Thursday, July 10, 2008

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Jan Urbański wrote:

> Oh, one important thing. You need to choose a bucket width for the LC
> algorithm, that is decide after how many elements will you prune your
> data structure. I chose to prune after every twenty tsvectors.

Do you prune after X tsvectors regardless of the numbers of lexemes in
them? I don't think that preserves the algorithm properties; if there's
a bunch of very short tsvectors and then long tsvectors, the pruning
would take place too early for the initial lexemes. I think you should
count lexemes, not tsvectors.


--
Alvaro Herrera

http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: