Thursday, July 10, 2008

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

On Wed, 9 Jul 2008, Jan Urbaski wrote:

> Jan Urbaski wrote:


> Do you think it's worthwhile to implement the LC algorithm in C and send it
> out, so others could try it out? Heck, maybe it's worthwhile to replace the
> current compute_minimal_stats() algorithm with LC and see how that compares?

I and Teodor are using LC for phrase estimation in one application and
from our understanding of the original paper this algorithm might be
not good for sampling, since all theory behind was about streaming of
FULL data. As for technique we use suffix tree, which should be fine for
typical sample size.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: