Thursday, August 14, 2008

Re: [HACKERS] gsoc, oprrest function for text search take 2

Jan Urbański <j.urbanski@students.mimuw.edu.pl> writes:

> Heikki Linnakangas wrote:
>> Speaking of which, a lot of time seems to be spent on detoasting. I'd like to
>> understand that a better. Where is the detoasting coming from?
>
> Hmm, maybe bttext_pattern_cmp does some detoasting? It calls
> PG_GETARG_TEXT_PP(), which in turn calls pg_detoast_datum_packed(). Oh, and
> also I think that compare_lexeme_textfreq() uses DatumGetTextP() and that also
> does detoasting.

DatumGetTextP() will detoast packed data (ie, 1-byte length headers) whereas
DatumGetTextPP will only detoast compressed or externally stored data. I
suspect you're seeing the former.

> The root of all evil could by keeping a Datum in the TextFreq array, and not
> a "text *", which is something you pointed out earlier and I apparently
> didn't understand.

Well it doesn't really matter which type. If you store Datums which are
already detoasted then the DatumGetTextP and DatumGetTextPP will just be noops
anyways. If you store packed data (from DatumGetTextPP) then it's probably
safer to store it as Datums so if you need to pass it to any functions which
don't expect packed data they'll untoast it.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: