Friday, July 4, 2008

Re: [HACKERS] PATCH: CITEXT 2.0

Replying to myself, but I've made some local changes (see other
messages) and just wanted to follow up on some of my own comments.

On Jul 2, 2008, at 21:38, David E. Wheeler wrote:

>> 4) Operator = citext_eq is not correct. See comment http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac
>
> So should citextcmp() call strncmp() instead of varst_cmp()? The
> latter is what I saw in varlena.c.

I'm guessing that the answer is "no," since varstr_cmp() uses
strncmp() internally, as appropriate to the locale. Correct?

>> There must be difference between equality and collation for example
>> in Czech language 'láska' and 'laská' are different word it means
>> that 'láska' != 'laská'. But there is no difference in collation
>> order. See Unicode Universal Collation Algorithm for detail.
>
> I'll leave the collation stuff to the functions I call (*far* from
> my specialty), but I'll add a test for this and make sure it works
> as expected. Um, although, with what collation should it be tested?
> The tests I wrote assume en_US.UTF-8.

I added this test and is passes:

SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accented
characters should not be equivalent' );

>> 5) There are several commented out lines in CREATE OPERATOR
>> statement mostly related to NEGATOR. Is there some reason for that?
>
> I copied it from the original citext.sql. Not sure what effect it has.

I restored these (and one of them was wrong anyway).

>> Also OPERATOR || has probably wrong negator.
>
> Right, good catch.

Stupid question: What would the negation of || actually be? There
isn't one is, there?

Thanks!

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: