Wednesday, July 23, 2008

Re: [HACKERS] [WIP] collation support revisited (phase 1)

On Tue, Jul 22, 2008 at 04:32:39PM +0200, Zdenek Kotala wrote:
> >Oh, so you're thinking of a charset as a sort of check constraint. If
> >your locale is turkish and you have a column marked charset ASCII then
> >storing lower('HI') results in an error.
>
> Yeah, if you use strcoll function it fails when illegal character is found.
> See
> http://www.opengroup.org/onlinepubs/009695399/functions/strcoll.html

Wierd, at least in glibc and ICU it can't happen but perhaps there are
other implementations where it can...

> Collation cannot be defined on any character. There is not any relation
> between
> Latin and Chines characters. Collation has sense when you are able to
> specify < = > operators.

There is no standardised relation. However, if your collation library
decides to define all chinese characters after all latin characters,
they will have defined a collation that will work for all strings with
any characters... Which is basically the approach glibc/ICU takes.

I think the standard is kind of pathetic to say that strcoll can set
errno but have no value to indicate error. I wonder how many platforms
actually use that "feature".

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

No comments: