Monday, July 21, 2008

[BUGS] BUG #4319: lower()/upper() does not know about UNICODE case mapping

The following bug has been logged online:

Bug reference: 4319
Logged by: Valentine Gogichashvili
Email address: valgog@gmail.com
PostgreSQL version: 8.3.1
Operating system: SuSE Linux (kernel 2.6.13-15.11-default)
Description: lower()/upper() does not know about UNICODE case mapping
Details:

Hi,

I understand, that it is more a feature, but it does not help me anyways...

On the UNICODE databases lower and upper functions are using system locale
settings (that cannot be changed after initializing DB?) and does not know
anything about UNICODE case mapping.

The problem really becomes 'a problem' on multilingual systems. I have to
store data for German, Russian and Romanian languages together.

On 8.2.3 database with LC_CTYPE set to en_EN, lower() function is actually
corrupting UTF8 data, lowering UTF8 control bytes... I did have a chance to
check if how it works on the 8.3 as I do not have any db instance with the
LC_CTYPE set to en_EN.

I can understand, that LC_COLLATE is something that is not clear from the
context of the character. But case pair is always defined in the UNICODE
standard and should be not dependent of the LC_CTYPE. Or am I wrong?

Regards,

-- Valentine Gogichashvili

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

No comments: