Saturday, June 21, 2008

Re: [BUGS] BUG #4257: about unicode extend

"arli weng" <program@163.com> writes:
> the command (chinese by utf-8):
> INSERT INTO "title" VALUES(46307243,46307898,'é…&lsqauo;鼠𪕨');
> in postgres report error:
> invalid byte sequence for encoding "UNICODE": 0xf0

I don't believe this is actually an 8.3 server. In 8.1 or later that
encoding would be referred to as "UTF8"; also, 8.1 and later would show
all bytes of the complained-of character not just the first one.

8.0 and before only support 16-bit Unicode code points (ie, 3-byte
utf8 sequences). We have support for 4-byte sequences in 8.1 and
later. Also, there were some fixes in this area in Jan 2007, so
whichever branch you use, make sure you get a minor release that's
newer than that.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

No comments: