Sunday, July 20, 2008

Re: [HACKERS] [WIP] collation support revisited (phase 1)

I was trying to sort out the problem with not creating new catalog for character sets and I came up following ideas. Correct me if my ideas are wrong.

Since collation has to have a defined character set I'm suggesting to use already written infrastructure of encodings and to use list of encodings in chklocale.c. Currently databases are not created with specified character set but with specified encoding. I think instead of pointing a record in collation catalog to another record in character set catalog we might use only name (string) of the encoding.

So each collation will be set over these encodings set in chklocale.c. Each database will be able to use only collations that are created over same ("compatible") encodings regarding encoding_match_list. Each standard collation (SQL standard) will be defined over all possible encodings (hard-coded).

Comments?

Regards

     Radek Strnad

On Sat, Jul 12, 2008 at 5:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> I think if we support UTF8 encoding, than it make sense to create own charsets,
> because system locales could have defined collation for that.

Say what?  I cannot imagine a scenario in which a user-defined encoding
would be useful. The amount of infrastructure you need for a new
encoding is so large that providing management commands is just silly
--- anyone who can create the infrastructure can do the last little bit
for themselves.  The analogy to index access methods is on point, again.

                       regards, tom lane

No comments: