Friday, July 18, 2008

Re: [HACKERS] Postgres-R: primary key patches

markus@bluegap.ch (Markus Wanner) writes:
> Hello Chris,
>
> chris wrote:
>> Slony-I does the same, with the "variation" that it permits the option
>> of using a "candidate primary key," namely an index that is unique+NOT
>> NULL.
>>
>> If it is possible to support that broader notion, that might make
>> addition of these sorts of logic more widely useful.
>
> Well, yeah, that's technically not much different, so it would
> probably be very easy to extend Postgres-R to work on any arbitrary
> Index.
>
> But what do we have primary keys for, in the first place? Isn't it
> exactly the *primay* key into the table, which you want to use for
> replication? Or do we need an additional per-table configuration
> option for that? A REPLICATION KEY besides the PRIMARY KEY?

I agree with you that tables are *supposed* to have primary keys;
that's proper design, and if tables are missing them, then something
is definitely broken.

Sometimes, unfortunately, people make errors in design, and we wind up
needing to accomodate situations that are "less than perfect."

The "happy happenstance" is that, in modern versions of PostgreSQL, a
unique index may be added in the background so that this may be
rectified without outage if you can live with a "candidate primary
key" rather than a true PRIMARY KEY.

It seems to me that this extension can cover over a number of "design
sins," which looks like a very kind accomodation where it is surely
preferable to design it in earlier rather than later.

>> I know Jan Wieck has in mind the idea of adding an interface to enable
>> doing highly efficient IUD (Insert/Update/Delete) via generating a way
>> to do direct heap updates, which would be *enormously* more efficient
>> than the present need (in Slony-I, for instance) to parse, plan and
>> execute thousands of IUD statements. For UPDATE/DELETE to work
>> requires utilizing (candidate) primary keys, so there is some
>> seemingly relevant similarity there.
>
> Definitely. The remote backend does exactly that for Postgres-R: it
> takes a change set, which consists of one or more tuple collections,
> and then applies these collections. See ExecProcessCollection() in
> execMain.c.
>
> (Although, I'm still less than thrilled about the internal storage
> format of these tuple collections. That can certainly be improved and
> simplified.)

You may want to have a chat with Jan; he's got some thoughts on a more
general purpose mechanism that would be good for this as well as for
(we think) extremely efficient bulk data loading.
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules of the Evil Overlord #145. "My dungeon cell decor will not
feature exposed pipes. While they add to the gloomy atmosphere, they
are good conductors of vibrations and a lot of prisoners know Morse
code." <http://www.eviloverlord.com/>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: