Friday, September 12, 2008

Re: [HACKERS] Synchronous Log Shipping Replication

On Fri, 2008-09-12 at 17:24 +0300, Hannu Krosing wrote:
> On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote:
> > Hmm, built-in rsync capability would be cool. Probably not in the first
> > phase, though..
>
> We have it for WAL shipping, in form of GUC "archive_command" :)
>
> Why not add full_backup_command ?

I see the current design is all master-push centered, i.e. the master is
in control of everything WAL related. That makes it hard to create a
slave which is simply pointed to the server and takes all it's data from
there...

Why not have a design where the slave is in control for it's own data ?
I mean the slave could ask for the base files (possibly through a
special function deployed on the master), then ask for the WAL stream
and so on. That would easily let a slave cascade too, as it could relay
the WAL stream and serve the base backup too... or have a special WAL
repository software with the same interface as a normal master, but
having a choice of base backups and WAL streams. Plus that a slave in
control approach would also allow multiple slaves at the same time for a
given master...

The way it would work would be something like:

* configure the slave with a postgres connection to the master;
* the slave will connect and set up some meta data on the master
identifying itself and telling the master to keep the WAL needed by this
slave, and also get some meta data about the master's details if needed;
* the slave will call a special function on the slave and ask for the
base backup to be streamed (potentially compressed with special
knowledge of postgres internals);
* once the base backup is streamed, or possibly in parallel, ask for
streaming the WAL files;
* when the base backup is finished, start applying the WAL stream, which
is cached in the meantime, and it it's streaming continues;
* keep the master updated about the state of the slave, so the master
can know if it needs to keep the WAL files which were not yet streamed;
* in case of network error, the slave connects again and starts to
stream the WAL from where it was left;
* in case of extended network outage, the master could decide to
unsubscribe the slave when a certain time-out happened;
* when the slave finds itself unsubscribed after a longer disconnection,
it could ask for a new base backup based on differences only... some
kind of built in rsync thingy;

The only downside of this approach is that the slave machine needs a
full postgres super user connection to the master. That could be a
security problem in certain scenarios. The master-centric scenario needs
a connection in the other direction, which might be seen as more secure,
I don't know for sure...

Cheers,
Csaba.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: