Wednesday, August 13, 2008

Re: [HACKERS] Transaction-controlled robustness for replication

On Tue, 2008-08-12 at 13:33 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> > > > > with values of:
> > > > >
> > > > > nothing: have network traffic send WAL as needed
> > > > > netflush: wait for flush of WAL network packets to slave
> > > > > process: wait for slave to process WAL traffic and
> > > > > optionally fsync
> > > >
> > > > Suggest
> > > > async
> > > > syncnet
> > > > syncdisk
> > >
> > > I think the first two are fine, but 'syncdisk' might be wrong if the slave
> > > has 'synchronous_commit = off'. Any ideas?
> >
> > Yes, synchronous_commit can be set in the postgresql.conf, but its great
> > advantage is it is a userset parameter.
> >
> > The main point of the post is that the parameter would be transaction
> > controlled, so *must* be set in the transaction and thus *must* be set
> > on the master. Otherwise the capability is not available in the way I am
> > describing.
>
> Oh, so synchronous_commit would not control WAL sync on the slave? What
> about our fsync parameter? Because the slave is read-only, I saw no
> disadvantage of setting synchronous_commit to off in postgresql.conf on
> the slave.

The setting of synchronous_commit will be important if the standby
becomes the primary. I can see many cases where we might want "syncnet"
mode (i.e. no fsync of WAL data to disk on standby) and yet want
synchronous_commit=on when it becomes primary.

So if we were to use same parameters it would be confusing.

> > synchronous_commit applies to transaction commits. The code path would
> > be completely different here, so having parameter passed as an info byte
> > from master will not cause code structure problems or performance
> > problems.
>
> OK, I was just trying to simplify it.

I understood why you've had those thoughts and commend the lateral
thinking. I just don't think that on this occasion we've discovered any
better ways of doing it.

> The big problem with an async
> slave is that not only would you have lost data in a failover, but the
> database might be inconsistent, like fsync = off, which is something I
> think we want to try to avoid, which is why I was suggesting
> synchronous_commit = off.
>
> Or were you thinking of always doing fsync on the slave, no matter what.
> I am worried the slave might not be able to keep up (being
> single-threaded) and therefore we should allow a way to async commit on
> the slave.

Bit confused here. I've not said I want always async, neither have I
said I want always sync.

The main thing is we agree there will be 3 settings, including two
variants of synchronous replication one fairly safe and one ultra safe.

For the ultra safe mode we really need to see how synch replication will
work before we comment on where we might introduce fsyncs. I'm presuming
that incoming WAL will be written to WAL files (and optionally fsynced).
You might be talking about applying WAL records to the database and then
fsyncing them, but we do need to allow for crash recovery of the standby
server, so the data must be synced to WAL files before it is synced to
database.

> Certainly if the master is async sending the data, there is
> no need to do a synchronous_commit on the slave.

Agreed

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: