PostgreSQL: Re: [HACKERS] Transaction-controlled robustness for replication

Tuesday, July 22, 2008

Re: [HACKERS] Transaction-controlled robustness for replication

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Simon Riggs wrote:
> Asynchronous commit controls whether we go to disk at time of commit, or
> whether we defer this slightly. We have the same options with
> replication: do we replicate at time of commit, or do we defer this
> slightly for performance reasons. DRBD and other replication systems
> show us that there is actually another difference when talking about
> synchronous replication: do we go to disk on the standby before
> acknowledging the primary?
>
> We can generalise this as three closed questions, answered either Yes
> (Synchronous) or No (Asynchronous)
>
> * Does WAL get forced to disk on primary at commit time?
> * Does WAL get forced across link to standby at commit time?
> * Does WAL get forced to disk on standby at commit time?
* Does WAL get applied [and synced] to disk on standby at commit time?
This is important if you want to use the standby as a read-only.
I am slightly confused about what the fsync setting does to all this, hence
the brackets.

I think that questions 2 and 3 are trivially bundled together. Once the
user can specify 2, implementing 3 should be trivial and vice versa.
I am not even convinced that these need to be two different parameters.
Also please note that an answer of "yes" to 3 means that 2 must also
be answered "yes".

> We could represent this with 3 parameters:
> synchronous_commit = on | off
> synchronous_standby_transfer = on | off
> synchronous_standby_wal_fsync = on | off
synchronous_standby_apply = on | off # just to propose a name

> Changing the parameter setting at transaction-level would be expensive
> if we had to set three parameters.
What exactly does "expensive" mean? All three parameters can probably be set
in one TCP packet from client to server.

> Or we could use just a single parameter
> synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no
> log-based replication is defined
>
> Having the ability to set these at the transaction-level would be very
> cool. Having it set via a *single* parameter would make it much more
> viable to switch between AAA for bulk, low importance data and SSS for
> very low volume, critical data, or somewhere in between on the same
> server, at the same time.
The problem with a single parameter is that everything becomes position
dependent and if whyever a new parameter is introduced, it's not easy to
upgrade old application code.

> So proposal in summary is
> * allow various modes of synchronous replication for perf/robustness
> * allow modes to be specified per-transaction
> * allow modes to be specified as a single parameter
How about creating named modes? This would give the user the ability to
define more fine-grained control especially in larger clusters of fail-over/read-only
servers without totally clogging the parameter space and application code.
Whether this should be done SQL-style or in some config file is not so clear to me,
although I'd prefer SQL-style like

CREATE SYNCHRONIZING MODE immediate_readonly AS
LOCAL SYNCHRONOUS APPLY
192.168.0.10 SYNCHRONOUS APPLY -- read-only slave
192.168.0.11 SYNCHRONOUS APPLY -- read-only slave
192.168.0.20 SYNCHRONOUS SHIP -- backup-server
192.168.0.21 SYNCHRONOUS SHIP -- backup-server
192.168.0.30 SYNHCRONOUS FSYNC -- backup-server with fast disks
;

and then something like

synchronize_mode = immediate_readonly;

Yeah, I know, give patches not pipe-dreams :)

Regards,
Jens-Wolfhard Schicke
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIhoAdzhchXT4RR5ARAo/6AJ9R6LA0TsPvD/TBy6Bh1q0q5JvyKQCbBycx
1CKc8dqxnlvmH/hbi1Px+v8=
=l5P4
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

PostgreSQL

Tuesday, July 22, 2008

Re: [HACKERS] Transaction-controlled robustness for replication

No comments:

Blog Archive