Friday, May 30, 2008

Re: [HACKERS] Core team statement on replication in PostgreSQL

On 5/30/08, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> On Fri, May 30, 2008 at 10:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > But since you mention it: one of the plausible answers for fixing the
> > vacuum problem for read-only slaves is to have the slaves push an xmin
> > back upstream to the master to prevent premature vacuuming. The current
> > design of pg_standby is utterly incapable of handling that requirement.
> > So there might be an implementation dependency there, depending on how
> > we want to solve that problem.
>
> I think it would be best to not make the slave interfere with the master's
> operations; that's only going to increase the operational complexity of such
> a solution.

I disagree - it's better to consider syncronized WAL-slaves
as equal to master, so having queries there affect master is ok.

You need to remeber this solution tries not to replace 100-node Slony-I
setups. You can run sanity checks on slaves or use them to load-balance
read-only OLTP queries, but not random stuff.

> There could be multiple slaves following a master, some serving
> data-warehousing queries, some for load-balancing reads, some others just
> for disaster recovery, and then some just to mitigate human errors by
> re-applying the logs with a delay.

To run warehousing queries you better use Slony-I / Londiste. For
warehousring you want different / more indexes on tables anyway,
so I think it's quite ok to say "don't do it" for complex queries
on WAL-slaves.

> I don't think any one installation would see all of the above mentioned
> scenarios, but we need to take care of multiple slaves operating off of a
> single master; something similar to cascaded Slony-I.

Again, the synchronized WAL replication is not generic solution.
Use Slony/Londiste if you want to get totally independent slaves.

Thankfully the -core has set concrete and limited goals,
that means it is possible to see working code in reasonable time.
I think that should apply to read-only slaves too.

If we try to make it handle any load, it will not be finished in any time.

Now if we limit the scope I've seen 2 variants thus far:

1) Keep slave max in sync, let the load there affect master (xmin).
- Slave can be used to load-balance OLTP load
- Slave should not be used for complex queries.

2) If long query is running, let slave lag (avoid applying WAL data).
- Slave cannot be used to load-balance OLTP load
- Slave can be used for complex queries (although no new indexes
or temp tables can be created).

I think 1) is more important (and more easily implementable) case.

For 2) we already have solutions (Slony/Londiste/Bucardo, etc)
so there is no point to make effort to solve this here.

--
marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: