Saturday, May 31, 2008

[HACKERS] synchronized scans for VACUUM

Previous thread for reference:

http://archives.postgresql.org/pgsql-patches/2007-06/msg00096.php

The objections to synchronized scans for VACUUM as listed in that thread
(summary):

1. vacuum sometimes progresses faster than a regular heapscan, because
it doesn't need to check WHERE clauses, etc.

2. vacuum takes breaks from the scan to clean up the indexes when it
runs out of maintenance_work_mem.

3. vacuum takes breaks for the cost delay

4. vacuum will dirty a lot of the blocks as it goes, and that will cause
some kind of interaction with the ring buffer

I'd like to address these one by one to see what problems are really in
our way:

1. This would mean that it's not an I/O limited scan. I think as long as
we're talking about regular table scans that can benefit from
synchronized scanning, a vacuum of the same table would also benefit. A
microbenchmark could show whether some benefit exists or not.

2. There have been suggestions about a more compact representation for
the tuple id list. If this works, it will solve this problem.

3. Offering synchronized vacuums could reduce the need for these
elective pauses.

4. This probably has more to do with the buffer ring than synchronized
scans. There could be some bad interaction there, but I don't see that
it's clearly bad.

Additionally, with the possible exception of #4, I don't see the
situation being worse than it is currently.

Thoughts?

Regards,
Jeff Davis


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: