Thursday, September 11, 2008

Re: [GENERAL] "Healing" a table after massive updates

In response to Alvaro Herrera <alvherre@commandprompt.com>:

> Bill Moran wrote:
> > In response to "Gauthier, Dave" <dave.gauthier@intel.com>:
> >
> > > I might be able to answer my own question...
> > >
> > > vacuum FULL (analyze is optional)
> >
> > CLUSTER _may_ be a better choice, but carefully read the docs regarding
> > it's drawbacks first. You may want to do some benchmarks to see if it's
> > really needed before you commit to it as a scheduled operation.
>
> What drawbacks?

There's the whole "there will be two copies of the table on-disk" thing
that could be an issue if it's a large table.

Depending on the version of PG in use, there are warnings about tuple
visibility during CLUSTER. It seems as if most of these have been
removed for 8.3.

And while this applies to VACUUM FULL as well, it might just be a bad
idea. If the number of rows inserted isn't a significant increase of
the overall size of the table, he may find that overall performance is
better if he uses plain old VACUUM so that the FSM stays at a stable
size.

Some of this is dependent on PG version, which the OP neglected to mention.
Other stuff is dependent on what kind of maintenance window he has, which
was also not mentioned. Other stuff id dependent on various various
details of the actual process, which (looking back through the thread)
I may have misunderstood. I took the post to mean that he was loading
additional data into a pre-existing table, which is a difference scenario
than loading up a virgin table.

So, you know, standard disclaimer about YMMV, which it seems that _I_
worded poorly.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: