Sunday, July 20, 2008

Re: [PATCHES] pg_dump additional options for performance

* daveg (daveg@sonic.net) wrote:
> One observation, indexes should be built right after the table data
> is loaded for each table, this way, the index build gets a hot cache
> for the table data instead of having to re-read it later as we do now.

That's not how pg_dump has traditionally worked, and the point of this
patch is to add options to easily segregate the main pieces of the
existing pg_dump output (main schema definition, data dump, key/index
building). You suggestion brings up an interesting point that should
pg_dump's traditional output structure change the "--schema-post-load"
set of objects wouldn't be as clear to newcomers since the load and the
indexes would be interleaved in the regular output.

I'd be curious about the performance impact this has on an actual load
too. It would probably be more valuable on smaller loads where it would
have less of an impact anyway than on loads larger than the cache size.
Still, not an issue for this patch, imv.

Thanks,

Stephen

No comments: