Sunday, September 21, 2008

Re: [HACKERS] parallel pg_restore

Andrew Dunstan <> writes:
> I am working on getting parallel pg_restore working. I'm currently
> getting all the scaffolding working, and hope to have a naive prototype
> posted within about a week.

> The major question is how to choose the restoration order so as to
> maximize efficiency both on the server and in reading the archive.

One of the first software design principles I ever learned was to
separate policy from mechanism. ISTM in this first cut you ought to
concentrate on mechanism and let the policy just be something dumb
(but coded separately from the infrastructure). We can refine it after

> Another question is what we should do if the user supplies an explicit
> order with --use-list. I'm inclined to say we should stick strictly with
> the supplied order. Or maybe that should be an option.

Hmm. I think --use-list is used more for selecting a subset of items
to restore than for forcing a nondefault restore order. Forcing the
order used to be a major purpose, but that was years ago before we
had the dependency-driven-restore-order code working. So I'd vote that
the default behavior is to still allow parallel restore when this option
is used, and we should provide an orthogonal option that disables use of
parallel restore.

You'd really want the latter anyway for some cases, ie, when you don't
want the restore trying to hog the machine. Maybe the right form for
the extra option is just a limit on how many connections to use. Set it
to one to force the exact restore order, and to other values to throttle
how much of the machine the restore tries to eat.

One problem here though is that you'd need to be sure you behave sanely
when there is a dependency chain passing through an object that's not to
be restored. The ordering of the rest of the chain still ought to honor
the dependencies I think.

regards, tom lane

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

No comments: