Saturday, July 19, 2008

Re: [PATCHES] pg_dump additional options for performance

Index: doc/src/sgml/ref/pg_dump.sgml
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.102
diff -c -r1.102 pg_dump.sgml
*** doc/src/sgml/ref/pg_dump.sgml 13 Apr 2008 03:49:21 -0000 1.102
--- doc/src/sgml/ref/pg_dump.sgml 20 Jul 2008 06:33:30 -0000
***************
*** 133,139 ****
<para>
Include large objects in the dump. This is the default behavior
except when <option>--schema</>, <option>--table</>, or
! <option>--schema-only</> is specified, so the <option>-b</>
switch is only useful to add large objects to selective dumps.
</para>
</listitem>
--- 133,140 ----
<para>
Include large objects in the dump. This is the default behavior
except when <option>--schema</>, <option>--table</>, or
! <option>--schema-only</> or <option>--schema-pre-load</> or
! <option>--schema-post-load</> is specified, so the <option>-b</>
switch is only useful to add large objects to selective dumps.
</para>
</listitem>
***************
*** 443,448 ****
--- 444,471 ----
</varlistentry>

<varlistentry>
+ <term><option>--schema-pre-load</option></term>
+ <listitem>
+ <para>
+ Dump only the object definitions (schema) required to load data. Dumps
+ exactly what <option>--schema-only</> would dump, but only those
+ statements before the data load.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--schema-post-load</option></term>
+ <listitem>
+ <para>
+ Dump only the object definitions (schema) required after data has been
+ loaded. Dumps exactly what <option>--schema-only</> would dump, but
+ only those statements after the data load.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-S <replaceable class="parameter">username</replaceable></option></term>
<term><option>--superuser=<replaceable class="parameter">username</replaceable></option></term>
<listitem>
***************
*** 774,779 ****
--- 797,830 ----
</para>

<para>
+ The output of pg_dump can be notionally divided into three parts:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Pre-Schema - objects required before data loading, such as
+ <command>CREATE TABLE</command>.
+ This part can be requested using <option>--schema-pre-load</>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Table Data - data can be requested using <option>--data-only</>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Post-Schema - objects required after data loading, such as
+ <command>ALTER TABLE</command> and <command>CREATE INDEX</command>.
+ This part can be requested using <option>--schema-post-load</>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ This allows us to work more easily with large data dump files when
+ there is some need to edit commands or resequence their execution for
+ performance.
+ </para>
+
+ <para>
Because <application>pg_dump</application> is used to transfer data
to newer versions of <productname>PostgreSQL</>, the output of
<application>pg_dump</application> can be loaded into
Index: doc/src/sgml/ref/pg_restore.sgml
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/ref/pg_restore.sgml,v
retrieving revision 1.75
diff -c -r1.75 pg_restore.sgml
*** doc/src/sgml/ref/pg_restore.sgml 13 Apr 2008 03:49:21 -0000 1.75
--- doc/src/sgml/ref/pg_restore.sgml 20 Jul 2008 06:33:18 -0000
***************
*** 321,326 ****
--- 321,350 ----
</varlistentry>

<varlistentry>
+ <term><option>--schema-post-load</option></term>
+ <listitem>
+ <para>
+ Dump only the object definitions (schema) required after data has been
+ loaded. Dumps exactly what <option>--schema-only</> would dump, but
+ only those statements after the data load.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-S <replaceable class="parameter">username</replaceable></option></term>
+ <term><option>--superuser=<replaceable class="parameter">username</replaceable></option></term>
+ <listitem>
+ <para>
+ Specify the superuser user name to use when disabling triggers.
+ This is only relevant if <option>--disable-triggers</> is used.
+ (Usually, it's better to leave this out, and instead start the
+ resulting script as superuser.)
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-S <replaceable class="parameter">username</replaceable></option></term>
<term><option>--superuser=<replaceable class="parameter">username</replaceable></option></term>
<listitem>
***************
*** 572,577 ****
--- 596,629 ----
</para>

<para>
+ The actions of pg_restore can be notionally divided into three parts:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Pre-Schema - objects required before data loading, such as
+ <command>CREATE TABLE</command>.
+ This part can be requested using <option>--schema-pre-load</>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Table Data - data can be requested using <option>--data-only</>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Post-Schema - objects required after data loading, such as
+ <command>ALTER TABLE</command> and <command>CREATE INDEX</command>.
+ This part can be requested using <option>--schema-post-load</>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ This allows us to work more easily with large data dump files when
+ there is some need to edit commands or resequence their execution for
+ performance.
+ </para>
+
+ <para>
The limitations of <application>pg_restore</application> are detailed below.

<itemizedlist>
Index: src/bin/pg_dump/pg_backup.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_backup.h,v
retrieving revision 1.47
diff -c -r1.47 pg_backup.h
*** src/bin/pg_dump/pg_backup.h 13 Apr 2008 03:49:21 -0000 1.47
--- src/bin/pg_dump/pg_backup.h 20 Jul 2008 05:19:34 -0000
***************
*** 89,95 ****
int use_setsessauth;/* Use SET SESSION AUTHORIZATION commands
* instead of OWNER TO */
char *superuser; /* Username to use as superuser */
! int dataOnly;
int dropSchema;
char *filename;
int schemaOnly;
--- 89,95 ----
int use_setsessauth;/* Use SET SESSION AUTHORIZATION commands
* instead of OWNER TO */
char *superuser; /* Username to use as superuser */
! int dumpObjFlags; /* which objects types to dump */
int dropSchema;
char *filename;
int schemaOnly;
Index: src/bin/pg_dump/pg_backup_archiver.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.157
diff -c -r1.157 pg_backup_archiver.c
*** src/bin/pg_dump/pg_backup_archiver.c 4 May 2008 08:32:21 -0000 1.157
--- src/bin/pg_dump/pg_backup_archiver.c 20 Jul 2008 05:19:34 -0000
***************
*** 56,62 ****
static void _selectTablespace(ArchiveHandle *AH, const char *tablespace);
static void processEncodingEntry(ArchiveHandle *AH, TocEntry *te);
static void processStdStringsEntry(ArchiveHandle *AH, TocEntry *te);
! static teReqs _tocEntryRequired(TocEntry *te, RestoreOptions *ropt, bool include_acls);
static void _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
static void _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
static TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
--- 56,62 ----
static void _selectTablespace(ArchiveHandle *AH, const char *tablespace);
static void processEncodingEntry(ArchiveHandle *AH, TocEntry *te);
static void processStdStringsEntry(ArchiveHandle *AH, TocEntry *te);
! static int _tocEntryRequired(TocEntry *te, RestoreOptions *ropt, bool include_acls);
static void _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
static void _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
static TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
***************
*** 129,135 ****
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
TocEntry *te;
! teReqs reqs;
OutputContext sav;
bool defnDumped;

--- 129,135 ----
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
TocEntry *te;
! int reqs;
OutputContext sav;
bool defnDumped;

***************
*** 175,193 ****
* Work out if we have an implied data-only restore. This can happen if
* the dump was data only or if the user has used a toc list to exclude
* all of the schema data. All we do is look for schema entries - if none
! * are found then we set the dataOnly flag.
*
! * We could scan for wanted TABLE entries, but that is not the same as
! * dataOnly. At this stage, it seems unnecessary (6-Mar-2001).
*/
! if (!ropt->dataOnly)
{
int impliedDataOnly = 1;

for (te = AH->toc->next; te != AH->toc; te = te->next)
{
reqs = _tocEntryRequired(te, ropt, true);
! if ((reqs & REQ_SCHEMA) != 0)
{ /* It's schema, and it's wanted */
impliedDataOnly = 0;
break;
--- 175,193 ----
* Work out if we have an implied data-only restore. This can happen if
* the dump was data only or if the user has used a toc list to exclude
* all of the schema data. All we do is look for schema entries - if none
! * are found then say we only want DATA type objects.
*
! * We could scan for wanted TABLE entries, but that is not the same.
! * At this stage, it seems unnecessary (6-Mar-2001).
*/
! if (!WANT_DATA(ropt->dumpObjFlags))
{
int impliedDataOnly = 1;

for (te = AH->toc->next; te != AH->toc; te = te->next)
{
reqs = _tocEntryRequired(te, ropt, true);
! if (WANT_PRE_SCHEMA(reqs) || WANT_POST_SCHEMA(reqs))
{ /* It's schema, and it's wanted */
impliedDataOnly = 0;
break;
***************
*** 195,201 ****
}
if (impliedDataOnly)
{
! ropt->dataOnly = impliedDataOnly;
ahlog(AH, 1, "implied data-only restore\n");
}
}
--- 195,201 ----
}
if (impliedDataOnly)
{
! ropt->dumpObjFlags = REQ_DATA;
ahlog(AH, 1, "implied data-only restore\n");
}
}
***************
*** 236,242 ****
AH->currentTE = te;

reqs = _tocEntryRequired(te, ropt, false /* needn't drop ACLs */ );
! if (((reqs & REQ_SCHEMA) != 0) && te->dropStmt)
{
/* We want the schema */
ahlog(AH, 1, "dropping %s %s\n", te->desc, te->tag);
--- 236,242 ----
AH->currentTE = te;

reqs = _tocEntryRequired(te, ropt, false /* needn't drop ACLs */ );
! if (((reqs & REQ_PRE_SCHEMA) != 0) && te->dropStmt)
{
/* We want the schema */
ahlog(AH, 1, "dropping %s %s\n", te->desc, te->tag);
***************
*** 278,284 ****
/* Dump any relevant dump warnings to stderr */
if (!ropt->suppressDumpWarnings && strcmp(te->desc, "WARNING") == 0)
{
! if (!ropt->dataOnly && te->defn != NULL && strlen(te->defn) != 0)
write_msg(modulename, "warning from original dump file: %s\n", te->defn);
else if (te->copyStmt != NULL && strlen(te->copyStmt) != 0)
write_msg(modulename, "warning from original dump file: %s\n", te->copyStmt);
--- 278,284 ----
/* Dump any relevant dump warnings to stderr */
if (!ropt->suppressDumpWarnings && strcmp(te->desc, "WARNING") == 0)
{
! if (!WANT_DATA(ropt->dumpObjFlags) && te->defn != NULL && strlen(te->defn) != 0)
write_msg(modulename, "warning from original dump file: %s\n", te->defn);
else if (te->copyStmt != NULL && strlen(te->copyStmt) != 0)
write_msg(modulename, "warning from original dump file: %s\n", te->copyStmt);
***************
*** 286,292 ****

defnDumped = false;

! if ((reqs & REQ_SCHEMA) != 0) /* We want the schema */
{
ahlog(AH, 1, "creating %s %s\n", te->desc, te->tag);

--- 286,293 ----

defnDumped = false;

! if ((WANT_PRE_SCHEMA(reqs) && WANT_PRE_SCHEMA(ropt->dumpObjFlags)) ||
! (WANT_POST_SCHEMA(reqs) && WANT_POST_SCHEMA(ropt->dumpObjFlags))) /* We want the schema */
{
ahlog(AH, 1, "creating %s %s\n", te->desc, te->tag);

***************
*** 331,337 ****
/*
* If we have a data component, then process it
*/
! if ((reqs & REQ_DATA) != 0)
{
/*
* hadDumper will be set if there is genuine data component for
--- 332,338 ----
/*
* If we have a data component, then process it
*/
! if (WANT_DATA(reqs))
{
/*
* hadDumper will be set if there is genuine data component for
***************
*** 343,349 ****
/*
* If we can output the data, then restore it.
*/
! if (AH->PrintTocDataPtr !=NULL && (reqs & REQ_DATA) != 0)
{
#ifndef HAVE_LIBZ
if (AH->compression != 0)
--- 344,350 ----
/*
* If we can output the data, then restore it.
*/
! if (AH->PrintTocDataPtr !=NULL && WANT_DATA(reqs))
{
#ifndef HAVE_LIBZ
if (AH->compression != 0)
***************
*** 415,421 ****
/* Work out what, if anything, we want from this entry */
reqs = _tocEntryRequired(te, ropt, true);

! if ((reqs & REQ_SCHEMA) != 0) /* We want the schema */
{
ahlog(AH, 1, "setting owner and privileges for %s %s\n",
te->desc, te->tag);
--- 416,422 ----
/* Work out what, if anything, we want from this entry */
reqs = _tocEntryRequired(te, ropt, true);

! if (WANT_PRE_SCHEMA(reqs)) /* We want the schema */
{
ahlog(AH, 1, "setting owner and privileges for %s %s\n",
te->desc, te->tag);
***************
*** 473,479 ****
_disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt)
{
/* This hack is only needed in a data-only restore */
! if (!ropt->dataOnly || !ropt->disable_triggers)
return;

ahlog(AH, 1, "disabling triggers for %s\n", te->tag);
--- 474,480 ----
_disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt)
{
/* This hack is only needed in a data-only restore */
! if (!WANT_DATA(ropt->dumpObjFlags) || !ropt->disable_triggers)
return;

ahlog(AH, 1, "disabling triggers for %s\n", te->tag);
***************
*** 499,505 ****
_enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt)
{
/* This hack is only needed in a data-only restore */
! if (!ropt->dataOnly || !ropt->disable_triggers)
return;

ahlog(AH, 1, "enabling triggers for %s\n", te->tag);
--- 500,506 ----
_enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt)
{
/* This hack is only needed in a data-only restore */
! if (!WANT_DATA(ropt->dumpObjFlags) || !ropt->disable_triggers)
return;

ahlog(AH, 1, "enabling triggers for %s\n", te->tag);
***************
*** 1321,1327 ****
return NULL;
}

! teReqs
TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt)
{
TocEntry *te = getTocEntryByDumpId(AH, id);
--- 1322,1328 ----
return NULL;
}

! int
TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt)
{
TocEntry *te = getTocEntryByDumpId(AH, id);
***************
*** 2026,2035 ****
te->defn);
}

! static teReqs
_tocEntryRequired(TocEntry *te, RestoreOptions *ropt, bool include_acls)
{
! teReqs res = REQ_ALL;

/* ENCODING and STDSTRINGS items are dumped specially, so always reject */
if (strcmp(te->desc, "ENCODING") == 0 ||
--- 2027,2036 ----
te->defn);
}

! static int
_tocEntryRequired(TocEntry *te, RestoreOptions *ropt, bool include_acls)
{
! int res = ropt->dumpObjFlags;

/* ENCODING and STDSTRINGS items are dumped specially, so always reject */
if (strcmp(te->desc, "ENCODING") == 0 ||
***************
*** 2109,2125 ****
if ((strcmp(te->desc, "<Init>") == 0) && (strcmp(te->tag, "Max OID") == 0))
return 0;

- /* Mask it if we only want schema */
- if (ropt->schemaOnly)
- res = res & REQ_SCHEMA;
-
- /* Mask it we only want data */
- if (ropt->dataOnly)
- res = res & REQ_DATA;
-
/* Mask it if we don't have a schema contribution */
if (!te->defn || strlen(te->defn) == 0)
! res = res & ~REQ_SCHEMA;

/* Finally, if there's a per-ID filter, limit based on that as well */
if (ropt->idWanted && !ropt->idWanted[te->dumpId - 1])
--- 2110,2118 ----
if ((strcmp(te->desc, "<Init>") == 0) && (strcmp(te->tag, "Max OID") == 0))
return 0;

/* Mask it if we don't have a schema contribution */
if (!te->defn || strlen(te->defn) == 0)
! res = res & ~(REQ_PRE_SCHEMA | REQ_POST_SCHEMA);

/* Finally, if there's a per-ID filter, limit based on that as well */
if (ropt->idWanted && !ropt->idWanted[te->dumpId - 1])
Index: src/bin/pg_dump/pg_backup_archiver.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_backup_archiver.h,v
retrieving revision 1.76
diff -c -r1.76 pg_backup_archiver.h
*** src/bin/pg_dump/pg_backup_archiver.h 7 Nov 2007 12:24:24 -0000 1.76
--- src/bin/pg_dump/pg_backup_archiver.h 20 Jul 2008 05:19:34 -0000
***************
*** 158,169 ****
STAGE_FINALIZING
} ArchiverStage;

! typedef enum
! {
! REQ_SCHEMA = 1,
! REQ_DATA = 2,
! REQ_ALL = REQ_SCHEMA + REQ_DATA
! } teReqs;

typedef struct _archiveHandle
{
--- 158,173 ----
STAGE_FINALIZING
} ArchiverStage;

! #define REQ_PRE_SCHEMA (1 << 0)
! #define REQ_DATA (1 << 1)
! #define REQ_POST_SCHEMA (1 << 2)
! #define REQ_ALL (REQ_PRE_SCHEMA + REQ_DATA + REQ_POST_SCHEMA)
!
! #define WANT_PRE_SCHEMA(req) ((req & REQ_PRE_SCHEMA) == REQ_PRE_SCHEMA)
! #define WANT_DATA(req) ((req & REQ_DATA) == REQ_DATA)
! #define WANT_POST_SCHEMA(req) ((req & REQ_POST_SCHEMA) == REQ_POST_SCHEMA)
! #define WANT_ALL(req) ((req & REQ_ALL) == REQ_ALL)
!

typedef struct _archiveHandle
{
***************
*** 317,323 ****
extern void ReadToc(ArchiveHandle *AH);
extern void WriteDataChunks(ArchiveHandle *AH);

! extern teReqs TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
extern bool checkSeek(FILE *fp);

#define appendStringLiteralAHX(buf,str,AH) \
--- 321,327 ----
extern void ReadToc(ArchiveHandle *AH);
extern void WriteDataChunks(ArchiveHandle *AH);

! extern int TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
extern bool checkSeek(FILE *fp);

#define appendStringLiteralAHX(buf,str,AH) \
Index: src/bin/pg_dump/pg_dump.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.496
diff -c -r1.496 pg_dump.c
*** src/bin/pg_dump/pg_dump.c 18 Jul 2008 03:32:52 -0000 1.496
--- src/bin/pg_dump/pg_dump.c 20 Jul 2008 05:57:58 -0000
***************
*** 72,77 ****
--- 72,81 ----
bool dataOnly;
bool aclsSkip;

+ /* groups of objects: default is we dump all groups */
+
+ int dumpObjFlags;
+
/* subquery used to convert user ID (eg, datdba) to user name */
static const char *username_subquery;

***************
*** 224,231 ****
RestoreOptions *ropt;

static int disable_triggers = 0;
! static int outputNoTablespaces = 0;
static int use_setsessauth = 0;

static struct option long_options[] = {
{"data-only", no_argument, NULL, 'a'},
--- 228,237 ----
RestoreOptions *ropt;

static int disable_triggers = 0;
! static int outputNoTablespaces = 0;
static int use_setsessauth = 0;
+ static int use_schemaPreLoadOnly;
+ static int use_schemaPostLoadOnly;

static struct option long_options[] = {
{"data-only", no_argument, NULL, 'a'},
***************
*** 265,270 ****
--- 271,278 ----
{"disable-dollar-quoting", no_argument, &disable_dollar_quoting, 1},
{"disable-triggers", no_argument, &disable_triggers, 1},
{"no-tablespaces", no_argument, &outputNoTablespaces, 1},
+ {"schema-pre-load", no_argument, &use_schemaPreLoadOnly, 1},
+ {"schema-post-load", no_argument, &use_schemaPostLoadOnly, 1},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},

{NULL, 0, NULL, 0}
***************
*** 456,467 ****
if (optind < argc)
dbname = argv[optind];

! if (dataOnly && schemaOnly)
{
! write_msg(NULL, "options -s/--schema-only and -a/--data-only cannot be used together\n");
exit(1);
}

if (dataOnly && outputClean)
{
write_msg(NULL, "options -c/--clean and -a/--data-only cannot be used together\n");
--- 464,504 ----
if (optind < argc)
dbname = argv[optind];

! /*
! * Look for conflicting options relating to object groupings
! */
! if (schemaOnly && dataOnly)
! {
! write_msg(NULL, "options %s and %s cannot be used together\n",
! "-s/--schema-only", "-a/--data-only");
! exit(1);
! }
! else if ((schemaOnly || dataOnly) &&
! (use_schemaPreLoadOnly == 1 || use_schemaPostLoadOnly == 1))
{
! write_msg(NULL, "options %s and %s cannot be used together\n",
! schemaOnly ? "-s/--schema-only" : "-a/--data-only",
! use_schemaPostLoadOnly == 1 ? "--schema-post-load" : "--schema-pre-load ");
exit(1);
}

+ /*
+ * Decide which of the object groups we will dump
+ */
+ dumpObjFlags = REQ_ALL;
+
+ if (dataOnly)
+ dumpObjFlags = REQ_DATA;
+
+ if (use_schemaPreLoadOnly == 1)
+ dumpObjFlags = REQ_PRE_SCHEMA;
+
+ if (use_schemaPostLoadOnly == 1)
+ dumpObjFlags = REQ_POST_SCHEMA;
+
+ if (schemaOnly)
+ dumpObjFlags = (REQ_PRE_SCHEMA | REQ_POST_SCHEMA);
+
if (dataOnly && outputClean)
{
write_msg(NULL, "options -c/--clean and -a/--data-only cannot be used together\n");
***************
*** 638,644 ****
* Dumping blobs is now default unless we saw an inclusion switch or -s
* ... but even if we did see one of these, -b turns it back on.
*/
! if (include_everything && !schemaOnly)
outputBlobs = true;

/*
--- 675,681 ----
* Dumping blobs is now default unless we saw an inclusion switch or -s
* ... but even if we did see one of these, -b turns it back on.
*/
! if (include_everything && WANT_PRE_SCHEMA(dumpObjFlags))
outputBlobs = true;

/*
***************
*** 650,656 ****
if (g_fout->remoteVersion < 80400)
guessConstraintInheritance(tblinfo, numTables);

! if (!schemaOnly)
getTableData(tblinfo, numTables, oids);

if (outputBlobs && hasBlobs(g_fout))
--- 687,693 ----
if (g_fout->remoteVersion < 80400)
guessConstraintInheritance(tblinfo, numTables);

! if (WANT_DATA(dumpObjFlags))
getTableData(tblinfo, numTables, oids);

if (outputBlobs && hasBlobs(g_fout))
***************
*** 704,710 ****
dumpStdStrings(g_fout);

/* The database item is always next, unless we don't want it at all */
! if (include_everything && !dataOnly)
dumpDatabase(g_fout);

/* Now the rearrangeable objects. */
--- 741,747 ----
dumpStdStrings(g_fout);

/* The database item is always next, unless we don't want it at all */
! if (include_everything && WANT_DATA(dumpObjFlags))
dumpDatabase(g_fout);

/* Now the rearrangeable objects. */
***************
*** 726,732 ****
ropt->noTablespace = outputNoTablespaces;
ropt->disable_triggers = disable_triggers;
ropt->use_setsessauth = use_setsessauth;
! ropt->dataOnly = dataOnly;

if (compressLevel == -1)
ropt->compression = 0;
--- 763,769 ----
ropt->noTablespace = outputNoTablespaces;
ropt->disable_triggers = disable_triggers;
ropt->use_setsessauth = use_setsessauth;
! ropt->dumpObjFlags = dumpObjFlags;

if (compressLevel == -1)
ropt->compression = 0;
***************
*** 3389,3395 ****
continue;

/* Ignore indexes of tables not to be dumped */
! if (!tbinfo->dobj.dump)
continue;

if (g_verbose)
--- 3426,3432 ----
continue;

/* Ignore indexes of tables not to be dumped */
! if (!tbinfo->dobj.dump || !WANT_POST_SCHEMA(dumpObjFlags))
continue;

if (g_verbose)
***************
*** 5140,5146 ****
int ncomments;

/* Comments are SCHEMA not data */
! if (dataOnly)
return;

/* Search for comments associated with catalogId, using table */
--- 5177,5183 ----
int ncomments;

/* Comments are SCHEMA not data */
! if (!WANT_PRE_SCHEMA(dumpObjFlags))
return;

/* Search for comments associated with catalogId, using table */
***************
*** 5191,5197 ****
PQExpBuffer target;

/* Comments are SCHEMA not data */
! if (dataOnly)
return;

/* Search for comments associated with relation, using table */
--- 5228,5234 ----
PQExpBuffer target;

/* Comments are SCHEMA not data */
! if (!WANT_PRE_SCHEMA(dumpObjFlags))
return;

/* Search for comments associated with relation, using table */
***************
*** 5543,5549 ****
char *qnspname;

/* Skip if not to be dumped */
! if (!nspinfo->dobj.dump || dataOnly)
return;

/* don't dump dummy namespace from pre-7.3 source */
--- 5580,5586 ----
char *qnspname;

/* Skip if not to be dumped */
! if (!nspinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

/* don't dump dummy namespace from pre-7.3 source */
***************
*** 5592,5598 ****
dumpType(Archive *fout, TypeInfo *tinfo)
{
/* Skip if not to be dumped */
! if (!tinfo->dobj.dump || dataOnly)
return;

/* Dump out in proper style */
--- 5629,5635 ----
dumpType(Archive *fout, TypeInfo *tinfo)
{
/* Skip if not to be dumped */
! if (!tinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

/* Dump out in proper style */
***************
*** 6237,6243 ****
PQExpBuffer q;

/* Skip if not to be dumped */
! if (!stinfo->dobj.dump || dataOnly)
return;

q = createPQExpBuffer();
--- 6274,6280 ----
PQExpBuffer q;

/* Skip if not to be dumped */
! if (!stinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 6284,6290 ****
if (!include_everything)
return false;
/* And they're schema not data */
! if (dataOnly)
return false;
return true;
}
--- 6321,6327 ----
if (!include_everything)
return false;
/* And they're schema not data */
! if (!WANT_PRE_SCHEMA(dumpObjFlags))
return false;
return true;
}
***************
*** 6305,6311 ****
FuncInfo *funcInfo;
FuncInfo *validatorInfo = NULL;

! if (dataOnly)
return;

/*
--- 6342,6348 ----
FuncInfo *funcInfo;
FuncInfo *validatorInfo = NULL;

! if (!WANT_PRE_SCHEMA(dumpObjFlags))
return;

/*
***************
*** 6565,6571 ****
int i;

/* Skip if not to be dumped */
! if (!finfo->dobj.dump || dataOnly)
return;

query = createPQExpBuffer();
--- 6602,6608 ----
int i;

/* Skip if not to be dumped */
! if (!finfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

query = createPQExpBuffer();
***************
*** 6960,6966 ****
TypeInfo *sourceInfo;
TypeInfo *targetInfo;

! if (dataOnly)
return;

if (OidIsValid(cast->castfunc))
--- 6997,7003 ----
TypeInfo *sourceInfo;
TypeInfo *targetInfo;

! if (!WANT_PRE_SCHEMA(dumpObjFlags))
return;

if (OidIsValid(cast->castfunc))
***************
*** 7110,7116 ****
char *oprcanhash;

/* Skip if not to be dumped */
! if (!oprinfo->dobj.dump || dataOnly)
return;

/*
--- 7147,7153 ----
char *oprcanhash;

/* Skip if not to be dumped */
! if (!oprinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

/*
***************
*** 7494,7500 ****
int i;

/* Skip if not to be dumped */
! if (!opcinfo->dobj.dump || dataOnly)
return;

/*
--- 7531,7537 ----
int i;

/* Skip if not to be dumped */
! if (!opcinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

/*
***************
*** 7802,7808 ****
int i;

/* Skip if not to be dumped */
! if (!opfinfo->dobj.dump || dataOnly)
return;

/*
--- 7839,7845 ----
int i;

/* Skip if not to be dumped */
! if (!opfinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

/*
***************
*** 8071,8077 ****
bool condefault;

/* Skip if not to be dumped */
! if (!convinfo->dobj.dump || dataOnly)
return;

query = createPQExpBuffer();
--- 8108,8114 ----
bool condefault;

/* Skip if not to be dumped */
! if (!convinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

query = createPQExpBuffer();
***************
*** 8225,8231 ****
bool convertok;

/* Skip if not to be dumped */
! if (!agginfo->aggfn.dobj.dump || dataOnly)
return;

query = createPQExpBuffer();
--- 8262,8268 ----
bool convertok;

/* Skip if not to be dumped */
! if (!agginfo->aggfn.dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

query = createPQExpBuffer();
***************
*** 8428,8434 ****
PQExpBuffer delq;

/* Skip if not to be dumped */
! if (!prsinfo->dobj.dump || dataOnly)
return;

q = createPQExpBuffer();
--- 8465,8471 ----
PQExpBuffer delq;

/* Skip if not to be dumped */
! if (!prsinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 8497,8503 ****
char *tmplname;

/* Skip if not to be dumped */
! if (!dictinfo->dobj.dump || dataOnly)
return;

q = createPQExpBuffer();
--- 8534,8540 ----
char *tmplname;

/* Skip if not to be dumped */
! if (!dictinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 8582,8588 ****
PQExpBuffer delq;

/* Skip if not to be dumped */
! if (!tmplinfo->dobj.dump || dataOnly)
return;

q = createPQExpBuffer();
--- 8619,8625 ----
PQExpBuffer delq;

/* Skip if not to be dumped */
! if (!tmplinfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 8648,8654 ****
int i_dictname;

/* Skip if not to be dumped */
! if (!cfginfo->dobj.dump || dataOnly)
return;

q = createPQExpBuffer();
--- 8685,8691 ----
int i_dictname;

/* Skip if not to be dumped */
! if (!cfginfo->dobj.dump || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 8784,8790 ****
PQExpBuffer sql;

/* Do nothing if ACL dump is not enabled */
! if (dataOnly || aclsSkip)
return;

sql = createPQExpBuffer();
--- 8821,8827 ----
PQExpBuffer sql;

/* Do nothing if ACL dump is not enabled */
! if (!WANT_PRE_SCHEMA(dumpObjFlags) || aclsSkip)
return;

sql = createPQExpBuffer();
***************
*** 8821,8827 ****
{
if (tbinfo->relkind == RELKIND_SEQUENCE)
dumpSequence(fout, tbinfo);
! else if (!dataOnly)
dumpTableSchema(fout, tbinfo);

/* Handle the ACL here */
--- 8858,8864 ----
{
if (tbinfo->relkind == RELKIND_SEQUENCE)
dumpSequence(fout, tbinfo);
! else if (WANT_PRE_SCHEMA(dumpObjFlags))
dumpTableSchema(fout, tbinfo);

/* Handle the ACL here */
***************
*** 9128,9134 ****
PQExpBuffer delq;

/* Only print it if "separate" mode is selected */
! if (!tbinfo->dobj.dump || !adinfo->separate || dataOnly)
return;

/* Don't print inherited defaults, either */
--- 9165,9171 ----
PQExpBuffer delq;

/* Only print it if "separate" mode is selected */
! if (!tbinfo->dobj.dump || !adinfo->separate || !WANT_PRE_SCHEMA(dumpObjFlags))
return;

/* Don't print inherited defaults, either */
***************
*** 9213,9219 ****
PQExpBuffer q;
PQExpBuffer delq;

! if (dataOnly)
return;

q = createPQExpBuffer();
--- 9250,9256 ----
PQExpBuffer q;
PQExpBuffer delq;

! if (!WANT_POST_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 9282,9288 ****
PQExpBuffer delq;

/* Skip if not to be dumped */
! if (!coninfo->dobj.dump || dataOnly)
return;

q = createPQExpBuffer();
--- 9319,9325 ----
PQExpBuffer delq;

/* Skip if not to be dumped */
! if (!coninfo->dobj.dump || !WANT_POST_SCHEMA(dumpObjFlags))
return;

q = createPQExpBuffer();
***************
*** 9675,9681 ****
*
* Add a 'SETVAL(seq, last_val, iscalled)' as part of a "data" dump.
*/
! if (!dataOnly)
{
resetPQExpBuffer(delqry);

--- 9712,9718 ----
*
* Add a 'SETVAL(seq, last_val, iscalled)' as part of a "data" dump.
*/
! if (WANT_PRE_SCHEMA(dumpObjFlags))
{
resetPQExpBuffer(delqry);

***************
*** 9778,9784 ****
tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
}

! if (!schemaOnly)
{
resetPQExpBuffer(query);
appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
--- 9815,9821 ----
tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
}

! if (WANT_PRE_SCHEMA(dumpObjFlags))
{
resetPQExpBuffer(query);
appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
***************
*** 9811,9817 ****
const char *p;
int findx;

! if (dataOnly)
return;

query = createPQExpBuffer();
--- 9848,9854 ----
const char *p;
int findx;

! if (!WANT_POST_SCHEMA(dumpObjFlags))
return;

query = createPQExpBuffer();
***************
*** 10019,10025 ****
PGresult *res;

/* Skip if not to be dumped */
! if (!rinfo->dobj.dump || dataOnly)
return;

/*
--- 10056,10062 ----
PGresult *res;

/* Skip if not to be dumped */
! if (!rinfo->dobj.dump || !WANT_POST_SCHEMA(dumpObjFlags))
return;

/*
Index: src/bin/pg_dump/pg_restore.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_restore.c,v
retrieving revision 1.88
diff -c -r1.88 pg_restore.c
*** src/bin/pg_dump/pg_restore.c 13 Apr 2008 03:49:22 -0000 1.88
--- src/bin/pg_dump/pg_restore.c 20 Jul 2008 05:57:51 -0000
***************
*** 78,83 ****
--- 78,90 ----
static int no_data_for_failed_tables = 0;
static int outputNoTablespaces = 0;
static int use_setsessauth = 0;
+ bool dataOnly = false;
+ bool schemaOnly = false;
+
+ static int use_schemaPreLoadOnly;
+ static int use_schemaPostLoadOnly;
+
+ int dumpObjFlags;

struct option cmdopts[] = {
{"clean", 0, NULL, 'c'},
***************
*** 114,119 ****
--- 121,128 ----
{"disable-triggers", no_argument, &disable_triggers, 1},
{"no-data-for-failed-tables", no_argument, &no_data_for_failed_tables, 1},
{"no-tablespaces", no_argument, &outputNoTablespaces, 1},
+ {"schema-pre-load", no_argument, &use_schemaPreLoadOnly, 1},
+ {"schema-post-load", no_argument, &use_schemaPostLoadOnly, 1},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},

{NULL, 0, NULL, 0}
***************
*** 145,151 ****
switch (c)
{
case 'a': /* Dump data only */
! opts->dataOnly = 1;
break;
case 'c': /* clean (i.e., drop) schema prior to create */
opts->dropSchema = 1;
--- 154,160 ----
switch (c)
{
case 'a': /* Dump data only */
! dataOnly = true;
break;
case 'c': /* clean (i.e., drop) schema prior to create */
opts->dropSchema = 1;
***************
*** 213,219 ****
opts->triggerNames = strdup(optarg);
break;
case 's': /* dump schema only */
! opts->schemaOnly = 1;
break;
case 'S': /* Superuser username */
if (strlen(optarg) != 0)
--- 222,228 ----
opts->triggerNames = strdup(optarg);
break;
case 's': /* dump schema only */
! schemaOnly = true;
break;
case 'S': /* Superuser username */
if (strlen(optarg) != 0)
***************
*** 295,300 ****
--- 304,344 ----
opts->useDB = 1;
}

+ /*
+ * Look for conflicting options relating to object groupings
+ */
+ if (schemaOnly && dataOnly)
+ {
+ write_msg(NULL, "options %s and %s cannot be used together\n",
+ "-s/--schema-only", "-a/--data-only");
+ exit(1);
+ }
+ else if ((schemaOnly || dataOnly) &&
+ (use_schemaPreLoadOnly == 1 || use_schemaPostLoadOnly == 1))
+ {
+ write_msg(NULL, "options %s and %s cannot be used together\n",
+ schemaOnly ? "-s/--schema-only" : "-a/--data-only",
+ use_schemaPostLoadOnly == 1 ? "--schema-post-load" : "--schema-pre-load ");
+ exit(1);
+ }
+
+ /*
+ * Decide which of the object groups we will dump
+ */
+ dumpObjFlags = REQ_ALL;
+
+ if (dataOnly)
+ dumpObjFlags = REQ_DATA;
+
+ if (use_schemaPreLoadOnly == 1)
+ dumpObjFlags = REQ_PRE_SCHEMA;
+
+ if (use_schemaPostLoadOnly == 1)
+ dumpObjFlags = REQ_POST_SCHEMA;
+
+ if (schemaOnly)
+ dumpObjFlags = (REQ_PRE_SCHEMA | REQ_POST_SCHEMA);
+
opts->disable_triggers = disable_triggers;
opts->noDataForFailedTables = no_data_for_failed_tables;
opts->noTablespace = outputNoTablespaces;
On Sun, 2008-07-20 at 05:47 +0100, Simon Riggs wrote:
> On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote:
> > Simon,
> >
> > I agree with adding these options in general, since I find myself
> > frustrated by having to vi huge dumps to change simple schema things.
> > A couple of comments on the patch though:
> >
> > - Conflicting option handling
> > I think we are doing our users a disservice by putting it on them to
> > figure out exactly what:
> > multiple object groups cannot be used together
> > means to them. You and I may understand what an "object group" is,
> > and why there can be only one, but it's a great deal less clear than
> > the prior message of
> > options -s/--schema-only and -a/--data-only cannot be used together
> > My suggestion would be to either list out the specific options which
> > can't be used together, as was done previously, or add a bit of (I
> > realize, boring) code and actually tell the user which of the
> > conflicting options were used.
> >
> > - Documentation
> > When writing the documentation I would stress that "pre-schema" and
> > "post-schema" be defined in terms of PostgreSQL objects and why they
> > are pre vs. post.
> >
> > - Technically, the patch needs to be updated slightly since another
> > pg_dump-related patch was committed recently which also added
> > options and thus causes a conflict.
> >
> > Beyond those minor points, the patch looks good to me.
>
> Thanks for the review. I'll make the changes you suggest.

Patch updated to head, plus changes/docs requested.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

No comments: