PostgreSQL: Sep 28, 2008

Sunday, September 28, 2008

Re: [GENERAL] inserting to a multi-table view

On Sun, 28 Sep 2008 21:43:49 -0700,
"Richard Broersma" <richard.broersma@gmail.com> wrote:

[...]

> Well, I was able to get PostgreSQL Update-able views to work nearly as
> well as the update-able queries did in Access.

Would you mind sharing a sample schema?

> As a side note, you'll notice that MS-Access will not allow
> update-able queries based on ODBC linked table like it does on its
> native tables for this reason.

That's right, I did find that out once but didn't know whether it was
due to ODBC limitations or something else.

> Natural Primary key/Foreign key CASCADE UPDATEs don't work well with
> update-able views. Choose one or the other.

Not sure what you mean; can you please tell more about what doesn't work
well with updateable views what the choice is?

Thanks for the feedback,

--
Seb

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[COMMITTERS] uint - uint: Initial checking for the unsigned integer data type.

Log Message:
-----------
Initial checking for the unsigned integer data type.

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] uint - uint: New Directory

Update of /cvsroot/uint/uint/sql
In directory pgfoundry.org:/tmp/cvs-serv98957/sql

Log Message:
Directory /cvsroot/uint/uint/sql added to the repository

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] uint - uint: New Directory

Update of /cvsroot/uint/uint/expected
In directory pgfoundry.org:/tmp/cvs-serv98957/expected

Log Message:
Directory /cvsroot/uint/uint/expected added to the repository

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [NOVICE] absolute novice wanting knowledgeable opinion about front end

Phil,

PostgreSQL is a server side database, so not quite clear what you mean by not mixing front with back. Regardless of what you choose for your front-end, its not going to be completely tied to PostgreSQL.

It might be a good stepping stone to stick with your Access front end and just switch all your tables to linked PostgreSQL tables especially if you have a lot of time invested in writing Access functions.

For the most part you can use all the functions you have written in MS Access if you stick with Linked Tables. If you use pass-thrus or postgresql views then you can take advantage of PostgreSQL specific functionality. You can mix and match all 3 strategies (linked tables, linked views, sql pass-thru) in the same MS Access database.

On top of that you inherit PostgreSQL ACID, cascade update/delete, network efficiency (e..g passing statements along the pipe instead of index reads) security stuff even with linked tables. We have a bunch of applications we have written that use PostgreSQL as a backend and MS Access as a front-end. And also a bunch that use SQL Server as back end and MS Access as front-end. They actually work well together and don't suffer from the network issues that a pure MS Access solution does (e.g. 15 clients, slow over slow network etc) .

Thanks,
Regina

-----Original Message-----
From: Phil [mailto:philbaseless-postgres@yahoo.com]
Sent: Sun 9/28/2008 11:42 PM
To: Obe, Regina; Tom Lane
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] absolute novice wanting knowledgeable opinion about front end

This was interesting and the comments in the article about Access's ease of use
being a bain or boon is appropriate. But it made it easy toy to target ourselves
and not have to muck thru a generic db app.

I'm not planning to mix front and back end's.

So far I found report generators and sql builders. Form builders will be more
difficult to find. The ones in MSaccess integrate a lot of their GUI app
features and are very powerful. For example columns can be greyed out or not
depending on content. The forms in Access are often used to make up for it's
lack of data security that would probably be handled by postgres's ACID
compliance. I need to educate myself on ACID compliance and other SQL that is
new and improved over Msaccess spec.

I see I would have to rewrite a lot of Access functions also.

What would be nice is if someone had a sample DB and frontend that mimic's
Access's 'Northwind traders' sample.

Anyway thanks for the replies from everyone.

>> (Anyone want to start putting together a page on wiki.postgresql.org
>> about Access compatibility?)
>
>> regards, tom lane
>
> If it helps we wrote a quick one. I think its already listed on the
> wiki too.
>
> http://www.postgresonline.com/journal/index.php?/archives/24-Using-MS-Access-with-PostgreSQL.html
>
> Hope that helps,
> Regina
>
>
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.

[COMMITTERS] uint - uint: New Directory

Update of /cvsroot/uint/uint/data
In directory pgfoundry.org:/tmp/cvs-serv96724/data

Log Message:
Directory /cvsroot/uint/uint/data added to the repository

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [GENERAL] inserting to a multi-table view

On Sun, Sep 28, 2008 at 5:31 PM, Seb <spluque@gmail.com> wrote:
> I've read this thread with great interest as I'm coming to PostgreSQL
> from the MS Access world of databases, where one can enter new data into
> queries/forms and tables get automatically updated/deleted/inserted into
> where expected.

Well, I was able to get PostgreSQL Update-able views to work nearly
as well as the update-able queries did in Access.

In the case of update-able joined queries, I would expect that
MS-Access had the advantage since it was using Pessimistic locking on
native Access tables rather than the Optimistic locking that MS-Access
uses on all ODBC linked tables. As a side note, you'll notice that
MS-Access will not allow update-able queries based on ODBC linked
table like it does on its native tables for this reason.

> I'm also leaning towards using natural keys where possible and was
> wondering how best to create multi-table views that can be
> updated/deleted/inserted into.

Natural Primary key/Foreign key CASCADE UPDATEs don't work well with
update-able views. Choose one or the other.

> Particularly, I'm
> curious to learn how PostgreSQL database maintainers handle data
> entry/modification requiring multi-table queries. Thanks.

My opionion is that Multitable update-able views are not safe if you
plan to allow multiple users concurrent access to the view. Because
of this I scrapped these kinds of views for multiple prepared
statements issued in a serializable level transaction.

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[ADMIN] PostgreSQL Cache

Hi,

I need to perform some timed testing, thus need to make sure that disk cache does not affect me. Is clearing the OS (Ubuntu) disk cache, ( by running: sudo echo 3 | sudo tee /proc/sys/vm/drop_caches ) enough to do this? If not can you please point me to some site please since all I am finding is such command.

Thanks and regards

Matthew

--
Matthew Pulis
URL : http://www.solutions-lab.net
MSN : pulis_matthew[@]hotmail.com
ICQ : 145951110
Skype : solutions-lab.net

[COMMITTERS] uint - uint: Imported Sources

Update of /cvsroot/uint/uint/uint
In directory pgfoundry.org:/tmp/cvs-serv58775

Log Message:
Unsigned Integer Data Type

Status:

Vendor Tag: UINT_1_0_0
Release Tags: start

No conflicts created by this import

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[ANNOUNCE] == PostgreSQL Weekly News - September 28 2008 ==

== PostgreSQL Weekly News - September 28 2008 ==

A lively discussion of SE-PostgreSQL continues on -hackers.

Registration for PGDay.IT is open.
https://register.pgday.org/

ArcGIS 9.3 now supports PostgreSQL.
http://www.esri.com/software/arcgis/geodatabase/about/whats-new.html

== PostgreSQL Product News ==

check_postgres 2.2.1 released.
http://bucardo.org/check_postgres/

MyJSQLView 2.91 released.
http://myjsqlview.sourceforge.net/

PL/Proxy 2.0.7 released.
http://pgfoundry.org/projects/plproxy/

== PostgreSQL Jobs for September ==

http://archives.postgresql.org/pgsql-jobs/2008-09/threads.php

== PostgreSQL Local ==

PgDay.fr will be October 4 in Toulouse. The Call for Papers is open:
http://www.postgresqlfr.org/?q=node/1686
Registration:
http://www.pgday.fr/doku.php/inscription

The Russian PostgreSQL Users' Group meeting will be October 6th in
Moscow. Special guests expected include Gavin Roy of MyYearbook.com
and Asko Oja and Marko Kreen of Skype.
http://forum.postgresmen.ru/viewtopic.php?f=6&t=38

The Highload++ conference will be October 6-8 in Moscow, Russia.
Gavin Roy, Asko Oja and Maxim Boguk will talk about things PostgreSQL.
http://highload.ru

PostgreSQL Conference West 2008 will be October 10-12 at Portland
State University in Portland, Oregon.
http://www.postgresqlconference.org/
Talk submission at:
http://www.postgresqlconference.org/west08/talk_submission/

Ohio LinuxFest 2008, held October 11 in Columbus, will once again have
a PostgreSQL booth this year. Contact melanie AT dunslane DOT net to
volunteer.

PostgreSQL User Group Germany is doing a talk and workshop at Open
Source day 2008 October 11 in Magdeburg, Germany.
http://www.open-source-tag.de/

The European PostgreSQL Day (PGDay 2008) will be October 17 and 18 in
Prato, Tuscany, Italy. Registration is open at
http://register.pgday.org/ http://www.pgday.org/en/

PostgreSQL has a table at LinuxLive, Olympia, London, UK on 23-25
October, 2008. Write to Dave Page to participate.
dpage AT pgamin DOT org

== PostgreSQL in the News ==

Planet PostgreSQL: http://www.planetpostgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter
Jean-Paul Argudo, Gabriele Bartolini, Nikolay Samokhvalov, and Andreas
(ads) Scherbaum.

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david@fetter.org, German language
to pwn@pgug.de, Italian language to pwn@itpug.org.

== Applied Patches ==

Tom Lane committed:

- Fix dblink_connect() so that it verifies that a password is supplied
in the conninfo string *before* trying to connect to the remote
server, not after. As pointed out by Marko Kreen, in certain
not-very-plausible situations this could result in sending a
password from the postgres user's .pgpass file, or other places that
non-superusers shouldn't have access to, to an untrustworthy remote
server. The cleanest fix seems to be to expose libpq's
conninfo-string-parsing code so that dblink can check for a password
option without duplicating the parsing logic. Joe Conway, with a
little cleanup by Tom Lane.

- Get rid of pgpass_from_client tracking inside libpq --- given the
conclusion that presence of the password in the conninfo string must
be checked *before* risking a connection attempt, there is no point
in checking it afterwards. This makes the specification of
PQconnectionUsedPassword() a bit simpler and perhaps more generally
useful, too.

- In pgsql/src/pl/plpgsql/src/pl_exec.c, fix unportable syntax used in
recent patch. Per results from buildfarm member 'bear'.

- In pgsql/src/backend/rewrite/rewriteHandler.c, fix more problems
with rewriter failing to set Query.hasSubLinks when inserting a
SubLink expression into a rule query. We missed cases where the
original query contained a sub-SELECT in a function in FROM, a
multi-row VALUES list, or a RETURNING list. Per bug #4434 from Dean
Rasheed and subsequent investigation. Back-patch to 8.1; older
releases don't have the issue because they didn't try to be smart
about setting hasSubLinks only when needed.

- Establish the rule that array types should have the same typdelim as
their element types. Since the backend doesn't actually pay
attention to the array type's delimiter, this has no functional
effect, but it seems better for the catalog entries to be
consistent. Per gripe from Greg Mullane and subsequent discussion.

- In pgsql/src/backend/utils/adt/like_match.c, make LIKE throw an
error if the escape character is at the end of the pattern (ie, has
nothing to quote), rather than silently ignoring the character as
has been our historical behavior. This is required by SQL spec and
should help reduce the sort of user confusion seen in bug #4436.
Per discussion. This is not so much a bug fix as a definitional
change, and it could break existing applications; so not
back-patched. It might deserve being mentioned as an
incompatibility in the 8.4 release notes.

- In pgsql/src/backend/utils/adt/formatting.c, fix pointer-advancement
bugs in MS and US cases of new to_timestamp() code. Alex Hunsaker.

- Add hooks to let plugins override the planner's lookups in
pg_statistic. Simon Riggs, with some editorialization by me.

- In pgsql/src/backend/utils/adt/selfuncs.c, dept of second thoughts:
let's make sure that get_index_stats_hook is only applied to
expression indexes, not to plain relations. The original coding in
btcostestimate conflated the two cases, but it's not hard to use
get_relation_stats_hook instead when we're looking to the underlying
relation.

Heikki Linnakangas committed:

- Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and
datctype columns in pg_database. This is a stripped-down version of
Radek Strnad's patch, with further changes by me.

- Tighten the check in initdb and CREATE DATABASE that the chosen
encoding matches the encoding of the locale. LC_COLLATE is now
checked in addition to LC_CTYPE.

- In pgsql/src/bin/pg_dump/pg_dump.c, fix pg_dump bug in the
database-level collation patch. "datcollate" and "datctype" columns
were misspelled. Per report from Chris Browne.

Bruce Momjian committed:

- Mention battery-backed cache under hardware selection options.

- Add comment about the use of EXEC_BACKEND.

- In pgsql/doc/src/sgml/charset.sgml, fix markup tag error, envvar ->
envar.

- In pgsql/src/backend/utils/adt/datetime.c, fix integral timestamps
so the output is consistent in all cases to round: select interval
'0:0:0.7', interval '@ 0.70 secs', interval '0.7 seconds'; Ron
Mayer.

- In pgsql/doc/src/sgml/plpgsql.sgml, add documentation about when
trigger values NEW/OLD return NULL. Jeff Davis.

Magnus Hagander committed:

- In pgsql/src/backend/utils/misc/guc.c, only show source file and
line numbers to superusers, for consistent security level with other
parts of the system. Per gripe from Tom.

- Make sure pg_control is opened in binary mode, to deal with
situtations when the file contains an EOF maker (0x1A) on Windows.
ITAGAKI Takahiro.

- In pgsql/src/bin/pg_resetxlog/pg_resetxlog.c, silence compiler
warning caused by recent collation patch.

Andrew Dunstan committed:

- In pgsql/src/backend/utils/adt/like_match.c, compare escaped chars
case insensitively for ILIKE - per gripe from TGL.

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

Gregory Stark sent in another patch to do prefetching for bitmap heap
scans and index scans, sets POSIX_FADV_SEQUENTIAL for bulk sequential
scans, and fixes autoconf tests for posix_fadvise.

Simon Riggs sent in two more revisions of his infrastructure changes
for recovery.

Andrew Dunstan sent in three WIP patches for parallel restore.

Heikki Linnakangas sent in doc changes for the FSM rewrite.

KaiGai Kohei sent in another revision of the SE-PostgreSQL patches.

Heikki Linnakangas sent in two more revision of his FSM rewrite, this
time without WAL logging.

Tatsuo Ishii, Jeff Davis and Tom Lane sent in more work on the CTE
patch.

Benedek Laszlo sent in another revision of his patch to add roles to
pg_dump.

Peter Eisentraut sent in another revision of his patch to fix some
issues in vpath builds.

---------------------------(end of broadcast)---------------------------
-To unsubscribe from this list, send an email to:

pgsql-announce-unsubscribe@postgresql.org

Re: [PERFORM] Slow updates, poor IO

On Sun, Sep 28, 2008 at 9:08 PM, John Huttley <John@mib-infotech.co.nz> wrote:
> Ah yess... actually I can get the Kingston stuff locally.
> However at the moment I'm happily married and want to keep it that way!
>
> Everything is in pairs too. Actually its a lot cheaper than when it first
> came out, but still
> a lot more than your corner shop DDR-2 stuff.

I don't mean to keep arguing here, but it's not any more expensive
than the same speed DDR-2 667MHz memory. for ECC memory memory,
they're almost the same price.

http://www.crucial.com/store/listparts.aspx?model=PowerEdge%201950

That's memory for my Dell PowerEdge web server, and it's $105.99 for 2
1Gig sticks. $56.99 * 2 = $113.98. It's only 7.99 more. I get the
point about not wanting to anger the wife, but maybe if you ask for it
nice for Christmas? :)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [NOVICE] absolute novice wanting knowledgeable opinion about front end

On Sun, Sep 28, 2008 at 8:42 PM, Phil <philbaseless-postgres@yahoo.com> wrote:
> Anyway thanks for the replies from everyone.

Here is a link for some MS-Access Lessons that I've learned:

http://www.utteraccess.com/forums/uafavgo.php?Cat=&F_Board=93&Thread=1494943&partnumber=1&postmarker=

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: [PERFORM] Slow updates, poor IO

I've canned the db and got rid my of data.
I'm in the midst of doing some other benchmarking for a possible change
to the bacula database.

Loading up 1M records into a table of 60M records complete with indexes.
It's still going...

--john

Dan Langille wrote:
>
> On Sep 28, 2008, at 10:01 PM, John Huttley wrote:
>
>>
>>
>> Greg Smith wrote:
>>> On Mon, 29 Sep 2008, John Huttley wrote:
>>>
>>>> checkpoint _segments=16 is fine, going to 64 made no improvement.
>>>
>>> You might find that it does *after* increasing shared_buffers. If
>>> the buffer cache is really small, the checkpoints can't have very
>>> much work to do, so their impact on performance is smaller. Once
>>> you've got a couple of hundred MB on there, the per-checkpoint
>>> overhead can be considerable.
>>>
>> Ahh bugger, I've just trashed my test setup.
>
> Pardon? How did you do that?
>

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [NOVICE] absolute novice wanting knowledgeable opinion about front end

This was interesting and the comments in the article about Access's ease of use
being a bain or boon is appropriate. But it made it easy toy to target ourselves
and not have to muck thru a generic db app.

I'm not planning to mix front and back end's.

So far I found report generators and sql builders. Form builders will be more
difficult to find. The ones in MSaccess integrate a lot of their GUI app
features and are very powerful. For example columns can be greyed out or not
depending on content. The forms in Access are often used to make up for it's
lack of data security that would probably be handled by postgres's ACID
compliance. I need to educate myself on ACID compliance and other SQL that is
new and improved over Msaccess spec.

I see I would have to rewrite a lot of Access functions also.

What would be nice is if someone had a sample DB and frontend that mimic's
Access's 'Northwind traders' sample.

Anyway thanks for the replies from everyone.

>> (Anyone want to start putting together a page on wiki.postgresql.org
>> about Access compatibility?)
>
>> regards, tom lane
>
> If it helps we wrote a quick one. I think its already listed on the
> wiki too.
>
> http://www.postgresonline.com/journal/index.php?/archives/24-Using-MS-Access-with-PostgreSQL.html
>
> Hope that helps,
> Regina
>
>
> -----------------------------------------
> The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended
> solely for the addressee. If you received this in error, please
> contact the sender and delete the material from any computer.

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: [PERFORM] Slow updates, poor IO

On Sep 28, 2008, at 10:01 PM, John Huttley wrote:

>
>
> Greg Smith wrote:
>> On Mon, 29 Sep 2008, John Huttley wrote:
>>
>>> checkpoint _segments=16 is fine, going to 64 made no improvement.
>>
>> You might find that it does *after* increasing shared_buffers. If
>> the buffer cache is really small, the checkpoints can't have very
>> much work to do, so their impact on performance is smaller. Once
>> you've got a couple of hundred MB on there, the per-checkpoint
>> overhead can be considerable.
>>
> Ahh bugger, I've just trashed my test setup.

Pardon? How did you do that?

--
Dan Langille
http://langille.org/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [GENERAL] NULL values seem to short-circuit my unique index

Matthew Wilson wrote:
> I'm trying to comprehend how NULL values interact with unique indexes.
> It seems like I can insert two rows with NULL values in a column with a
> unique constraint just fine.
>
> Is there something special about NULL? Can anyone post some links to
> explain what is going on?
>

http://www.postgresql.org/docs/8.3/interactive/ddl-constraints.html#AEN2058
Last paragraph just above 5.3.4.

> What's the deal with NULL?
>

NULL = NULL is not true, it's null
NULL <> NULL is not false, it's null

It's the normal SQL 3 state logic (true/false/null) with only the true
value from the comparison causing the constraint violation. Think of
the unique constraint check like "does this value equal any other value
already recorded".

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [PERFORM] Slow updates, poor IO

Ah yess... actually I can get the Kingston stuff locally.
However at the moment I'm happily married and want to keep it that way!

Everything is in pairs too. Actually its a lot cheaper than when it first came out, but still
a lot more than your corner shop DDR-2 stuff.

--John

Scott Marlowe wrote:

On Sun, Sep 28, 2008 at 8:01 PM, John Huttley <John@mib-infotech.co.nz> wrote:

Ahh bugger, I've just trashed my test setup. I've settled on  64Mb shared memory since I've only got 1Gb or RAM and the system impact of 256M is severe. Also it uses FB-DIMMS which cost arm+leg+first born

 http://www.crucial.com/search/searchresults.aspx?keywords=buffered  Fully buffered memory there is $56.99 for a 1 Gig stick.  That's hardly an arm and a leg.  Considering many pgsql DBAs make that in 1 to 3 hours, it's not much at all really.  A lot cheaper than pulling your hair out trying to make a db server run on 1 Gig.

Re: [GENERAL] NULL values seem to short-circuit my unique index

Matthew Wilson <matt@tplus1.com> writes:
> It seems like I can insert two rows with NULL values in a column with a
> unique constraint just fine.

This is per SQL spec and quite well-documented in our manual ...

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] NULL values seem to short-circuit my unique index

When you think of null as "unknown", it makes sense.

Does an unknown value equal another unknown value?

http://www.postgresql.org/docs/8.3/interactive/functions-logical.html

http://www.postgresql.org/docs/8.3/interactive/functions-comparison.html

--
Postgresql & php tutorials
http://www.designmagick.com/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] parallel pg_restore - WIP patch

Index: pg_backup.h
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup.h,v
retrieving revision 1.47
diff -c -r1.47 pg_backup.h
*** pg_backup.h 13 Apr 2008 03:49:21 -0000 1.47
--- pg_backup.h 29 Sep 2008 02:43:51 -0000
***************
*** 123,128 ****
--- 123,130 ----
int suppressDumpWarnings; /* Suppress output of WARNING entries
* to stderr */
bool single_txn;
+ int number_of_threads;
+ bool truncate_before_load;

bool *idWanted; /* array showing which dump IDs to emit */
} RestoreOptions;
***************
*** 165,170 ****
--- 167,173 ----
extern void CloseArchive(Archive *AH);

extern void RestoreArchive(Archive *AH, RestoreOptions *ropt);
+ extern void RestoreArchiveParallel(Archive *AH, RestoreOptions *ropt);

/* Open an existing archive */
extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
Index: pg_backup_archiver.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.158
diff -c -r1.158 pg_backup_archiver.c
*** pg_backup_archiver.c 5 Sep 2008 23:53:42 -0000 1.158
--- pg_backup_archiver.c 29 Sep 2008 02:43:52 -0000
***************
*** 27,38 ****
--- 27,50 ----

#include <unistd.h>

+ #include <sys/types.h>
+ #include <sys/wait.h>
+
+
#ifdef WIN32
#include <io.h>
#endif

#include "libpq/libpq-fs.h"

+ typedef struct _parallel_slot
+ {
+ pid_t pid;
+ TocEntry *te;
+ DumpId dumpId;
+ } ParallelSlot;
+
+ #define NO_SLOT (-1)

const char *progname;

***************
*** 70,76 ****
--- 82,99 ----
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static OutputContext SetOutput(ArchiveHandle *AH, char *filename, int compression);
static void ResetOutput(ArchiveHandle *AH, OutputContext savedContext);
+ static bool work_is_being_done(ParallelSlot *slot, int n_slots);
+ static int get_next_slot(ParallelSlot *slots, int n_slots);
+ static TocEntry *get_next_work_item(ArchiveHandle *AH);
+ static void prestore(ArchiveHandle *AH, TocEntry *te);
+ static void mark_work_done(ArchiveHandle *AH, pid_t worker, ParallelSlot *slots, int n_slots);
+ static int _restore_one_te(ArchiveHandle *ah, TocEntry *te, RestoreOptions *ropt,bool is_parallel);
+ static void _reduce_dependencies(ArchiveHandle * AH, TocEntry *te);
+ static void _fix_dependency_counts(ArchiveHandle *AH);
+ static void _inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry * te);
+

+ static ArchiveHandle *GAH;

/*
* Wrapper functions.
***************
*** 125,137 ****

/* Public */
void
RestoreArchive(Archive *AHX, RestoreOptions *ropt)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
TocEntry *te;
teReqs reqs;
OutputContext sav;
- bool defnDumped;

AH->ropt = ropt;
AH->stage = STAGE_INITIALIZING;
--- 148,579 ----

/* Public */
void
+ RestoreArchiveParallel(Archive *AHX, RestoreOptions *ropt)
+ {
+
+ ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ ParallelSlot *slots;
+ int next_slot;
+ TocEntry *next_work_item = NULL;
+ int work_status;
+ pid_t ret_child;
+ int n_slots = ropt->number_of_threads;
+ TocEntry *te;
+ teReqs reqs;
+
+
+ /* AH->debugLevel = 99; */
+ /* some routines that use ahlog() don't get passed AH */
+ GAH = AH;
+
+ ahlog(AH,1,"entering RestoreARchiveParallel\n");
+
+
+ slots = (ParallelSlot *) calloc(sizeof(ParallelSlot),n_slots);
+ AH->ropt = ropt;
+
+ /*
+ if (ropt->create)
+ die_horribly(AH,modulename,
+ "parallel restore is incompatible with --create\n");
+ */
+
+
+ if (ropt->dropSchema)
+ die_horribly(AH,modulename,
+ "parallel restore is incompatible with --clean\n");
+
+ if (!ropt->useDB)
+ die_horribly(AH,modulename,
+ "parallel restore requires direct database connection\n");
+
+
+ #ifndef HAVE_LIBZ
+
+ /* make sure we won't need (de)compression we haven't got */
+ if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ {
+ for (te = AH->toc->next; te != AH->toc; te = te->next)
+ {
+ reqs = _tocEntryRequired(te, ropt, false);
+ if (te->hadDumper && (reqs & REQ_DATA) != 0)
+ die_horribly(AH, modulename,
+ "cannot restore from compressed archive (compression not supported in this installation)\n");
+ }
+ }
+ #endif
+
+ ahlog(AH, 1, "connecting to database for restore\n");
+ if (AH->version < K_VERS_1_3)
+ die_horribly(AH, modulename,
+ "direct database connections are not supported in pre-1.3 archives\n");
+
+ /* XXX Should get this from the archive */
+ AHX->minRemoteVersion = 070100;
+ AHX->maxRemoteVersion = 999999;
+
+ /* correct dependency counts in case we're doing a partial restore */
+ if (ropt->idWanted == NULL)
+ InitDummyWantedList(AHX,ropt);
+ _fix_dependency_counts(AH);
+
+ /*
+ * Since we're talking to the DB directly, don't send comments since they
+ * obscure SQL when displaying errors
+ */
+ AH->noTocComments = 1;
+
+ /* Do all the early stuff in a single connection in the parent.
+ * There's no great point in running it in parallel and it will actually
+ * run faster in a single connection because we avoid all the connection
+ * and setup overhead, including the 0.5s sleep below.
+ */
+ ConnectDatabase(AHX, ropt->dbname,
+ ropt->pghost, ropt->pgport, ropt->username,
+ ropt->requirePassword);
+
+
+ /*
+ * Establish important parameter values right away.
+ */
+ _doSetFixedOutputState(AH);
+
+ while((next_work_item = get_next_work_item(AH)) != NULL)
+ {
+ /* XXX need to improve this test in case there is no table data */
+ /* need to test for indexes, FKs, PK, Unique, etc */
+ if(strcmp(next_work_item->desc,"TABLE DATA") == 0)
+ break;
+ (void) _restore_one_te(AH, next_work_item, ropt, false);
+
+ next_work_item->prestored = true;
+
+ _reduce_dependencies(AH,next_work_item);
+ }
+
+
+ /*
+ * now close parent connection in prep for parallel step.
+ */
+ PQfinish(AH->connection);
+ AH->connection = NULL;
+
+ /* blow away any preserved state from the previous connection */
+
+ if (AH->currSchema)
+ free(AH->currSchema);
+ AH->currSchema = strdup("");
+ if (AH->currUser)
+ free(AH->currUser);
+ AH->currUser = strdup("");
+ if (AH->currTablespace)
+ free(AH->currTablespace);
+ AH->currTablespace = NULL;
+ AH->currWithOids = -1;
+
+ /* main parent loop */
+
+ ahlog(AH,1,"entering main loop\n");
+
+ while (((next_work_item = get_next_work_item(AH)) != NULL) ||
+ (work_is_being_done(slots,n_slots)))
+ {
+ if (next_work_item != NULL &&
+ ((next_slot = get_next_slot(slots,n_slots)) != NO_SLOT))
+ {
+ /* there is work still to do and a worker slot available */
+
+ pid_t child;
+
+ next_work_item->prestored = true;
+
+ child = fork();
+ if (child == 0)
+ {
+ prestore(AH,next_work_item);
+ /* should not happen ... we expect prestore to exit */
+ exit(1);
+ }
+ else if (child > 0)
+ {
+ slots[next_slot].pid = child;
+ slots[next_slot].te = next_work_item;
+ slots[next_slot].dumpId = next_work_item->dumpId;
+ }
+ else
+ {
+ /* XXX fork error - handle it! */
+ }
+ /* delay just long enough betweek forks to give the catalog some
+ * breathing space. Without this sleep I got
+ * "tuple concurrently updated" errors.
+ */
+ /* pg_usleep(500000); */
+ continue; /* in case the slots are not yet full */
+ }
+ /* if we get here there must be work being done */
+ ret_child = wait(&work_status);
+
+ if (WIFEXITED(work_status) && WEXITSTATUS(work_status) == 0)
+ {
+ mark_work_done(AH, ret_child, slots, n_slots);
+ }
+ else if (WIFEXITED(work_status) && WEXITSTATUS(work_status) == 1)
+ {
+ int i;
+
+ for (i = 0; i < n_slots; i++)
+ {
+ if (slots[i].pid == ret_child)
+ _inhibit_data_for_failed_table(AH, slots[i].te);
+ break;
+ }
+ mark_work_done(AH, ret_child, slots, n_slots);
+ }
+ else
+ {
+ /* XXX something went wrong - deal with it */
+ }
+ }
+
+ /*
+ * now process the ACLs - no need to do this in parallel
+ */
+
+ /* reconnect from parent */
+ ConnectDatabase(AHX, ropt->dbname,
+ ropt->pghost, ropt->pgport, ropt->username,
+ ropt->requirePassword);
+
+ /*
+ * Scan TOC to output ownership commands and ACLs
+ */
+ for (te = AH->toc->next; te != AH->toc; te = te->next)
+ {
+ AH->currentTE = te;
+
+ /* Work out what, if anything, we want from this entry */
+ reqs = _tocEntryRequired(te, ropt, true);
+
+ if ((reqs & REQ_SCHEMA) != 0) /* We want the schema */
+ {
+ ahlog(AH, 1, "setting owner and privileges for %s %s\n",
+ te->desc, te->tag);
+ _printTocEntry(AH, te, ropt, false, true);
+ }
+ }
+
+ /* clean up */
+ PQfinish(AH->connection);
+ AH->connection = NULL;
+
+ }
+
+ static bool
+ work_is_being_done(ParallelSlot *slot, int n_slots)
+ {
+ ahlog(GAH,1,"is work being done?\n");
+ while(n_slots--)
+ {
+ if (slot->pid > 0)
+ return true;
+ slot++;
+ }
+ ahlog(GAH,1,"work is not being done\n");
+ return false;
+ }
+
+ static int
+ get_next_slot(ParallelSlot *slots, int n_slots)
+ {
+ int i;
+
+ for (i = 0; i < n_slots; i++)
+ {
+ if (slots[i].pid == 0)
+ {
+ ahlog(GAH,1,"available slots is %d\n",i);
+ return i;
+ }
+ }
+ ahlog(GAH,1,"No slot available\n");
+ return NO_SLOT;
+ }
+
+ static TocEntry*
+ get_next_work_item(ArchiveHandle *AH)
+ {
+ TocEntry *te;
+ teReqs reqs;
+
+ /* just search from the top of the queue until we find an available item.
+ * Note that the queue isn't reordered in the current implementation. If
+ * we ever do reorder it, then certain code that processes entries from the
+ * current item to the end of the queue will probably need to be
+ * re-examined.
+ */
+
+ for (te = AH->toc->next; te != AH->toc; te = te->next)
+ {
+ if (!te->prestored && te->depCount < 1)
+ {
+ /* make sure it's not an ACL */
+ reqs = _tocEntryRequired (te, AH->ropt, false);
+ if ((reqs & (REQ_SCHEMA | REQ_DATA)) != 0)
+ {
+ ahlog(AH,1,"next item is %d\n",te->dumpId);
+ return te;
+ }
+ }
+ }
+ ahlog(AH,1,"No item ready\n");
+ return NULL;
+ }
+
+ static void
+ prestore(ArchiveHandle *AH, TocEntry *te)
+ {
+ RestoreOptions *ropt = AH->ropt;
+ int retval;
+
+ /* close and reopen the archive so we have a private copy that doesn't
+ * stomp on anyone else's file pointer
+ */
+
+ (AH->ReopenPtr)(AH);
+
+ ConnectDatabase((Archive *)AH, ropt->dbname,
+ ropt->pghost, ropt->pgport, ropt->username,
+ ropt->requirePassword);
+
+ /*
+ * Establish important parameter values right away.
+ */
+ _doSetFixedOutputState(AH);
+
+ retval = _restore_one_te(AH, te, ropt, true);
+
+ PQfinish(AH->connection);
+ exit(retval);
+
+ }
+
+ static void
+ mark_work_done(ArchiveHandle *AH, pid_t worker,
+ ParallelSlot *slots, int n_slots)
+ {
+
+ TocEntry *te = NULL;
+ int i;
+
+ for (i = 0; i < n_slots; i++)
+ {
+ if (slots[i].pid == worker)
+ {
+ te = slots[i].te;
+ slots[i].pid = 0;
+ slots[i].te = NULL;
+ slots[i].dumpId = 0;
+ break;
+ }
+ }
+
+ /* Assert (te != NULL); */
+
+ _reduce_dependencies(AH,te);
+
+
+ }
+
+
+ /*
+ * Make sure the head of each dependency chain is a live item
+ *
+ * Once this is established the property will be maintained by
+ * _reduce_dependencies called as items are done.
+ */
+ static void
+ _fix_dependency_counts(ArchiveHandle *AH)
+ {
+ TocEntry * te;
+ RestoreOptions * ropt = AH->ropt;
+ bool * RealDumpIds;
+ int i;
+
+
+ RealDumpIds = calloc(AH->maxDumpId, sizeof(bool));
+ for (te = AH->toc->next; te != AH->toc; te = te->next)
+ {
+ RealDumpIds[te->dumpId-1] = true;
+ if (te->depCount == 0 && ! ropt->idWanted[te->dumpId -1])
+ _reduce_dependencies(AH,te);
+ }
+
+ /*
+ * It is possible that the dependencies list items that are
+ * not in the archive at all. Reduce the depcounts so those get
+ * ignored.
+ */
+ for (te = AH->toc->next; te != AH->toc; te = te->next)
+ for (i = 0; i < te->nDeps; i++)
+ if (!RealDumpIds[te->dependencies[i]-1])
+ te->depCount--;
+ }
+
+ static void
+ _reduce_dependencies(ArchiveHandle * AH, TocEntry *te)
+ {
+ DumpId item = te->dumpId;
+ RestoreOptions * ropt = AH->ropt;
+ int i;
+
+ for (te = te->next; te != AH->toc; te = te->next)
+ {
+ if (te->nDeps == 0)
+ continue;
+
+ for (i = 0; i < te->nDeps; i++)
+ if (te->dependencies[i] == item)
+ te->depCount = te->depCount - 1;
+
+ /* If this is a table data item we are making available,
+ * make the table's dependencies depend on this item instead of
+ * the table definition, so they
+ * don't get scheduled until the data is loaded.
+ * Have to do this now before the main loop gets to anything
+ * further down the list.
+ */
+ if (te->depCount == 0 && strcmp(te->desc,"TABLEDATA") == 0)
+ {
+ TocEntry *tes;
+ int j;
+ for (tes = te->next; tes != AH->toc; tes = tes->next)
+ for (j = 0; j < tes->nDeps; j++)
+ if (tes->dependencies[j] == item)
+ tes->dependencies[j] = te->dumpId;
+ }
+
+ /*
+ * If this item won't in fact be done, and is now at
+ * 0 dependency count, we pretend it's been done and
+ * reduce the dependency counts of all the things that
+ * depend on it, by a recursive call
+ */
+ if (te->depCount == 0 && ! ropt->idWanted[te->dumpId -1])
+ _reduce_dependencies(AH,te);
+ }
+
+ }
+
+
+ /* Public */
+ void
RestoreArchive(Archive *AHX, RestoreOptions *ropt)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
TocEntry *te;
teReqs reqs;
OutputContext sav;

AH->ropt = ropt;
AH->stage = STAGE_INITIALIZING;
***************
*** 171,176 ****
--- 613,632 ----
AH->noTocComments = 1;
}

+ #ifndef HAVE_LIBZ
+
+ /* make sure we won't need (de)compression we haven't got */
+ if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ {
+ for (te = AH->toc->next; te != AH->toc; te = te->next)
+ {
+ reqs = _tocEntryRequired(te, ropt, false);
+ if (te->hadDumper && (reqs & REQ_DATA) != 0)
+ die_horribly(AH, modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
+ }
+ }
+ #endif
+
/*
* Work out if we have an implied data-only restore. This can happen if
* the dump was data only or if the user has used a toc list to exclude
***************
*** 270,409 ****
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
! AH->currentTE = te;
!
! /* Work out what, if anything, we want from this entry */
! reqs = _tocEntryRequired(te, ropt, false);
!
! /* Dump any relevant dump warnings to stderr */
! if (!ropt->suppressDumpWarnings && strcmp(te->desc, "WARNING") == 0)
! {
! if (!ropt->dataOnly && te->defn != NULL && strlen(te->defn) != 0)
! write_msg(modulename, "warning from original dump file: %s\n", te->defn);
! else if (te->copyStmt != NULL && strlen(te->copyStmt) != 0)
! write_msg(modulename, "warning from original dump file: %s\n", te->copyStmt);
! }
!
! defnDumped = false;
!
! if ((reqs & REQ_SCHEMA) != 0) /* We want the schema */
! {
! ahlog(AH, 1, "creating %s %s\n", te->desc, te->tag);
!
! _printTocEntry(AH, te, ropt, false, false);
! defnDumped = true;
!
! /*
! * If we could not create a table and --no-data-for-failed-tables
! * was given, ignore the corresponding TABLE DATA
! */
! if (ropt->noDataForFailedTables &&
! AH->lastErrorTE == te &&
! strcmp(te->desc, "TABLE") == 0)
! {
! TocEntry *tes;
!
! ahlog(AH, 1, "table \"%s\" could not be created, will not restore its data\n",
! te->tag);
!
! for (tes = te->next; tes != AH->toc; tes = tes->next)
! {
! if (strcmp(tes->desc, "TABLE DATA") == 0 &&
! strcmp(tes->tag, te->tag) == 0 &&
! strcmp(tes->namespace ? tes->namespace : "",
! te->namespace ? te->namespace : "") == 0)
! {
! /* mark it unwanted */
! ropt->idWanted[tes->dumpId - 1] = false;
! break;
! }
! }
! }
!
! /* If we created a DB, connect to it... */
! if (strcmp(te->desc, "DATABASE") == 0)
! {
! ahlog(AH, 1, "connecting to new database \"%s\"\n", te->tag);
! _reconnectToDB(AH, te->tag);
! }
! }
!
! /*
! * If we have a data component, then process it
! */
! if ((reqs & REQ_DATA) != 0)
! {
! /*
! * hadDumper will be set if there is genuine data component for
! * this node. Otherwise, we need to check the defn field for
! * statements that need to be executed in data-only restores.
! */
! if (te->hadDumper)
! {
! /*
! * If we can output the data, then restore it.
! */
! if (AH->PrintTocDataPtr !=NULL && (reqs & REQ_DATA) != 0)
! {
! #ifndef HAVE_LIBZ
! if (AH->compression != 0)
! die_horribly(AH, modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
! #endif
!
! _printTocEntry(AH, te, ropt, true, false);
!
! if (strcmp(te->desc, "BLOBS") == 0 ||
! strcmp(te->desc, "BLOB COMMENTS") == 0)
! {
! ahlog(AH, 1, "restoring %s\n", te->desc);
!
! _selectOutputSchema(AH, "pg_catalog");
!
! (*AH->PrintTocDataPtr) (AH, te, ropt);
! }
! else
! {
! _disableTriggersIfNecessary(AH, te, ropt);
!
! /* Select owner and schema as necessary */
! _becomeOwner(AH, te);
! _selectOutputSchema(AH, te->namespace);
!
! ahlog(AH, 1, "restoring data for table \"%s\"\n",
! te->tag);
!
! /*
! * If we have a copy statement, use it. As of V1.3,
! * these are separate to allow easy import from
! * withing a database connection. Pre 1.3 archives can
! * not use DB connections and are sent to output only.
! *
! * For V1.3+, the table data MUST have a copy
! * statement so that we can go into appropriate mode
! * with libpq.
! */
! if (te->copyStmt && strlen(te->copyStmt) > 0)
! {
! ahprintf(AH, "%s", te->copyStmt);
! AH->writingCopyData = true;
! }
!
! (*AH->PrintTocDataPtr) (AH, te, ropt);
!
! AH->writingCopyData = false;
!
! _enableTriggersIfNecessary(AH, te, ropt);
! }
! }
! }
! else if (!defnDumped)
! {
! /* If we haven't already dumped the defn part, do so now */
! ahlog(AH, 1, "executing %s %s\n", te->desc, te->tag);
! _printTocEntry(AH, te, ropt, false, false);
! }
! }
! } /* end loop over TOC entries */

/*
* Scan TOC again to output ownership commands and ACLs
--- 726,733 ----
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
! (void) _restore_one_te(AH, te, ropt, false);
! }

/*
* Scan TOC again to output ownership commands and ACLs
***************
*** 451,456 ****
--- 775,955 ----
}
}

+ static int
+ _restore_one_te(ArchiveHandle *AH, TocEntry *te,
+ RestoreOptions *ropt, bool is_parallel)
+ {
+ teReqs reqs;
+ bool defnDumped;
+ int retval = 0;
+
+ AH->currentTE = te;
+
+ /* Work out what, if anything, we want from this entry */
+ reqs = _tocEntryRequired(te, ropt, false);
+
+ /* Dump any relevant dump warnings to stderr */
+ if (!ropt->suppressDumpWarnings && strcmp(te->desc, "WARNING") == 0)
+ {
+ if (!ropt->dataOnly && te->defn != NULL && strlen(te->defn) != 0)
+ write_msg(modulename, "warning from original dump file: %s\n", te->defn);
+ else if (te->copyStmt != NULL && strlen(te->copyStmt) != 0)
+ write_msg(modulename, "warning from original dump file: %s\n", te->copyStmt);
+ }
+
+ defnDumped = false;
+
+ if ((reqs & REQ_SCHEMA) != 0) /* We want the schema */
+ {
+ ahlog(AH, 1, "creating %s %s\n", te->desc, te->tag);
+
+ _printTocEntry(AH, te, ropt, false, false);
+ defnDumped = true;
+
+ /*
+ * If we could not create a table and --no-data-for-failed-tables
+ * was given, ignore the corresponding TABLE DATA
+ *
+ * For the parallel case this must be done in the parent, so we just
+ * set a return value.
+ */
+ if (ropt->noDataForFailedTables &&
+ AH->lastErrorTE == te &&
+ strcmp(te->desc, "TABLE") == 0)
+ {
+ if (is_parallel)
+ retval = 1;
+ else
+ _inhibit_data_for_failed_table(AH,te);
+ }
+
+ /* If we created a DB, connect to it... */
+ /* won't happen in parallel restore */
+ if (strcmp(te->desc, "DATABASE") == 0)
+ {
+ ahlog(AH, 1, "connecting to new database \"%s\"\n", te->tag);
+ _reconnectToDB(AH, te->tag);
+ }
+ }
+
+ /*
+ * If we have a data component, then process it
+ */
+ if ((reqs & REQ_DATA) != 0)
+ {
+ /*
+ * hadDumper will be set if there is genuine data component for
+ * this node. Otherwise, we need to check the defn field for
+ * statements that need to be executed in data-only restores.
+ */
+ if (te->hadDumper)
+ {
+ /*
+ * If we can output the data, then restore it.
+ */
+ if (AH->PrintTocDataPtr !=NULL && (reqs & REQ_DATA) != 0)
+ {
+ _printTocEntry(AH, te, ropt, true, false);
+
+ if (strcmp(te->desc, "BLOBS") == 0 ||
+ strcmp(te->desc, "BLOB COMMENTS") == 0)
+ {
+ ahlog(AH, 1, "restoring %s\n", te->desc);
+
+ _selectOutputSchema(AH, "pg_catalog");
+
+ (*AH->PrintTocDataPtr) (AH, te, ropt);
+ }
+ else
+ {
+ _disableTriggersIfNecessary(AH, te, ropt);
+
+ /* Select owner and schema as necessary */
+ _becomeOwner(AH, te);
+ _selectOutputSchema(AH, te->namespace);
+
+ ahlog(AH, 1, "restoring data for table \"%s\"\n",
+ te->tag);
+
+ if (ropt->truncate_before_load)
+ {
+ if (AH->connection)
+ StartTransaction(AH);
+ else
+ ahprintf(AH, "BEGIN;\n\n");
+
+ ahprintf(AH, "TRUNCATE TABLE %s;\n\n",
+ fmtId(te->tag)); }
+
+ /*
+ * If we have a copy statement, use it. As of V1.3,
+ * these are separate to allow easy import from
+ * withing a database connection. Pre 1.3 archives can
+ * not use DB connections and are sent to output only.
+ *
+ * For V1.3+, the table data MUST have a copy
+ * statement so that we can go into appropriate mode
+ * with libpq.
+ */
+ if (te->copyStmt && strlen(te->copyStmt) > 0)
+ {
+ ahprintf(AH, "%s", te->copyStmt);
+ AH->writingCopyData = true;
+ }
+
+ (*AH->PrintTocDataPtr) (AH, te, ropt);
+
+ AH->writingCopyData = false;
+
+ if (ropt->truncate_before_load)
+ {
+ if (AH->connection)
+ CommitTransaction(AH);
+ else
+ ahprintf(AH, "COMMIT;\n\n");
+ }
+
+
+ _enableTriggersIfNecessary(AH, te, ropt);
+ }
+ }
+ }
+ else if (!defnDumped)
+ {
+ /* If we haven't already dumped the defn part, do so now */
+ ahlog(AH, 1, "executing %s %s\n", te->desc, te->tag);
+ _printTocEntry(AH, te, ropt, false, false);
+ }
+ }
+
+ return retval;
+ }
+
+ static void
+ _inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry * te)
+ {
+ TocEntry *tes;
+ RestoreOptions *ropt = AH->ropt;
+
+ ahlog(AH, 1, "table \"%s\" could not be created, will not restore its data\n",
+ te->tag);
+
+ for (tes = te->next; tes != AH->toc; tes = tes->next)
+ {
+ if (strcmp(tes->desc, "TABLE DATA") == 0 &&
+ strcmp(tes->tag, te->tag) == 0 &&
+ strcmp(tes->namespace ? tes->namespace : "",
+ te->namespace ? te->namespace : "") == 0)
+ {
+ /* mark it unwanted */
+ ropt->idWanted[tes->dumpId - 1] = false;
+
+ _reduce_dependencies(AH, tes);
+ break;
+ }
+ }
+ }
+
/*
* Allocate a new RestoreOptions block.
* This is mainly so we can initialize it, but also for future expansion,
***************
*** 653,662 ****
while (te != AH->toc)
{
if (_tocEntryRequired(te, ropt, true) != 0)
! ahprintf(AH, "%d; %u %u %s %s %s %s\n", te->dumpId,
te->catalogId.tableoid, te->catalogId.oid,
te->desc, te->namespace ? te->namespace : "-",
te->tag, te->owner);
te = te->next;
}

--- 1152,1167 ----
while (te != AH->toc)
{
if (_tocEntryRequired(te, ropt, true) != 0)
! {
! int i;
! ahprintf(AH, "%d;[%d: ",te->dumpId, te->nDeps);
! for (i=0 ;i<te->nDeps; i++)
! ahprintf(AH, "%d ",te->dependencies[i]);
! ahprintf(AH, "] %u %u %s %s %s %s\n",
te->catalogId.tableoid, te->catalogId.oid,
te->desc, te->namespace ? te->namespace : "-",
te->tag, te->owner);
+ }
te = te->next;
}

***************
*** 1948,1965 ****
--- 2453,2473 ----
deps = (DumpId *) realloc(deps, sizeof(DumpId) * depIdx);
te->dependencies = deps;
te->nDeps = depIdx;
+ te->depCount = depIdx;
}
else
{
free(deps);
te->dependencies = NULL;
te->nDeps = 0;
+ te->depCount = 0;
}
}
else
{
te->dependencies = NULL;
te->nDeps = 0;
+ te->depCount = 0;
}

if (AH->ReadExtraTocPtr)
Index: pg_backup_archiver.h
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.h,v
retrieving revision 1.76
diff -c -r1.76 pg_backup_archiver.h
*** pg_backup_archiver.h 7 Nov 2007 12:24:24 -0000 1.76
--- pg_backup_archiver.h 29 Sep 2008 02:43:52 -0000
***************
*** 99,104 ****
--- 99,105 ----
struct _restoreList;

typedef void (*ClosePtr) (struct _archiveHandle * AH);
+ typedef void (*ReopenPtr) (struct _archiveHandle * AH);
typedef void (*ArchiveEntryPtr) (struct _archiveHandle * AH, struct _tocEntry * te);

typedef void (*StartDataPtr) (struct _archiveHandle * AH, struct _tocEntry * te);
***************
*** 212,217 ****
--- 213,219 ----
WriteBufPtr WriteBufPtr; /* Write a buffer of output to the archive */
ReadBufPtr ReadBufPtr; /* Read a buffer of input from the archive */
ClosePtr ClosePtr; /* Close the archive */
+ ReopenPtr ReopenPtr; /* Reopen the archive */
WriteExtraTocPtr WriteExtraTocPtr; /* Write extra TOC entry data
* associated with the current archive
* format */
***************
*** 231,236 ****
--- 233,239 ----
char *archdbname; /* DB name *read* from archive */
bool requirePassword;
PGconn *connection;
+ char *cachepw;
int connectToDB; /* Flag to indicate if direct DB connection is
* required */
bool writingCopyData; /* True when we are sending COPY data */
***************
*** 284,289 ****
--- 287,293 ----
DumpId dumpId;
bool hadDumper; /* Archiver was passed a dumper routine (used
* in restore) */
+ bool prestored; /* keep track of parallel restore */
char *tag; /* index tag */
char *namespace; /* null or empty string if not in a schema */
char *tablespace; /* null if not in a tablespace; empty string
***************
*** 296,301 ****
--- 300,306 ----
char *copyStmt;
DumpId *dependencies; /* dumpIds of objects this one depends on */
int nDeps; /* number of dependencies */
+ int depCount; /* adjustable tally of dependencies */

DataDumperPtr dataDumper; /* Routine to dump data for object */
void *dataDumperArg; /* Arg for above routine */
Index: pg_backup_custom.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_custom.c,v
retrieving revision 1.40
diff -c -r1.40 pg_backup_custom.c
*** pg_backup_custom.c 28 Oct 2007 21:55:52 -0000 1.40
--- pg_backup_custom.c 29 Sep 2008 02:43:52 -0000
***************
*** 40,45 ****
--- 40,46 ----
static size_t _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len);
static size_t _ReadBuf(ArchiveHandle *AH, void *buf, size_t len);
static void _CloseArchive(ArchiveHandle *AH);
+ static void _ReopenArchive(ArchiveHandle *AH);
static void _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
static void _WriteExtraToc(ArchiveHandle *AH, TocEntry *te);
static void _ReadExtraToc(ArchiveHandle *AH, TocEntry *te);
***************
*** 120,125 ****
--- 121,127 ----
AH->WriteBufPtr = _WriteBuf;
AH->ReadBufPtr = _ReadBuf;
AH->ClosePtr = _CloseArchive;
+ AH->ReopenPtr = _ReopenArchive;
AH->PrintTocDataPtr = _PrintTocData;
AH->ReadExtraTocPtr = _ReadExtraToc;
AH->WriteExtraTocPtr = _WriteExtraToc;
***************
*** 835,840 ****
--- 837,879 ----
AH->FH = NULL;
}

+ static void
+ _ReopenArchive(ArchiveHandle *AH)
+ {
+ lclContext *ctx = (lclContext *) AH->formatData;
+ pgoff_t tpos;
+
+ if (AH->mode == archModeWrite)
+ {
+ die_horribly(AH,modulename,"Can only reopen input archives");
+ }
+ else if ((! AH->fSpec) || strcmp(AH->fSpec, "") == 0)
+ {
+ die_horribly(AH,modulename,"Cannot reopen stdin");
+ }
+
+ tpos = ftello(AH->FH);
+
+ if (fclose(AH->FH) != 0)
+ die_horribly(AH, modulename, "could not close archive file: %s\n",
+ strerror(errno));
+
+ AH->FH = fopen(AH->fSpec, PG_BINARY_R);
+ if (!AH->FH)
+ die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
+ AH->fSpec, strerror(errno));
+
+ if (ctx->hasSeek)
+ {
+ fseeko(AH->FH, tpos, SEEK_SET);
+ }
+ else
+ {
+ die_horribly(AH,modulename,"cannot reopen non-seekable file");
+ }
+
+ }
+
/*--------------------------------------------------
* END OF FORMAT CALLBACKS
*--------------------------------------------------
Index: pg_backup_db.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_db.c,v
retrieving revision 1.80
diff -c -r1.80 pg_backup_db.c
*** pg_backup_db.c 16 Aug 2008 02:25:06 -0000 1.80
--- pg_backup_db.c 29 Sep 2008 02:43:52 -0000
***************
*** 138,148 ****

ahlog(AH, 1, "connecting to database \"%s\" as user \"%s\"\n", newdb, newuser);

! if (AH->requirePassword)
{
password = simple_prompt("Password: ", 100, false);
if (password == NULL)
die_horribly(AH, modulename, "out of memory\n");
}

do
--- 138,153 ----

ahlog(AH, 1, "connecting to database \"%s\" as user \"%s\"\n", newdb, newuser);

! if (AH->requirePassword && AH->cachepw == NULL)
{
password = simple_prompt("Password: ", 100, false);
if (password == NULL)
die_horribly(AH, modulename, "out of memory\n");
+ AH->requirePassword = true;
+ }
+ else if (AH->requirePassword)
+ {
+ password = AH->cachepw;
}

do
***************
*** 174,180 ****
}
} while (new_pass);

! if (password)
free(password);

/* check for version mismatch */
--- 179,185 ----
}
} while (new_pass);

! if (password != AH->cachepw)
free(password);

/* check for version mismatch */
***************
*** 206,220 ****
if (AH->connection)
die_horribly(AH, modulename, "already connected to a database\n");

! if (reqPwd)
{
password = simple_prompt("Password: ", 100, false);
if (password == NULL)
die_horribly(AH, modulename, "out of memory\n");
AH->requirePassword = true;
}
else
AH->requirePassword = false;

/*
* Start the connection. Loop until we have a password if requested by
--- 211,231 ----
if (AH->connection)
die_horribly(AH, modulename, "already connected to a database\n");

! if (reqPwd && AH->cachepw == NULL)
{
password = simple_prompt("Password: ", 100, false);
if (password == NULL)
die_horribly(AH, modulename, "out of memory\n");
AH->requirePassword = true;
}
+ else if (reqPwd)
+ {
+ password = AH->cachepw;
+ }
else
+ {
AH->requirePassword = false;
+ }

/*
* Start the connection. Loop until we have a password if requested by
***************
*** 241,247 ****
} while (new_pass);

if (password)
! free(password);

/* check to see that the backend connection was successfully made */
if (PQstatus(AH->connection) == CONNECTION_BAD)
--- 252,258 ----
} while (new_pass);

if (password)
! AH->cachepw = password;

/* check to see that the backend connection was successfully made */
if (PQstatus(AH->connection) == CONNECTION_BAD)
Index: pg_backup_files.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_files.c,v
retrieving revision 1.34
diff -c -r1.34 pg_backup_files.c
*** pg_backup_files.c 28 Oct 2007 21:55:52 -0000 1.34
--- pg_backup_files.c 29 Sep 2008 02:43:52 -0000
***************
*** 87,92 ****
--- 87,93 ----
AH->WriteBufPtr = _WriteBuf;
AH->ReadBufPtr = _ReadBuf;
AH->ClosePtr = _CloseArchive;
+ AH->ReopenPtr = NULL;
AH->PrintTocDataPtr = _PrintTocData;
AH->ReadExtraTocPtr = _ReadExtraToc;
AH->WriteExtraTocPtr = _WriteExtraToc;
Index: pg_backup_tar.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_tar.c,v
retrieving revision 1.62
diff -c -r1.62 pg_backup_tar.c
*** pg_backup_tar.c 15 Nov 2007 21:14:41 -0000 1.62
--- pg_backup_tar.c 29 Sep 2008 02:43:52 -0000
***************
*** 143,148 ****
--- 143,149 ----
AH->WriteBufPtr = _WriteBuf;
AH->ReadBufPtr = _ReadBuf;
AH->ClosePtr = _CloseArchive;
+ AH->ReopenPtr = NULL;
AH->PrintTocDataPtr = _PrintTocData;
AH->ReadExtraTocPtr = _ReadExtraToc;
AH->WriteExtraTocPtr = _WriteExtraToc;
Index: pg_restore.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_restore.c,v
retrieving revision 1.88
diff -c -r1.88 pg_restore.c
*** pg_restore.c 13 Apr 2008 03:49:22 -0000 1.88
--- pg_restore.c 29 Sep 2008 02:43:52 -0000
***************
*** 78,83 ****
--- 78,84 ----
static int no_data_for_failed_tables = 0;
static int outputNoTablespaces = 0;
static int use_setsessauth = 0;
+ static int truncate_before_load = 0;

struct option cmdopts[] = {
{"clean", 0, NULL, 'c'},
***************
*** 92,97 ****
--- 93,99 ----
{"ignore-version", 0, NULL, 'i'},
{"index", 1, NULL, 'I'},
{"list", 0, NULL, 'l'},
+ {"multi-thread",1,NULL,'m'},
{"no-privileges", 0, NULL, 'x'},
{"no-acl", 0, NULL, 'x'},
{"no-owner", 0, NULL, 'O'},
***************
*** 114,119 ****
--- 116,122 ----
{"disable-triggers", no_argument, &disable_triggers, 1},
{"no-data-for-failed-tables", no_argument, &no_data_for_failed_tables, 1},
{"no-tablespaces", no_argument, &outputNoTablespaces, 1},
+ {"truncate-before-load", no_argument, &truncate_before_load, 1},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},

{NULL, 0, NULL, 0}
***************
*** 139,145 ****
}
}

! while ((c = getopt_long(argc, argv, "acCd:ef:F:h:iI:lL:n:Op:P:RsS:t:T:U:vWxX:1",
cmdopts, NULL)) != -1)
{
switch (c)
--- 142,148 ----
}
}

! while ((c = getopt_long(argc, argv, "acCd:ef:F:h:iI:lL:m:n:Op:P:RsS:t:T:U:vWxX:1",
cmdopts, NULL)) != -1)
{
switch (c)
***************
*** 182,187 ****
--- 185,194 ----
opts->tocFile = strdup(optarg);
break;

+ case 'm':
+ opts->number_of_threads = atoi(optarg); /* XXX fix error checking */
+ break;
+
case 'n': /* Dump data for this schema only */
opts->schemaNames = strdup(optarg);
break;
***************
*** 262,268 ****
break;

case 0:
! /* This covers the long options equivalent to -X xxx. */
break;

case '1': /* Restore data in a single transaction */
--- 269,278 ----
break;

case 0:
! /*
! * This covers the long options without a short equivalent,
! * including those equivalent to -X xxx.
! */
break;

case '1': /* Restore data in a single transaction */
***************
*** 299,304 ****
--- 309,329 ----
opts->noDataForFailedTables = no_data_for_failed_tables;
opts->noTablespace = outputNoTablespaces;
opts->use_setsessauth = use_setsessauth;
+ opts->truncate_before_load = truncate_before_load;
+
+ if (opts->single_txn)
+ {
+ if (opts->number_of_threads > 1)
+ {
+ write_msg(NULL, "single transaction not compatible with multi-threading");
+ exit(1);
+ }
+ else if (opts->truncate_before_load)
+ {
+ write_msg(NULL, "single transaction not compatible with truncate-before-load");
+ exit(1);
+ }
+ }

if (opts->formatName)
{
***************
*** 330,335 ****
--- 355,362 ----

AH = OpenArchive(inputFileSpec, opts->format);

+ /* XXX looks like we'll have to do sanity checks in the parallel archiver */
+
/* Let the archiver know how noisy to be */
AH->verbose = opts->verbose;

***************
*** 351,356 ****
--- 378,385 ----

if (opts->tocSummary)
PrintTOCSummary(AH, opts);
+ else if (opts->number_of_threads > 1)
+ RestoreArchiveParallel(AH, opts);
else
RestoreArchive(AH, opts);

Andrew Dunstan wrote:
>
>
>>
>> this works better but there is something fishy still - using the same
>> dump file I get a proper restore using pg_restore normally. If I
>> however use -m for a parallel one I only get parts (in this case only
>> 243 of the 709 tables) of the database restored ...
>>
>>
>>
>
> Yes, there are several funny things going on, including some stuff
> with dependencies. I'll have a new patch tomorrow with luck. Thanks
> for testing.
>
>

OK, in this version a whole heap of bugs are fixed, mainly those to do
with dependencies and saved state. I get identical row counts in the
source and destination now, quite reliably.

cheers

andrew

[GENERAL] NULL values seem to short-circuit my unique index

I'm trying to comprehend how NULL values interact with unique indexes.
It seems like I can insert two rows with NULL values in a column with a
unique constraint just fine.

Is there something special about NULL? Can anyone post some links to
explain what is going on?

Here's the example that stumped me:

I created an organization table and a category table:

matt=# \d category
Table "public.category"
Column | Type | Modifiers
--------------------+---------+-------------------------------------------------------
organization_id | integer |
id | integer | not null default nextval('category_id_seq'::regclass)
name | text |
parent_category_id | integer |
Indexes:
"category_pkey" PRIMARY KEY, btree (id)
"nodup_categories" UNIQUE, btree (organization_id, name, parent_category_id)
Foreign-key constraints:
"category_organization_id_fkey" FOREIGN KEY (organization_id) REFERENCES organization(id)
"category_parent_category_id_fkey" FOREIGN KEY (parent_category_id) REFERENCES category(id)

I thought that nodup_categories index would prevent me from putting in
these values, but I was wrong:

matt=# insert into category (organization_id, name) values (1, 'bogus');
INSERT 0 1
matt=# insert into category (organization_id, name) values (1, 'bogus');
INSERT 0 1
matt=# insert into category (organization_id, name) values (1, 'bogus');
INSERT 0 1
matt=# select * from category;
organization_id | id | name | parent_category_id
-----------------+----+-------+--------------------
1 | 1 | bogus |
1 | 2 | bogus |
1 | 3 | bogus |
(3 rows)

So, obviously there's something I'm missing. Clearly an index exists.

What's the deal with NULL?

I think I'll use some other value besides NULL to indicate categories
with parents. Then I would need to figure out how to handle the FK
constraint on the parent_category_id column.

Matt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[HACKERS] Operation needed for datfrozenxid bug?

Hello,

The datfrozenxid bug is fixed in 8.3.4,
but do I need additional operations after upgrade postgres binaries?

| Fix potential miscalculation of datfrozenxid (Alvaro)
| This error may explain some recent reports of failure
| to remove old pg_clog data.

I assume that pg_database.datfrozenxid is wrongly set to 3
(FirstNormalTransactionId) if the bug has occurred.

1. Can I use the following SQL to check whether the bug has occurred?
SELECT * FROM pg_database WHERE datfrozenxid = 3
(will return some rows under the bug.)

2. What is needed to fix the corrupted system catalog,
something like the following SQL?
UPDATE pg_database SET datfrozenxid = relfrozenxid
FROM (SELECT relfrozenxid FROM pg_class
WHERE relkind IN ('r', 't')
ORDER BY age(relfrozenxid) DESC LIMIT 1) AS tmp
WHERE datname = current_database();

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [NOVICE] absolute novice wanting knowledgeable opinion about front end

> (Anyone want to start putting together a page on wiki.postgresql.org
> about Access compatibility?)

> regards, tom lane

If it helps we wrote a quick one. I think its already listed on the wiki too.

http://www.postgresonline.com/journal/index.php?/archives/24-Using-MS-Access-with-PostgreSQL.html

Hope that helps,
Regina

The substance of this message, including any attachments, may be confidential, legally privileged and/or exempt from disclosure pursuant to Massachusetts law. It is intended solely for the addressee. If you received this in error, please contact the sender and delete the material from any computer.

Re: [PERFORM] Slow updates, poor IO

On Sun, Sep 28, 2008 at 8:01 PM, John Huttley <John@mib-infotech.co.nz> wrote:
> Ahh bugger, I've just trashed my test setup.
> I've settled on 64Mb shared memory since I've only got 1Gb or RAM and the
> system impact of 256M is severe.
> Also it uses FB-DIMMS which cost arm+leg+first born

http://www.crucial.com/search/searchresults.aspx?keywords=buffered

Fully buffered memory there is $56.99 for a 1 Gig stick. That's
hardly an arm and a leg. Considering many pgsql DBAs make that in 1
to 3 hours, it's not much at all really. A lot cheaper than pulling
your hair out trying to make a db server run on 1 Gig.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow updates, poor IO

Greg Smith wrote:
> On Mon, 29 Sep 2008, John Huttley wrote:
>
>> checkpoint _segments=16 is fine, going to 64 made no improvement.
>
> You might find that it does *after* increasing shared_buffers. If the
> buffer cache is really small, the checkpoints can't have very much
> work to do, so their impact on performance is smaller. Once you've
> got a couple of hundred MB on there, the per-checkpoint overhead can
> be considerable.
>
Ahh bugger, I've just trashed my test setup.
I've settled on 64Mb shared memory since I've only got 1Gb or RAM and
the system impact of 256M is severe.
Also it uses FB-DIMMS which cost arm+leg+first born

>> It would be nice if thing like
>> * The effect of updates on indexed tables
>> * Fill Factor
>> * reindex after restore
>> Were mentioned in the 'performance' section of the manual, since
>> that's the part someone will go to when looking for a solution.
>
> If you have to reindex after restore to get good performance, that
> means what you should do instead is drop the indexes on the table
> during the restore and then create them once the data is there. The
> REINDEX is more aimed at when the system has been running for a while
> and getting fragmented.

I thought that the pg_dump generated files did that, so I dismissed it
initially. Maybe I did a data only restore into an existing schema..
>
> Unfortunately most of the people who know enough about those topics to
> really do a good treatment of them are too busy fixing slow systems to
> have time to write about it. There are many articles on this general
> topic trickling out at
> http://wiki.postgresql.org/wiki/Performance_Optimization you might
> find valuable in addition to the manual.
>
An of course this is now in mail archive!

> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
>
>

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow updates, poor IO

On Mon, 29 Sep 2008, John Huttley wrote:

> checkpoint _segments=16 is fine, going to 64 made no improvement.

You might find that it does *after* increasing shared_buffers. If the
buffer cache is really small, the checkpoints can't have very much work to
do, so their impact on performance is smaller. Once you've got a couple
of hundred MB on there, the per-checkpoint overhead can be considerable.

> It would be nice if thing like
> * The effect of updates on indexed tables
> * Fill Factor
> * reindex after restore
> Were mentioned in the 'performance' section of the manual, since that's
> the part someone will go to when looking for a solution.

If you have to reindex after restore to get good performance, that means
what you should do instead is drop the indexes on the table during the
restore and then create them once the data is there. The REINDEX is more
aimed at when the system has been running for a while and getting
fragmented.

Unfortunately most of the people who know enough about those topics to
really do a good treatment of them are too busy fixing slow systems to
have time to write about it. There are many articles on this general
topic trickling out at
http://wiki.postgresql.org/wiki/Performance_Optimization you might find
valuable in addition to the manual.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [NOVICE] absolute novice wanting knowledgeable opinion about front end

"Phil" <philbaseless-postgres@yahoo.com> writes:
> How would I enable my secretaries to keep creating and modifying forms and
> reports if I switch to an open source DB.

You can use Access with postgres, and likely with mysql too. There are
certain gotchas and things that don't work, but for the most part it
works. Trawl our mailing list archives for some more info, but the
key point is that you connect it via an ODBC driver.

(Anyone want to start putting together a page on wiki.postgresql.org
about Access compatibility?)

regards, tom lane

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: [HACKERS] Ad-hoc table type?

> pgsql@mohawksoft.com writes:
>> Being able to insert arbitrary named values, and extracting them
>> similarly, IMHO works "better" and more naturally than some external
>> aggregate system built on a column. I know it is a little "outside the
>> box" thinking, what do you think?
>
> I'm failing to see the point. Allowing columns to spring into existence
> without any forethought seems to me to be all minuses and no pluses
> worth mentioning.
>
> * What if the column name is just a typo?

In an automated system like PHP, Java, etc. that's not too likely.

>
> * What datatype should it have? ("Always varchar" is just lame.)

varchar or text is not "just lame," SQLite used to do that exclusively.
One could argue that XML is nothing more than text.

>
> * Should it have an index? If so, should it be unique?

The answer to that is, well, no, not unless the dba generates one or it is
declared. Just like any other column. All the rules that apply to "create
table" and "alter table add column" just apply naturally as would be
expected.

create adhoc table userdata(username varchar, email varchar, primary
key(email));

>
> * If you keep doing this, you'll soon find yourself reading out
> unbelievably wide tables (lots of columns), which won't be especially
> easy or efficient to process on either the backend or the client side.
> Plus you might run into the max-columns-per-tuple limit.

Well, I fully understand that it is not a general purpose "unlimited"
width sort of thing. In a programing environment, the target environment
for this type of feature, it is unlikely to be a run-away problem.

>
> If you've expended enough thought to be sure that the column is not just
> a typo, ISTM that you can afford to enter an ALTER TABLE ADD COLUMN
> command to tell the database the results of your genius.

Like I said, if you've never dealt with a live web site, maintained by a
team of "web dudes," working furiously to keep their job and get paid,
your only hope to keep up with "Oh! I needed to add the 'time to live' of
the session into the session data" is to use an aggregate storage system.

>
> I do see the point that switching from "member of an hstore column" to
> "real database column" is pretty painful, but I don't see that "allow
> columns to spring into existence" solves that in any meaningful way.
> Is there some other way we could address such conversions?

Every other solution creates a second tier of data storage. You either
deal with data elements at the table level, or you create a "roll your
own" aggregate mechanism, or make a HUGE table of "user,name,value" table
and force a join and index scan for every select. (A million users, 5-10
attributes each is an expensive join.)
>
> BTW, I think it is (or should be) possible to create an index on
> hstore->'mycol', so at least one of the reasons why you should *need*
> to switch to a "real" database column seems bogus.

Oh, yea, function indexes work great. I think you did that right?

For what its worth, I don't expect you to jump all over this. It really is
a divergence from classic SQL design. I'm not even sure I like it. In
fact, I don't like it, but the argument that you are being forced to
create a second class data storage mechanism or a relational join for data
that is logically in a single relation does cause one to ponder the
problem.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Sunday, September 28, 2008

Blog Archive