PostgreSQL: Sep 11, 2008

Thursday, September 11, 2008

[ANNOUNCE] Npgsql2 RC2 Released!!

Hi, all!!

The Npgsql Development Team is proud to announce the Npgsql2 RC2 release!

Npgsql is a .Net Data provider written 100% in C# which allows .net
programs to talk to postgresql backends. Npgsql is licensed under BSD.
More info can be obtained from http://www.npgsql.org

On the highlights, thanks to Josh Cooley, we have support for newer
version of EntityFramework 3.5sp1

We fixed problems about networkstream seek not supported exceptions.

Also, we fixed a lot of problems with connection pool.
Mono version of Npgsql wasn't using connection pool at all. Now it uses it.
So, if you have a a heavy Npgsql usage pattern, you will benefit from this fix.

You can see full changelog and release notes here:
http://pgfoundry.org/frs/shownotes.php?release_id=1217

You can dowload it from here: http://downloads.npgsql.org

I'd like to thank Josh Cooley for all his help and support.

Thank you to all who helped us with this release with feedback, bug
fixes, bug reports and comments.
We couldn't make it without your help. Thank you very much!

Please, give it a try and let us know if you have any problems.
Check out our forums: http://forums.npgsql.org

--
Regards,

Francisco Figueiredo Jr.
http://fxjr.blogspot.com
http://www.npgsql.org

---------------------------(end of broadcast)---------------------------
-To unsubscribe from this list, send an email to:

pgsql-announce-unsubscribe@postgresql.org

Re: [GENERAL] psql scripting tutorials

At 7:13am -0400 on Thu, 11 Sep 2008, Harald Fuchs wrote:
> Nice trick, but when I try the following variant:
>
> psql -v TEST=16 -c 'select :TEST as "input"'
>
> I get [a syntax error]

> This seems to be contrary to the psql manual page:

Nope. Take a look at the -c option. Specifically "Thus you cannot mix
SQL and psql meta-commands with this option."

You might try shell interpretation:

$ TEST=16; psql -c "select $TEST as \"input1\";"
$ TEST=16; echo "select $TEST as \"input1\";" | psql

Kevin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] declare column update expression

On Thu, Sep 11, 2008 at 1:35 AM, Chris Velevitch
<chris.velevitch@gmail.com> wrote:
> In 7.4, how do I declare that a column in a table is to be
> automatically set to the value of some fixed expression whenever a row
> is updated?

Oh yeah, if you can upgrade from 7.4 that would be a good idea, it's
getting a bit old and the newer versions of pgsql like 8.3 are
literally many times faster at most things. We just upgraded from 8.1
to 8.3 and resolved a lot of performance issues, I can't imagine how
slow it would be running 7.4 nowadays.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] psql scripting tutorials

Peter Eisentraut escribió:

> psql manual page:
>
> -c command
>
> --command command
> Specifies that psql is to execute one command string, command,
> and then exit. This is useful in shell scripts.
>
> command must be either a command string that is completely
> parsable by the server (i.e., it contains no psql specific
> features), or a single backslash command. Thus you cannot mix
> SQL and psql meta-commands with this option.

Doesn't say about variable expansion ... And it seems to be in a
different realm, because the point is that the command is going to have
a single destination (either \-processing or sending it to the server).

Is psql being just lazy here and avoiding parsing the command?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[COMMITTERS] libpqtypes - libpqtypes: handlers.c:duphandlers was not checking return

Log Message:
-----------
handlers.c:duphandlers was not checking return value of strdup, also not always cleaning up properly

Modified Files:
--------------
libpqtypes/src:
handler.c (r1.6 -> r1.7)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/libpqtypes/libpqtypes/src/handler.c.diff?r1=1.6&r2=1.7)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] stackbuilder - wizard: fixup some strings that were out of date and

Log Message:
-----------
fixup some strings that were out of date and didn't work well on non-win32 platforms

Modified Files:
--------------
wizard:
IntroductionPage.cpp (r1.18 -> r1.19)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/IntroductionPage.cpp.diff?r1=1.18&r2=1.19)
StackBuilder.pot (r1.5 -> r1.6)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/StackBuilder.pot.diff?r1=1.5&r2=1.6)
Wizard.cpp (r1.11 -> r1.12)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/Wizard.cpp.diff?r1=1.11&r2=1.12)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [GENERAL] declare column update expression

A trigger as Pavel said. writing them in plpgsql seems a bit hard at
first, but it's a simple language and it's pretty easy to write stuff
like this in.

Here's a simple example of last modified trigger using plpgsql from way back:

-- FUNCTION --

CREATE FUNCTION modtime () RETURNS opaque AS '
BEGIN
new.lm :=''now'';
RETURN new;
END;
' LANGUAGE 'plpgsql';

-- TABLE --

CREATE TABLE dtest (
id int primary key,
fluff text,
lm timestamp without time zone
);

--TRIGGER --

CREATE TRIGGER dtest
BEFORE UPDATE or INSERT ON dtest FOR EACH ROW EXECUTE PROCEDURE
modtime(lm);

-- SQL TESTS --

INSERT INTO dtest (id, fluff) VALUES (1,'this is a test');
INSERT INTO dtest (id, fluff) VALUES (2,'this is another test');
SELECT * FROM dtest;
1 | this is a test | 2003-04-02 10:33:12.577089
2 | this is another test | 2003-04-02 10:33:18.591148
UPDATE dtest SET id=3 WHERE id=1;
3 | this is a test | 2003-04-02 10:34:52.219963 [1]
UPDATE dtest SET fluff='now is the time' WHERE id=2;
SELECT * FROM dtest WHERE id=2;
2 | now is the time | 2003-04-02 10:38:06.259443 [2]
UPDATE dtest SET lm='2003-04-02 08:30:00' WHERE id=3;
SELECT * FROM dtest WHERE id=3;
3 | this is a test | 2003-04-02 10:36:15.45687 [3]

[1] The timestamp has changed for this record when we changed the id field.
[2] The timestamp also changes for the fluff field.
[3] We tried to set lm, but the trigger on that field in dtest
intercepted the change and forced it

>
>
>
> Chris
> --
> Chris Velevitch
> Manager - Adobe Platform Users Group, Sydney
> m: 0415 469 095
> www.apugs.org.au
>
> Adobe Platform Users Group, Sydney
> September meeting: It's Going To Be Brilliant
> Date: Mon 29th September 6pm for 6:30 start
> Details and RSVP on http://apugs2008september.eventbrite.com
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[COMMITTERS] stackbuilder - wizard: Automatic merge using stringmerge script.

Log Message:
-----------
Automatic merge using stringmerge script.

Modified Files:
--------------
wizard/i18n/de_DE:
StackBuilder.po (r1.4 -> r1.5)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/i18n/de_DE/StackBuilder.po.diff?r1=1.4&r2=1.5)
wizard/i18n/fr_FR:
StackBuilder.po (r1.2 -> r1.3)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/i18n/fr_FR/StackBuilder.po.diff?r1=1.2&r2=1.3)
wizard/i18n/ja_JP:
StackBuilder.po (r1.3 -> r1.4)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/i18n/ja_JP/StackBuilder.po.diff?r1=1.3&r2=1.4)
wizard/i18n/ru_RU:
StackBuilder.po (r1.4 -> r1.5)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/i18n/ru_RU/StackBuilder.po.diff?r1=1.4&r2=1.5)
wizard/i18n/sv_SE:
StackBuilder.po (r1.4 -> r1.5)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/i18n/sv_SE/StackBuilder.po.diff?r1=1.4&r2=1.5)
wizard/i18n/tr_TR:
StackBuilder.po (r1.3 -> r1.4)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/stackbuilder/wizard/i18n/tr_TR/StackBuilder.po.diff?r1=1.3&r2=1.4)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [HACKERS] Potential Join Performance Issue

Attached is a patch that will disable the physical-tlist optimization
for hash join if the number of batches is greater than 1. The patch and
performance results were created by Michael Henderson (graduate
student).

To keep the changes simple, the update simply calls
ExecChooseHashTableSize() in create_hashjoin_plan() to re-calculate the
expected number of batches. This is more efficient and results in less
code changes than modifying the HashPath struct to store the number of
batches and updating that variable when costing (as cost_hashjoin() will
be called many times during costing).

We have also attached some performance results that show a dramatic
effect when disabling the physical-tlist optimization for joins with
more than one batch.

I do not know the performance tradeoffs of using the physical-tlist
optimization to avoid projection on the outer relation for joins with
one batch. However, there is a potential huge penalty if the optimizer
is wrong. If the optimizer suggests one batch, and on execution either
due to poor estimates or data skew more than one batch is needed, then
the join operator will perform considerably more I/Os on the outer
relation that still contains the unnecessary attributes.

An ideal solution would detect at execution time if the inner relation
remained in memory (one batch) and decide to disable/enable the
physical-tlist optimization on the outer relation accordingly. At this
time, we are uncertain if this would be desirable or possible.

Sincerely,

Dr. Ramon Lawrence
Assistant Professor, Department of Computer Science, University of
British Columbia Okanagan
E-mail: ramon.lawrence@ubc.ca

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: September 9, 2008 6:47 PM
To: Lawrence, Ramon
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Potential Join Performance Issue

"Lawrence, Ramon" <ramon.lawrence@ubc.ca> writes:
> Our research group has been using the PostgreSQL code base to test new
> join algorithms. During testing, we noticed that the planner is not
> pushing down projections to the outer relation in a hash join.
Although
> this makes sense for in-memory (1 batch) joins, for joins larger than
> memory (such as for TPC-H DSS), this causes the system to perform
> significantly more disk I/Os when reading/writing batches of the outer
> relation.

Hm. The proposed patch seems a bit brute-force, since it loses the
benefit of the physical-tlist optimization even if the relations are
certainly too small to require batching.

> A more complicated modification alternative is to add a state variable
> to allow the planner to know how many batches the hash join expects
and
> only push down the projection if it is greater than one. However,
> pushing the projection on the outer relation is almost always the best
> choice as it eliminates unneeded attributes for operators above the
hash
> join in the plan and will be robust in the case of poor estimates.

Nonetheless, I'm inclined to do it that way. The "robust in the case of
poor estimates" argument doesn't convince me, because the incremental
cost isn't *that* large if we get it wrong; and the other argument is
just bogus because we don't do physical tlists at or above joins anyhow.

regards, tom lane

Re: [HACKERS] [Review] pgbench duration option

ITAGAKI Takahiro wrote:
> ***************
> *** 29,36 ****
> --- 29,40 ----
> #include "postgres_fe.h"
>
> #include "libpq-fe.h"
> + #include "pqsignal.h"
>
> #include <ctype.h>
> + #include <signal.h>
> + #include <sys/time.h>
> + #include <unistd.h>
>
> #ifdef WIN32
> #undef FD_SETSIZE

sys/time.h and unistd.h are #included just a few lines after that, but
within a #ifndef WIN32 block. I don't think the patch added any
codepaths where we'd need those header files on Windows, so I presume
that was just an oversight and those two extra #includes can be removed?
I don't have a Windows environment to test it myself.

Also, should we be using pqsignal at all? It's not clear to me what it
is, to be honest, but there's a note in pqsignal.h that says "This
shouldn't be in libpq, but the monitor and some other things need it..."

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-advocacy] Ohio LinuxFest Booth

On Thursday 11 September 2008 09:50:08 Melanie wrote:
> Actually, I spoke with JD last night and we liked the idea of stickers
> that have the url on them. I would expect most people attending are tech
> savvy and even if they could give a rat's about the sustainability (hey,
> they might) they're probably more interested in getting their needs from
> the web vs crap that sits around. Though everything mankind does involves
> the environment, the products that can be made from recycled goods and/or
> can be recycled is obviously ideal.
>

I was one who favored the idea of ditching the CD's before, but I think I was
more geared to the OSCon/Linuxworld crowd. Now, having though about it some
more, I think there is something to be said for having a live cd, where users
1) dont have to wait for a download (yeah, it isnt large, but the perception
might be) and more importantly 2) they can completely bypass the installation
process. Yes, I know most people dont think installing postgres is hard, but
if you tell someone to apt-get postgres, and 5 minutes later they are staring
at IDENT errors... well, this is a barrier that a live cd's get's you past.

Don't get me wrong, I think stickers and pins would both be pretty good ideas
too.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

--
Sent via pgsql-advocacy mailing list (pgsql-advocacy@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-advocacy

[COMMITTERS] pgtreelib - pgtreelib: make_dot

Log Message:
-----------
make_dot

Modified Files:
--------------
pgtreelib/src/graph:
make_dot.pl (r1.1 -> r1.2)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/graph/make_dot.pl.diff?r1=1.1&r2=1.2)
make_dot.conf (r1.1 -> r1.2)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/graph/make_dot.conf.diff?r1=1.1&r2=1.2)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] pgtreelib - pgtreelib: psql

Log Message:
-----------
psql

Modified Files:
--------------
pgtreelib/src/main:
crud_node.psql (r1.7 -> r1.8)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/main/crud_node.psql.diff?r1=1.7&r2=1.8)
traversal.psql (r1.6 -> r1.7)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/main/traversal.psql.diff?r1=1.6&r2=1.7)
base.psql (r1.8 -> r1.9)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/main/base.psql.diff?r1=1.8&r2=1.9)
schema.sql (r1.7 -> r1.8)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/main/schema.sql.diff?r1=1.7&r2=1.8)
util.psql (r1.7 -> r1.8)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgtreelib/pgtreelib/src/main/util.psql.diff?r1=1.7&r2=1.8)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[pgus-general] provision for donations

I assume that PGUS will not be accepting donations until process for
the 501c3 is complete. Will interested persons be able to make
donations to the organization in the same way that they purchase
membership?

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

--
Sent via pgus-general mailing list (pgus-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgus-general

[JDBC] Found a problem with 8.2-508 JDBC 4

PostgreSQL: 8.2.4

JDK: 1.6.0_07

JDBC Driver: 8.2-508 JDBC4

I have found a problem with 8.2-508 JDBC 4. I can recreate the problem every time.

Summary of problem:

When the below SQL is executed the result set returns the value of “null” for the field “calendar_name”. When I use driver “8.2-506 JDBC 4” and “8.2-507 JDBC 4” it work correctly and I get a valid value.

SQL that returns a “null” in 508 but not in 506 and 507 for the field “calendar_name”:

SELECT to_char(event.start_timestamp, 'Mon FMDD, YYYY') AS start_date,

to_char(event.end_timestamp, 'Mon FMDD, YYYY') AS end_date,

to_char(event.start_timestamp, 'FMHH:MI am') AS start_time,

to_char(event.end_timestamp, 'FMHH:MI am') AS end_time,

to_char(event.news_letter_timestamp, 'YYYYMMDD') AS news_letter_date,

to_char(event.news_letter_timestamp, 'FMHHMISS') AS news_letter_time,

to_char(last_updated_timestamp, 'MM/DD/YYYY') AS last_updated_date,

to_char(event.created_timestamp, 'MM/DD/YYYY') AS created_date,

event.*,

calendar.name as calendar_name

FROM calendar.event, calendar.calendar

WHERE event.id=? AND event.fk_calendar_id=calendar.id

When I rearrange the following SQL, driver 508 will return a non “null” value for calendar_name. Both the above and below SQL work with 506 and 507:

SELECT to_char(event.start_timestamp, 'Mon FMDD, YYYY') AS start_date,

to_char(event.end_timestamp, 'Mon FMDD, YYYY') AS end_date,

to_char(event.start_timestamp, 'FMHH:MI am') AS start_time,

to_char(event.end_timestamp, 'FMHH:MI am') AS end_time,

to_char(event.news_letter_timestamp, 'YYYYMMDD') AS news_letter_date,

to_char(event.news_letter_timestamp, 'FMHHMISS') AS news_letter_time,

to_char(last_updated_timestamp, 'MM/DD/YYYY') AS last_updated_date,

to_char(event.created_timestamp, 'MM/DD/YYYY') AS created_date,

calendar.name as calendar_name,

event.*

FROM calendar.event, calendar.calendar

WHERE event.id=? AND event.fk_calendar_id=calendar.id

Thanks,

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

Re: [ADMIN] replica of database

On Thu, Sep 11, 2008 at 7:42 AM, Rex Mabry <rexmabry@yahoo.com> wrote:
> Version 8.3 has pg_standby. Does anyone know how well that works for
> failover?

pg_standby is part of the Point in time recovery suite, which can be
used to create a cold (recovery on comand) or warm standby (continuous
recovery) server ready to take over with a few commands. It's quite
different from slony, in that there's not hot read-only standby, so
there's no load balancing, however, it usually produces less load on
the master db, so that's a plus.

PITR is quite well tested and used in a lot of production
environments. It has very different design objectives than slony or
pgpool or pgbouncer, which all allow you to have multiple live servers
at once.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [HACKERS] Synchronous Log Shipping Replication

Markus Wanner <markus@bluegap.ch> writes:
> Tom Lane wrote:
>> Sooner or later we shall have to bite the bullet and set up a
>> multiplexing system to transmit multiple event types to backends with
>> just one signal. We already did it for signals to the postmaster.

> Agreed. However, it's non-trivial if you want reliable queues (i.e. no
> message skipped, as with signals) for varying message sizes.

No, that's not what I had in mind at all, just the ability to deliver
one of a specified set of event notifications --- ie, get around the
fact that Unix only gives us two user-definable signal types.

For signals sent from other backends, it'd be sufficient to put a
bitmask field into PGPROC entries, which the sender could OR bits into
before sending the one "real" signal event (either SIGUSR1 or SIGUSR2).

I'm not sure what to do if we need signals sent from processes that
aren't connected to shared memory; but maybe we need not cross that
bridge here.

(Also, I gather that the Windows implementation could already support
a bunch more signal types without much trouble.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [GENERAL] psql scripting tutorials

Harald Fuchs wrote:
> In article <Pine.GSO.4.64.0809110111250.244@westnet.com>,
> Greg Smith <gsmith@gregsmith.com> writes:
>
>> On Tue, 9 Sep 2008, Artacus wrote:
>>> Can psql access environmental variables or command line params?
>
>> $ cat test.sql
>> select :TEST as "input";
>> $ psql -v TEST=16 -f test.sql
>> input
>> -------
>> 16
>> (1 row)
>
> Nice trick, but when I try the following variant:
>
> psql -v TEST=16 -c 'select :TEST as "input"'
>
> I get
>
> ERROR: syntax error at or near ":"
> LINE 1: select :TEST as "input"
>
> This seems to be contrary to the psql manual page:
>
> These assignments are done during a very early stage of start-up...
>
>

psql manual page:

-c command

--command command
Specifies that psql is to execute one command string, command,
and then exit. This is useful in shell scripts.

command must be either a command string that is completely
parsable by the server (i.e., it contains no psql specific
features), or a single backslash command. Thus you cannot mix
SQL and psql meta-commands with this option.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] pg_regress inputdir

Jorgen Austvik - Sun Norway wrote:

> The attached patch makes pg_regress write converted files to
> <inputdir>/sql and <inputdir>/expected, which is one way to make it read
> and write to the same directory. Tested on Solaris x86 with pgsql "make
> check" and standalone.

Okay, so this patch does change it in a way that it still works, but
what else do you need to be able to run the test from another directory?
I tried to run the test from another directory with this patch
installed, and found that it didn't work because it's replacing
@abs_builddir@ in the input files improperly (to the current path; it
should be using the output dir path, I think)

So maybe this is a step in the right direction, but ISTM you need a
slightly larger patch for it to be actually useful.

If I am not making sense, then maybe I am not understanding what you
mean by running it standalone. In that case, please explain.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[COMMITTERS] plproxy - plproxy: news for 2.0.7

Log Message:
-----------
news for 2.0.7

Modified Files:
--------------
plproxy:
NEWS (r1.13 -> r1.14)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/plproxy/plproxy/NEWS.diff?r1=1.13&r2=1.14)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] pgsql: Adjust the parser to accept the typename syntax INTERVAL ...

Log Message:
-----------
Adjust the parser to accept the typename syntax INTERVAL ... SECOND(n)
and the literal syntax INTERVAL 'string' ... SECOND(n), as required by the
SQL standard. Our old syntax put (n) directly after INTERVAL, which was
a mistake, but will still be accepted for backward compatibility as well
as symmetry with the TIMESTAMP cases.

Change intervaltypmodout to show it in the spec's way, too. (This could
potentially affect clients, if there are any that analyze the typmod of an
INTERVAL in any detail.)

Also fix interval input to handle 'min:sec.frac' properly; I had overlooked
this case in my previous patch.

Document the use of the interval fields qualifier, which up to now we had
never mentioned in the docs. (I think the omission was intentional because
it didn't work per spec; but it does now, or at least close enough to be
credible.)

Modified Files:
--------------
pgsql/doc/src/sgml:
datatype.sgml (r1.227 -> r1.228)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/datatype.sgml?r1=1.227&r2=1.228)
pgsql/src/backend/parser:
gram.y (r2.622 -> r2.623)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/parser/gram.y?r1=2.622&r2=2.623)
pgsql/src/backend/utils/adt:
datetime.c (r1.191 -> r1.192)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/datetime.c?r1=1.191&r2=1.192)
timestamp.c (r1.191 -> r1.192)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/timestamp.c?r1=1.191&r2=1.192)
pgsql/src/test/regress/expected:
interval.out (r1.21 -> r1.22)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/interval.out?r1=1.21&r2=1.22)
pgsql/src/test/regress/sql:
interval.sql (r1.13 -> r1.14)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/interval.sql?r1=1.13&r2=1.14)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [PATCHES] hash index improving v3

On Wed, Sep 10, 2008 at 10:17:31PM -0600, Alex Hunsaker wrote:
> On Wed, Sep 10, 2008 at 9:49 PM, Alex Hunsaker <badalex@gmail.com> wrote:
> > On Wed, Sep 10, 2008 at 7:04 AM, Kenneth Marshall <ktm@rice.edu> wrote:
> >> On Tue, Sep 09, 2008 at 07:23:03PM -0600, Alex Hunsaker wrote:
> >>> On Tue, Sep 9, 2008 at 7:48 AM, Kenneth Marshall <ktm@rice.edu> wrote:
> >>> > I think that the glacial speed for generating a big hash index is
> >>> > the same problem that the original code faced.
> >>>
> >>> Yeah sorry, I was not saying it was a new problem with the patch. Err
> >>> at least not trying to :) *Both* of them had been running at 18+ (I
> >>> finally killed them sometime Sunday or around +32 hours...)
> >>>
> >>> > It would be useful to have an equivalent test for the hash-only
> >>> > index without the modified int8 hash function, since that would
> >>> > be more representative of its performance. The collision rates
> >>> > that I was observing in my tests of the old and new mix() functions
> >>> > was about 2 * (1/10000) of what you test generated. You could just
> >>> > test against the integers between 1 and 2000000.
> >>>
> >>> Sure but then its pretty much just a general test of patch vs no
> >>> patch. i.e. How do we measure how much longer collisions take when
> >>> the new patch makes things faster? That's what I was trying to
> >>> measure... Though I apologize I don't think that was clearly stated
> >>> anywhere...
> >>
> >> Right, I agree that we need to benchmark the collision processing
> >> time difference. I am not certain that two data points is useful
> >> information. There are 469 collisions with our current hash function
> >> on the integers from 1 to 2000000. What about testing the performance
> >> at power-of-2 multiples of 500, i.e. 500, 1000, 2000, 4000, 8000,...
> >> Unless you adjust the fill calculation for the CREATE INDEX, I would
> >> stop once the time to create the index spikes. It might also be useful
> >> to see if a CLUSTER affects the performance as well. What do you think
> >> of that strategy?
> >
> > Not sure it will be a good benchmark of collision processing. Then
> > again you seem to have studied the hash algo closer than me. Ill go
> > see about doing this. Stay tuned.
>
> Assuming I understood you correctly, And I probably didn't this does
> not work very well because you max out at 27,006 values before you get
> this error:
> ERROR: index row size 8152 exceeds hash maximum 8144
> HINT: Values larger than a buffer page cannot be indexed.
>
> So is a power-of-2 multiple of 500 not simply:
> x = 500;
> while(1)
> {
> print x;
> x *= 2;
> }
>
> ?
>
Alex,

I meant to check the performance with increasing numbers of collisions,
not increasing size of the hashed item. In other words, something like
this:

for ($coll=500; $i<=1000000; $i=$i*2) {
for ($i=0; $i<=1000000; $i++) {
hash(int8 $i);
}
# add the appropriate number of collisions, distributed evenly to
# minimize the packing overrun problem
for ($dup=0; $dup<=$coll; $dup++) {
hash(int8 MAX_INT + $dup * 1000000/$coll);
}
}

Ken

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [GENERAL] No error when column doesn't exist

Tom Lane wrote:
> The ideas I had involved not considering the cast interpretation when
> the actual syntax is table.column and some-set-of-other-conditions.
> While this is certainly possible to implement, any variant of it will
> break the existing 100% equivalence of foo.bar and bar(foo); which
> seems to me to be a nice principle, though I grant you won't find it
> anywhere in the SQL standard.

I think if we say that functions can be used as table attributes, and
types can be used as (cast) functions, and tables are types, then we are
simply stuck with the current behavior. Individually, these all make
sense, so you can't break that chain without some really complicated warts.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Autocommit, isolation level, and vacuum behavior

On 2008-09-10 16:46, Jack Orenstein wrote:

> Application requirement. We need to do something for each row retrieved from BIG
> and the something is expensive. We do the scan slowly (30 second sleep inside
> the loop) to amortize the cost.

Then do the processing in separate transactions like this (in pseudocode):

$last_id = -1;
do {
begin transaction;
$result = select * from bigtable
where id>$last_id
and processed=false
order by id limit 1;
if ( empty($result) ) {
rollback;
break;
}
do_something_expensive_with($result[0]);
update bigtable set processed=true where id=$result[0][id];
commit;
sleep 30;
} while (true);

Always avoid long running transactions. This is recommended for any
transactional database.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Synchronous Log Shipping Replication

Fujii Masao wrote:
> I think that this case would often happen. So, we should establish a certain
> solution or procedure to the case where TLI of the master doesn't match
> TLI of the slave. If we only allow the case where TLI of both servers is the
> same, the configuration after failover always needs to get the base backup
> on the new master. It's unacceptable for many users. But, I think that it's
> the role of admin or external tools to copy history files to the slave from
> the master.

Hmm. There's more problems than the TLI with that. For the original
master to catch up by replaying WAL from the new slave, without
restoring from a full backup, the original master must not write to disk
*any* WAL that hasn't made it to the slave yet. That is certainly not
true for asynchronous replication, but it also throws off the idea of
flushing the WAL concurrently to the local disk and to the slave in
synchronous mode.

I agree that having to get a new base backup to get the old master catch
up with the new master sucks, so I hope someone sees a way around that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [ADMIN] replica of database

On Thu, 2008-09-11 at 10:35 -0400, Andrew Sullivan wrote:
> On Wed, Sep 10, 2008 at 09:52:41PM -0700, Joshua D. Drake wrote:
> > Version 7.3 is end of life and I don't know of *any* of the replication
> > technologies that will work with it.
>
> The 1.0.x Slony releases work with 7.3.x, x>2. I don't recommend
> sticking with 7.3, however.

If you ignore this very wise advise, and stick with 7.3 and decide to
give Slony 1.0.x a shot, be prepared for lots of late nights rebuilding
replicas and dealing if a whole multitude of other bugs it had then.

--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

[GENERAL] European PGDay 2008 - registration open

The European PGDay 2008 is now open for registration. We are working on
finalizing the English schedule, but we already have an impressive list
of speakers available at http://www.pgday.org/en/speakers. The schedule
will have at least one English talk at all times, as well as a number of
Italian ones. There will also be social activities after the conference
- stay tuned to http://www.pgday.org/en/ for details.

While the full schedule is being worked on, attendees can now register
their participation in the conference at https://register.pgday.org.
Although the conference will charge no entry fee, we do ask all
attendees to register at this site as soon as possible, to help us plan
the schedule to best suit your needs. It is also recommended that you
start looking at your travel arrangements - information about travel and
accommodation is available at http://www.pgday.org/en/.

We would also like to thank our sponsors who make it possible to make
this conference the best in Europe, while keeping it free:

Gold Sponsor - EnterpriseDB
Bronze Sponsors - 2ndQuadrant, Continuent, Dalibo, Sonologic and Sun
Microsystems

See http://www.pgday.org/en/sponsors for a full list of sponsors.

There are still openings in the sponsorship program for all interested
parties - see http://www.pgday.org/en/sponsors/campaign

--
The PGDay.eu organization team,
ITPUG and PostgreSQL Europe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[pgeu-general] European PGDay 2008 - registration open

We would also like to thank our sponsors who make it possible to make
this conference the best in Europe, while keeping it free:

Gold Sponsor - EnterpriseDB
Bronze Sponsors - 2ndQuadrant, Continuent, Dalibo, Sonologic and Sun
Microsystems

See http://www.pgday.org/en/sponsors for a full list of sponsors.

There are still openings in the sponsorship program for all interested
parties - see http://www.pgday.org/en/sponsors/campaign

--
The PGDay.eu organization team,
ITPUG and PostgreSQL Europe

--
Sent via pgeu-general mailing list (pgeu-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgeu-general

Re: [GENERAL] declare column update expression

On Thu, Sep 11, 2008 at 6:01 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> you should to use trigger

I've never used trigger before, it looks messy and error prone having
to write functions.

How is it that you can declare the default value of a column on insert
but not on update?

Chris
--
Chris Velevitch
Manager - Adobe Platform Users Group, Sydney
m: 0415 469 095
www.apugs.org.au

Adobe Platform Users Group, Sydney
September meeting: It's Going To Be Brilliant
Date: Mon 29th September 6pm for 6:30 start
Details and RSVP on http://apugs2008september.eventbrite.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[COMMITTERS] plproxy - plproxy: Use server_encoding as remote connection encoding

Log Message:
-----------
Use server_encoding as remote connection encoding

Previous use of client_encoding was wrong, but seemed to work because
of binary i/o which bypassed the encoding checks.

Noticed and fixed by Hiroshi Saito

Modified Files:
--------------
plproxy/src:
execute.c (r1.13 -> r1.14)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/plproxy/plproxy/src/execute.c.diff?r1=1.13&r2=1.14)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [HACKERS] Synchronous Log Shipping Replication

On Thu, Sep 11, 2008 at 3:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Wed, 2008-09-10 at 17:57 +0900, Fujii Masao wrote:
>
>> My sequence covers several cases :
>>
>> * There is no missing WAL file.
>> * There is a lot of missing WAL file.
>
> This is the likely case for any medium+ sized database.

I'm sorry, but I could not understand what you mean.

>
>> * There are missing history files. Failover always generates the gap
>> of
>> history file because TLI is incremented when archive recovery is
>> completed.
>
> Yes, but failover doesn't happen while we are configuring replication,
> it can only happen after we have configured replication. It would be
> theoretically possible to take a copy from one server and then try to
> synchronise with a 3rd copy of the same server, but that seems perverse
> and bug prone. So I advise that we only allow replication when the
> timeline of the standby matches the timeline of the master, having it as
> an explicit check.

Umm... my explanation seems to have been unclear:(
Here is the case which I assume.

1) Configuration of replication, i.e. the master and the slave work fine.
2) The master fails down, then failover happens. When the slave becomes
the master, TLI is incremented, and new history file is generated.
3) In order to catch up with the new master, the server which was the master
from the first needs missing history file. At this time, it's
because there is
the gap of TLI in between two servers.

I think that this case would often happen. So, we should establish a certain
solution or procedure to the case where TLI of the master doesn't match
TLI of the slave. If we only allow the case where TLI of both servers is the
same, the configuration after failover always needs to get the base backup
on the new master. It's unacceptable for many users. But, I think that it's
the role of admin or external tools to copy history files to the slave from
the master.

>> In your design, does not initial setup block the master?
>> Does your design cover above-mentioned case?
>
> The way I described it does not block the master. It does defer the
> point at which we can start using synchronous replication, so perhaps
> that is your objection. I think it is acceptable: good food takes time
> to cook.

Yes. I understood your design.

> IMHO it will be confusing to be transferring both old and new data at
> the same time from master to slave. We will have two different processes
> sending and two different processes receiving. You'll need to work
> through about four times as many failure modes, all of which will need
> testing. Diagnosing problems in it via the log hurts my head just
> thinking about it. ISTM that will severely impact the initial robustness
> of the software for this feature. Perhaps in time it is the right way.

In my procedure, old WAL files are copyed by admin using scp, rsync
or other external tool. So, I don't think that my procedure makes a
problem more difficult. Since there are many setup cases, we should
not leave all procedures to postgres, I think.

> Anyway, feels like we're getting close to some good designs. There isn't
> much difference between what we're discussing here.

Yes. Thank you for your great ideas.

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transaction Snapshots and Hot Standby

Simon Riggs wrote:
> So part of the handshake between
> primary and standby must be "what is your recentxmin?". The primary will
> then use the lower/earliest of the two.

Even then, the master might already have vacuumed away tuples that are
visible to an already running transaction in the slave, before the slave
connects. Presumably the master doesn't wait for the slave to connect
before starting to accept new connections.

>> As you mentioned, the options there are to defer applying WAL, or cancel
>> queries. I think both options need the same ability to detect when
>> you're about to remove a tuple that's still visible to some snapshot,
>> just the action is different. We should probably provide a GUC to
>> control which you want.
>
> I don't see any practical way of telling whether a tuple removal will
> affect a snapshot or not. Each removed row would need to be checked
> against each standby snapshot. Even if those were available, it would be
> too costly.

How about using the same method as we use in HeapTupleSatisfiesVacuum?
Before replaying a vacuum record, look at the xmax of the tuple
(assuming it committed). If it's < slave's OldestXmin, it can be
removed. Otherwise not. Like HeapTupleSatisfiesVacuum, it's
conservative, but doesn't require any extra bookkeeping.

And vice versa: if we implement the more precise book-keeping, with all
snapshots in shared memory or something, we might as well use it in
HeapTupleSatisfiesVacuum. That has been discussed before, but it's a
separate project.

> It was also suggested we might take the removed rows and put them in a
> side table, but that makes me think of the earlier ideas for HOT and so
> I've steered clear of that.

Yeah, that's non-trivial. Basically a whole new, different
implementation of MVCC, but without changing any on-disk formats.

BTW, we haven't talked about how to acquire a snapshot in the slave.
You'll somehow need to know which transactions have not yet committed,
but will in the future. In the master, we keep track of in-progress
transaction in the ProcArray, so I suppose we'll need to do the same in
the slave. Very similar to prepared transactions, actually. I believe
the Abort records, which are not actually needed for normal operation,
become critical here. The slave will need to put an entry to ProcArray
for any new XLogRecord.xl_xid it sees in the WAL, and remove the entry
at a Commit and Abort record. And clear them all at a shutdown record.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [GENERAL] "Healing" a table after massive updates

In response to Alvaro Herrera <alvherre@commandprompt.com>:

> Bill Moran wrote:
> > In response to "Gauthier, Dave" <dave.gauthier@intel.com>:
> >
> > > I might be able to answer my own question...
> > >
> > > vacuum FULL (analyze is optional)
> >
> > CLUSTER _may_ be a better choice, but carefully read the docs regarding
> > it's drawbacks first. You may want to do some benchmarks to see if it's
> > really needed before you commit to it as a scheduled operation.
>
> What drawbacks?

There's the whole "there will be two copies of the table on-disk" thing
that could be an issue if it's a large table.

Depending on the version of PG in use, there are warnings about tuple
visibility during CLUSTER. It seems as if most of these have been
removed for 8.3.

And while this applies to VACUUM FULL as well, it might just be a bad
idea. If the number of rows inserted isn't a significant increase of
the overall size of the table, he may find that overall performance is
better if he uses plain old VACUUM so that the FSM stays at a stable
size.

Some of this is dependent on PG version, which the OP neglected to mention.
Other stuff is dependent on what kind of maintenance window he has, which
was also not mentioned. Other stuff id dependent on various various
details of the actual process, which (looking back through the thread)
I may have misunderstood. I took the post to mean that he was loading
additional data into a pre-existing table, which is a difference scenario
than loading up a virgin table.

So, you know, standard disclaimer about YMMV, which it seems that _I_
worded poorly.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Move src/tools/backend/ to wiki

Peter Eisentraut wrote:
> Alvaro Herrera wrote:
>> So I went ahead and moved its mention to a separate question, where it
>> has a lot more visibility. (I also added an URL to anoncvs, where
>> people can see it more readily.)
>
> Perhaps the low visibility also has to do with the fact that
> documentation is hidden under "tools".

Well, with the way it was before it was even more hidden. Now at least
it has its own question, and has working links and all.

However I wonder how much value there really is in the developer's FAQ,
considering that some answers seem rather poor. For example the
answer on ereport() was wrong, and nobody ever pointed it out. The
answer on palloc/pfree is very incomplete too.

My question is, is this resource actually useful for somebody?

> Once we start applying the argument that things should be moved to
> the wiki based on getting more people to work on it, we might as well
> move the source code to the wiki altogether. ;-)

Hey, now that's a clever idea!

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[COMMITTERS] plproxy - plproxy: disable binary i/o completely

Log Message:
-----------
disable binary i/o completely

Modified Files:
--------------
plproxy/src:
type.c (r1.6 -> r1.7)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/plproxy/plproxy/src/type.c.diff?r1=1.6&r2=1.7)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] plproxy - plproxy: allow non-ascii chars in idents

Log Message:
-----------
allow non-ascii chars in idents

Modified Files:
--------------
plproxy/src:
scanner.l (r1.12 -> r1.13)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/plproxy/plproxy/src/scanner.l.diff?r1=1.12&r2=1.13)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] plproxy - plproxy: more serious encoding test

Log Message:
-----------
more serious encoding test

Modified Files:
--------------
plproxy/sql:
plproxy_encoding.sql (r1.2 -> r1.3)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/plproxy/plproxy/sql/plproxy_encoding.sql.diff?r1=1.2&r2=1.3)
plproxy/expected:
plproxy_encoding.out (r1.3 -> r1.4)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/plproxy/plproxy/expected/plproxy_encoding.out.diff?r1=1.3&r2=1.4)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [GENERAL] "Healing" a table after massive updates

Bill Moran wrote:
> In response to "Gauthier, Dave" <dave.gauthier@intel.com>:
>
> > I might be able to answer my own question...
> >
> > vacuum FULL (analyze is optional)
>
> CLUSTER _may_ be a better choice, but carefully read the docs regarding
> it's drawbacks first. You may want to do some benchmarks to see if it's
> really needed before you commit to it as a scheduled operation.

What drawbacks?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] "Healing" a table after massive updates

In response to "Gauthier, Dave" <dave.gauthier@intel.com>:

> I might be able to answer my own question...
>
> vacuum FULL (analyze is optional)

CLUSTER _may_ be a better choice, but carefully read the docs regarding
it's drawbacks first. You may want to do some benchmarks to see if it's
really needed before you commit to it as a scheduled operation.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] "Healing" a table after massive updates

Brad Nicholson wrote:

> If you want to compact the the table, you either need to use CLUSTER or
> VACUUM FULL + REINDEX.

Actually those are all pretty slow. If you can do a no-op ALTER TYPE
that rewrites the entire table, it is a lot faster. Something like

ALTER TABLE tab ALTER COLUMN col TYPE integer;

Assume that column "col" on table "tab" already has type integer.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [ADMIN] replica of database

On Wed, Sep 10, 2008 at 09:52:41PM -0700, Joshua D. Drake wrote:
> Version 7.3 is end of life and I don't know of *any* of the replication
> technologies that will work with it.

The 1.0.x Slony releases work with 7.3.x, x>2. I don't recommend
sticking with 7.3, however.

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [GENERAL] psql scripting tutorials

Harald Fuchs escribió:

> Nice trick, but when I try the following variant:
>
> psql -v TEST=16 -c 'select :TEST as "input"'
>
> I get
>
> ERROR: syntax error at or near ":"
> LINE 1: select :TEST as "input"
>
> This seems to be contrary to the psql manual page:
>
> These assignments are done during a very early stage of start-up...

Seems like a bug in -c ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Transaction Snapshots and Hot Standby

Thanks for the detailed thinking. At least one very good new idea here,
some debate on other points.

On Thu, 2008-09-11 at 09:24 +0300, Heikki Linnakangas wrote:

> And still we can't escape the scenario that the slave receives a WAL
> record that vacuums away a tuple that's still visible according to a
> snapshot used in the slave. Even with the proposed scheme, this can happen:
>
> 1. Slave receives a snapshot from master
> 2. A long-running transaction begins on the slave, using that snapshot
> 3. Network connection is lost
> 4. Master hits a timeout, and decides to discard the snapshot it sent to
> the slave
> 5. A tuple visible to the snapshot is vacuumed
> 6. Network connection is re-established
> 7. Slave receives the vacuum WAL record, even though the long-running
> transaction still needs the tuple.

Interesting point. (4) is a problem, though not for the reason you
suggest. If we were to stop and start master, that would be sufficient
to discard the snapshot that the standby is using and so cause problems.
So the standby *must* tell the master the recentxmin it is using, as you
suggest later, so good thinking. So part of the handshake between
primary and standby must be "what is your recentxmin?". The primary will
then use the lower/earliest of the two.

> I like the idea of acquiring snapshots locally in the slave much more.

Me too. We just need to know how, if at all.

> As you mentioned, the options there are to defer applying WAL, or cancel
> queries. I think both options need the same ability to detect when
> you're about to remove a tuple that's still visible to some snapshot,
> just the action is different. We should probably provide a GUC to
> control which you want.

I don't see any practical way of telling whether a tuple removal will
affect a snapshot or not. Each removed row would need to be checked
against each standby snapshot. Even if those were available, it would be
too costly. And even if we can do that, ISTM that neither option is
acceptable: if we cancel queries then touching a frequently updated
table is nearly impossible, or if we delay applying WAL then the standby
could fall behind, impairing its ability for use in HA. (If there was a
way, yes, we should have a parameter for it).

It was also suggested we might take the removed rows and put them in a
side table, but that makes me think of the earlier ideas for HOT and so
I've steered clear of that.

You might detect blocks that have had tuples removed from them *after* a
query started by either
* keeping a hash table of changed blocks - it would be a very big data
structure and hard to keep clean
* adding an additional "last cleaned LSN" onto every data block
* keeping an extra LSN on the bufhdr for each of the shared_buffers,
plus keeping a hash table of blocks that have been cleaned and then
paged out
Once detected, your only option is to cancel the query.

ISTM if we want to try to avoid making recentxmin same on both primary
and standby then the only viable options are the 3 on the original post.

> However, if we still to provide the behavior that "as long as the
> network connection works, the master will not remove tuples still needed
> in the slave" as an option, a lot simpler implementation is to
> periodically send the slave's oldest xmin to master. Master can take
> that into account when calculating its own oldest xmin. That requires a
> lot less communication than the proposed scheme to send snapshots back
> and forth. A softer version of that is also possible, where the master
> obeys the slave's oldest xmin, but only up to a point.

I like this very much. Much simpler implementation and no need for a
delay in granting snapshots. I'll go for this as the default
implementation. Thanks for the idea.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[BUGS] Incorrect cursor behaviour with gist index

Hi,

I'm using PostgreSQL 8.3.1 with PostGIS 1.3.3.

I have the following table:

CREATE TABLE fog_4752 (
description text,
gid integer NOT NULL,
item_class text,
item_id integer,
origin_x double precision,
origin_y double precision,
origin_z double precision,
geometry geometry,
CONSTRAINT enforce_dims_geometry CHECK ((ndims(geometry) = 2)),
CONSTRAINT enforce_srid_geometry CHECK ((srid(geometry) = 27700)) );

INSERT INTO fog_4752 (description, gid, item_class, item_id, origin_x, origin_y, origin_z, geometry) VALUES ('Polygon', 6, 'Polygon', 6, 270463.5995574299, 660527.33722885954, 0, '0103000020346C0000010000000500000042098568C0E014411917774DA44F26419475BFC6784608411917774DA44F26419475BFC678460841936EDB0B1901224142098568C0E01441936EDB0B1901224142098568C0E014411917774DA44F2641');
INSERT INTO fog_4752 (description, gid, item_class, item_id, origin_x, origin_y, origin_z, geometry) VALUES ('Polygon', 4, 'Polygon', 4, 306782.6950348168, 112627.83974142233, 0, '0103000020346C00000100000005000000288A4FB70C430741E06CA5E47060F240EAA4C6336FD11941E06CA5E47060F240EAA4C6336FD1194130FD41FD044F0241288A4FB70C43074130FD41FD044F0241288A4FB70C430741E06CA5E47060F240');
INSERT INTO fog_4752 (description, gid, item_class, item_id, origin_x, origin_y, origin_z, geometry) VALUES ('Polygon', 5, 'Polygon', 5, 224805.30810014351, 415632.86486705049, 0, '0103000020346C00000100000005000000A0BDB7907EBA04415A4590094F4612417256A12EEB1311415A4590094F4612417256A12EEB1311419892D7F01B3B2041A0BDB7907EBA04419892D7F01B3B2041A0BDB7907EBA04415A4590094F461241');
INSERT INTO fog_4752 (description, gid, item_class, item_id, origin_x, origin_y, origin_z, geometry) VALUES ('Polygon', 1, 'Polygon', 1, 317159.57945692743, 809954.47290725145, 0, '0103000020346C0000010000000500000018E0648798E71641E6B7DC1478FF2A415CE0AA36489F0F41E6B7DC1478FF2A415CE0AA36489F0F41861465CF1170264118E0648798E71641861465CF1170264118E0648798E71641E6B7DC1478FF2A41');
INSERT INTO fog_4752 (description, gid, item_class, item_id, origin_x, origin_y, origin_z, geometry) VALUES ('Polygon', 2, 'Polygon', 2, 457247.5191554199, 527703.21662584448, 0, '0103000020346C00000100000005000000B50BF40E7B642041475B545A4EF4224186DD520906081741475B545A4EF4224186DD5209060817411E9EFD061D821A41B50BF40E7B6420411E9EFD061D821A41B50BF40E7B642041475B545A4EF42241');
INSERT INTO fog_4752 (description, gid, item_class, item_id, origin_x, origin_y, origin_z, geometry) VALUES ('Polygon', 3, 'Polygon', 3, 567242.49402979179, 197718.29200272885, 0, '0103000020346C000001000000050000003FCF4C7C885E23415E698CEE51801041BA452CFB42811E415E698CEE51801041BA452CFB42811E41E075E49D8189FE403FCF4C7C885E2341E075E49D8189FE403FCF4C7C885E23415E698CEE51801041');

ALTER TABLE ONLY fog_4752 ADD CONSTRAINT fog_4752_pkey PRIMARY KEY (gid);

CREATE INDEX fog_4752_geometry_sidx ON fog_4752 USING gist (geometry);

Now I'm running these SQL commands:

SET ENABLE_SEQSCAN = OFF;
BEGIN;
DECLARE C63 SCROLL CURSOR FOR
select * from fog_4752
where (geometry && setsrid('BOX(111697.268 85647.94,655446.012 679205.729)'::box2d,27700) and intersects(geometry,'SRID=27700;POLYGON((655446.011617731 679205.729188659,111697.267899139 679205.729188659,111697.267899139 85647.940243935,655446.011617731 85647.940243935,655446.011617731 679205.729188659))'::geometry));
FETCH ABSOLUTE -1 IN C63;
FETCH ABSOLUTE 1 IN C63;
FETCH FORWARD 10 IN C63;
FETCH ABSOLUTE -1 IN C63;
CLOSE C63;
END;

The query used to create the cursor selects 5 of the 6 rows in the table.

The problem is this: The "FETCH ABSOLUTE -1 IN C63" commands return zero rows, when clearly they should return one row, namely the last row in the cursor.

As far as I understand, the ENABLE_SEQSCAN = OFF forces the query to use the gist index, which would otherwise not be used. However, if the cursor would select a suitably sized subset of a large enough table, then the gist index would be used regardless of the setting of ENABLE_SEQSCAN. So simply setting ENABLE_SEQSCAN = ON is not a solution that will work in all cases.

To me this seems to be a bug in PostgreSQL. If it is, can it be fixed?

Regards,

Martin Schäfer
Principal Software Engineer
Cadcorp
Computer Aided Development Corporation Ltd.
1 Heathcock Court, London, WC2R 0NT
martin.schaefer@cadcorp.com
www.cadcorp.com

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [GENERAL] "Healing" a table after massive updates

On Thu, 2008-09-11 at 07:01 -0700, Gauthier, Dave wrote:
> Hi:
>
>
>
> I have a job that loads a large table, but then has to "update" about
> half the records for various reasons. My perception of what happens
> on update for a particular recors is...
>
> - a new record will be inserted with the updated value(s).
>
> - The old record is marked as being obselete.
>
> - Not sure what happens to index elements that pointed to the original
> (now obselete) record. Is it updated to point directly at the newly
> inserted record? Or does it use the obselete record as a "link" to
> the newly inserted record?

Depends on the version of Postgres. Prior to 8.3, the obsolete tuples
and index entries are dead. In 8.3, the updates are HOT updates, it
will not leave the dead tuples or index. 8.3 might be a big help for
you. It could remove the need to vacuum this table entirely.

> My concern is that the resulting table is not in optimal shape for
> queries. I would like to get rid of the obseleted records (vacuum I
> believe) but also "heal" the table in terms of filling in the holes
> left where those deleted records used to be (will gather more records
> per disk block read if record density on disk is greater). Is there a
> way to do this?

Regular VACUUM is the correct operation to get rid of the dead tuples.

If you want to compact the the table, you either need to use CLUSTER or
VACUUM FULL + REINDEX.
--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] "Healing" a table after massive updates

I might be able to answer my own question...

vacuum FULL (analyze is optional)

Correct?

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gauthier, Dave
Sent: Thursday, September 11, 2008 10:01 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] "Healing" a table after massive updates

Hi:

I have a job that loads a large table, but then has to “update” about half the records for various reasons. My perception of what happens on update for a particular recors is...

- a new record will be inserted with the updated value(s).

- The old record is marked as being obselete.

- Not sure what happens to index elements that pointed to the original (now obselete) record. Is it updated to point directly at the newly inserted record? Or does it use the obselete record as a “link” to the newly inserted record?

My concern is that the resulting table is not in optimal shape for queries. I would like to get rid of the obseleted records (vacuum I believe) but also “heal” the table in terms of filling in the holes left where those deleted records used to be (will gather more records per disk block read if record density on disk is greater). Is there a way to do this?

Thanks

Re: [pgsql-es-ayuda] Replicacion.

Edilberto:

Existen otras opciones en dependencias de tus necesidades .... esta
bucardo .... Pgcluster y otros pagados. Yo con la ayuda de Miguel pruebo
una aplicación que el conoce...
.... pero hasta el Lunes de la próxima semana ....
no podré ver nada al respecto .... como debes haber leído en la
noticias .... después de los daños ocasionados por lo dos huracanes que
recién acaban de pasar por acá por Cuba ... estamos en tareas de
recuperación.

Saludos,
Gilberto.

--
TIP 10: no uses HTML en tu pregunta, seguro que quien responda no podr� leerlo

Re: [HACKERS] Transaction Snapshots and Hot Standby

Csaba Nagy wrote:
> and that means in fact that if you have
> continuously overlapping small transactions, the "blocking horizon"
> could be even blocked forever, as there'll always be a query running,
> and the new queries will always have the snapshot of the currently
> running ones because WAL recovery is stalled...

Hmm, no I don't think the WAL recovery can become completely stalled. To
completely stop progressing, we'd need to take a new snapshot that
includes transaction X, and at the same time be blocked on a vacuum
record that vacuums a tuple that's visible to transaction X. I don't
think that can happen, because for such a scenario to arise, in the
corresponding point in time in the master, there would've been a
scenario where the vacuum would've removed a tuple that would have been
visible to a newly starting transaction. Which can't happen. I think..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[GENERAL] "Healing" a table after massive updates

Hi:

I have a job that loads a large table, but then has to “update” about half the records for various reasons. My perception of what happens on update for a particular recors is...

- a new record will be inserted with the updated value(s).

- The old record is marked as being obselete.

Thanks

[COMMITTERS] pgsql: Initialize the minimum frozen Xid in vac_update_datfrozenxid

Log Message:
-----------
Initialize the minimum frozen Xid in vac_update_datfrozenxid using
GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not
depend on the latter being correctly set elsewhere, and while it is more
expensive, this code path is not performance-critical. This is a real
risk for autovacuum, because it can execute whole cycles without doing
a single vacuum, which would mean that RecentGlobalXmin would stay at its
initialization value, FirstNormalTransactionId, causing a bogus value to be
inserted in pg_database. This bug could explain some recent reports of
failure to truncate pg_clog.

At the same time, change the initialization of RecentGlobalXmin to
InvalidTransactionId, and ensure that it's set to something else whenever
it's going to be used. Using it as FirstNormalTransactionId in HOT page
pruning could incur in data loss. InitPostgres takes care of setting it
to a valid value, but the extra checks are there to prevent "special"
backends from behaving in unusual ways.

Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us

Tags:
----
REL8_3_STABLE

Modified Files:
--------------
pgsql/src/backend/access/heap:
heapam.c (r1.249.2.2 -> r1.249.2.3)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/heap/heapam.c?r1=1.249.2.2&r2=1.249.2.3)
pgsql/src/backend/access/index:
indexam.c (r1.101 -> r1.101.2.1)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/index/indexam.c?r1=1.101&r2=1.101.2.1)
pgsql/src/backend/commands:
vacuum.c (r1.364.2.1 -> r1.364.2.2)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/vacuum.c?r1=1.364.2.1&r2=1.364.2.2)
pgsql/src/backend/executor:
nodeBitmapHeapscan.c (r1.22 -> r1.22.2.1)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/executor/nodeBitmapHeapscan.c?r1=1.22&r2=1.22.2.1)
pgsql/src/backend/utils/init:
postinit.c (r1.180 -> r1.180.2.1)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/init/postinit.c?r1=1.180&r2=1.180.2.1)
pgsql/src/backend/utils/time:
tqual.c (r1.109 -> r1.109.2.1)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/time/tqual.c?r1=1.109&r2=1.109.2.1)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [pgsql-advocacy] Ohio LinuxFest Booth(revised)

Let me clarify, that have the url for the download on them...

Melanie wrote:

Actually, I spoke with JD last night and we liked the idea of stickers that have the url on them. I would expect most people attending are tech savvy and even if they could give a rat's about the sustainability (hey, they might) they're probably more interested in getting their needs from the web vs crap that sits around. Though everything mankind does involves the environment, the products that can be made from recycled goods and/or can be recycled is obviously ideal.

Hey Robert - I understand you're involved in an open source show that is to be held in Atlanta. Atlanta is like 5 hours from us, we'd like to participate, what's involved?

Robert Treat wrote:
On Wednesday 10 September 2008 19:18:06 Josh Berkus wrote:   
On Wednesday 10 September 2008 14:56, Melanie wrote:     
 This saves me from putting out  a request for this stuff later.  Andrew wants to give out CD's with Postgres on it, thoughts where I might arrange to get those?       
Well, we didn't print any on the West Coast because we've become concerned about environmental issues.  We did get USB keys for "special people" (they cost $9 each to buy) which could be used for handing out images, but not to every attendee.      
 Yeah, you don't want to do that at OHLF, the audience is far too varied. I  could see CD's though, especially with the postive feedback that's been  received.     

Thursday, September 11, 2008

Blog Archive