Sunday, July 13, 2008

[ANNOUNCE] == PostgreSQL Weekly News - July 13 2008 ==

== PostgreSQL Weekly News - July 13 2008 ==

== PostgreSQL Product News ==

Open Technology Group has created a high-availability training course.
http://www.otg-nc.com/training-courses/coursedetail.php?courseid=65&cat_id=8

== PostgreSQL Jobs for July ==

http://archives.postgresql.org/pgsql-jobs/2008-07/threads.php

== PostgreSQL Local ==

The Call for Papers for European PGDay has begun.
http://www.pgday.org/en/call4papers

pgDay Portland is July 20, just before OSCON.
http://pugs.postgresql.org/node/400

PGCon Brazil 2008 will be on September 26-27 at Unicamp in Campinas.
http://pgcon.postgresql.org.br/index.en.html

PGDay.(IT|EU) 2008 will be October 17 and 18 in Prato.
http://www.pgday.org/it/

== PostgreSQL in the News ==

Planet PostgreSQL: http://www.planetpostgresql.org/

General Bits, Archives and occasional new articles:
http://www.varlena.com/GeneralBits/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david@fetter.org, German language
to pwn@pgug.de, Italian language to pwn@itpug.org.

== Applied Patches ==

Peter Eisentraut committed:

- In pgsql/doc/src/sgml/func.sgml, added documentation for function
xmlagg.

- Allow binary-coercible types for cast function arguments and return
types. Document return type of cast functions. Also change
documentation to prefer the term "binary coercible" in its present
sense instead of the previous term "binary compatible".

Tom Lane committed:

- Fix AT TIME ZONE (in all three variants) so that we first try to
interpret the timezone argument as a timezone abbreviation, and only
try it as a full timezone name if that fails. The zic database has
four zones (CET, EET, MET, WET) that are full daylight-savings zones
and yet have names that are the same as their abbreviations for
standard time, resulting in ambiguity. In the timestamp input
functions we resolve the ambiguity by preferring the abbreviation,
and AT TIME ZONE should work the same way. (No functionality is
lost because the zic database also has other names for these zones,
eg Europe/Zurich.) Per gripe from Jaromir Talir. Backpatch to 8.1.
Older releases did not have the issue because AT TIME ZONE only
accepted abbreviations not zone names. (Thus, this patch also
arguably fixes a compatibility botch introduced at 8.1: in ambiguous
cases we now behave the same as 8.0 did.)

- In pgsql/src/backend/utils/adt/selfuncs.c, fix estimate_num_groups()
to assume that GROUP BY expressions yielding boolean results always
contribute two groups, regardless of the expression contents. This
is very substantially more accurate than the regular heuristic for
certain boolean tests like "col IS NULL". Per gripe from Sam Mason.
Back-patch to all supported releases, since the behavior of
estimate_num_groups() hasn't changed all that much since 7.4.

- In pgsql/src/backend/utils/error/elog.c, fix performance bug in
write_syslog(): the code to preferentially break the log message at
newlines cost O(N^2) for very long messages with few or no newlines.
For messages in the megabyte range this became the dominant cost.
Per gripe from Achilleas Mantzios. Patch all the way back, since
this is a safe change with no portability risks. I am also thinking
of increasing PG_SYSLOG_LIMIT, but that should be done separately.

- In pgsql/src/backend/utils/error: elog.c, increase PG_SYSLOG_LIMIT
(the max line length sent to syslog()) from 128 to 1024 to improve
performance when sending large elog messages. Also add a comment
about why we use that number. Since this represents an externally
visible behavior change, and might possibly result in portability
issues, it seems best not to back-patch it.

- Fix mis-calculation of extParam/allParam sets for plan nodes, as
seen in bug #4290. The fundamental bug is that masking extParam by
outer_params, as finalize_plan had been doing, caused us to lose the
information that an initPlan depended on the output of a sibling
initPlan. On reflection the best thing to do seemed to be not to
try to adjust outer_params for this case but get rid of it entirely.
The only thing it was really doing for us was to filter out param
IDs associated with SubPlan nodes, and that can be done (with
greater accuracy) while processing individual SubPlan nodes in
finalize_primnode. This approach was vindicated by the discovery
that the masking method was hiding a second bug: SS_finalize_plan
failed to remove extParam bits for initPlan output params that were
referenced in the main plan tree (it only got rid of those
referenced by other initPlans). It's not clear that this caused any
real problems, given the limited use of extParam by the executor,
but it's certainly not what was intended. I originally thought that
there was also a problem with needing to include indirect
dependencies on external params in initPlans' param sets, but it
turns out that the executor handles this correctly so long as the
depended-on initPlan is earlier in the initPlans list than the one
using its output. That seems a bit of a fragile assumption, but it
is true at the moment, so I just documented it in some code comments
rather than making what would be rather invasive changes to remove
the assumption. Back-patch to 8.1. Previous versions don't have
the case of initPlans referring to other initPlans' outputs, so
while the existing logic is still questionable for them, there are
not any known bugs to be fixed. So I'll refrain from changing them
for now.

- Tighten up SS_finalize_plan's computation of valid_params to exclude
Params of the current query level that aren't in fact output
parameters of the current initPlans. (This means, for example,
output parameters of regular subplans.) To make this work correctly
for output parameters coming from sibling initplans requires
rejiggering the API of SS_finalize_plan just a bit: we need the
siblings to be visible to it, rather than hidden as
SS_make_initplan_from_plan had been doing. This is really part of
my response to bug #4290, but I concluded this part probably
shouldn't be back-patched, since all that it's doing is to make a
debugging cross-check tighter.

- Add unchangeable GUC "variables" segment_size, wal_block_size, and
wal_segment_size to make those configuration parameters available to
clients, in the same way that block_size was previously exposed.
Bernd Helmle, with comments from Abhijit Menon-Sen and some further
tweaking by me.

- In pgsql/src/backend/utils/time/snapmgr.c, fix a few typos in
comments and sort header inclusions alphabetically.

- Fix an oversight in the original implementation of
performMultipleDeletions(): the alreadyDeleted list has to be passed
down through deleteDependentObjects(), else objects that are deleted
via auto/internal dependencies don't get reported back up to
performMultipleDeletions(). Depending on the visitation order, this
could cause the code to try to delete an already-deleted object,
leading to strange errors in DROP OWNED (typically "cache lookup
failed for relation NNNNN" or similar). Per bug #4289. Patch for
back branches only. This code has recently been rewritten in HEAD,
and doesn't have this particular bug anymore.

- Multi-column GIN indexes. Teodor Sigaev

- Const-ify the arguments of str_tolower() and friends to suppress
compile warnings. Clean up various unneeded cruft that was left
behind after creating those routines. Introduce some convenience
functions str_tolower_z etc to eliminate tedious and error-prone
double arguments in formatting.c. (Currently there seems no need to
export the latter, but maybe reconsider this later.)

- In pgsql/src/include/pg_config_manual.h, don't make --enable-cassert
turn on RANDOMIZE_ALLOCATED_MEMORY automatically; it's just too dang
expensive. Per recent discussion, but I just got my nose rubbed in
it again while doing some performance checking.

- More replacements of binary compatible to binary coercible.

- In pgsql/doc/src/sgml/ref/create_cast.sgml, fix a couple of stray
misuses of "binary compatible".

- Clean up the use of some page-header-access macros: principally, use
SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places
where that makes the code clearer, and avoid casting between Page
and PageHeader where possible. Zdenek Kotala, with some additional
cleanup by Heikki Linnakangas. I did not apply the parts of the
proposed patch that would have resulted in slightly changing the
on-disk format of hash indexes; it seems to me that's not a win as
long as there's any chance of having in-place upgrade for 8.4.

- Change the PageGetContents() macro to guarantee its result is
maxalign'd, thereby forestalling any problems with alignment of the
data structure placed there. Since SizeOfPageHeaderData is
maxalign'd anyway in 8.3 and HEAD, this does not actually change
anything right now, but it is foreseeable that the header size will
change again someday. I had to fix a couple of places that were
assuming that the content offset is just SizeOfPageHeaderData rather
than MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's
page-macros patch.

- Create a type-specific typanalyze routine for tsvector, which
collects stats on the most common individual lexemes in place of the
mostly-useless default behavior of counting duplicate tsvectors.
Future work: create selectivity estimation functions that actually
do something with these stats. (Some other things we ought to look
at doing: using the Lossy Counting algorithm in
compute_minimal_stats, and using the element-counting idea for stats
on regular arrays.) Jan Urbanski

Bruce Momjian committed:

- In pgsql/src/backend/utils/misc/guc.c, add comment for deadlock_timeout:
"This is PGC_SIGHUP so all backends have the same value."

Neil Conway committed:

- In pgsql/src/backend/access/gin/README, minor improvements to the
Gin internal documentation.

Heikki Linnakangas committed:

- In pgsql/contrib/pg_standby/pg_standby.c, fix WAL file cutoff point
calculation in pg_standby. Patch by Simon Riggs, per bug report
from Ferenc Felhoffer.

Alvaro Herrera committed:

- Make sure we only try to free snapshots that have been passed
through CopySnapshot, per Neil Conway. Also add a comment about the
assumption in GetSnapshotData that the argument is statically
allocated. Also, fix some more typos in comments in snapmgr.c.

Teodor Sigaev committed:

- Add caching of query to GIN/GiST consistent function. Per
performance gripe from nomao.com

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

Heikki Linnakangas sent in a revision of the page macros cleanup.

Simon Riggs sent in a patch to change PG_USERSET to PG_SUSET for
logging files.

Bernd Helmle sent in a patch which adds some missing descriptions for
aggregates, functions and conversions.

Pavel Stehule, with feedback from Marko Kreen, sent in two more
revisions of his table function support patch.

Ken Camann sent in a patch to get Postgres to compile under 64-bit
Windows.

Jaime Casanova sent in another revision of his patch which makes
granting INSERT on a table extend to any sequences attached.

Tom Lane sent in a revised version of David Wheeler's case-insensitive
text patch.


---------------------------(end of broadcast)---------------------------
-To unsubscribe from this list, send an email to:

pgsql-announce-unsubscribe@postgresql.org

No comments: