Saturday, May 24, 2008

Re: [PERFORM] I/O on select count(*)

On May 18, 2008, at 1:28 AM, Greg Smith wrote:
> I just collected all the good internals information included in
> this thread and popped it onto http://wiki.postgresql.org/wiki/

> Hint_Bits where I'll continue to hack away at the text until it's
> readable. Thanks to everyone who answered my questions here,
> that's good progress toward clearing up a very underdocumented area.
>
> I note a couple of potential TODO items not on the official list
> yet that came up during this discussion:
>
> -Smooth latency spikes when switching commit log pages by
> preallocating cleared pages before they are needed
>
> -Improve bulk loading by setting "frozen" hint bits for tuple
> inserts which occur within the same database transaction as the
> creation of the table into which they're being inserted
>
> Did I miss anything? I think everything brought up falls either
> into one of those two or the existing "Consider having the
> background writer update the transaction status hint bits..." TODO.

-Evaluate impact of improved caching of CLOG per Greenplum:

Per Luke Longergan:
I'll find out if we can extract our code that did the work. It was
simple but scattered in a few routines. In concept it worked like this:

1 - Ignore if hint bits are unset, use them if set. This affects
heapam and vacuum I think.
2 - implement a cache for clog lookups based on the optimistic
assumption that the data was inserted in bulk. Put the cache one
call away from heapgetnext()

I forget the details of (2). As I recall, if we fall off of the
assumption, the penalty for long scans get large-ish (maybe 2X), but
since when do people full table scan when they're updates/inserts are
so scattered across TIDs? It's an obvious big win for DW work.

--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

No comments: