Saturday, June 21, 2008

Re: [HACKERS] Hash index build patch has *worse* performance at small table sizes

Did we ever do anything about this?

---------------------------------------------------------------------------

Tom Lane wrote:
> I've been reviewing the hash index build patch submitted here:
> http://archives.postgresql.org/pgsql-patches/2007-10/msg00154.php
>
> Although it definitely helps on large indexes, it's actually
> counterproductive on not-so-large ones. The test case I'm using
> is random integers generated like this:
> create table foo as select (random() * N)::int as f1
> from generate_series(1,N);
> select count(*) from foo; -- force hint bit updates
> checkpoint;
> then timing
> create index fooi on foo using hash(f1);
>
> Using all-default configuration settings on some not-very-new hardware,
> at N = 1E6 I see
>
> 8.3.1: 30 sec
> With pre-expansion of index (CVS HEAD): 24 sec
> With sorting: 72 sec
> To build a btree index on same data: 34 sec
>
> Now this isn't amazingly surprising, because the original argument for
> doing sorting was to improve locality of access to the index during
> the build, and that only matters if you've got an index significantly
> bigger than memory. If the index fits in RAM then the sort is pure
> overhead.
>
> The obvious response to this is to use the sorting approach only when
> the estimated index size exceeds some threshold. One possible choice of
> threshold would be shared_buffers (or temp_buffers for a temp index)
> but I think that is probably too conservative, because in most scenarios
> the kernel's disk cache is available too. Plus you can't tweak that
> setting without a postmaster restart. I'm tempted to use
> effective_cache_size, which attempts to measure an appropriate number
> and can be set locally within the session doing the CREATE INDEX if
> necessary. Or we could invent a new GUC parameter, but that is probably
> overkill.
>
> Comments?
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce@momjian.us>

http://momjian.us

EnterpriseDB

http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: