Friday, May 9, 2008

Re: [HACKERS] constraint exclusion analysis caching

On Fri, 2008-05-09 at 08:47 -0400, Andrew Dunstan wrote:

> Yesterday a client and I were sad to discover that the overhead of
> constraint exclusion is apparently O(n) in the number of partitions, and
> that where we had ~180 partitions each with a simple constraint (check
> (field = nnn)) the overhead appeared to amount to about 0.25s on some
> quite performant hardware, which is way too high for our application.
> Actual execution of the query in question was talking one tenth of that
> time.
>
> For now we're going to work around this by directing the queries
> directly to the child tables, although this does involve fairly large
> application changes.
>
> However, I wondered if we couldn't mitigate this by caching the results
> of constraint exclusion analysis for a particular table + condition. I
> have no idea how hard this would be, but in principle it seems silly to
> keep paying the same penalty over and over again.

I think the only way forward is to put an index across the constraints,
to allow the exclusion time to be O(logN).

Currently the constraints are all independent of each other and can even
overlap. So we would need a way of

* confirming that the partitions are non-overlapping
* defining some structure to them, to allow them to be organised in a
sequence that allows either a bsearch or an index to exist

The latter requires some kind of top-down definition, which hopefully is
on the way from Gavin.

This can then allow exclusion to take place dynamically within the
executor, to allow a form of nested join.

My other requirements are noted here...
http://wiki.postgresql.org/wiki/Image:Partitioning_Requirements.pdf

I'm not working on this at all at the moment.

--
Simon Riggs
2ndQuadrant

http://www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: