Thursday, September 4, 2008

Re: [PERFORM] limit clause breaks query planner?

"Matt Smiley" <mss@rentrak.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> default cost settings will cause it to prefer bitmap scan for retrieving
>> up to about a third of the table, in my experience). I too am confused
>> about why it doesn't prefer that choice in the OP's example.

> It looks like the bitmap scan has a higher cost estimate because the
> entire bitmap index must be built before beginning the heap scan and
> returning rows up the pipeline.

Oh, of course. The LIMIT is small enough to make it look like we can
get the required rows after scanning only a small part of the table,
so the bitmap scan will lose out in the cost comparison because of its
high startup cost.

Ultimately the only way that we could get the right answer would be if
the planner realized that the required rows are concentrated at the end
of the table instead of being randomly scattered. This isn't something
that is considered at all right now in seqscan cost estimates. I'm not
sure offhand whether the existing correlation stats would be of use for
it, or whether we'd have to get ANALYZE to gather additional data.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: