Tuesday, July 15, 2008

Re: [GENERAL] optimizer ignoring primary key and doing sequence scan

On Tue, Jul 15, 2008 at 12:19 AM, Edoardo Panfili <edoardo@aspix.it> wrote:
> Scott Marlowe ha scritto:
>>
>> On Mon, Jul 14, 2008 at 1:54 PM, Chris Hoy <chris.hoy@hoyc.fsnet.co.uk>
>> wrote:
>>>
>>> Hi
>>>
>>> I have a number of tables in my database where the queries appear to
>>> ignoring the primary key and doing a seq scan instead, however other
>>> tables
>>> appear to be fine. I can see any difference between them.
>>>
>>> Is their any way of determination why the otimizer isn't picking up the
>>> primary key?
>>>
>>> Version 8.3.3 windows
>>>
>>> An example of a non working table is:
>>>
>>> select * from industries where industryid = 1;
>>> "Seq Scan on industries (cost=0.00..1.02 rows=1 width=116) (actual
>>> time=0.011..0.013 rows=1 loops=1)"
>>
>> According to this there's only one row in the table. why WOULD
>> postgresql use an index when it can just scan the one row table in a
>> split second.
>>
> I agree with you that it can depend on the size of the table but where you
> can read that the table contains only one row?

Actually I meant to write one page or block there, not row. But it's
the same diff really. 1 Row or 100, if they fit in an 8k block
together, pgsql isn't going to use an index to look them up. It kinda
knows which block they'll be in ahead of time.

> I try with my table (39910 rows, no index on column note)
> explain analyze select * from table where note='single example';
>
> Seq Scan on table (cost=0.00..2458.88 rows=13 width=327) (actual
> time=10.901..481.896 rows=1 loops=1)

Yeah that'll take 481 blocks scanned to get the answer.

> On the postgres manual I can find "Estimated number of rows output by this
> plan node (Again, only if executed to completion.)" regarding the third
> parameter of the explain
>
> Where is my error?

I'm not seeing an error, just a possible misunderstanding of pgsql
plans queries. Without an index above on note, it has to do a
sequential scan there's no index to shorten up the work.

The estimated number of rows are how many pgsql thinks it will get
back when it runs the query, based on the statistics it has stored
from the last time analyze was run.

The actual number of rows listed in the explain analyze output is the
actual number of rows, not the estimated number...

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

No comments: