Tuesday, August 19, 2008

Re: [PERFORM] Slow query with a lot of data

On Tue, 19 Aug 2008, Moritz Onken wrote:
> explain select
> a."user", b.category, sum(1.0/b.cat_count)::float
> from result a, domain_categories b
> where a."domain" = b."domain"
> group by a."user", b.category;

> Both results and domain_categories are clustered on domain and analyzed.
> Why is it still sorting on domain? I thought the clustering should prevent
> the planner from doing this?

As far as I can tell, it should. If it is clustered on an index on domain,
and then analysed, it should no longer have to sort on domain.

Could you post here the results of running:

select * from pg_stats where attname = 'domain';

> It took 50 minutes to run this query for 280 users ("and "user" IN ([280
> ids])"), 78000 rows were returned and stored in a table. Is this reasonable?

Sounds like an awfully long time to me. Also, I think restricting it to
280 users is probably not making it any faster.

Matthew

--
It's one of those irregular verbs - "I have an independent mind," "You are
an eccentric," "He is round the twist."
-- Bernard Woolly, Yes Prime Minister

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: