Wednesday, July 30, 2008

[PERFORM] why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?

Two queries which do the same thing, first one takes ages to complete
(did wait several minutes and cancelled it), while the second one took
9 seconds? Don't they do the same thing?

miernik=> EXPLAIN SELECT uid FROM locks WHERE uid NOT IN (SELECT uid FROM locks INNER JOIN wys USING (uid, login));
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on locks (cost=38341.39..61365389.71 rows=48446 width=4)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=38341.39..39408.47 rows=79508 width=4)
-> Hash Join (cost=3997.27..37989.89 rows=79508 width=4)
Hash Cond: (((wys.uid)::integer = (locks.uid)::integer) AND ((wys.login)::text = (locks.login)::text))
-> Seq Scan on wys (cost=0.00..13866.51 rows=633451 width=16)
-> Hash (cost=2069.91..2069.91 rows=96891 width=16)
-> Seq Scan on locks (cost=0.00..2069.91 rows=96891 width=16)
(9 rows)

Time: 231,634 ms
miernik=> EXPLAIN SELECT uid FROM locks EXCEPT (SELECT uid FROM locks INNER JOIN wys USING (uid, login));
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SetOp Except (cost=59306.12..60188.11 rows=17640 width=4)
-> Sort (cost=59306.12..59747.12 rows=176399 width=4)
Sort Key: "*SELECT* 1".uid
-> Append (cost=0.00..41823.79 rows=176399 width=4)
-> Subquery Scan "*SELECT* 1" (cost=0.00..3038.82 rows=96891 width=4)
-> Seq Scan on locks (cost=0.00..2069.91 rows=96891 width=4)
-> Subquery Scan "*SELECT* 2" (cost=3997.27..38784.97 rows=79508 width=4)
-> Hash Join (cost=3997.27..37989.89 rows=79508 width=4)
Hash Cond: (((wys.uid)::integer = (locks.uid)::integer) AND ((wys.login)::text = (locks.login)::text))
-> Seq Scan on wys (cost=0.00..13866.51 rows=633451 width=16)
-> Hash (cost=2069.91..2069.91 rows=96891 width=16)
-> Seq Scan on locks (cost=0.00..2069.91 rows=96891 width=16)
(12 rows)

Time: 1479,238 ms
miernik=>

--
Miernik
http://miernik.name/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: