> I'm trying to update a field in a table that has about 5 millin records with
> a table that has about 3.5 million records. I've created indexes for th a
> joined columns, but PostgreSQL 8.1 doesn't seem to want to use them. This
> makes for a very slow update.
8.3 is a bit smarter on some of these types of queries, updating to
that may (or may not) help in this situation.
> 1.) Why won't it use the indexes?
The pgsql planner chooses whether to use indexes versus sequential
scans based on whether one or the other is "cheaper" to use.
> 2.) How can I make this update faster?
Get a faster server?
> fsa=# EXPLAIN UPDATE growers
> SET grower_id = id2.grower_id
> FROM id2 WHERE growers.fsa_id = id2.fsa_id;
> QUERY PLAN
> --------------------------------------------------------------------------
> Hash Join (cost=70375.50..1603795.30 rows=6802720 width=355)
> Hash Cond: ("outer".fsa_id = "inner".fsa_id)
> -> Seq Scan on growers (cost=0.00..672373.20 rows=6802720 width=351)
> -> Hash (cost=46249.20..46249.20 rows=1966920 width=44)
> -> Seq Scan on id2 (cost=0.00..46249.20 rows=1966920 width=44)
1: Post explain analyze (if you can wait long enough for the update to finish.
2: Is the table bloated from multiple updates? Have you been
vacuuming between each update you've tested? Do you have autovacuum
enabled and is it aggresive enough to keep up?
3: If you want to force pgsql to use an index for testing purposes,
try running this before your update query:
set enable_indexscan=off;
and see if it's faster or slower.
Pgsql indexes don't have visibility, and this contributes to the
higher cost of index scans on pgsql than some other dbs. However,
sequential scans make sense if you're updating the whole table in
almost any database.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
No comments:
Post a Comment