Thursday, September 25, 2008

Re: [PERFORM] Slow updates, poor IO

On Thu, Sep 25, 2008 at 1:24 PM, John Huttley <> wrote:
> I've just had an interesting encounter with the slow full table update
> problem that is inherent with MVCC
> The system is 64 bit linux with 2.6.25 kernel feeding scsi disks.
> the table is
> fileid integer NOT NULL,
> fileindex integer DEFAULT 0 NOT NULL,
> jobid integer NOT NULL,
> pathid integer NOT NULL,
> filenameid integer NOT NULL,
> markid integer DEFAULT 0 NOT NULL,
> lstat text NOT NULL,
> md5 text NOT NULL,
> perms text
> );
> ADD CONSTRAINT file_pkey PRIMARY KEY (fileid);
> CREATE INDEX file_fp_idx ON file USING btree (filenameid, pathid);
> CREATE INDEX file_jobid_idx ON file USING btree (jobid);
> There are 2.7M rows.
> running update file set perms='0664' took about 10 mins

So, how many rows would already be set to 0664? Would adding a where
clause speed it up?

update file set perms='0664' where perms <> '0664';

> during this period, vmstat reported Blocks Out holding in the 4000 to 6000
> range.
> When I dropped the indexes this query ran in 48sec.
> Blocks out peaking at 55000.
> So there is a double whammy.
> MVCC requires more work to be done when indexes are defined and then this
> work
> results in much lower IO, compounding the problem.

That's because it becomes more random and less sequential. If you had
a large enough drive array you could get that kind of performance for
updating indexes, since the accesses would tend to hit different
drives most the time.

Under heavy load on the production servers at work we can see 30 to 60
Megs a second random access with 12 drives, meaning 2.5 to 5Megs per
second per drive. Sequential throughput is about 5 to 10 times

What you're seeing are likely the effects of running a db on
insufficient drive hardware.

Sent via pgsql-performance mailing list (
To make changes to your subscription:

No comments: