Sunday, August 31, 2008

Re: [PERFORM] slow update of index during insert/copy

You may want to investigate pg_bulkload.

http://pgbulkload.projects.postgresql.org/

One major enhancement over COPY is that it does an index merge, rather than modify the index one row at a time. 
http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf



On Sun, Aug 31, 2008 at 6:32 AM, Thomas Finneid <tfinneid@student.matnat.uio.no> wrote:
Hi

I am working on a table which stores up to 125K rows per second and I find that the inserts are a little bit slow. The insert is in reality a COPY of a chunk of rows, up to 125K. A COPY og 25K rows, without an index, is fast enough, about 150ms. With the index, the insert takes about 500ms. The read though, is lightning fast, because of the index. It takes only 10ms to retrieve 1000 rows from a 15M row table. As the table grows to several billion rows, that might change though.

I would like the insert, with an index, to be a lot faster than 500ms, preferrably closer to 150ms. Any advice on what to do?
Additionally, I dont enough about pg configuring to be sure I have included all the important directives and given them proportional values, so any help on that as well would be appreciated.

Here are the details:

postgres 8.2.7 on latest kubuntu, running on dual Opteron quad cores, with 8GB memory and 8 sata disks on a raid controller (no raid config)

table:

create table v1
(
       id_s            integer,
       id_f            integer,
       id_st           integer,
       id_t            integer,
       value1          real,
       value2          real,
       value3          real,
       value4          real,
       value5          real,
       ...
       value20         real
);

create index idx_v1 on v1 (id_s, id_st, id_t);

- insert is a COPY into the 5-8 first columns. the rest are unused so
 far.

postgres config:

autovacuum = off
checkpoint_segments = 96
commit_delay = 5
effective_cache_size = 128000
fsync = on
max_fsm_pages = 208000
max_fsm_relations = 10000
max_connections = 20
shared_buffers = 128000
wal_sync_method = fdatasync
wal_buffers = 256
work_mem = 512000
maintenance_work_mem = 2000000

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: