Tuesday, June 24, 2008

Re: [PERFORM] Postgresql is very slow

>> Not necessarily, the 'cost' depends on
>> seq_page_cost and there might be
>> other value than 1 (which is the default). A better
>> approach is
>>
>> SELECT relpages, reltuples FROM pg_class WHERE relname =
>> 'users';
>>
>> which reads the values from system catalogue.
>>
> The Output of query on the Slow Server
>
> SELECT relpages, reltuples FROM pg_class WHERE relname ='users';
> relpages | reltuples
> ----------+-----------
> 54063 | 2307
> (1 row)
>
> The Output of query on the old server which is fast
>
> relpages | reltuples
> ----------+-----------
> 42 | 1637
>
>

This definitely confirms the suspicion about dead tuples etc. On the old
server the table has 1637 tuples and occupies just 42 pages (i.e. 330kB
with 8k pages), which gives about 0.025 of a page (0.2kB per) per row.

Let's suppose the characteristics of data (row sizes, etc.) are the same
on both servers - in that case the 2307 rows should occuppy about 58
pages, but as you can see from the first output it occupies 54063, i.e.
400MB instead of 450kB.

>> > Definitely need a vacuum full on this table, likely
>> followed by a reindex.
>>
>
> The Slow server load increases whenever i run a simple query, is it the
> good idea to run VACUUM full on the live server's database now or it
> should be run when the traffic is very low may be in weekend.

The load increases because with the queries you've sent the database has
to read the whole table (sequential scan) and may be spread through the
disk (thus the disk has to seek).

I'd recommend running CLUSTER instead of VACUUM - that should be much
faster in this case. It will lock the table, but the performance already
sucks, so I'd probably prefer a short downtime with a much faster
processing after that.

>
>> Yes, that's true. I guess the table holds a lot of dead
>> tuples. I'm not
>> sure why this happens on one server (the new one) and not
>> on the other
>> one. I guess the original one uses some automatic vacuuming
>> (autovacuum,
>> cron job, or something like that).
>
> There was nothing related to VACUUM of database in the crontab.

In that case there's something running vacuum - maybe autovacuum (see
postgresql.conf), or so.

>> As someone already posted, clustering the table (by primary
>> key for
>> example) should be much faster than vacuuming and give
>> better performance
>> in the end. See
>>
>> http://www.postgresql.org/docs/8.3/interactive/sql-cluster.html
>>
>> The plain reindex won't help here - it won't remove
>> dead tuples.
>>
> I am new to Postgres database, i didnt understand the "indexing" part. Is
> it related to PRIMARY_KEY column of the table?

Not sure what you mean by the 'nd

Principle of clustering is quite simple - by sorting the table according
to an index (by the columns in the index) you may get better performance
when using the index. Another 'bonus' is that it compacts the table on the
disk, so disk seeking is less frequent. These two effects may mean a
serious increase of performance. You may cluster according to any index on
the table, not just by primary key - just choose the most frequently used
index.

Sure, there are some drawbacks - it locks the table, so you may not use it
when the command is running. It's not an incremental operation, the order
is not enforced when modifying the table - when you modify a row the new
version won't respect the order and you have to run the CLUSTER command
from time to time. And it's possible to cluster by one index only.

>
> Should i have to run:- CLUSTER USERS using 'username';

I guess 'username' is a column, so it won't work. You have to choose an
index (I'd recommend the primary key index, i.e. the one with _pk at the
end).

Tomas


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: