Monday, July 21, 2008

Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)

Luke Lonergan wrote:
>
> pgbench is unrelated to the workload you are concerned with if ETL/ELT
> and decision support / data warehousing queries are your target.
>
> Also - placing the xlog on dedicated disks is mostly irrelevant to
> data warehouse / decision support work or ELT. If you need to
> maximize loading speed while concurrent queries are running, it may be
> necessary, but I think you'll be limited in load speed by CPU related
> to data formatting anyway.
>
Indeed. pgbench was mostly done as 'informative' and not really relevant
to the future workload of this db. (given the queries it's doing not
sure it's relevant for anything but connections speed,
interesting for me to get reference for tx like however). I was more
interested in the raw disk performance.

>
> The primary performance driver for ELT / DW is sequential transfer
> rate, thus the dd test at 2X memory. With six data disks of this
> type, you should expect a maximum of around 6 x 80 = 480 MB/s. With
> RAID10, depending on the raid adapter, you may need to have two or
> more IO streams to use all platters, otherwise your max speed for one
> query would be 1/2 that, or 240 MB/s.
>
ok, which seems to be in par with what I'm getting. (the 240 that is)

>
> I'd suggest RAID5, or even better, configure all eight disks as a JBOD
> in the RAID adapter and run ZFS RAIDZ. You would then expect to get
> about 7 x 80 = 560 MB/s on your single query.
>
Do you have a particular controller and disk hardware configuration in
mind when you're suggesting RAID5 ?
My understanding was it was more difficult to find the right hardware to
get performance on RAID5 compared to RAID10.

>
> That said, your single cpu on one query will only be able to scan that
> data at about 300 MB/s (try running a SELECT COUNT(*) against a table
> that is 2X memory size).
>
Note quite 2x memory size, but ~26GB (accounts with scaling factor 2000):

$ time psql -c "select count(*) from accounts" pgbench
count
-----------
200000000
(1 row)

real 1m52.050s
user 0m0.020s
sys 0m0.020s


NB: For the sake of completness, reran the pgbench by taking average of
10 runs for each scaling factor (same configuration as per initial mail,
columns are scaling factor, db size, average tps)

1 20 23451
100 1565 21898
200 3127 20474
300 4688 20003
400 6249 20637
500 7810 16434
600 9372 15114
700 11000 14595
800 12000 16090
900 14000 14894
1000 15000 3071
1200 18000 3382
1400 21000 1888
1600 24000 1515
1800 27000 1435
2000 30000 1354

-- stephane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: