Saturday, July 19, 2008

Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)

pgbench is unrelated to the workload you are concerned with if ETL/ELT and decision support / data warehousing queries are your target.

Also - placing the xlog on dedicated disks is mostly irrelevant to data warehouse / decision support work or ELT.  If you need to maximize loading speed while concurrent queries are running, it may be necessary, but I think you'll be limited in load speed by CPU related to data formatting anyway.

The primary performance driver for ELT / DW is sequential transfer rate, thus the dd test at 2X memory.  With six data disks of this type, you should expect a maximum of around 6 x 80 = 480 MB/s.  With RAID10, depending on the raid adapter, you may need to have two or more IO streams to use all platters, otherwise your max speed for one query would be 1/2 that, or 240 MB/s.

I'd suggest RAID5, or even better, configure all eight disks as a JBOD in the RAID adapter and run ZFS RAIDZ.  You would then expect to get about 7 x 80 = 560 MB/s on your single query.

That said, your single cpu on one query will only be able to scan that data at about 300 MB/s (try running a SELECT COUNT(*) against a table that is 2X memory size).

- Luke

----- Original Message -----
From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org>
To: pgsql-performance@postgresql.org <pgsql-performance@postgresql.org>
Sent: Sat Jul 19 09:19:43 2008
Subject: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)


I'm trying to run a few basic tests to see what a current machine can
deliver (typical workload ETL like, long running aggregate queries,
medium size db ~100 to 200GB).

I'm currently checking the system (dd, bonnie++) to see if performances
are within the normal range but I'm having trouble relating it to
anything known. Scouting the archives there are more than a few people
familiar with it, so if someone can have a look at those numbers and
raise a flag where some numbers look very out of range for such system,
that would be appreciated. I also added some raw pgbench numbers at the end.

(Many thanks to Greg Smith, his pages was extremely helpful to get
started. Any mistake is mine)

Hardware:

Sun Fire X4150 x64

2 Quad-Core Intel(R) Xeon(R) X5460 processor (2x6MB L2, 3.16 GHz, 1333
MHz FSB)
16GB of memory (4x2GB PC2-5300 667 MHz ECC fully buffered DDR2 DIMMs)

6x 146GB 10K RPM SAS  in RAID10 - for os + data
2x 146GB 10K RPM SAS  in RAID1 - for xlog
Sun StorageTek SAS HBA Internal (Adaptec AAC-RAID)


OS is Ubuntu 7.10 x86_64 running  2.6.22-14
os in on ext3
data is on xfs noatime
xlog is on ext2 noatime


data
$ time sh -c "dd if=/dev/zero of=bigfile bs=8k count=4000000 && sync"
4000000+0 records in
4000000+0 records out
32768000000 bytes (33 GB) copied, 152.359 seconds, 215 MB/s

real    2m36.895s
user    0m0.570s
sys     0m36.520s

$ time dd if=bigfile of=/dev/null bs=8k
4000000+0 records in
4000000+0 records out
32768000000 bytes (33 GB) copied, 114.723 seconds, 286 MB/s

real    1m54.725s
user    0m0.450s
sys     0m22.060s


xlog
$ time sh -c "dd if=/dev/zero of=bigfile bs=8k count=4000000 && sync"
4000000+0 records in
4000000+0 records out
32768000000 bytes (33 GB) copied, 389.216 seconds, 84.2 MB/s

real    6m50.155s
user    0m0.420s
sys     0m26.490s

$ time dd if=bigfile of=/dev/null bs=8k
4000000+0 records in
4000000+0 records out
32768000000 bytes (33 GB) copied, 294.556 seconds, 111 MB/s

real    4m54.558s
user    0m0.430s
sys     0m23.480s



bonnie++ -s 32g -n 256

data:
Version  1.03       ------Sequential Output------ --Sequential Input-
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP 
/sec %CP
lid-statsdb-1   32G 101188  98 202523  20 107642  13 88931  88 271576 
19 980.7   2
                    ------Sequential Create------ --------Random
Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP 
/sec %CP
                256 11429  93 +++++ +++ 17492  71 11097  91 +++++ +++ 
2473  11



xlog
Version  1.03       ------Sequential Output------ --Sequential Input-
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP 
/sec %CP
lid-statsdb-1   32G 62973  59 69981   5 35433   4 87977  85 119749   9
496.2   1
                    ------Sequential Create------ --------Random
Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP 
/sec %CP
                256   551  99 +++++ +++ 300935  99   573  99 +++++ +++ 
1384  99

pgbench

postgresql 8.2.9 with data and xlog as mentioned above

postgresql.conf:
shared_buffers = 4GB
checkpoint_segments = 8
effective_cache_size = 8GB

Script running over scaling factor 1 to 1000 and running 3 times pgbench
with "pgbench -t 2000 -c 8 -S pgbench"

It's a bit limited and will try to do a much much longer run and
increase the # of tests and calculate mean and stddev as I have a pretty
large variation for the 3 runs sometimes (typically for the scaling
factor at 1000, the runs are respectively 1952, 940, 3162)  so the graph
is pretty ugly.

I get (scaling factor, size of db in MB, middle tps)

1 20 22150
5 82 22998
10 160 22301
20 316 22857
30 472 23012
40 629 17434
50 785 22179
100 1565 20193
200 3127 23788
300 4688 15494
400 6249 23513
500 7810 18868
600 9372 22146
700 11000 14555
800 12000 10742
900 14000 13696
1000 15000 940

cheers,

-- stephane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No comments: