Quite a lot of answers !
> Does all INSERTs are made by one application? Maybe PREPARED STATEMENTS will help you? What about optimization on application level?
Yes it's on only one application (through JDBC), optimizations (grouped transaction, prepared statement) will be done but that our very first test in hard condition which scared us all :p.
> Can you tell us what kind of application this is? It sounds like a control systems application where you will write the current values of the sensors with each scan of a PLC. If so, is that a good idea? Also is 3,000 sensors realistic? That would be a lot of sensors for one control system.
Our research project is trying to manage large scale sensor network deployments. 3.000 is quite a huge deployment but it can be realistic for huge aggricultural deployment for example.
> That would be an insert plus updates to each of your 6 indexes every 0.33 ms. Is that a good idea? Is there a better strategy? What are you measuring with the instruments e.g. is this a process plant or manufacturing facility? What will people do with this data?
I try to suppress the indexes the more I can. Actually I only really need the index on timestamp to see for example the last readings, and to search for a historical data by period, the others (values) are more for "when this sensor was over 45ºC" for instance but it can be without indexes (will be slow but less heavy at insert time). I get the data from differents telosb motes that gathers temperature / humidity and light.
> Have you checked what you are bottlenecking on - CPU or disk? Try iostat/top/etc during the inserts. Also check actual disk utilizatio (iostat -x on linux/freebsd; varies on others) to see what percentage of time the disk/storage device is busy.
I saw the results of iostat and top, the postgres process was at 70% cpu . Yes I know that my test machine is not brand new but I have to find a good solution with this.
Ok I just ran some tests. It seems that I spammed too much right after the creation of the tables, thus the vacuum analyse could not be ran. I have better results now :
Average of writing 10 rows in each table
ON 1000 TABLES
Without indexes at all : ~1.5s
With only the index on timestamp : ~2.5s
With all indexes : ~30s
ON 3000 TABLES
Without indexes at all : ~8s
With only the index on timestamp : ~45s
With all indexes : ~3min
I don't know why but the difference is quite huge with indexes ! When I did my vacuum the system told me about the "max_fsm_relations" (1000). Do you think it will change something (as Scott said). I didn't have time to run tests with vacuum analyze on system table see you tomorow for other news...
> Does all INSERTs are made by one application? Maybe PREPARED STATEMENTS will help you? What about optimization on application level?
Yes it's on only one application (through JDBC), optimizations (grouped transaction, prepared statement) will be done but that our very first test in hard condition which scared us all :p.
> Can you tell us what kind of application this is? It sounds like a control systems application where you will write the current values of the sensors with each scan of a PLC. If so, is that a good idea? Also is 3,000 sensors realistic? That would be a lot of sensors for one control system.
Our research project is trying to manage large scale sensor network deployments. 3.000 is quite a huge deployment but it can be realistic for huge aggricultural deployment for example.
> That would be an insert plus updates to each of your 6 indexes every 0.33 ms. Is that a good idea? Is there a better strategy? What are you measuring with the instruments e.g. is this a process plant or manufacturing facility? What will people do with this data?
I try to suppress the indexes the more I can. Actually I only really need the index on timestamp to see for example the last readings, and to search for a historical data by period, the others (values) are more for "when this sensor was over 45ºC" for instance but it can be without indexes (will be slow but less heavy at insert time). I get the data from differents telosb motes that gathers temperature / humidity and light.
> Have you checked what you are bottlenecking on - CPU or disk? Try iostat/top/etc during the inserts. Also check actual disk utilizatio (iostat -x on linux/freebsd; varies on others) to see what percentage of time the disk/storage device is busy.
I saw the results of iostat and top, the postgres process was at 70% cpu . Yes I know that my test machine is not brand new but I have to find a good solution with this.
Ok I just ran some tests. It seems that I spammed too much right after the creation of the tables, thus the vacuum analyse could not be ran. I have better results now :
Average of writing 10 rows in each table
ON 1000 TABLES
Without indexes at all : ~1.5s
With only the index on timestamp : ~2.5s
With all indexes : ~30s
ON 3000 TABLES
Without indexes at all : ~8s
With only the index on timestamp : ~45s
With all indexes : ~3min
I don't know why but the difference is quite huge with indexes ! When I did my vacuum the system told me about the "max_fsm_relations" (1000). Do you think it will change something (as Scott said). I didn't have time to run tests with vacuum analyze on system table see you tomorow for other news...
No comments:
Post a Comment