Monday, June 23, 2008

Re: [HACKERS] pg_stat_statements

On Mon, 2008-06-23 at 15:22 +0900, ITAGAKI Takahiro wrote:
> I wrote:
> > I will try to measure overheads of logging in some implementation:
> > 1. Log statements and dump them into server logs.
> > 2. Log statements and filter them before to be written.
> > 3. Store statements in shared memory.
> > I know 1 is slow, but I don't know what part of it is really slow;
>
> I tested overheads of SQL logging with pgbench.
> $ pgbench -s10 -c10 -t10000 -n -S -M prepared
>
> logging type | tps | %
> -----------------------+-------+--------
> 0. no logging | 10651 | 100.0%
> 1. log to pg_log/* | 6535 | 61.4%
> 2. log to /dev/null | 8347 | 78.4%
> 3. store in memory | 10280 | 96.5%
>
> As expected, 1 is 40% slower than no logging settings. Also, filtering
> logs before written into files seems not to be a perfect solution.
> Redirecting logs to /dev/null is the *fastest* filter, but there was
> 30% of overhead. On the other hand, 3 has only 3.5% of overhead.
>
> I think storing SQLs in server memory is worth trying even if there
> are some troubles, for example, memory management. We can use either
> hooks and dtrace for the purpose, but I'm working hook-method because
> of portability.
>
> I'll send a core patch and an extension module to -patches. I hope only
> the patch is to be applied in the core. The extension module would be
> better to be developed separately from the core.

I think you want to see the distribution of execution times for
particular queries without needing to log *every* execution, including
parameters. I think I understand now what you are asking for and why you
are asking for it.

How about we have another log mode, call it log_statement_summary where
we keep track of the running average of re-execution time of each
protocol 3 prepared plan. Once we have 40 executions for a plan we log
any statement, with parameters, that has an execution time more than
twice the running average. That way we have an automatic filter to
reduce the amount of logging, yet without changing any current tuning
methodologies or tools.

We could also have a function that causes each backend to dump the
current averages of all plans through to the stats collector, so you can
assemble a global view. But that should be on-demand, not a continuous
flow of stats, IMHO.

I'd been thinking about ways to specify "desired execution time" for any
query, so we could log only those queries that aren't performing as
expected.

Yes, doing this as a plugin makes a lot of sense for me.

--
Simon Riggs

www.2ndQuadrant.com

PostgreSQL Training, Services and Support


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: