Monday, September 8, 2008

Re: [HACKERS] reducing statistics write overhead

Martin Pihlak <> writes:
> Attached is a patch which adds a timestamp to pgstat.stat file header,
> backend_read_statsfile uses this to determine if the file is fresh.
> During the wait loop, the stats request message is retransmitted to
> compensate for possible loss of message(s).

> The collector only writes the file mostly at PGSTAT_STAT_INTERVAL frequency,
> currently no extra custom timeouts can be passed in the message. This can
> of course be added if need arises.

Hmm. With the timestamp in the file, ISTM that we could put all the
intelligence on the reader side. Reader checks file, sends message if
it's too stale. The collector just writes when it's told to, no
filtering. In this design, rate limiting relies on the readers to not
be unreasonable in how old a file they'll accept; and there's no problem
with different readers having different requirements.

A possible problem with this idea is that two readers might send request
messages at about the same time, resulting in two writes where there
need have been only one. However I think that could be fixed if we add
a little more information to the request messages and have the collector
remember the file timestamp it last wrote out. There are various ways
you could design it but what comes to mind here is for readers to send
a timestamp defined as minimum acceptable file timestamp (ie, their
current clock time less whatever slop they want to allow). Then,
when the collector gets a request with that timestamp <= its last
write timestamp, it knows it need not do anything.

With signaling like that, there's no excess writes *and* no delay in
responding to a new live write request. It's sort of annoying that
the readers have to sleep for an arbitrary interval though. If we could
get rid of that part...

regards, tom lane

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

No comments: