Sunday, August 3, 2008

Re: [HACKERS] Mini improvement: statement_cost_limit

On Sunday 03 August 2008 15:12:22 Simon Riggs wrote:
> On Sun, 2008-08-03 at 00:44 -0700, daveg wrote:
> > On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote:
> > > On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:
> > > >Andrew Dunstan <andrew@dunslane.net> writes:
> > > >>Hans-Jürgen Schönig wrote:
> > > >>>i introduced a GUC called statement_cost_limit which can be used to
> > > >>>error out if a statement is expected to be too expensive.
> > > >>
> > > >>You clearly have far more faith in the cost estimates than I do.
> > > >
> > > >Wasn't this exact proposal discussed and rejected awhile back?
> > >
> > > i don't remember precisely.
> > > i have seen it on simon's wiki page and it is something which would
> > > have been useful in some cases in the past.
>
> I still support it. Regrettably, many SQL developers introduce product
> joins and other unintentional errors. Why let problem queries through?
> Security-wise they're great Denial of Service attacks, bringing the
> server to its knees better than most ways I know, in conjunction with a
> nice hefty work_mem setting. 27 table product joins: memory, CPU, I/O
> and diskspace resources used all in a simple killer query.
>

ISTR that what ended up killing the enthusiasm for this was that most people
realized that this GUC was just a poor tool to take a stab at solving other
problems (ie. rate limiting cpu for queries).

> If anybody thinks costs are inaccurate, don't use it. Or better still
> improve the cost models. It isn't any harder or easier to find a useful
> value than it is to use statement_timeout. What's the difference between
> picking an arbitrary time and an arbitrary cost? You need to alter the
> value according to people's complaints in both cases.
>

I think the original argument for statement_timeout was that long running
queries were known to cause have wrt vacuum strategies (remember, that one
has been in the back end a long time). ISTR some recent threds on -hackers
questioning whether statement_timeout should be eliminated itself.

> > I think a variation on this could be very useful in development and test
> > environments. Suppose it raised a warning or notice if the cost was over
> > the limit. Then one could set a limit of a few million on the development
> > and test servers and developers would at least have a clue that they
> > needed to look at explain for that query. As it is now, one can exhort
> > them to run explain, but it has no effect. Instead we later see queries
> > killed by a 24 hour timeout with estimated costs ranging from "until they
> > unplug the machine and dump it" to "until the sun turns into a red
> > giant".
>
> Great argument. So that's 4 in favour at least.
>

Not such a great argument. Cost models on development servers can and often
are quite different from those on production, so you might be putting an
artifical limit on top of your developers.

> A compromise would be to have log_min_statement_cost (or
> warn_min_statement_cost) which will at least help find these problems in
> testing before we put things live, but that still won't help with
> production issues.
>
> Another alternative would be to have a plugin that can examine the plan
> immediately after planner executes, so you can implement this yourself,
> plus some other possibilities.
>

I still think it is worth revisiting what problems people are trying to solve,
and see if there are better tools they can be given to solve them. Barring
that, I suppose a crude solution is better than nothing, though I fear people
might point at the crude solution as a good enough solution to justify not
working on better solutions.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: