Monday, July 7, 2008

Re: [SQL] how to control the execution plan ?

On Mon, Jul 7, 2008 at 3:14 AM, Sabin Coanda <sabin.coanda@deuromedia.ro> wrote:
> Hi there,
>
> I try to execute the following statement:
>
> SELECT *
> FROM (
> SELECT "MY_FUNCTION_A"(bp."COL_A", NULL::boolean) AS ALIAS_A
> FROM "TABLE_A" bp
> JOIN "TABLE_B" pn ON bp."COL_B" = pn."PK_ID"
> JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID"
> WHERE pn."Editor"::text ~~ 'Some%'::text AND bp."COL_A" IS NOT NULL AND
> bp."COL_A"::text <> ''::text
> ) x
> WHERE (x.ALIAS_A::text ) IS NULL;
>
> The problem is the excution plan first make Seq Scan on "TABLE_A", with
> Filter: (("COL_A" IS NOT NULL) AND (("COL_A")::text <> ''::text) AND
> (("MY_FUNCTION_A"("COL_A", NULL::boolean))::text IS NULL))". This way,
> MY_FUNCTION_A crashes for some unsupported data provided by "COL_A".
>
> I'd like to get an execution plan which is filtering first the desired rows,
> and just after that compute te column value "MY_FUNCTION_A"(bp."COL_A",
> NULL::boolean).

Just wondering what the query plans look like here, both regular
explain, and if you can wait for it to execute, explain analyze.

I'm guessing that the function is not indexed / indexable. Is it
marked immutable (and is it actually immutable) or stable (and is
stable)?

If it's immutable then you can create an index on it and that should
speed things up.

>
> I made different combinations, including a subquery like:
>
> SELECT *
> FROM (
> SELECT "MY_FUNCTION_A"(y."COL_A", NULL::boolean) AS ALIAS_A
> FROM (
> SELECT bp."COL_A"
> FROM "TABLE_A" bp
> JOIN "TABLE_B" pn ON bp."COL_B" = pn."PK_ID"
> JOIN "TABLE_C" vbo ON bp."COL_C" = vbo."PK_ID"
> WHERE pn."COL_E"::text ~~ 'Some%'::text AND bp."COL_A" IS NOT NULL
> AND bp."COL_A"::text <> ''::text
> ) y
> ) x
> WHERE (x.ALIAS_A::text ) IS NULL;
>
> but postgres analyze is too 'smart' and optimize it as in the previous case,
> with the same Seq Scan on "TABLE_A", and with the same filter.
>
> I thought to change the function MY_FUNCTION_A, to support any argument
> data, but the even that another performance problem will be rised when the
> function will be computed for any row in join, even those that can be
> removed by other filter.
>
> Do you have a solution please ?

If it's still to smart, you can run two queries, one to pull the set
you want to work with from the custom function into a temp table, then
analyze it, then run the query against that.
Not an optimal solution, but it might be the fastest if you can't
index your function.

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

No comments: