Tuesday, August 12, 2008

[GENERAL] Alias for function return buffer in pl/pgsql?

Hello all, long time no chit-chat on the PG mailing list.  We’re upgrading from 8.0.3 to 8.3 and found that some stored procedures utilizing int_agg that we had left over from 7.3 had terrible performance.  No problem, using ANY() we’re able to regain that performance, more or less, and at the same time greatly simplify our stored procedures.  But things can never be fast enough, can they?  So I have a question or two.  Here’s my function for reference:

 

CREATE OR REPLACE FUNCTION bg_nodes2descendants(INT[], INT) RETURNS SETOF connection_generation AS

'

 

DECLARE

_row connection_generation%ROWTYPE;

_children INT[];

 

BEGIN

 

-- this is faster than constructing in the loop below

--_children = array(SELECT connectee_node_id FROM connections WHERE connection_type_id = 1 AND connector_node_id = ANY($1));

 

FOR _row IN

    SELECT connection_id, connection_type_id, connector_node_id, connector_node_type_id, connectee_node_id,

            connectee_node_type_id, current, timestamp, $2 + 1

        FROM connections WHERE connection_type_id = 1 AND connector_node_id = ANY($1)

    LOOP

        _children := _children || _row.connectee_node_id;

        RETURN NEXT _row;

    END LOOP;

 

IF FOUND THEN

    RETURN QUERY SELECT * FROM bg_nodes2descendants(_children, $2+1);

END IF;

 

RETURN;

END

 

' LANGUAGE 'plpgsql';

 

     So, my concern is alluded to in the comment above.  When I use this function in places where it returns large results, building the _children array directly (in the commented out line) is about 25% faster.  But I’d like to avoid building the children array altogether and would instead like to generate that array from the already collected output rows.  For example, right before the recursive call, I’d like to select a column of the buffered output rows, cast it to an integer[], and pass it into the recursive call.  Is there an internal value I can access for this such as:

 

_children := array(SELECT connectee_node_id FROM $output);

 

Bonus question -  if I rewrite the first FOR loop as:

 

RETURN QUERY SELECT connection_id, connection_type_id, connector_node_id, connector_node_type_id, connectee_node_id,

                             connectee_node_type_id, current, timestamp, $2 + 1 FROM connections

                 WHERE connection_type_id = 1 AND connector_node_id = ANY($1);

 

I get “ERROR:  structure of query does not match function result type”, even though the type signatures of the returned columns match the “connection_generation” rowtype.  I am pretty sure this could be resolved by casting the resulting columns to that row type, but I am lost as to how the syntax to do such a thing would look.

 

Thanks in advance for the help, and keep up the great work.  PG8.3 is an amazing piece of software and it blows me away how much more advanced it gets with every release.

 

Bart Grantham

VP of R&D

Logicworks Inc. – Complex and Managed Hosting

No comments: