Wednesday, May 7, 2008

[HACKERS] Behaviour of MERGE with complex Rules

It seems there is strange behaviour coming from trying to apply complex
Rules to the MERGE statement.

My proposal from here is to allow MERGE to work with Rules, but only
when the Rules are equivalent to simply updatable views. This would
restrict MERGE somewhat, yet be entirely compatible with SQL Standard
behaviour of MERGE, Views etc..

Let's look at some of the strangeness:

Rules allow you to define something like this

CREATE RULE foo_rule AS ON INSERT TO foo DO ALSO INSERT foo2 ...;

so that any insert into foo becomes 2 inserts, 1 into foo, 1 into foo2.
Now if we do an INSERT INTO FOO SELECT .... this gets rewritten into
1. INSERT INTO foo SELECT ...
2. INSERT INTO foo2 SELECT ...
This means that we fire statement-level insert triggers on foo and foo2.

By analogy, we might expect MERGE to behave similarly. That could be
true with trivial examples such as

MERGE INTO foo
USING (SELECT ...) ON (...)
WHEN NOT MATCHED
INSERT

though with a statement like this

MERGE INTO foo
USING (SELECT ...) ON (...)
WHEN NOT MATCHED
INSERT
WHEN MATCHED
UPDATE

what do we expect to happen exactly?

We run the MERGE statement twice, with the insert statements permuted?
But what happens with the UPDATE? Especially if there are also rules
that apply to UPDATE. Would we run it four times? No, the MERGE query
must run once and we must handle the rules within that single execution.
Which means if we do that then MERGE acts differently with Rules than
does an INSERT SELECT. Which feels like a warning...

So we must replace the INSERT with INSERT foo; INSERT foo2 and run the
MERGE. If we do this do we treat the second INSERT as a separate
statement each time it is executed? If so a statement-level trigger on
INSERT foo2 would be executed once for each row we insert. To make that
happen correctly we would need to apply the rule as if the INSERT were a
top-level statement, then execute it within MERGE as if it were not a
top-level statement (i.e. do not execute statement-level triggers for
that statement). If we have AFTER ROW triggers, should they be executed
after each execution of the sub-statement? or should they be executed in
a group at the end of the MERGE statement, as would happen if an INSERT
SELECT.

What would happen if one of the rules contained a MERGE statement? We'd
have to recursively apply rules down into the sub-statements of the
MERGE, yet unravel the trigger behaviour correctly.

All of this makes the hair on the back of my neck stand up. It's taken a
while to realise these issues exist. This feels to me like even stranger
behaviour might lurk somewhere there. This is a long way from clear
behaviour and makes me think it will be a long way from a clean and
useful implementation.

Clear, obvious behaviour for MERGE only seems possible when we have some
restrictions on rules. My proposal is to throw an "ERROR Cannot resolve
rules for MERGE statement", except when the rules meet the following
restrictions:

* allow MERGE with rules which say DO NOTHING
e.g. CREATE RULE foo_rule AS ON INSERT TO foo DO NOTHING;
That's fairly trivial

* allow MERGE when rules which are merely "redirections"
e.g. CREATE RULE foo_rule AS ON INSERT TO foo DO INSTEAD INSERT foo2
So only rules that have a single replacement statement. This is roughly
equivalent to a rule we might create to emulate a simply updatable view.

* allow rules only when all of the actions are similarly redirected, so
we only ever need to make changes to one table
i.e. SELECT, UPDATEs, INSERTs and DELETEs are all redirected

So a MERGE statement like this

MERGE INTO foo
USING (SELECT ...) ON (...)
WHEN NOT MATCHED
INSERT
WHEN MATCHED
UPDATE

with these rules

CREATE RULE foo_i AS ON INSERT TO foo DO INSTEAD INSERT foo2
CREATE RULE foo_u AS ON UPDATE TO foo DO INSTEAD UPDATE foo2
CREATE RULE foo_d AS ON DELETE TO foo DO INSTEAD DELETE foo2

would simply be equivalent to

MERGE INTO foo2
USING (SELECT ...) ON (...)
WHEN NOT MATCHED
INSERT
WHEN MATCHED
UPDATE

In fact, that might even be a better clue as to how to complete the
implementation of updatable views.

So: do we all agree with the restriction on MERGE to only work with
rules equivalent to simple updatability? If we do, should the
implementation of MERGE go on hold until after updatable views are added
(with the required hooks)?

--
Simon Riggs
2ndQuadrant

http://www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

No comments: