Sunday, August 31, 2008

[PATCHES] WIP Join Removal

Index: src/backend/optimizer/path/joinrels.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/optimizer/path/joinrels.c,v
retrieving revision 1.94
diff -c -r1.94 joinrels.c
*** src/backend/optimizer/path/joinrels.c 17 Aug 2008 19:40:11 -0000 1.94
--- src/backend/optimizer/path/joinrels.c 31 Aug 2008 09:41:53 -0000
***************
*** 467,472 ****
--- 467,568 ----
return true;
}

+ /*
+ * consider_join_removal_path
+ * Determine whether a proposed join is removable, and if this results
+ * in a cost reduction for this join, remove the join.
+ *
+ * Callers *must* have already determined that the relation
+ * is on the non-outer side of an outer join.
+ *
+ * Note that removal is correct *even* if there are restrictions on the
+ * checkrel. That is because missing rows don't alter the overall result,
+ * so removing a few more makes no difference at all.
+ * XXX should we check that they are all immutable before we ignore them?
+ */
+ static void
+ consider_join_removal_path(RelOptInfo *keeprel, RelOptInfo *checkrel, RelOptInfo *joinrel)
+ {
+ bool removable = false;
+ int attrno;
+ int minattr = 0;
+
+ if (checkrel->rtekind != RTE_RELATION)
+ return;
+
+ /*
+ * Check that the query targetlist does not contain any non-junk
+ * target entries for the relation. If it does, we cannot remove join.
+ */
+ if (checkrel->min_attr > 0)
+ minattr = checkrel->min_attr;
+
+ for (attrno = minattr; attrno < checkrel->max_attr; attrno++)
+ {
+ if (!bms_is_empty(checkrel->attr_needed[attrno]))
+ return;
+ }
+
+ /*
+ * Check that each row of keeprel joins to exactly one row of checkrel.
+ * So we check whether the columns of the join condition exactly match
+ * a unique index on checkrel. It must be an *exact* match, no more,
+ * no less. If less, we would fail to prove uniqueness; if more we would
+ * effectively have an extra restriction condition that must be applied.
+ */
+ if (checkrel->indexlist)
+ {
+ ListCell *l;
+
+ foreach(l, checkrel->indexlist)
+ {
+ IndexOptInfo *info = lfirst(l);
+
+ /*
+ * Stop searching as soon as we find a matching unique index.
+ * We don't care if the index allows nulls, we only care
+ * if the join cannot return more than one row for keeprel.
+ *
+ * XXX Seems not worth searching partial indexes because those
+ * are only usable with a appropriate restriction, so the
+ * only way they could ever be usable is if there was a
+ * restriction that exactly matched the index predicate,
+ * which is possible but generally unlikely.
+ *
+ * XXX Expressional indexes would only be useful if the
+ * join condition contained that expression: also unlikely.
+ */
+ if (info->unique && !info->indexprs && !info->indpred)
+ {
+ removable = true;
+
+ /* check joinrel->joininfo
+ * but also somehow consider equivalence classes
+ * if (joinrel->has_eclass_joins)
+ */
+
+ /* if matches then return true */
+ }
+ }
+ }
+
+ /*
+ * We can now remove join by pulling up child plan from the keeprel.
+ * This needs to be done considering costs, since its possible for
+ * a nested inner indexscan plan to be cheaper. So it isn't
+ * always desirable to remove the join.
+ *
+ * XXX we keep resjunk attributes in the targetlist of keeprel
+ * for now, but these seem to be potentially removable also
+ */
+ if (removable &&
+ joinrel->cheapest_total_path < keeprel->cheapest_total_path)
+ {
+ elog(LOG, "join removed");
+ joinrel->pathlist = keeprel->pathlist;
+ joinrel->joininfo = keeprel->baserestrictinfo;
+ }
+ }

/*
* make_join_rel
***************
*** 597,602 ****
--- 693,699 ----
add_paths_to_joinrel(root, joinrel, rel2, rel1,
JOIN_RIGHT, sjinfo,
restrictlist);
+ consider_join_removal_path(rel1, rel2, joinrel);
break;
case JOIN_FULL:
if (is_dummy_rel(rel1) && is_dummy_rel(rel2))
As discussed on 26 June, "Join Removal/Vertical Partitioning", here's a
patch to remove joins in certain circumstances.

Tested and blind reviewed, but this is complex and subtle enough I
welcome and expect your comments on corner cases and missed
complications. (Lord knows, I've been down a few blind alleys writing it
to date...)

Patch works, but there's a bit I haven't finished yet - checking unique
indexes. So patch is marked WIP for now. Depending upon how long
commitfest lasts I may have a fully working version. (Yes, I know its
only 1 hours work, but haven't worked out how yet...)

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

No comments: