mirror of
https://github.com/postgres/postgres.git
synced 2025-11-13 00:03:54 -05:00
The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
src/backend/nodes/README
Node Structures
===============
Andrew Yu (11/94)
Introduction
------------
The current node structures are plain old C structures. "Inheritance" is
achieved by convention. No additional functions will be generated. Functions
that manipulate node structures reside in this directory.
FILES IN THIS DIRECTORY (src/backend/nodes/)
General-purpose node manipulation functions:
copyfuncs.c - copy a node tree
equalfuncs.c - compare two node trees
outfuncs.c - convert a node tree to text representation
readfuncs.c - convert text representation back to a node tree
makefuncs.c - creator functions for some common node types
nodeFuncs.c - some other general-purpose manipulation functions
Specialized manipulation functions:
bitmapset.c - Bitmapset support
list.c - generic list support
params.c - Param support
tidbitmap.c - TIDBitmap support
value.c - support for Value nodes
FILES IN src/include/nodes/
Node definitions:
nodes.h - define node tags (NodeTag)
primnodes.h - primitive nodes
parsenodes.h - parse tree nodes
pathnodes.h - path tree nodes and planner internal structures
plannodes.h - plan tree nodes
execnodes.h - executor nodes
memnodes.h - memory nodes
pg_list.h - generic list
Steps to Add a Node
-------------------
Suppose you want to define a node Foo:
1. Add a tag (T_Foo) to the enum NodeTag in nodes.h. (If you insert the
tag in a way that moves the numbers associated with existing tags,
you'll need to recompile the whole tree after doing this. It doesn't
force initdb though, because the numbers never go to disk.)
2. Add the structure definition to the appropriate include/nodes/???.h file.
If you intend to inherit from, say a Plan node, put Plan as the first field
of your struct definition.
3. If you intend to use copyObject, equal, nodeToString or stringToNode,
add an appropriate function to copyfuncs.c, equalfuncs.c, outfuncs.c
and readfuncs.c accordingly. (Except for frequently used nodes, don't
bother writing a creator function in makefuncs.c) The header comments
in those files give general rules for whether you need to add support.
4. Add cases to the functions in nodeFuncs.c as needed. There are many
other places you'll probably also need to teach about your new node
type. Best bet is to grep for references to one or two similar existing
node types to find all the places to touch.
Historical Note
---------------
Prior to the current simple C structure definitions, the Node structures
used a pseudo-inheritance system which automatically generated creator and
accessor functions. Since every node inherited from LispValue, the whole thing
was a mess. Here's a little anecdote:
LispValue definition -- class used to support lisp structures
in C. This is here because we did not want to totally rewrite
planner and executor code which depended on lisp structures when
we ported postgres V1 from lisp to C. -cim 4/23/90