PostgreSQL/src/include/executor/execPartition.h
Alvaro Herrera 3f2393edef Redesign initialization of partition routing structures
This speeds up write operations (INSERT, UPDATE, DELETE, COPY, as well
as the future MERGE) on partitioned tables.

This changes the setup for tuple routing so that it does far less work
during the initial setup and pushes more work out to when partitions
receive tuples.  PartitionDispatchData structs for sub-partitioned
tables are only created when a tuple gets routed through it.  The
possibly large arrays in the PartitionTupleRouting struct have largely
been removed.  The partitions[] array remains but now never contains any
NULL gaps.  Previously the NULLs had to be skipped during
ExecCleanupTupleRouting(), which could add a large overhead to the
cleanup when the number of partitions was large.  The partitions[] array
is allocated small to start with and only enlarged when we route tuples
to enough partitions that it runs out of space. This allows us to keep
simple single-row partition INSERTs running quickly.  Redesign

The arrays in PartitionTupleRouting which stored the tuple translation maps
have now been removed.  These have been moved out into a
PartitionRoutingInfo struct which is an additional field in ResultRelInfo.

The find_all_inheritors() call still remains by far the slowest part of
ExecSetupPartitionTupleRouting(). This commit just removes the other slow
parts.

In passing also rename the tuple translation maps from being ParentToChild
and ChildToParent to being RootToPartition and PartitionToRoot. The old
names mislead you into thinking that a partition of some sub-partitioned
table would translate to the rowtype of the sub-partitioned table rather
than the root partitioned table.

Authors: David Rowley and Amit Langote, heavily revised by Álvaro Herrera
Testing help from Jesper Pedersen and Kato Sho.
Discussion: https://postgr.es/m/CAKJS1f_1RJyFquuCKRFHTdcXqoPX-PYqAd7nz=GVBwvGh4a6xA@mail.gmail.com
2018-11-16 15:01:05 -03:00

154 lines
5.7 KiB
C

/*--------------------------------------------------------------------
* execPartition.h
* POSTGRES partitioning executor interface
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* src/include/executor/execPartition.h
*--------------------------------------------------------------------
*/
#ifndef EXECPARTITION_H
#define EXECPARTITION_H
#include "nodes/execnodes.h"
#include "nodes/parsenodes.h"
#include "nodes/plannodes.h"
#include "partitioning/partprune.h"
/* See execPartition.c for the definitions. */
typedef struct PartitionDispatchData *PartitionDispatch;
typedef struct PartitionTupleRouting PartitionTupleRouting;
/*
* PartitionRoutingInfo
*
* Additional result relation information specific to routing tuples to a
* table partition.
*/
typedef struct PartitionRoutingInfo
{
/*
* Map for converting tuples in root partitioned table format into
* partition format, or NULL if no conversion is required.
*/
TupleConversionMap *pi_RootToPartitionMap;
/*
* Map for converting tuples in partition format into the root partitioned
* table format, or NULL if no conversion is required.
*/
TupleConversionMap *pi_PartitionToRootMap;
/*
* Slot to store tuples in partition format, or NULL when no translation
* is required between root and partition.
*/
TupleTableSlot *pi_PartitionTupleSlot;
} PartitionRoutingInfo;
/*
* PartitionedRelPruningData - Per-partitioned-table data for run-time pruning
* of partitions. For a multilevel partitioned table, we have one of these
* for the topmost partition plus one for each non-leaf child partition.
*
* subplan_map[] and subpart_map[] have the same definitions as in
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* context Contains the context details required to call
* the partition pruning code.
* pruning_steps List of PartitionPruneSteps used to
* perform the actual pruning.
* do_initial_prune true if pruning should be performed during
* executor startup (for this partitioning level).
* do_exec_prune true if pruning should be performed during
* executor run (for this partitioning level).
*/
typedef struct PartitionedRelPruningData
{
int *subplan_map;
int *subpart_map;
Bitmapset *present_parts;
PartitionPruneContext context;
List *pruning_steps;
bool do_initial_prune;
bool do_exec_prune;
} PartitionedRelPruningData;
/*
* PartitionPruningData - Holds all the run-time pruning information for
* a single partitioning hierarchy containing one or more partitions.
* partrelprunedata[] is an array ordered such that parents appear before
* their children; in particular, the first entry is the topmost partition,
* which was actually named in the SQL query.
*/
typedef struct PartitionPruningData
{
int num_partrelprunedata; /* number of array entries */
PartitionedRelPruningData partrelprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruningData;
/*
* PartitionPruneState - State object required for plan nodes to perform
* run-time partition pruning.
*
* This struct can be attached to plan types which support arbitrary Lists of
* subplans containing partitions, to allow subplans to be eliminated due to
* the clauses being unable to match to any tuple that the subplan could
* possibly produce.
*
* execparamids Contains paramids of PARAM_EXEC Params found within
* any of the partprunedata structs. Pruning must be
* done again each time the value of one of these
* parameters changes.
* other_subplans Contains indexes of subplans that don't belong to any
* "partprunedata", e.g UNION ALL children that are not
* partitioned tables, or a partitioned table that the
* planner deemed run-time pruning to be useless for.
* These must not be pruned.
* prune_context A short-lived memory context in which to execute the
* partition pruning functions.
* do_initial_prune true if pruning should be performed during executor
* startup (at any hierarchy level).
* do_exec_prune true if pruning should be performed during
* executor run (at any hierarchy level).
* num_partprunedata Number of items in "partprunedata" array.
* partprunedata Array of PartitionPruningData pointers for the plan's
* partitioned relation(s), one for each partitioning
* hierarchy that requires run-time pruning.
*/
typedef struct PartitionPruneState
{
Bitmapset *execparamids;
Bitmapset *other_subplans;
MemoryContext prune_context;
bool do_initial_prune;
bool do_exec_prune;
int num_partprunedata;
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
Relation rel);
extern ResultRelInfo *ExecFindPartition(ModifyTableState *mtstate,
ResultRelInfo *rootResultRelInfo,
PartitionTupleRouting *proute,
TupleTableSlot *slot,
EState *estate);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
extern PartitionPruneState *ExecCreatePartitionPruneState(PlanState *planstate,
PartitionPruneInfo *partitionpruneinfo);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
int nsubplans);
#endif /* EXECPARTITION_H */