mirror of
https://github.com/postgres/postgres.git
synced 2025-05-24 00:03:23 -04:00
This commit changes the query jumbling code in queryjumblefuncs.c to be generated automatically based on the information of the nodes in the headers of src/include/nodes/ by using gen_node_support.pl. This approach offers many advantages: - Support for query jumbling for all the utility statements, based on the state of their parsed Nodes and not only their query string. This will greatly ease the switch to normalize the information of some DDLs, like SET or CALL for example (this is left unchanged and should be part of a separate discussion). With this feature, the number of entries stored for utilities in pg_stat_statements is reduced (for example now "CHECKPOINT" and "checkpoint" mean the same thing with the same query ID). - Documentation of query jumbling directly in the structure definition of the nodes. Since this code has been introduced in pg_stat_statements and then moved to code, the reasons behind the choices of what should be included in the jumble are rather sparse. Note that some explanation is added for the most relevant parts, as a start. - Overall code reduction and more consistency with the other parts generating read, write and copy depending on the nodes. The query jumbling is controlled by a couple of new node attributes, documented in nodes/nodes.h: - custom_query_jumble, to mark a Node as having a custom implementation. - no_query_jumble, to ignore entirely a Node. - query_jumble_ignore, to ignore a field in a Node. - query_jumble_location, to mark a location in a Node, for normalization. This can apply only to int fields, with "location" in their name (only Const as of this commit). There should be no compatibility impact on pg_stat_statements, as the new code applies the jumbling to the same fields for each node (its regression tests have no modification, for one). Some benchmark of the query jumbling between HEAD and this commit for SELECT and DMLs has proved that this new code does not cause a performance regression, with computation times close for both methods. For utility queries, the new method is slower than the previous method of calculating a hash of the query string, though we are talking about extra ns-level changes based on what I measured, which is unnoticeable even for OLTP workloads as a query ID is calculated once per query post-parse analysis. Author: Michael Paquier Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/Y5BHOUhX3zTH/ig6@paquier.xyz
The PostgreSQL contrib tree --------------------------- This subtree contains porting tools, analysis utilities, and plug-in features that are not part of the core PostgreSQL system, mainly because they address a limited audience or are too experimental to be part of the main source tree. This does not preclude their usefulness. User documentation for each module appears in the main SGML documentation. When building from the source distribution, these modules are not built automatically, unless you build the "world" target. You can also build and install them all by running "make all" and "make install" in this directory; or to build and install just one selected module, do the same in that module's subdirectory. Some directories supply new user-defined functions, operators, or types. To make use of one of these modules, after you have installed the code you need to register the new SQL objects in the database system by executing a CREATE EXTENSION command. In a fresh database, you can simply do CREATE EXTENSION module_name; See the PostgreSQL documentation for more information about this procedure.