When estimating the cost/size of a pre-sorted path for a given upper
relation using local stats, this function dereferences the passed-in
PgFdwPathExtraData pointer without checking that it is not NULL. But
that is not a bug as the pointer is guaranteed to be non-NULL in that
case; to avoid confusion, add an Assert to ensure that it is not NULL
before dereferencing it.
Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQArgiALbV1akQpeZOgim7XP05n%3DbDP1%3DTcOYLA43nRX_vA%40mail.gmail.com
The previous minimum was to maintain support for Python 3.5, but we
now require Python 3.6 anyway (commit 45363fca637), so that reason is
obsolete. A small raise to Meson 0.57 allows getting rid of a fair
amount of version conditionals and silences some future-deprecated
warnings.
With the version bump, the following deprecation warnings appeared and
are fixed:
WARNING: Project targets '>=0.57' but uses feature deprecated since '0.55.0': ExternalProgram.path. use ExternalProgram.full_path() instead
WARNING: Project targets '>=0.57' but uses feature deprecated since '0.56.0': meson.build_root. use meson.project_build_root() or meson.global_build_root() instead.
It turns out that meson 0.57.0 and 0.57.1 are buggy for our use, so
the minimum is actually set to 0.57.2. This is specific to this
version series; in the future we won't necessarily need to be this
precise.
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/42e13eb0-862a-441e-8d84-4f0fd5f6def0%40eisentraut.org
This provides a convenient way to look up a database's OID. For
example, the query
SELECT * FROM pg_shdepend
WHERE dbid = (SELECT oid FROM pg_database
WHERE datname = current_database());
can now be simplified to
SELECT * FROM pg_shdepend
WHERE dbid = current_database()::regdatabase;
Like the regrole type, regdatabase has cluster-wide scope, so we
disallow regdatabase constants from appearing in stored
expressions.
Bumps catversion.
Author: Ian Lawrence Barwick <barwick@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aBpjJhyHpM2LYcG0%40nathan
We concluded that commit e5a3c9d9b is a feature rather than a fix; since
it was added after feature freeze, revert it.
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/ed2296f1-1a6b-4932-b870-5bb18c2591ae%40oss.nttdata.com
Previously, postgres_fdw always 1) opened a remote transaction in READ
WRITE mode even when the local transaction was READ ONLY, causing a READ
ONLY transaction using it that references a foreign table mapped to a
remote view executing a volatile function to write in the remote side,
and 2) opened the remote transaction in NOT DEFERRABLE mode even when
the local transaction was DEFERRABLE, causing a SERIALIZABLE READ ONLY
DEFERRABLE transaction using it to abort due to a serialization failure
in the remote side.
To avoid these, modify postgres_fdw to open a remote transaction in the
same access/deferrable modes as the local transaction. This commit also
modifies it to open a remote subtransaction in the same access mode as
the local subtransaction.
Although these issues exist since the introduction of postgres_fdw,
there have been no reports from the field. So it seems fine to just fix
them in master only.
Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com
postgres_fdw tries to use PG_TRY blocks to ensure that it will
eventually free the PGresult created by the remote modify command.
However, it's fundamentally impossible for this scheme to work
reliably when there's RETURNING data, because the query could fail
in between invocations of postgres_fdw's DirectModify methods.
There is at least one instance of exactly this situation in the
regression tests, and the ensuing session-lifespan leak is visible
under Valgrind.
We can improve matters by using a memory context reset callback
attached to the ExecutorState context. That ensures that the
PGresult will be freed when the ExecutorState context is torn
down, even if control never reaches postgresEndDirectModify.
I have little faith that there aren't other potential PGresult
leakages in the backend modules that use libpq. So I think it'd
be a good idea to apply this concept universally by creating
infrastructure that attaches a reset callback to every PGresult
generated in the backend. However, that seems too invasive for
v18 at this point, let alone the back branches. So for the
moment, apply this narrow fix that just makes DirectModify safe.
I have a patch in the queue for the more general idea, but it
will have to wait for v19.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us
Backpatch-through: 13
A 'void *' argument suggests that the caller might pass an arbitrary
struct, which is appropriate for functions like libc's read/write, or
pq_sendbytes(). 'uint8 *' is more appropriate for byte arrays that
have no structure, like the cancellation keys or SCRAM tokens. Some
places used 'char *', but 'uint8 *' is better because 'char *' is
commonly used for null-terminated strings. Change code around SCRAM,
MD5 authentication, and cancellation key handling to follow these
conventions.
Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org
When building a ForeignPath for a joinrel, if there's a possibility
that EvalPlanQual will be executed, we must identify a suitable path
for EPQ checks. If the outer or inner path of the chosen path is a
ForeignPath representing a pushed-down join, we replace it with its
fdw_outerpath to ensure that the EPQ check path consists entirely of
local joins.
If the chosen path is a MergePath, and its outer or inner path is a
ForeignPath that is not already well enough ordered, the MergePath
will have non-NIL outersortkeys or innersortkeys indicating the
desired ordering to be created by an explicit Sort node. If we then
replace the outer or inner path with its corresponding fdw_outerpath,
and that path is already sufficiently ordered, we end up in an
inconsistent state: the MergePath has non-NIL outersortkeys or
innersortkeys, and its input path is already properly ordered. This
inconsistency can result in an Assert failure or the addition of a
redundant Sort node.
To fix, check if the new outer or inner path of a MergePath is already
properly sorted, and set its outersortkeys or innersortkeys to NIL if
so.
Bug: #18902
Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18902-71c1bed2b9f7c46f@postgresql.org
A subsequent commit will reclassify oauth_client_secret from dispchar=""
to dispchar="*", so that UIs will treat it like a secret. For our FDWs,
this change will move that option from SERVER to USER MAPPING, which we
need to avoid.
But upon further discussion, we don't really want our FDWs to use our
builtin Device Authorization flow at all, for several reasons:
- the URL and code would be printed to the server logs, not sent over
the client connection
- tokens are not cached/refreshed, so every single connection has to be
manually authorized by a user with a browser
- oauth_client_secret needs to belong to the foreign server, but options
on SERVER are publicly accessible
- all non-superusers would need password_required=false, which is
dangerous
Future OAuth work can use FDWs as a motivating use case. But for now,
disallow all oauth_* connection options for these two extensions.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250415191435.55.nmisch%40google.com
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places. These
inconsistencies were all introduced during Postgres 18 development.
This commit was written with help from clang-tidy, by mechanically
applying the same rules as similar clean-up commits (the earliest such
commit was commit 035ce1fe).
When planning queries to partitioned tables, we clone all
EquivalenceMembers belonging to the partitioned table into em_is_child
EquivalenceMembers for each non-pruned partition. For partitioned tables
with large numbers of partitions, this meant the ec_members list could
become large and code searching that list would become slow. Effectively,
the more partitions which were present, the more searches needed to be
performed for operations such as find_ec_member_matching_expr() during
create_plan() and the more partitions present, the longer these searches
would take, i.e., a quadratic slowdown.
To fix this, here we adjust how we store EquivalenceMembers for
em_is_child members. Instead of storing these directly in ec_members,
these are now stored in a new array of Lists in the EquivalenceClass,
which is indexed by the relid. When we want to find EquivalenceMembers
belonging to a certain child relation, we can narrow the search to the
array element for that relation.
To make EquivalenceMember lookup easier and to reduce the amount of code
change, this commit provides a pair of functions to allow iteration over
the EquivalenceMembers of an EC which also handles finding the child
members, if required. Callers that never need to look at child members
can remain using the foreach loop over ec_members, which will now often
be faster due to only parent-level members being stored there.
The actual performance increases here are highly dependent on the number
of partitions and the query being planned. Performance increases can be
visible with as few as 8 partitions, but the speedup is marginal for
such low numbers of partitions. The speedups become much more visible
with a few dozen to hundreds of partitions. With some tested queries
using 56 partitions, the planner was around 3x faster than before. For
use cases with thousands of partitions, these are likely to become
significantly faster. Some testing has shown planner speedups of 60x or
more with 8192 partitions.
Author: Yuya Watari <watari.yuya@gmail.com>
Co-authored-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Tested-by: Thom Brown <thom@linux.com>
Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com
Change the PathKey struct to use CompareType to record the sort
direction instead of hardcoding btree strategy numbers. The
CompareType is then converted to the index-type-specific strategy when
the plan is created.
This reduces the number of places btree strategy numbers are
hardcoded, and it's a self-contained subset of a larger effort to
allow non-btree indexes to behave like btrees.
Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
It seems potentially useful to label our shared libraries with version
information, now that a facility exists for retrieving that. This
patch labels them with the PG_VERSION string. There was some
discussion about using semantic versioning conventions, but that
doesn't seem terribly helpful for modules with no SQL-level presence;
and for those that do have SQL objects, we typically expect them
to support multiple revisions of the SQL definitions, so it'd still
not be very helpful.
I did not label any of src/test/modules/. It seems unnecessary since
we don't install those, and besides there ought to be someplace that
still provides test coverage for the original PG_MODULE_MAGIC macro.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com
The tests introduced by commit 76563f88cfb only work when Unix-domain
sockets are available. This is optional on Windows, and buildfarm
member drongo runs without them. To fix, skip the test if Unix-domain
sockets are not enabled.
If a var belongs to the innerrel of the joinrel, it's not possible that
it belongs to the outerrel. This commit removes the redundant check from
the if-clause but keeps it as an assertion.
Discussion: https://postgr.es/m/flat/CAHewXN=8aW4hd_W71F7Ua4+_w0=bppuvvTEBFBF6G0NuSXLwUw@mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Alexander Pyhalov <a.yhalov@postgrespro.ru>
Backpatch-through: 17
Semi-join joins below left/right join are deparsed as
subqueries. Thus, we can't refer to subqueries vars from upper relations.
This commit avoids pulling conditions from them.
Reported-by: Robins Tharakan <tharakan@gmail.com>
Bug: #18852
Discussion: https://postgr.es/m/CAEP4nAzryLd3gwcUpFBAG9MWyDfMRX8ZjuyY2XXjyC_C6k%2B_Zw%40mail.gmail.com
Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 17
SCRAM pass-through should not bypass the FDW security check as it was
implemented for postgres_fdw in commit 761c79508e7.
This commit improves the security check by adding new SCRAM
pass-through checks to ensure that the required SCRAM connection
options are not overwritten by the user mapping or foreign server
options. This is meant to match the security requirements for a
password-using connection.
Since libpq has no SCRAM-specific equivalent of
PQconnectionUsedPassword(), we enforce this instead by making the
use_scram_passthrough option of postgres_fdw imply
require_auth=scram-sha-256. This means that if use_scram_passthrough
is set, some situations that might otherwise have worked are
preempted, for example GSSAPI with delegated credentials. This could
be enhanced in the future if there is desire for more flexibility.
Reported-by: Jacob Champion <jacob.champion@enterprisedb.com>
Author: Matheus Alcantara <mths.dev@pm.me>
Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com
We have collected several instances of a workaround for GCC bug 53119,
which caused false-positive compiler warnings. This bug has long been
fixed, but was still seen on the buildfarm, most recently on lapwing
with gcc (Debian 4.7.2-5). (The GCC bug tracker mentions that a fix
was backported to 4.7.4 and 4.8.3.)
That compiler no longer runs warning-free since commit 6fdd5d95634, so
we don't need to keep these workarounds. And furthermore, the
consensus appears to be that we don't want to keep supporting that era
of platform anymore at all.
This reverts the following commits:
d937904cce6a3d82e4f9c2127de7b59105a134b3
506428d091760650971433f6bc083531c307b368
b449afb582bb9015bfbb85abc10ce122aef9ec70
6392f2a0968c20ecde4d27b6652703ad931fce92
bad0763a4d7be3005eae35d460c73ac4bc7ebaad
5e0c761d0a13c7b4f7c5de618ac38560d74d74d0
and makes a few similar fixes to newer code.
Discussion: https://www.postgresql.org/message-id/flat/e170d61f-01ab-4cf9-ab68-91cd1fac62c5%40eisentraut.org
Discussion: https://www.postgresql.org/message-id/flat/CA%2BTgmoYEAm-KKZibAP3hSqbTFTjUd47XtVcf3xSFDpyecXX9uQ%40mail.gmail.com
Modules can use RegisterExtensionExplainOption to register new
EXPLAIN options, and GetExplainExtensionId, GetExplainExtensionState,
and SetExplainExtensionState to store related state inside the
ExplainState object.
Since this substantially increases the amount of code that needs
to handle ExplainState-related tasks, move a few bits of existing
code to a new file explain_state.c and add the rest of this
infrastructure there.
See the comments at the top of explain_state.c for further
explanation of how this mechanism works.
This does not yet provide a way for such such options to do anything
useful. The intention is that we'll add hooks for that purpose in a
separate commit.
Discussion: http://postgr.es/m/CA+TgmoYSzg58hPuBmei46o8D3SKX+SZoO4K_aGQGwiRzvRApLg@mail.gmail.com
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Up to now we just punted on showing the window definitions used
in a plan, with window function calls represented as "OVER (?)".
To improve that, show the window definition implemented by each
WindowAgg plan node, and reference their window names in OVER.
For nameless window clauses generated by "OVER (...)", assign
unique names w1, w2, etc.
In passing, re-order the properties shown for a WindowAgg node
so that the Run Condition (if any) appears after the Window
property and before the Filter (if any). This seems more
sensible since the Run Condition is associated with the Window
and acts before the Filter.
Thanks to David G. Johnston and Álvaro Herrera for design
suggestions.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/144530.1741469955@sss.pgh.pa.us
This commit adds a new "remote_backend_pid" output column to
the postgres_fdw_get_connections function. It returns the process ID of
the remote backend, on the foreign server, handling the connection.
This enhancement is useful for troubleshooting, monitoring, and reporting.
For example, if a connection is unexpectedly closed by the foreign server,
the remote backend's PID can help diagnose the cause.
No extension version bump is needed, as commit c297a47c5f already
handled it for v18~.
Author: Sagar Dilip Shedge <sagar.shedge92@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAPhYifF25q5xUQWXETfKwhc0YVa_6+tfG9Kw4bCvCjpCWxYs2A@mail.gmail.com
explain.c has grown rather large, so move various functions that
are principally concerned with output generation to a new source
file, explain_format.c, instead of lumping them in with everything
else that is part of explain.c
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: http://postgr.es/m/CA+TgmoYutMw1Jgo8BWUmB3TqnOhsEAJiYO=rOQufF4gPLWmkLQ@mail.gmail.com
Commit ddb17e387aa28d61521227377b00f997756b8a27 attempted to avoid
confusing users by displaying digits after the decimal point only when
nloops > 1, since it's impossible to have a fraction row count after a
single iteration. However, this made the regression tests unstable since
parallal queries will have nloops>1 for all nodes below the Gather or
Gather Merge in normal cases, but if the workers don't start in time and
the leader finishes all the work, they will suddenly have nloops==1,
making it unpredictable whether the digits after the decimal point would
be displayed or not. Although 44cbba9a7f51a3888d5087fc94b23614ba2b81f2
seemed to fix the immediate failures, it may still be the case that there
are lower-probability failures elsewhere in the regression tests.
Various fixes are possible here. For example, it has previously been
proposed that we should try to display the digits after the decimal
point only if rows/nloops is an integer, but currently rows is storead
as a float so it's not theoretically an exact quantity -- precision
could be lost in extreme cases. It has also been proposed that we
should try to display the digits after the decimal point only if we're
under some sort of construct that could potentially cause looping
regardless of whether it actually does. While such ideas are not
without merit, this patch adopts the much simpler solution of always
display two decimal digits. If that approach stands up to scrutiny
from the buildfarm and human users, it spares us the trouble of doing
anything more complex; if not, we can reassess.
This commit incidentally reverts 44cbba9a7f51a3888d5087fc94b23614ba2b81f2,
which should no longer be needed.
Author: Robert Haas <robertmhaas@gmail.com>
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Discussion: http://postgr.es/m/CA+TgmoazzVHn8sFOMFAEwoqBTDxKT45D7mvkyeHgqtoD2cn58Q@mail.gmail.com
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).
The syntax for the column definition is
... GENERATED ALWAYS AS (...) VIRTUAL
and VIRTUAL is also optional. VIRTUAL is the default rather than
STORED to match various other SQL products. (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.) (Also, virtual views are the default, rather than
materialized views.)
Virtual generated columns are stored in tuples as null values. (A
very early version of this patch had the ambition to not store them at
all. But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing. This is a
compromise, and it still saves space over being forced to use stored
generated columns. If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)
The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns. In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent. Some of
that could maybe be relaxed later after separate careful
considerations.
Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:
- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication
The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced. This way we can make
sure the behavior is mostly aligned, and the differences can be
visible. Some tests for currently not supported features are
currently commented out.
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
If a column is omitted in an INSERT, and there's no column default,
the code in preptlist.c generates a NULL Const to be inserted.
Furthermore, if the column is of a domain type, we wrap the Const
in CoerceToDomain, so as to throw a run-time error if the domain
has a NOT NULL constraint. That's fine as far as it goes, but
there are two problems:
1. We're being sloppy about the type/typmod that the Const is
labeled with. It really should have the domain's base type/typmod,
since it's the input to CoerceToDomain not the output. This can
result in coerce_to_domain inserting a useless length-coercion
function (useless because it's being applied to a null). The
coercion would typically get const-folded away later, but it'd
be better not to create it in the first place.
2. We're not applying expression preprocessing (specifically,
eval_const_expressions) to the resulting expression tree.
The planner's primary expression-preprocessing pass already happened,
so that means the length coercion step and CoerceToDomain node miss
preprocessing altogether.
This is at the least inefficient, since it means the length coercion
and CoerceToDomain will actually be executed for each inserted row,
though they could be const-folded away in most cases. Worse, it
seems possible that missing preprocessing for the length coercion
could result in an invalid plan (for example, due to failing to
perform default-function-argument insertion). I'm not aware of
any live bug of that sort with core datatypes, and it might be
unreachable for extension types as well because of restrictions of
CREATE CAST, but I'm not entirely convinced that it's unreachable.
Hence, it seems worth back-patching the fix (although I only went
back to v14, as the patch doesn't apply cleanly at all in v13).
There are several places in the rewriter that are building null
domain constants the same way as preptlist.c. While those are
before the planner and hence don't have any reachable bug, they're
still applying a length coercion that will be const-folded away
later, uselessly wasting cycles. Hence, make a utility routine
that all of these places can call to do it right.
Making this code more careful about the typmod assigned to the
generated NULL constant has visible but cosmetic effects on some
of the plans shown in contrib/postgres_fdw's regression tests.
Discussion: https://postgr.es/m/1865579.1738113656@sss.pgh.pa.us
Backpatch-through: 14
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries
to explicitly return old and new values by using the special aliases
"old" and "new", which are automatically added to the query (if not
already defined) while parsing its RETURNING list, allowing things
like:
RETURNING old.colname, new.colname, ...
RETURNING old.*, new.*
Additionally, a new syntax is supported, allowing the names "old" and
"new" to be changed to user-supplied alias names, e.g.:
RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ...
This is useful when the names "old" and "new" are already defined,
such as inside trigger functions, allowing backwards compatibility to
be maintained -- the interpretation of any existing queries that
happen to already refer to relations called "old" or "new", or use
those as aliases for other relations, is not changed.
For an INSERT, old values will generally be NULL, and for a DELETE,
new values will generally be NULL, but that may change for an INSERT
with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule
changes the command type. Therefore, we put no restrictions on the use
of old and new in any DML queries.
Dean Rasheed, reviewed by Jian He and Jeff Davis.
Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
This enables SCRAM authentication for postgres_fdw when connecting to
a foreign server without having to store a plain-text password on user
mapping options.
This is done by saving the SCRAM ClientKey and ServeryKey from the
client authentication and using those instead of the plain-text
password for the server-side SCRAM exchange. The new foreign-server
or user-mapping option "use_scram_passthrough" enables this.
Co-authored-by: Matheus Alcantara <mths.dev@pm.me>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/27b29a35-9b96-46a9-bc1a-914140869dac@gmail.com
Despite the best efforts of commit 0e5c82380, we're still seeing
occasional failures of postgres_fdw's query_cancel test in the
buildfarm. Investigation suggests that its 100ms timeout is
still not enough to reliably ensure that the remote side starts
the query before receiving the cancel request --- and if it
hasn't, it will just discard the request because it's idle.
We discussed allowing a cancel request to kill the next-received
query, but that would have wide and perhaps unpleasant side-effects.
What seems safer is to make postgres_fdw do what a human user would
likely do, which is issue another cancel request if the first one
didn't seem to do anything. We'll keep the same overall 30 second
grace period before concluding things are broken, but issue additional
cancel requests after 1 second, then 2 more seconds, then 4, then 8.
(The next one in series is 16 seconds, but we'll hit the 30 second
timeout before that.)
Having done that, revert the timeout in query_cancel.sql to 10 ms.
That will still be enough on most machines, most of the time, for
the remote query to start; but now we're intentionally risking the
race condition occurring sometimes in the buildfarm, so that the
repeat-cancel code path will get some testing.
As before, back-patch to v17. We might eventually contemplate
back-patching this further, and/or adding similar logic to dblink.
But given the lack of field complaints to date, this feels like
mostly an exercise in test case stabilization, so v17 is enough.
Discussion: https://postgr.es/m/colnv3lzzmc53iu5qoawynr6qq7etn47lmggqr65ddtpjliq5d@glkveb4m6nop
The new compact_attrs array stores a few select fields from
FormData_pg_attribute in a more compact way, using only 16 bytes per
column instead of the 104 bytes that FormData_pg_attribute uses. Using
CompactAttribute allows performance-critical operations such as tuple
deformation to be performed without looking at the FormData_pg_attribute
element in TupleDesc which means fewer cacheline accesses.
For some workloads, tuple deformation can be the most CPU intensive part
of processing the query. Some testing with 16 columns on a table
where the first column is variable length showed around a 10% increase in
transactions per second for an OLAP type query performing aggregation on
the 16th column. However, in certain cases, the increases were much
higher, up to ~25% on one AMD Zen4 machine.
This also makes pg_attribute.attcacheoff redundant. A follow-on commit
will remove it, thus shrinking the FormData_pg_attribute struct by 4
bytes.
Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option
has come up a few times over the past few years. In many ways, doing this
seems like a good idea as it may be more obvious to users why a given
query is running more slowly than they might expect. Also, from my own
(David's) personal experience, I've seen users posting to the mailing
lists with two identical plans, one slow and one fast asking why their
query is sometimes slow. In many cases, this is due to additional reads.
Having BUFFERS on by default may help reduce some of these questions, and
if not, make it more obvious to the user before they post, or save a
round-trip to the mailing list when additional I/O effort is the cause of
the slowness.
The general consensus is that we want BUFFERS on by default with
ANALYZE. However, there were more than zero concerns raised with doing
so. The primary reason against is the additional verbosity, making it
harder to read large plans. Another concern was that buffer information
isn't always useful so may not make sense to have it on by default.
It's currently December, so let's commit this to see if anyone comes
forward with a strong objection against making this change. We have over
half a year remaining in the v18 cycle where we could still easily consider
reverting this if someone were to come forward with a convincing enough
reason as to why doing this is a bad idea.
There were two patches independently submitted to achieve this goal, one
by me and the other by Guillaume. This commit is a mix of both of these
patches with some additional work done by me to adjust various
additional places in the documentation which include EXPLAIN ANALYZE
output.
Author: Guillaume Lelarge, David Rowley
Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides
Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
Like INT64_FORMAT and UINT64_FORMAT, these macros produce format
strings for 64-bit integers. However, INT64_HEX_FORMAT and
UINT64_HEX_FORMAT generate the output in hexadecimal instead of
decimal. Besides introducing these macros, this commit makes use
of them in several places. This was originally intended to be part
of commit 5d6187d2a2, but I left it out because I felt there was a
nonzero chance that back-patching these new macros into c.h could
cause problems with third-party code. We tend to be less cautious
with such changes in new major versions.
Note that UINT64_HEX_FORMAT was originally added in commit
ee1b30f128, but it was placed in test_radixtree.c, so it wasn't
widely available. This commit moves UINT64_HEX_FORMAT to c.h.
Discussion: https://postgr.es/m/ZwQvtUbPKaaRQezd%40nathan
Several places treat MyStartTime as a "long", which is only 32 bits
wide on some platforms. In reality, MyStartTime is a pg_time_t,
i.e., a signed 64-bit integer. This will lead to interesting bugs
on the aforementioned systems in 2038 when signed 32-bit integers
are no longer sufficient to store Unix time (e.g., "pg_ctl start"
hanging). To fix, ensure that MyStartTime is handled as a 64-bit
value everywhere. (Of course, users will need to ensure that
time_t is 64 bits wide on their system, too.)
Co-authored-by: Max Johnson
Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com
Backpatch-through: 12
This commit adds a "user_name" output column to
the postgres_fdw_get_connections function, returning the name
of the local user mapped to the foreign server for each connection.
If a public mapping is used, it returns "public."
This helps identify postgres_fdw connections more easily,
such as determining which connections are invalid, closed,
or used within the current transaction.
No extension version bump is needed, as commit c297a47c5f
already handled it for v18~.
Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/b492a935-6c7e-8c08-e485-3c1d64d7d10f@oss.nttdata.com
This test occasionally shows
+WARNING: could not get result of cancel request due to timeout
which appears to be because the cancel request is sometimes unluckily
sent to the remote session between queries, and then it's ignored.
This patch tries to make that less probable in three ways:
1. Use a test query that does not involve remote estimates, so that
no EXPLAINs are sent.
2. Make sure that the remote session is ready-to-go (transaction
started, SET commands sent) before we start the timer.
3. Increase the statement_timeout to 100ms, to give the local
session enough time to plan and issue the query.
We might have to go higher than 100ms to make this adequately
stable in the buildfarm, but let's see how it goes.
Back-patch to v17 where this test was introduced.
Jelte Fennema-Nio and Tom Lane
Discussion: https://postgr.es/m/578934.1725045685@sss.pgh.pa.us
Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.
To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the startup cost and the
total cost. Hence, the path list is now sorted by disabled_nodes and
then by total_cost, instead of just by the latter, and likewise for
the partial path list. It is important that this number is a count
and not simply a Boolean; else, as soon as we're unable to respect
disabled path types in all portions of the path, we stop trying to
avoid them where we can.
Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.
Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).
Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley.
Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
When pg_dump retrieves the list of database objects and performs the
data dump, there was possibility that objects are replaced with others
of the same name, such as views, and access them. This vulnerability
could result in code execution with superuser privileges during the
pg_dump process.
This issue can arise when dumping data of sequences, foreign
tables (only 13 or later), or tables registered with a WHERE clause in
the extension configuration table.
To address this, pg_dump now utilizes the newly introduced
restrict_nonsystem_relation_kind GUC parameter to restrict the
accesses to non-system views and foreign tables during the dump
process. This new GUC parameter is added to back branches too, but
these changes do not require cluster recreation.
Back-patch to all supported branches.
Reviewed-by: Noah Misch
Security: CVE-2024-7348
Backpatch-through: 12
The buildfarm member "hake" reported a failure in the regression test
added by commit 857df3cef7, where postgres_fdw_get_connections(true)
returned unexpected results.
The function postgres_fdw_get_connections(true) checks
if a connection is closed by using POLLRDHUP in the requested events
and calling poll(). Previously, the function only considered
POLLRDHUP or 0 as valid returned events. However, poll() can also
return POLLHUP, POLLERR, and/or POLLNVAL. So if any of these events
were returned, postgres_fdw_get_connections(true) would report
incorrect results. postgres_fdw_get_connections(true) failed to
account for these return events.
This commit updates postgres_fdw_get_connections(true) to correctly
report a closed connection when poll() returns not only POLLRDHUP
but also POLLHUP, POLLERR, or POLLNVAL.
Discussion: https://postgr.es/m/fd8f6186-9e1e-4b9a-92c5-e71e3697d381@oss.nttdata.com
This commit extends the postgres_fdw_get_connections() function
to check if connections are closed. This is useful for detecting closed
postgres_fdw connections that could prevent successful transaction
commits. Users can roll back transactions immediately upon detecting
closed connections, avoiding unnecessary processing of failed
transactions.
This feature is available only on systems supporting the non-standard
POLLRDHUP extension to the poll system call, including Linux.
Author: Hayato Kuroda
Reviewed-by: Shinya Kato, Zhihong Yu, Kyotaro Horiguchi, Andres Freund
Reviewed-by: Onder Kalaci, Takamichi Osumi, Vignesh C, Tom Lane, Ted Yu
Reviewed-by: Katsuragi Yuta, Peter Smith, Shubham Khanna, Fujii Masao
Discussion: https://postgr.es/m/TYAPR01MB58662809E678253B90E82CE5F5889@TYAPR01MB5866.jpnprd01.prod.outlook.com
This commit extends the postgres_fdw_get_connections() function to
include a new used_in_xact column, indicating whether each connection
is used in the current transaction.
This addition is particularly useful for the upcoming feature that
will check if connections are closed. By using those information,
users can verify if postgres_fdw connections used in a transaction
remain open. If any connection is closed, the transaction cannot
be committed successfully. In this case users can roll back it
immediately without waiting for transaction end.
The SQL API for postgres_fdw_get_connections() is updated by
this commit and may change in the future. To handle compatibility
with older SQL declarations, an API versioning system is introduced,
allowing the function to behave differently based on the API version.
Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/be9382f7-5072-4760-8b3f-31d6dffa8d62@oss.nttdata.com
This allows us to skip it in Cygwin, where it's reportedly flaky because
of platform bugs or something.
Backpatch to 17, where the test was introduced by commit 2466d6654f85.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/e4d0cb33-6be5-e4d5-ae49-9eac3ff2b005@gmail.com
Commit d844cd75a disallowed rewind in a non-scrollable cursor to resolve
anomalies arising from such a cursor operation. However, this failed to
take into account the assumption in postgres_fdw that when rescanning a
foreign relation, it can rewind the cursor created for scanning the
foreign relation without specifying the SCROLL option, regardless of its
scrollability, causing this error when it tried to do such a rewind in a
non-scrollable cursor. Fix by modifying postgres_fdw to instead
recreate the cursor, regardless of its scrollability, when rescanning
the foreign relation. (If we had a way to check its scrollability, we
could improve this by rewinding it if it is scrollable and recreating it
if not, but we do not have it, so this commit modifies it to recreate it
in any case.)
Per bug #17889 from Eric Cyr. Devrim Gunduz also reported this problem.
Back-patch to v15 where that commit enforced the prohibition.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/17889-e8c39a251d258dda%40postgresql.org
Discussion: https://postgr.es/m/b415ac3255f8352d1ea921cf3b7ba39e0587768a.camel%40gunduz.org
If we intend to generate a Memoize node on top of a path, we need
cache keys of some sort. Currently we search for the cache keys in
the parameterized clauses of the path as well as the lateral_vars of
its parent. However, it turns out that this is not sufficient because
there might be lateral references derived from PlaceHolderVars, which
we fail to take into consideration.
This oversight can cause us to miss opportunities to utilize the
Memoize node. Moreover, in some plans, failing to recognize all the
cache keys could result in performance regressions. This is because
without identifying all the cache keys, we would need to purge the
entire cache every time we get a new outer tuple during execution.
This patch fixes this issue by extracting lateral Vars from within
PlaceHolderVars and subsequently including them in the cache keys.
In passing, this patch also includes a comment clarifying that Memoize
nodes are currently not added on top of join relation paths. This
explains why this patch only considers PlaceHolderVars that are due to
be evaluated at baserels.
Author: Richard Guo
Reviewed-by: Tom Lane, David Rowley, Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48jLxn0pAPZpJ50EThZ569Xrw+=4Ac3QvkpQvNszbeoNg@mail.gmail.com
Hash joins can support semijoin with the LHS input on the right, using
the existing logic for inner join, combined with the assurance that only
the first match for each inner tuple is considered, which can be
achieved by leveraging the HEAP_TUPLE_HAS_MATCH flag. This can be very
useful in some cases since we may now have the option to hash the
smaller table instead of the larger.
Merge join could likely support "Right Semi Join" too. However, the
benefit of swapping inputs tends to be small here, so we do not address
that in this patch.
Note that this patch also modifies a test query in join.sql to ensure it
continues testing as intended. With this patch the original query would
result in a right-semi-join rather than semi-join, compromising its
original purpose of testing the fix for neqjoinsel's behavior for
semi-joins.
Author: Richard Guo
Reviewed-by: wenhui qiu, Alena Rybakina, Japin Li
Discussion: https://postgr.es/m/CAMbWs4_X1mN=ic+SxcyymUqFx9bB8pqSLTGJ-F=MHy4PW3eRXw@mail.gmail.com
Previously, when considering LIMIT pushdown, postgres_fdw failed to
check whether the query has this clause, which led to pushing false
LIMIT clauses, causing incorrect results.
This clause has been supported since v13, so we need to do a
remote-version check before deciding that it will be safe to push such a
clause, but we do not currently have a way to do the check (without
accessing the remote server); disable pushing such a clause for now.
Oversight in commit 357889eb1. Back-patch to v13, where that commit
added the support.
Per bug #18467 from Onder Kalaci.
Patch by Japin Li, per a suggestion from Tom Lane, with some changes to
the comments by me. Review by Onder Kalaci, Alvaro Herrera, and me.
Discussion: https://postgr.es/m/18467-7bb89084ff03a08d%40postgresql.org