Compare commits

...

2755 Commits

Author SHA1 Message Date
Richard Guo
5069fef1cf Expand virtual generated columns for ALTER COLUMN TYPE
For the subcommand ALTER COLUMN TYPE of the ALTER TABLE command, the
USING expression may reference virtual generated columns.  These
columns must be expanded before the expression is fed through
expression_planner and the expression-execution machinery.  Failing to
do so can result in incorrect rewrite decisions, and can also lead to
"ERROR:  unexpected virtual generated column reference".

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/b5f96b24-ccac-47fd-9e20-14681b894f36@gmail.com
2025-06-26 12:17:12 +09:00
Peter Eisentraut
62a47aea1d doc: Some copy-editing around constraint validation and enforcement
Author: Robert Treat <rob@xzilla.net>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxFo4yTwzbSZrP%2BzQiR6_M00skoZMFaUnNJCdY6he%3DuQfA%40mail.gmail.com
2025-06-25 12:46:16 +02:00
Peter Eisentraut
60dda7bbc4 pg_createsubscriber: Rename option --remove to --clean
After discussion, the name --remove was suboptimally chosen.  --clean
has more precedent in other PostgreSQL tools.

Reviewed-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Discussion: https://www.postgresql.org/message-id/84be7ff3-2763-4c0f-ac1e-ca9862077f41@eisentraut.org
2025-06-25 10:50:43 +02:00
Peter Eisentraut
0cd69b3d7e Restrict virtual columns to use built-in functions and types
Just like selecting from a view is exploitable (CVE-2024-7348),
selecting from a table with virtual generated columns is exploitable.
Users who are concerned about this can avoid selecting from views, but
telling them to avoid selecting from tables is less practical.

To address this, this changes it so that generation expressions for
virtual generated columns are restricted to using built-in functions
and types, and the columns are restricted to having a built-in type.
We assume that built-in functions and types cannot be exploited for
this purpose.

In the future, this could be expanded by some new mechanism to declare
other functions and types as safe or trusted for this purpose, but
that is to be designed.

(An alternative approach might have been to expand the
restrict_nonsystem_relation_kind GUC to handle this, like the fix for
CVE-2024-7348.  But that is kind of an ugly approach.  That fix had to
fit in the constraints of fixing an ancient vulnerability in all
branches.  Since virtual generated columns are new, we're free from
the constraints of the past, and we can and should use cleaner
options.)

Reported-by: Feike Steenbergen <feikesteenbergen@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAK_s-G2Q7de8Q0qOYUR%3D_CTB5FzzVBm5iZjOp%2BmeVWpMpmfO0w%40mail.gmail.com
2025-06-25 09:56:49 +02:00
Amit Kapila
69e5cdc47f Doc: Improve documentation of stream abort.
Protocol v4 introduces parallel streaming, which allows Stream Abort
messages to include additional abort information such as LSN and
timestamp. However, the current documentation only states, "This field is
available since protocol version 4," which may misleadingly suggest that
the fields are always present when using protocol v4.

This patch clarifies that the abort LSN and timestamp are included only
when parallel streaming is enabled, even under protocol v4.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/CAO6_XqoKteQR1AnaR8iPcegbBE+HkAc2-g12rxN04yOt4-2ORg@mail.gmail.com
2025-06-25 10:25:15 +05:30
Michael Paquier
661643deda Avoid scribbling of VACUUM options
This fixes two issues with the handling of VacuumParams in vacuum_rel().
This code path has the idea to change the passed-in pointer of
VacuumParams for the "truncate" and "index_cleanup" options for the
relation worked on, impacting the two following scenarios where
incorrect options may be used because a VacuumParams pointer is shared
across multiple relations:
- Multiple relations in a single VACUUM command.
- TOAST relations vacuumed with their main relation.

The problem is avoided by providing to the two callers of vacuum_rel()
copies of VacuumParams, before the pointer is updated for the "truncate"
and "index_cleanup" options.

The refactoring of the VACUUM option and parameters done in 0d831389749a
did not introduce an issue, but it has encouraged the problem we are
dealing with in this commit, with b84dbc8eb80b for "truncate" and
a96c41feec6b for "index_cleanup" that have been added a couple of years
after the initial refactoring.  HEAD will be improved with a different
patch that hardens the uses of VacuumParams across the tree.  This
cannot be backpatched as it introduces an ABI breakage.

The backend portion of the patch has been authored by Nathan, while I
have implemented the tests.  The tests rely on injection points to check
the option values, making them faster, more reliable than the tests
originally proposed by Shihao, and they also provide more coverage.
This part can only be backpatched down to v17.

Reported-by: Shihao Zhong <zhong950419@gmail.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAGRkXqTo+aK=GTy5pSc-9cy8H2F2TJvcrZ-zXEiNJj93np1UUw@mail.gmail.com
Backpatch-through: 13
2025-06-25 10:03:46 +09:00
Fujii Masao
82015fd9bd doc: Fix type description of io_workers GUC for consistency.
The documentation previously described the type of the io_workers GUC
parameter as "int". However, the documentation consistently uses "integer"
for parameters of this type.

This commit updates the type description of io_workers to "integer"
for consistency with other GUC parameter descriptions.

Author: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/e95c899a-2aeb-45b7-8fd3-7a27dcdb475b@oss.nttdata.com
2025-06-25 09:02:31 +09:00
Fujii Masao
a9c2bde929 doc: Mention ANALYZE VERBOSE in track_cost_delay_timing description.
The documentation for track_cost_delay_timing describes where cost-based
vacuum delay timing information is displayed when the setting is enabled.
While this information is also shown in the output of ANALYZE VERBOSE,
that was previously omitted from the list.

This commit updates the documentation to include ANALYZE VERBOSE in the list,
clarifying that it also reports cost-based delay timing information.

Author: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/e95c899a-2aeb-45b7-8fd3-7a27dcdb475b@oss.nttdata.com
2025-06-25 09:01:13 +09:00
Fujii Masao
84c4e10e13 doc: Add secondary index entries for vacuum-related parameters.
For parameters that exist as both configuration and storage options,
the documentation typically includes secondary index entries to
help users distinguish and locate the relevant references easily.

However, such index entries were missing for vacuum_truncate and
vacuum_max_eager_freeze_failure_rate, both introduced in v18.

This commit adds appropriate secondary index terms for these parameters
to ensure consistency with other parameters and improve usability of
the documentation index.

Author: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/e95c899a-2aeb-45b7-8fd3-7a27dcdb475b@oss.nttdata.com
2025-06-25 08:59:25 +09:00
Tom Lane
fd519419c9 Prevent excessive delays before launching new logrep workers.
The logical replication launcher process would sometimes sleep
for as much as 3 minutes before noticing that it is supposed
to launch a new worker.  This could happen if
(1) WaitForReplicationWorkerAttach absorbed a process latch wakeup
that was meant to cause ApplyLauncherMain to do work, or
(2) logicalrep_worker_launch reported failure, either because of
resource limits or because the new worker terminated immediately.

In case (2), the expected behavior is that we retry the launch after
wal_retrieve_retry_interval, but that didn't reliably happen.

It's not clear how often such conditions would occur in the field,
but in our subscription test suite they are somewhat common,
especially in tests that exercise cases that cause quick worker
failure.  That causes the tests to take substantially longer than
they ought to do on typical setups.

To fix (1), make WaitForReplicationWorkerAttach re-set the latch
before returning if it cleared it while looping.  To fix (2), ensure
that we reduce wait_time to no more than wal_retrieve_retry_interval
when logicalrep_worker_launch reports failure.  In passing, fix a
couple of perhaps-hypothetical race conditions, e.g. examining
worker->in_use without a lock.

Backpatch to v16.  Problem (2) didn't exist before commit 5a3a95385
because the previous code always set wait_time to
wal_retrieve_retry_interval when launching a worker, regardless of
success or failure of the launch.  That behavior also greatly
mitigated problem (1), so I'm not excited about adapting the remainder
of the patch to the substantially-different code in older branches.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/817604.1750723007@sss.pgh.pa.us
Backpatch-through: 16
2025-06-24 14:14:07 -04:00
Álvaro Herrera
c2da1a5d63
Make query jumbling also squash PARAM_EXTERN params
Commit 62d712ecfd94 made query jumbling squash lists of Consts as a
single element, but there's no reason not to treat PARAM_EXTERN
parameters the same.  For these purposes, these values are indeed
constants for any particular execution of a query.

In particular, this should make list squashing more useful for
applications using extended query protocol, which would use parameters
extensively.

A complication arises: if a query has both external parameters and
squashable lists, then the parameter number used as placeholder for the
squashed list might be inconsistent with regards to the parameter
numbers used by the query literal.  To reduce the surprise factor, all
parameters are renumbered starting from 1 in that case.

Author: Sami Imseih <samimseih@gmail.com>
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
2025-06-24 19:36:32 +02:00
Álvaro Herrera
debad29d22
Improve jumble squashing through CoerceViaIO and RelabelType
There's no principled reason for query jumbling to only remove the first
layer of RelabelType and CoerceViaIO.  Change it to see through as many
layers as there are.
2025-06-24 19:36:12 +02:00
Melanie Plageman
303ba0573c Test that vacuum removes tuples older than OldestXmin
If vacuum fails to prune a tuple killed before OldestXmin, it will
decide to freeze its xmax and later error out in pre-freeze checks.

Add a test reproducing this scenario to the recovery suite which creates
a table on a primary, updates the table to generate dead tuples for
vacuum, and then, during the vacuum, uses a replica to force
GlobalVisState->maybe_needed on the primary to move backwards and
precede the value of OldestXmin set at the beginning of vacuuming the
table.

This test is coverage for a case fixed in 83c39a1f7f3. The test was
originally committed to master in aa607980aee but later reverted in
efcbb76efe4 due to test instability.

The test requires multiple index passes. In Postgres 17+, vacuum uses a
TID store for the dead TIDs that is very space efficient. With the old
minimum maintenance_work_mem of 1 MB, it required a large number of dead
rows to generate enough dead TIDs to force multiple index
vacuuming passes. Once the source code changes were made to allow a
minimum maintenance_work_mem value of 64kB, the test could be made much
faster and more stable.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAAKRu_ZJBkidusDut6i%3DbDCiXzJEp93GC1%2BNFaZt4eqanYF3Kw%40mail.gmail.com
Backpatch-through: 17
2025-06-24 09:20:16 -04:00
Daniel Gustafsson
054beebb7c doc: Remove dead link to NewbieDoc Docbook Guide
The link returns 404 and no replacement is available in the project
on Sourceforge where the content once was. Since we already link to
resources for both beginner and experienced docs hackers, remove the
the dead link.

Backpatch to all supported versions as the link was added in 8.1.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxH=YzQPDOe+2WuYZ7seD-BOyjCBmP6JiErpoSiVZWDRnw@mail.gmail.com
Backpatch-through: 13
2025-06-24 11:49:37 +02:00
Peter Eisentraut
49fe1c83ec Fix virtual generated column type checking for ALTER TABLE
Virtual generated columns have some special checks in
CheckAttributeType(), mainly to check that domains are not used.  But
this check was only applied during CREATE TABLE, not during ALTER
TABLE.  This fixes that.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxE0KHR__-h=zHXbhSNZXMMs4LYo4-dbj8H3YoStYBok1Q@mail.gmail.com
2025-06-24 11:31:26 +02:00
Fujii Masao
0cb5145a32 doc: Fix incorrect UUID index entry in function documentation.
Previously, the UUID functions documentation defined the "UUID" index entry
to link to the UUID data type page, even though that entry already exists there.
Instead, the UUID functions page should define its own index entry linking
to itself.

This commit updates the UUID index entry in the UUID functions documentation
to point to the correct section, improving navigation and avoiding duplication.

Back-patch to all supported versions.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/f33e0493-5773-4296-87c5-7ce459054cfe@oss.nttdata.com
Backpatch-through: 13
2025-06-24 14:21:10 +09:00
Amit Kapila
6531f36283 Fix missing comment update in 1462aad2e4.
Remove the part of comment that says we don't allow toggling two_phase
option as that is supported in commit 1462aad2e4.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Author: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB1496656725F3951AEE8749EBDF579A@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-06-24 09:51:07 +05:30
Michael Paquier
fc39b286ad psql: Rename meta-command \close to \close_prepared
\close has been introduced in d55322b0da60 to be able to close a
prepared statement using the extended protocol in psql.  Per discussion,
the name "close" is ambiguous.  At the SQL level, CLOSE is used to close
a cursor.  At protocol level, the close message can be used to either
close a statement or a portal.

This patch renames \close to \close_prepared to avoid any ambiguity and
make it clear that this is used to close a prepared statement.  This new
name has been chosen based on the feedback from the author and the
reviewers.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/3e694442-0df5-4f92-a08f-c5d4c4346b85@eisentraut.org
2025-06-24 13:12:46 +09:00
Alexander Korotkov
f3ed72ca07 Temporarily remove 046_checkpoint_logical_slot.pl
This new test was intended to check the handling of the replication slot's
restart lsn fixed in ca307d5cec90.  However, it also reveals another issue
related to logical decoding.  This commit temporarily removes this test to
keep the buildfarm and CFbot green and avoid distorting others' work.  This
test will be restored once we investigate and fix the issue.

Discussion: https://postgr.es/m/CAAKRu_ZCOzQpEumLFgG_%2Biw3FTa%2BhJ4SRpxzaQBYxxM_ZAzWcA%40mail.gmail.com
2025-06-23 21:33:50 +03:00
Alexander Korotkov
70d8a91f82 Remove excess assert from InvalidatePossiblyObsoleteSlot()
ca307d5cec90 introduced keeping WAL segments by slot's last saved restart LSN.
It also added an assertion that the slot's restart LSN never goes backward.
However, situations when the restart LSN goes backward have been spotted by
buildfarm animals and investigated in the thread.

When pg_receivewal starts the replication, it sets the last replayed LSN to
the beginning of the segment, which is older than what
ReplicationSlotReserveWal() set for the slot.  A similar situation can happen
to pg_basebackup.  When standby reconnects to the primary, it sends the last
replayed LSN, which might be older than the last confirmed flush LSN.  In
both these situations, a concurrent checkpoint may trigger an assert trap.

Based on ideas from Vitaly Davydov <v.davydov@postgrespro.ru>,
Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>,
Vignesh C <vignesh21@gmail.com>,
Amit Kapila <amit.kapila16@gmail.com>.

Reported-by: Vignesh C <vignesh21@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALDaNm3s-jpQTe1MshsvQ8GO%3DTLj233JCdkQ7uZ6pwqRVpxAdw%40mail.gmail.com
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
2025-06-23 21:27:42 +03:00
Tom Lane
ccd5bc93fd Include _mm512_zextsi128_si512() in AVX-512 configure probes.
Commit 43da39430 added a dependency on this intrinsic to our
AVX-512 CRC code.  It turns out this intrinsic was added to
gcc later than the other ones we were using, so that there
are platforms where the new code fails to compile.  Since only
relatively old (pre-gcc-10) compilers are affected, it doesn't
seem worth trying to make the AVX-512 CRC code actually work
on these platforms.  Just add the new intrinsic to the configure
probe, so that we'll conclude the code can't be built.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/3350336.1750690281@sss.pgh.pa.us
2025-06-23 11:50:21 -04:00
John Naylor
43da394304 Properly fix AVX-512 CRC calculation bug
The problem that led to the workaround in f83f14881c7 was not in fact
a compiler bug, but a failure to zero the upper bits of the vector
register containing the initial scalar CRC value. Fix that and revert
the workaround.

Diagnosed-by: Nathan Bossart <nathandbossart@gmail.com>
Diagnosed-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Tested-by: Andy Fan <zhihuifan1213@163.com>
Tested-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Discussion: https://postgr.es/m/PH8PR11MB82866B07AA6758D12F699C00FB70A@PH8PR11MB8286.namprd11.prod.outlook.com
2025-06-23 18:03:56 +07:00
Peter Eisentraut
2c0d8b9508 meson: Fix meson warning
WARNING: You should add the boolean check kwarg to the run_command call.
             It currently defaults to false,
             but it will default to true in meson 2.0.

Introduced by commit bc46104fc9a.

(This only happens in the msvc branch.  All the other run_command
calls are ok.)

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/42e13eb0-862a-441e-8d84-4f0fd5f6def0%40eisentraut.org
2025-06-22 14:13:46 +02:00
Tom Lane
ea06263c4a Doc: improve documentation about width_bucket().
Specify whether the bucket bounds are inclusive or exclusive,
and improve some other vague language.  Explain the behavior that
occurs when the "low" bound is greater than the "high" bound.
Make width_bucket_numeric's comment more like that for
width_bucket_float8, in particular noting that infinite
bounds are rejected (since they became possible in v14).

Reported-by: Ben Peachey Higdon <bpeacheyhigdon@gmail.com>
Author: Robert Treat <rob@xzilla.net>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/2BD74F86-5B89-4AC1-8F13-23CED3546AC1@gmail.com
Backpatch-through: 13
2025-06-21 12:52:37 -04:00
Bruce Momjian
fa638edc74 doc PG 18 relnotes: update to current, add one commit 2025-06-20 23:53:15 -04:00
Bruce Momjian
d21cf31f7c doc PG 18 relnotes: indent tag blocks 2025-06-20 23:37:30 -04:00
Bruce Momjian
fed7aa8f56 doc PG 18 relnotes: add remaining missing link tags 2025-06-20 22:44:42 -04:00
Tom Lane
a16ef313f2 Remove planner's have_dangerous_phv() join-order restriction.
Commit 85e5e222b, which added (a forerunner of) this logic,
argued that

    Adding the necessary complexity to make this work doesn't seem like
    it would be repaid in significantly better plans, because in cases
    where such a PHV exists, there is probably a corresponding join order
    constraint that would allow a good plan to be found without using the
    star-schema exception.

The flaw in this claim is that there may be other join-order
restrictions that prevent us from finding a join order that doesn't
involve a "dangerous" PHV.  In particular we now recognize that
small join_collapse_limit or from_collapse_limit could prevent it.
Therefore, let's bite the bullet and make the case work.

We don't have to extend the executor's support for nestloop parameters
as I thought at the time, because we can instead push the evaluation
of the placeholder's expression into the left-hand input of the
NestLoop node.  So there's not really a lot of downside to this
solution, and giving the planner more join-order flexibility should
have value beyond just avoiding failure.

Having said that, there surely is a nonzero risk of introducing
new bugs.  Since this failure mode escaped detection for ten years,
such cases don't seem common enough to justify a lot of risk.
Therefore, let's put this fix into master but leave the back branches
alone (for now anyway).

Bug: #18953
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18953-1c9883a9d4afeb30@postgresql.org
2025-06-20 15:55:12 -04:00
Tom Lane
5861b1f343 Use SnapshotDirty when checking for conflicting index names.
While choosing an autogenerated name for an index, look for
pre-existing relations using a SnapshotDirty snapshot, instead of the
previous behavior that considered only committed-good pg_class rows.
This allows us to detect and avoid conflicts against indexes that are
still being built.

It's still possible to fail due to a race condition, but the window
is now just the amount of time that it takes DefineIndex to validate
all its parameters, call smgrcreate(), and enter the index's pg_class
row.  Formerly the race window covered the entire time needed to
create and fill an index, which could be very long if the table is
large.  Worse, if the conflicting index creation is part of a larger
transaction, it wouldn't be visible till COMMIT.

So this isn't a complete solution, but it should greatly ameliorate
the problem, and the patch is simple enough to be back-patchable.

It might at some point be useful to do the same for pg_constraint
entries (cf. ChooseConstraintName, ConstraintNameExists, and related
functions).  However, in the absence of field complaints, I'll leave
that alone for now.  The relation-name test should be good enough for
index-based constraints, while foreign-key constraints seem to be okay
since they require exclusive locks to create.

Bug: #18959
Reported-by: Maximilian Chrzan <maximilian.chrzan@here.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/18959-f63b53b864bb1417@postgresql.org
Backpatch-through: 13
2025-06-20 13:41:11 -04:00
Tom Lane
2f6e240d7a pgxs.mk: remove unreachable rule for deleting regress.def.
We never create regress.def, and if we did this code would fail to
delete it, because "win" is not the correct PORTNAME for Windows.

This thinko seems to have originated in commit 7a6b562fd from 1999,
although it got moved around multiple times since then.

Author: Christoph Berg <myon@debian.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aFVR7R7VDX7y2ruc@msg.df7cb.de
2025-06-20 12:12:29 -04:00
Alexander Korotkov
4464fddf7b Improve runtime and output of tests for replication slots checkpointing.
The TAP tests that verify logical and physical replication slot behavior
during checkpoints (046_checkpoint_logical_slot.pl and
047_checkpoint_physical_slot.pl) inserted two batches of 2 million rows each,
generating approximately 520 MB of WAL.  On slow machines, or when compiled
with '-DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE', this caused the
tests to run for 8-9 minutes and occasionally time out, as seen on the
buildfarm animal prion.

This commit modifies the mentioned tests to utilize the $node->advance_wal()
function, thereby reducing runtime. Once we do not use the generated data,
the proposed function is a good alternative, which cuts the total wall-clock
run time.

While here, remove superfluous '\n' characters from several note() calls;
these appeared literally in the build-farm logs and looked odd.  Also, remove
excessive 'shared_preload_libraries' GUC from the config and add a check for
'injection_points' extension availability.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Vitaly Davydov <v.davydov@postgrespro.ru>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/fbc5d94e-6fbd-4a64-85d4-c9e284a58eb2%40gmail.com
Backpatch-through: 17
2025-06-20 01:41:28 +03:00
Bruce Momjian
a8360f074c doc PG 18 relnotes: add links to command and struct tags 2025-06-19 17:14:20 -04:00
Jeff Davis
6c29088fc6 Correct docs about partitions and EXCLUDE constraints.
In version 17 we added support for cross-partition EXCLUDE
constraints, as long as they included all partition key columns and
compared them with equality (see 8c852ba9a4). I updated the docs for
exclusion constraints, but I missed that the docs for CREATE TABLE
still said that they were not supported. This commit fixes that.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/c955d292-b92d-42d1-a2a0-1ec6715a2546@illuminatedcomputing.com
Backpatch-through: 17
2025-06-19 12:43:27 -07:00
Bruce Momjian
ed117c4c6c doc PG 18 relnotes: add links for applications 2025-06-19 11:59:00 -04:00
Bruce Momjian
d8aa21b74f doc: add xreflabel text for libpq and PL/Python
to be used for PG 18 release notes
2025-06-19 11:51:12 -04:00
Peter Eisentraut
dec6643487 Improve pg_dump/pg_dumpall help synopses and terminology
Increase consistency of --help and man page synopses between pg_dump
and pg_dumpall.  These should now be very similar, as pg_dumpall can
now also produce non-text dump output.  But actually, they had drifted
further apart.

- Use verb "export" consistently, instead of "dump" or "extract".
- Use "SQL script" instead of just "script" or "text file".
- Maintain consistent distinction between SQL script and other
  formats/archives (which is relevant for pg_restore).

Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://www.postgresql.org/message-id/flat/3f71d8a7-095b-4829-9b0b-fce09e9866b3%40eisentraut.org
2025-06-19 13:57:16 +02:00
Amit Kapila
1546e17f9d Improve log messages and docs for slot synchronization.
Improve the clarity of LOG messages when a failover logical slot
synchronization fails, making the reasons more explicit for easier
debugging.

Update the documentation to outline scenarios where slot synchronization
can fail, especially during the initial sync, and emphasize that
pg_sync_replication_slot() is primarily intended for testing and
debugging purposes.

We also discussed improving the functionality of
pg_sync_replication_slot() so that it can be used reliably, but we would
take up that work for next version after some more discussion and review.

Reported-by: Suraj Kharage <suraj.kharage@enterprisedb.com>
Author: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/CAF1DzPWTcg+m+x+oVVB=y4q9=PYYsL_mujVp7uJr-_oUtWNGbA@mail.gmail.com
2025-06-19 09:48:08 +05:30
Bruce Momjian
a03805920b doc PG 18 relnotes: add links for server variables 2025-06-18 21:20:04 -04:00
Fujii Masao
db0c93f172 doc: Mention GIN indexes support parallel builds.
Commit 8492feb98f6 added support for parallel CREATE INDEX on GIN indexes.
However, previously two places in the documentation and two in the source
code comments still stated that only B-tree and BRIN indexes support
parallel builds.

This commit updates those references to correctly include GIN indexes.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/7d27d068-90e2-4022-9bd7-09b0fd3d4f47@oss.nttdata.com
2025-06-19 09:12:34 +09:00
Fujii Masao
b57d707708 doc: Fix incorrect description of INCLUDING COMMENTS in CREATE FOREIGN TABLE.
Commit 302cf157592 added support for LIKE in CREATE FOREIGN TABLE.
In this feature, since indexes are not created for foreign tables,
comments on indexes are not copied either.

However, the documentation incorrectly stated that index comments
would be copied when using INCLUDING COMMENTS. This commit
corrects that by removing the mention of index comments.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/f86cd84f-a6a3-4451-bae7-5cca9e63b06d@oss.nttdata.com
2025-06-19 09:07:19 +09:00
Bruce Momjian
d0d1bcb1e8 doc: fix for commit 09f7d36ba16 in changing "_" to "-".
I thought underscores wouldn't even work in "id"s, so I never checked to
see if anything referenced it, but it seems it does work, so adjust the
calling site for the dash syntax.
2025-06-18 16:48:26 -04:00
Bruce Momjian
09f7d36ba1 doc config.sgml: use "-" and not "_" for varlistentry "id"s
Change "id"s of file_copy_method and enable_self_join_elimination for
consistency with the rest of the guc "id"s.  These are new entries for
PG 18.
2025-06-18 16:43:27 -04:00
Fujii Masao
c2e2589ab9 pg_dump: Allow pg_dump to dump the statistics for foreign tables.
Commit 1fd1bd87101 introduced support for dumping statistics with
pg_dump and pg_dumpall, covering tables, materialized views, and indexes.
However, it overlooked foreign tables, even though functions like
pg_restore_relation_stats() support them.

This commit fixes that oversight by allowing pg_dump and pg_dumpall
to include statistics for foreign tables.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/3772e4e4-ef39-4deb-bb76-aa8165f33fb6@oss.nttdata.com
2025-06-18 14:53:55 +09:00
Michael Paquier
9e1183953f Document "relrewrite" at the top of heap_create_with_catalog()
This parameter has been introduced in 325f2ec5557f, and it was not
documented contrary to all the other arguments of
heap_create_with_catalog().

Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Steven Niu <niushiji@gmail.com>
Discussion: https://postgr.es/m/aE--bmEv-gJUTH5v@paquier.xyz
2025-06-18 11:03:21 +09:00
Fujii Masao
428a87607b doc: Reorder protocol version option descriptions in libpq docs.
Commit 285613c60a7 introduced the min_protocol_version and
max_protocol_version connection options for libpq, but their descriptions
were placed in the middle of the unrelated ssl_min_protocol_version and
ssl_max_protocol_version entries.

This commit moves the min_protocol_version and max_protocol_version
descriptions to appear after the SSL-related options. This improves
the logical order and makes it easier for users to locate the relevant
settings in the libpq documentation.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/a3391f36-30f5-4d4a-825b-232476819de8@oss.nttdata.com
2025-06-18 09:18:40 +09:00
Bruce Momjian
bb43c97bab doc PG 18 relnotes: add markup, still need to add links 2025-06-17 20:00:38 -04:00
Daniel Gustafsson
917c00d761 Fix allocation check to test the right variable
The memory allocation for cancelConn->be_cancel_key was accidentally
checking the be_cancel_key member in the conn object instead of the
one in cancelConn.

Author: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAEudQAq4ySDR6dsg9xwurBXwud02hX7XCOZZAcZx-JMn6A06nA@mail.gmail.com
2025-06-17 22:42:38 +02:00
Tomas Vondra
0cf205e122 amcheck: Fix posting tree checks in gin_index_check()
Fix two issues in parent_key validation in posting trees:

* It's not enough to check stack->parentblk is valid to determine if the
  parentkey is valid. It's possible parentblk is set to a valid block
  number, but parentkey is invalid. So check parentkey directly.

* We don't need to invalidate parentkey for all child pages of the
  rightmost page. It's enough to invalidate it for the rightmost child
  only, which means we can check more cases (less false negatives).

Issues reported by Arseniy Mukhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
2025-06-17 16:48:11 +02:00
Tomas Vondra
cdd1a431f2 amcheck: Fix parent key check in gin_index_check()
The checks introduced by commit 14ffaece0fb5 did not get the parent key
checks quite right, missing some data corruption cases. In particular:

* The "rightlink" check was not working as intended, because rightlink
  is a BlockNumber, and InvalidBlockNumber is 0xFFFFFFFF, so

    !GinPageGetOpaque(page)->rightlink

  almost always evaluates to false (except for rightlink=0). So in most
  cases parenttup was left NULL, preventing any checks against parent.

* Use GinGetDownlink() to retrieve child blkno to avoid triggering
  Assert, same as the core GIN code.

Issues reported by Arseniy Mukhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
2025-06-17 15:46:29 +02:00
Tomas Vondra
0b54b39233 amcheck: Fix checks of entry order for GIN indexes
This tightens a couple checks in checking GIN indexes, which might have
resulted in incorrect results (false positives/negatives).

* The code skipped ordering checks if the entries were for different
  attributes (for multi-column GIN indexes), possibly missing some cases
  of data corruption. But the attribute number is part of the ordering,
  so we can check that.

* The root page was skipped when checking entry order, but that is
  unnecessary. The root page is subject to the same ordering rules, we
  can process it just like any other page.

* The high key on the right-most page was not checked, but that is
  needed only for inner pages (we don't store the high key for those).
  For leaf pages we can check the high key just fine.

* Correct the detection of split pages. If the page gets split, the
  cached parent key is greater than the current child key (not less, as
  the code incorrectly expected).

Issues reported by Arseniy Mukhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
2025-06-17 14:55:29 +02:00
Tomas Vondra
8dd41c0bff amcheck: Remove unused GinScanItem->parentlsn field
The field was introduced by commit 14ffaece0fb5, but is unused and
unnecessary. So remove it.

Issues reported by Arseniy Mukhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and minor improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
2025-06-17 14:17:38 +02:00
Tomas Vondra
c89d6b889c amcheck: Test gin_index_check on a multicolumn index
Adds a regression test with gin_index_check() on a multicolumn index,
to verify it's handled correctly and improve test coverage for code
introduced by 14ffaece0fb5.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
2025-06-17 14:14:54 +02:00
Peter Eisentraut
6f55fb7411 doc: Mention the default io_method
It was previously not documented.

Author: Daniel Westermann (DWE) <daniel.westermann@dbi-services.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ZR0P278MB04279CB0C1D8F49DE68F168ED2AF2%40ZR0P278MB0427.CHEP278.PROD.OUTLOOK.COM
2025-06-17 07:41:15 +02:00
Bruce Momjian
23c67e8a83 doc PG 18 relnotes: add author for initdb commit 04bec894a04
Needed to run src/tools//add_commit_links.pl.
2025-06-16 21:04:14 -04:00
Masahiko Sawada
d87d07b7ad Fix re-distributing previously distributed invalidation messages during logical decoding.
Commit 4909b38af0 introduced logic to distribute invalidation messages
from catalog-modifying transactions to all concurrent in-progress
transactions. However, since each transaction distributes not only its
original invalidation messages but also previously distributed
messages to other transactions, this leads to an exponential increase
in allocation request size for invalidation messages, ultimately
causing memory allocation failure.

This commit fixes this issue by tracking distributed invalidation
messages separately per decoded transaction and not redistributing
these messages to other in-progress transactions. The maximum size of
distributed invalidation messages that one transaction can store is
limited to MAX_DISTR_INVAL_MSG_PER_TXN (8MB). Once the size of the
distributed invalidation messages exceeds this threshold, we
invalidate all caches in locations where distributed invalidation
messages need to be executed.

Back-patch to all supported versions where we introduced the fix by
commit 4909b38af0.

Note that this commit adds two new fields to ReorderBufferTXN to store
the distributed transactions. This change breaks ABI compatibility in
back branches, affecting third-party extensions that depend on the
size of the ReorderBufferTXN struct, though this scenario seems
unlikely.

Additionally, it adds a new flag to the txn_flags field of
ReorderBufferTXN to indicate distributed invalidation message
overflow. This should not affect existing implementations, as it is
unlikely that third-party extensions use unused bits in the txn_flags
field.

Bug: #18938 #18942
Author: vignesh C <vignesh21@gmail.com>
Reported-by: Duncan Sands <duncan.sands@deepbluecap.com>
Reported-by: John Hutchins <john.hutchins@wicourts.gov>
Reported-by: Laurence Parry <greenreaper@hotmail.com>
Reported-by: Max Madden <maxmmadden@gmail.com>
Reported-by: Braulio Fdo Gonzalez <brauliofg@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/680bdaf6-f7d1-4536-b580-05c2760c67c6@deepbluecap.com
Discussion: https://postgr.es/m/18942-0ab1e5ae156613ad@postgresql.org
Discussion: https://postgr.es/m/18938-57c9a1c463b68ce0@postgresql.org
Discussion: https://postgr.es/m/CAD1FGCT2sYrP_70RTuo56QTizyc+J3wJdtn2gtO3VttQFpdMZg@mail.gmail.com
Discussion: https://postgr.es/m/CANO2=B=2BT1hSYCE=nuuTnVTnjidMg0+-FfnRnqM6kd23qoygg@mail.gmail.com
Backpatch-through: 13
2025-06-16 17:36:01 -07:00
David Rowley
33b06a2001 Fix possible Assert failure in verify_compact_attribute()
Sometimes the TupleDesc used in verify_compact_attribute() is shared
among backends, and since CompactAttribute.attcacheoff gets updated
during tuple deformation, it was possible that another backend would
set attcacheoff on a given CompactAttribute in the small window of time
from when the attcacheoff from the live CompactAttribute was being set
in the 'tmp' CompactAttribute and before the Assert verifying that the
live and tmp CompactAttributes matched.

Here we adjust the code to make a copy of the live CompactAttribute so
that we're not trying to Assert against a shared copy of it.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/7195e408-758c-4031-8e61-4f842c716ac0@gmail.com
2025-06-17 10:49:36 +12:00
Andres Freund
e9a3615a52 aio: Add missing memory barrier when waiting for IO handle
Previously there was no memory barrier enforcing correct memory ordering when
waiting for a free IO handle. However, in the much more common case of waiting
for IO to complete, memory barriers already were present.

On strongly ordered architectures like x86 this had no negative consequences,
but on some armv8 hardware (observed on Apple hardware), it was possible for
the update, in the IO worker, to PgAioHandle->state to become visible before
->distilled_result becoming visible, leading to rather confusing assertion
failures. The failures were rare enough that the bug sometimes took days to
reproduce when running 027_stream_regress in a loop.

Once finally debugged, it was easy enough to come up with a much quicker
repro: Trigger a lot of very fast IO by limiting io_combine_limit to 1 and
ensure that we always have to wait for a free handle by setting
io_max_concurrency to 1. Triggering lots of concurrent seqscans in that setup
triggers the issue within seconds.

One reason this was hard to debug was that the assertion failure most commonly
happened in WaitReadBuffers(), rather than in the AIO subsystem itself. The
assertions added in this commit make problems like this easier to understand.

Also add a comment to the IO worker explaining that we rely on the lwlock
acquisition for correct memory ordering.

I think it'd be good to add a tap test that stress tests buffer IO, but that's
material for a separate patch.

Thanks a lot to Alexander and Konstantin for all the debugging help.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Investigated-by: Andres Freund <andres@anarazel.de>
Investigated-by: Alexander Lakhin <exclusion@gmail.com>
Investigated-by: Konstantin Knizhnik <knizhnik@garret.ru>
Discussion: https://postgr.es/m/2dkz7azclpeiqcmouamdixyn5xhlzy4rvikxrbovyzvi6rnv5c@pz7o7osv2ahf
2025-06-16 12:36:01 -04:00
Peter Eisentraut
ee685c9baf doc: Clean up title case use 2025-06-16 11:43:52 +02:00
Peter Eisentraut
f24fdf9855 libpq-oauth: Add exports.list to .gitignore 2025-06-16 11:16:52 +02:00
Peter Eisentraut
a876464abc Message style improvements
Some message style improvements in new code, and some small
refactorings to make translations easier.
2025-06-16 11:14:39 +02:00
John Naylor
f83f14881c Workaround code generation bug in clang
At optimization level -O0, builds on recent clang fail to produce the
correct CRC32C with our AVX-512 implementation. For now, just disable
the runtime check for clang at -O0. When this is fixed upstream and we
know the extent of the breakage, we can adjust to be version-specific.

Reported-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Reported-by: Andy Fan <zhihuifan1213@163.com>
Tested-by: Andy Fan <zhihuifan1213@163.com>
Discussion: https://postgr.es/m/CAE-ML%2B-OV6p9uvCFBcSQjZUEh__y0h-KjN%2BBseyGJHt7u8EP%2Bw%40mail.gmail.com
Discussion: https://postgr.es/m/87o6uqd3iv.fsf%40163.com
2025-06-16 09:27:15 +07:00
Tom Lane
fd385c4c62 Add commit b27644bad to .git-blame-ignore-revs. 2025-06-15 13:11:04 -04:00
Tom Lane
b27644bade Sync typedefs.list with the buildfarm.
Our maintenance of typedefs.list has been a little haphazard
(and apparently we can't alphabetize worth a darn).  Replace
the file with the authoritative list from our buildfarm, and
run pgindent using that.

I also updated the additions/exclusions lists in pgindent where
necessary to keep pgindent from messing things up significantly.
Notably, now that regex_t and some related names are macros not real
typedefs, we have to whitelist them explicitly.  The exclusions list
has also drifted noticeably, presumably due to changes of system
headers on the buildfarm animals that contribute to the list.

Unlike in prior years, I've not manually added typedef names that
are missing from the buildfarm's list because they are not used to
declare any variables or fields.  So there are a few places where
the typedef declaration itself is formatted worse than before,
e.g. typedef enum IoMethod.  I could preserve the names that were
manually added to the list previously, but I'd really prefer to find
a less manual way of dealing with these cases.  A quick grep finds
about 75 such symbols, most of which have never gotten any special
treatment.

Per discussion among pgsql-release, doing this now seems appropriate
even though we're still a week or two away from making the v18 branch.
2025-06-15 13:04:24 -04:00
Peter Eisentraut
6d6480066c psql: Change new \conninfo to use SSL instead of TLS
Commit bba2fbc6238 introduced a new implementation of the \conninfo
command in psql.  That new code uses the term "TLS" while the rest of
PostgreSQL, including the rest of psql, consistently uses "SSL".  This
is uselessly confusing.  This changes the new code to use "SSL" as
well.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/f4ff9294-b491-4053-83f5-11c10ab8c999@eisentraut.org
2025-06-15 11:07:00 +02:00
David Rowley
2f98f967fa Improve comments for TidRangeEval
Here we provide a bit more detail on why TidRangeEval() does return false
when trss_mintid is greater than trss_maxtid.

Reported-by: Junwang Zhao <zhjwpku@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3KUbUUqQgfK5X8Sj-%2BppPtGNTU%2BZiep0Rxr7SLjoR%2BB6w%40mail.gmail.com
2025-06-14 17:18:31 +12:00
Fujii Masao
0fe50417ec doc: Add note about "Client User" and "Superuser" fields in \conninfo output.
In the \conninfo psql command, the "Client User" column shows the user who
established the connection, while the "Superuser" column reflects whether
the current user in the current execution context is a superuser. This means
the users referred to in these columns can differ, for example, if the current
user was changed with the SET ROLE command.

This commit adds a note to the \conninfo documentation to clarify
this behavior and avoid potential confusion.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/685961b8-b6ce-40bb-b2d5-c2ff135d3388@oss.nttdata.com
2025-06-14 10:39:26 +09:00
Fujii Masao
be37ac20fc psql: Report full protocol version in \conninfo output.
Commit bba2fbc6238 modified \conninfo to display the protocol version
used by the current connection, but it only showed the major version (e.g., 3).

This commit updates \conninfo to display the full protocol version (e.g., 3.2).
Since support for new version 3.2 was added in v18, and the server supports
both 3.0 and 3.2, showing the complete version helps users understand
exactly which protocol version the current session is using.

Although this is a minor behavior change, it's considered a fix for
an oversight in the original patch and is included in v18.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/685961b8-b6ce-40bb-b2d5-c2ff135d3388@oss.nttdata.com
2025-06-14 10:37:12 +09:00
Alexander Korotkov
eb124c3d6d Add TAP tests to check replication slot advance during the checkpoint
The new tests verify that logical and physical replication slots are still
valid after an immediate restart on checkpoint completion when the slot was
advanced during the checkpoint.

This commit introduces two new injection points to make these tests possible:

* checkpoint-before-old-wal-removal - triggered in the checkpointer process
  just before old WAL segments cleanup;
* logical-replication-slot-advance-segment - triggered in
  LogicalConfirmReceivedLocation() when restart_lsn was changed enough to
  point to the next WAL segment.

Discussion: https://postgr.es/m/flat/1d12d2-67235980-35-19a406a0%4063439497
Author: Vitaly Davydov <v.davydov@postgrespro.ru>
Author: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17
2025-06-14 03:55:21 +03:00
Alexander Korotkov
ca307d5cec Keep WAL segments by slot's last saved restart LSN
The patch fixes the issue with the unexpected removal of old WAL segments
after checkpoint, followed by an immediate restart.  The issue occurs when
a slot is advanced after the start of the checkpoint and before old WAL
segments are removed at the end of the checkpoint.

The patch introduces a new in-memory state for slots: last_saved_restart_lsn,
which is used to calculate the oldest LSN for removing WAL segments. This
state is updated every time with the current restart_lsn at the moment when
the slot is saved to disk.

This fix changes the shared memory layout.  It's applied to HEAD only because
we don't have to preserve ABI compatibility during the beta stage.  Another
fix that doesn't affect the ABI is committed to back branches.

Discussion: https://postgr.es/m/1d12d2-67235980-35-19a406a0%4063439497
Author: Vitaly Davydov <v.davydov@postgrespro.ru>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
2025-06-14 03:36:04 +03:00
Peter Geoghegan
c45a1dba0d nbtree: _bt_readnextpage doesn't affect markPos.
_bt_readnextpage expects so->currPos.buf to be InvalidBuffer (and for
the position's page to be unlocked) when called.  However, it does not
expect there to be no pins held on any page.  In particular, so->markPos
might hold a separate pin, both before and after the call.  Fix some
comments that seemed to suggest otherwise.

Follow-up commit to commit 7c319f54, which made _bt_killitems drop pins
it acquired itself.
2025-06-13 19:58:47 -04:00
Jeff Davis
a0c7b76537 Comment fixups from 626df47ad9.
Reported-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CAHut+PspbHQmRCBL1c-opoJeTUKUaFFfUQJd2rhDZqwUrWCi7w@mail.gmail.com
2025-06-13 10:02:24 -07:00
Daniel Gustafsson
29aaeceee2 psql: Reword help message and docs for WATCH_INTERVAL
Reword the documentation around the default value to make interaction
between WATCH_INTERVAL and the \watch command clearer.  While there,
also remove a stray parenthesis left over from a previous version of
the patch.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/c34a650b-6f8b-4da7-9ebb-b6df03ce009d@eisentraut.org
2025-06-13 15:13:09 +02:00
Michael Paquier
6e951f279b psql: Forbid use of COPY and \copy while in a pipeline
Running COPY within a pipeline can break protocol synchronization in
multiple ways.  psql is limited in terms of result processing if mixing
COPY commands with normal queries while controlling a pipeline with the
new meta-commands, as an effect of the following reasons:
- In COPY mode, the backend ignores additional Sync messages and will
not send a matching ReadyForQuery expected by the frontend.  Doing a
\syncpipeline just after COPY will leave the frontend waiting for a
ReadyForQuery message that won't be sent, leaving psql out-of-sync.
- libpq automatically sends a Sync with the Copy message which is not
tracked in the command queue, creating an unexpected synchronisation
point that psql cannot really know about.  While it is possible to track
such activity for a \copy, this cannot really be done sanely with plain
COPY queries.  Backend failures during a COPY would leave the pipeline
in an aborted state while the backend would be in a clean state, ready
to process commands.

At the end, fixing those issues would require modifications in how libpq
handles pipeline and COPY.  So, rather than implementing workarounds in
psql to shortcut the libpq internals (with command queue handling for
one), and because meta-commands for pipelines in psql are a new feature
with COPY in a pipeline having a limited impact compared to other
queries, this commit forbids the use of COPY within a pipeline to avoid
possible break of protocol synchronisation within psql.  If there is a
use-case for COPY support within pipelines in libpq, this could always
be added in the future, if necessary.

Most of the changes of this commit impacts the tests for psql pipelines,
removing the tests related to COPY.  Some TAP tests still exist for COPY
TO/FROM and \copy to/from, to check that that connections are aborted
when this operation is attempted.

Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/AC468509-06E8-4E2A-A4B1-63046A4AC6AB@postgrespro.ru
2025-06-13 10:15:17 +09:00
Michael Paquier
2c76c6ac47 Replace %llu by PRIu64 in AIO io_uring code
This is a continuation of 15a79c73111f, cleaning up the AIO io_uring
code that has been committed after that while still using %llu.

The code changed here is new in v18, so cleaning things now means less
conflicts if this area of the code changes on backpatch once the 18
stable branch is created.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/aEZcGCnYFq642q8k@paquier.xyz
2025-06-13 08:59:47 +09:00
Fujii Masao
84914e964b pg_restore: Fix wrong descriptions of --with-{schema,data,statistics} options.
Commit bde2fb797aa added the --with-schema, --with-data, and --with-statistics
options to pg_restore. These options control whether to restore schema, data,
or statistics if present in the archive. However, the help message and
documentation incorrectly described them as affecting what gets dumped.

This commit corrects those descriptions to clarify that the options control
restoration, not dumping.

Bug: #18952
Reported-by: TAKATSUKA Haruka <harukat@sraoss.co.jp>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: TAKATSUKA Haruka <harukat@sraoss.co.jp>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/18952-be40a620f8b1e755@postgresql.org
2025-06-12 23:25:21 +09:00
Álvaro Herrera
0f65f3eec4
Fix squashing algorithm for query texts
The algorithm to squash lists of constants added by commit 62d712ecfd94
was a bit too simplistic; we wanted to avoid adding unnecessary
complexity, but cases like direct function calls of typecasting
functions (and others) were missed, and bogus SQL syntax was being shown
in pg_stat_statements normalized query text field.  To fix normalization
for those cases, we need the parser to transmit information about were
each list of constant values starts and ends, so add that to a couple of
nodes.  Also add a few more test cases to make sure we're doing the
right thing.

The patch initially submitted by Sami added a new private struct in
gram.y to carry the start/end information for A_Expr, but I (Álvaro)
decided that a better fix was to remove the parser indirection via the
in_expr production, and instead create separate components in the a_expr
rule.  I'm surprised that this works and doesn't require more changes,
but I assume (without checking) that the grammar used to be more complex
and got simplified at some point.

Bump catversion.

Author: Sami Imseih <samimseih@gmail.com>
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
2025-06-12 14:21:21 +02:00
Fujii Masao
f7b11414e9 doc: Document that MAINTAIN privilege allows statistics manipulation functions.
Database object statistics manipulation functions were introduced
in PostgreSQL 18 and are permitted under the MAINTAIN privilege.
However, the documentation previously did not mention these functions
in the list of allowed operations.

This commit updates the MAINTAIN privilege documentation to
explicitly include statistics manipulation functions, clarifying
what the privilege covers.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/7c7e1ad5-fdf9-486f-bc63-40ac99b0461d@oss.nttdata.com
2025-06-12 14:53:32 +09:00
Michael Paquier
f85f6ab051 Revert support for improved tracking of nested queries
This commit reverts the two following commits:
- 499edb09741b, track more precisely query locations for nested
statements.
- 06450c7b8c70, a follow-up fix of 499edb09741b with query locations.
The test introduced in this commit is not reverted.  This is proving
useful to track a problem that only pgaudit was able to detect.

These prove to have issues with the tracking of SELECT statements, when
these use multiple parenthesis which is something supported by the
grammar.  Incorrect location and lengths are causing pg_stat_statements
to become confused, failing its job in query normalization with
potential out-of-bound writes because the location and the length may
not match with what can be handled.  A lot of the query patterns
discussed when this issue was reported have no test coverage in the main
regression test suite, or the recovery test 027_stream_regress.pl would
have caught the problems as pg_stat_statements is loaded by the node
running the regression tests.  A first step would be to improve the test
coverage to stress more the query normalization logic.

A different portion of this work was done in 45e0ba30fc40, with the
addition of tests for nested queries.  These can be left in the tree.
They are useful to track the way inner queries are currently tracked by
PGSS with non-top-level entries, and will be useful when reconsidering
in the future the work reverted here.

Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru>
Discussion: https://postgr.es/m/18947-cdd2668beffe02bf@postgresql.org
2025-06-12 10:08:55 +09:00
Peter Geoghegan
dd2ce37927 Revert "nbtree: Remove useless row compare arg."
This reverts commit 54c6ea8c81db718508eeea50991d3c1c5dff54a5.

Further analysis has shown that the forcenonrequired row compare
behavior is in fact necessary, despite the new restrictions on
RowCompares imposed by _bt_set_startikey following commit 5f4d98d4.

Discussion: https://postgr.es/m/CAH2-Wzm3bKcz3TbHGem3_+SinEyG=VZVPbApQghp7YiZj+MM3g@mail.gmail.com
2025-06-11 18:16:15 -04:00
Jeff Davis
e1458f2f1b Revert a few small patches that were intended for version 19.
- 4c787a24e7e220a60022e47c1776f22f72902899
- 78bd364ee39ca70a8f9cb8719282389866a08e14
- 7a6880fadc177873d5663961ec3a02d67e34dcbe
- 8898082a5d3e94eef073f0e08124137e096e78ef

Suggested-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+TgmoZ=J=PVNZUNKaxULu+KUVSt3Y-aJ1DZ9Y3Co6mu0z62jA@mail.gmail.com
Discussion: https://postgr.es/m/60e8c6d0a6c08e67f15dbbe9e53df0119c710065.camel@j-davis.com
2025-06-11 15:10:12 -07:00
Masahiko Sawada
b774ad4933 Add tab completion for REJECT_LIMIT option.
This addresses an oversight in commit 4ac2a9bec, which introduced the
REJECT_LIMIT option to the COPY command.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/ac23e824d1d602f113a89c91ee56fb23@oss.nttdata.com
2025-06-11 11:44:25 -07:00
Peter Geoghegan
7c319f5491 Make _bt_killitems drop pins it acquired itself.
Teach nbtree's _bt_killitems to leave the so->currPos page that it sets
LP_DEAD items on in whatever state it was in when _bt_killitems was
called.  In particular, make sure that so->dropPin scans don't acquire a
pin whose reference is saved in so->currPos.buf.

Allowing _bt_killitems to change so->currPos.buf like this is wrong.
The immediate consequence of allowing it is that code in _bt_steppage
(that copies so->currPos into so->markPos) will behave as if the scan is
a !so->dropPin scan.  so->markPos will therefore retain the buffer pin
indefinitely, even though _bt_killitems only needs to acquire a pin
(along with a lock) for long enough to mark known-dead items LP_DEAD.

This issue came to light following a report of a failure of an assertion
from recent commit e6eed40e.  The test case in question involves the use
of mark and restore.  An initial call to _bt_killitems takes place that
leaves so->currPos.buf in a state that is inconsistent with the scan
being so->dropPin.  A subsequent call to _bt_killitems for the same
position (following so->currPos being saved in so->markPos, and then
restored as so->currPos) resulted in the failure of an assertion that
tests that so->currPos.buf is InvalidBuffer when the scan is so->dropPin
(non-assert builds got a "resource was not closed" WARNING instead).

The same problem exists on earlier releases, though the issue is far
more subtle there.  Recent commit e6eed40e introduced the so->dropPin
field as a partial replacement for testing so->currPos.buf directly.
Earlier releases won't get an assertion failure (or buffer pin leak),
but they will allow the second _bt_killitems call from the test case to
behave as if a buffer pin was consistently held since the original call
to _bt_readpage.  This is wrong; there will have been an initial window
during which no pin was held on the so->currPos page, and yet the second
_bt_killitems call will neglect to check if so->currPos.lsn continues to
match the page's now-current LSN.

As a result of all this, it's just about possible that _bt_killitems
will set the wrong items LP_DEAD (on release branches).  This could only
happen with merge joins (the sole user of nbtree mark/restore support),
when a concurrently inserted index tuple used a recently-recycled TID
(and only when the new tuple was inserted onto the same page as a
distinct concurrently-removed tuple with the same TID).  This is exactly
the scenario that _bt_killitems' check of the page's now-current LSN
against the LSN stashed in currPos was supposed to prevent.

A follow-up commit will make nbtree completely stop conditioning whether
or not a position's pin needs to be dropped on whether the 'buf' field
is set.  All call sites that might need to drop a still-held pin will be
taught to rely on the scan-level so->dropPin field recently introduced
by commit e6eed40e.  That will make bugs of the same general nature as
this one impossible (or make them much easier to detect, at least).

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/545be1e5-3786-439a-9257-a90d30f8b849@gmail.com
Backpatch-through: 13
2025-06-11 09:17:35 -04:00
Michael Paquier
361499538c psql: Remove PARTITION BY clause in tab completion for unlogged tables
CREATE UNLOGGED TABLE was still being recommended by psql's tab
completion as a possible pattern, but the backend is rejecting this
option since e2bab2d79204.

Reported-by: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Discussion: https://postgr.es/m/CAOzEurQZ1a+6d1K8b=+Ww1NFQVwAt9KSCQsBWXYBaPnYCenK3g@mail.gmail.com
2025-06-11 09:27:28 +09:00
Tom Lane
137935bd11 Don't reduce output request size on non-Unix-socket connections.
Traditionally, libpq's pqPutMsgEnd has rounded down the amount-to-send
to be a multiple of 8K when it is eagerly writing some data.  This
still seems like a good idea when sending through a Unix socket, as
pipes typically have a buffer size of 8K or some fraction/multiple of
that.  But there's not much argument for it on a TCP connection, since
(a) standard MTU values are not commensurate with that, and (b) the
kernel typically applies its own packet splitting/merging logic.

Worse, our SSL and GSSAPI code paths both have API stipulations that
if they fail to send all the data that was offered in the previous
write attempt, we mustn't offer less data in the next attempt; else
we may get "SSL error: bad length" or "GSSAPI caller failed to
retransmit all data needing to be retried".  The previous write
attempt might've been pqFlush attempting to send everything in the
buffer, so pqPutMsgEnd can't safely write less than the full buffer
contents.  (Well, we could add some more state to track exactly how
much the previous write attempt was, but there's little value evident
in such extra complication.)  Hence, apply the round-down only on
AF_UNIX sockets, where we never use SSL or GSSAPI.

Interestingly, we had a very closely related bug report before,
which I attempted to fix in commit d053a879b.  But the test case
we had then seemingly didn't trigger this pqFlush-then-pqPutMsgEnd
scenario, or at least we failed to recognize this variant of the bug.

Bug: #18907
Reported-by: Dorjpalam Batbaatar <htgn.dbat.95@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18907-d41b9bcf6f29edda@postgresql.org
Backpatch-through: 13
2025-06-10 18:39:34 -04:00
Jeff Davis
8898082a5d inet_net_pton.c: use pg_ascii_tolower() rather than tolower().
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:23:20 -07:00
Jeff Davis
7a6880fadc isn.c: use pg_ascii_toupper() instead of toupper().
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:23:11 -07:00
Jeff Davis
78bd364ee3 contrib/spi/refint.c: use pg_ascii_tolower() instead.
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:23:05 -07:00
Jeff Davis
4c787a24e7 copyfromparse.c: use pg_ascii_tolower() rather than tolower().
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:22:57 -07:00
Peter Eisentraut
3feff3916e Use exported symbols list on macOS for loadable modules as well
On macOS, when building with the make system, the exported symbols
list $(SHLIB_EXPORTS) was ignored.  This was probably not intentional,
it was probably just forgotten, since that combination has never
actually been used until now (for libpq-oauth).

The meson build system handles this correctly.  Also, other platforms
have been doing this correctly.

This fixes it.  It also does a bit of refactoring to make the code
match the layout for other platforms.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/c70ca32e-b109-460d-9810-6e23ebb4473f%40eisentraut.org
2025-06-10 07:04:43 +02:00
Tom Lane
166b4f4560 pg_restore: fix incompatibility with old directory-format dumps.
pg_restore failed to restore large objects (blobs) out of
directory-format dumps made by versions before PG v12.
That's because, due to a bug fixed in commit 548e50976, those
old versions put the wrong filename into the BLOBS TOC entry.
Said bug was harmless before v17, because we ignored the
incorrect filename field --- but commit a45c78e32 assumed it
would be correct.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFj8pRCrZ=_e1Rv1N+6vDaH+6gf=9A2mE2J4RvnvKA1bLiXvXA@mail.gmail.com
Backpatch-through: 17
2025-06-08 17:06:39 -04:00
Etsuro Fujita
7d4667c620 Revert "postgres_fdw: Inherit the local transaction's access/deferrable modes."
We concluded that commit e5a3c9d9b is a feature rather than a fix; since
it was added after feature freeze, revert it.

Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/ed2296f1-1a6b-4932-b870-5bb18c2591ae%40oss.nttdata.com
2025-06-08 17:30:00 +09:00
Bruce Momjian
73e26cbeb5 doc PG 18 relnotes: add AFTER trigger user change item
Reported-by: Noah Misch

Discussion: https://postgr.es/m/20250603172123.5f.nmisch@google.com
2025-06-07 11:25:17 -04:00
Bruce Momjian
37e5f0b61f doc PG 18 relnotes: adjust wording of initdb item 48814415d5a
And move to the top of the incompatibility list.  This will impact users
more than any other incompatibility item because of pg_upgrade.
2025-06-07 11:06:47 -04:00
Peter Eisentraut
1a857348e4 plpython: Remove obsolete test expected file
Move plpython_error_5.out to plpython_error.out, since the pre-3.5
version is no longer needed, since we raised the Python requirement to
3.6 (commit 45363fca637).

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/d620e7c6-becc-4a8e-9b43-eea0da55faf2@eisentraut.org
2025-06-07 09:04:29 +02:00
Jeff Davis
5b40feab59 Improve CREATE DATABASE error message for invalid libc locale.
Discussion: https://postgr.es/m/73959a14-267b-49c1-8293-291b175682cb@manitou-mail.org
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
2025-06-06 15:28:51 -07:00
Nathan Bossart
a31767fc09 Use NULL instead of 0 for pointer arguments.
Commit 5fe08c006c fixed this for calls to dshash_create().  This
commit fixes calls to dshash_attach() and dsa_create_in_place().

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aECi_gSD9JnVWQ8T%40nathan
2025-06-06 12:08:17 -05:00
Nathan Bossart
304862973e Fixed signed/unsigned mismatch in test_dsm_registry.
Oversight in commit 8b2bcf3f28.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/aECi_gSD9JnVWQ8T%40nathan
Backpatch-through: 17
2025-06-06 11:40:52 -05:00
Peter Geoghegan
e6eed40e44 Avoid BufferGetLSNAtomic() calls during nbtree scans.
Delay calling BufferGetLSNAtomic() until we finish reading a page that
actually contains items that btgettuple will return to the executor.
This reduces the number of calls during plain index scans (we'll only
call BufferGetLSNAtomic() when _bt_readpage returns true), and totally
eliminates calls during index-only scans, bitmap index scans, and plain
index scans of an unlogged relation.

Currently, when checksums (or wal_log_hints) are enabled, acquiring a
page's LSN in BufferGetLSNAtomic() involves locking the buffer header
(which involves the use of spinlocks).  Testing has shown that enabling
page-level checksums causes large regressions with certain workloads,
especially on larger multi-socket systems.

The regression isn't tied to any Postgres 18 commit.  However, Postgres
18 commit 04bec894 made initdb use checksums by default, so it seems
prudent to address the problem now.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/941f0190-e3c6-4622-9ac7-c04e936e5fdb@vondra.me
Discussion: https://postgr.es/m/CAH2-Wzk-Dg5XWs_jDuiHt4_7ryrSY+n=vxmHY51EVqPDFsKXmg@mail.gmail.com
2025-06-06 10:19:44 -04:00
Robert Haas
016e407f4b pg_prewarm: Allow autoprewarm to use more than 1GB to dump blocks.
Reported-by: Daria Shanina <vilensipkdm@gmail.com>
Author: Daria Shanina <vilensipkdm@gmail.com>
Author: Robert Haas <robertmhaas@gmail.com>
Backpatch-through: 13
2025-06-06 08:18:27 -04:00
Tom Lane
c37be39a74 Doc: improve description of which role runs a trigger.
Refine wording from commit 01463e1cc.

Author: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20250605163441.2f.nmisch@google.com
2025-06-05 15:24:15 -04:00
Peter Geoghegan
54c6ea8c81 nbtree: Remove useless row compare arg.
Use of a RowCompare key makes nbtree index scans ineligible to use
pstate.forcenonrequired following recent bugfix commit 5f4d98d4.
There's no longer any need for _bt_check_rowcompare to accept a
forcenonrequired argument, so remove it.
2025-06-05 14:50:43 -04:00
Álvaro Herrera
e6f98d8848
Avoid bogus scans of partitions when marking FKs enforced
Similar to commit cc733ed164c5: when an unenforced foreign key that
references a partitioned table is altered to be enforced, we scan
the constrained table based on each partition on the referenced
partitioned table.  This is bogus and likely to cause the ALTER TABLE to
fail: we must only scan the constrained table as pointing to the
top-level partitioned table.  Oversight in commit eec0040c4bcd.  Fix by
eliding those scans.

Author: Amul Sul <sulamul@gmail.com>
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF1e_gPOLtsDoaE4VCgQPC8KZW_kPAjPR5Rvv4Ew=fb2A@mail.gmail.com
2025-06-05 18:39:06 +02:00
Tom Lane
04acad82b0 Doc: you must own the target object to use SECURITY LABEL.
For some reason this wasn't mentioned before.

Author: Patrick Stählin <me@packi.ch>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/931e012a-57ba-41ba-9b88-24323a46dec5@packi.ch
Backpatch-through: 13
2025-06-05 11:30:12 -04:00
Álvaro Herrera
cc733ed164
Avoid bogus scans of partitions when validating FKs to partitioned tables
Validating an unvalidated foreign key that references a partitioned
table would try to queue validations for each individual partition of
the referenced table, but this is wrong: each individual partition would
not necessarily have all the referenced rows, so errors would be raised.
Avoid doing that.  The pg_constraint rows that cause this to happen are
only there to support the action triggers that implement the DELETE/
UPDATE actions of the FK, so no validating scan is necessary.

This was an oversight in commit b663b9436e75.

An equivalent oversight exists for NOT ENFORCED constraints, which is
not fixed in this commit.

Author: Amul Sul <sulamul@gmail.com>
Reported-by: Antonin Houska <ah@cybertec.at>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/26983.1748418675@localhost
2025-06-05 17:17:13 +02:00
Tom Lane
4b05ebf095 Change role names used in trigger test.
The choices made in commit 01463e1cc might pose copyright hazards,
and are more cutesy than informative anyway.

Reported-by: Noah Misch <noah@leadboat.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20250415155850.9b.nmisch@google.com
2025-06-05 11:05:53 -04:00
Magnus Hagander
112e40b867 psql: fix order of join clauses when listing extensions
Commit d696406a9b2 added a new join to the query for extensions, but did
so in the wrong place, causing the AND clause to be applied to the wrong
join.

Author:	Suraj Kharage <suraj.kharage@enterprisedb.com>
Reviewed-By: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CAF1DzPVBrN-cmPB2zb7ZU=2J4vEF2fNdArGCG9w+9fnKq4v8tg@mail.gmail.com
2025-06-05 09:54:16 +02:00
Michael Paquier
b87163e5f3 Fix copy-pasto with process count calculation in method_io_uring.c
This commit replaces the formula used for "TotalProcs" with a call to
pgaio_uring_procs() in pgaio_uring_shmem_init() for the shared memory
initialization, which is exactly the same, removing a duplication.

pgaio_uring_procs() is used for shared memory sizing and a sanity check,
and it has some documentation explaining some reasoning behind the
formula.

Author: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/ME0P300MB044521067A1EDDA9EDEC3793B66DA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-06-05 09:39:24 +09:00
Nathan Bossart
f9b1192190 doc: Remove notes about "unencrypted" passwords.
The documentation for the pg_authid system catalog and the
pg_shadow system view indicates that passwords might be stored in
cleartext, but that hasn't been possible for some time.

Oversight in commit eb61136dc7.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aD2yKkZro4nbl5ol%40nathan
Backpatch-through: 13
2025-06-04 09:47:25 -05:00
Peter Eisentraut
30c15987d9 doc: Update description of pg_constraint.convalidated
The previous description listed the constraint types that this column
was used for, but that was outdated, since not-valid not-null
constraints are now possible.  So just remove that qualification,
rather than trying to keep it updated.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxFo4yTwzbSZrP%2BzQiR6_M00skoZMFaUnNJCdY6he%3DuQfA%40mail.gmail.com
2025-06-04 15:27:44 +02:00
Peter Eisentraut
48814415d5 doc PG 18 relnotes: Add incompatibility note about checksums now default
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ%40mail.gmail.com
2025-06-04 12:06:08 +02:00
Peter Eisentraut
f777d77387 Don't strip $libdir from LOAD command
Commit 4f7f7b03758 implemented the extension_control_path GUC, and to
make it work it was decided that we should strip the $libdir/ on
module_pathname from .control files, so that extensions don't need to
worry about this change.

This strip logic was implemented on expand_dynamic_library_name()
which works fine when executing the SQL functions from extensions, but
this function is also called when the LOAD command is executed, and
since the user may explicitly pass the $libdir prefix on LOAD
parameter, we should not strip in this case.

This commit fixes this issue by moving the strip logic from
expand_dynamic_library_name() to load_external_function() that is
called when the running the SQL script from extensions.

Reported-by: Evan Si <evsi@amazon.com>
Author: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Bug: #18920
Discussion: https://www.postgresql.org/message-id/flat/18920-b350b1c0a30af006%40postgresql.org
2025-06-04 11:38:12 +02:00
Michael Paquier
7f3381c7ee psql: Abort connection when using \syncpipeline after COPY TO/FROM
When the backend reads COPY data, it ignores all sync messages, as per
c01641f8aed0.  With psql pipelines, it is possible to manually send sync
messages with \sendpipeline which leaves the frontend in an
unrecoverable state as the backend will not send the necessary
ReadyForQuery message that is expected to feed psql result consumption
logic.

It could be possible to artificially reduce the piped_syncs and
requested_results, however libpq's state would still have queued sync
messages in its command queue, and the only way to consume those without
directly calling pqCommandQueueAdvance() is to process ReadyForQuery
messages that won't be sent since the backend ignores these.  Perhaps
this could be improved in the future, but I am not really excited about
introducing this amount of complications in libpq to manipulate the
message queues without a better use case to support it.

Hence, this patch aborts the connection if we detect excessive sync
messages after a COPY in a pipeline to avoid staying in an inconsistent
protocol state, which is the best thing we can do with pipelines in
psql for now.  Note that this change does not prevent wrapping a set
of queries inside a block made of \startpipeline and \endpipeline, only
the use of \syncpipeline for a COPY.

Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/18944-8a926c30f68387dd@postgresql.org
2025-06-04 09:01:29 +09:00
Peter Eisentraut
58fbfde152 Fix incorrect format placeholders 2025-06-03 21:38:04 +02:00
Noah Misch
0e164eb9f4 Fix a pg_dump scenario for platforms where SEEK_CUR != 1.
POSIX allows such platforms.  Given the lack of complaints, we may not
currently test on such a platform.  This is new in v18 (commit
7d5c83b4e90c7156655f98b7312a30ae5eeb4d27), so no back-patch.
2025-06-03 11:18:52 -07:00
Fujii Masao
73bdcfab35 Rename log_lock_failure GUC to log_lock_failures for consistency.
This commit renames the GUC log_lock_failure to log_lock_failures
to align with the existing similar setting log_lock_waits, which uses
the plural form. This improves naming consistency across related GUCs.

Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Author: Fujii Masao <masao.fujii@gmail.com
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/7a8198b6-d5b8-4910-b41e-8d3efcbb015d@eisentraut.org
2025-06-03 10:02:55 +09:00
Tom Lane
aa87f69c00 Disallow "=" in names of reloptions and foreign-data options.
We store values for these options as array elements with the syntax
"name=value", hence a name containing "=" confuses matters when
it's time to read the array back in.  Since validation of the
options is often done (long) after this conversion to array format,
that leads to confusing and off-point error messages.  We can
improve matters by rejecting names containing "=" up-front.

(Probably a better design would have involved pairs of array
elements, but it's too late now --- and anyway, there's no
evident use-case for option names like this.  We already
reject such names in some other contexts such as GUCs.)

Reported-by: Chapman Flack <jcflack@acm.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chapman Flack <jcflack@acm.org>
Discussion: https://postgr.es/m/6830EB30.8090904@acm.org
Backpatch-through: 13
2025-06-02 15:22:44 -04:00
Melanie Plageman
31a7e175fd Correct heap vacuum boundary state setup ordering
052026c9b9 mistakenly reordered setup steps in heap_vacuum_rel(),
incorrectly moving RelationGetNumberOfBlocks() before
vacuum_get_cutoffs().

OldestXmin must be determined before RelationGetNumberOfBlocks()
calculates the number of blocks in the relation that will be vacuumed.
Otherwise tuples older than OldestXmin may be inserted into the end of
the relation into blocks that are not vacuumed. If additional tuples
newer than those inserted into unscanned blocks but older than
OldestXmin are inserted into free space earlier in the relation, the
result could be advancing pg_class.relfrozenxid to a newer value than an
unfrozen XID in one of the unscanned heap pages.

Assigning an incorrect relfrozenxid can lead to data loss, so it is
imperative that it correctly reflect the oldest unfrozen xid.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzntqvVEdbbpqG5JqSZGuLWmy4PBfUO-OswfivKchr2gvw%40mail.gmail.com
2025-06-02 10:54:07 -04:00
Peter Eisentraut
fc32be3c94 Fix incorrect format placeholders
Fixes for return type of dclist_count().
2025-06-02 10:12:58 +02:00
Peter Eisentraut
32edf732e8 Rename gist stratnum support function
Commit 7406ab623fe added a gist support function that we internally
refer to by the symbol GIST_STRATNUM_PROC.  This translated from
"well-known" strategy numbers to opfamily-specific strategy numbers.
However, we later (commit 630f9a43cec) changed this to fit into
index-AM-level compare type mapping, so this function actually now
maps from compare type to opfamily-specific strategy numbers.  So this
name is no longer fitting.

Moreover, the index AM level also supports the opposite, a function to
map from strategy number to compare type.  This is currently not
supported in gist, but one might wonder what this function is supposed
to be called when it is added.

This patch changes the naming of the gist-level functionality to be
more in line with the index-AM-level functionality.  This makes sense
because these are essentially the same thing on different levels.
This also changes the names of the externally visible functions that
are provided for use as such a support function.

Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/37ebb1d9-9036-485f-a215-e55435689917%40eisentraut.org
2025-06-02 08:41:27 +02:00
Michael Paquier
5231ed8262 Use replay LSN as target for cascading logical WAL senders
A cascading WAL sender doing logical decoding (as known as doing its
work on a standby) has been using as flush LSN the value returned by
GetStandbyFlushRecPtr() (last position safely flushed to disk).  This is
incorrect as such processes are only able to decode changes up to the
LSN that has been replayed by the startup process.

This commit changes cascading logical WAL senders to use the replay LSN,
as returned by GetXLogReplayRecPtr().  This distinction is important
particularly during shutdown, when WAL senders need to send any
remaining available data to their clients, switching WAL senders to a
caught-up state.  Using the latest flush LSN rather than the replay LSN
could cause the WAL senders to be stuck in an infinite loop preventing
them to shut down, as the startup process does not run when WAL senders
attempt to catch up, so they could keep waiting for work that would
never happen.

Backpatch down to v16, where logical decoding on standbys has been
introduced.

Author: Alexey Makhmutov <a.makhmutov@postgrespro.ru>
Reviewed-by: Ajin Cherian <itsajin@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/52138028-7246-421c-9161-4fa108b88070@postgrespro.ru
Backpatch-through: 16
2025-06-02 12:03:59 +09:00
Tom Lane
c98975ba85 Add commit 4672b6223 to .git-blame-ignore-revs. 2025-06-01 14:58:42 -04:00
Tom Lane
4672b62239 Run pgindent on the previous commit.
Clean up after rearranging PG_TRY blocks.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2954090.1748723636@sss.pgh.pa.us
Backpatch-through: 13
2025-06-01 14:55:24 -04:00
Tom Lane
c6f7f11d8f Fix edge-case resource leaks in PL/Python error reporting.
PLy_elog_impl and its subroutine PLy_traceback intended to avoid
leaking any PyObject reference counts, but their coverage of the
matter was sadly incomplete.  In particular, out-of-memory errors
in most of the string-construction subroutines could lead to
reference count leaks, because those calls were outside the
PG_TRY blocks responsible for dropping reference counts.

Fix by (a) adjusting the scopes of the PG_TRY blocks, and
(b) moving the responsibility for releasing the reference counts
of the traceback-stack objects to PLy_elog_impl.  This requires
some additional "volatile" markers, but not too many.

In passing, fix an ancient thinko: use of the "e_module_o" PyObject
was guarded by "if (e_type_s)", where surely "if (e_module_o)"
was meant.  This would only have visible consequences if the
"__name__" attribute were present but the "__module__" attribute
wasn't, which apparently never happens; but someday it might.

Rearranging the PG_TRY blocks requires indenting a fair amount
of code one more tab stop, which I'll do separately for clarity.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2954090.1748723636@sss.pgh.pa.us
Backpatch-through: 13
2025-06-01 14:48:35 -04:00
Etsuro Fujita
e5a3c9d9b5 postgres_fdw: Inherit the local transaction's access/deferrable modes.
Previously, postgres_fdw always 1) opened a remote transaction in READ
WRITE mode even when the local transaction was READ ONLY, causing a READ
ONLY transaction using it that references a foreign table mapped to a
remote view executing a volatile function to write in the remote side,
and 2) opened the remote transaction in NOT DEFERRABLE mode even when
the local transaction was DEFERRABLE, causing a SERIALIZABLE READ ONLY
DEFERRABLE transaction using it to abort due to a serialization failure
in the remote side.

To avoid these, modify postgres_fdw to open a remote transaction in the
same access/deferrable modes as the local transaction.  This commit also
modifies it to open a remote subtransaction in the same access mode as
the local subtransaction.

Although these issues exist since the introduction of postgres_fdw,
there have been no reports from the field.  So it seems fine to just fix
them in master only.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com
2025-06-01 17:30:00 +09:00
Dean Rasheed
b006bcd531 Fix MERGE into a plain inheritance parent table.
When a MERGE's target table is the parent of an inheritance tree, any
INSERT actions insert into the parent table using ModifyTableState's
rootResultRelInfo. However, there are two bugs in the way is
initialized:

1. ExecInitMerge() incorrectly uses a different ResultRelInfo entry
from ModifyTableState's resultRelInfo array to build the insert
projection, which may not be compatible with rootResultRelInfo.

2. ExecInitModifyTable() does not fully initialize rootResultRelInfo.
Specifically, ri_WithCheckOptions, ri_WithCheckOptionExprs,
ri_returningList, and ri_projectReturning are not initialized.

This can lead to crashes, or incorrect query results due to failing to
check WCO's or process the RETURNING list for INSERT actions.

Fix both these bugs in ExecInitMerge(), noting that it is only
necessary to fully initialize rootResultRelInfo if the MERGE has
INSERT actions and the target table is a plain inheritance parent.

Backpatch to v15, where MERGE was introduced.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/4rlmjfniiyffp6b3kv4pfy4jw3pciy6mq72rdgnedsnbsx7qe5@j5hlpiwdguvc
Backpatch-through: 15
2025-05-31 12:12:58 +01:00
Michael Paquier
e050af2868 Change internal plan ID type from uint64 to int64
uint64 was chosen to be consistent with the type used by the query ID,
but the conclusion of a recent discussion for the query ID is that int64
is a better fit as the signed form is shown to the user, for PGSS or
EXPLAIN outputs.

This commit changes the plan ID to use int64, following c3eda50b0648
that has done the same for the query ID.

The plan ID is new to v18, introduced in 2a0cd38da5cc.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aCvzJNwetyEI3Sgo@paquier.xyz
2025-05-31 09:40:45 +09:00
Nathan Bossart
706054b11b Ensure we have a snapshot when updating various system catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables.  To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards.  While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.

Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog.  On the
back-branches, those bugs are left in place.  We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected.  Given
the low severity of the issue, fixing older versions doesn't seem
worth the trouble of significantly modifying the patch.

Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not back-patched.  While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
2025-05-30 15:17:28 -05:00
Tom Lane
232d8caeaa Fix memory leakage in postgres_fdw's DirectModify code path.
postgres_fdw tries to use PG_TRY blocks to ensure that it will
eventually free the PGresult created by the remote modify command.
However, it's fundamentally impossible for this scheme to work
reliably when there's RETURNING data, because the query could fail
in between invocations of postgres_fdw's DirectModify methods.
There is at least one instance of exactly this situation in the
regression tests, and the ensuing session-lifespan leak is visible
under Valgrind.

We can improve matters by using a memory context reset callback
attached to the ExecutorState context.  That ensures that the
PGresult will be freed when the ExecutorState context is torn
down, even if control never reaches postgresEndDirectModify.

I have little faith that there aren't other potential PGresult
leakages in the backend modules that use libpq.  So I think it'd
be a good idea to apply this concept universally by creating
infrastructure that attaches a reset callback to every PGresult
generated in the backend.  However, that seems too invasive for
v18 at this point, let alone the back branches.  So for the
moment, apply this narrow fix that just makes DirectModify safe.
I have a patch in the queue for the more general idea, but it
will have to wait for v19.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us
Backpatch-through: 13
2025-05-30 13:45:41 -04:00
Tom Lane
d98cefe114 Allow larger packets during GSSAPI authentication exchange.
Our GSSAPI code only allows packet sizes up to 16kB.  However it
emerges that during authentication, larger packets might be needed;
various authorities suggest 48kB or 64kB as the maximum packet size.
This limitation caused login failure for AD users who belong to many
AD groups.  To add insult to injury, we gave an unintelligible error
message, typically "GSSAPI context establishment error: The routine
must be called again to complete its function: Unknown error".

As noted in code comments, the 16kB packet limit is effectively a
protocol constant once we are doing normal data transmission: the
GSSAPI code splits the data stream at those points, and if we change
the limit then we will have cross-version compatibility problems
due to the receiver's buffer being too small in some combinations.
However, during the authentication exchange the packet sizes are
not determined by us, but by the underlying GSSAPI library.  So we
might as well just try to send what the library tells us to.
An unpatched recipient will fail on a packet larger than 16kB,
but that's not worse than the sender failing without even trying.
So this doesn't introduce any meaningful compatibility problem.

We still need a buffer size limit, but we can easily make it be
64kB rather than 16kB until transport negotiation is complete.
(Larger values were discussed, but don't seem likely to add
anything.)

Reported-by: Chris Gooch <cgooch@bamfunds.com>
Fix-suggested-by: Jacob Champion <jacob.champion@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/DS0PR22MB5971A9C8A3F44BCC6293C4DABE99A@DS0PR22MB5971.namprd22.prod.outlook.com
Backpatch-through: 13
2025-05-30 12:55:15 -04:00
Fujii Masao
961553daf5 Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more.
Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter
a non-interruptible loop when they successfully acquired a lock on a transaction
but the transaction still appeared to be running. Since this loop continued
until the transaction completed, it could result in long, uninterruptible waits.

Although this scenario is generally unlikely since XactLockTableWait() and
ConditionalXactLockTableWait() can basically acquire a transaction lock
only when the transaction is not running, it can occur in a hot standby.
In such cases, the transaction may still appear active due to
the KnownAssignedXids list, even while no lock on the transaction exists.
For example, this situation can happen when creating a logical replication
slot on a standby.

The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS()
within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions,
ensuring they can be interrupted safely.

Back-patch to all supported branches.

Author: Kevin K Biju <kevinkbiju@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
Backpatch-through: 13
2025-05-31 00:08:40 +09:00
David Rowley
c3eda50b06 Change internal queryid type from uint64 to int64
uint64 was perhaps chosen in cff440d36 as the type was uint32 prior to
that widening work.

Having this as uint64 doesn't make much sense and just adds the overhead of
having to remember that we always output this in its signed form.  Let's
remove that overhead.

The signed form output is seemingly required since we have no way to
represent the full range of uint64 in an SQL type.  We use BIGINT in places
like pg_stat_statements, which maps directly to int64.

The release notes "Source Code" section may want to mention this
adjustment as some extensions may wish to adjust their code.

Author: David Rowley <dgrowleyml@gmail.com>
Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/50cb0c8b-994b-48f9-a1c4-13039eb3536b@eisentraut.org
2025-05-30 22:59:39 +12:00
Bruce Momjian
03c53a7314 doc PG 18 relnotes: modify async I/O item for other improvements
Add "etc." to indicate other actions will also be improved by
asynchronous I/O.

Reported-by: Melanie Plageman

Discussion: https://postgr.es/m/CAAKRu_bqjgSYA+OdemL-X91Yv53OwsVARZy+-tRyj8YQ=kcj0A@mail.gmail.com
2025-05-29 12:37:05 -04:00
Tom Lane
470273da0f Avoid resource leaks when a dblink connection fails.
If we hit out-of-memory between creating the PGconn and inserting
it into dblink's hashtable, we'd lose track of the PGconn, which
is quite bad since it represents a live connection to a remote DB.
Fix by rearranging things so that we create the hashtable entry
first.

Also reduce the number of states we have to deal with by getting rid
of the separately-allocated remoteConn object, instead allocating it
in-line in the hashtable entries.  (That incidentally removes a
session-lifespan memory leak observed in the regression tests.)

There is an apparently-irreducible remaining OOM hazard, which
is that if the connection fails at the libpq level (ie it's
CONNECTION_BAD) then we have to pstrdup the PGconn's error message
before we can release it, and theoretically that could fail.  However,
in such cases we're only leaking memory not a live remote connection,
so I'm not convinced that it's worth sweating over.

This is a pretty low-probability failure mode of course, but losing
a live connection seems bad enough to justify back-patching.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/1346940.1748381911@sss.pgh.pa.us
Backpatch-through: 13
2025-05-29 10:39:55 -04:00
Fujii Masao
3c4d7557e0 Fix assertion failure in pg_prewarm() on objects without storage.
An assertion test added in commit 049ef33 could fail when pg_prewarm()
was called on objects without storage, such as partitioned tables.
This resulted in the following failure in assert-enabled builds:

    Failed Assert("RelFileNumberIsValid(rlocator.relNumber)")

Note that, in non-assert builds, pg_prewarm() just failed with an error
in that case, so there was no ill effect in practice.

This commit fixes the issue by having pg_prewarm() raise an error early
if the specified object has no storage. This approach is similar to
the fix in commit 4623d7144 for pg_freespacemap.

Back-patched to v17, where the issue was introduced.

Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/e082e6027610fd0a4091ae6d033aa117@oss.nttdata.com
Backpatch-through: 17
2025-05-29 17:50:32 +09:00
Michael Paquier
c3623703f3 Add AioUringCompletion in wait_event_names.txt
Oversight in c325a7633fcb, where the LWLock tranche AioUringCompletion
has been added.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aDT5sBOxJTdulXnE@paquier.xyz
2025-05-29 13:25:05 +09:00
Bruce Momjian
a1de1b0833 doc PG 18 relnotes: split apart log_connections item
Also add details to asynchronous I/O item.

Reported-by: Melanie Plageman

Discussion: https://postgr.es/m/CAAKRu_YsVvyantS0X0Y_-vp_97=yGaoYJMXXyCEkR7pumAH3Jg@mail.gmail.com
2025-05-28 22:43:36 -04:00
Michael Paquier
35a428f30b pg_stat_statements: Fix parameter number gaps in normalized queries
pg_stat_statements anticipates that certain constant locations may be
recorded multiple times and attempts to avoid calculating a length for
these locations in fill_in_constant_lengths().

However, during generate_normalized_query() where normalized query
strings are generated, these locations are not excluded from
consideration.  This could increment the parameter number counter for
every recorded occurrence at such a location, leading to an incorrect
normalization in certain cases with gaps in the numbers reported.

For example, take this query:
SELECT WHERE '1' IN ('2'::int, '3'::int::text)
Before this commit, it would be normalized like that, with gaps in the
parameter numbers:
SELECT WHERE $1 IN ($3::int, $4::int::text)
However the correct, less confusing one should be like that:
SELECT WHERE $1 IN ($2::int, $3::int::text)

This commit fixes the computation of the parameter numbers to track the
number of constants replaced with an $n by a separate counter instead of
the iterator used to loop through the list of locations.

The underlying query IDs are not changed, neither are the normalized
strings for existing PGSS hash entries.  New entries with fresh
normalized queries would automatically get reshaped based on the new
parameter numbering.

Issue discovered while discussing a separate problem for HEAD, but this
affects all the stable branches.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com
Backpatch-through: 13
2025-05-29 11:26:03 +09:00
Bruce Momjian
089f27cf8a doc: clarify log_connections new "setup_durations" output 2025-05-28 21:42:34 -04:00
Bruce Momjian
bf6034d00d doc PG 18 relnotes: move ANALYZE item,split ANALYZE/EXPLAIN item
Reported-by: Yugo Nagata

Author: Yugo Nagata

Discussion: https://postgr.es/m/20250528232503.7db770f651c2c821c0e3c1df@sraoss.co.jp
2025-05-28 18:43:31 -04:00
Tom Lane
e5d64fd654 Tighten parsing of datetime input.
ParseFraction only expects to deal with fields that contain a decimal
point and digit(s).  However it's possible in some edge cases for it
to be passed input that doesn't look like that.  In particular the
input could look like a valid floating-point number, such as ".123e6".
strtod() will happily eat that, possibly producing a result that is
not within the expected range 0..1, which can result in integer
overflow in the callers.  That doesn't have any security consequences,
but it's still not very desirable.  Fix by checking that the input
has the expected form.

Similarly, DecodeNumberField only expects to deal with fields that
contain a decimal point and digit(s), but it's sometimes abused to
parse strings that might not look like that.  This could result in
failure to reject bogus input, yielding silly results.  Again, fix
by rejecting input that doesn't look as-expected.  That decision
also means that we can affirmatively answer the very old comment
questioning whether we couldn't save some duplicative code by
using ParseFractionalSecond here.

While these changes should only reject input that nobody would
consider valid, it still doesn't seem like a change to make in
stable branches.  Apply to HEAD only.

Reported-by: Evgeniy Gorbanev <gorbanev.es@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1328335.1748371099@sss.pgh.pa.us
2025-05-28 15:10:48 -04:00
Tom Lane
be86ca103a Fix memory leakage when function compilation fails.
In pl_comp.c, initially create the plpgsql function's cache context
under the assumed-short-lived caller's context, and reparent it under
CacheMemoryContext only upon success.  This avoids a process-lifespan
leak of 8kB or more if the function contains syntax errors.  (This
leakage has existed for a long time without many complaints, but as
we move towards a possibly multi-threaded future, getting rid of
process-lifespan leaks grows more important.)

In funccache.c, arrange to reclaim the CachedFunction struct in case
the language-specific compile callback function throws an error;
previously, that resulted in an independent process-lifespan leak.
This is arguably a new bug in v18, since the leakage now occurred
for SQL-language functions as well as plpgsql.

Also, don't fill fn_xmin/fn_tid/dcallback until after successful
completion of the compile callback.  This avoids a scenario where a
partially-built function cache might appear already valid upon later
inspection, and another scenario where dcallback might fail upon being
presented with an incomplete cache entry.  We would have to reach such
a faulty cache entry via a pre-existing fn_extra pointer, so I'm not
sure these scenarios correspond to any live bug.  (The predecessor
code in pl_comp.c never took any care about this, and we've heard no
complaints about that.)  Still, it's better to be careful.

Given the lack of field complaints, I'm not very excited about
back-patching any of this; but it seems still in-scope for v18.

Discussion: https://postgr.es/m/999171.1748300004@sss.pgh.pa.us
2025-05-28 13:29:45 -04:00
Bruce Momjian
c861092b0e doc PG 18 relnotes: clarify multiplication item
Reported-by: Dean Rasheed

Author: Dean Rasheed

Discussion: https://postgr.es/m/CAEZATCXZGU3LLMZHobYys1MLpyNMAus7+UUpWeeFYwSaPNC2CA@mail.gmail.com
2025-05-28 12:34:11 -04:00
Michael Paquier
4fbb46f612 Adjust regex for test with opening parenthesis in character classes
As written, the test was throwing an error because of an unbalanced
parenthesis.  The regex used in the test is adjusted to not fail and to
test the case of an opening parenthesis in a character class after some
nested square brackets.

Oversight in d46911e584d4.

Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
2025-05-28 09:43:31 +09:00
Michael Paquier
d46911e584 Fix conversion of SIMILAR TO regexes for character classes
The code that translates SIMILAR TO pattern matching expressions to
POSIX-style regular expressions did not consider that square brackets
can be nested.  For example, in an expression like [[:alpha:]%_], the
logic replaced the placeholders '_' and '%' but it should not.

This commit fixes the conversion logic by tracking the nesting level of
square brackets marking character class areas, while considering that
in expressions like []] or [^]] the first closing square bracket is a
regular character.  Multiple tests are added to show how the conversions
should or should not apply applied while in a character class area, with
specific cases added for all the characters converted outside character
classes like an opening parenthesis '(', dollar sign '$', etc.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
Backpatch-through: 13
2025-05-28 08:58:40 +09:00
Bruce Momjian
3e782ca322 doc PG 18 relnotes: add removal details to MD5 item
Reported-by: Nathan Bossart

Author: Nathan Bossart

Discussion: https://postgr.es/m/aDXLoTcBYjfyqeTA@nathan
2025-05-27 17:50:52 -04:00
Bruce Momjian
08b8aa1748 doc PG 18 relnotes: fix markup
Reported-by: Peter Smith

Discussion: https://postgr.es/m/CAHut+PswZ7wFtpNgv3bdtYK5D0eGMpvz4CcnAxvj7gR_acazGQ@mail.gmail.com
2025-05-27 17:34:45 -04:00
Jeff Davis
34eb2a80d5 Change pg_dump default for statistics export.
Set the default behavior of pg_dump and pg_dumpall to be
--no-statistics.

Leave the default for pg_restore and pg_upgrade to be
--with-statistics.

Discussion: https://postgr.es/m/CA+TgmoZ9=RnWcCOZiKYYjZs_AW1P4QXCw--h4dOLLHuf1Omung@mail.gmail.com
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2025-05-27 13:54:38 -07:00
Masahiko Sawada
4c08ecd161 Fix assertion when decrementing eager scanning success and failure counters.
Previously, we asserted that the eager scan's success and failure
counters were positive before decrementing them. However, this
assumption was incorrect, as it's possible that some blocks have
already been eagerly scanned by the time eager scanning is disabled.

This commit replaces the assertions with guards to handle this
scenario gracefully.

With this change, we continue to allow read-ahead operations by the
read stream that exceed the success and failure caps. While there is a
possibility that overruns will trigger eager scans of additional
pages, this does not pose a practical concern as the overruns will not
be substantial and remain within an acceptable range.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAD21AoConf6tkVCv-=JhQJj56kYsDwo4jG5+WqgT+ukSkYomSQ@mail.gmail.com
2025-05-27 11:42:36 -07:00
Peter Eisentraut
c53f3b9cc8 Improve file_copy_method entry in postgresql.conf.sample
Improve the wording of the comment a bit, fix whitespace.  Also move
the entry so that the section order is consistent with config.sgml.
2025-05-26 14:52:00 +02:00
Daniel Gustafsson
1f62dbf5f0 doc: Fix wording in JIT README
Remove superfluous 'is' from sentence.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20250526154412.5f77dfead87af9afc089cc48@sraoss.co.jp
2025-05-26 13:30:01 +02:00
Michael Paquier
52a1df85f2 Fix race condition in subscription TAP test 021_twophase
The test did not wait for all the subscriptions to have caught up when
dropping the subscription "tab_copy".  In a slow environment, it could
be possible for the replay of the COMMIT PREPARED transaction "mygid"
to not be confirmed yet, causing one prepared transaction to be left
around before moving to the next steps of the test.

One failure noticed is a transaction found in pg_prepared_xacts for the
cases where copy_data = false and two_phase = true, but there should be
none after dropping the subscription.

As an extra safety measure, a check is added before dropping the
subscription, scanning pg_prepared_xacts to make sure that no prepared
transactions are left once both subscriptions have caught up.

Issue introduced by a8fd13cab0ba, fixing a problem similar to
eaf5321c3524.

Per buildfarm member kestrel.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CALDaNm329QaZ+bwU--bW6GjbNSZ8-38cDE8QWofafub7NV67oA@mail.gmail.com
Backpatch-through: 15
2025-05-26 17:28:37 +09:00
Amit Kapila
3bcb554fd2 Doc: Make logical replication examples executable in bulk.
To improve the usability of logical replication examples, we need to
enable bulk copy-pasting of DML/DDL series.

Currently, output command tags and prompts disrupt this workflow. While
prompts are typically removed, converting them to comments is acceptable
here, given the multi-server context.

Additionally, ensure all examples containing operators like < and > are
wrapped in CDATA blocks to guarantee correct rendering and consistency
with other places.

Author: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwbhbL1uaDTuo9shmo1rA-fX6XGotR7qZQ7rd-ia5ZDoQA@mail.gmail.com
2025-05-26 11:05:05 +05:30
Fujii Masao
47d90b741d doc: Fix documenation for snapshot export in logical decoding.
The documentation for exported snapshots in logical decoding previously
stated that snapshot creation may fail on a hot standby. This is no longer
accurate, as snapshot exporting on standbys has been supported since
PostgreSQL 10. This commit removes the outdated description.

Additionally, the docs referred to the NOEXPORT_SNAPSHOT option to
suppress snapshot exporting in CREATE_REPLICATION_SLOT. However,
since PostgreSQL 15, NOEXPORT_SNAPSHOT is considered legacy syntax
and retained only for backward compatibility. This commit updates
the documentation for v15 and later to use the modern equivalent:
SNAPSHOT 'nothing'. The older syntax is preserved in documentation for
v14 and earlier.

Back-patched to all supported branches.

Reported-by: Kevin K Biju <kevinkbiju@gmail.com>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Kevin K Biju <kevinkbiju@gmail.com>
Discussion: https://postgr.es/m/174791480466.798.17122832105389395178@wrigleys.postgresql.org
Backpatch-through: 13
2025-05-26 12:47:33 +09:00
Bruce Momjian
44ce4e1593 doc PG 18 relnotes: clarify btree skip-scan item
Reported-by: Peter Geoghegan

Discussion: https://postgr.es/m/CAH2-Wzko57+sT=FcxHHo7jnPLhh35up_5aAvogLtj_D9bATsgQ@mail.gmail.com
2025-05-23 17:02:33 -04:00
Jacob Champion
a8f093234d oauth: Correct missing comma in Requires.private
I added libcurl to the Requires.private section of libpq.pc in commit
b0635bfda, but I missed that the Autoconf side needs commas added
explicitly. Configurations which used both --with-libcurl and
--with-openssl ended up with the following entry:

    Requires.private: libssl, libcrypto libcurl

The pkg-config parser appears to be fairly lenient in this case, and
accepts the whitespace as an equivalent separator, but let's not rely on
that. Add an add_to_list macro (inspired by Makefile.global's
add_to_path) to build up the PKG_CONFIG_REQUIRES_PRIVATE list correctly.

Reported-by: Wolfgang Walther <walther@technowledgy.de>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Discussion: https://postgr.es/m/CAOYmi+k2z7Rqj5xiWLUT0+bSXLvdE7TYgS5gCOSqSyXyTSSXiQ@mail.gmail.com
2025-05-23 13:05:38 -07:00
Jacob Champion
cbc8fd0c9a oauth: Limit JSON parsing depth in the client
Check the ctx->nested level as we go, to prevent a server from running
the client out of stack space.

The limit we choose when communicating with authorization servers can't
be overly strict, since those servers will continue to add extensions in
their JSON documents which we need to correctly ignore. For the SASL
communication, we can be more conservative, since there are no defined
extensions (and the peer is probably more Postgres code).

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAOYmi%2Bm71aRUEi0oQE9ciBnBS8xVtMn3CifaPu2kmJzUfhOZgA%40mail.gmail.com
2025-05-23 13:05:33 -07:00
Bruce Momjian
1ca583f6c0 doc PG 18 relnotes: update to current
Includes runtime injection point item by Michael Paquier.

Reported-by: Michael Paquier

Author: Michael Paquier

Discussion: https://postgr.es/m/aDAS0_eWzeGl4sok@paquier.xyz
2025-05-23 16:01:07 -04:00
Tom Lane
02502c1bca Fix per-relation memory leakage in autovacuum.
PgStat_StatTabEntry and AutoVacOpts structs were leaked until
the end of the autovacuum worker's run, which is bad news if
there are a lot of relations in the database.

Note: pfree'ing the PgStat_StatTabEntry structs here seems a bit
risky, because pgstat_fetch_stat_tabentry_ext does not guarantee
anything about whether its result is long-lived.  It appears okay
so long as autovacuum forces PGSTAT_FETCH_CONSISTENCY_NONE, but
I think that API could use a re-think.

Also ensure that the VacuumRelation structure passed to
vacuum() is in recoverable storage.

Back-patch to v15 where we started to manage table statistics
this way.  (The AutoVacOpts leakage is probably older, but
I'm not excited enough to worry about just that part.)

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
Backpatch-through: 15
2025-05-23 14:43:43 -04:00
Tom Lane
6aa33afe6d Fix AlignedAllocRealloc to cope sanely with OOM.
If the inner allocation call returns NULL, we should restore the
previous state and return NULL.  Previously this code pfree'd
the old chunk anyway, which is surely wrong.

Also, make it call MemoryContextAllocationFailure rather than
summarily returning NULL.  The fact that we got control back from the
inner call proves that MCXT_ALLOC_NO_OOM was passed, so this change
is just cosmetic, but someday it might be less so.

This is just a latent bug at present: AFAICT no in-core callers use
this function at all, let alone call it with MCXT_ALLOC_NO_OOM.
Still, it's the kind of bug that might bite back-patched code pretty
hard someday, so let's back-patch to v17 where the bug was introduced
(by commit 743112a2e).

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
Backpatch-through: 17
2025-05-23 11:47:33 -04:00
Daniel Gustafsson
fb844b9f06 Revert function to get memory context stats for processes
Due to concerns raised about the approach, and memory leaks found
in sensitive contexts the functionality is reverted. This reverts
commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291
for v18 with an intent to revisit this patch for v19.

Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us
2025-05-23 15:44:54 +02:00
Peter Eisentraut
70a13c528b Move oauth_validator_libraries in postgresql.conf.sample
Move oauth_validator_libraries in postgresql.conf.sample to be grouped
with the other CONN_AUTH_AUTH settings, rather than making up a new
ad-hoc category.  This matches the internal categorization and also
how it is listed in the documentation.
2025-05-23 09:03:09 +02:00
Bruce Momjian
883339c170 doc PG 18 relnotes: adjust CREATE SUBSCRIPTION attribution
Reported-by: vignesh C

Discussion: https://postgr.es/m/CALDaNm0Wy-vJ6dE+e=y=yuq31i2KvGf-Rs-u6QOG4K7TpU_6Tw@mail.gmail.com
2025-05-22 23:02:11 -04:00
Bruce Momjian
7ddfac79f2 doc PG 18 relnotes: clarify btree skip scan item
Reported-by: Peter Geoghegan

Discussion: https://postgr.es/m/CAH2-Wz=2CWXgO1+uyR-VfN3ALMtFnfTtXK-VtkoQQ89ogm=4sg@mail.gmail.com
2025-05-22 22:24:18 -04:00
Bruce Momjian
3b7140d27e doc PG 18 relnotes: remove duplicate commit entry
Item related to btree skip scans.
2025-05-22 21:41:38 -04:00
Tom Lane
b7ab88ddb1 Fix assorted new memory leaks in libpq.
Valgrind'ing the postgres_fdw tests showed me that libpq was leaking
PGconn.be_cancel_key.  It looks like freePGconn is expecting
pqDropServerData to release it ... but in a cancel connection
object, that doesn't happen.

Looking a little closer, I was dismayed to find that freePGconn
also missed freeing the pgservice, min_protocol_version,
max_protocol_version, sslkeylogfile, scram_client_key_binary,
and scram_server_key_binary strings.  There's much less excuse
for those oversights.  Worse, that's from five different commits
(a460251f0, 4b99fed75, 285613c60, 2da74d8d6, 761c79508),
some of them by extremely senior hackers.

Fortunately, all of these are new in v18, so we haven't
shipped any leaky versions of libpq.

While at it, reorder the operations in freePGconn to match the
order of the fields in struct PGconn.  Some of those free's seem
to have been inserted with the aid of a dartboard.
2025-05-22 20:35:32 -04:00
Melanie Plageman
cb1456423d Replace deprecated log_connections values in docs and tests
9219093cab2607f modularized log_connections output to allow more
granular control over which aspects of connection establishment are
logged. It converted the boolean log_connections GUC into a list of strings
and deprecated previously supported boolean-like values on, off, true,
false, 1, 0, yes, and no. Those values still work, but they are
supported mainly for backwards compatability. As such, documented
examples of log_connections should not use these deprecated values.

Update references in the docs to deprecated log_connections values. Many
of the tests use log_connections. This commit also updates the tests to
use the new values of log_connections. In some of the tests, the updated
log_connections value covers a narrower set of aspects (e.g. the
'authentication' aspect in the tests in src/test/authentication and the
'receipt' aspect in src/test/postmaster). In other cases, the new value
for log_connections is a superset of the previous included aspects (e.g.
'all' in src/test/kerberos/t/001_auth.pl).

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/e1586594-3b69-4aea-87ce-73a7488cdc97%40eisentraut.org
2025-05-22 17:14:54 -04:00
Tom Lane
d376ab570e In ExecInitModifyTable, don't scribble on the source plan.
The code carelessly modified mtstate->ps.plan->targetlist,
which it's not supposed to do.  Fortunately, there's not
really any need to do that because the planner already
set up a perfectly acceptable targetlist for the plan node.
We just need to remove the erroneous assignments and update some
relevant comments.

As it happens, the erroneous assignments caused the targetlist to
point to a different part of the source plan tree, so that there
isn't really a risk of the pointer becoming dangling after executor
termination.  The only visible effect of this change we can find is
that EXPLAIN will show upper references to the ModifyTable's output
expressions using different variables.  Formerly it showed Vars from
the first target relation that survived executor-startup pruning.
Now it always shows such references using the first relation appearing
in the planner output, independently of what happens during executor
pruning.  On the whole that seems like a good thing.

Also make a small tweak in ExplainPreScanNode to ensure that the first
relation will receive a refname assignment in set_rtable_names, even
if it got pruned at startup.  Previously the Vars might be shown
without any table qualification, which is confusing in a multi-table
query.

I considered back-patching this, but since the bug doesn't seem to
have any really terrible consequences in existing branches, it
seems better to not change their EXPLAIN output.  It's not too late
for v18 though, especially since v18 already made other changes in
the EXPLAIN output for these cases.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/213261.1747611093@sss.pgh.pa.us
2025-05-22 14:28:51 -04:00
Tom Lane
f24605e2dc Fix memory leak in XMLSERIALIZE(... INDENT).
xmltotext_with_options sometimes tries to replace the existing
root node of a libxml2 document.  In that case xmlDocSetRootElement
will unlink and return the old root node; if we fail to free it,
it's leaked for the remainder of the session.  The amount of memory
at stake is not large, a couple hundred bytes per occurrence, but
that could still become annoying in heavy usage.

Our only other xmlDocSetRootElement call is not at risk because
it's working on a just-created document, but let's modify that
code too to make it clear that it's dependent on that.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Discussion: https://postgr.es/m/1358967.1747858817@sss.pgh.pa.us
Backpatch-through: 16
2025-05-22 13:52:46 -04:00
Nathan Bossart
5d6eac80cd pg_dump: Adjust reltuples from 0 to -1 for dumps of older versions.
Before v14, a reltuples value of 0 was ambiguous: it could either
mean the relation is empty, or it could mean that it hadn't yet
been vacuumed or analyzed.  (Commit 3d351d916b taught v14 and newer
to use -1 for the latter case.)  This ambiguity allegedly can cause
the planner to choose inefficient plans after restoring to v18 or
newer.  To fix, let's just dump reltuples as -1 in that case.  This
will cause some truly empty tables to be seen as not-yet-processed,
but that seems unlikely to cause too much trouble in practice.

Note that we could alternatively teach pg_restore_relation_stats()
to translate reltuples based on the version argument, but since
that function doesn't exist until v18, there's no particular
advantage to that approach.  That is, there's no chance of
restoring stats dumped from a pre-v14 server to another pre-v14
server.  Per discussion, the current policy is to fix pre-v18
behavior differences during export and everything else during
import.

Commit 9879105024 fixed a similar problem for vacuumdb by removing
the check for reltuples != 0.  Presumably we could reinstate that
check now, but I've chosen to leave it in place in case reltuples
isn't accurate.  As before, processing some empty tables seems
relatively harmless.

Author: Hari Krishna Sunder <hari.db.pg@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CAAeiqZ0o2p4SX5_xPcuAbbsmXjg6MJLNuPYSLUjC%3DWh-VeW64A%40mail.gmail.com
2025-05-22 10:23:26 -05:00
Amit Langote
1722d5eb05 Revert "Don't lock partitions pruned by initial pruning"
As pointed out by Tom Lane, the patch introduced fragile and invasive
design around plan invalidation handling when locking of prunable
partitions was deferred from plancache.c to the executor. In
particular, it violated assumptions about CachedPlan immutability and
altered executor APIs in ways that are difficult to justify given the
added complexity and overhead.

This also removes the firstResultRels field added to PlannedStmt in
commit 28317de72, which was intended to support deferred locking of
certain ModifyTable result relations.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/605328.1747710381@sss.pgh.pa.us
2025-05-22 17:02:35 +09:00
Peter Eisentraut
f3622b6476 doc: Move documentation of md5_password_warnings to a better place
Commit db6a4a985bc categorized md5_password_warnings as an
authentication setting, and the placement in postgresql.conf.sample
matches that, but in the documentation it ended up under logging
settings, which isn't unreasonable but inconsistent.  This moves the
documentation chunk to authentication settings as well.
2025-05-21 16:29:05 +02:00
Michael Paquier
3d0c3a418f Adjust operation names of pg_aios to match the documentation
pg_aios used the terms "read" and "write" for vectored I/O read and
write operations, respectively.  The documentation refers to them as
"readv" and "writev", and the code uses internally the terms
PGAIO_OP_READV and PGAIO_OP_WRITEV for them, as of "vectored".

This commit adjusts these operation names to match with the code and the
documentation.

Oversight in 8e293e689bab.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://postgr.es/m/6df1e949d1d759ad2767c18e5845963e@oss.nttdata.com
2025-05-21 15:58:03 +09:00
Fujii Masao
0bd762e81f Fix incorrect WAL description for PREPARE TRANSACTION record.
Since commit 8b1dccd37c7, the PREPARE TRANSACTION WAL record includes
information about dropped statistics entries. However, the WAL resource
manager description function for PREPARE TRANSACTION record failed to
parse this information correctly and always assumed there were
no such entries.

As a result, for example, pg_waldump could not display the dropped
statistics entries stored in PREPARE TRANSACTION records.

The root cause was that ParsePrepareRecord() did not set the number of
statistics entries to drop on commit or abort. These values remained
zero-initialized and were never updated from the parsed record.

This commit fixes the issue by properly setting those values during parsing.
With this fix, pg_waldump can now correctly report dropped statistics
entries in PREPARE TRANSACTION records.

Back-patch to v15, where commit 8b1dccd37c7 was introduced.

Author: Daniil Davydov <3danissimo@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAJDiXgh-6Epb2XiJe4uL0zF-cf0_s_7Lw1TfEHDMLzYjEmfGOw@mail.gmail.com
Backpatch-through: 15
2025-05-21 11:55:14 +09:00
Michael Paquier
06450c7b8c Fix regression with location calculation of nested statements
The statement location calculated for some nested query cases was wrong
when multiple queries are sent as a single string, these being separated
by semicolons.  As pointed by Sami Imseih, the location calculation was
incorrect when the last query of nested statement with multiple queries
does **NOT** finish with a semicolon for the last statement.  In this
case, the statement length tracked by RawStmt is 0, which is equivalent
to say that the string should be used until its end.  The code
previously discarded this case entirely, causing the location to remain
at 0, the same as pointing at the beginning of the string.  This caused
pg_stat_statements to store incorrect query strings.

This issue has been introduced in 499edb09741b.  I have looked at the
diffs generated by pgaudit back then, and noticed the difference
generated for this nested query case, but I have missed the point that
it was an actual regression with an existing case.  A test case is added
in pg_stat_statements to provide some coverage, restoring the pre-17
behavior for the calculation of the query locations.  Special thanks to
David Steele, who, through an analysis of the test diffs generated by
pgaudit with the new v18 logic, has poked me about the fact that my
original analysis of the matter was wrong.

The test output of pg_overexplain is updated to reflect the new logic,
as the new locations refer to the beginning of the argument passed to
the function explain_filter().  When the module was introduced in
8d5ceb113e3f, which was after 499edb09741b (for the new calculation
method), the locations of the test were not actually right: the plan
generated for the query string given in input of the function pointed to
the top-level query, not the nested one.

Reported-by: David Steele <david@pgbackrest.org>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: David Steele <david@pgbackrest.org>
Discussion: https://postgr.es/m/844a3b38-bbf1-4fb2-9fd6-f58c35c09917@pgbackrest.org
2025-05-21 10:22:12 +09:00
Nathan Bossart
a6060f1cbe pg_dump: Fix array literals in fetchAttributeStats().
Presently, fetchAttributeStats() builds array literals by treating
the elements as SQL identifiers.  This is incorrect for a couple of
reasons:

* Array literal content must match the external text representation
  of the array, i.e., what array_out() would return.  One notable
  problem is that double quotes are escaped with "" in identifiers
  but with \" in array literals.  To fix, build the array content
  using the pre-existing appendPGArray() function.

* Array literals must be written as string constants.  A notable
  problem here is that single quotes are escaped via '' in strings
  but are not escaped in the text representation of an array.  To
  fix, append the aforementioned array literal content to the query
  with appendStringLiteralAH().

While at it, modify a test case to use an identifier that would
cause the test to fail without this change.

Oversight in commit 9c02e3a986.

Reported-by: Philippe Beaudoin <pbh.emaj@free.fr>
Author: Jian He <jian.universality@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Stepan Neretin <slpmcf@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18923
Discussion: https://postgr.es/m/18923-e79273f87c6bed69%40postgresql.org
2025-05-20 16:31:00 -05:00
Heikki Linnakangas
cbf53e2b8a Fix cross-version upgrade test failure
Commit 29f7ce6fe7 added another view that needs adjustment in the
cross-version upgrade test. This should fix the XversionUpgrade
failures in the buildfarm.

Backpatch-through: 16
Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org
2025-05-20 10:39:14 +03:00
Michael Paquier
54675d8986 doc: Clarify use of _ccnew and _ccold in REINDEX CONCURRENTLY
Invalid indexes are suffixed with "_ccnew" or "_ccold".  The
documentation missed to mention the initial underscore.
ChooseRelationName() may also append an extra number if indexes with a
similar name already exist; let's add a note about that too.

Author: Alec Cozens <acozens@pixelpower.com>
Discussion: https://postgr.es/m/174733277404.1455388.11471370288789479593@wrigleys.postgresql.org
Backpatch-through: 13
2025-05-20 14:39:06 +09:00
Andres Freund
acad909321 aio: Fix possible state confusions due to interrupt processing
elog()/ereport() process interrupts, iff the log message is < ERROR and the
log message will be emitted. aio's debug messages are emitted via ereport(),
but in some places the code is not ready for interrupts to be processed.

Fix the issue using a few different methods:

1) handle interrupts arriving concurrently - in some places it's easy to
   detect that by fetching the handle's generation a bit earlier
2) Check if interrupts made the work needing to be done obsolete
3) Disallow interrupts, as there's no sane way to make interrupt processing
   safe

To prevent some similar issues from being re-introduced, assert that
interrupts are held in pgaio_io_update_state().

This commit also fixes the contents of a debug message I added in 039bfc457e4.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/mvpm7ga3dfgz7bvum22hmuz26cariylmcppb3irayftc7bwk3r@l7gb6gr7azhc
2025-05-19 21:07:06 -04:00
Heikki Linnakangas
29f7ce6fe7 Fix deparsing FETCH FIRST <expr> ROWS WITH TIES
In the grammar, <expr> is a c_expr, which accepts only a limited set
of integer literals and simple expressions without parens. The
deparsing logic didn't quite match the grammar rule, and failed to use
parens e.g. for "5::bigint".

To fix, always surround the expression with parens. Would be nice to
omit the parens in simple cases, but unfortunately it's non-trivial to
detect such simple cases. Even if the expression is a simple literal
123 in the original query, after parse analysis it becomes a FuncExpr
with COERCE_IMPLICIT_CAST rather than a simple Const.

Reported-by: yonghao lee
Backpatch-through: 13
Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org
2025-05-19 18:50:26 +03:00
Amit Kapila
ad5eaf390c Don't retreat slot's confirmed_flush LSN.
Prevent moving the confirmed_flush backwards, as this could lead to data
duplication issues caused by replicating already replicated changes.

This can happen when a client acknowledges an LSN it doesn't have to do
anything for, and thus didn't store persistently. After a restart, the
client can send the prior LSN that it stored persistently as an
acknowledgement, but we need to ignore such an LSN to avoid retreating
confirm_flush LSN.

Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Author: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Tested-by: Nisha Moond <nisha.moond412@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB57164AB5716AF2E477D53F6F9489A@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-05-19 12:13:06 +05:30
Tom Lane
f8db5c7a3f Doc: add pre-branch task to run src/tools/copyright.pl.
It's common for some files with last year's copyright date
to sneak into the tree between early January (when we normally run
copyright.pl) and feature freeze.  Immediately before branching
the new release is an ideal time to fix the stragglers, so add a
note about it to the RELEASE_CHANGES checklist.

Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com
2025-05-18 23:31:44 -04:00
Michael Paquier
2c6469d4cd Fix incorrect year in some copyright notices
A couple of new files have been added in the tree with a copyright year
of 2024 while we were already in 2025.  These should be marked with
2025, so let's fix them.

Reported-by: Shaik Mohammad Mujeeb <mujeeb.sk.dev@gmail.com>
Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com
2025-05-19 09:46:52 +09:00
Michael Paquier
11b2dc3709 ecpg: Add missing newline in meson.build
Noticed while performing a routine sanity check of the files in the
tree.  Issue introduced by 28f04984f0c2.

Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com
2025-05-19 09:44:17 +09:00
Alexander Korotkov
3d3a81fc24 Fix tuple_fraction calculation in generate_orderedappend_paths()
6b94e7a6da adjusted generate_orderedappend_paths() to consider fractional
paths.  However, it didn't manage to interpret the tuple_fraction value
correctly.  According to the header comment of grouping_planner(), the
tuple_fraction >= 1 specifies the absolute number of expected tuples.  That
number must be divided by the expected total number of tuples to get the
actual fraction.

Even though this is a bug fix, we don't backpatch it.  The risks of the side
effects of plan changes on stable branches are too high.

Reported-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/3ca271fa-ca5c-458c-8934-eb148622b270%40gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2025-05-18 23:49:50 +03:00
Tom Lane
12eee85e51 Make our usage of memset_s() conform strictly to the C11 standard.
Per the letter of the C11 standard, one must #define
__STDC_WANT_LIB_EXT1__ as 1 before including <string.h> in order to
have access to memset_s().  It appears that many platforms are lenient
about this, because we weren't doing it and yet the code appeared to
work anyway.  But we now find that with -std=c11, macOS is strict and
doesn't declare memset_s, leading to compile failures since we try to
use it anyway.  (Given the lack of prior reports, perhaps this is new
behavior in the latest SDK?  No matter, we're clearly in the wrong.)

In addition to the immediate problem, which could be fixed merely by
adding the needed #define to explicit_bzero.c, it seems possible that
our configure-time probe for memset_s() could fail in case a platform
implements the function in some odd way due to this spec requirement.
This concern can be fixed in largely the same way that we dealt with
strchrnul() in 6da2ba1d8: switch to using a declaration-based
configure probe instead of a does-it-link probe.

Back-patch to v13 where we started using memset_s().

Reported-by: Lakshmi Narayana Velayudam <dev.narayana.v@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAA4pTnLcKGG78xeOjiBr5yS7ZeE-Rh=FaFQQGOO=nPzA1L8yEA@mail.gmail.com
Backpatch-through: 13
2025-05-18 12:45:55 -04:00
Daniel Gustafsson
0d4dad200d Fix function name reference in comment
Ensure that we refer to the function being used, rather than the
name of the resulting function in question.

Author: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+renyVZNiHEv5ceKDjA4j5xC6NT6mRuW33BDERBQMi_90_t6A@mail.gmail.com
2025-05-18 10:05:38 +02:00
Daniel Gustafsson
5987553fde Align organization wording in copyright statement
This aligns the copyright and legal notice wordig with commit
a233a603bab8 and pgweb commit 2d764dbc083ab8.  Backpatch down
to all supported versions.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Dave Page <dpage@pgadmin.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/744E414E-3F52-404C-97FB-ED9B3AA37DC8@yesql.se
Backpatch-through: 13
2025-05-16 11:20:07 -04:00
Richard Guo
fe29b2a1da Fix Assert failure in XMLTABLE parser
In an XMLTABLE expression, columns can be marked NOT NULL, and the
parser internally fabricates an option named "is_not_null" to
represent this.  However, the parser also allows users to specify
arbitrary option names.  This creates a conflict: a user can
explicitly use "is_not_null" as an option name and assign it a
non-Boolean value, which violates internal assumptions and triggers an
assertion failure.

To fix, this patch checks whether a user-supplied name collides with
the internally reserved option name and raises an error if so.
Additionally, the internal name is renamed to "__pg__is_not_null" to
further reduce the risk of collision with user-defined names.

Reported-by: Евгений Горбанев <gorbanyoves@basealt.ru>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/6bac9886-65bf-4cec-96bd-e304159f28db@basealt.ru
Backpatch-through: 15
2025-05-15 17:09:04 +09:00
Richard Guo
2c0ed86d39 Add explicit initialization for all PlannerGlobal fields
When creating a new PlannerGlobal node in standard_planner(), most
fields are explicitly initialized, but a few are not.  This doesn't
cause any functional issues, as makeNode() zeroes all fields by
default.  However, the inconsistency is undesirable from a clarity and
maintenance perspective.

This patch explicitly initializes the remaining fields to improve
consistency and readability.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-TgQHNOiouqGcuHoBqbJjWyx4UxGKxUY3FrF4trGbcPA@mail.gmail.com
2025-05-14 09:59:31 +09:00
Daniel Gustafsson
6e289f2d5d Fix order of parameters in POD documentation
The documentation for log_check() had the parameters in the wrong
order.  Also while there, rename %parameters to %params to better
documentation for similar functions which use %params.  Backpatch
down to v14 where this was introduced.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/9F503B5-32F2-45D7-A0AE-952879AD65F1@yesql.se
Backpatch-through: 14
2025-05-13 07:29:14 -04:00
Amit Kapila
8ede692de5 Fix the race condition in the test added by 7c99dc587.
After executing ALTER SUBSCRIPTION tap_sub SET PUBLICATION, we did not
wait for the new walsender process to restart. As a result, an INSERT
executed immediately after the ALTER could be decoded and skipped,
considering it is not part of any subscribed publication. And, the old
apply worker could also confirm the LSN of such an INSERT. This could
cause the replication to resume from a point after the INSERT. In such
cases, we miss the expected warning about the missing publication.

To fix this, ensure the walsender has restarted before continuing after
ALTER SUBSCRIPTION.

Reported-by: Tom Lane as per CI
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/1230066.1745992333@sss.pgh.pa.us
2025-05-13 09:54:29 +05:30
Álvaro Herrera
dbf42b84ac
Add tab-complete for ALTER DOMAIN ADD [CONSTRAINT]
We can add tab-completion with "CHECK (" and "NOT NULL" after ALTER
DOMAIN ADD [CONSTRAINT].

ALTER DOMAIN dom ADD -> CHECK (
ALTER DOMAIN dom ADD -> NOT NULL
ALTER DOMAIN dom ADD -> CONSTRAINT
ALTER DOMAIN dom ADD CONSTRAINT nm -> CHECK (
ALTER DOMAIN dom ADD CONSTRAINT nm -> NOT NULL

Author: jian he <jian.universality@gmail.com>
Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/CACJufxG_f6LzAT_McC-kKmQWpuWnOYKyNBw8Kv3xzTjPqmeHcA@mail.gmail.com
2025-05-11 10:16:45 -04:00
Álvaro Herrera
0588656366
Fix comment of tsquerysend()
The comment describes the order in which fields are sent, and it had one
of the fields in the wrong place.

This has been wrong since e6dbcb72fafa (2008), so backpatch all the way
back.

Author: Emre Hasegeli <emre@hasegeli.com>
Discussion: https://postgr.es/m/CAE2gYzzf38bR_R=izhpMxAmqHXKeM5ajkmukh4mNs_oXfxcMCA@mail.gmail.com
2025-05-11 09:47:10 -04:00
Álvaro Herrera
dc9a2d54fd
relcache: Avoid memory leak on tables with no CHECK constraints
As complained about by Valgrind, in commit a379061a22a8 I failed to
realize that I was causing rd_att->constr->check to become allocated
when no CHECK constraints exist; previously it'd remain NULL.  (This was
my bug, not the mentioned commit author's).  Fix by making the
allocation conditional, and set ->check to NULL if unallocated.

Reported-by: Yasir <yasir.hussain.shah@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202505082025.57ijx3qrbx7u@alvherre.pgsql
2025-05-11 09:22:12 -04:00
Álvaro Herrera
7b2ad43426
Sort includes in alphabetical order
Added by commit 042a66291b04, no backpatch needed.
2025-05-11 09:15:05 -04:00
Tom Lane
d4a7e4e179 Fix incorrect "return NULL" in BumpAllocLarge().
This must be "return MemoryContextAllocationFailure(context, size, flags)"
instead.  The effect of this oversight is that if we got a malloc
failure right here, the code would act as though MCXT_ALLOC_NO_OOM
had been specified, whether it was or not.  That would likely lead
to a null-pointer-dereference crash at the unsuspecting call site.

Noted while messing with a patch to improve our Valgrind leak
detection support.  Back-patch to v17 where this code came in.
2025-05-10 20:22:39 -04:00
Noah Misch
4a4ee0c2c1 Remove GLOBALTABLESPACE_OID assert for locked buffers.
Commit f4ece891fc2f3f96f0571720a1ae30db8030681b added the assertion in
an attempt to catch some defects even after VACUUM FULL or REINDEX.
However, IsCatalogTextUniqueIndexOid(tag.relNumber) always returns false
after a relfilenode change, provoking unintended assertion failures.

Reported-by: Adam Guo <adamguo@amazon.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Bug: #18912
Discussion: https://postgr.es/m/18912-a41c9bd0e0ad19b1@postgresql.org
2025-05-10 07:36:27 -07:00
Bruce Momjian
99ddf8615c doc PG 18 relnotes: mv. hash joins and GROUP BY item to General
Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvqJz+Zf7a6abisqoTGottDSRD+YPx=aQSgCsCKD476vGA@mail.gmail.com
2025-05-09 23:40:02 -04:00
Michael Paquier
c259ba881c aio: Use runtime arguments with injections points in tests
This cleans up the code related to the testing infrastructure of AIO
that used injection points, switching the test code to use the new
facility for injection points added by 371f2db8b05e rather than tweaks
to pass and reset arguments to the callbacks run.

This removes all the dependencies to USE_INJECTION_POINTS in the AIO
code.  pgaio_io_call_inj(), pgaio_inj_io_get() and pgaio_inj_cur_handle
are now gone.

Reviewed-by: Greg Burd <greg@burd.me>
Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz
2025-05-10 12:36:57 +09:00
Michael Paquier
36e5fda632 injection_points: Add support and tests for runtime arguments
This commit provides some test coverage for the runtime arguments of
injection points, for both INJECTION_POINT_CACHED() and
INJECTION_POINT(), as extended in 371f2db8b05e.

The SQL functions injection_points_cached() and injection_points_run()
are extended so as it is possible to pass an optional string value to
them.

Reviewed-by: Greg Burd <greg@burd.me>
Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz
2025-05-10 07:40:25 +09:00
Michael Paquier
371f2db8b0 Add support for runtime arguments in injection points
The macros INJECTION_POINT() and INJECTION_POINT_CACHED() are extended
with an optional argument that can be passed down to the callback
attached when an injection point is run, giving to callbacks the
possibility to manipulate a stack state given by the caller.  The
existing callbacks in modules injection_points and test_aio have their
declarations adjusted based on that.

da7226993fd4 (core AIO infrastructure) and 93bc3d75d8e1 (test_aio) and
been relying on a set of workarounds where a static variable called
pgaio_inj_cur_handle is used as runtime argument in the injection point
callbacks used by the AIO tests, in combination with a TRY/CATCH block
to reset the argument value.  The infrastructure introduced in this
commit will be reused for the AIO tests, simplifying them.

Reviewed-by: Greg Burd <greg@burd.me>
Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz
2025-05-10 06:56:26 +09:00
Bruce Momjian
89372d0aaa doc PG 18 relnotes: fix missing parens for crc32c()
Reported-by: Steven Niu

Discussion: https://postgr.es/m/CABBtG=ejqK58cFWpw3etVZfQfhjC-qOqV+9GQWRnLO+p9wYMbw@mail.gmail.com
2025-05-09 14:16:17 -04:00
Tom Lane
95129709fd Skip RSA-PSS ssl test when using LibreSSL.
Presently, LibreSSL does not have working support for RSA-PSS,
so disable that test.  Per discussion at
https://marc.info/?l=libressl&m=174664225002441&w=2
they do intend to fix this, but it's a ways off yet.

Reported-by: Thomas Munro <thomas.munro@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com
Backpatch-through: 15
2025-05-09 12:29:01 -04:00
Tom Lane
75d73331d0 Hack one ssl test case to pass with current LibreSSL.
With LibreSSL, our test of error logging for cert chain depths > 0
reports the wrong certificate.  This is almost certainly their bug
not ours, so just tweak the test to accept their answer.

No back-patch needed, since this test case wasn't enabled before
e0f373ee4.

Reported-by: Thomas Munro <thomas.munro@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com
2025-05-09 11:53:51 -04:00
Tom Lane
0aaf69965d Centralize ssl tests' check for whether we're using LibreSSL.
Right now there's only one caller, so that this is merely
an exercise in shoving code from one module to another,
but there will shortly be another one.  It seems better to
avoid having two copies of this highly-subject-to-change test.

Back-patch to v15, where we first introduced some tests that
don't work with LibreSSL.

Reported-by: Thomas Munro <thomas.munro@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com
Backpatch-through: 15
2025-05-09 11:50:33 -04:00
Peter Eisentraut
bc35adee8d doc: Put new options in consistent order on man pages 2025-05-09 09:03:41 +02:00
Heikki Linnakangas
b28c59a6cd Use 'void *' for arbitrary buffers, 'uint8 *' for byte arrays
A 'void *' argument suggests that the caller might pass an arbitrary
struct, which is appropriate for functions like libc's read/write, or
pq_sendbytes(). 'uint8 *' is more appropriate for byte arrays that
have no structure, like the cancellation keys or SCRAM tokens. Some
places used 'char *', but 'uint8 *' is better because 'char *' is
commonly used for null-terminated strings. Change code around SCRAM,
MD5 authentication, and cancellation key handling to follow these
conventions.

Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org
2025-05-08 22:01:25 +03:00
Heikki Linnakangas
965213d9c5 Use more mundane 'int' type for cancel key lengths in libpq
The documented max length of a cancel key is 256 bytes, so it fits in
uint8. It nevertheless seems weird to not just use 'int', like in
commit 0f1433f053 for the backend.

Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64%40eisentraut.org
2025-05-08 22:01:20 +03:00
Bruce Momjian
9d710a1ac0 PG 18 relnotes: adjust RETURNING new/old item
Reported-by: jian he

Discussion: https://postgr.es/m/CACJufxFM1avdwu=OrTx_uMAjTDbFOj1Gp7mnNHOofTVj9QtmRw@mail.gmail.com
2025-05-08 11:11:08 -04:00
Daniel Gustafsson
8fcc648780 doc: Fix title markup for AT TIME ZONE and AT LOCAL
The title for AT TIME ZONE and AT LOCAL was accidentally wrapping the
"and" in the <literal> tag.  Backpatch to v17 where it was introduced
in 97957fdbaa42.

Author: Noboru Saito <noborusai@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAAM3qn+7QUWW9R6_YwPKXmky0xGE4n63U3EsxZeWE_QtogeU8g@mail.gmail.com
Backpatch-through: 17
2025-05-08 13:53:16 +02:00
Richard Guo
c06e909c26 Track the number of presorted outer pathkeys in MergePath
When creating an explicit Sort node for the outer path of a mergejoin,
we need to determine the number of presorted keys of the outer path to
decide whether explicit incremental sort can be applied.  Currently,
this is done by repeatedly calling pathkeys_count_contained_in.

This patch caches the number of presorted outer pathkeys in MergePath,
allowing us to save several calls to pathkeys_count_contained_in.  It
can be considered a complement to the changes in commit 828e94c9d.

Reported-by: David Rowley <dgrowleyml@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com
2025-05-08 18:21:32 +09:00
Richard Guo
773db22269 Suppress unnecessary explicit sorting for EPQ mergejoin path
When building a ForeignPath for a joinrel, if there's a possibility
that EvalPlanQual will be executed, we must identify a suitable path
for EPQ checks.  If the outer or inner path of the chosen path is a
ForeignPath representing a pushed-down join, we replace it with its
fdw_outerpath to ensure that the EPQ check path consists entirely of
local joins.

If the chosen path is a MergePath, and its outer or inner path is a
ForeignPath that is not already well enough ordered, the MergePath
will have non-NIL outersortkeys or innersortkeys indicating the
desired ordering to be created by an explicit Sort node.  If we then
replace the outer or inner path with its corresponding fdw_outerpath,
and that path is already sufficiently ordered, we end up in an
inconsistent state: the MergePath has non-NIL outersortkeys or
innersortkeys, and its input path is already properly ordered.  This
inconsistency can result in an Assert failure or the addition of a
redundant Sort node.

To fix, check if the new outer or inner path of a MergePath is already
properly sorted, and set its outersortkeys or innersortkeys to NIL if
so.

Bug: #18902
Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18902-71c1bed2b9f7c46f@postgresql.org
2025-05-08 18:20:18 +09:00
Bruce Momjian
9fef27a83b doc PG 18 relnotes: adjust pg_log_backend_memory_contexts()
Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvrGLBqs_Vm9COMY7uBDvUDMKds7RwC20YjEPf+XRTY9XQ@mail.gmail.com
2025-05-07 21:11:16 -04:00
Bruce Momjian
f8d49aa130 doc PG 18 relnotes: add pg_log_backend_memory_contexts() mention
Now zero-based.

Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvqMfTBdfwc0Z-tHXLnBMKJLYEZDApgUzA7x_PUDZsY3GA@mail.gmail.com
2025-05-07 20:36:21 -04:00
Bruce Momjian
69aca072eb doc PG 18 relnotes: adjust pgbench per-script reporting item
Also run src/tools/add_commit_links.pl for a previous commit.

Reported-by: Yugo Nagata

Discussion: https://postgr.es/m/20250507195941.c6e1b48c73f062b727f686a8@sraoss.co.jp
2025-05-07 16:56:26 -04:00
Bruce Momjian
3bd5271729 doc PG 18 relnotes: mention GROUP SET fixes
Reported-by: Richard Guo

Discussion: https://postgr.es/m/CAMbWs4_asKPqTCt0h9pp=zHc9vmPcnczbHeF6Xkxn1LhLapcTQ@mail.gmail.com
2025-05-07 16:39:49 -04:00
Nathan Bossart
16bf24e0e4 Remove pg_replication_origin's TOAST table.
A few places that access this catalog don't set up an active
snapshot before potentially accessing its TOAST table.  However,
roname (the replication origin name) is the only varlena column, so
this is only a problem if the name requires out-of-line storage.
This commit removes its TOAST table to avoid needing to set up a
snapshot.  It also places a limit on replication origin names so
that attempts to set long names will fail with a more user-friendly
error.  Those chosen limit of 512 bytes should be sufficient to
avoid "row is too big" errors independent of BLCKSZ, but it should
also be lenient enough for all reasonable use-cases.

Bumps catversion.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
2025-05-07 14:47:36 -05:00
Peter Geoghegan
5f4d98d4f3 Prevent premature nbtree array advancement.
nbtree array index scans could fail to return matching tuples in rare
cases where the missed tuples cover key space that the scan's arrays
incorrectly indicate has already been read.  These cases involved nearby
tuples with NULL values that were evaluated using a skip array key while
in pstate.forcenonrequired mode.

To fix, prevent forcenonrequired mode from prematurely advancing the
scan's array keys beyond key space that the scan has yet to read tuples
from: reset the scan's array keys (to the first elements in the current
scan direction) before the _bt_checkkeys call for pstate.finaltup.  That
way _bt_checkkeys starts from a clean slate, which ensures that it will
call _bt_advance_array_keys (while passing it sktrig_required=true).
This reliably restores the invariant that the scan's arrays always
accurately track its progress through the index's key space (at least
when the scan is "between pages").

Oversight in commit 8a510275, which optimized nbtree search scan key
comparisons.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/CAH2-WzmodSE+gpTd1CRGU9ez8ytyyDS+Kns2r9NzgUp1s56kpw@mail.gmail.com
2025-05-07 15:20:42 -04:00
Peter Geoghegan
7e25c9363a nbtree: tighten up array recheck rules.
Be more conservative when performing a scheduled recheck of an nbtree
scan's array keys once on the next page, having set so->scanBehind: back
out of reading the page (perform another primitive scan instead) when
the next page's high key/finaltup has an untruncated prefix of matching
values and truncated suffix attributes associated with lower-order keys.
In other words, stop assuming that the lower-order keys have been
satisfied by the truncated suffix attributes in this context (only do so
when considering scheduling a recheck within _bt_advance_array_keys).

The new behavior is more logical: if the next page read after setting
so->scanBehind can only contain tuples that are themselves "behind the
scan", that's reason enough to cut our losses.  In general, when we set
so->scanBehind, we only expect to perform one recheck on the next page
to make a final decision about whether or not to continue the current
primitive index scan.  It seems unprincipled for the recheck to allow a
_bt_readpage to continue unless the scan's arrays will advance/unless
the page might actually contain relevant tuples.

In practice it is highly unlikely that things will line up like this
(the untruncated prefix of attribute values from the next page's high
key is seldom an exact match for their corresponding array's current
element following array advancement on the original/previous page).
That gives us all the more reason to keep things simple and consistent.

This was arguably an oversight in commit 9a2e2a285a, which improved
nbtree array primitive scan scheduling.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkXzJajgyW-pCQ7vaDPhaT3huU+Zw_j448rpCBEsu2YOQ@mail.gmail.com
2025-05-07 15:17:40 -04:00
Nathan Bossart
acea3fc49f pg_dumpall: Add --sequence-data.
I recently added this option to pg_dump, but I forgot to add it to
pg_dumpall, too.  There's probably little use for it at the moment,
but we will need it if/when we teach pg_upgrade to use pg_dumpall
to dump the database schemas.

Oversight in commit 9c49f0e8cd.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aBE8rHFo922xQUwh%40nathan
2025-05-07 13:36:51 -05:00
Alexander Korotkov
ab42d643c1 Refactor ChangeVarNodesExtended() using the custom callback
fc069a3a6319 implemented Self-Join Elimination (SJE) and put related logic
to ChangeVarNodes_walker().  This commit provides refactoring to remove the
SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to
ChangeVarNodesExtended(), which has a chance to process a node before
ChangeVarNodes_walker().  Passing this callback to ChangeVarNodesExtended()
allows SJE-related node handling to be kept within the analyzejoins.c.

Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
2025-05-07 11:10:16 +03:00
Peter Eisentraut
2448c7a9e0 doc: Put some psql documentation pieces back into alphabetical order 2025-05-07 08:23:44 +02:00
Peter Eisentraut
c0cf282551 Remove some tabs in C string literals 2025-05-07 08:23:44 +02:00
Peter Eisentraut
c11bd5f500 doc: Add link to table
Formal tables should generally have an xref in the text that points to
them.  Add them here.
2025-05-07 08:23:44 +02:00
Peter Eisentraut
a2c6d84acd doc: Fix up spacing around verbatim DocBook elements 2025-05-07 08:23:44 +02:00
Michael Paquier
c4c236ab5c Fix some comments related to IO workers
IO workers are treated as auxiliary processes.  The comments fixed in
this commit stated that there could be only one auxiliary process of
each BackendType at the same time.  This is not true for IO workers, as
up to MAX_IO_WORKERS of them can co-exist at the same time.

Author: Cédric Villemain <Cedric.Villemain@data-bene.io>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/e4a3ac45-abce-4b58-a043-b4a31cd11113@Data-Bene.io
2025-05-07 14:55:57 +09:00
Peter Eisentraut
09a47c68e2 Fix whitespace 2025-05-07 07:01:03 +02:00
Bruce Momjian
b560ce7884 doc PG 18 relnotes: adjust partition planning item
Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvqgK7uqPZAwxsfBiFhvBHHB0txaUxhUrdwG4d5Mik_RnA@mail.gmail.com
2025-05-06 21:15:44 -04:00
Bruce Momjian
ada78f9bef doc PG 18 relnotes: small adjustments regarding options
Reported-by: jian he

Discussion: https://postgr.es/m/CACJufxH1jo=hv77AK0HUJYBBMuPmr6+JT+8g-yovuJmHUPGOZQ@mail.gmail.com
2025-05-06 17:17:46 -04:00
Bruce Momjian
575f6003ed doc PG 18 relnotes: move partition locking item to General Perf
Reported-by: Amit Langote

Discussion: https://postgr.es/m/CA+HiwqE+8Pui_NCCC7zgacnet0Cf3tc_vU+P=nhLDES-8xuCUw@mail.gmail.com
2025-05-06 16:03:56 -04:00
Bruce Momjian
45750c6cfe doc PG 18 relnotes: adjust partition items
Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvo+BrVTXMBPjNXBTnAovJWN9+-dYc0kN7rSDqdNvpggZQ@mail.gmail.com
2025-05-06 15:45:03 -04:00
Tom Lane
caa76b91a6 Stamp 18beta1. 2025-05-05 16:25:46 -04:00
Bruce Momjian
c0e6aace02 doc PG 18 relnotes: reword OAuth item
Reported-by: Jacob Champion

Discussion: https://postgr.es/m/CAOYmi+mEQOqBSJas5V5t__b+6h_MLxyy3JFrVJEq638fnNxi0A@mail.gmail.com
2025-05-05 15:42:03 -04:00
Bruce Momjian
0de2e1c8b5 doc PG 18 relnotes: add mention of pg_stat_reset_backend_stats()
This is for WAL statistics.

Reported-by: Bertrand Drouvot

Discussion: https://postgr.es/m/aBjGlj+Yi++fVRQt@ip-10-97-1-34.eu-west-3.compute.internal
2025-05-05 14:56:58 -04:00
Bruce Momjian
092e72a930 doc PG 18 relnotes: adjust hash item
Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvrNmGncNgZMh2oBG5K-+4d1LGJgzrz7180OcHRT1VFojw@mail.gmail.com
2025-05-05 12:30:35 -04:00
Bruce Momjian
cf847d6340 doc PG 18 relnotes: split partition optimizer item into two
Reported-by: David Rowley

Discussion: https://postgr.es/m/CAApHDvohfoJ0D9eiUuVyHU_kq2Y7A_jAjWVsUt0Fm7Gw1Q=1cQ@mail.gmail.com
2025-05-05 11:59:56 -04:00
Noah Misch
627acc3caa With GB18030, prevent SIGSEGV from reading past end of allocation.
With GB18030 as source encoding, applications could crash the server via
SQL functions convert() or convert_from().  Applications themselves
could crash after passing unterminated GB18030 input to libpq functions
PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or
PQescapeString().  Extension code could crash by passing unterminated
GB18030 input to jsonapi.h functions.  All those functions have been
intended to handle untrusted, unterminated input safely.

A crash required allocating the input such that the last byte of the
allocation was the last byte of a virtual memory page.  Some malloc()
implementations take measures against that, making the SIGSEGV hard to
reach.  Back-patch to v13 (all supported versions).

Author: Noah Misch <noah@leadboat.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Backpatch-through: 13
Security: CVE-2025-4207
2025-05-05 04:52:04 -07:00
Noah Misch
5be213caaa Refactor test_escape.c for additional ways of testing.
Start the file with static functions not specific to pe_test_vectors
tests.  This way, new tests can use them without disrupting the file's
layout.  Change report_result() PQExpBuffer arguments to plain strings.
Back-patch to v13 (all supported versions), for the next commit.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Backpatch-through: 13
Security: CVE-2025-4207
2025-05-05 04:52:04 -07:00
Peter Eisentraut
18c4fff640 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: f90ee4803c30491e5c49996b973b8a30de47bfb2
2025-05-05 12:04:49 +02:00
Bruce Momjian
b3754dcc9f doc PG 18 relnotes: adjust COPY and REJECT_LIMIT items
Reported-by: Atsushi Torikoshi

Discussion: https://postgr.es/m/CAM6-o=CEF6tKAjtGMEOd45YySwNRXPu8d_zyYq=fhnia9hOU6Q@mail.gmail.com
2025-05-04 22:37:20 -04:00
Bruce Momjian
d83981c24b doc PG 18 relnotes: move and clarify constraint items
Reported-by: Álvaro Herrera

Discussion: https://postgr.es/m/202505041135.cpo7zgdcya2u@alvherre.pgsql
2025-05-04 22:08:20 -04:00
Bruce Momjian
8c9eec540d doc PG 18 relnotes: add commit for cancel key and protocol neg.
Reported-by: Jelte Fennema-Nio

Discussion: https://postgr.es/m/CAGECzQQehQrhkNNXvLiBgE3odBbTPG=9PzV8F4Oqq3kOorK0Sw@mail.gmail.com
2025-05-04 21:44:39 -04:00
Bruce Momjian
a675149e87 doc PG 18 relnotes: fix libpq wording
Reported-by: Jelte Fennema-Nio

Discussion: https://postgr.es/m/CAGECzQT4804OLOP+nDBxDpMw3Soq=g+fKOE7NryBHggy4GgEcg@mail.gmail.com
2025-05-03 18:50:03 -04:00
Alexander Korotkov
2782f3b845 Revert "Refactor ChangeVarNodesExtended() using the custom callback"
This reverts commit 250a718aadad68793e82103282247556a46a3cfc.
It shouldn't be pushed during the release freeze.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/E1uBIbY-000owH-0O%40gemulon.postgresql.org
2025-05-03 22:42:05 +03:00
Alexander Korotkov
250a718aad Refactor ChangeVarNodesExtended() using the custom callback
fc069a3a6319 implemented Self-Join Elimination (SJE) and put related logic
to ChangeVarNodes_walker().  This commit provides refactoring to remove the
SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to
ChangeVarNodesExtended(), which has a chance to process a node before
ChangeVarNodes_walker().  Passing this callback to ChangeVarNodesExtended()
allows SJE-related node handling to be kept within the analyzejoins.c.

Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
2025-05-03 22:30:52 +03:00
Bruce Momjian
fb21ed6c38 doc: update guidelines on non-ASCII characters in docs 2025-05-03 14:45:26 -04:00
Bruce Momjian
24987c6f06 doc PG 18 relnotes: add GROUP BY column elimination item
With a nod to PG 9.6.

Reported-by: jian he

Discussion: https://postgr.es/m/CACJufxEqs=EXZETwtaOooTFhZrtxvSWg8M2uPfzjNtS3wQ6Dzw@mail.gmail.com
2025-05-03 12:57:18 -04:00
Bruce Momjian
04b269da56 doc PG 18 relnotes: move protocol version item to "server"
Reported-by: Jelte Fennema-Nio

Discussion: https://postgr.es/m/CAGECzQSTBgTsDJPxOHWKo7106-YnnYQGzpzNJdis+xTKGUhu2g@mail.gmail.com
2025-05-03 12:19:54 -04:00
Etsuro Fujita
5201bba266 Fix memory allocation/copy mistakes.
The previous code was allocating more memory and copying more data than
necessary because it specified the wrong PgStat_KindInfo member as the
size argument for MemoryContextAlloc and memcpy, respectively.

Although these issues exist since 5891c7a8e, there have been no reports
from the field.  So for now, it seems sufficient to fix them in master.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Gurjeet Singh <gurjeet@singh.im>
Discussion: https://postgr.es/m/CAPmGK15eTRCZTnfgQ4EuBNo%3DQLYGFEbXS_7m2dXqtkcT7L8qrQ%40mail.gmail.com
2025-05-03 20:00:00 +09:00
Etsuro Fujita
6e91b9c16f Fix typos in comments.
Also adjust the phrasing in the comments.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAPmGK17%3DPHSDZ%2B0G6jcj12buyyE1bQQc3sbp1Wxri7tODT-SDw%40mail.gmail.com
Backpatch-through: 15
2025-05-03 19:10:00 +09:00
Bruce Momjian
9fd989ff99 doc PG 18 relnotes: update chapter tags for recent commit 2025-05-02 20:10:10 -04:00
Bruce Momjian
9f8fcadb20 doc PG 18 relnotes: adjust libpq trace & potocol version items
Reported-by: Jelte Fennema-Nio

Discussion: https://postgr.es/m/CAGECzQQj0r_JX38fa-_kepp9UaMzCcujRAYaJG2+fPks1b8MVg@mail.gmail.com
2025-05-02 20:09:12 -04:00
Bruce Momjian
aa82ebdc29 doc PG 18 relnotes: reword and reorder items
Also move ssl_groups to a more appropriate section.

Reported-by: Jacob Champion (ssl_groups item)

Discussion: https://postgr.es/m/CAOYmi+k_zpGaDOrwV46_j-O-a_hSWxcXM6h8vccq45Y28deP-g@mail.gmail.com
2025-05-02 19:59:17 -04:00
Peter Geoghegan
0f08df4068 Avoid treating nonrequired nbtree keys as required.
Consistently prevent nbtree array advancement from treating a scankey as
required when operating in pstate.forcenonrequired mode.  Otherwise, we
risk a NULL pointer dereference.  This was possible in the path where
_bt_check_compare is called to recheck a tuple that advanced all of the
scan's arrays to matching values: its continuescan=false handling
expects _bt_advance_array_keys to have been called with a valid pstate,
but it'll always be NULL during sktrig_required=false calls (which is
how _bt_advance_array_keys must be called when pstate.forcenonrequired).

Oversight in commit 8a510275, which optimized nbtree search scan key
comparisons.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/CAHgHdKsn2W=gPBmj7p6MjQFvxB+zZDBkwTSg0o3f5Hh8rkRrsA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzmodSE+gpTd1CRGU9ez8ytyyDS+Kns2r9NzgUp1s56kpw@mail.gmail.com
2025-05-02 17:50:58 -04:00
Tomas Vondra
1681a70df3 Fix memory leak in _gin_parallel_merge
To insert the merged GIN entries in _gin_parallel_merge, the leader
calls ginEntryInsert(). This may allocate memory, e.g. for a new leaf
tuple. This was allocated in the PortalContext, and kept until the end
of the index build. For most GIN indexes the amount of leaked memory is
negligible, but for custom opclasses with large keys it may cause OOMs.

Fixed by calling ginEntryInsert() in a temporary memory context, reset
after each insert. Other ginEntryInsert() callers do this too, except
that the context is reset after batches of inserts. More frequent resets
don't seem to hurt performance, it may even help it a bit.

Report and fix by Vinod Sridharan.

Author: Vinod Sridharan <vsridh90@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAFMdLD4p0VBd8JG=Nbi=BKv6rzFAiGJ_sXSFrw-2tNmNZFO5Kg@mail.gmail.com
2025-05-02 23:05:18 +02:00
Tom Lane
e83a8ae447 Don't use a tuplestore if we don't have to for SQL-language functions.
We only need a tuplestore if we're actually going to accumulate
multiple result tuples.  Obviously then we don't need one for non-set-
returning functions; but even a SRF doesn't need one if we decide to
use "lazyEval" (one row at a time) mode.  In these cases, it's
sufficient to use the junkfilter's result slot to hold the single row
that's due to be returned.  We just need to "materialize" that slot
to ensure it holds onto the data past shutdown of the sub-executor.

The original intent of this patch was partially to save a few cycles
(by not putting tuples into a tuplestore only to pull them back out
immediately), but mostly to ensure that we don't use a tuplestore
in non-set-returning functions.  That's because I had concerns
about whether a tuplestore is safe to keep across queries,
which was possible for functions invoked via long-lived FmgrInfos
such as those kept in the typcache.  There are no cases where SRFs
are called that way, so getting rid of the tuplestore in non-SRFs
should make things safer.

However, it emerges that running fmgr_sql in a short-lived context
(as 595d1efed made it do) makes the existing coding unsafe anyway:
we can end up with a long-lived TupleTableSlot holding a freeable
reference to a short-lived tuple, resulting in a double-free crash.
Not trying to pull tuples out of the tuplestore using that slot
dodges the problem, so I'm going to commit this now rather than
invent a band-aid solution for v18.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2443532.1744919968@sss.pgh.pa.us
Discussion: https://postgr.es/m/9f975803-1a1c-4f21-b987-f572e110e860@gmail.com
2025-05-02 16:16:20 -04:00
Álvaro Herrera
c83a38758d
Handle self-referencing FKs correctly in partitioned tables
For self-referencing foreign keys in partitioned tables, we weren't
handling creation of pg_constraint rows during CREATE TABLE PARTITION AS
as well as ALTER TABLE ATTACH PARTITION.  This is an old bug -- mostly,
we broke this in 614a406b4ff1 while trying to fix it (so 12.13, 13.9,
14.6 and 15.0 and up all behave incorrectly).  This commit reverts part
of that with additional fixes for full correctness, and installs more
tests to verify the parts we broke, not just the catalog contents but
also the user-visible behavior.

Backpatch to all live branches.  In branches 13 and 14, commit
46a8c27a7226 changed the behavior during DETACH to drop a FK
constraint rather than trying to repair it, because the complete fix of
repairing catalog constraints was problematic due to lack of previous
fixes.  For this reason, the test behavior in those branches is a bit
different.  However, as best as I can tell, the fix works correctly
there.

In release notes we have to recommend that all self-referencing foreign
keys on partitioned tables be recreated if partitions have been created
or attached after the FK was created, keeping in mind that violating
rows might already be present on the referencing side.

Reported-by: Guillaume Lelarge <guillaume@lelarge.info>
Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com>
Reported-by: Luca Vallisa <luca.vallisa@gmail.com>
Discussion: https://postgr.es/m/CAECtzeWHCA+6tTcm2Oh2+g7fURUJpLZb-=pRXgeWJ-Pi+VU=_w@mail.gmail.com
Discussion: https://postgr.es/m/18156-a44bc7096f0683e6@postgresql.org
Discussion: https://postgr.es/m/CAAT=myvsiF-Attja5DcWoUWh21R12R-sfXECY2-3ynt8kaOqjw@mail.gmail.com
2025-05-02 21:25:50 +02:00
Tom Lane
ac557793d4 Doc: correct spelling of meson switch.
It's --auto-features not --auto_features.

Reported-by: Egor Chindyaskin <kyzevan23@mail.ru>
Discussion: https://postgr.es/m/172465652540.862882.17808523044292761256@wrigleys.postgresql.org
Discussion: https://postgr.es/m/1979661.1746212726@sss.pgh.pa.us
Backpatch-through: 16
2025-05-02 15:12:49 -04:00
Jacob Champion
3db68212a3 oauth: Correct SSL dependency for libpq-oauth.a
libpq-oauth.a includes libpq-int.h, which includes OpenSSL headers. The
Autoconf side picks up the necessary include directories via CPPFLAGS,
but Meson needs the dependency to be made explicit.

Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Tested-by: Nathan Bossart <nathandbossart@gmail.com>
Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aBTgjDfrdOZmaPgv%40nathan
2025-05-02 10:45:12 -07:00
Peter Eisentraut
81eaaa2c41 Make "directory" setting work with extension_control_path
The extension_control_path setting (commit 4f7f7b03758) did not
support extensions that set a custom "directory" setting in their
control file.  Very few extensions use that and during the discussion
on the previous commit it was suggested to maybe remove that
functionality.  But a fix was easier than initially thought, so this
just adds that support.  The fix is to use the control->control_dir as
a share dir to return the path of the extension script files.

To make this work more sensibly overall, the directory suffix
"extension" is no longer to be included in the extension_control_path
value.  To quote the patch, it would be

-extension_control_path = '/usr/local/share/postgresql/extension:/home/my_project/share/extension:$system'
+extension_control_path = '/usr/local/share/postgresql:/home/my_project/share:$system'

During the initial patch, there was some discussion on which of these
two approaches would be better, and the committed patch was a 50/50
decision.  But the support for the "directory" setting pushed it the
other way, and also it seems like many people didn't like the previous
behavior much.

Author: Matheus Alcantara <mths.dev@pm.me>
Reviewed-by: Christoph Berg <myon@debian.org>
Reviewed-by: David E. Wheeler <david@justatheory.com>
Discussion: https://www.postgresql.org/message-id/flat/aAi1VACxhjMhjFnb%40msg.df7cb.de#0cdf7b7d727cc593b029650daa3c4fbc
2025-05-02 16:35:48 +02:00
Bruce Momjian
a724c7889f doc: first draft of the PG 18 release notes 2025-05-01 22:36:58 -04:00
Noah Misch
c6a26e4ccd Doc: stop implying recommendation of insecure search_path value.
SQL "SET search_path = 'pg_catalog, pg_temp'" is silently equivalent to
"SET search_path = pg_temp, pg_catalog, "pg_catalog, pg_temp"" instead
of the intended "SET search_path = pg_catalog, pg_temp".  (The intent
was a two-element search path.  With the single quotes, it instead
specifies one element with a comma and a space in the middle of the
element.)  In addition to the SET statement, this affects SET clauses of
CREATE FUNCTION, ALTER ROLE, and ALTER DATABASE.  It does not affect the
set_config() SQL function.

Though the documentation did not show an insecure command, remove single
quotes that could entice a reader to write an insecure command.
Back-patch to v13 (all supported versions).

Reported-by: Sven Klemm <sven@timescale.com>
Author: Sven Klemm <sven@timescale.com>
Backpatch-through: 13
2025-05-01 16:51:59 -07:00
Peter Eisentraut
0064020680 doc: Flesh out extension docs for the "prefix" make variable
The variable is a bit magical in how it requires "postgresql" or
"pgsql" to be part of the path, and files end up in its "share" and
"lib" subdirectories.  So mention all that and show an example of
setting "extension_control_path" and "dynamic_library_path" to use
those locations.

Author: David E. Wheeler <david@justatheory.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Christoph Berg <myon@debian.org>
Discussion: https://www.postgresql.org/message-id/6B5BF07B-8A21-48E3-858C-1DC22F3A28B4@justatheory.com
2025-05-01 22:23:52 +02:00
Jacob Champion
4ea1254f35 oauth: Fix Autoconf build on macOS
Oversight in b0635bfda. -lintl is necessary for gettext on Mac, which
libpq-oauth depends on via pgport/pgcommon. (I'd incorrectly removed
this change from an earlier version of the patch, where it was suggested
by Peter Eisentraut.)

Per buildfarm member indri.
2025-05-01 12:35:52 -07:00
Jacob Champion
b0635bfda0 oauth: Move the builtin flow into a separate module
The additional packaging footprint of the OAuth Curl dependency, as well
as the existence of libcurl in the address space even if OAuth isn't
ever used by a client, has raised some concerns. Split off this
dependency into a separate loadable module called libpq-oauth.

When configured using --with-libcurl, libpq.so searches for this new
module via dlopen(). End users may choose not to install the libpq-oauth
module, in which case the default flow is disabled.

For static applications using libpq.a, the libpq-oauth staticlib is a
mandatory link-time dependency for --with-libcurl builds. libpq.pc has
been updated accordingly.

The default flow relies on some libpq internals. Some of these can be
safely duplicated (such as the SIGPIPE handlers), but others need to be
shared between libpq and libpq-oauth for thread-safety. To avoid
exporting these internals to all libpq clients forever, these
dependencies are instead injected from the libpq side via an
initialization function. This also lets libpq communicate the offsets of
PGconn struct members to libpq-oauth, so that we can function without
crashing if the module on the search path came from a different build of
Postgres. (A minor-version upgrade could swap the libpq-oauth module out
from under a long-running libpq client before it does its first load of
the OAuth flow.)

This ABI is considered "private". The module has no SONAME or version
symlinks, and it's named libpq-oauth-<major>.so to avoid mixing and
matching across Postgres versions. (Future improvements may promote this
"OAuth flow plugin" to a first-class concept, at which point we would
need a public API to replace this anyway.)

Additionally, NLS support for error messages in b3f0be788a was
incomplete, because the new error macros weren't being scanned by
xgettext. Fix that now.

Per request from Tom Lane and Bruce Momjian. Based on an initial patch
by Daniel Gustafsson, who also contributed docs changes. The "bare"
dlopen() concept came from Thomas Munro. Many people reviewed the design
and implementation; thank you!

Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Christoph Berg <myon@debian.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Wolfgang Walther <walther@technowledgy.de>
Discussion: https://postgr.es/m/641687.1742360249%40sss.pgh.pa.us
2025-05-01 09:14:30 -07:00
Nathan Bossart
a3ef0b570c Remove extra "not" in pg_upgrade documentation.
Oversight in commit cb45dc3afb.

Reported-by: Erik Rijkers <er@xs4all.nl>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/7b856277-62ad-80f0-36e1-a134ec3c9cab%40xs4all.nl
2025-05-01 09:31:36 -05:00
Dean Rasheed
d73d4cfdfc doc: Warn that ts_headline() output is not HTML-safe.
Add a documentation warning to ts_headline() pointing out that, when
working with untrusted input documents, the output is not guaranteed
to be safe for direct inclusion in web pages. This is because, while
it does remove some XML tags from the input, it doesn't remove all
HTML markup, and so the result may be unsafe (e.g., it might permit
XSS attacks).

To guard against that, all HTML markup should be removed from the
input, making it plain text, or the output should be passed through an
HTML sanitizer.

In addition, document precisely what the default text search parser
recognises as valid XML tags, since that's what determines which XML
tags ts_headline() will remove.

Reported-by: Richard Neill <richard.neill@telos.digital>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13
2025-05-01 11:03:43 +01:00
Peter Eisentraut
06c4f3ae80 doc: Improve explanations when a table rewrite is needed
Further improvement for commit 11bd8318602.  That commit confused
identity and generated columns; fix that.  Also, virtual generated
columns have since been added; add more details about that.  Also some
small rewordings and reformattings to further improve clarity.

Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/00e6eb5f5c793b8ef722252c7a519c9a@oss.nttdata.com
2025-05-01 08:57:48 +02:00
Peter Geoghegan
9d924dbb37 Adjust overstrong nbtree skip array assertion.
Make an nbtree array preprocessing assertion account for scans that add
fewer skip arrays than initially expected due to preprocessing finding
an unsatisfiable array qual.

Oversight in commit 92fe23d9.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/CAHgHdKtQMhHy5qcB3KqCcGiW-Rp8P7KzUFRa9ZMKUiv6zen7LQ@mail.gmail.com
2025-04-30 23:15:51 -04:00
Michael Paquier
92ee8a4df5 doc: Mention cost-based delays for total_[auto]{vacuum,analyze}_time
30a6ed0ce4b has added four attributes to pg_stat_all_tables to track the
cumulative time spent in [auto]vacuum and [auto]analyze.  It was not
mentioned that the vacuum cost-based delays are included in these
numbers, which could be confusing now that the delays are included in
the vacuum progress view (bb8dff9995f2).

This commit adds an extra note about this matter.

Reported-by: Magnus Hagander <magnus@hagander.net>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CABUevEz9v1ZNToPyD98JnWDGZgG=SmPZKkSNzU9hXQ-nGTQF0g@mail.gmail.com
2025-05-01 08:52:19 +09:00
Daniel Gustafsson
45e7e8ca9e Convert strncpy to strlcpy
We try to avoid using strncpy() due to the ease of which it can
be misused.  Convert this callsite to use strlcpy() instead to
match similar codepaths in this file.

Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/2a796830-de2d-4030-b480-d673f6cc5d94@eisentraut.org
2025-04-30 23:00:47 +02:00
Nathan Bossart
2d6745a66b doc: Add missing reference to track_cost_delay_timing.
Oversight in commit bb8dff9995.
2025-04-30 14:45:54 -05:00
Nathan Bossart
9879105024 vacuumdb: Don't skip empty relations in --missing-stats-only mode.
Presently, --missing-stats-only skips relations with reltuples set
to 0 because empty relations don't get optimizer statistics.
However, before v14, a reltuples value of 0 was ambiguous: it could
either mean the relation is empty, or it could mean that it hadn't
yet been vacuumed or analyzed.  (Commit 3d351d916b taught v14 and
newer to use -1 for the latter case.)  This ambiguity can cause
--missing-stats-only to inadvertently skip relations that need
optimizer statistics after upgrades to v18 and newer (since
reltuples is now transferred from the old cluster).

To fix, simply remove the check for reltuples != 0.  This will
cause --missing-stats-only to analyze some empty tables, but that
doesn't seem too terrible a trade-off.

Reported-by: Christoph Berg <myon@debian.org>
Reviewed-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/aAjyvW5_fRGNr7yF%40msg.df7cb.de
2025-04-30 14:12:59 -05:00
Nathan Bossart
d5f1b6a75b Further adjust guidance for running vacuumdb after pg_upgrade.
Since pg_upgrade does not transfer the cumulative statistics used
to trigger autovacuum and autoanalyze, the server may take much
longer than expected to process them post-upgrade.  Currently, we
recommend analyzing only relations for which optimizer statistics
were not transferred by using the --analyze-in-stages and
--missing-stats-only options.  This commit appends another
recommendation to analyze all relations to update the relevant
cumulative statistics by using the --analyze-only option.  This is
similar to the recommendation for pg_stat_reset().

Reported-by: Christoph Berg <myon@debian.org>
Reviewed-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/aAfxfKC82B9NvJDj%40msg.df7cb.de
2025-04-30 14:12:59 -05:00
Nathan Bossart
f60420cff6 doc: Alphabetize long options for pg_dump[all].
The current ordering strategy for these pages is to list the short
options in alphabetical order followed by the long options in
alphabetical order.  If an option has both a short variant and a
long variant, the short variant takes precedence.  This commit
moves a few recently added options to match this style.  We should
probably adjust all pages and --help output to list the long and
short options in one combined alphabetical list (with the long
variants taking precedence), but that is a much larger change, so
it is left as a future exercise.

Oversights in commits a5cf808be5, 1fd1bd8710, and bde2fb797a.

Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/aBFBtsltgu3-IU1d%40nathan
2025-04-30 13:07:51 -05:00
Tom Lane
368c3fbf9d Update time zone data files to tzdata release 2025b.
DST law changes in Chile: there is a new time zone America/Coyhaique
for Chile's Aysén Region, to account for it changing to UTC-03
year-round and thus diverging from America/Santiago.

Historical corrections for Iran.

Backpatch-through: 13
2025-04-30 11:13:49 -04:00
Daniel Gustafsson
f8c115a6cb Typo and doc fixups for memory context reporting
This fixes comment and docs typos as well as a small documentation
change to make it clearer.  Found via post-commit review.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com
2025-04-30 11:10:27 +02:00
Daniel Gustafsson
d2a1ed1727 Add missing string terminator
When copying the string strncpy won't add nul termination since
the string length is equal to the length specified.  Explicitly
set a nul terminator after copying to properly terminate. Found
via post-commit review.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com
2025-04-30 10:34:08 +02:00
David Rowley
991407ae86 Add 918e7287e to .git-blame-ignore-revs 2025-04-30 19:27:56 +12:00
David Rowley
918e7287ed Fix broken indentation
I forgot to run pgindent in d8555e522.

Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/156083c9-eac0-418d-9667-92dec4d6d6cd@oss.nttdata.com
2025-04-30 19:18:30 +12:00
David Rowley
d8555e522e Fix a couple of comment typos
Author: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3+MRwDKc4YSFKKPKq7Y+vMufVC5u94wM5KZPB2CbgCxnQ@mail.gmail.com
2025-04-30 13:40:46 +12:00
Tom Lane
810a8b1c80 Give up on running with NetBSD/OpenBSD's default semaphore settings.
This reverts commit 38da053463bef32adf563ddee5277d16d2b6c5af, which
attempted to preserve our ability to start with only 60 semaphores.

Subsequent changes (particularly 55b454d0e) have put that idea pretty
much permanently out of reach: people wishing to use Postgres v18 on
OpenBSD or NetBSD will have no choice but to increase those platforms'
default values of SEMMNI and SEMMNS.

Hence, revert 38da05346's changes in SEMAS_PER_SET and the minimum
tested value of max_connections.  Adjust a comment from the subsequent
patch 6d0154196, and tweak the wording in runtime.sgml to make it
clear that changing SEMMNI/SEMMNS is no longer even a little bit
optional on these platforms.

Although 38da05346 was later back-patched into v17, leave that branch
alone: it's still capable of starting with 60 semaphores, and there's
no reason to break that.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/E1tuZNv-0037Gs-34@gemulon.postgresql.org
Discussion: https://postgr.es/m/1052019.1745947915@sss.pgh.pa.us
2025-04-29 17:27:52 -04:00
Jacob Champion
e974f1c216 oauth: Classify oauth_client_secret as a password
Tell UIs to hide the value of oauth_client_secret, like the other
passwords. Due to the previous commit, this does not affect postgres_fdw
and dblink, but add a comment to try to warn others of the hazard in the
future.

Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250415191435.55.nmisch%40google.com
2025-04-29 13:08:55 -07:00
Jacob Champion
d2e7d2a09d oauth: Disallow OAuth connections via postgres_fdw/dblink
A subsequent commit will reclassify oauth_client_secret from dispchar=""
to dispchar="*", so that UIs will treat it like a secret. For our FDWs,
this change will move that option from SERVER to USER MAPPING, which we
need to avoid.

But upon further discussion, we don't really want our FDWs to use our
builtin Device Authorization flow at all, for several reasons:

- the URL and code would be printed to the server logs, not sent over
  the client connection
- tokens are not cached/refreshed, so every single connection has to be
  manually authorized by a user with a browser
- oauth_client_secret needs to belong to the foreign server, but options
  on SERVER are publicly accessible
- all non-superusers would need password_required=false, which is
  dangerous

Future OAuth work can use FDWs as a motivating use case. But for now,
disallow all oauth_* connection options for these two extensions.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250415191435.55.nmisch%40google.com
2025-04-29 13:08:24 -07:00
Jacob Champion
45363fca63 Bump the minimum supported Python version to 3.6.8
Python 3.2 is no longer tested by the buildfarm, and there are only a
handful of buildfarm animals running versions older than 3.6, which
itself went end-of-life in 2021. Python 3.6.8 is the default version
shipped in RHEL8, so that seems like a reasonable baseline for PG18.

Now that we use the Python Limited API as of 0793ab810, older versions
of Python should continue functioning for users of PL/Python in
particular, so soften the language from "required" to "supported".

Wording by Tom Lane. Separate from the review of the patch itself,
several people provided input on the choice of cutoff: Christoph Berg,
Devrim Gündüz, Florents Tselai, Jelte Fennema-Nio, and Renan Alves
Fonseca. Thank you!

Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/16098.1745079444%40sss.pgh.pa.us
2025-04-29 13:04:19 -07:00
Peter Eisentraut
eec34099c3 Fix whitespace typo in string 2025-04-29 19:16:11 +02:00
Nathan Bossart
2b49492eda initdb: Do not report default autovacuum_worker_slots.
Commit 6d01541960 taught initdb to lower the default value of
autovacuum_worker_slots for systems with very few semaphores.  It
also added a "fake" report for the chosen value, i.e., initdb
prints a message about selecting the default, but the value was
already selected in a previous test.  Per discussion, this is not a
precedent we want to set, and it seems unnecessary to report
everything derived from max_connections, so let's remove the "fake"
report.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/de722583-4ba4-4063-bc41-e20684978116%40eisentraut.org
2025-04-29 11:41:42 -05:00
Bruce Momjian
faced8e6a4 doc: adjust max_files_per_process again
Reported-by: Andres Freund

Discussion: https://postgr.es/m/5yqochswkulckuzzrwgv2nqdrfh4k4coc4uwq4lvgzkfwnbjbd@46igbiwjabn2
2025-04-29 10:30:08 -04:00
Bruce Momjian
9a9e60fed3 doc: clarify new behavior of max_files_per_process 2025-04-29 09:45:41 -04:00
Peter Eisentraut
913c60b067 doc: Small example improvement
Add a comment character before a line annotation, so that the query
can be used as presented.

Reported-by: Yaroslav Saburov <y.saburov@gmail.com>
Author: Euler Taveira <euler@eulerto.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://www.postgresql.org/message-id/flat/174393459040.678.17810152410419444783%40wrigleys.postgresql.org
2025-04-29 14:43:35 +02:00
Alexander Korotkov
2260c7f6d9 Fixes for ChangeVarNodes_walker()
This commit fixes two bug in ChangeVarNodes_walker() function.

 * When considering RestrictInfo, walk down to its clauses based on the
   presense of relid to be deleted not just in clause_relids but also in
   required_relids.

 * Incrementally adjust num_base_rels based on the change of clause_relids
   instead of recalculating it using clause_relids, which could contain
   outer-join relids.

Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-04-29 14:34:44 +03:00
Peter Eisentraut
15b1b4dd3f pg_restore: Improve --help synopsis
The --help synopsis should only be one line.  This rephrases the first
line a bit to reflect the new functionality of restoring multiple
databases from pg_dumpall output.  Additional explanations are better
kept in the man page.
2025-04-29 11:32:49 +02:00
Peter Eisentraut
dadc58f50a pg_restore: Put new option in consistent order in --help output
Also make the description a bit more consistent with similar options.
2025-04-29 10:59:05 +02:00
Amit Kapila
3ff2a1f0c9 Fix assertion failure during decoding from synced slots.
The slot synchronization skips updating the confirmed_flush LSN of the
local slot if the local slot has a newer catalog_xmin or restart_lsn, but
still allows updating the two_phase and two_phase_at fields of the slot.
This opens up a window for the prepared transactions between old
confirmed_flush LSN and two_phase_at to unexpectedly get decoded and sent
to the downstream after promotion. Then, while decoding the commit
prepared the assert will fail, which expects that the prepare hasn't been
sent to the downstream.

The fix is to skip updating the other slot fields when we are skipping to
update the confirmed_flush LSN of the slot.

We didn't backpatch this commit as two_phase_at was not synced in back
branches, which means prepared transactions won't be unexpectedly sent to
downstream.

We discovered this problem while analyzing BF failure reported in the
discussion link.

Reliably reproducing this issue without a debugger is difficult. Given
its rarity, adding specific injection point to test it doesn't seem
worthwhile, so we won't be adding a dedicated test case.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OS0PR01MB5716B44052000EB91EFAE60E94BC2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-04-29 12:52:05 +05:30
Peter Eisentraut
ef1811ac9a pg_verifybackup: Message style improvements 2025-04-29 09:19:15 +02:00
Peter Eisentraut
c893245ec3 test_slru: Fix incorrect format placeholders
Before commit a0ed19e0a9e there was a cast around these, but the cast
inadvertently changed the signedness, but that made the format
placeholder correct.  Commit a0ed19e0a9e removed the casts, so now the
format placeholders had the wrong signedness.
2025-04-29 09:09:00 +02:00
Amit Kapila
9807617a92 Doc: Specify the interaction of publish_generated_columns with column list.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PtnjLiNFFh-3f9cXH0wnwqjdkTjQNbVmZdZ1y+zKt_PPg@mail.gmail.com
2025-04-29 09:01:43 +05:30
Melanie Plageman
f132815fd7 Add maintenance_io_concurrency flag to some read stream users
Index vacuuming and [auto]prewarm AIO concurrency should be governed by
maintenance_io_concurrency. As such, pass those read stream users the
READ_STREAM_MAINTENANCE flag which will calculate their read stream
distance with maintenance_io_concurrency instead of
effective_io_concurrency. This was an oversight in the original commits
making those operations use the read stream API.

Discussion: https://postgr.es/m/flat/CAAKRu_aopDxTo4b41Mt_7Zc-z0_ngocrY8SFCCY6Aph1HgwuNw%40mail.gmail.com
2025-04-28 14:19:45 -04:00
Peter Geoghegan
ce72e7e02e Fix obsolete nbtree array advancement comment.
Checking if another primitive scan is required after all once the next
leaf page was moved from _bt_checkkeys to its _bt_readpage caller by
commit 9a2e2a28.  Update a comment that incorrectly described the
recheck mechanism as something that takes place in _bt_checkkeys.

Also fix an older typo in related code comments.
2025-04-28 12:49:17 -04:00
Peter Geoghegan
b75fedcab7 Make NULL tuple values always advance skip arrays.
_bt_check_compare neglected to handle a case that can arise when the
scan's keys are temporarily treated as nonrequired, as an optimization:
whenever a NULL tuple value was encountered that had a skip array whose
current element wasn't already NULL, _bt_check_compare failed to advance
the array to the NULL element.  This allowed _bt_check_compare to fail
to return matching tuples containing a NULL value (though only with an
array column that came before a skip array column with NULLs, and only
during _bt_readpage calls that set pstate.forcenonrequired=true on a
page where the higher-order column also had to advance).

To fix, teach _bt_check_compare to handle this case just like any other
case where a skip array key is unsatisfied and must be advanced directly
(due to the key being considered a nonrequired key).

Oversight in commit 8a510275, which optimized nbtree search scan key
comparisons with skip arrays.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/CAHgHdKtLFWZcjr87hMH0hYDHgcifu4Tj7iHz-xh8qsJREt5cqA@mail.gmail.com
2025-04-28 12:11:08 -04:00
Álvaro Herrera
0e13b13d26
Fix pg_dump for inherited validated not-null constraints
When a child constraint is validated and the parent constraint it
derives from isn't, pg_dump must be coerced into printing the child
constraint; failing to do would result in a dump that restores the
constraint as not valid, which would be incorrect.

Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: jian he <jian.universality@gmail.com>
Message-id: https://postgr.es/m/CACJufxGHNNMc0E2JphUqJMzD3=bwRSuAEVBF5ekgkG8uY0Q3hg@mail.gmail.com
2025-04-28 16:25:06 +02:00
Peter Eisentraut
c061000311 pg_combinebackup: Message style improvements 2025-04-28 14:26:49 +02:00
Alexander Korotkov
73e7361376 Restore comments in ChangeVarNodesExtended()
This commit restores comments in ChangeVarNodesExtended(), which were
accidentally removed by fc069a3a6319.

Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
2025-04-28 11:20:22 +03:00
Amit Kapila
aaf9e95e87 Fix xmin advancement during fast_forward decoding.
During logical decoding, we advance catalog_xmin of logical too early in
fast_forward mode, resulting in required catalog data being removed by
vacuum. This mode is normally used to advance the slot without processing
the changes, but we still can't let the slot's xmin to advance to an
incorrect value.

Commit f49a80c481 fixed a similar issue where the logical slot's
catalog_xmin was getting advanced prematurely during non-fast-forward
mode. During xl_running_xacts processing, instead of directly advancing
the slot's xmin to the oldest running xid in the record, it allowed the
xmin to be held back for snapshots that can be used for
not-yet-replayed transactions, as those might consider older txns as
running too. However, it missed the fact that the same problem can happen
during fast_forward mode decoding, as we won't build a base snapshot in
that mode, and the future call to get_changes from the same slot can miss
seeing the required catalog changes leading to incorrect reslts.

This commit allows building the base snapshot even in fast_forward mode to
prevent the early advancement of xmin.

Reported-by: Amit Kapila <amit.kapila16@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-04-28 11:35:54 +05:30
Michael Paquier
b225c5e76e Remove circular #include's between wait_event.h and wait_event_types.h
wait_event_types.h is generated by the code, and included wait_event.h.
wait_event.h did the opposite move, including wait_event_types.h,
causing a circular dependency between both.

wait_event_types.h only needs to now about the wait event classes, so
this information is moved into its own file, and wait_event_types.h uses
this new header so as it does not depend anymore on wait_event.h.

Note that such errors can be found with clang-tidy, with commands like
this one:
clang-tidy source_file.c --checks=misc-header-include-cycle -- \
  -I/install/path/include/ -I/install/path/include/server/

Issue introduced by fa88928470b5.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/350192.1745768770@sss.pgh.pa.us
2025-04-28 09:08:15 +09:00
Alexander Korotkov
1aa7cf9eb8 Disallow removing placeholders during Self-Join Elimination.
fc069a3a6319 implements Self-Join Elimination (SJE), which can remove base
relations when appropriate.  However, regressions tests for SJE only cover
the case when placeholder variables (PHVs) are evaluated and needed only
in a single base rel.  If this baserel is removed due to SJE, its clauses,
including PHVs, will be transferred to the keeping relation.  Removing these
PHVs may trigger an error on plan creation -- thanks to the b3ff6c742f6c for
detecting that.

This commit skips removal of PHVs during SJE.  This might also happen that
we skip the removal of some PHVs that could be removed.  However, the overhead
of extra PHVs is small compared to the complexity of analysis needed to remove
them.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Alena Rybakina <a.rybakina@postgrespro.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
2025-04-28 01:40:42 +03:00
Tom Lane
2f5b056203 Remove inappropriate inclusions of c.h and postgres_fe.h.
Per our usual policy, Postgres header files should not include these;
the decision as to which one to use is to be made in the calling .c
file instead.

These errors aren't particularly new, but I'm not feeling a need
to back-patch these changes; it's mostly just neatnik-ism.
2025-04-27 16:58:57 -04:00
Tom Lane
94b84a6072 Don't use double-quotes in #include's of system headers, redux.
This cleans up some loose ends left by commit e8ca9ed1d.  I hadn't
looked closely enough at these places before, but now I have.

The use of double-quoted #includes for Perl headers in plperl_system.h
seems to be simply a mistake introduced in 6c944bf3c and faithfully
copied forward since then.  (I had thought possibly it was required
by some weird Windows build setup, but there's no evidence of that in
our history.)

The occurrences in SectionMemoryManager.h and SectionMemoryManager.cpp
evidently stem from those files' origin as LLVM code.  It's
understandable that LLVM would treat their own files as needing
double-quoted #includes; but they're still system headers to us.

I also applied the same check to *.c files, and found a few other
random incorrect usages in both directions.

Our ECPG headers and test files routinely use angle brackets to refer
to ECPG headers.  I left those usages alone, since it seems reasonable
for an ECPG user to regard those headers as system headers.
2025-04-27 13:23:19 -04:00
Tom Lane
2311f193ea Remove circular #include's between plpython.h and plpy_util.h.
plpython.h included plpy_util.h, simply on the grounds that "it's
easier to just include it everywhere".  However, plpy_util.h must
include plpython.h, or it won't pass headerscheck.  While the
resulting circularity doesn't have any immediate bad effect,
it's poor design.  We have seen serious messes arise in the past
from overly-broad inclusion footprints created by such circularities,
so let's establish a project policy against it.

To fix, just replace *.c files' inclusions of plpython.h with
plpy_util.h.  They'll pull in plpython.h indirectly; indeed, almost
all have already done so via inclusions of other plpy_xxx.h headers.
(Any extensions using plpython.h can do likewise without breaking
the compatibility of their code with prior Postgres versions.)

Reported-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aAxQ6fcY5QQV1lo3@ip-10-97-1-34.eu-west-3.compute.internal
2025-04-27 11:43:02 -04:00
Tom Lane
e8ca9ed1d2 Don't use double-quotes in #include's of system headers.
While few if any C compilers will complain about this, it's
inconsistent with our other #include's of the same headers.

There are some other questionable usages in
src/include/jit/SectionMemoryManager.h and
src/pl/plperl/plperl_system.h, but perhaps those have a
reason to be like that.  I can't see that these do.

Noticed while fooling around with a script to do analysis
of our header cross-inclusions.
2025-04-26 20:30:27 -04:00
David Rowley
936457419d Eliminate divide in new fast-path locking code
c4d5cb71d2 adjusted the fast-path locking code to allow some
configuration of the number of fast-path locking slots via the
max_locks_per_transaction GUC.  In that commit the FAST_PATH_REL_GROUP()
macro used integer division to determine the fast-path locking group slot
to use for the lock.

The divisor in this case is always a power-of-two value.  Here we swap
out the divide by a bitwise-AND, which is a significantly faster
operation to perform.

In passing, adjust the code that's setting FastPathLockGroupsPerBackend
so that it's more clear that the value being set is a power-of-two.

Also, adjust some comments in the area which contained some magic
numbers.  It seems better to justify the 1024 upper limit in the
location where the #define is made instead of where it is used.

Author: David Rowley <drowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAApHDvodr3bcnpxcs7+k-3cFwYR0tP-BYhyd2PpDhe-bCx9i=g@mail.gmail.com
2025-04-27 11:53:40 +12:00
John Naylor
27757677ca Match parameter in new function to earlier equivalents
Oversight in commit 3c6e8c123.
2025-04-27 03:03:52 +07:00
Bruce Momjian
10e8176950 doc: improve wording of vacuum_max_eager_freeze_failure_rate 2025-04-26 11:41:23 -04:00
Andres Freund
039bfc457e aio: Improve debug logging around waiting for IOs
Trying to investigate a bug report by Alexander Lakhin made it apparent that
the debug logging around waiting for IO completion is insufficient. Fix that.

Discussion: https://postgr.es/m/h4in2db37vepagmi2oz5vvqymjasc5gyb4lpqkunj4eusu274i@37jpd3c2spd3
2025-04-25 13:31:25 -04:00
Andres Freund
500b61769f Fix bug allowing io_combine_limit > io_max_combine_combine limit
10f66468475 intended to limit the value of io_combine_limit to the minimum of
io_combine_limit and io_max_combine_limit. To avoid issues with interdependent
GUCs, it introduced io_combine_limit_guc and set io_combine_limit in assign
hooks. That plan was thwarted by guc_tables.c accidentally still referencing
io_combine_limit, instead of io_combine_limit_guc.  That lead to the GUC
machinery overriding the work done in the assign hooks, potentially leaving
io_combine_limit with a too high value.

The consequence of this bug was that when running with io_combine_limit >
io_combine_limit_guc the AIO machinery would not have reserved large enough
iovec and IO data arrays, with one IO's arrays overlapping with another IO's,
leading to total confusion.

To make such a problem easier to detect in the future, add assertions to
pgaio_io_set_handle_data_* checking the length is smaller than
io_max_combine_limit (not just PG_IOV_MAX).

It'd be nice to have a few tests for this, but it's not entirely obvious how
to do so portably.

As remarked upon by Tom, the GUC assignment hooks really shouldn't set the
underlying variable, that's the job of the GUC machinery. Change that as well.

Discussion: https://postgr.es/m/c5jyqnuwrpigd35qe7xdypxsisdjrdba5iw63mhcse4mzjogxo@qdjpv22z763f
2025-04-25 13:31:24 -04:00
Andres Freund
0d9114b704 aio: Fix crash potential for pg_aios views due to late state update
pgaio_io_reclaim() reset the fields in PgAioHandle before updating the state
to IDLE or incrementing the generation. For most things that's OK, but for
pg_get_aios() it is not - if it copied the PgAioHandle while fields were being
reset, we wouldn't detect that and could call
pgaio_io_get_target_description() with ioh->target == PGAIO_TID_INVALID,
leading to a crash.

Fix this issue by incrementing the generation and state earlier, before
resetting.

Also add an assertion to pgaio_io_get_target_description() for the target to
be valid - that'd have made this case a bit easier to debug. While at it,
add/update a few related assertions.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/062daca9-dfad-4750-9da8-b13388301ad9@gmail.com
2025-04-25 13:31:13 -04:00
Peter Eisentraut
76d52e7165 Fix incorrect format placeholders
Before commit a0ed19e0a9e there was a cast around these, but the cast
inadvertently changed the signedness, but that made the format
placeholder correct.  Commit a0ed19e0a9e removed the casts, so now the
format placeholders had the wrong signedness.
2025-04-25 16:49:30 +02:00
Peter Eisentraut
385959bdea Fix terminology in comment and message
Should be "bracket" not "brace" for [].
2025-04-25 16:26:28 +02:00
Peter Eisentraut
0787646e1d Small code consistency improvement
Adjust the way the increment operators are placed to be consistent
throughout the function.  Fixup for commit commit c1da7281060.
2025-04-25 13:01:31 +02:00
Amit Kapila
50b8ad30f7 Fix typo in test file name added in commit 4909b38af0.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CANhcyEXsObdjkjxEnq10aJumDpa5J6aiPzgTh_w4KCWRYHLw6Q@mail.gmail.com
2025-04-25 12:46:02 +05:30
Fujii Masao
632f62dcec doc: remove unnecessary secondary index terms for replication settings.
Previously, config.sgml included secondary index terms for
max_replication_slots and max_active_replication_origins. These are
no longer necessary, as each parameter now has a single distinct index entry.

The secondary index terms were originally useful because
max_active_replication_origins was part of max_replication_slots,
and separate index entries helped users locate each setting. However,
commit 04ff636cbce split them into independent parameters,
making the secondary terms redundant.

This commit removes the unnecessary secondary index entries to
simplify the documentation.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/e825e7a7-4877-441d-93c1-25377db36c31@oss.nttdata.com
2025-04-25 14:58:14 +09:00
Bruce Momjian
6389db2320 doc: simplify new EXPLAIN ANALYZE BUFFERS description 2025-04-24 22:02:35 -04:00
Michael Paquier
3631612eae psql: Fix assertion failures with pipeline mode
A correct cocktail of COPY FROM, SELECT and/or DML queries and
\syncpipeline was able to break the logic in charge of discarding
results of a pipeline, done in discardAbortedPipelineResults().  Such
sequence make the backend generate a FATAL, due to a protocol
synchronization loss.

This problem comes down to the fact that we did not consider the case of
libpq returning a PGRES_FATAL_ERROR when discarding the results of an
aborted pipeline.  The discarding code is changed so as this result
status is handled as a special case, with the caller of
discardAbortedPipelineResults() being responsible for consuming the
result.

A couple of tests are added to cover the problems reported, bringing an
interesting gain in coverage as there were no tests in the tree covering
the case of protocol synchronization loss.

Issue introduced by 41625ab8ea3d.

Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/ebf6ce77-b180-4d6b-8eab-71f641499ddf@postgrespro.ru
2025-04-24 12:22:53 +09:00
Michael Paquier
923ae50cf5 Add sanity check for dshash entries when reading pgstats file
Not having this check would produce a core dump at startup when running
pgstat_read_statsfile(), in the case where the information of a stats
kind for an entry in the dshash could not be found.  The same check
already happens for fixed-numbered stats and entries that are stored
with their names.  This issue can be seen with custom stats kinds.

Note that this problem can be reproduced what what is in the core code:
- Tweak the test module injection_points to not load the fixed-numbered
stats part, leaving only the variable-numbered stats.
- Create an instance with injection_points defined in
shared_preload_libraries.
- Create a pgstats entry by attaching and running a point.
- Restart the server without shared_preload_libraries.  The startup
process detects that something is wrong and reports a WARNING.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aAieZAvM+K1d89R2@ip-10-97-1-34.eu-west-3.compute.internal
2025-04-24 09:20:01 +09:00
Tom Lane
bc19f63f80 Avoid possibly-theoretical OOM crash hazard in hash_create().
One place in hash_create() used DynaHashAlloc() as a convenient
shorthand for MemoryContextAlloc().  That was fine when it was
written, but it stopped being fine when 9c911ec06 changed
DynaHashAlloc() to use MCXT_ALLOC_NO_OOM (mea culpa).  Change
the code to call plain MemoryContextAlloc() as intended.

I think that this bug may be unreachable in practice, since we now
always create AllocSets with some space already allocated, so that
an OOM failure here for a non-shared hash table should be impossible
(with a hash table name of reasonable length anyway).  And there
aren't enough shared hash tables to make a crash for one of those
probable.  Nonetheless it's clearly not operating as designed, so
back-patch to v16 where 9c911ec06 came in.

Reported-by: Maksim Korotkov <m.korotkov@postgrespro.ru>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/219bdccd460510efaccf90b57e5e5ef2@postgrespro.ru
Backpatch-through: 16
2025-04-23 16:04:55 -04:00
Jacob Champion
005ccae0f2 oauth: Support Python 3.6 in tests
RHEL8 ships a patched 3.6.8 as its base Python version, and I
accidentally let some newer Python-isms creep into oauth_server.py
during development.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Tested-by: Renan Alves Fonseca <renanfonseca@gmail.com>
Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/16098.1745079444%40sss.pgh.pa.us
2025-04-23 11:16:45 -07:00
Alexander Korotkov
bb78e42678 Maintain RelIdToTypeIdCacheHash in TypeCacheOpcCallback()
b85a9d046efd introduced a new RelIdToTypeIdCacheHash, whose entries should
exist for typecache entries with TCFLAGS_HAVE_PG_TYPE_DATA flag set or any
of TCFLAGS_OPERATOR_FLAGS set or tupDesc set.  However, TypeCacheOpcCallback(),
which resets TCFLAGS_OPERATOR_FLAGS, was forgotten to update
RelIdToTypeIdCacheHash.

This commit adds a delete_rel_type_cache_if_needed() call to the
TypeCacheOpcCallback() function to maintain RelIdToTypeIdCacheHash after
resetting TCFLAGS_OPERATOR_FLAGS.

Also, this commit fixes the name of the delete_rel_type_cache_if_needed()
function in its mentions in the comments.

Reported-by: Noah Misch
Discussion: https://postgr.es/m/20250411203241.e9.nmisch%40google.com
2025-04-23 20:26:52 +03:00
Alexander Korotkov
9f404d7922 Properly prepare varinfos in estimate_multivariate_bucketsize()
To estimate with extended statistics, we need to clear the varnullingrels
field in the expression, and duplicates are not allowed in the GroupVarInfo
list.  We might re-use add_unique_group_var(), but we don't do so for two
reasons.

  1) We must keep the origin_rinfos list ordered exactly the same way as
     varinfos.
  2) add_unique_group_var() is designed for estimate_num_groups(), where a
     larger number of groups is worse.   While estimating the number of hash
     buckets, we have the opposite: a lesser number of groups is worse.
     Therefore, we don't have to remove "known equal" vars: the removed var
     may valuably contribute to the multivariate statistics to grow the number
     of groups.

This commit adds custom code to estimate_multivariate_bucketsize() to
initialize varinfos properly.

Reported-by: Robins Tharakan <tharakan@gmail.com>
Discussion: https://postgr.es/m/18885-da51324078588253%40postgresql.org
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-04-23 20:25:21 +03:00
Tom Lane
3db61db48e Change the names generated for child foreign key constraints.
When a foreign key constraint is placed on a partitioned table, we
actually make two pg_constraint entries associated with that table.
(I have my doubts about the wisdom of that, but it's been like that
since v12 and post-feature-freeze is no time to be messing with such
entrenched decisions.)  The second "child" entry always had a name
generated according to the default rule, "table_column(s)_fkey[nnn]",
even if the primary entry had an unrelated user-specified name.  The
trouble with doing that is that the default name could collide with
the user-specified name of some other constraint on the same table.
While we were willing to adjust the generated name to avoid
collisions, that only helps if it's made second; if it's made first
then creation of the other constraint would fail, potentially causing
dump/reload or pg_upgrade failures.

The core of the problem here is that we're infringing on user
namespace, so I doubt that there's any 100% solution other than to
find a way to not need the "child" entry.  In the meantime, it seems
like it'd be an improvement to make the child's name be the name of
the parent constraint with an underscore and digit(s) appended as
necessary to make it unique.  This rule can in theory fail in the same
way, but it seems much less probable; for one thing, this rule is
guaranteed not to match primary entries having auto-generated names.
(While an auto-generated primary name isn't user-specified to begin
with, it acts like that during dump/reload, so collisions against such
names are definitely possible.)

An additional bonus, visible in some of the regression test cases
that change here, arises from the fact that some error messages
cite the child constraint's name not the parent's.  In the
previous approach the two names could be completely unrelated,
leading to user confusion --- the more so since psql's \d command
hides child constraints.  With this approach it's hopefully much
clearer which constraint-the-user-knows-about is failing.

However, that does mean that there's user-visible behavior change
occurring here, making it seem like not something to back-patch.
I feel it's not too late for v18, though.

Reported-by: Kirill Reshke <reshkekirill@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CALdSSPhGitjpTfzEMJN-Y2x+Q-5QChSxAsmSJ1-E8mQJLkHOqQ@mail.gmail.com
2025-04-23 12:03:02 -04:00
Daniel Gustafsson
994a100b37 Allocate JsonLexContexts on the heap to avoid warnings
The stack allocated JsonLexContexts, in combination with codepaths
using goto, were causing warnings when compiling with LTO enabled
as the optimizer is unable to figure out that is safe.  Rather than
contort the code with workarounds for this simply heap allocate the
structs instead as these are not in any performance critical paths.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2074634.1744839761@sss.pgh.pa.us
2025-04-23 11:02:05 +02:00
Michael Paquier
0ff95e0a5b psql: Rework TAP routine psql_fails_like() to define WAL sender context
The routine was coded so as a WAL sender was always used, state required
only for one failure test related to START_REPLICATION.  This test is
changed so as a WAL sender is used by passing a replication option to
psql_fails_like(), instead of forcing the use of a WAL sender for all
the tests.

This has come up as useful in the context of a separate bug fix where
we are looking at extending tests for some failure scenarios.  These
tests need to happen in the context of a normal backend, and not a WAL
sender where the extended query protocol cannot be used.

Discussion: https://postgr.es/m/aAXkJIOildLUA7vQ@paquier.xyz
2025-04-23 15:33:07 +09:00
Amit Kapila
0e091ce409 Fix an oversight in 3f28b2fcac.
Commit 3f28b2fcac tried to ensure that the replication origin shouldn't be
advanced in case of an ERROR in the apply worker, so that it can request
the same data again after restart. However, it is possible that an ERROR
was caught and handled by a (say PL/pgSQL) function, and  the apply worker
continues to apply further changes, in which case, we shouldn't reset the
replication origin.

Ensure to reset the origin only when the apply worker exits after an
ERROR.

Commit 3f28b2fcac added new function geterrlevel, which we removed in HEAD
as part of this commit, but kept it in backbranches to avoid breaking any
applications. A separate case can be made to have such a function even for
HEAD.

Reported-by: Shawn McCoy <shawn.the.mccoy@gmail.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com
2025-04-23 11:08:24 +05:30
Michael Paquier
1f7878c33c Remove assertion based on pending_since in pgstat_report_stat()
This assertion, based on pending_since (timestamp used to prevent stats
reports to be too frequent or should a partial flush happen), is reached
when it is found that no data can be flushed but a previous call of
pgstat_report_stat() determined that some stats data has been found as
in need of a flush.  So pending_since is set when some stats data is
pending (in non-force mode) or if report attempts are too frequent, and
reset to 0 once all stats have been flushed.

Since 5cbbe70a9cc6, WAL senders have begun to report their stats on a
periodic basis for IO stats in v16~ and backend stats on HEAD, creating
some friction with the concurrent pgstat_report_stat() calls that can
happen in the context of a WAL sender (shutdown callback doing a final
report or backend-related code paths).  This problem is the cause of
spurious failures in the TAP tests.

In theory, this assertion can be also reached in v15, even if that's
very unlikely.  For example, a process, say a background worker, could
do periodic and direct stats flushes with concurrent calls of
pgstat_report_stat() that could cause conflicting values of
pending_since.  This can be done with WAL or SLRU stats flushes using
pgstat_flush_wal() or pgstat_slru_flush().  HEAD makes this situation
easier to happen with custom cumulative stats.

This commit removes the assertion altogether, per discussion, as it is
more useful to keep the state of things as they are for the WAL sender.
The assertion could use a special state based on for example
am_walsender, but I doubt that this would be meaningful in the long run
based on the other arguments raised while discussing this issue.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/1489124.1744685908@sss.pgh.pa.us
Discussion: https://postgr.es/m/dwrkeszz6czvtkxzr5mqlciy652zau5qqnm3cp5f3p2po74ppk@omg4g3cc6dgq
Backpatch-through: 15
2025-04-23 13:53:29 +09:00
Tom Lane
e0f373ee42 Re-enable SSL connect_fails tests, and fix related race conditions.
Cluster.pm's connect_fails routine has long had the ability to
sniff the postmaster log file for expected messages after a
connection failure.  However, that's always had a race condition:
on some platforms it's possible for psql to exit and the test
script to slurp up the postmaster log before the backend process
has been able to write out its final log messages.  Back in
commit 55828a6b6 we disabled a bunch of tests after discovering
that, and the aim of this patch is to re-enable them.

(The sibling function connect_ok doesn't seem to have a similar
problem, mainly because the messages we look for come out during
the authentication handshake, so that if psql reports successful
connection they should certainly have been emitted already.)

The solution used here is borrowed from 002_connection_limits.pl's
connect_fails_wait routine: set the server's log_min_messages setting
to DEBUG2 so that the postmaster will log child-process exit, and then
wait till we see that log entry before checking for the messages we
are actually interested in.

If a TAP test uses connect_fails' log_like or log_unlike options, and
forgets to set log_min_messages, those connect_fails calls will now
hang until timeout.  Fixing up the existing callers shows that we had
several other TAP tests that were in theory vulnerable to the same
problem.  It's unclear whether the lack of failures is just luck, or
lack of buildfarm coverage, or perhaps there is some obscure timing
effect that only manifests in SSL connections.  In any case, this
change should in principle make those other call sites more robust.
I'm not inclined to back-patch though, unless sometime we observe
an actual failure in one of them.

Reported-by: Andrew Dunstan <andrew@dunslane.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/984fca80-85a8-4c6f-a5cc-bb860950b435@dunslane.net
2025-04-22 15:10:50 -04:00
Tom Lane
da83b1ea10 Avoid depending on post-UPDATE row order in float4/float8 tests.
While heapam reproduces the insertion order of rows well, updates
can move rows to varying places depending on autovacuum activity.
In most regression tests we've guarded against getting variable
results due to that, but float4.sql and float8.sql had escaped
notice so far because they update tables that are too small for
autovacuum to pay attention to.

With increasing interest in non-heap table AMs, it seems worth
allowing for update behaviors that are not like heapam's.  Hence,
add ORDER BY to stabilize the results in case the updates put
the rows in a different order.  (We'll continue to assume that a
seqscan will reproduce original insertion order, though.  Removing
that assumption would require vastly-more-invasive test changes.)

Author: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALT9ZEExHAnBoBVQzQuWPMKUbapF5-FBO3fdeYG3s2tuWQz1NQ@mail.gmail.com
2025-04-22 14:24:21 -04:00
Tom Lane
eaf582806c gen_node_support.pl: improve error message for unclosed struct.
This error message was 'runaway "struct_name"', which isn't all
that clear; I think 'could not find closing brace for "struct_name"'
is better.  Also, provide the location of the struct start using the
script's usual '$file:$lineno' style.

Bug: #18901
Reported-by: Clemens Ruck <clemens.ruck@t-online.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18901-424272abe01357e6@postgresql.org
2025-04-22 13:56:31 -04:00
Michael Paquier
e29df428a1 doc: Mention naming convention used by injection points
All the injection points used in the tree have relied on an implied
rule: their names should be made of lower-case characters, with dashes
between the words used.

This commit adds a light mention about that in the docs, encouraging the
practice.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
2025-04-22 12:41:29 +09:00
David Rowley
0b06459f3c Doc: reword text explaining the --maintenance-db option
The previous text was a little clumsy.  Here we improve that.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Noboru Saito <noborusai@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
Backpatch-through: 13
2025-04-22 14:54:22 +12:00
Michael Paquier
02c63f9438 Rename injection point for invalidation messages at end of transaction
This injection point was named "AtEOXact_Inval-with-transInvalInfo", not
respecting the implied naming convention that injection points should
use lower-case characters, with terms separated by dashes.  All the
other points defined in the tree follow this style, so let's be more
consistent.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
2025-04-22 10:01:38 +09:00
David Rowley
5e6f9a9c4e Doc: various fixups
* Use <symbol> tags for CONNECTION_* #defines

We were using an inconsistent mix of <literal> and sometimes <function>
tags.

* Use <application> tag for libpq

There was a mix of <literal> and <productname>

Also fix a whitespace issue.

None of these seem critical enough mistakes to backpatch.

Author: Noboru Saito <noborusai@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
2025-04-22 11:10:08 +12:00
David Rowley
d010cc6cca Doc: fix incorrect punctuation
Author: Noboru Saito <noborusai@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
Backpatch-through: 17
2025-04-22 11:04:04 +12:00
Jeff Davis
90260e2ec6 Fix INITCAP() word boundaries for PG_UNICODE_FAST.
Word boundaries are based on whether a character is alphanumeric or
not. For the PG_UNICODE_FAST collation, alphanumeric includes
non-ASCII digits; whereas for the PG_C_UTF8 collation, it only
includes digits 0-9. Pass down the right information from the
pg_locale_t into initcap_wbnext to differentiate the behavior.

Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250417135841.33.nmisch@google.com
2025-04-21 12:34:58 -07:00
Tom Lane
80b727eb9d Use the same cmd_context throughout a walsender's lifetime.
exec_replication_command created a cmd_context to work in and
then deleted it on exit.  This is pretty dangerous because
some replication commands start/finish transactions.  In the
wake of commit 1afe31f03, that could lead to re-selecting a
CurrentMemoryContext that's already been deleted, leading to
hilarity such as a memory context that is its own parent.

To fix, let's make the cmd_context persist across
exec_replication_command calls; instead of deleting it, we'll just
reset it each time.  In this way it retains the same identity and
there's no problem if transaction abort restores it as the working
context.  It probably even saves a few microseconds to do this.

This fix also ensures that exec_replication_command returns to the
caller (PostgresMain) with the same context active that had been
when it was called (probably MessageContext).  The previous
coding could get that wrong too.

Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAO6_XqoJA7-_G6t7Uqe5nWF3nj+QBGn4F6Ptp=rUGDr0zo+KvA@mail.gmail.com
2025-04-21 12:09:36 -04:00
Tom Lane
5ec8b01c30 MemoryContextCreate: assert parent is valid and different from node.
The case of "node == parent" might seem impossible, since we just
allocated the new node.  But it's possible if parent is a dangling
reference to a recently-deleted context.  In fact, given aset.c's
habit of recycling contexts, it's actually rather likely if that's so.
If we'd had this assertion before, it would have simplified debugging
a recently-identified walsender issue.

Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAO6_XqoJA7-_G6t7Uqe5nWF3nj+QBGn4F6Ptp=rUGDr0zo+KvA@mail.gmail.com
2025-04-21 11:34:36 -04:00
Fujii Masao
706cbed351 doc: Fix memory context level in pg_log_backend_memory_contexts() example.
Commit d9e03864b6b changed the memory context level numbers shown by
pg_log_backend_memory_contexts() to be 1-based. However, the example in
the documentation was not updated and still used 0-based numbering.

This commit updates the example to match the current 1-based output.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/1ad6d388-1b43-400d-bec9-36d52f755f74@oss.nttdata.com
2025-04-21 14:53:25 +09:00
David Rowley
78eda9e264 Fix a few more duplicate words in comments
Similar to 84fd3bc14 but these ones were found using a regex that can span
multiple lines.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com
2025-04-21 13:50:50 +12:00
David Rowley
84fd3bc141 Fix a few duplicate words in comments
These are all new to v18

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com
2025-04-21 10:41:18 +12:00
Noah Misch
8180136652 Comment on need to MarkBufferDirty() if omitting DELAY_CHKPT_START.
Blocking checkpoint phase 2 requires MarkBufferDirty() and
BUFFER_LOCK_EXCLUSIVE; neither suffices by itself.  transam/README documents
this, citing SyncOneBuffer().  Update the DELAY_CHKPT_START documentation to
say this.  Expand the heap_inplace_update_and_unlock() comment that cites
XLogSaveBufferForHint() as precedent, since heap_inplace_update_and_unlock()
could have opted not to use DELAY_CHKPT_START.

Commit 8e7e672cdaa6bfec85d4d5dd9be84159df23bb41 added DELAY_CHKPT_START to
heap_inplace_update_and_unlock().  Since commit
bc6bad88572501aecaa2ac5d4bc900ac0fd457d5 reverted it in non-master branches,
no back-patch.

Discussion: https://postgr.es/m/20250406180054.26.nmisch@google.com
2025-04-20 12:00:17 -07:00
Noah Misch
714bd9e3a7 Test restartpoints in archive recovery.
v14 commit 1f95181b44c843729caaa688f74babe9403b5850 and its v13
equivalent caused timing-dependent failures in archive recovery, at
restartpoints.  The symptom was "invalid magic number 0000 in log
segment X, offset 0", "unexpected pageaddr X in log segment Y, offset 0"
[X < Y], or an assertion failure.  Commit
3635a0a35aafd3bfa80b7a809bc6e91ccd36606a and predecessors back-patched
v15 changes to fix that.  This test reproduces the problem
probabilistically, typically in less than 1000 iterations of the test.
Hence, buildfarm and CI runs would have surfaced enough failures to get
attention within a day.

Reported-by: Arun Thirupathi <arunth@google.com>
Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com
Backpatch-through: 13
2025-04-20 08:28:48 -07:00
Noah Misch
2d5350cfbd Avoid ERROR at ON COMMIT DELETE ROWS after relhassubclass=f.
Commit 7102070329d8147246d2791321f9915c3b5abf31 fixed a similar bug, but
it missed the case of database-wide ANALYZE ("use_own_xacts" mode).
Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed consequences
from silent discard of a pg_class stats (relpages et al.) update to
ERROR "tuple to be updated was already modified".  Losing a relpages
update of an ON COMMIT DELETE ROWS table was negligible, but a
COMMIT-time error isn't negligible.  Back-patch to v13 (all supported
versions).

Reported-by: Richard Guo <guofenglinux@gmail.com
Reported-by: Robins Tharakan <tharakan@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-XwMKMKJ_GT=p3_-_=j9rQSEs1FbDFUnW9zHuKPsPNEQ@mail.gmail.com
Backpatch-through: 13
2025-04-20 08:28:48 -07:00
David Rowley
d47f922246 Fix issue with ORDER BY / DISTINCT aggregates and FILTER
1349d2790 added support so that aggregate functions with an ORDER BY or
DISTINCT clause could make use of presorted inputs to avoid an implicit
sort within nodeAgg.c.  That commit failed to consider that a FILTER
clause may exist that filters rows before the aggregate function
arguments are evaluated.  That can be problematic if an aggregate
argument contains an expression which could error out during evaluation.
It's perfectly valid to want to have a FILTER clause which eliminates
such values, and with the pre-sorted path added in 1349d2790, it was
possible that the planner would produce a plan with a Sort node above
the Aggregate to perform the sort on the aggregate's arguments long before
the Aggregate node would filter out the non-matching values.

Here we fix this by inspecting ORDER BY / DISTINCT aggregate functions
which have a FILTER clause to see if the aggregate's arguments are
anything more complex than a Var or a Const.  Evaluating these isn't
going to cause an error.  If we find any non-Var, non-Const parameters
then the planner will now opt to perform the sort in the Aggregate node
for these aggregates, i.e. disable the presorted aggregate optimization.

An alternative fix would have been to completely disallow the presorted
optimization for Aggrefs with any FILTER clause, but that wasn't done as
that could cause large performance regressions for queries that see
significant gains from 1349d2790 due to presorted results coming in from
an Index Scan.

Backpatch to 16, where 1349d2790 was introduced

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Kaimeh <kkaimeh@gmail.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAK-%2BJz9J%3DQ06-M7cDJoPNeYbz5EZDqkjQbJnmRyQyzkbRGsYkA%40mail.gmail.com
Backpatch-through: 16
2025-04-20 22:12:07 +12:00
Michael Paquier
78231baaf9 psql: Split extended query protocol meta-commands in --help=commands
Compared to v17 with only \bind able to do extended query protocol work,
v18 has now a total of 11 meta-commands related to the extended query
protocol.  These were all listed under the "General" section of the
--help=commands output and are specialized, bloating the output
generated.

All these meta-commands are moved into a new section called "Extended
Query Protocol", listed at the end of --help=commands.

This split has been suggested by Noah Misch.

Discussion: https://postgr.es/m/20250415213450.1f.nmisch@google.com
2025-04-20 08:34:38 +09:00
Michael Paquier
5743d122fc psql: Improve descriptions of \\flush[request] in --help
Noah has reported that the current wording was confusing compared to the
description of the underlying libpq routine.  The new wording is from
me.

Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250415213450.1f.nmisch@google.com
2025-04-20 08:16:57 +09:00
Michael Paquier
5ee7bd944e psql: Fix incorrect status code returned by \getresults
When an invalid number of results is requested for \getresults, the
status code returned by exec_command_getresults() was PSQL_CMD_SKIP_LINE
and not PSQL_CMD_ERROR.

This led to incorrect behaviors, with ON_ERROR_STOP for example.

Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250415213450.1f.nmisch@google.com
2025-04-20 08:15:39 +09:00
Tom Lane
d05996340d Be more wary of corrupt data in pageinspect's heap_page_items().
The original intent in heap_page_items() was to return nulls, not
throw an error or crash, if an item was sufficiently corrupt that
we couldn't safely extract data from it.  However, commit d6061f83a
utterly missed that memo, and not only put in an un-length-checked
copy of the tuple's data section, but also managed to break the check
on sane nulls-bitmap length.  Either mistake could possibly lead to
a SIGSEGV crash if the tuple is corrupt.

Bug: #18896
Reported-by: Dmitry Kovalenko <d.kovalenko@postgrespro.ru>
Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18896-add267b8e06663e3@postgresql.org
Backpatch-through: 13
2025-04-19 16:37:42 -04:00
Michael Paquier
88e947136b Fix typos and grammar in the code
The large majority of these have been introduced by recent commits done
in the v18 development cycle.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com
2025-04-19 19:17:42 +09:00
Michael Paquier
114f7fa81c Rename injection points used in AIO tests
The format of the injection point names used by the AIO code does not
match the existing naming convention used everywhere else in the code,
so let's be consistent.  These points are used in test_aio.

Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/Z_yTB80bdu1sYDqJ@paquier.xyz
2025-04-19 18:53:35 +09:00
Fujii Masao
3aad76a0a9 Make pg_upgrade log message with control file path translatable.
Commit 173c97812ff replaced the hardcoded "global/pg_control" in pg_upgrade
log message with a string literal concatenation of XLOG_CONTROL_FILE macro.
However, this change made the message untranslatable.

This commit fixes the issue by using %s with XLOG_CONTROL_FILE instead of
that literal concatenation, allowing the message to be translated properly.
It also wraps the file path in double quotes for consistency with similar
log messages.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20250407.155546.2129693791769531891.horikyota.ntt@gmail.com
2025-04-18 18:35:40 +09:00
Tatsuo Ishii
05883bd6e5 Doc: fix missing comma at the end of a line.
Backpatch to 17, where the line was added.

Reported by Noboru Saito while he was working on translating the file
into Japanese.

Discussion: https://postgr.es/m/20250417.203047.1321297410457834775.ishii%40postgresql.org
Reported-by: Noboru Saito <noborusai@gmail.com>
Reviewed-by: Daniel Gustafs <daniel@yesql.se>
Backpatch-through: 17
2025-04-18 09:38:46 +09:00
David Rowley
1bd08f6ba5 Fixup various older misuses of appendPQExpBuffer
Use appendPQExpBufferStr when there are no parameters and
appendPQExpBufferChar when the string length is 1.

Unlike 3fae25cbb, which fixed this issue for code that was new to v18,
this one fixes up instances which exist in the backbranches.  We've
historically tried to maintain this standard and if we're going to
continue doing that, then we won't be doing that selectively based on
when the code was introduced.  Now seems like a good time to flush out the
existing misuses.  Waiting until v19 just prolongs their existence in
terms of released versions that the misuses exist in.

Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com
2025-04-18 12:15:08 +12:00
David Rowley
d9e03864b6 Make levels 1-based in pg_log_backend_memory_contexts()
Both pg_get_process_memory_contexts() and pg_backend_memory_contexts
have 1-based levels, whereas pg_log_backend_memory_contexts() was using
0-based levels.  Align these.

This results in slightly saner behavior from MemoryContextStatsDetail()
in regards to the max_level.  Previously it would stop at 1 level before
the maximum requested level rather than at that level.

Reported-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Author: David Rowley <drowleyml@gmail.com
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/395ea5d4fe190480efa95bf533485c70@oss.nttdata.com
2025-04-18 09:04:28 +12:00
Tom Lane
fc5e966f73 Suppress "may be used uninitialized" warnings from older compilers.
The "children" list won't be used until "got_children" has been set
true, but older compilers don't get that; about half a dozen
buildfarm animals are warning about this.  Issue added by 11ff192b5.

While here, improve slightly-shaky grammar in comment.

Discussion: https://postgr.es/m/2057835.1744833309@sss.pgh.pa.us
2025-04-17 16:47:04 -04:00
Tom Lane
4aad2cb770 Portability fix: isdigit() must be passed an unsigned char.
Oversight in commit 40b9c2701, per buildfarm member mamba.
2025-04-17 16:33:21 -04:00
Tom Lane
0400ae4a68 Cache typlens of a SQL function's input arguments.
This gets rid of repetitive get_typlen calls in postquel_sub_params,
which show up as costing a few percent of the runtime in simple test
cases (more with more parameters).

In combination with the preceding patches, this gets us most of the
way back down to the amount of per-call overhead that functions.c
had before commit 0dca5d68d.  There are some more things that could
be done, but this seems like an okay place to stop for v18.
2025-04-17 12:56:40 -04:00
Tom Lane
0313c5dc62 Make SQLFunctionCache long-lived again.
At this point, the only data structures we allocate directly in
fcontext are the SQLFunctionCache struct itself, the ParamListInfo
struct, and the execution_state array, all of which are small and
perfectly capable of being re-used across executions of the same
FmgrInfo.  Hence, let's give them the same lifespan as the FmgrInfo.
This step gets rid of the separate SQLFunctionLink struct and makes
fn_extra point to SQLFunctionCache again.  We also get rid of the
separate fcontext memory context and allocate these items directly
in fn_mcxt.

For notational simplicity, SQLFunctionCache still has an fcontext
field, but it's just a copy of fn_mcxt.

The motivation for this is to allow these structures to live as
long as the FmgrInfo and be re-used across calls, restoring the
original design without its propensity for memory leaks.  This
gets rid of some per-call overhead that we added in 0dca5d68d.

We also make an effort to re-use the JunkFilter and result slot.
Those might need to change if the function definition changes,
so we compromise by rebuilding them if the cached plan changes.

This also moves the tuplestore into fn_mcxt so that it can be
re-used across calls, again undoing a change made in 0dca5d68d.
2025-04-17 12:56:31 -04:00
Tom Lane
f45a5444ee Split some storage out to separate subcontexts of fcontext.
Put the JunkFilter and its result slot (and thence also
some subsidiary data such as the result tupledesc) into a
separate subcontext "jfcontext".  This doesn't accomplish
a lot at this point, because we make a new JunkFilter each
time through the SQL function.  However, the plan is to make
the fcontext long-lived, and that raises the possibility
that we'll need a new JunkFilter because the plan for the
result-generating query changes.  A separate context makes
it easy to free the obsoleted data when that happens.

Also, instead of always running the sub-executor in fcontext,
make a separate context for it if we're doing lazy eval of
a SRF, and otherwise just run it inside CurrentMemoryContext.
2025-04-17 12:56:21 -04:00
Tom Lane
595d1efeda Make functions.c mostly run in a short-lived memory context.
Previously, much of this code ran with CurrentMemoryContext set
to be the function's fcontext, so that we tended to leak a lot of
stuff there.  Commit 0dca5d68d dealt with that by releasing the
fcontext at the completion of each SQL function call, but we'd
like to go back to the previous approach of allowing the fcontext
to be query-lifespan.  To control the leakage problem, rearrange
the code so that we mostly run in the memory context that fmgr_sql
is called in (which we expect to be short-lived).  Notably, this
means that parsing/planning is all done in the short-lived context
and doesn't leak cruft into fcontext.

This patch also fixes the allocation of execution_state records
so that we don't leak them across executions.  I set that up
with a re-usable array that contains at least as many
execution_state structs as we need for the current querytree.
The chain structure is still there, but it's not really doing
much for us, and maybe somebody will be motivated to get rid
of it.  I'm not though.

This incidentally also moves the call of BlessTupleDesc to be
with the code that creates the JunkFilter.  That doesn't make
much difference now, but a later patch will reduce the number
of times the JunkFilter gets made, and we needn't bless the
results any more often than that.

We still leak a fair amount in fcontext, particularly when
executing utility statements, but that's material for a
separate patch step; the point here is only to get rid of
unintentional allocations in fcontext.
2025-04-17 12:56:08 -04:00
Tom Lane
09b07c2953 Minor performance improvement for SQL-language functions.
Late in the development of commit 0dca5d68d, I added a step to copy
the result tlist we extract from the cached final query, because
I was afraid that that might not last as long as the JunkFilter that
we're passing it off to.  However, that turns out to cost a noticeable
number of cycles, and it's really quite unnecessary because the
JunkFilter will not examine that tlist after it's been created.
(ExecFindJunkAttribute would use it, but we don't use that function
on this JunkFilter.)  Hence, remove the copy step.  For safety,
reset the might-become-dangling jf_targetList pointer to NIL.

In passing, remove DR_sqlfunction.cxt, which we don't use anymore;
it's confusing because it's not entirely clear which context it
ought to point at.
2025-04-17 12:55:58 -04:00
Noah Misch
f4ece891fc Assert lack of hazardous buffer locks before possible catalog read.
Commit 0bada39c83a150079567a6e97b1a25a198f30ea3 fixed a bug of this kind,
which existed in all branches for six days before detection.  While the
probability of reaching the trouble was low, the disruption was extreme.  No
new backends could start, and service restoration needed an immediate
shutdown.  Hence, add this to catch the next bug like it.

The new check in RelationIdGetRelation() suffices to make autovacuum detect
the bug in commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 that led to commit
0bada39.  This also checks in a number of similar places.  It replaces each
Assert(IsTransactionState()) that pertained to a conditional catalog read.

No back-patch for now, but a back-patch of commit 243e9b4 should back-patch
this, too.  A back-patch could omit the src/test/regress changes, since back
branches won't gain new index columns.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com
Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
2025-04-17 05:00:30 -07:00
Daniel Gustafsson
b669293e34 pg_dump: Set private_date pointer to NULL in callback
The end callback for ZStandard compression frees the private_data
but didn't set the pointer to NULL after freeing.  This is not a
bug as the code is right now, since nothing is dereferencing the
pointer upon returning from the callback but it is good practice
to do.

Author: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/efaee52b-9550-44ca-8633-ea86076b3283@altlinux.org
2025-04-17 12:58:00 +02:00
Fujii Masao
e4b0f86e1f pg_dump: Fix incorrect archive format shown in error message.
In pg_dump and pg_restore, _allocAH() calls _discoverArchiveFormat() to
determine the archive format when the input format is unknown one.
If the input or discovered format is unrecognized, it reports an error
including the archive format number.

If discovered format is unrecognized, its number should be shown in
the error message. But previously the error message mistakenly showed
the originally requested format number (i.e., unknown one) instead of
the discovered one, due to referencing the wrong variable in the error
message.

This commit corrects the issue by using the appropriate variable in
the error message.

This fix has no practical impact since _discoverArchiveFormat() never
returns an unrecognized format and that error mesasge is actually
never output. Therefore, while the issue exists in back branches,
it's not worth the trouble and buildfarm cycles to back-patch.
So this fix is applied only to the master branch.

Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAKYtNAqu+N-Ab2Fq6wzNSOm_-0N-BMneanYNV1+6kFDXjva1Eg@mail.gmail.com
2025-04-17 09:52:47 +09:00
Jeff Davis
2e5353be25 Another unintentional behavior change in commit e9931bfb75.
Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250412123430.8c.nmisch@google.com
2025-04-16 16:49:42 -07:00
Jeff Davis
b107744ce7 Improve comment in regc_pg_locale.c.
Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250412123430.8c.nmisch@google.com
2025-04-16 16:49:35 -07:00
David Rowley
3fae25cbb3 Fixup various new-to-v18 usages of appendPQExpBuffer
Use appendPQExpBufferStr when there are no parameters and
appendPQExpBufferChar when the string length is 1.

Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com
2025-04-17 11:37:55 +12:00
David Rowley
f3281f9f93 Improve comments for estimate_multivariate_ndistinct()
estimate_multivariate_ndistinct() is coded to assume the caller handles
passing it a list of GroupVarInfos with unique 'var' fields over the
entire list.  6bb6a62f3 added code which didn't ensure this and that
could result in estimate_multivariate_ndistinct() erroring out with:

ERROR:  corrupt MVNDistinct entry

This occurred because estimate_multivariate_ndistinct() first searches
for a set of stats that match to at least two of the given GroupVarInfos
and then later assumes that the MVNDistinctItem.items array of the
best matching stats will have an entry for those two columns.  If the
GroupVarInfos List contained a duplicate entry then the same column could
be matched to twice and that could trick the code into thinking we have
>= 2 columns matched in cases where only a single distinct column has been
matched.  This could result in a failure to find the correct
MVNDistinctItem in the stats as the array containing those never
contains an item for single columns.

Here we make it more clear that the function needs a distinct set of
GroupVarInfos and also tidy up a few other comments to make things a bit
easier to follow.

Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvocZCUhM9W9mJ39d6oQz7ePKoqFnao_347mvC-A7QatcQ@mail.gmail.com
2025-04-17 11:03:24 +12:00
Tom Lane
ab3d8afc7f Sync declarations and definitions of two new tablecmds.c functions.
Buildfarm member drongo complained because the definitions of these
functions used "const Oid foo" where the forward declarations just
had "Oid foo".  (I'm a bit surprised that drongo seems to be the only
complainant.)  I chose to fix this by removing the "consts" because
(a) I'm generally not a fan of using const that way, and (b) it was
a minority usage even within these two functions, let alone compared
to the rest of our code base.

Oversight in commit eec0040c4, so no need for back-patch.
2025-04-16 17:59:08 -04:00
Álvaro Herrera
11ff192b5b
Elide not-null constraint checks on child tables during PK creation
We were unnecessarily acquiring AccessExclusiveLock on all child tables
when "ALTER TABLE ONLY sometab ADD PRIMARY KEY" was run on their parent
table, an oversight in commit 14e87ffa5c54.  This caused deadlocks
during pg_restore of partitioned tables.

The reason to acquire the AEL was that we need to verify that child
tables have the involved columns already marked as not-null; but if the
parent table has an inheritable not-null constraint, then all children
must necessarily be in the correct state already, so we can skip the
check, which avoids acquiring the lock.  Reorder the code so that it
works that way.  This doesn't change things in the case where the
constraint doesn't exist, but that case is of lesser importance because
it doesn't occur during parallel pg_restore.

While at it, reword some errmsg() and add errhint() to similar cases in
related but not adjacent code.

Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/67469c1c-38bc-7d94-918a-67033f5dd731@gmx.net
Discussion: https://postgr.es/m/2045026.1743801143@sss.pgh.pa.us
Discussion: https://postgr.es/m/1280408.1744650810@sss.pgh.pa.us
2025-04-16 21:51:23 +02:00
Daniel Gustafsson
1fd3566ebc Update pg_config.h.in with libnuma changes
Add macros from autoheader which were accidentally omitted in
commit 65c298f61fc. There is no function change by this as no
code is currently using the missing macro.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/CF6D7D7F-E1C4-45BE-9019-0F4B4BC7C135@yesql.se
2025-04-16 20:16:57 +02:00
Tom Lane
1fc3403626 Fix pg_dump --clean with partitioned indexes.
We'd try to drop the partitions of a partitioned index separately,
which is disallowed by the backend, leading to an error during
restore.  While the error is harmless, it causes problems if you
try to use --single-transaction mode.

Fortunately, there seems no need to do a DROP at all, since the
partition will go away silently when we drop either the parent index
or the partition's table.  So just make the DROP conditional on not
being a partition.

Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com
Backpatch-through: 13
2025-04-16 13:31:59 -04:00
Andrew Dunstan
40b9c27014 pg_restore cleanups
. remove unnecessary oid_string list stuff
. use pg_get_line_buf() instead of open-coding it
. cleaner parsing of map.dat lines

Reverts 2b69afbe50d add new list type simple_oid_string_list to fe-utils/simple_list

Author: Álvaro Herrera <alvherre@kurilemu.de>
Author: Andrew Dunstan <andrew@dunslane.net>

Discussion: https://postgr.es/m/202504141220.343fmoxfsbj4@alvherre.pgsql
2025-04-16 12:04:34 -04:00
Richard Guo
3b35f9a4c5 Fix an incorrect check in get_memoize_path
Memoize typically marks cache entries as complete after fully scanning
the inner side of a join.  However, in the case of unique joins, we
skip to the next outer tuple as soon as the first matching inner tuple
is found, leaving no opportunity to scan the inner side to completion.
To work around that, we mark cache entries as complete after fetching
the first matching inner tuple in unique joins.

This approach is only safe when all of the join's restriction clauses
are parameterized; otherwise, there is no guarantee that reading just
one tuple from the inner side is sufficient.

Currently, we check for this by verifying that the number of clauses
in ppi_clauses is no less than the number of the join's restriction
clauses.  However, this check isn't entirely reliable, as ppi_clauses
includes join clauses available from all outer rels, not just the
current outer rel.  This means the check could pass even if a
restriction clause isn't parameterized, as long as another join
clause, which doesn't belong to the current join, is included in
ppi_clauses.

To fix this, we explicitly check whether each restriction clause of
the current join is present in ppi_clauses.

While we're here, remove the XXX comment from the modified code, as
it's not justified; in certain cases, it's not possible to move a join
clause to the inner side.

This is arguably a bugfix, but no backpatch given the lack of field
reports.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-8JPouj=wBDj4DhK-WO4+Xdx=A2jbjvvyyTBQneJ1=BQ@mail.gmail.com
2025-04-16 10:55:44 +09:00
Daniel Gustafsson
5ee476294c doc: Fix typos in documentation
This fixes a set of typos introduced during the v18 development
cycle.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/7038B4C5-2742-42B1-A8F0-0FFEAECF02A7@yesql.se
2025-04-15 21:32:18 +02:00
Tom Lane
7c87284940 Fix failure for generated column with a not-null domain constraint.
If a GENERATED column is declared to have a domain data type where
the domain's constraints disallow null values, INSERT commands failed
because we built a targetlist that included coercing a null constant
to the domain's type.  The failure occurred even when the generated
value would have been perfectly OK.  This is adjacent to the issues
fixed in 0da39aa76, but we didn't notice for lack of testing a domain
with such a constraint.

We aren't going to use the result of the targetlist entry for the
generated column --- ExecComputeStoredGenerated will overwrite it.
So it's not really necessary that it have the exact datatype of
the generated column.  This patch fixes the problem by changing
the targetlist entry to be a null Const of the domain's base type,
which should be sufficiently legal.  (We do have to tweak
ExecCheckPlanOutput to accept the situation, though.)

This has been broken since we implemented generated columns.
However, this patch only applies easily as far back as v14, partly
because I (tgl) only carried 0da39aa76 back that far, but mostly
because v14 significantly refactored the handling of INSERT/UPDATE
targetlists.  Given the lack of field complaints and the short
remaining support lifetime of v13, I judge the cost-benefit ratio
not good for devising a version that would work in v13.

Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxG59tip2+9h=rEv-ykOFjt0cbsPVchhi0RTij8bABBA0Q@mail.gmail.com
Backpatch-through: 14
2025-04-15 12:08:34 -04:00
Fujii Masao
f840f8ee30 doc: Fix missing whitespace in pg_restore documentation.
Previously, a space was missing between "<option>--exclude-schema</option>"
and "for" in the pg_restore documentation. This commit fixes the typo by
adding the missing whitespace.

Back-patch to v17 where the typo was added.

Author: Lele Gaifax <lele@metapensiero.it>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/87lds3ysm0.fsf@metapensiero.it
Backpatch-through: 17
2025-04-15 23:15:06 +09:00
Daniel Gustafsson
7ae13170ba pg_combinebackup: Fix incorrect code documentation
The code comment for parse_oid accidentally used the wrong parameter
when referring to the location of the last backup. Also, while there,
improve sentence wording by removing a superfluous word.

Backpatch to v17 where pg_combinebackup was addedd

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b95ecWgzcS4K3Dx0E_Yp-SLwK5JBasFgioKMSjhQLw9xvg@mail.gmail.com
Backpatch-through: 17
2025-04-15 15:27:08 +02:00
Peter Eisentraut
c55df7c6ea Fix incorrect format placeholders
BlockNumber is unsigned int.  Fix for commit 14ffaece0fb.
2025-04-14 08:56:33 +02:00
Peter Eisentraut
7cd171a5d2 Add more source files to pg_verifybackup/nls.mk
also related to commit 8dfd3129027
2025-04-14 08:32:46 +02:00
David Rowley
b51f86e49a Doc: use "an SQL" consistently rather than "a SQL"
Per the precedent set by 04539e73f, adjust article prefixes for "SQL" to
use "an" consistently rather than "a", i.e., "an es-que-ell" rather than
"a sequel".

Both of these are new to v18. Also see b1b13d2b5, d866f0374 and
7bdd489d3.
2025-04-14 11:55:18 +12:00
Daniel Gustafsson
2970c75dd9 Mark sslkeylogfile as Debug option
Mark the sslkeylogile option as "D" debug as this truly is a debug
option, and it will allow postgres_fdw et.al to filter it out as
well.  Also update the display length to match that for an ssl key
as they are both filename based inputs.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/CAOYmi+=5GyBKpu7bU4D_xkAnYJTj=rMzGaUvHO99-DpNG_YKcw@mail.gmail.com
2025-04-13 21:53:03 +02:00
Andrew Dunstan
64e193f5dd Make AIO error test more portable
Alpine Linux's C library (musl) spells one error message differently.

Reported-by: Wolfgang Walther
2025-04-13 14:39:45 -04:00
Andrew Dunstan
f09088a01d Free memory properly in pg_restore.c
Thinko in commit 39729ec01d2. Mea maxima culpa.

Per Mahendra Singh Thalor <mahi6run@gmail.com>
2025-04-12 14:54:48 -04:00
Tom Lane
78637a8be2 Doc: do a little copy-editing on Index Storage Parameters list.
Add a paragraph break per suggestion from David G. Johnston.
Use a consistent voice for all the different parameter
descriptions, and fix a couple of grammatical issues.

Reported-by: Igor Korot <ikorot01@gmail.com>
Co-authored-by: "David G. Johnston" <david.g.johnston@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+FnnTz=EW1VQRpWB9J+G-NSchrPFcw4nR7d0JqzEK9jWKB35A@mail.gmail.com
2025-04-12 13:42:31 -04:00
Tom Lane
e708ffe79d Fix GIN's shimTriConsistentFn to not corrupt its input.
Commit 0f21db36d made an assumption that GIN triConsistentFns
would not modify their input entryRes[] arrays.  But in fact,
the "shim" triConsistentFn that we use for opclasses that don't
supply their own did exactly that, potentially leading to wrong
answers from a GIN index search.  Through bad luck, none of the
test cases that we have for such opclasses exposed the bug.

One response to this could be that the assumption of consistency check
functions not modifying entryRes[] arrays is a bad one, but it still
seems reasonable to me.  Notably, shimTriConsistentFn is itself
assuming that with respect to the underlying boolean consistentFn,
so it's sure being self-centered in supposing that it gets to do so.

Fortunately, it's quite simple to fix shimTriConsistentFn to restore
the entry-time state of entryRes[], so let's do that instead.

This issue doesn't affect any core GIN opclasses, since they all
supply their own triConsistentFns.  It does affect contrib modules
btree_gin, hstore, and intarray.

Along the way, I (tgl) noticed that shimTriConsistentFn failed to
pick up on a "recheck" flag returned by its first call to the boolean
consistentFn.  This may be only a latent problem, since it would be
unlikely for a consistentFn to set recheck for the all-false case
and not any other cases.  (Indeed, none of our contrib modules do
that.)  Nonetheless, it's formally wrong.

Reported-by: Vinod Sridharan <vsridh90@gmail.com>
Author: Vinod Sridharan <vsridh90@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com
Backpatch-through: 13
2025-04-12 12:28:02 -04:00
Peter Geoghegan
a6cab6a78e Harmonize function parameter names for Postgres 18.
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places.  These
inconsistencies were all introduced during Postgres 18 development.

This commit was written with help from clang-tidy, by mechanically
applying the same rules as similar clean-up commits (the earliest such
commit was commit 035ce1fe).
2025-04-12 12:07:36 -04:00
Michael Paquier
fdb69dd582 Fix instability with WAL fsync test in stats.sql
A backend using wal_sync_method set to "open_sync" or "open_datasync"
would fail the test checking the WAL sync data in pg_stat_io.  These
modes guarantee that a sync is done when WAL is written to disk, and the
data checked by the test is not incremented in this case,
issue_xlog_fsync() doing nothing.

Oversight in commit a051e71e28a1.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0uxwg3xAi4nvdBMJ-zJQEeyg+RotuU+ebM2F6CKmnvaYA@mail.gmail.com
2025-04-12 13:09:48 +09:00
Daniel Gustafsson
847bbb21f8 Fix recently introduced typos
This fixes typos in docs and comments introduced during the v18
development cycle, to keep them from ending up in backbranches.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+COZaCgGua25f2hSrjrDLJcJJAHkwoKgTTqUy-wyL1=64JNjw@mail.gmail.com
2025-04-11 22:17:12 +02:00
Nathan Bossart
5822bf21d5 Add missing space in pg_restore documentation.
Oversight in commit 1495eff7bd.
2025-04-11 10:05:32 -05:00
Peter Eisentraut
914ea1c93c Add missing source file to pg_verifybackup/nls.mk
added by commit 8dfd3129027
2025-04-11 10:53:36 +02:00
Peter Eisentraut
b63cbacb86 Add missing source file to pg_dump/nls.mk
added by commit c1da7281060
2025-04-11 10:28:59 +02:00
Peter Eisentraut
9e0e1cfc3e Add missing source file to pg_upgrade/nls.mk
added by commit 40e2e5e92b7
2025-04-11 10:26:51 +02:00
Peter Eisentraut
7d430a5728 Add missing PGDLLIMPORT markings
Discussion: https://www.postgresql.org/message-id/flat/25095db5-b595-4b85-9100-d358907c25b5%40eisentraut.org
2025-04-11 08:59:52 +02:00
Michael Paquier
2e57790836 Fix race with synchronous_standby_names at startup
synchronous_standby_names cannot be reloaded safely by backends, and the
checkpointer is in charge of updating a state in shared memory if the
GUC is enabled in WalSndCtl, to let the backends know if they should
wait or not for a given LSN.  This provides a strict control on the
timing of the waiting queues if the GUC is enabled or disabled, then
reloaded.  The checkpointer is also in charge of waking up the backends
that could be waiting for a LSN when the GUC is disabled.

This logic had a race condition at startup, where it would be possible
for backends to not wait for a LSN even if synchronous_standby_names is
enabled.  This would cause visibility issues with transactions that we
should be waiting for but they were not.  The problem lasts until the
checkpointer does its initial update of the shared memory state when it
loads synchronous_standby_names.

In order to take care of this problem, the shared memory state in
WalSndCtl is extended to detect if it has been initialized by the
checkpointer, and not only check if synchronous_standby_names is
defined.  In WalSndCtlData, sync_standbys_defined is renamed to
sync_standbys_status, a bits8 able to know about two states:
- If the shared memory state has been initialized.  This flag is set by
the checkpointer at startup once, and never removed.
- If synchronous_standby_names is known as defined in the shared memory
state.  This is the same as the previous sync_standbys_defined in
WalSndCtl.

This method gives a way for backends to decide what they should do until
the shared memory area is initialized, and they now ultimately fall back
to a check on the GUC value in this case, which is the best thing that
can be done.

Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by
the checkpointer when this process starts, so the window is very narrow.
It is possible to enlarge the problematic window by making the
checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined()
with a hardcoded sleep for example, and doing so has showed that a 2PC
visibility test is indeed failing.  On machines slow enough, this bug
would cause spurious failures.

In 17~, we have looked at the possibility of adding an injection point
to have a reproducible test, but as the problematic window happens at
early startup, we would need to invent a way to make an injection point
optionally persistent across restarts when attached, something that
would be fine for this case as it would involve the checkpointer.  This
issue is quite old, and can be reproduced on all the stable branches.

Author: Melnikov Maksim <m.melnikov@postgrespro.ru>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru
Backpatch-through: 13
2025-04-11 10:00:21 +09:00
David Rowley
530050d8d2 Add code comment explaining ins_since_vacuum and aborted inserts
Sami complained that there's a discrepancy between n_mod_since_analyze
and n_ins_since_vacuum, as the former only accounts for committed changes
and the latter tracks committed and aborted inserts.  Nobody seemed
overly concerned that this would cause any concerning issues.  The
repercussions, from what I can tell, are limited to causing an
autovacuum to trigger for inserts sooner than it otherwise might. For
typical ratios of commits to aborts, it's unlikely to ever be noticed.

Fixing things to make it so n_ins_since_vacuum only displays committed
inserts would require an additional field in PgStat_TableCounts, which
does not quite seem worthwhile at this stage.  This commit just adds a
comment with some details to mention that we know about it, which will
hopefully prevent repeat discussions.

Reported-by: Sami Imseih <samimseih@gmail.com>
Author: David Rowley <drowleyml@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpgV3a-R2EGmPOh0L-x3pHbZpM3y4dySWfy+UqUazwDQA@mail.gmail.com
2025-04-11 11:36:21 +12:00
Andrew Dunstan
39729ec01d Fix fat fingering in 22cb6d28950
Per Rainier Vilela
2025-04-10 19:08:04 -04:00
David Rowley
928394b664 Improve various new-to-v18 appendStringInfo calls
Similar to 8461424fd, here we adjust a few new locations which were not
using the most suitable appendStringInfo* function for the intended
purpose.

Author: David Rowley <drowleyml@gmail.com
Discussion: https://postgr.es/m/CAApHDvqJnNjueb=Eoj8K+8n0g7nj_AcPWSiCj5RNV4fDejAfqA@mail.gmail.com
2025-04-11 10:07:22 +12:00
Daniel Gustafsson
55ef7abf88 Rename global variable backing DSA area
The global variable backing the DSA area for Memory Context stats
reporting had a too generic name, rename to be more descriptive.
Independently reported by Peter and Laurenz.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Reported-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/d51172bd4e7f4b07a18a0288ca1b1c28a71a5f6a.camel@cybertec.at
Discussion: https://postgr.es/m/25095db5-b595-4b85-9100-d358907c25b5@eisentraut.org
2025-04-10 22:40:27 +02:00
Andrew Dunstan
22cb6d2895 Fix memory leak in pg_restore.c
Oversight in 1495eff7bdb

Author: Ranier Vilela <ranier.vf@gmail.com>
2025-04-10 14:57:02 -04:00
Tom Lane
d89335eea6 Doc: remove long-obsolete advice about generated constraint names.
It's been twenty years since we generated constraint names that
look like "$N".  So this advice about double-quoting such names
is well past its sell-by date, and now it merely seems confusing.

Reported-by: Yaroslav Saburov <y.saburov@gmail.com>
Author: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/174393459040.678.17810152410419444783@wrigleys.postgresql.org
Backpatch-through: 13
2025-04-10 14:49:10 -04:00
Tom Lane
f27eb0325b Remove useless check for negative result of ip_addrsize().
By inspection, ip_addrsize() can't return a negative result.
(If it could, we'd have way bigger problems elsewhere.)
So delete useless check in network_send().  Most C compilers
are probably perfectly capable of removing this code by
themselves, but it's confusing/misleading.

Bug: #18889
Reported-by: Daniel Elishakov <dan-eli@mail.ru>
Discussion: https://postgr.es/m/18889-73d4f19e953a629e@postgresql.org
2025-04-10 14:18:07 -04:00
Andrew Dunstan
4170298b6e Further cleanup for directory creation on pg_dump/pg_dumpall
Instead of two separate (and different) implementations, refactor to use
a single common routine.

Along the way, remove use of a hardcoded file permissions constant in
favor of the common project setting for directory creation.

Author: Mahendra Singh Thalor <mahi6run@gmail.com>

Discussion: https://postgr.es/m/CAKYtNApihL8X1h7XO-zOjznc8Ca66Aevgvhc9zOTh6DBh2iaeA@mail.gmail.com
2025-04-10 12:11:36 -04:00
Amit Kapila
4909b38af0 Fix data loss in logical replication.
Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ...
or ALTER TYPE ...  that don't take a strong lock on table happens
concurrently to DMLs on the tables involved in the DDL. This happens
because logical decoding doesn't distribute invalidations to concurrent
transactions and those transactions use stale cache data to decode the
changes. The problem becomes bigger because we keep using the stale cache
even after those in-progress transactions are finished and skip the
changes required to be sent to the client.

This commit fixes the issue by distributing invalidation messages from
catalog-modifying transactions to all concurrent in-progress transactions.
This allows the necessary rebuild of the catalog cache when decoding new
changes after concurrent DDL.

We observed performance regression primarily during frequent execution of
*publication DDL* statements that modify the published tables. The
regression is minor or nearly nonexistent for DDLs that do not affect the
published tables or occur infrequently, making this a worthwhile cost to
resolve a longstanding data loss issue.

An alternative approach considered was to take a strong lock on each
affected table during publication modification. However, this would only
address issues related to publication DDLs (but not the ALTER TYPE ...)
and require locking every relation in the database for publications
created as FOR ALL TABLES, which is impractical.

The bug exists in all supported branches, but we are backpatching till 14.
The fix for 13 requires somewhat bigger changes than this fix, so the fix
for that branch is still under discussion.

Reported-by: hubert depesz lubaczewski <depesz@depesz.com>
Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Tested-by: Benoit Lobréau <benoit.lobreau@dalibo.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com
Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
2025-04-10 13:14:40 +05:30
Peter Eisentraut
9ad19295e9 Fix incorrect format placeholders
for commits 8f427187db7, 6ee3b91bad2
2025-04-10 08:04:35 +02:00
David Rowley
d7c04db27a Update wording in optimizer/README for EquivalenceClasses
d69d45a5a changed how em_is_child members are stored in
EquivalenceClasses.  Children are no longer stored in the ec_members
list.  optimizer/README mentioned that most operations "should ignore
child members", but that felt a little untrue now since child members
are now stored in a separate place, they simply won't be found by the
normal means of looking (a foreach loop over ec_members), and if you don't
find them, there's technically no need to "ignore" them.

Here we tweak the wording slightly to reflect the new storage location
for child members.

Reported-by: Amit Langote <amitlangote09@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqE8v=EuAP_3F_A2xn8zWx+nG_etW_Fe_DvKO-Fkx=+DdQ@mail.gmail.com
2025-04-10 17:33:58 +12:00
Amit Kapila
d438515c29 Cosmetic fixes for pg_createsubscriber's -all option.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsmSCQ-ENSDQ0YOUcsgzT=GG-E9jyXBvxd51A_dMXH5XA@mail.gmail.com
2025-04-10 10:30:05 +05:30
Tomas Vondra
d15acc915d ci: Check for missing dependencies in meson builds
Extends the Linux and Windows meson builds with a check for missing
dependencies by running

    ninja -t missingdeps

after the build. This highlights unindended dependencies.

Reviewed-by: Andres Freund <andres@anarazel.de>
https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com
2025-04-09 22:01:58 +02:00
Tomas Vondra
3887d0cfeb Cleanup of pg_numa.c
This moves/renames some of the functions defined in pg_numa.c:

* pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and
  moved to src/backend/storage/ipc/shmem.c. The new name better reflects
  that the page size is not related to NUMA, and it's specifically about
  the page size used for the main shared memory segment.

* move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into
  the backend (which more appropriate for functions callable from SQL).
  While at it, improve the comment to explain what page size it returns.

* remove unnecessary includes from src/port/pg_numa.c, adding
  unnecessary dependencies (src/port should be suitable for frontent).
  These were either leftovers or unnecessary thanks to the other changes
  in this commit.

This eliminates unnecessary dependencies on backend symbols, which we
don't want in src/port.

Reported-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com
2025-04-09 21:50:17 +02:00
Nathan Bossart
e2665efd0f pg_upgrade: Mention that we preserve database OIDs in a comment.
Oversight in commit aa01051418.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/4055696.1744134682%40sss.pgh.pa.us
2025-04-09 14:27:08 -05:00
Tom Lane
837cc73af2 Fix performance issue in deadlock-parallel isolation test.
With debug_discard_caches = 1, the runtime of this test script
increased by about a factor of 10 after commit 0dca5d68d.  That's
causing some of our buildfarm animals to fail with a timeout.

The reason for the increased time is that now we are re-planning
some intentionally-non-inlineable SQL functions on every execution,
where the previous coding held onto the original plans throughout
the outer query.  The previous behavior was arguably quite buggy,
so I don't think 0dca5d68d deserves blame here.  But we would
like this test script to not take so long.

To fix, instead of forcing a "parallel safe" label via a
non-inlineable SQL function, apply it directly to the advisory-lock
functions by making internal-language aliases for them.  A small
problem is that the advisory-lock functions return void but this
test would really like them to return integer 1.  I cheated here by
declaring the aliases as returning "int".  That's perhaps undue
familiarity with the implementation of PG_RETURN_VOID(), but that
hasn't changed in twenty years and is unlikely to do so in the next
twenty.  That gets us an integer 0 result, and then an inline-able
wrapper to convert that to an integer 1 allows the rest of the script
to remain unchanged.

For me, this reduces the runtime with debug_discard_caches = 1
by about 100x, making the test comfortably faster than before
instead of slower.

Discussion: https://postgr.es/m/136163.1744179562@sss.pgh.pa.us
2025-04-09 12:28:34 -04:00
Noah Misch
5bbc596391 Fix test races between syscache-update-pruned.spec and autovacuum.
This spec fails ~3% of my Valgrind runs, and the spec has failed on Valgrind
buildfarm member skink at a similar rate.  Two problems contributed to that:

- A competing buffer pin triggered VACUUM's lazy_scan_noprune() path, causing
  "tuples missed: 1 dead from 1 pages not removed due to cleanup lock
  contention".  FREEZE fixes that.

- The spec ran lazy VACUUM immediately after VACUUM FULL.  The spec implicitly
  assumed lazy VACUUM prunes the one tuple that VACUUM FULL made dead.  First
  wait for old snapshots, making that assumption reliable.

This also adds two forms of defense in depth:

- Wait for snapshots using shared catalog pruning rules (VISHORIZON_SHARED).
  This avoids the removable cutoff moving backward when an XID-bearing
  autoanalyze process runs in another database.  That may never happen in this
  test, but it's cheap insurance.

- Use lazy VACUUM option DISABLE_PAGE_SKIPPING.  Commit
  c2dc1a79767a0f947e1145f82eb65dfe4360d25f did this for a related requirement
  in other tests, but I suspect FREEZE is necessary and sufficient in all
  these tests.

Back-patch to v17, where the test first appeared.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/sv3taq4e6ea4qckimien3nxp3sz4b6cw6sfcy4nhwl52zpur4g@h6i6tohxmizu
Backpatch-through: 17
2025-04-09 07:23:39 -07:00
Peter Eisentraut
306dd6e727 Update config.guess and config.sub 2025-04-09 12:41:54 +02:00
Heikki Linnakangas
0f1433f053 Fix a few oversights in the longer cancel keys patch
Change MyCancelKeyLength's type from uint8 to int. While it always
fits in a uint8, plain int is less surprising, as there's no
particular reason for it to be uint8.

Fix one ProcSignalInit caller that passed 'false' instead of NULL for
the pointer argument.

Author: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org
2025-04-09 13:11:42 +03:00
Daniel Gustafsson
ef366b7d7e Perform missed catversion bump
Commit c57971034e69ca renamed an argument for a function but missed
to bump the catversion to reflect this.

Reported-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqOega=dPtu3h2C5fJWJEuaGCMDib_sVfhKQqgUNJVmFA@mail.gmail.com
2025-04-09 09:29:12 +02:00
Tom Lane
dd496eedea Doc: note that two examples in optimizer/README are oversimplified.
These examples fail to account for join clauses generated by
EquivalenceClasses, but since we haven't mentioned EquivalenceClasses
yet it seems like it'd just add confusion to make them fully accurate.
Instead, parenthetically note that they're oversimplified.

Reported-by: Zeyuan Hu <ferrishu3886@gmail.com>
Co-authored-by: David Rowley <dgrowleyml@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACvHWmYFo+60yMqKJajDDvKN5EM41YHrCT3oxukwXmGAqpWvyw@mail.gmail.com
2025-04-08 23:03:33 -04:00
Tom Lane
b65b9da568 Adjust AdjustUpgrade.pm for commit b1720fe63.
Need to delete the functions we no longer have available from
the dumps to be reloaded from old versions.

Per buildfarm.
2025-04-08 20:21:03 -04:00
Tom Lane
b1720fe63f Move contrib/spi testing from core regression tests to contrib/spi.
It's weird to have the core regression tests depending on contrib
code, and coverage testing shows that those test queries add nothing
to the core-code coverage of the core tests.  So pull those test bits
out and put them into ordinary test scripts inside contrib/spi/,
making that more like other contrib modules.

Aside from being structurally nicer, anything we can take out of the
core tests (which are executed multiple times per check-world run)
and put into tests executed only once should be a win.  It doesn't
look like this change will buy a whole lot of milliseconds, but a
cycle saved is a cycle earned.

Also, there is some discussion around possibly removing refint and/or
autoinc altogether.  I don't know if that will happen, but we'd
certainly need to decouple them from the core tests to do so.

The tests for autoinc were quite intertwined with the undocumented
"ttdummy" trigger in regress.c.  That made the tests very hard to
understand and contributed nothing to autoinc's testing either.
So I just deleted ttdummy and rewrote the autoinc tests without it.

I realized while doing this that the description of autoinc in
the SGML docs is not a great description of what the function
actually does, so the patch includes some updates to those docs.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/3872677.1744077559@sss.pgh.pa.us
2025-04-08 19:12:03 -04:00
Daniel Gustafsson
c57971034e Rename argument in pg_get_process_memory_contexts().
During development the third argument to pg_get_process_memory_contexts
was a retry count, but it was changed to a timeout instead.  The param
name was accidentally left in pg_proc.dat though.  Fix by renaming to
the correct parameter name.

Author: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/3eb40b3e-45c7-426a-b7f8-81f7d05a9b53@oss.nttdata.com
2025-04-08 23:09:13 +02:00
Peter Eisentraut
8969194b73 Fix incorrect format placeholder
for commit 749a9e20c97
2025-04-08 19:12:03 +02:00
Nathan Bossart
b0a4c3e88b Prevent 006_transfer_modes.pl from leaving files behind.
This test was leaving files like delete_old_cluster.{sh,bat} in the
source directory for VPATH and meson builds.  To fix, change the
directory to tmp_check before running the test, as was done in
commits 15b6d21553, 8af917be6b, and c462b054ba.

Oversight in commit af0d4901c1.

Reported-by: Andrew Dunstan <andrew@dunslane.net> (on Discord)
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/Z_RHkG770w3SE0yU%40nathan
2025-04-08 10:57:31 -05:00
Daniel Gustafsson
88edd661c8 ci: Add MBUILD_TARGET for NetBSD and OpenBSD
Commit b2bdb972c0 added MBUILD_TARGET to ensure that meson builds
the tests before running them, this adds MBUILD_TARGET to OpenBSD
and NetBSD builds as well where it was missing.

No backpatching since OpenBSD and NetBSD support does not exist
in the backbranch CI.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAN55FZ2LNnRrtL+cpSdEg44fQcLPq_GjJjfNa0vz+xqEdq=ZHw@mail.gmail.com
2025-04-08 15:28:29 +02:00
Tomas Vondra
91f1fe90c7 pg_buffercache: Change page_num type to bigint
The page_num was defined as integer, which should be sufficient for the
near future (with 4K pages it's 8TB). But it's virtually free to return
bigint, and get a wider range. This was agreed on the thread, but I
forgot to tweak this in ba2a3c2302f1.

While at it, make the data types in CREATE VIEW a bit more consistent.

Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.co
2025-04-08 12:38:42 +02:00
Tomas Vondra
b8a6078ca8 doc: Correct pg_shmem_allocations_numa.size data type
The code in pg_get_shmem_allocations_numa() returned 'size' as int64,
but the docs said int32.

Report and fix by Noriyoshi Shinoda.

Reported-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com>
Discussion: https://postgr.es/m/DM4PR84MB1734308EB741A6ECFF040C27EEAA2@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM
2025-04-08 12:36:36 +02:00
Amit Kapila
12eece5fd5 Fix uninitialized index information access during apply.
The issue happens when building conflict information during apply of
INSERT or UPDATE operations that violate unique constraints on leaf
partitions.

The problem was introduced in commit 9ff68679b5, which removed the
redundant calls to ExecOpenIndices/ExecCloseIndices. The previous code was
relying on the redundant ExecOpenIndices call in
apply_handle_tuple_routing() to build the index information required for
unique key conflict detection.

The fix is to delay building the index information until a conflict is
detected instead of relying on ExecOpenIndices to do the same. The
additional benefit of this approach is that it avoids building index
information when there is no conflict.

Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by:Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/TYAPR01MB57244ADA33DDA57119B9D26494A62@TYAPR01MB5724.jpnprd01.prod.outlook.com
2025-04-08 15:35:42 +05:30
Thomas Munro
7ea21f4ee2 Fix typo in docs.
Typo in previous commit.
2025-04-08 22:02:45 +12:00
Thomas Munro
f78ca6f3eb Introduce file_copy_method setting.
It can be set to either COPY (the default) or CLONE if the system
supports it.  CLONE causes callers of copydir(), currently CREATE
DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE =
..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy
files instead of a read-write loop over the contents.

CLONE gives the kernel the opportunity to share block ranges on
copy-on-write file systems and push copying down to storage on others,
depending on configuration.  On some systems CLONE can be used to clone
large databases quickly with CREATE DATABASE ... TEMPLATE=source
STRATEGY=FILE_COPY.

Other operating systems could be supported; patches welcome.

Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com
2025-04-08 21:35:38 +12:00
Daniel Gustafsson
042a66291b Add function to get memory context stats for processes
This adds a function for retrieving memory context statistics
and information from backends as well as auxiliary processes.
The intended usecase is cluster debugging when under memory
pressure or unanticipated memory usage characteristics.

When calling the function it sends a signal to the specified
process to submit statistics regarding its memory contexts
into dynamic shared memory.  Each memory context is returned
in detail, followed by a cumulative total in case the number
of contexts exceed the max allocated amount of shared memory.
Each process is limited to use at most 1Mb memory for this.

A summary can also be explicitly requested by the user, this
will return the TopMemoryContext and a cumulative total of
all lower contexts.

In order to not block on busy processes the caller specifies
the number of seconds during which to retry before timing out.
In the case where no statistics are published within the set
timeout,  the last known statistics are returned, or NULL if
no previously published statistics exist.  This allows dash-
board type queries to continually publish even if the target
process is temporarily congested.  Context records contain a
timestamp to indicate when they were submitted.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com
2025-04-08 11:06:56 +02:00
Andres Freund
15f0cb26b5 Increase BAS_BULKREAD based on effective_io_concurrency
Before, BAS_BULKREAD was always of size 256kB. With the default
io_combine_limit of 16, that only allowed 1-2 IOs to be in flight -
insufficient even on very low latency storage.

We don't just want to increase the size to a much larger hardcoded value, as
very large rings (10s of MBs of of buffers), appear to have negative
performance effects when reading in data that the OS has cached (but not when
actually needing to do IO).

To address this, increase the size of BAS_BULKREAD to allow for
io_combine_limit * effective_io_concurrency buffers getting read in. To
prevent the ring being much larger than useful, limit the increased size with
GetPinLimit().

The formula outlined above keeps the ring size to sizes for which we have not
observed performance regressions, unless very large effective_io_concurrency
values are used together with large shared_buffers setting.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/lqwghabtu2ak4wknzycufqjm5ijnxhb4k73vzphlt2a3wsemcd@gtftg44kdim6
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah@brqs62irg4dt
2025-04-08 02:41:03 -04:00
Andres Freund
dcf7e1697b Add pg_buffercache_evict_{relation,all} functions
In addition to the added functions, the pg_buffercache_evict() function now
shows whether the buffer was flushed.

pg_buffercache_evict_relation(): Evicts all shared buffers in a
relation at once.
pg_buffercache_evict_all(): Evicts all shared buffers at once.

Both functions provide mechanism to evict multiple shared buffers at
once. They are designed to address the inefficiency of repeatedly calling
pg_buffercache_evict() for each individual buffer, which can be time-consuming
when dealing with large shared buffer pools. (e.g., ~477ms vs. ~2576ms for
16GB of fully populated shared buffers).

These functions are intended for developer testing and debugging
purposes and are available to superusers only.

Minimal tests for the new functions are included. Also, there was no test for
pg_buffercache_evict(), test for this added too.

No new extension version is needed, as it was already increased this release
by ba2a3c2302f.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru>
Reviewed-by: Joseph Koshakow <koshy44@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com
2025-04-08 02:19:32 -04:00
David Rowley
d69d45a5a9 Speedup child EquivalenceMember lookup in planner
When planning queries to partitioned tables, we clone all
EquivalenceMembers belonging to the partitioned table into em_is_child
EquivalenceMembers for each non-pruned partition.  For partitioned tables
with large numbers of partitions, this meant the ec_members list could
become large and code searching that list would become slow.  Effectively,
the more partitions which were present, the more searches needed to be
performed for operations such as find_ec_member_matching_expr() during
create_plan() and the more partitions present, the longer these searches
would take, i.e., a quadratic slowdown.

To fix this, here we adjust how we store EquivalenceMembers for
em_is_child members.  Instead of storing these directly in ec_members,
these are now stored in a new array of Lists in the EquivalenceClass,
which is indexed by the relid.  When we want to find EquivalenceMembers
belonging to a certain child relation, we can narrow the search to the
array element for that relation.

To make EquivalenceMember lookup easier and to reduce the amount of code
change, this commit provides a pair of functions to allow iteration over
the EquivalenceMembers of an EC which also handles finding the child
members, if required.  Callers that never need to look at child members
can remain using the foreach loop over ec_members, which will now often
be faster due to only parent-level members being stored there.

The actual performance increases here are highly dependent on the number
of partitions and the query being planned.  Performance increases can be
visible with as few as 8 partitions, but the speedup is marginal for
such low numbers of partitions.  The speedups become much more visible
with a few dozen to hundreds of partitions.  With some tested queries
using 56 partitions, the planner was around 3x faster than before.  For
use cases with thousands of partitions, these are likely to become
significantly faster.  Some testing has shown planner speedups of 60x or
more with 8192 partitions.

Author: Yuya Watari <watari.yuya@gmail.com>
Co-authored-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Tested-by: Thom Brown <thom@linux.com>
Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com
2025-04-08 18:09:57 +12:00
Amit Kapila
105b2cb336 Stabilize 035_standby_logical_decoding.pl.
Some tests try to invalidate logical slots on the standby server by
running VACUUM on the primary. The problem is that xl_running_xacts was
getting generated and replayed before the VACUUM command, leading to the
advancement of the active slot's catalog_xmin. Due to this, active slots
were not getting invalidated, leading to test failures.

We fix it by skipping the generation of xl_running_xacts for the required
tests with the help of injection points. As the required interface for
injection points was not present in back branches, we fixed the failing
tests in them by disallowing the slot to become active for the required
cases (where rows_removed conflict could be generated).

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/Z6oQXc8LmiTLfwLA@ip-10-97-1-34.eu-west-3.compute.internal
2025-04-08 09:38:02 +05:30
Bruce Momjian
46b4ba533c Fix PG 17 [NOT] NULL optimization bug for domains
A PG 17 optimization allowed columns with NOT NULL constraints to skip
table scans for IS NULL queries, and to skip IS NOT NULL checks for IS
NOT NULL queries.  This didn't work for domain types, since domain types
don't follow the IS NULL/IS NOT NULL constraint logic.  To fix, disable
this optimization for domains for PG 17+.

Reported-by: Jan Behrens

Diagnosed-by: Tom Lane

Discussion: https://postgr.es/m/Z37p0paENWWUarj-@momjian.us

Backpatch-through: 17
2025-04-07 21:33:42 -04:00
Michael Paquier
039549d70f Flush the IO statistics of active WAL senders more frequently
WAL senders do not flush their statistics until they exit, limiting the
monitoring possible for live processes.  This is penalizing when WAL
senders are running for a long time, like in streaming or logical
replication setups, because it is not possible to know the amount of IO
they generate while running.

This commit makes WAL senders more aggressive with their statistics
flush, using an internal of 1 second, with the flush timing calculated
based on the existing GetCurrentTimestamp() done before the sleeps done
to wait for some activity.  Note that the sleep done for logical and
physical WAL senders happens in two different code paths, so the stats
flushes need to happen in these two places.

One test is added for the physical WAL sender case, and one for the
logical WAL sender case.  This can be done in a stable fashion by
relying on the WAL generated by the TAP tests in combination with a
stats reset while a server is running, but only on HEAD as WAL data has
been added to pg_stat_io in a051e71e28a1.

This issue exists since a9c70b46dbe and the introduction of pg_stat_io,
so backpatch down to v16.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/Z73IsKBceoVd4t55@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 16
2025-04-08 07:57:19 +09:00
Tomas Vondra
ba2a3c2302 Add pg_buffercache_numa view with NUMA node info
Introduces a new view pg_buffercache_numa, showing NUMA memory nodes
for individual buffers. For each buffer the view returns an entry for
each memory page, with the associated NUMA node.

The database blocks and OS memory pages may have different size - the
default block size is 8KB, while the memory page is 4K (on x86). But
other combinations are possible, depending on configure parameters,
platform, etc. This means buffers may overlap with multiple memory
pages, each associated with a different NUMA node.

To determine the NUMA node for a buffer, we first need to touch the
memory pages using pg_numa_touch_mem_if_required, otherwise we might get
status -2 (ENOENT = The page is not present), indicating the page is
either unmapped or unallocated.

The view may be relatively expensive, especially when accessed for the
first time in a backend, as it touches all memory pages to get reliable
information about the NUMA node. This may also force allocation of the
shared memory.

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07 23:08:17 +02:00
Tomas Vondra
8cc139bec3 Introduce pg_shmem_allocations_numa view
Introduce new pg_shmem_alloctions_numa view with information about how
shared memory is distributed across NUMA nodes. For each shared memory
segment, the view returns one row for each NUMA node backing it, with
the total amount of memory allocated from that node.

The view may be relatively expensive, especially when executed for the
first time in a backend, as it has to touch all memory pages to get
reliable information about the NUMA node. This may also force allocation
of the shared memory.

Unlike pg_shmem_allocations, the view does not show anonymous shared
memory allocations. It also does not show memory allocated using the
dynamic shared memory infrastructure.

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07 23:08:17 +02:00
Tomas Vondra
65c298f61f Add support for basic NUMA awareness
Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c
portability wrapper and an optional build dependency, enabled by
--with-libnuma configure option. For now this is Linux-only, other
platforms may be supported later.

A built-in SQL function pg_numa_available() allows checking NUMA
support, i.e. that the server was built/linked with the NUMA library.

The main function introduced is pg_numa_query_pages(), which allows
determining the NUMA node for individual memory pages. Internally the
function uses move_pages(2) syscall, as it allows batching, and is more
efficient than get_mempolicy(2).

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07 23:08:17 +02:00
Álvaro Herrera
17bcf4f545
Use specific collation where needed in new test
Oversight in commit a379061a22a8.

Per Czech buildfarm members jay and hippopotamus.
2025-04-07 21:58:06 +02:00
Tom Lane
8cfbdf8f4d Fix some issues in contrib/spi/refint.c.
check_foreign_key incorrectly used a single cache entry for its saved
plans for a 'c' (cascade) trigger, although there are two different
queries to execute depending on whether it fires for an update or a
delete.  This caused the wrong things to be done if both types of
event occur in one session.  (This was indeed visible in the triggers
regression test, but apparently nobody ever questioned it.)  To fix,
add the operation type to the cache key.

Its debug log output failed to distinguish update from delete
events, too.

Also, change the intended trigger usage from BEFORE ROW to AFTER ROW,
and add checks insisting on that usage.  BEFORE is really rather
unsafe, since if there are other BEFORE triggers they might change or
cancel the operation we are trying to check.  AFTER triggers are the
standard way to propagate changes to other rows, so we should follow
that way here.

In passing, remove a useless duplicate lookup of the cache entry.

This code is mostly intended as a documentation example, so we
won't consider a back-patch.

Author: Dmitrii Bondar <d.bondar@postgrespro.ru>
Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Lilian Ontowhee <ontowhee@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/79755a2b18ed4fe5e29da6a87a1e00d1@postgrespro.ru
2025-04-07 15:54:16 -04:00
Andres Freund
8e293e689b aio: Make AIO more compatible with valgrind
In some edge cases valgrind flags issues with the memory referenced by
IOs. All of the cases addressed in this change are false positives.

Most of the false positives are caused by UnpinBuffer[NoOwner] marking buffer
data as inaccessible. This happens even though the AIO subsystem still holds a
pin. That's good, there shouldn't be accesses to the buffer outside of AIO
related code until it is pinned by "user" code again. But it requires some
explicit work - if the buffer is not pinned by the current backend, we need to
explicitly mark the buffer data accessible/inaccessible while executing
completion callbacks.

That however causes a cascading issue in IO workers: After the completion
callbacks for a buffer is executed, the page is marked as inaccessible. If
subsequently the same worker is executing IO targeting the same buffer, we
would get an error, as the memory is still marked inaccessible. To avoid that,
we need to explicitly mark the memory as accessible in IO workers.

Another issue is that IO executed in workers or via io_uring will not mark
memory as DEFINED. In the case of workers that is because valgrind does not
track memory definedness across processes. For io_uring that is because
valgrind does not understand io_uring, and therefore its IOs never mark memory
as defined, whether the completions are processed in the defining process or
in another context.  It's not entirely clear how to best solve that. The
current user of AIO is not affected, as it explicitly marks buffers as DEFINED
& NOACCESS anyway.  Defer solving this issue until we have a user with
different needs.

Per buildfarm animal skink.

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/3pd4322mogfmdd5nln3zphdwhtmq3rzdldqjwb2sfqzcgs22lf@ok2gletdaoe6
2025-04-07 15:20:30 -04:00
Andres Freund
8ab4241b9f localbuf: Add Valgrind buffer access instrumentation
This mirrors 1e0dfd166b3 (+ 46ef520b9566), for temporary table buffers. This
is mainly interesting right now because the AIO work currently triggers
spurious valgrind errors, and the fix for that is cleaner if temp buffers
behave the same as shared buffers.

This requires one change beyond the annotations themselves, namely to pin
local buffers while writing them out in FlushRelationBuffers().

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/3pd4322mogfmdd5nln3zphdwhtmq3rzdldqjwb2sfqzcgs22lf@ok2gletdaoe6
2025-04-07 15:20:30 -04:00
Masahiko Sawada
a13d49014d doc: Fix a typo in pg_recvlogical documentation.
Oversight in cf2655a9029a.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/OS3PR01MB5718DD1466E2B9043448AE5094AA2@OS3PR01MB5718.jpnprd01.prod.outlook.com
2025-04-07 12:13:08 -07:00
Tom Lane
969ab9d4f5 Follow-up fixes for SHA-2 patch (commit 749a9e20c).
This changes the check for valid characters in the salt string to
only allow plain ASCII letters and digits.  The previous coding was
locale-dependent which doesn't really seem like a great idea here;
moreover it could not work correctly in multibyte encodings.

This fixes a careless pointer-use-after-pfree, too.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Author: Bernd Helmle <mailings@oopsware.de>
Discussion: https://postgr.es/m/6fab35422df6b6b9727fdcc243c5fa1c667dd3b5.camel@oopsware.de
2025-04-07 14:14:28 -04:00
Tom Lane
b73e6d71a8 Fix erroneous construction of functions' dependencies on transforms.
The list of transform objects that a function should use is specified
in CREATE FUNCTION's TRANSFORM clause, and then represented indirectly
in pg_proc.protrftypes.  However, ProcedureCreate completely ignored
that for purposes of constructing pg_depend entries, and instead made
the function depend on any transforms that exist for its parameter or
return data types.  This is bad in both directions: the function could
be made dependent on a transform it does not actually use, or it
could try to use a transform that's since been dropped.  (The latter
scenario would require use of a transform that's not for any of the
parameter or return types, but that seems legit for cases where the
function performs SQL operations internally.)

To fix, pass in the list of transform objects that CreateFunction
identified, and build pg_depend entries from that not from the
parameter/return types.  This results in changes in the expected
test outputs in contrib/bool_plperl, which I guess are due to
different ordering of pg_depend entries -- that test case is
surely not exercising either of the problem scenarios.

This fix is not back-patchable as-is: changing the signature of
ProcedureCreate seems too risky in stable branches.  We could
do something like making ProcedureCreate a wrapper around
ProcedureCreateExt or so.  However, I'm more inclined to do
nothing in the back branches.  We had no field complaints up to
now, so the hazards don't seem to be a big issue in practice.
And we couldn't do anything about existing pg_depend entries,
so a back-patched fix would result in a mishmash of dependencies
created according to different rules.  That cure could be worse
than the disease, perhaps.

I bumped catversion just to lay down a marker that the expected
contents of pg_depend are a bit different than before.

Reported-by: Chapman Flack <jcflack@acm.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3112950.1743984111@sss.pgh.pa.us
2025-04-07 13:31:37 -04:00
Álvaro Herrera
a379061a22
Allow NOT NULL constraints to be added as NOT VALID
This allows them to be added without scanning the table, and validating
them afterwards without holding access exclusive lock on the table after
any violating rows have been deleted or fixed.

Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid
not-null constraint validates that constraint.  ALTER TABLE .. VALIDATE
CONSTRAINT is also supported.  There are various checks on whether an
invalid constraint is allowed in a child table when the parent table has
a valid constraint; this should match what we do for enforced/not
enforced constraints.

pg_attribute.attnotnull is now only an indicator for whether a not-null
constraint exists for the column; whether it's valid or invalid must be
queried in pg_constraint.  Applications can continue to query
pg_attribute.attnotnull as before, but now it's possible that NULL rows
are present in the column even when that's set to true.

For backend internal purposes, we cache the nullability status in
CompactAttribute->attnullability that each tuple descriptor carries
(replacing CompactAttribute.attnotnull, which was a mirror of
Form_pg_attribute.attnotnull).  During the initial tuple descriptor
creation, based on the pg_attribute scan, we set this to UNRESTRICTED if
pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we
update the latter to VALID or INVALID depending on the pg_constraint
scan.  This flag is also copied when tupledescs are copied.

Comparing tuple descs for equality must also compare the
CompactAttribute.attnullability flag and return false in case of a
mismatch.

pg_dump deals with these constraints by storing the OIDs of invalid
not-null constraints in a separate array, and running a query to obtain
their properties.  The regular table creation SQL omits them entirely.
They are then dealt with in the same way as "separate" CHECK
constraints, and dumped after the data has been loaded.  Because no
additional pg_dump infrastructure was required, we don't bump its
version number.

I decided not to bump catversion either, because the old catalog state
works perfectly in the new world.  (Trying to run with new catalog state
and the old server version would likely run into issues, however.)

System catalogs do not support invalid not-null constraints (because
commit 14e87ffa5c54 didn't allow them to have pg_constraint rows
anyway.)

Author: Rushabh Lathia <rushabh.lathia@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com
2025-04-07 19:19:50 +02:00
Andrew Dunstan
b52a4a5f28 Clean up error messages from 1495eff7bdb
Quote file names, and mostly avoid hard coded file names. Along the way
make a few other minor improvements.

Discussion: https://postgr.es/m/20250407.152721.1397761902317499205.horikyota.ntt@gmail.com
2025-04-07 12:22:41 -04:00
Tom Lane
3516ea768c Add local-address escape "%L" to log_line_prefix.
This escape shows the numeric server IP address that the client
has connected to.  Unix-socket connections will show "[local]".
Non-client processes (e.g. background processes) will show "[none]".

We expect that this option will be of interest to only a fairly
small number of users.  Therefore the implementation is optimized
for the case where it's not used (that is, we don't do the string
conversion until we have to), and we've not added the field to
csvlog or jsonlog formats.

Author: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Cary Huang <cary.huang@highgo.ca>
Reviewed-by: David Steele <david@pgmasters.net>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAKAnmmK-U+UicE-qbNU23K--Q5XTLdM6bj+gbkZBZkjyjrd3Ow@mail.gmail.com
2025-04-07 11:06:05 -04:00
Andrew Dunstan
8f5e419484 Revert "Use workaround of __builtin_setjmp only on MINGW on MSVCRT"
This reverts commit c313fa4602defe1be947370ab5b217ca163a1e3c.

This is found to cause issues on x86_64 Windows even when using UCRT.

Discussion: https://postgr.es/m/3312149.1744001936@sss.pgh.pa.us
2025-04-07 11:01:15 -04:00
Andres Freund
8ce79483dc read_stream: Fix overflow hazard with large shared buffers
If the limit returned by GetAdditionalPinLimit() is large, the buffer_limit
variable in read_stream_start_pending_read() can overflow. While the code is
careful to limit buffer_limit PG_INT16_MAX, we subsequently add the number of
forwarded buffers.

The overflow can lead to assertion failures, crashes or wrong query results
when using large shared buffers.

It seems easier to avoid this if we make the buffer_limit variable an int,
instead of an int16.  Do so, and clamp buffer_limit after adding the number of
forwarded buffers.

It's possible we might want to address this and related issues more widely by
changing to int instead of int16 more widely, but since the consequences of
this bug can be confusing, it seems better to fix it now.

This bug was introduced in ed0b87caaca.

Discussion: https://postgr.es/m/ewvz3cbtlhrwqk7h6ca6cctiqh7r64ol3pzb3iyjycn2r5nxk5@tnhw3a5zatlr
2025-04-07 09:45:00 -04:00
Alexander Korotkov
717d0e8dd9 Remove GUC_NOT_IN_SAMPLE from enable_self_join_elimination
fc069a3a6319 implements Self-Join Elimination (SJE) and provides a new GUC
variable: enable_self_join_elimination.  This new GUC variable was marked
as GUC_NOT_IN_SAMPLE.  However, enable_self_join_elimination is documented
and is not different from any other enable_* GUCs.  Thus, remove
GUC_NOT_IN_SAMPLE from it and add it to the postgresql.conf.sample.

Discussion: https://postgr.es/m/CAPpHfdsqMTEsmxk3aQwt6xPz%2BKpUELO%3D6fzmER9ZRGrbs4uMfA%40mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2025-04-07 16:28:54 +03:00
Daniel Gustafsson
ae60947643 psql: Clarify help message for WATCH_INTERVAL
The help message for WATCH_INTERVAL was hard to interpret and didn't
follow the style of other messages, this updates it to nake it fit in
better and be easier to interpret.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/20250326.120732.1167093737847500721.horikyota.ntt@gmail.com
2025-04-07 13:44:58 +02:00
Michael Paquier
d6f118444d Fix grammar in log message of pg_restore.c
Introduced by 1495eff7bdb0.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20250407.151359.72428746612514925.horikyota.ntt@gmail.com
2025-04-07 15:37:34 +09:00
Michael Paquier
2c7bd2ba50 libpq: Fix some issues in TAP tests for service files
The valid service file was not correctly shaped, as append_to_file() was
called with an array as input.  This is changed so as the parameter and
value pairs from the valid connection string are appended to the valid
service file one by one.

Even with the first issue fixed, the tests should fail.  However, they
have been passing because all the connection attempts relied on the
default values given to PGPORT and PGHOST from the node when using
Cluster.pm's connect_ok() and connect_fails(), rather than the data in
the service file.  The test is updated to use an interesting trick: a
dummy node is initialized but not started, and all the connection
attempts are done through it.  This ensures that the data inside the
service file is used for all the connection tests.  Note that breaking
the contents of the valid service file on purpose makes all the tests
that rely on it fail.

Issues introduced by 72c2f36d5727.

Author: Andrew Jackson <andrewjackson947@gmail.com>
Discussion: https://postgr.es/m/CAKK5BkG_6_YSaebM6gG=8EuKaY7_VX1RFgYeySuwFPh8FZY73g@mail.gmail.com
2025-04-07 12:55:09 +09:00
Michael Paquier
c36eda2591 Clarify comment for worst-case allocation in quote_literal_cstr()
palloc() is invoked with a specific formula for its allocation size in
quote_literal_cstr().  This wastes some memory, but the size is large
enough to cover even the worst-case scenarios.

No explanations were given about the reasons behind these numbers.  This
commit adds more documentation about all that.

Author: Steve Chavez <steve@supabase.io>
Discussion: https://postgr.es/m/CAGRrpzZ9bToRWS+fAnjxDJrxwZN1QcJ-y1Pn2yg=Hst6rydLtw@mail.gmail.com
2025-04-07 10:02:12 +09:00
Michael Paquier
3191a593d6 Fix use-after-free in pgstat_fetch_stat_backend_by_pid()
stats_fetch_consistency set to "snapshot" causes the backend entry
"beentry" retrieved by pgstat_get_beentry_by_proc_number() to be reset
at the beginning of pgstat_fetch_stat_backend() when fetching the
backend pgstats entry.  As coded, "beentry" was being accessed after
being freed.  This commit moves all the accesses to "beentry" to happen
before calling pgstat_fetch_stat_backend(), fixing the problem.

This problem could be reached by calling the SQL functions
pg_stat_get_backend_io() or pg_stat_get_backend_wal().

Issue caught by valgrind.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/f1788cc0-253a-4a3a-aee0-1b8ab9538736@gmail.com
2025-04-07 09:51:40 +09:00
Fujii Masao
173c97812f Use XLOG_CONTROL_FILE macro consistently for control file name.
The XLOG_CONTROL_FILE macro (defined in access/xlog_internal.h)
represents the control file name. While some parts of the codebase already
use this macro, others previously hardcoded the file name as a string.

This commit replaces those hardcoded strings with the macro,
ensuring consistent usage throughout the code. This makes future
maintenance easier and improves searchability, for example when
grepping for control file usage.

Author: Anton A. Melnikov <a.melnikov@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/0841ec77-47e5-452a-adb4-c6fa55d605fc@postgrespro.ru
2025-04-07 09:27:33 +09:00
Daniel Gustafsson
a233a603ba doc: Clarify project naming
Clarify the project naming in the history section of the docs
to match the recent license preamble changes.

Backpatch to all supported versions.

Author: Dave Page <dpage@pgadmin.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+OCxozLzK2+Jc14XZyWXSp6L9Ot+3efwXUE35FJG=fsbib2EA@mail.gmail.com
Backpatch-through: 13
2025-04-07 00:03:18 +02:00
Andrew Dunstan
643a1a6198 Clean up checking for pg_dumpall output directory
Coverity objected to the original code, and in any case this is much
cleaner, using the existing routine pg_check_dir() instead of rolling
its own test.

Per suggestion from Tom Lane.
2025-04-06 17:04:58 -04:00
Tom Lane
218ab68275 Doc: fix PDF "contents ... exceed the available area" warnings.
Tweak column widths in a new table, similarly to some previous
fixes such as b62381d9a.

Per buildfarm.
2025-04-06 16:27:39 -04:00
Nathan Bossart
de48056ec7 pg_upgrade: Fix memory leak in check_for_unicode_update().
This function was initializing the "task" variable before a couple
of early returns.  To fix, postpone the initialization until just
before it's needed.

Per Coverity.

Discussion: https://postgr.es/m/Z_KMsUH2-FEbiNjC%40nathan
2025-04-06 15:11:41 -05:00
Andres Freund
57dec20fd4 aio: Avoid spurious coverity warning
PgAioResult.result is never accessed in the relevant path, but coverity
complains about an uninitialized access anyway. So just zero-initialize the
whole thing.  While at it, reduce the scope of the variable.

Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CAEudQApsKqd-s+fsUQ0OmxJAMHmBSXxrAz3dCs+uvqb3iRtjSw@mail.gmail.com
2025-04-06 12:07:02 -04:00
Tom Lane
8ab6ef2bb8 Fix memory leaks in px_crypt_shacrypt().
Per Coverity.  I don't think these are of any actual significance
since the function ought to be invoked in a short-lived context.
Still, if it's trying to be neat it should get it right.

Also const-ify a constant and fix up typedef formatting.
2025-04-06 11:57:22 -04:00
Tom Lane
2e4ccf1b45 Use "(void)" to mark pgstat_lock_entry(..., false) calls.
This should silence Coverity's complaints about the result being
sometimes ignored.

I'm inclined to think that these routines are simply misdesigned,
because sometimes it's okay to ignore the result and sometimes it
isn't, and we have no way to enforce the latter.  But for now
I just added a comment.
2025-04-06 11:37:09 -04:00
Andrew Dunstan
5e19154390 Avoid unnecessary copying of a string in pg_restore.c
Coverity complained about a possible overrun in the copy, but there is
no actual need to copy the string at all.
2025-04-06 09:21:09 -04:00
Andrew Dunstan
6d5417e634 Fix a couple of memory leaks in pg_restore.c
per complaint from Coverity.
2025-04-06 09:09:25 -04:00
Peter Eisentraut
a8025f5448 Relax ordering-related hardcoded btree requirements in planning
There were several places in ordering-related planning where a
requirement for btree was hardcoded but an amcanorder index could
suffice.  This fixes that.  We just need to do the necessary mapping
between strategy numbers and compare types and adjust some related
APIs so that this works independent of btree strategy numbers.  For
instance, non-btree amcanorder indexes can now be used to support
sorting and merge joins.  Also, predtest.c works independent of btree
strategy numbers now.

To avoid performance regressions, some details on btree and other
built-in index types are still hardcoded as shortcuts, but other index
types now have access to the same features by providing the required
flags and callbacks.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-04-06 14:43:51 +02:00
Alexander Korotkov
3a1a7c5a70 Revert "Put enable_self_join_elimination into postgresql.conf.sample"
This reverts commit c2d329260cd8.

Reported-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/D292EB44-806E-439A-82A4-491A1BA59E7A%40yesql.se
2025-04-06 14:30:20 +03:00
Alexander Korotkov
c2d329260c Put enable_self_join_elimination into postgresql.conf.sample
fc069a3a6319 implements Self-Join Elimination (SJE) and provides a new
GUC variable: enable_self_join_elimination.  This commit adds
enable_self_join_elimination to the postgresql.conf.sample, as it was
forgotten in the original commit.

Discussion: https://postgr.es/m/CAHewXN%3D%2Bghd6O6im46q7j2u6c3H6vkXtXmF%3D_v4CfGSnjje8PA%40mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
2025-04-06 13:24:16 +03:00
John Naylor
3c6e8c1238 Compute CRC32C using AVX-512 instructions where available
The previous implementation of CRC32C on x86 relied on the native
CRC32 instruction from the SSE 4.2 extension, which operates on
up to 8 bytes at a time. We can get a substantial speedup by using
carryless multiplication on SIMD registers, processing 64 bytes per
loop iteration. Shorter inputs fall back to ordinary CRC instructions.
On Intel Tiger Lake hardware (2020), CRC is now 50% faster for inputs
between 64 and 112 bytes, and 3x faster for 256 bytes.

The VPCLMULQDQ instruction on 512-bit registers has been available
on Intel hardware since 2019 and AMD since 2022. There is an older
variant for 128-bit registers, but at least on Zen 2 it performs worse
than normal CRC instructions for short inputs.

We must now do a runtime check, even for builds that target SSE
4.2. This doesn't matter in practice for WAL (arguably the most
critical case), because since commit e2809e3a1 the final computation
with the 20-byte WAL header is inlined and unrolled when targeting
that extension. Compared with two direct function calls, testing
showed equal or slightly faster performance in performing an indirect
function call on several dozen bytes followed by inlined instructions
on constant input of 20 bytes.

The MIT-licensed implementation was generated with the "generate"
program from

https://github.com/corsix/fast-crc32/

Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009

Co-authored-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Co-authored-by: Paul Amonson <paul.d.amonson@intel.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Matthew Sterrett <matthewsterrett2@gmail.com> (earlier version)
Tested-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Tested-by: David Rowley <<dgrowleyml@gmail.com>> (earlier version)
Discussion: https://postgr.es/m/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
2025-04-06 14:04:30 +07:00
Daniel Gustafsson
683df3f4de Quote filename in error message
Project standard is to quote filenames in error and log messages, which
commit 2da74d8d640 missed in two error messages.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20250404.120328.103562371975971823.horikyota.ntt@gmail.com
2025-04-05 22:10:28 +02:00
Tom Lane
691836405f Fix parse_cte.c's failure to examine sub-WITHs in DML statements.
makeDependencyGraphWalker thought that only SelectStmt nodes could
contain a WithClause.  Which was true in our original implementation
of WITH, but astonishingly we missed updating this code when we added
the ability to attach WITH to INSERT/UPDATE/DELETE (and later MERGE).
Moreover, since it was coded to deliberately block recursion to a
WithClause, even updating raw_expression_tree_walker didn't save it.

The upshot of this was that we didn't see references to outer CTE
names appearing within an inner WITH, and would neither complain about
disallowed recursion nor account for such references when sorting CTEs
into a usable order.  The lack of complaints about this is perhaps not
so surprising, because typical usage of WITH wouldn't hit either case.
Still, it's pretty broken; failing to detect recursion here leads to
assert failures or worse later on.

Fix by factoring out the processing of sub-WITHs into a new function
WalkInnerWith, and invoking that for all the statement types that
can have WITH.

Bug: #18878
Reported-by: Yu Liang <luy70@psu.edu>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18878-a26fa5ab6be2f2cf@postgresql.org
Backpatch-through: 13
2025-04-05 15:01:48 -04:00
Álvaro Herrera
749a9e20c9
Add modern SHA-2 based password hashes to pgcrypto.
This adapts the publicly available reference implementation on
https://www.akkadia.org/drepper/SHA-crypt.txt and adds the new hash
algorithms sha256crypt and sha512crypt to crypt() and gen_salt()
respectively.

Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/c763235a2757e2f5f9e3e27268b9028349cef659.camel@oopsware.de
2025-04-05 19:17:13 +02:00
Tom Lane
e33f2335a9 Avoid double transformation of json_array()'s subquery.
transformJsonArrayQueryConstructor() applied transformStmt() to
the same subquery tree twice.  While this causes no issue in many
cases, there are some where it causes a coredump, thanks to the
parser's habit of scribbling on its input.

Fix by making a copy before the first transformation (compare
0f43083d1).  This is quite brute-force, but then so is the
whole business of transforming the input twice.  Per discussion
in the bug thread, this implementation of json_array() parsing
should be replaced completely.  But that will take some work
and will surely not be back-patchable, so for the moment let's
take the easy way out.

Oversight in 7081ac46a.  Back-patch to v16 where that came in.

Bug: #18877
Reported-by: Yu Liang <luy70@psu.edu>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18877-c3c3ad75845833bb@postgresql.org
Backpatch-through: 16
2025-04-05 12:13:35 -04:00
Andrew Dunstan
5db3bf7391 Clean up from commit 1495eff7bdb
Fix some comments, and remove the hacky way of quoting database names in
favor of appendStringLiteralConn.
2025-04-05 08:00:24 -04:00
Álvaro Herrera
64fba9c617
Set log_statement=none in t/002_pg_upgrade.pl
This should make the test a wee bit faster on high-load machines (e.g.,
when running under valgrind).

Per complaint from Andres Freund.

Discussion: https://postgr.es/m/cwbcyjp2ts7o7xgy5y5gwtcd4zltvncsj67el7xgci7xbwrhlu@k363vk5tce4g
2025-04-05 11:41:01 +02:00
Álvaro Herrera
4be6a74cfb
pg_dump: Tiny header cleanup
In commits 9c02e3a986da and 8ec0aaeae094, Nathan added a duplicate
TocEntry typedef forward declaration (plus assorted #ifdef hackery to
avoid C99 preprocessor issues) to deal with some very old untidyness
regarding DefnDumperPtr function prototype being located in pg_backup.h.
But there's no reason to have the DefnDumperPtr typedef (and the
accompanying DataDumperPtr typedef) in that file at all; they are better
placed in pg_backup_archiver.h, the internal header, because they are
only used internally.  That also requires zero #ifdef hackery, so move
them there.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202504042140.qo66ggw6wzsz@alvherre.pgsql
2025-04-05 11:22:40 +02:00
Nathan Bossart
f0d0083f52 pg_dump: Fix query for gathering attribute stats on older versions.
Commit 9c02e3a986 taught pg_dump to retrieve attribute statistics
for 64 relations at a time.  pg_dump supports dumping from v9.2 and
newer versions, but our query for retrieving statistics for
multiple relations uses WITH ORDINALITY and multi-argument
UNNEST(), both of which were introduced in v9.4.  To fix, we resort
to gathering statistics for a single relation at a time on versions
older than v9.4.

Per buildfarm member crake.

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/Z_BcWVMvlUIJ_iuZ%40nathan
2025-04-04 21:05:30 -05:00
Tom Lane
43b8e6c4ab Repair misbehavior with duplicate entries in FK SET column lists.
Since v15 we've had an option to apply a foreign key constraint's
ON DELETE SET DEFAULT or SET NULL action to just some of the
referencing columns.  There was not a check for duplicate entries in
the list of columns-to-set, though.  That caused a potential memory
stomp in CreateConstraintEntry(), which incautiously assumed that
the list of columns-to-set couldn't be longer than the number of key
columns.  Even after fixing that, the case doesn't work because you
get an error like "multiple assignments to same column" from the SQL
command that is generated to do the update.

We could either raise an error for duplicate columns or silently
suppress the dups, and after a bit of thought I chose to do the
latter.  This is motivated by the fact that duplicates in the FK
column list are legal, so it's not real clear why duplicates
in the columns-to-set list shouldn't be.  Of course there's no
need to actually set the column more than once.

I left in the fix in CreateConstraintEntry() too, just because
it didn't seem like such low-level code ought to be making
assumptions about what it's handed.

Bug: #18879
Reported-by: Yu Liang <luy70@psu.edu>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18879-259fc59d072bd4d7@postgresql.org
Backpatch-through: 15
2025-04-04 20:11:48 -04:00
Tom Lane
0f43083d16 functions.c: copy trees from source_list before parse analysis etc.
This is yet another bit of fallout from the fact that backend/parser
(like other code) feels free to scribble on the parse tree it's
handed.  In this case that resulted in modifying the
relatively-short-lived copy in the cached function's source_list.
That would be fine since we only need each source_list tree once
... except that if the parser fails after making some changes,
the function cache entry remains as-is and will still be there
if the user tries to execute the function again.  Then we have
problems because we're feeding a non-pristine tree to the parser.

The most expedient fix is a quick copyObject().  I considered
other answers like somehow marking the cache entry invalid
temporarily, but that would add complexity and I'm not sure
it's worth it.  In typical scenarios we'd only do this once
per function query per session.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/6d442183-102c-498a-81d1-eeeb086cdc5a@gmail.com
2025-04-04 18:26:51 -04:00
Andrew Dunstan
2ef5790806 Fix a couple of error messages and tests for them
oversights in 1495eff7bdb and 289f74d0cb2. Mea culpa.
2025-04-04 17:07:45 -04:00
Nathan Bossart
8ec0aaeae0 Prevent redeclaration of typedef TocEntry.
Commit 9c02e3a986 added a forward declaration for this typedef that
caused redeclarations, which is not valid in C99.  To fix, add some
preprocessor guards to avoid a redefinition, as is done elsewhere
(e.g., commit 382092a0cd).

Per buildfarm.
2025-04-04 15:56:23 -05:00
Andrew Dunstan
289f74d0cb Add more TAP tests for pg_dumpall
Author: Matheus Alcantara <matheusssilv97@gmail.com>
Author: Mahendra Singh Thalor <mahi6run@gmail.com>
2025-04-04 16:07:46 -04:00
Andrew Dunstan
1495eff7bd Non text modes for pg_dumpall, correspondingly change pg_restore
pg_dumpall acquires a new -F/--format option, with the same meanings as
pg_dump. The default is p, meaning plain text. For any other value, a
directory is created containing two files, globals.data and map.dat. The
first contains SQL for restoring the global data, and the second
contains a map from oids to database names. It will also contain a
subdirectory called databases, inside which it will create archives in
the specified format, named using the database oids.

In these casess the -f argument is required.

If pg_restore encounters a directory containing globals.dat, and no
toc.dat, it restores the global settings and then restores each
database.

pg_restore acquires two new options: -g/--globals-only which suppresses
restoration of any databases, and --exclude-database which inhibits
restoration of particualr database(s) in the same way the same option
works in pg_dumpall.

Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Co-authored-by:  Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>

Discussion: https://postgr.es/m/cb103623-8ee6-4ba5-a2c9-f32e3a4933fa@dunslane.net
2025-04-04 16:01:22 -04:00
Andrew Dunstan
2b69afbe50 add new list type simple_oid_string_list to fe-utils/simple_list
This type contains both an oid and a string.

This will be used in forthcoming changes to pg_restore.

Author: Andrew Dunstan <andrew@dunslane.net>
2025-04-04 16:01:22 -04:00
Andrew Dunstan
c1da728106 Move common pg_dump code related to connections to a new file
ConnectDatabase is used by pg_dumpall, pg_restore and pg_dump so move
common code to new file.

new file name: connectdb.c

Author:    Mahendra Singh Thalor <mahi6run@gmail.com>
2025-04-04 16:01:22 -04:00
Nathan Bossart
ff3a7f0b68 Remove unused function parameters in pg_backup_archiver.c.
Thanks to commit 9c02e3a986, which modified some of the changes
from commit a0a4601765, we can remove the now-unused ArchiveHandle
parameter from _tocEntryRestorePass() and move_to_ready_heap().

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/Z-3x2AnPCP331JA3%40nathan
2025-04-04 14:55:04 -05:00
Nathan Bossart
9c02e3a986 pg_dump: Retrieve attribute statistics in batches.
Currently, pg_dump gathers attribute statistics with a query per
relation, which can cause pg_dump to take significantly longer,
especially when there are many relations.  This commit addresses
this by teaching pg_dump to gather attribute statistics for 64
relations at a time.  Some simple tests showed this was the optimal
batch size, but performance may vary depending on the workload.

Our lookahead code determines the next batch of relations by
searching the TOC sequentially for relevant entries.  This approach
assumes that we will dump all such entries in TOC order, which
unfortunately isn't true for dump formats that use
RestoreArchive().  RestoreArchive() does multiple passes through
the TOC and selectively dumps certain groups of entries each time.
This is particularly problematic for index stats and a subset of
matview stats; both are in SECTION_POST_DATA, but matview stats
that depend on matview data are dumped in RESTORE_PASS_POST_ACL,
while all other stats are dumped in RESTORE_PASS_MAIN.  To handle
this, this commit moves all statistics data entries in
SECTION_POST_DATA to RESTORE_PASS_POST_ACL, which ensures that we
always dump them in TOC order.  A convenient side effect of this
change is that we can revert a decent chunk of commit a0a4601765,
but that is left for a follow-up commit.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
2025-04-04 14:51:08 -05:00
Nathan Bossart
7d5c83b4e9 pg_dump: Reduce memory usage of dumps with statistics.
Right now, pg_dump stores all generated commands for statistics in
memory.  These commands can be quite large and therefore can
significantly increase pg_dump's memory footprint.  To fix, wait
until we are about to write out the commands before generating
them, and be sure to free the commands after writing.  This is
implemented via a new defnDumper callback that works much like the
dataDumper one but is specifically designed for TOC entries.

Custom dumps that include data might write the TOC twice (to update
data offset information), which would ordinarily cause pg_dump to
run the attribute statistics queries twice.  However, as a hack, we
save the length of the written-out entry in the first pass and skip
over it in the second.  While there is no known technical issue
with executing the queries multiple times and rewriting the
results, it's expensive and feels risky, so let's avoid it.

As an exception, we _do_ execute the queries twice for the tar
format.  This format does a second pass through the TOC to generate
the restore.sql file.  pg_restore doesn't use this file, so even if
the second round of queries returns different results than the
first, it won't corrupt the output; the archive and restore.sql
file will just have different content.  A follow-up commit will
teach pg_dump to gather attribute statistics in batches, which our
testing indicates more than makes up for the added expense of
running the queries twice.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
2025-04-04 14:51:08 -05:00
Nathan Bossart
e3cc039a7d Skip second WriteToc() call for custom-format dumps without data.
Presently, "pg_dump --format=custom" calls WriteToc() twice.  The
second call updates the data offset information, which allegedly
makes parallel pg_restore significantly faster.  However, if we're
not dumping any data, there are no data offsets to update, so we
can skip this step.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/Z9c1rbzZegYQTOQE%40nathan
2025-04-04 14:51:08 -05:00
Melanie Plageman
d9c7911e1a Use streaming read I/O in autoprewarm
Make a read stream for each valid fork of each valid relation
represented in the autoprewarm dump file and prewarm those blocks
through the read stream API instead of by directly invoking
ReadBuffer().

Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> (earlier versions)
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>  (earlier versions)
Reviewed-by: Matheus Alcantara <mths.dev@pm.me> (earlier versions)
Discussion: https://postgr.es/m/flat/CAN55FZ3n8Gd%2BhajbL%3D5UkGzu_aHGRqnn%2BxktXq2fuds%3D1AOR6Q%40mail.gmail.com
2025-04-04 15:28:54 -04:00
Melanie Plageman
6acab8bdbc Refactor autoprewarm_database_main() in preparation for read stream
Autoprewarm prewarms blocks from a dump file representing the contents
of shared buffers at the time it was dumped. It uses a sorted array of
BlockInfoRecords, each representing a block from one of the cluster's
databases and tables.

autoprewarm_database_main() prewarms all the blocks from a single
database. It is optimized to ensure we don't try to open the same
relation or fork over and over again if it has been dropped or is
invalid. The main loop handled this by carefully setting various local
variables to sentinel values when a run of blocks should be skipped.

This method won't work with the read stream API. The read stream
callback must be able to advance the current position in the
BlockInfoRecord array to allow for reading ahead additional blocks,
however a read stream maps 1-1 with a relation and fork combination. So,
the main loop in autoprewarm_database_main() must also advance the
position in the array of BlockInfoRecords to skip invalid relations and
forks. This split control doesn't fit well with the current flow control
in autoprewarm_database_main()

To make it compatible with the read stream API, change
autoprewarm_database_main() to explicitly fast-forward in the
BlockInfoRecords array past the blocks belonging to an invalid relation
or fork.

This commit only implements the new control flow -- it does not use the
read stream API.

Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/flat/CAN55FZ3n8Gd%2BhajbL%3D5UkGzu_aHGRqnn%2BxktXq2fuds%3D1AOR6Q%40mail.gmail.com
2025-04-04 15:28:49 -04:00
Melanie Plageman
7f848cb788 Remove superfluous autoprewarm check
autoprewarm_database_main() prewarms blocks from the same database. It
is passed an array of sorted BlockInfoRecords and a start and stop index
into the array. The range represented should include only blocks
belonging to global objects or blocks from a single database. Remove an
unnecessary check that the current block is from the same database and
add an assert to ensure this invariant remains. Doing so removes a
special case that makes future refactoring to accommodate read
streamifying autoprewarm easier.

Noticed off-list by Andres Freund
2025-04-04 15:28:39 -04:00
Peter Geoghegan
b3f1a13f22 Avoid extra index searches through preprocessing.
Transform low_compare and high_compare nbtree skip array inequalities
(with opclasses that offer skip support) in such a way as to allow
_bt_first to consistently apply later keys when it descends the tree.
This can lower the number of index searches for multi-column scans that
use a ">" key on one of the index's prefix columns (or use a "<" key,
when scanning backwards) when it precedes some later lower-order key.

For example, an index qual "WHERE a > 5 AND b = 2" will now be converted
to "WHERE a >= 6 AND b = 2" by a new preprocessing step that takes place
after low_compare and high_compare have been finalized.  That way, the
initial call to _bt_first can use "WHERE a >= 6 AND b = 2" to find an
initial position, rather than just using "WHERE a > 5" -- "b = 2" can be
applied during every _bt_first call.  There's a decent chance that this
will allow such a scan to avoid the extra search that might otherwise be
needed to determine the lowest "a" value still satisfying "WHERE a > 5".

The transformation process can only lower the total number of index
pages read when the use of a more restrictive set of initial positioning
keys in _bt_first actually allows the scan to land on some later leaf
page directly, relative to the unoptimized case (or on an earlier leaf
page directly, when scanning backwards).  But the savings can really add
up in cases where an affected skip array comes after some other array.
For example, a scan indexqual "WHERE x IN (1, 2, 3) AND y > 5 AND z = 2"
can save as many as 3 _bt_first calls by applying the new transformation
to its "y" array (up to 1 extra search can be avoided per "x" element).

Follow-up to commit 92fe23d9, which added nbtree skip scan.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=FJ78K3WsF3iWNxWnUCY9f=Jdg3QPxaXE=uYUbmuRz5Q@mail.gmail.com
2025-04-04 14:14:08 -04:00
Peter Geoghegan
21a152b37f Improve nbtree skip scan primitive scan scheduling.
Don't allow nbtree scans with skip arrays to end any primitive scan on
its first leaf page without giving some consideration to how many times
the scan's arrays advanced while changing at least one skip array
(though continue not caring about the number of array advancements that
only affected SAOP arrays, even during skip scans with SAOP arrays).
Now when a scan performs more than 3 such array advancements in the
course of reading a single leaf page, it is taken as a signal that the
next page is unlikely to be skippable.  We'll therefore continue the
ongoing primitive index scan, at least until we can perform a recheck
against the next page's finaltup.

Testing has shown that this new heuristic occasionally makes all the
difference with skip scans that were expected to rely on the "passed
first page" heuristic added by commit 9a2e2a28.  Without it, there is a
remaining risk that certain kinds of skip scans will never quite manage
to clear the initial hurdle of performing a primitive scan that lasts
beyond its first leaf page (or that such a skip scan will only clear
that initial hurdle when it has already wasted noticeably-many cycles
due to inefficient primitive scan scheduling).

Follow-up to commits 92fe23d9 and 9a2e2a28.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=RVdG3zWytFWBsyW7fWH7zveFvTHed5JKEsuTT0RCO_A@mail.gmail.com
2025-04-04 13:58:05 -04:00
Masahiko Sawada
cf2655a902 pg_recvlogical: Add --failover option.
This new option instructs pg_recvlogical to create the logical
replication slot with the failover option enabled. It can be used in
conjunction with the --create-slot option.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966C54097FC83AF19F3516BF5AC2@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-04-04 10:39:57 -07:00
Jeff Davis
3556c89321 Oversight in commit b81ffa13e3.
Should warn if a materialized view may be affected, as well.
2025-04-04 10:28:52 -07:00
Peter Geoghegan
8a510275dd Further optimize nbtree search scan key comparisons.
Postgres 17 commit e0b1ee17 added two complementary optimizations to
nbtree: the "prechecked" and "firstmatch" optimizations.  _bt_readpage
was made to avoid needlessly evaluating keys that are guaranteed to be
satisfied by applying page-level context.  "prechecked" did this for
keys required in the current scan direction, while "firstmatch" did it
for keys required in the opposite-to-scan direction only.

The "prechecked" design had a number of notable issues.  It didn't
account for the fact that an = array scan key's sk_argument field might
need to advance at the point of the page precheck (it didn't check the
precheck tuple against the key's array, only the key's sk_argument,
which needlessly made it ineffective in cases involving stepping to a
page having advanced the scan's arrays using a truncated high key).
"prechecked" was also completely ineffective when only one scan key
wasn't guaranteed to be satisfied by every tuple (it didn't recognize
that it was still safe to avoid evaluating other, earlier keys).

The "firstmatch" optimization had similar limitations.  It could only be
applied after _bt_readpage found its first matching tuple, regardless of
why any earlier tuples failed to satisfy the scan's index quals.  This
allowed unsatisfied non-required scan keys to impede the optimization.

Replace both optimizations with a new optimization, without any of these
limitations: the "startikey" optimization.  Affected _bt_readpage calls
generate a page-level key offset ("startikey"), that their _bt_checkkeys
calls can then start at.  This is an offset to the first key that isn't
known to be satisfied by every tuple on the page.

Although this is independently useful work, its main goal is to avoid
performance regressions with index scans that use skip arrays, but still
never manage to skip over irrelevant leaf pages.  We must avoid wasting
CPU cycles on overly granular skip array maintenance in these cases.
The new "startikey" optimization helps with this by selectively
disabling array maintenance for the duration of a _bt_readpage call.
This has no lasting consequences for the scan's array keys (they'll
still reliably track the scan's progress through the index's key space
whenever the scan is "between pages").

Skip scan adds skip arrays during preprocessing using simple, static
rules, and decides how best to navigate/apply the scan's skip arrays
dynamically, at runtime.  The "startikey" optimization enables this
approach.  As a result of all this, the planner doesn't need to generate
distinct, competing index paths (one path for skip scan, another for an
equivalent traditional full index scan).  The overall effect is to make
scan runtime close to optimal, even when the planner works off an
incorrect cardinality estimate.  Scans will also perform well given a
skipped column with data skew: individual groups of pages with many
distinct values (in respect of a skipped column) can be read about as
efficiently as before -- without the scan being forced to give up on
skipping over other groups of pages that are provably irrelevant.

Many scans that cannot possibly skip will still benefit from the use of
skip arrays, since they'll allow the "startikey" optimization to be as
effective as possible (by allowing preprocessing to mark all the scan's
keys as required).  A scan that uses a skip array on "a" for a qual
"WHERE a BETWEEN 0 AND 1_000_000 AND b = 42" is often much faster now,
even when every tuple read by the scan has its own distinct "a" value.
However, there are still some remaining regressions, affecting certain
trickier cases.

Scans whose index quals have several range skip arrays, each on some
high cardinality column, can still be slower than they were before the
introduction of skip scan -- even with the new "startikey" optimization.
There are also known regressions affecting very selective index scans
that use a skip array.  The underlying issue with such selective scans
is that they never get as far as reading a second leaf page, and so will
never get a chance to consider applying the "startikey" optimization.
In principle, all regressions could be avoided by teaching preprocessing
to not add skip arrays whenever they aren't expected to help, but it
seems best to err on the side of robust performance.

Follow-up to commit 92fe23d9, which added nbtree skip scan.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=Y93jf5WjoOsN=xvqpMjRy-bxCE037bVFi-EasrpeUJA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WznWDK45JfNPNvDxh6RQy-TaCwULaM5u5ALMXbjLBMcugQ@mail.gmail.com
2025-04-04 12:27:52 -04:00
Peter Geoghegan
92fe23d93a Add nbtree skip scan optimization.
Teach nbtree multi-column index scans to opportunistically skip over
irrelevant sections of the index given a query with no "=" conditions on
one or more prefix index columns.  When nbtree is passed input scan keys
derived from a predicate "WHERE b = 5", new nbtree preprocessing steps
output "WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys.
That is, preprocessing generates a "skip array" (and an output scan key)
for the omitted prefix column "a", which makes it safe to mark the scan
key on "b" as required to continue the scan.  The scan is therefore able
to repeatedly reposition itself by applying both the "a" and "b" keys.

A skip array has "elements" that are generated procedurally and on
demand, but otherwise works just like a regular ScalarArrayOp array.
Preprocessing can freely add a skip array before or after any input
ScalarArrayOp arrays.  Index scans with a skip array decide when and
where to reposition the scan using the same approach as any other scan
with array keys.  This design builds on the design for array advancement
and primitive scan scheduling added to Postgres 17 by commit 5bf748b8.

Testing has shown that skip scans of an index with a low cardinality
skipped prefix column can be multiple orders of magnitude faster than an
equivalent full index scan (or sequential scan).  In general, the
cardinality of the scan's skipped column(s) limits the number of leaf
pages that can be skipped over.

The core B-Tree operator classes on most discrete types generate their
array elements with the help of their own custom skip support routine.
This infrastructure gives nbtree a way to generate the next required
array element by incrementing (or decrementing) the current array value.
It can reduce the number of index descents in cases where the next
possible indexable value frequently turns out to be the next value
stored in the index.  Opclasses that lack a skip support routine fall
back on having nbtree "increment" (or "decrement") a skip array's
current element by setting the NEXT (or PRIOR) scan key flag, without
directly changing the scan key's sk_argument.  These sentinel values
behave just like any other value from an array -- though they can never
locate equal index tuples (they can only locate the next group of index
tuples containing the next set of non-sentinel values that the scan's
arrays need to advance to).

A skip array's range is constrained by "contradictory" inequality keys.
For example, a skip array on "x" will only generate the values 1 and 2
given a qual such as "WHERE x BETWEEN 1 AND 2 AND y = 66".  Such a skip
array qual usually has near-identical performance characteristics to a
comparable SAOP qual "WHERE x = ANY('{1, 2}') AND y = 66".  However,
improved performance isn't guaranteed.  Much depends on physical index
characteristics.

B-Tree preprocessing is optimistic about skipping working out: it
applies static, generic rules when determining where to generate skip
arrays, which assumes that the runtime overhead of maintaining skip
arrays will pay for itself -- or lead to only a modest performance loss.
As things stand, these assumptions are much too optimistic: skip array
maintenance will lead to unacceptable regressions with unsympathetic
queries (queries whose scan can't skip over many irrelevant leaf pages).
An upcoming commit will address the problems in this area by enhancing
_bt_readpage's approach to saving cycles on scan key evaluation, making
it work in a way that directly considers the needs of = array keys
(particularly = skip array keys).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiro Ikeda <masahiro.ikeda@nttdata.com>
Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
2025-04-04 12:27:04 -04:00
Tom Lane
3ba2cdaa45 Stabilize regression test from c0962a113.
Per buildfarm.

Co-authored-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/srnuqlttuimzmvoulhsrbgvj4vnul6b65osswvua7sfkqsvmuy@yg7apybpxp34
2025-04-04 11:57:26 -04:00
Melanie Plageman
64e7fa43a9 Fix autoprewarm neglect of tablespaces
While prewarming blocks from a dump file, autoprewarm_database_main()
mistakenly ignored tablespace when detecting the beginning of the next
relation to prewarm. Because RelFileNumbers are only unique within a
tablespace, autoprewarm could miss prewarming blocks from a
relation with the same RelFileNumber in a different tablespace.

Though this situation is likely rare in practice, it's best to make the
code correct. Do so by explicitly checking for the RelFileNumber when
detecting a new relation.

Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/97c36982-603b-494a-95f4-aaf2a12ac27e%40iki.fi
2025-04-04 11:34:06 -04:00
Nathan Bossart
742317a80f Add commit e1a8b1ad58 to .git-blame-ignore-revs. 2025-04-04 09:41:59 -05:00
Nathan Bossart
e1a8b1ad58 Re-pgindent pg_largeobject.c after commit 0d6c477664. 2025-04-04 09:38:22 -05:00
Alexander Korotkov
c0962a113d Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
This commit implements the automatic conversion of 'x IN (VALUES ...)' into
ScalarArrayOpExpr.  That simplifies the query tree, eliminating the appearance
of an unnecessary join.

Since VALUES describes a relational table, and the value of such a list is
a table row, the optimizer will likely face an underestimation problem due to
the inability to estimate cardinality through MCV statistics.  The cardinality
evaluation mechanism can work with the array inclusion check operation.
If the array is small enough (< 100 elements), it will perform a statistical
evaluation element by element.

We perform the transformation in the convert_ANY_sublink_to_join() if VALUES
RTE is proper and the transformation is convertible.  The conversion is only
possible for operations on scalar values, not rows.  Also, we currently
support the transformation only when it ends up with a constant array.
Otherwise, the evaluation of non-hashed SAOP might be slower than the
corresponding Hash Join with VALUES.

Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com
Author: Alena Rybakina <a.rybakina@postgrespro.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-04-04 16:01:50 +03:00
Alexander Korotkov
d48d2e2dc8 Extract make_SAOP_expr() function from match_orclause_to_indexcol()
This commit extracts the code to generate ScalarArrayOpExpr on top of the list
of expressions from match_orclause_to_indexcol() into a separate function
make_SAOP_expr().  This function was extracted to be used in optimization for
conversion of 'x IN (VALUES ...)' to 'x = ANY ...'.  make_SAOP_expr() is
placed in clauses.c file as only two additional headers were needed there
compared with other places.

Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com
Author: Alena Rybakina <a.rybakina@postgrespro.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-04-04 16:01:28 +03:00
Peter Eisentraut
ee1ae8b99f Fix crash/valgrind error
Fix for commit 9ef1851685b: We have to skip indexes where sortopfamily
is NULL.  This takes the place of the previous btree check.  Detected
by valgrind on the buildfarm.
2025-04-04 14:45:53 +02:00
Heikki Linnakangas
b4f453f6ab docs: Clarify that NULL arg to set_config() means reset to default
Author: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAKFQuwY0SK6JdCci1VJX6xsztRXgGeVEY-grkENZx%2B3CZpyPcQ@mail.gmail.com
2025-04-04 15:17:17 +03:00
Heikki Linnakangas
7afca7edef Relax assertion in finding correct GiST parent
Commit 28d3c2ddcf introduced an assertion that if the memorized
downlink location in the insertion stack isn't valid, the parent's
LSN should've changed too. Turns out that was too strict. In
gistFindCorrectParent(), if we walk right, we update the parent's
block number and clear its memorized 'downlinkoffnum'. That triggered
the assertion on next call to gistFindCorrectParent(), if the parent
needed to be split too. Relax the assertion, so that it's OK if
downlinkOffnum is InvalidOffsetNumber.

Backpatch to v13-, all supported versions. The assertion was added in
commit 28d3c2ddcf in v12.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://www.postgresql.org/message-id/18396-03cac9beb2f7aac3@postgresql.org
2025-04-04 13:49:00 +03:00
Fujii Masao
534874fac0 Allow "COPY table TO" command to copy rows from materialized views.
Previously, "COPY table TO" command worked only with plain tables and
did not support materialized views, even when they were populated and
had physical storage. To copy rows from materialized views,
"COPY (query) TO" command had to be used, instead.

This commit extends "COPY table TO" to support populated materialized
views directly, improving usability and performance, as "COPY table TO"
is generally faster than "COPY (query) TO". Note that copying from
unpopulated materialized views will still result in an error.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACJufxHVxnyRYy67hiPePNCPwVBMzhTQ6FaL9_Te5On9udG=yg@mail.gmail.com
2025-04-04 19:32:00 +09:00
Peter Eisentraut
9ef1851685 Support non-btree indexes in get_actual_variable_range()
This was previously not supported because the btree strategy numbers
were hardcoded.  Now we can support this for any index that has the
required strategy mapping support and the required operators.

If an index scan used for get_actual_variable_range() requires
recheck, we now just ignore it instead of erroring out.  With btree we
knew this couldn't happen, but now it might.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-04-04 12:21:34 +02:00
Fujii Masao
0d6c477664 Extend ALTER DEFAULT PRIVILEGES to define default privileges for large objects.
Previously, ALTER DEFAULT PRIVILEGES did not support large objects.
This meant that to grant privileges to users other than the owner,
permissions had to be manually assigned each time a large object
was created, which was inconvenient.

This commit extends ALTER DEFAULT PRIVILEGES to allow defining default
access privileges for large objects. With this change, specified privileges
will automatically apply to newly created large objects, making privilege
management more efficient.

As a side effect, this commit introduces the new keyword OBJECTS
since it's used in the syntax of ALTER DEFAULT PRIVILEGES.

Original patch by Haruka Takatsuka, with some fixes and tests by Yugo Nagata,
and rebased by Laurenz Albe.

Author: Takatsuka Haruka <harukat@sraoss.co.jp>
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Co-authored-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20240424115242.236b499b2bed5b7a27f7a418@sraoss.co.jp
2025-04-04 19:02:17 +09:00
Heikki Linnakangas
6e9c81836e Use standard die() signal handler in walreceiver
This gets rid of the bespoken ProcessWalRcvInterrupts() function,
which lets walreceiver terminate at any CHECK_FOR_INTERRUPTS() call.
And it's less code anyway.

We can now use the standard libpqsrv_connect_params() libpq wrapper
from libpq-be-fe-helpers.h, removing more code. We attempted to do
that earlier already in commit 728f86fec6, but that was reverted
because it didn't call ProcessWalRcvInterrupts() and therefore didn't
react to shutdown requests. Now that ProcessWalRcvInterrupts() is
gone, it works. As stated in that commit, this also leads to
libpqwalreceiver reserving file descriptors for libpq conncetions,
which is nice.

Author: Andres Freund <andres@anarazel.de> (the earlier commit)
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Yura Sokolov <y.sokolov@postgrespro.ru>
2025-04-04 12:38:32 +03:00
Peter Eisentraut
8123e91f5a Convert PathKey to use CompareType
Change the PathKey struct to use CompareType to record the sort
direction instead of hardcoding btree strategy numbers.  The
CompareType is then converted to the index-type-specific strategy when
the plan is created.

This reduces the number of places btree strategy numbers are
hardcoded, and it's a self-contained subset of a larger effort to
allow non-btree indexes to behave like btrees.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-04-04 11:22:20 +02:00
Daniel Gustafsson
daa16893fa doc: Clarify the system value for sslrootcert
The documentation for the special value "system" for sslrootcert could
be misinterpreted to mean the default operating system CA store, which
it may be, but it's defined to be the default CA store of the SSL lib
used.

Backpatch down to v16 where support for the system value was added.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: George MacKerron <george@mackerron.co.uk>
Discussion: https://postgr.es/m/B3CBBAA3-6EA3-4AB7-8619-4BBFAB93DDB4@yesql.se
Backpatch-through: 16
2025-04-04 09:47:36 +02:00
Amit Kapila
898c131b58 pg_createsubscriber: Improve error messages.
Consistently, an option name is used in the error messages where
applicable. Also, change the code to use pg_fatal() instead of a
combination of pg_log_error() and exit().

Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CALDaNm0HxF1RH27LP7VisLzNsSJbssy8a64M5p6UduDaBq6-ag@mail.gmail.com
2025-04-04 10:58:59 +05:30
Fujii Masao
d5d85f1881 Fix logical decoding test to correctly check slot removal on standby.
The regression test for logical decoding verifies whether a logical slot
is correctly dropped on a standby when its associated database is dropped.
However, the test mistakenly retrieved slot information from the primary
instead of the standby, causing incorrect behavior.

This commit fixes the issue by ensuring the test correctly checks the slot
on the standby.

Back-patch to all supported versions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/1fdfd020-a509-403c-bd8f-a04664aba148@oss.nttdata.com
Backpatch-through: 13
2025-04-04 13:32:46 +09:00
Fujii Masao
c754bdd8a2 Fix logical decoding regression tests to correctly check slot existence.
The regression tests for logical decoding verify whether a logical slot
exists or has been dropped. Previously, these tests attempted to
retrieve "slot_name" from the result of slot(), but since "slot_name" was
not included in the result, slot()->{'slot_name'} always returned undef,
leading to incorrect behavior.

This commit fixes the issue by checking the "plugin" field in the result
of slot() instead, ensuring the tests properly verify slot existence.

Back-patch to all supported versions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB149667EC4E738769CA80B7EA5F5AE2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
2025-04-04 13:09:06 +09:00
Tomas Vondra
1aff1dc8df Revert "Improve accounting for memory used by shared hash tables"
This reverts commit f5930f9a98ea65d659d41600a138e608988ad122.

This broke the expansion of private hash tables, which reallocates the
directory. But that's impossible when it's allocated together with the
other fields, and dir_realloc() failed with BogusFree. Clearly, this
needs rethinking.

Discussion: https://postgr.es/m/CAApHDvriCiNkm=v521AP6PKPfyWkJ++jqZ9eqX4cXnhxLv8w-A@mail.gmail.com
2025-04-04 04:43:50 +02:00
Amit Langote
88f55bc976 Make derived clause lookup in EquivalenceClass more efficient
Derived clauses are stored in ec_derives, a List of RestrictInfos.
These clauses are later looked up by matching the left and right
EquivalenceMembers along with the clause's parent EC.

This linear search becomes expensive in queries with many joins or
partitions, where ec_derives may contain thousands of entries. In
particular, create_join_clause() can spend significant time scanning
this list.

To improve performance, introduce a hash table (ec_derives_hash) that
is built when the list reaches 32 entries -- the same threshold used
for join_rel_hash. The original list is retained alongside the hash
table to support EC merging and serialization
(_outEquivalenceClass()).

Each clause is stored in the hash table using a canonicalized key: the
EquivalenceMember with the lower memory address is placed in the key
before the one with the higher memory address. This avoids storing or
searching for both permutations of the same clause. For clauses
involving a constant EM, the key places NULL in the first slot and the
non-constant EM in the second.

The hash table is initialized using list_length(ec_derives_list) as
the size hint. simplehash internally adjusts this to the next power of
two after dividing by the fillfactor, so this typically results in at
least 64 buckets near the threshold -- avoiding immediate resizing
while adapting to the actual number of entries.

The lookup logic for derived clauses is now centralized in
ec_search_derived_clause_for_ems(), which consults the hash table when
available and falls back to the list otherwise.

The new ec_clear_derived_clauses() always frees ec_derives_list, even
though some of the original code paths that cleared the old
ec_derives field did not. This ensures consistent cleanup and avoids
leaking memory when large lists are discarded.

An assertion originally placed in find_derived_clause_for_ec_member()
is moved into ec_search_derived_clause_for_ems() so that it is
enforced consistently, regardless of whether the hash table or list is
used for lookup.

This design incorporates suggestions by David Rowley, who proposed
both the key canonicalization and the initial sizing approach to
balance memory usage and CPU efficiency.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com>
Tested-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Amit Langote <amitlangote09@gmail.com>
Tested-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAExHW5vZiQtWU6moszLP5iZ8gLX_ZAUbgEX0DxGLx9PGWCtqUg@mail.gmail.com
2025-04-04 10:45:05 +09:00
Amit Langote
887160d1be Add assertion to verify derived clause has constant RHS
find_derived_clause_for_ec_member() searches for a previously-derived
clause that equates a non-constant EquivalenceMember to a constant.
It is only called for EquivalenceClasses with ec_has_const set, and
with a non-constant member the EquivalenceMember to search for.

The matched clause is expected to have the non-constant member on the
left-hand side and the constant EquivalenceMember on the right.

Assert that the RHS is indeed a constant, to catch violations of this
structure and enforce assumptions made by
generate_base_implied_equalities_const().

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CAExHW5scMxyFRqOFE6ODmBiW2rnVBEmeEcA-p4W_CyuEikURdA@mail.gmail.com
2025-04-04 10:45:05 +09:00
Melanie Plageman
67be093562 Use AIO batchmode for bitmap heap scans
Previously bitmap heap scan was not AIO batchmode safe because of the
visibility map reads potentially done for the "skip fetch" optimization
(which skipped fetching tuples from the heap if the pages were all
visible and none of the columns were used in the query).

The skip fetch optimization implementation was found to have bugs and
was removed in 459e7bf8e2f8, so we can safely enable batchmode for
bitmap heap scans.
2025-04-03 18:23:02 -04:00
Melanie Plageman
54a3615f15 Remove misleading read stream asserts in a few users
Several read stream users asserted that the read stream was exhausted
after looping on that very condition. It was pointed out in an a
review of an as-of-yet uncommitted read stream user [1] that this was
confusing and could lead the reader to think there was a possibility of
some kind of race condition. Remove these asserts.

[1] https://postgr.es/m/F9ACE8D0-B807-4A17-B6BD-87EF0717983D%40yesql.se
2025-04-03 18:22:37 -04:00
Tom Lane
dbd437e670 Fix oversight in commit 0dca5d68d.
As coded, fmgr_sql() would get an assertion failure for a SQL function
that has an empty body and is declared to return some type other than
VOID.  Typically you'd never get that far because fmgr_sql_validator()
would reject such a definition (I suspect that's how come I managed to
miss the bug).  But if check_function_bodies is off or the function is
polymorphic, the validation check wouldn't get made.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0fde377a-3870-4d18-946a-ce008ee5bb88@gmail.com
2025-04-03 16:03:12 -04:00
Daniel Gustafsson
46c4c7cbc6 oauth: Remove timeout from t/002_client when not needed
The connect_timeout=1 setting for the --hang-forever test was left in
place and used by later tests, causing unexpected timeouts on slower
buildfarm animals. Remove it when no longer needed.

Per buildfarm member skink, reported by Andres on Discord.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reported-by: Andres Freund <andres@anarazel.de>
2025-04-03 20:41:09 +02:00
Daniel Gustafsson
8ae0a37932 oauth: Fix build on platforms without epoll/kqueue
register_socket() missed a variable declaration if neither
HAVE_SYS_EPOLL_H nor HAVE_SYS_EVENT_H was defined.

While we're fixing that, adjust the tests to check pg_config.h for one
of the multiplexer implementations, rather than assuming that Windows is
the only platform without support. (Christoph reported this on
hurd-amd64, an experimental Debian.)

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reported-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/Z-sPFl27Y0ZC-VBl%40msg.df7cb.de
2025-04-03 20:37:52 +02:00
Jeff Davis
945126234b Fix unintentional 'NULL' string literal in pg_upgrade.
Introduced in 2a083ab807.

Discussion: https://postgr.es/m/e852442da35b4f31acc600ed98bbee0f12e65e0c.camel@j-davis.com
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2025-04-03 11:04:37 -07:00
Jeff Davis
b81ffa13e3 pg_upgrade check for Unicode-dependent relations.
This check will not cause an upgrade failure, only a warning.

Discussion: https://postgr.es/m/ef03d678b39a64392f4b12e0f59d1495c740969e.camel%40j-davis.com
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-04-03 10:45:38 -07:00
Masahiko Sawada
fd09c1316b Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots that
were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.

This commit restricts the copying of invalidated replication slots.

Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.

For v15 and earlier, this check is not required since slots can only
be invalidated due to WAL removal, and existing checks already handle
this issue.

Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
2025-04-03 10:30:00 -07:00
Álvaro Herrera
f104192e52
Remove duplicate set of print_notnull
I inserted the second one by mistake in commit 14e87ffa5c54.

Reported-by: jian he <jian.universality@gmail.com>
Confirmed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CACJufxFqckBFxPfCixHHbOr0zMLksviTj2m3o12-tErfx_PvTg@mail.gmail.com
2025-04-03 17:34:25 +02:00
Daniel Gustafsson
b82e7eddb0 Add missing declarations to pg_config.h.in
Add missing pg_config.h.in declarations from 09be39112654
where the corresponding autoconf/meson declarations were
added.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/70145721-6949-4ABF-BB54-63F866488DF8@yesql.se
2025-04-03 13:57:27 +02:00
Daniel Gustafsson
2da74d8d64 libpq: Add support for dumping SSL key material to file
This adds a new connection parameter which instructs libpq to
write out keymaterial clientside into a file in order to make
connection debugging with Wireshark and similar tools possible.
The file format used is the standardized NSS format.

Author: Abhishek Chanda <abhishek.becs@gmail.com>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/CAKiP-K85C8uQbzXKWf5wHQPkuygGUGcufke713iHmYWOe9q2dA@mail.gmail.com
2025-04-03 13:16:43 +02:00
Heikki Linnakangas
e4309f73f6 Add support for sorted gist index builds to btree_gist
This enables sortsupport in the btree_gist extension for faster builds
of gist indexes.

Sorted gist index build strategy is the new default now. Regression
tests are unchanged (except for one small change in the 'enum' test to
add coverage for enum values added later) and are using the sorted
build strategy instead.

One version of this was committed a long time ago already, in commit
9f984ba6d2, but it was quickly reverted because of buildfarm
failures. The failures were presumably caused by some small bugs, but
we never got around to debug and commit it again. This patch was
written from scratch, implementing the same idea, with some fragments
and ideas from the original patch.

Author: Bernd Helmle <mailings@oopsware.de>
Author: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://www.postgresql.org/message-id/64d324ce2a6d535d3f0f3baeeea7b25beff82ce4.camel@oopsware.de
2025-04-03 13:46:35 +03:00
Heikki Linnakangas
9370978da8 Fix boilerplate comments in btree_gist
A few of these were copy-pasted wrong, like the comment "Bytea ops" in
btree_numeric.c. Instead of fixing the incorrect ones, replace them
all with generic comment "GiST support functions".

Also tidy up the inconsistent newlines between various functions while
we're at it.
2025-04-03 13:39:33 +03:00
Peter Eisentraut
82a46cca99 Update Unicode data to Unicode 16.0.0
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/146349e4-4687-4321-91af-f235572490a8@eisentraut.org
2025-04-03 12:00:09 +02:00
Peter Eisentraut
231064aa0f plpython: Add test for returning Python set from SETOF function
This is claimed in the documentation but there was a no test case for
it.

Reported-by: Bogdan Grigorenko <gri.bogdan.2020@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/173543330569.680.6706329879058172623%40wrigleys.postgresql.org
2025-04-03 11:09:50 +02:00
Amit Kapila
d1d83827ba Doc: Improve -R option added in e5aeed4b80.
Author: Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAHut+PvJPnaL=70SbBe3fYg2nq74Z=Yv4X=zRpUWYfOi-q6=2w@mail.gmail.com
2025-04-03 14:27:13 +05:30
Álvaro Herrera
8806e4e8de
002_pg_upgrade.pl: Move pg_dump test code for better stability
The alleged "statistics pg_dump bug" that prevented us from enabling
stats dumping in commit 172259afb563 wasn't a pg_dump bug after all: it
was just a side effect of not running pg_dump at the right time (namely,
before giving autovacuum some time to do its thing and then disabling it
to stabilize things).  Move the code around to fix this problem and
enable statistics dumping.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Diagnosed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/5f3703fd7f27da62a8f3615218f937507f522347.camel@j-davis.com
Discussion: https://postgr.es/m/CAExHW5sDm+aGb7A4EXK=X9rkrmSPDgc03EdADt=wWkdMO=XPSA@mail.gmail.com
2025-04-03 10:16:24 +02:00
Álvaro Herrera
abe56227b2
002_pg_upgrade.pl: rename some variables for clarity
This renames %node_params to %old_node_params, @initdb_params to
@old_initdb_params, and adds separate @new_initdb_params and
%new_node_params rather than reusing the former in confusing ways.

Extracted from a larger patch from the same author.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5sDm+aGb7A4EXK=X9rkrmSPDgc03EdADt=wWkdMO=XPSA@mail.gmail.com
2025-04-03 09:56:58 +02:00
Richard Guo
ea5d3f5233 Remove duplicated comment in get_relation_constraints
The check for non-inheritable constraints is performed later, and the
same comment is included at that point.

While we're here, remove one extraneous blank line.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CACJufxETi6x86S8EkH8mRfOcm2AenoE9t1pyCFVMpU34gVhF3w@mail.gmail.com
2025-04-03 16:43:53 +09:00
Peter Eisentraut
84fea854c9 Update Unicode data to CLDR 47
No actual changes result.
2025-04-03 09:20:25 +02:00
Peter Eisentraut
bbf24fe2f1 Update code comment
Commit 4e7f62bc386 added a new input file to a script but didn't
update the comment listing the input files.
2025-04-03 09:20:25 +02:00
Peter Eisentraut
34f04aa653 Fix update-unicode make target
The addition of SpecialCasing.txt by commit 286a365b9c2 was not added
to the make target dependencies, so the invoked script would fail
because the required file wasn't downloaded first.  (The meson version
appears to work correctly.)
2025-04-03 09:20:25 +02:00
Amit Kapila
4868c96bc8 Fix slot synchronization for two_phase enabled slots.
The issue is that the transactions prepared before two-phase decoding is
enabled can fail to replicate to the subscriber after being committed on a
promoted standby following a failover. This is because the two_phase_at
field of a slot, which tracks the LSN from which two-phase decoding
starts, is not synchronized to standby servers. Without two_phase_at, the
logical decoding might incorrectly identify prepared transaction as
already replicated to the subscriber after promotion of standby server,
causing them to be skipped.

To address the issue on HEAD, the two_phase_at field of the slot is
exposed by the pg_replication_slots view and allows the slot
synchronization to copy this value to the corresponding synced slot on the
standby server.

This bug is likely to occur if the user toggles the two_phase option to
true after initial slot creation. Given that altering the two_phase option
of a replication slot is not allowed in PostgreSQL 17, this bug is less
likely to occur. We can't change the view/function definition in
backbranch so we can't push the same fix but we are brainstorming an
appropriate solution for PG17.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/TYAPR01MB5724CC7C288535BBCEEE65DA94A72@TYAPR01MB5724.jpnprd01.prod.outlook.com
2025-04-03 12:26:54 +05:30
Tom Lane
a7187c3723 Remove unnecessary type violation in tsvectorrecv().
compareentry() is declared to work on WordEntryIN structs, but
tsvectorrecv() is using it in two places to work on WordEntry
structs.  This is almost okay, since WordEntry is the first
field of WordEntryIN.  But on machines with 8-byte pointers,
WordEntryIN will have a larger alignment spec than WordEntry,
and it's at least theoretically possible that the compiler
could generate code that depends on the larger alignment.

Given the lack of field reports, this may be just a hypothetical bug
that upsets nothing except sanitizer tools.  Or it may be real on
certain hardware but nobody's tried to use tsvectorrecv() on such
hardware.  In any case we should fix it, and the fix is trivial:
just change compareentry() so that it works on WordEntry without any
mention of WordEntryIN.  We can also get rid of the quite-useless
intermediate function WordEntryCMP.

Bug: #18875
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18875-07a29c49c825a608@postgresql.org
Backpatch-through: 13
2025-04-02 16:17:43 -04:00
Andres Freund
24da5b239a Add test for HeapBitmapScan's broken skip_fetch optimization
In the previous commit HeapBitmapScan's skip_fetch optimization was removed,
due to being broken in not easily fixable ways. Add a test that verifies we
don't re-introduce this bug if somebody tries to re-add the feature.

Only add the test to master for now, it's possible it's not entirely
stable. That seems sufficient, as we're not going to re-introduce the feature
on the backbranches. I did verify that the test passes on all branches. If the
test turns out to be unproblematic, we can backpatch it later, should we feel
a need to do so.

Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com
2025-04-02 14:58:39 -04:00
Andres Freund
459e7bf8e2 Remove HeapBitmapScan's skip_fetch optimization
The optimization does not take the removal of TIDs by a concurrent vacuum into
account. The concurrent vacuum can remove dead TIDs and make pages ALL_VISIBLE
while those dead TIDs are referenced in the bitmap. This can lead to a
skip_fetch scan returning too many tuples.

It likely would be possible to implement this optimization safely, but we
don't have the necessary infrastructure in place. Nor is it clear that it's
worth building that infrastructure, given how limited the skip_fetch
optimization is.

In the backbranches we just disable the optimization by always passing
need_tuples=true to table_beginscan_bm(). We can't perform API/ABI changes in
the backbranches and we want to make the change as minimal as possible.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reported-By: Konstantin Knizhnik <knizhnik@garret.ru>
Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com
Backpatch-through: 13
2025-04-02 14:54:20 -04:00
Tom Lane
0dca5d68d7 Change SQL-language functions to use the plan cache.
In the historical implementation of SQL functions (if they don't get
inlined), we built plans for all the contained queries at first call
within an outer query, and then re-used those plans for the duration
of the outer query, and then forgot everything.  This was not ideal,
not least because the plans could not be customized to specific values
of the function's parameters.  Our plancache infrastructure seems
mature enough to be used here.  That will solve both the problem with
not being able to build custom plans and the problem with not being
able to share work across successive outer queries.

Aside from those performance concerns, this change fixes a
longstanding bugaboo with SQL functions: you could not write DDL that
would affect later statements in the same function.  That's mostly
still true with new-style SQL functions, since the results of parse
analysis are baked into the stored query trees (and protected by
dependency records).  But for old-style SQL functions, it will now
work much as it does with PL/pgSQL functions, because we delay parse
analysis and planning of each query until we're ready to run it.
Some edge cases that require replanning are now handled better too;
see for example the new rowsecurity test, where we now detect an RLS
context change that was previously missed.

One other edge-case change that might be worthy of a release note
is that we now insist that a SQL function's result be generated
by the physically-last query within it.  Previously, if the last
original query was deleted by a DO INSTEAD NOTHING rule, we'd be
willing to take the result from the preceding query instead.
This behavior was undocumented except in source-code comments,
and it seems hard to believe that anyone's relying on it.

Along the way to this feature, we needed a few infrastructure changes:

* The plancache can now take either a raw parse tree or an
analyzed-but-not-rewritten Query as the starting point for a
CachedPlanSource.  If given a Query, it is caller's responsibility
that nothing will happen to invalidate that form of the query.
We use this for new-style SQL functions, where what's in pg_proc is
serialized Query(s) and we trust the dependency mechanism to disallow
DDL that would break those.

* The plancache now offers a way to invoke a post-rewrite callback
to examine/modify the rewritten parse tree when it is rebuilding
the parse trees after a cache invalidation.  We need this because
SQL functions sometimes adjust the parse tree to make its output
exactly match the declared result type; if the plan gets rebuilt,
that has to be re-done.

* There is a new backend module utils/cache/funccache.c that
abstracts the idea of caching data about a specific function
usage (a particular function and set of input data types).
The code in it is moved almost verbatim from PL/pgSQL, which
has done that for a long time.  We use that logic now for
SQL-language functions too, and maybe other PLs will have use
for it in the future.

Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop
2025-04-02 14:06:02 -04:00
Heikki Linnakangas
e9e7b66044 Add GiST and btree sortsupport routines for range types
For GiST, having a sortsupport function allows building the index
using the "sorted build" method, which is much faster.

For b-tree, the sortsupport routine doesn't give any new
functionality, but speeds up sorting a tiny bit. The difference is not
very significant, about 2% in cursory testing on my laptop, because
the range type comparison function has quite a lot of overhead from
detoasting. In any case, since we have the function for GiST anyway,
we might as well register it for the btree opfamily too.

Author: Bernd Helmle <mailings@oopsware.de>
Discussion: https://www.postgresql.org/message-id/64d324ce2a6d535d3f0f3baeeea7b25beff82ce4.camel@oopsware.de
2025-04-02 19:51:28 +03:00
Heikki Linnakangas
ea3f9b6da3 docs: Fix column count attribute in table
Nothing seems to actually depend on the attribute, as the docs built
successfully, but let's be tidy.

Reported offlist by Matthias van de Meent
2025-04-02 18:21:07 +03:00
Tomas Vondra
46df9487d9 Improve accounting for PredXactList, RWConflictPool and PGPROC
Various places allocated shared memory by first allocating a small chunk
using ShmemInitStruct(), followed by ShmemAlloc() calls to allocate more
memory. Unfortunately, ShmemAlloc() does not update ShmemIndex, so this
affected pg_shmem_allocations - it only shown the initial chunk.

This commit modifies the following allocations, to allocate everything
as a single chunk, and then split it internally.

- PredXactList
- RWConflictPool
- PGPROC structures
- Fast-Path Lock Array

The fast-path lock array is allocated separately, not as a part of the
PGPROC structures allocation.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com
2025-04-02 17:14:28 +02:00
Tomas Vondra
f5930f9a98 Improve accounting for memory used by shared hash tables
pg_shmem_allocations tracks the memory allocated by ShmemInitStruct(),
but for shared hash tables that covered only the header and hash
directory.  The remaining parts (segments and buckets) were allocated
later using ShmemAlloc(), which does not update the shmem accounting.
Thus, these allocations were not shown in pg_shmem_allocations.

This commit improves the situation by allocating all the hash table
parts at once, using a single ShmemInitStruct() call. This way the
ShmemIndex entries (and thus pg_shmem_allocations) better reflect the
proper size of the hash table.

This affects allocations for private (non-shared) hash tables too, as
the hash_create() code is shared. For non-shared tables this however
makes no practical difference.

This changes the alignment a bit. ShmemAlloc() aligns the chunks using
CACHELINEALIGN(), which means some parts (header, directory, segments)
were aligned this way. Allocating all parts as a single chunk removes
this (implicit) alignment. We've considered adding explicit alignment,
but we've decided not to - it seems to be merely a coincidence due to
using the ShmemAlloc() API, not due to necessity.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com
2025-04-02 17:14:28 +02:00
Tom Lane
bd178960c6 Need to do CommandCounterIncrement after StoreAttrMissingVal.
Without this, an additional change to the same pg_attribute row
within the same command will fail.  This is possible at least with
ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure.
(Another potential hazard is that immediately-following operations
might not see the missingval.)

Introduced by 95f650674, which split the former coding that
used a single pg_attribute update to change both atthasdef and
atthasmissing/attmissingval into two updates, but missed that
this should entail two CommandCounterIncrements as well.  Like
that fix, back-patch through v13.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/025a3ffa-5eff-4a88-97fb-8f583b015965@gmail.com
Backpatch-through: 13
2025-04-02 11:13:01 -04:00
Heikki Linnakangas
b05751220b docs: Add a new section and a table listing protocol versions
Move the discussion on protocol versions and version negotiation to a
new "Protocol versions" section. Add a table listing all the different
protocol versions, starting from the obsolete protocol version 2, and
the PostgreSQL versions that support each.

Discussion: https://www.postgresql.org/message-id/69f53970-1d55-4165-9151-6fb524e36af9@iki.fi
2025-04-02 16:41:51 +03:00
Heikki Linnakangas
a460251f0a Make cancel request keys longer
Currently, the cancel request key is a 32-bit token, which isn't very
much entropy. If you want to cancel another session's query, you can
brute-force it. In most environments, an unauthorized cancellation of
a query isn't very serious, but it nevertheless would be nice to have
more protection from it. Hence make the key longer, to make it harder
to guess.

The longer cancellation keys are generated when using the new protocol
version 3.2. For connections using version 3.0, short 4-bytes keys are
still used.

The new longer key length is not hardcoded in the protocol anymore,
the client is expected to deal with variable length keys, up to 256
bytes. This flexibility allows e.g. a connection pooler to add more
information to the cancel key, which might be useful for finding the
connection.

Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Discussion: https://www.postgresql.org/message-id/508d0505-8b7a-4864-a681-e7e5edfe32aa@iki.fi
2025-04-02 16:41:48 +03:00
Heikki Linnakangas
285613c60a libpq: Add min/max_protocol_version connection options
All supported version of the PostgreSQL server send the
NegotiateProtocolVersion message when an unsupported minor protocol
version is requested by a client. But many other applications that
implement the PostgreSQL protocol (connection poolers, or other
databases) do not, and the same is true for PostgreSQL server versions
older than 9.3. Connecting to such other applications thus fails if a
client requests a protocol version different than 3.0.

This patch adds a max_protocol_version connection option to libpq that
specifies the protocol version that libpq should request from the
server. Currently only 3.0 is supported, but that will change in a
future commit that bumps the protocol version. Even after that version
bump the default will likely stay 3.0 for the time being. Once more of
the ecosystem supports the NegotiateProtocolVersion message we might
want to change the default to the latest minor version.

This also adds the similar min_protocol_version connection option, to
allow the client to specify that connecting should fail if a lower
protocol version is attempted by the server. This can be used to
ensure that certain protocol features are used, which can be
particularly useful if those features impact security.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Discussion: https://www.postgresql.org/message-id/CAGECzQTfc_O%2BHXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAGECzQRbAGqJnnJJxTdKewTsNOovUt4bsx3NFfofz3m2j-t7tA@mail.gmail.com
2025-04-02 16:41:45 +03:00
Heikki Linnakangas
5070349102 libpq: Handle NegotiateProtocolVersion message differently
Previously libpq would always error out if the server sends a
NegotiateProtocolVersion message. This was fine because libpq only
supported a single protocol version and did not support any protocol
parameters. But in the upcoming commits, we will introduce a new
protocol version and the NegotiateProtocolVersion message starts to
actually be used.

This patch modifies the client side checks to allow a range of
supported protocol versions, instead of only allowing the exact
version that was requested. Currently this "range" only contains the
3.0 version, but in a future commit we'll change this.

Also clarify the error messages, making them suitable for the world
where libpq will support multiple protocol versions and protocol
extensions.

Note that until the later commits that introduce new protocol version,
this change does not have any behavioural effect, because libpq will
only request version 3.0 and will never send protocol parameters, and
therefore will never receive a NegotiateProtocolVersion message from
the server.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Discussion: https://www.postgresql.org/message-id/CAGECzQTfc_O%2BHXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAGECzQRbAGqJnnJJxTdKewTsNOovUt4bsx3NFfofz3m2j-t7tA@mail.gmail.com
2025-04-02 16:41:42 +03:00
Peter Eisentraut
748e98d05b Fix code comment
The changes made in commit d2b4b4c2259 contained incorrect comments:
They said that certain forward declarations were necessary to "avoid
including pathnodes.h here", but the file is itself pathnodes.h!  So
change the comment to just say it's a forward declaration in one case,
and in the other case we don't need the declaration at all because it
already appeared earlier in the file.
2025-04-02 14:46:47 +02:00
Heikki Linnakangas
09be391126 Add timingsafe_bcmp(), for constant-time memory comparison
timingsafe_bcmp() should be used instead of memcmp() or a naive
for-loop, when comparing passwords or secret tokens, to avoid leaking
information about the secret token by timing. This commit just
introduces the function but does not change any existing code to use
it yet.

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/7b86da3b-9356-4e50-aa1b-56570825e234@iki.fi
2025-04-02 15:32:40 +03:00
Heikki Linnakangas
85d799ba8a docs: Update phrase on message lengths in the protocol
The reasoning for why all the message formats are parseable without
the explicit message length field is anachronistic; the real reason is
that protocol version 2 did not have a message length field. There's
nothing wrong with relying on the message length, like we do in the
CopyData messags, even though it often still makes sense to have
length fields for individual parts in messages.

Discussion: https://www.postgresql.org/message-id/02a4eed2-98f0-4796-9d4f-12128ff44fe0@iki.fi
2025-04-02 15:32:33 +03:00
Andres Freund
a6285b150a tests: Fix incompatibility of test_aio with *_FORCE_RELEASE
The test added in 93bc3d75d8e failed in a build with RELCACHE_FORCE_RELEASE
and CATCACHE_FORCE_RELEASE defined. The test intentionally forgets to exit
batchmode - normally that would trigger an error at the end of the
transaction, which the test verifies.  However, with RELCACHE_FORCE_RELEASE
and CATCACHE_FORCE_RELEASE defined, we get other code (output function lookup)
entering batchmode and erroring out because batchmode isn't allowed to be
entered recursively.

Fix that by changing the queries in question to not output any rows. That's
not exactly pretty, but seems to avoid the problem reliably.

Eventually we might want to make RELCACHE_FORCE_RELEASE and
CATCACHE_FORCE_RELEASE GUCs, so we can disable them where necessary - this
isn't the first test having difficulty with those debug options. But that's
for later.

Per buildfarm member prion.

Discussion: https://postgr.es/m/uc62i6vi5gd4bi6wtjj5poadqxolgy55e7ihkmf3mthjegb6zl@zqo7xez7sc2r
2025-04-02 07:57:11 -04:00
Andres Freund
43dca8a116 tests: Cope with WARNINGs during failed CREATE DB on windows
The test added in 93bc3d75d8e sometimes fails on windows, due to warnings like
WARNING:  some useless files may be left behind in old database directory "base/16514"

The reason for that is createdb_failure_callback() does not ensure that there
are no open file descriptors for files in the partially created,
to-be-dropped, database. We do take care in dropdb(), but that involves
waiting for checkpoints and a ProcSignalBarrier, which we probably don't want
to do in an error callback.  This should probably be fixed one day, but for
now 001_aio.pl needs to cope.

Per buildfarm animals fairywren and drongo.

Discussion: https://postgr.es/m/uc62i6vi5gd4bi6wtjj5poadqxolgy55e7ihkmf3mthjegb6zl@zqo7xez7sc2r
2025-04-02 07:51:48 -04:00
Peter Eisentraut
eec0040c4b Add support for NOT ENFORCED in foreign key constraints
This expands the NOT ENFORCED constraint flag, previously only
supported for CHECK constraints (commit ca87c415e2f), to foreign key
constraints.

Normally, when a foreign key constraint is created on a table, action
and check triggers are added to maintain data integrity.  With this
patch, if a constraint is marked as NOT ENFORCED, integrity checks are
no longer required, making these triggers unnecessary.  Consequently,
when creating a NOT ENFORCED foreign key constraint, triggers will not
be created, and the constraint will be marked as NOT VALID.
Similarly, if an existing foreign key constraint is changed to NOT
ENFORCED, the associated triggers will be dropped, and the constraint
will also be marked as NOT VALID.  Conversely, if a NOT ENFORCED
foreign key constraint is changed to ENFORCED, the necessary triggers
will be created, and the will be changed to VALID by performing
necessary validation.

Since not-enforced foreign key constraints have no triggers, the
shortcut used for example in psql and pg_dump to skip looking for
foreign keys if the relation is known not to have triggers no longer
applies.  (It already didn't work for partitioned tables.)

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Tested-by: Triveni N <triveni.n@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-04-02 13:36:44 +02:00
Andres Freund
327d987df1 tests: Cope with io_method in TEMP_CONFIG in test_aio
If io_method is set in TEMP_CONFIG the test added in 93bc3d75d8e fails,
because it assumes the io_method specified at initdb is actually used.

Fix that by appending the io_method again, after initdb (and thus after
TEMP_CONFIG has been added by Cluster.pm).

Per buildfarm animal bumblebee

Discussion: https://postgr.es/m/zh5u22wbpcyfw2ddl3lsvmsxf4yvsrvgxqwwmfjddc4c2khsgp@gfysyjsaelr5
2025-04-02 07:00:40 -04:00
Alexander Korotkov
bc22dc0e0d Get rid of WALBufMappingLock
Allow multiple backends to initialize WAL buffers concurrently.  This way
`MemSet((char *) NewPage, 0, XLOG_BLCKSZ);` can run in parallel without
taking a single LWLock in exclusive mode.

The new algorithm works as follows:
 * reserve a page for initialization using XLogCtl->InitializeReserved,
 * ensure the page is written out,
 * once the page is initialized, try to advance XLogCtl->InitializedUpTo and
   signal to waiters using XLogCtl->InitializedUpToCondVar condition
   variable,
 * repeat previous steps until we reserve initialization up to the target
   WAL position,
 * wait until concurrent initialization finishes using a
   XLogCtl->InitializedUpToCondVar.

Now, multiple backends can, in parallel, concurrently reserve pages,
initialize them, and advance XLogCtl->InitializedUpTo to point to the latest
initialized page.

Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Tested-by: Michael Paquier <michael@paquier.xyz>
2025-04-02 12:44:24 +03:00
Fujii Masao
b53b88109f Improve error message when standby does accept connections.
Even after reaching the minimum recovery point, if there are long-lived
write transactions with 64 subtransactions on the primary, the recovery
snapshot may not yet be ready for hot standby, delaying read-only
connections on the standby. Previously, when read-only connections were
not accepted due to this condition, the following error message was logged:

    FATAL:  the database system is not yet accepting connections
    DETAIL:  Consistent recovery state has not been yet reached.

This DETAIL message was misleading because the following message was
already logged in this case:

    LOG:  consistent recovery state reached

This contradiction, i.e., indicating that the recovery state was consistent
while also stating it wasn’t, caused confusion.

This commit improves the error message to better reflect the actual state:

    FATAL: the database system is not yet accepting connections
    DETAIL: Recovery snapshot is not yet ready for hot standby.
    HINT: To enable hot standby, close write transactions with more than 64 subtransactions on the primary server.

To implement this, the commit introduces a new postmaster signal,
PMSIGNAL_RECOVERY_CONSISTENT. When the startup process reaches
a consistent recovery state, it sends this signal to the postmaster,
allowing it to correctly recognize that state.

Since this is not a clear bug, the change is applied only to the master
branch and is not back-patched.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/02db8cd8e1f527a8b999b94a4bee3165@oss.nttdata.com
2025-04-02 15:13:01 +09:00
David Rowley
121d774cae Doc: add information about partition locking
The documentation around locking of partitions for the executor startup
phase of run-time partition pruning wasn't clear about which partitions
were being locked.  Fix that.

Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAApHDvp738G75HfkKcfXaf3a8s%3D6mmtOLh46tMD0D2hAo1UCzA%40mail.gmail.com
Backpatch-through: 13
2025-04-02 14:02:44 +13:00
Melanie Plageman
b3219c69fc aio: Add errcontext for processing I/Os for another backend
Push an ErrorContextCallback adding additional detail about the process
performing the I/O and the owner of the I/O when those are not the same.

For io_method worker, this adds context specifying which process owns
the I/O that the I/O worker is processing.

For io_method io_uring, this adds context only when a backend is
*completing* I/O for another backend. It specifies the pid of the owning
process.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/rdml3fpukrqnas7qc5uimtl2fyytrnu6ymc2vjf2zuflbsjuul%40hyizyjsexwmm
2025-04-01 19:53:07 -04:00
David Rowley
b136db07c6 Fix planner's failure to identify multiple hashable ScalarArrayOpExprs
50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify
IN and NOT IN clauses which have Const lists as right-hand arguments and
when an appropriate hash function is available for the data types, mark
the ScalarArrayOpExpr as hashable so the executor could execute it more
optimally by building and probing a hash table during expression
evaluation.

These commits both worked correctly when there was only a single
ScalarArrayOpExpr in the given expression being processed by the
planner, but when there were multiple, only the first was checked and any
subsequent ones were not identified, which resulted in less optimal
expression evaluation during query execution for all but the first found
ScalarArrayOpExpr.

Backpatch to 14, where 50e17ad28 was introduced.

Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/29a76f51-97b0-4c07-87b7-ec8e3b5345c9@gmail.com
Backpatch-through: 14
2025-04-02 11:56:29 +13:00
Tom Lane
6c12ae09f5 Introduce a SQL-callable function array_sort(anyarray).
Create a function that will sort the elements of an array
according to the element type's sort order.  If the array
has more than one dimension, the sub-arrays of the first
dimension are sorted per normal array-comparison rules,
leaving their contents alone.

In support of this, add pg_type.typarray to the set of fields
cached by the typcache.

Author: Junwang Zhao <zhjwpku@gmail.com>
Co-authored-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAEG8a3J41a4dpw_-F94fF-JPRXYxw-GfsgoGotKcjs9LVfEEvw@mail.gmail.com
2025-04-01 18:03:55 -04:00
Tom Lane
6da2ba1d8a Fix detection and handling of strchrnul() for macOS 15.4.
As of 15.4, macOS has strchrnul(), but access to it is blocked behind
a check for MACOSX_DEPLOYMENT_TARGET >= 15.4.  But our does-it-link
configure check finds it, so we try to use it, and fail with the
present default deployment target (namely 15.0).  This accounts for
today's buildfarm failures on indri and sifaka.

This is the identical problem that we faced some years ago when Apple
introduced preadv and pwritev in the same way.  We solved that in
commit f014b1b9b by using AC_CHECK_DECLS instead of AC_CHECK_FUNCS
to check the functions' availability.  So do the same now for
strchrnul().  Interestingly, we already had a workaround for
"the link check doesn't agree with <string.h>" cases with glibc,
which we no longer need since only the header declaration is being
checked.

Testing this revealed that the meson version of this check has never
worked, because it failed to use "-Werror=unguarded-availability-new".
(Apparently nobody's tried to build with meson on macOS versions that
lack preadv/pwritev as standard.)  Adjust that while at it.  Also,
we had never put support for "-Werror=unguarded-availability-new"
into v13, but we need that now.

Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/385134.1743523038@sss.pgh.pa.us
Backpatch-through: 13
2025-04-01 16:50:09 -04:00
Andrew Dunstan
c313fa4602 Use workaround of __builtin_setjmp only on MINGW on MSVCRT
MSVCRT is not present Windows/ARM64 and the workaround is not
necessary on any UCRT based toolchain.

Author: Lars Kanis <lars@greiz-reinsdorf.de>

Discussion: https://postgr.es/m/CAHXCYb2OjNHtoGVKyXtXmw4B3bUXwJX6M-Lcp1KcMCRUMLOocA@mail.gmail.com
2025-04-01 16:24:59 -04:00
Andres Freund
e19dc74491 aio: Minor comment improvements
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/usbwzckj7q3jhfx3ann3nrfnukmupbs35axvq5zfyeo6nvrzrm@onjhxs2du4st
2025-04-01 16:06:48 -04:00
Andres Freund
fdd146a8ef aio: Add README.md explaining higher level design
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-04-01 16:06:48 -04:00
Nathan Bossart
5aec7e07fb doc: Adjust some notes about pg_upgrade's file transfer modes.
--copy-file-range and --swap were not mentioned in a few places
that discuss the available file transfer modes.  This entire page
would likely benefit from an overhaul, but that's v19 material at
this point.

Oversights in commits d93627bcbe and 626d7236b6.
2025-04-01 14:37:47 -05:00
Andres Freund
00066aa173 md: Add comment & assert to buffer-zeroing path in md[start]readv()
mdreadv() has a codepath to zero out buffers when a read returns zero bytes,
guarded by a check for zero_damaged_pages || InRecovery.

The InRecovery codepath to zero out buffers in mdreadv() appears to be
unreachable. The only known paths to reach mdreadv()/mdstartreadv() in
recovery are XLogReadBufferExtended(), vm_readbuf(), and fsm_readbuf(), each
of which takes care to extend the relation if necessary. This looks to either
have been the case for a long time, or the code was never reachable.

The zero_damaged_pages path is incomplete, as missing segments are not
created.

Putting blocks into the buffer-pool that do not exist on disk is rather
problematic, as such blocks will, at least initially, not be found by scans
that rely on smgrnblocks(), as they are beyond EOF. It also can cause weird
problems with relation extension, as relation extension does not expect blocks
beyond EOF to exist.

Therefore we would like to remove that path.

mdstartreadv(), which I added in e5fe570b51c, does not implement this zeroing
logic. I had started a discussion about that a while ago (linked below), but
forgot to act on the conclusion of the discussion, namely to disable the
in-memory-zeroing behavior.

We could certainly implement equivalent zeroing logic in mdstartreadv(), but
it would have to be more complicated due to potential differences in the
zero_damaged_pages setting between the definer and completor of IO. Given that
we want to remove the logic, that does not seem worth implementing the
necessary logic.

For now, put an Assert(false) and comments documenting this choice into
mdreadv() and comments documenting the deprecation of the path in mdreadv()
and the non-implementation of it in mdstartreadv().  If we, during testing,
discover that we do need the path, we can implement it at that time.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/postgr.es/m/20250330024513.ac.nmisch@google.com
Discussion: https://postgr.es/m/postgr.es/m/3qxxsnciyffyf3wyguiz4besdp5t5uxvv3utg75cbcszojlz7p@uibfzmnukkbd
2025-04-01 13:50:39 -04:00
Andres Freund
93bc3d75d8 aio: Add test_aio module
To make the tests possible, a few functions from bufmgr.c/localbuf.c had to be
exported, via buf_internals.h.

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-04-01 13:47:46 -04:00
Andres Freund
60f566b4f2 aio: Add pg_aios view
The new view lists all IO handles that are currently in use and is mainly
useful for PG developers, but may also be useful when tuning PG.

Bumps catversion.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-04-01 13:30:33 -04:00
Andres Freund
46250cdcb0 docs: Add acronym and glossary entries for I/O and AIO
These are fairly basic, but better than nothing.  While there are several
opportunities to link to these entries, this patch does not add any. They will
however be referenced by future patches.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250326183102.92.nmisch@google.com
2025-04-01 13:30:33 -04:00
Álvaro Herrera
172259afb5
Verify roundtrip dump/restore of regression database
Add a test to pg_upgrade's test suite that verifies that
dump-restore-dump of regression database produces equivalent output to
dumping it directly.  This was already being tested by running
pg_upgrade itself, but non-binary-upgrade mode was not being covered.

The regression database has accrued, over time, a sufficient collection
of interesting objects to ensure good coverage, but there hasn't been a
concerted effort to be completely exhaustive, so it is likely still
possible to have more.

This'd belong more naturally in the pg_dump test suite, but we chose to
put it in src/bin/pg_upgrade/t/002_pg_upgrade.pl because we need a run
of the regression tests which is already done here, so this has less
total test runtime impact.  Also, experiments have shown that using
parallel dump/restore is slightly faster, so we use --format=directory -j2.

This test has already reported pg_dump bugs, as fixed in fd41ba93e463,
74563f6b9021, d611f8b1587b, 4694aedf63bf.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2Wf61PT+hfquzjBqouRzQJQ@mail.gmail.com
2025-04-01 18:50:40 +02:00
Peter Eisentraut
764d501d24 Remove a stray "pgrminclude" annotation
We don't use those anymore.  Fix for commit 8492feb98f6.
2025-04-01 15:28:22 +02:00
Peter Eisentraut
113ecf1f8c Fix minor C type confusion
Returning false instead of NULL gets a compiler error under gcc-14
-std=gnu23, and it appears to have been unintentional.  Fix for commit
8492feb98f6.
2025-04-01 15:28:22 +02:00
Heikki Linnakangas
2904324a88 heapam: Only set tuple's block once per page in pagemode
Due to splitting the block id into two 16 bit integers, BlockIdSet()
is more expensive than one might think.  Doing it once per returned
tuple shows up as a small but reliably reproducible cost.  It's simple
enough to set the block number just once per block in pagemode, so do
so.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6
2025-04-01 13:24:27 +03:00
John Naylor
af0c248557 Use function attributes for SSE 4.2 even when targeting that extension
On Red Hat 9 systems (or similar), the packaged gcc targets x86-64-v2,
but clang does not. This has caused build failures in the wake of
commit e2809e3a1 when building --with-llvm.

The most expedient fix is to use the same function attributes for
the inlined function as we do for the global function.

Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> (plus members skimmer and bumblebee)
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Todd Cook <cookt@blackduck.com>
Discussion: https://postgr.es/m/CANWCAZZSxs3a1YRKehkgk2OHKbrVn+xZ+AWW8Co2R_f70NqqmA@mail.gmail.com
2025-04-01 12:01:58 +07:00
David Rowley
3dbdf86c63 Fix failing regression test on x86-32 machines
95d6e9af0 added code to display the tuplestore storage type for
WindowAgg nodes and added a test to ensure the "Disk" storage method was
working correctly by setting work_mem to 64 and running a test which
caused the WindowAgg to go to disk.  Seemingly, the number of rows
chosen there wasn't quite enough for that to happen in x86 32-bit.

Fix this by increasing the number of rows slightly.

I suspect the buildfarm didn't catch this as MEMORY_CONTEXT_CHECKING
builds will use a bit more memory for MemoryChunks to store the
requested_size and also because of the additional space to store the
chunk's sentinel byte.

Reported-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/Z-q3ZAM4OhE-4UiI@msg.df7cb.de
2025-04-01 10:52:25 +13:00
Tom Lane
2fd3e2fa5c Fix accidentally-harmless thinko in psqlscan_test_variable().
This code was passing literal strings to psqlscan_emit,
which is quite contrary to that function's specification:
"If you pass it something that is not part of the yytext
string, you are making a mistake".  It accidentally worked
anyway, even in non-safe_encoding mode.  psqlscan_emit
would compute a garbage "reference" pointer, but would
never dereference that since the passed string is all-ASCII.
So there's no live bug today, but that is a happenstance
outcome of psqlscan_emit's current implementation.

Let's make psqlscan_test_variable do what it's supposed to,
namely append directly to the output buffer.  This is just
future-proofing against possible changes in psqlscan_emit,
so I don't feel a need to back-patch.
2025-03-31 12:16:32 -04:00
Peter Eisentraut
0fcf02ad45 doc: Mention clock synchronization recommendation for hot_standby_feedback
hot_standby_feedback mechanics assume that clocks are synchronized,
but it was not clear from documentation.

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAKZiRmwBcALLrDgCyEhHP1enUxtPMjyNM_d1A2Lng3_6Rf4Qfw%40mail.gmail.com
2025-03-31 16:54:50 +02:00
John Naylor
e2809e3a10 Inline CRC computation for small fixed-length input on x86
pg_crc32c.h now has a simplified copy of the loop in pg_crc32c_sse42.c
suitable for inlining where possible.

This may slightly reduce contention for the WAL insertion lock,
but that hasn't been tested. The motivation for this change is avoid
regressing for a future commit that will use a function pointer for
non-constant input in all x86 builds.

While it's technically possible to make a similar change for Arm and
LoongArch, there are some questions about how inlining should work
since those platforms prefer stricter alignment. There are also no
immediate plans to add additional implementations for them.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Discussion: https://postgr.es/m/CANWCAZZEiTzhZcuwTiJ2=opiNpAUn1vuDRu1N02z61AthwRZLA@mail.gmail.com
Discussion: https://postgr.es/m/CANWCAZYRhLHArpyfV4uRK-Rw9N5oV5HMkkKtBehcuTjNOMwCZg@mail.gmail.com
2025-03-31 13:17:21 +07:00
Jeff Davis
4694aedf63 Add relallfrozen to pg_dump statistics.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=desCuf3dVHasADvdUVRmb-5gO0mhMO5u9nzgv6i7U86Q@mail.gmail.com
2025-03-30 22:14:06 -07:00
Andres Freund
2a5e709e72 Enable IO concurrency on all systems
Previously effective_io_concurrency and maintenance_io_concurrency could not
be set above 0 on machines without fadvise support. AIO enables IO concurrency
without such support, via io_method=worker.

Currently only subsystems using the read stream API will take advantage of
this. Other users of maintenance_io_concurrency (like recovery prefetching)
which leverage OS advice directly will not benefit from this change. In those
cases, maintenance_io_concurrency will have no effect on I/O behavior.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CAAKRu_atGgZePo=_g6T3cNtfMf0QxpvoUh5OUqa_cnPdhLd=gw@mail.gmail.com
2025-03-30 19:16:47 -04:00
Andres Freund
ae3df4b341 read_stream: Introduce and use optional batchmode support
Submitting IO in larger batches can be more efficient than doing so
one-by-one, particularly for many small reads. It does, however, require
the ReadStreamBlockNumberCB callback to abide by the restrictions of AIO
batching (c.f. pgaio_enter_batchmode()). Basically, the callback may not:
a) block without first calling pgaio_submit_staged(), unless a
   to-be-waited-on lock cannot be part of a deadlock, e.g. because it is
   never held while waiting for IO.

b) directly or indirectly start another batch pgaio_enter_batchmode()

As this requires care and is nontrivial in some cases, batching is only
used with explicit opt-in.

This patch adds an explicit flag (READ_STREAM_USE_BATCHING) to read_stream and
uses it where appropriate.

There are two cases where batching would likely be beneficial, but where we
aren't using it yet:

1) bitmap heap scans, because the callback reads the VM

   This should soon be solved, because we are planning to remove the use of
   the VM, due to that not being sound.

2) The first phase of heap vacuum

   This could be made to support batchmode, but would require some care.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-30 18:36:41 -04:00
Andres Freund
f4d0730bbc aio: Basic read_stream adjustments for real AIO
Adapt the read stream logic for real AIO:
- If AIO is enabled, we shouldn't issue advice, but if it isn't, we should
  continue issuing advice
- AIO benefits from reading ahead with direct IO
- If effective_io_concurrency=0, pass READ_BUFFERS_SYNCHRONOUSLY to
  StartReadBuffers() to ensure synchronous IO execution

There are further improvements we should consider:

- While in read_stream_look_ahead(), we can use AIO batch submission mode for
  increased efficiency. That however requires care to avoid deadlocks and thus
  done separately.
- It can be beneficial to defer starting new IOs until we can issue multiple
  IOs at once. That however requires non-trivial heuristics to decide when to
  do so.

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
2025-03-30 18:26:44 -04:00
Andres Freund
b27f8637ea docs: Reframe track_io_timing related docs as wait time
With AIO it does not make sense anymore to track the time for each individual
IO, as multiple IOs can be in-flight at the same time. Instead we now track
the time spent *waiting* for IOs.

This should be reflected in the docs. While, so far, we only do a subset of
reads, and no other operations, via AIO, describing the GUC and view columns
as measuring IO waits is accurate for synchronous and asynchronous IO.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/5dzyoduxlvfg55oqtjyjehez5uoq6hnwgzor4kkybkfdgkj7ag@rbi4gsmzaczk
2025-03-30 18:04:40 -04:00
Andres Freund
12ce89fd07 bufmgr: Use AIO in StartReadBuffers()
This finally introduces the first actual use of AIO. StartReadBuffers() now
uses the AIO routines to issue IO.

As the implementation of StartReadBuffers() is also used by the functions for
reading individual blocks (StartReadBuffer() and through that
ReadBufferExtended()) this means all buffered read IO passes through the AIO
paths.  However, as those are synchronous reads, actually performing the IO
asynchronously would be rarely beneficial. Instead such IOs are flagged to
always be executed synchronously. This way we don't have to duplicate a fair
bit of code.

When io_method=sync is used, the IO patterns generated after this change are
the same as before, i.e. actual reads are only issued in WaitReadBuffers() and
StartReadBuffers() may issue prefetch requests.  This allows to bypass most of
the actual asynchronicity, which is important to make a change as big as this
less risky.

One thing worth calling out is that, if IO is actually executed
asynchronously, the precise meaning of what track_io_timing is measuring has
changed. Previously it tracked the time for each IO, but that does not make
sense when multiple IOs are executed concurrently. Now it only measures the
time actually spent waiting for IO. A subsequent commit will adjust the docs
for this.

While AIO is now actually used, the logic in read_stream.c will often prevent
using sufficiently many concurrent IOs. That will be addressed in the next
commit.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-30 18:02:23 -04:00
Andres Freund
047cba7fa0 bufmgr: Implement AIO read support
This commit implements the infrastructure to perform asynchronous reads into
the buffer pool.

To do so, it:

- Adds readv AIO callbacks for shared and local buffers

  It may be worth calling out that shared buffer completions may be run in a
  different backend than where the IO started.

- Adds an AIO wait reference to BufferDesc, to allow backends to wait for
  in-progress asynchronous IOs

- Adapts StartBufferIO(), WaitIO(), TerminateBufferIO(), and their localbuf.c
  equivalents, to be able to deal with AIO

- Moves the code to handle BM_PIN_COUNT_WAITER into a helper function, as it
  now also needs to be called on IO completion

As of this commit, nothing issues AIO on shared/local buffers. A future commit
will update StartReadBuffers() to do so.

Buffer reads executed through this infrastructure will report invalid page /
checksum errors / warnings differently than before:

In the error case the error message will cover all the blocks that were
included in the read, rather than just the reporting the first invalid
block. If more than one block is invalid, the error will include information
about the range of the read, the first invalid block and the number of invalid
pages, with a HINT towards the server log for per-block details.

For the warning case (i.e. zero_damaged_buffers) we would previously emit one
warning message for each buffer in a multi-block read. Now there is only a
single warning message for the entire read, again referring to the server log
for more details in case of multiple checksum failures within a single larger
read.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-30 17:28:03 -04:00
Andres Freund
ef64fe26ba aio: Add WARNING result status
If an IO succeeds, but issues a warning, e.g. due to a page verification
failure with zero_damaged_pages, we want to issue that warning in the context
of the issuer of the IO, not the process that executes the completion (always
the case for worker).

It's already possible for a completion callback to report a custom error
message, we just didn't have a result status that allowed a user of AIO to
know that a warning should be emitted even though the IO request succeeded.

All that's needed for that is a dedicated PGAIO_RS_ value.

Previously there were not enough bits in PgAioResult.id for the new
value. Increase. While at that, add defines for the amount of bits and static
asserts to check that the widths are appropriate.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250329212929.a6.nmisch@google.com
2025-03-30 16:27:10 -04:00
Andres Freund
d445990adc Let caller of PageIsVerified() control ignore_checksum_failure
For AIO the completion of a read into shared buffers (i.e. verifying the page
including the checksum, updating the BufferDesc to reflect the IO) can happen
in a different backend than the backend that started the IO. As
ignore_checksum_failure can differ between backends, we need to allow the
caller of PageIsVerified() control whether to ignore checksum failures.

The commit leaves a gap in the PIV_* values, as an upcoming commit, which
depends on this commit, will add PIV_LOG_LOG, which better fits just after
PIV_LOG_WARNING.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250329212929.a6.nmisch@google.com
2025-03-30 16:27:10 -04:00
Andres Freund
b96d3c3897 pgstat: Allow checksum errors to be reported in critical sections
For AIO we execute completion callbacks in critical sections (to ensure that
AIO can in the future be used for WAL, which in turn requires that we can call
completion callbacks in critical sections, to get the resources for WAL
io). To report checksum errors a backend now has to call
pgstat_prepare_report_checksum_failure(), before entering a critical section,
which guarantees the relevant pgstats entry is in shared memory, the relevant
DSM segment is mapped into the backend's memory and the address is known via a
PgStat_EntryRef.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/wkjj4p2rmkevutkwc6tewoovdqznj6c6nvjmvii4oo5wmbh5sr@retq7d6uqs4j
2025-03-30 16:12:04 -04:00
Andres Freund
4244cf6876 Add errhint_internal()
We have errmsg_internal(), errdetail_internal(), but not errhint_internal().

Sometimes it is useful to output a hint with already translated format
string (e.g. because there different messages depending on the condition). For
message/detail we do that with the _internal() variants, but we can't do that
with hint today.  It's possible to work around that that by using something
like
  str = psprintf(translated_format, args);
  ereport(...
          errhint("%s", str);
but that's not exactly pretty and makes it harder to avoid memory leaks.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/ym3dqpa4xcvoeknewcw63x77vnqdosbqcetjinb2zfoh65k55m@m4ozmwhr6lk6
2025-03-30 16:10:51 -04:00
Tomas Vondra
49b82522f1 Remove incidental md5() function use from test
Replace md5() with sha256() in tests introduced in 14ffaece0fb5, to
allow test to pass in OpenSSL FIPS mode.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3518736.1743307492@sss.pgh.pa.us
2025-03-30 13:22:39 +02:00
Andres Freund
d6d8054dc7 localbuf: Track pincount in BufferDesc as well
For AIO on temporary table buffers the AIO subsystem needs to be able to
ensure a pin on a buffer while AIO is going on, even if the IO issuing query
errors out. Tracking the buffer in LocalRefCount does not work, as it would
cause CheckForLocalBufferLeaks() to assert out.

Instead, also track the refcount in BufferDesc.state, not just
LocalRefCount. This also makes local buffers behave a bit more akin to shared
buffers.

Note that we still don't need locking, AIO completion callbacks for local
buffers are executed in the issuing session (i.e. nobody else has access to
the BufferDesc).

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-29 16:36:51 -04:00
Andres Freund
08ccd56ac7 aio, bufmgr: Comment fixes/improvements
Some of these comments have been wrong for a while (12f3867f5534), some I
recently introduced (da7226993fd, 55b454d0e14). This includes an update to a
comment in FlushBuffer(), which will be copied in a future commit.

These changes seem big enough to be worth doing in separate commits.

Suggested-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250319212530.80.nmisch@google.com
2025-03-29 14:45:42 -04:00
Andres Freund
50cb7505b3 aio: Implement support for reads in smgr/md/fd
This implements the following:

1) An smgr AIO target, for AIO on smgr files. This should be usable not just
   for md.c but also other SMGR implementation if we ever get them.
2) readv support in fd.c, which requires a small bit of infrastructure work in
   fd.c
3) smgr.c and md.c support for readv

There still is nothing performing AIO, but as of this commit it would be
possible.

As part of this change FileGetRawDesc() actually ensures that the file is
opened - previously it was basically not usable. It's used to reopen a file in
IO workers.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-29 13:38:35 -04:00
Andres Freund
dee8002468 Fix mis-attribution of checksum failure stats to the wrong database
Checksum failure stats could be attributed to the wrong database in two cases:

- when a read of a shared relation encountered a checksum error , it would be
  attributed to the current database, instead of the "database" representing
  shared relations

- when using CREATE DATABASE ... STRATEGY WAL_LOG checksum errors in the
  source database would be attributed to the current database

The checksum stats reporting via PageIsVerifiedExtended(PIV_REPORT_STAT) does
not have access to the information about what database a page belongs to.

This fixes the issue by removing PIV_REPORT_STAT and delegating the
responsibility to report stats to the caller, which now can learn about the
number of stats via a new optional argument.

As this changes the signature of PageIsVerifiedExtended() and all callers
should adapt to the new signature, use the occasion to rename the function to
PageIsVerified() and remove the compatibility macro.

We could instead have fixed this by adding information about the database to
the args of PageIsVerified(), but there are soon-to-be-applied patches that
need to separate the stats reporting from the PageIsVerified() call
anyway. Those patches also include testing for the failure paths, something we
inexplicably have not had.

As there is no caller of pgstat_report_checksum_failure() left, remove it.

It'd be possible, but awkward to fix this in the back branches. We considered
doing the work not quite worth it, as mis-attributed stats should still elicit
concern. The emitted error messages do allow to attribute the errors
correctly.

Discussion: https://postgr.es/m/5tyic6epvdlmd6eddgelv47syg2b5cpwffjam54axp25xyq2ga@ptwkinxqo3az
Discussion: https://postgr.es/m/mglpvvbhighzuwudjxzu4br65qqcxsnyvio3nl4fbog3qknwhg@e4gt7npsohuz
2025-03-29 13:38:35 -04:00
Tomas Vondra
68f97aeadb amcheck: Add a GIN index to the CREATE INDEX CONCURRENTLY tests
The existing CREATE INDEX CONCURRENTLY tests checking only B-Tree, but
can be cheaply extended to also check GIN. This helps increasing test
coverage for GIN amcheck, especially related to handling concurrent page
splits and posting list trees.

This already helped to identify several issues during development of the
GIN amcheck support.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/BC221A56-977C-418E-A1B8-9EFC881D80C5%40enterprisedb.com
2025-03-29 16:47:44 +01:00
Tomas Vondra
ca738bdc4c amcheck: Add a test with GIN index on JSONB data
Extend the existing test of GIN checks to also include an index on JSONB
data, using the jsonb_path_ops opclass. This is a common enough usage of
GIN that it makes sense to have better test coverage for it.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/BC221A56-977C-418E-A1B8-9EFC881D80C5%40enterprisedb.com
2025-03-29 16:47:44 +01:00
Tomas Vondra
ec4327d106 amcheck: Fix indentation in verify_gin.c
I forgot to reindent the code after a couple last-minute adjustments
just before committing 14ffaece0fb53fed8ddbc46d2b353e1c4834863a.

Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
2025-03-29 16:47:44 +01:00
Andres Freund
116e851db5 Fix "‘static’ is not at beginning of declaration" warning
b98be8a2a2a used "const static" instead of "static const". We normally use the
latter form.

Discussion: https://postgr.es/m/z4mc2hzecahyq3paupfsouhuupmzmgum45md3k5my6bmo7gvn7@z5j26doqamqy
2025-03-29 10:48:59 -04:00
Tomas Vondra
14ffaece0f amcheck: Add gin_index_check() to verify GIN index
Adds a new function, validating two kinds of invariants on a GIN index:

- parent-child consistency: Paths in a GIN graph have to contain
  consistent keys. Tuples on parent pages consistently include tuples
  from child pages; parent tuples do not require any adjustments.

- balanced-tree / graph: Each internal page has at least one downlink,
  and can reference either only leaf pages or only internal pages.

The GIN verification is based on work by Grigory Kryachko, reworked by
Heikki Linnakangas and with various improvements by Andrey Borodin.
Investigation and fixes for multiple bugs by Kirill Reshke.

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
2025-03-29 15:44:29 +01:00
Peter Eisentraut
53a2a1564a pgbench: Make set_random_seed() 64-bit everywhere.
Delete an intermediate variable, a redundant cast, a use of long and a
use of long long.  scanf() the seed directly into a uint64, now that we
can do that with SCNu64 from <inttypes.h>.

The previous coding was from pre-C99 times when %lld might not have been
there, so it read into an unsigned long.  Therefore behavior varied
by OS, and --random-seed would accept either 32 or 64 bit seeds.  Now
it's the same everywhere.

Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
2025-03-29 15:24:42 +01:00
Tomas Vondra
d70b17636d amcheck: Move common routines into a separate module
Before performing checks on an index, we need to take some safety
measures that apply to all index AMs. This includes:

* verifying that the index can be checked - Only selected AMs are
supported by amcheck (right now only B-Tree). The index has to be
valid and not a temporary index from another session.

* changing (and then restoring) user's security context

* obtaining proper locks on the index (and table, if needed)

* discarding GUC changes from the index functions

Until now this was implemented in the B-Tree amcheck module, but it's
something every AM will have to do. So relocate the code into a new
module verify_common for reuse.

The shared steps are implemented by amcheck_lock_relation_and_check(),
receiving the AM-specific verification as a callback. Custom parameters
may be supplied using a pointer.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
2025-03-29 15:14:49 +01:00
Tomas Vondra
fb9dff7663 Fix grammar in GIN README
Author: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CALdSSPgu9uAhVYojQ0yjG%3Dq5MaqmiSLUJPhz%2B-u7cA6K6Mc9UA%40mail.gmail.com
2025-03-29 15:14:25 +01:00
Dean Rasheed
8b6a0e2392 Fix MERGE with DO NOTHING actions into a partitioned table.
ExecInitPartitionInfo() duplicates much of the logic in
ExecInitMerge(), except that it failed to handle DO NOTHING
actions. This would cause an "unknown action in MERGE WHEN clause"
error if a MERGE with any DO NOTHING actions attempted to insert into
a partition not already initialised by ExecInitModifyTable().

Bug: #18871
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Gurjeet Singh <gurjeet@singh.im>
Discussion: https://postgr.es/m/18871-b44e3c96de3bd2e8%40postgresql.org
Backpatch-through: 15
2025-03-29 09:58:40 +00:00
Peter Eisentraut
a0ed19e0a9 Use PRI?64 instead of "ll?" in format strings (continued).
Continuation of work started in commit 15a79c73, after initial trial.

Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
2025-03-29 10:43:57 +01:00
Jeff Davis
a0a4601765 Matview statistics depend on matview data.
REFRESH MATERIALIZED VIEW replaces the storage, which resets
statistics, so statistics must be restored afterward.

If both statistics and data are being dumped for a materialized view,
add a dependency from the former to the latter. Defer the statistics
to SECTION_POST_DATA, and use RESTORE_PASS_POST_ACL.

Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5s47kmubpbbRJzSM-Zfe0Tj2O3GBagB7YAyE8rQ-V24Uw@mail.gmail.com
2025-03-28 16:12:55 -07:00
Alexander Korotkov
775a06d44c Make group_similar_or_args() reorder clause list as little as possible
Currently, group_similar_or_args() permutes original positions of clauses
independently on whether it manages to find any groups of similar clauses.
While we are not providing any strict warranties on saving the original order
of OR-clauses, it is preferred that the original order be modified as little
as possible.

This commit changes the reordering algorithm of group_similar_or_args() in
the following way.  We reorder each group of similar clauses so that the
first item of the group stays in place, but all the other items are moved
after it.  So, if there are no similar clauses, the order of clauses stays
the same.  When there are some groups, only required reordering happens while
the rest of the clauses remain in their places.

Reported-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/3ac7c436-81e1-4191-9caf-b0dd70b51511%40gmail.com
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
2025-03-28 23:37:49 +02:00
Nathan Bossart
519338ace4 Optimize popcount functions with ARM SVE intrinsics.
This commit introduces SVE implementations of pg_popcount{32,64}.
Unlike the Neon versions, we need an additional configure-time
check to determine if the compiler supports SVE intrinsics, and we
need a runtime check to determine if the current CPU supports SVE
instructions.  Our testing showed that the SVE implementations are
much faster for larger inputs and are comparable to the status
quo for smaller inputs.

Author: "Devanga.Susmitha@fujitsu.com" <Devanga.Susmitha@fujitsu.com>
Co-authored-by: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com>
Co-authored-by: "Malladi, Rama" <ramamalladi@hotmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com
Discussion: https://postgr.es/m/OSZPR01MB84990A9A02A3515C6E85A65B8B2A2%40OSZPR01MB8499.jpnprd01.prod.outlook.com
2025-03-28 16:20:20 -05:00
Peter Eisentraut
3c8e463b0d Revert "Tidy up locale thread safety in ECPG library."
This reverts commit 8e993bff5326b00ced137c837fce7cd1e0ecae14.

It causes various build failures on the buildfarm, to be investigated.

Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech
2025-03-28 21:27:37 +01:00
Nathan Bossart
6be53c2767 Optimize popcount functions with ARM Neon intrinsics.
This commit introduces Neon implementations of pg_popcount{32,64},
pg_popcount(), and pg_popcount_masked().  As in simd.h, we assume
that all available AArch64 hardware supports Neon, so we don't need
any new configure-time or runtime checks.  Some compilers already
emit Neon instructions for these functions, but our hand-rolled
implementations for pg_popcount() and pg_popcount_masked()
performed better in testing, likely due to better instruction-level
parallelism.

Author: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com
2025-03-28 14:49:35 -05:00
Heikki Linnakangas
51a0382e8d Fix crash if LockErrorCleanup() is called twice
The refactoring in commit 3c0fd64fec removed the clearing of
awaitedLock from LockErrorCleanup(). It's still needed, otherwise
LockErrorCleanup() during abort processing will try to update the
LOCALLOCK struct even after the lock has already been released. Put it
back.

Reported-by: Richard Guo <guofenglinux@gmail.com>
Reported-by: Robins Tharakan <tharakan@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4_dNX1SzBmvFdoY-LxJh_4W_BjtVd5i008ihfU-wFF=eg@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/18832-38e5575b1bbd7277@postgresql.org
Discussion: https://www.postgresql.org/message-id/e11a30e5-c0d8-491d-8546-3a1b50c10ad4@gmail.com
2025-03-28 20:19:17 +02:00
Nathan Bossart
9ac6f7e7ce Rename TRY_POPCNT_FAST to TRY_POPCNT_X86_64.
This macro protects x86_64-specific code, and a subsequent commit
will introduce AArch64-specific versions of that code.  To prevent
confusion, let's rename it to clearly indicate that it's for
x86_64.  We should likely move this code to its own file (perhaps
merging it with the AVX-512 popcount code), but that is left as a
future exercise.

Reviewed-by: "Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com
2025-03-28 12:27:47 -05:00
Masahiko Sawada
a5419bc72e Fix timestamp overflow in UUIDv7 implementation.
The uuidv7_interval() function previously converted a shifted
microsecond-precision timestamp (64-bit integer) to another 64-bit
integer representing a timestamp with nanosecond precision. This
conversion caused overflow for dates beyond the year 2262. The
millisecond and sub-millisecond parts were then extracted from this
nanosecond-precision timestamp and stored in UUIDv7 values.

With this commit, the millisecond and sub-millisecond parts are stored
directly into the UUIDv7 value without being converted back to a
nanosecond precision timestamp. Following RFC 9562, the timestamp is
stored as an unsigned integer, enabling support for dates up to the
year 10889.

Reported and fixed by Andrey Borodin, with cosmetic changes and
regression tests by me.

Reported-by: Andrey Borodin <x4mmm@yandex-team.ru>
Author: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/96DEC2D9-659A-40E8-B7BA-AF5D162A9E21@yandex-team.ru
2025-03-28 09:39:11 -07:00
Peter Eisentraut
8e993bff53 Tidy up locale thread safety in ECPG library.
Remove setlocale() and _configthreadlocal() as fallback strategy on
systems that don't have uselocale(), where ECPG tries to control
LC_NUMERIC formatting on input and output of floating point numbers.  It
was probably broken on some systems (NetBSD), and the code was also
quite messy and complicated, with obsolete configure tests (Windows).
It was also arguably broken, or at least had unstated environmental
requirements, if pgtypeslib code was called directly.

Instead, introduce PG_C_LOCALE to refer to the "C" locale as a locale_t
value.  It maps to the special constant LC_C_LOCALE when defined by libc
(macOS, NetBSD), or otherwise uses a process-lifetime locale_t that is
allocated on first use, just as ECPG previously did itself.  The new
replacement might be more widely useful.  Then change the float parsing
and printing code to pass that to _l() functions where appropriate.

Unfortunately the portability of those functions is a bit complicated.
First, many obvious and useful _l() functions are missing from POSIX,
though most standard libraries define some of them anyway.  Second,
although the thread-safe save/restore technique can be used to replace
the missing ones, Windows and NetBSD refused to implement standard
uselocale().  They might have a point: "wide scope" uselocale() is hard
to combine with other code and error-prone, especially in library code.
Luckily they have the  _l() functions we want so far anyway.  So we have
to be prepared for both ways of doing things:

1.  In ECPG, use strtod_l() for parsing, and supply a port.h replacement
using uselocale() over a limited scope if missing.

2.  Inside our own snprintf.c, use three different approaches to format
floats.  For frontend code, call libc's snprintf_l(), or wrap libc's
snprintf() in uselocale() if it's missing.  For backend code, snprintf.c
can keep assuming that the global locale's LC_NUMERIC is "C" and call
libc's snprintf() without change, for now.

(It might eventually be possible to call our in-tree Ryū routines to
display floats in snprintf.c, given the C-locale-always remit of our
in-tree snprintf(), but this patch doesn't risk changing anything that
complicated.)

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech
2025-03-28 16:18:36 +01:00
Peter Eisentraut
2247281c47 Cast result of i64abs() back to int64
Without the cast, the return type could be long or long long,
depending on what int64 is underneath.  This doesn't affect code
correctness, but it could result in format-mismatch warnings when
attempting to printf such values using PRId64.

Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+hUKGJc4s+Wyb3EFOQNN9VVK+Qv40r2LK41o9PkS9ThxviTvQ@mail.gmail.com
2025-03-28 14:34:57 +01:00
Robert Haas
83ccc85859 pg_overexplain: Use PG_MODULE_MAGIC_EXT.
I committed this contrib module just after Tom committed
55527368bd07248e91e3d37a782bf66b76f06865; adjust it to match.

Author: Man Zeng <zengman@halodbtech.com>
Discussion: http://postgr.es/m/174313513707.60295.16516085012903412705.pgcf@coridan.postgresql.org
2025-03-28 09:16:29 -04:00
Robert Haas
9f0c36aea0 pg_overexplain: Call previous hooks as appropriate.
It makes no sense to remember the previous values of the hook variables
and then never bother calling those functions. Thanks to Andrei for
spotting my goof.

Author: Andrei Lepikhov <lepihov@gmail.com>
Discussion: http://postgr.es/m/41a344e3-ffb1-4296-8ba7-801f1e9642e5@gmail.com
2025-03-28 09:02:37 -04:00
Peter Eisentraut
cdc168ad4b Add support for not-null constraints on virtual generated columns
This was left out of the original patch for virtual generated columns
(commit 83ea6c54025).

This just involves a bit of extra work in the executor to expand the
generation expressions and run a "IS NOT NULL" test against them.

There is also a bit of work to make sure that not-null constraints are
checked during a table rewrite.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Navneet Kumar <thanit3111@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com
2025-03-28 13:53:37 +01:00
Peter Eisentraut
747ddd38cb Modernize some code a bit
Modernize code in ExecRelCheck() and ExecConstraints() a bit,
preparing the way for some new code.

Co-authored-by: jian he <jian.universality@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Navneet Kumar <thanit3111@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com
2025-03-28 10:49:15 +01:00
Peter Eisentraut
9a9ead1105 Rename a node field for clarity
Rename ResultRelInfo.ri_ConstraintExprs to ri_CheckConstraintExprs.
This reflects its specific purpose better and avoids confusion with
adjacent fields with similar but distinct purposes.

Discussion: https://postgr.es/m/CACJufxHArQysbDkWFmvK+D1TPHQWWTxWN15cMuUaTYX3xhQXgg@mail.gmail.com
2025-03-28 09:50:01 +01:00
Amit Kapila
fb2ea12f42 pg_createsubscriber: Add '--all' option.
The '--all' option indicates that the tool queries the source server
(publisher) for all databases and creates subscriptions on the target
server (subscriber) for databases with matching names. Without this user
needs to explicitly specify all databases by using -d option for each
database.

This simplifies converting a physical standby to a logical subscriber,
particularly during upgrades.

The options '--database', '--publication', '--subscription', and
'--replication-slot' cannot be used when '--all' is specified.

Author: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://postgr.es/m/CAHv8RjKhA=_h5vAbozzJ1Opnv=KXYQHQ-fJyaMfqfRqPpnC2bA@mail.gmail.com
2025-03-28 12:26:39 +05:30
Peter Eisentraut
890fc826c9 Use thread-safe strftime_l() instead of strftime().
This removes some setlocale() calls and a lot of commentary about how
dangerous that is.  strftime_l() is from POSIX 2008, and on Windows we
use _wcsftime_l().

While here, adjust error message for strftime_l() failure: it does not
in practice set errno (even though POSIX says it could), so no %m.

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com
2025-03-28 07:13:43 +01:00
Amit Kapila
474d7a1fd8 Stablize tests added in 3abe9dc188.
The problem is that after the ALTER SUBSCRIPTION tap_sub SET PUBLICATION
command, we didn't wait for the new walsender to start on the publisher.
Immediately after ALTER, we performed Insert and expected it to replicate.
However, the replication could start from a point after the INSERT location,
and as the subscription isn't copying initial data, we could miss such an
Insert.

The fix is to wait for connection to be established between publisher and
subscriber before starting DML operations that are expected to replicate.

As per CI.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/CALDaNm2ms1deM5EYNLFEfESv_Kw=Y4AiTB0LP=qGS-UpFwGbPg@mail.gmail.com
2025-03-28 11:03:05 +05:30
Daniel Gustafsson
058b5152f0 Fix guc_malloc calls for consistency and OOM checks
check_createrole_self_grant and check_synchronized_standby_slots
were allocating memory on a LOG elevel without checking if the
allocation succeeded or not, which would have led to a segfault
on allocation failure.

On top of that, a number of callsites were using the ERROR level,
relying on erroring out rather than returning false to allow the
GUC machinery handle it gracefully.  Other callsites used WARNING
instead of LOG.  While neither being not wrong, this changes all
check_ functions do it consistently with LOG.

init_custom_variable gets a promoted elevel to FATAL to keep
the guc_malloc error handling in line with the rest of the
error handling in that function which already call FATAL.  If
we encounter an OOM in this callsite there is no graceful
handling to be had, better to error out hard.

Backpatch the fix to check_createrole_self_grant down to v16
and the fix to check_synchronized_standby_slots down to v17
where they were introduced.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Nikita <pm91.arapov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18845
Discussion: https://postgr.es/m/18845-582c6e10247377ec@postgresql.org
Backpatch-through: 16
2025-03-27 22:57:34 +01:00
Melanie Plageman
043799fa08 Use streaming read I/O in heap amcheck
Instead of directly invoking ReadBuffer() for each unskippable block in
the heap relation, verify_heapam() now uses the read stream API to
acquire the next buffer to check for corruption.

Author: Matheus Alcantara <matheusssilv97@gmail.com>
Co-authored-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/flat/CAFY6G8eLyz7%2BsccegZYFj%3D5tAUR-GZ9uEq4Ch5gvwKqUwb_hCA%40mail.gmail.com
2025-03-27 14:04:14 -04:00
Tom Lane
4623d71443 Prevent assertion failure in contrib/pg_freespacemap.
Applying pg_freespacemap() to a relation lacking storage (such as a
view) caused an assertion failure, although there was no ill effect
in non-assert builds.  Add an error check for that case.

Bug: #18866
Reported-by: Robins Tharakan <tharakan@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://postgr.es/m/18866-d68926d0f1c72d44@postgresql.org
Backpatch-through: 13
2025-03-27 13:20:23 -04:00
Tom Lane
d66997dfe8 Avoid mixing designated and non-designated field initializers.
As revised by commit 9324c8c58, PG_MODULE_MAGIC constructed a
struct initializer containing both designated fields and a
non-designated "0".  That's okay in C, but not in C++, with
the result that extensions written in C++ failed to compile.
Change it to use only designated field initializers.

Author: Yurii Rashkovskii <yrashk@omnigres.com>
Discussion: https://postgr.es/m/CAG=VW14mctsR543gpzLCuJ9JgJqwa=ptmBfGvxEjs+k8Jf7-Bg@mail.gmail.com
2025-03-27 11:06:30 -04:00
Daniel Gustafsson
0f3604a518 psql: Fix incorrect equality comparison
Commit 1a759c83278 contained an incorrect equality comparison
which was discovered by Coverity.

Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQApfAWzLo+oSuy2byXktdr7R8KJC_ACT5VV8fontrL35Pw@mail.gmail.com
2025-03-27 14:09:25 +01:00
Robert Haas
081ec08e6a pg_overexplain: Filter out actual row count from test result.
Per buildfarm, these are not stable. In particular, 1/8 is sometimes
0.12 and sometimes 0.13.
2025-03-27 09:00:46 -04:00
Álvaro Herrera
9fbd53dea5
Remove the query_id_squash_values GUC
Commit 62d712ecfd94 introduced the capability to calculate the same
queryId for queries with different lengths of constants in a list for an
IN clause.  This behavior was originally enabled with a GUC
query_id_squash_values.  After a discussion about the value of such a
GUC, it was decided to back out of the use of a GUC and make the
squashing behavior the only available option.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/Z-LZyygkkNyA8-kR@msg.df7cb.de
Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com
2025-03-27 13:33:37 +01:00
Peter Eisentraut
5d5f415816 Expand test a bit
Make pg_constraint output in inherit test show the convalidated column
as well.  This shows the interaction between convalidated and
conenforced.

This is extracted from a larger patch so that this reformatting isn't
distracting there.

Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-03-27 12:11:15 +01:00
Peter Eisentraut
b98be8a2a2 Provide thread-safe pg_localeconv_r().
This involves four different implementation strategies:

1.  For Windows, we now require _configthreadlocale() to be available
and work (commit f1da075d9a0), and the documentation says that the
object returned by localeconv() is in thread-local memory.

2.  For glibc, we translate to nl_langinfo_l() calls, because it
offers the same information that way as an extension, and that API is
thread-safe.

3.  For macOS/*BSD, use localeconv_l(), which is thread-safe.

4.  For everything else, use uselocale() to set the locale for the
thread, and use a big ugly lock to defend against the returned object
being concurrently clobbered.  In practice this currently means only
Solaris.

The new call is used in pg_locale.c, replacing calls to setlocale() and
localeconv().

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com
2025-03-27 10:54:28 +01:00
Álvaro Herrera
4a02af8b1a
Simplify syntax for ALTER TABLE ALTER CONSTRAINT NO INHERIT
Commit d45597f72fe5 introduced the ability to change a not-null
constraint from NO INHERIT to INHERIT and vice versa, but we included
the SET noise word in the syntax for it.  The SET turns out not to be
necessary and goes against what the SQL standard says for other ALTER
TABLE subcommands, so remove it.

This changes the way this command is processed for constraint types
other than not-null, so there are some error message changes.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Suraj Kharage <suraj.kharage@enterprisedb.com>
Discussion: https://postgr.es/m/202503251602.vsxaehsyaoac@alvherre.pgsql
2025-03-27 09:24:52 +01:00
Michael Paquier
72c2f36d57 libpq: Add TAP tests for service files and names
This commit adds a set of regression tests that checks various patterns
with service names and service files, with:
- Service file with no contents, used as default for PGSERVICEFILE to
prevent any lookups at the HOME directory of an environment where the
test is run.
- Service file with valid service name and its section.
- Service file at the root of PGSYSCONFDIR, named pg_service.conf.
- Missing service file.
- Service name defined as a connection parameter or as PGSERVICE.

Note that PGSYSCONFDIR is set to always point at a temporary directory
created by the test, so as we never try to look at SYSCONFDIR.

This set of tests has come up as a useful independent addition while
discussing a patch that adds an equivalent of PGSERVICEFILE as a
connection parameter as there have never been any tests for service
files and service names.  Torsten Foertsch and Ryo Kanbayashi have
provided a basic implementation, that I have expanded to what is
introduced in this commit.

Author: Torsten Foertsch <tfoertsch123@gmail.com>
Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAKkG4_nCjx3a_F3gyXHSPWxD8Sd8URaM89wey7fG_9g7KBkOCQ@mail.gmail.com
2025-03-27 16:01:38 +09:00
David Rowley
ad9a23bc4f Optimize Query jumble
f31aad9b0 adjusted query jumbling so it no longer ignores NULL nodes
during the jumble.  This added some overhead.  Here we tune a few
things to make jumbling faster again.  This makes jumbling perform
similar or even slightly faster than prior to that change.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAApHDvreP04nhTKuYsPw0F-YN+4nr4f=L72SPeFb81jfv+2c7w@mail.gmail.com
2025-03-27 18:34:34 +13:00
David Rowley
f31aad9b07 Fix query jumbling to account for NULL nodes
Previously NULL nodes were ignored.  This could cause issues where the
computed query ID could match for queries where fields that are next to
each other in their Node struct where one field was NULL and the other
non-NULL.  For example, the Query struct had distinctClause and sortClause
next to each other.  If someone wrote;

SELECT DISTINCT c1 FROM t;

and then;

SELECT c1 FROM t ORDER BY c1;

these would produce the same query ID since, in the first query, we
ignored the NULL sortClause and appended the jumble bytes for the
distictClause.  In the latter query, since we did nothing for the NULL
distinctClause then jumble the non-NULL sortClause, and since the node
representation stored is the same in both cases, the query IDs were
identical.

Here we fix this by always accounting for NULL nodes by recording that
we saw a NULL in the jumble buffer.  This fixes the issue as the order that
the NULL is recorded isn't the same in the above two queries.

Author: Bykov Ivan <i.bykov@modernsys.ru>
Author: Michael Paquier <michael@paquier.xyz>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/aafce7966e234372b2ba876c0193f1e9%40localhost.localdomain
2025-03-27 18:23:00 +13:00
Michael Paquier
44fe6ceb51 doc: Correct description of values used in FSM for indexes
The implementation of FSM for indexes is simpler than heap, where 0 is
used to track if a page is in-use and (BLCKSZ - 1) if a page is free.
One comment in indexfsm.c and one description in the documentation of
pg_freespacemap were incorrect about that.

Author: Alex Friedman <alexf01@gmail.com>
Discussion: https://postgr.es/m/71eef655-c192-453f-ac45-2772fec2cb04@gmail.com
Backpatch-through: 13
2025-03-27 10:20:41 +09:00
Andres Freund
c325a7633f aio: Add io_method=io_uring
Performing AIO using io_uring can be considerably faster than
io_method=worker, particularly when lots of small IOs are issued, as
a) the context-switch overhead for worker based AIO becomes more significant
b) the number of IO workers can become limiting

io_uring, however, is linux specific and requires an additional compile-time
dependency (liburing).

This implementation is fairly simple and there are substantial optimization
opportunities.

The description of the existing AIO_IO_COMPLETION wait event is updated to
make the difference between it and the new AIO_IO_URING_EXECUTION clearer.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-26 19:49:13 -04:00
Andres Freund
8eadd5c73c aio: Add liburing dependency
Will be used in a subsequent commit, to implement io_method=io_uring. Kept
separate for easier review.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-26 19:45:32 -04:00
Michael Paquier
f056f75daf doc: Mention possible ephemeral discrepancies in pg_stat_activity
Ephemeral inconsistencies across multiple attributes of pg_stat_activity
can exist as the system is designed to be efficient with a low overhead.
This question is raised by users from time to time based on the data
read in the view, so let's add a note in the docs about this
possibility.

Author: Alex Friedman <alexf01@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/8a275154-a654-44b0-ab37-197802f04c7b@gmail.com
2025-03-27 08:07:54 +09:00
Andres Freund
9469d7fdd2 aio: Rename pgaio_io_prep_* to pgaio_io_start_*
The old naming pattern (mirroring liburing's naming) was inconsistent with
the (not yet introduced) callers. It seems better to get rid of the
inconsistency now than to grow more users of the odd naming.

Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250326001915.bc.nmisch@google.com
2025-03-26 16:10:29 -04:00
Andres Freund
f321ec237a aio: Pass result of local callbacks to ->report_return
Otherwise the results of e.g. temp table buffer verification errors will not
reach bufmgr.c. Obviously that's not right. Found while expanding the tests
for invalid buffer contents.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250326001915.bc.nmisch@google.com
2025-03-26 16:06:54 -04:00
Andres Freund
96da9050a5 aio: Be more paranoid about interrupts
As reported by Noah, it's possible, although practically very unlikely, that
interrupts could be processed in between pgaio_io_reopen() and
pgaio_io_perform_synchronously(). Prevent that by explicitly holding
interrupts.

It also seems good to add an assertion to pgaio_io_before_prep() to ensure
that interrupts are held, as otherwise FDs referenced by the IO could be
closed during interrupt processing. All code in the aio series currently runs
the code with interrupts held, but it seems better to be paranoid.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250324002939.5c.nmisch@google.com
2025-03-26 16:06:54 -04:00
Robert Haas
47a1f076a7 pg_overexplain: SET jit=off when running tests.
Per buildfarm.
2025-03-26 15:43:25 -04:00
Robert Haas
de65c4dade Fix oversights in commit 8d5ceb113e3f7ddb627bd40b26438a9d2fa05512
It added bogus whitespace at the end of a line in the documentation.
It should not have done that.

The pg_overexplain tests must SET debug_parallel_query = false,
not just RESET debug_parallel_query, or we get failures on test
machines that make debug_parallel_query = true the defualt.
2025-03-26 14:22:45 -04:00
Robert Haas
8d5ceb113e pg_overexplain: Additional EXPLAIN options for debugging.
There's a fair amount of information in the Plan and PlanState trees
that isn't printed by any existing EXPLAIN option. This means that,
when working on the planner, it's often necessary to rely on facilities
such as debug_print_plan, which produce excessively voluminous
output. Hence, use the new EXPLAIN extension facilities to implement
EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE) as extensions to the core
EXPLAIN facility.

A great deal more could be done here, and the specific choices about
what to print and how are definitely arguable, but this is at least
a starting point for discussion and a jumping-off point for possible
future improvements.

Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviweed-by: Andrei Lepikhov <lepihov@gmail.com> (who didn't like it)
Discussion: http://postgr.es/m/CA+TgmoZfvQUBWQ2P8iO30jywhfEAKyNzMZSR+uc2xr9PZBw6eQ@mail.gmail.com
2025-03-26 13:52:21 -04:00
Tomas Vondra
818245506c Keep the decompressed filter in brin_bloom_union
The brin_bloom_union() function combines two BRIN summaries, by merging
one filter into the other. With bloom, we have to decompress the filters
first, but the function failed to update the summary to store the merged
filter. As a consequence, the index may be missing some of the data, and
return false negatives.

This issue exists since BRIN bloom indexes were introduced in Postgres
14, but at that point the union function was called only when two
sessions happened to summarize a range concurrently, which is rare. It
got much easier to hit in 17, as parallel builds use the union function
to merge summaries built by workers.

Fixed by storing a pointer to the decompressed filter, and freeing the
original one. Free the second filter too, if it was decompressed. The
freeing is not strictly necessary, because the union is called in
short-lived contexts, but it's tidy.

Backpatch to 14, where BRIN bloom indexes were introduced.

Reported by Arseniy Mukhin, investigation and fix by me.

Reported-by: Arseniy Mukhin
Discussion: https://postgr.es/m/18855-1cf1c8bcc22150e6%40postgresql.org
Backpatch-through: 14
2025-03-26 17:01:41 +01:00
Tom Lane
55527368bd Use PG_MODULE_MAGIC_EXT in our installable shared libraries.
It seems potentially useful to label our shared libraries with version
information, now that a facility exists for retrieving that.  This
patch labels them with the PG_VERSION string.  There was some
discussion about using semantic versioning conventions, but that
doesn't seem terribly helpful for modules with no SQL-level presence;
and for those that do have SQL objects, we typically expect them
to support multiple revisions of the SQL definitions, so it'd still
not be very helpful.

I did not label any of src/test/modules/.  It seems unnecessary since
we don't install those, and besides there ought to be someplace that
still provides test coverage for the original PG_MODULE_MAGIC macro.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com
2025-03-26 11:11:02 -04:00
Tom Lane
9324c8c580 Introduce PG_MODULE_MAGIC_EXT macro.
This macro allows dynamically loaded shared libraries (modules) to
provide a wired-in module name and version, and possibly other
compile-time-constant fields in future.  This information can be
retrieved with the new pg_get_loaded_modules() function.

This feature is expected to be particularly useful for modules
that do not have any exposed SQL functionality and thus are
not associated with a SQL-level extension object.  But even for
modules that do belong to extensions, being able to verify the
actual code version can be useful.

Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Yurii Rashkovskii <yrashk@omnigres.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com
2025-03-26 11:06:12 -04:00
Daniel Gustafsson
e92c0632c1 Move GSSAPI includes into its own header
Due to a conflict in macro names on Windows between <wincrypt.h>
and <openssl/ssl.h> these headers need to be included using a
predictable pattern with an undef to handle that. The GSSAPI
header <gssapi.h> does include <wincrypt.h> which cause problems
with compiling PostgreSQL using MSVC when OpenSSL and GSSAPI are
both enabled in the tree. Rather than fixing piecemeal for each
file including gssapi headers, move the the includes and undef
to a new file which should be used to centralize the logic.

This patch is a reworked version of a patch by Imran Zaheer
proposed earlier in the thread. Once this has proven effective
in master we should look at backporting this as the problem
exist at least since v16.

Author: Daniel Gustafsson <daniel@yesql.se>
Co-authored-by: Imran Zaheer <imran.zhir@gmail.com>
Reported-by: Dave Page <dpage@pgadmin.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/20240708173204.3f3xjilglx5wuzx6@awork3.anarazel.de
2025-03-26 15:31:46 +01:00
Daniel Gustafsson
1eb399366e psql: Make test robust against locale variations
The test committed in 1a759c83278 was prone to failing when using
locales with a different decimal separator.  Since the test value
isn't the important part, change to using an integer instead.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRDE=7uW7QP4rg-OQLE2i-puYsUUt+eHE-L6_b_J9w=eWg@mail.gmail.com
2025-03-26 13:20:56 +01:00
Peter Eisentraut
3642df265d dblink: SCRAM authentication pass-through
This enables SCRAM authentication for dblink (using dblink_fdw) when
connecting to a foreign server without having to store a plain-text
password on user mapping options

This uses the same approach as it was implemented for postgres_fdw in
commit 761c79508e7.  (It also contains the equivalent of the
subsequent fixes 76563f88cfb and d2028e9bbc1.)

Author: Matheus Alcantara <mths.dev@pm.me>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com
2025-03-26 10:49:23 +01:00
Dean Rasheed
a3b6dfd410 Add support for gamma() and lgamma() functions.
These are useful general-purpose math functions which are included in
POSIX and C99, and are commonly included in other math libraries, so
expose them as SQL-callable functions.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Stepan Neretin <sncfmgg@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: https://postgr.es/m/CAEZATCXpGyfjXCirFk9au+FvM0y2Ah+2-0WSJx7MO368ysNUPA@mail.gmail.com
2025-03-26 09:35:53 +00:00
Richard Guo
7c82b4f711 Fix integer-overflow problem in scram_SaltedPassword()
Setting the iteration count for SCRAM secret generation to INT_MAX
will cause an infinite loop in scram_SaltedPassword() due to integer
overflow, as the loop uses the "i <= iterations" comparison.  To fix,
use "i < iterations" instead.

Back-patch to v16 where the user-settable GUC scram_iterations has
been added.

Author: Kevin K Biju <kevinkbiju@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAM45KeEMm8hnxdTOxA98qhfZ9CzGDdgy3mxgJmy0c+2WwjA6Zg@mail.gmail.com
2025-03-26 17:46:51 +09:00
Michael Paquier
787514b30b Use relation name instead of OID in query jumbling for RangeTblEntry
custom_query_jumble (introduced in 5ac462e2b7ac as a node field
attribute) is now assigned to the expanded reference name "eref" of
RangeTblEntry, adding in the query jumble computation the non-qualified
aliased relation name, without the list of column names.  The relation
OID is removed from the query jumbling.

The effects of this change can be seen in the tests added by
3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped
using the relation name, ignoring the relation search_path may point at.
For example, these two relations are different, but are now grouped in a
single PGSS entry as they are assigned the same query ID:
CREATE TABLE foo1.tab (a int);
CREATE TABLE foo2.tab (b int);
SET search_path = 'foo1';
SELECT count(*) FROM tab;
SET search_path = 'foo2';
SELECT count(*) FROM tab;
SELECT count(*) FROM foo1.tab;
SELECT count(*) FROM foo2.tab;
SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab';
          query           | calls
--------------------------+-------
 SELECT count(*) FROM tab |     4
(1 row)

It is still possible to use an alias in the FROM clause to split these.
This behavior is useful for relations re-created with the same name,
where queries based on such relations would be grouped in the same
PGSS entry.  For permanent schemas, it should not really matter in
practice.  The main benefit is for workloads that use a lot of temporary
relations, which are usually re-created with the same name continuously.
These can be a heavy source of bloat in PGSS depending on the workload.
Such entries can now be grouped together, improving the user experience.

The original idea from Christoph Berg used catalog lookups to find
temporary relations, something that the query jumble has never done, and
it could cause some performance regressions.  The idea to use
RangeTblEntry.eref and the relation name, applying the same rules for
all relations, temporary and not temporary, has been proposed by Tom
Lane.  The documentation additions have been suggested by Sami Imseih.

Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Christoph Berg <myon@debian.org>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
Peter Eisentraut
d2028e9bbc postgres_fdw: Fix tests on some Windows variants
The tests introduced by commit 76563f88cfb only work when Unix-domain
sockets are available.  This is optional on Windows, and buildfarm
member drongo runs without them.  To fix, skip the test if Unix-domain
sockets are not enabled.
2025-03-26 07:00:00 +01:00
Jeff Davis
bde2fb797a Add pg_dump --with-{schema|data|statistics} options.
By adding the positive variants of options, in addition to the
negative variants that already exist, users can be explicit about what
pg_dump should produce.

Discussion: https://postgr.es/m/bd0513e4b1ea2b2f2d06f02720c6579711cb62a6.camel@j-davis.com
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
2025-03-25 17:36:38 -07:00
Michael Paquier
27ee6ede6b Fix two issues with custom_query_jumble in gen_node_support.pl
A node field marked with custom_query_jumble and query_jumble_ignore
would generate some code of a custom routine.  The script is changed so
as custom_query_jumble behaves like the other options in this case,
query_jumble_ignore taking priority, with no code generated.

A comment related to the code generated for node types was misplaced.

Thinkos introduced in 5ac462e2b7ac.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1324036.1742945060@sss.pgh.pa.us
2025-03-26 09:06:36 +09:00
Tom Lane
cb36f8ec21 Fix order of -I switches for building pg_regress.o.
We need the -I switch for libpq_srcdir to come before any -I switches
injected by configure.  Otherwise there is a risk of pulling in a
mismatched version of libpq_fe.h from someplace like
/usr/local/include, if the platform has another Postgres version
installed there.  This evidently accounts for today's buildfarm
failures on "anaconda".

In principle the -I switch for src/port/ is at similar hazard, and has
been for a very long time.  But the only .h files we keep there are
pg_config_paths.h and pthread-win32.h, neither of which get installed
on Unix-ish systems, so the odds of picking up a conflicting header
seem pretty small.  That doubtless accounts for the lack of prior
reports.

Back-patch to v17 where pg_regress acquired a build dependency on
libpq_fe.h.  We could go back further to fix the hazard for src/port/
in older branches, but it seems unlikely to be worth troubling over.

Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/Z-MhRzoc7t-nPUQG@nathan
Backpatch-through: 17
2025-03-25 20:03:56 -04:00
Michael Paquier
3430215fe3 pg_stat_statements: Add more tests with temp tables and namespaces
These tests provide coverage for RangeTblEntry and how query jumbling
works with search_path, as well as the case where relations are
re-created, generating a different query ID as the relation OID is used
in the computation.

A patch is under discussion to switch to a different approach based on
the relation name, and there was no test coverage for this area,
including how queries are currently grouped with search_path.  This is
useful to track how the situation changes between HEAD and any patches
proposed.

Christoph has proposed the test with ON COMMIT DROP temporary tables,
and I have written the second part.

Author: Christoph Berg <myon@debian.org>
Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 07:25:23 +09:00
Nathan Bossart
626d7236b6 pg_upgrade: Add --swap for faster file transfer.
This new option instructs pg_upgrade to move the data directories
from the old cluster to the new cluster and then to replace the
catalog files with those generated for the new cluster.  This mode
can outperform --link, --clone, --copy, and --copy-file-range,
especially on clusters with many relations.

However, this mode creates many garbage files in the old cluster,
which can prolong the file synchronization step if
--sync-method=syncfs is used.  To handle that, we recommend using
--sync-method=fsync with this mode, and pg_upgrade internally uses
"initdb --sync-only --no-sync-data-files" for file synchronization.
pg_upgrade will synchronize the catalog files as they are
transferred.  We assume that the database files transferred from
the old cluster were synchronized prior to upgrade.

This mode also complicates reverting to the old cluster, so we
recommend restoring from backup upon failure during or after file
transfer.  We did consider teaching pg_upgrade how to generate a
revert script for such failures, but we decided against it due to
the rarity of failing during file transfer, the complexity of
generating the script, and the potential for misusing the script.

The new mode is limited to clusters located in the same file
system.  With some effort, we could probably support upgrades
between different file systems, but this mode is unlikely to offer
much benefit if we have to copy the files across file system
boundaries.

It is also limited to upgrades from version 10 or newer.  There are
a few known obstacles for using swap mode to upgrade from older
versions.  For example, the visibility map format changed in v9.6,
and the sequence tuple format changed in v10.  In fact, swap mode
omits the --sequence-data option in its uses of pg_dump and instead
reuses the old cluster's sequence data files.  While teaching swap
mode to deal with these kinds of changes is surely possible (and we
may have to deal with similar problems in the future, anyway), it
doesn't seem worth the effort to support upgrades from
long-unsupported versions.

Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
2025-03-25 16:02:35 -05:00
Nathan Bossart
9c49f0e8cd pg_dump: Add --sequence-data.
This new option instructs pg_dump to dump sequence data when the
--no-data, --schema-only, or --statistics-only option is specified.
This was originally considered for commit a7e5457db8, but it was
left out at that time because there was no known use-case.  A
follow-up commit will use this to optimize pg_upgrade's file
transfer step.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
2025-03-25 16:02:35 -05:00
Nathan Bossart
cf131fa942 initdb: Add --no-sync-data-files.
This new option instructs initdb to skip synchronizing any files
in database directories, the database directories themselves, and
the tablespace directories, i.e., everything in the base/
subdirectory and any other tablespace directories.  Other files,
such as those in pg_wal/ and pg_xact/, will still be synchronized
unless --no-sync is also specified.  --no-sync-data-files is
primarily intended for internal use by tools that separately ensure
the skipped files are synchronized to disk.  A follow-up commit
will use this to help optimize pg_upgrade's file transfer step.

The --sync-method=fsync implementation of this option makes use of
a new exclude_dir parameter for walkdir().  When not NULL,
exclude_dir specifies a directory to skip processing.  The
--sync-method=syncfs implementation of this option just skips
synchronizing the non-default tablespace directories.  This means
that initdb will still synchronize some or all of the database
files, but there's not much we can do about that.

Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
2025-03-25 16:02:35 -05:00
Jeff Davis
650ab8aaf1 Stats: use schemaname/relname instead of regclass.
For import and export, use schemaname/relname rather than
regclass.

This is more natural during export, fits with the other arguments
better, and it gives better control over error handling in case we
need to downgrade more errors to warnings.

Also, use text for the argument types for schemaname, relname, and
attname so that casts to "name" are not required.

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
2025-03-25 11:16:06 -07:00
Jeff Davis
2a420f7995 Minor doc update for commit 99f8f3fbbc.
Author: Corey Huinker <corey.huinker@gmail.com>
2025-03-25 11:15:52 -07:00
Daniel Gustafsson
1a759c8327 psql: Make default \watch interval configurable
The default interval for \watch to wait between executing queries,
when executed without a specified interval, was hardcoded to two
seconds.  This adds the new variable WATCH_INTERVAL which is used
to set the default interval, making it configurable for the user.
This makes \watch the first command which has a user configurable
default setting.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/B2FD26B4-8F64-4552-A603-5CC3DF1C7103@yesql.se
2025-03-25 17:53:33 +01:00
Daniel Gustafsson
a19db08274 pg_basebackup: Add missing PQclear in error path
This adds a missing PQclear in the error path of StreamLogicalLog, a
fix in the same vein as e889422d98e with an equivalent low impact.

Author: Steven Niu <niushiji@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/c4b1c627-a3e4-4347-a670-1e28a43ce0eb@gmail.com
2025-03-25 17:24:23 +01:00
Peter Eisentraut
ef7a5af77d refactor: Pass relation OID instead of Relation to createForeignKeyCheckTriggers()
Currently, createForeignKeyCheckTriggers() takes a Relation type as
its first argument, but it doesn't use that argument directly.
Instead, it fetches the relation OID by calling RelationGetRelid().
Therefore, it would be more consistent with other functions (e.g.,
createForeignKeyCheckTriggers()) to pass the relation OID directly
instead of the whole Relation.

Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-03-25 17:04:12 +01:00
Peter Eisentraut
639238b978 refactor: Split ATExecAlterConstraintInternal()
Split ATExecAlterConstraintInternal() into two functions:
ATExecAlterConstrDeferrability() and
ATExecAlterConstrInheritability().  This simplifies the code and
avoids unnecessary confusion caused by recursive code, which isn't
needed for ATExecAlterConstrInheritability().

(This also takes over the changes in commit 64224a834ce, as the new
AlterConstrDeferrabilityRecurse() is essentially the old
ATExecAlterChildConstr().)

Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-03-25 16:18:00 +01:00
Peter Eisentraut
a3280e2a49 refactor: Move some code that updates pg_constraint to a separate function
This extracts common/duplicate code for different ALTER CONSTRAINT
variants into a common function.  We plan to add more variants that
would use the same code.

Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-03-25 14:37:22 +01:00
Peter Eisentraut
f4b2a62ae3 Small fixes for Add ALTER TABLE ... ALTER CONSTRAINT ... SET [NO] INHERIT
Small fixes for commit f4e53e10b6c: Add missing calls to
InvokeObjectPostAlterHook() and also CacheInvalidateRelcache().  The
former change could have a user-visible effect.  The latter omission
might have caused other bugs, but it is not clear whether one actually
existed.  With these changes, the code is now more consistent with
similar ALTER CONSTRAINT variants, especially the ones that set the
deferrability.

Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAF1DzPVfOW6Kk=7SSh7LbneQDJWh=PbJrEC_Wkzc24tHOyQWGg@mail.gmail.com
2025-03-25 13:40:24 +01:00
Alexander Korotkov
62f36d6924 postgres_fdw: Remove redundant check in semijoin_target_ok()
If a var belongs to the innerrel of the joinrel, it's not possible that
it belongs to the outerrel.  This commit removes the redundant check from
the if-clause but keeps it as an assertion.

Discussion: https://postgr.es/m/flat/CAHewXN=8aW4hd_W71F7Ua4+_w0=bppuvvTEBFBF6G0NuSXLwUw@mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Alexander Pyhalov <a.yhalov@postgrespro.ru>
Backpatch-through: 17
2025-03-25 12:49:01 +02:00
Thomas Munro
3c86223c99 libpq: Deprecate pg_int64.
Previously we used pg_int64 in three function prototypes in libpq.  It
was added by commit 461ef73f to expose the platform-dependent type used
for int64 in the C89 era.  As of commit 962da900 it is defined as
standard int64_t, and the dust seems to have settled.

Let's just use int64_t directly in these three client-facing functions
instead of (yet) another name.  We've required C99 and thus <stdint.h>
since PostgreSQL 12, C89 and C++98 compilers are long gone, and client
applications very likely use standard types for their own 64-bit needs.
This also cleans up the obscure placement of a new #include <stdint.h>
directive in postgres_ext.h, required for the new definition.  The
typedef was hiding in there for historical reasons, but it doesn't fit
postgres_ext.h's own description of its purpose and there is no evidence
of client applications including postgres_ext.h directly to see it.

Keep a typedef marked deprecated for backward compatibility, but move it
into libpq-fe.h where it was used.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGKn_EkNNGMY5RzMcKP%2Ba6urT4JF%3DCPhw_zHtQwjvX6P2g%40mail.gmail.com
2025-03-25 21:40:00 +13:00
Peter Eisentraut
be1cc9aaf5 Generalize index support in network support function
The network (inet) support functions currently only supported a
hardcoded btree operator family.  With the generalized compare type
facility, we can generalize this to support any operator family from
any index type that supports the required operators.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-03-25 07:11:56 +01:00
Michael Paquier
5ac462e2b7 Add support for custom_query_jumble as a node field attribute
This option gives the possibility for query jumble to define a custom
routine for the field of a Node, extending support for
custom_query_jumble as a node field attribute.  When dealing with
complex node structures, this can be simpler than having to enforce a
custom function across a full node.

Custom functions need to be defined in queryjumblefuncs.c, named as
_jumble${node}_${field}(), and use in input the JumbleState, the node
and its field.  The field is not really required if we have the Node,
but it makes custom implementations somewhat easier to think about.  The
code generated by gen_node_support.pl uses a macro called
JUMBLE_CUSTOM(), hiding the internals of the logic inside
queryjumblefuncs.c.

This will be used by an upcoming patch manipulating adding a custom
routine into a field of RangeTblEntry, but this facility can become
useful in more cases.

Reviewed-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/Z9y43-dRvb4EtxQ0@paquier.xyz
2025-03-25 14:18:00 +09:00
Jeff Davis
626df47ad9 Remove 'additional' pointer from TupleHashEntryData.
Reduces memory required for hash aggregation by avoiding an allocation
and a pointer in the TupleHashEntryData structure. That structure is
used for all buckets, whether occupied or not, so the savings is
substantial.

Discussion: https://postgr.es/m/AApHDvpN4v3t_sdz4dvrv1Fx_ZPw=twSnxuTEytRYP7LFz5K9A@mail.gmail.com
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
2025-03-24 22:06:02 -07:00
Jeff Davis
a0942f441e Add ExecCopySlotMinimalTupleExtra().
Allows an "extra" argument that allocates extra memory at the end of
the MinimalTuple. This is important for callers that need to store
additional data, but do not want to perform an additional allocation.

Suggested-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvppeqw2pNM-+ahBOJwq2QmC0hOAGsmCpC89QVmEoOvsdg@mail.gmail.com
2025-03-24 22:05:53 -07:00
Jeff Davis
4d143509cb Create accessor functions for TupleHashEntry.
Refactor for upcoming optimizations.

Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/1cc3b400a0e8eead18ff967436fa9e42c0c14cfb.camel@j-davis.com
2025-03-24 22:05:41 -07:00
Jeff Davis
cc721c459d HashAgg: use Bump allocator for hash TupleHashTable entries.
The entries aren't freed until the entire hash table is destroyed, so
use the Bump allocator to improve allocation speed, avoid wasting
space on the chunk header, and avoid wasting space due to the
power-of-two allocations.

Discussion: https://postgr.es/m/CAApHDvqv1aNB4cM36FzRwivXrEvBO_LsG_eQ3nqDXTjECaatOQ@mail.gmail.com
Reviewed-by: David Rowley
2025-03-24 22:05:33 -07:00
Amit Kapila
cc4331605a Fix the typo in the test case added in 73eba5004a.
Author: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CALDaNm2ms1deM5EYNLFEfESv_Kw=Y4AiTB0LP=qGS-UpFwGbPg@mail.gmail.com
Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
2025-03-25 09:39:53 +05:30
Amit Kapila
b87ced747d Fix an oversight in 3abe9dc188.
Forgot to update the comment atop one of the functions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/OSCPR01MB1496623BE1125B44614494E7AF5A72@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-25 09:26:23 +05:30
Alexander Korotkov
023fb51275 postgres_fdw: Avoid pulling up restrict infos from subqueries
Semi-join joins below left/right join are deparsed as
subqueries.  Thus, we can't refer to subqueries vars from upper relations.
This commit avoids pulling conditions from them.

Reported-by: Robins Tharakan <tharakan@gmail.com>
Bug: #18852
Discussion: https://postgr.es/m/CAEP4nAzryLd3gwcUpFBAG9MWyDfMRX8ZjuyY2XXjyC_C6k%2B_Zw%40mail.gmail.com
Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 17
2025-03-25 05:49:47 +02:00
Andres Freund
adb5f85fa5 Redefine max_files_per_process to control additionally opened files
Until now max_files_per_process=N limited each backend to open N files in
total (minus a safety factor), even if there were already more files opened in
postmaster and inherited by backends.  Change max_files_per_process to control
how many additional files each process is allowed to open.

The main motivation for this is the patch to add io_method=io_uring, which
needs to open one file for each backend.  Without this patch, even if
RLIMIT_NOFILE is high enough, postmaster will fail in set_max_safe_fds() if
started with a high max_connections.  The cause of the failure is that, until
now, set_max_safe_fds() subtracted the already open files from
max_files_per_process.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/w6uiicyou7hzq47mbyejubtcyb2rngkkf45fk4q7inue5kfbeo@bbfad3qyubvs
Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d=h9J=D1rStFCMR+t7yh_Kwj-g87aLQ@mail.gmail.com
2025-03-24 18:20:18 -04:00
Nathan Bossart
7d559c8580 Expand comment for isset_offset.
This field was added in commit 0164a0f9ee to provide a way to
determine whether a storage parameter was explicitly set for the
relation or if it just picked up the default value.  In most cases,
this can be accomplished by giving the storage parameter a special
out-of-range default value (e.g., the
autovacuum_vacuum_insert_threshold storage parameter defaults to
-2), but this approach doesn't work in all cases.  For example, a
Boolean storage parameter cannot be given an out-of-range default,
so we need another way to discover the source of its value.

Reported-by: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwYKtEUYKS%2B18gRs-xPhn0qOJgM2KGyyWVCODHuVn9F-XQ%40mail.gmail.com
2025-03-24 15:47:02 -05:00
Melanie Plageman
aea916fe55 Fix bitmapheapscan incorrect recheck of NULL tuples
The bitmap heap scan skip fetch optimization skips fetching the heap
block when a page is set all-visible in the visibility map and no
columns from the table are needed to satisfy the query.

2b73a8cd33b and c3953226a07 changed the control flow of bitmap heap scan
to use the read stream API. The read stream API returns buffers
containing blocks to the user. To make this work with the skip fetch
optimization, we keep a count of the empty tuples we need to emit for
all the blocks skipped and only emit the empty tuples after processing
the next block fetched from the heap or at the end of the scan.

It's incorrect to recheck NULL tuples, so we must set `recheck` to false
before yielding control back to BitmapHeapNext(). This was done before
emitting any remaining empty tuples at the end of the scan but not for
empty tuples emitted during the scan. This meant that if a page fetched
from the heap did require recheck and set `recheck` to true and then we
emitted empty tuples for subsequent blocks, we would get wrong results.

Fix this by always setting `recheck` to false before emitting empty
tuples.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Tested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/496f7acd-881c-4df3-9bd3-8f8534dfec26%40gmail.com
2025-03-24 16:40:59 -04:00
Álvaro Herrera
0e3e0ec06b
Fix typo 2025-03-24 17:36:44 +01:00
Fujii Masao
c68100aa43 Allow pg_recvlogical --drop-slot to work without --dbname.
When pg_recvlogical was introduced in 9.4, the --dbname option was not
required for --drop-slot. Without it, pg_recvlogical --drop-slot connected
using a replication connection (not tied to a specific database) and
was able to drop both physical and logical replication slots, similar to
pg_receivewal --drop-slot.

However, commit 0c013e08cfb unintentionally changed this behavior in 9.5,
making pg_recvlogical always check whether it's connected to a specific
database and fail if it's not. This change was expected for --create-slot
and --start, which handle logical replication slots and require a database
connection, but it was unnecessary for --drop-slot, which should work with
any replication connection. As a result, --dbname became a required option
for --drop-slot.

This commit removes that restriction, restoring the original behavior and
allowing pg_recvlogical --drop-slot to work without specifying --dbname.

Although this issue originated from an unintended change, it has existed
for a long time without complaints or bug reports, and the documentation
never explicitly stated that --drop-slot should work without --dbname.
Therefore, the change is not treated as a bug fix and is applied only to
master.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/b15ecf4f-e5af-4fbb-82c2-a425f453e0b2@oss.nttdata.com
2025-03-25 00:18:27 +09:00
Fujii Masao
dfc13428a9 doc: Clarify required options for each action in pg_recvlogical.
Each pg_recvlogical action requires specific options. For example,
--slot, --dbname, and --file must be specified with the --start action.
Previously, the documentation did not clearly outline these requirements.

This commit updates the documentation to explicitly state
the necessary options for each action.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966930B4357BAE8C9D68A8AF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-25 00:14:38 +09:00
Peter Eisentraut
76563f88cf postgres_fdw: improve security checks
SCRAM pass-through should not bypass the FDW security check as it was
implemented for postgres_fdw in commit 761c79508e7.

This commit improves the security check by adding new SCRAM
pass-through checks to ensure that the required SCRAM connection
options are not overwritten by the user mapping or foreign server
options.  This is meant to match the security requirements for a
password-using connection.

Since libpq has no SCRAM-specific equivalent of
PQconnectionUsedPassword(), we enforce this instead by making the
use_scram_passthrough option of postgres_fdw imply
require_auth=scram-sha-256.  This means that if use_scram_passthrough
is set, some situations that might otherwise have worked are
preempted, for example GSSAPI with delegated credentials.  This could
be enhanced in the future if there is desire for more flexibility.

Reported-by: Jacob Champion <jacob.champion@enterprisedb.com>
Author: Matheus Alcantara <mths.dev@pm.me>
Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com
2025-03-24 15:56:53 +01:00
Magnus Hagander
a8eeb22f17 psql: use consistent alias for pg_description
Author:Jelte Fennema-Nio <github-tech@jeltef.nl>
Suggested-By: Michael Banck <mbanck@gmx.net>
Discussion: https://www.postgresql.org/message-id/67813520.170a0220.183245.7bf0%40mx.google.com
2025-03-24 14:31:28 +01:00
Magnus Hagander
d696406a9b psql: show default extension version in \dx output
Reviewed-By: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-By: Michael Banck <mbanck@gmx.net>
Reviewed-By: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-By: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-By: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CABUevEyTMyXC6OvCWkj+rPnHrfi8_Rw_+DD_jzgFFNPqgf+Oig@mail.gmail.com
2025-03-24 14:25:05 +01:00
Heikki Linnakangas
19c6eb06c5 Add test case for when subscriber table is missing a column
We haven't had bugs in this area, but there's some not-entirely
trivial code to detect that case, so it seems good to have test
coverage for it.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://www.postgresql.org/message-id/CAHut%2BPtX8P0EGhsk9p%3DhQGUHrzxeCSzANXSMKOvYiLX-EjdyNw@mail.gmail.com
2025-03-24 12:13:32 +02:00
Amit Kapila
73eba5004a Detect and Log multiple_unique_conflicts type conflict.
Introduce a new conflict type, multiple_unique_conflicts, to handle cases
where an incoming row during logical replication violates multiple UNIQUE
constraints.

Previously, the apply worker detected and reported only the first
encountered key conflict (insert_exists/update_exists), causing repeated
failures as each constraint violation needs to be handled one by one
making the process slow and error-prone.

With this patch, the apply worker checks all unique constraints upfront
once the first key conflict is detected and reports
multiple_unique_conflicts if multiple violations exist. This allows users
to resolve all conflicts at once by deleting all conflicting tuples rather
than dealing with them individually or skipping the transaction.

In the future, this will also allow us to specify different resolution
handlers for such a conflict type.

Add the stats for this conflict type in pg_stat_subscription_stats.

Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
2025-03-24 12:30:44 +05:30
David Rowley
35a92b7c25 Add tests for POSITION(bytea, bytea)
Previously there was no coverage for this function.

Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com>
Discussion: https://postgr.es/m/CAJ7c6TMT6XCooMVKnCd_tR2oBdGcnjefSeCDCv8jzKy9VkWA5w@mail.gmail.com
2025-03-24 19:32:02 +13:00
Michael Paquier
2a0cd38da5 Allow plugins to set a 64-bit plan identifier in PlannedStmt
This field can be optionally set in a PlannedStmt through the planner
hook, giving extensions the possibility to assign an identifier related
to a computed plan.  The backend is changed to report it in the backend
entry of a process running (including the extended query protocol), with
semantics and APIs to set or get it similar to what is used for the
existing query ID (introduced in the backend via 4f0b0966c8).  The plan
ID is reset at the same timing as the query ID.  Currently, this
information is not added to the system view pg_stat_activity; extensions
can access it through PgBackendStatus.

Some patches have been proposed to provide some features in the planning
area, where a plan identifier is used as a key to know the plan involved
(for statistics, plan storage and manipulations, etc.), and the point of
this commit is to provide an anchor in the backend that extensions can
rely on for future work.   The reset of the plan identifier is
controlled by core and follows the same pattern as the query identifier
added in 4f0b0966c8.

The contents of this commit are extracted from a larger set proposed
originally by Lukas Fittl, that Sami Imseih has proposed as an
independent change, with a few tweaks sprinkled by me.

Author: Lukas Fittl <lukas@fittl.com>
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg@mail.gmail.com
Discussion: https://postgr.es/m/CAA5RZ0vyWd4r35uUBUmhngv8XqeiJUkJDDKkLf5LCoWxv-t_pw@mail.gmail.com
2025-03-24 13:23:42 +09:00
Tom Lane
8a3e4011f0 psql: Add tab completion for VACUUM and ANALYZE ... ONLY option.
Improve psql's tab completion for VACUUM and ANALYZE by supporting
the ONLY option introduced in 62ddf7ee9.

In passing, simplify some of the VACUUM patterns by making use
of MatchAnyN.

Author: Umar Hayat <postgresql.wizard@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Discussion: https://postgr.es/m/CAD68Dp3L6yW_nWs+MWBs6s8tKLRzXaQdQgVRm4byZe0L-hRD8g@mail.gmail.com
2025-03-23 17:16:08 -04:00
Heikki Linnakangas
2817525f0d Fix rare assertion failure in standby, if primary is restarted
During hot standby, ExpireAllKnownAssignedTransactionIds() and
ExpireOldKnownAssignedTransactionIds() functions mark old transactions
as no-longer running, but they failed to update xactCompletionCount
and latestCompletedXid. AFAICS it would not lead to incorrect query
results, because those functions effectively turn in-progress
transactions into aborted transactions and an MVCC snapshot considers
both as "not visible". But it could surprise GetSnapshotDataReuse()
and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin,
RecentXmin))" assertion in it, if the apparent xmin in a backend would
move backwards. We saw this happen when GetCatalogSnapshot() would
reuse an older catalog snapshot, when GetTransactionSnapshot() had
already advanced TransactionXmin.

The bug goes back all the way to commit 623a9ba79b in v14 that
introduced the snapshot reuse mechanism, but it started to happen more
frequently with commit 952365cded6 which removed a
GetTransactionSnapshot() call from backend startup. That made it more
likely for ExpireOldKnownAssignedTransactionIds() to be called between
GetCatalogSnapshot() and the first GetTransactionSnapshot() in a
backend.

Andres Freund first spotted this assertion failure on buildfarm member
'skink'. Reproduction and analysis by Tomas Vondra.

Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5zmdj
2025-03-23 20:41:16 +02:00
Noah Misch
f0446384ea Fix "make clean" for new TAP suite.
Commit 28f04984f0c240b76e61f00cd247554fbc850056 missed this.
2025-03-23 06:12:02 -07:00
Andres Freund
ca3067cc57 aio: Change prefix of PgAioResultStatus values to PGAIO_RS_
The previous prefix wasn't consistent with the naming of other AIO related
enum values. It seems best to rename it before the users are introduced.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_Yb+JzQpNsgUxCB0gBi+sE-mi_HmcJF6ALnmO4W+UgwpA@mail.gmail.com
2025-03-22 17:30:44 -04:00
Tom Lane
58fdca2204 plpgsql: make WHEN OTHERS distinct from WHEN SQLSTATE '00000'.
The catchall exception condition OTHERS was represented as
sqlerrstate == 0, which was a poor choice because that comes
out the same as SQLSTATE '00000'.  While we don't issue that
as an error code ourselves, there isn't anything particularly
stopping users from doing so.  Use -1 instead, which can't
match any allowed SQLSTATE string.

While at it, invent a macro PLPGSQL_OTHERS to use instead of
a hard-coded magic number.

While this seems like a bug fix, I'm inclined not to back-patch.
It seems barely possible that someone has written code like this
and would be annoyed by changing the behavior in a minor release.

Reported-by: David Fiedler <david.fido.fiedler@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHjN70-=H5EpTOuZVbC8mPvRS5EfZ4MY2=OUdVDWoyGvKhb+Rw@mail.gmail.com
2025-03-22 14:17:00 -04:00
Peter Geoghegan
9a2e2a285a Improve nbtree array primitive scan scheduling.
Add a new scheduling heuristic: don't end the ongoing primitive index
scan immediately (at the point where _bt_advance_array_keys notices that
the next set of matching tuples must be on a later page) if the primscan
already managed to step right/left from its first leaf page.  Schedule a
recheck against the next sibling leaf page's finaltup instead.

The new heuristic tends to avoid scenarios where the top-level scan
repeatedly starts and ends primitive index scans that each read only one
leaf page from a group of neighboring leaf pages.  Affected top-level
scans will now tend to step forward (or backward) through the index
instead, without wasting cycles on descending the index anew.

The recheck mechanism isn't exactly new.  But up until now it has only
been used to deal with edge cases involving high key finaltups with one
or more truncated -inf attributes that _bt_advance_array_keys deemed
"provisionally satisfied" (satisfied for the purposes of allowing the
scan to step onto the next page, subject to recheck once on that page).
The mechanism was added by commit 5bf748b8, which invented the general
concept of primitive scan scheduling.  It was later enhanced by commit
79fa7b3b, which taught it about cases involving -inf attributes that
satisfy inequality scan keys required in the opposite-to-scan direction
only (arguably, they should have been covered by the earliest version).
Now the recheck mechanism can be applied based on scan-level heuristics,
which have nothing to do with truncated high keys.  Now rechecks might
be performed by _bt_readpage when scanning in _either_ scan direction.

The theory behind the new heuristic is that any primitive scan that
makes it past its first leaf page is one that is already likely to have
arrays whose key values match index tuples that are closely clustered
together in the index.  The rules that determine whether we ever get
past the first page are still conservative (that'll still only happen
when pstate.finaltup strongly suggests that it's the right thing to do).
Surviving past the first leaf page is a strong signal in itself.

Preparation for an upcoming patch that will add skip scan optimizations
to nbtree.  That'll work by adding skip arrays, which behave similarly
to SAOP arrays, but generate their elements procedurally and on-demand.

Note that this commit isn't specifically concerned with skip arrays; the
scheduling logic doesn't (and won't) condition anything on whether the
scan uses skip arrays, SAOP arrays, or some combination of the two
(which seems like a good general principle for _bt_advance_array_keys).
While the problems that this commit ameliorates are more likely with
skip arrays (at least in practice), SAOP arrays (or those with very
dense, contiguous array elements) are also affected.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wzkz0wPe6+02kr+hC+JJNKfGtjGTzpG3CFVTQmKwWNrXNw@mail.gmail.com
2025-03-22 13:02:18 -04:00
Melanie Plageman
e215166c9c Use streaming read I/O in SP-GiST vacuuming
Like 69273b818b1df did for GiST vacuuming, make SP-GiST vacuum use the
read stream API for vacuuming physically contiguous index pages.

Concurrent insertions may cause SP-GiST index tuples to be redirected.
While vacuuming, these are added to a pending list which is later
processed to ensure no dead tuples are left behind. Pages containing
such tuples are still read by directly calling ReadBuffer() and do not
use the read stream API.

Author: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/37432403-8657-403B-9CDF-5A642BECDD81%40yandex-team.ru
2025-03-21 17:51:22 -04:00
Thomas Munro
e51ca405ed Fix ps display for IO workers.
This code must have missed a memo about the backend type description
being supplied automatically these days, and was duplicating that
information.

Before: "io worker io worker: N"
After:  "io worker N"
2025-03-22 10:13:23 +13:00
Tom Lane
16a3ae504e Revert inappropriate weakening of an Assert in plpgsql.
Commit 682ce911f modified exec_save_simple_expr to accept a Param
in the tlist of a Gather node, rather than the normal case of a Var
referencing the Gather's input.  It turns out that this was a kluge
to work around the bug later fixed in 0f7ec8d9c, namely that setrefs.c
was failing to replace Params in upper plan nodes with Var references
to the same Params appearing in the child tlists.  With that fixed,
there seems no reason to continue to allow a Param here.  (Moreover,
even if we did expect a Param here, the semantically correct thing
to do would be to take the Param as the expression being sought.
Whatever it may represent, it is *not* a reference to the child.)
Hence, revert that part of 682ce911f.

That all happened a long time ago.  However, since the net effect
here is just to tighten an Assert condition, I'm content to change
it only in master.

Discussion: https://postgr.es/m/1565347.1742572349@sss.pgh.pa.us
2025-03-21 15:55:06 -04:00
Masahiko Sawada
04ff636cbc Add GUC option to control maximum active replication origins.
This commit introduces a new GUC option max_active_replication_origins
to control the maximum number of active replication
origins. Previously, this was controlled by
'max_replication_slots'. Having a separate GUC option provides better
flexibility for setting up subscribers, as they may not require
replication slots (for cascading replication) but always require
replication origins.

Author: Euler Taveira <euler@eulerto.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/b81db436-8262-4575-b7c4-bc0c1551000b@app.fastmail.com
2025-03-21 12:20:15 -07:00
Tom Lane
0e032a2240 Place "extern" declaration in the right part of pg_class.h.
errdetail_relkind_not_supported() was declared within
EXPOSE_TO_CLIENT_CODE, which is mistaken since that function
isn't available client-side.  While relatively harmless,
this isn't good precedent.

Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us
2025-03-21 15:14:15 -04:00
Tom Lane
cd72c1b76e Label the contents of pg_*_d.h files a little better.
Make genbki.pl emit some boilerplate comments identifying the
sections of the pg_*_d.h files that it generates.  This is in
hopes of making them slightly more readable, in case people
look at those files and not the pg_*.h/pg_*.dat originals.

Discussion: https://postgr.es/m/1134562.1742507765@sss.pgh.pa.us
2025-03-21 15:09:46 -04:00
Melanie Plageman
69273b818b Use streaming read I/O in GiST vacuuming
Like c5c239e26e387 did for btree vacuuming, make GiST vacuum use the
read stream API for sequentially processed pages.

Because it is possible for concurrent insertions to relocate unprocessed
index entries to already vacuumed pages, GiST vacuum must backtrack and
reprocess those pages. These pages are still read with explicit
ReadBuffer() calls.

Author: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/EFEBED92-18D1-4C0F-A4EB-CD47072EF071%40yandex-team.ru
2025-03-21 14:06:45 -04:00
Melanie Plageman
3f850c3fc5 Assorted trivial cleanup of c5c239e26e
c5c239e26e made btree vacuum use the read stream API. Though it used
functions declared in read_stream.h, it relied on transitively including
it. Explicitly include that file. Also remove an extraneous newline and
decrease the scope of one of the local variables in btvacuumscan().
2025-03-21 14:06:40 -04:00
Tom Lane
7fe312f609 Fix plpgsql's handling of simple expressions in scrollable cursors.
exec_save_simple_expr did not account for the possibility that
standard_planner would stick a Materialize node atop the plan
of even a simple Result, if CURSOR_OPT_SCROLL is set.  This led
to an "unexpected plan node type" error.

This is a very old bug, but it'd only be reached by declaring a
cursor for a "SELECT simple-expression" query and explicitly
marking it scrollable, which is an odd thing to do.  So the lack
of prior reports isn't too surprising.

Bug: #18859
Reported-by: Olleg Samoylov <splarv@ya.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18859-0d5f28ac99a37059@postgresql.org
Backpatch-through: 13
2025-03-21 11:30:42 -04:00
Melanie Plageman
c5c239e26e Use streaming read I/O in btree vacuuming
Btree vacuum processes all index pages in physical order. Now it uses
the read stream API to get the next buffer instead of explicitly
invoking ReadBuffer().

It is possible for concurrent insertions to cause page splits during
index vacuuming. This can lead to index entries that have yet to be
vacuumed being moved to pages that have already been vacuumed. Btree
vacuum code handles this by backtracking to reprocess those pages. So,
while sequentially encountered pages are now read through the
read stream API, backtracked pages are still read with explicit
ReadBuffer() calls.

Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_bW1UOyup%3DjdFw%2BkOF9bCaAm%3D9UpiyZtbPMn8n_vnP%2Big%40mail.gmail.com#3b3a84132fc683b3ee5b40bc4c2ea2a5
2025-03-21 09:09:39 -04:00
Álvaro Herrera
1d617a2028
Change one loop in ATRewriteTable to use 1-based attnums
All TupleDescAttr() calls in tablecmds.c that aren't in loops across all
attributes use AttrNumber-style indexes (1-based); there was only one
place in ATRewriteTable that was stashing 0-based indexes in a list for
later processing.  Switch that to use attnums for consistency.

Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEoYA5ScUr2=CmA1xcpaS_1ixneDbEkVU77X1ctGxY2mA@mail.gmail.com
2025-03-21 10:55:06 +01:00
Thomas Munro
ce1a75c4fe Support buffer forwarding in StartReadBuffers().
StartReadBuffers() reports a short read when it finds a cached block
that ends a range needing I/O by updating the caller's *nblocks.  It
doesn't want to have to unpin the trailing hit that it knows the caller
wants, so the v17 version used sleight of hand in the name of
simplicity: it included it in *nblocks as if it were part of the I/O,
but internally tracked the shorter real I/O size in io_buffers_len (now
removed).

This API change "forwards" the delimiting buffer to the next call.  It's
still pinned, and still stored in the caller's array, but *nblocks no
longer includes stray buffers that are not really part of the operation.
The expectation is that the caller still wants the rest of the blocks
and will call again starting from that point, and now it can pass the
already pinned buffer back in (or choose not to and release it).

The change is needed for the coming asynchronous I/O version's larger
version of the problem: by definition it must move BM_IO_IN_PROGRESS
negotiation from WaitReadBuffers() to StartReadBuffers(), but it might
already have many buffers pinned before it discovers a need to split an
I/O.  (The current synchronous I/O version hides that detail from
callers by looping over smaller reads if required to make all covered
buffers valid in WaitReadBuffers(), so it looks like one operation but
it might occasionally be several under the covers.)

Aside from avoiding unnecessary pin traffic, this will also be important
for later work on out-of-order streams: you can't prioritize data that
is already available right now if that fact is hidden from you.

The new API is natural for read_stream.c (see ed0b87ca).  After a short
read it leaves forwarded buffers where they fell in its circular queue
for the continuing call to pick up.

Single-block StartReadBuffer() and traditional ReadBuffer() share code
but are not affected by the change.  They don't do multi-block I/O.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
2025-03-21 20:43:59 +13:00
Thomas Munro
ed0b87caac Support buffer forwarding in read_stream.c.
In preparation for a follow-up change to the buffer manager, teach
read_stream.c to manage buffers "forwarded" from one StartReadBuffers()
call to the next after a short read.  This involves a small amount of
extra book-keeping, and opens the way for lower levels to split I/O
operations without having to drop pins, as required for efficient
handling of various edge cases.

Concretely, the "buffers" argument will change from an out parameter to
an in/out parameter.  Buffer queue elements must be initialized on first
use and cleared after they're consumed, but forwarded buffers are left
where they fall ahead of the current pending read in the queue, ready
for use by the operation that continues where a short read left off.
The stream also needs to count them for pin limit management and release
them on reset/early end.

Tested-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
2025-03-21 18:44:47 +13:00
Fujii Masao
14413d0ef5 doc: Remove incorrect description about dropping replication slots.
pg_drop_replication_slot() can drop replication slots created on
a different database than the one where it is executed. This behavior
has been in place since PostgreSQL 9.4, when pg_drop_replication_slot()
was introduced.

However, commit ff539d mistakenly added the following incorrect
description in the documentation:

     For logical slots, this must be called when connected to
     the same database the slot was created on.

This commit removes that incorrect statement. A similar mistake was
also present in the documentation for the DROP_REPLICATION_SLOT
command, which has now been corrected as well.

Back-patch to all supported versions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966C6BE304B5BB2E58D4009F5DE2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
2025-03-21 12:56:39 +09:00
David Rowley
00b52c3db6 Simplify EXPLAIN code for Memoize
This removes a needless special case for Memoize's FORMAT TEXT EXPLAIN
output.

ExplainPropertyText() outputs the same thing in text mode as the
special-case code was doing, so removing the special-case code results in
the same EXPLAIN output, just with less code.

It seems like a good idea to fix this to help prevent future changes in
this area from copying the same pattern.

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Reported-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/88a71bcd-0b5c-4d0b-8107-757e96f402d5@tantorlabs.com
2025-03-21 13:40:05 +13:00
Andres Freund
202b12774d bufmgr: Improve stats when a buffer is read in concurrently
Previously we would have the following inaccuracies when a backend tried to
read in a buffer, but that buffer was read in concurrently by another backend:
- the read IO was double-counted in the global buffer access stats (pgBufferUsage)
- the buffer hit was not accounted for in:
  - global buffer access statistics
  - pg_stat_io
  - relation level IO stats
  - vacuum cost balancing

While trying to read in a buffer that is concurrently read in by another
backend is not a common occurrence, it's also not that rare, e.g. due to
concurrent sequential scans on the same relation.  This scenario has become
more likely in PG 17, due to the introducing of read streams, which can pin
multiple buffers before calling StartBufferIO() for all the buffers.

This behaviour has historically grown, but there doesn't seem to be any reason
to continue with the wrong accounting.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_Zk-B08AzPsO-6680LUHLOCGaNJYofaxTFseLa=OepV1g@mail.gmail.com
2025-03-20 19:58:22 -04:00
Andrew Dunstan
12604593e9 Show plperl version in the meson setup summary.
Also, use perl 'version' instead of 'api_versionstring' to sync with
the configure script.

Author: Roman Zharkov <r.zharkov@postgrespro.ru>

Discussion: https://postgr.es/m/93e7f77bf4e1ef4640e4ee733f9e2a78@postgrespro.ru
2025-03-20 18:55:29 -04:00
Andres Freund
fc51a60dd4 smgr: Hold interrupts in most smgr functions
We need to hold interrupts across most of the smgr.c/md.c functions, as
otherwise interrupt processing, e.g. due to a < ERROR elog/ereport, can
trigger procsignal processing, which in turn can trigger smgrreleaseall(). As
the relevant code is not reentrant, we quickly end up in a bad situation.

The only reason we haven't noticed this before is that there is only one
non-error ereport called in affected routines, in register_dirty_segments(),
and that one is extremely rarely reached. If one enables fd.c's FDDEBUG it's
easy to reproduce crashes.

It seems better to put the HOLD_INTERRUPTS()/RESUME_INTERRUPTS() in smgr.c,
instead of trying to push them down to md.c where possible: For one, every
smgr implementation would be vulnerable, for another, a good bit of smgr.c
code itself is affected too.

Eventually we might want a more targeted solution, allowing e.g. a networked
smgr implementation to be interrupted, but many other, more complicated,
problems would need to be fixed for that to be viable (e.g. smgr.c is often
called with interrupts already held).

One could argue this should be backpatched, but the existing < ERROR
elog/ereports that can be reached with unmodified sources are unlikely to be
reached. On balance the risk of backpatching seems higher than the gain - at
least for now.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
2025-03-20 17:33:57 -04:00
Tom Lane
fdb5dd6331 Be more paranoid in configure's checks for CRC and POPCNT intrinsics.
In these tests, we need to verify not only that the compiler has heard
of these intrinsics, but that lower-level tools cope with them too.
(For example, the assembler must also know the instructions, and on
some platforms there might be library support involved.)  The hazard
is that the compiler might optimize away the calls altogether,
allowing the configure check to succeed only to have the build fail
later if lower-level support is missing.  The existing code tried to
prevent that by ensuring that the result of the intrinsic is used
for something, but that's really insufficient because we were feeding
constant input to it.  So the compiler would be perfectly entitled to
optimize away the calls anyway.  Fix by making the inputs into global
variables.  (Hypothetically, LTO optimization could still remove the
code --- but that's well past where we'd be likely to hit trouble.)

It is not known that any current compiler would actually optimize
away these calls, and even if that happened it would be unlikely
that any problem would manifest.  Our concern for this stems from
largely-bygone days when it was common to install gcc on platforms
with some other native compiler, so that a compiler-vs-library
support discrepancy was more probable.  Still, there's little
point in defending against such cases in a way that is visibly
incomplete.

I'm content to fix this in master for now; we can back-patch if
any indication appears that it's a live problem for someone.

Discussion: https://postgr.es/m/3368102.1741993462@sss.pgh.pa.us
2025-03-20 16:23:09 -04:00
Robert Haas
50ba65e733 Add an additional hook for EXPLAIN option validation.
Commit c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17 made it possible for
loadable modules to add EXPLAIN options. Normally, any necessary
validation can be performed by the hook function passed to
RegisterExtensionExplainOption, but if a loadable module wants to sanity
check options against each other, that needs to be done after the entire
options list has been processed. So, add an additional hook for that
purpose.

Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CAA5RZ0vOcJF91O2e5AQN+V6guMNLMhJx83dxALf-iUZ-hLGO_Q@mail.gmail.com
2025-03-20 13:47:55 -04:00
Nathan Bossart
af0d4901c1 Add test for pg_upgrade file transfer modes.
This new test checks all of pg_upgrade's file transfer modes.  For
each mode, we verify that pg_upgrade either succeeds (and some test
objects successfully reach the new version) or fails with an error
that indicates the mode is not supported on the current platform.
For cross-version tests, we also check that pg_upgrade transfers
non-default tablespaces.  (Tablespaces can't be tested on same
version upgrades because of the version-specific subdirectory
conflict, but we might be able to enable such tests once we teach
pg_upgrade how to handle in-place tablespaces.)

Suggested-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
2025-03-20 11:08:42 -05:00
Nathan Bossart
0164a0f9ee Add vacuum_truncate configuration parameter.
This new parameter works just like the storage parameter of the
same name: if set to true (which is the default), autovacuum and
VACUUM attempt to truncate any empty pages at the end of the table.
It is primarily intended to help users avoid locking issues on hot
standbys.  The setting can be overridden with the storage parameter
or VACUUM's TRUNCATE option.

Since there's presently no way to determine whether a Boolean
storage parameter is explicitly set or has just picked up the
default value, this commit also introduces an isset_offset member
to relopt_parse_elt.

Suggested-by: Will Storey <will@summercat.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/Z2DE4lDX4tHqNGZt%40dev.null
2025-03-20 10:16:50 -05:00
Peter Eisentraut
618c64ffd3 Revert workarounds for -Wmissing-braces false positives on old GCC
We have collected several instances of a workaround for GCC bug 53119,
which caused false-positive compiler warnings.  This bug has long been
fixed, but was still seen on the buildfarm, most recently on lapwing
with gcc (Debian 4.7.2-5).  (The GCC bug tracker mentions that a fix
was backported to 4.7.4 and 4.8.3.)

That compiler no longer runs warning-free since commit 6fdd5d95634, so
we don't need to keep these workarounds.  And furthermore, the
consensus appears to be that we don't want to keep supporting that era
of platform anymore at all.

This reverts the following commits:

d937904cce6a3d82e4f9c2127de7b59105a134b3
506428d091760650971433f6bc083531c307b368
b449afb582bb9015bfbb85abc10ce122aef9ec70
6392f2a0968c20ecde4d27b6652703ad931fce92
bad0763a4d7be3005eae35d460c73ac4bc7ebaad
5e0c761d0a13c7b4f7c5de618ac38560d74d74d0

and makes a few similar fixes to newer code.

Discussion: https://www.postgresql.org/message-id/flat/e170d61f-01ab-4cf9-ab68-91cd1fac62c5%40eisentraut.org
Discussion: https://www.postgresql.org/message-id/flat/CA%2BTgmoYEAm-KKZibAP3hSqbTFTjUd47XtVcf3xSFDpyecXX9uQ%40mail.gmail.com
2025-03-20 11:25:58 +01:00
Peter Eisentraut
b7076c1e7f Fix extension control path tests
Change expected extension to be installed from amcheck to plpgsql since
not all build farm animals has the contrib module installed.

Author: Matheus Alcantara <mths.dev@pm.me>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/E7C7BFFB-8857-48D4-A71F-88B359FADCFD@justatheory.com
2025-03-20 10:53:59 +01:00
Peter Eisentraut
47929324c5 Fix typo in comment 2025-03-20 10:44:12 +01:00
Amit Kapila
e5aeed4b80 pg_createsubscriber: Add -R publications option.
This patch introduces a new '-R'/'--remove' option in the
'pg_createsubscriber' utility to specify the object types to be removed
from the subscriber. Currently, we add support to specify 'publications'
as an object type. In the future, other object types like failover-slots
could be added.

This feature allows optionally to remove publications on the subscriber
that were replicated from the primary server (before running this tool)
during physical replication. Users may want to retain these publications
in case they want some pre-existing subscribers to point to the newly
created subscriber.

Author: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHv8RjL4OvoYafofTb_U_JD5HuyoNowBoGpMfnEbhDSENA74Kg@mail.gmail.com
2025-03-20 12:21:54 +05:30
Andres Freund
5941946d09 meson: Flush stdout in testwrap
Otherwise the progress won't reliably be displayed during a test.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/kx6xu7suexal5vwsxpy7ybgkcznx6hgywbuhkr6qabcwxjqax2@i4pcpk75jvaa
Backpatch-through: 16
2025-03-19 09:04:09 -04:00
Peter Eisentraut
190dc27998 Update a code comment
The comment explained that ALTER TABLE ADD CONSTRAINT USING INDEX is
only supported with a btree index.  (This is not being changed.)  The
reason is to keep upgrades robust, as explained there.  The other part
of the comment, that btree is the only unique index kind anyway, is
somewhat less true as we're trying to enable unique indexes other than
btree, and it's irrelevant to this check.  There is a check for
indisunique earlier already.  So just remove this part of the comment.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-03-19 10:39:06 +01:00
Peter Eisentraut
4f7f7b0375 extension_control_path
The new GUC extension_control_path specifies a path to look for
extension control files.  The default value is $system, which looks in
the compiled-in location, as before.

The path search uses the same code and works in the same way as
dynamic_library_path.

Some use cases of this are: (1) testing extensions during package
builds, (2) installing extensions outside security-restricted
containers like Python.app (on macOS), (3) adding extensions to
PostgreSQL running in a Kubernetes environment using operators such as
CloudNativePG without having to rebuild the base image for each new
extension.

There is also a tweak in Makefile.global so that it is possible to
install extensions using PGXS into an different directory than the
default, using 'make install prefix=/else/where'.  This previously
only worked when specifying the subdirectories, like 'make install
datadir=/else/where/share pkglibdir=/else/where/lib', for purely
implementation reasons.  (Of course, without the path feature,
installing elsewhere was rarely useful.)

Author: Peter Eisentraut <peter@eisentraut.org>
Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: David E. Wheeler <david@justatheory.com>
Reviewed-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Reviewed-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Reviewed-by: Niccolò Fei <niccolo.fei@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E7C7BFFB-8857-48D4-A71F-88B359FADCFD@justatheory.com
2025-03-19 07:03:20 +01:00
Michael Paquier
2cce0fe440 psql: Allow queries terminated by semicolons while in pipeline mode
Currently, the only way to pipe queries in an ongoing pipeline (in a
\startpipeline block) is to leverage the meta-commands able to create
extended queries such as \bind, \parse or \bind_named.

While this is good enough for testing the backend with pipelines, it has
been mentioned that it can also be very useful to allow queries
terminated by semicolons to be appended to a pipeline.  For example, it
would be possible to migrate existing psql scripts to use pipelines by
just adding a set of \startpipeline and \endpipeline meta-commands,
making such scripts more efficient.

Doing such a change is proving to be simple in psql: queries terminated
by semicolons can be executed through PQsendQueryParams() without any
parameters set when the pipeline mode is active, instead of
PQsendQuery(), the default, like pgbench.  \watch is still forbidden
while in a pipeline, as it expects its results to be processed
synchronously.

The large portion of this commit consists in providing more test
coverage, with mixes of extended queries appended in a pipeline by \bind
and friends, and queries terminated by semicolons.

This improvement has been suggested by Daniel Vérité.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/d67b9c19-d009-4a50-8020-1a0ea92366a1@manitou-mail.org
2025-03-19 13:34:59 +09:00
Thomas Munro
0b53c08677 Fix compiler warning for commit 434dbf69.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
2025-03-19 17:26:16 +13:00
Thomas Munro
1cf4c56480 oauth: Simplify copy of PGoauthBearerRequest
Follow-up to 03366b61d. Since there are no more const members in the
PGoauthBearerRequest struct, the previous memcpy() can be replaced with
simple assignment.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq
2025-03-19 16:59:25 +13:00
Thomas Munro
873c0fd678 oauth: Improve validator docs on interruptibility
Andres pointed out that EINTR handling is inadequate for real-world use
cases. Direct module writers to our wait APIs instead.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq
2025-03-19 16:58:06 +13:00
Thomas Munro
d7e40845f9 oauth: Disallow synchronous DNS in libcurl
There is concern that a blocking DNS lookup in libpq could stall a
backend process (say, via FDW). Since there's currently no strong
evidence that synchronous DNS is a popular option, disallow it entirely
rather than warning at configure time. We can revisit if anyone
complains.

Per query from Andres Freund.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq
2025-03-19 16:56:19 +13:00
Thomas Munro
434dbf6907 oauth: Fix postcondition for set_timer on macOS
On macOS, readding an EVFILT_TIMER to a kqueue does not appear to clear
out previously queued timer events, so checks for timer expiration do
not work correctly during token retrieval. Switching to IPv4-only
communication exposes the problem, because libcurl is no longer clearing
out other timeouts related to Happy Eyeballs dual-stack handling.

Fully remove and re-register the kqueue timer events during each call to
set_timer(), to clear out any stale expirations.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/CAOYmi%2Bn4EDOOUL27_OqYT2-F2rS6S%2B3mK-ppWb2Ec92UEoUbYA%40mail.gmail.com
2025-03-19 16:45:01 +13:00
Thomas Munro
8d9d5843b5 oauth: Use IPv4-only issuer in oauth_validator tests
The test authorization server implemented in oauth_server.py does not
listen on IPv6. Most of the time, libcurl happily falls back to IPv4
after failing its initial connection, but on NetBSD, something is
consistently showing up on the unreserved IPv6 port and causing a test
failure.

Rather than deal with dual-stack details across all test platforms,
change the issuer to enforce the use of IPv4 only. (This elicits more
punishing timeout behavior from libcurl, so it's a useful change from
the testing perspective as well.)

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAOYmi%2Bn4EDOOUL27_OqYT2-F2rS6S%2B3mK-ppWb2Ec92UEoUbYA%40mail.gmail.com
2025-03-19 16:45:01 +13:00
Amit Langote
28317de723 Ensure first ModifyTable rel initialized if all are pruned
Commit cbc127917e introduced tracking of unpruned relids to avoid
processing pruned relations, and changed ExecInitModifyTable() to
initialize only unpruned result relations. As a result, MERGE
statements that prune all target partitions can now lead to crashes
or incorrect behavior during execution.

The crash occurs because some executor code paths rely on
ModifyTableState.resultRelInfo[0] being present and initialized,
even when no result relations remain after pruning. For example,
ExecMerge() and ExecMergeNotMatched() use the first resultRelInfo
to determine the appropriate action. Similarly,
ExecInitPartitionInfo() assumes that at least one result relation
exists.

To preserve these assumptions, ExecInitModifyTable() now includes the
first result relation in the initialized result relation list if all
result relations for that ModifyTable were pruned. To enable that,
ExecDoInitialPruning() ensures the first relation is locked if it was
pruned and locking is necessary.

To support this exception to the pruning logic, PlannedStmt now
includes a list of RT indexes identifying the first result relation
of each ModifyTable node in the plan. This allows
ExecDoInitialPruning() to check whether each such relation was
pruned and, if so, lock it if necessary.

Bug: #18830
Reported-by: Robins Tharakan <tharakan@gmail.com>
Diagnozed-by: Tender Wang <tndrwang@gmail.com>
Diagnozed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/18830-1f31ea1dc930d444%40postgresql.org
2025-03-19 12:14:24 +09:00
Thomas Munro
06fb5612c9 Increase io_combine_limit range to 1MB.
The default of 128kB is unchanged, but the upper limit is changed from
32 blocks to 128 blocks, unless the operating system's IOV_MAX is too
low.  Some other RDBMSes seem to cap their multi-block buffer pool I/O
around this number, and it seems useful to allow experimentation.

The concrete change is to our definition of PG_IOV_MAX, which provides
the maximum for io_combine_limit and io_max_combine_limit.  It also
affects a couple of other places that work with arrays of struct iovec
or smaller objects on the stack, so we still don't want to use the
system IOV_MAX directly without a clamp: it is not under our control and
likely to be 1024.  128 seems acceptable for our current usage.

For Windows, we can't use real scatter/gather yet, so we continue to
define our own IOV_MAX value of 16 and emulate preadv()/pwritev() with
loops.  Someone would need to research the trade-offs of raising that
number.

NB if trying to see this working: you might temporarily need to hack
BAS_BULKREAD to be bigger, since otherwise the obvious way of "a very
big SELECT" is limited by that for now.

Suggested-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
2025-03-19 15:40:35 +13:00
Thomas Munro
10f6646847 Introduce io_max_combine_limit.
The existing io_combine_limit can be changed by users.  The new
io_max_combine_limit is fixed at server startup time, and functions as a
silent clamp on the user setting.  That in itself is probably quite
useful, but the primary motivation is:

aio_init.c allocates shared memory for all asynchronous IOs including
some per-block data, and we didn't want to waste memory you'd never used
by assuming they could be up to PG_IOV_MAX.  This commit already halves
the size of 'AioHandleIov' and 'AioHandleData'.  A follow-up commit can
now expand PG_IOV_MAX without affecting that.

Since our GUC system doesn't support dependencies or cross-checks
between GUCs, the user-settable one now assigns a "raw" value to
io_combine_limit_guc, and the lower of io_combine_limit_guc and
io_max_combine_limit is maintained in io_combine_limit.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
2025-03-19 15:23:54 +13:00
Michael Paquier
17d8bba6da Fix copy-paste error related to the autovacuum launcher in pgstat_io.c
Autovacuum launchers perform no WAL IO reads, but pgstat_tracks_io_op()
was tracking them as an allowed combination for the "init" and "normal"
contexts.

This caused the "read", "read_bytes" and "read_time" attributes of
pg_stat_io to show zeros for the autovacuum launcher rather than NULL.
NULL means that a combination of IO object, IO context and IO operation
has no meaning for a backend type.  Zero is the same as telling that a
combination is relevant, and that WAL reads are possible in an
autovacuum launcher, but it is not relevant.

Copy-pasto introduced in a051e71e28a1.

Author: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAEudQAopEMAPiUqE7BvDV+x2fUPmKmb9RrsaoDR+hhQzLKg4PQ@mail.gmail.com
2025-03-19 08:52:10 +09:00
Masahiko Sawada
f4290f20dd Fix assertion failure in parallel vacuum with minimal maintenance_work_mem setting.
bbf668d66fbf lowered the minimum value of maintenance_work_mem to
64kB. However, in parallel vacuum cases, since the initial underlying
DSA size is 256kB, it attempts to perform a cycle of index vacuuming
and table vacuuming with an empty TID store, resulting in an assertion
failure.

This commit ensures that at least one page is processed before index
vacuuming and table vacuuming begins.

Backpatch to 17, where the minimum maintenance_work_mem value was
lowered.

Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAD21AoCEAmbkkXSKbj4dB+5pJDRL4ZHxrCiLBgES_g_g8mVi1Q@mail.gmail.com
Backpatch-through: 17
2025-03-18 16:37:02 -07:00
Michael Paquier
6d3ea48ff1 Optimize check for pending backend IO stats
This commit changes the backend stats code so as we rely on a single
boolean rather than a repeated check based on pg_memory_is_all_zeros()
in the code, making it cheaper should PgStat_PendingIO get bigger in
size.

The frequency of backend stats reports is not a bottleneck, but there is
no reason to not make that cheaper, and the logic is simple as the only
entry points updating backend IO stats are pgstat_count_backend_io_op()
and pgstat_count_backend_io_op_time().

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/Z8WYf1jyy4MwOveQ@ip-10-97-1-34.eu-west-3.compute.internal
2025-03-19 08:03:06 +09:00
Nathan Bossart
7fb418f020 Add commit 796bdda484 to .git-blame-ignore-revs. 2025-03-18 17:00:23 -05:00
Nathan Bossart
c9d502eb68 Update guidance for running vacuumdb after pg_upgrade.
Now that pg_upgrade can carry over most optimizer statistics, we
should recommend using vacuumdb's new --missing-stats-only option
to only analyze relations that are missing statistics.

Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan
2025-03-18 16:32:56 -05:00
Nathan Bossart
edba754f05 vacuumdb: Add option for analyzing only relations missing stats.
This commit adds a new --missing-stats-only option that can be used
with --analyze-only or --analyze-in-stages.  When this option is
specified, vacuumdb will analyze a relation if it lacks any
statistics for a column, expression index, or extended statistics
object.  This new option is primarily intended for use after
pg_upgrade (since it can now retain most optimizer statistics), but
it might be useful in other situations, too.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan
2025-03-18 16:32:56 -05:00
Nathan Bossart
9c03c8d187 vacuumdb: Teach vacuum_one_database() to reuse query results.
Presently, each call to vacuum_one_database() queries the catalogs
to retrieve the list of tables to process.  A follow-up commit will
add a "missing stats only" feature to --analyze-in-stages, which
requires saving the catalog query results (since tables without
statistics will have them after the first stage).  This commit adds
a new parameter to vacuum_one_database() that specifies either a
previously-retrieved list or a place to return the catalog query
results.  Note that nothing uses this new parameter yet.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan
2025-03-18 16:32:55 -05:00
Tom Lane
a6524105d2 Doc: manually break lines in wide UUID examples.
Buildfarm member crake has been complaining "WARNING: The contents of
fo:inline line 1 exceed the available area in the inline-progression
direction by 20500 millipoints. (See position 23808:106)" since
ba57dcfdc went in.  The other doc-building animals are not showing
this warning, and I don't see it on my RHEL8 workstation either, but
I was able to reproduce it on a Fedora 41 box.  So apparently this
is due to a recent-ish change in DocBook's line-breaking heuristics,
which caused it to cope less well with the UUIDs in these examples.
Put in some zero-width spaces to encourage the PDF toolchain to
break these lines in a better place.  (Only one of these examples
actually needs this today, but I marked up all three to ensure that
they get wrapped in a consistent way.)
2025-03-18 15:35:13 -04:00
Andres Freund
499faf9063 smgr: Make SMgrRelation initialization safer against errors
In case the smgr_open callback failed, the ->pincount field would not be
initialized and the relation would not be put onto the unpinned_relns list.

This buglet was introduced in 21d9c3ee4ef7, in 17.

Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
Backpatch-through: 17
2025-03-18 14:04:44 -04:00
Álvaro Herrera
62d712ecfd
Introduce squashing of constant lists in query jumbling
pg_stat_statements produces multiple entries for queries like
    SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on the number of parameters, because every element of
ArrayExpr is individually jumbled.  Most of the time that's undesirable,
especially if the list becomes too large.

Fix this by introducing a new GUC query_id_squash_values which modifies
the node jumbling code to only consider the first and last element of a
list of constants, rather than each list element individually.  This
affects both the query_id generated by query jumbling, as well as
pg_stat_statements query normalization so that it suppresses printing of
the individual elements of such a list.

The default value is off, meaning the previous behavior is maintained.

Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Sergey Dudoladov (mysterious, off-list)
Reviewed-by: David Geier <geidav.pg@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sutou Kouhei <kou@clear-code.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Marcos Pegoraro <marcos@f10.com.br>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Tested-by: Yasuo Honda <yasuo.honda@gmail.com>
Tested-by: Sergei Kornilov <sk@zsrv.org>
Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Tested-by: Chengxi Sun <sunchengxi@highgo.com>
Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com
2025-03-18 18:56:11 +01:00
Andres Freund
247ce06b88 aio: Add io_method=worker
The previous commit introduced the infrastructure to start io_workers. This
commit actually makes the workers execute IOs.

IO workers consume IOs from a shared memory submission queue, run traditional
synchronous system calls, and perform the shared completion handling
immediately.  Client code submits most requests by pushing IOs into the
submission queue, and waits (if necessary) using condition variables.  Some
IOs cannot be performed in another process due to lack of infrastructure for
reopening the file, and must processed synchronously by the client code when
submitted.

For now the default io_method is changed to "worker". We should re-evaluate
that around beta1, we might want to be careful and set the default to "sync"
for 18.

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-18 11:54:01 -04:00
Andres Freund
55b454d0e1 aio: Infrastructure for io_method=worker
This commit contains the basic, system-wide, infrastructure for
io_method=worker. It does not yet actually execute IO, this commit just
provides the infrastructure for running IO workers, kept separate for easier
review.

The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually
we'd like to make the number of workers dynamically scale up/down based on the
current "IO load".

To allow the number of IO workers to be increased without a restart, we need
to reserve PGPROC entries for the workers unconditionally. This has been
judged to be worth the cost. If it turns out to be problematic, we can
introduce a PGC_POSTMASTER GUC to control the maximum number.

As io workers might be needed during shutdown, e.g. for AIO during the
shutdown checkpoint, a new PMState phase is added. IO workers are shut down
after the shutdown checkpoint has been performed and walsender/archiver have
shut down, but before the checkpointer itself shuts down. See also
87a6690cc69.

Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType.

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-18 11:54:01 -04:00
Jeff Davis
549ea06e42 Fix headerscheck warning.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/93731.1742310701@sss.pgh.pa.us
2025-03-18 08:37:07 -07:00
Tom Lane
4078da6c47 Silence compiler warning.
Assorted buildfarm members are complaining about "'process_list' may
be used uninitialized in this function" since f76892c9f, presumably
because they don't trust that the switch case labels are exhaustive.
We can silence that by initializing the variable to NULL.  Should
a switch fall-through actually happen, we'll get SIGSEGV at the
first use, which is as good as an Assert.
2025-03-18 10:54:10 -04:00
Daniel Gustafsson
daa02c6bd9 Add X25519 to the default set of curves
Since many clients default to the X25519 curve in the TLS handshake,
the fact that the server by defualt doesn't support it cause an extra
roundtrip for each TLS connection.  By adding multiple curves, which
is supported since 3d1ef3a15c3eb68da, we can reduce the risk of extra
roundtrips.

Author: Daniel Gustafsson <daniel@yesql.se>
Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/20240616234612.6cslu7nqexquvwj7@awork3.anarazel.de
2025-03-18 15:26:27 +01:00
Robert Haas
4fd02bf7cf Add some new hooks so extensions can add details to EXPLAIN.
Specifically, add a per-node hook that is called after the per-node
information has been displayed but before we display children, and a
per-query hook that is called after existing query-level information
is printed. This assumes that extension-added information should
always go at the end rather than the beginning or the middle, but
that seems like an acceptable limitation for simplicity. It also
assumes that extensions will only want to add information, not remove
or reformat existing details; those also seem like acceptable
restrictions, at least for now.

If multiple EXPLAIN extensions are used, the order in which any
additional details are printed is likely to depend on the order in
which the modules are loaded. That seems OK, since the user may
have opinions about the order in which output should appear, and the
extension author can't really know whether their stuff is more or
less important to a particular user than some other extension.

Discussion: http://postgr.es/m/CA+TgmoYSzg58hPuBmei46o8D3SKX+SZoO4K_aGQGwiRzvRApLg@mail.gmail.com
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
2025-03-18 09:28:01 -04:00
Álvaro Herrera
f76892c9ff
Simplify reindexdb coding
get_parallel_object_list() was trying to serve two masters, and it was
doing a bad job at both.  In particular, it treated the given user_list
as an output argument, but only sometimes.  This was confusing, and the
two paths through it didn't really have all that much in common, so the
complexity wasn't buying us much.  Split it in two:
get_parallel_tables_list() handles the straightforward cases for
schemas, databases and tables, takes one list as argument and returns
another list.

A new function get_parallel_tabidx_list() handles the case for indexes.
This takes a list as argument and outputs two lists, just like
get_parallel_object_list used to do, but now the API is clearer (IMO
anyway).  Another difference is that accompanying the list of indexes
now we have a list of tables as an OID list rather than a
fully-qualified table name list.  This makes some comparisons easier,
and we don't really need the names of the tables, just their OIDs.
(This requires atooid, which requires <stdlib.h>).

Author: Ranier Vilela <ranier.vf@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAEudQArfqr0-s0VVPSEh=0kgOgBJvFNdGW=xSL5rBcr0WDMQYQ@mail.gmail.com
2025-03-18 14:21:26 +01:00
Melanie Plageman
cc6be07ebd Increase default maintenance_io_concurrency to 16
Since its introduction in fc34b0d9de27a, the default
maintenance_io_concurrency has been larger than the default
effective_io_concurrency. maintenance_io_concurrency primarily
controlled prefetching done on behalf of the whole system, for
operations like recovery. Therefore it makes sense for it to have a
value equal to or greater than effective_io_concurrency, which controls
I/O concurrency for reading a relation in a bitmap heap scan.

ff79b5b2ab increased effective_io_concurrency to 16, so we'll increase
maintenance_io_concurrency as well. For now, though, we'll keep the
defaults of effective_io_concurrency and maintenance_io_concurrency
equal to one another (16).

On fast, high IOPs systems, significantly higher values of
maintenance_io_concurrency are observably beneficial [1]. However, such
values would flood low IOPs systems and increase overall system I/O
latency.

It is worth mentioning that since 9256822608f and c3e775e608f,
maintenance_io_concurrency also controls the I/O concurrency of each
vacuum worker. Since many autovacuum workers may be simultaneously
issuing I/Os, we want to keep maintenance_io_concurrency appropriately
conservative.

[1] https://postgr.es/m/c5d52837-6256-0556-ac8c-d6d3d558820a%40enterprisedb.com

Suggested-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmxdHQaU%2B2Zpe6d%3Dx%3D0vigJ1sfWwwVYLJAf%3Dud_wQ_VcUw%40mail.gmail.com
2025-03-18 09:08:10 -04:00
Robert Haas
796bdda484 Fix indentation again.
Because somehow I manage to keep forgetting this.
2025-03-18 09:02:36 -04:00
Robert Haas
c65bc2e1d1 Make it possible for loadable modules to add EXPLAIN options.
Modules can use RegisterExtensionExplainOption to register new
EXPLAIN options, and GetExplainExtensionId, GetExplainExtensionState,
and SetExplainExtensionState to store related state inside the
ExplainState object.

Since this substantially increases the amount of code that needs
to handle ExplainState-related tasks, move a few bits of existing
code to a new file explain_state.c and add the rest of this
infrastructure there.

See the comments at the top of explain_state.c for further
explanation of how this mechanism works.

This does not yet provide a way for such such options to do anything
useful. The intention is that we'll add hooks for that purpose in a
separate commit.

Discussion: http://postgr.es/m/CA+TgmoYSzg58hPuBmei46o8D3SKX+SZoO4K_aGQGwiRzvRApLg@mail.gmail.com
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
2025-03-18 08:41:12 -04:00
Peter Eisentraut
9d6db8bec1 Allow non-btree unique indexes for matviews
We were rejecting non-btree indexes in some cases owing to the
inability to determine the equality operators for other index AMs;
that problem no longer exists, because we can look up the equality
operator using COMPARE_EQ.

Stop rejecting these indexes, but instead rely on all unique indexes
having equality operators.  Unique indexes must have equality
operators.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-03-18 11:29:15 +01:00
Peter Eisentraut
f278e1fe30 Allow non-btree unique indexes for partition keys
We were rejecting non-btree indexes in some cases owing to the
inability to determine the equality operators for other index AMs;
that problem no longer exists, because we can look up the equality
operator using COMPARE_EQ.  The problem of not knowing the strategy
number for equality in other index AMs is already resolved.

Stop rejecting the indexes upfront, and instead reject any for which
the equality operator lookup fails.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-03-18 11:25:36 +01:00
Peter Eisentraut
7317e64126 Add some opfamily support functions to lsyscache.c
Add get_opfamily_method() and get_opfamily_member_for_cmptype() in
lsyscache.c.  No callers yet, but we'll add some soon.  This is part
of generalizing some parts of the code away from having btree
hardcoded and use CompareType instead.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-03-18 11:17:43 +01:00
Amit Kapila
122a9af5de Fix typo.
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1KqJ0VFfDJRPbfYi9Shz6LHFEE-Ckn+eqsePfKhebv9w@mail.gmail.com
2025-03-18 14:18:09 +05:30
Amit Kapila
01e27aab05 Use correct variable name in publicationcmds.c.
subid was used at few places for publicationid in publicationcmds.c/.h.

Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1KqJ0VFfDJRPbfYi9Shz6LHFEE-Ckn+eqsePfKhebv9w@mail.gmail.com
2025-03-18 14:06:51 +05:30
Masahiko Sawada
c462b054ba Fix the test 005_char_signedness.
pg_upgrade test 005_char_signedness was leaving files like
delete_old_cluster.sh in the source directory for VPATH and meson
builds. The fix is to change the directory to tmp_check before running
the test.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoYg5e4oznn0XGoJ3+mceG1qe_JJt34rF2JLwvGS5T1hgQ@mail.gmail.com
2025-03-17 21:34:10 -07:00
Michael Paquier
17caf66445 psql: Add \sendpipeline to send query buffers while in a pipeline
In the initial pipeline support for psql added in 41625ab8ea3d, \g was
used as the way to push extended query into an ongoing pipeline.  \gx
was blocked.

These two meta-commands have format-related options that can be applied
when fetching a query result (expanded, etc.).  As the results of a
pipeline are fetched asynchronously, not at the moment of the
meta-command execution but at the moment of a \getresults or a
\endpipeline, authorizing \g while blocking \gx leads to a confusing
implementation, making one think that psql should be smart enough to
remember the output format options defined from the time when \g or \gx
were executed.  Doing so would lead to more code complications when
retrieving a batch of results.  There is an extra argument other than
simplicity here: the output format options defined at the point of a
\getresults or a \endpipeline execution should be what affect the output
format for a batch of results.

To avoid any confusion, we have settled to the introduction of a new
meta-command called \sendpipeline, replacing \g when within a pipeline.
An advantage of this design is that it is possible to add new options
specific to pipelines when sending a query buffer, independent of \g
and \gx, should it prove to be necessary.

Most of the changes of this commit happen in the regression tests, where
\g is replaced by \sendpipeline.  More tests are added to check that \g
is not allowed.

Per discussion between the author, Daniel Vérité and me.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/ad4b9f1a-f7fe-4ab8-8546-90754726d0be@manitou-mail.org
2025-03-18 09:41:21 +09:00
Andres Freund
da7226993f aio: Add core asynchronous I/O infrastructure
The main motivations to use AIO in PostgreSQL are:

a) Reduce the time spent waiting for IO by issuing IO sufficiently early.

   In a few places we have approximated this using posix_fadvise() based
   prefetching, but that is fairly limited (no completion feedback, double the
   syscalls, only works with buffered IO, only works on some OSs).

b) Allow to use Direct-I/O (DIO).

   DIO can offload most of the work for IO to hardware and thus increase
   throughput / decrease CPU utilization, as well as reduce latency.  While we
   have gained the ability to configure DIO in d4e71df6, it is not yet usable
   for real world workloads, as every IO is executed synchronously.

For portability, the new AIO infrastructure allows to implement AIO using
different methods. The choice of the AIO method is controlled by the new
io_method GUC. As of this commit, the only implemented method is "sync",
i.e. AIO is not actually executed asynchronously. The "sync" method exists to
allow to bypass most of the new code initially.

Subsequent commits will introduce additional IO methods, including a
cross-platform method implemented using worker processes and a linux specific
method using io_uring.

To allow different parts of postgres to use AIO, the core AIO infrastructure
does not need to know what kind of files it is operating on. The necessary
behavioral differences for different files are abstracted as "AIO
Targets". One example target would be smgr. For boring portability reasons,
all targets currently need to be added to an array in aio_target.c.  This
commit does not implement any AIO targets, just the infrastructure for
them. The smgr target will be added in a later commit.

Completion (and other events) of IOs for one type of file (i.e. one AIO
target) need to be reacted to differently, based on the IO operation and the
callsite. This is made possible by callbacks that can be registered on
IOs. E.g. an smgr read into a local buffer does not need to update the
corresponding BufferDesc (as there is none), but a read into shared buffers
does.  This commit does not contain any callbacks, they will be added in
subsequent commits.

For now the AIO infrastructure only understands READV and WRITEV operations,
but it is expected that more operations will be added. E.g. fsync/fdatasync,
flush_range and network operations like send/recv.

As of this commit, nothing uses the AIO infrastructure. Later commits will add
an smgr target, md.c and bufmgr.c callbacks and then finally use AIO for
read_stream.c IO, which, in one fell swoop, will convert all read stream users
to AIO.

The goal is to use AIO in many more places. There are patches to use AIO for
checkpointer and bgwriter that are reasonably close to being ready. There also
are prototypes to use it for WAL, relation extension, backend writes and many
more. Those prototypes were important to ensure the design of the AIO
subsystem is not too limiting (e.g. WAL writes need to happen in critical
sections, which influenced a lot of the design).

A future commit will add an AIO README explaining the AIO architecture and how
to use the AIO subsystem. The README is added later, as it references details
only added in later commits.

Many many more people than the folks named below have contributed with
feedback, work on semi-independent patches etc. E.g. various folks have
contributed patches to use the read stream infrastructure (added by Thomas in
b5a9b18cd0b) in more places. Similarly, a *lot* of folks have contributed to
the CI infrastructure, which I had started to work on to make adding AIO
feasible.

Some of the work by contributors has gone into the "v1" prototype of AIO,
which heavily influenced the current design of the AIO subsystem. None of the
code from that directly survives, but without the prototype, the current
version of the AIO infrastructure would not exist.

Similarly, the reviewers below have not necessarily looked at the current
design or the whole infrastructure, but have provided very valuable input. I
am to blame for problems, not they.

Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-17 18:51:33 -04:00
Andres Freund
02844012b3 aio: Basic subsystem initialization
This commit just does the minimal wiring up of the AIO subsystem, added in the
next commit, to the rest of the system. The next commit contains more details
about motivation and architecture.

This commit is kept separate to make it easier to review, separating the
changes across the tree, from the implementation of the new subsystem.

We discussed squashing this commit with the main commit before merging AIO,
but there has been a mild preference for keeping it separate.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-17 18:51:33 -04:00
Nathan Bossart
65db3963ae Add commit 203c1b4cc4 to .git-blame-ignore-revs. 2025-03-17 15:58:02 -05:00
Robert Haas
203c1b4cc4 Fix indentation.
Commit 99aeb84703177308c1541e2d11c09fdc59acb724 wasn't fully
reindented prior to commit.
2025-03-17 16:06:17 -04:00
Nathan Bossart
7e05df430b pg_upgrade: Remove some dead code.
Since commit e469f0aaf3, tablespace_suffix can't be empty.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/Z9hc3mkYFKR56Xof%40nathan
2025-03-17 13:18:14 -05:00
Andres Freund
1a22a8a0f1 tests: Expand temp table tests to some pin related matters
Added tests:
- recovery from running out of unpinned local buffers
- that we don't run out of unpinned buffers due to read stream (only recently
  fixed, in 92fc6856cb4)
- temp tables can't be dropped while in use by cursors

Discussion: weskknhckugbdm2yt7sa2uq53xlsax67gcdkac34sanb7qpd3p@hcc2wadao5wy
Discussion: https://postgr.es/m/ge6nsuddurhpmll3xj22vucvqwp4agqz6ndtcf2mhyeydzarst@l75dman5x53p
2025-03-17 14:12:44 -04:00
Robert Haas
99aeb84703 pg_combinebackup: Add -k, --link option.
This is similar to pg_upgrade's --link option, except that here we won't
typically be able to use it for every input file: sometimes we will need
to reconstruct a complete backup from blocks stored in different files.
However, when a whole file does need to be copied, we can use an
optimized copying strategy: see the existing --clone and
--copy-file-range options and the code to use CopyFile() on Windows.
This commit adds a new strategy: add a hard link to an existing file.
Making a hard link doesn't actually copy anything, but it makes sense
for the code to treat it as doing so.

This is useful when the input directories are merely staging directories
that will be removed once the restore is complete. In such cases, there
is no need to actually copy the data, and making a bunch of new hard
links can be very quick. However, it would be quite dangerous to use it
if the input directories might later be reused for any other purpose,
since starting postgres on the output directory would destructively
modify the input directories. For that reason, using this new option
causes pg_combinebackup to emit a warning about the danger involved.

Author: Israel Barth Rubio <barthisrael@gmail.com>
Co-authored-by: Robert Haas <robertmhaas@gmail.com> (cosmetic changes)
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoaEFsYHsMefNaNkU=2SnMRufKE3eVJxvAaX=OWgcnPmPg@mail.gmail.com
2025-03-17 14:03:14 -04:00
Tom Lane
ed762e9425 Unify wording of user-facing "row security" messages.
Row-level security is mostly referred to as "row security" in
user-facing messages.  Commit cd3c45125 introduced one inconsistent
use of "row level security"; make that one match the rest.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20250317.135305.573764276033358827.horikyota.ntt@gmail.com
2025-03-17 12:53:50 -04:00
Michael Paquier
3943f5cff6 Fix inconsistent quoting for some options in TAP tests
This commit addresses some inconsistencies with how the options of some
routines from PostgreSQL/Test/ are written, mainly for init() and
init_from_backup() in Cluster.pm.  These are written as unquoted, except
in the locations updated here.

Changes extracted from a larger patch by the same author.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87jz8rzf3h.fsf@wibble.ilmari.org
2025-03-17 14:07:12 +09:00
Michael Paquier
19c6e92b13 Apply more consistent style for command options in TAP tests
This commit reshapes the grammar of some commands to apply a more
consistent style across the board, following rules similar to
ce1b0f9da03e:
- Elimination of some pointless used-once variables.
- Use of long options, to self-document better the options used.
- Use of fat commas to link option names and their assigned values,
including redirections, so as perltidy can be tricked to put them
together.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87jz8rzf3h.fsf@wibble.ilmari.org
2025-03-17 12:42:23 +09:00
Michael Paquier
5721e5453e Revert "Add redo LSN to pgstats files"
This reverts commit b860848232aa, that was added as a prerequisite for
the support of pgstats data flush across checkpoints, linking a pgstats
file to a specific checkpoint redo LSN.

As reported, this is proving to be currently problematic when going
through a pg_upgrade, that does direct manipulations of the control file
in the new cluster.  The LSN stored in the pgstats file is not able to
cope with any changes done in the control file by pg_upgrade yet,
causing the pgstats file to be discarded when starting the new cluster
after overriding its redo LSN (one is a `pg_resetwal -l` where the new
cluster's start LSN is bumped by a hardcoded value of 8 segments, see
copy_xact_xlog_xid).

The least painful path going forward is likely going to be a refactor of
the pgstats code so as it is possible to read and write some of its data
with some routines in src/common/, so as pg_upgrade or pg_resetwal are
able to update its data.  The main point is that we are going to need a
LSN in the stats file should we make it written at checkpoint time and
not only as part of a shutdown sequence.  It is too late to dive into
these details for v18, so let's revert the change, and let's try to
figure out all the details in the next release cycle.  The pgstats file
is currently only written as part of a shutdown sequence, and its
contents are still lost on crash, same as older releases.

Bump PGSTAT_FILE_FORMAT_ID.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2563883.1741826489@sss.pgh.pa.us
2025-03-17 08:35:12 +09:00
Tom Lane
cd3c45125d pg_dump, pg_dumpall, pg_restore: Add --no-policies option.
Add --no-policies option to control row level security policy handling
in dump and restore operations. When this option is used, both CREATE
POLICY commands and ALTER TABLE ... ENABLE ROW LEVEL SECURITY commands
are excluded from dumps and skipped during restores.

This is useful in scenarios where policies need to be redefined in the
target system or when moving data between environments with different
security requirements.

Author: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAM527d8kG2qPKvbfJ=OYJkT7iRNd623Bk+m-a4ngm+nyHYsHog@mail.gmail.com
2025-03-16 18:08:15 -04:00
Tom Lane
4489044239 contrib/isn: Make weak mode a GUC setting, and fix related functions.
isn's weak mode used to be a simple static variable, settable only
via the isn_weak(boolean) function.  This wasn't optimal, as this
means it doesn't respect transactions nor respond to RESET ALL.

This patch makes isn.weak a GUC parameter instead, so that
it acts like any other user-settable parameter.

The isn_weak() functions are retained for backwards compatibility.
But we must fix their volatility markings: they were marked IMMUTABLE
which is surely incorrect, and PARALLEL RESTRICTED which isn't right
for GUC-related functions either.  Mark isn_weak(boolean) as
VOLATILE and PARALLEL UNSAFE, matching set_config().  Mark isn_weak()
as STABLE and PARALLEL SAFE, matching current_setting().

Reported-by: Viktor Holmberg <v@viktorh.net>
Diagnosed-by: Daniel Gustafsson <daniel@yesql.se>
Author: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/790bc1f9-74dc-4b50-94d2-8147315b1556@Spark
2025-03-16 13:45:48 -04:00
Alexander Korotkov
682c5be25c reindexdb: Fix the index-level REINDEX with multiple jobs
47f99a407d introduced a parallel index-level REINDEX.  The code was written
assuming that running run_reindex_command() with 'async == true' can schedule
a number of queries for a connection.  That's not true, and the second query
sent using run_reindex_command() will wait for the completion of the previous
one.

This commit fixes that by putting REINDEX commands for the same table into a
single query.

Also, this commit removes the 'async' argument from run_reindex_command(),
as only its call always passes 'async == true'.

Reported-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202503071820.j25zn3lo4hvn%40alvherre.pgsql
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Backpatch-through: 17
2025-03-16 13:29:15 +02:00
Michael Paquier
83e5763d4d pg_createsubscriber: Remove some code bloat in the atexit() callback
This commit adjusts some code added by e117cfb2f6c6 in the atexit()
callback of pg_createsubscriber.c, in charge of performing post-failure
cleanup actions.  The code loops over all the databases specified, and
it is changed here to rely on a single LogicalRepInfo for each database
rather than always using LogicalRepInfos, simplifying its logic.

Author: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CAHut+PtdBSVi4iH7BObDVwDNVwOpn+H3fezOBdSTtENx+rhNMw@mail.gmail.com
2025-03-16 19:20:49 +09:00
Andres Freund
771ba90298 localbuf: Introduce StartLocalBufferIO()
To initiate IO on a shared buffer we have StartBufferIO(). For temporary table
buffers no similar function exists - likely because the code for that
currently is very simple due to the lack of concurrency.

However, the upcoming AIO support will make it possible to re-encounter a
local buffer, while the buffer already is the target of IO. In that case we
need to wait for already in-progress IO to complete. This commit makes it
easier to add the necessary code, by introducing StartLocalBufferIO().

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com
2025-03-15 22:07:48 -04:00
Andres Freund
4b4d33b9ea localbuf: Introduce FlushLocalBuffer()
Previously we had two paths implementing writing out temporary table
buffers. For shared buffers, the logic for that is centralized in
FlushBuffer(). Introduce FlushLocalBuffer() to do the same for local buffers.

Besides being a nice cleanup on its own, it also makes an upcoming change
slightly easier.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com
2025-03-15 22:07:48 -04:00
Andres Freund
dd6f2618f6 localbuf: Introduce TerminateLocalBufferIO()
Previously TerminateLocalBufferIO() was open-coded in multiple places, which
doesn't seem like a great idea. While TerminateLocalBufferIO() currently is
rather simple, an upcoming patch requires additional code to be added to
TerminateLocalBufferIO(), making this modification particularly worthwhile.

For some reason FlushRelationBuffers() previously cleared BM_JUST_DIRTIED,
even though that's never set for temporary buffers. This is not carried over
as part of this change.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com
2025-03-15 22:07:48 -04:00
Andres Freund
0762a151b0 localbuf: Introduce InvalidateLocalBuffer()
Previously, there were three copies of this code, two of them
identical. There's no good reason for that.

This change is nice on its own, but the main motivation is the AIO patchset,
which needs to add extra checks the deduplicated code, which of course is
easier if there is only one version.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com
2025-03-15 22:07:48 -04:00
Andres Freund
fa6af9b25e localbuf: Fix dangerous coding pattern in GetLocalVictimBuffer()
If PinLocalBuffer() were to modify the buf_state, the buf_state in
GetLocalVictimBuffer() would be out of date. Currently that does not happen,
as PinLocalBuffer() only modifies the buf_state if adjust_usagecount=true and
GetLocalVictimBuffer() passes false.

However, it's easy to make this not the case anymore - it cost me a few hours
to debug the consequences.

The minimal fix would be to just refetch the buf_state after after calling
PinLocalBuffer(), but the same danger exists in later parts of the
function. Instead, declare buf_state in the narrower scopes and re-read the
state in conditional branches.  Besides being safer, it also fits well with
an upcoming set of cleanup patches that move the contents of the conditional
branches in GetLocalVictimBuffer() into helper functions.

I "broke" this in 794f2594479.

Arguably this should be backpatched, but as the relevant functions are not
exported and there is no actual misbehaviour, I chose to not backpatch, at
least for now.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com
2025-03-15 22:07:48 -04:00
Andrew Dunstan
5eabd91a83 Silence perl critic
Commit 27bdec06841 uses a loop variable that is not strictly local to
the loop. Perlcritic disapproves, and there's really no reason as the
variable is not used outside the loop.

Per buildfarm animals koel and crake.
2025-03-15 17:41:54 -04:00
Jeff Davis
27bdec0684 Optimization for lower(), upper(), casefold() functions.
Improve performance and reduce table sizes for case mapping.

The main case mapping table stores only 16-bit offsets, which can be
used to look up the mapped code point in any of the case tables (fold,
lower, upper, or title case). Simple case pairs point to the same
offsets.

Generate a function in generate-unicode_case_table.pl that consists of
a nested branches to test for specific codepoint ranges that determine
the offset in the main table.

Other approaches were considered, such as representing these ranges as
another structure (rather than branches in a generated function), or a
different approach such as a radix tree, or perfect hashing. The
author implemented and tested these alternatives and settled on the
generated branches.

Author: Alexander Borisov <lex.borisov@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/7cac7e66-9a3b-4e3f-a997-42aa0c401f80%40gmail.com
2025-03-15 13:00:50 -07:00
Melanie Plageman
c3953226a0 Remove table AM callback scan_bitmap_next_block
After pushing the bitmap iterator into table-AM specific code (as part
of making bitmap heap scan use the read stream API in 2b73a8cd33b7),
scan_bitmap_next_block() no longer returns the current block number.
Since scan_bitmap_next_block() isn't returning any relevant information
to bitmap table scan code, it makes more sense to get rid of it.

Now, bitmap table scan code only calls table_scan_bitmap_next_tuple(),
and the heap AM implementation of scan_bitmap_next_block() is a local
helper in heapam_handler.c.

Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com
2025-03-15 10:37:46 -04:00
Melanie Plageman
2b73a8cd33 BitmapHeapScan uses the read stream API
Make Bitmap Heap Scan use the read stream API instead of invoking
ReadBuffer() for each block indicated by the bitmap.

The read stream API handles prefetching, so remove all of the explicit
prefetching from bitmap heap scan code.

Now, heap table AM implements a read stream callback which uses the
bitmap iterator to return the next required block to the read stream
code.

Tomas Vondra conducted extensive regression testing of this feature.
Andres Freund, Thomas Munro, and I analyzed regressions and Thomas Munro
patched the read stream API.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Tested-by: Tomas Vondra <tomas@vondra.me>
Tested-by: Andres Freund <andres@anarazel.de>
Tested-by: Thomas Munro <thomas.munro@gmail.com>
Tested-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com
2025-03-15 10:34:42 -04:00
Melanie Plageman
944e81bf99 Separate TBM[Shared|Private]Iterator and TBMIterateResult
Remove the TBMIterateResult member from the TBMPrivateIterator and
TBMSharedIterator and make tbm_[shared|private_]iterate() take a
TBMIterateResult as a parameter.

This allows tidbitmap API users to manage multiple TBMIterateResults per
scan. This is required for bitmap heap scan to use the read stream API,
with which there may be multiple I/Os in flight at once, each one with a
TBMIterateResult.

Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
2025-03-15 10:11:19 -04:00
Thomas Munro
799959dc7c Simplify distance heuristics in read_stream.c.
Make the distance control heuristics simpler and more aggressive in
preparation for asynchronous I/O.

The v17 version of read_stream.c made a conservative choice to limit the
look-ahead distance when streaming sequential blocks, because it
couldn't benefit very much from looking ahead further yet.  It had a
three-behavior model where only random I/O would rapidly increase the
look-ahead distance, to support read-ahead advice.  Sequential I/O would
move it towards the io_combine_limit setting, just enough to build one
full-sized synchronous I/O at a time, and then expect kernel read-ahead
to avoid I/O stalls.

That already left I/O performance on the table with advice-based I/O
concurrency, since sequential blocks could be followed by random jumps,
eg with the proposed streaming Bitmap Heap Scan patch.

It is time to delete the cautious middle option and adjust the distance
based on recent I/O needs only, since asynchronous reads will need to be
started ahead of time whether random or sequential.  It is still limited
by io_combine_limit, *_io_concurrency, buffer availability and
strategy ring size, as before.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Tested-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
2025-03-16 03:05:07 +13:00
Thomas Munro
7ea8cd1566 Improve read_stream.c advice for dense streams.
read_stream.c tries not to issue read-ahead advice when it thinks the
kernel's own read-ahead should be active, ie when using buffered I/O and
reading sequential blocks.  It previously gave up too easily, and issued
advice only for the first read of up to io_combine_limit blocks in a
larger range of sequential blocks after random jump.  The following read
could suffer an avoidable I/O stall.

Fix, by continuing to issue advice until the corresponding preadv()
calls catch up with the start of the region we're currently issuing
advice for, if ever.  That's when the kernel actually sees the
sequential pattern.  Advice is now disabled only when the stream is
entirely sequential as far as we can see in the look-ahead window, or
in other words, when a sequential region is larger than we can cover
with the current io_concurrency and io_combine_limit settings.

While refactoring the advice control logic, also get rid of the
"suppress_advice" argument that was passed around between functions to
skip useless posix_fadvise() calls immediately followed by preadv().
read_stream_start_pending_read() can figure that out, so let's
concentrate knowledge of advice heuristics in fewer places (our goal
being to make advice-based I/O concurrency a legacy mode soon).

The problem cases were revealed by Tomas Vondra's extensive regression
testing with many different disk access patterns using Melanie
Plageman's streaming Bitmap Heap Scan patch, in a battle against the
venerable always-issue-advice-and-always-one-block-at-a-time code.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Tomas Vondra <tomas@vondra.me>
Reported-by: Andres Freund <andres@anarazel.de>
Tested-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKGJ3HSWciQCz8ekP1Zn7N213RfA4nbuotQawfpq23%2Bw-5Q%40mail.gmail.com
2025-03-15 19:04:54 +13:00
Álvaro Herrera
11bd831860
doc: Explain more thoroughly when a table rewrite is needed
Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/00e6eb5f5c793b8ef722252c7a519c9a@oss.nttdata.com
2025-03-14 20:44:59 +01:00
Tom Lane
1c9242b2cd Doc: remove obsolete comment.
This para should have been removed by 2f9661311, which made it
both false and irrelevant.  Noted while looking at SQL function
plancache patch.
2025-03-14 14:08:47 -04:00
Fujii Masao
6d376c3b0d Add GUC option to log lock acquisition failures.
This commit introduces a new GUC, log_lock_failure, which controls whether
a detailed log message is produced when a lock acquisition fails. Currently,
it only supports logging lock failures caused by SELECT ... NOWAIT.

The log message includes information about all processes holding or
waiting for the lock that couldn't be acquired, helping users analyze and
diagnose the causes of lock failures.

Currently, this option does not log failures from SELECT ... SKIP LOCKED,
as that could generate excessive log messages if many locks are skipped,
causing unnecessary noise.

This mechanism can be extended in the future to support for logging
lock failures from other commands, such as LOCK TABLE ... NOWAIT.

Author: Yuki Seino <seinoyu@oss.nttdata.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/411280a186cc26ef7034e0f2dfe54131@oss.nttdata.com
2025-03-14 23:14:12 +09:00
Fujii Masao
e80171d57c Optimize iteration over PGPROC for fast-path lock searches.
This commit improves efficiency in FastPathTransferRelationLocks()
and GetLockConflicts(), which iterate over PGPROCs to search for
fast-path locks.

Previously, these functions recalculated the fast-path group during
every loop iteration, even though it remained constant. This update
optimizes the process by calculating the group once and reusing it
throughout the loop.

The functions also now skip empty fast-path groups, avoiding
unnecessary scans of their slots. Additionally, groups belonging to
inactive backends (with pid=0) are always empty, so checking
the group is sufficient to bypass these backends, further enhancing
performance.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/07d5fd6a-71f1-4ce8-8602-4cc6883f4bd1@oss.nttdata.com
2025-03-14 22:49:29 +09:00
Peter Eisentraut
a359d37019 Simplify and generalize PrepareSortSupportFromIndexRel()
PrepareSortSupportFromIndexRel() was accepting btree strategy numbers
purely for the purpose of comparing it later against btree strategies
to determine if the sort direction was forward or reverse.  Change
that.  Instead, pass a bool directly, to indicate the same without an
unfortunate assumption that a strategy number refers specifically to a
btree strategy.  (This is similar in spirit to commits 0d2aa4d4937 and
c594f1ad2ba.)

(This could arguably be simplfied further by having the callers fill
in ssup_reverse directly.  But this way, it preserves consistency by
having all PrepareSortSupport*() variants be responsible for filling
in ssup_reverse.)

Moreover, remove the hardcoded check against BTREE_AM_OID, and check
against amcanorder instead, which is the actual requirement.

Co-authored-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-03-14 10:34:08 +01:00
Álvaro Herrera
1548c3a304
Remove direct handling of reloptions for toast tables
It doesn't actually work, even with allow_system_table_mods turned on:
the ALTER TABLE operation is rejected by ATSimplePermissions(), so even
the error message we're adding in this commit is unreachable.

Add a test case for it.

Author: Nikolay Shaplov <dhyan@nataraj.su>
Discussion: https://postgr.es/m/1913854.tdWV9SEqCh@thinkpad-pgpro
2025-03-14 09:28:51 +01:00
Thomas Munro
92fc6856cb Respect changing pin limits in read_stream.c.
To avoid pinning too much of the buffer pool at once, read_stream.c
previously used LimitAdditionalPins().  The coding was naive, and only
considered the available buffers at stream construction time.

This commit checks before each StartReadBuffers() call with
GetAdditionalPinLimit().  The result might change over time due to pins
acquired outside this stream by the same backend.  No extra CPU cycles
are added to the all-buffered fast-path code, but the I/O-starting path
now considers the up-to-date remaining buffer limit.

In practice it was quite difficult to exceed limits and cause any real
problems in v17, so no back-patch for now, but proposed changes will
make it easier.

Per code review from Andres, in the course of testing his AIO patches.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
2025-03-14 21:21:09 +13:00
Peter Eisentraut
0793ab8100 Activate Python "Limited API" in PL/Python
This allows building PL/Python against any Python 3.x version and
using another Python 3.x version at run time.  This is useful for
installers that want to run against a separately downloaded Python, so
that they don't have to bundle it themselves.

This builds on the earlier patch to only use APIs supported by the
Limited API.

At the moment, this is not activated on MSVC because that leads to
build failures that no one could explain or cared enough to address.
This could be done later.

Reviewed-by: Jakob Egger <jakob@eggerapps.at>
Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org
2025-03-14 08:57:02 +01:00
Peter Eisentraut
05cbd6cb22 Swap order of extern/static and pg_nodiscard
When pg_nodiscard was first added, the C standard draft had it as a
function specifier, and so the code comment about placement was
written with that in mind.  The final C23 standard has it as an
attribute and the placement rules are a bit different for that.
Specifically, it needs to be before extern or static.  (Or at least
both current clang and gcc require that.)  So just swap these.  (To be
clear: The current implementation with gcc attributes doesn't care.
This change is just for maximum forward compatibility for non-gcc
compilers.)  This also keeps the order consistent with the previously
introduced pg_noreturn.  Also update the code comment to reflect the
mentioned developments since its introduction.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
2025-03-14 07:18:07 +01:00
Thomas Munro
01261fb078 Improve buffer manager API for backend pin limits.
Previously the support functions assumed that the caller needed one pin
to make progress, and could optionally use some more, allowing enough
for every connection to do the same.  Add a couple more functions for
callers that want to know:

* what the maximum possible number could be, irrespective of currently
  held pins, for space planning purposes

* how many additional pins they could acquire right now, without the
  special case allowing one pin, for callers that already hold pins and
  could already make progress even if no extra pins are available

The pin limit logic began in commit 31966b15.  This refactoring is
better suited to read_stream.c, which will be adjusted to respect the
remaining limit as it changes over time in a follow-up commit.  It also
computes MaxProportionalPins up front, to avoid performing divisions
whenever a caller needs to check the balance.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
2025-03-14 17:13:09 +13:00
Amit Kapila
7c99dc587a Fix ALTER SUBSCRIPTION ... SET PUBLICATION ... command.
The problem is that ALTER SUBSCRIPTION ... SET PUBLICATION ... will lead
to restarting of apply worker and after the restart, the apply worker will
use the existing slot and replication origin corresponding to the
subscription. Now, it is possible that before the restart, the origin has
not been updated, and the WAL start location points to a location before
where PUBLICATION pointed to by SET PUBLICATION doesn't exist, and that
can lead to an error like: "ERROR:  publication "pub1" does not exist".
Once this error occurs, apply worker will never be able to proceed and
will always return the same error.

We decided to skip loading the publication if the publication does not
exist. The publication is loaded later and updates the relation entry when
the publication gets created.

We decided not to backpatch this as this is a behaviour change, and we don't
see field reports. This problem has been found by intermittent buildfarm
failures.

Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com
2025-03-14 08:57:40 +05:30
Tom Lane
4618045bee Fix ARRAY_SUBLINK and ARRAY[] for int2vector and oidvector input.
If the given input_type yields valid results from both
get_element_type and get_array_type, initArrayResultAny believed the
former and treated the input as an array type.  However this is
inconsistent with what get_promoted_array_type does, leading to
situations where the output of an ARRAY() subquery is labeled with
the wrong type: it's labeled as oidvector[] but is really a 2-D
array of OID.  That at least results in strange output, and can
result in crashes if further processing such as unnest() is applied.
AFAIK this is only possible with the int2vector and oidvector
types, which are special-cased to be treated mostly as true arrays
even though they aren't quite.

Fix by switching the logic to match get_promoted_array_type by
testing get_array_type not get_element_type, and remove an Assert
thereby made pointless.  (We need not introduce a symmetrical
check for get_element_type in the other if-branch, because
initArrayResultArr will check it.)  This restores the behavior
that existed before bac27394a introduced initArrayResultAny:
the output really is int2vector[] or oidvector[].

Comparable confusion exists when an input of an ARRAY[] construct
is int2vector or oidvector: transformArrayExpr decides it's dealing
with a multidimensional array constructor, and we end up with
something that's a multidimensional OID array but is alleged to be
of type oidvector.  I have not found a crashing case here, but it's
easy to demonstrate totally-wrong results.  Adjust that code so
that what you get is an oidvector[] instead, for consistency with
ARRAY() subqueries.  (This change also makes these types work like
domains-over-arrays in this context, which seems correct.)

Bug: #18840
Reported-by: yang lei <ylshiyu@126.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18840-fbc9505f066e50d6@postgresql.org
Backpatch-through: 13
2025-03-13 16:07:55 -04:00
Álvaro Herrera
c7fc8808a9
ATExecSetRelOptions: Reduce scope of 'isnull' variable
Author: Nikolay Shaplov <dhyan@nataraj.su>
Reviewed-by: Timur Magomedov <t.magomedov@postgrespro.ru>
Discussion: https://postgr.es/m/1913854.tdWV9SEqCh@thinkpad-pgpro
2025-03-13 18:15:59 +01:00
Álvaro Herrera
da0f0582e8
Make lwlocknames.h generated file less ugly
We can make the output look a bit better by aligning each lock's
definition, so add some padding space to achieve that.  This change
makes no practical difference, but casual onlookers will be less
distracted by (lack of) whitespace.

Author: Gurjeet Singh <gurjeet@singh.im>
Discussion: https://postgr.es/m/CABwTF4VxfwDtRV-H22_XK4XeDogaV-Vaobu+af5U=8ZAZn9ZZQ@mail.gmail.com
2025-03-13 17:38:21 +01:00
Nathan Bossart
0697b23906 Add reverse(bytea).
This commit introduces a function for reversing the order of the
bytes in binary strings.

Bumps catversion.

Author: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAJ7c6TMe0QVRuNssUArbMi0bJJK32%2BzNA3at5m3osrBQ25MHuw%40mail.gmail.com
2025-03-13 11:20:53 -05:00
Peter Eisentraut
bb25276205 Fix copy-and-paste mistake in error message
Introduced in commit a68159ff2b3.
2025-03-13 15:17:08 +01:00
Peter Eisentraut
3691edfab9 pg_noreturn to replace pg_attribute_noreturn()
We want to support a "noreturn" decoration on more compilers besides
just GCC-compatible ones, but for that we need to move the decoration
in front of the function declaration instead of either behind it or
wherever, which is the current style afforded by GCC-style attributes.
Also rename the macro to "pg_noreturn" to be similar to the C11
standard "noreturn".

pg_noreturn is now supported on all compilers that support C11 (using
_Noreturn), as well as GCC-compatible ones (using __attribute__, as
before), as well as MSVC (using __declspec).  (When PostgreSQL
requires C11, the latter two variants can be dropped.)

Now, all supported compilers effectively support pg_noreturn, so the
extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped.

This also fixes a possible problem if third-party code includes
stdnoreturn.h, because then the current definition of

    #define pg_attribute_noreturn() __attribute__((noreturn))

would cause an error.

Note that the C standard does not support a noreturn attribute on
function pointer types.  So we have to drop these here.  There are
only two instances at this time, so it's not a big loss.  In one case,
we can make up for it by adding the pg_noreturn to a wrapper function
and adding a pg_unreachable(), in the other case, the latter was
already done before.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
2025-03-13 12:37:26 +01:00
Richard Guo
cc5d98525d Fix incorrect handling of subquery pullup
When pulling up a subquery, if the subquery's target list items are
used in grouping set columns, we need to wrap them in PlaceHolderVars.
This ensures that expressions retain their separate identity so that
they will match grouping set columns when appropriate.

In 90947674f, we decided to wrap subquery outputs that are non-var
expressions in PlaceHolderVars.  This prevents const-simplification
from merging them into the surrounding expressions after subquery
pullup, which could otherwise lead to failing to match those
subexpressions to grouping set columns, with the effect that they'd
not go to null when expected.

However, that left some loose ends.  If the subquery's target list
contains two or more identical Var expressions, we can still fail to
match the Var expression to the expected grouping set expression.
This is not related to const-simplification, but rather to how we
match expressions to lower target items in setrefs.c.

For sort/group expressions, we use ressortgroupref matching, which
works well.  For other expressions, we primarily rely on comparing the
expressions to determine if they are the same.  Therefore, we need a
way to prevent setrefs.c from matching the expression to some other
identical ones.

To fix, wrap all subquery outputs in PlaceHolderVars if the parent
query uses grouping sets, ensuring that they preserve their separate
identity throughout the whole planning process.

Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-meSahaanKskpBn0KKxdHAXC1_EJCVWHxEodqirrGJnw@mail.gmail.com
2025-03-13 16:36:03 +09:00
Richard Guo
4c49611715 Remove code setting wrap_non_vars to true for UNION ALL subqueries
In pull_up_simple_subquery and pull_up_constant_function, there is
code that sets wrap_non_vars to true when dealing with an appendrel
member.  The goal is to wrap subquery outputs that are not simple Vars
in PlaceHolderVars, ensuring that what we pull up doesn't get merged
into a surrounding expression during later processing, which could
cause it to fail to match the expression actually available from the
appendrel.

However, this is unnecessary.  When pulling up an appendrel child
subquery, the only part of the upper query that could reference the
appendrel child yet is the translated_vars list of the associated
AppendRelInfo that we just made for this child.  Furthermore, we do
not want to force use of PHVs in the AppendRelInfo, as there is no
outer join between.  In fact, perform_pullup_replace_vars always sets
wrap_non_vars to false before performing pullup_replace_vars on the
AppendRelInfo.

This patch simply removes the code that sets wrap_non_vars to true for
UNION ALL subqueries.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-VXDEi1v+hZYLxpOv0riJxHsCkCH1f46tLnhonEAyGCQ@mail.gmail.com
2025-03-13 16:34:28 +09:00
Jeff Davis
d3b2e5e1ab Refactor convert_case() to prepare for optimizations.
Upcoming optimizations will add complexity to convert_case(). This
patch reorganizes slightly so that the complexity can be contained
within the logic to convert the case of a single character, rather
than mixing it in with logic to iterate through the string.

Reviewed-by: Alexander Borisov <lex.borisov@gmail.com>
Discussion: https://postgr.es/m/44005c3d-88f4-4a26-981f-fd82dfa8e313@gmail.com
2025-03-12 21:51:52 -07:00
Amit Kapila
3abe9dc188 Avoid invalidating all RelationSyncCache entries on publication rename.
On Publication rename, we need to only invalidate the RelationSyncCache
entries corresponding to relations that are part of the publication being
renamed.

As part of this patch, we introduce a new invalidation message to
invalidate the cache maintained by the logical decoding output plugin. We
can't use existing relcache invalidation for this purpose, as that would
unnecessarily cause relcache invalidations in other backends.

This will improve performance by building fewer relation cache entries
during logical replication.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-13 09:16:33 +05:30
Thomas Munro
75da2bece6 Fix read_stream.c for changing io_combine_limit.
In a couple of places, read_stream.c assumed that io_combine_limit would
be stable during the lifetime of a stream.  That is not true in at least
one unusual case: streams held by CURSORs where you could change the GUC
between FETCH commands, with unpredictable results.

Fix, by storing stream->io_combine_limit and referring only to that
after construction.  This mirrors the treatment of the other important
setting {effective,maintenance}_io_concurrency, which is stored in
stream->max_ios.

One of the cases was the queue overflow space, which was sized for
io_combine_limit and could be overrun if the GUC was increased.  Since
that coding was a little hard to follow, also introduce a variable for
better readability instead of open-coding the arithmetic.  Doing so
revealed an off-by-one thinko while clamping max_pinned_buffers to
INT16_MAX, though that wasn't a live bug due to the current limits on
GUC values.

Back-patch to 17.

Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
2025-03-13 15:43:34 +13:00
Amit Langote
d4f79865d4 Fix copy-paste error in datum_to_jsonb_internal()
Commit 3c152a27b06 mistakenly repeated JSONTYPE_JSON in a condition,
omitting JSONTYPE_CAST. As a result, datum_to_jsonb_internal() failed
to reject inputs that were casts (e.g., from an enum to json as in the
example below) when used as keys in JSON constructors.

This led to a crash in cases like:

  SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);

where 'happy'::mood is implicitly cast to json. The missing check
meant such casted values weren’t properly rejected as invalid
(non-scalar) JSON keys.

Reported-by: Maciek Sakrejda <maciek@pganalyze.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Maciek Sakrejda <maciek@pganalyze.com>
Discussion: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry+Xrq-ghjcejBRhcRMzWZvbd__QdgJA@mail.gmail.com
Backpatch-through: 17
2025-03-13 09:56:36 +09:00
Masahiko Sawada
4ecdd4110d pg_rewind: Add dbname to primary_conninfo when using --write-recovery-conf.
This commit enhances pg_rewind's --write-recovery-conf option to
include the dbname in the generated primary_conninfo value when
specified in the --source-server option. With this modification, the
rewound server can connect to the primary server without manual
configuration file modifications when sync_replication_slots is
enabled.

Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CAD21AoAkW=Ht0k9dVoBTCcqLiiZ2MXhVr+d=j2T_EZMerGrLWQ@mail.gmail.com
2025-03-12 16:56:04 -07:00
David Rowley
cdc1471cc7 Add b955df443 to .git-blame-ignore-revs 2025-03-13 12:44:26 +13:00
David Rowley
b955df4434 Fix indentation issue
Introduced recently by 9e088f7dd

Per buildfarm member koel
2025-03-13 12:41:44 +13:00
Masahiko Sawada
9e088f7dd8 Fix compiler warning in pg_logicalinspect.
Oversight in bd65cb3cd48.

Reported-by: David Rowley <dgrowleyml@gmail.com>
Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqrhFfnetbcwgGkJ=z63T8HfQ_OyP=vX8BYiXyxFKt67w@mail.gmail.com
2025-03-12 14:23:56 -07:00
Heikki Linnakangas
ac4494646d Rename alloc/free functions in reorderbuffer.c
There used to be bespoken pools for these structs to reduce the
palloc/pfree overhead, but that was ripped out a long time ago and
replaced with the generic, cheaper generational memory allocator
(commit a4ccc1cef5). The Get/Return terminology made sense with the
pools, as you "got" an object from the pool and "returned" it later,
but now it just looks weird. Rename to Alloc/Free.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/c9e43d2d-8e83-444f-b111-430377368989@iki.fi
2025-03-12 22:03:39 +02:00
Nathan Bossart
025e7e1eb4 Remove count_one_bits() in acl.c.
The only caller, select_best_grantor(), can instead use
pg_popcount64().  This isn't performance-critical code, but we
might as well use the centralized implementation.  While at it, add
some test coverage for this part of select_best_grantor().

Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/Z9GtL7Nm6hsYyJnF%40nathan
2025-03-12 15:01:52 -05:00
Melanie Plageman
ff79b5b2ab Increase default effective_io_concurrency to 16
The default effective_io_concurrency has been 1 since it was introduced
in b7b8f0b6096d2ab6e. Referencing the associated discussion [1], it
seems 1 was chosen as a conservative value that seemed unlikely to cause
regressions.

Experimentation on high latency cloud storage as well as fast, local
nvme storage (see Discussion link) shows that even slightly higher
values improve query timings substantially. 1 actually performs worse
than 0 [2]. With effective_io_concurrency 1, we are not prefetching
enough to avoid I/O stalls, but we are issuing extra syscalls.

The new default is 16, which should be more appropriate for common
hardware while still avoiding flooding low IOPs devices with I/O
requests.

[1] https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com
[2] https://www.postgresql.org/message-id/CAAKRu_Zv08Cic%3DqdCfzrQabpEXGrd9Z9UOW5svEVkCM6%3DFXA9g%40mail.gmail.com

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_Z%2BJa-mwXebOoOERMMUMvJeRhzTjad4dSThxG0JLXESxw%40mail.gmail.com
2025-03-12 15:57:44 -04:00
Heikki Linnakangas
af717317a0 Handle interrupts while waiting on Append's async subplans
We did not wake up on interrupts while waiting on async events on an
async-capable append node. For example, if you tried to cancel the
query, nothing would happen until one of the async subplans becomes
readable. To fix, add WL_LATCH_SET to the WaitEventSet.

Backpatch down to v14 where async Append execution was introduced.

Discussion: https://www.postgresql.org/message-id/37a40570-f558-40d3-b5ea-5c2079b3b30b@iki.fi
2025-03-12 20:53:09 +02:00
Tom Lane
f4e7756ef9 Build whole-row Vars the same way during parsing and planning.
makeWholeRowVar() has different rules for constructing a
whole-row Var depending on the kind of RTE it's representing.
This turns out to be problematic because the rewriter and planner
can convert view RTEs and set-returning-function RTEs into
subquery RTEs; so a whole-row Var made during planning might
look different from one made by the parser.  In isolation this
doesn't cause any problem, but if a query contains Vars made
both ways for the same varno, there are cross-checks in the
executor that will complain.  This manifests for UPDATE, DELETE,
and MERGE queries that use whole-row table references.

To fix, we need makeWholeRowVar() to produce the same result
from an inlined RTE as it would have for the original.  For
an inlined view, we can use RangeTblEntry.relid to detect
that this had been a view RTE.  For inlined SRFs, make a
data structure definition change akin to commit 47bb9db75,
and say that we won't clear RangeTblEntry.functions until
the end of planning.  That allows makeWholeRowVar() to
repeat what it would have done with the unmodified RTE.

Reported-by: Duncan Sands <duncan.sands@deepbluecap.com>
Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/3518c50a-ab18-482f-b916-a37263622501@deepbluecap.com
Backpatch-through: 13
2025-03-12 11:47:38 -04:00
Melanie Plageman
18cd15e706 Add connection establishment duration logging
Add log_connections option 'setup_durations' which logs durations of
several key parts of connection establishment and backend setup.

For an incoming connection, starting from when the postmaster gets a
socket from accept() and ending when the forked child backend is first
ready for query, there are multiple steps that could each take longer
than expected due to external factors. This logging provides visibility
into authentication and fork duration as well as the end-to-end
connection establishment and backend initialization time.

To make this portable, the timings captured in the postmaster (socket
creation time, fork initiation time) are passed through the
BackendStartupData.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Guillaume Lelarge <guillaume.lelarge@dalibo.com>
Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-03-12 11:35:27 -04:00
Melanie Plageman
9219093cab Modularize log_connections output
Convert the boolean log_connections GUC into a list GUC comprised of the
connection aspects to log.

This gives users more control over the volume and kind of connection
logging.

The current log_connections options are 'receipt', 'authentication', and
'authorization'. The empty string disables all connection logging. 'all'
enables all available connection logging.

For backwards compatibility, the most common values for the
log_connections boolean are still supported (on, off, 1, 0, true, false,
yes, no). Note that previously supported substrings of on, off, true,
false, yes, and no are no longer supported.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-03-12 11:35:21 -04:00
Michael Paquier
f554a95379 Remove initialization from PendingBackendStats
9a8dd2c5a6d has added an initialization to PendingBackendStats, which
has been causing compilation warnings in the buildfarm.  This code does
not strictly require it as PendingBackendStats is always initialized
with memset(0), so let's remove it.

Per report from multiple buildfarm members, like ayu and batfish, via
Tom Lane.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/1870853.1741749264@sss.pgh.pa.us
2025-03-12 20:37:43 +09:00
Peter Eisentraut
72a3d0462b Prepare for Python "Limited API" in PL/Python
Using the Python Limited API would allow building PL/Python against
any Python 3.x version and using another Python 3.x version at run
time.  This commit does not activate that, but it prepares the code to
only use APIs supported by the Limited API.

Implementation details:

- Convert static types to heap types
  (https://docs.python.org/3/howto/isolating-extensions.html#heap-types).

- Replace PyRun_String() with component functions.

- Replace PyList_SET_ITEM() with PyList_SetItem().

This was previously committed as c47e8df815c and then reverted because
it wasn't working under Python older than 3.8.  That has been fixed in
this version.  There was a Python API change/bugfix between 3.7 and
3.8 that directly affects this patch.  The relevant commit is
<https://github.com/python/cpython/commit/364f0b0f19c>.  The
workarounds described there have been applied in this patch, and it
has been confirmed to work with Python 3.6 and 3.7.

Reviewed-by: Jakob Egger <jakob@eggerapps.at>
Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org
2025-03-12 08:53:54 +01:00
Tom Lane
c872516d8f Doc: silence A4 PDF build warnings.
Commit 0fbceae84 put a "&zwsp;" in almost but not quite the correct
place to avoid "The contents of fo:block line 1 exceed the available
area" warnings.  Per buildfarm.
2025-03-11 23:35:39 -04:00
Heikki Linnakangas
043745c3a0 Improve snapmgr.c comment
Add more details on the different kinds of snapshots, how to use them,
and how the active snapshot stack works.

Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi
2025-03-11 23:28:38 +02:00
Heikki Linnakangas
8076c00592 Assert that a snapshot is active or registered before it's used
The comment in GetTransactionSnapshot() said that you "should call
RegisterSnapshot or PushActiveSnapshot on the returned snap if it is
to be used very long". That felt too unclear to me. Make the comment
more strongly worded.

To enforce that rule and to catch potential bugs where a snapshot
might get invalidated while it's still in use, add an assertion to
HeapTupleSatisfiesMVCC() to check that the snapshot is registered or
pushed to active stack. No new bugs were found by this, but it seems
like good future-proofing. It's not a great place for the check;
HeapTupleSatisfiesMVCC() is in fact safe to call with an unregistered
snapshot, and the assertion won't catch other unsafe uses. But it goes
a long way in practice.

Fix a few cases that were playing fast and loose with that and just
assumed that the snapshot cannot be invalidated during a scan. Those
assumptions were not wrong, but they're not performance critical, so
let's drop the excuses and just register the snapshot. These were
false positives found by the new assertion.

Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi
2025-03-11 23:20:34 +02:00
Masahiko Sawada
bd65cb3cd4 pg_logicalinspect: Fix possible crash when passing a directory path.
Previously, pg_logicalinspect functions were too trusting of their
input and blindly passed it to SnapBuildRestoreSnapshot(). If the
input pointed to a directory, the server could a PANIC error while
attempting to fsync_fname() with isdir=false on a directory.

This commit adds validation checks for input filenames and passes the
LSN extracted from the filename to SnapBuildRestoreSnapshot() instead
of the filename itself. It also adds regression tests for various
input patterns and permission checks.

Bug: #18828
Reported-by: Robins Tharakan <tharakan@gmail.com>
Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/18828-0f4701c635064211@postgresql.org
2025-03-11 09:56:40 -07:00
Masahiko Sawada
a49927f04c pg_logicalinspect: Stabilize isolation tests.
The previous isolation tests did not account for the possibility that
the background writer or the checkpointer could write a RUNNING_XACTS
record, which could cause logical decoding to produce more logical
snapshots than expected.

This commit modifies the isolation tests to verify that at least one
logical snapshot contains the expected number of committed or ongoing
catalog-change transactions.

Per buildfarm member skink.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/5qbxud4pvnvmtuoi7weiizm5hmumxaeohx4vztfhrwlfhyz6rj@buh4435mllwo
2025-03-11 09:30:00 -07:00
Tom Lane
8b1b342544 Improve EXPLAIN's display of window functions.
Up to now we just punted on showing the window definitions used
in a plan, with window function calls represented as "OVER (?)".
To improve that, show the window definition implemented by each
WindowAgg plan node, and reference their window names in OVER.
For nameless window clauses generated by "OVER (...)", assign
unique names w1, w2, etc.

In passing, re-order the properties shown for a WindowAgg node
so that the Run Condition (if any) appears after the Window
property and before the Filter (if any).  This seems more
sensible since the Run Condition is associated with the Window
and acts before the Filter.

Thanks to David G. Johnston and Álvaro Herrera for design
suggestions.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/144530.1741469955@sss.pgh.pa.us
2025-03-11 11:19:54 -04:00
Peter Geoghegan
426ea61117 nbtree: Make BTMaxItemSize into object-like macro.
Make nbtree's "1/3 of a page limit" BTMaxItemSize function-like macro
(which accepts a "page" argument) into an object-like macro that can be
used from code that doesn't have convenient access to an nbtree page.

Preparation for an upcoming patch that adds skip scan to nbtree.
Parallel index scans that use skip scan will serialize datums (not just
SAOP array subscripts) when scheduling primitive scans.  BTMaxItemSize
will be used by btestimateparallelscan to determine how much DSM to
request.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=H_RG5weNGeUG_TkK87tRBnH9mGCQj6WpM4V4FNWKv2g@mail.gmail.com
2025-03-11 10:35:56 -04:00
Peter Geoghegan
0fbceae841 Show index search count in EXPLAIN ANALYZE, take 2.
Expose the count of index searches/index descents in EXPLAIN ANALYZE's
output for index scan/index-only scan/bitmap index scan nodes.  This
information is particularly useful with scans that use ScalarArrayOp
quals, where the number of index searches can be unpredictable due to
implementation details that interact with physical index characteristics
(at least with nbtree SAOP scans, since Postgres 17 commit 5bf748b8).
The information shown also provides useful context when EXPLAIN ANALYZE
runs a plan with an index scan node that successfully applied the skip
scan optimization (set to be added to nbtree by an upcoming patch).

The instrumentation works by teaching all index AMs to increment a new
nsearches counter whenever a new index search begins.  The counter is
incremented at exactly the same point that index AMs already increment
the pg_stat_*_indexes.idx_scan counter (we're counting the same event,
but at the scan level rather than the relation level).  Parallel queries
have workers copy their local counter struct into shared memory when an
index scan node ends -- even when it isn't a parallel aware scan node.
An earlier version of this patch that only worked with parallel aware
scans became commit 5ead85fb (though that was quickly reverted by commit
d00107cd following "debug_parallel_query=regress" buildfarm failures).

Our approach doesn't match the approach used when tracking other index
scan related costs (e.g., "Rows Removed by Filter:").  It is comparable
to the approach used in similar cases involving costs that are only
readily accessible inside an access method, not from the executor proper
(e.g., "Heap Blocks:" output for a Bitmap Heap Scan, which was recently
enhanced to show per-worker costs by commit 5a1e6df3, using essentially
the same scheme as the one used here).  It is necessary for index AMs to
have direct responsibility for maintaining the new counter, since the
counter might need to be incremented multiple times per amgettuple call
(or per amgetbitmap call).  But it is also necessary for the executor
proper to manage the shared memory now used to transfer each worker's
counter struct to the leader.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
2025-03-11 09:20:50 -04:00
Peter Eisentraut
12c5f797ea Update nls.mk for newly added file
Commit f18231e8175 moved some code to a new file, but the new file
wasn't added to nls.mk.
2025-03-11 13:48:14 +01:00
Álvaro Herrera
17ce344f86
BRIN: be more strict about required support procs
With improperly defined operator classes, it's possible to get a
Postgres crash because we'd try to invoke a procedure that doesn't
exist.  This is because the code is being a bit too trusting that the
opclass is correctly defined.  Add some ereport(ERROR)s for cases where
mandatory support procedures are not defined, transforming the crashes
into errors.

The particular case that was reported is an incomplete opclass in
PostGIS.

Backpatch all the way down to 13.

Reported-by: Tobias Wendorff <tobias.wendorff@tu-dortmund.de>
Diagnosed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/fb6d9a35-6c8e-4869-af80-0a4944a793a4@tu-dortmund.de
2025-03-11 12:50:35 +01:00
Daniel Gustafsson
d35d32d711 Add special case fast-paths for strict functions
Many STRICT function calls will have one or two arguments, in which
case we can speed up checking for NULL input by avoiding setting up
a loop over the arguments. This adds EEOP_FUNCEXPR_STRICT_1 and the
corresponding EEOP_FUNCEXPR_STRICT_2 for functions with one and two
arguments respectively.

Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
2025-03-11 12:02:42 +01:00
Daniel Gustafsson
8dd7c7cd0a Replace EEOP_DONE with special steps for return/no return
Knowing when the side-effects of an expression is the intended result
of the execution, rather than the returnvalue, is important for being
able generate more efficient JITed code. This replaces EEOP_DONE with
two new steps: EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN.  Expressions
which return a value should use the former step; expressions used for
their side-effects which don't return value should use the latter.

Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
2025-03-11 12:02:38 +01:00
Peter Eisentraut
dabccf4513 Move RemoveInheritedConstraint() call slightly earlier
This change is harmless and does not affect the existing intended
operation.  It is necessary for a subsequent patch operation (NOT
ENFORCED foreign keys), where we may need to change the child
constraint to enforced.  In this case, we would create the necessary
triggers and queue the constraint for validation, so it is important
to remove any unnecessary constraints before proceeding.

This is a small change that could have been included in the previous
"split tryAttachPartitionForeignKey" refactoring patch (commit
1d26c2d2c4b), but was kept separate to highlight the changes.

Author: Amul Sul <amul.sul@enterprisedb.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com
2025-03-11 10:43:48 +01:00
Peter Eisentraut
1d26c2d2c4 refactor: Split tryAttachPartitionForeignKey()
Split tryAttachPartitionForeignKey() into three functions:
AttachPartitionForeignKey(), RemoveInheritedConstraint(), and
DropForeignKeyConstraintTriggers(), so they can be reused in some
subsequent patches for the NOT ENFORCED feature.

Author: Amul Sul <amul.sul@enterprisedb.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com
2025-03-11 09:35:24 +01:00
Peter Eisentraut
64224a834c refactor: re-add ATExecAlterChildConstr()
ATExecAlterChildConstr() was removed in commit 80d7f990496, but it is
needed in some subsequent patches for the NOT ENFORCED feature, to
recurse over child constraints.  This adds it back in slightly altered
form.

Author: Amul Sul <amul.sul@enterprisedb.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com
2025-03-11 08:43:35 +01:00
Michael Paquier
76def4cdd7 Add WAL data to backend statistics
This commit adds per-backend WAL statistics, providing the same
information as pg_stat_wal, except that it is now possible to know how
much WAL activity is happening in each backend rather than an overall
aggregate of all the activity.  Like pg_stat_wal, the implementation
relies on pgWalUsage, tracking the difference of activity between two
reports to pgstats.

This data can be retrieved with a new system function called
pg_stat_get_backend_wal(), that returns one tuple based on the PID
provided in input.  Like pg_stat_get_backend_io(), this is useful when
joined with pg_stat_activity to get a live picture of the WAL generated
for each running backend, showing how the activity is [un]balanced.

pgstat_flush_backend() gains a new flag value, able to control the flush
of the WAL stats.

This commit relies mostly on the infrastructure provided by
9aea73fc61d4, that has introduced backend statistics.

Bump catalog version.  A bump of PGSTAT_FILE_FORMAT_ID is not required,
as backend stats do not persist on disk.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
2025-03-11 09:04:11 +09:00
Andres Freund
59a1592e39 tests: Make postmaster/002_connection_limits deal verbose logs
When log_error_verbosity=verbose is configured the test would hand (and then
fail), because of the sqlstate being added between log level and message. Make
regex cope.

Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/c7ba6bd0-3701-43d1-9087-017777fe9cd2%40dunslane.net
2025-03-10 19:32:26 -04:00
Tom Lane
29d6808ede CREATE INDEX: do update index stats if autovacuum=off.
This fixes a thinko from commit d611f8b15.  The intent was to prevent
updating the stats of the pre-existing heap if autovacuum is off,
but it also disabled updating the stats of the just-created index.
There is AFAICS no good reason to do the latter, since there could not
be any pre-existing stats to refrain from overwriting, and the zeroed
stats that are there to begin with are very unlikely to be useful.
Moreover, the change broke our cross-version upgrade tests again.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1116282.1741374848@sss.pgh.pa.us
2025-03-10 17:49:27 -04:00
Heikki Linnakangas
f7c566a1a2 Fix a few more redundant calls of GetLatestSnapshot()
Commit 2367503177 fixed this in RelationFindReplTupleByIndex(), but I
missed two other similar cases.

Per report from Ranier Vilela.

Discussion: https://www.postgresql.org/message-id/CAEudQArUT1dE45WN87F-Gb7XMy_hW6x1DFd3sqdhhxP-RMDa0Q@mail.gmail.com
Backpatch-through: 13
2025-03-10 18:58:10 +02:00
Heikki Linnakangas
2367503177 Fix snapshot used in logical replication index lookup
The function calls GetLatestSnapshot() to acquire a fresh snapshot,
makes it active, and was meant to pass it to table_tuple_lock(), but
instead called GetLatestSnapshot() again to acquire yet another
snapshot. It was harmless because the heap AM and all other known
table AMs ignore the 'snapshot' argument anyway, but let's be tidy.

In the long run, this perhaps should be redesigned so that snapshot
was not needed in the first place. The table AM API uses TID +
snapshot as the unique identifier for the row version, which is
questionable when the row came from an index scan with a Dirty
snapshot. You might lock a different row version when you use a
different snapshot in the table_tuple_lock() call (a fresh MVCC
snapshot) than in the index scan (DirtySnapshot). However, in the heap
AM and other AMs where the TID alone identifies the row version, it
doesn't matter. So for now, just fix the obvious albeit harmless bug.

This has been wrong ever since the table AM API was introduced in
commit 5db6df0c01, so backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/83d243d6-ad8d-4307-8b51-2ee5844f6230@iki.fi
Backpatch-through: 13
2025-03-10 17:07:38 +02:00
Tom Lane
9f87e2593f Doc: improve description of window function processing.
The previous wording talked about a "single pass over the data",
which can be read as promising more than intended (to wit, that only
one WindowAgg plan node will be used).  What we promise is only what
the SQL spec requires, namely that the data not get re-sorted between
window functions with compatible PARTITION BY/ORDER BY clauses.
Adjust the wording in hopes of making this clearer.

Reported-by: Christopher Inokuchi <cinokuchi@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CABde6B5va2wMsnM79u_x=n9KUgfKQje_pbLROEBmA9Ru5XWidw@mail.gmail.com
Backpatch-through: 13
2025-03-10 10:22:08 -04:00
Alexander Korotkov
6bb6a62f3c Use extended stats for precise estimation of bucket size in hash join
Recognizing the real-life complexity where columns in the table often have
functional dependencies, PostgreSQL's estimation of the number of distinct
values over a set of columns can be underestimated (or much rarely,
overestimated) when dealing with multi-clause JOIN.  In the case of hash
join, it can end up with a small number of predicted hash  buckets and, as
a result, picking non-optimal merge join.

To improve the situation, we introduce one additional stage of bucket size
estimation - having two or more join clauses estimator lookup for extended
statistics and use it for multicolumn estimation.  Clauses are grouped into
lists, each containing expressions referencing the same relation.  The result
of the multicolumn estimation made over such a list is combined with others
according to the caller's logic.  Clauses that are not estimated are returned
to the caller for further estimation.

Discussion: https://postgr.es/m/52257607-57f6-850d-399a-ec33a654457b%40postgrespro.ru
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Andy Fan <zhihui.fan1213@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-03-10 13:42:01 +02:00
Alexander Korotkov
fae535da0a Teach Append to consider tuple_fraction when accumulating subpaths.
This change is dedicated to more active usage of IndexScan and parameterized
NestLoop paths in partitioned cases under an Append node, as it already works
with plain tables.  As newly added regression tests demonstrate, it should
provide more smartness to the partitionwise technique.

With an indication of how many tuples are needed, it may be more meaningful
to use the 'fractional branch' subpaths of the Append path list, which are
more optimal for this specific number of tuples.  Planning on a higher level,
if the optimizer needs all the tuples, it will choose non-fractional paths.
In the case when, during execution, Append needs to return fewer tuples than
declared by tuple_fraction, it would not be harmful to use the 'intermediate'
variant of paths.  However, it will earn a considerable profit if a sensible
set of tuples is selected.

The change of the existing regression test demonstrates the positive outcome
of this feature: instead of scanning the whole table, the optimizer prefers
to use a parameterized scan, being aware of the only single tuple the join
has to produce to perform the query.

Discussion: https://www.postgresql.org/message-id/flat/CAN-LCVPxnWB39CUBTgOQ9O7Dd8DrA_tpT1EY3LNVnUuvAX1NjA%40mail.gmail.com
Author: Nikita Malakhov <hukutoc@gmail.com>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Andy Fan <zhihuifan1213@163.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-03-10 13:38:39 +02:00
Peter Eisentraut
b83e8a2ca2 Remove support for temporal RESTRICT foreign keys
It isn't clear how these should behave, so let's wait to implement them
until we are sure how to do it.

This feature was initially added by commit 89f908a6d0a, so it hasn't
been released yet.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://postgr.es/m/e773bc11-4ac1-40de-bb91-814e02f05b6d%40eisentraut.org
2025-03-10 11:31:01 +01:00
David Rowley
e033696596 Fix incorrect #endif comment
Noticed while reading code in this area.
2025-03-10 13:36:04 +13:00
Heikki Linnakangas
03f8e9a7fe Fix incorrect assertion in libpqwalreceiver
Was supposed to check the length of the array, but was checking its
size in bytes.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://www.postgresql.org/message-id/CA%2BCOZaA_9afJxj9ZuO73U5P7WXP%2BZM9NGnZvTDCmBFz0FGP%2BwA@mail.gmail.com
2025-03-09 20:40:45 +02:00
Heikki Linnakangas
2a943afcff Fix test name and username used in failed connection attempts
The first failed connection tests the "regular" connections limit, not
the reserved limit.

In the second failed connection, the username doesn't really matter,
but since the previous successful connections used "regress_reserved",
it seems weird to switch back to "regress_regular" for the
expected-to-fail attempt.

Discussion: https://www.postgresql.org/message-id/fd5e9523-78d3-4270-86b2-fd1b1eeb4fc9@iki.fi
2025-03-09 19:47:55 +02:00
Tom Lane
fedfcf6650 Don't try to parallelize array_agg() on an anonymous record type.
This doesn't work because record_recv requires the typmod that
identifies the specific record type (in our session) and
array_agg_deserialize has no convenient way to get that information.
The result is an "input of anonymous composite types is not
implemented" error.

We could probably make this work if we had to, but it does not seem
worth the trouble, given that it took this long to get a field report.
Just shut off parallelization, as though record_recv didn't exist.

Oversight in commit 16fd03e95.  Back-patch to v16 where that
came in.

Reported-by: Kirill Zdornyy <kirill@dineserve.com>
Diagnosed-by: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/atLI5Kce2ie1zcYjU0w_kjtVaxiYbYGTihrkLDmGZQnRDD4pnXukIATaABbnIj9pUnelC4ESvCXMm4HAyHg-v61XABaKpERj0A2IXzJZM7g=@dineserve.com
Backpatch-through: 16
2025-03-09 13:11:20 -04:00
Nathan Bossart
3c472a1829 doc: Adjust note about pg_upgrade's --jobs option.
Presently, this section lists a couple of parallelized parts of
pg_upgrade and suggests a starting point for setting the --jobs
option.  The list of parallelized tasks is not particularly
actionable, and the phrasing for the --jobs recommendation is
confusing to some readers.

This commit attempts to improve this section by eliminating the
list of parallelized tasks and instead highlighting that --jobs is
most useful for clusters with multiple databases or tablespaces.
Additionally, the recommendation for setting --jobs is simplified
to suggest starting with the number of CPU cores.

Reported-by: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/Z8dBn_5iGLNuYiPo%40nathan
2025-03-08 14:28:16 -06:00
Jeff Davis
1852aea3f5 Don't convert to and from floats in pg_dump.
Commit 8f427187db improved performance by remembering relation stats
as native types rather than issuing a new query for each relation.

Using native types is fine for integers like relpages; but reltuples
is floating point. The commit controllled for that complexity by using
setlocale(LC_NUMERIC, "C"). After that, Alexander Lakhin found a
problem in pg_strtof(), fixed in 00d61a08c5.

While we aren't aware of any more problems with that approach, it
seems wise to just use a string the whole way for floating point
values, as Corey's original patch did, and get rid of the
setlocale(). Integers are still converted to native types to avoid
wasting memory.

Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/3049348.1740855411@sss.pgh.pa.us
Discussion: https://postgr.es/m/560cca3781740bd69881bb07e26eb8f65b09792c.camel%40j-davis.com
2025-03-08 11:25:36 -08:00
Tom Lane
7fb8801021 Clear errno before calling strtol() in spell.c.
Per POSIX, a caller of strtol() that wishes to check for errors must
set errno to 0 beforehand.  Several places in spell.c neglected that,
so that they risked delivering a false overflow error in case errno
had been ERANGE already.  Given the lack of field reports, this case
may be unreachable at present --- but it's surely trouble waiting to
happen, so fix it.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaBhsq6EromFm+knMJfzK6nTpG23zJ+K2=nfUQQXcj_xcQ@mail.gmail.com
Backpatch-through: 13
2025-03-08 11:24:25 -05:00
Peter Geoghegan
67fc4c9fd7 Make parallel nbtree index scans use an LWLock.
Teach parallel nbtree index scans to use an LWLock (not a spinlock) to
protect the scan's shared descriptor state.

Preparation for an upcoming patch that will add skip scan optimizations
to nbtree.  That patch will create the need to occasionally allocate
memory while the scan descriptor is locked, while copying datums that
were serialized by another backend.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
2025-03-08 11:10:14 -05:00
Peter Eisentraut
8021c77769 Make amcanorder independent of amconsistentordering
Follow-up to commit af4002b381d: Make amconsistentordering not depend
on amcanorder.  Although they are related, they are independent
properties.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org
2025-03-08 09:37:06 +01:00
Peter Eisentraut
661781f3a3 Fix typo
Duplicate assignment in commit af4002b381d should have been a
different field.  (But it didn't affect the outcome.)

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org
2025-03-08 08:06:30 +01:00
Michael Paquier
21f653cc00 Use stricter ordering in regression test query for pg_stat_io
The query introduced in 8b532771a099 is proving to have ordering issues
under at least the locale cs_CZ.  This commit updates the query to use a
stricter ordering.

Per reports from buildfarm members hippopotamus and jay.
2025-03-08 13:39:57 +09:00
Michael Paquier
8b532771a0 Add regression test listing all the possible tuples in pg_stat_io
pg_stat_io returns a set of tuples based on a combination of three
properties (BackendType, IOObject and IOContext) and
pgstat_tracks_io_object() to decide if a BackendType should return a
tuple based on a pair made of an IOObject and an IOContext.

This commit adds a regression test to track all the combinations
supported.  This is useful to know which tuples are relevant when adding
a new BackendType to the set or when touching pgstat_tracks_io_object(),
and I have noticed while playing with this area that it is not
complicated to break it without the regression tests noticing a
difference in some cases.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z8exfAehbVbEKXW5@paquier.xyz
2025-03-08 12:22:41 +09:00
Michael Paquier
9a8dd2c5a6 Improve check for detection of pending data in backend statistics
The callback pgstat_backend_have_pending_cb() is used as a way for
pg_stat_report() to detect if there is any pending data for backend
statistics.

It did not include a check based on pgstat_tracks_backend_bktype(), that
discards processes whose backend types do not support backend
statistics.  The logic is not a problem on HEAD, as processes that do
not support backend statistics cannot touch PendingBackendStats, so the
callback would always report that there is no pending data in this case.
However, we would run into trouble once backend statistics include
portions of pending stats that are not always zeroed, like pgWalUsage.

There is no reason for pgstat_backend_have_pending_cb() to not check
for pgstat_tracks_backend_bktype(), anyway, and this pattern is safer in
the long run, so let's update the code to do so.

While on it, this commit adds a proper initialization to
PendingBackendStats.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z8l6EMM4ImVoWRkg@ip-10-97-1-34.eu-west-3.compute.internal
2025-03-08 10:56:30 +09:00
Peter Geoghegan
8e167e6188 nbtree: refine _bt_readnextpage contract comments.
Another minor follow-up commit for commit 1bd4bc85, which changed the
_bt_readnextpage contract.
2025-03-07 18:35:13 -05:00
Nathan Bossart
088f8e2d56 Assert that wrapper_handler()'s argument is within expected range.
pqsignal() already does a similar check, but strange Valgrind
reports have us wondering if wrapper_handler() is somehow getting
called with an invalid signal number.

Reported-by: Tomas Vondra <tomas@vondra.me>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/ace01111-f9ac-4f61-b1b1-8e9379415444%40vondra.me
Backpatch-through: 17
2025-03-07 15:23:09 -06:00
Tom Lane
34c3c5ce1c Include column name in build_attrmap_by_position's error reports.
Formerly we only provided the column number, but it's frequently
more useful to mention the column name.  The input tupdesc often
doesn't have useful column names, but the output tupdesc usually
contains user-supplied names, so report that one.

Author: Marcos Pegoraro <marcos@f10.com.br>
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Co-authored-by: Erik Wienhold <ewie@ewie.name>
Reviewed-by: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Discussion: https://postgr.es/m/CAB-JLwanky28gjAMdnMh1CjyO1b2zLdr6UOA1-oY9G7PVL9KKQ@mail.gmail.com
2025-03-07 13:24:20 -05:00
Andres Freund
b48832cddb tests: Don't fail due to high default timeout in postmaster/003_start_stop
Some BF animals use very high timeouts due to their slowness. Unfortunately
postmaster/003_start_stop fails if a high timeout is configured, due to
authentication_timeout having a fairly low max.

As this test is reasonably fast, the easiest fix seems to be to cap the
timeout to 600.

Per buildfarm animal skink.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/ggflhkciwdyotpoie323chu2c2idpjk5qimrn462encwx2io7s@thmcxl7i6dpw
2025-03-07 13:09:16 -05:00
Andres Freund
71d1ed6fe1 tests: Fix race condition in postmaster/002_connection_limits
The test occasionally failed due to unexpected connection limit errors being
encountered after having waited for FATAL errors on another connection. These
spurious failures were caused by the the backend reporting FATAL errors to the
client before detaching from the PGPROC entry. Adding a sleep(1) before
proc_exit() makes it easy to reproduce that problem.

To fix the issue, add a helper function that waits for postmaster to notice
the process having exited. For now this is implemented by waiting for the
DEBUG2 message that postmaster logs in that case. That's not the prettiest
fix, but simple. If we notice this problem elsewhere, it might be worthwhile
to make this more general, e.g. by adding an injection point.

Reported-by: Tomas Vondra <tomas@vondra.me>
Diagnosed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Tested-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/ggflhkciwdyotpoie323chu2c2idpjk5qimrn462encwx2io7s@thmcxl7i6dpw
2025-03-07 13:09:16 -05:00
Robert Haas
d3fc7a5120 doc: Add missing decimal places to example rowcount.
Commit 95dbd827f2edc4d10bebd7e840a0bd6782cf69b7 updated a bunch
of similar cases in the documentation, but missed this one.

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
2025-03-07 09:00:53 -05:00
Peter Eisentraut
7f24c02743 Improve possible performance regression
Commit ce62f2f2a0a introduced calls to GetIndexAmRoutineByAmId() in
lsyscache.c functions.  This call is a bit more expensive than a
simple syscache lookup.  So rearrange the nesting so that we call that
one last and do the cheaper checks first.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org
2025-03-07 11:46:33 +01:00
Peter Eisentraut
af4002b381 Rename amcancrosscompare
After more discussion about commit ce62f2f2a0a, rename the index AM
property amcancrosscompare to two separate properties
amconsistentequality and amconsistentordering.  Also improve the
documentation and update some comments that were previously missed.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org
2025-03-07 11:46:33 +01:00
Dean Rasheed
6da469bada Allow casting between bytea and integer types.
This allows smallint, integer, and bigint values to be cast to and
from bytea. The bytea value is the two's complement representation of
the integer, with the most significant byte first. For example:

  1234::bytea -> \x000004d2
  (-1234)::bytea -> \xfffffb2e

Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAJ7c6TPtOp6%2BkFX5QX3fH1SVr7v65uHr-7yEJ%3DGMGQi5uhGtcA%40mail.gmail.com
2025-03-07 09:31:18 +00:00
Jeff Davis
d611f8b158 CREATE INDEX: don't update table stats if autovacuum=off.
We previously fixed this for binary upgrade in 71b66171d0, but a
similar problem remained when dumping statistics without data.

Fix by not opportunistically updating table stats during CREATE INDEX
when autovacuum is disabled. For stats to be stable at all, the server
needs to be aware that it should not take every opportunity to update
stats. Per discussion, autovacuum=off is a signal that the user
expects stats to be stable; though if necessary, we could create
a more specific mode in the future.

Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5vf9D+8-a5_BEX3y=2y_xY9hiCxV1=C+FnxDvfprWvkng@mail.gmail.com
Discussion: https://postgr.es/m/ca81cbf6e6ea2af838df972801ad4da52640a503.camel%40j-davis.com
2025-03-06 19:39:14 -08:00
John Naylor
19e57f4f78 Revert "vacuumdb: Add option for analyzing only relations missing stats."
This reverts commit 5f8eb25706b62923c53172e453c8a4dedd877a3d, which in
my branch by mistake.
2025-03-07 10:35:21 +07:00
John Naylor
fcabc3adf8 Doc: correct aggressive vacuum threshold for multixact members storage
The threshold is two billion members, which was interpreted as 2GB
in the documentation. Fix to reflect that each member takes up five
bytes, which translates to about 10GB. This is not exact, because of
page boundaries. While at it, mention the maximum size 20GB.

This has been wrong since commit c552e171d16e, so backpatch to
version 14.

Author: Alex Friedman <alexf01@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CACbFw60UOk6fCC02KsyT3OfU9Dnuq5roYxdw2aFisiN_p1L0bg@mail.gmail.com
Backpatch-through: 14
2025-03-07 10:22:56 +07:00
Nathan Bossart
5f8eb25706 vacuumdb: Add option for analyzing only relations missing stats.
This commit adds a new --missing-only option that can be used in
conjunction with --analyze-only and --analyze-in-stages.  When this
option is specified, vacuumdb will generate ANALYZE commands for a
relation if it is missing any statistics it should ordinarily have.
For example, if a table has statistics for one column but not
another, we will analyze the whole table.  A similar principle
applies to extended statistics, expression indexes, and table
inheritance.

Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: TODO
Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan
2025-03-07 10:17:35 +07:00
Michael Paquier
e2080261cc Fix race condition in TAP test 007_pre_auth
The authentication test added in c76db55c9085 expects a backend to start
and wait at the injection point "init-pre-auth".  A query is used to
retrieve the PID of the backend waiting at authentication, but its WHERE
clause was too soft, checking only for a backend in a "starting" state.

As proved by the CI, this WHERE clause is not enough.  There is a small
window between the moment when the backend is reported as "starting" in
its backend entry and the moment when it waits in its injection point,
and it was possible for the test to return the PID of a backend process
not yet waiting in the injection point, causing spurious failures.  This
issue is fixed by tweaking the query retrieving the PID of the backend
waiting before authentication so as we check for "init-pre-auth" in its
wait_event.  An extra check based on the backend_type is added, based on
a suggestion by Jacob, to be more cautious.

Error spotted by the CI on Windows, but it could happen anywhere, as
long as the authentication path is slow enough compared to the TAP test.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Jacob Champion <jacob.champion@enterprisedb.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/soexrl7oeyku24bj3czupxmv27ow35u6edymp5y3oyoysbe2kb@r3tgoos2xp2x
2025-03-07 08:12:45 +09:00
Álvaro Herrera
24503fa95c
reindexdb: move PQfinish() calls to the right place
get_parallel_object_list() has no business closing a connection it did
not create.  Make things more sensible by closing the connection at the
level where it is created, in reindex_one_database().

Extracted from a larger patch by the same author.  However, the patch as
submitted not only was not described as containing this change, but in
addition it contained a fatal flaw whereby reindexdb would crash and
fail across all of its TAP test, which is why I list myself as
co-author.

Author: Ranier Vilela <ranier.vf@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAEudQArfqr0-s0VVPSEh=0kgOgBJvFNdGW=xSL5rBcr0WDMQYQ@mail.gmail.com
2025-03-06 19:40:06 +01:00
Tom Lane
0f21db36d6 Fix some performance issues in GIN query startup.
If a GIN index search had a lot of search keys (for example,
"jsonbcol ?| array[]" with tens of thousands of array elements),
both ginFillScanKey() and startScanKey() took O(N^2) time.
Worse, those loops were uncancelable for lack of CHECK_FOR_INTERRUPTS.

The problem in ginFillScanKey() is the brute-force search key
de-duplication done in ginFillScanEntry().  The most expedient
solution seems to be to just stop trying to de-duplicate once
there are "too many" search keys.  We could imagine working harder,
say by using a sort-and-unique algorithm instead of brute force
compare-all-the-keys.  But it seems unlikely to be worth the trouble.
There is no correctness issue here, since the code already allowed
duplicate keys if any extra_data is present.

The problem in startScanKey() is the loop that attempts to identify
the first non-required search key.  In the submitted test case, that
vainly tests all the key positions, and each iteration takes O(N)
time.  One part of that is that it's reinitializing the entryRes[]
array from scratch each time, which is entirely unnecessary given
that the triConsistentFn isn't supposed to scribble on its input.
We can easily adjust the array contents incrementally instead.
The other part of it is that the triConsistentFn may itself take
O(N) time (and does in this test case).  This is all extremely
brute force: in simple cases with AND or OR semantics, we could
know without any looping whatever that all or none of the keys
are required.  But GIN opclasses don't have any API for exposing
that knowledge, so at least in the short run there is little to
be done about that.  Put in a CHECK_FOR_INTERRUPTS so that at
least the loop is cancelable.

These two changes together resolve the primary complaint that
the test query doesn't respond promptly to cancel interrupts.
Also, while they don't completely eliminate the O(N^2) behavior,
they do provide quite a nice speedup for mid-sized examples.

Bug: #18831
Reported-by: Niek <niek.brasa@hitachienergy.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18831-e845ac44ebc5dd36@postgresql.org
Backpatch-through: 13
2025-03-06 11:54:31 -05:00
Andrew Dunstan
e33969abc1 Further fix for json_strip_nulls documentation
Oversight in commit 4603903d294.

Author: Shinoda, Noriyoshi (SXD Japan FSI) <noriyoshi.shinoda@hpe.com>
2025-03-06 10:24:03 -05:00
Andrew Dunstan
0e76f253f4 Remove extraneous commas in json{b}_strip_nulls documentation
Oversight in commit 4603903d294.

Author: Ian Lawrence Barwick <barwick@gmail.com>
2025-03-06 08:46:15 -05:00
Amit Kapila
588acf6d0e Avoid invalidating all RelationSyncCache entries on publication change.
On change of publication via ALTER PUBLICATION ... SET/ADD/DROP commands,
we were invalidating all the relations present in relation sync cache
maintained by pgoutput. We need to invalidate only the relation entries
that are changed as part of publication DDL.

We have ensured that the publication DDL execution generated the
invalidations required to invalidate impacted relation sync entries in
RelationSyncCache.

This improves the performance by avoiding building the cache entries for
the cases where a publication has many tables but only one of them is
dropped.

Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-06 14:19:38 +05:30
Jeff Davis
1d33de9d68 Organize and deduplicate statistics import tests.
Author: Corey Huinker <corey.huinker@gmail.com>
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_bWEqUfxhODfJ-XbZC75vq=P6DYOKK6biyey=yM1Ah3Hg@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com
2025-03-06 00:19:22 -08:00
Jeff Davis
f9f4b43b8d Address stats export review comments.
Per discussion, did not use Jian He's patch exactly.

Reported-by: jian he <jian.universality@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CACJufxFVq=tq9u1zrHWYSbMi1T07gS9Ff0LJScMco4HZmtZ1xw@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com
2025-03-06 00:11:12 -08:00
Jeff Davis
298944e8d8 Address stats import review comments.
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxHG9MBQozbJQ4JRBcRbUO+t+sx4qLZX092rS_9b4SR_EA@mail.gmail.com
2025-03-05 23:07:25 -08:00
Heikki Linnakangas
39de4f157d Fix compiler warnings about typedef redefinitions
Clang with -Wtypedef-redefinition produced warnings:

    src/include/storage/latch.h:122:3: error: redefinition of typedef 'Latch' is a C11 feature [-Werror,-Wtypedef-redefinition]

Per buildfarm
2025-03-06 03:10:22 +02:00
Michael Paquier
7f7f324eb5 Add more monitoring data for WAL writes in the WAL receiver
This commit adds two improvements related to the monitoring of WAL
writes for the WAL receiver.

First, write counts and timings are now counted in pg_stat_io for the
WAL receiver.  These have been discarded from pg_stat_wal in
ff99918c625a due to performance concerns, related to the fact that we
still relied on an on-disk file for the stats back then, even with
track_wal_io_timing to avoid the overhead of the timestamp calculations.
This implementation is simpler than the original proposal as it is
possible to rely on the APIs of pgstat_io.c to do the job.  Like the
fsync and read data, track_wal_io_timing needs to be enabled to track
the timings.

Second, a wait event is added around the pg_pwrite() call in charge of
the writes, using the exiting WAIT_EVENT_WAL_WRITE.  This is useful as
the WAL receiver data is tracked in pg_stat_activity.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z8gFnH4o3jBm5BRz@ip-10-97-1-34.eu-west-3.compute.internal
2025-03-06 09:41:37 +09:00
Heikki Linnakangas
393e0d2314 Split WaitEventSet functions to separate source file
latch.c now only contains the Latch related functions, which build on
the WaitEventSet abstraction. Most of the platform-dependent stuff is
now in waiteventset.c.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
2025-03-06 01:26:16 +02:00
Heikki Linnakangas
84e5b2f07a Use ModifyWaitEvent to update exit_on_postmaster_death
This is in preparation for splitting WaitEventSet related functions to
a separate source file. That will hide the details of WaitEventSet
from WaitLatch, so it must use an exposed function instead of
modifying WaitEventSet->exit_on_postmaster_death directly.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
2025-03-06 01:26:12 +02:00
Fujii Masao
9f25b9f739 ecpg: Fix compiler warning in ecpg build with Meson.
Previously, Meson could produce a warning about the use of 'deps' in ecpg:

    WARNING: Project targets '>=0.54' but uses a feature introduced in '0.60.0': list.<plus>. The right-hand operand was not a list.

The right-hand operand of 'deps' should be a list. This commit fixes
the warning by wrapping it with square brackets.

This issue was introduced in commit 28f04984f0c.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/CAOYmi+ks8wO06Ymxduw2h_eQJ_D4_jHGeyMK0P=p5Q3psnEdMA@mail.gmail.com
2025-03-06 08:22:30 +09:00
Heikki Linnakangas
a98e4dee63 Remove unused ShutdownLatchSupport() function
The only caller was removed in commit 80a8f95b3b. I don't foresee
needing it any time soon, and I'm working on some big changes in this
area, so let's remove it out of the way.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
2025-03-05 23:52:04 +02:00
Daniel Gustafsson
153836b99a ci: Remove installation of libcurl
The CI images come with libcurl pre-installed since commit a119426
in the pg-vm-images repository so remove the installation commands
from the Cirrus tasks.  Installation of libcurl packages was added
in the OAuth patchset which introduced the dependency, a backpatch
is thus not applicable.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/8745B9D8-D897-4302-BD4C-FC18F291ECB7@yesql.se
2025-03-05 22:12:20 +01:00
Andres Freund
d4a6c847ca ci: Document what makes certain tasks special
To increase coverage without drastically increasing CI resource usage, we have
different CI tasks test different things (e.g. the linux tasks use
sanitizers).  Unfortunately that can create confusing situations where CI
fails on some OS, but not others, without the problem appearing to be platform
dependent.

To, partially, address that, add a comment, prefixed with SPECIAL, to each
task that we use to test in some non-default way.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/321570.1741195755@sss.pgh.pa.us
2025-03-05 13:19:28 -05:00
Andres Freund
0a2f5df881 ci: freebsd: Specify debug_parallel_query=regress
A lot of buildfarm animals run with debug_parallel_query=regress, while CI
didn't test that. That lead to the annoying situation of only noticing related
test instabilities after merging changes upstream.

FreeBSD was chosen because it's a relatively fast task. It also tests
debug_write_read_parse_plan_trees etc, which probably is exercised a bit more
heavily with debug_parallel_query=regress.

Discussion: https://postgr.es/m/zbuk4mlov22yfoktf5ub3lwjw2b7ezwphwolbplthepda42int@h6wpvq7orc44
2025-03-05 13:19:28 -05:00
Andres Freund
ad40644eb8 ci: Upgrade FreeBSD image
Upgrade to the current stable version. To avoid needing commits like this in
the future, the CI image name now doesn't contain the OS version number
anymore.

Backpatch to all versions with CI support, we don't want to generate CI images
for multiple FreeBSD versions.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ3_P4JJ6tWZafjf-_XbHgG6DQGXhH-y6Yp78_bwBJjcww@mail.gmail.com
Backpatch-through: 15
2025-03-05 10:33:47 -05:00
Peter Geoghegan
d00107cd63 Revert "Show index search count in EXPLAIN ANALYZE."
This reverts commit 5ead85fbc81162ab1594f656b036a22e814f96b3.

This commit shows test failures with debug_parallel_query=regress.  The
underlying issue needs to be debugged, so revert for now.
2025-03-05 10:27:31 -05:00
Andrew Dunstan
4603903d29 Allow json{b}_strip_nulls to remove null array elements
An additional paramater ("strip_in_arrays") is added to these functions.
It defaults to false. If true, then null array elements are removed as
well as null valued object fields. JSON that just consists of a single
null is not affected.

Author: Florents Tselai <florents.tselai@gmail.com>

Discussion: https://postgr.es/m/4BCECCD5-4F40-4313-9E98-9E16BEB0B01D@gmail.com
2025-03-05 10:04:02 -05:00
Peter Geoghegan
5ead85fbc8 Show index search count in EXPLAIN ANALYZE.
Expose the count of index searches/index descents in EXPLAIN ANALYZE's
output for index scan nodes.  This information is particularly useful
with scans that use ScalarArrayOp quals, where the number of index scans
isn't predictable in advance (at least not with optimizations like the
one added to nbtree by Postgres 17 commit 5bf748b8).  It will also be
useful when EXPLAIN ANALYZE shows details of an nbtree index scan that
uses skip scan optimizations set to be introduced by an upcoming patch.

The instrumentation works by teaching index AMs to increment a new
nsearches counter whenever a new index search begins.  The counter is
incremented at exactly the same point that index AMs must already
increment the index's pg_stat_*_indexes.idx_scan counter (we're counting
the same event, but at the scan level rather than the relation level).
The new counter is stored in the scan descriptor (IndexScanDescData),
which explain.c reaches by going through the scan node's PlanState.

This approach doesn't match the approach used when tracking other index
scan specific costs (e.g., "Rows Removed by Filter:").  It is similar to
the approach used in other cases where we must track costs that are only
readily accessible inside an access method, and not from the executor
(e.g., "Heap Blocks:" output for a Bitmap Heap Scan).  It is inherently
necessary to maintain a counter that can be incremented multiple times
during a single amgettuple call (or amgetbitmap call), and directly
exposing PlanState.instrument to index access methods seems unappealing.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
2025-03-05 09:36:48 -05:00
Heikki Linnakangas
635f580120 Rename some signal and interrupt handling functions for consistency
The usual pattern for handling a signal is that the signal handler
sets a flag and calls SetLatch(MyLatch), and CHECK_FOR_INTERRUPTS() or
other code that is part of a wait loop calls another function to deal
with it. The naming of the functions involved was a bit inconsistent,
however. CHECK_FOR_INTERRUPTS() calls ProcessInterrupts() to do the
heavy-lifting, but the analogous functions in aux processes were
called HandleMainLoopInterrupts(), HandleStartupProcInterrupts(),
etc. Similarly, most subroutines of ProcessInterrupts() were called
Process*(), but some were called Handle*().

To make things less confusing, rename all the functions that are part
of the overall signal/interrupt handling system but are not executed
in a signal handler to e.g. ProcessSomething(), rather than
HandleSomething(). The "Process" prefix is now consistently used in
the non-signal-handler functions, and the "Handle" prefix in functions
that are part of signal handlers, except for some completely unrelated
functions that clearly have nothing to do with signal or interrupt
handling.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/8a384b26-1499-41f6-be33-64b801fb98b8@iki.fi
2025-03-05 16:22:26 +02:00
Álvaro Herrera
f4e53e10b6
Add ALTER TABLE ... ALTER CONSTRAINT ... SET [NO] INHERIT
This allows to redefine an existing non-inheritable constraint to be
inheritable, which allows to straighten up situations with NO INHERIT
constraints so that thay can become normal constraints without having to
re-verify existing data.  For existing inheritance children this may
require creating additional constraints, if they don't exist already.

It also allows to do the opposite, if only for symmetry.

Author: Suraj Kharage <suraj.kharage@enterprisedb.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAF1DzPVfOW6Kk=7SSh7LbneQDJWh=PbJrEC_Wkzc24tHOyQWGg@mail.gmail.com
2025-03-05 13:50:22 +01:00
Michael Paquier
f4694e0f35 Fix some gaps in pg_stat_io with WAL receiver and WAL summarizer
The WAL receiver and WAL summarizer processes gain each one a call to
pgstat_report_wal(), to make sure that they report their WAL statistics
to pgstats, gathering data for pg_stat_io.

In the WAL receiver, the stats reports are timed with status updates sent
to the primary, that depend on wal_receiver_status_interval and
wal_receiver_timeout.  This is a conservative choice, but perhaps we
could be more aggressive with the frequency of the stats reports.  An
interesting historical fact is that the WAL receiver does writes and
syncs of WAL, but it has never reported its statistics to pgstats in
pg_stat_wal.

In the WAL summarizer, the stats reports are done each time the process
waits for WAL.

While on it, pg_stat_io is adjusted so as these two processes do not
report any rows when IOObject is not WAL, making the view easier to use
with less rows.

Two tests are added in TAP, checking statistics for the WAL summarizer
and the WAL receiver.  Status updates in the WAL receiver are currently
possible in the recovery test 001_stream_rep.pl.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z8UKZyVSHUUQJHNb@paquier.xyz
2025-03-05 10:17:39 +09:00
Michael Paquier
54d23601b9 psql: Fix memory leak with \gx used within a pipeline
While inside a pipeline, \gx is currently forbidden and will make
exec_command_g() exit early.  There was a memory leak in this code path,
so let's fix it.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqqFVQjLjZQiL7xdwLpzZEy1ghO_JWvCFPM_OmwF9s7XdA@mail.gmail.com
2025-03-05 07:56:03 +09:00
Tomas Vondra
b229c10164 Enforce memory limit during parallel GIN builds
Index builds are expected to respect maintenance_work_mem, just like
other maintenance operations. For serial builds this is done simply by
flushing the buffer in ginBuildCallback() into the index. But with
parallel builds it's more complicated, because there are multiple places
that can allocate memory.

ginBuildCallbackParallel() does the same thing as ginBuildCallback(),
except that the accumulated items are written into tuplesort. Then the
entries with the same key get merged - first in the worker, then in the
leader - and the TID lists may get (arbitrarily) long. It's unlikely it
would exceed the memory limit, but it's possible. We address this by
evicting some of the data if the list gets too long.

We can't simply dump the whole in-memory TID list. The GIN index bulk
insert code expects to see TIDs in monotonic order; it may fail if the
TIDs go backwards. If the TID lists overlap, evicting the whole current
TID list would break this (a later entry might add "old" TID values into
the already-written part).

In the workers this is not an issue, because the lists never overlap.
But the leader may see overlapping lists produced by the workers.

We can however derive a safe "horizon" TID - the entries (for a given
key) are sorted by (key, first TID), which means no future list can add
values before the last "first TID" we've seen. This patch tracks the
"frozen" part of the TID list, which we know can't change by merging
additional TID lists. If needed, we can evict this part of the list.

We don't want to do this too often - the smaller lists we evict, the
more expensive it'll be to merge them in the next step (especially in
the leader). Therefore we only trim the list if we have at least 1024
frozen items, and if the whole list is at least 64kB large.

These thresholds are somewhat arbitrary and conservative. We might
calculate the values from maintenance_work_mem, but tests show that does
not really improve anything (time, compression ratio, ...). So we stick
to these conservative values to release memory faster.

Author: Tomas Vondra
Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke
Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
2025-03-04 20:41:13 +01:00
Masahiko Sawada
f52345995d pg_upgrade: Check for the expected error message in TAP tests.
Since pg_upgrade prints its error messages on stdout, we can't use
command_fails_like() to check if it fails for the right reason. This
commit uses command_checks_all() in pg_upgrade TAP tests to check the
exit status and stdout, enabling proper verification of error
reasons.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87tt8h1vb7.fsf@wibble.ilmari.org
2025-03-04 11:16:12 -08:00
Álvaro Herrera
7bbc46213d
Fix ALTER TABLE error message
This bogus error message was introduced in 2013 by commit f177cbfe676d,
because of misunderstanding the processCASbits() API; at the time, no
test cases were added that would be affected by this change.  Only in
ca87c415e2fc was one added (along with a couple of typos), with an XXX
note that the error message was bogus.  Fix the whole, add some test
cases.

Backpatch all the way back.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/202503041822.aobpqke3igvb@alvherre.pgsql
2025-03-04 20:07:30 +01:00
Masahiko Sawada
bacbc4863b Refactor Copy{From|To}GetRoutine() to use pass-by-reference argument.
The change improves efficiency by eliminating unnecessary copying of
CopyFormatOptions.

The coverity also complained about inefficiencies caused by
pass-by-value.

Oversight in 7717f6300 and 2e4127b6d.

Reported-by: Junwang Zhao <zhjwpku@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (per reports from coverity)
Author: Sutou Kouhei <kou@clear-code.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3L6YCpPksTQMzjD_CvwDEhW3D_t=5md9BvvdOs5k+TA=Q@mail.gmail.com
2025-03-04 10:38:41 -08:00
Tomas Vondra
0b2a45a5d1 Compress TID lists when writing GIN tuples to disk
When serializing GIN tuples to tuplesorts during parallel index builds,
we can significantly reduce the amount of data by compressing the TID
lists. The GIN opclasses may produce a lot of data (depending on how
many keys are extracted from each row), and the TID compression is very
efficient and effective.

If the number of distinct keys is high, the first worker pass (reading
data from the table and writing them into a private tuplesort) may not
benefit from the compression very much. It is likely to spill data to
disk before the TID lists get long enough for the compression to help.
The second pass (writing the merged data into the shared tuplesort) is
more likely to benefit from compression.

The compression can be seen as a way to reduce the amount of disk space
needed by the parallel builds, because the data is written twice. First
into the per-worker tuplesorts, then into the shared tuplesort.

Author: Tomas Vondra
Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke
Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
2025-03-04 19:02:05 +01:00
Tom Lane
9b4bdf876a Add .gitignore entry for ecpg test detritus.
Oversight in commit 28f04984f.
2025-03-04 12:58:07 -05:00
Tomas Vondra
c878de1db4 Make FP_LOCK_SLOTS_PER_BACKEND look like a function
The FP_LOCK_SLOTS_PER_BACKEND macro looks like a constant, but it
depends on the max_locks_per_transaction GUC, and thus can change. This
is non-obvious and confusing, so make it look more like a function by
renaming it to FastPathLockSlotsPerBackend().

While at it, use the macro when initializing fast-path shared memory,
instead of using the formula.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/ffiwtzc6vedo6wb4gbwelon5nefqg675t5c7an2ta7pcz646cg%40qwmkdb3l4ett
2025-03-04 18:33:12 +01:00
Fujii Masao
91ecb5e0bc Add regression tests for pg_stat_progress_copy.tuples_skipped.
This commit adds tests to verify that tuples_skipped in pg_stat_progress_copy
works as expected. While existing tests checked other fields, tuples_skipped
was previously untested.

This improves test coverage and ensures accurate tracking of skipped tuples.

Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Josef Šimánek <josef.simanek@gmail.com>
Discussion: https://postgr.es/m/CACJufxFazq-bfyhiO0KBojR=yOr84E25Rqf6mHB0Ow0KPidkKw@mail.gmail.com
2025-03-04 23:56:49 +09:00
Heikki Linnakangas
d2e7068392 Fix outdated comment
Commit bc971f4025 replaced the latch-setting mechanism that the
comment talked about with a condition variable. And before that,
commit 2258e76f90 moved the code so that the comment got detached from
the loop that it talked about, so move the comment closer to the loop.
2025-03-04 15:33:19 +02:00
Daniel Gustafsson
ad13490be0 doc: Expand version compatibility for pg_basebackup features
This updates the paragraph on backwards compatitibility for server
features to include --incremental which only works on servers with
v17 or newer.  Backpatch down to v17 where incremental backup was
added.

Author: David G. Johnston <David.G.Johnston@Gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAKFQuwZYfZyeTkS3g2Ovw84TsxHa796xnf-u5kfgn_auyxZk0Q@mail.gmail.com
Backpatch-through: 17
2025-03-04 12:08:27 +01:00
Peter Eisentraut
3abbd8dbeb Fix accidental use of = instead of ==
Fix for commit 630f9a43cec.  It used = instead of ==.  The result
would be an incorrect error message.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/CA%2BCOZaC-JMbhQ4O0Q8V1Bxa0R%2BNex_RN9D6UyuLPiEx_CK4Heg%40mail.gmail.com
2025-03-04 09:45:01 +01:00
Peter Eisentraut
f011acdd61 Fix ALTER TABLE ADD VIRTUAL GENERATED COLUMN when table rewrite
demo:
CREATE TABLE gtest20a (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) VIRTUAL);
ALTER TABLE gtest20a ADD COLUMN c float8 DEFAULT RANDOM() CHECK (b < 60);
ERROR:  no generation expression found for column number 2 of table "pg_temp_17306"

In ATRewriteTable, the variable OIDNewHeap (if valid) corresponding
pg_attrdef default expression entry was not populated.  So OIDNewHeap
cannot be used to call expand_generated_columns_in_expr or
build_generation_expression.  Therefore in ATRewriteTable, we can only
use the existing relation to expand the generated expression.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxEJ%3DFoajabWXjszo_yrQeKSxdZ87KJqBW373rSbajKGAA%40mail.gmail.com
2025-03-04 09:18:32 +01:00
Richard Guo
716a051aac Avoid NullTest deduction for clone clauses
In commit b262ad440, we introduced an optimization that reduces an IS
NOT NULL qual on a column defined as NOT NULL to constant true, and an
IS NULL qual on a NOT NULL column to constant false, provided we can
prove that the input expression of the NullTest is not nullable by any
outer join.  This deduction happens after we have generated multiple
clones of the same qual condition to cope with commuted-left-join
cases.

However, performing the NullTest deduction for clone clauses can be
unsafe, because we don't have a reliable way to determine if the input
expression of a NullTest is non-nullable: nullingrel bits in clone
clauses may not reflect reality, so we dare not draw conclusions from
clones about whether Vars are guaranteed not-null.

To fix, we check whether the given RestrictInfo is a clone clause in
restriction_is_always_true and restriction_is_always_false, and avoid
performing any reduction if it is.

There are several ensuing plan changes in predicate.out, and we have
to modify the tests to ensure that they continue to test what they are
intended to.  Additionally, this fix causes the test case added in
f00ab1fd1 to no longer trigger the bug that commit fixed, so we also
remove that test case.

Back-patch to v17 where this bug crept in.

Reported-by: Ronald Cruz <cruz@rentec.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/f5320d3d-77af-4ce8-b9c3-4715ff33f213@rentec.com
Backpatch-through: 17
2025-03-04 16:11:03 +09:00
Fujii Masao
28f04984f0 ecpg: Add TAP test for the ecpg command.
This commit adds a TAP test to verify that the ecpg command correctly
detects unsupported or disallowed statements in input files and reports
the appropriate error or warning messages.

This test helps catch bugs like the one introduced in commit 3d009e45bd,
which broke ecpg's handling of unsupported COPY FROM STDIN statements,
later fixed by commit 94b914f601b.

Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CANOn0EzoMyxA1m-quDS1UeQUq6FNki6+GGiGucgr9tm2R78rKw@mail.gmail.com
2025-03-04 14:58:46 +09:00
Michael Paquier
c76db55c90 Split pgstat_bestart() into three different routines
pgstat_bestart(), used post-authentication to set up a backend entry
in the PgBackendStatus array, so as its data becomes visible in
pg_stat_activity and related catalogs, has its logic divided into three
routines with this commit, called in order at different steps of the
backend initialization:
* pgstat_bestart_initial() sets up the backend entry with a minimal
amount of information, reporting it with a new BackendState called
STATE_STARTING while waiting for backend initialization and client
authentication to complete.  The main benefit that this offers is
observability, so as it is possible to monitor the backend activity
during authentication.  This step happens earlier than in the logic
prior to this commit.  pgstat_beinit() happens earlier as well, before
authentication.
* pgstat_bestart_security() reports the SSL/GSS status of the
connection, once authentication completes.  Auxiliary processes, for
example, do not need to call this step, hence it is optional.  This
step is called after performing authentication, same as previously.
* pgstat_bestart_final() reports the user and database IDs, takes the
entry out of STATE_STARTING, and reports its application_name.  This is
called as the last step of the three, once authentication completes.

An injection point is added, with a test checking that the "starting"
phase of a backend entry is visible in pg_stat_activity.  Some follow-up
patches are planned to take advantage of this refactoring with more
information provided in backend entries during authentication (LDAP
hanging was a problem for the author, initially).

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2025-03-04 14:09:44 +09:00
Michael Paquier
40d3f82744 Add more assertions in palloc0() and palloc_extended()
palloc() includes an assertion checking that an alloc() implementation
never returns NULL for all MemoryContextMethods.

This commit adds a similar assertion in palloc0().  In palloc_extend(),
a different assertion is added, checking that MCXT_ALLOC_NO_OOM is set
when an alloc() routine returns NULL.  These additions can be useful to
catch errors when implementing a new set of MemoryContextMethods
routines.

Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/507e8eba-2035-4a12-a777-98199a66beb8@proxel.se
2025-03-04 10:53:10 +09:00
Masahiko Sawada
ba57dcfdcd doc: Convert UUID functions list to table format.
Convert the list of UUID functions into a table for better
readability. This commit also adds references to the UUID type section
and includes descriptions of different UUID generation algorithm
versions.

Author: Andy Alsup <bluesbreaker@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/CADOZ7s7OHag+r6w+BzKw2xgb3fVtAD-pU=_N9-9pSe5W1TB+xQ@mail.gmail.com
2025-03-03 15:44:01 -08:00
Tom Lane
246dedc5d0 Allow => syntax for named cursor arguments in plpgsql.
We've traditionally accepted "name := value" syntax for
cursor arguments in plpgsql.  But it turns out that the
equivalent statements in Oracle use "name => value".
Since we accept both forms of punctuation for function
arguments, it makes sense to do the same here.

Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Gilles Darold <gilles@darold.net>
Discussion: https://postgr.es/m/CAFj8pRA3d0ARQEMbABa1n6q25AUdNmyO8aGs56XNf9pD4sRMjQ@mail.gmail.com
2025-03-03 18:00:13 -05:00
Thomas Munro
b6904afae4 ci: Use a RAM disk for NetBSD and OpenBSD.
Put the RAM disk setup for all three *BSD CI tasks into a common script,
replacing the old FreeBSD-specific one from commit 0265e5c1.  This makes
them run 3 times and a bit over 2 times faster, respectively.

NetBSD and FreeBSD now share the same one-liner to mount tmpfs.  OpenBSD
needs a GCP-image specific recipe that knows where to steal an unused
disk partition needed to reserve swap space for an mfs RAM disk, because
its tmpfs is deprecated and currently broken.  The configured size is
enough for our current tests but could potentially need future
expansion.  Thanks to Bilal for the disklabel incantation.

Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJJ-XrPhN%2BQA4ZUfYAAXcwOSDty9t0vE9Z8__AdacKnQg%40mail.gmail.com
2025-03-04 11:29:21 +13:00
Melanie Plageman
06eae9e621 Trigger more frequent autovacuums with relallfrozen
Calculate the insert threshold for triggering an autovacuum of a
relation based on the number of unfrozen pages.

By only considering the unfrozen portion of the table when calculating
how many tuples to add to the insert threshold, we can trigger more
frequent vacuums of insert-heavy tables. This increases the chances of
vacuuming those pages when they still reside in shared buffers

This also increases the number of autovacuums triggered by tuples
inserted and not by wraparound risk. We prefer to freeze these pages
during insert-triggered autovacuums, as anti-wraparound vacuums are not
automatically canceled by conflicting lock requests.

We calculate the unfrozen percentage of the table using the recently
added (99f8f3fbbc8f) relallfrozen column of pg_class.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com
2025-03-03 14:42:00 -05:00
Tom Lane
35c8dd9e11 Simplify some logic around setting pg_attribute.atthasdef.
DefineRelation was of the opinion that it could usefully pre-fill
atthasdef flags to eliminate work for StoreAttrDefault.  This is not
the case, however: the tupledesc that it's filling is not the one that
InsertPgAttributeTuples will work from.  The tupledesc used there is
made by RelationBuildLocalRelation, which deliberately doesn't copy
atthasdef.  Moreover, if this did happen as the code thinks, it would
be wrong for the case of plain "DEFAULT NULL" clauses, since we detect
and ignore simple-null-Const defaults later on.  Hence, remove the
useless code.

It also emerges that it's not really worth a special-case path in
StoreAttrDefault() for atthasdef already being set, because as far as
we can see that never happens: cases where an existing default gets
updated always do RemoveAttrDefault first, so as to clean up
possibly-no-longer-correct dependency entries.  If it were the case
the code would still work, anyway.

Also remove a nearby comment made moot by 5eaa0e92e.

Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
2025-03-03 13:35:48 -05:00
Tom Lane
4528768d98 Remove now-dead code in StoreAttrDefault().
StoreAttrDefault() is no longer responsible for filling
attmissingval, so remove the code for that.

Get rid of RawColumnDefault.missingMode, too, as we no longer
need that to pass information around.

While here, clean up some sloppy coding in StoreAttrDefault(),
such as failure to use XXXGetDatum macros.  These aren't bugs
but they're not good code either.

Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
2025-03-03 13:09:20 -05:00
Tom Lane
95f650674d Fix broken handling of domains in atthasmissing logic.
If a domain type has a default, adding a column of that type (without
any explicit DEFAULT clause) failed to install the domain's default
value in existing rows, instead leaving the new column null.  This
is unexpected, and it used to work correctly before v11.  The cause
is confusion in the atthasmissing mechanism about which default value
to install: we'd only consider installing an explicitly-specified
default, and then we'd decide that no table rewrite is needed.

To fix, take the responsibility for filling attmissingval out of
StoreAttrDefault, and instead put it into ATExecAddColumn's existing
logic that derives the correct value to fill the new column with.
Also, centralize the logic that determines the need for
default-related table rewriting there, instead of spreading it over
four or five places.

In the back branches, we'll leave the attmissingval-filling code
in StoreAttrDefault even though it's now dead, for fear that some
extension may be depending on that functionality to exist there.
A separate HEAD-only patch will clean up the now-useless code.

Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
Backpatch-through: 13
2025-03-03 12:43:44 -05:00
Melanie Plageman
99f8f3fbbc Add relallfrozen to pg_class
Add relallfrozen, an estimate of the number of pages marked all-frozen
in the visibility map.

pg_class already has relallvisible, an estimate of the number of pages
in the relation marked all-visible in the visibility map. This is used
primarily for planning.

relallfrozen, together with relallvisible, is useful for estimating the
outstanding number of all-visible but not all-frozen pages in the
relation for the purposes of scheduling manual VACUUMs and tuning vacuum
freeze parameters.

A future commit will use relallfrozen to trigger more frequent vacuums
on insert-focused workloads with significant volume of frozen data.

Bump catalog version

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com
2025-03-03 11:18:05 -05:00
Tomas Vondra
8492feb98f Allow parallel CREATE INDEX for GIN indexes
Allow using parallel workers to build a GIN index, similarly to BTREE
and BRIN. For large tables this may result in significant speedup when
the build is CPU-bound.

The work is divided so that each worker builds index entries on a subset
of the table, determined by the regular parallel scan used to read the
data. Each worker uses a local tuplesort to sort and merge the entries
for the same key. The TID lists do not overlap (for a given key), which
means the merge sort simply concatenates the two lists. The merged
entries are written into a shared tuplesort for the leader.

The leader needs to merge the sorted entries again, before writing them
into the index. But this way a significant part of the work happens in
the workers, and the leader is left with merging fewer large entries,
which is more efficient.

Most of the parallelism infrastructure is a simplified copy of the code
used by BTREE indexes, omitting the parts irrelevant for GIN indexes
(e.g. uniqueness checks).

Original patch by me, with reviews and substantial improvements by
Matthias van de Meent, certainly enough to make him a co-author.

Author: Tomas Vondra, Matthias van de Meent
Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke
Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
2025-03-03 16:53:06 +01:00
Michael Paquier
3f1db99bfa Handle auxiliary processes in SQL functions of backend statistics
This commit impacts the following SQL functions, authorizing the access
to the PGPROC entries of auxiliary processes when attempting to fetch or
reset backend-level pgstats entries:
- pg_stat_reset_backend_stats()
- pg_stat_get_backend_io()

This is relevant since a051e71e28a1 for at least the WAL summarizer, WAL
receiver and WAL writer processes, that has changed the backend
statistics to authorize these three following the addition of WAL I/O
statistics in pg_stat_io and backend statistics.  The code is more
flexible with future changes written this way, adapting automatically to
any updates done in pgstat_tracks_backend_bktype().

While on it, pgstat_report_wal() gains a call to pgstat_flush_backend(),
making sure that backend I/O statistics are updated when calling this
routine.  This makes the statistics report correctly for the WAL writer.
WAL receiver and WAL summarizer do not call pgstat_report_wal() yet
(spoiler: both should).  It should be possible to lift some of the
existing restrictions for other auxiliary processes, as well, but this
is left as future work.

Reported-by: Rahila Syed <rahilasyed90@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CAH2L28v9BwN8_y0k6FQ591=0g2Hj_esHLGj3bP38c9nmVykoiA@mail.gmail.com
2025-03-03 09:57:48 +09:00
Fujii Masao
fe186bda78 postgres_fdw: Extend postgres_fdw_get_connections to return remote backend PID.
This commit adds a new "remote_backend_pid" output column to
the postgres_fdw_get_connections function. It returns the process ID of
the remote backend, on the foreign server, handling the connection.

This enhancement is useful for troubleshooting, monitoring, and reporting.
For example, if a connection is unexpectedly closed by the foreign server,
the remote backend's PID can help diagnose the cause.

No extension version bump is needed, as commit c297a47c5f already
handled it for v18~.

Author: Sagar Dilip Shedge <sagar.shedge92@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAPhYifF25q5xUQWXETfKwhc0YVa_6+tfG9Kw4bCvCjpCWxYs2A@mail.gmail.com
2025-03-03 08:51:30 +09:00
Peter Eisentraut
15a79c7311 Use PRI*64 instead of "ll*" in format strings (minimal trial)
Old: errmsg("hello %llu", (unsigned long long) x)
New: errmsg("hello %" PRIu64, x)

And likewise for everything printf-like.

In the past we had to use long long so localized format strings remained
architecture independent in message catalogs.  Although long long is
expected to be 64 bit everywhere, if we hadn't also cast the int64
values, we'd have generated compiler warnings on systems where int64 was
long.

Now that int64 is int64_t, C99 understand how to format them using
<inttypes.h> macros, the casts are not necessary, and the gettext()
tools recognize the macros and defer expansion until load time.  (And if
we ever manage to get -Wformat-signedness to work for us, that'd help
with these too, but not the type-system-clobbering casts.)

This particular patch converts only pg_checksums.c to the new system,
to allow testing of the translation toolchain for everyone.  If this
works okay, a later patch will convert most of the rest.

Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
2025-03-02 13:53:03 +01:00
Tom Lane
00d61a08c5 Fix pg_strtof() to not crash on NULL endptr.
We had managed not to notice this simple oversight because none
of our calls exercised the case --- until commit 8f427187d.
That led to pg_dump crashing on any platform that uses this code
(currently Cygwin and Mingw).

Even though there's no immediate bug in the back branches, backpatch,
because a non-POSIX-compliant strtof() substitute is trouble waiting
to happen for extensions or future back-patches.

Diagnosed-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/339b3902-4e98-4e31-a744-94e43b7b9292@gmail.com
Backpatch-through: 13
2025-03-01 14:22:56 -05:00
Peter Eisentraut
56ba0463d3 Set amcancrosscompare to true for hash
This was missed in the refactoring in patch ce62f2f2a0a, which thus
created a regression.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org
2025-03-01 09:15:27 +01:00
Thomas Munro
c301a0a74a Work around OAuth/EVFILT_TIMER quirk on NetBSD.
NetBSD's EVFILT_TIMER doesn't like zero timeouts, as introduced by
commit b3f0be788.  Steal the workaround from the same problem on Linux
from a few lines up: round zero up to one.  Do this only for NetBSD, as
the other systems with the kevent() API accept zero and shouldn't have
to insert a small bogus wait.

Future improvement ideas:
 * when NetBSD < 10 falls out of support, we could try NODE_ABSTIME for
   the "fire now" meaning if timeout == 0
 * when libcurl tells us to start a 0ms timer and call it back, we could
   figure out how to handle that more directly without involving the
   kernel (the current architecture doesn't make that straightforward)

Failures with EINVAL errors could be seen on the new optional NetBSD CI
task that we're trying to keep green as a candidate for inclusion as
default-enabled CI task.  The NetBSD build farm animals aren't testing
OAuth yet, so no breakage there.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/CA%2BhUKGJ%2BWyJ26QGvO_nkgvbxgw%2B03U4EQ4Hxw%2BQBft6Np%2BXW7w%40mail.gmail.com
2025-03-01 14:41:02 +13:00
Masahiko Sawada
8a1012b35d Re-export NextCopyFromRawFields() to copy.h.
Commit 7717f630069 removed NextCopyFromRawFields() from copy.h. While
it was hoped that NextCopyFrom() could serve as an alternative,
certain use cases still require NextCopyFromRawFields(). For instance,
extensions like file_text_array_fdw, which process source data with an
unknown number of columns, rely on this function.

Per buildfarm member crake.

Reported-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Sutou Kouhei <kou@clear-code.com>
Discussion: https://postgr.es/m/5c7e1ac8-5083-4c08-af19-cb9ade2f16ce@dunslane.net
2025-02-28 15:11:41 -08:00
Nathan Bossart
e636da9200 Adjust auto_explain's GUC descriptions.
This commit adjusts auto_explain's GUC descriptions to follow the
style guidelines established by commit 977d865c36.  Specifically,
it ensures the accepted special values are listed in a consistent
manner.

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/e82d4647-ce7f-45c7-9b01-fb900a050767%40tantorlabs.com
2025-02-28 16:05:51 -06:00
Tom Lane
8b49392b27 Tweak regex to avoid a bug in Perl 5.16.3.
For some reason, 5.16.3 (and perhaps slightly earlier/later versions)
go into an infinite loop with the version-replacement regex installed
by commit fc0d0ce97.  We can work around that by using an explicit
"\n" instead of the line-start metacharacter "^".

Reported-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0u9dV3CdKqkqdusA_RdvBkwWe0c0rxcFWj++VYoutFYSw@mail.gmail.com
2025-02-28 15:20:24 -05:00
Masahiko Sawada
7717f63006 Refactor COPY FROM to use format callback functions.
This commit introduces a new CopyFromRoutine struct, which is a set of
callback routines to read tuples in a specific format. It also makes
COPY FROM with the existing formats (text, CSV, and binary) utilize
these format callbacks.

This change is a preliminary step towards making the COPY FROM command
extensible in terms of input formats.

Similar to 2e4127b6d2d, this refactoring contributes to a performance
improvement by reducing the number of "if" branches that need to be
checked on a per-row basis when sending field representations in text
or CSV mode. The performance benchmark results showed ~5% performance
gain in text or CSV mode.

Author: Sutou Kouhei <kou@clear-code.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/20231204.153548.2126325458835528809.kou@clear-code.com
2025-02-28 10:29:36 -08:00
Robert Haas
77cb08be51 Avoid including explain.h in explain_format.h and explain_dr.h
As per a suggestion from Tom Lane, we do this by declaring "struct
ExplainState" here and refer to that rather than "ExplainState".

Also per Tom, CreateExplainSerializeDestReceiver was still defined
in explain.h in addition to explain_dr.h. Remove leftover prototype.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYtaad3i21V0jqua-fbr+CR0ix6uBvEX8_s6BG96abd=g@mail.gmail.com
2025-02-28 13:17:29 -05:00
Robert Haas
51d3e279c3 Fix missing space in EXPLAIN ANALYZE output.
Commit ddb17e387aa28d61521227377b00f997756b8a27 introduced this
regression. Ideally, the regression tests would have caught this
mistake, but apparently they don't test with timing enabled,
presumably because that would make the output vary.

Author: Thom Brown <thom@linux.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Discussion: http://postgr.es/m/CAA-aLv6nq=UeiyvM7_Mxgo9TVBzs2oh46b9vfyLzuyVEz3j1-g@mail.gmail.com
2025-02-28 13:04:12 -05:00
Jeff Davis
424ededc58 Adjust pg_dump tag for relation stats.
Do not use fmtId(), just use dobj->name directly, like for table data.
2025-02-27 20:42:12 -08:00
Michael Paquier
c2a50ac678 Invent pgstat_fetch_stat_backend_by_pid()
This code is extracted from pg_stat_get_backend_io() in pgstatfuncs.c,
so as it can be shared with other areas that need backend pgstats
entries while having the benefits of the various sanity checks
refactored here.  As per its name, this retrieves backend statistics
based on a PID, with the option of retrieving a BackendType if given in
input.

Currently, this is used for the backend-level IO statistics.  The next
move would be to reuse that for the backend-level WAL statistics.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-28 11:20:31 +09:00
Michael Paquier
2a083ab807 pg_upgrade: Fix inconsistency in memory freeing
The function in charge of freeing the memory from a result created by
PQescapeIdentifier() has to be PQfreemem(), to ensure that both
allocation and free come from libpq.

One spot in pg_upgrade was not respecting that for pg_database's
datlocale (daticulocale in v16) when the collation provider is libc (aka
datlocale/daticulocale is NULL) with an allocation done using
pg_strdup() and a free with PQfreemem().  The code is changed to always
use PQescapeLiteral() when processing the input.

Oversight in 9637badd9f92.  This commit is similar to 48e4ae9a0707 and
5b94e2753439.

Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/Z601RQxTmIUohdkV@paquier.xyz
Backpatch-through: 16
2025-02-28 10:15:29 +09:00
Masahiko Sawada
2e4127b6d2 Refactor COPY TO to use format callback functions.
This commit introduces a new CopyToRoutine struct, which is a set of
callback routines to copy tuples in a specific format. It also makes
the existing formats (text, CSV, and binary) utilize these format
callbacks.

This change is a preliminary step towards making the COPY TO command
extensible in terms of output formats.

Additionally, this refactoring contributes to a performance
improvement by reducing the number of "if" branches that need to be
checked on a per-row basis when sending field representations in text
or CSV mode. The performance benchmark results showed ~5% performance
gain in text or CSV mode.

Author: Sutou Kouhei <kou@clear-code.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/20231204.153548.2126325458835528809.kou@clear-code.com
2025-02-27 15:03:52 -08:00
Robert Haas
555960a0fb Create explain_dr.c and move DestReceiver-related code there.
explain.c has grown rather large, and the code that deals with the
DestReceiver that supports the SERIALIZE option is pretty easily severable
from the rest of explain.c; hence, move it to a separate file.

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: http://postgr.es/m/CA+TgmoYutMw1Jgo8BWUmB3TqnOhsEAJiYO=rOQufF4gPLWmkLQ@mail.gmail.com
2025-02-27 13:14:16 -05:00
Robert Haas
9173e8b604 Create explain_format.c and move relevant code there.
explain.c has grown rather large, so move various functions that
are principally concerned with output generation to a new source
file, explain_format.c, instead of lumping them in with everything
else that is part of explain.c

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: http://postgr.es/m/CA+TgmoYutMw1Jgo8BWUmB3TqnOhsEAJiYO=rOQufF4gPLWmkLQ@mail.gmail.com
2025-02-27 12:37:10 -05:00
Robert Haas
95dbd827f2 EXPLAIN: Always use two fractional digits for row counts.
Commit ddb17e387aa28d61521227377b00f997756b8a27 attempted to avoid
confusing users by displaying digits after the decimal point only when
nloops > 1, since it's impossible to have a fraction row count after a
single iteration. However, this made the regression tests unstable since
parallal queries will have nloops>1 for all nodes below the Gather or
Gather Merge in normal cases, but if the workers don't start in time and
the leader finishes all the work, they will suddenly have nloops==1,
making it unpredictable whether the digits after the decimal point would
be displayed or not. Although 44cbba9a7f51a3888d5087fc94b23614ba2b81f2
seemed to fix the immediate failures, it may still be the case that there
are lower-probability failures elsewhere in the regression tests.

Various fixes are possible here. For example, it has previously been
proposed that we should try to display the digits after the decimal
point only if rows/nloops is an integer, but currently rows is storead
as a float so it's not theoretically an exact quantity -- precision
could be lost in extreme cases. It has also been proposed that we
should try to display the digits after the decimal point only if we're
under some sort of construct that could potentially cause looping
regardless of whether it actually does. While such ideas are not
without merit, this patch adopts the much simpler solution of always
display two decimal digits. If that approach stands up to scrutiny
from the buildfarm and human users, it spares us the trouble of doing
anything more complex; if not, we can reassess.

This commit incidentally reverts 44cbba9a7f51a3888d5087fc94b23614ba2b81f2,
which should no longer be needed.

Author: Robert Haas <robertmhaas@gmail.com>
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Discussion: http://postgr.es/m/CA+TgmoazzVHn8sFOMFAEwoqBTDxKT45D7mvkyeHgqtoD2cn58Q@mail.gmail.com
2025-02-27 11:27:16 -05:00
Peter Eisentraut
ce62f2f2a0 Generalize hash and ordering support in amapi
Stop comparing access method OID values against HASH_AM_OID and
BTREE_AM_OID, and instead check the IndexAmRoutine for an index to see
if it advertises its ability to perform the necessary ordering,
hashing, or cross-type comparing functionality.  A field amcanorder
already existed, this uses it more widely.  Fields amcanhash and
amcancrosscompare are added for the other purposes.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-27 17:03:31 +01:00
Tom Lane
6eb8a1a4f9 Avoid unnecessary computation of pgbench's script line number.
ParseScript only needs the lineno for meta-commands, so let's not
bother computing it otherwise.  While this doesn't save much given
the previous patch, there's no point in doing unnecessary work.
While we're at it, avoid calling psql_scan_get_location() twice for
a meta-command.

One reason for making this change is that the line number computed
in ParseScript's main loop was actually wrong in most cases: it
would point just past the semicolon of the previous SQL command,
not at what the user thinks the current command's line number is.
We could add some code to skip whitespace before capturing the line
number, but it would be pretty pointless at present.  Just move the
call to avoid the temptation to rely on that value.  (Once we've
lexed the backslash, the computed line number will be right.)

This change also means that pgbench never inquires about the
location before it's lexed something, so that the care taken in
the previous patch to behave sanely in that case is unnecessary.
It seems best to keep that logic, though, as future callers
might depend on it.

Author: Daniel Vérité <daniel@manitou-mail.org>
Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org
2025-02-27 10:57:55 -05:00
Tom Lane
c8c74ad7e1 Get rid of O(N^2) script-parsing overhead in pgbench.
pgbench wants to record the starting line number of each command
in its scripts.  It was computing that by scanning from the script
start and counting newlines, so that O(N^2) work had to be done
for an N-command script.  In a script with 50K lines, this adds
up to about 10 seconds on my machine.

To add insult to injury, the results were subtly wrong, because
expr_scanner_offset() scanned to find the NUL that flex inserts
at the end of the current token --- and before the first yylex
call, no such NUL has been inserted.  So we ended by computing the
script's last line number not its first one.  This was visible only
in case of \gset at the start of a script, which perhaps accounts
for the lack of complaints.

To fix, steal an idea from plpgsql and track the current lexer
ending position and line count as we advance through the script.
(It's a bit simpler than plpgsql since we can't need to back up.)
Also adjust a couple of other places that were invoking scans
from script start when they didn't really need to.  I made a new
psqlscan function psql_scan_get_location() that replaces both
expr_scanner_offset() and expr_scanner_get_lineno(), since in
practice expr_scanner_get_lineno() was only being invoked to find
the line number of the current lexer end position.

Reported-by: Daniel Vérité <daniel@manitou-mail.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/84a8a89e-adb8-47a9-9d34-c13f7150ee45@manitou-mail.org
2025-02-27 10:53:38 -05:00
Alexander Korotkov
e167191dc1 Get rid of ojrelid local variable in remove_rel_from_query()
As spotted by Coverity, the calculation of ojrelid mixes signed and unsigned
types causes possible overflow and undefined behavior.  Instead of trying to
fix the expression, this commit eliminates the relied local variable.  The
explicit branching is used to replace the -1 value.  That, in turn, requires
changing the signature of the remove_rel_from_eclass() function.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/914330.1740330169%40sss.pgh.pa.us
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
2025-02-27 11:22:01 +02:00
Thomas Munro
55918f798b Remove arbitrary cap on read_stream.c buffer queue.
Previously the internal queue of buffers was capped at max_ios * 4,
though not less than io_combine_limit, at allocation time.  That was
done in the first version based on conservative theories about resource
usage and heuristics pending later work.  The configured I/O depth could
not always be reached with dense random streams generated by ANALYZE,
VACUUM, the proposed Bitmap Heap Scan patch, and also sequential streams
with the proposed AIO subsystem to name some examples.

The new formula is (max_ios + 1) * io_combine_limit, enough buffers for
the full configured I/O concurrency level using the full configured I/O
combine size, plus the buffers from one finished but not yet consumed
full-sized I/O.  Significantly more memory would be needed for high GUC
values if the client code requests a large per-buffer data size, but
that is discouraged (existing and proposed stream users try to keep it
under a few words, if not zero).

With this new formula, an intermediate variable could have overflowed
under maximum GUC values, so its data type is adjusted to cope.

Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
2025-02-27 20:49:48 +13:00
Michael Paquier
48e4ae9a07 pg_amcheck: Fix inconsistency in memory freeing
The function in charge of freeing the memory from a result created by
PQescapeIdentifier() has to be PQfreemem(), to ensure that both
allocation and free come from libpq, but one spot in pg_amcheck was
missing that.

Oversight in b859d94c6389.

Author: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAEudQArD_nKSnYCNUZiPPsJ2tNXgRmLbXGSOrH1vpOF_XtP0Vg@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQArbTWVSbxq608GRmXJjnNSQ0B6R7CSffNnj2hPWMUsRNg@mail.gmail.com
Backpatch-through: 14
2025-02-27 14:05:51 +09:00
Amit Kapila
8709dccc79 Fix the race condition in ReplicationSlotAcquire().
After commit f41d8468dd, a process could acquire and use a replication
slot that had just been invalidated, leading to failures while accessing
WAL.

To ensure that we don't accidentally start using invalid slots, we must
perform the invalidation check after acquiring the slot or under the
spinlock where we associate the slot with a particular process. We choose
the earlier method to keep the code simple.

Reported-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CABdArM7J-LbGoMPGUPiFiLOyB_TZ5+YaZb=HMES0mQqzVTn8Gg@mail.gmail.com
2025-02-27 09:47:04 +05:30
Amit Kapila
845511a72a Doc: Additional clarification for -d option of pg_createsubscriber.
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CALDaNm0zsFUYpe-tLha+-sp3K8KmBXu0o=LUN=8FFtxMLYikPA@mail.gmail.com
2025-02-27 08:50:03 +05:30
Michael Paquier
495864a4cf Refactor code of pg_stat_get_wal() building result tuple
This commit adds to pgstatfuncs.c a new routine called
pg_stat_wal_build_tuple(), helper routine for pg_stat_get_wal().  This
is in charge of filling one tuple based on the contents of
PgStat_WalStats retrieved from pgstats.

This refactoring will be used by an upcoming patch introducing
backend-level WAL statistics, simplifying the main patch.  Note that
it is not possible for stats_reset to be NULL in pg_stat_wal; backend
statistics need to be able to handle this case.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-27 11:54:36 +09:00
Michael Paquier
62ec3e1f67 Fix possible double-release of spinlock in procsignal.c
9d9b9d46f3c5 has added spinlocks to protect the fields in ProcSignal
flags, introducing a code path in ProcSignalInit() where a spinlock
could be released twice if the pss_pid field of a ProcSignalSlot is
found as already set.  Multiple spinlock releases have no effect with
most spinlock implementations, but this could cause the code to run into
issues when the spinlock is acquired concurrently by a different
process.

This sanity check on pss_pid generates a LOG that can be delayed until
after the spinlock is released as, like older versions up to v17, the
code expects the initialization of the ProcSignalSlot to happen even if
pss_pid is found incorrect.  The code is changed so as the old pss_pid
is read while holding the slot's spinlock, with the LOG from the sanity
check generated after releasing the spinlock, preventing the double
release.

Author: Maksim Melnikov <m.melnikov@postgrespro.ru>
Co-authored-by: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/dca47527-2d8b-4e3b-b5a0-e2deb73371a4@postgrespro.ru
2025-02-27 09:43:06 +09:00
Jeff Davis
15df9d7b51 Remove stray diff introduced by a5cbdeb98a.
Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z77IkjmmfbFfNh3f@paquier.xyz
2025-02-26 13:37:14 -08:00
Tom Lane
40e27d04b4 Use attnum to identify index columns in pg_restore_attribute_stats().
Previously we used attname for both table and index columns, but
that is problematic for indexes because their attnames are assigned
by internal rules that don't guarantee to preserve the names across
dump and reload.  (This is what's causing the remaining buildfarm
failures in cross-version-upgrade tests.)  Fortunately we can use
attnum instead, since there's no such thing as adding or dropping
columns in an existing index.  We met this same problem previously
with ALTER INDEX ... SET STATISTICS, and solved it the same way,
cf commit 5b6d13eec.

In pg_restore_attribute_stats() itself, we accept either attnum or
attname, but the policy used by pg_dump is to always use attname
for tables and attnum for indexes.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us
2025-02-26 16:36:20 -05:00
Peter Eisentraut
f734c9fc3a Revert "Prepare for Python "Limited API" in PL/Python"
This reverts commit c47e8df815c1c45f4e4fc90d5817d67ab088279f.

That commit makes the plpython tests crash with Python 3.6.* and
3.7.*.  It will need further investigation and testing, so revert for
now.
2025-02-26 21:58:38 +01:00
Masahiko Sawada
945a9e3832 Fix a typo in 005_char_signedness.pl test.
The test in 005_char_signedness.pl was missing a dash in the
--set-char-signedness option. Although the test didn't fail since it
doesn't check the error message, it resulted in an unexpected error
message instead of the intended one.

Oversight in 1aab680591.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87tt8h1vb7.fsf@wibble.ilmari.org
2025-02-26 11:10:03 -08:00
Peter Eisentraut
c47e8df815 Prepare for Python "Limited API" in PL/Python
Using the Python Limited API would allow building PL/Python against
any Python 3.x version and using another Python 3.x version at run
time.  This commit does not activate that, but it prepares the code to
only use APIs supported by the Limited API.

Implementation details:

- Convert static types to heap types
  (https://docs.python.org/3/howto/isolating-extensions.html#heap-types).

- Replace PyRun_String() with component functions.

- Replace PyList_SET_ITEM() with PyList_SetItem().

Reviewed-by: Jakob Egger <jakob@eggerapps.at>
Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org
2025-02-26 16:14:39 +01:00
Michael Paquier
0e42d31b0b Adding new PgStat_WalCounters structure in pgstat.h
This new structure contains the counters and the data related to the WAL
activity statistics gathered from WalUsage, separated into its own
structure so as it can be shared across more than one Stats structure in
pg_stat.h.

This refactoring will be used by an upcoming patch introducing
backend-level WAL statistics.

Bump PGSTAT_FILE_FORMAT_ID.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-26 16:48:54 +09:00
Michael Paquier
d7cbeaf261 Remove pgstat_flush_wal()
All the processes that generate WAL should call pgstat_report_wal() to
report all their statistics related to WAL, and this is already what
happens in the tree.  Keeping pgstat_report_wal() is confusing while the
other routine is encouraged.

This routine is not required since fc415edf8ca8, where it was lastly
used in pgstat_report_stat() before an equivalent callback existed.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z71oPkJJICrRB5Ws@paquier.xyz
2025-02-26 15:37:28 +09:00
Amit Kapila
e117cfb2f6 Add two-phase option in pg_createsubscriber.
This patch introduces the '--enable-two-phase' option to the
'pg_createsubscriber' utility, allowing users to enable two-phase commit
for all subscriptions during their creation.

Note that even without this option users can enable the two_phase option
for the subscriptions created by pg_createsubscriber. However, it requires
the subscription to be disabled first which could be inconvenient for
users.

When two-phase commit is enabled, prepared transactions are sent to the
subscriber at the time of 'PREPARE TRANSACTION', and they are processed as
two-phase transactions on the subscriber as well. If disabled, prepared
transactions are sent only when committed and are processed immediately by
the subscriber.

Author: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Ajin Cherian <itsajin@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHv8RjLPdFP=kA5LNSmWZ=+GMXmO+LczvV6p9HJjsXxZz10KGA@mail.gmail.com
2025-02-26 11:12:50 +05:30
Michael Paquier
adc6032fa8 Improve FATAL message for invalid TLI history at recovery
The original message did not mention where the checkpoint record LSN was
found, a control file or a backup_label file.  A couple of LOG messages
are generated before this FATAL check is reached, providing more details
about the way recovery is set up.  However, knowing this information in
this specific message is useful for debugging.  This is also useful for
instances where log_min_messages is set to FATAL or more, where LOG
messages do not show up.

Author: Benoit Lobréau <benoit.lobreau@dalibo.com>
Reviewed-by: David Steele <david@pgbackrest.org>
Discussion: https://postgr.es/m/4ed10bc8-5513-4d8e-8643-8abcaa08336d@dalibo.com
2025-02-26 14:26:16 +09:00
Jeff Davis
6ee3b91bad pg_dump: prepare attribute stats query.
Follow precedent in pg_dump for preparing queries to improve
performance. Also, simplify the query by removing unnecessary joins.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM=dRMC6t8gp9GVf6y6E_r5EChQjMAAh_vPyih_zMiq0zvA@mail.gmail.com
2025-02-25 19:52:11 -08:00
Jeff Davis
8f427187db Avoid unnecessary relation stats query in pg_dump.
The few fields we need can be easily collected in getTables() and
getIndexes() and stored in RelStatsInfo.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM=f0a43aTd88xW4xCFayEF25g-7hTrHX_WhV40HyocsUGg@mail.gmail.com
2025-02-25 19:51:45 -08:00
Michael Paquier
6c349d83b6 Re-add GUC track_wal_io_timing
This commit is a rework of 2421e9a51d20, about which Andres Freund has
raised some concerns as it is valuable to have both track_io_timing and
track_wal_io_timing in some cases, as the WAL write and fsync paths can
be a major bottleneck for some workloads.  Hence, it can be relevant to
not calculate the WAL timings in environments where pg_test_timing
performs poorly while capturing some IO data under track_io_timing for
the non-WAL IO paths.  The opposite can be also true: it should be
possible to disable the non-WAL timings and enable the WAL timings (the
previous GUC setups allowed this possibility).

track_wal_io_timing is added back in this commit, controlling if WAL
timings should be calculated in pg_stat_io for the read, fsync and write
paths, as done previously with pg_stat_wal.  pg_stat_wal previously
tracked only the sync and write parts (now removed), read stats is new
data tracked in pg_stat_io, all three are aggregated if
track_wal_io_timing is enabled.  The read part matters during recovery
or if a XLogReader is used.

Extra note: more control over if the types of timings calculated in
pg_stat_io could be done with a GUC that lists pairs of (IOObject,IOOp).

Reported-by: Andres Freund <andres@anarazel.de>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/3opf2wh2oljco6ldyqf7ukabw3jijnnhno6fjb4mlu6civ5h24@fcwmhsgmlmzu
2025-02-26 09:49:59 +09:00
Jeff Davis
a5cbdeb98a Remove redundant pg_set_*_stats() variants.
After commit f3dae2ae58, the primary purpose of separating the
pg_set_*_stats() from the pg_restore_*_stats() variants was
eliminated.

Leave pg_restore_relation_stats() and pg_restore_attribute_stats(),
which satisfy both purposes, and remove pg_set_relation_stats() and
pg_set_attribute_stats().

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us
2025-02-25 16:15:47 -08:00
Andres Freund
ecbff4378b Change _mdfd_segpath() to return paths by value
This basically mirrors the changes done in the predecessor commit. While there
isn't currently a need to get these paths in critical sections, it seems a
shame to unnecessarily allocate memory in these paths now that relpath()
doesn't allocate anymore.

Discussion: https://postgr.es/m/xeri5mla4b5syjd5a25nok5iez2kr3bm26j2qn4u7okzof2bmf@kwdh2vf7npra
2025-02-25 09:02:07 -05:00
Andres Freund
37c87e63f9 Change relpath() et al to return path by value
For AIO, and also some other recent patches, we need the ability to call
relpath() in a critical section. Until now that was not feasible, as it
allocated memory.

The fact that relpath() allocated memory also made it awkward to use in log
messages because we had to take care to free the memory afterwards. Which we
e.g. didn't do for when zeroing out an invalid buffer.

We discussed other solutions, e.g. filling a pre-allocated buffer that's
passed to relpath(), but they all came with plenty downsides or were larger
projects. The easiest fix seems to be to make relpath() return the path by
value.

To be able to return the path by value we need to determine the maximum length
of a relation path. This patch adds a long #define that computes the exact
maximum, which is verified to be correct in a regression test.

As this change the signature of relpath(), extensions using it will need to
adapt their code. We discussed leaving a backward-compat shim in place, but
decided it's not worth it given the use of relpath() doesn't seem widespread.

Discussion: https://postgr.es/m/xeri5mla4b5syjd5a25nok5iez2kr3bm26j2qn4u7okzof2bmf@kwdh2vf7npra
2025-02-25 09:02:07 -05:00
Peter Eisentraut
32c393f9f1 Remove obsolete Python version check
The checked version is already the current minimum supported version
(3.2).

Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org
2025-02-25 14:11:38 +01:00
Richard Guo
363a6e8c6f Eliminate code duplication in replace_rte_variables callbacks
The callback functions ReplaceVarsFromTargetList_callback and
pullup_replace_vars_callback are both used to replace Vars in an
expression tree that reference a particular RTE with items from a
targetlist, and they both need to expand whole-tuple references and
deal with OLD/NEW RETURNING list Vars.  As a result, currently there
is significant code duplication between these two functions.

This patch introduces a new function, ReplaceVarFromTargetList, to
perform the replacement and calls it from both callback functions,
thereby eliminating code duplication.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWhr=FM4X5kCPvVs-g2XEk+ceLsNtBK_zZMkqFn9vUjsw@mail.gmail.com
2025-02-25 16:11:34 +09:00
Richard Guo
1e4351af32 Expand virtual generated columns in the planner
Commit 83ea6c540 added support for virtual generated columns that are
computed on read.  All Var nodes in the query that reference virtual
generated columns must be replaced with the corresponding generation
expressions.  Currently, this replacement occurs in the rewriter.
However, this approach has several issues.  If a Var referencing a
virtual generated column has any varnullingrels, those varnullingrels
need to be propagated into the generation expression.  Failing to do
so can lead to "wrong varnullingrels" errors and improper outer-join
removal.

Additionally, if such a Var comes from the nullable side of an outer
join, we may need to wrap the generation expression in a
PlaceHolderVar to ensure that it is evaluated at the right place and
hence is forced to null when the outer join should do so.  In certain
cases, such as when the query uses grouping sets, we also need a
PlaceHolderVar for anything that is not a simple Var to isolate
subexpressions.  Failure to do so can result in incorrect results.

To fix these issues, this patch expands the virtual generated columns
in the planner rather than in the rewriter, and leverages the
pullup_replace_vars architecture to avoid code duplication.  The
generation expressions will be correctly marked with nullingrel bits
and wrapped in PlaceHolderVars when needed by the pullup_replace_vars
callback function.  This requires handling the OLD/NEW RETURNING list
Vars in pullup_replace_vars_callback, as it may now deal with Vars
referencing the result relation instead of a subquery.

The "wrong varnullingrels" error was reported by Alexander Lakhin.
The incorrect result issue and the improper outer-join removal issue
were reported by Richard Guo.

Author: Richard Guo <guofenglinux@gmail.com>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/75eb1a6f-d59f-42e6-8a78-124ee808cda7@gmail.com
2025-02-25 16:10:25 +09:00
Michael Paquier
560a842d63 Fix untranslatable string concatenation in pg_upgrade
Oversight in 1aab6805919b.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20250225.140953.1271748916018759840.horikyota.ntt@gmail.com
2025-02-25 15:53:32 +09:00
Amit Kapila
5b8f2ccc0a Doc: Fix pg_copy_logical_replication_slot description.
This commit documents that the failover option is not copied when using
the pg_copy_logical_replication_slot function.

In passing, we modify the comments in the function clarifying the reason
for this behavior.

Reported-by: <duffieldzane@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/173976850802.682632.11315364077431550250@wrigleys.postgresql.org
2025-02-25 09:42:07 +05:30
Jeff Davis
15601fa21a Missing doc update for f3dae2ae58. 2025-02-24 17:27:32 -08:00
Jeff Davis
f3dae2ae58 Do not use in-place updates for statistics import.
The use of in-place updates was originally there to follow the
precedent of ANALYZE and to reduce the potential for bloat on
pg_class. Per discussion, it's not worth the risks.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/cpdanvzykcb5o64rmapkx6n5gjypoce3y52hff7ocxupgpbxu4@53jmlyvukijo
2025-02-24 17:10:59 -08:00
Michael Paquier
3ce357584e psql: Add pipeline status to prompt and some state variables
This commit adds %P to psql prompts, able to report the status of a
pipeline depending on PQpipelineStatus(): on, off or abort.

The following variables are added to report the state of an ongoing
pipeline:
- PIPELINE_SYNC_COUNT: reports the number of piped syncs.
- PIPELINE_COMMAND_COUNT: reports the number of piped commands, a
command being either \bind, \bind_named, \close or \parse.
- PIPELINE_RESULT_COUNT: reports the results available to read with
\getresults.

These variables can be used with \echo or in a prompt, using "%:name:"
in PROMPT1, PROMPT2 or PROMPT3.  Some basic regression tests are added
for these.  The suggestion to use variables to show the details about
the status counters comes from me.  The original patch proposed was less
extensible, hardcoding the output in the prompt.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-25 10:07:24 +09:00
Amit Langote
cbb9086c9e Fix bug in cbc127917 to handle nested Append correctly
A non-leaf partition with a subplan that is an Append node was
omitted from PlannedStmt.unprunableRelids because it was mistakenly
included in PlannerGlobal.prunableRelids due to the way
PartitionedRelPruneInfo.leafpart_rti_map[] is constructed. This
happened when a non-leaf partition used an unflattened Append or
MergeAppend.  As a result, ExecGetRangeTableRelation() reported an
error when called from CreatePartitionPruneState() to process the
partition's own PartitionPruneInfo, since it was treated as prunable
when it should not have been.

Reported-by: Alexander Lakhin <exclusion@gmail.com> (via sqlsmith)
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/74839af6-aadc-4f60-ae77-ae65f94bf607@gmail.com
2025-02-25 09:24:42 +09:00
Masahiko Sawada
48796a98d5 Fix assertion when decoding XLOG_PARAMETER_CHANGE on promoted primary.
When a standby replays an XLOG_PARAMETER_CHANGE record that lowers
wal_level below logical, we invalidate all logical slots in hot
standby mode. However, if this record was replayed while not in hot
standby mode, logical slots could remain valid even after promotion,
potentially causing an assertion failure during WAL record decoding.

To fix this issue, this commit adds a check for hot_standby status
when restoring a logical replication slot on standbys. This check
ensures that logical slots are invalidated when they become
incompatible due to insufficient wal_level during recovery.

Backpatch to v16 where logical decoding on standby was introduced.

Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CAD21AoABoFwGY_Rh2aeE6tEq3HkJxf0c6UeOXn4VV9v6BAQPSw%40mail.gmail.com
Backpatch-through: 16
2025-02-24 14:03:04 -08:00
Daniel Gustafsson
d1146dc2a7 oauth: Rename macro to avoid collisions on Windows
Our json parsing defined the macros OPTIONAL and REQUIRED to decorate the
structs with for increased readability. This however collides with macros
in the <windef.h> header on Windows.

../src/interfaces/libpq/fe-auth-oauth-curl.c:398:9: warning: "OPTIONAL" redefined
  398 | #define OPTIONAL false
      |         ^~~~~~~~
In file included from D:/a/_temp/msys64/ucrt64/include/windef.h:9,
                 from D:/a/_temp/msys64/ucrt64/include/windows.h:69,
                 from D:/a/_temp/msys64/ucrt64/include/winsock2.h:23,
                 from ../src/include/port/win32_port.h:60,
                 from ../src/include/port.h:24,
                 from ../src/include/c.h:1331,
                 from ../src/include/postgres_fe.h:28,
                 from ../src/interfaces/libpq/fe-auth-oauth-curl.c:16:
include/minwindef.h:65:9: note: this is the location of the previous definition
   65 | #define OPTIONAL
      |         ^~~~~~~~

Rename to avoid compilation errors in anticipation of implementing
support for Windows.

Reported-by: Dave Cramer (on PostgreSQL Hacking Discord)
2025-02-24 22:20:37 +01:00
Daniel Gustafsson
03366b61df oauth: Fix incorrect const markers in struct
Two members in PGoauthBearerRequest were incorrectly marked as const.
While in there, align the name of the struct with the typedef as per
project style.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/912516.1740329361@sss.pgh.pa.us
2025-02-24 22:20:29 +01:00
Melanie Plageman
bfe56cdf9a Delay extraction of TIDBitmap per page offsets
Pages from the bitmap created by the TIDBitmap API can be exact or
lossy. The TIDBitmap API extracts the tuple offsets from exact pages
into an array for the convenience of the caller.

This was done in tbm_private|shared_iterate() right after advancing the
iterator. However, as long as tbm_private|shared_iterate() set a
reference to the PagetableEntry in the TBMIterateResult, the offset
extraction can be done later.

Waiting to extract the tuple offsets has a few benefits. For the shared
iterator case, it allows us to extract the offsets after dropping the
shared iterator state lock, reducing time spent holding a contended
lock.

Separating the iteration step and extracting the offsets later also
allows us to avoid extracting the offsets for prefetched blocks. Those
offsets were never used, so the overhead of extracting and storing them
was wasted.

The real motivation for this change, however, is that future commits
will make bitmap heap scan use the read stream API. This requires a
TBMIterateResult per issued block. By removing the array of tuple
offsets from the TBMIterateResult and only extracting the offsets when
they are used, we reduce the memory required for per buffer data
substantially.

Suggested-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
2025-02-24 16:10:19 -05:00
Melanie Plageman
b8778c4cd8 Add lossy indicator to TBMIterateResult
TBMIterateResult->ntuples is -1 when the page in the bitmap is lossy.
Add an explicit lossy indicator so that we can move ntuples out of the
TBMIterateResult in a future commit.

Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
2025-02-24 16:10:13 -05:00
Nathan Bossart
c56e8af75e Fix comment for MAX_BACKENDS.
This comment mentions that we check that the configured number of
backends does not exceed MAX_BACKENDS in RegisterBackgroundWorker()
and relevant GUC check hooks, neither of which has those checks
anymore.  To fix, adjust this comment to say that we do the check
in InitializeMaxBackends().

Oversights in commits 6bc8ef0b7f and 0b1fe1413e.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/Z7zOEzz8lNjaU9yf%40nathan
2025-02-24 15:02:09 -06:00
Robert Haas
e87c14b19e libpq: Trace all NegotiateProtocolVersion fields
Previously, the names of the unsupported protocol options were not
traced. Since NegotiateProtocolVersion has not really been used yet,
that has not mattered much, but we hope to use it eventually, so let's
fix this.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQTfc_O+HXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com
2025-02-24 12:06:21 -05:00
Robert Haas
c9d94ea215 libpq: Add PQfullProtocolVersion to exports.txt
This is necessary to be able to actually use the function on Windows;
bug introduced in commit cdb6b0fdb0b2face270406905d31f8f513b015cc.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQTfc_O+HXqAo5_-xG4r3EFVsTefUeQzSvhEyyLDba-O9w@mail.gmail.com
2025-02-24 11:47:31 -05:00
Tom Lane
9de2cc455e Fix confusion about data type of pg_class.relpages and relallvisible.
Although they're exposed as int4 in pg_class, relpages and
relallvisible are really of type BlockNumber, that is uint32.
Correct type puns in relation_statistics_update() and remove
inappropriate range-checks.  The type puns are only cosmetic
issues, but the range checks would cause failures with huge
relations.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/614341.1740269035@sss.pgh.pa.us
2025-02-24 11:16:04 -05:00
Daniel Gustafsson
e889422d98 pg_amcheck: PQclear query results
While the potential memory leak is small, ensure to PQclear the query
results before disconnecting.

Author: Jiao Shuntian <312199339@qq.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/tencent_F34922C91C41E76C734773E767C9FBDB9906@qq.com
2025-02-24 16:03:19 +01:00
Andres Freund
5ee75e32fa Add static asserts for MAX_BACKENDS limiting factors
So far the various dependencies were documented in the comment above
MAX_BACKENDS, but not checked.

Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com
2025-02-24 06:23:41 -05:00
Andres Freund
418451bfe1 bufmgr: Make it easier to change number of buffer state bits
In an upcoming commit I'd like to change the number of bits for the usage
count (the current max is 5, fitting in three bits, but we reserve four
bits). Until now that required adjusting a bunch of magic constants, now the
constants are defined based on the number of bits reserved.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6
Discussion: https://postgr.es/m/riivolmg6uzfvpzfn6wjo3ghwt42rcec43ok6mv4oenfg654y7@x7dbposbskwd
2025-02-24 06:23:41 -05:00
Andres Freund
cd3ccf88aa Base LWLock limits directly on MAX_BACKENDS
Jacob reported that comments for LW_SHARED_MASK referenced a MAX_BACKENDS
limit of 2^23-1, but that MAX_BACKENDS is actually limited to 2^18-1. The
limit was lowered in 48354581a49c, but the comment in lwlock.c wasn't updated.

Instead of just fixing the comment, it seems better to directly base the
lwlock defines on MAX_BACKENDS and add static assertions to ensure that there
is enough space. That way there's no comment that can go out of sync in the
future.

As part of that change I noticed that for some reason the high bit wasn't used
for flags, which seems somewhat odd. Redefine the flag values to start at the
highest bit.

Reported-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com
2025-02-24 06:23:41 -05:00
Andres Freund
6394a3a61c Move MAX_BACKENDS to procnumber.h
MAX_BACKENDS influences many things besides postmaster. I e.g. noticed that we
don't have static assertions ensuring BUF_REFCOUNT_MASK is big enough for
MAX_BACKENDS, adding them would require including postmaster.h in
buf_internals.h which doesn't seem right.

While at that, add MAX_BACKENDS_BITS, as that's useful in various places for
static assertions (to be added in subsequent commits).

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/wptizm4qt6yikgm2pt52xzyv6ycmqiutloyvypvmagn7xvqkce@d4xuv3mylpg4
2025-02-24 06:23:41 -05:00
John Naylor
0600d276d4 Silence warning in older versions of Valgrind
Due to misunderstanding on my part, commit 235328ee4 did not go far
enough to silence older versions of Valgrind. For those, it was the bit
scan that was problematic, not the subsequent bit-masking operation. To
fix, use the unaligned path for the trailing bytes. Since we don't have
a bit scan here anymore, also remove some comments and endian-specific
coding around that.

Reported-by: Anton A. Melnikov <a.melnikov@postgrespro.ru>
Discussion: https://postgr.es/m/f3aa2d45-3b28-41c5-9499-a1bc30e0f8ec@postgrespro.ru
Backpatch-through: 17
2025-02-24 18:03:29 +07:00
Michael Paquier
2421e9a51d Remove read/sync fields from pg_stat_wal and GUC track_wal_io_timing
The four following attributes are removed from pg_stat_wal:
* wal_write
* wal_sync
* wal_write_time
* wal_sync_time

a051e71e28a1 has added an equivalent of this information in pg_stat_io
with more granularity as this now spreads across the backend types, IO
context and IO objects.  So, keeping the same information in pg_stat_wal
has little benefits.

Another benefit of this commit is the removal of PendingWalStats,
simplifying an upcoming patch to add per-backend WAL statistics, which
already support IO statistics and which have access to the write/sync
stats data of WAL.

The GUC track_wal_io_timing, that was used to enable or disable the
aggregation of the write and sync timings for WAL, is also removed.
pgstat_prepare_io_time() is simplified.

Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-24 09:51:56 +09:00
Tom Lane
fc0d0ce978 Ignore hash's relallvisible when checking pg_upgrade from pre-v10.
Our cross-version upgrade tests have been failing for some pre-v10
source versions since commit 1fd1bd871.  This turns out to be
because relallvisible may change for tables that have hash indexes,
because the upgrade process forcibly reindexes such indexes to
deal with the changes made in v10.

Fortunately, the set of tables that have such indexes is small
and won't change anymore in those branches.  So just hack up
AdjustUpgrade.pm to not compare the relallvisible values of
those specific tables.

While here, also tighten the regex that suppresses comparison
of version fields.

Discussion: https://postgr.es/m/812817.1740277228@sss.pgh.pa.us
2025-02-23 14:16:26 -05:00
Peter Eisentraut
454c182f85 backend libpq void * argument for binary data
Change some backend libpq functions to take void * for binary data
instead of char *.  This removes the need for numerous casts.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-23 14:27:02 +01:00
Peter Eisentraut
ebdccead16 SnapBuildRestoreContents() void * argument for binary data
Change internal snapbuild API function to take void * for binary data
instead of char *.  This removes the need for numerous casts.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-23 12:38:21 +01:00
Michael Paquier
a4e986ef5a Add more tests for utility commands in pipelines
This commit checks interactions with pipelines and implicit transaction
blocks for the following commands that have their own behaviors when
used in pipelines depending on their order in a pipeline and sync
requests:
- SET LOCAL
- REINDEX CONCURRENTLY
- VACUUM
- Subtransactions (SAVEPOINT, ROLLBACK TO SAVEPOINT)

These scenarios could be tested only with pgbench previously.  The
meta-commands of psql controlling pipelines make these easier to
implement, debug, and they can be run in a SQL script.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-23 16:43:07 +09:00
Peter Eisentraut
f98765f0ce jsonb internal API void * argument for binary data
Change some internal jsonb API functions to take void * for binary
data instead of char *.  This removes the need for numerous casts.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-23 08:34:55 +01:00
Jeff Davis
cb45dc3afb Documentation fixups for dumping statistics.
Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/OSCPR01MB149665630030E7F54FDA8B27BF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/25d26774-25fa-46f2-9888-c6a707d1fef7@dunslane.net
2025-02-22 10:03:11 -08:00
Álvaro Herrera
bba2fbc623
Change \conninfo to use tabular format
(Initially the proposal was to keep \conninfo alone and add this feature
as \conninfo+, but we decided against keeping the original.)

Also display more fields than before, though not as many as were
suggested during the discussion.  In particular, we don't show 'role'
nor 'session authorization', for both which a case can probably be made.
These can be added as followup commits, if we agree to it.

Some (most?) reviewers actually reviewed rather different versions of
the patch and do not necessarily endorse the current one.

Co-authored-by: Maiquel Grassi <grassi@hotmail.com.br>
Co-authored-by: Hunaid Sohail <hunaidpgml@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Sami Imseih <simseih@amazon.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Pavel Luzanov <p.luzanov@postgrespro.ru>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Erik Wienhold <ewie@ewie.name>
Discussion: https://postgr.es/m/CP8P284MB24965CB63DAC00FC0EA4A475EC462@CP8P284MB2496.BRAP284.PROD.OUTLOOK.COM
2025-02-22 10:05:26 +01:00
Amit Langote
4f1b6e5bb4 Remove unstable test suite added by 525392d57
The 'cached-plan-inval' test suite, introduced in 525392d57 under
src/test/modules/delay_execution, aimed to verify that cached plan
invalidation triggers replanning after deferred locks are taken.
However, its ExecutorStart_hook-based approach relies on lock timing
assumptions that, in retrospect, are fragile. This instability was
exposed by failures on BF animal trilobite, which builds with
CLOBBER_CACHE_ALWAYS.

One option was to dynamically disable the cache behavior that causes
the test suite to fail by setting "debug_discard_caches = 0", but it
seems better to remove the suite. The risk of future failures due to
other cache flush hazards outweighs the benefit of catching real
breakage in the backend behavior it tests.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2990641.1740117879@sss.pgh.pa.us
2025-02-22 15:19:23 +09:00
Andres Freund
f8d7f29b3e Allow lwlocks to be disowned
To implement AIO writes, the backend initiating writes needs to transfer the
lock ownership to the AIO subsystem, so the lock held during the write can be
released in another backend.

Other backends need to be able to "complete" an asynchronously started IO to
avoid deadlocks (consider e.g. one backend starting IO for a buffer and then
waiting for a heavyweight lock held by another relation followed by the
current holder of the heavyweight lock waiting for the IO to complete).

To that end, this commit adds LWLockDisown() and LWLockReleaseDisowned(). If
code uses LWLockDisown() it's the code's responsibility to ensure that the
lock is released in case of errors.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
2025-02-21 20:55:23 -05:00
Robert Haas
44cbba9a7f Adjust EXPLAIN test case to filter out "Actual Rows" values.
Per the buildfarm, these tests appear to be unstable in the wake of
commit ddb17e387aa28d61521227377b00f997756b8a27. I'm not sure that
just hiding this output is the right way forward, because I think
there may be other test cases that will fail with lower probability
even after this fix. However, it's hard to tell right now, because
this is failing on a number of buildfarm animals. So let's try this
for now to either get a clearer picture of what else is broken, or
as a stopgap until we decide what the permanent fix should be, or
perhaps this will be the permanent fix after all.
2025-02-21 19:20:41 -05:00
Tom Lane
98fc31d649 Avoid race condition between "GRANT role" and "DROP ROLE".
Concurrently dropping either the granted role or the grantee
does not stop GRANT from completing, instead resulting in a
dangling role reference in pg_auth_members.  That's relatively
harmless in the short run, but inconsistent catalog entries
are not a good thing.

This patch solves the problem by adding the granted and grantee
roles as explicit shared dependencies of the pg_auth_members entry.
That's a bit indirect, but it works because the pg_shdepend code
applies the necessary locking and rechecking.

Commit 6566133c5 previously established similar handling for
the grantor column of pg_auth_members; it's not clear why it
didn't cover the other two role OID columns.

A side-effect of this approach is that DROP OWNED BY will now drop
pg_auth_members entries that mention the target role as either the
granted or grantee role.  That's clearly appropriate for the
grantee, since we'll drop its other privileges too.  It doesn't
seem too far out of line for the granted role, since we're
presumably about to drop it and besides we're removing all reasons
why it'd matter to be a member of it.  (One could argue that this
makes DropRole's code to auto-drop pg_auth_members entries
unnecessary, but I chose to leave it in place since perhaps some
people's workflows expect that to work without a DROP OWNED BY.)

Note to patch readers: CreateRole's first CommandCounterIncrement
call is now unconditional, because this change creates another
case in which it's needed, and it seemed to be more trouble than
it's worth to preserve that micro-optimization.

Arguably this is a bug fix, but the fact that it changes the
expected contents of pg_shdepend seems like not a great thing
to do in the stable branches, and perhaps we don't want the
change in DROP OWNED BY semantics there either.  On the other
hand, I opted not to force a catversion bump in HEAD, because
the presence or absence of these entries doesn't matter for
most purposes.

Reported-by: Virender Singla <virender.cse@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
2025-02-21 17:07:01 -05:00
Robert Haas
ddb17e387a Allow EXPLAIN to indicate fractional rows.
When nloops > 1, we now display two digits after the decimal point,
rather than none. This is important because what we print is actually
planstate->instrument->ntuples / nloops, and sometimes what you want
to know is planstate->instrument->ntuples. You can estimate that by
multiplying the displayed row count by the displayed nloops value, but
the fact that the displayed value is rounded makes that inexact. It's
still inexact even if we show these two extra decimal places, but less
so. Perhaps we will agree on a way to further improve this output later,
but for now this seems better than doing nothing.

Author: Ibrar Ahmed <ibrar.ahmad@gmail.com>
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Naeem Akhter <akhternaeem@gmail.com>
Reviewed-by: Hamid Akhtar <hamid.akhtar@percona.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrei Lepikhov <a.lepikhov@postgrespro.ru>
Reviewed-by: Guillaume Lelarge <guillaume@lelarge.info>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: http://postgr.es/m/603c8f070905281830g2e5419c4xad2946d149e21f9d%40mail.gmail.com
2025-02-21 16:14:13 -05:00
Masahiko Sawada
78d3f48895 Add test 005_char_signedness.pl to meson.build.
Oversight in a8238f87f98 where the test has been added.

Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 12:31:16 -08:00
Tom Lane
29d75b25b5 Fix pg_dumpall to cope with dangling OIDs in pg_auth_members.
There is a race condition between "GRANT role" and "DROP ROLE",
which allows GRANT to install pg_auth_members entries that refer to
dropped roles.  (Commit 6566133c5 prevented that for the grantor
field, but not for the granted or grantee roles.)  We'll soon fix
that, at least in HEAD, but pg_dumpall needs to cope with the
situation in case of pre-existing inconsistency.  As pg_dumpall
stands, it will emit invalid commands like 'GRANT foo TO ""',
which causes pg_upgrade to fail.  Fix it to emit warnings and skip
those GRANTs, instead.

There was some discussion of removing the problem by changing
dumpRoleMembership's query to use JOIN not LEFT JOIN, but that
would result in silently ignoring such entries.  It seems better
to produce a warning.

Pre-v16 branches already coped with dangling grantor OIDs by simply
omitting the GRANTED BY clause.  I left that behavior as-is, although
it's somewhat inconsistent with the behavior of later branches.

Reported-by: Virender Singla <virender.cse@gmail.com>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
Backpatch-through: 13
2025-02-21 13:37:15 -05:00
Masahiko Sawada
dfd8e6c73e Fix an issue with index scan using pg_trgm due to char signedness on different architectures.
GIN and GiST indexes utilizing pg_trgm's opclasses store sorted
trigrams within index tuples. When comparing and sorting each trigram,
pg_trgm treats each character as a 'char[3]' type in C. However, the
char type in C can be interpreted as either signed char or unsigned
char, depending on the platform, if the signedness is not explicitly
specified. Consequently, during replication between different CPU
architectures, there was an issue where index scans on standby servers
could not locate matching index tuples due to the differing treatment
of character signedness.

This change introduces comparison functions for trgm that explicitly
handle signed char and unsigned char. The appropriate comparison
function will be dynamically selected based on the character
signedness stored in the control file. Therefore, upgraded clusters
can utilize the indexes without rebuilding, provided the cluster
upgrade occurs on platforms with the same character signedness as the
original cluster initialization.

The default char signedness information was introduced in 44fe30fdab6,
so no backpatch.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:27:39 -08:00
Masahiko Sawada
1aab680591 pg_upgrade: Add --set-char-signedness to set the default char signedness of new cluster.
This change adds a new option --set-char-signedness to pg_upgrade. It
enables user to set arbitrary signedness during pg_upgrade. This helps
cases where user who knew they copied the v17 source cluster from
x86 (signedness=true) to ARM (signedness=false) can pg_upgrade
properly without the prerequisite of acquiring an x86 VM.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:23:39 -08:00
Masahiko Sawada
a8238f87f9 pg_upgrade: Preserve default char signedness value from old cluster.
Commit 44fe30fdab6 introduced the 'default_char_signedness' field in
controlfile. Newly created database clusters always set this field to
'signed'.

This change ensures that pg_upgrade updates the
'default_char_signedness' to 'unsigned' if the source database cluster
has signedness=false. For source clusters from v17 or earlier, which
lack the 'default_char_signedness' information, pg_upgrade assumes the
source cluster was initialized on the same platform where pg_upgrade
is running. It then sets the 'default_char_signedness' value according
to the current platform's default character signedness.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:19:40 -08:00
Masahiko Sawada
30666d1857 pg_resetwal: Add --char-signedness option to change the default char signedness.
With the newly added option --char-signedness, pg_resetwal updates the
default char signedness flag in the controlfile. This option is
primarily intended for an upcoming patch that pg_upgrade supports
preserving the default char signedness during upgrades, and is not
meant for manual operation.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:14:36 -08:00
Masahiko Sawada
44fe30fdab Add default_char_signedness field to ControlFileData.
The signedness of the 'char' type in C is
implementation-dependent. For instance, 'signed char' is used by
default on x86 CPUs, while 'unsigned char' is used on aarch
CPUs. Previously, we accidentally let C implementation signedness
affect persistent data. This led to inconsistent results when
comparing char data across different platforms.

This commit introduces a new 'default_char_signedness' field in
ControlFileData to store the signedness of the 'char' type. While this
change does not encourage the use of 'char' without explicitly
specifying its signedness, this field can be used as a hint to ensure
consistent behavior for pre-v18 data files that store data sorted by
the 'char' type on disk (e.g., GIN and GiST indexes), especially in
cross-platform replication scenarios.

Newly created database clusters unconditionally set the default char
signedness to true. pg_upgrade (with an upcoming commit) changes this
flag for clusters if the source database cluster has
signedness=false. As a result, signedness=false setting will become
rare over time. If we had known about the problem during the last
development cycle that forced initdb (v8.3), we would have made all
clusters signed or all clusters unsigned. Making pg_upgrade the only
source of signedness=false will cause the population of database
clusters to converge toward that retrospective ideal.

Bump catalog version (for the catalog changes) and PG_CONTROL_VERSION
(for the additions in ControlFileData).

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:12:08 -08:00
Bruce Momjian
901a1cf8b4 doc: clarify default checksum behavior in non-master branches
Also simplify and correct data checksum wording in master now that it is
the default.  PG 13 did not have the awkward wording.

Reported-by: Felix <afripowered@gmail.com>

Reviewed-by: Laurenz Albe

Discussion: https://postgr.es/m/173928241056.707.3989867022954178032@wrigleys.postgresql.org

Backpatch-through: 14
2025-02-21 13:03:29 -05:00
Bruce Momjian
6ea0734e41 doc: remove non-breaking space in SGML files, causes make error 2025-02-21 12:15:53 -05:00
Andres Freund
32ce58e9e9 Make test portlock logic work with meson
Previously the portlock logic, added in 9b4eafcaf41, didn't actually work
properly when the tests were run via meson. 9b4eafcaf41 used the
MESON_BUILD_ROOT environment variable to determine the directory for the port
lock directory, but that's never set for running the tests.  That meant that
each test used its own portlock dir, unless the PG_TEST_PORT_DIR environment
variable was set.

Fix the problem by setting top_builddir for the environment. That's also used
for the autoconf/make build.

Backpatch back to 16, where meson support was added.

Reported-by: Zharkov Roman <r.zharkov@postgrespro.ru>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Backpatch-through: 16
2025-02-21 11:25:05 -05:00
Michael Paquier
665cafe8a4 Fix cross-version upgrades with XMLSERIALIZE(NO INDENT)
Dumps from versions older than v16 do not know about NO INDENT in a
XMLSERIALIZE() clause.  This commit adjusts AdjustUpgrade.pm so as NO
INDENT is discarded in the contents of the new dump adjusted for
comparison when the old version is v15 or older.  This should be enough
to make the cross-version upgrade tests pass.

Per report from buildfarm member crake.  Oversight in 984410b92326.

Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/88b183f1-ebf9-4f51-9144-3704380ccae7@dunslane.net
Backpatch-through: 16
2025-02-21 20:37:31 +09:00
Peter Eisentraut
329304c901 Support text position search functions with nondeterministic collations
This allows using text position search functions with nondeterministic
collations.  These functions are

- position, strpos
- replace
- split_part
- string_to_array
- string_to_table

which all use common internal infrastructure.

There was previously no internal implementation of this, so it was met
with a not-supported error.  This adds the internal implementation and
removes the error.

Unlike with deterministic collations, the search cannot use any
byte-by-byte optimized techniques but has to go substring by
substring.  We also need to consider that the found match could have a
different length than the needle and that there could be substrings of
different length matching at a position.  In most cases, we need to
find the longest such substring (greedy semantics), but this can be
configured by each caller.

Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://www.postgresql.org/message-id/flat/582b2613-0900-48ca-8b0d-340c06f4d400@eisentraut.org
2025-02-21 12:21:17 +01:00
Daniel Gustafsson
41336bf085 doc: Add links to olsen93 and ong90 in bibliography
The bibliography entries for olsen93 and ong90 lacked links to
online copies.  While ong90 is available in digital form, the
olsen93 thesis is only available as a physical copy in the UCB
library.  To save people from searching for it, we still link
to it via the UCB library page.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxFcJYdRvzgt59N26XjFp2tFFUXu+VN+x8Uo0NbDUCMCbw@mail.gmail.com
2025-02-21 11:28:42 +01:00
Amit Kapila
b4e0d0c53f Fix a WARNING for data origin discrepancies.
Previously, a WARNING was issued at the time of defining a subscription
with origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different
origins. However, the publisher can subscribe to the partition ancestors
or partition children of the table from other publishers, which could also
result in mixed-origin data inclusion. So, give a WARNING in those cases
as well.

Reported-by: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/5eda6a9c-63cf-404d-8a49-8dcb116a29f3@postgrespro.ru
2025-02-21 14:34:40 +05:30
Michael Paquier
984410b923 Add missing deparsing of [NO] IDENT to XMLSERIALIZE()
NO INDENT is the default, and is added if no explicit indentation
flag was provided with XMLSERIALIZE().

Oversight in 483bdb2afec9.

Author: Jim Jones <jim.jones@uni-muenster.de>
Discussion: https://postgr.es/m/bebd457e-5b43-46b3-8fc6-f6a6509483ba@uni-muenster.de
Backpatch-through: 16
2025-02-21 17:30:56 +09:00
Peter Eisentraut
7d6d2c4bbd Drop opcintype from index AM strategy translation API
The type argument wasn't actually really necessary.  It was a remnant
of converting the API of the gist strategy translation from using
opclass to using opfamily+opcintype (commits c09e5a6a016,
622f678c102).  For looking up the gist translation function, we used
the convention "amproclefttype = amprocrighttype = opclass's
opcintype" (see pg_amproc.h).  But each operator family should only
have one translation function, and getting the right type for the
lookup is sometimes cumbersome and fragile, so this is all
unnecessarily complicated.

To simplify this, change the gist stategy support procedure to take
"any", "any" as argument.  (This is arbitrary but seems intuitive.
The alternative of using InvalidOid as argument(s) upsets various DDL
commands, so it's not practical.)  Then we don't need opcintype for
the lookup, and we can remove it from all the API layers introduced by
commit c09e5a6a016.

This also adds some more documentation about the correct signature of
the gist support function and adds more checks in gistvalidate().
This was previously underspecified.  (It relied implicitly on
convention mentioned above.)

Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-21 09:07:16 +01:00
Peter Eisentraut
7202d72787 backend launchers void * arguments for binary data
Change backend launcher functions to take void * for binary data
instead of char *.  This removes the need for numerous casts.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-21 08:03:33 +01:00
Jeff Davis
b50a554cc8 Fix for pg_restore_attribute_stats().
Use RelationGetIndexExpressions() rather than rd_indexprs directly.

Author: Corey Huinker <corey.huinker@gmail.com>
2025-02-20 22:31:22 -08:00
Michael Paquier
41625ab8ea psql: Add support for pipelines
With \bind, \parse, \bind_named and \close, it is possible to issue
queries from psql using the extended protocol.  However, it was not
possible to send these queries using libpq's pipeline mode.  This
feature has two advantages:
- Testing.  Pipeline tests were only possible with pgbench, using TAP
tests.  It now becomes possible to have more SQL tests that are able to
stress the backend with pipelines and extended queries.  More tests will
be added in a follow-up commit that were discussed on some other
threads.  Some external projects in the community had to implement their
own facility to work around this limitation.
- Emulation of custom workloads, with more control over the actions
taken by a client with libpq APIs.  It is possible to emulate more
workload patterns to bottleneck the backend with the extended query
protocol.

This patch adds six new meta-commands to be able to control pipelines:
* \startpipeline starts a new pipeline.  All extended queries are queued
until the end of the pipeline are reached or a sync request is sent and
processed.
* \endpipeline ends an existing pipeline.  All queued commands are sent
to the server and all responses are processed by psql.
* \syncpipeline queues a synchronisation request, without flushing the
commands to the server, equivalent of PQsendPipelineSync().
* \flush, equivalent of PQflush().
* \flushrequest, equivalent of PQsendFlushRequest()
* \getresults reads the server's results for the queries in a pipeline.
Unsent data is automatically pushed when \getresults is called.  It is
possible to control the number of results read in a single meta-command
execution with an optional parameter, 0 means that all the results
should be read.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-21 11:19:59 +09:00
Michael Paquier
40af897eb7 Add braces for if block with large comment in psql's common.c
A patch touching this area of the code is under review, and this format
makes the readability of the code slightly harder to parse.

Extracted from a larger patch by the same author.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-21 09:18:49 +09:00
Daniel Gustafsson
2c53dec7f4 Add missing entry to oauth_validator test .gitignore
Commit b3f0be788 accidentally missed adding the oauth client test
binary to the relevant .gitignore.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2839306.1740082041@sss.pgh.pa.us
2025-02-20 21:29:21 +01:00
Peter Eisentraut
3e4d868615 Remove various unnecessary (char *) casts
Remove a number of (char *) casts that are unnecessary.  Or in some
cases, rewrite the code to make the purpose of the cast clearer.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-20 19:49:27 +01:00
Jeff Davis
ab84d0ff80 Trial fix for old cross-version upgrades.
Per buildfarm and reports, it seems that 9.X to 18 upgrades were
failing after commit 1fd1bd8710 due to an incorrect regex. Loosen the
regex to accommodate older versions.

Reported-by: vignesh C <vignesh21@gmail.com>
Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/CALDaNm3GUs+U8Nt4S=V5zmb+K8-RfAc03vRENS0teeoq0Lc6Tw@mail.gmail.com
Discussion: https://postgr.es/m/ea4cbbc1-c5a5-43d1-9618-8ff3f2155bfe@dunslane.net
2025-02-20 10:21:24 -08:00
Andrew Dunstan
8e4d72573c Ignore blank lines in pgindent exclude files
Currently a blank line matches everything, which is almost never what
someone would want. If they really want that they can use a wildcard
regex to do it.

Author: Zsolt Parragi <zsolt.parragi@percona.com>

Discussion: https://postgr.es/m/CAN4CZFNka+2q3=-Dithr4w65RJfwPaV92T62spEzLn+T4MgcMg@mail.gmail.com
2025-02-20 11:36:07 -05:00
Daniel Gustafsson
9d9a71002a cirrus: Temporarily fix libcurl link error
On FreeBSD the ftp/curl port appears to be missing a minimum
version dependency on libssh2, so the following starts showing
up after upgrading to curl 8.11.1_1:

  libcurl.so.4: Undefined symbol "libssh2_session_callback_set2"

Awaiting an upgrade of the FreeBSD CI images to version 14, work
around the issue.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAOYmi+kZAka0sdxCOBxsQc2ozEZGZKHWU_9nrPXg3sG1NJ-zJw@mail.gmail.com
2025-02-20 16:25:47 +01:00
Daniel Gustafsson
b3f0be788a Add support for OAUTHBEARER SASL mechanism
This commit implements OAUTHBEARER, RFC 7628, and OAuth 2.0 Device
Authorization Grants, RFC 8628.  In order to use this there is a
new pg_hba auth method called oauth.  When speaking to a OAuth-
enabled server, it looks a bit like this:

  $ psql 'host=example.org oauth_issuer=... oauth_client_id=...'
  Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG

Device authorization is currently the only supported flow so the
OAuth issuer must support that in order for users to authenticate.
Third-party clients may however extend this and provide their own
flows.  The built-in device authorization flow is currently not
supported on Windows.

In order for validation to happen server side a new framework for
plugging in OAuth validation modules is added.  As validation is
implementation specific, with no default specified in the standard,
PostgreSQL does not ship with one built-in.  Each pg_hba entry can
specify a specific validator or be left blank for the validator
installed as default.

This adds a requirement on libcurl for the client side support,
which is optional to build, but the server side has no additional
build requirements.  In order to run the tests, Python is required
as this adds a https server written in Python.  Tests are gated
behind PG_TEST_EXTRA as they open ports.

This patch has been a multi-year project with many contributors
involved with reviews and in-depth discussions:  Michael Paquier,
Heikki Linnakangas, Zhihong Yu, Mahendrakar Srinivasarao, Andrey
Chudnovsky and Stephen Frost to name a few.  While Jacob Champion
is the main author there have been some levels of hacking by others.
Daniel Gustafsson contributed the validation module and various bits
and pieces; Thomas Munro wrote the client side support for kqueue.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Reviewed-by: Kashif Zeeshan <kashi.zeeshan@gmail.com>
Discussion: https://postgr.es/m/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
2025-02-20 16:25:17 +01:00
Jeff Davis
1fd1bd8710 Transfer statistics during pg_upgrade.
Add support to pg_dump for dumping stats, and use that during
pg_upgrade so that statistics are transferred during upgrade. In most
cases this removes the need for a costly re-analyze after upgrade.

Some statistics are not transferred, such as extended statistics or
statistics with a custom stakind.

Now pg_dump accepts the options --schema-only, --no-schema,
--data-only, --no-data, --statistics-only, and --no-statistics; which
allow all combinations of schema, data, and/or stats. The options are
named this way to preserve compatibility with the previous
--schema-only and --data-only options.

Statistics are in SECTION_DATA, unless the object itself is in
SECTION_POST_DATA.

The stats are represented as calls to pg_restore_relation_stats() and
pg_restore_attribute_stats().

Author: Corey Huinker, Jeff Davis
Reviewed-by: Jian He
Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com
2025-02-20 01:29:06 -08:00
Amit Kapila
7da344b9f8 Improve errdetail message added by ac0e33136a.
Make it consistent with other similar messages.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/20250220.140839.1444694904721968348.horikyota.ntt@gmail.com
2025-02-20 14:02:29 +05:30
Amit Langote
525392d572 Don't lock partitions pruned by initial pruning
Before executing a cached generic plan, AcquireExecutorLocks() in
plancache.c locks all relations in a plan's range table to ensure the
plan is safe for execution. However, this locks runtime-prunable
relations that will later be pruned during "initial" runtime pruning,
introducing unnecessary overhead.

This commit defers locking for such relations to executor startup and
ensures that if the CachedPlan is invalidated due to concurrent DDL
during this window, replanning is triggered. Deferring these locks
avoids unnecessary locking overhead for pruned partitions, resulting
in significant speedup, particularly when many partitions are pruned
during initial runtime pruning.

* Changes to locking when executing generic plans:

AcquireExecutorLocks() now locks only unprunable relations, that is,
those found in PlannedStmt.unprunableRelids (introduced in commit
cbc127917e), to avoid locking runtime-prunable partitions
unnecessarily.  The remaining locks are taken by
ExecDoInitialPruning(), which acquires them only for partitions that
survive pruning.

This deferral does not affect the locks required for permission
checking in InitPlan(), which takes place before initial pruning.
ExecCheckPermissions() now includes an Assert to verify that all
relations undergoing permission checks, none of which can be in the
set of runtime-prunable relations, are properly locked.

* Plan invalidation handling:

Deferring locks introduces a window where prunable relations may be
altered by concurrent DDL, invalidating the plan. A new function,
ExecutorStartCachedPlan(), wraps ExecutorStart() to detect and handle
invalidation caused by deferred locking. If invalidation occurs,
ExecutorStartCachedPlan() updates CachedPlan using the new
UpdateCachedPlan() function and retries execution with the updated
plan. To ensure all code paths that may be affected by this handle
invalidation properly, all callers of ExecutorStart that may execute a
PlannedStmt from a CachedPlan have been updated to use
ExecutorStartCachedPlan() instead.

UpdateCachedPlan() replaces stale plans in CachedPlan.stmt_list. A new
CachedPlan.stmt_context, created as a child of CachedPlan.context,
allows freeing old PlannedStmts while preserving the CachedPlan
structure and its statement list. This ensures that loops over
statements in upstream callers of ExecutorStartCachedPlan() remain
intact.

ExecutorStart() and ExecutorStart_hook implementations now return a
boolean value indicating whether plan initialization succeeded with a
valid PlanState tree in QueryDesc.planstate, or false otherwise, in
which case QueryDesc.planstate is NULL. Hook implementations are
required to call standard_ExecutorStart() at the beginning, and if it
returns false, they should do the same without proceeding.

* Testing:

To verify these changes, the delay_execution module tests scenarios
where cached plans become invalid due to changes in prunable relations
after deferred locks.

* Note to extension authors:

ExecutorStart_hook implementations must verify plan validity after
calling standard_ExecutorStart(), as explained earlier. For example:

    if (prev_ExecutorStart)
        plan_valid = prev_ExecutorStart(queryDesc, eflags);
    else
        plan_valid = standard_ExecutorStart(queryDesc, eflags);

    if (!plan_valid)
        return false;

    <extension-code>

    return true;

Extensions accessing child relations, especially prunable partitions,
via ExecGetRangeTableRelation() must now ensure their RT indexes are
present in es_unpruned_relids (introduced in commit cbc127917e), or
they will encounter an error. This is a strict requirement after this
change, as only relations in that set are locked.

The idea of deferring some locks to executor startup, allowing locks
for prunable partitions to be skipped, was first proposed by Tom Lane.

Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Reviewed-by: David Rowley <dgrowleyml@gmail.com> (earlier versions)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier versions)
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2025-02-20 17:09:48 +09:00
Amit Kapila
4aa6fa3cd0 Include schema/table publications even with exclude options in dump.
The current implementation inconsistently includes public schema but not
information_schema when those are specified in FOR TABLES IN SCHMEA ...
Apart from that, the current behavior for publications w.r.t exclude table
and schema (--exclude-table, --exclude-schema) option differs from what we
do at other places. We try to avoid including publications for
corresponding tables or schemas when an exclude-table or exclude-schema
option is given, unlike what we do for views using functions defined in a
particular schema or a subscription pointing to publications with their
corresponding exclude options.

I decided not to backpatch this as it leads to a behavior change and we don't
see any field report for current behavior.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/1270733.1734134272@sss.pgh.pa.us
2025-02-20 11:25:29 +05:30
Michael Paquier
f11674f8df doc: Fix typo in section "WAL configuration"
pg_stat_io has an attribute named fsync_time, not sync_time.

Oversight in 2f70871c2bc1.

Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-20 14:22:00 +09:00
Michael Paquier
4538bd3f1d doc: Add details about object "wal" in pg_stat_io
This commit adds a short description of what kind of activity is tracked
in pg_stat_io for the object "wal", with a link pointing to the section
"WAL configuration" that has a lot of details on the matter.

This should perhaps have been added in a051e71e28a1, but things are what
they are.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-20 14:16:23 +09:00
Michael Paquier
2f70871c2b doc: Recommend pg_stat_io rather than pg_stat_wal in WAL configuration
Since a051e71e28a1, pg_stat_io is able to track statistics for the WAL
activity, providing an equivalent of pg_stat_wal with more granularity
for the fsyncs/writes counts and timings, as the data is split across
backend types.

This commit now recommends pg_stat_io rather than pg_stat_wal in the
section "WAL configuration", some of the latter's attributes being
candidate for removal in a follow-up commit.

Extracted from a larger patch by the same author.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-20 13:55:00 +09:00
Michael Paquier
71f17823ba Fix FATAL message for invalid recovery timeline at beginning of recovery
If the requested recovery timeline is not reachable, the logged
checkpoint and timeline should to be the values read from the
backup_label when it is defined.  The message generated used the values
from the control file in this case, which is fine when recovering from
the control file without a backup_label, but not if there is a
backup_label.

Issue introduced in ee994272ca50.  v15 has introduced xlogrecovery.c and
more simplifications in this area (4a92a1c3d1c3, a27048cbcb58), making
this change a bit simpler to think about, so backpatch only down to this
version.

Author: David Steele <david@pgbackrest.org>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Benoit Lobréau <benoit.lobreau@dalibo.com>
Discussion: https://postgr.es/m/c3d617d4-1696-4aa7-8a4d-5a7d19cc5618@pgbackrest.org
Backpatch-through: 15
2025-02-20 10:42:20 +09:00
Andres Freund
d38bab5edd pgbench: Increase RLIMIT_NOFILE if necessary
pgbench already had code to check if the soft rlimit is too low for the
specified number of connections. If too low, it errored out, telling the user
to increase the limit.

However, we can do better: If the hard limit allows, increase the soft limit
to be sufficiently for the number of connections.

It is common for the soft limit to be considerably lower than the hard limit,
due to the danger of soft limits > 1024 breaking programs that use the
select(2), as explained in [1].

[1]: https://0pointer.net/blog/file-descriptor-limits.html

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d%3Dh9J%3DD1rStFCMR%2Bt7yh_Kwj-g87aLQ%40mail.gmail.com
2025-02-19 19:35:09 -05:00
Michael Paquier
9b1cb58c5f test_escape: Fix output of --help
The short option name -f was not listed, only its long option name
--force-unsupported.

Author: Japin Li
Discussion: https://postgr.es/m/ME0P300MB04452BD1FB1B277D4C1C20B9B6C52@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
Backpatch-through: 13
2025-02-20 09:30:54 +09:00
Tomas Vondra
9ba7bcc894 Correct relation size estimate with low fillfactor
Since commit 29cf61ade3, table_block_relation_estimate_size() considers
fillfactor when estimating number of rows in a relation before the first
ANALYZE. The formula however did not consider tuples may be larger than
available space determined by fillfactor, ending with density 0. This
ultimately means the relation was estimated to contain a single row.

The executor however places at least one tuple per page, even with very
low fillfactor values, so the density should be at least 1. Fixed by
clamping the density estimate using clamp_row_est().

Reported by Heikki Linnakangas. Fix by me, with regression test inspired
by example provided by Heikki.

Backpatch to 17, where the issue was introduced.

Reported-by: Heikki Linnakangas
Backpatch-through: 17
Discussion: https://postgr.es/m/2bf9d973-7789-4937-a7ca-0af9fb49c71e@iki.fi
2025-02-19 23:53:37 +01:00
Tom Lane
e596e077bb Assert that ExecOpenIndices and ExecCloseIndices are not repeated.
These functions should be called at most once per ResultRelInfo;
it's wasteful to do otherwise, and certainly the pattern of
opening twice and then closing twice is a bad idea.  Moreover,
aminsertcleanup functions might not be prepared to be called twice,
as the just-hardened code in BRIN demonstrates.

This amounts to an API change, since such coding patterns were
safe even if wasteful before v17.  Hence, apply to HEAD only.
(Extension code violating this new rule faces some risk in v17,
but we just fixed brininsertcleanup and there are probably few
other aminsertcleanup functions as yet.  So the odds of breaking
usable code seem higher than the odds of doing something useful
with a back-patch.)

Bug: #18815
Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com>
Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org
2025-02-19 16:45:12 -05:00
Tom Lane
9ff68679b5 Fix crash in brininsertcleanup during logical replication.
Logical replication crashes if the subscriber's partitioned table
has a BRIN index.  There are two independently blamable causes,
and this patch fixes both:

1. brininsertcleanup fails if called twice for the same IndexInfo,
because it half-destroys its BrinInsertState but leaves it still
linked from ii_AmCache.  brininsert would also fail in that state,
so it's pretty hard to see any advantage to this coding.  Fully
remove the BrinInsertState, instead, so that a new brininsert
call would create a new cache.

2. A logical replication subscriber sometimes does ExecOpenIndices
twice on the same ResultRelInfo, followed by doing ExecCloseIndices
twice; the second call reaches the brininsertcleanup bug.  Quite
aside from tickling unexpected cases in aminsertcleanup methods,
this seems very wasteful, because the IndexInfos built in the
first ExecOpenIndices call are just lost during the second call,
and have to be rebuilt at possibly-nontrivial cost.  We should
establish a coding rule that you don't do that.

The problematic coding is that when the target table is partitioned,
apply_handle_tuple_routing calls ExecFindPartition which does
ExecOpenIndices (and expects that ExecCleanupTupleRouting will
close the indexes again).  Using the ResultRelInfo made by
ExecFindPartition, it calls apply_handle_delete_internal or
apply_handle_insert_internal, both of which think they need to do
ExecOpenIndices/ExecCloseIndices for themselves.  They do in the main
non-partitioned code paths, but not here.  The simplest fix is to pull
their ExecOpenIndices/ExecCloseIndices calls out and put them in the
call sites for the non-partitioned cases.  (We could have refactored
apply_handle_update_internal similarly, but I did not do so today
because there's no bug there: the partitioned code path doesn't
call it.)

Also, remove the always-duplicative open/close calls within
apply_handle_tuple_routing itself.

Since brininsertcleanup and indeed the whole aminsertcleanup mechanism
are new in v17, there's no observable bug in older branches.  A case
could be made for trying to avoid these duplicative open/close calls
in the older branches, but for now it seems not worth the trouble and
risk of new bugs.

Bug: #18815
Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com>
Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org
Backpatch-through: 17
2025-02-19 16:35:15 -05:00
Tomas Vondra
a1b4f289be Consider BufFiles when adjusting hashjoin parameters
Until now ExecChooseHashTableSize() considered only the size of the
in-memory hash table, and ignored the memory needed for the batch files.
Which can be a significant amount, because each batch needs two BufFiles
(each with a BLCKSZ buffer). The same issue applies to increasing the
number of batches during execution.

It's also possible to trigger a "batch explosion", e.g. due to duplicate
values or skew. We've seen reports of joins with hundreds of thousands
(or even millions) of batches, consuming gigabytes of memory, triggering
OOM errors. These cases may be fairly rare, but it's clearly possible to
hit them.

These issues can't be prevented during planning. Even if we improve
that, it does not help with execution-time batch explosion. We can
however reduce the impact and use as little memory as possible.

This patch improves the behavior by adjusting how the memory is divided
between the hash table and batch files. It may be better to use fewer
batch files, even if it means the hash table will exceed the limit.

The capacity of the hash node may be increased either by doubling he
number of batches, or doubling the size of the in-memory hash table. The
outcome is the same, but the memory usage may be very different. For low
nbatch values it's better to add batches, for high nbatch values it's
better to allow a larger hash table.

The patch considers both options, both during the initial sizing and
then during execution, to minimize how much the limit gets exceeded.

It might seem this patch is relaxing the memory limit - allowing it to
be exceeded. But that's not really the case. It has always been like
that, except the memory used by batches was ignored.

Allowing the hash table to grow may also prevent the batch explosion.
If there's a large batch that can't be split (due to hash collisions or
duplicate values), at some point the memory limit will increase enough
for the batch to fit into the hash table.

This patch was in the works for a long time. The early versions were
posted in 2019, and revived every year or two when we happened to get
the next report of OOM due to a hashjoin batch explosion. Each of those
patch versions were reviewed by a couple people. I'm mentioning only
Melanie Plageman and Robert Haas, because they reviewed the last
version, and the older patches are very different.

Reviewed-by: Melanie Plageman, Robert Haas
Discussion: https://postgr.es/m/7bed6c08-72a0-4ab9-a79c-e01fcdd0940f@vondra.me
Discussion: https://postgr.es/m/20190504003414.bulcbnge3rhwhcsh%40development
Discussion: https://postgr.es/m/20190428141901.5dsbge2ka3rxmpk6%40development
2025-02-19 21:08:20 +01:00
Andres Freund
8b886a4e34 tests: BackgroundPsql: Fix potential for lost errors on windows
This addresses various corner cases in BackgroundPsql:

- On windows stdout and stderr may arrive out of order, leading to errors not
  being reported, or attributed to the wrong statement.

  To fix, emit the "query-separation banner" on both stdout and stderr and
  wait for both.

- Very occasionally the "query-separation banner" would not get removed, because
  we waited until the banner arrived, but then replaced the banner plus
  newline.

  To fix, wait for banner and newline.

- For interactive psql replacing $banner\n is not sufficient, interactive psql
  outputs \r\n.

- For interactive psql, where commands are echoed to stdout, the \echo
  command, rather than its output, would be matched.

  This would sometimes lead to output from the prior query, or wait_connect(),
  being returned in the next command.

  This also affected wait_connect(), leading to sometimes sending queries to
  psql before the connection actually was established.

While debugging these issues I also found that it's hard to know whether a
query separation banner was attributed to the right query. Make that easier by
counting the queries each BackgroundPsql instance has emitted and include the
number in the banner.

Also emit psql stdout/stderr in query() and wait_connect() as Test::More
notes, without that it's rather hard to debug some issues in CI and buildfarm.

As this can cause issues not just to-be-added tests, but also existing ones,
backpatch the fix to all supported versions.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/wmovm6xcbwh7twdtymxuboaoarbvwj2haasd3sikzlb3dkgz76@n45rzycluzft
Backpatch-through: 13
2025-02-19 10:45:48 -05:00
Álvaro Herrera
80d7f99049
Add ATAlterConstraint struct for ALTER .. CONSTRAINT
Replace the use of Constraint with a new ATAlterConstraint struct, which
allows us to pass additional information.  No functionality is added by
this commit.  This is necessary for future work that allows altering
constraints in other ways.

I (Álvaro) took the liberty of restructuring the code for ALTER
CONSTRAINT beyond what Amul did.  The original coding before Amul's
patch was unnecessarily baroque, and this change makes things simpler
by removing one level of subroutine.  Also, partly remove the assumption
that only partitioned tables are relevant (by passing sensible 'recurse'
arguments) and no longer ignore whether ONLY was specified.  I say
'partly' because the current coding only walks down via the 'conparentid'
relationship, which is only used for partitioned tables; but future
patches could handle ONLY or not for other types of constraint changes
for legacy inheritance trees too.

Author: Amul Sul <sulamul@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAAJ_b94bfgPV-8Mw_HwSBeheVwaK9=5s+7+KbBj_NpwXQFgDGg@mail.gmail.com
2025-02-19 13:06:13 +01:00
Alexander Korotkov
e983ee9380 Improve statistics estimation for single-column GROUP BY in sub-queries
This commit follows the idea of the 4767bc8ff2.  If sub-query has only one
GROUP BY column, we can consider its output variable as being unique. We can
employ this fact in the statistics to make more precise estimations in the
upper query block.

Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-02-19 11:59:30 +02:00
Amit Kapila
8a695d7998 Add a test for commit ac0e33136a using the injection point.
This test uses an injection point to bypass the time overhead caused by
the idle_replication_slot_timeout GUC, which has a minimum value of one
minute.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Author: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
2025-02-19 15:02:22 +05:30
Michael Paquier
302cf15759 Add support for LIKE in CREATE FOREIGN TABLE
LIKE enables the creation of foreign tables based on the column
definitions, constraints and objects of the defined source relation(s).

This feature mirrors the behavior of CREATE TABLE LIKE, but ignores
the INCLUDING sub-options that do not make sense for foreign tables:
INDEXES, COMPRESSION, IDENTITY and STORAGE.  The supported sub-options
are COMMENTS, CONSTRAINTS, DEFAULTS, GENERATED and STATISTICS, mapping
with the clauses already supported by the command.

Note that the restriction with LIKE in CREATE FOREIGN TABLE was added in
a0c6dfeecfcc.

Author: Zhang Mingli
Reviewed-by: Álvaro Herrera, Sami Imseih, Michael Paquier
Discussion: https://postgr.es/m/42d3f855-2275-4361-a42a-826172ca2dc4@Spark
2025-02-19 15:50:37 +09:00
Amit Langote
e7563e3c75 doc: Fix some issues with JSON_TABLE() examples
1. Remove an unused PASSING variable.

 2. Adjust formatting of JSON data used in an example to be valid
    under strict mode

Reported-by: Miłosz Chmura <mieszko4@gmail.com>
Author: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/173859550337.1071.4748984213168572913@wrigleys.postgresql.org
2025-02-19 15:08:17 +09:00
Amit Kapila
ac0e33136a Invalidate inactive replication slots.
This commit introduces idle_replication_slot_timeout GUC that allows
inactive slots to be invalidated at the time of checkpoint. Because
checkpoints happen checkpoint_timeout intervals, there can be some lag
between when the idle_replication_slot_timeout was exceeded and when the
slot invalidation is triggered at the next checkpoint. To avoid such lags,
users can force a checkpoint to promptly invalidate inactive slots.

Note that the idle timeout invalidation mechanism is not applicable for
slots that do not reserve WAL or for slots on the standby server that are
synced from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because they
don't perform logical decoding to produce changes.

The slots can become inactive for a long period if a subscriber is down
due to a system error or inaccessible because of network issues. If such a
situation persists, it might be more practical to recreate the subscriber
rather than attempt to recover the node and wait for it to catch up which
could be time-consuming.

Then, external tools could create replication slots (e.g., for migrations
or upgrades) that may fail to remove them if an error occurs, leaving
behind unused slots that take up space and resources. Manually cleaning
them up can be tedious and error-prone, and without intervention, these
lingering slots can cause unnecessary WAL retention and system bloat.

As the duration of idle_replication_slot_timeout is in minutes, any test
using that would be time-consuming. We are planning to commit a follow up
patch for tests by using the injection point framework.

Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB5716C131A7D80DAE8CB9E88794FC2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-02-19 09:29:50 +05:30
Tom Lane
b464e51ab3 Update to latest Snowball sources.
It's been some time since we did this, partly because the upstream
snowball project hasn't formally tagged a new release since 2021.
The main motivation for doing it now is to absorb a bug fix
(their commit e322673a841d9abd69994ae8cd20e191090b6ef4), which
prevents a null pointer dereference crash if SN_create_env() gets
a malloc failure at just the wrong point.  We'll patch the back
branches with only that change, but we might as well do the full
sync dance on HEAD.

Aside from a bunch of mostly-minor tweaks to existing stemmers, this
update adds a new stemmer for Estonian.  It also removes the existing
stemmer for Romanian using ISO-8859-2 encoding.  Upstream apparently
concluded that ISO-8859-2 doesn't provide an adequate representation
of some Romanian characters, and the UTF-8 implementation should be
used instead.

While at it, update the README's instructions for doing a sync,
which have not been adjusted during the addition of meson tooling.

Thanks to Maksim Korotkov for discovering the null-pointer
bug and submitting the fix to upstream snowball.

Reported-by: Maksim Korotkov <m.korotkov@postgrespro.ru>
Discussion: https://postgr.es/m/1d1a46-67ab1000-21-80c451@83151435
2025-02-18 21:13:54 -05:00
Richard Guo
71d02dc478 Fix unsafe access to BufferDescriptors
When considering a local buffer, the GetBufferDescriptor() call in
BufferGetLSNAtomic() would be retrieving a shared buffer with a bad
buffer ID.  Since the code checks whether the buffer is shared before
using the retrieved BufferDesc, this issue did not lead to any
malfunction.  Nonetheless this seems like trouble waiting to happen,
so fix it by ensuring that GetBufferDescriptor() is only called when
we know the buffer is shared.

Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAHewXNku-o46-9cmUgyv6LkSZ25doDrWq32p=oz9kfD8ovVJMg@mail.gmail.com
Backpatch-through: 13
2025-02-19 11:05:35 +09:00
Richard Guo
c39392ebae Fix freeing a child join's SpecialJoinInfo
In try_partitionwise_join, we try to break down the join between two
partitioned relations into joins between matching partitions.  To
achieve this, we iterate through each pair of partitions from the two
joining relations and create child join relations for them.  To reduce
memory accumulation during each iteration, one step we take is freeing
the SpecialJoinInfos created for the child joins.

A child join's SpecialJoinInfo is a copy of the parent join's
SpecialJoinInfo, with some members being translated copies of their
counterparts in the parent.  However, when freeing the bitmapset
members in a child join's SpecialJoinInfo, we failed to check whether
they were translated copies.  As a result, we inadvertently freed the
members that were still in use by the parent SpecialJoinInfo, leading
to crashes when those freed members were accessed.

To fix, check if each member of the child join's SpecialJoinInfo is a
translated copy and free it only if that's the case.  This requires
passing the parent join's SpecialJoinInfo as a parameter to
free_child_join_sjinfo.

Back-patch to v17 where this bug crept in.

Bug: #18806
Reported-by: 孟令彬 <m_lingbin@126.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/18806-d70b0c9fdf63dcbf@postgresql.org
Backpatch-through: 17
2025-02-19 10:02:32 +09:00
Michael Paquier
aef6f907f6 test_escape: Fix handling of short options in getopt_long()
This addresses two errors in the module, based on the set of options
supported:
- '-c', for --conninfo, was not listed.
- '-f', for --force-unsupported, was not listed.

While on it, these are now listed in an alphabetical order.

Author: Japin Li
Discussion: https://postgr.es/m/ME0P300MB04451FB20CE0346A59C25CADB6FA2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
Backpatch-through: 13
2025-02-19 09:45:42 +09:00
Michael Paquier
f2e4c2b203 Make the description of some GUCs more consistent
This commit improves the description of a couple of GUCs, to be more
consistent with the style of their surroundings:
* array_nulls
* enable_self_join_elimination
* optimize_bounded_sort
* row_security
* synchronize_seqscans

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20250218.103240.1422205966404509831.horikyota.ntt@gmail.com
2025-02-19 08:42:35 +09:00
Bruce Momjian
06dc1ffd24 doc: add example of sign mismatch with POSIX/ISO-8601 time zones
Author: Laurenz Albe

Discussion: https://postgr.es/m/eb4d1e15c6822c1937be1491118500dd9201492f.camel@cybertec.at
2025-02-18 15:51:31 -05:00
Jeff Davis
a1f7f80bfe Update outdated comments in nodeAgg.c.
Author: Zhang Mingli
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/198a8d1e-0792-4e7f-828e-902aa342f36e@Spark
2025-02-18 10:37:50 -08:00
Melanie Plageman
c623e8593e Reduce scope of heap vacuum per_buffer_data
Move lazy_scan_heap()'s per_buffer_data variable into a tighter scope.
In lazy_scan_heap()'s phase I heap vacuuming, the read stream API
returns a pointer to the next block number to vacuum. As long as
read_stream_next_buffer() returns a valid buffer, per_buffer_data should
always be valid.

Move per_buffer_data into a tighter scope and make sure it is reset to
NULL on each iteration so that we get a core dump instead of bogus data
from a previous block if something goes wrong in the read stream API.

Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/626104.1739729538%40sss.pgh.pa.us
2025-02-18 09:29:10 -05:00
Daniel Gustafsson
95ef3d9029 Add PGErrorVerbosity to typedefs.list
PGErrorVerbosity was missing which resulted in incorrect whitespace
alignment going back all the way to e3860ffa4dd0.  No backpatch for
this though since we don't pgindent backbranches.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAGECzQTVi8n-HW4Q27je-b9ckQk7zf6bS_it42gNvQu+DX0NCQ@mail.gmail.com
2025-02-18 13:23:13 +01:00
David Rowley
593509202f Fix poorly written regression test
bd10ec529 added code to allow redundant functionally dependent GROUP BY
columns to be removed using unique indexes and NOT NULL constraints as
proofs of functional dependency.  In that commit, I (David) added a test
to ensure that when there are multiple indexes available to remove columns
that we pick the index that allows us to remove the most columns.  This
test was faulty as it assumed the t3 table's primary key index was valid
to use as functional dependency proof, but that's not the case since
that's defined as deferrable.

Here we adjust the tests added by that commit to use the t2 table instead.
That's defined with a non-deferrable primary key.

Author: songjinzhou <tsinghualucky912@foxmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/tencent_CD414C79D39668455DF80D35143B87634C08@qq.com
2025-02-19 00:42:22 +13:00
Amit Kapila
217919dd09 Raise a WARNING for max_slot_wal_keep_size in pg_createsubscriber.
During the pg_createsubscriber execution, it is possible that the required
WAL is removed from the primary/publisher node due to
'max_slot_wal_keep_size'.

This patch raises a WARNING during the '--dry-run' mode if the
'max_slot_wal_keep_size' is set to a non-default value on the
primary/publisher node.

Author: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAHv8Rj+deqsQXOMa7Tck8CBQUbsua=+4AuMVQ2=MPM0f-ZHbjA@mail.gmail.com
2025-02-18 12:15:43 +05:30
John Naylor
53d3daa491 Specialize intarray sorting
There is at least one report in the field of storing millions of
integers in arrays, so it seems like a good time to specialize
intarray's qsort function. In doing so, streamline the comparators:
Previously there were three, two for each direction for sorting
and one passed to qunique_arg. To preserve the early exit in the
case of descending input, pass the direction as an argument to
the comparator. This requires giving up duplicate detection, which
previously allowed skipping the qunique_arg() call. Testing showed
no regressions this way.

In passing, get rid of nearby checks that the input has at least
two elements, since preserving them would make some macros less
readable. These are not necessary for correctness, and seem like
premature optimizations.

Author: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/098A3E67-E4A6-4086-9C66-B1EAEB1DFE1C@yandex-team.ru
2025-02-18 11:04:55 +07:00
Amit Kapila
164bac92f0 Doc: Improve pg_replication_slots.inactive_since description.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PssvVMTWVtUPto6HbPO8pgVsvtzndt_FdBomA_Oq4zf3w@mail.gmail.com
2025-02-18 09:23:43 +05:30
Thomas Munro
2509b857cc Fix typo in 2a8a0067.
Builds configured with Valgrind but without assertions would fail due to
a typo in the recent change.  This should be included when back-patching
2a8a0067 into v17.
2025-02-18 14:44:59 +13:00
Daniel Gustafsson
9cdc21b533 Fix translator notes in comments
The translator comments detailing what a %s inclusion refers to were
accidentally including too many address types.  In practice this is
not a problem since it's not a translated string, but to minimize any
risk of confusion let's fix them anwyays.  Even though this exists in
backbranches there is little use for backpatch as the translation work
has already happened there, so let's avoid the churn.

Author: Japin Li <japinli@hotmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/ME0P300MB04458DE627480614ABE639D2B6FB2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-02-17 20:23:34 +01:00
Tomas Vondra
c407d5426b Add tab completion for ALTER USER/ROLE RESET
Currently tab completion for ALTER USER RESET shows a list of all
configuration parameters that may be set on a role, irrespectively of
which parameters are actually set. This patch improves tab completion to
offer only parameters that are set.

Author: Robins Tharakan
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/CAEP4nAzqiT6VbVC5r3nq5byLTnPzjniVGzEMpYcnAHQyNzEuaw%40mail.gmail.com
2025-02-17 18:12:15 +01:00
Tomas Vondra
9df8727c50 Add tab completion for ALTER DATABASE RESET
Currently tab completion for ALTER DATABASE RESET shows a list of all
configuration parameters that may be set on a database, irrespectively
of which parameters are actually set. This patch improves tab completion
to offer only parameters that are set.

Author: Robins Tharakan
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/CAEP4nAzqiT6VbVC5r3nq5byLTnPzjniVGzEMpYcnAHQyNzEuaw%40mail.gmail.com
2025-02-17 18:12:15 +01:00
Alexander Korotkov
fc069a3a63 Implement Self-Join Elimination
The Self-Join Elimination (SJE) feature removes an inner join of a plain
table to itself in the query tree if it is proven that the join can be
replaced with a scan without impacting the query result.  Self-join and
inner relation get replaced with the outer in query, equivalence classes,
and planner info structures.  Also, the inner restrictlist moves to the
outer one with the removal of duplicated clauses.  Thus, this optimization
reduces the length of the range table list (this especially makes sense for
partitioned relations), reduces the number of restriction clauses and,
in turn, selectivity estimations, and potentially improves total planner
prediction for the query.

This feature is dedicated to avoiding redundancy, which can appear after
pull-up transformations or the creation of an EquivalenceClass-derived clause
like the below.

  SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3);
  SELECT * FROM t1 WHERE EXISTS (SELECT t3.x FROM t1 t3 WHERE t3.x = t1.x);
  SELECT * FROM t1,t2, t1 t3 WHERE t1.x = t2.x AND t2.x = t3.x;

In the future, we could also reduce redundancy caused by subquery pull-up
after unnecessary outer join removal in cases like the one below.

  SELECT * FROM t1 WHERE x IN
    (SELECT t3.x FROM t1 t3 LEFT JOIN t2 ON t2.x = t1.x);

Also, it can drastically help to join partitioned tables, removing entries
even before their expansion.

The SJE proof is based on innerrel_is_unique() machinery.

We can remove a self-join when for each outer row:

 1. At most, one inner row matches the join clause;
 2. Each matched inner row must be (physically) the same as the outer one;
 3. Inner and outer rows have the same row mark.

In this patch, we use the next approach to identify a self-join:

 1. Collect all merge-joinable join quals which look like a.x = b.x;
 2. Add to the list above the baseretrictinfo of the inner table;
 3. Check innerrel_is_unique() for the qual list.  If it returns false, skip
    this pair of joining tables;
 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the
    possibility of self-join elimination, the inner and outer clauses must
    match exactly.

The relation replacement procedure is not trivial and is partly combined
with the one used to remove useless left joins.  Tests covering this feature
were added to join.sql.  Some of the existing regression tests changed due
to self-join removal logic.

Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru
Author: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
Author: Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Co-authored-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Simon Riggs <simon@2ndquadrant.com>
Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Reviewed-by: David Rowley <david.rowley@2ndquadrant.com>
Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
Reviewed-by: Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Hywel Carver <hywel@skillerwhale.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Ronan Dunklau <ronan.dunklau@aiven.io>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Greg Stark <stark@mit.edu>
Reviewed-by: Jaime Casanova <jcasanov@systemguards.com.ec>
Reviewed-by: Michał Kłeczek <michal@kleczek.org>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-02-17 12:44:12 +02:00
Alexander Korotkov
3fb58625d1 Revert: Get rid of WALBufMappingLock
This commit reverts 6a2275b895.  Buildfarm failure on batta spots some
concurrency issue, which requires further investigation.
2025-02-17 12:35:28 +02:00
Amit Langote
75dfde1363 Fix an oversight in cbc127917 to handle MERGE correctly
ExecInitModifyTable() forgot to trim MERGE-related lists to exclude
entries for result relations pruned during initial pruning, so fix
that.

While at it, make the function's use of the pruned resultRelations
list, rather than ModifyTable.resultRelations, more consistent.

Reported-by: Alexander Lakhin <exclusion@gmail.com> (via sqlsmith)
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/e72c94d9-e5f9-4753-9bc1-69d72bd54b8a@gmail.com
2025-02-17 16:12:03 +09:00
Michael Paquier
6a8a7ce476 Add information about WAL buffers full to VACUUM/ANALYZE (VERBOSE)
This commit adds the information about the number of times WAL buffers
have been full to the logs generated by VACUUM/ANALYZE (VERBOSE) and in
the logs generated by autovacuum, complementing the existing information
stored by WalUsage.

This is the last part of the backend code where the value of
wal_buffers_full can be reported, similarly to all the other fields of
WalUsage.  320545bfcfee and ce5bcc4a9f26 have done the same for EXPLAIN
and pgss.

Author: Bertrand Drouvot
Reviewed-by: Ilia Evdokimov
Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-17 15:09:51 +09:00
Michael Paquier
320545bfcf Add information about WAL buffers being full to EXPLAIN (WAL)
This is similar to ce5bcc4a9f26, relying on the addition of
wal_buffers_full to WalUsage.  This time, the information is added to
the output generated by EXPLAIN (WAL).

Author: Bertrand Drouvot
Reviewed-by: Ilia Evdokimov
Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-17 14:50:33 +09:00
Michael Paquier
ce5bcc4a9f pg_stat_statements: Add wal_buffers_full
wal_buffers_full tracks the number of times WAL buffers become full,
giving hints to be able to tune the GUC wal_buffers.

Up to now, this information was only available in pg_stat_wal.  With
this field available in WalUsage since eaf502747bac, exposing it in
pg_stat_statements is straight-forward, and it offers more granularity
at query level.

pg_stat_statements does not need a version bump as one has been done in
commit cf54a2c00254 for this development cycle.

Author: Bertrand Drouvot
Reviewed-by: Ilia Evdokimov
Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-17 13:55:17 +09:00
Michael Paquier
eaf502747b Move wal_buffers_full from PgStat_PendingWalStats to WalUsage
wal_buffers_full has been introduced in pg_stat_wal in 8d9a935965f, as
some information providing metrics for the tuning of the GUC
wal_buffers.  WalUsage has been introduced before that in df3b181499.

Moving this field is proving to be beneficial for several reasons:
- This information can now be made available in more layers, providing
more granularity than just pg_stat_wal, on a per-query basis: EXPLAIN,
pgss and VACUUM/ANALYZE logs.
- A patch is under discussion to provide statistics for WAL at backend
level, and this move simplifies a bit the handling of pending
statistics.  The remaining data in PgStat_PendingWalStats now relates to
write/sync counters and times, with equivalents present in pg_stat_io,
that backend statistics are able to already track.  So this should cut
all the dependencies between PgStat_PendingWalStats and WAL stats at
backend level.

As of this change, wal_buffers_full only shows in pg_stat_wal.

Author: Bertrand Drouvot
Reviewed-by: Ilia Evdokimov
Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-17 13:14:28 +09:00
Alexander Korotkov
6a2275b895 Get rid of WALBufMappingLock
Allow multiple backends to initialize WAL buffers concurrently.  This way
`MemSet((char *) NewPage, 0, XLOG_BLCKSZ);` can run in parallel without
taking a single LWLock in exclusive mode.

The new algorithm works as follows:
 * reserve a page for initialization using XLogCtl->InitializeReserved,
 * ensure the page is written out,
 * once the page is initialized, try to advance XLogCtl->InitializedUpTo and
   signal to waiters using XLogCtl->InitializedUpToCondVar condition
   variable,
 * repeat previous steps until we reserve initialization up to the target
   WAL position,
 * wait until concurrent initialization finishes using a
   XLogCtl->InitializedUpToCondVar.

Now, multiple backends can, in parallel, concurrently reserve pages,
initialize them, and advance XLogCtl->InitializedUpTo to point to the latest
initialized page.

Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
2025-02-17 04:25:29 +02:00
Richard Guo
fbc0fe9a2e Adjust tuples estimate for appendrels
In set_append_rel_size(), we currently set rel->tuples to rel->rows
for an appendrel.  Generally, rel->tuples is the raw number of tuples
in the relation and rel->rows is the estimated number of tuples after
the relation's restriction clauses have been applied.  Although an
appendrel itself doesn't directly enforce any quals today, its child
relations may.  Therefore, setting rel->tuples equal to rel->rows for
an appendrel isn't always appropriate.

Doing so can lead to issues in cost estimates in some cases.  For
instance, when estimating the number of distinct values from an
appendrel, we would not be able to adjust the estimate based on the
restriction selectivity.

This patch addresses this by setting an appendrel's tuples to the
total number of tuples accumulated from each live child, which better
aligns with reality.

This is arguably a bug, but nobody has complained about that until
now, so no back-patch.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAMbWs4_TG_+kVn6fjG-5GYzzukrNK57=g9eUo4gsrUG26OFawg@mail.gmail.com
2025-02-17 11:13:15 +09:00
Tom Lane
a7f95859ef In fmtIdEnc(), handle failure of enlargePQExpBuffer().
Coverity complained that we weren't doing that, and it's right.

This fix just makes fmtIdEnc() honor the general convention that OOM
causes a PQExpBuffer to become marked "broken", without any immediate
error.  In the pretty-unlikely case that we actually did hit OOM here,
the end result would be to return an empty string to the caller,
probably resulting in invalid SQL syntax in an issued command (if
nothing else went wrong, which is even more unlikely).  It's tempting
to throw an "out of memory" error if the buffer becomes broken, but
there's not a lot of point in doing that only here and not in hundreds
of other PQExpBuffer-using places in pg_dump and similar callers.
The whole issue could do with some non-time-crunched redesign, perhaps.

This is a followup to the fixes for CVE-2025-1094, and should be
included if cherry-picking those fixes.
2025-02-16 12:46:35 -05:00
Tom Lane
9f45e6a91d Make escaping functions retain trailing bytes of an invalid character.
Instead of dropping the trailing byte(s) of an invalid or incomplete
multibyte character, replace only the first byte with a known-invalid
sequence, and process the rest normally.  This seems less likely to
confuse incautious callers than the behavior adopted in 5dc1e42b4.

While we're at it, adjust PQescapeStringInternal to produce at most
one bleat about invalid multibyte characters per string.  This
matches the behavior of PQescapeInternal, and avoids the risk of
producing tons of repetitive junk if a long string is simply given
in the wrong encoding.

This is a followup to the fixes for CVE-2025-1094, and should be
included if cherry-picking those fixes.

Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/20250215012712.45@rfd.leadboat.com
Backpatch-through: 13
2025-02-15 16:20:21 -05:00
Thomas Munro
2a8a00674e Fix explicit valgrind interaction in read_stream.c.
By calling wipe_mem() on per-buffer data memory that has been released,
we are also telling Valgrind that the memory is "noaccess".  We need to
set it to "undefined" before giving it to the registered callback to
fill in, when a slot is reused.

As discovered by build farm animal skink when the VACUUM streamification
patches landed (the first users of per-buffer data).

Pushing to master only for now, to clear the error on skink.  It's also
possible that external code might discover the per-buffer data feature
in v17, and reasonable to expect Valgrind not to produce spurious
memcheck reports, but the back-patch is deferred until after the
imminent minor release is out of the way.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2Bg6aXpi2FEHqeLOzE%2BxYw%3DOV%2B-N5jhOEnnV%2BF0USM9xA%40mail.gmail.com
2025-02-15 13:14:03 +13:00
Andres Freund
efdadeb223 Fix PQescapeLiteral()/PQescapeIdentifier() length handling
In 5dc1e42b4fa I fixed bugs in various escape functions, unfortunately as part
of that I introduced a new bug in PQescapeLiteral()/PQescapeIdentifier(). The
bug is that I made PQescapeInternal() just use strlen(), rather than taking
the specified input length into account.

That's bad, because it can lead to including input that wasn't intended to be
included (in case len is shorter than null termination of the string) and
because it can lead to reading invalid memory if the input string is not null
terminated.

Expand test_escape to this kind of bug:

a) for escape functions with length support, append data that should not be
   escaped and check that it is not

b) add valgrind requests to detect access of bytes that should not be touched

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/Z64jD3u46gObCo1p@pryzbyj2023
Backpatch: 13
2025-02-14 18:09:19 -05:00
Nathan Bossart
7720082ae5 Add delay time to VACUUM/ANALYZE (VERBOSE) and autovacuum logs.
Commit bb8dff9995 added this information to the
pg_stat_progress_vacuum and pg_stat_progress_analyze system views.
This commit adds the same information to the output of VACUUM and
ANALYZE with the VERBOSE option and to the autovacuum logs.

Suggested-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
2025-02-14 14:53:28 -06:00
Daniel Gustafsson
9ad1b3d01f pgcrypto: Add support for CFB mode in AES encryption
Cipher Feedback Mode, CFB, is a self-synchronizing stream cipher which
is very similar to CBC performed in reverse. Since OpenSSL supports it,
we can easily plug it into the existing cipher selection code without
any need for infrastructure changes.

This patch was simultaneously submitted by Umar Hayat and Vladyslav
Nebozhyn, the latter whom suggested the feauture. The committed patch
is Umar's version.

Author: Umar Hayat <postgresql.wizard@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAPBGcbxo9ASzq14VTpQp3mnUJ5omdgTWUJOvWV0L6nNigWE5jw@mail.gmail.com
2025-02-14 21:18:37 +01:00
Nathan Bossart
760bf588de Use PqMsg_Progress macro in HandleParallelMessage().
Commit a99cc6c6b4 introduced the PqMsg_Progress macro but missed
updating HandleParallelMessage() accordingly.

Backpatch-through: 17
2025-02-14 12:57:13 -06:00
Melanie Plageman
c3e775e608 Use streaming read I/O in VACUUM's third phase
Make vacuum's third phase (its second pass over the heap), which reaps
dead items collected in the first phase and marks them as reusable, use
the read stream API. This commit adds a new read stream callback,
vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and
returns the next block number to read for vacuum.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
2025-02-14 12:57:49 -05:00
Melanie Plageman
9256822608 Use streaming read I/O in VACUUM's first phase
Make vacuum's first phase, which prunes and freezes tuples and records
dead TIDs, use the read stream API by by converting
heap_vac_scan_next_block() to a read stream callback.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aLwANZpxHc0tC-6OT0OQT4TftDGkKAO5yigMUOv_Tcsw%40mail.gmail.com
2025-02-14 12:57:43 -05:00
Melanie Plageman
32acad7d1d Convert heap_vac_scan_next_block() boolean parameters to flags
The read stream API only allows one piece of extra per block state to be
passed back to the API user (per_buffer_data). lazy_scan_heap() needs
two pieces of per-buffer data: whether or not the block was all-visible
in the visibility map and whether or not it was eagerly scanned.

Convert these two pieces of information to flags so that they can be
populated by heap_vac_scan_next_block() and returned to
lazy_scan_heap(). A future commit will turn heap_vac_scan_next_block()
into the read stream callback for heap phase I vacuuming.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_bmx33jTqATP5GKNFYwAg02a9dDtk4U_ciEjgBHZSVkOQ%40mail.gmail.com
2025-02-14 12:57:37 -05:00
Nathan Bossart
977d865c36 Describe special values in GUC descriptions more consistently.
Many GUCs accept special values like -1 or an empty string to
disable the feature, use a system default, etc.  While the
documentation consistently lists these special values, the GUC
descriptions do not.  Many such descriptions fail to mention the
special values, and those that do vary in phrasing and placement.
This commit aims to bring some consistency to this area by applying
the following rules:

* Special values should be listed at the end of the long
  description.
* Descriptions should use numerals (e.g., "0") instead of words
  (e.g., "zero").
* Special value mentions should be concise and direct (e.g., "0
  disables the timeout.", "An empty string means use the operating
  system setting.").
* Multiple special values should be listed in ascending order.

Of course, there are exceptions, such as
max_pred_locks_per_relation and search_path, whose special values
are too complex to include.  And there are cases like
listen_addresses, where the meaning of an empty string is arguably
too obvious to include.  In those cases, I've refrained from adding
special value information to the GUC description.

Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/Z6aIy4aywxUZHAo6%40nathan
2025-02-14 10:44:30 -06:00
Daniel Gustafsson
67a0234157 Fix assertion on dereferenced object
Commit 27cc7cd2bc8a accidentally placed the assertion ensuring
that the pointer isn't NULL after it had already been accessed.
Fix by moving the pointer dereferencing to after the assertion.
Backpatch to all supported branches.

Author: Dmitry Koval <d.koval@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/1618848d-cdc7-414b-9c03-08cf4bef4408@postgrespro.ru
Backpatch-through: 13
2025-02-14 11:50:56 +01:00
Thomas Munro
9e17ac997f Remove obsolete comment.
Commit 755a4c10d19d prevented StartReadBuffers() from crossing md.c
segment boundaries in one operation, but a comment about that
possibility remained.
2025-02-14 13:16:05 +13:00
Nathan Bossart
432c30dc4e Remove unused parameter from execute_extension_script().
This function's schemaOid parameter appears to have never been used
for anything.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Discussion: https://postgr.es/m/20250214010218.550ebe4ec1a7c7811a7fa2bb%40sraoss.co.jp
2025-02-13 16:47:42 -06:00
Peter Eisentraut
ed5e5f0710 Remove unnecessary (char *) casts [xlog]
Remove (char *) casts no longer needed after XLogRegisterData() and
XLogRegisterBufData() argument type change.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-13 10:57:07 +01:00
Peter Eisentraut
cdaeff9b39 XLogRegisterData, XLogRegisterBufData void * argument for binary data
Change XLogRegisterData() and XLogRegisterBufData() functions to take
void * for binary data instead of char *.  This will remove the need
for numerous casts (done in a separate commit for clarity).

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-13 10:33:14 +01:00
Michael Paquier
773c51dd39 Fix MakeTransitionCaptureState() to return a consistent result
When an UPDATE trigger referencing a new table and a DELETE trigger
referencing an old table are both present, MakeTransitionCaptureState()
returns an inconsistent result for UPDATE commands in its set of flags
and tuplestores holding the TransitionCaptureState for transition
tables.

As proved by the test added here, this issue causes a crash in v14 and
earlier versions (down to 11, actually, older versions do not support
triggers on partitioned tables) during cross-partition updates on a
partitioned table.  v15 and newer versions are safe thanks to
7103ebb7aae8.

This commit fixes the function so that it returns a consistent state
by using portions of the changes made in commit 7103ebb7aae8 for v13 and
v14.  v15 and newer versions are slightly tweaked to match with the
older versions, mainly for consistency across branches.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20250207.150238.968446820828052276.horikyota.ntt@gmail.com
Backpatch-through: 13
2025-02-13 16:30:58 +09:00
Masahiko Sawada
abfb29648f Rename RBTXN_PREPARE to RBTXN_IS_PREPARE for better clarification.
RBTXN_PREPARE flag and rbtxn_prepared macro could be misinterpreted as
either indicating the transaction type (e.g. a prepared transaction or
a normal transaction) or its currentstate (e.g. skipped or its prepare
message is sent), especially after commit 072ee847ad4 introduced the
RBTXN_SENT_PREPARE flag and the rbtxn_sent_prepare macro.

The RBTXN_PREPARE flag (and its corresponding macro) have been renamed
to RBTXN_IS_PREPARE to explicitly indicate the transaction
type. Therefore, this commit also adds the RBTXN_IS_PREPARE flag to
the transaction that is a prepared transaction and has been skipped,
which previously had only the RBTXN_SKIPPED_PREPARE flag.

Reviewed-by: Amit Kapila, Peter Smith
Discussion: https://postgr.es/m/CAA4eK1KgNmBsG%3D155E7QQ6TX9RoWnM4z5Z20SvsbwxSe_QXYsg%40mail.gmail.com
2025-02-12 16:55:00 -08:00
Masahiko Sawada
072ee847ad Skip logical decoding of already-aborted transactions.
Previously, transaction aborts were detected concurrently only during
system catalog scans while replaying a transaction in streaming mode.

This commit adds an additional CLOG lookup to check the transaction
status, allowing the logical decoding to skip changes also when it
doesn't touch system catalogs, if the transaction is already
aborted. This optimization enhances logical decoding performance,
especially for large transactions that have already been rolled back,
as it avoids unnecessary disk or network I/O.

To avoid potential slowdowns caused by frequent CLOG lookups for small
transactions (most of which commit), the CLOG lookup is performed only
for large transactions before eviction. The performance benchmark
results showed there is not noticeable performance regression due to
CLOG lookups.

Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Ajin Cherian
Reviewed-by: Dilip Kumar, Andres Freund
Discussion: https://postgr.es/m/CAD21AoDht9Pz_DFv_R2LqBTBbO4eGrpa9Vojmt5z5sEx3XwD7A@mail.gmail.com
2025-02-12 16:31:34 -08:00
Nathan Bossart
9e66a2b784 Remove unneeded volatile qualifier in fmgr.c.
Currently, the save_nestlevel variable in fmgr_security_definer()
is marked volatile.  While this may have been necessary when it was
used in a PG_CATCH section (as explained in the comment for PG_TRY
in elog.h), it appears to have been unnecessary since commit
82a47982f3, which removed its use in a PG_CATCH section.

Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z6xbAgXKY2L-3d5Q%40jrouhaud
2025-02-12 15:45:40 -06:00
Tom Lane
fd602f29c1 Clean up impenetrable logic in pg_basebackup/receivelog.c.
Coverity complained about possible double free of HandleCopyStream's
"copybuf".  AFAICS it's mistaken, but it is easy to see why it's
confused, because management of that buffer is impossibly confusing.
It's unreasonable that HandleEndOfCopyStream frees the buffer in some
cases but not others, updates the caller's state for that in no case,
and has not a single comment about how complicated that makes things.

Let's put all the responsibility for freeing copybuf in the actual
owner of that variable, HandleCopyStream.  This results in one more
PQfreemem call than before, but the logic is far easier to follow,
both for humans and machines.

Since this isn't (quite) actually broken, no back-patch.
2025-02-12 16:07:23 -05:00
Tom Lane
fcd77a6873 Fix minor memory leaks in pg_dump.
Coverity reported the two oversights in getPublicationTables.
Valgrind found the one in determineNotNullFlags.

The mistakes in getPublicationTables seem too minor to be worth
back-patching.  determineNotNullFlags could be run enough times
to matter, but that code is new in v18.  So, no back-patch.
2025-02-12 15:46:31 -05:00
Andres Freund
c45963c5d5 ci: Collect core files on NetBSD and OpenBSD
Support for NetBSD and OpenBSD operating systems have been added to CI in the
prior commit. Now add support for collect core files and generating backtraces
using for all core files.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ32ySyYa06k9MFd+VY5vHhUyBpvgmJUZae5PihjzaurVg@mail.gmail.com
2025-02-12 09:40:20 -05:00
Andres Freund
e291573534 ci: Test NetBSD and OpenBSD
NetBSD and OpenBSD Postgres CI images are now generated [1], but aren't yet
utilized for Postgres' CI. This commit adds CI support for them.

For now the tasks will be manually triggered, to save on CI credits.

[1] https://github.com/anarazel/pg-vm-images

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ32ySyYa06k9MFd+VY5vHhUyBpvgmJUZae5PihjzaurVg@mail.gmail.com
2025-02-12 09:40:07 -05:00
Andres Freund
b64d83115c meson: Fix failure to detect bsd_auth.h presence
bsd_auth.h file needs to be included after 'sys/types.h', as documented in
https://man.openbsd.org/authenticate.3

The reason a similar looking stanza works for autoconf is that autoconf
automatically adds AC_INCLUDES_DEFAULT, which in turn includes sys/types.h.

Backpatch to all versions with meson support.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/637haqqyhg2wlz7q6wq25m2qupe67g7f2uupngzui64zypy4x2@ysr2xnmynmu4
Backpatch-through: 16
2025-02-12 08:15:53 -05:00
Michael Paquier
0fc68c8421 Fix issue in recovery test 041_checkpoint_at_promote
The phase of the test waiting for a restartpoint to complete was not
working as intended, due to a log_contains() call incorrectly
written.

The problem reported by the author could be simply reproduced by
removing the injection_points_wakeup() call: the test succeeds rather
than waiting for the restartpoint completion.  In most cases, the
restartpoint completion is fast enough that the test offered the wanted
coverage.  On slow machines, it could have become unreliable.

Oversight in 6782709df81f.

Author: Nitin Jadhav
Discussion: https://postgr.es/m/CAMm1aWa_6u+o52r7h7G6pX-oWD0Qraf0ee17Ma50qxGS0B_Rzg@mail.gmail.com
Backpatch-through: 17
2025-02-12 17:58:25 +09:00
Michael Paquier
5b94e27534 Fix some inconsistencies with memory freeing in pg_createsubscriber
The correct function documented to free the memory allocated for the
result returned by PQescapeIdentifier() and PQescapeLiteral() is
PQfreemem().  pg_createsubscriber.c relied on pg_free() instead, which
is not incorrect as both do a free() internally, but inconsistent with
the documentation.

While on it, this commit fixes a small memory leak introduced by
4867f8a555ce, as the code of pg_createsubscriber makes this effort.

Author: Ranier Vilela
Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/CAEudQAp=AW5dJXrGLbC_aZg_9nOo=42W7uLDRONFQE-gcgnkgQ@mail.gmail.com
Backpatch-through: 17
2025-02-12 17:11:43 +09:00
Peter Eisentraut
1b5841d461 Remove unnecessary (char *) casts [checksum]
Remove some (char *) casts related to uses of the pg_checksum_page()
function.  These casts are useless, because everything involved
already has the right type.  Moreover, these casts actually silently
discarded a const qualifier.  The declaration of a higher-level
function needs to be adjusted to fix that.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-12 08:59:48 +01:00
Peter Eisentraut
827b4060a8 Remove unnecessary (char *) casts [mem]
Remove (char *) casts around memory functions such as memcmp(),
memcpy(), or memset() where the cast is useless.  Since these
functions don't take char * arguments anyway, these casts are at best
complicated casts to (void *), about which see commit 7f798aca1d5.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-12 08:50:13 +01:00
Peter Eisentraut
506183bce7 Remove unnecessary (char *) casts [string]
Remove (char *) casts around string functions where the arguments or
result already have the right type and the cast is useless (or worse,
potentially casts away a qualifier, but this doesn't appear to be the
case here).

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-12 08:49:18 +01:00
John Naylor
0bc34ad692 Doc: Fix punctuation errors
Author: 斉藤登 <noborusai@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnL6i-BSu5rB2+KiHLjMCOXiQEiPMBvEj7F1CgUzZMooLA@mail.gmail.com
Backpatch-through: 13
2025-02-12 13:18:14 +07:00
Nathan Bossart
bb8dff9995 Add cost-based vacuum delay time to progress views.
This commit adds the amount of time spent sleeping due to
cost-based delay to the pg_stat_progress_vacuum and
pg_stat_progress_analyze system views.  A new configuration
parameter named track_cost_delay_timing, which is off by default,
controls whether this information is gathered.  For vacuum, the
reported value includes the sleep time of any associated parallel
workers.  However, parallel workers only report their sleep time
once per second to avoid overloading the leader process.

Bumps catversion.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
2025-02-11 16:38:14 -06:00
Nathan Bossart
e5b0b0ce15 Add is_analyze parameter to vacuum_delay_point().
This function is used in both vacuum and analyze code paths, and a
follow-up commit will require distinguishing between the two.  This
commit forces callers to specify whether they are in a vacuum or
analyze path, but it does not use that information for anything
yet.

Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
2025-02-11 16:38:14 -06:00
Melanie Plageman
d0d649e916 Limit pgbench COPY FREEZE to ordinary relations
pgbench client-side data generation uses COPY FREEZE to load data for most
tables. COPY FREEZE isn't supported for partitioned tables and since pgbench
only supports partitioning pgbench_accounts, pgbench used a hard-coded check to
skip COPY FREEZE and use plain COPY for a partitioned pgbench_accounts.

If the user has manually partitioned one of the other pgbench tables, this
causes client-side data generation to error out with:

ERROR:  cannot perform COPY FREEZE on a partitioned table

Fix this by limiting COPY FREEZE to ordinary tables (RELKIND_RELATION).

Author: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/flat/97f55fca-8a7b-4da8-b413-7d1c57010676%40postgrespro.ru
2025-02-11 16:52:08 -05:00
Jeff Davis
38172d1856 Injection points for hash aggregation.
Requires adding a guard against shift-by-32. Previously, that was
impossible because the number of partitions was always greater than 1,
but a new injection point can force the number of partitions to 1.

Discussion: https://postgr.es/m/ff4e59305e5d689e03cd256a736348d3e7958f8f.camel@j-davis.com
2025-02-11 11:26:25 -08:00
Melanie Plageman
052026c9b9 Eagerly scan all-visible pages to amortize aggressive vacuum
Aggressive vacuums must scan every unfrozen tuple in order to advance
the relfrozenxid/relminmxid. Because data is often vacuumed before it is
old enough to require freezing, relations may build up a large backlog
of pages that are set all-visible but not all-frozen in the visibility
map. When an aggressive vacuum is triggered, all of these pages must be
scanned. These pages have often been evicted from shared buffers and
even from the kernel buffer cache. Thus, aggressive vacuums often incur
large amounts of extra I/O at the expense of foreground workloads.

To amortize the cost of aggressive vacuums, eagerly scan some
all-visible but not all-frozen pages during normal vacuums.

All-visible pages that are eagerly scanned and set all-frozen in the
visibility map are counted as successful eager freezes and those not
frozen are counted as failed eager freezes.

If too many eager scans fail in a row, eager scanning is temporarily
suspended until a later portion of the relation. The number of failures
tolerated is configurable globally and per table.

To effectively amortize aggressive vacuums, we cap the number of
successes as well. Capping eager freeze successes also limits the amount
of potentially wasted work if these pages are modified again before the
next aggressive vacuum. Once we reach the maximum number of blocks
successfully eager frozen, eager scanning is disabled for the remainder
of the vacuum of the relation.

Original design idea from Robert Haas, with enhancements from
Andres Freund, Tomas Vondra, and me

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
2025-02-11 13:53:48 -05:00
Andres Freund
4dd09a1d41 config: Rename "Asynchronous Behavior" to "I/O"
"I/O" seems more descriptive than "Asynchronous Behavior", given that some of
the GUCs in the section don't relate to anything asynchronous.

Most other abbreviations in the config sections are un-abbreviated, but
"Input/Output" seems less likely to be helpful than just IO or I/O.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq
2025-02-11 12:53:40 -05:00
Andres Freund
740766d37c config: Split "Worker Processes" out of "Asynchronous Behavior"
Having all the worker related GUCs in the same section as IO controlling GUCs
doesn't really make sense. Create a separate section for them.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq
2025-02-11 12:53:40 -05:00
Tom Lane
c366d2bdba Allow extension functions to participate in in-place updates.
Commit 1dc5ebc90 allowed PL/pgSQL to perform in-place updates
of expanded-object variables that are being updated with
assignments like "x := f(x, ...)".  However this was allowed
only for a hard-wired list of functions f(), since we need to
be sure that f() will not modify the variable if it fails.
It was always envisioned that we should make that extensible,
but at the time we didn't have a good way to do so.  Since
then we've invented the idea of "support functions" to allow
attaching specialized optimization knowledge to functions,
and that is a perfect mechanism for doing this.

Hence, adjust PL/pgSQL to use a support function request instead
of hard-wired logic to decide if in-place update is safe.
Preserve the previous optimizations by creating support functions
for the three functions that were previously hard-wired.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2025-02-11 12:49:34 -05:00
Tom Lane
6c7251db0c Implement new optimization rule for updates of expanded variables.
If a read/write expanded variable is declared locally to the
assignment statement that is updating it, and it is referenced
exactly once in the assignment RHS, then we can optimize the
operation as a direct update of the expanded value, whether
or not the function(s) operating on it can be trusted not to
modify the value before throwing an error.  This works because
if an error does get thrown, we no longer care what value the
variable has.

In cases where that doesn't work, fall back to the previous
rule that checks for safety of the top-level function.

In any case, postpone determination of whether these optimizations
are feasible until we are executing a Param referencing the target
variable and that variable holds a R/W expanded object.  While the
previous incarnation of exec_check_rw_parameter was pretty cheap,
this is a bit less so, and our plan to invoke support functions
will make it even less so.  So avoiding the check for variables
where it couldn't be useful should be a win.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2025-02-11 12:34:59 -05:00
Tom Lane
36fb9ef269 Detect whether plpgsql assignment targets are "local" variables.
Mark whether the target of a potentially optimizable assignment
is "local", in the sense of being declared inside any exception
block that could trap an error thrown from the assignment.
(This implies that we needn't preserve the variable's value
in case of an error.  This patch doesn't do anything with the
knowledge, but the next one will.)

Normally, this requires a post-parsing scan of the function's
parse tree, since we don't know while parsing a BEGIN ...
construct whether we will find EXCEPTION at its end.  However,
if there are no BEGIN ... EXCEPTION blocks in the function at
all, then all assignments are local, even those to variables
representing function arguments.  We optimize that common case
by initializing the target_is_local flags to "true", and fixing
them up with a post-scan only if we found EXCEPTION.

Note that variables' default-value expressions are never interesting
for expanded-variable optimization, since they couldn't contain a
reference to the target variable anyway.  But the code is set up
to compute their target_param and target_is_local correctly anyway,
for consistency and in case someone thinks of a use for that data.

I added a bit of plpgsql_dumptree support to help verify that this
code sets the flags as expected.  I also added a plpgsql_dumptree
call in plpgsql_compile_inline.  It was at best an oversight that
"#option dump" didn't work in a DO block; now it does.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2025-02-11 12:27:15 -05:00
Tom Lane
a654af21ae Preliminary refactoring of plpgsql expression construction.
This short and boring patch simply moves the responsibility for
initializing PLpgSQL_expr.target_param into plpgsql parsing,
rather than doing it at first execution of the expr as before.
This doesn't save anything in terms of runtime, since the work was
trivial and done only once per expr anyway.  But it makes the info
available during parsing, which will be useful for the next step.

Likewise set PLpgSQL_expr.func during parsing.  According to the
comments, this was once impossible; but it's certainly possible
since we invented the plpgsql_curr_compile variable.  Again, this
saves little runtime, but it seems far cleaner conceptually.

While at it, I reordered stuff in struct PLpgSQL_expr to make it
clearer which fields are filled when, and merged some duplicative
code in pl_gram.y.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2025-02-11 12:20:05 -05:00
Tom Lane
6a7283dd2f Refactor pl_funcs.c to provide a usage-independent tree walker.
We haven't done this up to now because there was only one use-case,
namely plpgsql_free_function_memory's search for expressions to clean
up.  However an upcoming patch has another need for walking plpgsql
functions' statement trees, so let's create sharable tree-walker
infrastructure in the same style as expression_tree_walker().

This patch actually makes the code shorter, although that's
mainly down to having used a more compact coding style.  (I didn't
write a separate subroutine for each statement type, and I made
use of some newer notations like foreach_ptr.)

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2025-02-11 12:14:12 -05:00
Peter Eisentraut
6998db59c2 Replace AssertMacro() with Assert() when not in macro
This was forgotten to be changed in commit 9c727360bcc.
2025-02-11 11:12:05 +01:00
Michael Paquier
c9238ad853 Fix indentation of comment in plannodes.h
Oversight in commit 3d17d7d7fb7a.  Worth noting that pgindent was fine
as-is.

Author: Sami Imseih
Discussion: https://postgr.es/m/CAA5RZ0t80hP2aTv97QtEJy39GkxKmDBVDiTBApfiuTa4O=TEWQ@mail.gmail.com
2025-02-11 07:40:03 +09:00
Tom Lane
5bf12323b6 Adapt appendPsqlMetaConnect() to the new fmtId() encoding expectations.
We need to tell fmtId() what encoding to assume, but this function
doesn't know that.  Fortunately we can fix that without changing the
function's API, because we can just use SQL_ASCII.  That's because
database names in connection requests are effectively binary not text:
no encoding-aware processing will happen on them.

This fixes XversionUpgrade failures seen in the buildfarm.  The
alternative of having pg_upgrade use setFmtEncoding() is unappetizing,
given that it's connecting to multiple databases that may have
different encodings.

Andres Freund, Noah Misch, Tom Lane

Security: CVE-2025-1094
2025-02-10 16:30:03 -05:00
Jeff Davis
9f12da78d9 Lock table in ShareUpdateExclusive when importing index stats.
Follow locking behavior of ANALYZE when importing statistics. In
particular, when importing index statistics, the table must be locked
in ShareUpdateExclusive mode. Fixes bug reportd by Jian He.

ANALYZE doesn't update statistics on partitioned indexes, and the
locking requirements are slightly different for in-place updates on
partitioned indexes versus normal indexes. To be conservative, lock
both the partitioned table and the partitioned index in
ShareUpdateExclusive mode when importing stats for a partitioned
index.

Author: Corey Huinker
Reported-by: Jian He
Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/CACJufxGreTY7qsCV8%2BBkuv0p5SXGTScgh%3DD%2BDq6%3D%2B_%3DXTp7FWg%40mail.gmail.com
2025-02-10 12:58:13 -08:00
Andres Freund
979205e47b Fix type in test_escape test
On machines where char is unsigned this could lead to option parsing looping
endlessly. It's also too narrow a type on other hardware.

Found via Tom Lane's monitoring of the buildfarm.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Security: CVE-2025-1094
Backpatch-through: 13
2025-02-10 12:12:14 -05:00
Andres Freund
32c34006b2 docs: EUC_TW can be up to four bytes wide, not three
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Andres Freund
ac00ff1c96 Add test of various escape functions
As highlighted by the prior commit, writing correct escape functions is less
trivial than one might hope.

This test module tries to verify that different escaping functions behave
reasonably. It e.g. tests:

- Invalidly encoded input to an escape function leads to invalidly encoded
  output

- Trailing incomplete multi-byte characters are handled sensibly

- Escaped strings are parsed as single statement by psql's parser (which
  derives from the backend parser)

There are further tests that would be good to add. But even in the current
state it was rather useful for writing the fix in the prior commit.

Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Andres Freund
5dc1e42b4f Fix handling of invalidly encoded data in escaping functions
Previously invalidly encoded input to various escaping functions could lead to
the escaped string getting incorrectly parsed by psql.  To be safe, escaping
functions need to ensure that neither invalid nor incomplete multi-byte
characters can be used to "escape" from being quoted.

Functions which can report errors now return an error in more cases than
before. Functions that cannot report errors now replace invalid input bytes
with a byte sequence that cannot be used to escape the quotes and that is
guaranteed to error out when a query is sent to the server.

The following functions are fixed by this commit:
- PQescapeLiteral()
- PQescapeIdentifier()
- PQescapeString()
- PQescapeStringConn()
- fmtId()
- appendStringLiteral()

Reported-by: Stephen Fewer <stephen_fewer@rapid7.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Andres Freund
3e98c8ce50 Specify the encoding of input to fmtId()
This commit adds fmtIdEnc() and fmtQualifiedIdEnc(), which allow to specify
the encoding as an explicit argument.  Additionally setFmtEncoding() is
provided, which defines the encoding when no explicit encoding is provided, to
avoid breaking all code using fmtId().

All users of fmtId()/fmtQualifiedId() are either converted to the explicit
version or a call to setFmtEncoding() has been added.

This commit does not yet utilize the now well-defined encoding, that will
happen in a subsequent commit.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Andres Freund
4dc2896353 Add pg_encoding_set_invalid()
There are cases where we cannot / do not want to error out for invalidly
encoded input. In such cases it can be useful to replace e.g. an incomplete
multi-byte characters with bytes that will trigger an error when getting
validated as part of a larger string.

Unfortunately, until now, for some encoding no such sequence existed. For
those encodings this commit removes one previously accepted input combination
- we consider that to be ok, as the chosen bytes are outside of the valid
ranges for the encodings, we just previously failed to detect that.

As we cannot add a new field to pg_wchar_table without breaking ABI, this is
implemented "in-line" in the newly added function.

Author: Noah Misch <noah@leadboat.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Michael Paquier
3d17d7d7fb Reformat node comments in plannodes.h
This is similar to d575051b9af9 but this time for the comments in
plannodes.h to avoid long lines, which is useful if adding per-field
annotations with pg_node_attr() to these planner structures.

Some patches are under discussion to add such properties to planner
fields, which is something that may or may not happen, and this change
makes future proposals easier to work on and review, which being more
consistent in style with the parse nodes.

Author: Sami Imseih
Discussion: https://postgr.es/m/Z5xTb5iBHVGns35R@paquier.xyz
2025-02-10 09:58:25 +09:00
Peter Eisentraut
9926f854d0 Cache NO ACTION foreign keys separately from RESTRICT foreign keys
Now that we generate different SQL for temporal NO ACTION vs RESTRICT
foreign keys, we should cache their query plans with different keys.
Since the key also includes the constraint oid, this shouldn't be
necessary, but we have been seeing build farm failures that suggest we
might be sometimes using a cached NO ACTION plan to implement a RESTRICT
constraint.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2025-02-09 13:43:56 +01:00
Peter Eisentraut
a9258629ed Make TLS write functions' buffer arguments pointers const
This also makes it match the equivalent APIs in libpq.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-09 12:43:30 +01:00
Michael Paquier
169208092f Refactor TAP test code for file comparisons into new routine in Utils.pm
This unifies the output used should any differences be found in the
files provided, information that 027_stream_regress did not show on
failures.  TAP tests of pg_combinebackup and pg_upgrade now rely on the
refactored routine, reducing the dependency to the diff command.  The
callers of this routine can optionally specify a custom line-comparison
function.

There are a couple of tests that still use directly a diff command:
001_pg_bsd_indent, 017_shm and test_json_parser's 003.  These rely on
different properties and are left out for now.

Extracted from a larger patch by the same author.

Author: Ashutosh Bapat
Discussion: https://postgr.es/m/Z6RQS-tMzGYjlA-H@paquier.xyz
2025-02-09 16:52:33 +09:00
Tom Lane
ecb8226af6 PDF docs build: avoid spurious "warn" in build logs.
Improve on e4c886519 so that the string "warn" appears in
the output when there's a problem, and not when there isn't.
This should silence noise I've been seeing in my buildfarm
warning scraper.
2025-02-07 22:12:38 -05:00
Tom Lane
fb056564ec Fix pgbench performance issue induced by commit af35fe501.
Commit af35fe501 caused "pgbench -i" to emit a '\r' character
for each data row loaded (when stderr is a terminal).
That's effectively invisible on-screen, but it causes the
connected terminal program to consume a lot of cycles.
It's even worse if you're connected over ssh, as the data
then has to pass through the ssh tunnel.

Simplest fix is to move the added logic inside the if-tests
that check whether to print a progress line.  We could do
it another way that avoids duplicating these few lines,
but on the whole this seems the most transparent way to
write it.

Like the previous commit, back-patch to all supported versions.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/4k4drkh7bcmdezq6zbkhp25mnrzpswqi2o75d5uv2eeg3aq6q7@b7kqdmzzwzgb
Backpatch-through: 13
2025-02-07 13:41:42 -05:00
Tom Lane
11bba6e494 Doc: clarify behavior of timestamptz input some more.
Try to make it absolutely plain that we don't retain the
originally specified time zone, only the UTC timestamp.

While at it, make glossary entries for "UTC" and "GMT".

Author: Robert Treat <rob@xzilla.net>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/173796426022.1064.9135167366862649513@wrigleys.postgresql.org
Backpatch-through: 13
2025-02-07 12:40:41 -05:00
Peter Eisentraut
b92c03342d Allow non-btree speculative insertion indexes
Previously, only btrees were supported as the arbiter index for
speculative insertion because there was no way to get the equality
strategy number for other index methods.  We have this now (commit
c09e5a6a016), so we can support this.

At the moment, only btree supports unique indexes, so this does not
change anything in practice, but it would allow another index method
that has amcanunique to be supported.

Co-authored-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-07 11:23:34 +01:00
Peter Eisentraut
bfe21b760e Support non-btree indexes for foreign keys
Previously, only btrees were supported as the referenced unique index
for foreign keys because there was no way to get the equality strategy
number for other index methods.  We have this now (commit
c09e5a6a016), so we can support this.  In fact, this is now just a
special case of the existing generalized "period" foreign key
support, since that already knows how to lookup equality strategy
numbers.

Note that this does not change the requirement that the referenced
index needs to be unique, and at the moment, only btree supports that,
so this does not change anything in practice, but it would allow
another index method that has amcanunique to be supported.

Co-authored-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-07 11:23:34 +01:00
Peter Eisentraut
83ea6c5402 Virtual generated columns
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).

The syntax for the column definition is

    ... GENERATED ALWAYS AS (...) VIRTUAL

and VIRTUAL is also optional.  VIRTUAL is the default rather than
STORED to match various other SQL products.  (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.)  (Also, virtual views are the default, rather than
materialized views.)

Virtual generated columns are stored in tuples as null values.  (A
very early version of this patch had the ambition to not store them at
all.  But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing.  This is a
compromise, and it still saves space over being forced to use stored
generated columns.  If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)

The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns.  In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent.  Some of
that could maybe be relaxed later after separate careful
considerations.

Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:

- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication

The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced.  This way we can make
sure the behavior is mostly aligned, and the differences can be
visible.  Some tests for currently not supported features are
currently commented out.

Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-02-07 09:46:59 +01:00
Amit Langote
cbc127917e Track unpruned relids to avoid processing pruned relations
This commit introduces changes to track unpruned relations explicitly,
making it possible for top-level plan nodes, such as ModifyTable and
LockRows, to avoid processing partitions pruned during initial
pruning.  Scan-level nodes, such as Append and MergeAppend, already
avoid the unnecessary processing by accessing partition pruning
results directly via part_prune_index. In contrast, top-level nodes
cannot access pruning results directly and need to determine which
partitions remain unpruned.

To address this, this commit introduces a new bitmapset field,
es_unpruned_relids, which the executor uses to track the set of
unpruned relations.  This field is referenced during plan
initialization to skip initializing certain nodes for pruned
partitions. It is initialized with PlannedStmt.unprunableRelids,
a new field that the planner populates with RT indexes of relations
that cannot be pruned during runtime pruning. These include relations
not subject to partition pruning and those required for execution
regardless of pruning.

PlannedStmt.unprunableRelids is computed during set_plan_refs() by
removing the RT indexes of runtime-prunable relations, identified
from PartitionPruneInfos, from the full set of relation RT indexes.
ExecDoInitialPruning() then updates es_unpruned_relids by adding
partitions that survive initial pruning.

To support this, PartitionedRelPruneInfo and PartitionedRelPruningData
now include a leafpart_rti_map[] array that maps partition indexes to
their corresponding RT indexes. The former is used in set_plan_refs()
when constructing unprunableRelids, while the latter is used in
ExecDoInitialPruning() to convert partition indexes returned by
get_matching_partitions() into RT indexes, which are then added to
es_unpruned_relids.

These changes make it possible for ModifyTable and LockRows nodes to
process only relations that remain unpruned after initial pruning.
ExecInitModifyTable() trims lists, such as resultRelations,
withCheckOptionLists, returningLists, and updateColnosLists, to
consider only unpruned partitions. It also creates ResultRelInfo
structs only for these partitions. Similarly, child RowMarks for
pruned relations are skipped.

By avoiding unnecessary initialization of structures for pruned
partitions, these changes improve the performance of updates and
deletes on partitioned tables during initial runtime pruning.

Due to ExecInitModifyTable() changes as described above, EXPLAIN on a
plan for UPDATE and DELETE that uses runtime initial pruning no longer
lists partitions pruned during initial pruning.

Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2025-02-07 17:15:09 +09:00
Michael Paquier
926c7fce03 injection_points: Tweak more permutation in isolation test "basic"
The CI has reported that using a marker to force the output of the
detach step to happen after the wait step was not enough, as
isolationtester has managed to report the detach step as waiting before
the wait step finishes in some runs.

src/test/isolation/README tells that there is a more drastic method to
enforce the ordering of the output: an empty step positioned just after
the wait step can force the wait step to complete before the detach step
begins.  This method has been able to pass 10 runs in the CI here, while
HEAD seems to fail 15~20% of the time in the CF bot.

Discussion: https://postgr.es/m/Z6WO8FbqK_FHmrzC@paquier.xyz
2025-02-07 13:58:22 +09:00
Michael Paquier
428fadb7e9 Move SQL tests of pg_stat_io for WAL data to recovery test 029_stats_restart
Three tests in the main regression test suite are proving to not be
portable across multiple runs on a deployed cluster as stats of
pg_stat_io are reset.  Problems happen for tests on:
- Writes of WAL in the init context, when creating a WAL segment.
- Syncs of WAL in the init context, when creating a WAL segment.
- Reads of WAL in the normal context, requiring a WAL record to be read.
For a `make check`, this could rely on the checkpoint record read by the
startup process when starting the cluster, something that is not going
to work for a deployed node.

Two of the three tests are moved to the recovery TAP test
029_stats_restart, where we already check the consistency of stats
data.  The test for syncs is dropped as TAP can run with fsync=off.  The
other two are checked with some data from a freshly-initialized cluster.

Per discussion with Tom Lane, Bertrand Drouvot and Nazir Bilal Yavuz.

Discussion: https://postgr.es/m/915687.1738780322@sss.pgh.pa.us
2025-02-07 09:42:31 +09:00
Nathan Bossart
401a6956fa Disallow COPY FREEZE on foreign tables.
This didn't actually work: the COPY succeeds, but the FREEZE
optimization isn't applied.  There doesn't seem to be an easy way
to support FREEZE on foreign tables, so let's follow the precedent
established by commit 5c9a5513a3 by raising an error early.  This
is arguably a bug fix, but due to the lack of reports, the minimal
discussion on the mailing list, and the potential to break existing
scripts, I am not back-patching it for now.

Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0ujeNgKpE3OrLtR%3DeJGa5LkGMekFzQTwjgw%3DrzaLufQLQ%40mail.gmail.com
2025-02-06 15:23:40 -06:00
Daniel Gustafsson
a99a32e43e libpq: Handle asynchronous actions during SASL
This adds the ability for a SASL mechanism to signal PQconnectPoll()
that some arbitrary work, external to the Postgres connection, is
required for authentication to continue.  There is no consumer for
this capability as part of this commit, it is infrastructure which
is required for future work on supporting the OAUTHBEARER mechanism.

To ensure that threads are not blocked waiting for the SASL mechanism
to make long-running calls, the mechanism communicates with the top-
level client via the "altsock": a file or socket descriptor, opaque to
this layer of libpq, which is signaled when work is ready to be done
again.  The altsock temporarily replaces the regular connection
descriptor, so existing PQsocket() clients should continue to operate
correctly using their existing polling implementations.

For a mechanism to use this it should set an authentication callback,
conn->async_auth(), and a cleanup callback, conn->cleanup_async_auth(),
and return SASL_ASYNC during the exchange.  It should then assign
conn->altsock during the first call to async_auth().  When the cleanup
callback is called, either because authentication has succeeded or
because the connection is being dropped, the altsock must be released
and disconnected from the PGconn object.

This was extracted from the larger OAUTHBEARER patchset which has
been developed, and reviewed by many, over several years and it is
thus likely that some reviewer credit of much earlier versions has
been accidentally omitted.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/CAOYmi+kJqzo6XsR9TEhvVfeVNQ-TyFM5LATypm9yoQVYk=4Wrw@mail.gmail.com
2025-02-06 22:19:21 +01:00
Daniel Gustafsson
44ec095751 Remove support for linking with libeay32 and ssleay32
The OpenSSL project stopped using the eay names back in 2016
on platforms other than Microsoft Windows, and version 1.1.0
removed the names from Windows as well. Since we now require
OpenSSL 1.1.1 we can remove support for using the eay names
from our tree as well.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/3C445F8E-D43E-4970-9CD9-A54882197714@yesql.se
Discussion: https://postgr.es/m/CAHrt6656W9OnFomQTHBGYDcM5CKZ7hcgzFt8L+N0ezBZfcN3zA@mail.gmail.com
2025-02-06 20:26:46 +01:00
Nathan Bossart
527f8fec22 Fix autovacuum_vacuum_max_threshold's GUC description.
Most GUCs that accept a special value to disable the feature
mention it in their GUC description.  This commit adds that
information to autovacuum_vacuum_max_threshold's description.

Oversight in commit 306dc520b9.
2025-02-06 11:59:12 -06:00
Daniel Gustafsson
affd38e55a pgcrypto: Remove static storage class from variables
Variables p, sp and ep were labeled with static storage class
but are all assigned before use so they cannot carry any data
across calls.  Fix by removing the static label.

Also while in there, make the magic variable const as it will
never change.

Author: Japin Li <japinli@hotmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ME0P300MB0445096B67ACE8CE25772F00B6F72@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-02-06 15:13:40 +01:00
Michael Paquier
9e020050b8 injection_points: Re-enable permutation in isolation test "basic"
This test has been disabled in 9f00edc22888 due to an instable expected
output, where it would be possible for the wait step to report its
result after the detach step is done.  The expected output was ordered
so as the detach would always report last.

Isolation test permutations have the option to use markers to control
the ordering for cases like this one, as documented in
src/test/isolation/README.  The permutation is enabled once again, this
time with a marker added so as the detach step reports only once the
wait step has finished, ensuring a correct output ordering.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Z6MBZTX5EqQ6A8Zc@paquier.xyz
2025-02-06 10:39:41 +09:00
Nathan Bossart
306dc520b9 Introduce autovacuum_vacuum_max_threshold.
One way autovacuum chooses tables to vacuum is by comparing the
number of updated or deleted tuples with a value calculated using
autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor.
The threshold specifies the base value for comparison, and the
scale factor specifies the fraction of the table size to add to it.
This strategy ensures that smaller tables are vacuumed after fewer
updates/deletes than larger tables, which is reasonable in many
cases but can result in infrequent vacuums on very large tables.
This is undesirable for a couple of reasons, such as very large
tables incurring a huge amount of bloat between vacuums.

This new parameter provides a way to set a limit on the value
calculated with autovacuum_vacuum_threshold and
autovacuum_vacuum_scale_factor so that very large tables are
vacuumed more frequently.  By default, it is set to 100,000,000
tuples, but it can be disabled by setting it to -1.  It can also be
adjusted for individual tables by changing storage parameters.

Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Frédéric Yhuel <frederic.yhuel@dalibo.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Joe Conway <mail@joeconway.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Vinícius Abrahão <vinnix.bsd@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com
2025-02-05 15:48:18 -06:00
Tom Lane
a14707da56 Show more-intuitive titles for psql commands \dt, \di, etc.
If exactly one relation type is requested in a command of the \dtisv
family, say "tables", "indexes", etc instead of "relations".  This
should cover the majority of actual uses, without creating a huge
number of new translatable strings.  The error messages for no
matching relations are adjusted as well.

In passing, invent "pg_log_error_internal()" to be used for frontend
error messages that don't seem to need translation, analogously to
errmsg_internal() in the backend.  The implementation is a bit cheesy,
being just a macro to prevent xgettext from recognizing a trigger
keyword.  This won't avoid a useless gettext lookup cycle at runtime
--- but surely we don't care about an extra microsecond or two in
what's supposed to be a can't-happen case.  I (tgl) also made
"pg_fatal_internal()", though it's not used in this patch.

Author: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAKAnmm+7o93fQV-RFkGaN1QnP-0D4d3JTykD+cLueqjDMKdfag@mail.gmail.com
2025-02-05 12:45:58 -05:00
Daniel Gustafsson
ee4667f018 doc: Update links which returned 404
Two links in the isn module documentation were pointing to tools
which had been moved, resulting in 404 error responses.  Update
to the new URLs for the tools.  The link to the Sequoia 2000 page
in the history section was no longer working, and since the page
is no longer available online update our link to point at the
paper instead which is on a stable URL.

These links exist in all versions of the documentation so backpatch
to all supported branches.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: charukiewicz@protonmail.com
Discussion: https://postgr.es/m/173679670185.705.8565555804465055355@wrigleys.postgresql.org
Backpatch-through: 13
2025-02-05 13:58:40 +01:00
Amit Kapila
0ec3c295e7 Avoid updating inactive_since for invalid replication slots.
It is possible for the inactive_since value of an invalid replication slot
to be updated multiple times, which is unexpected behavior like during the
release of the slot or at the time of restart. This is harmless because
invalid slots are not allowed to be accessed but it is not prudent to
update invalid slots. We are planning to invalidate slots due to other
reasons like idle time and it will look odd that the slot's inactive_since
displays the recent time in this field after invalidated due to idle time.
So, this patch ensures that the inactive_since field of slots is not
updated for invalid slots.

In the passing, ensure to use the same inactive_since time for all the
slots at restart while restoring them from the disk.

Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CABdArM7QdifQ_MHmMA=Cc4v8+MeckkwKncm2Nn6tX9wSCQ-+iw@mail.gmail.com
2025-02-05 08:56:14 +05:30
Andres Freund
b2bdb972c0 meson: ci: ensure tests are built before running them
Meson 1.7 stopped building all the dependencies of tests as part of the
default build target. But it does breaks CI because we only built the default
target before running the test, and ran the tests with --no-rebuild.

The simplest fix would be to remove --no-rebuild from MTEST_ARGS, but it seems
better to explicitly build the test dependencies, so compiler warnings /
errors are visible as part of the build step.

Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Backpatch: 16-, where meson was added
2025-02-04 17:56:19 -05:00
Andres Freund
26aca4de43 meson: Add missing dependencies for libpq tests
The missing dependency was, e.g., visible when doing
  ninja clean && ninja meson-test-prereq && meson test --no-rebuild --suite setup --suite libpq

This is a bit more complicated than other related fixes, because until now
libpq's tests depended on 'frontend_code', which includes a dependency on
fe_utils, which in turns on libpq. That in turn required
src/interfaces/libpq/test to be entered from the top-level, not from
libpq/meson.build.  Because of that the test definitions in libpq/meson.build
could not declare a dependency on the binaries defined in
libpq/test/meson.build.

To fix this, this commit creates frontend_no_fe_utils_code, which allows us to
recurse into libpq/test from withing libpq/meson.build.

Apply this to all branches with meson support, as part of an effort to fix
incorrect test dependencies that can lead to test failures.

Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Discussion: https://postgr.es/m/bdba588f-69a9-4f3e-9b95-62d07210a32e@eisentraut.org
Backpatch: 16-, where meson support was added
2025-02-04 17:56:19 -05:00
Andres Freund
c89525d57b meson: Add missing dependencies to libpq_pipeline test
The missing dependency was, e.g., visible when doing
  ninja clean && ninja meson-test-prereq && meson test --no-rebuild --suite setup --suite libpq_pipeline

Apply this to all branches with meson support, as part of an effort to fix
incorrect test dependencies that can lead to test failures.

Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Discussion: https://postgr.es/m/bdba588f-69a9-4f3e-9b95-62d07210a32e@eisentraut.org
Backpatch: 16-, where meson support was added
2025-02-04 17:56:19 -05:00
Andres Freund
1be5c37372 meson: Add test dependencies for test_json_parser
This is required to ensure correct test dependencies, previously
the test binaries would not necessarily be built.

The missing dependency was, e.g., visible when doing
  ninja clean && ninja meson-test-prereq && m test --no-rebuild --suite setup --suite test_json_parser

Apply this to all branches with meson support, as part of an effort to fix
incorrect test dependencies that can lead to test failures.

Author: Peter Eisentraut <peter@eisentraut.org>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Discussion: https://postgr.es/m/bdba588f-69a9-4f3e-9b95-62d07210a32e@eisentraut.org
Backpatch: 17-, where test_json_parser was added
2025-02-04 17:56:19 -05:00
Andres Freund
69b05d581e meson: Add pg_regress_ecpg to ecpg test dependencies
This is required to ensure correct test dependencies, previously
pg_regress_ecpg would not necessarily be built.

The missing dependency was, e.g., visible when doing
  ninja clean && ninja meson-test-prereq && meson test --no-rebuild --suite setup --suite ecpg

Apply this to all branches with meson support, as part of an effort to fix
incorrect test dependencies that can lead to test failures.

Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Discussion: https://postgr.es/m/bdba588f-69a9-4f3e-9b95-62d07210a32e@eisentraut.org
Backpatch: 16-, where meson support was added
2025-02-04 17:56:19 -05:00
Andres Freund
74ef4855b0 meson: Improve dependencies for tmp_install test target
The missing dependency was, e.g., visible when doing
  ninja clean && ninja meson-test-prereq && meson test --no-rebuild --suite setup --suite cube
because meson (and thus its internal meson-test-prereq target) did not know
about a lot of the required targets.

Previously tmp_install did not actually depend on the relevant files being
built. That was mostly not visible, because "meson test" currently uses the
'default' targets as a test's dependency if no dependency is specified.
However, there are plans to narrow that on the meson side, to make it quicker
to run tests.

Apply this to all branches with meson support, as part of an effort to fix
incorrect test dependencies that can lead to test failures.

Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Discussion: https://postgr.es/m/bdba588f-69a9-4f3e-9b95-62d07210a32e@eisentraut.org
Backpatch: 16-, where meson support was added
2025-02-04 17:56:19 -05:00
Andres Freund
c2ede6640c meson: Narrow dependencies for 'install-quiet' target
Previously test dependencies, which are not actually installed, were
unnecessarily built.

Apply this to all branches with meson support, as part of an effort to fix
incorrect test dependencies that can lead to test failures.

Discussion: https://postgr.es/m/CAGECzQSvM3iSDmjF+=Kof5an6jN8UbkP_4cKKT9w6GZavmb5yQ@mail.gmail.com
Discussion: https://postgr.es/m/bdba588f-69a9-4f3e-9b95-62d07210a32e@eisentraut.org
Backpatch: 16-, where meson support was added
2025-02-04 17:56:19 -05:00
Alexander Korotkov
ff1975ddd0 pg_controldata: Fix possible errors on corrupted pg_control
Protect against malformed timestamps.  Also protect against negative WalSegSz
as it triggers division by zero:

((0x100000000UL) / (WalSegSz)) can turn into zero in

XLogFileName(xlogfilename, ControlFile->checkPointCopy.ThisTimeLineID,
             segno, WalSegSz);

because if WalSegSz is -1 then by arithmetic rules in C we get
0x100000000UL / 0xFFFFFFFFFFFFFFFFUL == 0.

Author: Ilyasov Ian <ianilyasov@outlook.com>
Author: Anton Voloshin <a.voloshin@postgrespro.ru>
Backpatch-through: 13
2025-02-05 00:45:49 +02:00
Alexander Korotkov
627d63419e Allow usage of match_orclause_to_indexcol() for joins
This commit allows transformation of OR-clauses into SAOP's for index scans
within nested loop joins.  That required the following changes.

 1. Make match_orclause_to_indexcol() and group_similar_or_args() understand
    const-ness in the same way as match_opclause_to_indexcol().  This
    generally makes our approach more uniform.
 2. Make match_join_clauses_to_index() pass OR-clauses to
    match_clause_to_index().
 3. Also switch match_join_clauses_to_index() to use list_append_unique_ptr()
    for adding clauses to *joinorclauses.  That avoids possible duplicates
    when processing the same clauses with different indexes.  Previously such
    duplicates were elimited in match_clause_to_index(), but now
    group_similar_or_args() each time generates distinct copies of grouped
    OR clauses.

Discussion: https://postgr.es/m/CAPpHfdv%2BjtNwofg-p5z86jLYZUTt6tR17Wy00ta0dL%3DwHQN3ZA%40mail.gmail.com
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
2025-02-04 23:21:49 +02:00
Alexander Korotkov
23ef119f58 Revise the header comment for match_clause_to_indexcol()
Since d4378c0005e6, match_clause_to_indexcol() doesn't always return NULL
for an OR clause.  This commit reflects that in the function header comment.

Reported-by: Pavel Borisov <pashkin.elfe@gmail.com>
2025-02-04 23:18:47 +02:00
Nathan Bossart
f3e4aeb744 vacuumdb: Add missing PQfinish() calls to vacuum_one_database().
A few of the version checks in vacuum_one_database() do not call
PQfinish() before exiting.  This precedent was unintentionally
established in commit 00d1e88d36, and while it's probably not too
problematic, it seems better to properly close the connection.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/Z6JAwqN1I8ljTuXp%40nathan
Backpatch-through: 13
2025-02-04 13:26:57 -06:00
Peter Eisentraut
cc2c9fa696 sepgsql: update TAP test to use fat comma style
Adopt the style introduced by commit ce1b0f9da03 to this new test
file.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/87y0yv2har.fsf@wibble.ilmari.org
2025-02-04 15:51:42 +01:00
Michael Paquier
a051e71e28 Add data for WAL in pg_stat_io and backend statistics
This commit adds WAL IO stats to both pg_stat_io view and per-backend IO
statistics (pg_stat_get_backend_io()).  This change is possible since
f92c854cf406, as WAL IO is not counted in blocks in some code paths
where its stats data is measured (like WAL read in xlogreader.c).

IOContext gains IOCONTEXT_INIT and IOObject IOOBJECT_WAL, with the
following combinations allowed:
- IOOBJECT_WAL/IOCONTEXT_NORMAL is used to track I/O operations done on
already-created WAL segments.
- IOOBJECT_WAL/IOCONTEXT_INIT is used for tracking I/O operations done
when initializing WAL segments.

The core changes are done in pg_stat_io.c, backend statistics inherit
them.  Backend statistics and pg_stat_io are now available for the WAL
writer, the WAL receiver and the WAL summarizer processes.

I/O timing data is controlled by the GUC track_io_timing, like the
existing data of pg_stat_io for consistency.  The timings related to
IOOBJECT_WAL show up if the GUC is enabled (disabled by default).

Bump pgstats file version, due to the additions in IOObject and
IOContext, impacting the amount of data written for the fixed-numbered
IO stats kind in the pgstats file.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot, Nitin Jadhav, Amit Kapila, Michael
Paquier, Melanie Plageman, Bharath Rupireddy
Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2025-02-04 16:50:00 +09:00
Peter Eisentraut
622f678c10 Integrate GistTranslateCompareType() into IndexAmTranslateCompareType()
This turns GistTranslateCompareType() into a callback function of the
gist index AM instead of a standalone function.  The existing callers
are changed to use IndexAmTranslateCompareType().  This then makes
that code not hardcoded toward gist.

This means in particular that the temporal keys code is now
independent of gist.  Also, this generalizes commit 74edabce7a3, so
other index access methods other than the previously hardcoded ones
could now work as REPLICA IDENTITY in a logical replication
subscriber.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-03 10:53:18 +01:00
Tom Lane
43a15eb940 Fix incorrect range in pg_regress comment.
A comment in pg_regress incorrectly stated that alternative
output files could be named test_{i}.out with 0 < i <= 9.
However, the valid range is actually 0 <= i <= 9.
(The user-facing docs have this right already.)

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Discussion: https://postgr.es/m/6e6c4dea-07a1-4a83-9bb7-77b9b3324c37@tantorlabs.com
2025-02-02 22:37:13 -05:00
Michael Paquier
b998fedab7 Improve comment on top of pgstat_count_io_op_time()
This commit adds more documentation to pgstat_count_io_op_time() in
pgstat_io.c, explaining its internals for pgstat_count_buffer_*(),
pgBufferUsage and the contexts where these are used.

Extracted from a larger patch by the same author.

Author: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2025-02-03 11:19:58 +09:00
Michael Paquier
fcce828529 Fix typo in xlog.c
"recovery" is not a verb.  Introduced in 68cb5af46cd8.
2025-02-03 09:22:45 +09:00
Peter Eisentraut
c09e5a6a01 Convert strategies to and from compare types
For each Index AM, provide a mapping between operator strategies and
the system-wide generic concept of a comparison type.  For example,
for btree, BTLessStrategyNumber maps to and from COMPARE_LT.  Numerous
places in the planner and executor think directly in terms of btree
strategy numbers (and a few in terms of hash strategy numbers.)  These
should be converted over subsequent commits to think in terms of
CompareType instead.  (This commit doesn't make any use of this API
yet.)

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-02 10:26:04 +01:00
Peter Eisentraut
119fc30dd5 Move CompareType to separate header file
We'll want to make use of it in more places, and we'd prefer to not
have to include all of primnodes.h everywhere.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-02 08:11:57 +01:00
Michael Paquier
d61b9662b0 Mention jsonlog in description of logging_collector in GUC table
logging_collector was only mentioning stderr and csvlog, and forgot
about jsonlog.  Oversight in dc686681e079, that has added support for
jsonlog in log_destination.

While on it, the description in the GUC table is tweaked to be more
consistent with the documentation and postgresql.conf.sample.

Author: Umar Hayat
Reviewed-by: Ashutosh Bapat, Tom Lane
Discussion: https://postgr.es/m/CAD68Dp1K_vBYqBEukHw=1jF7e76t8aszGZTFL2ugi=H7r=a7MA@mail.gmail.com
Backpatch-through: 13
2025-02-02 11:31:21 +09:00
Peter Eisentraut
43493cceda Add get_opfamily_name() function
This refactors and simplifies various existing code to make use of the
new function.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-01 10:42:58 +01:00
Peter Eisentraut
a5709b5bb2 Rename GistTranslateStratnum() to GistTranslateCompareType()
Follow up to commit 630f9a43cec.  The previous name had become
confusing, because it doesn't actually translate a strategy number but
a CompareType into a strategy number.  We might add the inverse at
some point, which would then probably be called something like
GistTranslateStratnum.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-01 10:18:46 +01:00
Peter Eisentraut
2452e71ff2 Add script to keep .editorconfig in sync with .gitattributes
Our repo already contained an .editorconfig file, but it was not kept
up to date with .gitattributes.  This adds a script that keeps these
files in sync.  A big advantage of the editorconfig file is that it
many editors/IDEs get automatically configured to trim trailing
newlines and add a final newline on save, while .gitattributes only
complains about these problems instead of automatically fixing them.

This also adds rules to .gitattributes for Python files as well as for
C files in pg_bsd_indent directory (which have a different tab_width
than most C files due to being vendored in).

Author: Jelte Fennema-Nio <github-tech@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/CAGECzQQGzbroAXi+Yicp3HvcCo4=g84kaOgjuvQ5MW9F0ubOGg@mail.gmail.com
2025-02-01 10:09:45 +01:00
Amit Langote
79e872fedb Add commit 76aa615943 to .git-blame-ignore-revs 2025-02-01 16:48:18 +09:00
Tom Lane
53a4936505 Doc: add commentary about cowboy assignment of maintenance_work_mem.
Whilst working on commit 041e8b95b I happened to notice that
parallel_vacuum_main() assigns directly to the maintenance_work_mem
GUC.  This is definitely not per project conventions, so I tried to
fix it to use SetConfigOption().  But that fails with "parameter
cannot be set during a parallel operation".  It doesn't seem worth
working on a cleaner answer, at least not till we have a few more
instances of similar problems.  But add some commentary, just so
nobody gets the idea that this is an approved way to set a GUC.
2025-01-31 15:17:15 -05:00
Tom Lane
d4c3a6b8ad Remove obsolete restriction on the range of log_rotation_size.
When syslogger.c was first written, we didn't want to assume that
all platforms have 64-bit ftello.  But we've been assuming that
since v13 (cf commit 799d22461), so let's use that in syslogger.c
and allow log_rotation_size to range up to INT_MAX kilobytes.

The old code effectively limited log_rotation_size to 2GB regardless
of platform.  While nobody's complained, that doesn't seem too far
away from what might be thought reasonable these days.

I noticed this while searching for instances of "1024L" in connection
with commit 041e8b95b.  These were the last such instances.
(We still have instances of L-suffixed literals, but most of them
are associated with wait intervals for pg_usleep or similar functions.
I don't see any urgent reason to change that.)
2025-01-31 14:36:56 -05:00
Tom Lane
041e8b95b8 Get rid of our dependency on type "long" for memory size calculations.
Consistently use "Size" (or size_t, or in some places int64 or double)
as the type for variables holding memory allocation sizes.  In most
places variables' data types were fine already, but we had an ancient
habit of computing bytes from kilobytes-units GUCs with code like
"work_mem * 1024L".  That risks overflow on Win64 where they did not
make "long" as wide as "size_t".  We worked around that by restricting
such GUCs' ranges, so you couldn't set work_mem et al higher than 2GB
on Win64.  This patch removes that restriction, after replacing such
calculations with "work_mem * (Size) 1024" or variants of that.

It should be noted that this patch was constructed by searching
outwards from the GUCs that have MAX_KILOBYTES as upper limit.
So I can't positively guarantee there are no other places doing
memory-size arithmetic in int or long variables.  I do however feel
pretty confident that increasing MAX_KILOBYTES on Win64 is safe now.
Also, nothing in our code should be dealing in multiple-gigabyte
allocations without authorization from a relevant GUC, so it seems
pretty likely that this search caught everything that could be at
risk of overflow.

Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
2025-01-31 13:52:40 -05:00
Daniel Gustafsson
f8d8581ed8 require_auth: prepare for multiple SASL mechanisms
Prior to this patch, the require_auth implementation assumed that
the AuthenticationSASL protocol message was using SCRAM-SHA-256.
In preparation for future SASL mechanisms, like OAUTHBEARER, split
the implementation into two tiers: the first checks the acceptable
AUTH_REQ_* codes, and the second checks acceptable mechanisms if
AUTH_REQ_SASL et.al are permitted.

conn->allowed_sasl_mechs contains a list of pointers to acceptable
mechanisms, and pg_SASL_init() will bail if the selected mechanism
isn't contained in this array.

Since there's only one mechansism supported right now, one branch
of the second tier cannot be exercised yet and is protected by an
Assert(false) call.  This assertion will need to be removed when
the next mechanism is added.

This patch is extracted from a larger body of work aimed at adding
support for OAUTHBEARER in libpq.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CAOYmi+kJqzo6XsR9TEhvVfeVNQ-TyFM5LATypm9yoQVYk=4Wrw@mail.gmail.com
2025-01-31 15:47:28 +01:00
Daniel Gustafsson
e21d6f2971 Move PG_MAX_AUTH_TOKEN_LENGTH to libpq/auth.h
Future SASL mechanism, like OAUTHBEARER, will use this as a limit on
token messages coming from the client, so promote it to the header
file to make it available.

This patch is extracted from a larger body of work aimed at adding
support for OAUTHBEARER in libpq.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CAOYmi+kJqzo6XsR9TEhvVfeVNQ-TyFM5LATypm9yoQVYk=4Wrw@mail.gmail.com
2025-01-31 15:39:35 +01:00
Daniel Gustafsson
59d6c03956 doc: Fix pg_buffercache_evict() title
Use <function> rather than <structname> in the <title> to be consistent
with how other functions in this module are documented. Also suffix the
function name with () for consistency.

Backpatch to v17 where pg_buffercache_evict was introduced.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAExHW5uKWH8CuZc9NCb8XxSQc6uzvACV0cScebm54kF763ERAw@mail.gmail.com
Backpatch-through: 17
2025-01-31 10:44:21 +01:00
Amit Langote
76aa615943 Fix bad indentation introduced in commit d47cbf474
Per buildfarm member koel
2025-01-31 16:44:24 +09:00
Amit Langote
d47cbf474e Perform runtime initial pruning outside ExecInitNode()
This commit builds on the prior change that moved PartitionPruneInfos
out of individual plan nodes into a list in PlannedStmt, making it
possible to initialize PartitionPruneStates without traversing the
plan tree and perform runtime initial pruning before ExecInitNode()
initializes the plan trees.  These tasks are now handled in a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before calling ExecInitNode() on various plan trees.

ExecDoInitialPruning() performs the initial pruning and saves the
result -- a Bitmapset of indexes for surviving child subnodes -- in
es_part_prune_results, a list in EState.

PartitionPruneStates created for initial pruning are stored in
es_part_prune_states, another list in EState, for later use during
exec pruning. Both lists are parallel to es_part_prune_infos, which
holds the PartitionPruneInfos from PlannedStmt, enabling shared
indexing.

PartitionPruneStates initialized in ExecDoInitialPruning() now
include only the PartitionPruneContexts for initial pruning steps.
Exec pruning contexts are initialized later in
ExecInitPartitionExecPruning() when the parent plan node is
initialized, as the exec pruning step expressions depend on the parent
node's PlanState.

The existing function PartitionPruneFixSubPlanMap() has been
repurposed for this initialization to avoid duplicating a similar
loop structure for finding PartitionedRelPruningData to initialize
exec pruning contexts for.  It has been renamed to
InitExecPruningContexts() to reflect its new primary responsibility.
The original logic to "fix subplan maps" remains intact but is now
encapsulated within the renamed function.

This commit removes two obsolete Asserts in partkey_datum_from_expr().
The ExprContext used for pruning expression evaluation is now
independent of the parent PlanState, making these Asserts unnecessary.

By centralizing pruning logic and decoupling it from the plan
initialization step (ExecInitNode()), this change sets the stage for
future patches that will use the result of initial pruning to
save the overhead of redundant processing for pruned partitions.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2025-01-31 15:47:15 +09:00
Amit Kapila
f41d8468dd Raise an error while trying to acquire an invalid slot.
Once a replication slot is invalidated, it cannot be altered or used to
fetch changes. However, a process could still acquire an invalid slot and
fail later.

For example, if a process acquires a logical slot that was invalidated due
to wal_removed, it will eventually fail in CreateDecodingContext() when
attempting to access the removed WAL. Similarly, for physical replication
slots, even if the slot is invalidated and invalidation_reason is set to
wal_removed, the walsender does not currently check for invalidation when
starting physical replication. Instead, replication starts, and an error
is only reported later while trying to access WAL. Similarly, we prohibit
modifying slot properties for invalid slots but give the error for the
same after acquiring the slot.

This patch improves error handling by detecting invalid slots earlier at
the time of slot acquisition which is the first step. This also helped in
unifying different ERROR messages at different places and gave a
consistent message for invalid slots. This means that the message for
invalid slots will change to a generic message.

This will also be helpful for future patches where we are planning to
invalidate slots due to more reasons like idle_timeout because we don't
have to modify multiple places in such cases and avoid the chances of
missing out on a particular place.

Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com
2025-01-31 10:27:35 +05:30
Michael Paquier
a632cd354d injection_points: Add routine able to drop all stats
This serves as an example of how to use the new function introduced in
ce5c620fb625, pgstat_drop_matching_entries(), with a callback able to
filter the entries dropped.

A SQL function named injection_points_stats_drop() is added with some
tests.

Author: Lukas Fitti
Discussion: https://postgr.es/m/CAP53PkwuFbo3NkwZgxwNRMjMfqPEqidD-SggaoQ4ijotBVLJAA@mail.gmail.com
2025-01-31 12:41:39 +09:00
Michael Paquier
ce5c620fb6 Add pgstat_drop_matching_entries() to pgstats
This allows users of the cumulative statistics to drop entries in the
shared hash stats table, deleting as well local references.  Callers of
this function can optionally define a callback able to filter which
entries to drop, similarly to pgstat_reset_matching_entries() with its
callback do_reset().

pgstat_drop_all_entries() is refactored so as it uses this new function.

Author: Lukas Fitti
Discussion: https://postgr.es/m/CAP53PkwuFbo3NkwZgxwNRMjMfqPEqidD-SggaoQ4ijotBVLJAA@mail.gmail.com
2025-01-31 12:27:19 +09:00
Michael Paquier
1e380fa7d8 Fix comment of StrategySyncStart()
The top comment of StrategySyncStart() mentions BufferSync(), but this
function calls BgBufferSync(), not BufferSync().

Oversight in 9cd00c457e6a.

Author: Ashutosh Bapat
Discussion: https://postgr.es/m/CAExHW5tgkjag8i-s=RFrCn5KAWDrC4zEPPkfUKczfccPOxBRQQ@mail.gmail.com
Backpatch-through: 13
2025-01-31 11:05:57 +09:00
Tom Lane
b9d232b9de Use "ssize_t" not "long" in max_stack_depth-related code.
This change adapts these functions to the machine's address width
without depending on "long" to be the right size.  (It isn't on
Win64, for example.)  While it seems unlikely anyone would care
to run with a stack depth limit exceeding 2GB, this is part of a
general push to avoid using type "long" to represent memory sizes.

It's convenient to use ssize_t rather than the perhaps-more-obvious
choice of size_t/Size, because the code involved depends on working
with a signed data type.  Our MAX_KILOBYTES limit already ensures
that ssize_t will be sufficient to represent the maximum value of
max_stack_depth.

Extracted from a larger patch by Vladlen, plus additional hackery
by me.

Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
2025-01-30 16:44:47 -05:00
Tom Lane
b9aa4166fa Avoid integer overflow while testing wal_skip_threshold condition.
smgrDoPendingSyncs had two distinct risks of integer overflow while
deciding which way to ensure durability of a newly-created relation.
First, it accumulated the total size of all forks in a variable of
type BlockNumber (uint32).  While we restrict an individual fork's
size to fit in that, I don't believe there's such a restriction on
all of them added together.  Second, it proceeded to multiply the
sum by BLCKSZ, which most certainly could overflow a uint32.

(The exact expression is total_blocks * BLCKSZ / 1024.  The
compiler might choose to optimize that to total_blocks * 8,
which is not at quite as much risk of overflow as a literal
reading would be, but it's still wrong.)

If an overflow did occur it could lead to a poor choice to
shove a very large relation into WAL instead of fsync'ing it.
This wouldn't be fatal, but it could be inefficient.

Change total_blocks to uint64 which should be plenty, and
rearrange the comparison calculation to be overflow-safe.

I noticed this while looking for ramifications of the proposed
change in MAX_KILOBYTES.  It's not entirely clear to me why
wal_skip_threshold is limited to MAX_KILOBYTES in the
first place, but in any case this code is unsafe regardless
of the range of wal_skip_threshold.

Oversight in c6b92041d which introduced wal_skip_threshold,
so back-patch to v13.

Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
Backpatch-through: 13
2025-01-30 15:36:44 -05:00
Melanie Plageman
a5358c14b2 Move BitmapTableScan per-scan setup into a helper
Add BitmapTableScanSetup(), a helper which contains all of the code that
must be done on every scan of the table in a bitmap table scan. This
includes scanning the index, building the bitmap, and setting up the
scan descriptors.

Pushing this setup into a helper function makes BitmapHeapNext() more
readable.

Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ1vXu%2BZdT0_MM-i1vbTdfHHf0KR3cK6R5gs6dNNNpyrJw%40mail.gmail.com
2025-01-30 15:28:33 -05:00
Tom Lane
115a365519 Simplify executor's handling of CaseTestExpr & CoerceToDomainValue.
Instead of deciding at runtime whether to read from casetest.value
or caseValue_datum, split EEOP_CASE_TESTVAL into two opcodes and
make the decision during expression compilation.  Similarly for
EEOP_DOMAIN_TESTVAL.  This actually results in net less code,
mainly because llvmjit_expr.c's code for handling these opcodes
gets shorter.  The performance gain is doubtless negligible, but
this seems worth changing anyway on grounds of simplicity and
understandability.

Author: Andreas Karlsson <andreas@proxel.se>
Co-authored-by: Xing Guo <higuoxing@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACpMh+AiBYAWn+D1aU7Rsy-V1tox06Cbc0H3qA7rwL5zdJ=anQ@mail.gmail.com
2025-01-30 13:21:42 -05:00
Amit Kapila
6252b1eaf8 Doc: Generated column replication.
Commit 7054186c4e added the support to publish generated stored columns.
This patch adds detailed documentation for that feature.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E%40gmail.com
Discussion: https://postgr.es/m/CAHut+PsYmAvKhUjA1AaR1rxLdeSBKiBko8wKyf4_H8nEEqDuOg@mail.gmail.com
2025-01-30 11:09:18 +05:30
Amit Langote
bb3ec16e14 Move PartitionPruneInfo out of plan nodes into PlannedStmt
This moves PartitionPruneInfo from plan nodes to PlannedStmt,
simplifying traversal by centralizing all PartitionPruneInfo
structures in a single list in it, which holds all instances for the
main query and its subqueries. Instead of plan nodes (Append or
MergeAppend) storing PartitionPruneInfo pointers, they now reference
an index in this list.

A bitmapset field is added to PartitionPruneInfo to store the RT
indexes corresponding to the apprelids field in Append or MergeAppend.
This allows execution pruning logic to verify that it operates on the
correct plan node, mainly to facilitate debugging.

Duplicated code in set_append_references() and
set_mergeappend_references() is refactored into a new function,
register_pruneinfo(). This updates RT indexes by applying rtoffet
and adds PartitionPruneInfo to the global list in PlannerGlobal.

By allowing pruning to be performed without traversing the plan tree,
this change lays the groundwork for runtime initial pruning to occur
independently of plan tree initialization.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> (earlier version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2025-01-30 11:57:32 +09:00
Tom Lane
ba0da16bd0 Require callers of coerce_to_domain() to supply base type/typmod.
In view of the issue fixed in commit 0da39aa76, it no longer seems
like a great idea for coerce_to_domain() to offer to perform a lookup
that its caller probably should have done already.  The caller should
be providing a value of the domain's base type, so it's hard to
envision a valid case where it hasn't looked up that type.  After
0da39aa76 there is only one caller using the option for internal
lookup, and that one can trivially be rearranged to not do that.
So this seems more like a bug-encouraging misfeature than a useful
shortcut; let's get rid of it (in HEAD only, there's no need to
break any external callers in back branches).

Discussion: https://postgr.es/m/1865579.1738113656@sss.pgh.pa.us
2025-01-29 15:42:25 -05:00
Tom Lane
0da39aa766 Handle default NULL insertion a little better.
If a column is omitted in an INSERT, and there's no column default,
the code in preptlist.c generates a NULL Const to be inserted.
Furthermore, if the column is of a domain type, we wrap the Const
in CoerceToDomain, so as to throw a run-time error if the domain
has a NOT NULL constraint.  That's fine as far as it goes, but
there are two problems:

1. We're being sloppy about the type/typmod that the Const is
labeled with.  It really should have the domain's base type/typmod,
since it's the input to CoerceToDomain not the output.  This can
result in coerce_to_domain inserting a useless length-coercion
function (useless because it's being applied to a null).  The
coercion would typically get const-folded away later, but it'd
be better not to create it in the first place.

2. We're not applying expression preprocessing (specifically,
eval_const_expressions) to the resulting expression tree.
The planner's primary expression-preprocessing pass already happened,
so that means the length coercion step and CoerceToDomain node miss
preprocessing altogether.

This is at the least inefficient, since it means the length coercion
and CoerceToDomain will actually be executed for each inserted row,
though they could be const-folded away in most cases.  Worse, it
seems possible that missing preprocessing for the length coercion
could result in an invalid plan (for example, due to failing to
perform default-function-argument insertion).  I'm not aware of
any live bug of that sort with core datatypes, and it might be
unreachable for extension types as well because of restrictions of
CREATE CAST, but I'm not entirely convinced that it's unreachable.
Hence, it seems worth back-patching the fix (although I only went
back to v14, as the patch doesn't apply cleanly at all in v13).

There are several places in the rewriter that are building null
domain constants the same way as preptlist.c.  While those are
before the planner and hence don't have any reachable bug, they're
still applying a length coercion that will be const-folded away
later, uselessly wasting cycles.  Hence, make a utility routine
that all of these places can call to do it right.

Making this code more careful about the typmod assigned to the
generated NULL constant has visible but cosmetic effects on some
of the plans shown in contrib/postgres_fdw's regression tests.

Discussion: https://postgr.es/m/1865579.1738113656@sss.pgh.pa.us
Backpatch-through: 14
2025-01-29 15:31:55 -05:00
Tom Lane
6cddecdfb0 Avoid breaking SJIS encoding while de-backslashing Windows paths.
When running on Windows, canonicalize_path() converts '\' to '/'
to prevent confusing the Windows command processor.  It was
doing that in a non-encoding-aware fashion; but in SJIS there
are valid two-byte characters whose second byte matches '\'.
So encoding corruption ensues if such a character is used in
the path.

We can fairly easily fix this if we know which encoding is
in use, but a lot of our utilities don't have much of a clue
about that.  After some discussion we decided we'd settle for
fixing this only in psql, and assuming that its value of
client_encoding matches what the user is typing.

It seems hopeless to get the server to deal with the problematic
characters in database path names, so we'll just declare that
case to be unsupported.  That means nothing need be done in
the server, nor in utility programs whose only contact with
file path names is for database paths.  But psql frequently
deals with client-side file paths, so it'd be good if it
didn't mess those up.

Bug: #18735
Reported-by: Koichi Suzuki <koichi.suzuki@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Koichi Suzuki <koichi.suzuki@enterprisedb.com>
Discussion: https://postgr.es/m/18735-4acdb3998bb9f2b1@postgresql.org
Backpatch-through: 13
2025-01-29 14:24:36 -05:00
Tom Lane
f6ff75f796 Make BufferIsExclusiveLocked and BufferIsDirty work for local buffers.
These functions tried to check the state of the buffer's content lock
even for local buffers.  Since we don't use the content lock for a
local buffer, that would lead to a "false" result from
LWLockHeldByMeInMode, which would mean a misleading "false" answer
from BufferIsExclusiveLocked (we'd rather that case always return
"true") or an assertion failure in BufferIsDirty.

The core code never applies these two functions to local buffers,
and apparently no extensions do either, since we've not heard
complaints.  Still, in the name of future-proofing, let's fix
them to act as though a pinned local buffer is content-locked.

Author: Srinath Reddy <srinath2133@gmail.com>
Discussion: https://postgr.es/m/19396ef77f8.1098c4a1810508.2255483659262451647@zohocorp.com
2025-01-29 13:23:31 -05:00
John Naylor
128897b101 Fix grammatical typos around possessive "its"
Some places spelled it "it's", which is short for "it is".
In passing, fix a couple other nearby grammatical errors.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com
2025-01-29 14:39:14 +07:00
John Naylor
235328ee4a Revert "Speed up tail processing when hashing aligned C strings, take two"
This reverts commit a365d9e2e8c1ead27203a4431211098292777d3b.

Older versions of Valgrind raise an error, so go back to the bytewise
loop for the final word in the input.

Reported-by: Anton A. Melnikov <a.melnikov@postgrespro.ru>
Discussion: https://postgr.es/m/a3a959f6-14b8-4819-ac04-eaf2aa2e868d@postgrespro.ru
Backpatch-through: 17
2025-01-29 13:35:43 +07:00
Michael Paquier
4f071349c0 Improve test coverage of network address functions
The following functions were not covered by any tests:
- abbrev(inet)
- set_masklen(cidr)
- set_masklen(inet)
- netmask(inet)
- hostmask(inet)

While on it, this improves the output of some of the existing queries in
the test inet to use better aliases.

Author: Aleksander Alekseev
Reviewed-by: Jacob Champion, Keisuke Kuroda, Tom Lane
Discussion: https://postgr.es/m/CAJ7c6TOyZ9bGNrDK6Z3Q0gr9ow8ZpOm+=+01mpE0dsdH4C+u9A@mail.gmail.com
2025-01-29 08:49:48 +09:00
Amit Kapila
75eb9766ec Rename pubgencols_type to pubgencols in pg_publication.
The column added in commit e65dbc9927, pubgencols_type, was inconsistent
with the naming conventions of other columns in the pg_publication
catalog.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1u-ufVOW-RUsXSooqzkpohxfZYy=z78fbcr_9Pq5hbCg@mail.gmail.com
2025-01-28 10:42:46 +05:30
Michael Paquier
30a6ed0ce4 Track per-relation cumulative time spent in [auto]vacuum and [auto]analyze
This commit adds four fields to the statistics of relations, aggregating
the amount of time spent for each operation on a relation:
- total_vacuum_time, for manual vacuum.
- total_autovacuum_time, for vacuum done by the autovacuum daemon.
- total_analyze_time, for manual analyze.
- total_autoanalyze_time, for analyze done by the autovacuum daemon.

This gives users the option to derive the average time spent for these
operations with the help of the related "count" fields.

Bump catalog version (for the catalog changes) and PGSTAT_FILE_FORMAT_ID
(for the additions in PgStat_StatTabEntry).

Author: Sami Imseih
Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
2025-01-28 09:57:32 +09:00
Peter Eisentraut
5afaba6297 doc: Meson is not experimental on Windows
The installation documentation stated that using Meson is
experimental.  But since this is the only way to build using Visual
Studio on Windows, this would imply that that whole build procedure is
experimental, which isn't true.  So qualify this statement a bit more.
We keep the statement that Meson is experimental on other platforms,
since it doesn't have full, confirmed feature parity with the make
build system.

Author: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/flat/a3e76618-4cb5-4d54-a71c-da4fb8ba571b@eisentraut.org
2025-01-27 12:02:00 +01:00
Michael Paquier
65281391a9 Print out error position for some ALTER TABLE ALTER COLUMN type
A ParseState exists in ATPrepAlterColumnType() since its introduction
in 077db40fa1f3, and it has never relied on a query string that could be
used to point at a location in the origin string on error.

The output of some regression tests are updated, showing the error
location where applicable.  Six error strings are upgraded with the
error location.

Author: Jian He
Discussion: https://postgr.es/m/CACJufxGfbPfWLjcEz33G9eW_epDW0UDi2H05i9eSTPKGJ4rxSA@mail.gmail.com
2025-01-27 13:51:23 +09:00
Michael Paquier
14793f4719 pg_amcheck: Fix test failure on Windows with non-existing role
For SSPI auth extra users need to be explicitly allowed, or we get
"SSPI authentication failed" instead of the expected "role does not
exist" error.

This report also means that the test has never worked on Windows since
its introduction in 9706092839db, because it has always bumped on an
authentication failure rather than an error about the role not existing.

Oversight in eef4a33f62f7, that has added a pattern check on the error
generated by the command.

Per report from Tom Lane, via buildfarm member drongo.

Author: Dagfinn Ilmari Mannsåker
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/379085.1737734611@sss.pgh.pa.us
2025-01-27 08:00:19 +09:00
Noah Misch
2f12df7eb4 Test postmaster with program_options_handling_ok() et al.
Most executables already get that testing.  To occupy the customary
001_basic.pl name, this renumbers the new-in-October tests of
src/test/postmaster/t.

Reviewed by Thomas Munro.

Discussion: https://postgr.es/m/20241215022701.a1.nmisch@google.com
2025-01-26 09:39:05 -08:00
Álvaro Herrera
0a16c8326c
Add missing CommandCounterIncrement
For commit b663b9436e75 I thought this was useless, but turns out not to
be for the case where a partitioned table has two identical foreign key
constraints which can both be matched by the same constraint in a
partition during attach.  This CCI makes the match search for the second
constraint in the parent ignore the constraint in the child that has
already been matched by the first constraint in the parent.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c599253c-1ccd-4161-80fc-c9065e037a09@gmail.com
2025-01-26 17:34:28 +01:00
Noah Misch
d28cd3e7b2 At update of non-LP_NORMAL TID, fail instead of corrupting page header.
The right mix of DDL and VACUUM could corrupt a catalog page header such
that PageIsVerified() durably fails, requiring a restore from backup.
This affects only catalogs that both have a syscache and have DDL code
that uses syscache tuples to construct updates.  One of the test
permutations shows a variant not yet fixed.

This makes !TransactionIdIsValid(TM_FailureData.xmax) possible with
TM_Deleted.  I think core and PGXN are indifferent to that.

Per bug #17821 from Alexander Lakhin.  Back-patch to v13 (all supported
versions).  The test case is v17+, since it uses INJECTION_POINT.

Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
2025-01-25 11:28:14 -08:00
Noah Misch
81772a495e Merge copies of converting an XID to a FullTransactionId.
Assume twophase.c is the performance-sensitive caller, and preserve its
choice of unlikely() branch hint.  Add some retrospective rationale for
that choice.  Back-patch to v17, for the next commit to use it.

Reviewed (in earlier versions) by Michael Paquier.

Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
Discussion: https://postgr.es/m/20250116010051.f3.nmisch@google.com
2025-01-25 11:28:14 -08:00
Noah Misch
4f6ec3831d Disable runningcheck for src/test/modules/injection_points/specs.
Directory "injection_points" has specified NO_INSTALLCHECK since before
commit c35f419d6efbdf1a050250d84b687e6705917711 added the specs, but
that commit neglected to disable the corresponding meson runningcheck.
The alternative would be to enable "make installcheck" for ISOLATION,
but the GNU make build system lacks a concept of setting NO_INSTALLCHECK
for REGRESS without also setting it for ISOLATION.  Back-patch to v17,
where that commit first appeared, to avoid surprises when back-patching
additional specs.

Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
2025-01-25 11:28:14 -08:00
Noah Misch
7819a25cd1 Test ECPG decadd(), decdiv(), decmul(), and decsub() for risnull() input.
Since commit 757fb0e5a9a61ac8d3a67e334faeea6dc0084b3f, these
Informix-compat functions return 0 without changing the output
parameter.  Initialize the output parameter before the test call, making
that obvious.  Before this, the expected test output has been depending
on freed stack memory.  "gcc -ftrivial-auto-var-init=pattern" revealed
that.  Back-patch to v13 (all supported versions).

Discussion: https://postgr.es/m/20250106192748.cf.nmisch@google.com
2025-01-25 11:28:14 -08:00
Tom Lane
d83a108c10 Doc: recommend "psql -X" for restoring pg_dump scripts.
This practice avoids possible problems caused by non-default psql
options, such as disabling AUTOCOMMIT.

Author: Shinya Kato <Shinya11.Kato@oss.nttdata.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/96ff23a5d858ff72ca8e823a014d16fe@oss.nttdata.com
Backpatch-through: 13
2025-01-25 12:42:22 -05:00
Andres Freund
87a6690cc6 Change shutdown sequence to terminate checkpointer last
The main motivation for this change is to have a process that can serialize
stats after all other processes have terminated. Serializing stats already
happens in checkpointer, even though walsenders can be active longer.

The only reason the current shutdown sequence does not actively cause problems
is that walsender currently does not generate any stats. However, there is an
upcoming patch changing that.

Another need for this change originates in the AIO patchset, where IO
workers (which, in some edge cases, can emit stats of their own) need to run
while the shutdown checkpoint is being written.

This commit changes the shutdown sequence so checkpointer is signalled (via
SIGINT) to trigger writing the shutdown checkpoint without also causing
checkpointer to exit.  Once checkpointer wrote the shutdown checkpoint it
notifies postmaster via PMSIGNAL_XLOG_IS_SHUTDOWN and waits for the
termination signal (SIGUSR2, as before).  Checkpointer now is terminated after
all children, other than dead-end children and logger, have been terminated,
tracked using the new PM_WAIT_CHECKPOINTER PMState.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-25 11:37:13 -05:00
Tom Lane
04ace176e0 Tighten pg_restore's recognition of its -F (format) option values.
Instead of checking just the first letter, match the whole string
using pg_strcasecmp.  Per the documentation, we allow either just
the first letter (e.g. "c") or the whole name ("custom"); but we
will no longer accept random variations such as "chump".  This
matches pg_dump's longstanding parsing code for the same option.

Also for consistency with pg_dump, recognize "p"/"plain".  We don't
support it, but we can give a more helpful error message than
"unrecognized archive format".

Author: Srinath Reddy <srinath2133@gmail.com>
Discussion: https://postgr.es/m/CAFC+b6pfK-BGcWW1kQmtxVrCh-JGjB2X02rLPQs_ZFaDGjZDsQ@mail.gmail.com
2025-01-25 11:24:16 -05:00
Jeff Davis
d2ca16bb50 Fix PDF doc build.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/608525.1737781222@sss.pgh.pa.us
2025-01-25 00:12:30 -08:00
Tomas Vondra
38273b5f83 Use the correct sizeof() in BufFileLoadBuffer
The sizeof() call should reference buffer.data, because that's the
buffer we're reading data into, not the whole PGAlignedBuffer union.
This was introduced by 44cac93464, which replaced the simple buffer
with a PGAlignedBuffer field.

It's benign, because the buffer is the largest field of the union, so
the sizes are the same. But it's easy to trip over this in a patch, so
fix and backpatch. Commit 44cac93464 went into 12, but that's EOL.

Backpatch-through: 13
Discussion: https://postgr.es/m/928bdab1-6567-449f-98c4-339cd2203b87@vondra.me
2025-01-25 02:12:59 +01:00
Jeff Davis
bfc5992069 Add SQL function CASEFOLD().
Useful for caseless matching. Similar to LOWER(), but avoids edge-case
problems with using LOWER() for caseless matching.

For collations that support it, CASEFOLD() handles characters with
more than two case variations or multi-character case variations. Some
characters may fold to uppercase. The results of case folding are also
more stable across Unicode versions than LOWER() or UPPER().

Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com
Reviewed-by: Ian Lawrence Barwick
2025-01-24 14:56:22 -08:00
Andres Freund
f15538cd27 postmaster: Adjust which processes we expect to have exited
Comments and code stated that we expect checkpointer to have been signalled in
case of immediate shutdown / fatal errors, but didn't treat archiver and
walsenders the same. That doesn't seem right.

I had started digging through the history to see where this oddity was
introduced, but it's not the fault of a single commit.

Instead treat archiver, checkpointer, and walsenders the same.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:08:33 -05:00
Andres Freund
463a2ebd9f postmaster: Commonalize FatalError paths
This includes some behavioral changes:

- Previously PM_WAIT_XLOG_ARCHIVAL wasn't handled in HandleFatalError(), that
  doesn't seem quite right.

- Previously a fatal error in PM_WAIT_XLOG_SHUTDOWN lead to jumping back to
  PM_WAIT_BACKENDS, no we go to PM_WAIT_DEAD_END. Jumping backwards doesn't
  seem quite right and we didn't do so when checkpointer failed to fork during
  a shutdown.

- Previously a checkpointer fork failure didn't call SetQuitSignalReason(),
  which would lead to quickdie() reporting
  "terminating connection because of unexpected SIGQUIT signal"
  which seems even worse than the PMQUIT_FOR_CRASH message. If I saw that in
  the log I'd suspect somebody outside of postgres sent SIGQUITs

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:08:31 -05:00
Andres Freund
8edd8c77c8 postmaster: Move code to switch into FatalError state into function
There are two places switching to FatalError mode, behaving somewhat
differently. An upcoming commit will introduce a third. That doesn't seem seem
like a good idea.

This commit just moves the FatalError related code from HandleChildCrash()
into its own function, a subsequent commit will evolve the state machine
change to be suitable for other callers.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Andres Freund
f0b7ab7251 postmaster: Don't repeatedly transition to crashing state
Previously HandleChildCrash() skipped logging and signalling child exits if
already in an immediate shutdown or in FatalError state, but still
transitioned server state in response to a crash. That's redundant.

In the other place we transition to FatalError, we do take care to not do so
when already in FatalError state.

To make it easier to combine different paths for entering FatalError state,
only do so once in HandleChildCrash().

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Andres Freund
d239c1a8e5 postmaster: Don't open-code TerminateChildren() in HandleChildCrash()
After removing the duplication no user of sigquit_child() remains, therefore
remove it.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Andres Freund
4d271e3ec2 checkpointer: Request checkpoint via latch instead of signal
The motivation for this change is that a future commit will use SIGINT for
another purpose (postmaster requesting WAL access to be shut down) and that
there no other signals that we could readily use (see code comment for the
reason why SIGTERM shouldn't be used). But it's also a tad nicer / more
efficient to use SetLatch(), as it avoids sending signals when checkpointer
already is busy.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Tom Lane
a5579a90af Make jsonb casts to scalar types translate JSON null to SQL NULL.
Formerly, these cases threw an error "cannot cast jsonb null to type
<whatever>".  That seems less than helpful though.  It's also
inconsistent with the behavior of the ->> operator, which translates
JSON null to SQL NULL, as do some other jsonb functions.

Discussion: https://postgr.es/m/3851203.1722552717@sss.pgh.pa.us
2025-01-24 13:20:44 -05:00
Peter Eisentraut
13a255c195 Fix copy-and-paste typo 2025-01-24 17:45:55 +01:00
Daniel Gustafsson
035f99cbeb pgcrypto: Make it possible to disable built-in crypto
When using OpenSSL and/or the underlying operating system in FIPS
mode no non-FIPS certified crypto implementations should be used.
While that is already possible by just not invoking the built-in
crypto in pgcrypto, this adds a GUC which prohibit the code from
being called.  This doesn't change the FIPS status of PostgreSQL
but can make it easier for sites which target FIPS compliance to
ensure that violations cannot occur.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Joe Conway <mail@joeconway.com>
Reviewed-by: Joe Conway <mail@joeconway.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/16b4a157-9ea1-44d0-b7b3-4c85df5de97b@joeconway.com
2025-01-24 14:25:08 +01:00
Daniel Gustafsson
924d89a354 pgcrypto: Add function to check FIPS mode
This adds a SQL callable function for reading and returning the status
of FIPS configuration of OpenSSL.  If OpenSSL is operating with FIPS
enabled it will return true, otherwise false.  As this adds a function
to the SQL file, bump the extension version to 1.4.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Joe Conway <mail@joeconway.com>
Discussion: https://postgr.es/m/8f979145-e206-475a-a31b-73c977a4134c@joeconway.com
2025-01-24 14:18:40 +01:00
Álvaro Herrera
c44c2d2759
Fix instability in recently added regression tests
We missed the usual ORDER BY clause.

Author: Amul Sul <amul.sul@enterprisedb.com>
Discussion: https://postgr.es/m/CAAJ_b974U3Vvf-qGwFyZ73DFHqyFJP9TOmuiXR2Kp8KVcJtP6w@mail.gmail.com
2025-01-24 12:54:46 +01:00
Peter Eisentraut
aeb8ea361a Convert sepgsql tests to TAP
Add a TAP test for sepgsql.  This automates the previously required
manual setup before the test.  The actual tests are still run by
pg_regress, as before, but now called from within the TAP Perl script.

The previous manual test script (test_sepgsql) is left in place, since
its purpose is (also) to test whether a running instance was properly
initialized for sepgsql.  But it has been changed to call pg_regress
directly and no longer require make.

Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/651a5baf-5c45-4a5a-a202-0c8453a4ebf8@eisentraut.org
2025-01-24 12:39:47 +01:00
Peter Eisentraut
02ed3c2bdc meson: Fix sepgsql installation
The sepgsql.sql file should be installed under share/contrib/, not
share/extension/, since it is not an extension.  This makes it match
what make install does.

Discussion: https://www.postgresql.org/message-id/flat/651a5baf-5c45-4a5a-a202-0c8453a4ebf8@eisentraut.org
2025-01-24 10:26:12 +01:00
Michael Paquier
fd4c4ede70 initdb: Convert tests to use long options with fat comma style
This is similar to ce1b0f9da03e, but this time this rule is applied to
some of the TAP tests of initdb.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/878qr146ra.fsf@wibble.ilmari.org
2025-01-24 15:19:38 +09:00
Peter Eisentraut
473a575e05 Return yyparse() result not via global variable
Instead of passing the parse result from yyparse() via a global
variable, pass it via a function output argument.

This complements earlier work to make the parsers reentrant.

Discussion: Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2025-01-24 06:55:39 +01:00
Amit Kapila
6fc4fc42da Doc: Fix a typo introduced in 4a0e7314f1.
Author: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/6e625c81-968e-42d0-802d-edfaf9cfac11@xs4all.nl
2025-01-24 08:25:21 +05:30
Amit Kapila
117f9f328e Doc: Fix column name in pg_publication catalog.
Commit e65dbc9927 incorrectly spelled the column name in the
pg_publication catalog. In passing make the order of columns in the doc
match the actual catalog.

Author: Shinoda, Noriyoshi <noriyoshi.shinoda@hpe.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/DM4PR84MB1734F8F140E4477580761F93EEE02@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM
2025-01-24 08:11:29 +05:30
Tom Lane
4f15759bdc Don't ask for bug reports about pthread_is_threaded_np() != 0.
We thought that this condition was unreachable in ExitPostmaster,
but actually it's possible if you have both a misconfigured locale
setting and some other mistake that causes PostmasterMain to bail
out before reaching its own check of pthread_is_threaded_np().

Given the lack of other reports, let's not ask for bug reports if
this occurs; instead just give the same hint as in PostmasterMain.

Bug: #18783
Reported-by: anani191181515@gmail.com
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/18783-d1873b95a59b9103@postgresql.org
Discussion: https://postgr.es/m/206317.1737656533@sss.pgh.pa.us
Backpatch-through: 13
2025-01-23 14:23:04 -05:00
Tom Lane
01463e1ccc Ensure that AFTER triggers run as the instigating user.
With deferred triggers, it is possible that the current role changes
between the time when the trigger is queued and the time it is
executed (for example, the triggering data modification could have
been executed in a SECURITY DEFINER function).

Up to now, deferred trigger functions would run with the current role
set to whatever was active at commit time.  That does not matter for
foreign-key constraints, whose correctness doesn't depend on the
current role.  But for user-written triggers, the current role
certainly can matter.

Hence, fix things so that AFTER triggers are fired under the role
that was active when they were queued, matching the behavior of
BEFORE triggers which would have actually fired at that time.
(If the trigger function is marked SECURITY DEFINER, that of course
overrides this, as it always has.)

This does not create any new security exposure: if you do DML on a
table owned by a hostile user, that user has always had various ways
to exploit your permissions, such as the aforementioned BEFORE
triggers, default expressions, etc.  It might remove some security
exposure, because the old behavior could potentially expose some
other role besides the one directly modifying the table.

There was discussion of making a larger change, such as running as
the trigger's owner.  However, that would break the common idiom of
capturing the value of CURRENT_USER in a trigger for auditing/logging
purposes.  This change will make no difference in the typical scenario
where the current role doesn't change before commit.

Arguably this is a bug fix, but it seems too big a semantic change
to consider for back-patching.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://postgr.es/m/77ee784cf248e842f74588418f55c2931e47bd78.camel@cybertec.at
2025-01-23 12:25:55 -05:00
Jeff Davis
4e7f62bc38 Add support for Unicode case folding.
Expand case mapping tables to include entries for case folding, which
are parsed from CaseFolding.txt.

Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com
2025-01-23 09:06:50 -08:00
Tom Lane
7921927bbb Reverse the search order in afterTriggerAddEvent().
When scanning existing AfterTriggerSharedData records in search
of a match to the event being queued, we were examining the
records from oldest to newest.  But it makes more sense to do
the opposite.  The newest record is likely to be from the current
query, while the oldest is likely to be from some previous command
in the same transaction, which will likely have different details.

There aren't expected to be very many active AfterTriggerSharedData
records at once, so that this change is unlikely to make any
spectacular difference.  Still, having added a nontrivially-expensive
bms_equal call to this loop yesterday, I feel a need to shave cycles
where possible.

Discussion: https://postgr.es/m/4166712.1737583961@sss.pgh.pa.us
2025-01-23 11:08:05 -05:00
Álvaro Herrera
b663b9436e
Allow NOT VALID foreign key constraints on partitioned tables
This feature was intentionally omitted when FKs were first implemented
for partitioned tables, and had been requested a few times; the
usefulness is clear.

Validation can happen for each partition individually, which is useful
to contain the number of locks held and the duration; or it can be
executed for the partitioning hierarchy as a single command, which
validates all child constraints that haven't been validated already.

This is also useful to implement NOT ENFORCED constraints on top.

Author: Amul Sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b96Bp=-ZwihPPtuaNX=SrZ0U6ZsXD3+fgARO0JuKa8v2jQ@mail.gmail.com
2025-01-23 15:54:38 +01:00
Amit Kapila
b35434b134 Fix buildfarm failure introduced by commit e65dbc9927.
The patch had incorrectly specified the default value for
publish_generated_columns during the query formation in pg_dump.

Author: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1KfZYTD8Hpi9TD1KaB8rNUBR9baUvTxa5wYyZDGbEaa6g@mail.gmail.com
2025-01-23 17:47:15 +05:30
Peter Eisentraut
34694ec888 Convert macros to static inline functions (htup_details.h, itup.h)
Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517@enterprisedb.com
2025-01-23 12:12:08 +01:00
Peter Eisentraut
b15b8c5cf8 Add some const decorations (htup.h)
Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517@enterprisedb.com
2025-01-23 12:12:08 +01:00
Amit Kapila
e65dbc9927 Change publication's publish_generated_columns option type to enum.
The current boolean publish_generated_columns option only supports a
binary choice, which is insufficient for future enhancements where
generated columns can be of different types (e.g., stored or virtual). The
supported values for the publish_generated_columns option are 'none' and
'stored'.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2025-01-23 15:28:37 +05:30
Michael Paquier
eef4a33f62 Add error pattern checks for some TAP tests for non-existing objects
Some tests are updated to use command_fails_like(), gaining a check for
the error output generated.  The test changed in pg_amcheck has come up
after noticing that an incorrect option name still made the test to
pass, while the command failed.  The three other tests changed in
src/bin/scripts/ have been noticed by me, in passing.

Author: Dagfinn Ilmari Mannsåker, Michael Paquier
Discussion: https://postgr.es/m/87bjvy50cs.fsf@wibble.ilmari.org
2025-01-23 16:03:48 +09:00
Michael Paquier
858b4db378 Improve TAP tests of pg_basebackup
This addresses some minor issues with the TAP tests of pg_basebackup:
- Remove three duplicated tests used for incorrect option combinations.
- Add more pattern checks for commands doomed to fail, to make sure that
the error generated is the expected one.  These are for tests related to
the tablespace mapping and incorrect option combinations.
- Fix the description of one test for the case of backup target versus
format.

Issues noticed while reviewing this area of the tests.

Discussion: https://postgr.es/m/87bjvy50cs.fsf@wibble.ilmari.org
2025-01-23 15:15:36 +09:00
Tom Lane
172e6b3adb Support RN (roman-numeral format) in to_number().
We've long had roman-numeral output support in to_char(),
but lacked the reverse conversion.  Here it is.

Author: Hunaid Sohail <hunaidpgml@gmail.com>
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAMWA6ybh4M1VQqpmnu2tfSwO+3gAPeA8YKnMHVADeB=XDEvT_A@mail.gmail.com
2025-01-22 15:18:50 -05:00
Nathan Bossart
f0ee648527 Fix comment about AVX-512 popcount support.
Since commit f78667bd91, we've used __attribute__((target(...)))
instead of extra compiler flags for AVX-512 support, but this
comment still says that we put the code in a separate file because
it might require extra compiler flags.  Let's just remove that part
of the comment.
2025-01-22 14:11:37 -06:00
Tom Lane
ea68ea6320 Repair incorrect handling of AfterTriggerSharedData.ats_modifiedcols.
This patch fixes two distinct errors that both ultimately trace
to commit 71d60e2aa, which added the ats_modifiedcols field.

The more severe error is that ats_modifiedcols wasn't accounted for
in afterTriggerAddEvent's scanning loop that looks for a pre-existing
duplicate AfterTriggerSharedData.  Thus, a new event could be
incorrectly matched to an AfterTriggerSharedData that has a different
value of ats_modifiedcols, resulting in the wrong tg_updatedcols
bitmap getting passed to the trigger whenever it finally gets fired.
We'd not noticed because (a) few triggers consult tg_updatedcols,
and (b) we had no tests exercising a case where such a trigger was
called as an AFTER trigger.  In the test case added by this commit,
contrib/lo's trigger fails to remove a large object when expected
because (without this fix) it thinks the LO OID column hasn't changed.

The other problem was introduced by commit ce5aaea8c, which copied the
modified-columns bitmap into trigger-related storage.  It made a copy
for every trigger event, whereas what we really want is to make a new
copy only when we make a new AfterTriggerSharedData entry.  (We could
imagine adding extra logic to reduce the number of bitmap copies still
more, but it doesn't look worthwhile at the moment.)  In a simple test
of an UPDATE of 10000000 rows with a single AFTER trigger, this thinko
roughly tripled the amount of memory consumed by the pending-triggers
data structures, from 160446744 to 480443440 bytes.

Fixing the first problem requires introducing a bms_equal() call into
afterTriggerAddEvent's scanning loop, which is slightly annoying from
a speed perspective.  However, getting rid of the excessive bms_copy()
calls from the second problem balances that out; overall speed of
trigger operations is the same or slightly better, in my tests.

Discussion: https://postgr.es/m/3496294.1737501591@sss.pgh.pa.us
Backpatch-through: 13
2025-01-22 11:58:20 -05:00
Amit Kapila
991974bb48 Fix \dRp+ output when describing publications with a lower server version.
The psql was not careful that the new column "Generated columns" won't be
present in the lower version. This was introduced in recent commit
7054186c4e.

Author: Vignesh C
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CALDaNm3OcXdY0EzDEKAfaK9gq2B67Mfsgxu93+_249ohyts=0g@mail.gmail.com
2025-01-22 15:27:37 +05:30
Peter Eisentraut
41084409f6 Additional tests for stored generated columns
Some additional tests have been created during the development of
virtual generated columns (not included here).  This commit adds
equivalent tests to the existing test set for stored generated
columns.  This includes expanded tests related to MERGE, subqueries,
whole-row references, permissions, domains, partitioning, and
triggers.

Author: Peter Eisentraut <peter@eisentraut.org>
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-01-22 07:32:21 +01:00
Michael Paquier
ce1b0f9da0 Improve grammar of options for command arrays in TAP tests
This commit rewrites a good chunk of the command arrays in TAP tests
with a grammar based on the following rules:
- Fat commas are used between option names and their values, making it
clear to both humans and perltidy that values and names are bound
together.  This is particularly useful for the readability of multi-line
command arrays, and there are plenty of them in the TAP tests.  Most of
the test code is updated to use this style.  Some commands used
parenthesis to show the link, or attached values and options in a single
string.  These are updated to use fat commas instead.
- Option names are switched to use their long names, making them more
self-documented.  Based on a suggestion by Andrew Dunstan.
- Add some trailing commas after the last item in multi-line arrays,
which is a common perl style.

Not all the places are taken care of, but this covers a very good chunk
of them.

Author: Dagfinn Ilmari Mannsåker
Reviewed-by: Michael Paquier, Peter Smith, Euler Taveira
Discussion: https://postgr.es/m/87jzc46d8u.fsf@wibble.ilmari.org
2025-01-22 14:47:13 +09:00
Amit Kapila
4a0e7314f1 Doc: Update the interaction of tablesync with wal_retrieve_retry_interval.
In passing, update the documentation that explains the process of initial
data replication to explicitly state that it uses a table synchronization
worker.

Author: Vignesh C
Reviewed-by: Peter Smith, Shlok Kyal, Amit Kapila
Discussion: https://postgr.es/m/CALDaNm3RxGcD4cDAV5Q0_A4n06F3+AAMpxiyND9Zn0dB86hFmg@mail.gmail.com
2025-01-22 10:54:53 +05:30
Michael Paquier
be31ac2519 Run perltidy
A follow-up patch will adjust the TAP tests to follow a more-structured
format for option lists in commands, that perltidy is able to cope
better with.  Putting the tree first in a clean state makes the next
change a bit easier.  v20230309 has been used.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87jzc46d8u.fsf@wibble.ilmari.org
2025-01-22 10:15:32 +09:00
Tom Lane
4907ba304c Doc: simplify the tutorial's window-function examples.
For the purposes of this discussion, row_number() is just as good
as rank(), and its behavior is easier to understand and describe.
So let's switch the examples to using row_number().

Along the way to checking the results given in the tutorial,
I found it helpful to extract the empsalary table we use in the
regression tests, which is evidently the same data that was used
to make these results.  So I shoved that into advanced.source
to improve the coverage of that file a little.  (There's still
several pages of the tutorial that are not included in it,
but at least now 3.5 Window Functions is covered.)

Suggested-by: "David G. Johnston" <david.g.johnston@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/173737973383.1070.1832752929070067441@wrigleys.postgresql.org
2025-01-21 14:43:21 -05:00
Álvaro Herrera
db19a5061c
Reword recent error messages: "should" -> "must"
Most were introduced in the 17 timeframe.  The ones in wparser_def.c are
very old.

I also changed "JSON path expression for column \"%s\" should return
single item without wrapper" to "JSON path expression for column \"%s\"
must return single item when no wrapper is requested" to avoid
ambiguity.

Backpatch to 17.

Crickets: https://postgr.es/m/202501131819.26ors7oouafu@alvherre.pgsql
2025-01-21 15:24:49 +01:00
Álvaro Herrera
9b21f203dd
Fix detach of a partition that has a toplevel FK to a partitioned table
In common cases, foreign keys are defined on the toplevel partitioned
table; but if instead one is defined on a partition and references a
partitioned table, and the referencing partition is detached, we would
examine the pg_constraint row on the partition being detached, and fail
to realize that the sub-constraints must be left alone.  This causes the
ALTER TABLE DETACH process to fail with

 ERROR:  could not find ON INSERT check triggers of foreign key constraint NNN

This is similar but not quite the same as what was fixed by
53af9491a043.  This bug doesn't affect branches earlier than 15, because
the detach procedure was different there, so we only backpatch down to
15.

Fix by skipping such modifying constraints that are children of other
constraints being detached.

Author: Amul Sul <sulamul@gmail.com>
Diagnosys-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b97GuPh6wQPbxQS-Zpy16Oh+0aMv-w64QcGrLhCOZZ6p+g@mail.gmail.com
2025-01-21 14:53:46 +01:00
Peter Eisentraut
1772d554b0 Fix NO ACTION temporal foreign keys when the referenced endpoints change
If a referenced UPDATE changes the temporal start/end times, shrinking
the span the row is valid, we get a false return from
ri_Check_Pk_Match(), but overlapping references may still be valid, if
their reference didn't overlap with the removed span.

We need to consider what span(s) are still provided in the referenced
table.  Instead of returning that from ri_Check_Pk_Match(), we can
just look it up in the main SQL query.

Reported-by: Sam Gabrielsson <sam@movsom.se>
Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2025-01-21 14:39:24 +01:00
Peter Eisentraut
888d4523f0 Improve whitespace in without_overlaps test
Make some indentation better and more consistent.  Extracted from
another patch with some actual test changes.

Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2025-01-21 12:14:49 +01:00
Peter Eisentraut
44b61efb79 Improve generated_stored test
The test table names gtest11s and gtest12s were way originally chosen
to signify "stored", when the idea was to have virtual columns in the
same test file.  This is no longer the idea, so this naming is
irrelevant.  (The upcoming feature of virtual generated columns will
have a test file that is initially a copy of generated_stored.sql, and
this random difference will be even more annoying then.)  Clean this
up by dropping the suffix.

Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-01-21 08:13:40 +01:00
Amit Langote
fb9f955025 Refactor ExecScan() to allow inlining of its core logic
This commit refactors ExecScan() by moving its tuple-fetching,
filtering, and projection logic into an inline-able function,
ExecScanExtended(), defined in src/include/executor/execScan.h.
ExecScanExtended() accepts parameters for EvalPlanQual state,
qualifiers (ExprState), and projection (ProjectionInfo).

Specialized variants of the execution function of a given Scan node
(for example, ExecSeqScan() for SeqScan) can then pass const-NULL for
unused parameters.  This allows the compiler to inline the logic and
eliminate unnecessary branches or checks.  Each variant function thus
contains only the necessary code, optimizing execution for scans
where these features are not needed.

The variant function to be used is determined in the ExecInit*()
function of the node and assigned to the ExecProcNode function pointer
in the node's PlanState, effectively turning runtime checks and
conditional branches on the NULLness of epqstate, qual, and projInfo
into static ones, provided the compiler successfully eliminates
unnecessary checks from the inlined code of ExecScanExtended().

Currently, only ExecSeqScan() is modified to take advantage of this
inline-ability.  Other Scan nodes might benefit from such specialized
variant functions but that is left as future work.

Benchmarks performed by Junwang Zhao, David Rowley and myself show up
to a 5% reduction in execution time for queries that rely heavily on
Seq Scans. The most significant improvements were observed in
scenarios where EvalPlanQual, qualifiers, and projection were not
required, but other cases also benefit from reduced runtime overhead
due to the inlining and removal of unnecessary code paths.

The idea for this patch first came from Andres Freund in an off-list
discussion. The refactoring approach implemented here is based on a
proposal by David Rowley, significantly improving upon the patch I
(amitlan) initially proposed.

Suggested-by: Andres Freund <andres@anarazel.de>
Co-authored-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Tested-by: Junwang Zhao <zhjwpku@gmail.com>
Tested-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGaH-otvqW_ce-paL=96JvU4j+Xbuk+14esJNDwefdkOg@mail.gmail.com
2025-01-21 12:53:03 +09:00
Michael Paquier
4feba03d8b Rework handling of pending data for backend statistics
9aea73fc61d4 has added support for backend statistics, relying on
PgStat_EntryRef->pending for its data pending for flush.  This design
lacks in flexibility, because the pending list does some memory
allocation, making it unsuitable if incrementing counters in critical
sections.

Pending data of backend statistics is reworked so the implementation
does not depend on PgStat_EntryRef->pending anymore, relying on a static
area of memory to store the counters that are flushed when stats are
reported to the pgstats dshash.  An advantage of this approach is to
allow the pending data to be manipulated in critical sections; some
patches are under discussion and require that.

The pending data is tracked by PendingBackendStats, local to
pgstat_backend.c.  Two routines are introduced to allow IO statistics to
update the backend-side counters.  have_static_pending_cb and
flush_static_cb are used for the flush, instead of flush_pending_cb.

Author: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/66efowskppsns35v5u2m7k4sdnl7yoz5bo64tdjwq7r5lhplrz@y7dme5xwh2r5
2025-01-21 11:30:42 +09:00
Michael Paquier
28de66cee5 Rename some pgstats callbacks related to flush of entries
The two callbacks have_fixed_pending_cb and flush_fixed_cb have been
introduced in fc415edf8ca8 to provide a way for fixed-numbered
statistics to control the flush of their data.  These are renamed to
respectively have_static_pending_cb and flush_static_cb.  The
restriction that these only apply to fixed-numbered stats is removed.

A follow-up patch will make use of them for backend statistics.  This
stats kind is variable-numbered, and patches are under discussion to
track WAL data for IO and backend stats which cannot use
PgStat_EntryRef->pending as pending data would be touched in critical
sections, where no memory allocation can happen.

Per discussion with Andres Freund.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/66efowskppsns35v5u2m7k4sdnl7yoz5bo64tdjwq7r5lhplrz@y7dme5xwh2r5
2025-01-21 10:12:39 +09:00
Tom Lane
60c513f8fa Update time zone data files to tzdata release 2025a.
DST law changes in Paraguay.
Historical corrections for the Philippines.

Backpatch-through: 13
2025-01-20 16:49:15 -05:00
Tom Lane
8108674f0e Avoid using timezone Asia/Manila in regression tests.
The freshly-released 2025a version of tzdata has a refined estimate
for the longitude of Manila, changing their value for LMT in
pre-standardized-timezone days.  This changes the output of one of
our test cases.  Since we need to be able to run with system tzdata
files that may or may not contain this update, we'd better stop
making that specific test.

I switched it to use Asia/Singapore, which has a roughly similar UTC
offset.  That LMT value hasn't changed in tzdb since 2003, so we can
hope that it's well established.

I also noticed that this set of make_timestamptz tests only exercises
zones east of Greenwich, which seems rather sad, and was not the
original intent AFAICS.  (We've already changed these tests once
to stabilize their results across tzdata updates, cf 66b737cd9;
it looks like I failed to consider the UTC-offset-sign aspect then.)
To improve that, add a test with Pacific/Honolulu.  That LMT offset
is also quite old in tzdb, so we'll cross our fingers that it doesn't
get improved.

Reported-by: Christoph Berg <cb@df7cb.de>
Discussion: https://postgr.es/m/Z46inkznCxesvDEb@msg.df7cb.de
Backpatch-through: 13
2025-01-20 15:47:53 -05:00
Peter Eisentraut
86749ea3b7 Improve generated_stored test
It makes more sense to put the catalog sanity check at the end of the
test rather than at the beginning, so that it can also check whatever
the tests did rather than just whatever happened before the tests.

Suggested-by: jian he <jian.universality@gmail.com>

Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-01-20 15:27:33 +01:00
Peter Eisentraut
443a8e4ae3 Add some more use of Page/PageData rather than char *
Discussion: https://www.postgresql.org/message-id/flat/692ee0da-49da-4d32-8dca-da224cc2800e@eisentraut.org
2025-01-20 13:05:50 +01:00
Peter Eisentraut
4f4a1d853a Add const qualifiers to bufpage.h
This makes use of the new PageData type.

PageGetSpecialPointer() had to be turned back into a macro, because it
is used in a way that sometimes it takes const and returns const and
sometimes takes non-const and returns non-const.

Discussion: https://www.postgresql.org/message-id/flat/692ee0da-49da-4d32-8dca-da224cc2800e@eisentraut.org
2025-01-20 11:06:57 +01:00
Peter Eisentraut
6e4df237fb Add PageData C type
This adds the C type PageData and makes the existing type Page a
pointer to it.  This follows the usual PostgreSQL C type naming scheme
of Foo/FooData pairs.  (Prior to commit ddbba3aac86, PageData existed
as an unrelated type.)  The type definitions are compatible, so this
doesn't change anything except some of the naming.

Discussion: https://www.postgresql.org/message-id/flat/692ee0da-49da-4d32-8dca-da224cc2800e@eisentraut.org
2025-01-20 11:06:49 +01:00
Thomas Munro
73f6b9a3b0 Fix latch event policy that hid socket events.
If a WaitEventSetWait() caller asks for multiple events, an already set
latch would previously prevent other events from being reported at the
same time.  Now, we'll also poll the kernel for other events that would
fit in the caller's output buffer with a zero wait time.  This policy
change doesn't affect callers that ask for only one event.

The main caller affected is the postmaster.  If its latch is set
extremely frequently by backends launching workers and workers exiting,
we don't want it to handle only those jobs and ignore incoming client
connections.

Back-patch to 16 where the postmaster began using the API.  The
fast-return policy changed here is older than that, but doesn't cause
any known problems in earlier releases.

Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/Z1n5UpAiGDmFcMmd%40nathan
2025-01-20 16:43:29 +13:00
Michael Paquier
6cf1647d87 Fix header check for continuation records where standbys could be stuck
XLogPageRead() checks immediately for an invalid WAL record header on a
standby, to be able to handle the case of continuation records that need
to be read across two different sources.  As written, the check was too
generic, applying to any target LSN.  Based on an analysis by Kyotaro
Horiguchi, what really matters is to make sure that the page header is
checked when attempting to read a LSN at the boundary of a segment, to
handle the case of a continuation record that spawns across multiple
pages when dealing with multiple segments, as WAL receivers are spawned
they request WAL from the beginning of a segment.  This fix has been
proposed by Kyotaro Horiguchi.

This could cause standbys to loop infinitely when dealing with a
continuation record during a timeline jump, in the case where the
contents of the record in the follow-up page are invalid.

Some regression tests are added to check such scenarios, able to
reproduce the original problem.  In the test, the contents of a
continuation record are overwritten with junk zeros on its follow-up
page, and replayed on standbys.  This is inspired by 039_end_of_wal.pl,
and is enough to show how standbys should react on promotion by not
being stuck.  Without the fix, the test would fail with a timeout.  The
test to reproduce the problem has been written by Alexander Kukushkin.

The original check has been introduced in 066871980183, for a similar
problem.

Author: Kyotaro Horiguchi, Alexander Kukushkin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com
Backpatch-through: 13
2025-01-20 09:29:42 +09:00
Tom Lane
23d7562018 Remove PrintBufferDescs() and PrintPinnedBufs().
These have been #ifdef'd out for a long time, and in fact have
been uncompilable since commit 48354581a of 2016-04-10.  The
fact that nobody noticed for so long demonstrates their lack of
usefulness, so let's remove them rather than fix them.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaB+9CN_f63PPRoVhHjYmCwwmb_9CWLxqCJdMWDqs1a-JA@mail.gmail.com
2025-01-19 14:00:22 -05:00
Andrew Dunstan
ea5ff5833c Be clearer about when jsonapi's need_escapes is needed
Most operations beyond pure json parsing need to set need_escapes to
true to get access to field names and string scalars. Document this
fact more explicitly.

Slightly tweaked patch from:

Author: Corey Huinker <corey.huinker@gmail.com>

Discussion: https://postgr.es/m/CADkLM=c49Vkfg2+A8ubSuEtaGEjuaKZXCA6SrXA8kdwHjx3uxQ@mail.gmail.com
2025-01-19 09:09:58 -05:00
Jeff Davis
d3d0983169 Support PG_UNICODE_FAST locale in the builtin collation provider.
The PG_UNICODE_FAST locale uses code point sort order (fast,
memcmp-based) combined with Unicode character semantics. The character
semantics are based on Unicode full case mapping.

Full case mapping can map a single codepoint to multiple codepoints,
such as "ß" uppercasing to "SS". Additionally, it handles
context-sensitive mappings like the "final sigma", and it uses
titlecase mappings such as "Dž" when titlecasing (rather than plain
uppercase mappings).

Importantly, the uppercasing of "ß" as "SS" is specifically mentioned
by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics
for case mapping and pattern matching, so if we changed it to use the
PG_UNICODE_FAST locale, it would offer better compliance with the
standard. For now, though, do not change the behavior of UCS_BASIC.

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:30 -08:00
Jeff Davis
286a365b9c Support Unicode full case mapping and conversion.
Generate tables from Unicode SpecialCasing.txt to support more
sophisticated case mapping behavior:

 * support case mappings to multiple codepoints, such as "ß"
   uppercasing to "SS"
 * support conditional case mappings, such as the "final sigma"
 * support titlecase variants, such as "dž" uppercasing to "DŽ" but
   titlecasing to "Dž"

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:20 -08:00
Nathan Bossart
6a9b2a631a vacuumdb: Fix comment for vacuum_one_database().
Since commit e0c2933a76, vacuum_one_database() always uses a
catalog query to discover the tables to process, but this comment
still notes the special case for which we used a catalog query
before that commit.  Let's just remove that note.

Also, commit 7781f4e3e7 renamed the "tables" parameter to "objects"
but missed updating this comment.  This commit fixes that as well.
2025-01-17 15:23:14 -06:00
Tom Lane
86e4efc52b Add documentation about calling version-1 C functions from C.
This topic wasn't really covered before, so fill in some details.

Author: Florents Tselai <florents.tselai@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/90853055-5BBD-493D-91E5-721677C7C59B@gmail.com
2025-01-17 14:37:38 -05:00
Dean Rasheed
43830ecb8a Fix parsing of qualified relation names in RETURNING.
Given a qualified refname, refnameNamespaceItem() will search for a
matching namespace item by relation OID, rather than by name. Commit
80feb727c8 broke this by adding additional namespace items for OLD and
NEW in the RETURNING list, which have the same relation OID, causing
ambiguity. Fix this by ignoring these in the search, which is correct
since they don't match the qualified relation name, and so there is no
real ambiguity.

Reported by Richard Guo.

Discussion: https://postgr.es/m/CAMbWs49MBjWYWDROJ8MZ%3DY%2B4UgRQa10wzik1tWrD5yto9eoGXg%40mail.gmail.com
2025-01-17 10:35:07 +00:00
John Naylor
e24d77080b Speed up hex_encode with bytewise lookup
Previously, hex_encode looked up each nibble of the input
separately. We now use a larger lookup table containing the two-byte
encoding of every possible input byte, resulting in a 1/3 reduction
in encoding time.

Reviewed by Tom Lane, Michael Paquier, Nathan Bossart, David Rowley

Discussion: https://postgr.es/m/CANWCAZZvXuJMgqMN4u068Yqa19CEjS31tQKZp_qFFFbgYfaXqQ%40mail.gmail.com
2025-01-17 16:29:25 +07:00
Peter Eisentraut
0869ea43e9 Remove flex version checks
Remove the flex version checks from configure and meson.  The cutoff
versions are all so ancient that this is no longer relevant, and what
the actual cutoff should be is a bit fuzzy.

This also removes the ancient behavior that configure would also
accept a "lex" program if it is actuall flex.  This aligns the check
with meson in this respect.

For future reference, as of this commit, these are relevant flex
versions:

- The hard required minimum is flex 2.5.34 as of commit b1ef48980dd,
  but this has not actually been tested.

- Prior to this, the minimum enforced by configure/meson was flex
  2.5.35, which is the oldest present in the buildfarm right now.

- As of commit 6fdd5d95634, the oldest version that will compile
  without warnings due to flex-generated code is flex 2.5.36.

- The oldest version that probably still has some practical relevance
  is flex 2.5.37, which ships with CentOS/RHEL 7.

Discussion: https://www.postgresql.org/message-id/1a204ccd-7ae6-478c-a431-407b5c48ccc6@eisentraut.org
2025-01-17 09:30:42 +01:00
Peter Eisentraut
b0eff10988 Add pg_nodiscard decorations to base64 functions
The result of pg_b64_encode() and pg_b64_decode() should be checked
for errors.  This attribute could detect mistakes such as those fixed
in commit ff030ebe250 and d278541be42.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAEudQAq-3yHsSdWoOOaw%2BgAQYgPMpMGuB5pt2yCXgv-YuxG2Hg%40mail.gmail.com
2025-01-17 08:21:32 +01:00
Michael Paquier
a6c70f68cd Revert recent changes related to handling of 2PC files at recovery
This commit reverts 8f67f994e8ea (down to v13) and c3de0f9eed38 (down to
v17), as these are proving to not be completely correct regarding two
aspects:
- In v17 and newer branches, c3de0f9eed38's check for epoch handling is
incorrect, and does not correctly handle frozen epochs.  A logic closer
to widen_snapshot_xid() should be used.  The 2PC code should try to
integrate deeper with FullTransactionIds, 5a1dfde8334b being not enough.
- In v13 and newer branches, 8f67f994e8ea is a workaround for the real
issue, which is that we should not attempt CLOG lookups without reaching
consistency.  This exists since 728bd991c3c4, and this is reachable with
ProcessTwoPhaseBuffer() called by restoreTwoPhaseData() at the beginning
of recovery.

Per discussion with Noah Misch.

Discussion: https://postgr.es/m/20250116010051.f3.nmisch@google.com
Backpatch-through: 13
2025-01-17 13:27:39 +09:00
Nathan Bossart
0dc9c7d200 Remove redefinitions of SIG_* macros in win32_port.h.
It is not clear why these were originally added.  One hypothesis is
that an ancient version of MinGW didn't define them.  In any case,
they appear to now be superfluous, so let's remove them.  If
nothing else, the buildfarm might offer us clues to their origins.

Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
2025-01-16 20:55:24 -06:00
Tom Lane
bf826ea062 Fix setrefs.c's failure to do expression processing on prune steps.
We should run the expression subtrees of PartitionedRelPruneInfo
structs through fix_scan_expr.  Failure to do so means that
AlternativeSubPlans within those expressions won't be cleaned up
properly, resulting in "unrecognized node type" errors since v14.

It seems fairly likely that at least some of the other steps done
by fix_scan_expr are important here as well, resulting in as-yet-
undetected bugs.  Therefore, I've chosen to back-patch this to
all supported branches including v13, even though the known
symptom doesn't manifest in v13.

Per bug #18778 from Alexander Lakhin.

Discussion: https://postgr.es/m/18778-24cd399df6c806af@postgresql.org
2025-01-16 20:40:07 -05:00
Melanie Plageman
f7a8fc10cc Add and use BitmapHeapScanDescData struct
Move the several members of HeapScanDescData which are specific to
Bitmap Heap Scans into a new struct, BitmapHeapScanDescData, which
inherits from HeapScanDescData.

This reduces the size of the HeapScanDescData for other types of scans
and will allow us to add additional bitmap heap scan-specific members in
the future without fear of bloating the HeapScanDescData.

Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/c736f6aa-8b35-4e20-9621-62c7c82e2168%40vondra.me
2025-01-16 18:42:39 -05:00
Michael Paquier
7b6468cc95 Rework macro pgstat_is_ioop_tracked_in_bytes()
As written, it was triggering a compilation warning for old versions of
clang, as reported by buildfarm members ayu, batfish and demoiselle.
Forcing a cast with "unsigned int" should fix the warning.

While on it, the macro is moved to pgstat.h, closer to the declaration
of IOOp, per suggestion from Tom Lane.

Reported-by: Tom Lane
Reviewed-by: Bertrand Drouvot, Tom Lane, Nazir Bilal Yavuz
Discussion: https://postgr.es/m/1272824.1736961543@sss.pgh.pa.us
2025-01-17 08:26:17 +09:00
Nathan Bossart
d4a43b2837 Convert libpgport's pqsignal() to a void function.
The protections added by commit 3b00fdba9f introduced race
conditions to this function that can lead to bogus return values.
Since nobody seems to inspect the return value, this is of little
consequence, but it would have been nice to convert it to a void
function to avoid any possibility of a bogus return value.  I
originally thought that doing so would have required also modifying
legacy-pqsignal.c's version of the function (which would've
required an SONAME bump), but commit 9a45a89c38 gave
legacy-pqsignal.c its own dedicated extern for pqsignal(), thereby
decoupling it enough that libpgport's pqsignal() can be modified.

This commit also adds an assertion for the return value of
sigaction()/signal().  Since a failure most likely indicates a
coding error, and nobody has ever bothered to check pqsignal()'s
return value, it's probably not worth the effort to do anything
fancier.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
2025-01-16 16:41:05 -06:00
Nathan Bossart
5cda4fdb0b Avoid calling pqsignal() with invalid signals on Windows frontends.
As noted by the comment at the top of port/pqsignal.c, Windows
frontend programs can only use pqsignal() with the 6 signals
required by C.  Most places avoid using invalid signals via #ifndef
WIN32, but initdb and pg_test_fsync check whether the signal itself
is defined, which doesn't work because win32_port.h defines many
extra signals for the signal emulation code.  pg_regress seems to
have missed the memo completely.  These issues aren't causing any
real problems today because nobody checks the return value of
pqsignal(), but a follow-up commit will add some error checking.

To fix, surround all frontend calls to pqsignal() that use signals
that are invalid on Windows with #ifndef WIN32.  We cannot simply
skip defining the extra signals in win32_port.h for frontends
because they are needed in places such as pgkill().

Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
2025-01-16 15:56:39 -06:00
Tom Lane
d7674c9fab Seek zone abbreviations in the IANA data before timezone_abbreviations.
If a time zone abbreviation used in datetime input is defined in
the currently active timezone, use that definition in preference
to looking in the timezone_abbreviations list.  That allows us to
correctly handle abbreviations that have different meanings in
different timezones.  Also, it eliminates an inconsistency between
datetime input and datetime output: the non-ISO datestyles for
timestamptz have always printed abbreviations taken from the IANA
data, not from timezone_abbreviations.  Before this fix, it was
possible to demonstrate cases where casting a timestamp to text
and back fails or changes the value significantly because of that
inconsistency.

While this change removes the ability to override the IANA data about
an abbreviation known in the current zone, it's not clear that there's
any real use-case for doing so.  But it is clear that this makes life
a lot easier for dealing with abbreviations that have conflicts across
different time zones.

Also update the pg_timezone_abbrevs view to report abbreviations
that are recognized via the IANA data, and *not* report any
timezone_abbreviations entries that are thereby overridden.
Under the hood, there are now two SRFs, one that pulls the IANA
data and one that pulls timezone_abbreviations entries.  They're
combined by logic in the view.  This approach was useful for
debugging (since the functions can be called on their own).
While I don't intend to document the functions explicitly,
they might be useful to call directly.

Also improve DecodeTimezoneAbbrev's caching logic so that it can
cache zone abbreviations found in the IANA data.  Without that,
this patch would have caused a noticeable degradation of the
runtime of timestamptz_in.

Per report from Aleksander Alekseev and additional investigation.

Discussion: https://postgr.es/m/CAJ7c6TOATjJqvhnYsui0=CO5XFMF4dvTGH+skzB--jNhqSQu5g@mail.gmail.com
2025-01-16 14:11:19 -05:00
Tom Lane
bc10219b9c Make pg_interpret_timezone_abbrev() check sp->defaulttype too.
This omission caused it to not recognize the furthest-back zone
abbreviation when working with timezone data compiled with relatively
recent zic (2018f or newer).  Older versions of zic produced a dummy
DST transition at the Big Bang, so that the oldest abbreviation could
always be found in the sp->types[] array; but newer versions don't do
that, so that we must examine defaulttype as well as the types[] array
to be sure of seeing all the abbreviations.

While this has been broken for six or so years, we'd managed not
to notice for two reasons: (1) many platforms are still using
ancient zic for compatibility reasons, so that the issue did not
manifest in builds using --with-system-tzdata; (2) the oldest
zone abbreviation is almost always "LMT", which we weren't
supporting anyway (but an upcoming patch will accept that).

While at it, update pg_next_dst_boundary() to use sp->defaulttype
as the time type for non-DST zones and times before the oldest
DST transition.  The existing code there predates upstream's
invention of the sp->defaulttype field, and its heuristic for
finding the oldest time type has now been subsumed into the
code that sets sp->defaulttype.

Possibly this should be back-patched, but I'm not currently aware
of any visible consequences of this bug in released branches.

Per report from Aleksander Alekseev and additional investigation.

Discussion: https://postgr.es/m/CAJ7c6TOATjJqvhnYsui0=CO5XFMF4dvTGH+skzB--jNhqSQu5g@mail.gmail.com
2025-01-16 12:43:03 -05:00
Peter Geoghegan
901bd4a65a Fix nbtree contradictory array element comment.
Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.
2025-01-16 11:26:41 -05:00
Álvaro Herrera
86374c9a0e
Split ATExecValidateConstraint into reusable pieces
With this, we have separate functions to add validation requests to
ALTER TABLE's phase 3 queue for check and foreign key constraints, which
allows reusing them in future commits -- particularly this will allow us
to perform validation of invalid foreign key constraints in partitioned
tables.

We could have let the check constraint code alone since we don't need to
reuse that for anything at this point, but it seems cleaner and more
consistent to do both at the same time.

Author: Amul Sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b96Bp=-ZwihPPtuaNX=SrZ0U6ZsXD3+fgARO0JuKa8v2jQ@mail.gmail.com
2025-01-16 16:44:24 +01:00
Dean Rasheed
80feb727c8 Add OLD/NEW support to RETURNING in DML queries.
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries
to explicitly return old and new values by using the special aliases
"old" and "new", which are automatically added to the query (if not
already defined) while parsing its RETURNING list, allowing things
like:

  RETURNING old.colname, new.colname, ...

  RETURNING old.*, new.*

Additionally, a new syntax is supported, allowing the names "old" and
"new" to be changed to user-supplied alias names, e.g.:

  RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ...

This is useful when the names "old" and "new" are already defined,
such as inside trigger functions, allowing backwards compatibility to
be maintained -- the interpretation of any existing queries that
happen to already refer to relations called "old" or "new", or use
those as aliases for other relations, is not changed.

For an INSERT, old values will generally be NULL, and for a DELETE,
new values will generally be NULL, but that may change for an INSERT
with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule
changes the command type. Therefore, we put no restrictions on the use
of old and new in any DML queries.

Dean Rasheed, reviewed by Jian He and Jeff Davis.

Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
2025-01-16 14:57:35 +00:00
Peter Eisentraut
7407b2d48c Remove dead code
As of commit 9895b35cb88, AlterDomainAddConstraint() can only be
called with constraints of type CONSTR_CHECK and CONSTR_NOTNULL.  So
all the code to check for and reject other constraint type values is
dead and can be removed.

Author: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxHitd5LGLBSSAPShhtDWxT0ViVKTHinkYW-skBX93TcpA@mail.gmail.com
2025-01-16 14:37:28 +01:00
Peter Eisentraut
7a947ed25b refactor: split ATExecAlterConstrRecurse()
This splits out a couple of subroutines from
ATExecAlterConstrRecurse().  This makes the main function a bit
smaller, and a future patch (NOT ENFORCED foreign-key constraints)
will also want to call some of the pieces separately.

Author: Amul Sul <amul.sul@enterprisedb.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA%40mail.gmail.com
2025-01-16 13:24:11 +01:00
Peter Eisentraut
d278541be4 Fix error handling of pg_b64_decode()
Fix for commit 761c79508e7.  The previous error handling logic was not
quite correct.

Discussion: https://www.postgresql.org/message-id/flat/CAEudQAq-3yHsSdWoOOaw%2BgAQYgPMpMGuB5pt2yCXgv-YuxG2Hg%40mail.gmail.com
2025-01-16 09:02:21 +01:00
Peter Eisentraut
ff030ebe25 Check return of pg_b64_encode() for error
Forgotten in commit 761c79508e7.

Author: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAEudQAq-3yHsSdWoOOaw%2BgAQYgPMpMGuB5pt2yCXgv-YuxG2Hg%40mail.gmail.com
2025-01-16 08:35:57 +01:00
Amit Kapila
965b2cc0a4 Doc: Improve the Replica Identity information.
This commit improves the Replica Identity information and clarifies its
related restrictions.

Reported-by: James Coleman
Author: Peter Smith
Co-authored-by: Robert Treat
Reviewed-by: Laurenz Albe, Amit Kapila
Discussion: https://postgr.es/m/CAAaqYe_=7qFSqW7qavvhVy58mmzk1uSQ0RReRiUHyKO5znvr7g@mail.gmail.com
2025-01-16 08:53:39 +05:30
Michael Paquier
32a18cc0a7 Move routines to manipulate WAL into PostgreSQL::Test::Cluster
These facilities were originally in the recovery TAP test
039_end_of_wal.pl.  A follow-up bug fix with a TAP test doing similar
WAL manipulations requires them, and all these had better not be
duplicated due to their complexity.  The routine names are tweaked to
use "wal" more consistently, similarly to the existing "advance_wal".

In v14 and v13, the new routines are moved to PostgresNode.pm.
039_end_of_wal.pl is updated to use the refactored routines, without
changing its coverage.

Reviewed-by: Alexander Kukushkin
Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com
Backpatch-through: 13
2025-01-16 09:25:29 +09:00
Peter Eisentraut
d5221c49a3 Fix cpluspluscheck for "Change gist stratnum function to use CompareType"
Commit 630f9a43cec introduced an enum forward declaration, which
doesn't work in C++.  To fix, just include the header file to get the
type.
2025-01-15 23:11:08 +01:00
Melanie Plageman
3edc67d337 Add more general summary to vacuumlazy.c
Add more comments at the top of vacuumlazy.c on heap relation vacuuming
implementation.

Previously vacuumlazy.c only had details related to dead TID storage.
This commit adds a more general summary to help future developers
understand the heap relation vacuum design and implementation at a high
level.

Reviewed-by: Alena Rybakina, Robert Haas, Andres Freund, Bilal Yavuz
Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
2025-01-15 14:17:32 -05:00
Peter Eisentraut
44512e7c95 Add a bit of documentation related to IWYU
Add some basic information about IWYU to src/tools/pginclude/README.

Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
2025-01-15 18:57:53 +01:00
Peter Eisentraut
fecc8021e1 IWYU pragmas for catalog headers
Add "IWYU pragma: export" annotations in each catalog header file so
that, for instance, including "catalog/pg_aggregate.h" is considered
acceptable in place of "catalog/pg_aggregate_d.h".  This is very
common and it seems better to silence IWYU about it than trying to fix
this up.

Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
2025-01-15 18:57:53 +01:00
Peter Eisentraut
74938d1320 IWYU widely useful pragmas
Add various widely useful "IWYU pragma" annotations, such as

- Common header files such as c.h, postgres.h should be "always_keep".

- System headers included in c.h, postgres.h etc. should be considered
  "export".

- Some portability headers such as getopt_long.h should be
  "always_keep", so they are not considered superfluous on some
  platforms.

- Certain system headers included from portability headers should be
  considered "export" because the purpose of the portability header is
  to wrap them.

- Superfluous includes marked as "for backward compatibility" get a
  formal IWYU annotation.

- Generated header included in utils/syscache.h is marked exported.
  This is a very commonly used include and this avoids lots of
  complaints.

Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
2025-01-15 18:57:53 +01:00
Peter Eisentraut
761c79508e postgres_fdw: SCRAM authentication pass-through
This enables SCRAM authentication for postgres_fdw when connecting to
a foreign server without having to store a plain-text password on user
mapping options.

This is done by saving the SCRAM ClientKey and ServeryKey from the
client authentication and using those instead of the plain-text
password for the server-side SCRAM exchange.  The new foreign-server
or user-mapping option "use_scram_passthrough" enables this.

Co-authored-by: Matheus Alcantara <mths.dev@pm.me>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/27b29a35-9b96-46a9-bc1a-914140869dac@gmail.com
2025-01-15 17:58:05 +01:00
Peter Eisentraut
b6463ea6ef Downgrade error in object_aclmask_ext() to internal
The "does not exist" error in object_aclmask_ext() was written as
ereport(), suggesting that it is user-facing.  This is problematic:
get_object_class_descr() is meant to be for internal errors only and
does not support translation.

For the has_xxx_privilege functions, the error has not been
user-facing since commit 403ac226ddd.  The remaining users are
pg_database_size() and pg_tablespace_size().  The call stack here is
pretty deep and this dependency is not obvious.  Here we can put in an
explicit existence check with a bespoke error message early in the
function.

Then we can downgrade the error in object_aclmask_ext() to a normal
"cache lookup failed" internal error.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/da2f8942-be6d-48d0-ac1c-a053370a6b1f@eisentraut.org
2025-01-15 16:58:44 +01:00
Peter Eisentraut
de9037d0d0 Downgrade errors in object_ownercheck() to internal
The "does not exist" errors in object_ownership() were written as
ereport(), suggesting that they are user-facing.  But no code path
except one can reach this function without first checking that the
object exists.  If this were actually a user-facing error message,
then there would be some problems: get_object_class_descr() is meant
to be for internal errors only and does not support translation.

The one case that can reach this without first checking the object
existence is from be_lo_unlink().  (This makes some sense since large
objects are referred to by their OID directly.)  In this one case, we
can add a line of code to check the object existence explicitly,
consistent with other LO code.

For the rest, downgrade the error messages to elog()s.  The new
message wordings are the same as in DropObjectById().

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/da2f8942-be6d-48d0-ac1c-a053370a6b1f@eisentraut.org
2025-01-15 16:58:44 +01:00
Peter Eisentraut
6fdd5d9563 Drop warning-free support for Flex 2.5.35
This removes all the various workarounds for avoiding compiler
warnings with Flex 2.5.35.  Several recent patches have added
additional warnings that would either need to be fixed along the lines
of the existing workarounds, or we decide to no longer care about
this, which we do here.

Flex 2.5.35 is extremely outdated, and you can't even download it
anymore from any of the Flex project sites, so it's nearly impossible
to support.

After this, using Flex 2.5.35 will still work, but the generated code
will produce numerous compiler warnings.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/1a204ccd-7ae6-478c-a431-407b5c48ccc6@eisentraut.org
2025-01-15 15:35:08 +01:00
Peter Eisentraut
630f9a43ce Change gist stratnum function to use CompareType
This changes commit 7406ab623fe in that the gist strategy number
mapping support function is changed to use the CompareType enum as
input, instead of the "well-known" RT*StrategyNumber strategy numbers.

This is a bit cleaner, since you are not dealing with two sets of
strategy numbers.  Also, this will enable us to subsume this system
into a more general system of using CompareType to define operator
semantics across index methods.

Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-15 11:34:04 +01:00
Peter Eisentraut
6339f6468e Rename RowCompareType to CompareType
RowCompareType served as a way to describe the fundamental meaning of
an operator, notionally independent of an operator class (although so
far this was only really supported for btrees).  Its original purpose
was for use inside RowCompareExpr, and it has also found some small
use outside, such as for get_op_btree_interpretation().

We want to expand this now, as a more general way to describe operator
semantics for other index access methods, including gist (to improve
GistTranslateStratnum()) and others not written yet.  To avoid future
confusion, we rename the type to CompareType and the symbols from
ROWCOMPARE_XXX to COMPARE_XXX to reflect their more general purpose.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-15 08:44:01 +01:00
Tom Lane
9a45a89c38 Avoid symbol collisions between pqsignal.c and legacy-pqsignal.c.
In the name of ABI stability (that is, to avoid a library major
version bump for libpq), libpq still exports a version of pqsignal()
that we no longer want to use ourselves.  However, since that has
the same link name as the function exported by src/port/pqsignal.c,
there is a link ordering dependency determining which version will
actually get used by code that uses libpq as well as libpgport.a.

It now emerges that the wrong version has been used by pgbench and
psql since commit 06843df4a rearranged their link commands.  This
can result in odd failures in pgbench with the -T switch, since its
SIGALRM handler will now not be marked SA_RESTART.  psql may have
some edge-case problems in \watch, too.

Since we don't want to depend on link ordering effects anymore,
let's fix this in the same spirit as b6c7cfac8: use macros to change
the actual link names of the competing functions.  We cannot change
legacy-pqsignal.c's exported name of course, so the victim has to be
src/port/pqsignal.c.

In master, rename its exported name to be pqsignal_fe in frontend or
pqsignal_be in backend.  (We could perhaps have gotten away with using
the same symbol in both cases, but since the FE and BE versions now
work a little differently, it seems advisable to use different names.)

In back branches, rename to pqsignal_fe in frontend but keep it as
pqsignal in backend.  The frontend change could affect third-party
code that is calling pqsignal from libpgport.a or libpgport_shlib.a,
but only if the code is compiled against port.h from a different minor
release than libpgport.  Since we don't support using libpgport as a
shared library, it seems unlikely that there will be such a problem.
I left the backend symbol unchanged to avoid an ABI break for
extensions.  This means that the link ordering hazard still exists
for any extension that links against libpq.  However, none of our own
extensions use both pqsignal() and libpq, and we're not making things
any worse for third-party extensions that do.

Report from Andy Fan, diagnosis by Fujii Masao, patch by me.
Back-patch to all supported branches, as 06843df4a was.

Discussion: https://postgr.es/m/87msfz5qv2.fsf@163.com
2025-01-14 18:50:24 -05:00
Melanie Plageman
2ae98ea5ab Synchronize guc_tables.c categories with vacuum docs categories
ca9c6a5680d consolidated most of the vacuum-related GUCs' documentation
into a new subsection. af2317652d5daf8b then enforced this order in
postgresql.conf.sample. This commit reorganizes the GUC groups in
guc_tables.c/h to match the updated ordering in the docs.

Reported-by: Álvaro Herrera
Reviewed-by: Álvaro Herrera, Alena Rybakina
Discussion: https://postgr.es/m/202501132046.m4mcvxxswznu%40alvherre.pgsql
2025-01-14 15:31:00 -05:00
Dean Rasheed
00f4c2959d psql: Add option to use expanded mode to all list commands.
This allows "x" to be appended to any psql list-like meta-command,
forcing its output to be displayed in expanded mode. This improves
readability in cases where the output is very wide. For example,
"\dfx+" (or equivalently "\df+x") will produce a list of functions,
with additional details, in expanded mode.

This works with all \d* meta-commands, plus \l, \z, and \lo_list, with
the one exception that the expanded mode option "x" cannot be appended
to "\d" by itself, since "\dx" already means something else.

Dean Rasheed, reviewed by Greg Sabino Mullane.

Discussion: https://postgr.es/m/CAEZATCVXJk3KsmCncf7PAVbxdDAUDm3QzDgGT7mBYySWikuOYw@mail.gmail.com
2025-01-14 16:29:15 +00:00
Fujii Masao
94b914f601 ecpg: Restore detection of unsupported COPY FROM STDIN.
The ecpg command includes code to warn about unsupported COPY FROM STDIN
statements in input files. However, since commit 3d009e45bd,
this functionality has been broken due to a bug introduced in that commit,
causing ecpg to fail to detect the statement.

This commit resolves the issue, restoring ecpg's ability to detect
COPY FROM STDIN and issue a warning as intended.

Back-patch to all supported versions.

Author: Ryo Kanbayashi
Reviewed-by: Hayato Kuroda, Tom Lane
Discussion: https://postgr.es/m/CANOn0Ez_t5uDCUEV8c1YORMisJiU5wu681eEVZzgKwOeiKhkqQ@mail.gmail.com
2025-01-15 01:23:02 +09:00
Dean Rasheed
4cb560b53f Consistently spell "leakproof" without a hyphen.
The overwhelming majority of places already did this, but a small
handful of places had a hyphen.

Yugo Nagata.

Discussion: https://postgr.es/m/CAEZATCXnnuORE2BoGwHw2zbtVvsPOLhbfVmEk9GxRzK%2Bx3OW-Q%40mail.gmail.com
2025-01-14 13:50:54 +00:00
Dean Rasheed
2355e51110 psql: Add leakproof indicator to \df+, \do+, \dAo+, and \dC+ output.
This allows users to determine whether particular functions are
leakproof, and whether the underlying functions used by operators and
casts are leakproof. This is useful to determine whether indexes can
be used in queries on security barrier views or tables with row-level
security policies.

Yugo Nagata, reviewed by Erik Wienhold and Dean Rasheed.

Discussion: https://postgr.es/m/20240701220817.483f9b645b95611f8b1f65da%40sranhm.sraoss.co.jp
2025-01-14 13:23:24 +00:00
Heikki Linnakangas
af8cd1639a Fix catcache invalidation of a list entry that's being built
If a new catalog tuple is inserted that belongs to a catcache list
entry, and cache invalidation happens while the list entry is being
built, the list entry might miss the newly inserted tuple.

To fix, change the way we detect concurrent invalidations while a
catcache entry is being built. Keep a stack of entries that are being
built, and apply cache invalidation to those entries in addition to
the real catcache entries. This is similar to the in-progress list in
relcache.c.

Back-patch to all supported versions.

Reviewed-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi
2025-01-14 14:28:49 +02:00
Michael Paquier
ce9a74707d Bump PGSTAT_FILE_FORMAT_ID
Oversight in f92c854cf406, that has changed the definition of
PgStat_BktypeIO, impacting PgStat_IO which is the on-disk data for IO
pgstats data.
2025-01-14 15:17:22 +09:00
Michael Paquier
720e529840 Fix potential integer overflow in bringetbitmap()
This function expects an "int64" as result and stores the number of
pages to add to the index scan bitmap as an "int", multiplying its final
result by 10.  For a relation large enough, this can theoretically
overflow if counting more than (INT32_MAX / 10) pages, knowing that the
number of pages is upper-bounded by MaxBlockNumber.

To avoid the overflow, this commit redefines "totalpages", used to
calculate the result, to be an "int64" rather than an "int".

Reported-by: Evgeniy Gorbanyov
Author: James Hunter
Discussion: https://www.postgresql.org/message-id/07704817-6fa0-460c-b1cf-cd18f7647041@basealt.ru
Backpatch-through: 13
2025-01-14 15:12:56 +09:00
Michael Paquier
d35ea27e51 Move information about pgstats kinds into its own header pgstat_kind.h
This includes all the definitions for the various PGSTAT_KIND_* values,
the range allowed for custom stats kinds and some macros related all
that.

One use-case behind this split is the possibility to use this
information for frontend tools, without having to rely on pgstat.h and a
backend footprint.

Author: Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Z24fyb3ipXKR38oS@paquier.xyz
2025-01-14 12:43:07 +09:00
Michael Paquier
d2181b3218 Remove assertion in pgstat_count_io_op()
An equivalent check is done with pgstat_is_ioop_tracked_in_bytes(), so
there is no need for this extra one.  Small cleanup that should have
been included in f92c854cf406.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAN55FZ0oqxBaaHAEsj=xFqkzE3n5P=3RA1V_igXwL-RV7QRzyw@mail.gmail.com
2025-01-14 12:19:51 +09:00
Michael Paquier
f92c854cf4 Make pg_stat_io count IOs as bytes instead of blocks for some operations
Currently in pg_stat_io view, IOs are counted as blocks of size
BLCKSZ.  There are two limitations with this design:
* The actual number of I/O requests sent to the kernel is lower because
I/O requests may be merged before being sent.  Additionally, it gives
the impression that all I/Os are done in block size, which shadows the
benefits of merging I/O requests.
* Some patches are under work to extend pg_stat_io for the tracking of
operations that may not be linked to the block size.  For example, WAL
read IOs are done in variable bytes and it is not possible to correctly
show these IOs in pg_stat_io view, and we want to keep all this data in
a single system view rather than spread it across multiple relations to
ease monitoring.

WaitReadBuffers() can now be tracked as a single read operation
worth N blocks.  Same for ExtendBufferedRelShared() and
ExtendBufferedRelLocal() for extensions.

Three columns are added to pg_stat_io for reads, writes and extensions
for the byte calculations.  op_bytes, which was always hardcoded to
BLCKSZ, is removed.  IO backend statistics are updated to reflect these
changes.

Bump catalog version.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot, Melanie Plageman
Discussion: https://postgr.es/m/CAN55FZ0oqxBaaHAEsj=xFqkzE3n5P=3RA1V_igXwL-RV7QRzyw@mail.gmail.com
2025-01-14 12:14:29 +09:00
Jeff Davis
b4a07f532b Revert "TupleHashTable: store additional data along with tuple."
This reverts commit e0ece2a981ee9068f50c4423e303836c2585eb02 due to
performance regressions.

Reported-by: David Rowley
2025-01-13 14:14:33 -08:00
Melanie Plageman
af2317652d Reorder vacuum GUCs in postgresql.conf.sample to match docs
ca9c6a5680d consolidated most of vacuum-related GUCs' documentation into
a new subsection. It neglected, however, to reorganize
postgresql.conf.sample to match the new order. Do this now.

Reported-by: Álvaro Herrera
Discussion: https://postgr.es/m/202501110902.5banlseavz7c%40alvherre.pgsql
2025-01-13 15:21:04 -05:00
Peter Geoghegan
1c854eb893 Add BTOPTIONS_PROC comments to nbtree.h.
Add comments explaining the purpose of B-Tree support function 5 to
nbtree.h for consistency (all other support functions were already
described by nearby comments).

This fixes what was arguably an oversight in commit 911e702077, or in
follow-up doc commit 15cb2bd2 (which documented support function 5 in
btree.sgml, but neglected to add anything to nbtree.h).
2025-01-13 15:02:14 -05:00
Peter Geoghegan
597b1ffbf1 Move nbtree preprocessing into new .c file.
Quite a bit of code within nbtutils.c is only called during nbtree
preprocessing.  Move that code into a new .c file, nbtpreprocesskeys.c.
Also reorder some of the functions within the new file for clarity.

This commit has no functional impact.  It is strictly mechanical.

Author: Peter Geoghegan <pg@bowt.ie>
Suggested-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CAH2-WznwNn1BDOpWxHBUK1f3Rdw8pO9UCenWXnvT=n9GO8GnLA@mail.gmail.com
Discussion: https://postgr.es/m/86930045-5df5-494a-b4f1-815bc3fbcce0%40iki.fi
2025-01-13 12:15:00 -05:00
Nathan Bossart
a8a762bc46 Add commit 6e826278f1 to .git-blame-ignore-revs. 2025-01-13 09:37:32 -06:00
Richard Guo
6e826278f1 Fix pgindent damage
Oversight in commit e0ece2a98.
2025-01-13 11:27:32 +09:00
Daniel Gustafsson
97698cc517 Fix HBA option count
Commit 27a1f8d108 missed updating the max HBA option count to
account for the new option added.  Fix by bumping the counter
and adjust the relevant comment to match.  Backpatch down to
all supported branches like the erroneous commit.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/286764.1736697356@sss.pgh.pa.us
Backpatch-through: v13
2025-01-12 23:44:39 +01:00
Dean Rasheed
a93e2a1e25 Fix JsonExpr deparsing to quote variable names in the PASSING clause.
When deparsing a JsonExpr, variable names in the PASSING clause were
not quoted. However, since they are parsed as ColLabel tokens, some
variable names require double quotes to ensure that they are properly
interpreted. Fix by using quote_identifier() in the deparsing code.

This oversight was limited to the SQL/JSON query functions
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE().

Back-patch to v17, where these functions were added.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com
2025-01-12 13:35:12 +00:00
Dean Rasheed
d673eefd41 Fix XMLTABLE() deparsing to quote namespace names if necessary.
When deparsing an XMLTABLE() expression, XML namespace names were not
quoted. However, since they are parsed as ColLabel tokens, some names
require double quotes to ensure that they are properly interpreted.
Fix by using quote_identifier() in the deparsing code.

Back-patch to all supported versions.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com
2025-01-12 12:54:32 +00:00
Tom Lane
29dfffae0a Repair memory leaks in plpython.
PLy_spi_execute_plan (PLyPlan.execute) and PLy_cursor_plan
(plpy.cursor) use PLy_output_convert to convert Python values
into Datums that can be passed to the query-to-execute.  But they
failed to pay much attention to its warning that it can leave "cruft
generated along the way" behind.  Repeated use of these methods can
result in a substantial memory leak for the duration of the calling
plpython function.

To fix, make a temporary memory context to invoke PLy_output_convert
in.  This also lets us get rid of the rather fragile code that was
here for retail pfree's of the converted Datums.  Indeed, we don't
need the PLyPlanObject.values field anymore at all, though I left it
in place in the back branches in the name of ABI stability.

Mat Arye and Tom Lane, per report from Mat Arye.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/CADsUR0DvVgnZYWwnmKRK65MZg7YLUSTDLV61qdnrwtrAJgU6xw@mail.gmail.com
2025-01-11 11:45:56 -05:00
Peter Eisentraut
ca87c415e2 Add support for NOT ENFORCED in CHECK constraints
This adds support for the NOT ENFORCED/ENFORCED flag for constraints,
with support for check constraints.

The plan is to eventually support this for foreign key constraints,
where it is typically more useful.

Note that CHECK constraints do not currently support ALTER operations,
so changing the enforceability of an existing constraint isn't
possible without dropping and recreating it.  This could be added
later.

Author: Amul Sul <amul.sul@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Tested-by: Triveni N <triveni.n@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-01-11 10:52:30 +01:00
Tatsuo Ishii
72ceb21b02 Fix a compiler warning in initStringInfo().
Fix a compiler warning found by Cfbot. This was caused by commit
bb86e85e442.
2025-01-11 15:52:37 +09:00
Jeff Davis
ceb2855522 Fix redefinition of type in commit e0ece2a981. 2025-01-10 17:45:27 -08:00
Jeff Davis
e0ece2a981 TupleHashTable: store additional data along with tuple.
Previously, the caller needed to allocate the memory and the
TupleHashTable would store a pointer to it. That wastes space for the
palloc overhead as well as the size of the pointer itself.

Now, the TupleHashTable relies on the caller to correctly specify the
additionalsize, and allocates that amount of space. The caller can
then request a pointer into that space.

Discussion: https://postgr.es/m/b9cbf0219a9859dc8d240311643ff4362fd9602c.camel@j-davis.com
Reviewed-by: Heikki Linnakangas
2025-01-10 17:14:37 -08:00
David Rowley
34c6e65242 Make verify_compact_attribute available in non-assert builds
6f3820f37 adjusted the assert-enabled validation of the CompactAttribute
to call a new external function to perform the validation.  That commit
made it so the function was only available when building with
USE_ASSERT_CHECKING, and because TupleDescCompactAttr() is a static
inline function, the call to verify_compact_attribute() was compiled
into any extension which uses TupleDescCompactAttr().  This caused issues
for such extensions when loading the assert-enabled extension into
PostgreSQL versions without asserts enabled due to that function being
unavailable in core.

To fix this, make verify_compact_attribute() available unconditionally,
but make it do nothing unless building with USE_ASSERT_CHECKING.

Author: Andrew Kane <andrew@ankane.org>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOdR5yHfMEMW00XGo=v1zCVUS6Huq2UehXdvKnwtXPTcZwXhmg@mail.gmail.com
2025-01-11 13:45:54 +13:00
Tatsuo Ishii
a9dcbb4d5c Add new StringInfo APIs to allow callers to specify the buffer size.
Previously StringInfo APIs allocated buffers with fixed initial
allocation size of 1024 bytes. This may be too large and inappropriate
for some callers that can do with smaller memory buffers. To fix this,
introduce new APIs that allow callers to specify initial buffer size.

extern StringInfo makeStringInfoExt(int initsize);
extern void initStringInfoExt(StringInfo str, int initsize);

Existing APIs (makeStringInfo() and initStringInfo()) are changed to
call makeStringInfoExt and initStringInfoExt respectively (via inline
helper functions makeStringInfoInternal and initStringInfoInternal),
with the default buffer size of 1024.

Reviewed-by: Nathan Bossart, David Rowley, Michael Paquier, Gurjeet Singh
Discussion: https://postgr.es/m/20241225.123704.1194662271286702010.ishii%40postgresql.org
2025-01-11 08:23:46 +09:00
Melanie Plageman
ca9c6a5680 Consolidate docs for vacuum-related GUCs in new subsection
GUCs related to vacuum's freezing behavior were documented in a
subsection of the Client Connection Defaults documentation. These GUCs
don't belong there, as they affect the freezing behavior of all vacuums
-- including autovacuums.

There wasn't a clear alternative location, so this commit makes a new
"Server Configuration" docs subsection, "Vacuuming", with a subsection
for "Freezing". It also moves the "Automatic Vacuuming" subsection and
the docs on GUCs controlling cost-based vacuum delay under the new
"Vacuuming" subsection.

The other vacuum-related GUCs under the "Resource Consumption"
subsection have been left in their current location, as they seem to fit
there.

The GUCs' documentation was largely lifted and shifted. The only
modification made was the addition of a few missing <literal> tags.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/flat/CAAKRu_aQUOaMYrcjNuXeSkJtaX9oRUzKP57bsYbC0gVVWS%2BcbA%40mail.gmail.com
2025-01-10 18:22:19 -05:00
Daniel Gustafsson
27a1f8d108 Fix missing ldapscheme option in pg_hba_file_rules()
The ldapscheme option was missed when inspecing the HbaLine for
assembling rows for the pg_hba_file_rules function.  Backpatch
to all supported versions.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reported-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Bug: 18769
Discussion: https://postgr.es/m/18769-dd8610cbc0405172@postgresql.org
Backpatch-through: v13
2025-01-10 22:02:58 +01:00
Peter Geoghegan
5b14ec0a48 Fix obsolete nbtree README left link remarks.
Oversight in commit 1bd4bc85, which made nbtree backwards scans operate
off of a copy of each page's left link as of the time of its call to
_bt_readpage.
2025-01-10 15:42:17 -05:00
Nathan Bossart
3d0b4b1068 Use a non-locking initial test in TAS_SPIN on AArch64.
Our testing showed that this is helpful at sufficiently high
contention levels and doesn't hurt performance on smaller machines.
The new TAS_SPIN macro for AArch64 is identical to the ones added
for PPC and x86_64 (see commits bc2a050d40 and b03d196be0).

Reported-by: Salvatore Dipietro
Reviewed-by: Jingtang Zhang, Andres Freund
Tested-by: Tom Lane
Discussion: https://postgr.es/m/ZxgDEb_VpWyNZKB_%40nathan
2025-01-10 13:18:04 -06:00
Andres Freund
28e7a9968e postmaster: Rename some shutdown related PMState phase names
The previous names weren't particularly clear. Future patches will add more
shutdown phases, making it even more important to have understandable shutdown
phases.

Suggested-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/d2cd8fd3-396a-4390-8f0b-74be65e72899@iki.fi
2025-01-10 11:43:00 -05:00
Andres Freund
e84712c738 postmaster: Make btmask_add() variadic
Suggested-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/d2cd8fd3-396a-4390-8f0b-74be65e72899@iki.fi
2025-01-10 11:43:00 -05:00
Andres Freund
7e957cbb50 postmaster: Introduce variadic btmask_all_except()
Upcoming patches would otherwise need btmask_all_except3().

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/w3z6w3g4aovivs735nk4pzjhmegntecesm3kktpebchegm5o53@aonnq2kn27xi
2025-01-10 11:43:00 -05:00
Andres Freund
40d4031abd postmaster: Improve logging of signals sent by postmaster
Previously many, in some cases important, signals we never logged. In other
cases the signal name was only included numerically.

As part of this, change the debug log level the signal is logged at to DEBUG3,
previously some where DEBUG2, some DEBUG4.

Also move from direct use of kill() to signal the av launcher to
signal_child(). There doesn't seem to be a reason for directly using kill().

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-10 11:43:00 -05:00
Andres Freund
7148cbbdc6 postmaster: Update pmState via a wrapper function
This makes logging of state changes easier - state transitions are now logged
at DEBUG1. Without that logging it was surprisingly hard to understand the
current state of the system while debugging.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-10 11:42:56 -05:00
Álvaro Herrera
cc811f92ba
Adjust signature of cluster_rel() and its subroutines
cluster_rel() receives the OID of the relation to process, which it
opens and locks; but then its subroutine copy_table_data() also receives
the relation OID and opens it by itself.  This is a bit wasteful.  It's
better to have cluster_rel() receive the relation already open, and pass
it down to its subroutines as necessary; then cluster_rel closes the rel
before returning.  This simplifies things.

But a better motivation to make this change is that a future command to
do logical-decoding-based "concurrent VACUUM FULL" will need to release
all locks on the relation (and possibly on the clustering index) at some
point.  Since it makes little sense to keep the relation reference
without the lock, the cluster_rel() function will also close it (and
the index).  With this arrangement, neither the function nor its
subroutines need open extra references, which, again, makes things simpler.

Author: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/82651.1720540558@antos
2025-01-10 13:09:38 +01:00
David Rowley
2310064510 Fix UNION planner datatype issue
66c0185a3 gave the planner the ability to have union child queries
provide the union planner with pre-sorted input so that UNION queries
could be more efficiently implemented using Merge Append.

That commit overlooked checking that the UNION target list and the union
child target list's types all match.  In some corner cases, this could
result in the planner producing sorts using the sort operator of the
top-level UNION's target list type rather than of the union child's
target list's type.  The implications of this range from silently
working correctly, despite using the wrong sort operator all the way up
to a segmentation fault.

Here we fix by adjusting the planner so it makes no attempt to have the
subquery produce pre-sorted results when the data type of the UNION
target list and the types from the subquery target list don't match
exactly.

Backpatch to 17, where 66c0185a3 was introduced.

Reported-by: Jason Smith <dqetool@126.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: 18764
Discussion: https://postgr.es/m/18764-63ad667ea26e877a%40postgresql.org
Backpatch-through: 17
2025-01-10 14:30:25 +13:00
Michael Paquier
f0bf7857be Merge pgstat_count_io_op_n() and pgstat_count_io_op()
The pgstat_count_io_op() function, which counts a single I/O operation,
wraps pgstat_count_io_op_n() with a counter value of 1.  The latter is
declared in pgstat.h and used nowhere in the code, so let's remove it in
favor of the former.

This change makes also the code more symmetric with
pgstat_count_io_op_time(), that already uses a similar set of arguments,
except that it counts also the I/O time.  This will ease a bit the
integration of a follow-up patch that adds byte-level tracking in
pg_stat_io for some of its attributes, lifting the current restriction
based on BLCKSZ as all I/O operations are assumed to be block-based.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAN55FZ32ze812=yjyZg1QeXhKvACUM_Nu0_gyPQcUKKuVHL5xA@mail.gmail.com
2025-01-10 09:57:27 +09:00
Michael Paquier
2c14037bb5 Refactor some code related to backend statistics
This commit changes the way pending backend statistics are tracked by
moving them into a new structure called PgStat_BackendPending, removing
PgStat_BackendPendingIO.  PgStat_BackendPending currently only includes
PgStat_PendingIO for the pending I/O stats.

pgstat_flush_backend() is extended with a "flags" argument to control
which parts of the stats of a backend should be flushed.

With this refactoring, it becomes easier to plug into backend statistics
more data.  A patch to add information related to WAL in this stats kind
is under discussion.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/Z3zqc4o09dM/Ezyz@ip-10-97-1-34.eu-west-3.compute.internal
2025-01-10 09:00:48 +09:00
Nathan Bossart
39e3bcae44 Fix an ALTER GROUP ... DROP USER error message.
This error message stated the privileges required to add a member
to a group even if the user was trying to drop a member:

	postgres=> alter group a drop user b;
	ERROR:  permission denied to alter role
	DETAIL:  Only roles with the ADMIN option on role "a" may add members.

Since the required privileges for both operations are the same, we
can fix this by modifying the message to mention both adding and
dropping members:

	postgres=> alter group a drop user b;
	ERROR:  permission denied to alter role
	DETAIL:  Only roles with the ADMIN option on role "a" may add or drop members.

Author: ChangAo Chen
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/tencent_FAA0D00E3514AAF0BBB6322542A6094FEF05%40qq.com
Backpatch-through: 16
2025-01-09 17:10:13 -06:00
Tom Lane
bebe904038 Use @extschema:name@ notation in contrib transform modules.
Harden hstore_plperl, hstore_plpython, and ltree_plpython
against search-path-based attacks by using @extschema:name@
notation to refer to the underlying hstore or ltree data type.

This allows removal of the previous documentation warning
suggesting that they must be installed in the same schema as
the underlying data type.  In passing, also improve a para in
extend.sgml to suggest using @extschema:name@ for such purposes.

Discussion: https://postgr.es/m/692480.1736021695@sss.pgh.pa.us
2025-01-09 15:16:56 -05:00
Álvaro Herrera
ebd8fc7e47
Simplify signature of RewriteTable
This function doesn't need the lockmode to be passed: it was being used
to lock the new heap, but that's bogus, because the only caller has
already obtained the appropriate lock on the new heap (which is
unimportant anyway, because the relation's creation is not yet committed
and so no other session can see it).

Noticed while reviewed Antonin Houska's patch to add VACUUM FULL
CONCURRENTLY.
2025-01-09 14:17:12 +01:00
Fujii Masao
6313a76b35 doc: Clarify synchronous_standby_names parameter.
The synchronous_standby_names GUC allows specifying num_sync,
the number of synchronous standbys transactions must wait for
replies from. This value must be an integer greater than zero.
This commit updates the documentation to clarify this requirement.

Reported-by: Asphator <asphator@gmail.com>
Discussion: https://postgr.es/m/18663-b02f75cb919f1b60@postgresql.org
2025-01-09 21:04:49 +09:00
Álvaro Herrera
69ab446514
Fix SLRU bank selection code
The originally submitted code (using bit masking) was correct when the
number of slots was restricted to be a power of two -- but that
limitation was removed during development that led to commit
53c2a97a9266, which made the bank selection code incorrect.  This led to
always using a smaller number of banks than available.  Change said code
to use integer modulo instead, which works correctly with an arbitrary
number of banks.

It's likely that we could improve on this to avoid runtime use of
integer division.  But with this change we're, at least, not wasting
memory on unused banks, and more banks mean less contention, which is
likely to have a much higher performance impact than a single
instruction's latency.

Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/9444dc46-ca47-43ed-9058-89c456316306@postgrespro.ru
2025-01-09 07:39:05 +01:00
Thomas Munro
970b97eeb8 Fix off_t overflow in pg_basebackup on Windows.
walmethods.c used off_t to navigate around a pg_wal.tar file that could
exceed 2GB, which doesn't work on Windows and would fail with misleading
errors.  Use pgoff_t instead.

Back-patch to all supported branches.

Author: Davinder Singh <davinder.singh@enterprisedb.com>
Reported-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
2025-01-09 16:04:23 +13:00
Thomas Munro
026762dae3 Provide 64-bit ftruncate() and lseek() on Windows.
Change our ftruncate() macro to use the 64-bit variant of chsize(), and
add a new macro to redirect lseek() to _lseeki64().

Back-patch to all supported releases, in preparation for a bug fix.

Tested-by: Davinder Singh <davinder.singh@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
2025-01-09 15:00:58 +13:00
Jeff Davis
229e7793d9 Fix duplicate typedef from commit a2f17f004d.
Reported-by: Thomas Munro
2025-01-08 15:25:05 -08:00
Jeff Davis
a2f17f004d Control collation behavior with a method table.
Previously, behavior branched based on the provider. A method table is
less error-prone and more flexible.

The ctype behavior will be addressed in an upcoming commit.

Reviewed-by: Andreas Karlsson
Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-01-08 14:26:46 -08:00
Jeff Davis
4f5cef2607 Move code for collation version into provider-specific files.
Author: Andreas Karlsson
Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477%40proxel.se
2025-01-08 13:54:07 -08:00
Tom Lane
3c49d462db Disallow NAMEDTUPLESTORE RTEs in stored views, rules, etc.
A named tuplestore is necessarily a transient object, so it makes
no sense to reference one in a persistent object such as a view.
We didn't previously prevent that, with the result that if you
tried you would get some weird failure about how the executor
couldn't find the tuplestore.

We can mechanize a check for this case cheaply by making dependency
extraction complain if it comes across such an RTE.  This is a
plausible way of dealing with it since part of the problem is that we
have no way to make a pg_depend representation of a named tuplestore.

Report and fix by Yugo Nagata.  Although this is an old problem,
it's a very weird corner case and there have been no reports from
end users.  So it seems sufficient to fix it in master.

Discussion: https://postgr.es/m/20240726160714.e74d0db579f2c017e1ca0b7e@sraoss.co.jp
2025-01-08 16:35:54 -05:00
Andrew Dunstan
b20fe54c9c Set exit status for pgindent if pg_bsd_indent fails
Also document the exit codes in the script.

The new exit code is 3, and is not overridden by the exit code set in
--check mode.

Author: Ashutosh Bapat

Discussion: https://postgr.es/m/CAExHW5sPRSiFeLdP-u1Fa5ba7YS2f0gvLjmKOobopKadJwQ_GQ@mail.gmail.com
2025-01-08 10:56:12 -05:00
Peter Eisentraut
7b27f5fd36 plpgsql: pure parser and reentrant scanner
The plpgsql scanner is a wrapper around the core scanner, which
already uses the flex %option reentrant.  This patch only pushes up a
few levels the place where the scanner handle is allocated.  Before,
it was allocated in pl_scanner.c in a global variable, so to the
outside the scanner was not reentrant.  Now, it is allocated in
pl_comp.c and is passed as an argument to yyparse(), similar to how it
is handled in other reentrant scanners.

Also use flex yyextra to handle context information, instead of global
variables.  Again, this uses the existing yyextra support in the core
scanner.  This complements the other changes to make the scanner
reentrant.

The bison option %pure-parser is used to make the generated parser
pure.  This happens in the usual way, since plpgsql has its own bison
parser definition.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2025-01-08 09:22:57 +01:00
Peter Eisentraut
b18464fad4 Remove useless function declaration
This function apparently never existed.
2025-01-08 08:31:04 +01:00
Michael Paquier
e0c3d5122e pg_freespacemap: Fix declaration of pg_freespace(regclass)
This function called generate_series() without enforcing its input
argument types, making possible for an attacker to catch this call, by
defining for example a generate_series(int,bigint).

The internals of pg_freespace(regclass) are changed to force the use of
bigint for the inputs of generate_series().  A more consistent style is
applied for all its hardcoded values, while on it.

Issue introduced in 3f323eba89fb.

Reported-by: Noah Misch
Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/20250106190428.ec.nmisch@google.com
2025-01-08 13:16:43 +09:00
Jeff Davis
3f482940db ExecInitAgg: update aggstate->numaggs and ->numtrans earlier.
Functions hash_agg_entry_size() and build_hash_tables() make use of
those values for memory size estimates.

Because this change only affects memory estimates, don't backpatch.

Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com
2025-01-07 15:13:50 -08:00
Jeff Davis
32ddfaffd1 nodeSetOp.c: missing additionalsize for BuildTupleHashTable().
Provide additionalsize argument, which can affect the calculations for
'nbuckets'. Also, future work for Hash Aggregation will rely on the
correct additionalsize.

Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com
2025-01-07 14:55:53 -08:00
Jeff Davis
8a96faedc4 Remove unused TupleHashTableData->entrysize.
Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com
2025-01-07 14:49:18 -08:00
Jeff Davis
834c9e807c Add missing typedefs.list entry for AggStatePerGroupData.
Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com
2025-01-07 14:33:21 -08:00
Nathan Bossart
4a68d50088 Use PqMsg_* macros in postgres.c.
Commit f4b54e1ed9, which introduced macros for protocol characters,
missed updating a couple of places in postgres.c.

Author: Dave Cramer
Reviewed-by: Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CADK3HHJUVBPoVOmFesPB-fN8_dYt%2BQELV2UB6jxOW2Z40qF-qw%40mail.gmail.com
Backpatch-through: 17
2025-01-07 15:34:19 -06:00
Nathan Bossart
f7e1b3828a Add passwordcheck.min_password_length.
This new parameter can be used to change the minimum allowed
password length (in bytes).  Note that it has no effect if a user
supplies a pre-encrypted password.

Author: Emanuele Musella, Maurizio Boriani
Reviewed-by: Tomas Vondra, Bertrand Drouvot, Japin Li
Discussion: https://postgr.es/m/CA%2BugDNyYtHOtWCqVD3YkSVYDWD_1fO8Jm_ahsDGA5dXhbDPwrQ%40mail.gmail.com
2025-01-07 15:06:40 -06:00
Nathan Bossart
6d01541960 Lower default value of autovacuum_worker_slots in initdb as needed.
Commit c758119e5b increased the default number of semaphores
required for autovacuum workers from 3 to 16.  Unfortunately, some
systems have very low default settings for SEMMNS, and this change
moved the minimum required for Postgres well beyond that limit (see
commit 38da053463 for more details).

With this commit, initdb will lower the default value for
autovacuum_worker_slots as needed, just like it already does for
parameters such as max_connections and shared_buffers.  We test
for (max_connections / 6) slots, which conveniently has the
following properties:

* For the initial max_connections default of 100, the default of
  autovacuum_worker_slots will be 16, which is its initial default
  value specified in the documentation and in guc_tables.c.

* For the lowest possible max_connections default of 25, the
  default of autovacuum_worker_slots will be 4, which means we only
  need one additional semaphore for autovacuum workers (as compared
  to before commit c758119e5b).  This leaves some wiggle room for
  new auxiliary workers, etc. on systems with low SEMMNS, and it
  ensures that the default number of slots will be greater than or
  equal to the default value of autovacuum_max_workers (3).

Reported-by: Tom Lane
Suggested-by: Andres Freund
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/1346002.1736198977%40sss.pgh.pa.us
2025-01-07 14:38:55 -06:00
Álvaro Herrera
0e5b14410e
Fix error message wording
The originals are ambiguous and a bit out of style.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202412141243.efesjyyvzxsz@alvherre.pgsql
2025-01-07 20:07:32 +01:00
Thomas Munro
c4782c4410 Fix meson detection of a couple of 64 bit builtins.
A couple of checks were missed by commit 962da900, so we would fail to
detect the features.

Reported-by: Юрий Соколов <y.sokolov@postgrespro.ru>
Discussion: https://postgr.es/m/42C25E2A-6519-4549-9F47-6B0686E83836%40postgrespro.ru
2025-01-08 07:19:46 +13:00
Álvaro Herrera
5b291d1c9c
Remove unnecessary code to handle CONSTR_NOTNULL
Commit 14e87ffa5c54 needlessly added support for CONSTR_NOTNULL entries
to StoreConstraints.  It's dead code, so remove it.

To make the situation regarding constraint creation clearer, change
comments in heap_create_with_catalog, StoreConstraints, MergeAttributes
to explain which types of constraint are used on each.

Author: 何建 (Jian He) <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxFxzqrCiUNfjJ0tQU+=nKQkQCGtGzUBude=SMOwj5VNjQ@mail.gmail.com
2025-01-07 16:49:41 +01:00
Peter Geoghegan
ec986020de Improve nbtree unsatisfiable RowCompare detection.
Move nbtree's detection of RowCompare quals that are unsatisfiable due
to having a NULL in their first row element: rather than detecting these
cases at the point where _bt_first builds its insertion scan key, do so
earlier, during preprocessing proper.  This brings the RowCompare case
in line every other case involving an unsatisfiable-due-to-NULL qual.

nbtree now consistently detects such unsatisfiable quals -- even when
they happen to involve a key that isn't examined by _bt_first at all.
Affected cases thereby avoid useless full index scans that cannot
possibly return any matching rows.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzmySVXst2hFrOATC-zw1Byg1XC-jYUS314=mzuqsNwk+Q@mail.gmail.com
2025-01-07 10:38:30 -05:00
Peter Geoghegan
428a99b589 nbtree: Simplify _bt_first parallel scan handling.
This new structure relieves _bt_first from having separate calls to
_bt_start_array_keys for the serial case and parallel case.  This saves
code, and seems clearer.

Follow-up to work from commits 4e6e375b and b5ee4e52.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XjUZjBjHJdhTvuH5MwoJObWGoM2RG2LyFg5WUdWyk=A@mail.gmail.com
2025-01-07 10:29:46 -05:00
Richard Guo
2f8b4007db Remove unused parameter in lookup_var_attr_stats
The parameter 'rel' in lookup_var_attr_stats was once used to draw an
ERROR when ANALYZE failed to acquire sufficient data to build extended
statistics.  bf2a691e0 changed the logic to raise a WARNING in the
caller instead.  As a result, this parameter is no longer needed and
can be removed.  Since this is a static function, we can always easily
reintroduce the parameter if it's ever needed in the future.

Author: Ilia Evdokimov
Reviewed-by: Fabrízio de Royes Mello
Discussion: https://postgr.es/m/b3880f22-5808-4206-88d4-1553a81c3440@tantorlabs.com
2025-01-07 11:24:14 +09:00
Nathan Bossart
c758119e5b Allow changing autovacuum_max_workers without restarting.
This commit introduces a new parameter named
autovacuum_worker_slots that controls how many autovacuum worker
slots to reserve during server startup.  Modifying this new
parameter's value does require a server restart, but it should
typically be set to the upper bound of what you might realistically
need to set autovacuum_max_workers.  With that new parameter in
place, autovacuum_max_workers can now be changed with a SIGHUP
(e.g., pg_ctl reload).

If autovacuum_max_workers is set higher than
autovacuum_worker_slots, a WARNING is emitted, and the server will
only start up to autovacuum_worker_slots workers at a given time.
If autovacuum_max_workers is set to a value less than the number of
currently-running autovacuum workers, the existing workers will
continue running, but no new workers will be started until the
number of running autovacuum workers drops below
autovacuum_max_workers.

Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma
Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
2025-01-06 15:01:22 -06:00
Heikki Linnakangas
5e68f61192 Remove duplicate definitions in proc.h
These are also present in procnumber.h

Reported-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/bd04d675-4672-4f87-800a-eb5d470c15fc@eisentraut.org
2025-01-06 11:56:03 +02:00
Peter Eisentraut
b1ef48980d flex code modernization: Replace YY_EXTRA_TYPE define with flex option
Replace #define YY_EXTRA_TYPE with %option extra-type.  The latter is
the way recommended by the flex manual (available since flex 2.5.34).

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2025-01-06 09:47:58 +01:00
Fujii Masao
632384d0eb doc: Clarify log level for VERBOSE messages in maintenance commands.
VERBOSE messages from ANALYZE, CLUSTER, REINDEX, and VACUUM are logged
at the INFO level, but this detail was missing from the documentation.
This commit updates the docs to mention the log level for these messages.

Author: Masahiro Ikeda
Reviewed-by: Yugo Nagata
Discussion: https://postgr.es/m/b4a4b7916982dccd9607c8efb3ce5116@oss.nttdata.com
2025-01-06 17:24:10 +09:00
John Naylor
3e70da2781 Always use the caller-provided context for radix tree leaves
Previously, it would not have worked for a caller to pass a slab
context, since it would have been used for other things which likely
had incompatible size. In an attempt to be helpful and avoid possible
space wastage due to aset's power-of-two rounding, RT_CREATE would
create an additional slab context if the value type was fixed-length
and larger than pointer size. The problem was, we have since added
the bump context type, and the generation context was a possibility as
well, so silently overriding the caller's choice may actually be worse.

Commit e8a6f1f908d arranged so that the caller-provided context is
used only for leaves, so it's safe for the caller to use slab here
if they wish. As demonstration, use slab in one of the radix tree
regression tests.

Reviewed by Masahiko Sawada

Discussion: https://postgr.es/m/CANWCAZZDCo4k5oURg_pPxM6+WZ1oiG=sqgjmQiELuyP0Vtrwig@mail.gmail.com
2025-01-06 13:26:02 +07:00
John Naylor
e8a6f1f908 Get rid of radix tree's general purpose memory context
Previously, this was notionally used only for the entry point of the
tree and as a convenient parent for other contexts.

For shared memory, the creator previously allocated the entry point
in this context, but attaching backends didn't have access to that,
so they just used the caller's context. For the sake of consistency,
allocate every instance of an entry point in the caller's context.

For local memory, allocate the control object in the caller's context
as well. This commit also makes the "leaf context" the notional parent
of the child contexts used for nodes, so it's a bit of a misnomer,
but a future commit will make the node contexts independent rather
than children, so leave it this way for now to avoid code churn.

The memory context parameter for RT_CREATE is now unused in the case
of shared memory, so remove it and adjust callers to match.

In passing, remove unused "context" member from struct TidStore,
which seems to have been an oversight.

Reviewed by Masahiko Sawada

Discussion: https://postgr.es/m/CANWCAZZDCo4k5oURg_pPxM6+WZ1oiG=sqgjmQiELuyP0Vtrwig@mail.gmail.com
2025-01-06 11:21:21 +07:00
John Naylor
960013f2a1 Use caller's memory context for radix tree iteration state
Typically only one iterator is present at any time, so it's overkill
to devote an entire context for this. Get rid of it and use the
caller's context.

This is tidy-up work, so no backpatch in this form. However, a
hypothetical extension to v17 that tried to start iteration from
an attaching backend would result in a crash, so that'll be fixed
separately in a way that doesn't change behavior in core.

Patch by me, reported and reviewed by Masahiko Sawada

Discussion: https://postgr.es/m/CAD21AoBB2U47V=F+wQRB1bERov_of5=BOZGaybjaV8FLQyqG3Q@mail.gmail.com
2025-01-06 09:01:58 +07:00
Peter Eisentraut
9a8313dabe Remove useless configure check
The test for "decltype" as a variant of "typeof" apparently never
worked (see also commit 3582b223d49), so remove it.

Discussion: https://www.postgresql.org/message-id/flat/795b1c54-c64a-47f9-8fa3-880dcab59975%40eisentraut.org
2025-01-05 11:34:28 +01:00
Peter Eisentraut
6549a02a51 meson: Fix missing name arguments of cc.compiles() calls
Without it, the check won't show up in the meson setup/configure
output.

Discussion: https://www.postgresql.org/message-id/flat/795b1c54-c64a-47f9-8fa3-880dcab59975%40eisentraut.org
2025-01-05 11:34:28 +01:00
Andrew Dunstan
30f0176263 Document strange jsonb sort order for empty top level arrays
Slightly faulty logic in the original jsonb code (commit d9134d0a355)
results in an empty top level array sorting less than a json null. We
can't change the sort order now since it would affect btree indexes over
jsonb, so document the anomaly.

Backpatch to all live branches (13 .. 17)

In master, also add a code comment noting the anomaly.

Reported-by: Yan Chengpen
Reviewed-by: Jian He

Discussion: https://postgr.es/m/OSBPR01MB45199DD8DA2D1CECD50518188E272@OSBPR01MB4519.jpnprd01.prod.outlook.com
2025-01-03 10:36:30 -05:00
Richard Guo
e28033fe1a Ignore nullingrels when looking up statistics
When looking up statistical data about an expression, we do not need
to concern ourselves with the outer joins that could null the
Vars/PHVs contained in the expression.  Accounting for nullingrels in
the expression could cause estimate_num_groups to count the same Var
multiple times if it's marked with different nullingrels.  This is
incorrect, and could lead to "ERROR:  corrupt MVNDistinct entry" when
searching for multivariate n-distinct.

Furthermore, the nullingrels could prevent us from matching an
expression to expressional index columns or to the expressions in
extended statistics, leading to inaccurate estimates.

To fix, strip out all the nullingrels from the expression before we
look up statistical data about it.  There is one ensuing plan change
in the regression tests, but it looks reasonable and does not
compromise its original purpose.

This patch could result in plan changes, but it fixes an actual bug,
so back-patch to v16 where the outer-join-aware-Var infrastructure was
introduced.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-2Z4k+nFTiZe0Qbu5n8juUWenDAtMzi98bAZQtwHx0-w@mail.gmail.com
2025-01-02 18:06:00 +09:00
David Rowley
d93bb8163c Fix outdated CHUNKHDRSZ value in nodeAgg.c
CHUNKHDRSZ was defined as 16 bytes, which was true when that code went in,
but since c6e0fe1f2, 8 is a more accurate value.  Here we adjust it to use
sizeof(MemoryChunk), which is normally 8, or 16 for cassert builds.

c6e0fe1f2 first appeared in v16, so this is technically wrong in v16 up
to master, but let's apply this only to master as adjusting this does
influence the estimated number of batches in the aggregate costing code
and we don't want to cause plan instability in released versions.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAApHDvpMpRQvsTqZo3FinXkgytwxwF8sCyZm83xDj-1s_hLe+w@mail.gmail.com
2025-01-02 22:04:09 +13:00
David Rowley
11012c5037 Fix an assortment of spelling mistakes and typos
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/5812a0b9-b0cf-4151-9a14-d9f00e4f2858@gmail.com
2025-01-02 12:42:01 +13:00
Bruce Momjian
50e6eb731d Update copyright for 2025
Backpatch-through: 13
2025-01-01 11:21:55 -05:00
Tom Lane
98b1efd6ef Update obsolete reference to plpgsql's gram.y file.
This was evidently missed in 05346c131, which renamed that
file to pl_gram.y.

Japin Li

Discussion: https://postgr.es/m/ME0P300MB0445F7CA7456C2AC67D37A01B6092@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2024-12-30 14:33:45 -05:00
Michael Paquier
b757abefc0 injection_points: Tweak variable-numbered stats to work with pending data
As coded, the module was not using pending entries to store its data
locally before doing a flush to the central dshash with a timed
pgstat_report_stat() call.  Hence, the flush callback was defined, but
finished by being not used.  As a template, this is more efficient than
the original logic of updating directly the shared memory entries as
this reduces the interactions that need to be done with the pgstats
hash table in shared memory.

injection_stats_flush_cb() was also missing a pgstat_unlock_entry(), so
add one, while on it.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Z3JbLhKFFm6kKfT8@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-30 18:48:18 +09:00
Michael Paquier
c9b3d4909b Fix memory leak in pgoutput with relation attribute map
pgoutput caches the attribute map of a relation, that is free()'d only
when validating a RelationSyncEntry.  However, this code path is not
taken when calling any of the SQL functions able to do some logical
decoding, like pg_logical_slot_{get,peek}_changes(), leaking some memory
into CacheMemoryContext on repeated calls.

To address this, a relation's attribute map is allocated in
PGOutputData's cachectx, free()'d at the end of the execution of these
SQL functions when logical decoding ends.  This is available down to 15.
v13 and v14 have a similar leak, which will be dealt with later.

Reported-by: Masahiko Sawada
Author: Vignesh C
Reviewed-by: Hou Zhijie
Discussion: https://postgr.es/m/CAD21AoDkAhQVSukOfH3_reuF-j4EU0-HxMqU3dU+bSTxsqT14Q@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm1hewNAsZ_e6FF52a=9drmkRJxtEPrzCB6-9mkJyeBBqA@mail.gmail.com
Backpatch-through: 15
2024-12-30 13:33:09 +09:00
Michael Paquier
ebf2ab40e5 Remove redundant wording in pg_statistic.h
Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3JbMCHna=N5ZSx6huLnTDfW34kw7Pf2n8+3M-9UrrwesA@mail.gmail.com
2024-12-30 12:18:45 +09:00
Michael Paquier
7e125b20ee Fix failures with incorrect epoch handling for 2PC files at recovery
At the beginning of recovery, an orphaned two-phase file in an epoch
different than the one defined in the checkpoint record could not be
removed based on the assumptions that AdjustToFullTransactionId() relies
on, assuming that all files would be either from the current epoch or
from the previous epoch.

If the checkpoint epoch was 0 while the 2PC file was orphaned and in the
future, AdjustToFullTransactionId() would underflow the epoch used to
build the 2PC file path.  In non-assert builds, this would create a
WARNING message referring to a 2PC file with an epoch of "FFFFFFFF" (or
UINT32_MAX), as an effect of the underflow calculation, leaving the
orphaned file around.

Some tests are added with dummy 2PC files in the past and the future,
checking that these are properly removed.

Issue introduced by 5a1dfde8334b, that has switched two-phase state
files to use FullTransactionIds.

Reported-by: Vitaly Davydov
Author: Michael Paquier
Reviewed-by: Vitaly Davydov
Discussion: https://postgr.es/m/13b5b6-676c3080-4d-531db900@47931709
Backpatch-through: 17
2024-12-30 09:58:02 +09:00
Michael Paquier
e358425815 Fix handling of orphaned 2PC files in the future at recovery
Before 728bd991c3c4, that has improved the support for 2PC files during
recovery, the initial logic scanning files in pg_twophase was done so as
files in the future of the transaction ID horizon were checked first,
followed by a check if a transaction ID is aborted or committed which
could involve a pg_xact lookup.  After this commit, these checks have
been done in reverse order.

Files detected as in the future do not have a state that can be checked
in pg_xact, hence this caused recovery to fail abruptly should an
orphaned 2PC file in the future of the transaction ID horizon exist in
pg_twophase at the beginning of recovery.

A test is added to check for this scenario, using an empty 2PC with a
transaction ID large enough to be in the future when running the test.
This test is added in 16 and older versions for now.  17 and newer
versions are impacted by a second bug caused by the addition of the
epoch in the 2PC file names.  An equivalent test will be added in these
branches in a follow-up commit, once the second set of issues reported
are fixed.

Author: Vitaly Davydov, Michael Paquier
Discussion: https://postgr.es/m/11e597-676ab680-8d-374f23c0@145466129
Backpatch-through: 13
2024-12-30 08:06:07 +09:00
Tom Lane
68ff25eef1 contrib/pageinspect: Use SQL-standard function bodies.
In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8.

Tom Lane and Ronan Dunklau

Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-12-29 14:58:05 -05:00
Tom Lane
667368fd26 contrib/xml2: Use SQL-standard function bodies.
In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8.

Tom Lane and Ronan Dunklau

Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-12-29 13:53:00 -05:00
Tom Lane
97a5a16849 contrib/citext: Use SQL-standard function bodies.
In the same spirit as 969bbd0fa, 13e3796c9, 3f323eba8.

Tom Lane and Ronan Dunklau

Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-12-29 13:37:35 -05:00
David Rowley
eb53ff5517 Fix overly large values/nulls arrays
These arrays were sized with Natts_pg_trigger (19) when they should have
been sized with Natts_pg_event_trigger (7).  We'd better fix this as
it's clearly a mistake and it could become problematic if
pg_event_trigger were to gain a dozen or so more columns in the future.

No backpatch as there's no actual bug and the column count on those
tables isn't going to change in released versions.

Author: Xin Zhang <zhanghien@qq.com>
Discussion: https://postgr.es/m/tencent_05AD0FB321A414EC3661204D2102AA6EF605@qq.com
2024-12-29 23:57:43 +13:00
Tom Lane
508a97ee49 Replace PGPROC.isBackgroundWorker with isRegularBackend.
Commit 34486b609 effectively redefined isBackgroundWorker as meaning
"not a regular backend", whereas before it had the narrower
meaning of AmBackgroundWorkerProcess().  For clarity, rename the
field to isRegularBackend and invert its sense.

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 16:21:54 -05:00
Tom Lane
34486b6092 Exclude parallel workers from connection privilege/limit checks.
Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges.  The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role).  Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized.  We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.

Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached.  Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well.  The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends.  Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.

While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make some of these
checks using the leader's current role where before we had used its
AuthenticatedUserId, thus allowing parallel queries to fail after
SET ROLE.  The previous behavior was fairly accidental and I have
no desire to return to it.

This patch includes reverting 73c9f91a1, which was an emergency hack
to suppress these same checks in some cases.  It wasn't complete,
as shown by a recent bug report from Laurenz Albe.  We can also revert
fd4d93d26 and 492217301, which hacked around the same problems in one
regression test.

In passing, remove the special case for autovac workers in
CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass
the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's
needed.

Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 16:08:50 -05:00
Tom Lane
2bdf1b2a2e Reserve a PGPROC slot and semaphore for the slotsync worker process.
The need for this was missed in commit 93db6cbda, with the result
being that if we launch a slotsync worker it would consume one of
the PGPROCs in the max_connections pool.  That could lead to inability
to launch the worker, or to subsequent failures of connection requests
that should have succeeded according to the configured settings.

Rather than create some one-off infrastructure to support this,
let's group the slotsync worker with the existing autovac launcher
in a new category of "special worker" processes.  These are kind of
like auxiliary processes, but they cannot use that infrastructure
because they need to be able to run transactions.

For the moment, make these processes share the PGPROC freelist
used for autovac workers (which previously supplied the autovac
launcher too).  This is partly to avoid an ABI change in v17,
and partly because it seems silly to have a freelist with
at most two members.  This might be worth revisiting if we grow
enough workers in this category.

Tom Lane and Hou Zhijie.  Back-patch to v17.

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 12:30:42 -05:00
Noah Misch
ff90ee6145 In REASSIGN OWNED of a database, lock the tuple as mandated.
Commit aac2c9b4fde889d13f859c233c2523345e72d32b mandated such locking
and attempted to fulfill that mandate, but it missed REASSIGN OWNED.
Hence, it remained possible to lose VACUUM's inplace update of
datfrozenxid if a REASSIGN OWNED processed that database at the same
time.  This didn't affect the other inplace-updated catalog, pg_class.
For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the
generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills
the locking mandate.

Like in GRANT, implement this by following the locking protocol for any
catalog subject to the generic AlterObjectOwner_internal().  It would
suffice to do this for IsInplaceUpdateOid() catalogs only.  Back-patch
to v13 (all supported versions).

Kirill Reshke.  Reported by Alexander Kukushkin.

Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com
2024-12-28 07:16:22 -08:00
David Rowley
58a359e585 Speedup tuple deformation with additional function inlining
This adjusts slot_deform_heap_tuple() to add special-case loops to
eliminate much of the branching that was done within the body of the
main deform loop.

Previously, while looping over each attribute to deform,
slot_deform_heap_tuple() would always recheck if the given attribute was
NULL by looking at HeapTupleHasNulls() and if so, went on to check the
tuple's NULL bitmap.  Since many tuples won't contain any NULLs, we can
just check HeapTupleHasNulls() once and when there are no NULLs, use a
more compact version of the deforming loop which contains no NULL checking
code at all.

The same is possible for the "slow" mode checking part of the loop.  That
variable was checked several times for each attribute, once to determine
if the offset to the attribute value could be taken from the attcacheoff,
and again to check if the offset could be cached for next time.

These "slow" checks can mostly be eliminated by instead having multiple
loops.  Initially, we can start in the non-slow loop and break out of
that loop if and only if we must stop caching the offset.  This
eliminates branching for both slow and non-slow deforming methods.  The
amount of code required for the no nulls / non-slow version is very
small.  It's possible to have separate loops like this due to the fact
that once we move into slow mode, we never need to switch back into
non-slow mode for a given tuple.

We have the compiler take care of writing out the multiple required
loops by having a pg_attribute_always_inline function which gets called
various times passing in constant values for the "slow" and "hasnulls"
parameters.  This allows the compiler to eliminate const-false branches
and remove comparisons for const-true ones.

This commit has shown overall query performance increases of around 5-20%
in deform-heavy OLAP-type workloads.

Author: David Rowley
Reviewed-by: Victor Yegorov
Discussion: https://postgr.es/m/CAGnEbog92Og2CpC2S8=g_HozGsWtt_3kRS1sXjLz0jKSoCNfLw@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvo9e0XG71WrefYaRv5n4xNPLK4k8LjD0mSR3c9KR2vi2Q@mail.gmail.com
2024-12-28 12:20:42 +13:00
Michael Paquier
d85ce012f9 Improve handling of date_trunc() units for infinite input values
Previously, if an infinite value was passed to date_trunc(), then the
same infinite value would always be returned regardless of the field
unit given by the caller.  This commit updates the function so that an
error is returned when an invalid unit is passed to date_trunc() with an
infinite value.

This matches the behavior of date_trunc() with a finite value and
date_part() with an infinite value, making the handling of interval,
timestamp and timestamptz more consistent across the board for these two
functions.

Some tests are added to cover all these new failure cases, with an
unsupported unit and infinite values for the three data types.  There
were no test cases in core that checked all these patterns up to now.

Author: Joseph Koshakow
Discussion: https://postgr.es/m/CAAvxfHc4084dGzEJR0_pBZkDuqbPGc5wn7gK_M0XR_kRiCdUJQ@mail.gmail.com
2024-12-27 13:32:40 +09:00
David Rowley
61cac71c23 Remove unused totalrows parameter in compute_expr_stats
The totalrows parameter in compute_expr_stats is unused, so remove it.
This is a static function, so the parameter can easily be added again if
it's ever needed.

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.ru>
Discussion: https://postgr.es/m/667b92d2-f953-4fcb-9377-3765f5b94187@tantorlabs.com
2024-12-27 10:51:22 +13:00
Peter Eisentraut
3f2d72b493 plpgsql: Rename a variable for clarity
Rename "core_yy_extra_type core_yy" to "core_yy_extra".  The previous
name was a bit unclear and confusing.  The new name matches the name
used elsewhere for the same purpose, for example in
src/backend/parser/gramparse.h.
2024-12-26 11:11:14 +01:00
Michael Paquier
a86cfcae7c Fix typo in comment of compute_return_type() in functioncmds.c
Author: Japin Li
Discussion: https://postgr.es/m/ME0P300MB0445D51BCFA8680F0B35FD6EB60C2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2024-12-26 12:53:55 +09:00
Heikki Linnakangas
07f902bd76 meson: Export all libcommon functions in Windows builds
This fixes "unresolved external symbol" errors with extensions that
use functions from libpgport that need special CFLAGS to
compile. Currently, that includes the CRC-32 functions.

Commit 2571c1d5cc did this for libcommon, but I missed that libpqport
has the same issue.

Reported-by: Tom Lane
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
2024-12-25 19:22:25 +02:00
Peter Eisentraut
4c9b453c91 Add commit 301de6a6f60 to .git-blame-ignore-revs. 2024-12-25 18:17:29 +01:00
Peter Eisentraut
301de6a6f6 Partial pgindent of .l and .y files
Trying to clean up the code a bit while we're working on these files
for the reentrant scanner/pure parser patches.  This cleanup only
touches the code sections after the second '%%' in each file, via a
manually-supervised and locally hacked up pgindent.
2024-12-25 17:55:42 +01:00
Heikki Linnakangas
2571c1d5cc meson: Export all libcommon functions in Windows builds
This fixes "unresolved external symbol" errors with extensions that
use functions from libcommon. This was reported with pgvector.

Reported-by: Andrew Kane
Author: Vladlen Popolitov
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
2024-12-25 18:14:18 +02:00
Peter Eisentraut
d663f150b5 guc: reentrant scanner
Use the flex %option reentrant to make the generated scanner
reentrant, and perhaps eventually thread-safe, but that will require
additional work.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-25 14:18:07 +01:00
Peter Eisentraut
2a7425d7ee jsonpath scanner: reentrant scanner
Use the flex %option reentrant to make the generated scanner
reentrant and thread-safe.  Note: The parser was already pure.

Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.  (Actually, we
use yy_scan_bytes() here because we already have the length.)

Use flex yyextra to handle context information, instead of global
variables.  This complements the other changes to make the scanner
reentrant.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-24 23:42:41 +01:00
Peter Geoghegan
9b254895c3 Fix nbtree symbol name comment reference.
Oversight in commit 5bf748b86b.
2024-12-24 14:06:16 -05:00
Peter Eisentraut
db6856c991 syncrep parser: pure parser and reentrant scanner
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.

Make the generated scanner use palloc() etc. instead of malloc() etc.
Previously, we only used palloc() for the buffer, but flex would still
use malloc() for its internal structures.  Now, all the memory is
under palloc() control.

Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.

The previous code was necessary because we allocated the buffer with
palloc() and the rest of the state was handled by malloc().  But this
is no longer the case; everything is under palloc() now.

Use flex yyextra to handle context information, instead of global
variables.  This complements the other changes to make the scanner
reentrant.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-24 18:05:06 +01:00
Peter Eisentraut
e4a8fb8fef replication parser: pure parser and reentrant scanner
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.

Make the generated scanner use palloc() etc. instead of malloc() etc.
Previously, we only used palloc() for the buffer, but flex would still
use malloc() for its internal structures.  As a result, there could be
some small memory leaks in case of uncaught errors.  Now, all the
memory is under palloc() control, so there are no more such issues.

Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.

The previous code was necessary because we allocated the buffer with
palloc() and the rest of the state was handled by malloc().  But this
is no longer the case; everything is under palloc() now.

Use flex yyextra to handle context information, instead of global
variables.  This complements the other changes to make the scanner
reentrant.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Co-authored-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-24 16:40:09 +01:00
Peter Eisentraut
5af699066f Remove pgrminclude and associated scripts
Per git log, the last time someone tried to do something with
pgrminclude was around 2011.  And it's always had a tendency of
causing trouble when it was active.  Also, pgcominclude is redundant
with headerscheck.

Discussion: https://www.postgresql.org/message-id/flat/2d4dc7b2-cb2e-49b1-b8ca-ba5f7024f05b%40eisentraut.org
2024-12-24 14:02:42 +01:00
Peter Eisentraut
1eb7cb21c2 Remove pgrminclude annotations
Per git log, the last time someone tried to do something with
pgrminclude was around 2011.  Many (not all) of the "pgrminclude
ignore" annotations are of a newer date but seem to have just been
copied around during refactorings and file moves and don't seem to
reflect an actual need anymore.

There have been some parallel experiments with include-what-you-use
(IWYU) annotations, but these don't seem to correspond very strongly
to pgrminclude annotations, so there is no value in keeping the
existing ones even for that kind of thing.

So, wipe them all away.  We can always add new ones in the future
based on actual needs.

Discussion: https://www.postgresql.org/message-id/flat/2d4dc7b2-cb2e-49b1-b8ca-ba5f7024f05b%40eisentraut.org
2024-12-24 11:49:07 +01:00
David Rowley
6f3820f37a Fix race condition in TupleDescCompactAttr assert code
5983a4cff added CompactAttribute as an abbreviated alternative to
FormData_pg_attribute to allow more cache-friendly processing in tasks
related to TupleDescs.  That commit contained some assert-only code to
check that the CompactAttribute had been populated correctly, however,
the method used to do that checking caused the TupleDesc's
CompactAttribute to be zeroed before it was repopulated and compared to
the snapshot taken before the memset call.  This caused issues as the type
cache caches TupleDescs in shared memory which can be used by multiple
backend processes at the same time.  There was a window of time between
the zero and repopulation of the CompactAttribute where another process
would mistakenly think that the CompactAttribute is invalid due to the
memset.

To fix this, instead of taking a snapshot of the CompactAttribute and
calling populate_compact_attribute() and comparing the snapshot to the
freshly populated TupleDesc's CompactAttribute, refactor things so we
can just populate a temporary CompactAttribute on the stack.  This way
we don't touch the TupleDesc's memory.

Reported-by: Alexander Lakhin, SQLsmith
Discussion: https://postgr.es/m/ca3a256a-5d12-42db-aabe-a75a030d9fb9@gmail.com
2024-12-24 14:54:24 +13:00
Tom Lane
38da053463 Try to avoid semaphore-related test failures on NetBSD/OpenBSD.
These two platforms have a remarkably tight default limit on the
number of SysV semaphores in the system: SEMMNS is only 60
out-of-the-box.  Unless manual action is taken to raise that,
we'll only be able to allocate 3 sets of 16 usable semaphores
each, leading to initdb setting max_connections to just 20.
That's problematic because the core regression tests expect
to be able to launch 20 concurrent sessions, leaving us with
no headroom.  This seems to be the cause of intermittent
buildfarm failures on some machines.

While there's no getting around the fact that you'd better raise
SEMMNS for production use on these platforms, it does seem desirable
for "make check" to pass reliably without that.  We can make that
happen, at least for awhile longer, with two small changes:

* Change sysv_sema.c's SEMAS_PER_SET to 19, so that we can eat up
all of the available semas not just most of them.

* Change initdb to make the smallest max_connections value it will
consider be 25 not 20.

As of HEAD this will leave us with four free semaphores (using the
default values for other relevant parameters such as max_wal_senders).
So we won't need to consider this again until we've invented five
more background processes.  Maybe by then we can switch both these
platforms to some other semaphore API.

For the moment, do this only in master; there've not been field
complaints that might justify a back-patch.

Discussion: https://postgr.es/m/db2773a2-aca0-43d0-99c1-060efcd9954e@gmail.com
2024-12-23 16:46:24 -05:00
Peter Geoghegan
da9517fb3a Reset btpo_cycleid in nbtree VACUUM's REDO routine.
Reset btpo_cycleid to 0 in btree_xlog_vacuum for consistency with
_bt_delitems_vacuum (the corresponding original execution code).  This
makes things neater.

There might be some performance benefit to being consistent like this.
When btvacuumpage doesn't call _bt_delitems_vacuum, it can still
proactively reset btpo_cycleid to 0 via a separate hint-like update
mechanism (it does so whenever it sees that it isn't already set to 0).
And so it's possible that being consistent about resetting btpo_cycleid
like this will save work later on, after standby promotion: subsequent
VACUUMs won't need to clear btpo_cycleid using the hint-like update
mechanism as often as they otherwise would.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAH2-Wz=+LDFxn9NZyEsCo8ifcyKt6+n-VLyygySEHgMz+oynqw@mail.gmail.com
2024-12-23 15:46:00 -05:00
Tom Lane
c431986de1 postgres_fdw: re-issue cancel requests a few times if necessary.
Despite the best efforts of commit 0e5c82380, we're still seeing
occasional failures of postgres_fdw's query_cancel test in the
buildfarm.  Investigation suggests that its 100ms timeout is
still not enough to reliably ensure that the remote side starts
the query before receiving the cancel request --- and if it
hasn't, it will just discard the request because it's idle.

We discussed allowing a cancel request to kill the next-received
query, but that would have wide and perhaps unpleasant side-effects.
What seems safer is to make postgres_fdw do what a human user would
likely do, which is issue another cancel request if the first one
didn't seem to do anything.  We'll keep the same overall 30 second
grace period before concluding things are broken, but issue additional
cancel requests after 1 second, then 2 more seconds, then 4, then 8.
(The next one in series is 16 seconds, but we'll hit the 30 second
timeout before that.)

Having done that, revert the timeout in query_cancel.sql to 10 ms.
That will still be enough on most machines, most of the time, for
the remote query to start; but now we're intentionally risking the
race condition occurring sometimes in the buildfarm, so that the
repeat-cancel code path will get some testing.

As before, back-patch to v17.  We might eventually contemplate
back-patching this further, and/or adding similar logic to dblink.
But given the lack of field complaints to date, this feels like
mostly an exercise in test case stabilization, so v17 is enough.

Discussion: https://postgr.es/m/colnv3lzzmc53iu5qoawynr6qq7etn47lmggqr65ddtpjliq5d@glkveb4m6nop
2024-12-23 15:14:30 -05:00
Heikki Linnakangas
1585ff7387 Don't allow GetTransactionSnapshot() in logical decoding
A historic snapshot should only be used for catalog access, not
general queries. We never call GetTransactionSnapshot() during logical
decoding, which is good because it wouldn't be very sensible, so the
code to deal with that was unreachable and untested. Turn it into an
error, to avoid doing that in the future either.

Discussion: https://www.postgresql.org/message-id/a868fe78-ddb4-4b0a-9b96-873d91d93cfd@iki.fi
2024-12-23 12:42:55 +02:00
Heikki Linnakangas
952365cded Remove unnecessary GetTransactionSnapshot() calls
In get_database_list() and get_subscription_list(), the
GetTransactionSnapshot() call is not required because the catalog
table scans use the catalog snapshot, which is held until the end of
the scan. See table_beginscan_catalog(), which calls
RegisterSnapshot(GetCatalogSnapshot(relid)).

In InitPostgres, it's a little less obvious that it's not required,
but still true I believe. All the catalog lookups in InitPostgres()
also use the catalog snapshot, and the looked up values are copied
while still holding the snapshot.

Furthermore, as the removed FIXME comments said, calling
GetTransactionSnapshot() didn't really prevent MyProc->xmin from being
reset anyway.

Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi
2024-12-23 12:42:39 +02:00
David Rowley
7ec4b9ff80 Fix incorrect source filename references
Jian He reported the src/include/utility/tcop.h one and the remainder
were found by using a script to look for src/* and check that we have a
filename or directory of that name.

In passing, fix some out-date comments.

Reported-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CACJufxGoE3H-7VgO02=PrR4SNuVWDVbfTyUnwO0HvS-Lxurnog@mail.gmail.com
2024-12-23 19:41:49 +13:00
Michael Paquier
7f97b4734f Fix some comments related to library unloading
Library unloading has never been supported with its code removed in
ab02d702ef08, and there were some comments still mentioning that it was
a possible operation.

ChangAo has noticed the incorrect references in dfmgr.c, while I have
noticed the other ones while scanning the rest of the tree for similar
mistakes.

Author: ChangAo Chen, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/tencent_1D09840A1632D406A610C8C4E2491D74DB0A@qq.com
2024-12-23 14:46:49 +09:00
Heikki Linnakangas
578a7fe7b6 Update TransactionXmin when MyProc->xmin is updated
GetSnapshotData() set TransactionXmin = MyProc->xmin, but when
SnapshotResetXmin() advanced MyProc->xmin, it did not advance
TransactionXmin correspondingly. That meant that TransactionXmin could
be older than MyProc->xmin, and XIDs between than TransactionXmin and
the real MyProc->xmin could be vacuumed away. One known consequence is
in pg_subtrans lookups: we might try to look up the status of an XID
that was already truncated away.

Back-patch to all supported versions.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/d27a046d-a1e4-47d1-a95c-fbabe41debb4@iki.fi
2024-12-21 23:42:39 +02:00
David Rowley
db448ce5ad Optimize alignment calculations in tuple form/deform
Here we convert CompactAttribute.attalign from a char, which is directly
derived from pg_attribute.attalign into a uint8, which stores the number
of bytes to align the column's value by in the tuple.

This allows tuple deformation and tuple size calculations to move away
from using the inefficient att_align_nominal() macro, which manually
checks each TYPALIGN_* char to translate that into the alignment bytes
for the given type.  Effectively, this commit changes those to TYPEALIGN
calls, which are branchless and only perform some simple arithmetic with
some bit-twiddling.

The removed branches were often mispredicted by CPUs, especially so in
real-world tables which often contain a mishmash of different types
with different alignment requirements.

Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-21 09:43:26 +13:00
Heikki Linnakangas
1f81b48a9d Mark CatalogSnapshotData static
Like CurrentSnapshotData, it should not be accessed directly outside
snapmgr.c.
2024-12-20 19:37:50 +02:00
Heikki Linnakangas
d5a7bd5670 Fix variable reference in comment
This used to say "nsubxcnt isn't decreased when subtransactions
abort", but there's no variable called nsubxcnt. Commit 8548ddc61b
changed it to "subxcnt", among other typo fixes, but that was wrong
too: the comment actually talks about txn->nsubtxns. That's the field
that's incremented but never decremented and is used for the
allocation earlier in the function.
2024-12-20 19:36:33 +02:00
Melanie Plageman
94bb6c4410 Fix overflow danger in SampleHeapTupleVisible(), take 2
28328ec87b45725 addressed one overflow danger in
SampleHeapTupleVisible() but introduced another, albeit a less likely
one. Modify the binary search code to remove this danger.

Reported-by: Richard Guo
Reviewed-by: Richard Guo, Ranier Vilela
Discussion: https://postgr.es/m/CAMbWs4_bE%2BNscChbKWzw6HZOipCUyXfA5133qvoXQ654D3B2gQ%40mail.gmail.com
2024-12-20 09:43:44 -05:00
Thomas Munro
38c579b089 Fix corruption when relation truncation fails.
RelationTruncate() does three things, while holding an
AccessExclusiveLock and preventing checkpoints:

1. Logs the truncation.
2. Drops buffers, even if they're dirty.
3. Truncates some number of files.

Step 2 could previously be canceled if it had to wait for I/O, and step
3 could and still can fail in file APIs.  All orderings of these
operations have data corruption hazards if interrupted, so we can't give
up until the whole operation is done.  When dirty pages were discarded
but the corresponding blocks were left on disk due to ERROR, old page
versions could come back from disk, reviving deleted data (see
pgsql-bugs #18146 and several like it).  When primary and standby were
allowed to disagree on relation size, standbys could panic (see
pgsql-bugs #18426) or revive data unknown to visibility management on
the primary (theorized).

Changes:

 * WAL is now unconditionally flushed first
 * smgrtruncate() is now called in a critical section, preventing
   interrupts and causing PANIC on file API failure
 * smgrtruncate() has a new parameter for existing fork sizes,
   because it can't call smgrnblocks() itself inside a critical section

The changes apply to RelationTruncate(), smgr_redo() and
pg_truncate_visibility_map().  That last is also brought up to date with
other evolutions of the truncation protocol.

The VACUUM FileTruncate() failure mode had been discussed in older
reports than the ones referenced below, with independent analysis from
many people, but earlier theories on how to fix it were too complicated
to back-patch.  The more recently invented cancellation bug was
diagnosed by Alexander Lakhin.  Other corruption scenarios were spotted
by me while iterating on this patch and earlier commit 75818b3a.

Back-patch to all supported releases.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reported-by: rootcause000@gmail.com
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/18146-04e908c662113ad5%40postgresql.org
Discussion: https://postgr.es/m/18426-2d18da6586f152d6%40postgresql.org
2024-12-20 23:57:02 +13:00
David Rowley
02a8d0c452 Remove pg_attribute.attcacheoff column
The column is no longer needed as the offset is now cached in the
CompactAttribute struct per commit 5983a4cff.

Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-20 23:22:37 +13:00
Michael Paquier
546371599e Relax regression test for fsync check of backend-level stats
One test added in 9aea73fc61d4 did not take into account that the
backend may have some fsync even after a checkpoint.  Let's relax it to
be more flexible.

Per report from buildfarm member grassquit, via Alexander Lakhin.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/6143ab0a-9e88-4790-8d9d-50ba45657761@gmail.com
2024-12-20 19:00:18 +09:00
David Rowley
5983a4cffc Introduce CompactAttribute array in TupleDesc, take 2
The new compact_attrs array stores a few select fields from
FormData_pg_attribute in a more compact way, using only 16 bytes per
column instead of the 104 bytes that FormData_pg_attribute uses.  Using
CompactAttribute allows performance-critical operations such as tuple
deformation to be performed without looking at the FormData_pg_attribute
element in TupleDesc which means fewer cacheline accesses.

For some workloads, tuple deformation can be the most CPU intensive part
of processing the query.  Some testing with 16 columns on a table
where the first column is variable length showed around a 10% increase in
transactions per second for an OLAP type query performing aggregation on
the 16th column.  However, in certain cases, the increases were much
higher, up to ~25% on one AMD Zen4 machine.

This also makes pg_attribute.attcacheoff redundant.  A follow-on commit
will remove it, thus shrinking the FormData_pg_attribute struct by 4
bytes.

Author: David Rowley
Reviewed-by: Andres Freund, Victor Yegorov
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-20 22:31:26 +13:00
Melanie Plageman
8ac0021b6f Remove final mention of FREEZE_PAGE from comments
b7493e1ab35 removed leftover mentions of XLOG_HEAP2_FREEZE_PAGE records
from comments but neglected to remove one mention of FREEZE_PAGE.

Reported off-list by Alexander Lakhin
2024-12-19 18:52:19 -05:00
Tom Lane
e0a2721f7c Get rid of old version of BuildTupleHashTable().
It was reasonable to preserve the old API of BuildTupleHashTable()
in the back branches, but in HEAD we should actively discourage use
of that version.  There are no remaining callers in core, so just
get rid of it.  Then rename BuildTupleHashTableExt() back to
BuildTupleHashTable().

While at it, fix up the miserably-poorly-maintained header comment
for BuildTupleHashTable[Ext].  It looks like more than one patch in
this area has had the opinion that updating comments is beneath them.

Discussion: https://postgr.es/m/538343.1734646986@sss.pgh.pa.us
2024-12-19 18:07:00 -05:00
Tom Lane
f0b900086a Use ExecGetCommonSlotOps infrastructure in more places.
Append, MergeAppend, and RecursiveUnion can all use the support
functions added in commit 276279295.  The first two can report a
fixed result slot type if all their children return the same fixed
slot type.  That does nothing for the append step itself, but might
allow optimizations in the parent plan node.  RecursiveUnion can
optimize tuple hash table operations in the same way as SetOp now
does.

Patch by me; thanks to Richard Guo and David Rowley for review.

Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
2024-12-19 17:07:14 -05:00
Tom Lane
8d96f57d5c Improve planner's handling of SetOp plans.
Remove the code for inserting flag columns in the inputs of a SetOp.
That was the only reason why there would be resjunk columns in a
set-operations plan tree, so we can get rid of some code that
supported that, too.

Get rid of choose_hashed_setop() in favor of building Paths for
the hashed and sorted alternatives, and letting them fight it out
within add_path().

Remove set_operation_ordered_results_useful(), which was giving wrong
answers due to examining the wrong ancestor node: we need to examine
the immediate SetOperationStmt parent not the topmost node.  Instead
make each caller of recurse_set_operations() pass down the relevant
parent node.  (This thinko seems to have led only to wasted planning
cycles and possibly-inferior plans, not wrong query answers.  Perhaps
we should back-patch it, but I'm not doing so right now.)

Teach generate_nonunion_paths() to consider pre-sorted inputs for
sorted SetOps, rather than always generating a Sort node.

Patch by me; thanks to Richard Guo and David Rowley for review.

Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
2024-12-19 17:02:25 -05:00
Tom Lane
2762792952 Convert SetOp to read its inputs as outerPlan and innerPlan.
The original design for set operations involved appending the two
input relations into one and adding a flag column that allows
distinguishing which side each row came from.  Then the SetOp node
pries them apart again based on the flag.  This is bizarre.  The
only apparent reason to do it is that when sorting, we'd only need
one Sort node not two.  But since sorting is at least O(N log N),
sorting all the data is actually worse than sorting each side
separately --- plus, we have no chance of taking advantage of
presorted input.  On top of that, adding the flag column frequently
requires an additional projection step that adds cycles, and then
the Append node isn't free either.  Let's get rid of all of that
and make the SetOp node have two separate children, using the
existing outerPlan/innerPlan infrastructure.

This initial patch re-implements nodeSetop.c and does a bare minimum
of work on the planner side to generate correctly-shaped plans.
In particular, I've tried not to change the cost estimates here,
so that the visible changes in the regression test results will only
involve removal of useless projection steps and not any changes in
whether to use sorted vs hashed mode.

For SORTED mode, we combine successive identical tuples from each
input into groups, and then merge-join the groups.  The tuple
comparisons now use SortSupport instead of simple equality, but
the group-formation part should involve roughly the same number of
tuple comparisons as before.  The cross-comparisons between left and
right groups probably add to that, but I'm not sure to quantify how
many more comparisons we might need.

For HASHED mode, nodeSetop's logic is almost the same as before,
just refactored into two separate loops instead of one loop that
has an assumption that it will see all the left-hand inputs first.

In both modes, I added early-exit logic to not bother reading the
right-hand relation if the left-hand input is empty, since neither
INTERSECT nor EXCEPT modes can produce any output if the left input
is empty.  This could have been done before in the hashed mode, but
not in sorted mode.  Sorted mode can also stop as soon as it exhausts
the left input; any remaining right-hand tuples cannot have matches.

Also, this patch adds some infrastructure for detecting whether
child plan nodes all output the same type of tuple table slot.
If they do, the hash table logic can use slightly more efficient
code based on assuming that that's the input slot type it will see.
We'll make use of that infrastructure in other plan node types later.

Patch by me; thanks to Richard Guo and David Rowley for review.

Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
2024-12-19 16:23:45 -05:00
Melanie Plageman
2128cebcdb Remove extra prefetch iterator setup for Bitmap Table Scan
1a0da347a7ac98db replaced Bitmap Table Scan's separate private and
shared bitmap iterators with a unified iterator. It accidentally set up
the prefetch iterator twice for non-parallel bitmap table scans. Remove
the extra set up call to tbm_begin_iterate().
2024-12-19 11:55:18 -05:00
Melanie Plageman
754c610e13 Fix bitmap table scan crash on iterator release
1a0da347a7ac98db replaced Bitmap Table Scan's individual private and
shared iterators with a unified iterator. It neglected, however, to
check if the iterator had already been cleaned up before doing so on
rescan. Add this check both on rescan and end scan to be safe.

Reported-by: Richard Guo
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs48nrhcLY1kcd-u9oD%2B6yiS631F_8Fx8ZGsO-BYDwH%2Bbyw%40mail.gmail.com
2024-12-19 11:55:03 -05:00
Peter Geoghegan
31b0a8f040 Avoid nbtree index scan SAOP scanBehind confusion.
Consistently reset so->scanBehind at the beginning of nbtree array
advancement, even during sktrig_required=false calls (calls where array
advancement is triggered by an unsatisfied non-required array scan key).
Otherwise, it's possible for queries to fail to return all relevant
tuples to the scan given a low-order required scan key that was
previously deemed "satisfied" by a truncated high key attribute value.
This only happened at the point where a later non-required array scan
key needed to be "advanced" once on the next leaf page (that is, once
the right sibling of the truncated high key page was reached).

The underlying issue was that later code within _bt_advance_array_keys
assumed that the so->scanBehind flag must have been set using the
current page's high key (not the previous page's high key).  Any later
successful recheck call to _bt_check_compare would therefore spuriously
be prevented from making _bt_advance_array_keys return true, based on
the faulty belief that the truncated attribute must be from the scan's
current tuple (i.e. the non-pivot tuple at the start of the next page).
_bt_advance_array_keys would return false for the tuple, ultimately
resulting in _bt_checkkeys failing to return a matching tuple.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkJKncfqyAUTeuB5GgRhT1vhsWO2q11dbZNqKmvjopP_g@mail.gmail.com
Backpatch: 17-, where commit 5bf748b8 first appears.
2024-12-19 11:08:55 -05:00
Peter Eisentraut
3e4bacb171 bootstrap: pure parser and reentrant scanner
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.

Make the generated scanner use palloc() etc. instead of malloc() etc.

For the bootstrap scanner and parser, reentrancy and memory management
aren't that important, but we make this change here anyway so that all
the scanners and parsers in the backend use a similar set of options
and APIs.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-19 15:37:44 +01:00
Peter Eisentraut
399d0f1e11 Small whitespace improvement
Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-19 13:00:31 +01:00
Peter Eisentraut
382092a0cd Prevent redeclaration of typedef yyscan_t
Fix for 1f0de66ea2a: We need to prevent redeclaration of typedef
yyscan_t.  (This will work with C11 but not currently with C99.)  The
generated scanner files provide their own typedef, but we also need to
provide one for the interfaces that we expose.  So we need to add some
preprocessor guards to avoid a redefinition.  (This is how the
generated scanner files do it internally as well.)  This way
everything now works independent of the order in which things are
included.

Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-19 11:24:43 +01:00
Michael Paquier
9aea73fc61 Add backend-level statistics to pgstats
This adds a new variable-numbered statistics kind in pgstats, where the
object ID key of the stats entries is based on the proc number of the
backends.  This acts as an upper-bound for the number of stats entries
that can exist at once.  The entries are created when a backend starts
after authentication succeeds, and are removed when the backend exits,
making the stats entry exist for as long as their backend is up and
running.  These are not written to the pgstats file at shutdown (note
that write_to_file is disabled, as a safety measure).

Currently, these stats include only information about the I/O generated
by a backend, using the same layer as pg_stat_io, except that it is now
possible to know how much activity is happening in each backend rather
than an overall aggregate of all the activity.  A function called
pg_stat_get_backend_io() is added to access this data depending on the
PID of a backend.  The existing structure could be expanded in the
future to add more information about other statistics related to
backends, depending on requirements or ideas.

Auxiliary processes are not included in this set of statistics.  These
are less interesting to have than normal backends as they have dedicated
entries in pg_stat_io, and stats kinds of their own.

This commit includes also pg_stat_reset_backend_stats(), function able
to reset all the stats associated to a single backend.

Bump catalog version and PGSTAT_FILE_FORMAT_ID.

Author: Bertrand Drouvot
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, Michael Paquier, Nazir
Bilal Yavuz
Discussion: https://postgr.es/m/ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-19 13:19:22 +09:00
Michael Paquier
ff7c40d7fd Extract logic filling pg_stat_get_io()'s tuplestore into its own routine
This commit adds pg_stat_io_build_tuples(), a helper routine for
pg_stat_get_io(), that fills its result tuplestore based on the contents
of PgStat_BktypeIO.  This will be used in a follow-up commit that uses
the same structures as pg_stat_io for reporting, including the same
object types and contexts, but for a different statistics kind.

Author: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-19 10:16:02 +09:00
David Rowley
08cdb079d4 Optimize grouping equality checks with virtual slots
8f4ee9626 fixed an old Assert failure that could happen when the slot
type used to look up the hash table for BuildTupleHashTableExt() users
wasn't a TTSOpsMinimalTuple slot.  The fix for that in the back branches
had to be to pass the TupleTableSlotOps as NULL, however in master,
since we have the inputOps parameter as was added by d96d1d515, we can
pass that down instead.

At least one caller uses a fixed slot that's always TTSOpsVirtual, so
passing down inputOps for these cases allows ExecBuildGroupingEqual() to
skip adding the EEOP_INNER_FETCHSOME ExprEvalStep.

This should increase the performance of hashed subplans very slightly.

Author: Tom Lane, David Rowley
Discussion: https://postgr.es/m/2543667.1734483723@sss.pgh.pa.us
2024-12-19 13:57:21 +13:00
David Rowley
8f4ee96269 Fix Assert failure in WITH RECURSIVE UNION queries
If the non-recursive part of a recursive CTE ended up using
TTSOpsBufferHeapTuple as the table slot type, then a duplicate value
could cause an Assert failure in CheckOpSlotCompatibility() when
checking the hash table for the duplicate value.  The expected slot type
for the deform step was TTSOpsMinimalTuple so the Assert failed when the
TTSOpsBufferHeapTuple slot was used.

This is a long-standing bug which we likely didn't notice because it
seems much more likely that the non-recursive term would have required
projection and used a TTSOpsVirtual slot, which CheckOpSlotCompatibility
is ok with.

There doesn't seem to be any harm done here other than the Assert
failure.  Both TTSOpsMinimalTuple and TTSOpsBufferHeapTuple slot types
require tuple deformation, so the EEOP_*_FETCHSOME ExprState step would
have properly existed in the ExprState.

The solution is to pass NULL for the ExecBuildGroupingEqual's 'lops'
parameter.  This means the ExprState's EEOP_*_FETCHSOME step won't
expect a fixed slot type.  This makes CheckOpSlotCompatibility() happy as
no checking is performed when the ExprEvalStep is not expecting a fixed
slot type.

Reported-by: Richard Guo
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAMbWs4-8U9q2LAtf8+ghV11zeUReA3AmrYkxzBEv0vKnDxwkKA@mail.gmail.com
Backpatch-through: 13, all supported versions
2024-12-19 13:11:39 +13:00
Melanie Plageman
b7493e1ab3 Remove leftover mentions of XLOG_HEAP2_FREEZE_PAGE records
f83d709760d merged the separate XLOG_HEAP2_FREEZE_PAGE records into a
new combined prune, freeze, and vacuum record with opcode
XLOG_HEAP2_PRUNE_VACUUM_SCAN. Remove the last few references to
XLOG_HEAP2_FREEZE_PAGE records which were accidentally left behind.

Reported-by: Tomas Vondra
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA%2BTgmoY1tYff-1CEn8kYt5FsOrynTbtr%3DUZw%3D7mTC1Hv1HpeBQ%40mail.gmail.com
2024-12-18 18:47:21 -05:00
Melanie Plageman
1a0da347a7 Bitmap Table Scans use unified TBMIterator
With the repurposing of TBMIterator as an interface for both parallel
and serial iteration through TIDBitmaps in commit 7f9d4187e7bab10329cc,
bitmap table scans may now use it.

Modify bitmap table scan code to use the TBMIterator. This requires
moving around a bit of code, so a few variables are initialized
elsewhere.

Author: Melanie Plageman
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/c736f6aa-8b35-4e20-9621-62c7c82e2168%40vondra.me
2024-12-18 18:43:39 -05:00
Melanie Plageman
7f9d4187e7 Add common interface for TBMIterators
Add and use TBMPrivateIterator, which replaces the current TBMIterator
for serial use cases, and repurpose TBMIterator to be a unified
interface for both the serial ("private") and parallel ("shared") TID
Bitmap iterator interfaces. This encapsulation simplifies call sites for
callers supporting both parallel and serial TID Bitmap access.
TBMIterator is not yet used in this commit.

Author: Melanie Plageman
Reviewed-by: Tomas Vondra, Heikki Linnakangas
Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
2024-12-18 18:19:28 -05:00
Melanie Plageman
28328ec87b Fix overflow danger in SampleHeapTupleVisible()
68d9662be1c4b70 made HeapScanDesc->rs_ntuples unsigned but neglected to
change how it was being used in SampleHeapTupleVisible().

Return early if rs_ntuples is 0 to avoid overflowing and incorrectly
executing the loop code in SampleHeapTupleVisible().

Reported-by: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAot_xQoZyPZjpj1aBUPrPykY5mOPHGyvfe%3Djz%2BWowdA3A%40mail.gmail.com
2024-12-18 18:16:43 -05:00
Melanie Plageman
68d9662be1 Make rs_cindex and rs_ntuples unsigned
HeapScanDescData.rs_cindex and rs_ntuples can't be less than 0. All scan
types using the heap scan descriptor expect these values to be >= 0.
Make that expectation clear by making rs_cindex and rs_ntuples unsigned.

Also remove the test in heapam_scan_bitmap_next_tuple() that checks if
rs_cindex < 0. This was never true, but now that rs_cindex is unsigned,
it makes even less sense.

While we are at it, initialize both rs_cindex and rs_ntuples to 0 in
initscan().

Author: Melanie Plageman
Reviewed-by: Dilip Kumar
Discussion: https://postgr.es/m/CAAKRu_ZxF8cDCM_BFi_L-t%3DRjdCZYP1usd1Gd45mjHfZxm0nZw%40mail.gmail.com
2024-12-18 11:47:38 -05:00
Peter Eisentraut
1f0de66ea2 seg: pure parser and reentrant scanner
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.

Make the generated scanner use palloc() etc. instead of malloc() etc.
Previously, we only used palloc() for the buffer, but flex would still
use malloc() for its internal structures.  As a result, there could be
some small memory leaks in case of uncaught errors.  (We do catch
normal syntax errors as soft errors.)  Now, all the memory is under
palloc() control, so there are no more such issues.

Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.

The previous code was necessary because we allocated the buffer with
palloc() and the rest of the state was handled by malloc().  But this
is no longer the case; everything is under palloc() now.

(We could even get rid of the yylex_destroy() call and just let the
memory context cleanup handle everything.  But for now, we preserve
the existing behavior.)

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-18 08:47:53 +01:00
Peter Eisentraut
802fe923e3 cube: pure parser and reentrant scanner
Use the flex %option reentrant and the bison option %pure-parser to
make the generated scanner and parser pure, reentrant, and
thread-safe.

Make the generated scanner use palloc() etc. instead of malloc() etc.
Previously, we only used palloc() for the buffer, but flex would still
use malloc() for its internal structures.  As a result, there could be
some small memory leaks in case of uncaught errors.  (We do catch
normal syntax errors as soft errors.)  Now, all the memory is under
palloc() control, so there are no more such issues.

Simplify flex scan buffer management: Instead of constructing the
buffer from pieces and then using yy_scan_buffer(), we can just use
yy_scan_string(), which does the same thing internally.  (Actually, we
use yy_scan_bytes() here because we already have the length.)

The previous code was necessary because we allocated the buffer with
palloc() and the rest of the state was handled by malloc().  But this
is no longer the case; everything is under palloc() now.

(We could even get rid of the yylex_destroy() call and just let the
memory context cleanup handle everything.  But for now, we preserve
the existing behavior.)

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2024-12-18 08:47:34 +01:00
Michael Paquier
477728b5d6 psql: Add more information about service name
This commit adds support for the following items in psql, able to show a
service name, when available:
- Variable SERVICE.
- Substitution %s in PROMPT{1,2,3}.

This relies on 4b99fed7541e, that has made the service name available in
PGconn for libpq.

Author: Michael Banck
Reviewed-by: Greg Sabino Mullane
Discussion: https://postgr.es/m/6723c612.050a0220.1567f4.b94a@mx.google.com
2024-12-18 15:16:12 +09:00
Michael Paquier
4b99fed754 libpq: Add service name to PGconn and PQservice()
This commit adds one field to PGconn for the database service name (if
any), with PQservice() as routine to retrieve it.  Like the other
routines of this area, NULL is returned as result if the connection is
NULL.

A follow-up patch will make use of this feature to be able to display
the service name in the psql prompt.

Author: Michael Banck
Reviewed-by: Greg Sabino Mullane
Discusion: https://postgr.es/m/6723c612.050a0220.1567f4.b94a@mx.google.com
2024-12-18 14:53:42 +09:00
Tom Lane
3f06324705 Fix memory leak in pg_restore with zstd-compressed data.
EndCompressorZstd() neglected to free everything.  This was
most visible with a lot of large objects in the dump.

Per report from Tomasz Szypowski.  Back-patch to v16
where this code came in.

Discussion: https://postgr.es/m/DU0PR04MB94193D038A128EF989F922D199042@DU0PR04MB9419.eurprd04.prod.outlook.com
2024-12-17 22:31:26 -05:00
David Rowley
d96d1d5152 Fix incorrect slot type in BuildTupleHashTableExt
0f5738202 adjusted the execGrouping.c code so it made use of ExprStates to
generate hash values.  That commit made a wrong assumption that the slot
type to pass to ExecBuildHash32FromAttrs() is always &TTSOpsMinimalTuple.
That's not the case as the slot type depends on the slot type passed to
LookupTupleHashEntry(), which for nodeRecursiveunion.c, could be any of
the current slot types.

Here we fix this by adding a new parameter to BuildTupleHashTableExt()
to allow the slot type to be passed in.  In the case of nodeSubplan.c
and nodeAgg.c the slot type is always &TTSOpsVirtual, so for both of
those cases, it's beneficial to pass the known slot type as that allows
ExecBuildHash32FromAttrs() to skip adding the tuple deform step to the
resulting ExprState.  Another possible fix would have been to have
ExecBuildHash32FromAttrs() set "fetch.kind" to NULL so that
ExecComputeSlotInfo() always determines the EEOP_INNER_FETCHSOME is
required, however, that option isn't favorable as slows down aggregation
and hashed subplan evaluation due to the extra (needless) deform step.

Thanks to Nathan Bossart for bisecting to find the offending commit
based on Paul's report.

Reported-by: Paul Ramsey <pramsey@cleverelephant.ca>
Discussion: https://postgr.es/m/99F064C1-B3EB-4BE7-97D2-D2A0AA487A71@cleverelephant.ca
2024-12-18 12:05:55 +13:00
Nathan Bossart
84f1b0b031 Accommodate very large dshash tables.
If a dshash table grows very large (e.g., the dshash table for
cumulative statistics when there are millions of tables), resizing
it may fail with an error like:

	ERROR: invalid DSA memory alloc request size 1073741824

To fix, permit dshash resizing to allocate more than 1 GB by
providing the DSA_ALLOC_HUGE flag.

Reported-by: Andreas Scherbaum
Author: Matthias van de Meent
Reviewed-by: Cédric Villemain, Michael Paquier, Andres Freund
Discussion: https://postgr.es/m/80a12d59-0d5e-4c54-866c-e69cd6536471%40pgug.de
Backpatch-through: 13
2024-12-17 15:24:45 -06:00
Tom Lane
7a80e381d1 Skip useless calculation of join RTE column names during EXPLAIN.
There's no need for set_simple_column_names() to compute unique
column names for join RTEs, because a finished plan tree will
not contain any join alias Vars that we could need names for.
Its other, internal callers will not pass it any join RTEs
anyway, so the upshot is we can just skip join RTEs here.

Aside from getting rid of a klugy against-its-documentation use of
set_relation_column_names, this can speed up EXPLAIN substantially
when considering many-join queries, because the upper join RTEs
tend to have a lot of columns.

Sami Imseih, with cosmetic changes by me

Discussion: https://postgr.es/m/CAA5RZ0th3q-0p1pri58z9grG8r8azmEBa8o1rtkwhLmJg_cH+g@mail.gmail.com
2024-12-17 15:52:12 -05:00
Melanie Plageman
dc6acfd910 Count pages set all-visible and all-frozen in VM during vacuum
Heap vacuum already counts and logs pages with newly frozen tuples. Now
count and log the number of pages newly set all-visible and all-frozen
in the visibility map.

Pages that are all-visible but not all-frozen are debt for future
aggressive vacuums. The counts of newly all-visible and all-frozen pages
give us insight into the rate at which this debt is being accrued and
paid down.

Author: Melanie Plageman
Reviewed-by: Masahiko Sawada, Alastair Turner, Nitin Jadhav, Andres Freund, Bilal Yavuz, Tomas Vondra
Discussion: https://postgr.es/m/flat/CAAKRu_ZQe26xdvAqo4weHLR%3DivQ8J4xrSfDDD8uXnh-O-6P6Lg%40mail.gmail.com#6d8d2b4219394f774889509bf3bdc13d,
https://postgr.es/m/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
2024-12-17 14:19:13 -05:00
Melanie Plageman
4b565a198b Make visibilitymap_set() return previous state of vmbits
It can be useful to know the state of a relation page's VM bits before
visibilitymap_set(). visibilitymap_set() has the old value on hand, so
returning it is simple. This commit does not use visibilitymap_set()'s
new return value.

Author: Melanie Plageman
Reviewed-by: Masahiko Sawada, Andres Freund, Nitin Jadhav, Bilal Yavuz
Discussion: https://postgr.es/m/flat/CAAKRu_ZQe26xdvAqo4weHLR%3DivQ8J4xrSfDDD8uXnh-O-6P6Lg%40mail.gmail.com#6d8d2b4219394f774889509bf3bdc13d,
https://postgr.es/m/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
2024-12-17 14:19:03 -05:00
Melanie Plageman
f020baa066 Rename LVRelState->frozen_pages
Rename frozen_pages to new_frozen_tuple_pages in LVRelState, the struct
used for tracking state during vacuuming of a heap relation.
frozen_pages sounds like it tracks pages set all-frozen. That is a
misnomer. It only includes pages with at least one newly frozen tuple.
It also includes pages that are not all-frozen.

Author: Melanie Plageman
Reviewed-by: Andres Freund, Masahiko Sawada, Nitin Jadhav, Bilal Yavuz

Discussion: https://postgr.es/m/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
2024-12-17 14:18:59 -05:00
Tom Lane
21fb39cb07 Set max_safe_fds whenever we create shared memory and semaphores.
Formerly we skipped this in bootstrap/check mode and in single-user
mode.  That's bad in check mode because it may allow accepting a
value of max_connections that doesn't actually work: on platforms
where semaphores consume file descriptors, there may not be enough
free FDs left over to satisfy fd.c, causing postmaster start to
fail.  It's also not great in single-user mode, because fd.c will
operate with just the minimum allowable value of max_safe_fds,
resulting in excess file open/close overhead if anything moderately
complicated is done in single-user mode.  (There may be some penalty
for bootstrap mode too, though probably not much.)

Discussion: https://postgr.es/m/2081982.1734393311@sss.pgh.pa.us
2024-12-17 12:23:26 -05:00
Tom Lane
c91963da13 Set the stack_base_ptr in main(), not in random other places.
Previously we did this in PostmasterMain() and InitPostmasterChild(),
which meant that stack depth checking was disabled in non-postmaster
server processes, for instance in single-user mode.  That seems like
a fairly bad idea, since there's no a-priori restriction on the
complexity of queries we will run in single-user mode.  Moreover, this
led to not having quite the same stack depth limit in all processes,
which likely has no real-world effect but it offends my inner neatnik.
Setting the depth in main() guarantees that check_stack_depth() is
armed and has a consistent interpretation of stack depth in all forms
of server processes.

While at it, move the code associated with checking the stack depth
out of tcop/postgres.c (which was never a great home for it) into
a new file src/backend/utils/misc/stack_depth.c.

Discussion: https://postgr.es/m/2081982.1734393311@sss.pgh.pa.us
2024-12-17 12:08:42 -05:00
Tomas Vondra
957ba9ff14 Detect version mismatch in brin_page_items
Commit dae761a87ed modified brin_page_items() to return the new "empty"
flag for each BRIN range. But the new output parameter was added in the
middle, which may cause crashes when using the new binary with old
function definition.

The ideal solution would be to introduce API versioning similar to what
pg_stat_statements does, but it's too late for that as PG17 was already
released (so we can't introduce a new extension version). We could do
something similar in brin_page_items() by checking the number of output
columns (and ignoring the new flag), but it doesn't seem very nice.

Instead, simply error out and suggest updating the extension to the
latest version. pageinspect is a superuser-only extension, and there's
not much reason to run an older version. Moreover, there's a precedent
for this approach in 691e8b2e18.

Reported by Ľuboslav Špilák, investigation and patch by me. Backpatch to
17, same as dae761a87ed.

Reported-by: Ľuboslav Špilák
Reviewed-by: Michael Paquier, Hayato Kuroda, Peter Geoghegan
Backpatch-through: 17
Discussion: https://postgr.es/m/VI1PR02MB63331C3D90E2104FD12399D38A5D2@VI1PR02MB6333.eurprd02.prod.outlook.com
Discussion: https://postgr.es/m/flat/3385a58f-5484-49d0-b790-9a198a0bf236@vondra.me
2024-12-17 17:48:55 +01:00
Tomas Vondra
8cd44db42a Update comments about index parallel builds
Commit b43757171470 allowed parallel builds for BRIN, but left behind
two comments claiming only btree indexes support parallel builds.

Reported by Egor Rogov, along with similar issues in SGML docs.
Backpatch to 17, where parallel builds for BRIN were introduced.

Reported-by: Egor Rogov
Backpatch-through: 17
Discussion: https://postgr.es/m/114e2d5d-125e-07d8-94aa-5ad175fb7443@postgrespro.ru
2024-12-17 15:40:07 +01:00
Peter Eisentraut
fb1a18810f Remove ts_locale.c's lowerstr()
lowerstr() and lowerstr_with_len() in ts_locale.c do the same thing as
str_tolower() that the rest of the system uses, except that the former
don't use the common locale provider framework but instead use the
global libc locale settings.

This patch replaces uses of lowerstr*() with str_tolower(...,
DEFAULT_COLLATION_OID).  For instances that use a libc locale
globally, this will result in exactly the same behavior.  For
instances that use other locale providers, you now get consistent
behavior and are no longer dependent on the libc locale settings (for
this case; there are others).

Most uses of these functions are for processing dictionary and
configuration files.  In those cases, using the default collation
seems appropriate.  At least we don't have a more specific collation
available.  But the code in contrib/pg_trgm should really depend on
the collation of the columns being processed.  This is not done here,
this can be done in a separate patch.

(You can probably construct some edge cases where this change would
create some locale-related upgrade incompatibility, for example if
before you used a combination of ICU and a differently-behaving libc
locale.  We can document this in the release notes, but I don't think
there is anything more we can do about this.)

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
2024-12-17 14:04:55 +01:00
Peter Eisentraut
d3aad4ac57 Remove ts_locale.c's t_isdigit(), t_isspace(), t_isprint()
These do the same thing as the standard isdigit(), isspace(), and
isprint() but with multibyte and encoding support.  But all the
callers are only interested in analyzing single-byte ASCII characters.
So this extra layer is overkill and we can replace the uses with the
standard functions.

All the t_is*() functions in ts_locale.c are under scrutiny because
they don't use the common locale provider framework but instead use
the global libc locale settings.  For the functions being touched by
this patch, we don't need all that anyway, as mentioned above, so the
simplest solution is to just remove them.  The few remaining t_is*()
functions will need a different treatment in a separate patch.

pg_trgm has some compile-time options with macros such as
KEEPONLYALNUM.  These are not documented, and the non-default variant
is not supported by any test cases.  As part of this undertaking, I'm
removing the non-default variant, as it is in the way of cleanup.  So
in this case, the not-KEEPONLYALNUM code path is gone.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
2024-12-17 12:52:29 +01:00
Richard Guo
60be3f9f0a Avoid unnecessary wrapping for more complex expressions
When pulling up a subquery that is under an outer join, if the
subquery's target list contains a strict expression that uses a
subquery variable, it's okay to pull up the expression without
wrapping it in a PlaceHolderVar: if the subquery variable is forced to
NULL by the outer join, the expression result will come out as NULL
too.

If the strict expression does not contain any subquery variables, the
current code always wraps it in a PlaceHolderVar.  While this is not
incorrect, the analysis could be tighter: if the strict expression
contains any variables of rels that are under the same lowest nulling
outer join as the subquery, we can also avoid wrapping it.  This is
safe because if the subquery variable is forced to NULL by the outer
join, the variables of rels that are under the same lowest nulling
outer join will also be forced to NULL, resulting in the expression
evaluating to NULL as well.  Therefore, it's not necessary to force
the expression to be evaluated below the outer join.  It could be
beneficial to get rid of such PHVs because they could imply lateral
dependencies, which force us to resort to nestloop joins.

This patch checks if the lateral references in the strict expression
contain any variables of rels under the same lowest nulling outer join
as the subquery, and avoids wrapping the expression in that case.

This is fundamentally a generalization of the optimizations for bare
Vars and PHVs introduced in commit f64ec81a8.

No backpatch as this could result in plan changes.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_ENtfRdLaM_bXAxiKRYO7DmwDBDG4_2=VTDi0mJP-jAw@mail.gmail.com
2024-12-17 19:53:01 +09:00
Amit Kapila
2364f61488 Doc: Fix the wrong link on pg_createsubscriber page.
Commit 84db9a0eb1 has added the incorrect link to
'initial data synchronization'. It was a subsection of Row Filter and
didn't provide the required information.

Author: Peter Smith
Reviewed-by: Vignesh C, Pavel Luzanov
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/CAHut+PtnA4DB_pcv4TDr4NjUSM1=P2N_cuZx5DX09k7LVmaqUA@mail.gmail.com
2024-12-17 15:08:29 +05:30
Michael Paquier
fee2b3ea2e Tweak some comments related to variable-numbered stats in pgstat.c
These comments referred to database objects, but depending on the stats
kind dealt with this may not be true.

Issues found while reviewing a different patch in this area.

Discussion: https://postgr.es/m/ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-17 14:32:35 +09:00
Michael Paquier
0f23dedc91 Print out error position for some more DDLs
The following commands gain some information about the error position in
the query, should they fail when looking at the type used:
- CREATE TYPE (LIKE)
- CREATE TABLE OF

Both are related to typenameType() where the type name lookup is done.
These calls gain the ParseState that already exists in these paths.

Author: Kirill Reshke, Jian He
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
2024-12-17 09:44:06 +09:00
Michael Paquier
e116b703f0 pg_combinebackup: Fix PITR comparison test in 002_compare_backups
The test was creating both the dumps to compare from the same database
on the same node, so it would never detect any mismatches when comparing
the logical dumps of the two servers.

Fixing this issue has revealed that there is a difference in the dumps:
the tablespaces paths are different.  This commit uses compare_text()
with a custom comparison function to erase the difference (slightly
tweaked to be able to work with WIN32 and non-WIN32 paths).  This way,
the non-relevant parts of the tablespace path are ignored from the check
with the basic structure of the query string still compared.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87h67653ns.fsf@wibble.ilmari.org
Backpatch-through: 17
2024-12-17 09:23:49 +09:00
Tomas Vondra
0cc7da4e23 doc: Mention BRIN indexes support parallel builds
Two places in the documentation suggest B-tree is the only index access
method allowing parallel builds. Commit b4375717 added parallel builds
for BRIN too, but failed to update the docs. So fix that, and backpatch
to 17, where parallel BRIN builds were introduced.

Author: Egor Rogov
Backpatch-through: 17
Discussion: https://postgr.es/m/114e2d5d-125e-07d8-94aa-5ad175fb7443@postgrespro.ru
2024-12-16 19:17:58 +01:00
Tomas Vondra
3429145d42 psql: Tab completion for JOIN ... USING column list
For JOIN ... USING, offer attribute names for the first member of the
column list.

Author: Andreas Karlsson
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/3a7e27bc-d6ed-4cb0-9b21-f21143fc1b37@proxel.se
2024-12-16 18:47:03 +01:00
Tomas Vondra
a01f6fa6ad psql: Tab completion for JOIN ... ON/USING
Offer ON/USING clauses for join types that require join conditions (i.e.
anything except for NATURAL/CROSS joins).

Author: Andreas Karlsson
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/3a7e27bc-d6ed-4cb0-9b21-f21143fc1b37@proxel.se
2024-12-16 18:47:03 +01:00
Tomas Vondra
5dd5786b94 psql: Tab completion for LATERAL joins
When listing selectable objects after a JOIN, offer also LATERAL.

Author: Andreas Karlsson
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/3a7e27bc-d6ed-4cb0-9b21-f21143fc1b37@proxel.se
2024-12-16 18:47:03 +01:00
Jeff Davis
86a5d6006a Refactor string case conversion into provider-specific files.
Create API entry points pg_strlower(), etc., that work with any
provider and give the caller control over the destination
buffer. Then, move provider-specific logic into pg_locale_builtin.c,
pg_locale_icu.c, and pg_locale_libc.c as appropriate.

Discussion: https://postgr.es/m/7aa46d77b377428058403723440862d12a8a129a.camel@j-davis.com
2024-12-16 09:35:18 -08:00
Tomas Vondra
de1e298857 psql: Tab completion for CREATE MATERIALIZED VIEW ... USING
The tab completion didn't offer USING for CREATE MATERIALIZED VIEW, so
add it, and offer a list of access methods, followed by SELECT.

Author: Kirill Reshke
Reviewed-By: Karina Litskevich
Discussion: https://postgr.es/m/CALdSSPhVELkvutquqrDB=Ujfq_Pjz=6jn-kzh+291KPNViLTfw@mail.gmail.com
2024-12-16 17:30:32 +01:00
Tomas Vondra
1e1f70c34a psql: Tab completion for CREATE TEMP TABLE ... USING
The USING keyword was offered only for persistent tables, not for
temporary ones. So improve that.

Author: Kirill Reshke
Reviewed-By: Karina Litskevich
Discussion: https://postgr.es/m/CALdSSPhVELkvutquqrDB=Ujfq_Pjz=6jn-kzh+291KPNViLTfw@mail.gmail.com
2024-12-16 17:30:04 +01:00
Tomas Vondra
8f11ef80c5 psql: Tab completion for ALTER TYPE ... CASCADE/RESTRICT
Updates table completion for ALTER TYPE to offer CASCADE/RESTRICT for a
number of actions on attributes:

    ALTER TYPE ... ADD/DROP/RENAME ATTRIBUTE ... [CASCADE|RESTRICT]
    ALTER TYPE ... TYPE ... [CASCADE|RESTRICT]

Author: Kirill Reshke
Reviewed-By: Karina Litskevich
Discussion: https://postgr.es/m/CALdSSPhVELkvutquqrDB=Ujfq_Pjz=6jn-kzh+291KPNViLTfw@mail.gmail.com
2024-12-16 17:29:30 +01:00
Tomas Vondra
e0275c380c psql: Tab completion for ALTER TYPE ... ADD ATTRIBUTE
Improve psql tab completion for ALTER TYPE ... ADD ATTRIBUTE to offer a
list of existing data types (until now no options were offered).

Author: Kirill Reshke
Reviewed-By: Karina Litskevich
Discussion: https://postgr.es/m/CALdSSPhVELkvutquqrDB=Ujfq_Pjz=6jn-kzh+291KPNViLTfw@mail.gmail.com
2024-12-16 17:29:17 +01:00
Heikki Linnakangas
1dfeb6af7f Make 009_twophase.pl test pass with recovery_min_apply_delay set
The test failed if you ran the regression tests with TEMP_CONFIG with
recovery_min_apply_delay = '500ms'. Fix the race condition by waiting
for transaction to be applied in the replica, like in a few other
tests.

The failing test was introduced in commit cbfbda7841. Backpatch to all
supported versions like that commit (except v12, which is no longer
supported).

Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/09e2a70a-a6c2-4b5c-aeae-040a7449c9f2@gmail.com
2024-12-16 15:59:21 +02:00
Michael Paquier
39240bcad5 Print out error position for CREATE DOMAIN
This is simply done by pushing down the ParseState available in
ProcessUtility() to DefineDomain(), giving more information about the
position of an error when running a CREATE DOMAIN query.

Most of the queries impacted by this change have been added previously
in 0172b4c9449e.

Author: Kirill Reshke, Jian He
Reviewed-by: Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
2024-12-16 14:52:11 +09:00
Michael Paquier
3ad8b840ce Add some tests for encoding conversion in COPY TO/FROM
This adds a couple of tests to trigger encoding conversion when input
and server encodings do not match in COPY FROM/TO, or need_transcoding
set to true in the COPY state data.  These tests rely on UTF8 <-> LATIN1
for the valid cases as LATIN1 accepts any bytes, and UTF8 <-> EUC_JP for
some of the invalid cases where a character cannot be understood,
causing a conversion failure.

Both ENCODING and client_encoding are covered.  Test suggested by Andres
Freund.

Author: Sutou Kouhei
Discussion: https://postgr.es/m/20240206222445.hzq22pb2nye7rm67@awork3.anarazel.de
2024-12-16 11:23:38 +09:00
Tom Lane
bf9165bb0c Declare a couple of variables inside not outside a PG_TRY block.
I went through the buildfarm's reports of "warning: variable 'foo'
might be clobbered by 'longjmp' or 'vfork' [-Wclobbered]".  As usual,
none of them are live problems according to my understanding of the
effects of setjmp/longjmp, to wit that local variables might revert
to their values as of PG_TRY entry, due to being kept in registers.
But I did happen to notice that XmlTableGetValue's "cstr" variable
doesn't need to be declared outside the PG_TRY block at all (thus
giving further proof that the -Wclobbered warning has little
connection to real problems).  We might as well move it inside,
and "cur" too, in hopes of eliminating one of the bogus warnings.
2024-12-15 15:50:07 -05:00
Tom Lane
530f89e648 pgbench: fix misprocessing of some nested \if constructs.
An \if command appearing within a false (not-to-be-executed) \if
branch was incorrectly treated the same as \elif.  This could allow
statements within the inner \if to be executed when they should
not be.  Also the missing inner \if stack entry would result in an
assertion failure (in assert-enabled builds) when the final \endif
is reached.

Report and patch by Michail Nikolaev.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/CANtu0oiA1ke=SP6tauhNqkUdv5QFsJtS1p=aOOf_iU+EhyKkjQ@mail.gmail.com
2024-12-15 14:14:14 -05:00
Fujii Masao
56499315a7 doc: Clarify old WAL files are kept until they are summarized.
The documentation in wal.sgml explains that old WAL files cannot be
removed or recycled until they are archived (when WAL archiving is used)
or replicated (when using replication slots). However, it did not mention
that, similarly, old WAL files are also kept until they are summarized
if WAL summarization is enabled. This commit adds that clarification
to the documentation.

Back-patch to v17 where WAL summarization was added.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/fd0eb0a5-f43b-4e06-b450-cbca011b6cff@oss.nttdata.com
2024-12-15 11:18:18 +09:00
Tom Lane
969bbd0faf contrib/earthdistance: Use SQL-standard function bodies.
The @extschema:name@ feature added by 72a5b1fc8 allows us to
make earthdistance's references to the cube extension fully
search-path-secure, so long as all those references are
resolved at extension installation time not runtime.
To do that, we must convert earthdistance's SQL functions to
the new SQL-standard style; but we wanted to do that anyway.

The functions can be updated in our customary style by running
CREATE OR REPLACE FUNCTION in an extension update script.
However, there's still problems in the "CREATE DOMAIN earth"
command: its references to cube functions could be captured
by hostile objects in earthdistance's installation schema,
if that's not where the cube extension is.  Worse, the reference
to the cube type itself as the domain's base could be captured,
and that's not something we could fix after-the-fact in the
update script.

What I've done about that is to change the "CREATE DOMAIN earth"
command in the base script earthdistance--1.1.sql.  Ordinarily,
changing a released extension script is forbidden; but I think
it's okay here since the results of successful (non-trojaned)
script execution will be identical to before.

A good deal of care is still needed to make the extension's scripts
proof against search-path-based attacks.  We have to make sure that
all the function and operator invocations have exact argument-type
matches, to forestall attacks based on supplying a better match.
Fortunately earthdistance isn't very big, so I've just gone through
it and inspected each call to be sure of that.  The only actual code
changes needed were to spell all floating-point constants in the style
'-1'::float8, rather than depending on runtime type conversions and/or
negations.  (I'm not sure that the shortcuts previously used were
attackable, but removing run-time effort is a good thing anyway.)

I believe that this fixes earthdistance enough that we could
mark it trusted and remove the warnings about it that were
added by 7eeb1d986; but I've not done that here.

The primary reason for dealing with this now is that we've
received reports of pg_upgrade failing for databases that use
earthdistance functions in contexts like generated columns.
That's a consequence of 2af07e2f7 having restricted the search_path
used while evaluating such expressions.  The only way to fix that
is to make the earthdistance functions independent of run-time
search_path.  This patch is very much nicer than the alternative of
attaching "SET search_path" clauses to earthdistance's functions:
it is more secure and doesn't create a run-time penalty.  Therefore,
I've chosen to back-patch this to v16 where @extschema:name@
was added.  It won't help unless users update to 16.7 and issue
"ALTER EXTENSION earthdistance UPDATE" before upgrading to 17,
but at least there's now a way to deal with the problem without
manual intervention in the dump/restore process.

Tom Lane and Ronan Dunklau

Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
Discussion: https://postgr.es/m/6a6439f1-8039-44e2-8fb9-59028f7f2014@mailbox.org
2024-12-14 16:07:18 -05:00
Álvaro Herrera
62b7a9a778
Refactor some SQL/JSON error messages
Turn type names into "%s" specifiers to 1) avoid getting them translated
and 2) reduce the total number of messages.
2024-12-14 12:55:00 +01:00
Thomas Munro
7bc9a8bdd2 Fix warnings about declaration of environ on MinGW.
POSIX says that the global variable environ shouldn't be declared in a
header, and that you have to declare it yourself.  MinGW declares it in
<stdlib.h> with some macrology that messes up our declarations.  Visual
Studio doesn't warn (there are clues that it may also declare it, but if
so, apparently compatibly).  Suppress our declarations, on MinGW only.

This clears the last warnings on CI's optional MinGW task, and hopefully
on build farm animal fairywren too.

Like 1319997d, no back-patch for now as it's not known to be breaking
anything, and my humble goal is just to keep the MinGW build clean going
forward.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJLMh%2B6W5E4M_jSFb43gnrA_-Q6-%2BBf3HkBXyGfRFcBsQ%40mail.gmail.com
2024-12-15 00:41:27 +13:00
Thomas Munro
48c142f78d Remove EXTENSION_DONT_CHECK_SIZE from md.c.
Commits 7bb3102c and 3eb77eba removed the only user of the
EXTENSION_DONT_CHECK_SIZE flag, which had previously been required to
checkpoint truncated relations.  Since 7bb3102c, segments have been
opened directly for synchronization without calling _mdfd_getseg(), so
it doesn't need a mode that tolerates non-final short segments.  Remove
the redundant flag and associated comments.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/nyj4k7yur5t27rtygvx2i2lrlp6rqfvvhoiiwx4fznynksf2et%404hj2sp42alpe
2024-12-14 21:52:10 +13:00
John Naylor
c72ca3ddec Fix typo
Ryo Kanbayashi

Discussion: https://postgr.es/m/CANOn0ExEQiPVrzkjULkENVac_n4Lknm6dxsU69MSncQap0kJVA%40mail.gmail.com
2024-12-14 09:52:08 +07:00
Tom Lane
7b8cb9cd6a Fix possible crash in pg_dump with identity sequences.
If an owned sequence is considered interesting, force its owning
table to be marked interesting too.  This ensures, in particular,
that we'll fetch the owning table's column names so we have the
data needed for ALTER TABLE ... ADD GENERATED.  Previously there were
edge cases where pg_dump could get SIGSEGV due to not having filled in
the column names.  (The known case is where the owning table has been
made part of an extension while its identity sequence is not a member;
but there may be others.)

Also, if it's an identity sequence, force its dumped-components mask
to exactly match the owning table: dump definition only if we're
dumping the table's definition, dump data only if we're dumping the
table's data, etc.  This generalizes the code introduced in commit
b965f2617 that set the sequence's dump mask to NONE if the owning
table's mask is NONE.  That's insufficient to prevent failures,
because for example the table's mask might only request dumping ACLs,
which would lead us to still emit ALTER TABLE ADD GENERATED even
though we didn't create the table.  It seems better to treat an
identity sequence as though it were an inseparable aspect of the
table, matching the treatment used in the backend's dependency logic.
Perhaps this policy needs additional refinement, but let's wait to
see some field use-cases before changing it further.

While here, add a comment in pg_dump.h warning against writing tests
like "if (dobj->dump == DUMP_COMPONENT_NONE)", which was a bug in this
case.  There is one other example in getPublicationNamespaces, which
if it's not a bug is at least remarkably unclear and under-documented.
Changing that requires a separate discussion, however.

Per report from Artur Zakirov.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAKNkYnwXFBf136=u9UqUxFUVagevLQJ=zGd5BsLhCsatDvQsKQ@mail.gmail.com
2024-12-13 14:21:36 -05:00
Álvaro Herrera
3191eccd8a
Rewrite maybe_reread_subscription() comment
One sentence was gramatically wrong, but also too terse.  Expand on it.
2024-12-13 07:41:36 +01:00
Álvaro Herrera
fd41ba93e4
Dump not-null constraints on inherited columns correctly
With not-null constraints defined in child tables for columns that are
coming from their parent tables, we were printing ALTER TABLE SET NOT
NULL commands that were missing the constraint name, so the original
constraint name was being lost, which is bogus.  Fix by instead adding
a table-constraint constraint declaration with the correct constraint
name in the CREATE TABLE instead.

Oversight in commit 14e87ffa5c54.

We could have fixed it by changing the ALTER TABLE SET NOT NULL to ALTER
TABLE ADD CONSTRAINT, but I'm not sure that's any better.  A potential
problem here might be that if sent to a non-Postgres server, the new
pg_dump output would fail because the "CONSTRAINT foo NOT NULL colname"
syntax isn't SQL-conforming.  However, Postgres' implementation of
inheritance is already non-SQL-conforming, so that'd likely fail anyway.

This problem was only noticed by Ashutosh's proposed test framework for
pg_dump, https://postgr.es/m/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2Wf61PT+hfquzjBqouRzQJQ@mail.gmail.com

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAExHW5tbdgAKDfqjDJ-7Fk6PJtHg8D4zUF6FQ4H2Pq8zK38Nyw@mail.gmail.com
2024-12-13 07:38:49 +01:00
Nathan Bossart
a0ff56e2d3 Revert "Don't truncate database and user names in startup packets."
This reverts commit 562bee0fc13dc95710b8db6a48edad2f3d052f2e.

We received a report from the field about this change in behavior,
so it seems best to revert this commit and to add proper
multibyte-aware truncation as a follow-up exercise.

Fixes bug #18711.

Reported-by: Adam Rauch
Reviewed-by: Tom Lane, Bertrand Drouvot, Bruce Momjian, Thomas Munro
Discussion: https://postgr.es/m/18711-7503ee3e449d2c47%40postgresql.org
Backpatch-through: 17
2024-12-12 15:52:04 -06:00
Michael Paquier
4766438aa3 Adjust some comments about structure properties in pg_stat.h
One comment of PgStat_TableCounts mentioned that its pending stats use
memcmp() to check for the all-zero case if there is any activity.  This
is not true since 07e9e28b56, as pg_memory_is_all_zeros() is used.

PgStat_FunctionCounts incorrectly documented that it relied on memcpy().
This has never been correct, and not relevant because function
statistics do not have an all-zero check for pending stats.

Checkpoint and bgwriter statistics have been always relying on memcmp()
or pg_memory_is_all_zeros() (since 07e9e28b56 for the latter), and never
mentioned the dependency on event counters for their all-zero checks.
Let's document these properties, like the table statistics.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/Z1hNLvcPgVLPxCoc@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-12 16:59:22 +09:00
David Rowley
bd10ec5297 Detect redundant GROUP BY columns using UNIQUE indexes
d4c3a156c added support that when the GROUP BY contained all of the
columns belonging to a relation's PRIMARY KEY, all other columns
belonging to that relation would be removed from the GROUP BY clause.
That's possible because all other columns are functionally dependent on
the PRIMARY KEY and those columns alone ensure the groups are distinct.

Here we expand on that optimization and allow it to work for any unique
indexes on the table rather than just the PRIMARY KEY index.  This
normally requires that all columns in the index are defined with NOT NULL,
however, we can relax that requirement when the index is defined with
NULLS NOT DISTINCT.

When there are multiple suitable indexes to allow columns to be removed,
we prefer the index with the least number of columns as this allows us
to remove the highest number of GROUP BY columns.  One day, we may want to
revisit that decision as it may make more sense to use the narrower set of
columns in terms of the width of the data types and stored/queried data.

This also adjusts the code to make use of RelOptInfo.indexlist rather
than looking up the catalog tables.

In passing, add another short-circuit path to allow bailing out earlier
in cases where it's certainly not possible to remove redundant GROUP BY
columns.  This early exit is now cheaper to do than when this code was
originally written as 00b41463c made it cheaper to check for empty
Bitmapsets.

Patch originally by Zhang Mingli and later worked on by jian he, but after
I (David) worked on it, there was very little of the original left.

Author: Zhang Mingli, jian he, David Rowley
Reviewed-by: jian he, Andrei Lepikhov
Discussion: https://postgr.es/m/327990c8-b9b2-4b0c-bffb-462249f82de0%40Spark
2024-12-12 15:28:38 +13:00
Richard Guo
d8f335156c Improve the test case from 5668a857d
In commit 5668a857d, we fixed an issue with incorrect results in right
semi joins and introduced a test case to verify the fix.  The test
case involves SubPlans and InitPlans, which may not be immediately
apparent in relation to the issue we addressed.

This patch simplifies the test case with a more straightforward query.

Per discussion with Melanie Plageman.

Author: Richard Guo
Discussion: https://postgr.es/m/CAAKRu_a-Cip2XCXp13fmxq+T9BhLAVApHTyjr94awL2mbXHC-Q@mail.gmail.com
2024-12-12 11:21:51 +09:00
Michael Paquier
0172b4c944 Add some regression tests for missing DDL patterns
The following commands gain increased coverage for some of the errors
they can trigger:
- ALTER TABLE .. ALTER COLUMN
- CREATE DOMAIN
- CREATE TYPE (LIKE)

This has come up while discussing the possibility to add more
information about the location of the error in such queries, and it
is useful on its own as there was no coverage until now for the
patterns added in this commit.

Author: Jian He, Kirill Reshke
Reviewed-By: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
2024-12-12 11:16:45 +09:00
David Rowley
430a5952de Defer remove_useless_groupby_columns() work until query_planner()
Traditionally, remove_useless_groupby_columns() was called during
grouping_planner() directly after the call to preprocess_groupclause().
While in many ways, it made sense to populate the field and remove the
functionally dependent columns from processed_groupClause at the same
time, it's just that doing so had the disadvantage that
remove_useless_groupby_columns() was being called before the RelOptInfos
were populated for the relations mentioned in the query.  Not having
RelOptInfos available meant we needed to manually query the catalog tables
to get the required details about the primary key constraint for the
table.

Here we move the remove_useless_groupby_columns() call to
query_planner() and put it directly after the RelOptInfos are populated.
This is fine to do as processed_groupClause still isn't final at this
point as it can still be modified inside standard_qp_callback() by
make_pathkeys_for_sortclauses_extended().

This commit is just a refactor and simply moves
remove_useless_groupby_columns() into initsplan.c.  A planned follow-up
commit will adjust that function so it uses RelOptInfo instead of doing
catalog lookups and also teach it how to use unique indexes as proofs to
expand the cases where we can remove functionally dependent columns from
the GROUP BY.

Reviewed-by: Andrei Lepikhov, jian he
Discussion: https://postgr.es/m/CAApHDvqLezKwoEBBQd0dp4Y9MDkFBDbny0f3SzEeqOFoU7Z5+A@mail.gmail.com
2024-12-12 14:22:15 +13:00
Masahiko Sawada
78c5e141e9 Add UUID version 7 generation function.
This commit introduces the uuidv7() SQL function, which generates UUID
version 7 as specified in RFC 9652. UUIDv7 combines a Unix timestamp
in milliseconds and random bits, offering both uniqueness and
sortability.

In our implementation, the 12-bit sub-millisecond timestamp fraction
is stored immediately after the timestamp, in the space referred to as
"rand_a" in the RFC. This ensures additional monotonicity within a
millisecond. The rand_a bits also function as a counter. We select a
sub-millisecond timestamp so that it monotonically increases for
generated UUIDs within the same backend, even when the system clock
goes backward or when generating UUIDs at very high
frequency. Therefore, the monotonicity of generated UUIDs is ensured
within the same backend.

This commit also expands the uuid_extract_timestamp() function to
support UUID version 7.

Additionally, an alias uuidv4() is added for the existing
gen_random_uuid() SQL function to maintain consistency.

Bump catalog version.

Author: Andrey Borodin
Reviewed-by: Sergey Prokhorenko, Przemysław Sztoch, Nikolay Samokhvalov
Reviewed-by: Peter Eisentraut, Jelte Fennema-Nio, Aleksander Alekseev
Reviewed-by: Masahiko Sawada, Lukas Fittl, Michael Paquier, Japin Li
Reviewed-by: Marcos Pegoraro, Junwang Zhao, Stepan Neretin
Reviewed-by: Daniel Vérité
Discussion: https://postgr.es/m/CAAhFRxitJv%3DyoGnXUgeLB_O%2BM7J2BJAmb5jqAT9gZ3bij3uLDA%40mail.gmail.com
2024-12-11 15:54:41 -08:00
David Rowley
89988ac589 Fix further fallout from EXPLAIN ANALYZE BUFFERS change
c2a4078eb adjusted EXPLAIN ANALYZE to default the BUFFERS to ON.  This
(hopefully) fixes the last remaining issue with regression test failures
with -D RELCACHE_FORCE_RELEASE -D CATCACHE_FORCE_RELEASE builds, where
the planner accesses more buffers due to the cold caches.

Discussion: https://postgr.es/m/CAApHDvqLdzgz77JsE-yTki3w9UiKQ-uTMLRctazcu+99-ips3g@mail.gmail.com
2024-12-12 09:50:00 +13:00
Nathan Bossart
e8d5929428 Use pg_memory_is_all_zeros() in pgstatfuncs.c.
There are a few places in this file that use memset() and memcmp()
to determine whether a section of memory is all zeros.  This commit
modifies them to use pg_memory_is_all_zeros() instead.  These
aren't expected to be hot code paths, but this may optimize them a
bit.  Plus, this allows us to remove some variables that were only
needed for the memset() and memcmp().

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/Z1hNubHfvMxlW6eu%40ip-10-97-1-34.eu-west-3.compute.internal
2024-12-11 14:19:14 -06:00
Masahiko Sawada
398d3e3b5b Unmark gen_random_uuid() function leakproof.
The functions without arguments don't need to be marked
leakproof. This commit unmarks gen_random_uuid() leakproof for
consistency with upcoming UUID generation functions. Also, this commit
adds a regression test to prevent reintroducing such cases.

Bump catalog version.

Reported-by: Peter Eisentraut
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CAD21AoBE1ePPWY1NQEgk3DkqjYzLPZwYTzCySHm0e%2B9a69PfZw%40mail.gmail.com
2024-12-11 10:35:57 -08:00
Daniel Gustafsson
0e033f5b6d Fix a memory leak in dumping functions with TRANSFORMs
The gneration of the dump clause for functions with TRANSFORM
calls would leak the memory for holding the result of the Oid
array parsing.  Fix by freeing.

While in there, switch to using pg_malloc instead of palloc in
order to be consistent with the rest of the file.

Author: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/baf1ae4511288e5b421f41e79a3df1a0@postgrespro.ru
2024-12-11 12:48:22 +01:00
David Rowley
9df2a4b931 Add missing BUFFERS OFF in regression tests, take 2
Similar to 9fa1aaa65, but running with -D RELCACHE_FORCE_RELEASE and
-D CATCACHE_FORCE_RELEASE yielded some additional missing places that
needed BUFFERS OFF.

Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
2024-12-11 23:16:44 +13:00
David Rowley
9fa1aaa652 Add missing BUFFERS OFF in select_into regression tests
c2a4078eb adjusted EXPLAIN ANALYZE to include BUFFERS by default, but
a few tests in select_into.sql neglected to add BUFFERS OFF.  The
failing tests seem unlikely to ever access buffers during execution, but
they certainly could during planning.

Per buildfarm member kestrel, tayra and calliphoridae.

Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
2024-12-11 22:56:36 +13:00
David Rowley
c2a4078eba Enable BUFFERS with EXPLAIN ANALYZE by default
The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option
has come up a few times over the past few years.  In many ways, doing this
seems like a good idea as it may be more obvious to users why a given
query is running more slowly than they might expect.  Also, from my own
(David's) personal experience, I've seen users posting to the mailing
lists with two identical plans, one slow and one fast asking why their
query is sometimes slow.  In many cases, this is due to additional reads.
Having BUFFERS on by default may help reduce some of these questions, and
if not, make it more obvious to the user before they post, or save a
round-trip to the mailing list when additional I/O effort is the cause of
the slowness.

The general consensus is that we want BUFFERS on by default with
ANALYZE.  However, there were more than zero concerns raised with doing
so.  The primary reason against is the additional verbosity, making it
harder to read large plans.  Another concern was that buffer information
isn't always useful so may not make sense to have it on by default.

It's currently December, so let's commit this to see if anyone comes
forward with a strong objection against making this change.  We have over
half a year remaining in the v18 cycle where we could still easily consider
reverting this if someone were to come forward with a convincing enough
reason as to why doing this is a bad idea.

There were two patches independently submitted to achieve this goal, one
by me and the other by Guillaume.  This commit is a mix of both of these
patches with some additional work done by me to adjust various
additional places in the documentation which include EXPLAIN ANALYZE
output.

Author: Guillaume Lelarge, David Rowley
Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides
Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
2024-12-11 22:35:11 +13:00
David Rowley
0f5738202b Use ExprStates for hashing in GROUP BY and SubPlans
This speeds up obtaining hash values for GROUP BY and hashed SubPlans by
using the ExprState support for hashing, thus allowing JIT compilation for
obtaining hash values for these operations.

This, even without JIT compilation, has been shown to improve Hash
Aggregate performance in some cases by around 15% and hashed NOT IN
queries in one case by over 30%, however, real-world cases are likely to
see smaller gains as the test cases used were purposefully designed to
have high hashing overheads by keeping the hash table small to prevent
additional memory overheads that would be a factor when working with large
hash tables.

In passing, fix a hypothetical bug in ExecBuildHash32Expr() so that the
initial value is stored directly in the ExprState's result field if
there are no expressions to hash.  None of the current users of this
function use an initial value, so the bug is only hypothetical.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpYSO3kc9UryMevWqthTBrxgfd9djiAjKHMPUSQeX9vdQ@mail.gmail.com
2024-12-11 13:47:16 +13:00
Jeff Davis
a43567483c Use in-place updates for pg_restore_relation_stats().
This matches the behavior of vac_update_relstats(), which is important
to avoid bloating pg_class.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=fc3je+ufv3gsHqjjSSf+t8674RXpuXW62EL55MUEQd-g@mail.gmail.com
2024-12-10 16:30:37 -08:00
Michael Paquier
8ede501685 Improve reporting of pg_upgrade log files on test failure
On failure, the pg_upgrade log files are automatically appended to the
test log file, but the information reported was inconsistent.

A header, with the log file name, was reported with note(), while the
log contents and a footer used print(), making it harder to diagnose
failures when these are split into console output and test log file
because the pg_upgrade log file path in the header may not be included
in the test log file.

The output is now consolidated so as the header uses print() rather than
note().  An extra note() is added to inform that the contents of a
pg_upgrade log file are appended to the test log file.

The diffs from the regression test suite and dump files all use print()
to show their contents on failure.

Author: Joel Jacobson
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/49f7e64a-b9be-4a90-a9fe-210a7740405e@app.fastmail.com
Backpatch-through: 15
2024-12-11 08:48:47 +09:00
David Rowley
50416cc484 Speedup Hash Joins with dedicated functions for ExprState hashing
Hashing of a single Var is a very common operation for ExprState to
perform.  Here we add dedicated ExecJust* functions which helps speed up
Hash Joins by removing the interpretation overhead in ExecInterpExpr().

This change currently only affects Hash Joins on a single column.  Hash
Joins with multiple join keys or an expression still run through
ExecInterpExpr().

Some testing has shown up to 10% query performance increases on recent AMD
hardware and nearly 7% increase on an Apple M2 for a query performing a
hash join with a large number of lookups on a small hash table.

This change was extracted from a larger patch which adjusts GROUP BY /
hashed subplans / hashed set operations to use ExprState hashing.

Discussion: https://postgr.es/m/CAApHDvr8Zc0ZgzVoCZLdHGOFNhiJeQ6vrUcS9V7N23zMWQb-eA@mail.gmail.com
2024-12-11 11:32:15 +13:00
Tom Lane
9828905303 Doc: add some commentary about ExecutorRun's NoMovement special case.
Robert Haas expressed concern about whether commit 3eea7a0c9 exposed
the parallel-execution machinery to a case it isn't tested for, namely
a second non-parallel execution of a plan after a parallel execution.
Investigation shows that that can't happen because of pquery.c's
manipulation of the scan direction, but it sure wasn't obvious to
start with.  Add some commentary about that.

Discussion: https://postgr.es/m/CA+TgmoagyKQy=HFw+wLo0AKTYWHui+iKswZ8Jnqqd-cFby-WVg@mail.gmail.com
2024-12-10 17:17:28 -05:00
Noah Misch
8b9cbf4922 Fix elog(FATAL) before PostmasterMain() or just after fork().
Since commit 97550c0711972a9856b5db751539bbaf2f88884c, these failed with
"PANIC:  proc_exit() called in child process" due to uninitialized or
stale MyProcPid.  That was reachable if close() failed in
ClosePostmasterPorts() or setlocale(category, "C") failed, both
unlikely.  Back-patch to v13 (all supported versions).

Discussion: https://postgr.es/m/20241208034614.45.nmisch@google.com
2024-12-10 13:51:59 -08:00
Peter Eisentraut
939b0908c8 Tests for logical replication with temporal keys
This covers some cases that were previously failing before the
"Support for GiST in get_equal_strategy_number()" patch.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 15:05:58 +01:00
Peter Eisentraut
74edabce7a Support for GiST in get_equal_strategy_number()
A WITHOUT OVERLAPS primary key or unique constraint is accepted as a
REPLICA IDENTITY, since it guarantees uniqueness.  But subscribers
applying logical decoding messages would fail because there was not
support for looking up the equals operator for a gist index.  This
fixes that: For GiST indexes we can use the stratnum GiST support
function.

Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 13:26:09 +01:00
Peter Eisentraut
13544e790e Make the conditions in IsIndexUsableForReplicaIdentityFull() more explicit
IsIndexUsableForReplicaIdentityFull() described a number of conditions
that a suitable index has to fulfill.  But not all of these were
actually checked in the code.  Instead, it appeared to rely on
get_equal_strategy_number() to filter out any indexes that are not
btree or hash.  As we look to generalize index AM capabilities, this
would possibly break if we added additional support in
get_equal_strategy_number().  Instead, write out code to check for the
required capabilities explicitly.  This shouldn't change any behaviors
at the moment.

Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 13:11:34 +01:00
Peter Eisentraut
a2a475b011 Replace get_equal_strategy_number_for_am() by get_equal_strategy_number()
get_equal_strategy_number_for_am() gets the equal strategy number for
an AM.  This currently only supports btree and hash.  In the more
general case, this also depends on the operator class (see for example
GistTranslateStratnum()).  To support that, replace this function with
get_equal_strategy_number() that takes an opclass and derives it from
there.  (This function already existed before as a static function, so
the signature is kept for simplicity.)

This patch is only a refactoring, it doesn't add support for other
index AMs such as gist.  This will be done separately.

Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 12:53:27 +01:00
Peter Eisentraut
321c287351 Improve internal logical replication error for missing equality strategy
This "shouldn't happen", except right now it can with a temporal gist
index (to be fixed soon), because of missing gist support in
get_equal_strategy_number().  But right now, the error is not caught
right away, but instead you get the subsequent error about a "missing
operator 0".  This makes the error more accurate.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 12:30:42 +01:00
Michael Paquier
d37e856410 Fix comments of GUC hooks for timezone_abbreviations
The GUC assign and check hooks used "assign_timezone_abbreviations",
which was incorrect.

Issue noticed while browsing this area of the code, introduced in
0a20ff54f5e6.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/Z1eV6Y8yk77GZhZI@paquier.xyz
Backpatch-through: 16
2024-12-10 13:02:21 +09:00
Michael Paquier
7b2690a571 Fix outdated comment of scram_build_secret()
This routine documented that "iterations" would use a default value if
set to 0 by the caller.  However, the iteration should always be set by
the caller to a value strictly more than 0, as documented by an
assertion.

Oversight in b577743000cd, that has made the iteration count of SCRAM
configurable.

Author: Matheus Alcantara
Discussion: https://postgr.es/m/ac858943-4743-44cd-b4ad-08a0c10cbbc8@gmail.com
Backpatch-through: 16
2024-12-10 12:54:09 +09:00
Masahiko Sawada
724890ffb7 Include necessary header files in radixtree.h.
When #include'ing radixtree.h with RT_SHMEM, it could happen to raise
compiler errors due to missing some declarations of types and
functions.

This commit also removes the inclusion of postgres.h since it's
against our usual convention.

Backpatch to v17, where radixtree.h was introduced.

Reviewed-by: Heikki Linnakangas, Álvaro Herrera
Discussion: https://postgr.es/m/CAD21AoCU9YH%2Bb9Rr8YRw7UjmB%3D1zh8GKQkWNiuN9mVhMvkyrRg%40mail.gmail.com
Backpatch-through: 17
2024-12-09 13:07:06 -08:00
David Rowley
36d0229b8f Doc: fix incorrect EXPLAIN ANALYZE output for bloom indexes
It looks like the example case was once modified to increase the number
of rows but the EXPLAIN ANALYZE output wasn't updated to reflect that.

Also adjust the text which discusses the index sizes.  With the example
table size, the bloom index isn't quite 8 times more space efficient
than the btree indexes.

Discussion: https://postgr.es/m/CAApHDvovx8kQ0=HTt85gFDAwmTJHpCgiSvRmQZ_6u_g-vQYM_w@mail.gmail.com
Backpatch-through: 13, all supported versions
2024-12-10 09:24:43 +13:00
Daniel Gustafsson
73a392d236 Fix small memory leaks in GUC checks
Follow-up commit to a9d58bfe8a3a.  Backpatch down to v16 where
this was added in order to keep the code consistent for future
backpatches.

Author: Tofig Aliev <t.aliev@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/bba4313fdde9db46db279f96f3b748b1@postgrespro.ru
Backpatch-through: 16
2024-12-09 20:58:23 +01:00
Nathan Bossart
0a27c3d0f7 Fix various overflow hazards in date and timestamp functions.
This commit makes use of the overflow-aware routines in int.h to
fix a variety of reported overflow bugs in the date and timestamp
code.  It seems unlikely that this fixes all such bugs in this
area, but since the problems seem limited to cases that are far
beyond any realistic usage, I'm not going to worry too much.  Note
that for one bug, I've chosen to simply add a comment about the
overflow hazard because fixing it would require quite a bit of code
restructuring that doesn't seem worth the risk.

Since this is a bug fix, it could be back-patched, but given the
risk of conflicts with the new routines in int.h and the overall
risk/reward ratio of this patch, I've opted not to do so for now.

Fixes bug #18585 (except for the one case that's just commented).

Reported-by: Alexander Lakhin
Author: Matthew Kim, Nathan Bossart
Reviewed-by: Joseph Koshakow, Jian He
Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com
Discussion: https://postgr.es/m/18585-db646741dd649abd%40postgresql.org
2024-12-09 13:47:23 -06:00
Tom Lane
3eea7a0c97 Simplify executor's determination of whether to use parallelism.
Our parallel-mode code only works when we are executing a query
in full, so ExecutePlan must disable parallel mode when it is
asked to do partial execution.  The previous logic for this
involved passing down a flag (variously named execute_once or
run_once) from callers of ExecutorRun or PortalRun.  This is
overcomplicated, and unsurprisingly some of the callers didn't
get it right, since it requires keeping state that not all of
them have handy; not to mention that the requirements for it were
undocumented.  That led to assertion failures in some corner
cases.  The only state we really need for this is the existing
QueryDesc.already_executed flag, so let's just put all the
responsibility in ExecutePlan.  (It could have been done in
ExecutorRun too, leading to a slightly shorter patch -- but if
there's ever more than one caller of ExecutePlan, it seems better
to have this logic in the subroutine than the callers.)

This makes those ExecutorRun/PortalRun parameters unnecessary.
In master it seems okay to just remove them, returning the
API for those functions to what it was before parallelism.
Such an API break is clearly not okay in stable branches,
but for them we can just leave the parameters in place after
documenting that they do nothing.

Per report from Yugo Nagata, who also reviewed and tested
this patch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
2024-12-09 14:38:19 -05:00
Heikki Linnakangas
4d8275046c Remove remants of "snapshot too old"
Remove the 'whenTaken' and 'lsn' fields from SnapshotData. After the
removal of the "snapshot too old" feature, they were never set to a
non-zero value.

This largely reverts commit 3e2f3c2e423, which added the
OldestActiveSnapshot tracking, and the init_toast_snapshot()
function. That was only required for setting the 'whenTaken' and 'lsn'
fields. SnapshotToast is now a constant again, like SnapshotSelf and
SnapshotAny. I kept a thin get_toast_snapshot() wrapper around
SnapshotToast though, to check that you have a registered or active
snapshot. That's still a useful sanity check.

Reviewed-by: Nathan Bossart, Andres Freund, Tom Lane
Discussion: https://www.postgresql.org/message-id/cd4b4f8c-e63a-41c0-95f6-6e6cd9b83f6d@iki.fi
2024-12-09 18:13:03 +02:00
Richard Guo
f64ec81a81 Avoid unnecessary wrapping for Vars and PHVs
When pulling up a lateral subquery that is under an outer join, the
current code always wraps a Var or PHV in the subquery's targetlist
into a new PlaceHolderVar if it is a lateral reference to something
outside the subquery.  This is necessary when the Var/PHV references
the non-nullable side of the outer join from the nullable side: we
need to ensure that it is evaluated at the right place and hence is
forced to null when the outer join should do so.  However, if the
referenced rel is under the same lowest nulling outer join, we can
actually omit the wrapping.  That's safe because if the subquery
variable is forced to NULL by the outer join, the lateral reference
variable will come out as NULL too.  It could be beneficial to get rid
of such PHVs because they imply lateral dependencies, which force us
to resort to nestloop joins.

This patch leverages the newly introduced nullingrel_info to check if
the nullingrels of the subquery RTE are a subset of those of the
laterally referenced rel, in order to determine if the referenced rel
is under the same lowest nulling outer join.

No backpatch as this could result in plan changes.

Author: Richard Guo
Reviewed-by: James Coleman, Dmitry Dolgov, Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48uk6C7Z9m_FNT8_21CMCk68hrgAsz=z6zpP1PNZMkeoQ@mail.gmail.com
2024-12-09 20:38:22 +09:00
Richard Guo
5668a857de Fix right-semi-joins in HashJoin rescans
When resetting a HashJoin node for rescans, if it is a single-batch
join and there are no parameter changes for the inner subnode, we can
just reuse the existing hash table without rebuilding it.  However,
for join types that depend on the inner-tuple match flags in the hash
table, we need to reset these match flags to avoid incorrect results.
This applies to right, right-anti, right-semi, and full joins.

When I introduced "Right Semi Join" plan shapes in aa86129e1, I failed
to reset the match flags in the hash table for right-semi joins in
rescans.  This oversight has been shown to produce incorrect results.
This patch fixes it.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-nQF9io2WL2SkD0eXvfPdyBc9Q=hRwfQHCGV2usa0jyA@mail.gmail.com
2024-12-09 20:36:23 +09:00
Michael Paquier
f0c569d715 Fix memory leak in pgoutput with publication list cache
The pgoutput module caches publication names in a list and frees it upon
invalidation.  However, the code forgot to free the actual publication
names within the list elements, as publication names are pstrdup()'d in
GetPublication().  This would cause memory to leak in
CacheMemoryContext, bloating it over time as this context is not
cleaned.

This is a problem for WAL senders running for a long time, as an
accumulation of invalidation requests would bloat its cache memory
usage.  A second case, where this leak is easier to see, involves a
backend calling SQL functions like pg_logical_slot_{get,peek}_changes()
which create a new decoding context with each execution.  More
publications create more bloat.

To address this, this commit adds a new memory context within the
logical decoding context and resets it each time the publication names
cache is invalidated, based on a suggestion from Amit Kapila.  This
ensures that the lifespan of the publication names aligns with that of
the logical decoding context.

This solution changes PGOutputData, which is fine for HEAD but it could
cause an ABI breakage in stable branches as the structure size would
change, so these are left out for now.

Analyzed-by: Michael Paquier, Jeff Davis
Author: Zhijie Hou
Reviewed-by: Michael Paquier, Masahiko Sawada, Euler Taveira
Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz
2024-12-09 16:41:46 +09:00
Michael Paquier
001a537b83 Improve comment about dropped entries in pgstat.c
pgstat_write_statsfile() discards any entries marked as dropped from
being written to the stats file at shutdown, and also included an
assertion based on the same condition.

The intention of the assertion is to track that no pgstats entries
should be left around as terminating backends should drop any entries
they still hold references on before the stats file is written by the
checkpointer, and it not worth taking down the server in this case if
there is a bug making that possible.

Let's improve the comment of this area to document clearly what's
intended.

Based on a discussion with Bertrand Drouvot and Anton A. Melnikov.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/a13e8cdf-b97a-4ecb-8f42-aaa367974e29@postgrespro.ru
Backpatch-through: 15
2024-12-09 14:35:39 +09:00
Amit Kapila
2d0152d614 Improve the error message introduced in commit 87ce27de696.
The error detail message "Replica identity consists of an unpublished
generated column." implies that the entire replica identity is made up of
an unpublished generated column which may not be the case.

Reported-by: Peter Smith
Author: Shlok Kyal
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CAHut+PuwMhKx0PhOA4APhJTLoBGNykbeCQpr_CuwGT-SkswG5w@mail.gmail.com
2024-12-09 09:11:45 +05:30
Michael Paquier
da99fedf8c Fix invalidation of local pgstats references for entry reinitialization
818119afccd3 has introduced the "generation" concept in pgstats entries,
incremented a counter when a pgstats entry is reinitialized, but it did
not count on the fact that backends still holding local references to
such entries need to be refreshed if the cache age is outdated.  The
previous logic only updated local references when an entry was dropped,
but it needs also to consider entries that are reinitialized.

This matters for replication slot stats (as well as custom pgstats kinds
in 18~), where concurrent drops and creates of a slot could cause
incorrect stats to be locally referenced.  This would lead to an
assertion failure at shutdown when writing out the stats file, as the
backend holding an outdated local reference would not be able to drop
during its shutdown sequence the stats entry that should be dropped, as
the last process holding a reference to the stats entry.  The
checkpointer was then complaining about such an entry late in the
shutdown sequence, after the shutdown checkpoint is finished with the
control file updated, causing the stats file to not be generated.  In
non-assert builds, the entry would just be skipped with the stats file
written.

Note that only logical replication slots use statistics.

A test case based on TAP is added to test_decoding, where a persistent
connection peeking at a slot's data is kept with concurrent drops and
creates of the same slot.  This is based on the isolation test case that
Anton has sent.  As it requires a node shutdown with a check to make
sure that the stats file is written with this specific sequence of
events, TAP is used instead.

Reported-by: Anton A. Melnikov
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru
Backpatch-through: 15
2024-12-09 10:45:28 +09:00
David Rowley
1fe5a347e3 Fix possible crash during WindowAgg evaluation
When short-circuiting WindowAgg node evaluation on the top-level
WindowAgg node using quals on monotonic window functions, because the
WindowAgg run condition can mean there's no need to evaluate subsequent
window function results in the same partition once the run condition
becomes false, it was possible that the executor would use stale results
from the previous invocation of the window function in some cases.

A fix for this was partially done by a5832722, but that commit only
fixed the issue for non-top-level WindowAgg nodes.  I mistakenly thought
that the top-level WindowAgg didn't have this issue, but Jayesh's example
case clearly shows that's incorrect.  At the time, I also thought that
this only affected 32-bit systems as all window functions which then
supported run conditions returned BIGINT, however, that's wrong as
ExecProject is still called and that could cause evaluation of any other
window function belonging to the same WindowAgg node, one of which may
return a byref type.

The only queries affected by this are WindowAggs with a "Run Condition"
which contains at least one window function with a byref result type,
such as lead() or lag() on a byref column.  The window clause must also
contain a PARTITION BY clause (without a PARTITION BY, execution of the
WindowAgg stops immediately when the run condition becomes false and
there's no risk of using the stale results).

Reported-by: Jayesh Dehankar
Discussion: https://postgr.es/m/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com
Backpatch-through: 15, where WindowAgg run conditions were added
2024-12-09 14:23:21 +13:00
Tom Lane
3f9b962176 Ensure that pg_amop/amproc entries depend on their lefttype/righttype.
Usually an entry in pg_amop or pg_amproc does not need a dependency on
its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types,
because there is an indirect dependency via the argument types of its
referenced operator or procedure, or via the opclass it belongs to.
However, for some support procedures in some index AMs, the argument
types of the support procedure might not mention the column data type
at all.  Also, the amop/amproc entry might be treated as "loose" in
the opfamily, in which case it lacks a dependency on any particular
opclass; or it might be a cross-type entry having a reference to a
datatype that is not its opclass' opcintype.

The upshot of all this is that there are cases where a datatype can
be dropped while leaving behind amop/amproc entries that mention it,
because there is no path in pg_depend showing that those entries
depend on that type.  Such entries are harmless in normal activity,
because they won't get used, but they cause problems for maintenance
actions such as dropping the operator family.  They also cause pg_dump
to produce bogus output.  The previous commit put a band-aid on the
DROP OPERATOR FAMILY failure, but a real fix is needed.

To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry
depends on its lefttype/righttype.  To avoid bloating pg_depend too
much, skip this if the referenced operator or function has that type
as an input type.  (I did not bother with considering the possible
indirect dependency via the opclass' opcintype; at least in the
reported case, that wouldn't help anyway.)

Probably, the reason this has escaped notice for so long is that
add-on datatypes and relevant opclasses/opfamilies are usually
packaged as extensions nowadays, so that there's no way to drop
a type without dropping the referencing opclasses/opfamilies too.
Still, in the absence of pg_depend entries there's nothing that
constrains DROP EXTENSION to drop the opfamily entries before the
datatype, so it seems possible for a DROP failure to occur anyway.

The specific case that was reported doesn't fail in v13, because
v13 prefers to attach the support procedure to the opclass not the
opfamily.  But it's surely possible to construct other edge cases
that do fail in v13, so patch that too.

Per report from Yoran Heling.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 15:56:28 -05:00
Tom Lane
c82003760d Make getObjectDescription robust against dangling amproc type links.
Yoran Heling reported a case where a data type could be dropped
while references to its OID remain behind in pg_amproc.  This
causes getObjectDescription to fail, which blocks dropping the
operator family (since our DROP code likes to construct descriptions
of everything it's dropping).  The proper fix for this requires
adding more pg_depend entries.  But to allow DROP to go through with
already-corrupt catalogs, tweak getObjectDescription to print "???"
for the type instead of failing when it processes such an entry.

I changed the logic for pg_amop similarly, for consistency,
although it is not known that the problem can manifest in pg_amop.

Per report from Yoran Heling.  Back-patch to all supported
branches (although the problem may be unreachable in v13).

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 14:28:16 -05:00
Tom Lane
3220ceaf77 Fix is_digit labeling of to_timestamp's FFn format codes.
These format codes produce or consume strings of digits, so they
should be labeled with is_digit = true, but they were not.
This has effect in only one place, where is_next_separator()
is checked to see if the preceding format code should slurp up
all the available digits.  Thus, with a format such as '...SSFF3'
with remaining input '12345', the 'SS' code would consume all
five digits (and then complain about seconds being out of range)
when it should eat only two digits.

Per report from Nick Davies.  This bug goes back to d589f9446
where the FFn codes were introduced, so back-patch to v13.

Discussion: https://postgr.es/m/AM8PR08MB6356AC979252CFEA78B56678B6312@AM8PR08MB6356.eurprd08.prod.outlook.com
2024-12-07 13:12:32 -05:00
Peter Eisentraut
263a3f5f7f doc: remove LC_COLLATE and LC_CTYPE from SHOW command
The corresponding read-only server settings have been removed since
in PG16. See commit b0f6c437160db6.

Author: Pierre Giraud <pierre.giraud@dalibo.com>
Discussion: https://www.postgresql.org/message-id/flat/a75a2fb0-f4b3-4c0c-be3d-7a62d266d760%40dalibo.com
2024-12-07 12:55:55 +01:00
Jeff Davis
ffe003cae1 Comment fix: "buffer context lock" to "buffer content lock".
The term "buffer context lock" is outdated as of commit 5d5087363d.
2024-12-06 09:59:12 -08:00
Peter Eisentraut
8743ea1b2e Remove useless casts to (const void *)
Similar to commit 7f798aca1d5, but I didn't think to look for "const"
as well.
2024-12-06 18:49:01 +01:00
Thomas Munro
1319997df9 Fix printf format string warning on MinGW.
Commit 517bf2d91 changed a printf format string to placate MinGW, which
at the time warned about "%lld".  Current MinGW is now warning about the
replacement "%I64d".  Reverting the change clears the warning on the
MinGW CI task, and hopefully it will clear it on build farm animal
fairywren too.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/TYAPR01MB5866A71B744BE01B3BF71791F5AEA%40TYAPR01MB5866.jpnprd01.prod.outlook.com
2024-12-06 12:44:30 +13:00
Peter Eisentraut
792b2c7e6d Remove pg_regex_collation
We can also use the existing pg_regex_locale as the cache key, which
is the only use of this variable.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/b1b92ae1-2e06-4619-a87a-4b4858e547ec%40eisentraut.org
2024-12-05 07:19:37 +01:00
Thomas Munro
71cb352904 Fix header inclusion order in c.h.
Commit 962da900a added #include <stdint.h> to postgres_ext.h, which
broke c.h's header ordering rule.

The system headers on some systems would then lock down off_t's size in
private macros, before they'd had a chance to see our definition of
_FILE_OFFSET_BITS (and presumably other things).  This was picked up by
perl's ABI compatibility checks on some 32 bit systems in the build
farm.

Move #include "postgres_ext.h" down below the system header section, and
make the comments clearer (thanks to Tom for the new wording).

Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2397643.1733347237%40sss.pgh.pa.us
2024-12-05 14:31:39 +13:00
Nathan Bossart
76fd342496 Provide a better error message for misplaced dispatch options.
Before this patch, misplacing a special must-be-first option for
dispatching to a subprogram (e.g., postgres -D . --single) would
fail with an error like

	FATAL:  --single requires a value

This patch adjusts this error to more accurately complain that the
special option wasn't listed first.  The aforementioned error
message now looks like

	FATAL:  --single must be first argument

The dispatch option parsing code has been refactored for use
wherever ParseLongOption() is called.  Beyond the obvious advantage
of avoiding code duplication, this should prevent similar problems
when new dispatch options are added.  Note that we assume that none
of the dispatch option names match another valid command-line
argument, such as the name of a configuration parameter.

Ideally, we'd remove this must-be-first requirement for these
options, but after some investigation, we decided that wasn't worth
the added complexity and behavior changes.

Author: Nathan Bossart, Greg Sabino Mullane
Reviewed-by: Greg Sabino Mullane, Peter Eisentraut, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/CAKAnmmJkZtZAiSryho%3DgYpbvC7H-HNjEDAh16F3SoC9LPu8rqQ%40mail.gmail.com
2024-12-04 15:04:15 -06:00
Bruce Momjian
24c1c63387 Return actual error code from FOP failure in PDF build
Previously we returned "1" on error.  Improvement on 77c189cdafe.

Backpatch-through: master
2024-12-04 14:37:24 -05:00
Peter Eisentraut
dfbb092cff Fix dead code
from commit 85b7efa1cdd

per Coverity report
2024-12-04 16:44:40 +01:00
John Naylor
ccc8194e42 Fix use-after-free in parallel_vacuum_reset_dead_items
parallel_vacuum_reset_dead_items used a local variable to hold a
pointer from the passed vacrel, purely as a shorthand. This pointer
was later freed and a new allocation was made and stored to the
struct. Then the local pointer was mistakenly referenced again.

This apparently happened not to break anything since the freed chunk
would have been put on the context's freelist, so it was accidentally
the same pointer anyway, in which case the DSA handle was correctly
updated. The minimal fix is to change two places so they access
dead_items through the vacrel. This coding style is a maintenance
hazard, so while at it get rid of most other similar usages, which
were inconsistently used anyway.

Analysis and patch by Vallimaharajan G, with further defensive coding
by me

Backpath to v17, when TidStore came in

Discussion: https://postgr.es/m/1936493cc38.68cb2ef27266.7456585136086197135@zohocorp.com
2024-12-04 16:58:25 +07:00
Peter Eisentraut
7727049e8f Simplify IsIndexUsableForReplicaIdentityFull()
Take Relation as argument instead of IndexInfo.  Building the
IndexInfo is an unnecessary intermediate step here.

A future patch wants to get some information that is in the relcache
but not in IndexInfo, so this will also help there.

Discussion: https://www.postgresql.org/message-id/333d3886-b737-45c3-93f4-594c96bb405d@eisentraut.org
2024-12-04 08:33:28 +01:00
Amit Kapila
87ce27de69 Ensure stored generated columns must be published when required.
Ensure stored generated columns that are part of REPLICA IDENTITY must be
published explicitly for UPDATE and DELETE operations to be published. We
can publish generated columns by listing them in the column list or by
enabling the publish_generated_columns option.

This commit changes the behavior of the test added in commit adedf54e65 by
giving an ERROR for the UPDATE operation in such cases. There is no way to
trigger the bug reported in commit adedf54e65 but we didn't remove the
corresponding code change because it is still relevant when replicating
changes from a publisher with version less than 18.

We decided not to backpatch this behavior change to avoid the risk of
breaking existing output plugins that may be sending generated columns by
default although we are not aware of any such plugin. Also, we didn't see
any reports related to this on STABLE branches which is another reason not
to backpatch this change.

Author: Shlok Kyal, Hou Zhijie
Reviewed-by: Vignesh C, Amit Kapila
Discussion: https://postgr.es/m/CANhcyEVw4V2Awe2AB6i0E5AJLNdASShGfdBLbUd1XtWDboymCA@mail.gmail.com
2024-12-04 09:45:18 +05:30
Bruce Momjian
77c189cdaf Properly use $(AWK) in Makefile, not 'awk'
Fix for commit 498f1307569.

Backpatch-through: master
2024-12-03 22:31:33 -05:00
Thomas Munro
962da900ac Use <stdint.h> and <inttypes.h> for c.h integers.
Redefine our exact width types with standard C99 types and macros,
including int64_t, INT64_MAX, INT64_C(), PRId64 etc.  We were already
using <stdint.h> types in a few places.

One complication is that Windows' <inttypes.h> uses format strings like
"%I64d", "%I32", "%I" for PRI*64, PRI*32, PTR*PTR, instead of mapping to
other standardized format strings like "%lld" etc as seen on other known
systems.  Teach our snprintf.c to understand them.

This removes a lot of configure clutter, and should also allow 64-bit
numbers and other standard types to be used in localized messages
without casting.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
2024-12-04 15:05:38 +13:00
Tom Lane
3b08d5224d Define __EXTENSIONS__ on Solaris, too.
Apparently, if you define _POSIX_C_SOURCE on Solaris,
that's interpreted as "you get ONLY what's defined by POSIX".
Results from BF member hake show that that breaks perl.h,
and doubtless it'd cause more problems if we got past that.
Adopt the suggestion from standards(7) that we also need to
define __EXTENSIONS__, in hopes of un-breaking things.

Discussion: https://postgr.es/m/1654508.1733162761@sss.pgh.pa.us
2024-12-03 20:21:23 -05:00
Bruce Momjian
498f130756 Fix Makefile so invalid characters warning preserves error code
Fix for commit e4c8865196f.

Reported-by: Peter Eisentraut

Discussion: https://postgr.es/m/88cb6ecf-22bb-431e-974b-1cd236a80364@eisentraut.org

Backpatch-through: master
2024-12-03 18:27:41 -05:00
Bruce Momjian
8b318a168a Now that we have non-Latin1 SGML detection, restore Latin1 chars
This reverts the change in commit 641a5b7a144 that converted them to
HTML entities.

Reported-by: Peter Eisentraut

Discussion: https://postgr.es/m/Z05ssoVheWI-rqax@momjian.us

Backpatch-through: master
2024-12-03 17:09:49 -05:00
Jeff Davis
7167e05fc7 Move check for ucol_strcollUTF8 to pg_locale_icu.c
The result of the check is only used by pg_locale_icu.c.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477@proxel.se
2024-12-03 11:36:21 -08:00
Tom Lane
32a7deb2a0 Define _POSIX_C_SOURCE as 200112L on Solaris.
This is an attempt to suppress some compiler warnings that appeared in
the wake of commit 7f798aca1: it seems that by default Solaris/illumos
declares shmdt() to take "char *" not "void *".  We'd like the system
headers to provide modern POSIX APIs, and POSIX 2001 seems to be as
modern as is available there.

illumos' standards(7) man page suggests that we might also need to
define __EXTENSIONS__, but let's see what happens with just this.

Discussion: https://postgr.es/m/1654508.1733162761@sss.pgh.pa.us
2024-12-03 12:44:43 -05:00
Álvaro Herrera
3c5f9f12c8
Fix synchronized_standby_slots GUC check hook
The validate_sync_standby_slots subroutine requires an LWLock, so it
cannot run in processes without PGPROC; skip it there to avoid a crash.

This replaces the current test for ReplicationSlotCtl being not null,
which appears to be a solution for the same problem but less general.
I also rewrote a related comment that mentioned ReplicationSlotCtl in
StandbySlotsHaveCaughtup.

This code came in with commit bf279ddd1c28; backpatch to 17.

Reported-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/202411281216.sutbxtr6idnn@alvherre.pgsql
2024-12-03 17:50:57 +01:00
Álvaro Herrera
1e5ef3a2a1
Drop "Lock" suffix from LWLock wait event names
Commit da952b415f44 unintentially reverted the SQL-visible part of
commit 14a910109126, which breaks queries joining pg_wait_events with
pg_stat_acivity.  Remove the suffix again.

Backpatch to 17.

Reported-by: Christophe Courtois <christophe.courtois@dalibo.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/18728-450924477056a339%40postgresql.org
Discussion: https://postgr.es/m/Z01w1+LihtRiS0Te@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-03 15:50:03 +01:00
Álvaro Herrera
4cc2a44980
Update obsolete comment
Commit 3aa0395d4ed3 made worrying about BKI_ROWTYPE_OID matching no
longer necessary, but this comment didn't get the memo.
2024-12-03 14:46:31 +01:00
Peter Eisentraut
84a67725cd Fix handling of CREATE DOMAIN with GENERATED constraint syntax
Stuff like

    CREATE DOMAIN foo AS int CONSTRAINT cc GENERATED ALWAYS AS (2) STORED

is not supported for domains, but the parser allows it, because it's
the same syntax as for table constraints.  But CreateDomain() did not
explicitly handle all ConstrType values, so the above would get an
internal error like

    ERROR:  unrecognized constraint subtype: 4

Fix that by providing a user-facing error message for all ConstrType
values.  Also, remove the switch default case, so future additions to
ConstrType are caught.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxF8fmM=Dbm4pDFuV_nKGz2-No0k4YifhrF3-rjXTWJM3w@mail.gmail.com
2024-12-03 14:32:45 +01:00
Peter Eisentraut
1acf10549e Fix temporary memory leak in system table index scans
Commit 811af9786b introduced palloc() calls into systable_beginscan()
and systable_beginscan_ordered().  But there was no pfree(), as is the
usual style.

It turns out that an ANALYZE of a partitioned table can invoke many
thousand system table index scans, and this memory is not cleaned up
until the end of the command, so this can temporarily leak quite a bit
of memory.  Maybe there are improvements to be made at a higher level
about this, but for now, insert a couple of corresponding pfree()
calls to fix this particular issue.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/Z0XTfIq5xUtbkiIh@pryzbyj2023
2024-12-03 09:04:20 +01:00
Jeff Davis
1ba0782ce9 Perform provider-specific initialization in new functions.
Reviewed-by: Andreas Karlsson
Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477@proxel.se
2024-12-02 23:24:35 -08:00
Michael Paquier
8817e8d3a4 doc: Clarify some terms for pg_createsubscriber
The last section of pg_createsubscriber used the terms
"publication-name", "replication-slot-name", and "subscription-name".
These terms are not defined on the page, which was confusing, and the
intention is clearly to refer to the values one would give to the
options --publication, --subscription and --replication-slot.  Let's
simplify the documentation by mentioning the option switches, instead of
these terms.

Reported-by: Christophe Courtois
Author: Shubham Khanna
Reviewed-by: Vignesh C, Peter Smith
Discussion: https://postgr.es/m/173288198026.714.15127074046508836738@wrigleys.postgresql.org
Backpatch-through: 17
2024-12-03 16:21:07 +09:00
Jeff Davis
e3fa2b037c Fix unintentional behavior change in commit e9931bfb75.
Prior to that commit, there was special case to use ASCII case mapping
behavior for the libc provider with a single-byte encoding when that's
the default collation. Commit e9931bfb75 mistakenly eliminated that
special case; this commit restores it.

Discussion: https://postgr.es/m/01a104f0d2179d756261e90d96fd65c36ad6fcf0.camel@j-davis.com
2024-12-02 21:59:02 -08:00
David Rowley
4171c44c9b Revert "Introduce CompactAttribute array in TupleDesc"
This reverts commit d28dff3f6cd6a7562fb2c211ac0fb74a33ffd032.

Quite a large number of buildfarm members didn't like this commit and
it's not yet clear why.  Reverting this before too many animals turn
red.

Discussion: https://postgr.es/m/CAApHDvr9i6T5=iAwQCxFDgMsthr_obVxgwBaEJkC8KUH6yM3Hw@mail.gmail.com
2024-12-03 17:12:38 +13:00
David Rowley
d28dff3f6c Introduce CompactAttribute array in TupleDesc
The new compact_attrs array stores a few select fields from
FormData_pg_attribute in a more compact way, using only 16 bytes per
column instead of the 104 bytes that FormData_pg_attribute uses.  Using
CompactAttribute allows performance-critical operations such as tuple
deformation to be performed without looking at the FormData_pg_attribute
element in TupleDesc which means fewer cacheline accesses.  With this
change, NAMEDATALEN could be increased with a much smaller negative impact
on performance.

For some workloads, tuple deformation can be the most CPU intensive part
of processing the query.  Some testing with 16 columns on a table
where the first column is variable length showed around a 10% increase in
transactions per second for an OLAP type query performing aggregation on
the 16th column.  However, in certain cases, the increases were much
higher, up to ~25% on one AMD Zen4 machine.

This also makes pg_attribute.attcacheoff redundant.  A follow-on commit
will remove it, thus shrinking the FormData_pg_attribute struct by 4
bytes.

Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
Reviewed-by: Andres Freund, Victor Yegorov
2024-12-03 16:50:59 +13:00
Bruce Momjian
e4c8865196 doc Makefile: issue warning about chars that cannot be output
A follow-up improvement to commit 641a5b7a144.

Reported-by: Tatsuo Ishii, Tom Lane

Discussion: https://postgr.es/m/20241126.182513.1752581942460106099.ishii@postgresql.org

Backpatch-through: master
2024-12-02 21:25:12 -05:00
Michael Paquier
08691ea958 Rework some code handling pg_subscription data in psql and pg_dump
This commit fixes some inconsistencies found in the frontend code when
dealing with subscription catalog data.

The following changes are done:
- pg_subscription.h gains a EXPOSE_TO_CLIENT_CODE, so as more content
defined in pg_subscription.h becomes available in pg_subscription_d.h
for the frontend.
- In psql's describe.c, substream can be switched to use CppAsString2()
with its three LOGICALREP_STREAM_* values, with pg_subscription_d.h
included.
- pg_dump.c included pg_subscription.h, which is a header that should
only be used in the backend code.  The code is updated to use
pg_subscription_d.h instead.
- pg_dump stored all the data from pg_subscription in SubscriptionInfo
with only strings, and a good chunk of them are boolean and char values.
Using strings is not necessary, complicates the code (see for example
two_phase_disabled[] removed here), and is inconsistent with the way
other catalogs' data is handled.  The fields of SubscriptionInfo are
reordered to match with the order in its catalog, while on it.

Reviewed-by: Hayato Kuroda
Discussion: https://postgr.es/m/Z0lB2kp0ksHgmVuk@paquier.xyz
2024-12-03 09:48:12 +09:00
Thomas Munro
75818b3afb RelationTruncate() must set DELAY_CHKPT_START.
Previously, it set only DELAY_CHKPT_COMPLETE. That was important,
because it meant that if the XLOG_SMGR_TRUNCATE record preceded a
XLOG_CHECKPOINT_ONLINE record in the WAL, then the truncation would also
happen on disk before the XLOG_CHECKPOINT_ONLINE record was
written.

However, it didn't guarantee that the sync request for the truncation
was processed before the XLOG_CHECKPOINT_ONLINE record was written. By
setting DELAY_CHKPT_START, we guarantee that if an XLOG_SMGR_TRUNCATE
record is written to WAL before the redo pointer of a concurrent
checkpoint, the sync request queued by that operation must be processed
by that checkpoint, rather than being left for the following one.

This is a refinement of commit 412ad7a5563.  Back-patch to all supported
releases, like that commit.

Author: Robert Haas <robertmhaas@gmail.com>
Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2B-2rjGZC2kwqr2NMLBcEBp4uf59QT1advbWYF_uc%2B0Aw%40mail.gmail.com
2024-12-03 10:12:05 +13:00
Nathan Bossart
db6a4a985b Deprecate MD5 passwords.
MD5 has been considered to be unsuitable for use as a cryptographic
hash algorithm for some time.  Furthermore, MD5 password hashes in
PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing
the username and hashed password is sufficient to authenticate.
The SCRAM-SHA-256 method added in v10 is not subject to these
problems and is considered to be superior to MD5.

This commit marks MD5 password support in PostgreSQL as deprecated
and to be removed in a future release.  The documentation now
contains several deprecation notices, and CREATE ROLE and ALTER
ROLE now emit deprecation warnings when setting MD5 passwords.  The
warnings can be disabled by setting the md5_password_warnings
parameter to "off".

Reviewed-by: Greg Sabino Mullane, Jim Nasby
Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan
2024-12-02 13:30:07 -06:00
Dean Rasheed
97173536ed Add a planner support function for numeric generate_series().
This allows the planner to estimate the number of rows returned by
generate_series(numeric, numeric[, numeric]), when the input values
can be estimated at plan time.

Song Jinzhou, reviewed by Dean Rasheed and David Rowley.

Discussion: https://postgr.es/m/tencent_F43E7F4DD50EF5986D1051DE8DE547910206%40qq.com
Discussion: https://postgr.es/m/tencent_1F6D5B9A1545E02FD7D0EE508DFD056DE50A%40qq.com
2024-12-02 11:37:57 +00:00
Dean Rasheed
3315235845 Fix #include order in timestamp.c.
Oversight in 036bdcec9f.
2024-12-02 11:34:26 +00:00
Peter Eisentraut
086c84b23d Fix error code for referential action RESTRICT
According to the SQL standard, if the referential action RESTRICT is
triggered, it has its own error code.  We previously didn't use that,
we just used the error code for foreign key violation.  But RESTRICT
is not necessarily an actual foreign key violation.  The foreign key
might still be satisfied in theory afterwards, but the RESTRICT
setting prevents the action even then.  So it's a separate kind of
error condition.

Discussion: https://www.postgresql.org/message-id/ea5b2777-266a-46fa-852f-6fca6ec480ad@eisentraut.org
2024-12-02 08:22:34 +01:00
Tom Lane
2f696453d2 Fix broken list-munging in ecpg's remove_variables().
The loops over cursor argument variables neglected to ever advance
"prevvar".  The code would accidentally do the right thing anyway
when removing the first or second list entry, but if it had to
remove the third or later entry then it would also remove all
entries between there and the first entry.  AFAICS this would
only matter for cursors that reference out-of-scope variables,
which is a weird Informix compatibility hack; between that and
the lack of impact for short lists, it's not so surprising that
nobody has complained.  Nonetheless it's a pretty obvious bug.

It would have been more obvious if these loops used a more standard
coding style for chasing the linked lists --- this business with the
"prev" pointer sometimes pointing at the current list entry is
confusing and overcomplicated.  So rather than just add a minimal
band-aid, I chose to rewrite the loops in the same style we use
elsewhere, where the "prev" pointer is NULL until we are dealing with
a non-first entry and we save the "next" pointer at the top of the
loop.  (Two of the four loops touched here are not actually buggy,
but it seems better to make them all look alike.)

Coverity discovered this problem, but not until 2b41de4a5 added code
to free no-longer-needed arguments structs.  With that, the incorrect
link updates are possibly touching freed memory, and it complained
about that.  Nonetheless the list corruption hazard is ancient, so
back-patch to all supported branches.
2024-12-01 14:15:37 -05:00
Tom Lane
e032e4c7dd Avoid mislabeling of lateral references, redux.
As I'd feared, commit 5c9d8636d was still a few bricks shy of a load.
We can't just leave pulled-up lateral-reference Vars with no new
nullingrels: we have to carefully compute what subset of the
to-be-replaced Var's nullingrels apply to them, else we still get
"wrong varnullingrels" errors.  This is a bit tedious, but it looks
like we can use the nullingrel data this patch computes for other
purposes, enabling better optimization.  We don't want to inject
unnecessary plan changes into stable branches though, so leave that
idea for a later HEAD-only patch.

Patch by me, but thanks to Richard Guo for devising a test case that
broke 5c9d8636d, and for preliminary investigation about how to fix
it.  As before, back-patch to v16.

Discussion: https://postgr.es/m/E1tGn4j-0003zi-MP@gemulon.postgresql.org
2024-11-30 12:42:19 -05:00
Peter Eisentraut
49ae9fd8b7 doc: Fix typo
for commit 1e08905842f

Reported-by: Marcos Pegoraro <marcos@f10.com.br>
2024-11-30 08:43:46 +01:00
Peter Eisentraut
5d39becf8b Small indenting fixes in jsonpath_scan.l
Some lines were indented by an inconsistent number of spaces.  While
we're here, also fix some code that used the newline after left
parenthesis style, which is obsolete.
2024-11-29 11:33:21 +01:00
Peter Eisentraut
1e08905842 doc: Improve description of referential actions
Some of the differences between NO ACTION and RESTRICT were not
explained fully.

Discussion: https://www.postgresql.org/message-id/ea5b2777-266a-46fa-852f-6fca6ec480ad@eisentraut.org
2024-11-29 08:53:00 +01:00
Peter Eisentraut
4a2dbfc6be Add tests for foreign keys with case-insensitive collations
Some of the behaviors of the different referential actions, such as
the difference between NO ACTION and RESTRICT are best illustrated
using a case-insensitive collation.  So add some tests for that.

(What is actually being tested here is the behavior with values that
are "distinct" (binary different) but compare as equal.  Another way
to do that would be with positive and negative zeroes with float
types.  But this way seems nicer and more flexible.)

Discussion: https://www.postgresql.org/message-id/ea5b2777-266a-46fa-852f-6fca6ec480ad@eisentraut.org
2024-11-29 08:52:55 +01:00
Alexander Korotkov
5bba0546ee Skip not SOAP-supported indexes while transforming an OR clause into SAOP
There is no point in transforming OR-clauses into SAOP's if the target index
doesn't support SAOP scans anyway.  This commit adds corresponding checks
to match_orclause_to_indexcol() and group_similar_or_args().  The first check
fixes the actual bug, while the second just saves some cycles.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/8174de69-9e1a-0827-0e81-ef97f56a5939%40gmail.com
Author: Alena Rybakina
Reviewed-by: Ranier Vilela, Alexander Korotkov, Andrei Lepikhov
2024-11-29 09:52:12 +02:00
David Rowley
b6612aedc5 Fix typo in header comment for set_operation_ordered_results_useful
Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs492vMy3XNjDZRtqtHfFTK6HVeDwhrEQH7eXGgF_h5Jnzw@mail.gmail.com
2024-11-29 15:56:24 +13:00
Michael Paquier
18954ce7f6 psql: Sprinkle more CppAsString2() in describe.c
Like 91f5a4a000ea for pg_amcheck, this makes the code more
self-documented as there is less need to look in the headers what a
hardcoded value means.  This touches queries related to procedures, AMs,
functions, databases, relations, constraints, collations, types and
extended stats, pulling into psql their *_d.h headers.  The queries are
written the same way as originally.

There are still a couple of hardcoded values.  These cannot be included
yet as they are not exposed in headers that are safe to use in frontend
code.

Note that describe.c was including pg_am.h that should be used only in
backend code.  This is updated to use pg_am_d.h.

Reviewed-by: Daniel Gustafsson, Corey Huinker
Discussion: https://postgr.es/m/Zxb2hpca-pZc6zKe@paquier.xyz
2024-11-29 08:53:09 +09:00
Tom Lane
5c9d8636d3 Avoid mislabeling of lateral references when pulling up a subquery.
If we are pulling up a subquery that's under an outer join, and
the subquery's target list contains a strict expression that uses
both a subquery variable and a lateral-reference variable, it's okay
to pull up the expression without wrapping it in a PlaceHolderVar.
That's safe because if the subquery variable is forced to NULL
by the outer join, the expression result will come out as NULL too,
so we don't have to force that outcome by evaluating the expression
below the outer join.  It'd be correct to wrap in a PHV, but that can
lead to very significantly worse plans, since we'd then have to use
a nestloop plan to pass down the lateral reference to where the
expression will be evaluated.

However, when we do that, we should not mark the lateral reference
variable as being nulled by the outer join, because it isn't after
we pull up the expression in this way.  So the marking logic added
by cb8e50a4a was incorrect in this detail, leading to "wrong
varnullingrels" errors from the consistency-checking logic in
setrefs.c.  It seems to be sufficient to just not mark lateral
references at all in this case.  (I have a nagging feeling that more
complexity may be needed in cases where there are several levels of
outer join, but some attempts to break it with that didn't succeed.)

Per report from Bertrand Mamasam.  Back-patch to v16, as the previous
patch was.

Discussion: https://postgr.es/m/CACZ67_UA_EVrqiFXJu9XK50baEpH=ofEPJswa2kFxg6xuSw-ww@mail.gmail.com
2024-11-28 17:33:16 -05:00
Daniel Gustafsson
0c01f509a3 Fix wording in comment
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAHut+PvE+2T2etdTaHi3n+xbCG_UYrshQuCbaAdJCFPpQGLwgQ@mail.gmail.com
2024-11-28 15:17:49 +01:00
Peter Eisentraut
25ec329afa psql: Add tab completion for COPY (MERGE ...
The underlying feature for this was added in PostgreSQL 17.

Author: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxEmNjxvf1deR1zBrJbjAMeCooooLRzZ+yaaBuqDKh_6-Q@mail.gmail.com
2024-11-28 09:14:41 +01:00
Peter Eisentraut
7f798aca1d Remove useless casts to (void *)
Many of them just seem to have been copied around for no real reason.
Their presence causes (small) risks of hiding actual type mismatches
or silently discarding qualifiers

Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
2024-11-28 08:27:20 +01:00
Thomas Munro
97525bc5c8 Require sizeof(bool) == 1.
The C standard says that sizeof(bool) is implementation-defined, but we
know of no current systems where it is not 1.  The last known systems
seem to have been Apple macOS/PowerPC 10.5 and Microsoft Visual C++ 4,
both long defunct.

PostgreSQL has always required sizeof(bool) == 1 for the definition of
bool that it used, but previously it would define its own type if the
system-provided bool had a different size.  That was liable to cause
memory layout problems when interacting with system and third-party
libraries on (by now hypothetical) computers with wider _Bool, and now
C23 has introduced a new problem by making bool a built-in datatype
(like C++), so the fallback code doesn't even compile.  We could
probably work around that, but then we'd be writing new untested code
for a computer that doesn't exist.

Instead, delete the unreachable and C23-uncompilable fallback code, and
let existing static assertions fail if the system-provided bool is too
wide.  If we ever get a problem report from a real system, then it will
be time to figure out what to do about it in a way that also works on
modern compilers.

Note on C++: Previously we avoided including <stdbool.h> or trying to
define a new bool type in headers that might be included by C++ code.
These days we might as well just include <stdbool.h> unconditionally:
it should be visible to C++11 but do nothing, just as in C23.  We
already include <stdint.h> without C++ guards in c.h, and that falls
under the same C99-compatibility section of the C++11 standard as
<stdbool.h>, so let's remove the guards here too.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3198438.1731895163%40sss.pgh.pa.us
2024-11-28 12:01:14 +13:00
Nathan Bossart
4b03a27faf Use __attribute__((target(...))) for SSE4.2 CRC-32C support.
Presently, we check for compiler support for the required
intrinsics both with and without the -msse4.2 compiler flag, and
then depending on the results of those checks, we pick which files
to compile with which flags.  This is tedious and complicated, and
it results in unsustainable coding patterns such as separate files
for each portion of code that may need to be built with different
compiler flags.

This commit makes use of the newly-added support for
__attribute__((target(...))) in the SSE4.2 CRC-32C code.  This
simplifies both the configure-time checks and the build scripts,
and it allows us to place the functions that use the intrinsics in
files that we otherwise do not want to build with special CPU
instructions (although this commit refrains from doing so).  This
is also preparatory work for a proposed follow-up commit that will
further optimize the CRC-32C code with AVX-512 instructions.

While at it, this commit modifies meson's checks for SSE4.2 CRC
support to be the same as autoconf's.  meson was choosing whether
to use a runtime check based purely on whether -msse4.2 is
required, while autoconf has long checked for the __SSE4_2__
preprocessor symbol to decide.  meson's previous approach seems to
work just fine, but this change avoids needing to build multiple
test programs and to keep track of whether to actually use
pg_attribute_target().

Ideally we'd use __attribute__((target(...))) for ARMv8 CRC
support, too, but there's little point in doing so because until
clang 16, using the ARM intrinsics still requires special compiler
flags.  Perhaps we can re-evaluate this decision after some time
has passed.

Author: Raghuveer Devulapalli
Discussion: https://postgr.es/m/PH8PR11MB8286BE735A463468415D46B5FB5C2%40PH8PR11MB8286.namprd11.prod.outlook.com
2024-11-27 16:19:05 -06:00
Álvaro Herrera
6ba9892f5c
Make GUC_check_errdetail messages full sentences
They were all missing punctuation, one was missing initial capital.
Per our message style guidelines.

No backpatch, to avoid breaking existing translations.
2024-11-27 19:49:36 +01:00
Álvaro Herrera
fd9924542b
Remove redundant relam initialization
This struct member is initialized again a few lines below in the same
function.  This is cosmetic, so no backpatch.

Reported-by: Jingtang Zhang <mrdrivingduck@gmail.com>
Discussion: https://postgr.es/m/AFF74506-B925-46BB-B875-CF5A946170EB@gmail.com
2024-11-27 19:15:14 +01:00
Tom Lane
2b41de4a5b ecpg: clean up some other assorted memory leaks.
Avoid leaking the prior value when updating the "connection"
state variable.

Ditto for ECPGstruct_sizeof.  (It seems like this one ought to
be statement-local, but testing says it isn't, and I didn't
feel like diving deeper.)

The actual_type[] entries are statement-local, though, so
no need to mm_strdup() strings stored in them.

Likewise, sqlda variables are statement-local, so we can
loc_alloc them.

Also clean up sloppiness around management of the argsinsert and
argsresult lists.

progname changes are strictly to prevent valgrind from complaining
about leaked allocations.

With this, valgrind reports zero leakage in the ecpg preprocessor
for all of our ecpg regression test cases.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-11-27 12:50:23 -05:00
Tom Lane
85312d95e9 ecpg: put all string-valued tokens returned by pgc.l in local storage.
This didn't work earlier in the patch series (I think some of
the strings were ending up in data-type-related structures),
but apparently we're now clean enough for it.  This considerably
reduces process-lifespan memory leakage.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-11-27 12:50:23 -05:00
Tom Lane
0e6060790d ecpg: fix some memory leakage of data-type-related structures.
ECPGfree_type() and related functions were quite incomplete
about removing subsidiary data structures.  Possibly this is
because ecpg wasn't careful to make sure said data structures
always had their own storage.  Previous patches in this series
cleaned up a lot of that, and I had to add a couple more
mm_strdup's here.

Also, ecpg.trailer tended to overwrite struct_member_list[struct_level]
without bothering to free up its previous contents, thus potentially
leaking a lot of struct-member-related storage.  Add
ECPGfree_struct_member() calls at appropriate points.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-11-27 12:50:23 -05:00
Andrew Dunstan
5c32c21afe jsonapi: add lexer option to keep token ownership
Commit 0785d1b8b adds support for libpq as a JSON client, but
allocations for string tokens can still be leaked during parsing
failures. This is tricky to fix for the object_field semantic callbacks:
the field name must remain valid until the end of the object, but if a
parsing error is encountered partway through, object_field_end() won't
be invoked and the client won't get a chance to free the field name.

This patch adds a flag to switch the ownership of parsed tokens to the
lexer. When this is enabled, the client must make a copy of any tokens
it wants to persist past the callback lifetime, but the lexer will
handle necessary cleanup on failure.

Backend uses of the JSON parser don't need to use this flag, since the
parser's allocations will occur in a short lived memory context.

A -o option has been added to test_json_parser_incremental to exercise
the new setJsonLexContextOwnsTokens() API, and the test_json_parser TAP
tests make use of it. (The test program now cleans up allocated memory,
so that tests can be usefully run under leak sanitizers.)

Author: Jacob Champion

Discussion: https://postgr.es/m/CAOYmi+kb38EciwyBQOf9peApKGwraHqA7pgzBkvoUnw5BRfS1g@mail.gmail.com
2024-11-27 12:07:14 -05:00
Andres Freund
262283d5ee ci: Fix cached MacPorts installation management
1.  The error reporting of "port setrequested list-of-packages..."
changed, hiding errors we were relying on to know if all packages in our
list were already installed.  Use a loop instead.

2.  The cached MacPorts installation was shared between PostgreSQL
major branches, though each branch wanted different packages.  Add the
list of packages to cache key, so that different branches, when tested
in one github account/repo such as postgres/postgres, stop fighting with
each other, adding and removing packages.

Back-patch to 15 where CI began.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/au2uqfuy2nf43nwy2txmc5t2emhwij7kzupygto3d2ffgtrdgr%40ckvrlwyflnh2
2024-11-27 11:51:54 -05:00
Nathan Bossart
61171a632d Look up backend type in pg_signal_backend() more cheaply.
Commit ccd38024bc, which introduced the pg_signal_autovacuum_worker
role, added a call to pgstat_get_beentry_by_proc_number() for the
purpose of determining whether the process is an autovacuum worker.
This function calls pgstat_read_current_status(), which can be
fairly expensive and may return cached, out-of-date information.
Since we just need to look up the target backend's BackendType, and
we already know its ProcNumber, we can instead inspect the
BackendStatusArray directly, which is much less expensive and
possibly more up-to-date.  There are some caveats with this
approach (which are documented in the code), but it's still
substantially better than before.

Reported-by: Andres Freund
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/ujenaa2uabzfkwxwmfifawzdozh3ljr7geozlhftsuosgm7n7q%40g3utqqyyosb6
2024-11-27 10:32:25 -06:00
Andres Freund
6a5bcf7f7d postmaster: Reduce verbosity of environment dump debug message
Emitting each variable separately is unnecessarily verbose / hard to skim
over. Emit the whole thing in one ereport() to address that.

Also remove program name and function reference from the message. The former
doesn't seem particularly helpful and the latter is provided by the elog.c
infrastructure these days.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/leouteo5ozcrux3fepuhtbp6c56tbfd4naxeokidbx7m75cabz@hhw6g4urlowt
2024-11-27 11:17:23 -05:00
Fujii Masao
631db1d4b2 file_fdw: Add regression tests for ON_ERROR and other options.
This commit introduces regression tests to validate incorrect settings
for the ON_ERROR, LOG_VERBOSITY, and REJECT_LIMIT options in file_fdw.

Author: Atsushi Torikoshi
Reviewed-by: Fujii Masao
Suggested-by: Yugo Nagata
Discussion: https://postgr.es/m/20241113231706.09e5b5ea9640289312835be8@sraoss.co.jp
2024-11-27 23:40:11 +09:00
Fujii Masao
af35fe501a pgbench: Ensure previous progress message is fully cleared when updating.
During pgbench's table initialization, progress updates could display
leftover characters from the previous message if the new message
was shorter. This commit resolves the issue by appending spaces to
the current message to fully overwrite any remaining characters from
the previous line.

Back-patch to all the supported versions.

Author: Yushi Ogiwara, Tatsuo Ishii, Fujii Masao
Reviewed-by: Tatsuo Ishii, Fujii Masao
Discussion: https://postgr.es/m/9a9b8b95b6a709877ae48ad5b0c59bb9@oss.nttdata.com
2024-11-27 23:01:53 +09:00
Álvaro Herrera
09d09d4297
Fix pg_get_constraintdef for NOT NULL constraints on domains
We added pg_constraint rows for all not-null constraints, first for
tables and later for domains; but while the ones for tables were
reverted, the ones for domains were not.  However, we did accidentally
revert ruleutils.c support for the ones on domains in 6f8bb7c1e961,
which breaks running pg_get_constraintdef() on them.  Put that back.

This is only needed in branch 17, because we've reinstated this code in
branch master with commit 14e87ffa5c54.  Add some new tests in both
branches.

I couldn't find anything else that needs de-reverting.

Reported-by: Erki Eessaar <erki.eessaar@taltech.ee>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/AS8PR01MB75110350415AAB8BBABBA1ECFE222@AS8PR01MB7511.eurprd01.prod.exchangelabs.com
2024-11-27 13:50:27 +01:00
Peter Eisentraut
0d884f570b gitattributes: Add .cpp files to whitespace checks
Use the same rules as .c files.
2024-11-27 11:20:46 +01:00
Peter Eisentraut
41272784b9 Fix typo
from commit 9044fc1d45a
2024-11-27 11:20:46 +01:00
Peter Eisentraut
96447e9c81 Exclude LLVM files from whitespace checks
Commit 9044fc1d45a added some files from upstream LLVM.  These files
have different whitespace rules, which make the git whitespace checks
powered by gitattributes fail.  To fix, add those files to the exclude
list.
2024-11-27 11:20:46 +01:00
Thomas Munro
a62d90f2e5 Revert "Blind attempt to fix _configthreadlocale() failures on MinGW."
This reverts commit 2cf91ccb73ce888c44e3751548fb7c77e87335f2.

When using the old msvcrt.dll, MinGW would supply its own dummy version
of _configthreadlocale() that just returns -1 if you try to use it.  For
a time we tolerated that to shut the build farm up.  We would fall back
to code that was enough for the tests to pass, but it would surely have
risked crashing a real multithreaded program.

We don't need that kludge anymore, because we can count on ucrt.  We
expect the real _configthreadlocale() to be present, and the ECPG tests
will now fail if it isn't.  The workaround was dead code and it's time
to revert it.

(A later patch still under review proposes to remove this use of
_configthreadlocale() completely but we're unwinding this code in
steps.)

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/d9e7731c-ca1b-477c-9298-fa51e135574a%40eisentraut.org
2024-11-27 23:20:10 +13:00
Thomas Munro
1758d42446 Require ucrt if using MinGW.
Historically we tolerated the absence of various C runtime library
features for the benefit of the MinGW tool chain, because it used
ancient msvcrt.dll for a long period of time.  It now uses ucrt by
default (like Windows 10+, Visual Studio 2015+), and that's the only
configuration we're testing.

In practice, we effectively required ucrt already in PostgreSQL 17, when
commit 8d9a9f03 required _create_locale etc, first available in
msvcr120.dll (Visual Studio 2013, the last of the pre-ucrt series of
runtimes), and for MinGW users that practically meant ucrt because it
was difficult or impossible to use msvcr120.dll.  That may even not have
been the first such case, but old MinGW configurations had already
dropped off our testing radar so we weren't paying much attention.

This commit formalizes the requirement.  It also removes a couple of
obsolete comments that discussed msvcrt.dll limitations, and some tests
of !defined(_MSC_VER) to imply msvcrt.dll.  There are many more
anachronisms, but it'll take some time to figure out how to remove them
all.  APIs affected relate to locales, UTF-8, threads, large files and
more.

Thanks to Peter Eisentraut for the documentation change.  It's not
really necessary to talk about ucrt explicitly in such a short section,
since it's the default for MinGW-w64 and MSYS2.  It's enough to prune
references and broken links to much older tools.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/d9e7731c-ca1b-477c-9298-fa51e135574a%40eisentraut.org
2024-11-27 23:13:45 +13:00
Thomas Munro
f1da075d9a Remove configure check for _configthreadlocale().
All modern Windows systems have _configthreadlocale().  It was first
introduced in msvcr80.dll from Visual Studio 2005.  Historically, MinGW
was stuck on even older msvcrt.dll, but added its own dummy
implementation of the function when using msvcrt.dll years ago anyway,
effectively rendering the configure test useless.  In practice we don't
encounter the dummy anymore because modern MinGW uses ucrt.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech
2024-11-27 23:12:19 +13:00
Peter Eisentraut
63e10988f8 Improve slightly misleading internal error message
The error message was talking about RowCompareType but was actually
checking strategy numbers.  While those are closely related, it is
better to be accurate.

Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-11-27 10:55:35 +01:00
Amit Kapila
9d7aa406d0 Fix buildfarm failure from commit 8fcd80258b.
The test case was incorrectly matching the error code.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm0C5LPiTxkdqsxiyeaL=nuUP8t6ne81sp9jE0=MFz=-ew@mail.gmail.com
2024-11-27 14:54:26 +05:30
Peter Eisentraut
85b7efa1cd Support LIKE with nondeterministic collations
This allows for example using LIKE with case-insensitive collations.
There was previously no internal implementation of this, so it was met
with a not-supported error.  This adds the internal implementation and
removes the error.  The implementation follows the specification of
the SQL standard for this.

Unlike with deterministic collations, the LIKE matching cannot go
character by character but has to go substring by substring.  For
example, if we are matching against LIKE 'foo%bar', we can't start by
looking for an 'f', then an 'o', but instead with have to find
something that matches 'foo'.  This is because the collation could
consider substrings of different lengths to be equal.  This is all
internal to MatchText() in like_match.c.

The changes in GenericMatchText() in like.c just pass through the
locale information to MatchText(), which was previously not needed.
This matches exactly Generic_Text_IC_like() below.

ILIKE is not affected.  (It's unclear whether ILIKE makes sense under
nondeterministic collations.)

This also updates match_pattern_prefix() in like_support.c to support
optimizing the case of an exact pattern with nondeterministic
collations.  This was already alluded to in the previous code.

(includes documentation examples from Daniel Vérité and test cases
from Paul A Jungwirth)

Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/700d2e86-bf75-4607-9cf2-f5b7802f6e88@eisentraut.org
2024-11-27 08:19:42 +01:00
Amit Kapila
8fcd80258b Improve error message for replication of generated columns.
Currently, logical replication produces a generic error message when
targeting a subscriber-side table column that is either missing or
generated. The error message can be misleading for generated columns.

This patch introduces a specific error message to clarify the issue when
generated columns are involved.

Author: Shubham Khanna
Reviewed-by: Peter Smith, Vignesh C, Amit Kapila
Discussion: https://postgr.es/m/CAHv8RjJBvYtqU7OAofBizOmQOK2Q8h+w9v2_cQWxT_gO7er3Aw@mail.gmail.com
2024-11-27 09:09:20 +05:30
Michael Paquier
d0eb4297cc Handle better implicit transaction state of pipeline mode
When using a pipeline, a transaction starts from the first command and
is committed with a Sync message or when the pipeline ends.

Functions like IsInTransactionBlock() or PreventInTransactionBlock()
were already able to understand a pipeline as being in a transaction
block, but it was not the case of CheckTransactionBlock().  This
function is called for example to generate a WARNING for SET LOCAL,
complaining that it is used outside of a transaction block.

The current state of the code caused multiple problems, like:
- SET LOCAL executed at any stage of a pipeline issued a WARNING, even
if the command was at least second in line where the pipeline is in a
transaction state.
- LOCK TABLE failed when invoked at any step of a pipeline, even if it
should be able to work within a transaction block.

The pipeline protocol assumes that the first command of a pipeline is
not part of a transaction block, and that any follow-up commands is
considered as within a transaction block.

This commit changes the backend so as an implicit transaction block is
started each time the first Execute message of a pipeline has finished
processing, with this implicit transaction block ended once a sync is
processed.  The checks based on XACT_FLAGS_PIPELINING in the routines
checking if we are in a transaction block are not necessary: it is
enough to rely on the existing ones.

Some tests are added to pgbench, that can be backpatched down to v17
when \syncpipeline is involved and down to v14 where \startpipeline and
\endpipeline are available.  This is unfortunately limited regarding the
error patterns that can be checked, but it provides coverage for various
pipeline combinations to check if these succeed or fail.  These tests
are able to capture the case of SET LOCAL's WARNING.  The author has
proposed a different feature to improve the coverage by adding similar
meta-commands to psql where error messages could be checked, something
more useful for the cases where commands cannot be used in transaction
blocks, like REINDEX CONCURRENTLY or VACUUM.  This is considered as
future work for v18~.

Author: Anthonin Bonnefoy
Reviewed-by: Jelte Fennema-Nio, Michael Paquier
Discussion: https://postgr.es/m/CAO6_XqrWO8uNBQrSu5r6jh+vTGi5Oiyk4y8yXDORdE2jbzw8xw@mail.gmail.com
Backpatch-through: 13
2024-11-27 09:31:22 +09:00
Bruce Momjian
6e80951f49 Fix commit 641a5b7a144 for "nbsp" output in SVG files
In commit 641a5b7a144, I removed "nbsp" characters from SVG files, not
realizing the SVG files were generated from GV files and that the "nbsp"
characters were caused by trailing ASCII spaces in GV files.  This
commit restores the "nbsp" SVG characters and adds a GV comment about
how the trailing spaces cause the "nbsp" output.

Reported-by: Peter Eisentraut

Discussion: https://postgr.es/m/2c5dd601-b245-4092-9c27-6d1ad51609df%40eisentraut.org

Backpatch-through: master
2024-11-26 13:08:13 -05:00
Andres Freund
b8f9afc81f Distinguish between AcquireExternalFD and epoll_create1 / kqueue failing
The error messages in CreateWaitEventSet() made it hard to know whether the
syscall or AcquireExternalFD() failed. This is particularly relevant because
AcquireExternalFD() imposes a lower limit than what would cause syscalls fail
with EMFILE.

I did not change the message in libpqsrv_connect_prepare(), which is the one
other use of AcquireExternalFD() in our codebase, as the error message already
is less ambiguous.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/xjjx7r4xa7beixuu4qtkdhnwdbchrrpo3gaeb3jsbinvvdiat5@cwjw55mna5of
2024-11-26 12:44:47 -05:00
Peter Eisentraut
4ee130c6e0 meson: Build pgevent as shared_module rather than shared_library
This matches the behavior of the makefiles and the old MSVC build
system.  The main effect is that the build result gets installed into
pkglibdir rather than bindir.  The documentation says to locate the
library in pkglibdir, so this makes the code match the documentation
again.

Reviewed-by: Ryohei Takahashi (Fujitsu) <r.takahashi_2@fujitsu.com>
Discussion: https://www.postgresql.org/message-id/flat/TY3PR01MB118912125614599641CA881B782522%40TY3PR01MB11891.jpnprd01.prod.outlook.com
2024-11-26 18:09:05 +01:00
Álvaro Herrera
e6c32d9fad
Clean up newlines following left parentheses
Most came in during the 17 cycle, so backpatch there.  Some
(particularly reorderbuffer.h) are very old, but backpatching doesn't
seem useful.

Like commits c9d297751959, c4f113e8fef9.
2024-11-26 17:10:07 +01:00
Peter Eisentraut
2a7b2d9717 Improve InitShmemAccess() prototype
The code comment said, 'the argument should be declared "PGShmemHeader
*seghdr", but we use void to avoid having to include ipc.h in
shmem.h.'  We can achieve the original goal with a struct forward
declaration.  (ipc.h was also not the correct header file.)

Discussion: https://www.postgresql.org/message-id/flat/cnthxg2eekacrejyeonuhiaezc7vd7o2uowlsbenxqfkjwgvwj@qgzu6eoqrglb
2024-11-26 08:46:22 +01:00
Richard Guo
e15e567137 Fix test case from a8ccf4e93
Commit a8ccf4e93 uses the same table name "distinct_tbl" in both
select_distinct.sql and select_distinct_on.sql, which could cause
conflicts when these two test scripts are run in parallel.

Fix by renaming the table in select_distinct_on.sql to
"distinct_on_tbl".

Per buildfarm (via Tom Lane)

Discussion: https://postgr.es/m/1572004.1732583549@sss.pgh.pa.us
2024-11-26 11:12:57 +09:00
Michael Paquier
91f5a4a000 pg_amcheck: Use CppAsString2() for relkind and relpersistence in queries
This utility has been using hardcoded values for relkind and
relpersistence in its queries generated.  These queries are switched to
use CppAsString2() instead, with the values fetched directly from the
header of pg_class.  This has the advantage of making the code more
self-documented, as it becomes unnecessary to look at a header for the
meaning of a value.

There should be no functional changes; the queries are generated the
same way as before this commit.

Reviewed-by: Nathan Bossart, Daniel Gustafsson, Álvaro Herrera, Karina
Litskevich
Discussion: https://postgr.es/m/ZxIvemDk0Ob1RGwh@paquier.xyz
2024-11-26 09:45:34 +09:00
Richard Guo
cc4c90cef9 Remove dead code in get_param_path_clause_serials()
The function get_param_path_clause_serials() is used to get the set of
pushed-down clauses enforced within a parameterized Path.  Since we
don't currently support parameterized MergeAppend paths, and it
doesn't look like that is going to change anytime soon (as explained
in the comments for generate_orderedappend_paths), we don't need to
consider MergeAppendPath in this function.

This change won't make any measurable difference in performance; it's
just for clarity's sake.

Author: Richard Guo
Reviewed-by: Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs4_Puie4DQ2ODvjQB_3CxYkUODnrJm8jn_ObMAcrjYNW7Q@mail.gmail.com
2024-11-26 09:27:53 +09:00
Richard Guo
a8ccf4e93a Reordering DISTINCT keys to match input path's pathkeys
The ordering of DISTINCT items is semantically insignificant, so we
can reorder them as needed.  In fact, in the parser, we absorb the
sorting semantics of the sortClause as much as possible into the
distinctClause, ensuring that one clause is a prefix of the other.
This can help avoid a possible need to re-sort.

In this commit, we attempt to adjust the DISTINCT keys to match the
input path's pathkeys.  This can likewise help avoid re-sorting, or
allow us to use incremental-sort to save efforts.

For DISTINCT ON expressions, the parser already ensures that they
match the initial ORDER BY expressions.  When reordering the DISTINCT
keys, we must ensure that the resulting pathkey list matches the
initial distinctClause pathkeys.

This introduces a new GUC, enable_distinct_reordering, which allows
the optimization to be disabled if needed.

Author: Richard Guo
Reviewed-by: Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48dR26cCcX0f=8bja2JKQPcU64136kHk=xekHT9xschiQ@mail.gmail.com
2024-11-26 09:25:18 +09:00
Tom Lane
5b8728cd7f Fix NULLIF()'s handling of read-write expanded objects.
If passed a read-write expanded object pointer, the EEOP_NULLIF
code would hand that same pointer to the equality function
and then (unless equality was reported) also return the same
pointer as its value.  This is no good, because a function that
receives a read-write expanded object pointer is fully entitled
to scribble on or even delete the object, thus corrupting the
NULLIF output.  (This problem is likely unobservable with the
equality functions provided in core Postgres, but it's easy to
demonstrate with one coded in plpgsql.)

To fix, make sure the pointer passed to the equality function
is read-only.  We can still return the original read-write
pointer as the NULLIF result, allowing optimization of later
operations.

Per bug #18722 from Alexander Lakhin.  This has been wrong
since we invented expanded objects, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/18722-fd9e645448cc78b4@postgresql.org
2024-11-25 18:09:09 -05:00
Noah Misch
4ba84de459 Avoid "you don't own a lock of type ExclusiveLock" in GRANT TABLESPACE.
This WARNING appeared because SearchSysCacheLocked1() read
cc_relisshared before catcache initialization, when the field is false
unconditionally.  On the basis of reading false there, it constructed a
locktag as though pg_tablespace weren't relisshared.  Only shared
catalogs could be affected, and only GRANT TABLESPACE was affected in
practice.  SearchSysCacheLocked1() callers use one other shared-relation
syscache, DATABASEOID.  DATABASEOID is initialized by the end of
CheckMyDatabase(), making the problem unreachable for pg_database.

Back-patch to v13 (all supported versions).  This has no known impact
before v16, where ExecGrant_common() first appeared.  Earlier branches
avoid trouble by having a separate ExecGrant_Tablespace() that doesn't
use LOCKTAG_TUPLE.  However, leaving this unfixed in v15 could ensnare a
future back-patch of a SearchSysCacheLocked1() call.

Reported by Aya Iwata.

Discussion: https://postgr.es/m/OS7PR01MB11964507B5548245A7EE54E70EA212@OS7PR01MB11964.jpnprd01.prod.outlook.com
2024-11-25 14:42:35 -08:00
Nathan Bossart
96a81c1be9 pg_dump: Add dumpSchema and dumpData derivative flags.
Various parts of pg_dump consult the --schema-only and --data-only
options to determine whether to run a section of code.  While this
is simple enough for two mutually-exclusive options, it will become
progressively more complicated as more options are added.  In
anticipation of that, this commit introduces new internal flags
called dumpSchema and dumpData, which are derivatives of
--schema-only and --data-only.  This commit also removes the
schemaOnly and dataOnly members from the dump/restore options
structs to prevent their use elsewhere.

Note that this change neither adds new user-facing command-line
options nor changes the existing --schema-only and --data-only
options.

Author: Corey Huinker
Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/CADkLM%3DcQgghMJOS8EcAVBwRO4s1dUVtxGZv5gLPfZkQ1nL1gzA%40mail.gmail.com
2024-11-25 16:36:37 -06:00
Thomas Munro
648333a99f Clean up <stdbool.h> reference in meson.build.
Commit bc5a4dfc accidentally left a check for <stdbool.h> in
meson.build's header_checks.  Synchronize with configure, which no
longer defines HAVE_STDBOOL_H.

There is still a reference to <stdbool.h> in an earlier test to see if
we need -std=c99 to get C99 features, like autoconf 2.69's
AC_PROG_CC_C99.  (Therefore the test remove by this commit was
tautological since day one: you'd have copped "C compiler does not
support C99" before making it this far.)

Back-patch to 16, where meson begins.
2024-11-26 11:29:36 +13:00
Tom Lane
5980f1884f Update configure probes for CFLAGS needed for ARM CRC instructions.
On ARM platforms where the baseline CPU target lacks CRC instructions,
we need to supply a -march flag to persuade the compiler to compile
such instructions.  It turns out that our existing choice of
"-march=armv8-a+crc" has not worked for some time, because recent gcc
will interpret that as selecting software floating point, and then
will spit up if the platform requires hard-float ABI, as most do
nowadays.  The end result was to silently fall back to software CRC,
which isn't very desirable since in practice almost all currently
produced ARM chips do have hardware CRC.

We can fix this by using "-march=armv8-a+crc+simd" to enable the
correct ABI choice.  (This has no impact on the code actually
generated, since neither of the files we compile with this flag
does any floating-point stuff, let alone SIMD.)  Keep the test for
"-march=armv8-a+crc" since that's required for soft-float ABI,
but try that second since most platforms we're likely to build on
use hard-float.

Since this isn't working as-intended on the last several years'
worth of gcc releases, back-patch to all supported branches.

Discussion: https://postgr.es/m/4496616.iHFcN1HehY@portable-bastien
2024-11-25 12:50:17 -05:00
Tom Lane
4570b22666 Support runtime CRC feature probing on NetBSD/ARM using sysctl().
Commit aac831caf left this as a to-do; here's code to do it.
Like the previous patch, this is HEAD-only for now.

Discussion: https://postgr.es/m/4496616.iHFcN1HehY@portable-bastien
2024-11-25 11:53:26 -05:00
Peter Eisentraut
32a2aa77ef Add support for Tcl 9
Tcl 9 changed several API functions to take Tcl_Size, which is
ptrdiff_t, instead of int, for 64-bit enablement.  We have to change a
few local variables to be compatible with that.  We also provide a
fallback typedef of Tcl_Size for older Tcl versions.

The affected variables are used for quantities that will not approach
values beyond the range of int, so this doesn't change any
functionality.

Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://www.postgresql.org/message-id/flat/bce0fe54-75b4-438e-b42b-8e84bc7c0e9c%40eisentraut.org
2024-11-25 11:44:29 +01:00
Thomas Munro
bc5a4dfcf7 Assume that <stdbool.h> conforms to the C standard.
Previously we checked "for <stdbool.h> that conforms to C99" using
autoconf's AC_HEADER_STDBOOL macro.  We've required C99 since PostgreSQL
12, so the test was redundant, and under C23 it was broken: autoconf
2.69's implementation doesn't understand C23's new empty header (the
macros it's looking for went away, replaced by language keywords).
Later autoconf versions fixed that, but let's just remove the
anachronistic test.

HAVE_STDBOOL_H and HAVE__BOOL will no longer be defined, but they
weren't directly tested in core or likely extensions (except in 11, see
below).  PG_USE_STDBOOL (or USE_STDBOOL in 11 and 12) is still defined
when sizeof(bool) is 1, which should be true on all modern systems.
Otherwise we define our own bool type and values of size 1, which would
fail to compile under C23 as revealed by the broken test.  (We'll
probably clean that dead code up in master, but here we want a minimal
back-patchable change.)

This came to our attention when GCC 15 recently started using using C23
by default and failed to compile the replacement code, as reported by
Sam James and build farm animal alligator.

Back-patch to all supported releases, and then two older versions that
also know about <stdbool.h>, per the recently-out-of-support policy[1].
12 requires C99 so it's much like the supported releases, but 11 only
assumes C89 so it now uses AC_CHECK_HEADERS instead of the overly picky
AC_HEADER_STDBOOL.  (I could find no discussion of which historical
systems had <stdbool.h> but failed the conformance test; if they ever
existed, they surely aren't relevant to that policy's goals.)

[1] https://wiki.postgresql.org/wiki/Committing_checklist#Policies

Reported-by: Sam James <sam@gentoo.org>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org> (master version)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (approach)
Discussion: https://www.postgresql.org/message-id/flat/87o72eo9iu.fsf%40gentoo.org
2024-11-25 20:54:28 +13:00
Alexander Korotkov
d4d11940df Remove the wrong assertion from match_orclause_to_indexcol()
Obviously, the constant could be zero.  Also, add the relevant check to
regression tests.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-siKJdtWhcbqk4Y-xG12do2Ckm1qw672GNsSnDqL9FQg%40mail.gmail.com
2024-11-25 09:07:30 +02:00
Amit Kapila
d05a387d9d Doc: Clarify the inactive_since field description.
Updated to specify that it represents the exact time a slot became
inactive, rather than the period of inactivity.

Reported-by: Peter Smith
Author: Bruce Momjian, Nisha Moond
Reviewed-by: Amit Kapila, Peter Smith
Backpatch-through: 17
Discussion: https://postgr.es/m/CAHut+PuvsyA5v8y7rYoY9mkDQzUhwaESM05yCByTMaDoRh30tA@mail.gmail.com
2024-11-25 11:12:32 +05:30
Michael Paquier
db80507d98 Simplify some SPI tests of PL/Python
These tests relied on both next() and __next__(), but only the former is
needed since Python 2 support has been removed, so let's simplify a bit
the tests.

Author: Erik Wienhold
Discussion: https://postgr.es/m/173209043143.2092749.13692266486972491694@wrigleys.postgresql.org
2024-11-25 09:43:16 +09:00
Michael Paquier
2ff7c913d9 doc: Fix example with __next__() in PL/Python function
Per PEP 3114, iterator.next() has been renamed to iterator.__next__(),
and one example in the documentation still used next().  This caused the
example provided to fail the function creation since Python 2 is not
supported anymore since 19252e8ec93.

Author: Erik Wienhold
Discussion: https://postgr.es/m/173209043143.2092749.13692266486972491694@wrigleys.postgresql.org
Backpatch-through: 15
2024-11-25 09:15:25 +09:00
Noah Misch
5de08f136a Test "options=-crole=" and "ALTER DATABASE SET role".
Commit 7b88529f4363994450bd4cd3c172006a8a77e222 fixed a regression
spanning these features, but it didn't test them.  It did test code
paths sufficient for their present implementations, so no back-patch.

Reported by Matthew Woodcraft.

Discussion: https://postgr.es/m/87iksnsbhx.fsf@golux.woodcraft.me.uk
2024-11-24 12:49:53 -08:00
Alexander Korotkov
ae4569161a Teach bitmap path generation about transforming OR-clauses to SAOP's
When optimizer generates bitmap paths, it considers breaking OR-clause
arguments one-by-one.  But now, a group of similar OR-clauses can be
transformed into SAOP during index matching.  So, bitmap paths should
keep up.

This commit teaches bitmap paths generation machinery to group similar
OR-clauses into dedicated RestrictInfos.  Those RestrictInfos are considered
both to match index as a whole (as SAOP), or to match as a set of individual
OR-clause argument one-by-one (the old way).

Therefore, bitmap path generation will takes advantage of OR-clauses to SAOP's
transformation.  The old way of handling them is also considered.  So, there
shouldn't be planning regression.

Discussion: https://postgr.es/m/CAPpHfdu5iQOjF93vGbjidsQkhHvY2NSm29duENYH_cbhC6x%2BMg%40mail.gmail.com
Author: Alexander Korotkov, Andrey Lepikhov
Reviewed-by: Alena Rybakina, Andrei Lepikhov, Jian he, Robert Haas
Reviewed-by: Peter Geoghegan
2024-11-24 01:41:45 +02:00
Alexander Korotkov
d4378c0005 Transform OR-clauses to SAOP's during index matching
This commit makes match_clause_to_indexcol() match
"(indexkey op C1) OR (indexkey op C2) ... (indexkey op CN)" expression
to the index while transforming it into "indexkey op ANY(ARRAY[C1, C2, ...])"
(ScalarArrayOpExpr node).

This transformation allows handling long OR-clauses with single IndexScan
avoiding diving them into a slower BitmapOr.

We currently restrict Ci to be either Const or Param to apply this
transformation only when it's clearly beneficial.  However, in the future,
we might switch to a liberal understanding of constants, as it is in other
cases.

Discussion: https://postgr.es/m/567ED6CA.2040504%40sigaev.ru
Author: Alena Rybakina, Andrey Lepikhov, Alexander Korotkov
Reviewed-by: Peter Geoghegan, Ranier Vilela, Alexander Korotkov, Robert Haas
Reviewed-by: Jian He, Tom Lane, Nikolay Shaplov
2024-11-24 01:40:20 +02:00
Jeff Davis
869ee4f10e Disallow modifying statistics on system columns.
Reported-by: Heikki Linnakangas
Discussion: https://postgr.es/m/df3e1c41-4e6c-40ad-9636-98deefe488cd@iki.fi
2024-11-22 12:40:24 -08:00
Nathan Bossart
efdc7d7475 Add INT64_HEX_FORMAT and UINT64_HEX_FORMAT to c.h.
Like INT64_FORMAT and UINT64_FORMAT, these macros produce format
strings for 64-bit integers.  However, INT64_HEX_FORMAT and
UINT64_HEX_FORMAT generate the output in hexadecimal instead of
decimal.  Besides introducing these macros, this commit makes use
of them in several places.  This was originally intended to be part
of commit 5d6187d2a2, but I left it out because I felt there was a
nonzero chance that back-patching these new macros into c.h could
cause problems with third-party code.  We tend to be less cautious
with such changes in new major versions.

Note that UINT64_HEX_FORMAT was originally added in commit
ee1b30f128, but it was placed in test_radixtree.c, so it wasn't
widely available.  This commit moves UINT64_HEX_FORMAT to c.h.

Discussion: https://postgr.es/m/ZwQvtUbPKaaRQezd%40nathan
2024-11-22 12:41:57 -06:00
Nathan Bossart
8589876b79 Add a couple of recent commits to .git-blame-ignore-revs. 2024-11-22 12:17:35 -06:00
Heikki Linnakangas
8fb5936703 Make the memory layout of Port struct independent of USE_OPENSSL
Commit d39a49c1e4 added new fields to the struct, but missed the "keep
these last" comment on the previous fields. Add placeholder variables
so that the offsets of the fields are the same whether you build with
USE_OPENSSL or not. This is a courtesy to extensions that might peek
at the fields, to make the ABI the same regardless of the options used
to build PostgreSQL.

In reality, I don't expect any extensions to look at the 'raw_buf'
fields. Firstly, they are new in v17, so no one's written such
extensions yet. Secondly, extensions should have no business poking at
those fields anyway. Nevertheless, fix this properly on 'master'. On
v17, we mustn't change the memory layout, so just fix the comments.

Author: Jacob Champion
Discussion: https://www.postgresql.org/message-id/raw/CAOYmi%2BmKVJNzn5_TD_MK%3DhqO64r_w8Gb0FHCLk0oAkW-PJv8jQ@mail.gmail.com
2024-11-22 17:43:04 +02:00
Heikki Linnakangas
ee937f0409 Fix data loss when restarting the bulk_write facility
If a user started a bulk write operation on a fork with existing data
to append data in bulk, the bulk_write machinery would zero out all
previously written pages up to the last page written by the new
bulk_write operation.

This is not an issue for PostgreSQL itself, because we never use the
bulk_write facility on a non-empty fork. But there are use cases where
it makes sense. TimescaleDB extension is known to do that to merge
partitions, for example.

Backpatch to v17, where the bulk_write machinery was introduced.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reported-By: Erik Nordström <erik@timescale.com>
Reviewed-by: Erik Nordström <erik@timescale.com>
Discussion: https://www.postgresql.org/message-id/CACAa4VJ%2BQY4pY7M0ECq29uGkrOygikYtao1UG9yCDFosxaps9g@mail.gmail.com
2024-11-22 16:28:24 +02:00
Thomas Munro
aac831cafa Use auxv to check for CRC32 instructions on ARM.
Previously we probed for CRC32 instructions by testing if they caused
SIGILL.  Some have expressed doubts about that technique, the Linux
documentation advises not to use it, and it's not exactly beautiful.
Now that more operating systems expose CPU features to userspace via the
ELF loader in approximately the same way, let's use that instead.

This is expected to work on Linux, FreeBSD and recent OpenBSD.
OpenBSD/ARM has not been tested and is not present in our build farm,
but the API matches FreeBSD.

On macOS, compilers use a more recent baseline ISA so the runtime test
mechanism isn't reached.  (A similar situation is expected for
Windows/ARM when that port lands.)

On NetBSD, runtime feature probing is lost for armv8-a builds.  It looks
potentially doable with sysctl following the example of the cpuctl
program; patches are welcome.

No back-patch for now, since we don't have any evidence of actual
breakage from the previous technique.

Suggested-by: Bastien Roucariès <rouca@debian.org>
Discussion: https://postgr.es/m/4496616.iHFcN1HehY%40portable-bastien
2024-11-22 21:45:25 +13:00
Michael Paquier
ea15816928 psql: Fix category of \parse in output of --help=commands and \?
\parse was listed under the category "Connection", which was incorrect.
Let's move it to "General" like the other meta-commands of the same type
(\bind, \bind_named and \close).

Oversight in commit d55322b0da60.

Discussion: https://postgr.es/m/Zz_x-NEKNeeRlAVc@paquier.xyz
2024-11-22 14:04:21 +09:00
Michael Paquier
768dfd8e65 psql: Include \pset xheader_width in --help=commands|variables
psql's --help was missed the description of the \pset variable
xheader_width, that should be listed when using \? or --help=commands,
and described for --help=variables.

Oversight in a45388d6e098.

Author: Pavel Luzanov
Discussion: https://postgr.es/m/1e3e06d6-0807-4e62-a9f6-c11481e6eb10@postgrespro.ru
Backpatch-through: 16
2024-11-22 12:17:37 +09:00
Thomas Munro
78c09bd9f9 jit: Use -mno-outline-atomics for bitcode on ARM.
If the executable's .o files were produced by a compiler (probably gcc)
not using -moutline-atomics, and the corresponding .bc files were
produced by clang using -moutline-atomics (probably by default), then
the generated bitcode functions would have the target attribute
"+outline-atomics", and could fail at runtime when inlined.  If the
target ISA at bitcode generation time was armv8-a (the most conservative
aarch64 target, no LSE), then LLVM IR atomic instructions would generate
calls to functions in libgcc.a or libclang_rt.*.a that switch between
LL/SC and faster LSE instructions depending on a runtime AT_HWCAP check.
Since the corresponding .o files didn't need those functions, they
wouldn't have been included in the executable, and resolution would
fail.

At least Debian and Ubuntu are known to ship gcc and clang compilers
that target armv8-a but differ on the use of outline atomics by default.

Fix, by suppressing the outline atomics attribute in bitcode explicitly.
Inline LL/SC instructions will be generated for atomic operations in
bitcode built for armv8-a.  Only configure scripts are adjusted for now,
because the meson build system doesn't generate bitcode yet.

This doesn't seem to be a new phenomenon, so real cases of functions
using atomics that are inlined by JIT must be rare in the wild given how
long it took for a bug report to arrive.  The reported case could be
reduced to:

postgres=# set jit_inline_above_cost = 0;
SET
postgres=# set jit_above_cost = 0;
SET
postgres=# select pg_last_wal_receive_lsn();
WARNING:  failed to resolve name __aarch64_swp4_acq_rel
FATAL:  fatal llvm error: Program used external function
'__aarch64_swp4_acq_rel' which could not be resolved!

The change doesn't affect non-ARM systems or later target ISAs.

Back-patch to all supported releases.

Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru>
Discussion: https://postgr.es/m/18610-37bf303f904fede3%40postgresql.org
2024-11-22 15:29:47 +13:00
Michael Paquier
c06e71d1ac Add write_to_file to PgStat_KindInfo for pgstats kinds
This new field controls if entries of a stats kind should be written or
not to the on-disk pgstats file when shutting down an instance.  This
affects both fixed and variable-numbered kinds.

This is useful for custom statistics by itself, and a patch is under
discussion to add a new builtin stats kind where the write of the stats
is not necessary.  All the built-in stats kinds, as well as the two
custom stats kinds in the test module injection_points, set this flag to
"true" for now, so as stats entries are written to the on-disk pgstats
file.

Author: Bertrand Drouvot
Reviewed-by: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/Zz7T47nHwYgeYwOe@ip-10-97-1-34.eu-west-3.compute.internal
2024-11-22 10:12:26 +09:00
Bruce Momjian
4c4aaa19a6 doc: clarify how logical replication takes its initial snapshot
Reported-by: Koen De Groote

Discussion: https://postgr.es/m/171606613152.686.7693963105919927503@wrigleys.postgresql.org

Backpatch-through: master
2024-11-21 17:14:33 -05:00
Peter Eisentraut
53dcba9be5 pgindent run
for commit 79b575d3bc0
2024-11-21 21:40:17 +01:00
Álvaro Herrera
b5be29ecaf
Fix newly introduced 010_keep_recycled_wals.pl
It failed to set the archive_command as it desired because of a syntax
problem.  Oversight in commit 90bcc7c2db1d.

This bug doesn't cause the test to fail, because the test only checks
pg_rewind's output messages, not the actual outcome (and the outcome in
both cases is that the file is kept, not deleted).  But in either case
the message about the file being kept is there, so it's hard to get
excited about doing much more.

Reported-by: Antonin Houska <ah@cybertec.at>
Author: Alexander Kukushkin <cyberdemn@gmail.com>
Discussion: https://postgr.es/m/7822.1732167825@antos
2024-11-21 17:04:26 +01:00
Álvaro Herrera
7300ff1bd8
Fix outdated bit in README.tuplock
Apparently this information has been outdated since first committed,
because we adopted a different implementation during development per
reviews and this detail was not updated in the README.

This has been wrong since commit 0ac5ad5134f2 introduced the file in
2013.  Backpatch to all live branches.

Reported-by: Will Mortensen <will@extrahop.com>
Discussion: https://postgr.es/m/CAMpnoC6yEQ=c0Rdq-J7uRedrP7Zo9UMp6VZyP23QMT68n06cvA@mail.gmail.com
2024-11-21 16:54:36 +01:00
Peter Eisentraut
79b575d3bc Fix ALTER TABLE / REPLICA IDENTITY for temporal tables
REPLICA IDENTITY USING INDEX did not accept a GiST index.  This should
be allowed when used as a temporal primary key.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/04579cbf-b134-45e1-8f2d-8c54c849c1ee@illuminatedcomputing.com
2024-11-21 13:50:18 +01:00
Álvaro Herrera
da94e871e8
Unify repetitive error messages 2024-11-21 10:54:30 +01:00
Michael Paquier
ea792bfd93 Fix memory leak in pgoutput for the WAL sender
RelationSyncCache, the hash table in charge of tracking the relation
schemas sent through pgoutput, was forgetting to free the TupleDesc
associated to the two slots used to store the new and old tuples,
causing some memory to be leaked each time a relation is invalidated
when the slots of an existing relation entry are cleaned up.

This is rather hard to notice as the bloat is pretty minimal, but a
long-running WAL sender would be in trouble over time depending on the
workload.  sysbench has proved to be pretty good at showing the problem,
coupled with some memory monitoring of the WAL sender.

Issue introduced in 52e4f0cd472d, that has added row filters for tables
logically replicated.

Author: Boyu Yang
Reviewed-by: Michael Paquier, Hou Zhijie
Discussion: https://postgr.es/m/DM3PR84MB3442E14B340E553313B5C816E3252@DM3PR84MB3442.NAMPRD84.PROD.OUTLOOK.COM
Backpatch-through: 15
2024-11-21 15:14:02 +09:00
Bruce Momjian
f95da9f0e0 More logically order libpq func. includes, e.g., group GUC vals
Reported-by: David Zhang

Discussion: https://postgr.es/m/65909efe-97c6-4863-af4e-21eb5a26dd1e@highgo.ca

Co-authored-by: David Zhang

Backpatch-through: master
2024-11-20 17:09:17 -05:00
Bruce Momjian
70236cf22f doc: clarify that jsonb_path_match() returns an SQL boolean
Not a JSON boolean.  Also clarify that other predicate check expressions
functions return a JSON boolean, not an SQL boolean.

Reported-by: jian he

Discussion: https://postgr.es/m/CACJufxH7tP1NXCHN1bUBXcEB=dv7-qE+ZjB3UxwK6Em+9Qzb9Q@mail.gmail.com

Backpatch-through: 17
2024-11-20 17:03:45 -05:00
Bruce Momjian
f722dd32de clarify --no-comments option in --help and SGML files
The previous commit, b38bac26e20, missed these cases for dump/restore.

Reported-by: Tom Lane

Discussion: https://postgr.es/m/3495698.1731968093@sss.pgh.pa.us

Backpatch-through: master
2024-11-20 14:48:31 -05:00
Peter Geoghegan
7074337698 Refine nbtree = redundancy preprocessing comment.
Spell out how a = key associated with a SAOP array renders a > key
against the same index column redundant at the relevant point inside
_bt_preprocess_keys.

Follow-up to commit 5bf748b8.
2024-11-20 13:37:08 -05:00
Tom Lane
94131cd53c Avoid assertion failure if a setop leaf query contains setops.
Ordinarily transformSetOperationTree will collect all UNION/
INTERSECT/EXCEPT steps into the setOperations tree of the topmost
Query, so that leaf queries do not contain any setOperations.
However, it cannot thus flatten a subquery that also contains
WITH, ORDER BY, FOR UPDATE, or LIMIT.  I (tgl) forgot that in
commit 07b4c48b6 and wrote an assertion in rule deparsing that
a leaf's setOperations would always be empty.

If it were nonempty then we would want to parenthesize the subquery
to ensure that the output represents the setop nesting correctly
(e.g. UNION below INTERSECT had better get parenthesized).  So
rather than just removing the faulty Assert, let's change it into
an additional case to check to decide whether to add parens.  We
don't expect that the additional case will ever fire, but it's
cheap insurance.

Man Zeng and Tom Lane

Discussion: https://postgr.es/m/tencent_7ABF9B1F23B0C77606FC5FE3@qq.com
2024-11-20 12:03:47 -05:00
Fujii Masao
6c8f670323 file_fdw: Add REJECT_LIMIT option to file_fdw.
Commit 4ac2a9bece introduced the REJECT_LIMIT option for the COPY
command. This commit extends the support for this option to file_fdw.

As well as REJECT_LIMIT option for COPY, this option limits
the maximum number of erroneous rows that can be skipped.
If the number of data type conversion errors exceeds this limit,
accessing the file_fdw foreign table will fail with an error,
even when on_error = 'ignore' is specified.

Since the CREATE/ALTER FOREIGN TABLE commands require foreign
table options to be single-quoted, this commit updates
defGetCopyRejectLimitOption() to handle also string value for them,
in addition to int64 value for COPY command option.

Author: Atsushi Torikoshi
Reviewed-by: Fujii Masao, Yugo Nagata, Kirill Reshke
Discussion: https://postgr.es/m/bab68a9fc502b12693f0755b6f35f327@oss.nttdata.com
2024-11-20 23:53:19 +09:00
Michael Paquier
15afb7d61c doc: Fix section of functions age(xid) and mxid_age(xid)
In 17~, age(xid) and mxid_age(xid) were listed as deprecated.  Based on
the discussion that led to 48b5aa3143, this is not intentional as this
could break many existing monitoring queries.  Note that vacuumdb also
uses both of them.

In 16, both functions were listed under "Control Data Functions", which
is incorrect, so let's move them to the list of functions related to
transaction IDs and snapshots.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/Zzr2zZFyeFKXWe8a@ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://postgr.es/m/20231114013224.4z6oxa6p6va33rxr@awork3.anarazel.de
Backpatch-through: 16
2024-11-20 14:20:52 +09:00
Tom Lane
a43d7a8c7c Compare collations before merging UNION operations.
In the dim past we figured it was okay to ignore collations
when combining UNION set-operation nodes into a single N-way
UNION operation.  I believe that was fine at the time, but
it stopped being fine when we added nondeterministic collations:
the semantics of distinct-ness are affected by those.  v17 made
it even less fine by allowing per-child sorting operations to
be merged via MergeAppend, although I think we accidentally
avoided any live bug from that.

Add a check that collations match before deciding that two
UNION nodes are equivalent.  I also failed to resist the
temptation to comment plan_union_children() a little better.

Back-patch to all supported branches (v13 now), since they
all have nondeterministic collations.

Discussion: https://postgr.es/m/3605568.1731970579@sss.pgh.pa.us
2024-11-19 18:26:19 -05:00
Fujii Masao
c166454496 Improve error message for database object stats manipulation functions.
Previously, database object statistics manipulation functions like
pg_set_relation_stats() reported unclear error and hint messages
when executed during recovery. These messages were "internal",
making it difficult for users to understand the issue:

  ERROR:  cannot acquire lock mode ShareUpdateExclusiveLock on database objects while recovery is in progress
  HINT:  Only RowExclusiveLock or less can be acquired on database objects during recovery.

This commit updates the error handling so that, if these functions
are called during recovery, they produce clearer messages:

  ERROR:  recovery is in progress
  HINT:  Statistics cannot be modified during recovery.

The related documentation has also been updated to explicitly
clarify that these functions are not available during recovery.

Author: Fujii Masao
Reviewed-by: Heikki Linnakangas, Maxim Orlov
Discussion: https://postgr.es/m/6d313829-5f56-4a28-ae4b-bd01bf1ae791@oss.nttdata.com
2024-11-20 02:00:50 +09:00
Michael Paquier
a3699daea2 libpq: Improve error message when parsing URI parameters and keywords
The error message showing up when parameters or keywords include too
many whitespaces was "trailing data found", which was confusing because
there was no hint about what was actually wrong.

Issue introduced in 430ce189fc45, hence there is no need for a
backpatch.

Author: Yushi Ogiwara
Reviewed-by: Fujii Masao, Tom Lane, Bruce Momjian
Discussion: https://postgr.es/m/645bd22a53c4da8a1bc7e1e52d9d3b52@oss.nttdata.com
2024-11-19 13:27:42 +09:00
Bruce Momjian
b38bac26e2 doc: clarify pg_dump --no-comments meaning as SQL comments
Discussion: https://postgr.es/m/ZyjdAjEsXbFPkD3t@momjian.us

Backpatch-through: master
2024-11-18 16:30:33 -05:00
Bruce Momjian
cffca3665d doc: clarify text about combining row-level policies
Reported-by: splarv@ya.ru

Discussion: https://postgr.es/m/173045909386.700.9231055113418242392@wrigleys.postgresql.org

Backpatch-through: master
2024-11-18 15:34:59 -05:00
Peter Geoghegan
18ea6b3d0d nbtree: consistently use minoff variable.
This was arguably an oversight in commit 29b64d1de7, which moved this
code from nbtutils.c to its nbtsearch.c caller.
2024-11-18 13:35:28 -05:00
Michael Paquier
c1c09007e2 Improve some code format in gist.c
Author: Tender Wang
Discussion: https://postgr.es/m/CAHewXNmD=K7XmsHq=L1SyyzZYvwU4oaMG9EKSSMe4OrXfykLzg@mail.gmail.com
2024-11-18 13:41:10 +09:00
Michael Paquier
03a42c9652 Use pg_memory_is_all_zeros() in PageIsVerifiedExtended()
Relying on pg_memory_is_all_zeros(), which would apply SIMD instructions
when dealing with an aligned page, is proving to be at least three times
faster than the original size_t-based comparisons when checking if a
BLCKSZ page is full of zeros.  Note that PageIsVerifiedExtended() is
called each time a page is read from disk, and making it faster is a
good thing.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/CAApHDvq7P-JgFhgtxUPqhavG-qSDVUhyWaEX9M8_MNorFEijZA@mail.gmail.com
2024-11-18 11:52:35 +09:00
Michael Paquier
5be1dabd2a Optimize pg_memory_is_all_zeros() in memutils.h
pg_memory_is_all_zeros() is currently implemented to do only a
byte-per-byte comparison.  While being sufficient for its existing
callers for pgstats entries, it could lead to performance regressions
should it be used for larger memory areas, like 8kB blocks, or even
future commits.

This commit optimizes the implementation of this function to be more
efficient for larger sizes, written in a way so that compilers can
optimize the code.  This is portable across 32b and 64b architectures.

The implementation handles three cases, depending on the size of the
input provided:
* If less than sizeof(size_t), do a simple byte-by-byte comparison.
* If between sizeof(size_t) and (sizeof(size_t) * 8 - 1):
** Phase 1: byte-by-byte comparison, until the pointer is aligned.
** Phase 2: size_t comparisons, with aligned pointers, up to the last
   aligned location possible.
** Phase 3: byte-by-byte comparison, until the end location.
* If more than (sizeof(size_t) * 8) bytes, this is the same as case 2
except that an additional phase is placed between Phase 1 and Phase 2,
with 8 * sizeof(size_t) comparisons using bitwise OR, to encourage
compilers to use SIMD instructions if available.

The last improvement proves to be at least 3 times faster than the
size_t comparisons, which is something currently used for the all-zero
page check in PageIsVerifiedExtended().

The optimization tricks that would encourage the use of SIMD
instructions have been suggested by David Rowley.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Ranier Vilela
Discussion: https://postgr.es/m/CAApHDvq7P-JgFhgtxUPqhavG-qSDVUhyWaEX9M8_MNorFEijZA@mail.gmail.com
2024-11-18 10:08:38 +09:00
Noah Misch
7b88529f43 Fix per-session activation of ALTER {ROLE|DATABASE} SET role.
After commit 5a2fed911a85ed6d8a015a6bafe3a0d9a69334ae, the catalog state
resulting from these commands ceased to affect sessions.  Restore the
longstanding behavior, which is like beginning the session with a SET
ROLE command.  If cherry-picking the CVE-2024-10978 fixes, default to
including this, too.  (This fixes an unintended side effect of fixing
CVE-2024-10978.)  Back-patch to v12, like that commit.  The release team
decided to include v12, despite the original intent to halt v12 commits
earlier this week.

Tom Lane and Noah Misch.  Reported by Etienne LAFARGE.

Discussion: https://postgr.es/m/CADOZwSb0UsEr4_UTFXC5k7=fyyK8uKXekucd+-uuGjJsGBfxgw@mail.gmail.com
2024-11-15 20:39:56 -08:00
Masahiko Sawada
e5ed873b1b Fix a possibility of logical replication slot's restart_lsn going backwards.
Previously LogicalIncreaseRestartDecodingForSlot() accidentally
accepted any LSN as the candidate_lsn and candidate_valid after the
restart_lsn of the replication slot was updated, so it potentially
caused the restart_lsn to move backwards.

A scenario where this could happen in logical replication is: after a
logical replication restart, based on previous candidate_lsn and
candidate_valid values in memory, the restart_lsn advances upon
receiving a subscriber acknowledgment. Then, logical decoding restarts
from an older point, setting candidate_lsn and candidate_valid based
on an old RUNNING_XACTS record. Subsequent subscriber acknowledgments
then update the restart_lsn to an LSN older than the current value.

In the reported case, after WAL files were removed by a checkpoint,
the retreated restart_lsn prevented logical replication from
restarting due to missing WAL segments.

This change essentially modifies the 'if' condition to 'else if'
condition within the function. The previous code had an asymmetry in
this regard compared to LogicalIncreaseXminForSlot(), which does
almost the same thing for different fields.

The WAL removal issue was reported by Hubert Depesz Lubaczewski.

Backpatch to all supported versions, since the bug exists since 9.4
where logical decoding was introduced.

Reviewed-by: Tomas Vondra, Ashutosh Bapat, Amit Kapila
Discussion: https://postgr.es/m/Yz2hivgyjS1RfMKs%40depesz.com
Discussion: https://postgr.es/m/85fff40e-148b-4e86-b921-b4b846289132%40vondra.me
Backpatch-through: 13
2024-11-15 17:06:11 -08:00
Tom Lane
b69bdcee9c Avoid assertion due to disconnected NFA sub-graphs in regex parsing.
In commit 08c0d6ad6 which introduced "rainbow" arcs in regex NFAs,
I didn't think terribly hard about what to do when creating the color
complement of a rainbow arc.  Clearly, the complement cannot match any
characters, and I took the easy way out by just not building any arcs
at all in the complement arc set.  That mostly works, but Nikolay
Shaplov found a case where it doesn't: if we decide to delete that
sub-NFA later because it's inside a "{0}" quantifier, delsub()
suffered an assertion failure.  That's because delsub() relies on
the target sub-NFA being fully connected.  That was always true
before, and the best fix seems to be to restore that property.
Hence, invent a new arc type CANTMATCH that can be generated in
place of an empty color complement, and drop it again later when we
start NFA optimization.  (At that point we don't need to do delsub()
any more, and besides there are other cases where NFA optimization can
lead to disconnected subgraphs.)

It appears that this bug has no consequences in a non-assert-enabled
build: there will be some transiently leaked NFA states/arcs, but
they'll get cleaned up eventually.  Still, we don't like assertion
failures, so back-patch to v14 where rainbow arcs were introduced.

Per bug #18708 from Nikolay Shaplov.

Discussion: https://postgr.es/m/18708-f94f2599c9d2c005@postgresql.org
2024-11-15 18:23:38 -05:00
Fujii Masao
9a70f67667 Remove unnecessary backslash from CopyFrom() code.
Commit 4ac2a9bece accidentally added an unnecessary backslash
to CopyFrom() code. This commit removes it.

Author: Yugo Nagata
Reviewed-by: Tender Wang
Discussion: https://postgr.es/m/20241112114609.4175a2e175282edd1463dbc6@sraoss.co.jp
2024-11-16 01:59:33 +09:00
Peter Eisentraut
9321d2fdf8 Fix collation handling for foreign keys
Allowing foreign keys where the referenced and the referencing columns
have collations with different notions of equality is problematic.
This can only happen when using nondeterministic collations, for
example, if the referencing column is case-insensitive and the
referenced column is not, or vice versa.  It does not happen if both
collations are deterministic.

To show one example:

    CREATE COLLATION case_insensitive (provider = icu, deterministic = false, locale = 'und-u-ks-level2');

    CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
    CREATE TABLE fktable (x text COLLATE case_insensitive REFERENCES pktable ON UPDATE CASCADE ON DELETE CASCADE);
    INSERT INTO pktable VALUES ('A'), ('a');
    INSERT INTO fktable VALUES ('A');

    BEGIN; DELETE FROM pktable WHERE x = 'a'; TABLE fktable; ROLLBACK;
    BEGIN; DELETE FROM pktable WHERE x = 'A'; TABLE fktable; ROLLBACK;

Both of these DELETE statements delete the one row from fktable.  So
this means that one row from fktable references two rows in pktable,
which should not happen.  (That's why a primary key or unique
constraint is required on pktable.)

When nondeterministic collations were implemented, the SQL standard
available to yours truly said that referential integrity checks should
be performed with the collation of the referenced column, and so
that's how we implemented it.  But this turned out to be a mistake in
the SQL standard, for the same reasons as above, that was later
(SQL:2016) fixed to require both collations to be the same.  So that's
what we are aiming for here.

We don't have to be quite so strict.  We can allow different
collations if they are both deterministic.  This is also good for
backward compatibility.

So the new rule is that the collations either have to be the same or
both deterministic.  Or in other words, if one of them is
nondeterministic, then both have to be the same.

Users upgrading from before that have affected setups will need to
make changes to their schemas (i.e., change one or both collations in
affected foreign-key relationships) before the upgrade will succeed.

Some of the nice test cases for the previous situation in
collate.icu.utf8.sql are now obsolete.  They are changed to just check
the error checking of the new rule.  Note that collate.sql already
contained a test for foreign keys with different deterministic
collations.

A bunch of code in ri_triggers.c that added a COLLATE clause to
enforce the referenced column's collation can be removed, because both
columns now have to have the same notion of equality, so it doesn't
matter which one to use.

Reported-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/78d824e0-b21e-480d-a252-e4b84bc2c24b@illuminatedcomputing.com
2024-11-15 14:55:54 +01:00
Álvaro Herrera
90bcc7c2db
Avoid deleting critical WAL segments during pg_rewind
Previously, in unlucky cases, it was possible for pg_rewind to remove
certain WAL segments from the rewound demoted primary.  In particular
this happens if those files have been marked for archival (i.e., their
.ready files were created) but not yet archived; the newly promoted node
no longer has such files because of them having been recycled, but they
are likely critical for recovery in the demoted node.  If pg_rewind
removes them, recovery is not possible anymore.

Fix this by maintaining a hash table of files in this situation in the
scan that looks for a checkpoint, which the decide_file_actions phase
can consult so that it knows to preserve them.

Backpatch to 14.  The problem also exists in 13, but that branch was not
blessed with commit eb00f1d4bf96, so this patch is difficult to apply
there.  Users of older releases will just have to continue to be extra
careful when rewinding.

Co-authored-by: Полина Бунгина (Polina Bungina) <bungina@gmail.com>
Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://postgr.es/m/CAAtGL4AhzmBRsEsaDdz7065T+k+BscNadfTqP1NcPmsqwA5HBw@mail.gmail.com
2024-11-15 12:53:12 +01:00
Peter Eisentraut
d31bbfb659 Proper object locking for GRANT/REVOKE
Refactor objectNamesToOids() to use get_object_address() internally if
possible.  Not only does this save a lot of code, it also allows us to
use the object locking provided by get_object_address() for
GRANT/REVOKE.  There was previously a code comment that complained
about the lack of locking in objectNamesToOids(), which is now fixed.

The check in ExecGrant_Type_check() is obsolete because
get_object_address_type() already does the same check.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/bf72b82c-124d-4efa-a484-bb928e9494e4@eisentraut.org
2024-11-15 11:03:48 +01:00
Heikki Linnakangas
cfd7f36c83 jit: Stop emitting some unnecessary instructions
In EEOP_BOOL_AND_STEP* and EEOP_BOOL_OR_STEP*, we emitted pointlesss
store instructions to store to resnull/resvalue values that were just
loaded from the same fields in the previous instructions. They will
surely get optimized away by LLVM if any optimizations are enabled,
but it's better to not emit them in the first place. In
EEOP_BOOL_NOT_STEP, similar story with resnull.

In EEOP_NULLIF, when it returns NULL, there was also a redundant store
to resvalue just after storing a 0 to it. The value of resvalue
doesn't matter when resnull is set, so in fact even storing the 0 is
unnecessary, but I kept that because we tend to do that for general
tidiness.

Author: Xing Guo <higuoxing@gmail.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/CACpMh%2BC%3Dg13WdvzLRSponsVWGgxwDSMzQWM4Gz0heOyaA0-N6g@mail.gmail.com
2024-11-15 10:06:36 +02:00
Peter Eisentraut
e468ec0fdd Add an assertion in get_object_address()
Some places declared a Relation before calling get_object_address()
only to assert that the relation is NULL after the call.

The new assertion allows passing NULL as the relation argument at
those places making the code cleaner and easier to understand.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/ZzG34eNrT83W/Orz@ip-10-97-1-34.eu-west-3.compute.internal
2024-11-15 08:52:43 +01:00
Michael Paquier
818119afcc Fix race conditions with drop of reused pgstats entries
This fixes a set of race conditions with cumulative statistics where a
shared stats entry could be dropped while it should still be valid in
the event when it is reused: an entry may refer to a different object
but requires the same hash key.  This can happen with various stats
kinds, like:
- Replication slots that compute internally an index number, for
different slot names.
- Stats kinds that use an OID in the object key, where a wraparound
causes the same key to be used if an OID is used for the same object.
- As of PostgreSQL 18, custom pgstats kinds could also be an issue,
depending on their implementation.

This issue is fixed by introducing a counter called "generation" in the
shared entries via PgStatShared_HashEntry, initialized at 0 when an
entry is created and incremented when the same entry is reused, to avoid
concurrent issues on drop because of other backends still holding a
reference to it.  This "generation" is copied to the local copy that a
backend holds when looking at an object, then cross-checked with the
shared entry to make sure that the entry is not dropped even if its
"refcount" justifies that if it has been reused.

This problem could show up when a backend shuts down and needs to
discard any entries it still holds, causing statistics to be removed
when they should not, or even an assertion failure.  Another report
involved a failure in a standby after an OID wraparound, where the
startup process would FATAL on a "can only drop stats once", stopping
recovery abruptly.  The buildfarm has been sporadically complaining
about the problem, as well, but the window is hard to reach with the
in-core tests.

Note that the issue can be reproduced easily by adding a sleep before
dshash_find() in pgstat_release_entry_ref() to enlarge the problematic
window while repeating test_decoding's isolation test oldest_xmin a
couple of times, for example, as pointed out by Alexander Lakhin.

Reported-by: Alexander Lakhin, Peter Smith
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAA4eK1KxuMVyAryz_Vk5yq3ejgKYcL6F45Hj9ZnMNBS-g+PuZg@mail.gmail.com
Discussion: https://postgr.es/m/17947-b9554521ad963c9c@postgresql.org
Backpatch-through: 15
2024-11-15 11:31:58 +09:00
Heikki Linnakangas
5b00786857 Pass MyPMChildSlot as an explicit argument to child process
All the other global variables passed from postmaster to child have
the same value in all the processes, while MyPMChildSlot is more like
a parameter to each child process.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:32 +02:00
Heikki Linnakangas
a78af04270 Assign a child slot to every postmaster child process
Previously, only backends, autovacuum workers, and background workers
had an entry in the PMChildFlags array. With this commit, all
postmaster child processes, including all the aux processes, have an
entry. Dead-end backends still don't get an entry, though, and other
processes that don't touch shared memory will never mark their
PMChildFlags entry as active.

We now maintain separate freelists for different kinds of child
processes. That ensures that there are always slots available for
autovacuum and background workers. Previously, pre-authentication
backends could prevent autovacuum or background workers from starting
up, by using up all the slots.

The code to manage the slots in the postmaster process is in a new
pmchild.c source file. Because postmaster.c is just so large.
Assigning pmsignal slot numbers is now pmchild.c's responsibility.
This replaces the PMChildInUse array in pmsignal.c.

Some of the comments in postmaster.c still talked about the "stats
process", but that was removed in commit 5891c7a8ed. Fix those while
we're at it.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:28 +02:00
Heikki Linnakangas
bb861414fe Kill dead-end children when there's nothing else left
Previously, the postmaster would never try to kill dead-end child
processes, even if there were no other processes left. A dead-end
backend will eventually exit, when authentication_timeout expires, but
if a dead-end backend is the only thing that's preventing the server
from shutting down, it seems better to kill it immediately. It's
particularly important, if there was a bug in the early startup code
that prevented a dead-end child from timing out and exiting normally.

Includes a test for that case where a dead-end backend previously
prevented the server from shutting down.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:04 +02:00
Heikki Linnakangas
18d67a8d7d Replace postmaster.c's own backend type codes with BackendType
Introduce a separate BackendType for dead-end children, so that we
don't need a separate dead_end flag.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:06:16 +02:00
Peter Eisentraut
a274bbb1b3 Remove a useless cast to (void *) in hash_search() call
This pattern was previously cleaned up in 54a177a948b, but a new
instance snuck in around the same time in 31966b151e6.
2024-11-14 09:30:14 +01:00
Michael Paquier
13e3796c90 contrib/lo: Use SQL-standard function bodies
Author: Ronan Dunklau
Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
2024-11-14 13:23:11 +09:00
Michael Paquier
93f9b4a93f xml2: Add tests for functions xpath_nodeset() and xpath_list()
These two functions with their different argument lists have never been
tested in this module, so let's add something.

Author: Ronan Dunklau
Discussion: https://postgr.es/m/ZzMSJkiNZhimjXWx@paquier.xyz
2024-11-14 13:10:36 +09:00
Michael Paquier
3ef038fc4f contrib/lo: Add test for function lo_oid()
Author: Ronan Dunklau
Discussion: https://postgr.es/m/ZzMSJkiNZhimjXWx@paquier.xyz
2024-11-14 12:24:00 +09:00
Peter Geoghegan
4e6e375b00 Add nbtree amgettuple return item function.
This makes it easier to add precondition assertions.  We now assert that
the last call to _bt_readpage succeeded, and that the current item index
is within the bounds of the currPos items array.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Discussion: https://postgr.es/m/CAH2-WznFkEs9K1PtNruti5JjawY-dwj+gkaEh_k1ZE+1xLLGkA@mail.gmail.com
2024-11-13 09:50:57 -05:00
Álvaro Herrera
38c18710b3
Fix pg_upgrade's cross-version tests when old < 18
Because in the 18 cycle we turned checksums on by default with commit
04bec894a04c, and pg_upgrade fails if the setting doesn't match in old
and new clusters, the built-in cross-version pg_upgrade test is failing
if the old version is older than 18.  Fix the script so that it creates
the old cluster with checksums enabled (-k) in cross-version scenarios.

This went unnoticed because the buildfarm doesn't use the same test code
for cross-version testing.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/202411071838.7fgkb7uvavvz@alvherre.pgsql
2024-11-13 11:06:44 +01:00
Peter Eisentraut
f05b5e6346 configure.ac: Remove useless AC_SUBST
No longer used since commit 805e431a386.
2024-11-13 10:29:31 +01:00
Peter Eisentraut
f683ba0867 doc: Update pg_constraint.conexclop docs for WITHOUT OVERLAPS
Fixup for commit fc0438b4e80.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/57ea0668-5205-426e-b934-efc89f2186c2@illuminatedcomputing.com
2024-11-13 09:05:02 +01:00
Peter Eisentraut
d56af4c882 doc: Add PERIOD to ALTER TABLE reference docs
Commit 89f908a6d0a documented foreign keys with PERIOD in the CREATE
TABLE docs, but not in ALTER TABLE.  This commit adds the new syntax
to the ALTER TABLE docs.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/57ea0668-5205-426e-b934-efc89f2186c2@illuminatedcomputing.com
2024-11-13 08:53:08 +01:00
Peter Eisentraut
94daf80bd1 doc: Small improvement in CREATE TABLE / PERIOD documentation
Use placeholders that are more consistent and match the description
better.  Fixup for commit 89f908a6d0a.
2024-11-13 08:51:23 +01:00
Peter Eisentraut
bf62105950 doc: Add WITHOUT OVERLAPS to ALTER TABLE reference docs
Commit fc0438b4e80 documented WITHOUT OVERLAPS in the CREATE TABLE
docs, but not in ALTER TABLE.  This commit adds the new syntax to the
ALTER TABLE docs.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/57ea0668-5205-426e-b934-efc89f2186c2@illuminatedcomputing.com
2024-11-13 08:42:34 +01:00
Michael Paquier
d74b590983 Fix comment in injection_point.c
InjectionPointEntry->name was described as a hash key, which was fine
when introduced in d86d20f0ba79, but it is not now.

Oversight in 86db52a5062a, that has changed the way injection points are
stored in shared memory from a hash table to an array.

Backpatch-through: 17
2024-11-13 13:58:09 +09:00
Peter Geoghegan
3be30d0075 Fix obsolete nbtree page reuse FSM comment.
Oversight in commit d088ba5a.
2024-11-12 22:09:00 -05:00
Peter Geoghegan
93063e2e42 Count contrib/bloom index scans in pgstat view.
Maintain the pg_stat_user_indexes.idx_scan pgstat counter during
contrib/Bloom index scans.

Oversight in commit 9ee014fc, which added the Bloom index contrib
module.

Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/c48839d881388ee401a01807c686004d@oss.nttdata.com
Backpatch: 13- (all supported branches).
2024-11-12 20:57:45 -05:00
Amit Langote
bfeeb065ea Add missing word in comment
Discussion: https://postgr.es/m/CA+HiwqFgdp8=0_gi+DU0fPWZbg7qY3KZ_c1Wj1DEvzXC4BCnMQ@mail.gmail.com
2024-11-12 20:39:57 +09:00
Álvaro Herrera
ff239c3bf4
Silence compilers about extractNotNullColumn()
Multiple buildfarm animals warn that a newly added Assert() is
impossible to fail; remove it to avoid the noise.  While at it, use
direct assignment to obtain the value we need, avoiding an unnecessary
memcpy().

(I decided to remove the "pfree" call for the detoasted short-datum;
because this is only used for DDL, it's not problematic to leak such a
small allocation.)

Noted by Tom Lane about 14e87ffa5c54.

Discussion: https://postgr.es/m/3649828.1731083171@sss.pgh.pa.us
2024-11-12 11:35:43 +01:00
Michael Paquier
3f323eba89 pg_freespacemap: Use SQL-standard function bodies
72a5b1fc8804 was the piece missing for the conversion of this module.
pg_freespace is bumped to 1.3, with its function pg_freespace(regclass)
converted to this new style.

There are other modules in the tree that need a similar treatment; these
will be handled later.

Author: Tom Lane
Reviewed-by: Ronan Dunklau
Discussion: https://postgr.es/m/3395418.1618352794@sss.pgh.pa.us
2024-11-12 17:28:03 +09:00
Alexander Korotkov
db22b90024 Fix arrays comparison in CompareOpclassOptions()
The current code calls array_eq() and does not provide FmgrInfo.  This commit
provides initialization of FmgrInfo and uses C collation as the safe option
for text comparison because we don't know anything about the semantics of
opclass options.

Backpatch to 13, where opclass options were introduced.

Reported-by: Nicolas Maus
Discussion: https://postgr.es/m/18692-72ea398df3ec6712%40postgresql.org
Backpatch-through: 13
2024-11-12 01:44:20 +02:00
Tom Lane
73c9f91a1b Parallel workers use AuthenticatedUserId for connection privilege checks.
Commit 5a2fed911 had an unexpected side-effect: the parallel worker
launched for the new test case would fail if it couldn't use a
superuser-reserved connection slot.  The reason that test failed
while all our pre-existing ones worked is that the connection
privilege tests in InitPostgres had been based on the superuserness
of the leader's AuthenticatedUserId, but after the rearrangements
of 5a2fed911 we were testing the superuserness of CurrentUserId,
which the new test case deliberately made to be a non-superuser.

This all seems very accidental and probably not the behavior we really
want, but a security patch is no time to be redesigning things.
Pending some discussion about desirable semantics, hack it so that
InitPostgres continues to pay attention to the superuserness of
AuthenticatedUserId when starting a parallel worker.

Nathan Bossart and Tom Lane, per buildfarm member sawshark.

Security: CVE-2024-10978
2024-11-11 17:05:53 -05:00
Tom Lane
c4252c9ef0 Fix cross-version upgrade tests.
TestUpgradeXversion knows how to make the main regression database's
references to pg_regress.so be version-independent.  But it doesn't
do that for plperl's database, so that the C function added by
commit b7e3a52a8 is causing cross-version upgrade test failures.
Path of least resistance is to just drop the function at the end
of the new test.

In <= v14, also take the opportunity to clean up the generated
test files.

Security: CVE-2024-10979
2024-11-11 13:57:21 -05:00
Tom Lane
a34c33fd22 Avoid bizarre meson behavior with backslashes in command arguments.
meson makes the backslashes in text2macro.pl's --strip argument
into forward slashes, effectively disabling comment stripping.
That hasn't caused us issues before, but it breaks the test case
for b7e3a52a8.  We don't really need the pattern to be adjustable,
so just hard-wire it into the script instead.

Context: https://github.com/mesonbuild/meson/issues/1564
Security: CVE-2024-10979
2024-11-11 12:20:08 -05:00
Tom Lane
5a2fed911a Fix improper interactions between session_authorization and role.
The SQL spec mandates that SET SESSION AUTHORIZATION implies
SET ROLE NONE.  We tried to implement that within the lowest-level
functions that manipulate these settings, but that was a bad idea.
In particular, guc.c assumes that it doesn't matter in what order
it applies GUC variable updates, but that was not the case for these
two variables.  This problem, compounded by some hackish attempts to
work around it, led to some security-grade issues:

* Rolling back a transaction that had done SET SESSION AUTHORIZATION
would revert to SET ROLE NONE, even if that had not been the previous
state, so that the effective user ID might now be different from what
it had been.

* The same for SET SESSION AUTHORIZATION in a function SET clause.

* If a parallel worker inspected current_setting('role'), it saw
"none" even when it should see something else.

Also, although the parallel worker startup code intended to cope
with the current role's pg_authid row having disappeared, its
implementation of that was incomplete so it would still fail.

Fix by fully separating the miscinit.c functions that assign
session_authorization from those that assign role.  To implement the
spec's requirement, teach set_config_option itself to perform "SET
ROLE NONE" when it sets session_authorization.  (This is undoubtedly
ugly, but the alternatives seem worse.  In particular, there's no way
to do it within assign_session_authorization without incompatible
changes in the API for GUC assign hooks.)  Also, improve
ParallelWorkerMain to directly set all the relevant user-ID variables
instead of relying on some of them to get set indirectly.  That
allows us to survive not finding the pg_authid row during worker
startup.

In v16 and earlier, this includes back-patching 9987a7bf3 which
fixed a violation of GUC coding rules: SetSessionAuthorization
is not an appropriate place to be throwing errors from.

Security: CVE-2024-10978
2024-11-11 10:29:54 -05:00
Nathan Bossart
cd7ab57532 Ensure cached plans are correctly marked as dependent on role.
If a CTE, subquery, sublink, security invoker view, or coercion
projection references a table with row-level security policies, we
neglected to mark the plan as potentially dependent on which role
is executing it.  This could lead to later executions in the same
session returning or hiding rows that should have been hidden or
returned instead.

Reported-by: Wolfgang Walther
Reviewed-by: Noah Misch
Security: CVE-2024-10976
Backpatch-through: 12
2024-11-11 09:00:00 -06:00
Noah Misch
b7e3a52a87 Block environment variable mutations from trusted PL/Perl.
Many process environment variables (e.g. PATH), bypass the containment
expected of a trusted PL.  Hence, trusted PLs must not offer features
that achieve setenv().  Otherwise, an attacker having USAGE privilege on
the language often can achieve arbitrary code execution, even if the
attacker lacks a database server operating system user.

To fix PL/Perl, replace trusted PL/Perl %ENV with a tied hash that just
replaces each modification attempt with a warning.  Sites that reach
these warnings should evaluate the application-specific implications of
proceeding without the environment modification:

  Can the application reasonably proceed without the modification?

    If no, switch to plperlu or another approach.

    If yes, the application should change the code to stop attempting
    environment modifications.  If that's too difficult, add "untie
    %main::ENV" in any code executed before the warning.  For example,
    one might add it to the start of the affected function or even to
    the plperl.on_plperl_init setting.

In passing, link to Perl's guidance about the Perl features behind the
security posture of PL/Perl.

Back-patch to v12 (all supported versions).

Andrew Dunstan and Noah Misch

Security: CVE-2024-10979
2024-11-11 06:23:43 -08:00
Amit Kapila
220cea9411 Doc: Add links to clarify the max_replication_slots.
The GUC max_replication_slots has a different meaning for sending servers
and subscribers. Add cross-links in each section for easy reference.

Author: Tristan Partin
Discussion: https://postgr.es/m/D5FNEPMMFHFX.1OQBCML0TU5AH@partin.io
2024-11-11 15:24:40 +05:30
Michael Paquier
e7a9496de9 Add two attributes to pg_stat_database for parallel workers activity
Two attributes are added to pg_stat_database:
* parallel_workers_to_launch, counting the total number of parallel
workers that were planned to be launched.
* parallel_workers_launched, counting the total number of parallel
workers actually launched.

The ratio of both fields can provide hints that there are not enough
slots available when launching parallel workers, also useful when
pg_stat_statements is not deployed on an instance (i.e. cf54a2c00254).

This commit relies on de3a2ea3b264, that has added two fields to EState,
that get incremented when executing Gather or GatherMerge nodes.

A test is added in select_parallel, where parallel workers are spawned.

Bump catalog version.

Author: Benoit Lobréau
Discussion: https://postgr.es/m/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com
2024-11-11 10:40:48 +09:00
Michael Paquier
bf8835ea97 libpq: Bail out during SSL/GSS negotiation errors
This commit changes libpq so that errors reported by the backend during
the protocol negotiation for SSL and GSS are discarded by the client, as
these may include bytes that could be consumed by the client and write
arbitrary bytes to a client's terminal.

A failure with the SSL negotiation now leads to an error immediately
reported, without a retry on any other methods allowed, like a fallback
to a plaintext connection.

A failure with GSS discards the error message received, and we allow a
fallback as it may be possible that the error is caused by a connection
attempt with a pre-11 server, GSS encryption having been introduced in
v12.  This was a problem only with v17 and newer versions; older
versions discard the error message already in this case, assuming a
failure caused by a lack of support for GSS encryption.

Author: Jacob Champion
Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Michael Paquier
Security: CVE-2024-10977
Backpatch-through: 12
2024-11-11 10:19:52 +09:00
Michael Paquier
5d4298e75f pg_stat_statements: Avoid some locking during PGSS entry scans
A single PGSS entry's spinlock is used to be able to modify "counters"
without holding pgss->lock exclusively, as mentioned at the top of
pg_stat_statements.c and within pgssEntry.

Within a single pgssEntry, stats_since and minmax_stats_since are never
modified without holding pgss->lock exclusively, so there is no need to
hold an entry's spinlock when reading stats_since and
minmax_stats_since, as done when scanning all the PGSS entries for
function calls of pg_stat_statements().

This also restores the consistency between the code and the comments
about the entry's spinlock usage.  This change is a performance
improvement (it can be argued that this is a logic bug), so there is no
need for a backpatch.  This saves two instructions from being read while
holding an entry's spinlock.

Author: Karina Litskevich
Reviewed-by: Michael Paquier, wenhui qiu
Discussion: https://postgr.es/m/CACiT8ibhCmzbcOxM0v4pRLH3abk-95LPkt7_uC2JMP+miPjxsg@mail.gmail.com
2024-11-11 09:02:30 +09:00
Thomas Munro
29d66b2d2f jit: Remove obsolete LLVM version guard.
Commit 9044fc1d needed a version guard when back-patched, but it is
redundant in master as of commit 972c2cd2, and I accidentally left it
in there.
2024-11-11 12:07:24 +13:00
Nathan Bossart
0fa6884065 Fix sign-compare warnings in pg_iovec.h.
The code in question (pg_preadv() and pg_pwritev()) has been around
for a while, but commit 15c9ac3629 moved it to a header file.  If
third-party code that includes this header file is built with
-Wsign-compare on a system without preadv() or pwritev(), warnings
ensue.  This commit fixes said warnings by casting the result of
pg_pread()/pg_pwrite() to size_t, which should be safe because we
will have already checked for a negative value.

Author: Wolfgang Walther
Discussion: https://postgr.es/m/16989737-1aa8-48fd-8dfe-b7ada06509ab%40technowledgy.de
Backpatch-through: 17
2024-11-08 16:11:08 -06:00
Peter Geoghegan
caca6d8d27 Assert consistency of currPage that ended scan.
When _bt_readnextpage is called with our nbtree parallel scan already
seized (i.e. when it is directly called by _bt_first), we never expect a
prior call to _bt_readpage for lastcurrblkno to already indicate that
the scan should end -- the _bt_first caller's blkno must always be read.
After all, the "prior" _bt_readpage call (the call for lastcurrblkno)
probably took place in some other backend (and it might not even have
finished by the time our backend reaches _bt_first/_bt_readnextpage).

Add a documenting assertion to the path where _bt_readnextpage ends the
parallel scan based on information about lastcurrblkno from so->currPos.
Assert that the most recent _bt_readpage call that set so->currPos is in
fact lastcurrblkno's _bt_readpage call.

Follow-up to bugfix commit b5ee4e52.
2024-11-08 16:34:41 -05:00
Nathan Bossart
4225276e25 Move check for USE_AVX512_POPCNT_WITH_RUNTIME_CHECK.
Unlike TRY_POPCNT_FAST, which is defined in pg_bitutils.h, this
macro is defined in c.h (via pg_config.h), so we can check for it
earlier and avoid some unnecessary #includes on systems that lack
AVX-512 support.

Oversight in commit f78667bd91.

Discussion: https://postgr.es/m/Zy5K5Qmlb3Z4dsd4%40nathan
2024-11-08 14:25:28 -06:00
Tom Lane
b8df690492 Improve fix for not entering parallel mode when holding interrupts.
Commit ac04aa84a put the shutoff for this into the planner, which is
not ideal because it doesn't prevent us from re-using a previously
made parallel plan.  Revert the planner change and instead put the
shutoff into InitializeParallelDSM, modeling it on the existing code
there for recovering from failure to allocate a DSM segment.

However, that code path is mostly untested, and testing a bit harder
showed there's at least one bug: ExecHashJoinReInitializeDSM is not
prepared for us to have skipped doing parallel DSM setup.  I also
thought the Assert in ReinitializeParallelWorkers is pretty
ill-advised, and replaced it with a silent Min() operation.

The existing test case added by ac04aa84a serves fine to test this
version of the fix, so no change needed there.

Patch by me, but thanks to Noah Misch for the core idea that we
could shut off worker creation when !INTERRUPTS_CAN_BE_PROCESSED.
Back-patch to v12, as ac04aa84a was.

Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
2024-11-08 13:42:10 -05:00
Peter Geoghegan
b5ee4e5202 Avoid nbtree parallel scan currPos confusion.
Commit 1bd4bc85, which refactored nbtree sibling link traversal, made
_bt_parallel_seize reset the scan's currPos so that things were
consistent with the state of a serial backend moving between pages.
This overlooked the fact that _bt_readnextpage relied on the existing
currPos state to decide when to end the scan -- even though it came from
before the scan was seized.  As a result of all this, parallel nbtree
scans could needlessly behave like full index scans.

To fix, teach _bt_readnextpage to explicitly allow the use of an already
read page's so->currPos when deciding whether to end the scan -- even
during parallel index scans (allow it consistently now).  This requires
moving _bt_readnextpage's seizure of the scan to earlier in its loop.
That way _bt_readnextpage either deals with the true so->currPos state,
or an initialized-by-_bt_parallel_seize currPos state set from when the
scan was seized.  Now _bt_steppage (the most important _bt_readnextpage
caller) takes the same uniform approach to setting up its call using
details taken from so->currPos -- regardless of whether the scan happens
to be parallel or serial.

The new loop structure in _bt_readnextpage is prone to getting confused
by P_NONE blknos set when the rightmost or leftmost page was reached.
We could avoid that by adding an explicit check, but that would be ugly.
Avoid this problem by teaching _bt_parallel_seize to end the parallel
scan instead of returning a P_NONE next block/blkno.  Doing things this
way was arguably a missed opportunity for commit 1bd4bc85.  It allows us
to remove a similar "blkno == P_NONE" check from _bt_first.

Oversight in commit 1bd4bc85, which refactored sibling link traversal
(as part of optimizing nbtree backward scan locking).

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Diagnosed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Discussion: https://postgr.es/m/f8efb9c0f8d1a71b44fd7f8e42e49c25@oss.nttdata.com
2024-11-08 13:10:10 -05:00
Álvaro Herrera
14e87ffa5c
Add pg_constraint rows for not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints on
user tables.  Only one such constraint is allowed for a column.

We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables.  These related constraints mostly
follow the well-known rules of conislocal and coninhcount that we have
for CHECK constraints, with some adaptations: for example, as opposed to
CHECK constraints, we don't match not-null ones by name when descending
a hierarchy to alter or remove it, instead matching by the name of the
column that they apply to.  This means we don't require the constraint
names to be identical across a hierarchy.

The inheritance status of these constraints can be controlled: now we
can be sure that if a parent table has one, then all children will have
it as well.  They can optionally be marked NO INHERIT, and then children
are free not to have one.  (There's currently no support for altering a
NO INHERIT constraint into inheriting down the hierarchy, but that's a
desirable future feature.)

This also opens the door for having these constraints be marked NOT
VALID, as well as allowing UNIQUE+NOT NULL to be used for functional
dependency determination, as envisioned by commit e49ae8d3bc58.  It's
likely possible to allow DEFERRABLE constraints as followup work, as
well.

psql shows these constraints in \d+, though we may want to reconsider if
this turns out to be too noisy.  Earlier versions of this patch hid
constraints that were on the same columns of the primary key, but I'm
not sure that that's very useful.  If clutter is a problem, we might be
better off inventing a new \d++ command and not showing the constraints
in \d+.

For now, we omit these constraints on system catalog columns, because
they're unlikely to achieve anything.

The main difference to the previous attempt at this (b0e96f311985) is
that we now require that such a constraint always exists when a primary
key is in the column; we didn't require this previously which had a
number of unpalatable consequences.  With this requirement, the code is
easier to reason about.  For example:

- We no longer have "throwaway constraints" during pg_dump.  We needed
  those for the case where a table had a PK without a not-null
  underneath, to prevent a slow scan of the data during restore of the
  PK creation, which was particularly problematic for pg_upgrade.

- We no longer have to cope with attnotnull being set spuriously in
  case a primary key is dropped indirectly (e.g., via DROP COLUMN).

Some bits of code in this patch were authored by Jian He.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: 何建 (jian he) <jian.universality@gmail.com>
Reviewed-by: 王刚 (Tender Wang) <tndrwang@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/202408310358.sdhumtyuy2ht@alvherre.pgsql
2024-11-08 13:28:48 +01:00
Amit Langote
075acdd933 Disallow partitionwise join when collations don't match
If the collation of any join key column doesn’t match the collation of
the corresponding partition key, partitionwise joins can yield incorrect
results. For example, rows that would match under the join key collation
might be located in different partitions due to the partitioning
collation. In such cases, a partitionwise join would yield different
results from a non-partitionwise join, so disallow it in such cases.

Reported-by: Tender Wang <tndrwang@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com
Backpatch-through: 12
2024-11-08 17:25:24 +09:00
Amit Langote
90fe6251c8 Disallow partitionwise grouping when collations don't match
If the collation of any grouping column doesn’t match the collation of
the corresponding partition key, partitionwise grouping can yield
incorrect results. For example, rows that would be grouped under the
grouping collation may end up in different partitions under the
partitioning collation. In such cases, full partitionwise grouping
would produce results that differ from those without partitionwise
grouping, so disallowed that.

Partial partitionwise aggregation is still allowed, as the Finalize
step reconciles partition-level aggregates with grouping requirements
across all partitions, ensuring that the final output remains
consistent.

This commit also fixes group_by_has_partkey() by ensuring the
RelabelType node is stripped from grouping expressions when matching
them to partition key expressions to avoid false mismatches.

Bug: #18568
Reported-by: Webbo Han <1105066510@qq.com>
Author: Webbo Han <1105066510@qq.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/18568-2a9afb6b9f7e6ed3@postgresql.org
Discussion: https://postgr.es/m/tencent_9D9103CDA420C07768349CC1DFF88465F90A@qq.com
Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com
Backpatch-through: 12
2024-11-08 16:07:22 +09:00
Richard Guo
f00ab1fd15 Fix inconsistent RestrictInfo serial numbers
When we generate multiple clones of the same qual condition to cope
with outer join identity 3, we need to ensure that all the clones get
the same serial number.  To achieve this, we reset the
root->last_rinfo_serial counter each time we produce RestrictInfo(s)
from the qual list (see deconstruct_distribute_oj_quals).  This
approach works only if we ensure that we are not changing the qual
list in any way that'd affect the number of RestrictInfos built from
it.

However, with b262ad440, an IS NULL qual on a NOT NULL column might
result in an additional constant-FALSE RestrictInfo.  And different
versions of the same qual clause can lead to different conclusions
about whether it can be reduced to constant-FALSE.  This would affect
the number of RestrictInfos built from the qual list for different
versions, causing inconsistent RestrictInfo serial numbers across
multiple clones of the same qual.  This inconsistency can confuse
users of these serial numbers, such as rebuild_joinclause_attr_needed,
and lead to planner errors such as "ERROR:  variable not found in
subplan target lists".

To fix, reset the root->last_rinfo_serial counter after generating the
additional constant-FALSE RestrictInfo.

Back-patch to v17 where the issue crept in.  In v17, I failed to make
a test case that would expose this bug, so no test case for v17.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-B6kafn+LmPuh-TYFwFyEm-vVj3Qqv7Yo-69CEv14rRg@mail.gmail.com
2024-11-08 11:21:11 +09:00
Nathan Bossart
41b98ddb77 Fix __attribute__((target(...))) usage.
The commonly supported way to specify multiple target options is to
surround the entire list with quotes and to use a comma (with no
extra spaces) as the delimiter.

Oversight in commit f78667bd91.

Discussion: https://postgr.es/m/Zy0jya8nF8CPpv3B%40nathan
2024-11-07 15:27:32 -06:00
Nathan Bossart
f78667bd91 Use __attribute__((target(...))) for AVX-512 support.
Presently, we check for compiler support for the required
intrinsics both with and without extra compiler flags (e.g.,
-mxsave), and then depending on the results of those checks, we
pick which files to compile with which flags.  This is tedious and
complicated, and it results in unsustainable coding patterns such
as separate files for each portion of code may need to be built
with different compiler flags.

This commit introduces support for __attribute__((target(...))) and
uses it for the AVX-512 code.  This simplifies both the
configure-time checks and the build scripts, and it allows us to
place the functions that use the intrinsics in files that we
otherwise do not want to build with special CPU instructions.  We
are careful to avoid using __attribute__((target(...))) on
compilers that do not understand it, but we still perform the
configure-time checks in case the compiler allows using the
intrinsics without it (e.g., MSVC).

A similar change could likely be made for some of the CRC-32C code,
but that is left as a future exercise.

Suggested-by: Andres Freund
Reviewed-by: Raghuveer Devulapalli, Andres Freund
Discussion: https://postgr.es/m/20240731205254.vfpap7uxwmebqeaf%40awork3.anarazel.de
2024-11-07 13:58:43 -06:00
Álvaro Herrera
f56a01ebdb
doc: Reword ALTER TABLE ATTACH restriction on NO INHERIT constraints
The previous wording is easy to read incorrectly; this change makes it
simpler, less ambiguous, and less prominent.

Backpatch to all live branches.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202411051201.zody6mld7vkw@alvherre.pgsql
2024-11-07 14:06:24 +01:00
Peter Eisentraut
d7a2b5bd87 Clarify a foreign key error message
Clarify the message about type mismatch in foreign key definition to
indicate which column the referencing and which is the referenced one.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxEL82ao-aXOa=d_-Xip0bix-qdSyNc9fcWxOdkEZFko8w@mail.gmail.com
2024-11-07 11:13:06 +01:00
Michael Paquier
987027bcc0 Remove an obsolete comment in gistinsert()
This is inconsistent since 1f7ef548ec2e where the definition of
gistFormTuple() has changed.

Author: Tender Wang
Reviewed-by: Aleksander Alekseev
Discussion: https://postgr.es/m/CAHewXNkjU95_HdioDVU=5yBq_Xt=GfBv=Od-0oKtiA006pWW7Q@mail.gmail.com
2024-11-07 15:13:50 +09:00
Amit Kapila
7054186c4e Replicate generated columns when 'publish_generated_columns' is set.
This patch builds on the work done in commit 745217a051 by enabling the
replication of generated columns alongside regular column changes through
a new publication parameter: publish_generated_columns.

Example usage:
CREATE PUBLICATION pub1 FOR TABLE tab_gencol WITH (publish_generated_columns = true);

The column list takes precedence. If the generated columns are specified
in the column list, they will be replicated even if
'publish_generated_columns' is set to false. Conversely, if generated
columns are not included in the column list (assuming the user specifies a
column list), they will not be replicated even if
'publish_generated_columns' is true.

Author: Vignesh C, Shubham Khanna
Reviewed-by: Peter Smith, Amit Kapila, Hayato Kuroda, Shlok Kyal, Ajin Cherian, Hou Zhijie, Masahiko Sawada
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2024-11-07 08:58:49 +05:30
Michael Paquier
70291a3c66 Improve handling of empty query results in BackgroundPsql::query()
A newline is not added at the end of an empty query result, causing the
banner of the hardcoded \echo to not be discarded.  This would reflect
on scripts that expect an empty result by showing the "QUERY_SEPARATOR"
in the output returned back to the caller, which was confusing.

This commit changes BackgroundPsql::query() so as empty results are able
to work correctly, making the first newline before the banner optional,
bringing more flexibility.

Note that this change affects 037_invalid_database.pl, where three
queries generated an empty result, with the script relying on the data
from the hardcoded banner to exist in the expected output.  These
queries are changed to use query_safe(), leading to a simpler script.

The author has also proposed a test in a different patch where empty
results would exist when using BackgroundPsql.

Author: Jacob Champion
Reviewed-by: Andrew Dunstan, Michael Paquier
Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2024-11-07 12:11:27 +09:00
Daniel Gustafsson
f638aafd1e Find invalid databases during upgrade check stage
Before continuing with the check start by checking that all databases
allow connections to avoid a hard fail without proper error reporting.

Inspired by a larger patch by Thomas Krennwallner.

Discussion: https://postgr.es/m/f9315bf0-e03e-4490-9f0d-5b6f7a6d9908@postsubmeta.net
2024-11-06 15:40:52 +01:00
Daniel Gustafsson
1e37cc6e2c Remove unused variable
The low variable has not been used since it was added in d168b666823
and can be safely removed.  The variable is present in the Sedgewick
paper "Analysis of Shellsort and Related Algorithms" as a parameter
to the shellsort function, but our implementation does not use it.
Remove to improve readability of the code.

Author: Koki Nakamura <btnakamurakoukil@oss.nttdata.com>
Discussion: https://postgr.es/m/8aeb7b3eda53ca4c65fbacf8f43628fb@oss.nttdata.com
2024-11-06 15:11:14 +01:00
Peter Eisentraut
a0be94067e doc: Remove event trigger firing matrix
This is difficult to maintain accurately, and it was probably already
somewhat incorrect, especially in the sql_drop and table_rewrite
categories.

The prior section already documented which DDL commands are *not*
supported (which was also slightly outdated), so let's expand that a
bit and just rely on that instead of listing out each command in full
detail.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxE_UAuxcM08BW5oVsg34v0cFWoEt8yBa5xSAoKLmL6LTQ%40mail.gmail.com
2024-11-06 13:43:17 +01:00
Thomas Munro
9044fc1d45 Monkey-patch LLVM code to fix ARM relocation bug.
Supply a new memory manager for RuntimeDyld, to avoid crashes in
generated code caused by memory placement that can overflow a 32 bit
data type.  This is a drop-in replacement for the
llvm::SectionMemoryManager class in the LLVM library, with Michael
Smith's proposed fix from
https://www.github.com/llvm/llvm-project/pull/71968.

We hereby slurp it into our own source tree, after moving into a new
namespace llvm::backport and making some minor adjustments so that it
can be compiled with older LLVM versions as far back as 12.  It's harder
to make it work on even older LLVM versions, but it doesn't seem likely
that people are really using them so that is not investigated for now.

The problem could also be addressed by switching to JITLink instead of
RuntimeDyld, and that is the LLVM project's recommended solution as
the latter is about to be deprecated.  We'll have to do that soon enough
anyway, and then when the LLVM version support window advances far
enough in a few years we'll be able to delete this code.  Unfortunately
that wouldn't be enough for PostgreSQL today: in most relevant versions
of LLVM, JITLink is missing or incomplete.

Several other projects have already back-ported this fix into their fork
of LLVM, which is a vote of confidence despite the lack of commit into
LLVM as of today.  We don't have our own copy of LLVM so we can't do
exactly what they've done; instead we have a copy of the whole patched
class so we can pass an instance of it to RuntimeDyld.

The LLVM project hasn't chosen to commit the fix yet, and even if it
did, it wouldn't be back-ported into the releases of LLVM that most of
our users care about, so there is not much point in waiting any longer
for that.  If they make further changes and commit it to LLVM 19 or 20,
we'll still need this for older versions, but we may want to
resynchronize our copy and update some comments.

The changes that we've had to make to our copy can be seen by diffing
our SectionMemoryManager.{h,cpp} files against the ones in the tree of
the pull request.  Per the LLVM project's license requirements, a copy
is in SectionMemoryManager.LICENSE.

This should fix the spate of crash reports we've been receiving lately
from users on large memory ARM systems.

Back-patch to all supported releases.

Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se> (license aspects)
Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
2024-11-06 23:17:18 +13:00
Peter Eisentraut
ecb5af7798 Remove unused #include's from bin .c files
as determined by IWYU

Similar to commit dbbca2cf299, but for bin and some related files.

Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-11-06 11:11:52 +01:00
Michael Paquier
ba08edb065 Extend Cluster.pm's background_psql() to be able to start asynchronously
This commit extends the constructor routine of BackgroundPsql.pm with a
new "wait" parameter.  If set to 0, the routine returns without waiting
for psql to start, ready to consume input.

background_psql() in Cluster.pm gains the same "wait" parameter.  The
default behavior is still to wait for psql to start.  It becomes now
possible to not wait, giving to TAP scripts the possibility to perform
actions between a BackgroundPsql startup and its wait_connect() call.

Author: Jacob Champion
Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2024-11-06 15:31:14 +09:00
David Rowley
87f81a5563 Fix hypothetical bug in ExprState building for hashing
adf97c156 gave ExprStates the ability to hash expressions and return a
single hash value.  That commit supports seeding the hash value with an
initial value to have that blended into the final hash value.

Here we fix a hypothetical bug where if there are zero expressions to
hash, the initial value is stored in the wrong location.  The existing
code stored the initial value in an intermediate location expecting that
when the expressions were hashed that those steps would store the final
hash value in the ExprState.resvalue field.  However, that wouldn't happen
when there are zero expressions to hash.  The correct thing to do instead
is to have a special case for zero expressions and when we hit that case,
store the initial value directly in the ExprState.resvalue.  The reason
that this is a hypothetical bug is that no code currently calls
ExecBuildHash32Expr passing a non-zero initial value.

Discussion: https://postgr.es/m/CAApHDvpMAL_zxbMRr1LOex3O7Y7R7ZN2i8iUFLQhqQiJMAg3qw@mail.gmail.com
2024-11-06 09:16:00 +13:00
Daniel Gustafsson
7bdaa4b542 Add a Git .mailmap file
This adds a Git .mailmap to unify spellings of committer names.

Discussion: https://postgr.es/m/76773029-A7AD-4BAF-AFC2-E511D26E866D@yesql.se
2024-11-05 13:56:02 +01:00
Heikki Linnakangas
e54688030c Silence meson warning about PG_TEST_EXTRA in src/Makefile.global.in
Commit 99b937a44f introduced this warning when you run "meson setup":

    Configuring Makefile.global using configuration
    ../src/meson.build:31: WARNING: The variable(s) 'PG_TEST_EXTRA' in the input file 'src/Makefile.global.in' are not present in the given configuration data.

To fix, add PG_TEST_EXTRA to the list of variables that are not needed
in the makefiles generated by meson. In meson builds, the makefiles
are only used for PGXS, not for building or testing the server itself.

Reported-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/5c380997-e270-425a-9542-e4ef36a285de@eisentraut.org
2024-11-05 12:25:25 +02:00
Michael Paquier
7d85d87f4d Clear padding of PgStat_HashKey when handling pgstats entries
PgStat_HashKey is currently initialized in a way that could result in
random data if the structure has any padding bytes.  The structure
has no padding bytes currently, fortunately, but it could become a
problem should the structure change at some point in the future.

The code is changed to use some memset(0) so as any padding would be
handled properly, as it would be surprising to see random failures in
the pgstats entry lookups.  PgStat_HashKey is a structure internal to
pgstats, and an ABI change could be possible in the scope of a bug fix,
so backpatch down to 15 where this has been introduced.

Author: Bertrand Drouvot
Reviewed-by: Jelte Fennema-Nio, Michael Paquier
Discussion: https://postgr.es/m/Zyb7RW1y9dVfO0UH@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 15
2024-11-05 09:39:43 +09:00
Tom Lane
0704aed0d8 Use portable diff options in pg_bsd_indent's regression test.
We had been using "diff -upd", which evidently works for most people,
but Solaris's diff doesn't like it.  (We'd not noticed because the
Solaris buildfarm animals weren't running this test until they were
upgraded to the latest buildfarm client script.)  Change to "diff -U3"
which is what pg_regress has used for ages.

Per buildfarm (and off-list discussion with Noah Misch).

Back-patch to v16 where this test was added.  In v16,
also back-patch the relevant part of 628c1d1f2 so that
the test script looks about the same in all branches.
2024-11-04 18:08:48 -05:00
Alexander Korotkov
3a7ae6b3d9 Revert pg_wal_replay_wait() stored procedure
This commit reverts 3c5db1d6b0, and subsequent improvements and fixes
including 8036d73ae3, 867d396ccd, 3ac3ec580c, 0868d7ae70, 85b98b8d5a,
2520226c95, 014f9f34d2, e658038772, e1555645d7, 5035172e4a, 6cfebfe88b,
73da6b8d1b, and e546989a26.

The reason for reverting is a set of remaining issues.  Most notably, the
stored procedure appears to need more effort than the utility statement
to turn the backend into a "snapshot-less" state.  This makes an approach
to use stored procedures questionable.

Catversion is bumped.

Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz
2024-11-04 22:47:57 +02:00
Bruce Momjian
3293b718a0 doc: use more accurate URL for bug reporting
Reported-by: nat@makarevitch.org

Discussion: https://postgr.es/m/172947609746.699.14488791149769110078@wrigleys.postgresql.org

Backpatch-through: master
2024-11-04 15:08:01 -05:00
Tom Lane
b1008c1f01 pg_basebackup, pg_receivewal: fix failure to find password in ~/.pgpass.
Sloppy refactoring in commit cca97ce6a caused these programs
to pass dbname = NULL to libpq if there was no "--dbname" switch
on the command line, where before "replication" would be passed.
This didn't break things completely, because the source server doesn't
care about the dbname specified for a physical replication connection.
However, it did cause libpq to fail to match a ~/.pgpass entry that
has "replication" in the dbname field.  Restore the previous behavior
of passing "replication".

Also, closer inspection shows that if you do specify a dbname
in the connection string, that is what will be matched to ~/.pgpass,
not "replication".  This was the pre-existing behavior so we should
not change it, but the SGML docs were pretty misleading about it.
Improve that.

Per bug #18685 from Toshi Harada.  Back-patch to v17 where the
error crept in.

Discussion: https://postgr.es/m/18685-fee2dd142b9688f1@postgresql.org
Discussion: https://postgr.es/m/2702546.1730740456@sss.pgh.pa.us
2024-11-04 14:36:11 -05:00
Bruce Momjian
32d07a000f doc: remove check of SVG files, since they are derived
revert of change from commit 641a5b7a144

Reported-by: Peter Eisentraut

Discussion: https://postgr.es/m/2c5dd601-b245-4092-9c27-6d1ad51609df@eisentraut.org

Backpatch-through: master
2024-11-04 14:10:34 -05:00
Tom Lane
350e6b8ea8 pg_dump: provide a stable sort order for rules.
Previously, we sorted rules by schema name and then rule name;
if that wasn't unique, we sorted by rule OID.  This can be
problematic for comparing dumps from databases with different
histories, especially since certain rule names like "_RETURN"
are very common.  Let's make the sort key schema name, rule name,
table name, which should be unique.  (This is the same behavior
we've long used for triggers and RLS policies.)

Andreas Karlsson

Discussion: https://postgr.es/m/b4e468d8-0cd6-42e6-ac8a-1d6afa6e0cf1@proxel.se
2024-11-04 13:31:12 -05:00
Masahiko Sawada
215f7af27d Fix typo in comment of gistdoinsert().
Author: Tender Wang
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAHewXN%3D3sH2sNw4nC3QGCEVw1Lftmw9m5y1Xje0bXK6ApDrsPQ%40mail.gmail.com
2024-11-04 10:21:59 -08:00
Peter Geoghegan
846cfe0dcc Fix obsolete _bt_first comments.
_bt_first doesn't necessarily hold onto a buffer pin on success exit.
Fix header comments that claimed that we'll always hold onto a pin.

Oversight in commit 2ed5b87f96.
2024-11-04 12:43:54 -05:00
Heikki Linnakangas
0d82970336 docs: Consistently use <optional> to indicate optional parameters
Some functions were using square brackets instead, replace them all
with <optional>.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxFfUbSph5UUSsZbL4SitbuPuW%3DEccpKgEaZrjtrPPuadQ@mail.gmail.com
2024-11-04 18:28:40 +02:00
Peter Geoghegan
b6558e4f83 nbtree: Remove useless 'strat' local variable.
Remove a local variable that was used to avoid overwriting strat_total
with the = operator strategy when a >= operator strategy key was already
included in the initial positioning/insertion scan keys by _bt_first
(for backwards scans it would have to be a <= key that was included).
_bt_first's strat_total local variable now simply tracks the operator
strategy of the final scan key that was included in the scan's insertion
scan key (barring the case where the !used_all_subkeys row compare path
adjusts strat_total in its own way).

_bt_first already treated >= keys (or <= keys) as = keys for initial
positioning purposes.  There is no good reason to remember that that was
what happened; no later _bt_first step cares about the distinction.
Note, in particular, that the insertion scan key's 'nextkey' and
'backward' fields will be initialized the same way regardless.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
2024-11-04 11:04:30 -05:00
Heikki Linnakangas
3c0fd64fec Split ProcSleep function into JoinWaitQueue and ProcSleep
Split ProcSleep into two functions: JoinWaitQueue and ProcSleep.
JoinWaitQueue is called while holding the partition lock, and inserts
the current process to the wait queue, while ProcSleep() does the
actual sleeping. ProcSleep() is now called without holding the
partition lock, and it no longer re-acquires the partition lock before
returning. That makes the wakeup a little cheaper. Once upon a time,
re-acquiring the partition lock was needed to prevent a signal handler
from longjmping out at a bad time, but these days our signal handlers
just set flags, and longjmping can only happen at points where we
explicitly run CHECK_FOR_INTERRUPTS().

If JoinWaitQueue detects an "early deadlock" before even joining the
wait queue, it returns without changing the shared lock entry, leaving
the cleanup of the shared lock entry to the caller. This makes the
handling of an early deadlock the same as the dontWait=true case.

One small user-visible side-effect of this refactoring is that we now
only set the 'ps' title to say "waiting" when we actually enter the
sleep, not when the lock is skipped because dontWait=true, or when a
deadlock is detected early before entering the sleep.

This eliminates the 'lockAwaited' global variable in proc.c, which was
largely redundant with 'awaitedLock' in lock.c

Note: Updating the local lock table is now the caller's responsibility.
JoinWaitQueue and ProcSleep are now only responsible for modifying the
shared state. Seems a little nicer that way.

Based on Thomas Munro's earlier patch and observation that ProcSleep
doesn't really need to re-acquire the partition lock.

Reviewed-by: Maxim Orlov
Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
2024-11-04 17:59:24 +02:00
Robert Haas
0b1765d959 pg_combinebackup: Error if incremental file exists in full backup.
Suppose that you run a command like "pg_combinebackup b1 b2 -o output",
but both b1 and b2 contain an INCREMENTAL.$something file in a directory
that is expected to contain relation files. This is an error, but the
previous code would not detect the problem and instead write a garbage
full file named $something to the output directory. This commit adds
code to detect the error and a test case to verify the behavior.

It's difficult to imagine that this will ever happen unless someone
is intentionally trying to break incremental backup, but per discussion,
let's consider that the lack of adequate sanity checking in this area is
a bug and back-patch to v17, where incremental backup was introduced.

Patch by me, reviewed by Bertrand Drouvot and Amul Sul.

Discussion: http://postgr.es/m/CA+TgmoaD7dBYPqe7kMtO0dyto7rd0rUh7joh=JPUSaFszKY6Pg@mail.gmail.com
2024-11-04 10:11:05 -05:00
Robert Haas
6c24801b17 pg_combinebackup: When reconstructing, avoid double slash in filename.
This function is always called with a relative_path that ends in a
slash, so there's no need to insert a second one. So, don't. Instead,
add an assertion to verify that nothing gets broken in the future, and
adjust the comments.

While this is not a critical bug, the duplicate slash is visible in
error messages, which could create confusion, so back-patch to v17.
This is also better in that it keeps the code consistent across
branches.

Patch by me, reviewed by Bertrand Drouvot and Amul Sul.

Discussion: http://postgr.es/m/CA+TgmoaD7dBYPqe7kMtO0dyto7rd0rUh7joh=JPUSaFszKY6Pg@mail.gmail.com
2024-11-04 09:55:02 -05:00
Bruce Momjian
7ac744e72c doc: fix typo in mvcc clarification in commit 2fa255ce9b9
Reported-by: Erik Rijkers (private email)

Backpatch-through: master
2024-11-04 09:24:58 -05:00
Heikki Linnakangas
6ae0897e42 Move TRACE calls into WaitOnLock()
LockAcquire is a long and complex function. Pushing more stuff to its
subroutines makes it a little more manageable.

Reviewed-by: Maxim Orlov
Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
2024-11-04 16:21:01 +02:00
Heikki Linnakangas
0464f25b6a Set MyProc->heldLocks in ProcSleep
Previously, ProcSleep()'s caller was responsible for setting
MyProc->heldLocks, and we had comments to remind about that.  But it
seems simpler to make ProcSleep() itself responsible for it.
ProcSleep() already set the other info about the lock its waiting for
(waitLock, waitProcLock and waitLockMode), so it is natural for it to
set heldLocks too.

Reviewed-by: Maxim Orlov
Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
2024-11-04 16:20:57 +02:00
Peter Geoghegan
62620b6aad Clarify nbtree parallel scan _bt_endpoint contract.
_bt_endpoint is a helper function for _bt_first that's called whenever
no useful insertion scan key can be used, and we need to lock and read
either the leftmost or rightmost leaf page in the index.  Simplify and
document its preconditions, relieving its _bt_first caller from having
to end the parallel scan when it returns false.

Also stop unnecessarily invalidating the current scan position in nearby
code in both _bt_first and _bt_endpoint.  This seems to have been
copy-pasted from _bt_readnextpage, where invalidating the scan's current
position really is necessary.

Follow-up to the refactoring work in commit 1bd4bc85.
2024-11-04 09:05:59 -05:00
Heikki Linnakangas
1fe0466cf2 Fix comment in LockReleaseAll() on when locallock->nLock can be zero
We reach this case also e.g. when a deadlock is detected, not only
when we run out of memory.

Reviewed-by: Maxim Orlov
Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
2024-11-04 15:31:46 +02:00
Heikki Linnakangas
99b937a44f Add PG_TEST_EXTRA configure option to the Make builds
The Meson builds have PG_TEST_EXTRA as a configure-time variable,
which was not available in the Make builds. To ensure both build
systems are in sync, PG_TEST_EXTRA is now added as a configure-time
variable. It can be set like this:

    ./configure PG_TEST_EXTRA="kerberos, ssl, ..."

Note that to preserve the old behavior, this configure-time variable
is overridden by the PG_TEST_EXTRA environment variable when you run
the tests.

Author: Jacob Champion
Reviewed by: Ashutosh Bapat, Nazir Bilal Yavuz
2024-11-04 14:09:38 +02:00
Heikki Linnakangas
3d1aec225a Make PG_TEST_EXTRA env var override the "meson setup" option
"meson test" used to ignore the PG_TEST_EXTRA environment variable,
which meant that in order to run additional tests, you had to run
"meson setup -DPG_TEST_EXTRA=...". That's somewhat expensive, and not
consistent with autoconf builds. Allow PG_TEST_EXTRA environment
variable to override the setup-time option at run time, so that you
can do "PG_TEST_EXTRA=... meson test".

To implement this, the configuration time value is passed as an extra
"--pg-test-extra" argument to testwrap instead of adding it to the
test environment. If the environment variable is set at the time of
running test, testwrap uses the value from the environment variable
and ignores the --pg-test-extra option.

Now that "meson test" obeys the environment variable, we can remove it
from the "meson setup" steps in the CI script. It will now be picked
up from the environment variable like with "make check".

Author: Nazir Bilal Yavuzk, Ashutosh Bapat
Reviewed-by: Ashutosh Bapat with inputs from Tom Lane and Andrew Dunstan
2024-11-04 14:09:25 +02:00
Amit Kapila
5b0c46ea09 Doc: Update the behavior of generated columns in Logical Replication.
Commit 745217a051 misses updating the new behavior of generated columns in
logical replication at a few places.

Reported-by: Peter Smith, Ajin Cherian
Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm2JOO7szz9+uaQbjmgZOfzbM_9tAQdFF8H5BjkQeaJs0A@mail.gmail.com
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2024-11-04 09:39:30 +05:30
Michael Paquier
027124a872 Add missing newlines at the end of two SQL files
arrays.sql was already missing it before 49d6c7d8daba, and I have just
noticed it thanks to this commit.  The second one in test_slru has been
introduced by 768a9fd5535f.
2024-11-03 19:42:51 +09:00
Noah Misch
825c72c071 Suppress new "may be used uninitialized" warning.
Buildfarm member mamba fails to deduce that the function never uses this
variable without initializing it.  Back-patch to v12, like commit
b412f402d1e020c5dac94f3bf4a005db69519b99.
2024-11-02 19:42:52 -07:00
Noah Misch
0bada39c83 Fix inplace update buffer self-deadlock.
A CacheInvalidateHeapTuple* callee might call
CatalogCacheInitializeCache(), which needs a relcache entry.  Acquiring
a valid relcache entry might scan pg_class.  Hence, to prevent
undetected LWLock self-deadlock, CacheInvalidateHeapTuple* callers must
not hold BUFFER_LOCK_EXCLUSIVE on buffers of pg_class.  Move the
CacheInvalidateHeapTupleInplace() before the BUFFER_LOCK_EXCLUSIVE.  No
back-patch, since I've reverted commit
243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 from non-master branches.

Reported by Alexander Lakhin.  Reviewed by Alexander Lakhin.

Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
2024-11-02 09:04:56 -07:00
Noah Misch
b412f402d1 Move I/O before the index_update_stats() buffer lock region.
Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 enlarged the work done
here under the pg_class heap buffer lock.  Two preexisting actions are
best done before holding that lock.  Both RelationGetNumberOfBlocks()
and visibilitymap_count() do I/O, and the latter might exclusive-lock a
visibility map buffer.  Moving these reduces contention and risk of
undetected LWLock deadlock.  Back-patch to v12, like that commit.

Discussion: https://postgr.es/m/20241031200139.b4@rfd.leadboat.com
2024-11-02 09:04:55 -07:00
Bruce Momjian
2fa255ce9b doc: clarify text around MVCC example query
Reported-by: marlene.brandstaetter@cargonet.software

Discussion: https://postgr.es/m/167765529052.987840.12345375075704447735@wrigleys.postgresql.org

Backpatch-through: master
2024-11-01 16:38:16 -04:00
Bruce Momjian
4a9effe45e doc: remove useless MERGE example
Reported-by: dwayne.towell@gmail.com

Discussion: https://postgr.es/m/167699245721.1902146.6479762301617101634@wrigleys.postgresql.org

Backpatch-through: master
2024-11-01 16:20:27 -04:00
Bruce Momjian
e1a76db1a8 doc: improve tablespace example query and link to helper funcs.
Reported-by: Agustín

Discussion: https://postgr.es/m/172609721070.1128084.6724666076293146476@wrigleys.postgresql.org

Backpatch-through: master
2024-11-01 15:54:16 -04:00
Bruce Momjian
4200fea80e doc: fix ALTER DOMAIN domain_constraint to spell out options
It used to refer to CREATE DOMAIN, but CREATE DOMAIN allows NULL, while
ALTER DOMAIN does not.

Reported-by: elionescu@yahoo.com

Discussion: https://postgr.es/m/172225092461.915373.6103973717483380183@wrigleys.postgresql.org

Backpatch-through: 12
2024-11-01 13:54:28 -04:00
Bruce Momjian
94a8c19eed doc: explain how the home directory is found on Unix-like syst.
Done for libpq, postgres-fdw, and psql.

Reported-by: marc@msys.ch

Discussion: https://postgr.es/m/CAKFQuwZ-T-zsVM7gApS9-XU9vGxC7Oa-UyRQPVcJFagNU=AjOw@mail.gmail.com

Backpatch-through: master
2024-11-01 13:32:21 -04:00
Bruce Momjian
1eb5564230 doc: Add link to listen_addresses as cause of connection failure
Reported-by: k.man.113@gmail.com

Discussion: https://postgr.es/m/171494070007.703.17021965362263796980@wrigleys.postgresql.org

Backpatch-through: master
2024-11-01 13:15:09 -04:00
Tom Lane
cf590c568d Update contrib/sepgsql regression tests for commit 89e51abcb.
Oversight revealed by buildfarm.
2024-11-01 12:50:01 -04:00
Bruce Momjian
641a5b7a14 doc: improve build for non-Latin1 characters
Add README.non-ASCII to explain non-ASCII doc behavior; some text moved
from release.sgml.

Change UTF8 SGML characters to use HTML entities.

Remove unnecessary UTF8 spaces.

Add SVG file check for check-nbsp target.

Add dummy 'pdf' Makefile target.

Reported-by: Yugo Nagata

Discussion: https://postgr.es/m/20241011114122.c90f8a871462da36f2e2afeb@sraoss.co.jp

Backpatch-through: master
2024-11-01 12:46:51 -04:00
Peter Geoghegan
fc7ddededb Clarify nbtree array preprocessing comment.
Oversight in commit 5bf748b8.
2024-11-01 11:43:24 -04:00
Bruce Momjian
6d5444c9ed doc: remove mention of ActiveState for Perl and Tcl on Windows
Replace with Strawberry Perl and Magicsplat Tcl.

Reported-by: Yasir Hussain

Discussion: https://postgr.es/m/CAA9OW9fAAM_WDYYpAquqF6j1hmfRMzHPsFkRfP5E6oSfkF=dMA@mail.gmail.com

Backpatch-through: 12
2024-11-01 11:30:54 -04:00
Heikki Linnakangas
368d8270c8 Rename two functions that wake up other processes
Instead of talking about setting latches, which is a pretty low-level
mechanism, emphasize that they wake up other processes.

This is in preparation for replacing Latches with a new abstraction.
That's still work in progress, but this seems a little tidier anyway,
so let's get this refactoring out of the way already.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
2024-11-01 13:47:24 +02:00
Heikki Linnakangas
a9c546a5a3 Use ProcNumbers instead of direct Latch pointers to address other procs
This is in preparation for replacing Latches with a new abstraction.
That's still work in progress, but this seems a little tidier anyway,
so let's get this refactoring out of the way already.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
2024-11-01 13:47:20 +02:00
Michael Paquier
e819bbb7c8 Remove use of pg_memory_is_all_zeros() in bufpage.c
After a closer lookup, this makes the all-zero check of the page more
expensive, so let's remove the new function call in bufpage.c.  The
maths of the check were also incorrect, checking that the page was full
of zeros only for the first 1kB.

This brings back the code to the state it was at 49d6c7d8daba.

Per discussion with David Rowley and Bertrand Drouvot.

Discussion: https://postgr.es/m/CAApHDvrXzPAr3FxoBuB7b3D-okNoNA2jxLun1rW8Yw5wkbqusw@mail.gmail.com
2024-11-01 17:05:36 +09:00
Michael Paquier
07e9e28b56 Add pg_memory_is_all_zeros() in memutils.h
This new function tests if a memory region starting at a given location
for a defined length is made only of zeroes.  This unifies in a single
path the all-zero checks that were happening in a couple of places of
the backend code:
- For pgstats entries of relation, checkpointer and bgwriter, where
some "all_zeroes" variables were previously used with memcpy().
- For all-zero buffer pages in PageIsVerifiedExtended().

This new function uses the same forward scan as the check for all-zero
buffer pages, applying it to the three pgstats paths mentioned above.

Author: Bertrand Drouvot
Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Peter Smith
Discussion: https://postgr.es/m/ZupUDDyf1hHI4ibn@ip-10-97-1-34.eu-west-3.compute.internal
2024-11-01 11:35:46 +09:00
Michael Paquier
49d6c7d8da Add SQL function array_reverse()
This function takes in input an array, and reverses the position of all
its elements.  This operation only affects the first dimension of the
array, like array_shuffle().

The implementation structure is inspired by array_shuffle(), with a
subroutine called array_reverse_n() that may come in handy in the
future, should more functions able to reverse portions of arrays be
introduced.

Bump catalog version.

Author: Aleksander Alekseev
Reviewed-by: Ashutosh Bapat, Tom Lane, Vladlen Popolitov
Discussion: https://postgr.es/m/CAJ7c6TMpeO_ke+QGOaAx9xdJuxa7r=49-anMh3G5476e3CX1CA@mail.gmail.com
2024-11-01 10:32:19 +09:00
Tom Lane
2d8bff603c Make all ereport() calls within gram.y provide error locations.
This patch responds to a comment that I (tgl) made in the
discussion leading up to 774171c4f, that really all errors
occurring during raw parsing should provide error cursors.
Syntax errors reported by Bison will have one, and most of
the handwritten ereport's in gram.y already provide one,
but there were a few stragglers.

(It is not claimed that this handles every failure reachable
during raw parsing --- out-of-memory is an obvious exception.
But this makes a good start on cases that are likely to occur.)

While we're at it, clean up the reported positions for errors
associated with LIMIT/OFFSET clauses.  Previously we were
relying on applying exprLocation() to the contained expressions,
but that leads to slightly odd cursor placement, e.g.

regression=# (select * from foo limit 10) limit 10;
ERROR:  multiple LIMIT clauses not allowed
LINE 1: (select * from foo limit 10) limit 10;
                                           ^

We can afford to keep a little more state in the transient
SelectLimit structs in order to make that better.

Jian He and Tom Lane (extracted from a larger patch by Jian,
with some additional work by me)

Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
2024-10-31 16:09:27 -04:00
Tom Lane
89e51abcb2 Add a parse location field to struct FunctionParameter.
This allows an error cursor to be supplied for a bunch of
bad-function-definition errors that previously lacked one,
or that cheated a bit by pointing at the contained type name
when the error isn't really about that.

Bump catversion from an abundance of caution --- I don't think
this node type can actually appear in stored views/rules, but
better safe than sorry.

Jian He and Tom Lane (extracted from a larger patch by Jian,
with some additional work by me)

Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
2024-10-31 16:09:27 -04:00
Heikki Linnakangas
b82c877e76 Fix refreshing physical relfilenumber on shared index
Buildfarm member 'prion', which is configured with
-DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, failed with errors
like this:

    ERROR:  could not read blocks 0..0 in file "global/2672": read only 0 of 8192 bytes

while running a parallel test group that includes VACUUM FULL on some
catalog tables among other things. I was not able to reproduce that
just by running the tests with -DRELCACHE_FORCE_RELEASE
-DCATCACHE_FORCE_RELEASE, even though 'prion' hit it on first run
after commit 2b9b8ebbf8, so there might be something else that makes
it more susceptible to the race. However, I was able to reproduce it
by adding another test to the same test group that runs "vacuum full
pg_database" repeatedly.

The problem is that RelationReloadIndexInfo() no longer calls
RelationInitPhysicalAddr() on a nailed, shared index, when an
invalidation happens early during backend startup, before the critical
relcaches have been built. Before commit 2b9b8ebbf8, that was done by
RelationReloadNailed(), but it went missing from that path. Add it
back as an explicit step.

Broken by commit 2b9b8ebbf8, which refactored these functions.

Discussion: https://www.postgresql.org/message-id/db876575-8f5b-4193-a538-df7e1f92d47a%40iki.fi
2024-10-31 18:24:48 +02:00
Daniel Gustafsson
fb7e27abfb Remove duplicate words in comments
A few comments contained duplicate "the" in sentences, fix by removing
one occurrence.

Author: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CALDaNm2aEEiPwGJmPdzBxROVvs8n75yCjKz4K1f1B2TdWpzxTA@mail.gmail.com
2024-10-31 11:38:03 +01:00
Heikki Linnakangas
2b9b8ebbf8 Split RelationClearRelation into three different functions
The old RelationClearRelation function did different things depending
on the arguments and circumstances. It could:

a) remove the relation completely from relcache (rebuild == false),
b) mark the entry as invalid (rebuild == true, but not in xact), or
c) rebuild the entry (rebuild == true).

Different callers used it for different purposes, and often assumed a
particular behavior, which was confusing. Split it into three
different functions, one for each of the above actions (one of them,
RelationInvalidateRelation, was already added in commit e6cd857726).
Move the responsibility of choosing the action and calling the right
function to the callers.

Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
2024-10-31 10:09:40 +02:00
Heikki Linnakangas
8e2e266221 Simplify call to rebuild relcache entry for indexes
RelationClearRelation(rebuild == true) calls RelationReloadIndexInfo()
for indexes. We can rely on that in RelationIdGetRelation(), instead
of calling RelationReloadIndexInfo() directly. That simplifies the
code a little.

In the passing, add a comment in RelationBuildLocalRelation()
explaining why it doesn't call RelationInitIndexAccessInfo(). It's
because at index creation, it's called before the pg_index row has
been created. That's also the reason that RelationClearRelation()
still needs a special case to go through the full-blown rebuild if the
index support information in the relcache entry hasn't been populated
yet.

Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
2024-10-31 10:02:58 +02:00
David Rowley
3974bc3196 Remove unused field from SubPlanState struct
bf6c614a2 did some conversion work to use ExprState instead of manually
calling equality functions to check if one set of values is not distinct
from another set.  That patch removed many of the fields that became
redundant as a result of that change, but it forgot to remove
SubPlanState.tab_eq_funcs.  Fix that.

In passing, fix the header comment for TupleHashEntryData to correctly
spell the field name it's talking about.

Author: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CA+FpmFeycdombFzrjZw7Rmc29CVm4OOzCWwu=dVBQ6q=PX8SvQ@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvrWR2jYVhec=COyF2g2BE_ns91NDsCHAMFiXbyhEujKdQ@mail.gmail.com
2024-10-31 13:44:15 +13:00
Michael Paquier
baa1ae0429 injection_points: Improve comment about disabled isolation permutation
9f00edc22888 has disabled a permutation due to failures in the CI for
FreeBSD environments, but this is a matter of timing.  Let's document
properly why this type of permutation is a bad idea if relying on a wait
done in a SQL function, so as this can be avoided when implementing new
tests (this spec is also a template).

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZyCa2qsopKaw3W3K@paquier.xyz
2024-10-31 08:28:20 +09:00
Peter Geoghegan
492e6b54c6 nbtree: assert no scheduled primscan between pages.
Follow-up to bugfix commit 763d65ae.  Technically this new assertion is
redundant with the assertion recently added to _bt_readpage by that same
commit, but it seems like a good idea to have both.

The new assertion makes it clear that we expect to call _bt_readnextpage
when there's another primitive index scan scheduled, though only when
needed as the final step of ending the current primitive scan.
2024-10-30 15:53:26 -04:00
Peter Geoghegan
81a25790f1 Clarify nbtree array exhaustion comments.
Strictly speaking, we only need to make sure to leave the scan's array
keys in their final positions (final for the current scan direction) to
handle SAOP array exhaustion because btgettuple might only return a
subset of the items for the final page (final for the current scan
direction), before the scan changes direction.  While it's typical for
so->currPos to be invalidated shortly after the scan's arrays are first
exhausted, and while so->currPos invalidation does obviate the need to
leave the scan's arrays in any particular state, we can't rely on any of
that actually happening when handling array exhaustion.  Adjust comments
to make all of that a lot clearer.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.
2024-10-30 13:43:49 -04:00
Nathan Bossart
849110dd3e Optimize sifting down in binaryheap.
Presently, each iteration of the loop in sift_down() will perform
3 comparisons if both children are larger than the parent node (2
for comparing each child to the parent node, and a third to compare
the children to each other).  By first comparing the children to
each other and then comparing the larger child to the parent node,
we can accomplish the same thing with just 2 comparisons (while
also not affecting the number of comparisons in any other case).

Author: ChangAo Chen
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/tencent_0142D8DA90940B9930BCC08348BBD6D0BB07%40qq.com
2024-10-30 11:28:34 -05:00
Tom Lane
af21152268 Stabilize jsonb_path_query test case.
An operation like '12:34:56'::time_tz takes the UTC offset from
the prevailing time zone, which means that the results change
across DST transitions.  One of the test cases added in ed055d249
failed to consider this.

Per report from Bernhard Wiedemann.  Back-patch to v17, as the
test case was.

Discussion: https://postgr.es/m/ba8e1bc0-8a99-45b7-8397-3f2e94415e03@suse.de
2024-10-30 11:42:34 -04:00
Peter Geoghegan
763d65ae25 Fix bug in nbtree array primitive scan scheduling.
A bug in nbtree's handling of primitive index scan scheduling could lead
to wrong answers when a scrollable cursor was used with an index scan
that had a SAOP index qual.  Wrong answers were only possible when the
scan direction changed after a primitive scan was scheduled, but before
_bt_next was asked to fetch the next tuple in line (i.e. for things to
break, _bt_next had to be denied the opportunity to step off the page in
the same direction as the one used when the primscan was scheduled).
Furthermore, the issue only occurred when the page in question happened
to be the first page to be visited by the entire top-level scan; the
issue hinged upon the cursor backing up to the absolute beginning of the
key space that it returns tuples from (fetching in the opposite scan
direction across a "primitive scan boundary" always worked correctly).

To fix, make _bt_next unset the "needs primitive index scan" flag when
it detects that the current scan direction is not the one that was used
by _bt_readpage back when the primitive scan in question was scheduled.
This fixes the cases that are known to be faulty, and also seems like a
good idea on general robustness grounds.

Affected scrollable cursor cases now avoid a spurious primitive index
scan when they fetch backwards to the absolute start of the key space to
be visited by their cursor.  Fetching backwards now only returns those
tuples at the start of the scan, as expected.  It'll also be okay to
once again fetch forwards from the start at that point, since the scan
will be left in a state that's exactly consistent with the state it was
in before any tuples were ever fetched, as expected.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wznv49bFsE2jkt4GuZ0tU2C91dEST=50egzjY2FeOcHL4Q@mail.gmail.com
Backpatch: 17-, where commit 5bf748b8 first appears.
2024-10-30 10:57:19 -04:00
Álvaro Herrera
2d5fe51405
Fix some more bugs in foreign keys connecting partitioned tables
* In DetachPartitionFinalize() we were applying a tuple conversion map
  to tuples that didn't need one, which can lead to erratic behavior if
  a partitioned table has a partition with a different column order, as
  reported by Alexander Lakhin. This was introduced by 53af9491a043.
  Don't do that.  Also, modify a recently added test case to exercise
  this.

* The same function as well as CloneFkReferenced() were acquiring
  AccessShareLock on a partition, only to have CreateTrigger() later
  acquire ShareRowExclusiveLock on it.  This can lead to deadlock by
  lock escalation, unnecessarily.  Avoid that by acquiring the stronger
  lock to begin with.  This probably dates back to branch 12, but I have
  never seen a report of this being a problem in the field.

* Innocuous but wasteful: also introduced by 53af9491a043, we were
  reading a pg_constraint tuple from syscache that we don't need, as
  reported by Tender Wang.  Don't.

Backpatch to 15.

Discussion: https://postgr.es/m/461e9c26-2076-8224-e119-84998b6a784e@gmail.com
2024-10-30 10:54:03 +01:00
Peter Eisentraut
2845cd1ca0 meson: Add missing dependency to unicode test programs
The test programs in src/common/unicode/ (case_test, category_test,
norm_test), don't build with meson if the nls option is enabled,
because a libintl dependency is missing.  Fix that.  (The makefiles
are ok.)

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/flat/52db1d2b-4b96-473e-b323-a4b16a950fba%40eisentraut.org
2024-10-30 08:30:00 +01:00
Amit Kapila
745217a051 Replicate generated columns when specified in the column list.
This commit allows logical replication to publish and replicate generated
columns when explicitly listed in the column list. We also ensured that
the generated columns were copied during the initial tablesync when they
were published.

We will allow to replicate generated columns even when they are not
specified in the column list (via a new publication option) in a separate
commit.

The motivation of this work is to allow replication for cases where the
client doesn't have generated columns. For example, the case where one is
trying to replicate data from Postgres to the non-Postgres database.

Author: Shubham Khanna, Vignesh C, Hou Zhijie
Reviewed-by: Peter Smith, Hayato Kuroda, Shlok Kyal, Amit Kapila
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2024-10-30 12:36:26 +05:30
Jeff Davis
f22e436bff Add missing CommandCounterIncrement() in stats import functions.
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/98b2fcf0-f701-369e-d63d-6be9739ce17c@gmail.com
2024-10-29 10:14:23 -07:00
Noah Misch
30d47ec8c6 Unpin buffer before inplace update waits for an XID to end.
Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed inplace updates
to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE.
By keeping the pin during that wait, a sequence of autovacuum workers
and an uncommitted GRANT starved one foreground LockBufferForCleanup()
for six minutes, on buildfarm member sarus.  Prevent, at the cost of a
bit of complexity.  Back-patch to v12, like the earlier commit.  That
commit and heap_inplace_lock() have not yet appeared in any release.

Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
2024-10-29 09:39:55 -07:00
Tom Lane
502e7bf7f0 Update time zone data files to tzdata release 2024b.
Historical corrections for Mexico, Mongolia, and Portugal.
Notably, Asia/Choibalsan is now an alias for Asia/Ulaanbaatar
rather than being a separate zone, mainly because the differences
between those zones were found to be based on untrustworthy data.
2024-10-29 11:49:38 -04:00
David Rowley
fcbd1bb661 Reduce variable scope and possibly useless palloc
Move the CreateStmt down to the branch that it's used in, thus
preventing the makeNode() call in cases where the CreateStmt isn't used.

Author: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQAq=06YPWPhS+yyTbCwv5JLKRz8rm3dWx6JR5Uj_d_fQDA@mail.gmail.com
2024-10-30 01:38:42 +13:00
David Rowley
84b8fccbe5 Doc: add detail about EXPLAIN's "Disabled" property
c01743aa4 and later 161320b4b adjusted the EXPLAIN output to show which
plan nodes were chosen despite being disabled by the various enable*
GUCs.  Prior to e22253467, the disabledness of a node was only evident by
a large startup cost penalty.  Since we now explicitly tag disabled nodes
with a boolean property in EXPLAIN, let's add some documentation to
provide some details about why and when disabled nodes can appear in the
plan.

Author: Laurenz Albe, David Rowley
Discussion: https://postgr.es/m/883729e429267214753d5e438c82c73a58c3db5d.camel@cybertec.at
2024-10-29 23:28:12 +13:00
Peter Eisentraut
014720c6d9 Add missing FATAL => 'all' to a use warnings in Perl
Author: Anton Voloshin <a.voloshin@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/aa8a55d5-554a-4027-a491-1b0ca7c85f7a@postgrespro.ru
2024-10-29 10:26:17 +01:00
Michael Paquier
4b7bba49e7 doc: Add better description for rewrite functions in event triggers
There are two functions that can be used in event triggers to get more
details about a rewrite happening on a relation.  Both had a limited
documentation:
- pg_event_trigger_table_rewrite_reason() and
pg_event_trigger_table_rewrite_oid() were not mentioned in the main
event trigger section in the paragraph dedicated to the event
table_rewrite.
- pg_event_trigger_table_rewrite_reason() returns an integer which is a
bitmap of the reasons why a rewrite happens.  There was no explanation
about the meaning of these values, forcing the reader to look at the
code to find out that these are defined in event_trigger.h.

While on it, let's add a comment in event_trigger.h where the
AT_REWRITE_* are defined, telling to update the documentation when
these values are changed.

Backpatch down to 13 as a consequence of 1ad23335f36b, where this area
of the documentation has been heavily reworked.

Author: Greg Sabino Mullane
Discussion: https://postgr.es/m/CAKAnmmL+Z6j-C8dAx1tVrnBmZJu+BSoc68WSg3sR+CVNjBCqbw@mail.gmail.com
Backpatch-through: 13
2024-10-29 15:35:01 +09:00
David Rowley
dda781609f Doc: clarify enable_indexscan=off also disabled Index Only Scans
Disabling enable_indexscan has always also disabled Index Only Scans.
Here we make that more clear in the documentation in an attempt to
prevent future complaints complaining about this expected behavior.

Reported-by: Melanie Plageman
Author: David G. Johnston, David Rowley
Backpatch-through: 12, oldest supported version
Discussion: https://postgr.es/m/CAAKRu_atV=kovgpaLREyG68PB5+ncKvJ2UNoeRetEgyC3Yb5Sw@mail.gmail.com
2024-10-29 16:24:10 +13:00
Michael Paquier
49a23441ca Fix dependency of partitioned table and table AM with CREATE TABLE .. USING
A pg_depend entry between a partitioned table and its table access
method was missing when using CREATE TABLE .. USING with an unpinned
access method.  DROP ACCESS METHOD could be used, while it should be
blocked if CASCADE is not specified, even if there was a partitioned
table that depends on the table access method.  pg_class.relam would
then hold an orphaned OID value still pointing to the AM dropped.

The problem is fixed by adding a dependency between the partitioned
table and its table access method if set when the relation is created.
A test checking the contents of pg_depend in this case is added.

Issue introduced in 374c7a229042, that has added support for CREATE
TABLE .. USING for partitioned tables.

Reviewed-by: Alexander Lakhin
Discussion: https://postgr.es/m/18674-1ef01eceec278fab@postgresql.org
Backpatch-through: 17
2024-10-29 08:41:33 +09:00
Nathan Bossart
70b9adb98e Ensure we have a snapshot when updating pg_index in index_drop().
I assumed that all index_drop() callers set an active snapshot
beforehand, but that is evidently not true.  One counterexample is
autovacuum, which doesn't set an active snapshot when cleaning up
orphan temp indexes.  To fix, unconditionally push an active
snapshot before updating pg_index in index_drop().

Oversight in commit b52adbad46.

Reported-by: Masahiko Sawada
Reviewed-by: Stepan Neretin, Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoBgF9etQrXbN9or_YHsmBRJHHNUEkhHp9rGK9CyQv5aTQ%40mail.gmail.com
2024-10-28 16:44:31 -05:00
Tom Lane
11b7de4a78 Unify src/common/'s definitions of MaxAllocSize.
As threatened in the previous patch, define MaxAllocSize in
src/include/common/fe_memutils.h rather than having several
copies of it in different src/common/*.c files.  This also
provides an opportunity to document it better.

While this would probably be safe to back-patch, I'll refrain
(for now anyway).
2024-10-28 14:39:01 -04:00
Tom Lane
bd28431672 Guard against enormously long input in pg_saslprep().
Coverity complained that pg_saslprep() could suffer integer overflow,
leading to under-allocation of the output buffer, if the input string
exceeds SIZE_MAX/4.  This hazard seems largely hypothetical, but it's
easy enough to defend against, so let's do so.

This patch creates a third place in src/common/ where we are locally
defining MaxAllocSize so that we can test against that in the same way
in backend and frontend compiles.  That seems like about two places
too many, so the next patch will move that into common/fe_memutils.h.
I'm hesitant to do that in back branches however.

Back-patch to v14.  The code looks similar in older branches, but
before commit 67a472d71 there was a separate test on the input string
length that prevented this hazard.

Per Coverity report.
2024-10-28 14:33:55 -04:00
Tom Lane
6cfb3a3374 Strip Windows newlines from extension script files manually.
Revert commit 924e03917 in favor of adding code to convert \r\n to \n
explicitly, on Windows only.  The idea of letting text mode do the
work fails for a couple of reasons:

* Per Microsoft documentation, text mode also causes control-Z to be
interpreted as end-of-file.  While it may be unlikely that extension
scripts contain control-Z, we've historically allowed it, and breaking
the case doesn't seem wise.

* Apparently, on some Windows configurations, "r" mode is interpreted
as binary not text mode.  We could force it with "rt" but that would
be inconsistent with our code elsewhere, and it would still require
Windows-specific coding.

Thanks to Alexander Lakhin for investigation.

Discussion: https://postgr.es/m/79284195-4993-7b00-f6df-8db28ca60fa3@gmail.com
2024-10-28 13:07:32 -04:00
Peter Eisentraut
8a98822bcc Fix WAL_DEBUG build
broken by commit e18512c000e

Reported-by: Peter Geoghegan <pg@bowt.ie>
2024-10-28 17:44:18 +01:00
Peter Geoghegan
123474cbce nbtree: Minor sibling link traversal tweaks.
Tweak some code comments for clarity, and relocate some local variable
declarations to the scope where they're actually used.

Follow-up to recent commit 1bd4bc85.
2024-10-28 12:22:52 -04:00
Heikki Linnakangas
de5afddc3b Fix overflow in bsearch_arg() with more than INT_MAX elements
This was introduced in commit bfa2cee784, which replaced the old
bsearch_cmp() function we had in extended_stats.c with the current
implementation. The original discussion or commit message of
bfa2cee784 didn't mention where the new implementation came from, but
based on some googling, I'm guessing *BSD or libiberty, all of which
share this same code, with or without this fix.

Author: Ranier Vilela
Reviewed-by: Nathan Bossart
Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/CAEudQAp34o_8u6sGSVraLwuMv9F7T9hyHpePXHmRaxR2Aboi%2Bw%40mail.gmail.com
2024-10-28 14:07:38 +02:00
Heikki Linnakangas
22bb889f70 Restore missing line to copyright notice
Commit 12c9423832 in May 2003 accidentally removed the last line of
the copyright notice in getopt.c. Put it back.
2024-10-28 13:08:43 +02:00
Peter Eisentraut
9be4e5d293 Remove unused #include's from contrib, pl, test .c files
as determined by IWYU

Similar to commit dbbca2cf299, but for contrib, pl, and src/test/.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-28 08:02:17 +01:00
Amit Kapila
1bf1140be8 Change the default value of the streaming option to 'parallel'.
Previously the default value of streaming option for a subscription was
'off'. The parallel option indicates that the changes in large
transactions (greater than logical_decoding_work_mem) are to be applied
directly via one of the parallel apply workers, if available.

The parallel mode was introduced in 16, but we refrain from enabling it by
default to avoid seeing any unpleasant behavior in the existing
applications. However we haven't found any such report yet, so this is a
good time to enable it by default.

Reported-by: Vignesh C
Author: Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CALDaNm1=MedhW23NuoePJTmonwsMSp80ddsw+sEJs0GUMC_kqQ@mail.gmail.com
2024-10-28 08:42:05 +05:30
Michael Paquier
6b652e6ce8 Set query ID for inner queries of CREATE TABLE AS and DECLARE
Some utility statements contain queries that can be planned and
executed: CREATE TABLE AS and DECLARE CURSOR.  This commit adds query ID
computation for the inner queries executed by these two utility
commands, with and without EXPLAIN.  This change leads to four new
callers of JumbleQuery() and post_parse_analyze_hook() so as extensions
can decide what to do with this new data.

Previously, extensions relying on the query ID, like pg_stat_statements,
were not able to track these nested queries as the query_id was 0.

For pg_stat_statements, this commit leads to additions under !toplevel
when pg_stat_statements.track is set to "all", as shown in its
regression tests.  The output of EXPLAIN for these two utilities gains a
"Query Identifier" if compute_query_id is enabled.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-28 09:03:20 +09:00
Peter Geoghegan
33b2fbe050 Fix obsolete nbtree split buffer comment.
Oversight in commit d088ba5a.
2024-10-27 10:38:24 -04:00
Peter Eisentraut
e18512c000 Remove unused #include's from backend .c files
as determined by IWYU

These are mostly issues that are new since commit dbbca2cf299.

Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-27 08:26:50 +01:00
Jeff Davis
3aa2373c11 Refactor the code to create a pg_locale_t into new function.
Reviewed-by: Andreas Karlsson
Discussion: https://postgr.es/m/59da7ee4-5e1a-4727-b464-a603c6ed84cd@proxel.se
2024-10-25 16:31:08 -07:00
Tom Lane
924e03917d Read extension script files in text not binary mode.
This change affects only Windows, where it should cause DOS-style
newlines (\r\n) to be converted to plain \n during script loading.
This eliminates one potential discrepancy in the behavior of
extension script files between Windows and non-Windows.  While
there's a small chance that this might cause undesirable behavior
changes for some extensions, it can also be argued that this may
remove behavioral surprises for others.  An example is that in
the buildfarm, we are getting different results for the tests
added by commit 774171c4f depending on whether our git tree has
been checked out with Unix or DOS newlines.

The choice to use binary mode goes all the way back to our invention
of extensions in commit d9572c4e3.  However, I suspect it was not
thought through carefully but was just a side-effect of the ready
availability of an almost-suitable function read_binary_file().
On balance, changing to text mode seems like a better answer than
other ways in which we might fix the inconsistent test results.

Discussion: https://postgr.es/m/2480333.1729784872@sss.pgh.pa.us
2024-10-25 12:19:58 -04:00
Melanie Plageman
de380a62b5 Make table_scan_bitmap_next_block() async-friendly
Move all responsibility for indicating a block is exhuasted into
table_scan_bitmap_next_tuple() and advance the main iterator in
heap-specific code. This flow control makes more sense and is a step
toward using the read stream API for bitmap heap scans.

Previously, table_scan_bitmap_next_block() returned false to indicate
table_scan_bitmap_next_tuple() should not be called for the tuples on
the page. This happened both when 1) there were no visible tuples on the
page and 2) when the block returned by the iterator was past the end of
the table. BitmapHeapNext() (generic bitmap table scan code) handled the
case when the bitmap was exhausted.

It makes more sense for table_scan_bitmap_next_tuple() to return false
when there are no visible tuples on the page and
table_scan_bitmap_next_block() to return false when the bitmap is
exhausted or there are no more blocks in the table.

As part of this new design, TBMIterateResults are no longer used as a
flow control mechanism in BitmapHeapNext(), so we removed
table_scan_bitmap_next_tuple's TBMIterateResult parameter.

Note that the prefetch iterator is still saved in the
BitmapHeapScanState node and advanced in generic bitmap table scan code.
This is because 1) it was not necessary to change the prefetch iterator
location to change the flow control in BitmapHeapNext() 2) modifying
prefetch iterator management requires several more steps better split
over multiple commits and 3) the prefetch iterator will be removed once
the read stream API is used.

Author: Melanie Plageman
Reviewed-by: Tomas Vondra, Andres Freund, Heikki Linnakangas, Mark Dilger
Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
2024-10-25 10:11:58 -04:00
Melanie Plageman
7bd7aa4d30 Move EXPLAIN counter increment to heapam_scan_bitmap_next_block
Increment the lossy and exact page counters for EXPLAIN of bitmap heap
scans in heapam_scan_bitmap_next_block(). Note that other table AMs will
need to do this as well

Pushing the counters into heapam_scan_bitmap_next_block() is required to
be able to use the read stream API for bitmap heap scans. The bitmap
iterator must be advanced from inside the read stream callback, so
TBMIterateResults cannot be used as a flow control mechanism in
BitmapHeapNext().

Author: Melanie Plageman
Reviewed-by: Tomas Vondra, Heikki Linnakangas
Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
2024-10-25 10:11:46 -04:00
Noah Misch
8e7e672cda WAL-log inplace update before revealing it to other sessions.
A buffer lock won't stop a reader having already checked tuple
visibility.  If a vac_update_datfrozenid() and then a crash happened
during inplace update of a relfrozenxid value, datfrozenxid could
overtake relfrozenxid.  That could lead to "could not access status of
transaction" errors.  Back-patch to v12 (all supported versions).  In
v14 and earlier, this also back-patches the assertion removal from
commit 7fcf2faf9c7dd473208fd6d5565f88d7f733782b.

Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com
2024-10-25 06:51:03 -07:00
Noah Misch
243e9b40f1 For inplace update, send nontransactional invalidations.
The inplace update survives ROLLBACK.  The inval didn't, so another
backend's DDL could then update the row without incorporating the
inplace update.  In the test this fixes, a mix of CREATE INDEX and ALTER
TABLE resulted in a table with an index, yet relhasindex=f.  That is a
source of index corruption.  Back-patch to v12 (all supported versions).
The back branch versions don't change WAL, because those branches just
added end-of-recovery SIResetAll().  All branches change the ABI of
extern function PrepareToInvalidateCacheTuple().  No PGXN extension
calls that, and there's no apparent use case in extensions.

Reviewed by Nitin Motiani and (in earlier versions) Andres Freund.

Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
2024-10-25 06:51:02 -07:00
Daniel Gustafsson
0fe173680e doc: Fix typo in pg_restore_*_stats function documentation
Fix accidental typo from d32d146399, s/intepretation/interpretation/
2024-10-25 14:00:13 +02:00
Alexander Korotkov
aa1e898dea Fix concurrrently in typcache_rel_type_cache.sql
All injection points there must be local.  Otherwise it affects parallel
tests.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/b3ybc66l6lhmtzj2n7ypumz5yjz7njc46sddsqshdtstgj74ah%40qgtn6nzokj6a
2024-10-25 13:12:16 +03:00
Amit Kapila
b8a046081c Doc: Add a caution in alter publication.
Clarify that altering the 'publish_via_partition_root' option can lead to
data loss or duplication when a partition root table is specified as the
replication target.

Reported-by: Maxim Boguk
Author: Hayato Kuroda
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C
Discussion: https://postgr.es/m/18644-6866bbd22178ee16@postgresql.org
2024-10-25 14:19:05 +05:30
Tatsuo Ishii
7175ef870e pgbench: Fix typo.
Fix typo in commit cae0f3c405.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/70eaa41b-805b-ce19-6004-5a0dccd3f731%40gmail.com
2024-10-25 15:06:16 +09:00
Michael Paquier
248c2d1923 Refactor code converting a publication name List to a StringInfo
The existing get_publications_str() is renamed to GetPublicationsStr()
and is moved to pg_subscription.c, so as it is possible to reuse it at
two locations of the tablesync code where the same logic was duplicated.

fetch_remote_table_info() was doing two List->StringInfo conversions
when dealing with a server of version 15 or newer.  The conversion
happens only once now.

This refactoring leads to less code overall.

Author: Peter Smith
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/CAHut+PtJMk4bKXqtpvqVy9ckknCgK9P6=FeG8zHF=6+Em_Snpw@mail.gmail.com
2024-10-25 12:02:04 +09:00
Michael Paquier
1564339bfe Add install rules for Kerberos.pm and AdjustUpgrade.pm
For the same reasons as c3a0818460a8, these can be useful for
out-of-core extension testing.  Kerberos.pm has been moved to its
current path recently in 9f899562d420, and AdjustUpgrade.pm has been
introduced in 52585f8f072a, still both lacked [un]installation rules for
both meson and configure.

Reported-by: Ashutosh Bapat
Discussion: https://postgr.es/m/ZozqzznkDhfCG7Ng@paquier.xyz
2024-10-25 10:56:34 +09:00
Michael Paquier
9f00edc228 injection_points: Disable one permutation in isolation test "basic"
The first permutation done in the test does a wait, a wakeup then a
detach.  It is proving to be unstable in the CI for FreeBSD (Windows and
Linux are stable).  The failure shows that the wait is so slow to finish
after being woken up that the detach has the time to finish before the
wait, messing up with the expected output.

There may be a platform-specific issue going on here, but for now
disable this permutation to make the CI runs more stable.

Discussion: https://postgr.es/m/ZxrnSGdNtQWAxE3_@paquier.xyz
2024-10-25 10:34:27 +09:00
Richard Guo
ffe12d1d22 Remove the RTE_GROUP RTE if we drop the groupClause
For an EXISTS subquery, the only thing that matters is whether it
returns zero or more than zero rows.  Therefore, we remove certain SQL
features that won't affect that, among them the GROUP BY clauses.

After we drop the groupClause, we'd better remove the RTE_GROUP RTE
and clear the hasGroupRTE flag, as they depend on the groupClause.
Failing to do so could result in a bogus RTE_GROUP entry in the parent
query, leading to an assertion failure on the hasGroupRTE flag.

Reported-by: David Rowley
Author: Richard Guo
Discussion: https://postgr.es/m/CAApHDvp2_yht8uPLyWO-kVGWZhYvx5zjGfSrg4fBQ9fsC13V0g@mail.gmail.com
2024-10-25 09:52:34 +09:00
Jeff Davis
d32d146399 Add functions pg_restore_relation_stats(), pg_restore_attribute_stats().
Similar to the pg_set_*_stats() functions, except with a variadic
signature that's designed to be more future-proof. Additionally, most
problems are reported as WARNINGs rather than ERRORs, allowing most
stats to be restored even if some cannot.

These functions are intended to be called from pg_dump to avoid the
need to run ANALYZE after an upgrade.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
2024-10-24 12:08:00 -07:00
Tom Lane
534d0ea6c2 Generalize plpgsql's heuristic for importing expanded objects.
If a R/W expanded-object pointer is passed as a function parameter,
take ownership of the object, regardless of its type.  Previously
this happened only for expanded arrays, but that was a result of
sloppy thinking.  (If the plpgsql function did not end by returning
the object, the result would be to leak the object until the
surrounding memory context is cleaned up.  That's not awful,
since non-expanded values have always been managed that way,
but we can do better.)

Per discussion with Michel Pelletier.  There's a lot more to do
here to make plpgsql work efficiently with expanded objects that
aren't arrays, but this is an easy first step.

Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2024-10-24 13:28:22 -04:00
Noah Misch
67bab53d64 Fix parallel worker tracking of new catalog relfilenumbers.
Reunite RestorePendingSyncs() with RestoreRelationMap().  If
RelationInitPhysicalAddr() ran after RestoreRelationMap() but before
RestorePendingSyncs(), the relcache entry could cause RelationNeedsWAL()
to return true erroneously.  Trouble required commands of the current
transaction to include REINDEX or CLUSTER of a system catalog.  The
parallel leader correctly derived RelationNeedsWAL()==false from the new
relfilenumber, but the worker saw RelationNeedsWAL()==true.  Worker
MarkBufferDirtyHint() then wrote unwanted WAL.  Recovery of that
unwanted WAL could lose tuples like the system could before commit
c6b92041d38512a4176ed76ad06f713d2e6c01a8 introduced this tracking.
RestorePendingSyncs() and RestoreRelationMap() were adjacent till commit
126ec0bc76d044d3a9eb86538b61242bf7da6db4, so no back-patch for now.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20241019232815.c6.nmisch@google.com
2024-10-24 09:16:38 -07:00
Noah Misch
e947224cbb Stop reading uninitialized memory in heap_inplace_lock().
Stop computing a never-used value.  This removes the read; the read had
no functional implications.  Back-patch to v12, like commit
a07e03fd8fa7daf4d1356f7cb501ffe784ea6257.

Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/6c92f59b-f5bc-e58c-9bdd-d1f21c17c786@gmail.com
2024-10-24 09:16:14 -07:00
Fujii Masao
86c30cef4a Refactor GetLockStatusData() to skip backends/groups without fast-path locks.
Previously, GetLockStatusData() checked all slots for every backend
to gather fast-path lock data, which could be inefficient. This commit
refactors it by skipping backends with PID=0 (since they don't hold
fast-path locks) and skipping groups with no registered fast-path locks,
improving efficiency.

This refactoring is particularly beneficial, for example when
max_connections and max_locks_per_transaction are set high,
as it reduces unnecessary checks across numerous slots.

Author: Fujii Masao
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/a0a00c44-31e9-4c67-9846-fb9636213ac9@oss.nttdata.com
2024-10-25 00:18:32 +09:00
Daniel Gustafsson
45188c2ea2 Support configuring TLSv1.3 cipher suites
The ssl_ciphers GUC can only set cipher suites for TLSv1.2, and lower,
connections. For TLSv1.3 connections a different OpenSSL API must be
used.  This adds a new GUC, ssl_tls13_ciphers, which can be used to
configure a colon separated list of cipher suites to support when
performing a TLSv1.3 handshake.

Original patch by Erica Zhang with additional hacking by me.

Author: Erica Zhang <ericazhangy2021@qq.com>
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
2024-10-24 15:20:32 +02:00
Daniel Gustafsson
3d1ef3a15c Support configuring multiple ECDH curves
The ssl_ecdh_curve GUC only accepts a single value, but the TLS
handshake can list multiple curves in the groups extension (the
extension has been renamed to contain more than elliptic curves).
This changes the GUC to accept a colon-separated list of curves.
This commit also renames the GUC to ssl_groups to match the new
nomenclature for the TLS extension.

Original patch by Erica Zhang with additional hacking by me.

Author: Erica Zhang <ericazhangy2021@qq.com>
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
2024-10-24 15:20:28 +02:00
Daniel Gustafsson
6c66b7443c Raise the minimum supported OpenSSL version to 1.1.1
Commit a70e01d4306fdbcd retired support for OpenSSL 1.0.2 in order to get
rid of the need for manual initialization of the library.  This left our
API usage compatible with 1.1.0 which was defined as the minimum required
version. Also mention that 3.4 is the minimum version required when using
LibreSSL.

An upcoming commit will introduce support for configuring TLSv1.3 cipher
suites which require an API call in OpenSSL 1.1.1 and onwards.  In order
to support this setting this commit will set v1.1.1 as the new minimum
required version.  The version-specific call for randomness init added
in commit c3333dbc0c0 is removed as it's no longer needed.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/909A668B-06AD-47D1-B8EB-A164211AAD16@yesql.se
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
2024-10-24 15:20:19 +02:00
Daniel Gustafsson
f81855171f Handle alphanumeric characters in matching GUC names
The check for whether all GUCs are present in the sample config
file used the POSIX character class :alpha: which corresponds to
alphabet and not alphanumeric. Since GUC names can contain digits
as well we need to use the :alnum: character class instead.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/2CB04559-B1D8-4558-B6F0-8F09093D629F@yesql.se
2024-10-24 15:20:16 +02:00
Alexander Korotkov
e546989a26 Add 'no_error' argument to pg_wal_replay_wait()
This argument allow skipping throwing an error.  Instead, the result status
can be obtained using pg_wal_replay_wait_status() function.

Catversion is bumped.

Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz
Reviewed-by: Pavel Borisov
2024-10-24 15:02:21 +03:00
Alexander Korotkov
73da6b8d1b Refactor WaitForLSNReplay() to return the result of waiting
Currently, WaitForLSNReplay() immediately throws an error if waiting for LSN
replay is not successful.  This commit teaches  WaitForLSNReplay() to return
the result of waiting, while making pg_wal_replay_wait() responsible for
throwing an appropriate error.

This is preparation to adding 'no_error' argument to pg_wal_replay_wait() and
new function pg_wal_replay_wait_status(), which returns the last wait result
status.

Additionally, we stop distinguishing situations when we find our instance to
be not in a recovery state before entering the waiting loop and inside
the waiting loop.  Standby promotion may happen at any moment, even between
issuing a procedure call statement and pg_wal_replay_wait() doing a first
check of recovery status.  Thus, there is no pointing distinguishing these
situations.

Also, since we may exit the waiting loop and see our instance not in recovery
without throwing an error, we need to deleteLSNWaiter() in that case. We do
this unconditionally for the sake of simplicity, even if standby was already
promoted after reaching the target LSN, the startup process surely already
deleted us.

Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz
Reviewed-by: Michael Paquier, Pavel Borisov
2024-10-24 14:38:27 +03:00
Alexander Korotkov
6cfebfe88b Make WaitForLSNReplay() issue FATAL on postmaster death
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZvY2C8N4ZqgCFaLu%40paquier.xyz
Reviewed-by: Pavel Borisov
2024-10-24 14:38:06 +03:00
Alexander Korotkov
5035172e4a Move LSN waiting declarations and definitions to better place
3c5db1d6b implemented the pg_wal_replay_wait() stored procedure.  Due to
the patch development history, the implementation resided in
src/backend/commands/waitlsn.c (src/include/commands/waitlsn.h for headers).

014f9f34d moved pg_wal_replay_wait() itself to
src/backend/access/transam/xlogfuncs.c near to the WAL-manipulation functions.
But most of the implementation stayed in place.

The code in src/backend/commands/waitlsn.c has nothing to do with commands,
but is related to WAL.  So, this commit moves this code into
src/backend/access/transam/xlogwait.c (src/include/access/xlogwait.h for
headers).

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
Reviewed-by: Pavel Borisov
2024-10-24 14:37:53 +03:00
Alexander Korotkov
b85a9d046e Avoid looping over all type cache entries in TypeCacheRelCallback()
Currently, when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate.  Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups.  This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.

We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean.  Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.

There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur.  In order to handle this, we
keep an array of in-progress type cache entries.  In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.

Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
Reviewed-by: Artur Zakirov
2024-10-24 14:35:52 +03:00
Alexander Korotkov
c1500a1ba7 Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.

Discussion: https://postgr.es/m/CAPpHfdsQhwUrnB3of862j9RgHoJM--eRbifvBMvtQxpC57dxCA%40mail.gmail.com
Reviewed-by: Andrei Lepikhov, Artur Zakirov, Pavel Borisov
2024-10-24 14:34:16 +03:00
Michael Paquier
499edb0974 Track more precisely query locations for nested statements
Previously, a Query generated through the transform phase would have
unset stmt_location, tracking the starting point of a query string.

Extensions relying on the statement location to extract its relevant
parts in the source text string would fallback to use the whole
statement instead, leading to confusing results like in
pg_stat_statements for queries relying on nested queries, like:
- EXPLAIN, with top-level and nested query using the same query string,
and a query ID coming from the nested query when the non-top-level
entry.
- Multi-statements, with only partial portions of queries being
normalized.
- COPY TO with a query, SELECT or DMLs.

This patch improves things by keeping track of the statement locations
and propagate it to Query during transform, allowing PGSS to only show
the relevant part of the query for nested query.  This leads to less
bloat in entries for non-top-level entries, as queries can now be
grouped within the same (toplevel, queryid) duos in pg_stat_statements.
The result gives a stricter one-one mapping between query IDs and its
query strings.

The regression tests introduced in 45e0ba30fc40 produce differences
reflecting the new logic.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-24 09:29:54 +09:00
Jeff Davis
4b096c67e0 Improve pg_set_attribute_stats() error message.
Previously, an invalid attribute name was caught, but the error
message was unhelpful.
2024-10-23 16:16:39 -07:00
Masahiko Sawada
7b8b8dddd6 Fix typo in tidstore.h.
An oversight in commit f6bef362c.

Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAD21AoB8MJ5OHtpUw1UEGf7spioFmP3PNH44KNx6Yb3FiZSwKA%40mail.gmail.com
2024-10-23 15:37:00 -07:00
Jeff Davis
0a3f983821 Another documentation fixup.
Reported-by: Erik Rijkers
2024-10-23 10:28:31 -07:00
Jeff Davis
56b1e88c80 Fix compiler warning.
Some buildfarm members complained about an always-true test in the
SOFT_ERROR_OCCURRED macro. Fix by reading the field directly rather
than using the macro.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/2144895.1729653514@sss.pgh.pa.us
2024-10-23 10:24:17 -07:00
Jeff Davis
07d00692c8 Documentation fixup.
Wrong return type for pg_clear_attribute_stats().

Author: Noriyoshi Shinoda
Discussion: https://postgr.es/m/DM4PR84MB17347944F27A552F0CCDF84CEE4C2@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM
2024-10-23 09:44:36 -07:00
Daniel Gustafsson
940f7a5627 Fix incorrect struct reference in comment
SASL frontend mechanisms are implemented with pg_fe_sasl_mech and
not the _be_ variant which is the backend implementation. Spotted
while reading adjacent code.
2024-10-23 16:13:28 +02:00
Daniel Gustafsson
6d16f9deba Make SASL max message length configurable
The proposed OAUTHBEARER SASL mechanism will need to allow larger
messages in the exchange, since tokens are sent directly by the
client.  Move this limit into the pg_be_sasl_mech struct so that
it can be changed per-mechanism.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAOYmi+nqX_5=Se0W0Ynrr55Fha3CMzwv_R9P3rkpHb=1kG7ZTQ@mail.gmail.com
2024-10-23 16:10:27 +02:00
Daniel Gustafsson
17b4aa77c3 doc: Fix INSERT statement syntax for identity columns
The INSERT statements in the examples were erroneously using
VALUE instead of VALUES. Backpatch to v17 where the examples
were added through a37bb7c1399.

Reported-by: shixiong327926@gmail.com
Discussion: https://postgr.es/m/172958472112.696.6075270400394560263@wrigleys.postgresql.org
Backpatch-through: 17
2024-10-23 14:58:17 +02:00
Amit Langote
55e6d712af Remove unnecessary word in a comment
Relations opened by the executor are only closed once in
ExecCloseRangeTableRelations(), so the word "again" in the comment
for ExecGetRangeTableRelation() is misleading and unnecessary.

Discussion: https://postgr.es/m/CA+HiwqHnw-zR+u060i3jp4ky5UR0CjByRFQz50oZ05de7wUg=Q@mail.gmail.com
Backpatch-through: 12
2024-10-23 17:54:48 +09:00
Michael Paquier
a0bff38d13 ecpg: Fix out-of-bound read in DecodeDateTime()
It was possible for the code to read out-of-bound data from the
"day_tab" table with some crafted input data.  Let's treat these as
invalid input as the month number is incorrect.

A test is added to test this case with a check on the errno returned by
the decoding routine.  A test close to the new one added in this commit
was testing for a failure, but did not look at the errno generated, so
let's use this commit to also change it, adding a check on the errno
returned by DecodeDateTime().

Like the other test scripts, dt_test should likely be expanded to
include more checks based on the errnos generated in these code paths.
This is left as future work.

This issue exists since 2e6f97560a83, so backpatch all the way down.

Reported-by: Pavel Nekrasov
Author: Bruce Momjian, Pavel Nekrasov
Discussion: https://postgr.es/m/18614-6bbe00117352309e@postgresql.org
Backpatch-through: 12
2024-10-23 08:33:54 +09:00
Jeff Davis
ce207d2a79 Add functions pg_set_attribute_stats() and pg_clear_attribute_stats().
Enable manipulation of attribute statistics. Only superficial
validation is performed, so it's possible to add nonsense, and it's up
to the planner (or other users of statistics) to behave reasonably in
that case.

Bump catalog version.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
2024-10-22 15:06:55 -07:00
Jeff Davis
dbe6bd4343 Change pg_*_relation_stats() functions to return type to void.
These functions will either raise an ERROR or run to normal
completion, so no return value is necessary.

Bump catalog version.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=cBF8rnphuTyHFi3KYzB9ByDgx57HwK9Rz2yp7S+Om87w@mail.gmail.com
2024-10-22 12:48:01 -07:00
Tom Lane
774171c4f6 Improve reporting of errors in extension script files.
Previously, CREATE/ALTER EXTENSION gave basically no useful
context about errors reported while executing script files.
I think the idea was that you could run the same commands
manually to see the error, but that's often quite inconvenient.
Let's improve that.

If we get an error during raw parsing, we won't have a current
statement identified by a RawStmt node, but we should always get
a syntax error position.  Show the portion of the script from
the last semicolon-newline before the error position to the first
one after it.  There are cases where this might show only a
fragment of a statement, but that should be uncommon, and it
seems better than showing the whole script file.

Without an error cursor, if we have gotten past raw parsing (which
we probably have), we can report just the current SQL statement as
an item of error context.

In any case also report the script file name as error context,
since it might not be entirely obvious which of a series of
update scripts failed.  We can also show an approximate script
line number in case whatever we printed of the query isn't
sufficiently identifiable.

The error-context code path is already exercised by some
test_extensions test cases, but add tests for the syntax-error
path.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:31:45 -04:00
Tom Lane
14e5680eee Improve parser's reporting of statement start locations.
Up to now, the parser's reporting of a statement's stmt_location
included any preceding whitespace or comments.  This isn't really
desirable but was done to avoid accounting honestly for nonterminals
that reduce to empty.  It causes problems for pg_stat_statements,
which partially compensates by manually stripping whitespace, but
is not bright enough to strip /*-style comments.  There will be
more problems with an upcoming patch to improve reporting of errors
in extension scripts, so it's time to do something about this.

The thing we have to do to make it work right is to adjust
YYLLOC_DEFAULT to scan the inputs of each production to find the
first one that has a valid location (i.e., did not reduce to
empty).  In theory this adds a little bit of per-reduction overhead,
but in practice it's negligible.  I checked by measuring the time
to run raw_parser() on the contents of information_schema.sql, and
there was basically no change.

Having done that, we can rely on any nonterminal that didn't reduce
to completely empty to have a correct starting location, and we don't
need the kluges the stmtmulti production formerly used.

This should have a side benefit of allowing parse error reports to
include an error position in some cases where they formerly failed to
do so, due to trying to report the position of an empty nonterminal.
I did not go looking for an example though.  The one previously known
case where that could happen (OptSchemaEltList) no longer needs the
kluge it had; but I rather doubt that that was the only case.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:26:05 -04:00
Fujii Masao
7c4d3fe272 ecpg: Refactor ecpg_log() to skip unnecessary calls to ECPGget_sqlca().
Previously, ecpg_log() always called ECPGget_sqlca() to retrieve sqlca,
even though it was only needed for debug logging. This commit updates
ecpg_log() to call ECPGget_sqlca() only when debug logging is enabled.

Author: Yuto Sasaki
Reviewed-by: Alvaro Herrera, Tom Lane, Fujii Masao
Discussion: https://postgr.es/m/TY2PR01MB3628A85689649BABC9A1C6C3C1782@TY2PR01MB3628.jpnprd01.prod.outlook.com
2024-10-22 23:57:35 +09:00
Álvaro Herrera
53af9491a0
Restructure foreign key handling code for ATTACH/DETACH
... to fix bugs when the referenced table is partitioned.

The catalog representation we chose for foreign keys connecting
partitioned tables (in commit f56f8f8da6af) is inconvenient, in the
sense that a standalone table has a different way to represent the
constraint when referencing a partitioned table, than when the same
table becomes a partition (and vice versa).  Because of this, we need to
create additional catalog rows on detach (pg_constraint and pg_trigger),
and remove them on attach.  We were doing some of those things, but not
all of them, leading to missing catalog rows in certain cases.

The worst problem seems to be that we are missing action triggers after
detaching a partition, which means that you could update/delete rows
from the referenced partitioned table that still had referencing rows on
that table, the server failing to throw the required errors.

!!!
Note that this means existing databases with FKs that reference
partitioned tables might have rows that break relational integrity, on
tables that were once partitions on the referencing side of the FK.

Another possible problem is that trying to reattach a table
that had been detached would fail indicating that internal triggers
cannot be found, which from the user's point of view is nonsensical.

In branches 15 and above, we fix this by creating a new helper function
addFkConstraint() which is in charge of creating a standalone
pg_constraint row, and repurposing addFkRecurseReferencing() and
addFkRecurseReferenced() so that they're only the recursive routine for
each side of the FK, and they call addFkConstraint() to create
pg_constraint at each partitioning level and add the necessary triggers.
These new routines can be used during partition creation, partition
attach and detach, and foreign key creation.  This reduces redundant
code and simplifies the flow.

In branches 14 and 13, we have a much simpler fix that consists on
simply removing the constraint on detach.  The reason is that those
branches are missing commit f4566345cf40, which reworked the way this
works in a way that we didn't consider back-patchable at the time.

We opted to leave branch 12 alone, because it's different from branch 13
enough that the fix doesn't apply; and because it is going in EOL mode
very soon, patching it now might be worse since there's no way to undo
the damage if it goes wrong.

Existing databases might need to be repaired.

In the future we might want to rethink the catalog representation to
avoid this problem, but for now the code seems to do what's required to
make the constraints operate correctly.

Co-authored-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Guillaume Lelarge <guillaume@lelarge.info>
Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reported-by: Thomas Baehler (SBB CFF FFS) <thomas.baehler2@sbb.ch>
Discussion: https://postgr.es/m/20230420144344.40744130@karst
Discussion: https://postgr.es/m/20230705233028.2f554f73@karst
Discussion: https://postgr.es/m/GVAP278MB02787E7134FD691861635A8BC9032@GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
2024-10-22 16:01:18 +02:00
Alexander Korotkov
e1555645d7 Make all Perl warnings fatal in 043_wal_replay_wait.pl
This file was committed after c5385929593, but accidentally missed changing
all warnings into fatal errors.

Reported-by: Anton Voloshin
Discussion: https://postgr.es/m/aa8a55d5-554a-4027-a491-1b0ca7c85f7a%40postgrespro.ru
2024-10-22 13:25:11 +03:00
Peter Eisentraut
d2b4b4c225 Fix C23 compiler warning
The approach of declaring a function pointer with an empty argument
list and hoping that the compiler will not complain about casting it
to another type no longer works with C23, because foo() is now
equivalent to foo(void).

We don't need to do this here.  With a few struct forward declarations
we can supply a correct argument list without having to pull in
another header file.

(This is the only new warning with C23.  Together with the previous
fix a67a49648d9, this makes the whole code compile cleanly under C23.)

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/95c6a9bf-d306-43d8-b880-664ef08f2944%40eisentraut.org
2024-10-22 08:17:42 +02:00
Michael Paquier
45e0ba30fc pg_stat_statements: Add tests for nested queries with level tracking
There have never been any regression tests in PGSS for various query
patterns for nested queries combined with level tracking, like:
- Multi-statements.
- CREATE TABLE AS
- CREATE/REFRESH MATERIALIZED VIEW
- DECLARE CURSOR
- EXPLAIN, with a subset of the above supported.
- COPY.

All the tests added here track historical, sometimes confusing, existing
behaviors.  For example, EXPLAIN stores two PGSS entries with the same
top-level query string but two different query IDs as one is calculated
for the top-level EXPLAIN (this part is right) and a second one for the
inner query in the EXPLAIN (this part is not right).

A couple of patches are under discussion to improve the situation, and
all the tests added here will prove useful to evaluate the changes
discussed.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-22 13:05:51 +09:00
Tom Lane
68ad9816c1 Fix wrong assertion and poor error messages in "COPY (query) TO".
If the query is rewritten into a NOTIFY command by a DO INSTEAD
rule, we'd get an assertion failure, or in non-assert builds
issue a rather confusing error message.  Improve that.

Also fix a longstanding grammar mistake in a nearby error message.

Per bug #18664 from Alexander Lakhin.  Back-patch to all supported
branches.

Tender Wang and Tom Lane

Discussion: https://postgr.es/m/18664-ffd0ebc2386598df@postgresql.org
2024-10-21 15:08:22 -04:00
Heikki Linnakangas
3c7d78427e Update outdated comment on WAL-logged locks with invalid XID
We haven't generated those for a long time.

Discussion: https://www.postgresql.org/message-id/b439edfc-c5e5-43a9-802d-4cb51ec20646@iki.fi
2024-10-21 14:28:43 +03:00
Heikki Linnakangas
1a43de5e0a Fix race condition in committing a serializable transaction
The finished transaction list can contain XIDs that are older than the
serializable global xmin. It's a short-lived state;
ClearOldPredicateLocks() removes any such transactions from the list,
and it's called whenever the global xmin advances. But if another
backend calls SummarizeOldestCommittedSxact() in that window, it will
call SerialAdd() on an XID that's older than the global xmin, or if
there are no more transactions running, when global xmin is
invalid. That trips the assertion in SerialAdd().

Fixes bug #18658 reported by Andrew Bille. Thanks to Alexander Lakhin
for analysis. Backpatch to all versions.

Discussion: https://www.postgresql.org/message-id/18658-7dab125ec688c70b%40postgresql.org
2024-10-21 09:49:21 +03:00
Michael Paquier
57a36e890d Fix grammar of a comment in bufmgr.c
Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3L5YjxXCjx0LhkwHdDGsNgpFGEqH7SqtXRPNP+dwFMVZQ@mail.gmail.com
2024-10-21 11:25:29 +09:00
Michael Paquier
a7800cf498 injection_points: Add basic isolation test
This test can act as a template when implementing an isolation test with
injection points, and tracks in a much simpler way some of the behaviors
implied in the existing isolation test "inplace" that has been added in
c35f419d6efb.  Particularly, a detach does not affect a backend wait; a
wait needs to be interrupted by a wakeup.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZxGTONm_ctQz--io@paquier.xyz
2024-10-21 11:10:51 +09:00
Álvaro Herrera
f1c141fe14
Note that index_name in ALTER INDEX ATTACH PARTITION can be schema-qualified
Missed in 8b08f7d4820f; backpatch to all supported branches.

Reported-by: alvaro@datadoghq.com
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/172924785099.698.15236991344616673753@wrigleys.postgresql.org
2024-10-20 15:36:20 +02:00
Amit Langote
11c87216d1 SQL/JSON: Fix some oversights in commit b6e1157e7
The decision in b6e1157e7 to ignore raw_expr when evaluating a
JsonValueExpr was incorrect.  While its value is not ultimately
used (since formatted_expr's value is), failing to initialize it
can lead to problems, for instance,  when the expression tree in
raw_expr contains Aggref nodes, which must be initialized to
ensure the parent Agg node works correctly.

Also, optimize eval_const_expressions_mutator()'s handling of
JsonValueExpr a bit.  Currently, when formatted_expr cannot be folded
into a constant, we end up processing it twice -- once directly in
eval_const_expressions_mutator() and again recursively via
ece_generic_processing().  This recursive processing is required to
handle raw_expr. To avoid the redundant processing of formatted_expr,
we now  process raw_expr directly in eval_const_expressions_mutator().

Finally, update the comment of JsonValueExpr to describe the roles of
raw_expr and formatted_expr more clearly.

Bug: #18657
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Fabio R. Sluzala <fabio3rs@gmail.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18657-1b90ccce2b16bdb8@postgresql.org
Backpatch-through: 16
2024-10-20 12:20:55 +09:00
Tom Lane
52475b4d30 Fix comment about pg_authid.
pg_shadow is not "publicly readable".  (pg_group is, but there seems
no need to make that distinction here.)  Seems to be a thinko dating
clear back to 7762619e9.

Antonin Houska

Discussion: https://postgr.es/m/31926.1729252247@antos
2024-10-19 11:44:14 -04:00
Jeff Davis
779972e534 Disable autovacuum for tables in stats import tests.
While we haven't observed any test instability, it seems like a good
idea to disable autovacuum during the stats import tests.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=fajh1Lpcyr_XsMmq-9Z=SGk-u+_Zeac7Pt0RAN3uiVCg@mail.gmail.com
2024-10-18 10:57:46 -07:00
Jeff Davis
b391d882ff Allow pg_set_relation_stats() to set relpages to -1.
While the default value for relpages is 0, if a partitioned table with
at least one child has been analyzed, then the partititoned table will
have a relpages value of -1.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=fajh1Lpcyr_XsMmq-9Z=SGk-u+_Zeac7Pt0RAN3uiVCg@mail.gmail.com
2024-10-18 10:44:15 -07:00
Peter Geoghegan
1bd4bc85ca Optimize nbtree backwards scans.
Make nbtree backwards scans optimistically access the next page to be
read to the left by following a prevPage block number that's now stashed
in currPos when the leaf page is first read.  This approach matches the
one taken during forward scans, which follow a symmetric nextPage block
number from currPos.  We stash both a prevPage and a nextPage, since the
scan direction might change (when fetching from a scrollable cursor).

Backwards scans will no longer need to lock the same page twice, except
in rare cases where the scan detects a concurrent page split (or page
deletion).  Testing has shown this optimization to be particularly
effective during parallel index-only backwards scans: ~12% reductions in
query execution time are quite possible.

We're much better off being optimistic; concurrent left sibling page
splits are rare in general.  It's possible that we'll need to lock more
pages than the pessimistic approach would have, but only when there are
_multiple_ concurrent splits of the left sibling page we now start at.
If there's just a single concurrent left sibling page split, the new
approach to scanning backwards will at least break even relative to the
old one (we'll acquire the same number of leaf page locks as before).

The optimization from this commit has long been contemplated by comments
added by commit 2ed5b87f96, which changed the rules for locking/pinning
during nbtree index scans.  The approach that that commit introduced to
leaf level link traversal when scanning forwards is now more or less
applied all the time, regardless of the direction we're scanning in.

Following uniform conventions around sibling link traversal is simpler.
The only real remaining difference between our forward and backwards
handling is that our backwards handling must still detect and recover
from any concurrent left sibling splits (and concurrent page deletions),
as documented in the nbtree README.  That is structured as a single,
isolated extra step that takes place in _bt_readnextpage.

Also use this opportunity to further simplify the functions that deal
with reading pages and traversing sibling links on the leaf level, and
to document their preconditions and postconditions (with respect to
things like buffer locks, buffer pins, and seizing the parallel scan).

This enhancement completely supersedes the one recently added by commit
3f44959f.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAEze2WgpBGRgTTxTWVPXc9+PB6fc1a7t+VyGXHzfnrFXcQVxnA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkBTuFv7W2+84jJT8mWZLXVL0GHq2hMUTn6c9Vw=eYrCw@mail.gmail.com
2024-10-18 11:25:32 -04:00
Nathan Bossart
9e2d813d59 Adjust documentation for configuring Linux huge pages.
The present wording about viewing shared_memory_size_in_huge_pages
seems to suggest that the parameter cannot be viewed after startup
at all, whereas the intent is to make it clear that you can't use
"postgres -C" to view this parameter while the server is running.
This commit rephrases this section to remove the ambiguity.

Author: Seino Yuki
Reviewed-by: Michael Paquier, David G. Johnston, Fujii Masao
Discussion: https://postgr.es/m/420584fd274f9ec4f337da55ffb3b790%40oss.nttdata.com
Backpatch-through: 15
2024-10-18 10:20:15 -05:00
Peter Eisentraut
4b652692e9 Fix memory leaks from incorrect strsep() uses
Commit 5d2e1cc117b introduced some strsep() uses, but it did the
memory management wrong in some cases.  We need to keep a separate
pointer to the allocate memory so that we can free it later, because
strsep() advances the pointer we pass to it, and it at the end it
will be NULL, so any free() calls won't do anything.

(This fixes two of the four places changed in commit 5d2e1cc117b.  The
other two don't have this problem.)

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
2024-10-18 11:29:20 +02:00
Peter Eisentraut
24a36f91e3 Fix strsep() use for SCRAM secrets parsing
The previous code (from commit 5d2e1cc117b) did not detect end of
string correctly, so it would fail to error out if fewer than the
expected number of fields were present, which could then later lead to
a crash when NULL string pointers are accessed.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
2024-10-18 11:15:54 +02:00
Fujii Masao
9272bdeac8 Remove unused code for unlogged materialized views.
Commit 3bf3ab8c56 initially introduced support for unlogged
materialized views, but this was later disallowed by commit 3223b25ff7.
Additionally, commit d25f519107 added more code for handling
unlogged materialized views. This commit cleans up all unused
code related to them.

If unlogged materialized views had been supported in any official
release, psql would need to retain code to handle them for compatibility
with older servers. However, since they were never included in
an official release, this code is no longer necessary.

Author: Pixian Shi
Reviewed-by: Yugo Nagata, Fujii Masao
Discussion: https://postgr.es/m/CAAccyYKRZ=OvAvgowiSH+OELbStLP=p2Ht=R3CgT=OaNSH5DAA@mail.gmail.com
2024-10-18 17:18:57 +09:00
Michael Paquier
19567b3eb4 Fix description of PostgreSQL::Test::Cluster::wait_for_event()
The arguments of the function were listed in an incorrect order in the
description of the routine.  This information can be seen with perldoc.

Issue spotted while working on this area of the code.

Backpatch-through: 17
2024-10-18 13:49:58 +09:00
Jeff Davis
eecd9138a0 Improve ThrowErrorData() comments for use with soft errors.
Reviewed-by: Corey Huinker
Discussion: https://postgr.es/m/901ab7cf01957f92ea8b30b6feeb0eacfb7505fc.camel@j-davis.com
2024-10-17 14:56:44 -07:00
Tom Lane
1fed234f9f ecpg: fix more minor mishandling of bad input in preprocessor.
Don't get confused by an unmatched right brace in the input.
(Previously, this led to discarding information about file-level
variables and then possibly crashing.)

Detect, rather than crash on, an attempt to index into a non-array
variable.

As before, in the absence of field complaints I'm not too
excited about back-patching these.

Per valgrind testing by Alexander Lakhin.

Discussion: https://postgr.es/m/a239aec2-6c79-5fc9-9272-cea41158a360@gmail.com
2024-10-17 15:28:32 -04:00
Thomas Munro
98c7c7152d Fix extreme skew detection in Parallel Hash Join.
After repartitioning the inner side of a hash join that would have
exceeded the allowed size, we check if all the tuples from a parent
partition moved to one child partition.  That is evidence that it
contains duplicate keys and later attempts to repartition will also
fail, so we should give up trying to limit memory (for lack of a better
fallback strategy).

A thinko prevented the check from working correctly in partition 0 (the
one that is partially loaded into memory already).  After
repartitioning, we should check for extreme skew if the *parent*
partition's space_exhausted flag was set, not the child partition's.
The consequence was repeated futile repartitioning until per-partition
data exceeded various limits including "ERROR: invalid DSA memory alloc
request size 1811939328", OS allocation failure, or temporary disk space
errors.  (We could also do something about some of those symptoms, but
that's material for separate patches.)

This problem only became likely when PostgreSQL 16 introduced support
for Parallel Hash Right/Full Join, allowing NULL keys into the hash
table.  Repartitioning always leaves NULL in partition 0, no matter how
many times you do it, because the hash value is all zero bits.  That's
unlikely for other hashed values, but they might still have caused
wasted extra effort before giving up.

Back-patch to all supported releases.

Reported-by: Craig Milhiser <craig@milhiser.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CA%2BwnhO1OfgXbmXgC4fv_uu%3DOxcDQuHvfoQ4k0DFeB0Qqd-X-rQ%40mail.gmail.com
2024-10-17 22:11:59 +13:00
Peter Eisentraut
d893a299ce Remove superfluous forward declaration
The need for this was removed by commit dc9c3b0ff21.
2024-10-17 08:57:56 +02:00
Peter Eisentraut
6234a9ce0e Fix whitespace 2024-10-17 08:43:08 +02:00
Peter Eisentraut
665785d85f Fix unnecessary casts of copyObject() result
The result is already of the correct type, so these casts don't do
anything.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org
2024-10-17 08:36:48 +02:00
Peter Eisentraut
eafda78fc4 Improve node type forward reference
Instead of using Node *, we can use an incomplete struct.  That way,
everything has the correct type and fewer casts are required.  This
technique is already used elsewhere in node type definitions.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org
2024-10-17 08:36:48 +02:00
Peter Eisentraut
41b023946d jsonapi: fully initialize dummy lexer
Valgrind reports that checks on lex->inc_state are undefined for the
"dummy lexer" used for incremental parsing, since it's only partially
initialized on the stack. This was introduced in 0785d1b8b2.
Zero-initialize the whole struct.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAOYmi+n9QWr4gsAADZc6qFQjFViXQYVk=gBy_EvxuqsgPJcb_g@mail.gmail.com
2024-10-17 08:23:46 +02:00
Peter Eisentraut
342fb8a332 Fix unusual include style
Project-internal header files should be included using " ", not < >.
2024-10-17 08:14:45 +02:00
David Rowley
9ca67658d1 Don't store intermediate hash values in ExprState->resvalue
adf97c156 made it so ExprStates could support hashing and changed Hash
Join to use that instead of manually extracting Datums from tuples and
hashing them one column at a time.

When hashing multiple columns or expressions, the code added in that
commit stored the intermediate hash value in the ExprState's resvalue
field.  That was a mistake as steps may be injected into the ExprState
between each hashing step that look at or overwrite the stored
intermediate hash value.  EEOP_PARAM_SET is an example of such a step.

Here we fix this by adding a new dedicated field for storing
intermediate hash values and adjust the code so that all apart from the
final hashing step store their result in the intermediate field.

In passing, rename a variable so that it's more aligned to the
surrounding code and also so a few lines stay within the 80 char margin.

Reported-by: Andres Freund
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAApHDvqo9eenEFXND5zZ9JxO_k4eTA4jKMGxSyjdTrsmYvnmZw@mail.gmail.com
2024-10-17 14:25:08 +13:00
Michael Paquier
089aac631b Fix validation of COPY FORCE_NOT_NULL/FORCE_NULL for the all-column cases
This commit adds missing checks for COPY FORCE_NOT_NULL and FORCE_NULL
when applied to all columns via "*".  These options now correctly
require CSV mode and are disallowed in COPY TO, making their behavior
consistent with FORCE_QUOTE.

Some regression tests are added to verify the correct behavior for the
all-columns case, including FORCE_QUOTE, which was not tested.

Backpatch down to 17, where support for the all-column grammar with
FORCE_NOT_NULL and FORCE_NULL has been added.

Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 17
2024-10-17 08:44:50 +09:00
Michael Paquier
03bf0d9a4b Rewrite some regression queries for option checks with COPY
Some queries in copy2 are there to check various option combinations,
and used "stdin" or "stdout" incompatible with the COPY TO or FROM
clauses combined with them, which was confusing.  This commit rewrites
these queries to use a compatible grammar.

The coverage of the tests is unchanged.  Like the original commit
451d1164b9d0, backpatch down to 16 where these have been introduced.  A
follow-up commit will rely on this area of the tests for a bug fix.

Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 16
2024-10-17 07:21:35 +09:00
Peter Geoghegan
c0490b0ef7 nbtree: fix read page recheck typo.
Oversight in commit 79fa7b3b.
2024-10-16 17:38:38 -04:00
Tom Lane
c96de42c4b Further refine _SPI_execute_plan's rule for atomic execution.
Commit 2dc1deaea turns out to have been still a brick shy of a load,
because CALL statements executing within a plpgsql exception block
could still pass the wrong snapshot to stable functions within the
CALL's argument list.  That happened because standard_ProcessUtility
forces isAtomicContext to true if IsTransactionBlock is true, which
it always will be inside a subtransaction.  Then ExecuteCallStmt
would think it does not need to push a new snapshot --- but
_SPI_execute_plan didn't do so either, since it thought it was in
nonatomic mode.

The best fix for this seems to be for _SPI_execute_plan to operate
in atomic execution mode if IsSubTransaction() is true, even when the
SPI context as a whole is non-atomic.  This makes _SPI_execute_plan
have the same rules about when non-atomic execution is allowed as
_SPI_commit/_SPI_rollback have about when COMMIT/ROLLBACK are allowed,
which seems appropriately symmetric.  (If anyone ever tries to allow
COMMIT/ROLLBACK inside a subtransaction, this would all need to be
rethought ... but I'm unconvinced that such a thing could be logically
consistent at all.)

For further consistency, also check IsSubTransaction() in
SPI_inside_nonatomic_context.  That does not matter for its
one present-day caller StartTransaction, which can't be reached
inside a subtransaction.  But if any other callers ever arise,
they'd presumably want this definition.

Per bug #18656 from Alexander Alehin.  Back-patch to all
supported branches, like previous fixes in this area.

Discussion: https://postgr.es/m/18656-cade1780866ef66c@postgresql.org
2024-10-16 17:36:40 -04:00
Jeff Davis
a7f2f6adc2 Whitespace fixup from generated unicode tables.
When running the 'update-unicode' build target, generate files that
conform to pgindent whitespace rules.
2024-10-16 12:21:13 -07:00
Jeff Davis
b360d1762b Fix #include order from e839c8ecc9.
Reported-by: Alexander Korotkov
Discussion: https://postgr.es/m/CAPpHfduAiGSsvUc614Z-JOnyQffcMeJncWMF2HnUL8wFy4fuWA@mail.gmail.com
2024-10-16 12:13:40 -07:00
Masahiko Sawada
1b9b6cc345 Reduce memory block size for decoded tuple storage to 8kB.
Commit a4ccc1cef introduced the Generation Context and modified the
logical decoding process to use a Generation Context with a fixed
block size of 8MB for storing tuple data decoded during logical
decoding (i.e., rb->tup_context). Several reports have indicated that
the logical decoding process can be terminated due to
out-of-memory (OOM) situations caused by excessive memory usage in
rb->tup_context.

This issue can occur when decoding a workload involving several
concurrent transactions, including a long-running transaction that
modifies tuples. By design, the Generation Context does not free a
memory block until all chunks within that block are
released. Consequently, if tuples modified by the long-running
transaction are stored across multiple memory blocks, these blocks
remain allocated until the long-running transaction completes, leading
to substantial memory fragmentation. The memory usage during logical
decoding, tracked by rb->size, does not account for memory
fragmentation, resulting in potentially much higher memory consumption
than the value of the logical_decoding_work_mem parameter.

Various improvement strategies were discussed in the relevant
thread. This change reduces the block size of the Generation Context
used in rb->tup_context from 8MB to 8kB. This modification
significantly decreases the likelihood of substantial memory
fragmentation occurring and is relatively straightforward to
backport. Performance testing across multiple platforms has confirmed
that this change will not introduce any performance degradation that
would impact actual operation.

Backport to all supported branches.

Reported-by: Alex Richman, Michael Guissine, Avi Weinberg
Reviewed-by: Amit Kapila, Fujii Masao, David Rowley
Tested-by: Hayato Kuroda, Shlok Kyal
Discussion: https://postgr.es/m/CAD21AoBTY1LATZUmvSXEssvq07qDZufV4AF-OHh9VD2pC0VY2A%40mail.gmail.com
Backpatch-through: 12
2024-10-16 12:08:05 -07:00
Tom Lane
9b4bf51690 ecpg: fix some minor mishandling of bad input in preprocessor.
Avoid null-pointer crash when considering a cursor declaration
that's outside any C function (a case which is useless anyway).

Ensure a cursor for a prepared statement is marked as initially
not open.  At worst, if we chanced to get not-already-zeroed memory
from malloc(), this oversight would result in failing to issue a
"cursor "foo" has been declared but not opened" warning that would
have been appropriate.

Avoid running off the end of the buffer when there are mismatched
square brackets following a variable name.  This could lead to
SIGSEGV after reaching the end of memory.

Given the lack of field complaints, none of these seem to be worth
back-patching, but let's clean them up in HEAD.

Per valgrind testing by Alexander Lakhin.

Discussion: https://postgr.es/m/5f5bcecd-d7ec-b8c0-6c92-d1a7c6e0f639@gmail.com
2024-10-16 12:25:00 -04:00
Peter Geoghegan
79fa7b3b1a Normalize nbtree truncated high key array behavior.
Commit 5bf748b8 taught nbtree ScalarArrayOp index scans to decide when
and how to start the next primitive index scan based on physical index
characteristics.  This included rules for deciding whether to start a
new primitive index scan (or whether to move onto the right sibling leaf
page instead) that specifically consider truncated lower-order columns
(-inf columns) from leaf page high keys.

These omitted columns were treated as satisfying the scan's required
scan keys, though only for scan keys marked required in the current scan
direction (forward).  Scan keys that didn't get this behavior (those
marked required in the backwards direction only) usually didn't give the
scan reasonable cause to reposition itself to a later leaf page (via
another descent of the index in _bt_first), but _bt_advance_array_keys
would nevertheless always give up by forcing another call to _bt_first.

_bt_advance_array_keys was unwilling to allow the scan to continue onto
the next leaf page, to reconsider whether we really should start another
primitive scan based on the details of the sibling page's tuples.  This
didn't match its behavior with similar cases involving keys required in
the current scan direction (forward), which seems unprincipled.  It led
to an excessive number of primitive scans/index descents for queries
with a higher-order = array scan key (with dense, contiguous values)
mixed with a lower-order required > or >= scan key.

Bring > and >= strategy scan keys in line with other required scan key
types: treat truncated -inf scan keys as having satisfied scan keys
required in either scan direction (forwards and backwards alike) during
array advancement.  That way affected scans can continue to the right
sibling leaf page.  Advancement must now schedule an explicit recheck of
the right sibling page's high key in cases involving > or >= scan keys.
The recheck gives the scan a way to back out and start another primitive
index scan (we can't just rely on _bt_checkkeys with > or >= scan keys).

This work can be considered a stand alone optimization on top of the
work from commit 5bf748b8.  But it was written in preparation for an
upcoming patch that will add skip scan to nbtree.  In practice scans
that use "skip arrays" will tend to be much more sensitive to any
implementation deficiencies in this area.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com
2024-10-16 12:17:49 -04:00
Amit Langote
c259b1578e Fix typo in comment of transformJsonAggConstructor()
An oversight of 3a8a1f3254b.

Reported-by: Tender Wang <tndrwang@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Backpatch-through: 16
2024-10-16 20:37:02 +09:00
Peter Eisentraut
04bec894a0 initdb: Change default to using data checksums.
Checksums are now on by default.  They can be disabled by the
previously added option --no-data-checksums.

Author: Greg Sabino Mullane <greg@turnstep.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com
2024-10-16 08:48:10 +02:00
Peter Eisentraut
67846550dc doc: Fix initdb option xreflabels
Generally, we don't want any overriding xreflabels in the options
list, so that we can link to options and the link renders as the
option name.  The -g option did this differently and config.sgml made
use of that for a link.  The new --no-data-checksums option (commit
983a588e0b8) apparently copied this pattern, but that seems like the
wrong direction, as a future patch revealed.

To fix, remove the two xreflabels and rewrite the link in config.sgml
with an explicit link text.
2024-10-16 08:28:12 +02:00
Nathan Bossart
d5ca15ee54 Add type cast to foreach_internal's loop variable.
C++ requires explicitly casting void pointers to the appropriate
pointer type, which means the foreach_ptr macro cannot be used in
C++ code without this change.

Author: Jelte Fennema-Nio
Reviewed-by: Bruce Momjian
Discussion: https://postgr.es/m/CAGECzQSYG3QfHrc-rOk2KbnB9iJOd7Qu-Xii1s-GTA%3D3JFt49Q%40mail.gmail.com
Backpatch-through: 17
2024-10-15 16:20:49 -05:00
David Rowley
2453196107 Move clause_sides_match_join() into restrictinfo.h
Two near-identical copies of clause_sides_match_join() existed in
joinpath.c and analyzejoins.c.  Deduplicate this by moving the function
into restrictinfo.h.

It isn't quite clear that keeping the inline property of this function
is worthwhile, but this commit is just an exercise in code
deduplication.  More effort would be required to determine if the inline
property is worth keeping.

Author: James Hunter <james.hunter.pg@gmail.com>
Discussion: https://postgr.es/m/CAJVSvF7Nm_9kgMLOch4c-5fbh3MYg%3D9BdnDx3Dv7Fcb64zr64Q%40mail.gmail.com
2024-10-15 21:14:21 +13:00
Masahiko Sawada
7cdfeee320 Add contrib/pg_logicalinspect.
This module provides SQL functions that allow to inspect logical
decoding components.

It currently allows to inspect the contents of serialized logical
snapshots of a running database cluster, which is useful for debugging
or educational purposes.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith, Peter Eisentraut
Reviewed-by: David G. Johnston
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal
2024-10-14 17:22:02 -07:00
Masahiko Sawada
e2fd615ecc Move SnapBuild and SnapBuildOnDisk structs to snapshot_internal.h.
This commit moves the definitions of the SnapBuild and SnapBuildOnDisk
structs, related to logical snapshots, to the snapshot_internal.h
file. This change allows external tools, such as
pg_logicalinspect (with an upcoming patch), to access and utilize the
contents of logical snapshots.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal
2024-10-14 17:19:33 -07:00
Tom Lane
dbedc461b4 ecpg: invent a saner syntax for ecpg.addons entries.
Put the rule type at the start not the end, and put spaces
between the constitutent token names instead of smashing them
into an illegible mess.  This has no functional impact but
I think it makes the rules a great deal more readable.

Discussion: https://postgr.es/m/1185216.1724001216@sss.pgh.pa.us
2024-10-14 16:13:56 -04:00
Nathan Bossart
143e3a1ab8 Add commit 7f7474a8e4 to .git-blame-ignore-revs. 2024-10-14 15:09:39 -05:00
Tom Lane
d2f41b4621 ecpg: add cross-checks to parse.pl for usage of internal tables.
parse.pl contains several constant tables that describe tweaks
to be made to the backend grammar.  In the same spirit as
00b0e7204, add cross-checks that each table entry is used at
least once (or exactly once if that's appropriate).  This should
help catch cases where adjustments to the backend grammar cause
a table entry not to match as expected.

Per suggestion from Michael Paquier.

Discussion: https://postgr.es/m/ZsLVbjsc5x5Saesg@paquier.xyz
2024-10-14 15:59:29 -04:00
Jeff Davis
66ac94cdc7 Move libc-specific code from pg_locale.c into pg_locale_libc.c.
Move implementation of pg_locale_t code for libc collations into
pg_locale_libc.c. Other locale-related code, such as
pg_perm_setlocale(), remains in pg_locale.c for now.

Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b.camel@j-davis.com
2024-10-14 12:48:43 -07:00
Tom Lane
9812138593 ecpg: avoid breaking the IDENT precedence level in two.
Careless string hacking caused parse.pl to transform gram.y's
declaration

%nonassoc    IDENT PARTITION RANGE ROWS ...

into

%nonassoc IDENT
%nonassoc CSTRING PARTITION RANGE ROWS ...

It turns out that this has no semantic impact, because the
generated preproc.c is exactly the same either way (if you
inject a blank line to keep line numbers the same).

Nonetheless, given the great emphasis that the commentary in
gram.y places on keeping those other keywords at the same
precedence level as IDENT, this seems like foolishly risking ecpg
behaving differently from the core parser.  Adjust the code so
that CSTRING is added to the precedence line without breaking it
into two lines.

Discussion: https://postgr.es/m/2157151.1713540065@sss.pgh.pa.us
2024-10-14 15:42:02 -04:00
Jeff Davis
f244a2bb4c Move ICU-specific code from pg_locale.c into pg_locale_icu.c.
Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b.camel@j-davis.com
2024-10-14 12:13:26 -07:00
Tom Lane
1acd0f5527 ecpg: improve preprocessor's memory management.
Invent a notion of "local" storage that will automatically be
reclaimed at the end of each statement.  Use this for location
strings as well as other visibly short-lived data within the parser.

Also, make cat_str and make_str return local storage and not free
their inputs, which allows dispensing with a whole lot of retail
mm_strdup calls.  We do have to add some new ones in places where
a local-lifetime string needs to be added to a longer-lived data
structure, but on balance there are a lot less mm_strdup calls than
before.

In hopes of flushing out places where changes were necessary,
I changed YYLTYPE from "char *" to "const char *", which forced
const-ification of various function arguments that probably
should've been like that all along.

This still leaks somewhat more memory than v17, but that will be
cleaned up in future commits.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14 13:55:08 -04:00
Tom Lane
f18231e817 ecpg: move some functions into a new file ecpg/preproc/util.c.
mm_alloc and mm_strdup were in type.c, which seems a completely
random choice.  No doubt the original author thought two small
functions didn't deserve their own file.  But I'm about to add
some more memory-management stuff beside them, so let's put them
in a less surprising place.  This seems like a better home for
mmerror, mmfatal, and the cat_str/make_str family, too.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14 13:47:59 -04:00
Tom Lane
a542d5614b ecpg: re-implement preprocessor's string management.
Most productions in the preprocessor grammar construct strings
representing SQL or C statements or fragments thereof.  Instead
of returning these as <str> results of the productions, return
them as "location" values, taking advantage of Bison's flexibility
about what a location is.  We aren't really giving up anything
thereby, since ecpg's error reports have always just given line
numbers, and that's tracked separately.  The advantage of this
is that a single instance of the YYLLOC_DEFAULT macro can
perform all the work needed by the vast majority of productions,
including all the ones made automatically by parse.pl.  This
avoids having large numbers of effectively-identical productions,
which tickles an optimization inefficiency in recent versions of
clang.  (This patch reduces the compilation time for preproc.o
by more than 100-fold with clang 16, and is visibly helpful with
gcc too.)  The compiled parser is noticeably smaller as well.

A disadvantage of this approach is that YYLLOC_DEFAULT is applied
before running the production's semantic action (if any).  This
means it cannot use the method favored by cat_str() of free'ing
all the input strings; if the action needs to look at the input
strings, it'd be looking at dangling storage.  As this stands,
therefore, it leaks memory like a sieve.  This is already a big
patch though, and fixing the memory management seems like a
separable problem, so let's leave that for the next step.
(This does remove some free() calls that I'd have had to touch
anyway, in the expectation that the next step will manage
memory reclamation quite differently.)

Most of the changes here are mindless substitution of "@N" for
"$N" in grammar rules; see the changes to README.parser for
an explanation.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14 13:44:42 -04:00
Tom Lane
6b00549944 ecpg: major cleanup, simplification, and documentation of parse.pl.
Remove a lot of cruft, clean up and document what's left.
This produces the same preproc.y output as before, except for
fewer blank lines.  (It's not like we're making any attempt to
match the layout of gram.y, so I removed the one bit of logic
that seemed to have that in mind.)

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14 13:37:33 -04:00
Tom Lane
293fd24425 ecpg: remove check_rules.pl.
As noted in the previous commit, check_rules.pl is now entirely
redundant with checks made by parse.pl, or would be if it weren't
for the places where it's wrong.  It's a waste of build cycles
and maintenance effort, so remove it.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14 13:33:41 -04:00
Tom Lane
00b0e7204d ecpg: clean up documentation of parse.pl, and add more input checking.
README.parser is the user's manual, such as it is, for parse.pl.
It's rather poorly written if you ask me; so try to improve it.
(More could be written here, but this at least covers the same
info in a more organized fashion.)

Also, the single solitary line of usage info in parse.pl itself
was a lie.  Replace.

Add some error checks that the ecpg.addons entries meet the syntax
rules set forth in README.parser.  One of them didn't, but
accidentally worked anyway because the logic in include_addon is
such that 'block' is the default behavior.

Also add a cross-check that each ecpg.addons entry is matched exactly
once in the backend grammar.  This exposed that there are two dead
entries there --- they are dead because the %replace_types table in
parse.pl causes their nonterminals to be ignored altogether.
Removing them doesn't change the generated preproc.y file.

(This implies that check_rules.pl is completely worthless and should
be nuked: it adds build cycles and maintenance effort while failing
to reliably accomplish its one job of detecting dead rules.  I'll
do that separately.)

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14 13:29:36 -04:00
Masahiko Sawada
7be4ba4a9d Remove obsolete comment in reorderbuffer.h.
Commit 9fab40ad32e changed ReorderBuffer to use Slab Context for
allocating ReorderBufferTXN entries instead of using a caching
mechanism. The txn->node is no longer used as an element of the list
of preallocated ReorderBufferTXNs.

Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAD21AoB1CTnX66Ji3zTCnjoPVC9OzYe0B6LygUHcxEB2RV-hFw%40mail.gmail.com
2024-10-14 09:53:05 -07:00
Masahiko Sawada
4681ad4b2f Use construct_array_builtin for FLOAT8OID instead of construct_array.
Commit d746021de1 introduced construct_array_builtin() for built-in
data types, but forgot some replacements linked to FLOAT8OID.

Author: Bertrand Drouvot
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/CAD21AoCERkwmttY44dqUw%3Dm_9QCctu7W%2Bp6B7w_VqxRJA1Qq_Q%40mail.gmail.com
2024-10-14 09:49:29 -07:00
Peter Eisentraut
c594f1ad2b Track scan reversals in MergeJoin
The MergeJoin struct was tracking "mergeStrategies", which were an
array of btree strategy numbers, purely for the purpose of comparing
it later against btree strategies to determine if the scan direction
was forward or reverse.  Change that.  Instead, track
"mergeReversals", an array of bool, to indicate the same without an
unfortunate assumption that a strategy number refers specifically to a
btree strategy.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-10-14 15:36:18 +02:00
Peter Eisentraut
0d2aa4d493 Track sort direction in SortGroupClause
Functions make_pathkey_from_sortop() and transformWindowDefinitions(),
which receive a SortGroupClause, were determining the sort order
(ascending vs. descending) by comparing that structure's operator
strategy to BTLessStrategyNumber, but could just as easily have gotten
it from the SortGroupClause object, if it had such a field, so add
one.  This reduces the number of places that hardcode the assumption
that the strategy refers specifically to a btree strategy, rather than
some other index AM's operators.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-10-14 15:36:02 +02:00
Peter Eisentraut
e7d0cf42b1 Allow TAP tests to force checksums off when calling init()
TAP tests can write

    $node->init(no_data_checksums => 1);

to initialize a cluster explicitly without checksums.  Currently, this
is the default, but this change allows running all tests with
checksums enabled, like

    PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...

And this also prepares the tests for when we switch the default to
checksums enabled.

The pg_checksums tests need to disable checksums so it can test its
own functionality of enabling checksums.  The amcheck/pg_amcheck tests
need to disable checksums because they manually introduce corruption
that they want to detect, but with checksums enabled, the checksum
verification will fail before they even get to their work.

Author: Greg Sabino Mullane <greg@turnstep.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com
2024-10-14 11:25:03 +02:00
Peter Eisentraut
199ad00e4b Run pgperltidy on newly-added test code
From commit 85ec945b78 (but apparently not caught by 05d1b9b5c2).
2024-10-14 11:25:03 +02:00
Daniel Gustafsson
40f4f2fa65 doc: Add anchors for COPY format descriptions
When answering support questions online it's helpful to be able to
refer to the specific format by using an anchored link.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87edatit3t.fsf@wibble.ilmari.org
2024-10-14 10:15:33 +02:00
Peter Eisentraut
a2d9a9b95a Remove traces of BeOS.
Commit 15abc7788e6 tolerated namespace pollution from BeOS system
headers.  Commit 44f902122 de-supported BeOS.  Since that stuff didn't
make it into the Meson build system, synchronize by removing from
configure.

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (the idea, not the patch)
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
2024-10-14 08:33:36 +02:00
Michael Paquier
9f34cae142 psql: Fix \watch when using interval values less than 1ms
Attempting to use an interval of time less than 1ms would cause \watch
to hang.  This was confusing, so let's change the logic so as an
interval lower than 1ms behaves the same as 0.

Comments are added to mention that the internals of do_watch() had
better rely on "sleep_ms", the interval value in milliseconds.  While on
it, this commit adds a test to check the behavior of interval values
less than 1ms.

\watch hanging for interval values less than 1ms existed before
6f9ee74d45aa, that has changed the code to support an interval value of
0.

Reported-by: Heikki Linnakangas
Author: Andrey M. Borodin, Michael Paquier
Discussion: https://postgr.es/m/88445e0e-3156-4b9d-afae-9a1a7b1631f6@iki.fi
Backpatch-through: 16
2024-10-14 12:27:51 +09:00
Jeff Davis
35a015a600 Fixup for pg_set_relation_stats().
Reported-by: Noriyoshi Shinoda
Discussion: https://postgr.es/m/DM4PR84MB17345E2DFF28A5557B7CBC3CEE7A2@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM
2024-10-13 13:44:23 -07:00
Michael Paquier
c0b74323dc Use MAX_PARALLEL_WORKER_LIMIT for max_parallel_maintenance_workers
max_parallel_maintenance_workers has been introduced in 9da0cc35284b,
and used a hardcoded limit of 1024 rather than this variable.

max_parallel_workers and max_parallel_workers_per_gather already used
MAX_PARALLEL_WORKER_LIMIT (1024) as their upper-bound since
6599c9ac3340.

Author: Matthias van de Meent
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/CAEze2WiCiJD+8Wig_wGPyn4vgdPjbnYXy2Rw+9KYi6izTMuP=w@mail.gmail.com
2024-10-13 11:20:30 +09:00
Tom Lane
9f954177b1 Correctly identify which EC members are computable at a plan node.
find_computable_ec_member() had the wrong mental model of what
its primary caller prepare_sort_from_pathkeys() would do with
the selected EquivalenceClass member expression.  We will not
compute the EC expression in a plan node atop the one returning
the passed-in targetlist; rather, the EC expression will be
computed as an additional column of that targetlist.  So any
Var or quasi-Var used in the given tlist is also available to the
EC expression.  In simple cases this makes no difference because
the given tlist is just a list of Vars or quasi-Vars --- but if
we are considering an appendrel member produced by flattening
a UNION ALL, the tlist may contain expressions, resulting in
failure to match and a "could not find pathkey item to sort"
error.

To fix, we can flatten both the tlist and the EC members with
pull_var_clause(), and then just check for subset-ness, so
that the code is actually shorter than before.

While this bug is quite old, the present patch only works back to
v13.  We could possibly make it work in v12 by back-patching parts
of 375398244.  On the whole though I don't like the risk/reward
ratio of that idea.  v12's final release is next month, meaning
there would be no chance to correct matters if the patch causes a
regression.  Since this failure has escaped notice for 14 years,
it's likely nobody will hit it in the field with v12.

Per bug #18652 from Alexander Lakhin.

Andrei Lepikhov and Tom Lane

Discussion: https://postgr.es/m/18652-deaa782ebcca85d1@postgresql.org
2024-10-12 14:56:08 -04:00
Jeff Davis
98c5b191e7 Fix missed case for builtin collation provider.
A missed check for the builtin collation provider could result in
falling through to call isalpha().

This does not appear to have practical consequences because it only
happens for characters in the ASCII range. Regardless, the builtin
provider should not be calling libc functions, so backpatch.

Discussion: https://postgr.es/m/1bd5a0a5192f82c22ee7527e825b18ab0028b2c7.camel@j-davis.com
Backpatch-through: 17
2024-10-11 16:59:29 -07:00
Jeff Davis
e839c8ecc9 Create functions pg_set_relation_stats, pg_clear_relation_stats.
These functions are used to tweak statistics on any relation, provided
that the user has MAINTAIN privilege on the relation, or is the database
owner.

Bump catalog version.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
2024-10-11 16:55:11 -07:00
Daniel Gustafsson
6f782a2a17 Avoid mixing custom and OpenSSL BIO functions
PostgreSQL has for a long time mixed two BIO implementations, which can
lead to subtle bugs and inconsistencies. This cleans up our BIO by just
just setting up the methods we need. This patch does not introduce any
functionality changes.

The following methods are no longer defined due to not being needed:

  - gets: Not used by libssl
  - puts: Not used by libssl
  - create: Sets up state not used by libpq
  - destroy: Not used since libpq use BIO_NOCLOSE, if it was used it close
             the socket from underneath libpq
  - callback_ctrl: Not implemented by sockets

The following methods are defined for our BIO:

  - read: Used for reading arbitrary length data from the BIO. No change
          in functionality from the previous implementation.
  - write: Used for writing arbitrary length data to the BIO. No change
           in functionality from the previous implementation.
  - ctrl: Used for processing ctrl messages in the BIO (similar to ioctl).
          The only ctrl message which matters is BIO_CTRL_FLUSH used for
          writing out buffered data (or signal EOF and that no more data
          will be written). BIO_CTRL_FLUSH is mandatory to implement and
          is implemented as a no-op since there is no intermediate buffer
          to flush.
          BIO_CTRL_EOF is the out-of-band method for signalling EOF to
          read_ex based BIO's. Our BIO is not read_ex based but someone
          could accidentally call BIO_CTRL_EOF on us so implement mainly
          for completeness sake.

As the implementation is no longer related to BIO_s_socket or calling
SSL_set_fd, methods have been renamed to reference the PGconn and Port
types instead.

This also reverts back to using BIO_set_data, with our fallback, as a small
optimization as BIO_set_app_data require the ex_data mechanism in OpenSSL.

Author: David Benjamin <davidben@google.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAF8qwaCZ97AZWXtg_y359SpOHe+HdJ+p0poLCpJYSUxL-8Eo8A@mail.gmail.com
2024-10-11 21:58:58 +02:00
Nathan Bossart
4e1fad3787 Add pg_ls_summariesdir().
This function returns the name, size, and last modification time of
each regular file in pg_wal/summaries.  This allows administrators
to grant privileges to view the contents of this directory without
granting privileges on pg_ls_dir(), which allows listing the
contents of many other directories.  This commit also gives the
pg_monitor predefined role EXECUTE privileges on the new
pg_ls_summariesdir() function.

Bumps catversion.

Author: Yushi Ogiwara
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/a0a3af15a9b9daa107739eb45aa9a9bc%40oss.nttdata.com
2024-10-11 11:02:09 -05:00
Heikki Linnakangas
add77755ce Mark consume_xids test functions VOLATILE and PARALLEL UNSAFE
Both functions advance the transaction ID, which modifies the system
state. Thus, they should be marked as VOLATILE.

Additionally, they call the AssignTransactionId function, which cannot
be invoked in parallel mode, so they should be marked as PARALLEL
UNSAFE.

Author: Yushi Ogiwara <btogiwarayuushi@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/18f01e4fd46448f88c7a1363050a9955@oss.nttdata.com
2024-10-11 11:09:09 +03:00
Daniel Gustafsson
682512dca8 Fix typo in connection limits test
Spotted while doing post-commit review.
2024-10-11 10:04:23 +02:00
Álvaro Herrera
099c572d33
Use deconstruct_array_builtin instead of deconstruct_array
Commit 062a84442424 introduced use of deconstruct_array when
deconstruct_array_builtin can be used instead.  Do that to save some
code.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Zwi5g2GzlUX1NqxR@ip-10-97-1-34.eu-west-3.compute.internal
2024-10-11 09:54:18 +02:00
Tatsuo Ishii
cae0f3c405 pgbench: Improve result outputs related to failed transactions.
Previously, per-script statistics were never output when all
transactions failed due to serialization or deadlock errors.  However,
it is reasonable to report such information if there are ones even
when there are no successful transaction since these failed
transactions are now objects to be reported.

Meanwhile, if the total number of successful, skipped, and failed
transactions is zero, we don't have to report the number of failed
transactions as similar to the number of skipped transactions, which
avoids to print "NaN%" in lines on failed transaction reports.

Also, the number of transactions in per-script results now includes
skipped and failed transactions. It prevents to print "total of NaN%"
when any transactions are not successfully processed. The number of
transactions actually processed per-script and TPS based on it are now
output explicitly in a separate line.

Author: Yugo Nagata
Reviewed-by: Tatsuo Ishii
Discussion: https://postgr.es/m/20240921003544.2436ef8da9c5c8cb963c651b%40sraoss.co.jp
2024-10-11 13:40:23 +09:00
David Rowley
161320b4b9 Adjust EXPLAIN's output for disabled nodes
c01743aa4 added EXPLAIN output to display the plan node's disabled_node
count whenever that count is above 0.  Seemingly, there weren't many
people who liked that output as each parent of a disabled node would
also have a "Disabled Nodes" output due to the way disabled_nodes is
accumulated towards the root plan node.  It was often hard and sometimes
impossible to figure out which nodes were disabled from looking at
EXPLAIN.  You might think it would be possible to manually add up the
numbers from the "Disabled Nodes" output of a given node's children to
figure out if that node has a higher disabled_nodes count than its
children, but that wouldn't have worked for Append and Merge Append nodes
if some disabled child nodes were run-time pruned during init plan.  Those
children are not displayed in EXPLAIN.

Here we attempt to improve this output by only showing "Disabled: true"
against only the nodes which are explicitly disabled themselves.  That
seems to be the output that's desired by the most people who voiced
their opinion.  This is done by summing up the disabled_nodes of the
given node's children and checking if that number is less than the
disabled_nodes of the current node.

This commit also fixes a bug in make_sort() which was neglecting to set
the Sort's disabled_nodes field.  This should have copied what was done
in cost_sort(), but it hadn't been updated.  With the new output, the
choice to not maintain that field properly was clearly wrong as the
disabled-ness of the node was attributed to the Sort's parent instead.

Reviewed-by: Laurenz Albe, Alena Rybakina
Discussion: https://postgr.es/m/9e4ad616bebb103ec2084bf6f724cfc739e7fabb.camel@cybertec.at
2024-10-11 17:19:59 +13:00
Tom Lane
c75c6f8d28 Don't hard-code the input file name in gen_tabcomplete.pl's output.
Use $ARGV[0], that is the specified input file name, in #line
directives generated by gen_tabcomplete.pl.  This makes code
coverage reports work properly in the meson build system (where
the input file name will be a relative path).

Also fix up brain fade in the meson build rule for tab-complete.c:
we only need to write the input file name once not twice.

Jacob Champion (some cosmetic adjustments by me)

Discussion: https://postgr.es/m/CAOYmi+=+oWAoi8pqnH0MJQqsSn4ddzqDhqRQJvyiN2aJSWvw2w@mail.gmail.com
2024-10-10 17:02:08 -04:00
Tom Lane
95eb4cd4ff Avoid possible segfault in psql's tab completion.
Fix oversight in bd1276a3c: the "words_after_create" stanza in
psql_completion() requires previous_words_count > 0, since it uses
prev_wd.  This condition was formerly assured by the if-else chain
above it, but no more.  If there were no previous words then we'd
dereference an uninitialized pointer, possibly causing a segfault.

Report and patch by Anthonin Bonnefoy.

Discussion: https://postgr.es/m/CAO6_XqrSRE7c_i+D7Hm07K3+6S0jTAmMr60RY41XzaA29Ae5uA@mail.gmail.com
2024-10-10 16:17:38 -04:00
Álvaro Herrera
fd64ed60b6
Unbreak overflow test for attinhcount/coninhcount
Commit 90189eefc1e1 narrowed pg_attribute.attinhcount and
pg_constraint.coninhcount from 32 to 16 bits, but kept other related
structs with 32-bit wide fields: ColumnDef and CookedConstraint contain
an int 'inhcount' field which is itself checked for overflow on
increments, but there's no check that the values aren't above INT16_MAX
before assigning to the catalog columns.  This means that a creative
user can get a inconsistent table definition and override some
protections.

Fix it by changing those other structs to also use int16.

Also, modernize style by using pg_add_s16_overflow for overflow testing
instead of checking for negative values.

We also have Constraint.inhcount, which is here removed completely.
This was added by commit b0e96f311985 and not removed by its revert at
6f8bb7c1e961.  It is not needed by the upcoming not-null constraints
patch.

This is mostly academic, so we agreed not to backpatch to avoid ABI
problems.

Bump catversion because of the changes to parse nodes.

Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Co-authored-by: 何建 (jian he) <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202410081611.up4iyofb5ie7@alvherre.pgsql
2024-10-10 17:41:01 +02:00
Fujii Masao
1909835c28 Improve descriptions of some pg_stat_checkpoints functions in pg_proc.dat.
Previously, the descriptions of pg_stat_get_checkpointer_num_requested(),
pg_stat_get_checkpointer_restartpoints_requested(),
and pg_stat_get_checkpointer_restartpoints_performed() in pg_proc.dat
referred to "backend". This was misleading because these functions report
the number of checkpoints or restartpoints requested or performed
by other than backends as well.

This commit removes "backend" from these descriptions to avoid confusion.

Bump catalog version.

Idea from Anton A. Melnikov
Author: Fujii Masao
Reviewed-by: Anton A. Melnikov
Discussion: https://postgr.es/m/8e5f353f-8b31-4a8e-9cfa-c037f22b4aee@postgrespro.ru
2024-10-11 00:12:29 +09:00
Tom Lane
5a4416192d Avoid crash in estimate_array_length with null root pointer.
Commit 9391f7152 added a "PlannerInfo *root" parameter to
estimate_array_length, but failed to consider the possibility that
NULL would be passed for that, leading to a null pointer dereference.

We could rectify the particular case shown in the bug report by fixing
simplify_function/inline_function to pass through the root pointer.
However, as long as eval_const_expressions is documented to accept
NULL for root, similar hazards would remain.  For now, let's just do
the narrow fix of hardening estimate_array_length to not crash.
Its behavior with NULL root will be the same as it was before
9391f7152, so this is not too awful.

Per report from Fredrik Widlert (via Paul Ramsey).  Back-patch to v17
where 9391f7152 came in.

Discussion: https://postgr.es/m/518339E7-173E-45EC-A0FF-9A4A62AA4F40@cleverelephant.ca
2024-10-09 17:07:53 -04:00
Michael Paquier
f3f06b1330 Apply GUC name from central table in more places of guc.c
The name extracted from the record of the GUC tables is applied to more
internal places of guc.c.  This change has the advantage to simplify
parse_and_validate_value(), where the "name" was only used in elog
messages, while it was required to match with the name from the GUC
record.

pg_parameter_aclcheck() now passes the name of the GUC from its record
in two places rather than the caller's argument.  The value given to
this function goes through convert_GUC_name_for_parameter_acl() that
does a simple ASCII downcasing.

Few GUCs mix character casing in core; one test is added for one of
these code paths with "IntervalStyle".

Author: Peter Smith, Michael Paquier
Discussion: https://postgr.es/m/ZwNh4vkc2NHJHnND@paquier.xyz
2024-10-09 18:47:34 +09:00
Richard Guo
67a54b9e83 Allow pushdown of HAVING clauses with grouping sets
In some cases, we may want to transfer a HAVING clause into WHERE in
hopes of eliminating tuples before aggregation instead of after.

Previously, we couldn't do this if there were any nonempty grouping
sets, because we didn't have a way to tell if the HAVING clause
referenced any columns that were nullable by the grouping sets, and
moving such a clause into WHERE could potentially change the results.

Now, with expressions marked nullable by grouping sets with the RT
index of the RTE_GROUP RTE, it is much easier to identify those
clauses that reference any nullable-by-grouping-sets columns: we just
need to check if the RT index of the RTE_GROUP RTE is present in the
clause.  For other HAVING clauses, they can be safely pushed down.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-NpzPgtKU=hgnvyn+J-GanxQCjrUi7piNzZ=upiCV=2Q@mail.gmail.com
2024-10-09 17:19:04 +09:00
Richard Guo
828e94c9d2 Consider explicit incremental sort for mergejoins
For a mergejoin, if the given outer path or inner path is not already
well enough ordered, we need to do an explicit sort.  Currently, we
only consider explicit full sort and do not account for incremental
sort.

In this patch, for the outer path of a mergejoin, we choose to use
explicit incremental sort if it is enabled and there are presorted
keys.  For the inner path, though, we cannot use incremental sort
because it does not support mark/restore at present.

The rationale is based on the assumption that incremental sort is
always faster than full sort when there are presorted keys, a premise
that has been applied in various parts of the code.  In addition, the
current cost model tends to favor incremental sort as being cheaper
than full sort in the presence of presorted keys, making it reasonable
not to consider full sort in such cases.

It could be argued that what if a mergejoin with an incremental sort
as the outer path is selected as the inner path of another mergejoin.
However, this should not be a problem, because mergejoin itself does
not support mark/restore either, and we will add a Material node on
top of it anyway in this case (see final_cost_mergejoin).

There is one ensuing plan change in the regression tests, and we have
to modify that test case to ensure that it continues to test what it
is intended to.

No backpatch as this could result in plan changes.

Author: Richard Guo
Reviewed-by: David Rowley, Tomas Vondra
Discussion: https://postgr.es/m/CAMbWs49x425QrX7h=Ux05WEnt8GS757H-jOP3_xsX5t1FoUsZw@mail.gmail.com
2024-10-09 17:14:42 +09:00
Daniel Gustafsson
c4528fdfa9 Remove incorrect function import from pgindent
Commit 149ac7d4559 which re-implemented pgindent in Perl explicitly
imported the devnull function from File::Spec, but the module does
not export anything.  In recent versions of Perl calling a missing
import function cause a warning, which combined with warnings being
fatal cause pgindent to error out.

Backpatch to all supported versions.

Author: Erik Wienhold <ewie@ewie.name>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discusson: https://postgr.es/m/2372cd74-11b0-46f9-b28e-8f9627215d19@ewie.name
Backpatch-through: v12
2024-10-09 09:34:34 +02:00
Tom Lane
9735428661 Allow roles created by new test to log in under SSPI.
Semi-blind attempt to fix 6a1d0d470 to work on Windows,
along the same lines as a70f2a57f.  Per buildfarm.
2024-10-08 19:46:50 -04:00
Michael Paquier
cf54a2c002 pg_stat_statements: Add columns to track parallel worker activity
The view pg_stat_statements gains two columns:
- parallel_workers_to_launch, the number of parallel workers planned to
be launched.
- parallel_workers_launched, the number of parallel workers actually
launched.

The ratio of both columns offers hints that parallel workers are lacking
on a per-statement basis, requiring some tuning, in coordination with
"calls", the number of times a query is executed.

As of now, these numbers are tracked within Gather and GatherMerge
nodes.  They could be extended to utilities that make use of parallel
workers (parallel btree and brin, VACUUM).

The module is bumped to 1.12.

Author: Guillaume Lelarge
Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com
2024-10-09 08:30:45 +09:00
Michael Paquier
de3a2ea3b2 Introduce two fields in EState to track parallel worker activity
These fields can be set by executor nodes to record how many parallel
workers were planned to be launched and how many of them have been
actually launched within the number initially planned.  This data is
able to give an approximation of the parallel worker draught a system
is facing, making easier the tuning of related configuration parameters.

These fields will be used by some follow-up patches to populate other
parts of the system with their data.

Author: Guillaume Lelarge, Benoit Lobréau
Discussion: https://postgr.es/m/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com
Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com
2024-10-09 08:07:48 +09:00
Tom Lane
01fce8dab1 Silence assorted annoying test output.
Remove unnecessary chatter about "checking if IO::Socket::UNIX works";
our tests should never print anything on stderr unless there's a
problem.

Add .gitignore entry for temporary directory now being left behind
in src/test/postmaster.
2024-10-08 14:13:01 -04:00
Tom Lane
2d24fd942c Add min and max aggregates for bytea type.
Similar to a0f1fce80, although we chose to duplicate logic
rather than invoke byteacmp, primarily to avoid repeat detoasting.

Marat Buharov, Aleksander Alekseev

Discussion: https://postgr.es/m/CAPCEVGXiASjodos4P8pgyV7ixfVn-ZgG9YyiRZRbVqbGmfuDyg@mail.gmail.com
2024-10-08 13:52:14 -04:00
Andres Freund
57f3702471 Use aux process resource owner in walsender
AIO will need a resource owner to do IO. Right now we create a resowner
on-demand during basebackup, and we could do the same for AIO. But it seems
easier to just always create an aux process resowner.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
2024-10-08 11:37:45 -04:00
Andres Freund
755a4c10d1 bufmgr/smgr: Don't cross segment boundaries in StartReadBuffers()
With real AIO it doesn't make sense to cross segment boundaries with one
IO. Add smgrmaxcombine() to allow upper layers to query which buffers can be
merged.

We could continue to cross segment boundaries when not using AIO, but it
doesn't really make sense, because md.c will never be able to perform the read
across the segment boundary in one system call. Which means we'll mark more
buffers as undergoing IO than really makes sense - if another backend desires
to read the same blocks, it'll be blocked longer than necessary. So it seems
better to just never cross the boundary.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
2024-10-08 11:37:45 -04:00
Andres Freund
488f826c72 bufmgr: Return early in ScheduleBufferTagForWriteback() if fsync=off
As pg_flush_data() doesn't do anything with fsync disabled, there's no point
in tracking the buffer for writeback. Arguably the better fix would be to
change pg_flush_data() to flush data even with fsync off, but that's a
behavioral change, whereas this is just a small optimization.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
2024-10-08 11:37:45 -04:00
Tom Lane
c01fd93088 Silence buildfarm warning chatter from bd1276a3c.
Buildfarm members using -Wextra complained about "warning: suggest
braces around empty body in an 'if' statement".  Do it gcc's way,
though I see no actual readability benefit in this.
2024-10-08 11:15:16 -04:00
Heikki Linnakangas
05d1b9b5c2 Fix typo and run pgperltidy on newly-added test
From commit 85ec945b78.
2024-10-08 15:47:51 +03:00
Heikki Linnakangas
2bbc261ddb Use an shmem_exit callback to remove backend from PMChildFlags on exit
This seems nicer than having to duplicate the logic between
InitProcess() and ProcKill() for which child processes have a
PMChildFlags slot.

Move the MarkPostmasterChildActive() call earlier in InitProcess(),
out of the section protected by the spinlock.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-10-08 15:06:34 +03:00
Heikki Linnakangas
85ec945b78 Add test for dead-end backends
The code path for launching a dead-end backend because we're out of
slots was not covered by any tests, so add one. (Some tests did hit
the case of launching a dead-end backend because the server is still
starting up, though, so the gap in our test coverage wasn't as big as
it sounds.)

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-10-08 15:06:31 +03:00
Heikki Linnakangas
6a1d0d470e Add test for connection limits
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-10-08 15:06:26 +03:00
Tatsuo Ishii
5b7da5c261 Doc: add check to detect non-breaking spaces in the docs.
There were multiple instances where accidentally adding non-breaking
space (nbsp, U+00A0, 0xc2a0 in UTF-8) to sgml files. This commit adds
additional checking to detect nbsp. You can check the nbsp by:

make -C doc/src/sgml check

or

make -C doc/src/sgml check-nbsp

Authors: Yugo Nagata, Daniel Gustafsson
Reviewed-by: Tatsuo Ishii, Daniel Gustafsson
Discussion: https://postgr.es/m/20240930.153404.202479334310259810.ishii%40postgresql.org
2024-10-08 20:25:18 +09:00
Fujii Masao
a39297ec02 Move check for binary mode and on_error option to the appropriate location.
Commit 9e2d870119 placed the check for binary mode and on_error
before default values were inserted, which was not ideal.
This commit moves the check to a more appropriate position
after default values are set.

Additionally, the comment incorrectly mentioned two checks before
inserting defaults, when there are actually three. This commit corrects
that comment.

Author: Atsushi Torikoshi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/8830518a-28ac-43a2-8a11-1676d9a3cdf8@oss.nttdata.com
2024-10-08 18:23:43 +09:00
Fujii Masao
4ac2a9bece Add REJECT_LIMIT option to the COPY command.
Previously, when ON_ERROR was set to 'ignore', the COPY command
would skip all rows with data type conversion errors, with no way to
limit the number of skipped rows before failing.

This commit introduces the REJECT_LIMIT option, allowing users to
specify the maximum number of erroneous rows that can be skipped.
If more rows encounter data type conversion errors than allowed by
REJECT_LIMIT, the COPY command will fail with an error, even when
ON_ERROR = 'ignore'.

Author: Atsushi Torikoshi
Reviewed-by: Junwang Zhao, Kirill Reshke, jian he, Fujii Masao
Discussion: https://postgr.es/m/63f99327aa6b404cc951217fa3e61fe4@oss.nttdata.com
2024-10-08 18:19:58 +09:00
Amit Kapila
d759c1a0b8 Stabilize the test added by commit 022564f60c.
The test was unstable in branches 14 and 15 as we were relying on the
number of changes in the table having a toast column to start streaming.
On branches >= 16, we have a GUC debug_logical_replication_streaming which
can stream each change, so the test was stable in those branches.

Change the test to use PREPARE TRANSACTION as that should make the result
consistent and test the code changed in 022564f60c.

Reported-by: Daniel Gustafsson as per buildfarm
Author: Hou Zhijie, Amit Kapila
Backpatch-through: 14
Discussion: https://postgr.es/m/8C2F86AA-981E-4803-B14D-E264C0255330@yesql.se
2024-10-08 12:25:52 +05:30
Michael Paquier
4572d59e3c Improve style of two code paths
In execGrouping.c, execTuplesMatchPrepare() was doing a memory
allocation that was not necessary when the number of columns was 0.
In foreign.c, pg_options_to_table() was assigning twice a variable to
the same value.

Author: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAqup0agbSzMjSLSTn=OANyCzxENF1+HrSYnr3WyZib7=Q@mail.gmail.com
2024-10-08 10:51:20 +09:00
Jeff Davis
a9ed7d9449 Fix search_path cache initialization.
The cache needs to be available very early, so don't rely on
InitializeSearchPath() to initialize the it.

Reported-by: Murat Efendioğlu
Discussion: https://postgr.es/m/CACbCzujQ4zS8MM1bx-==+tr+D3Hk5G1cjN4XkUQ+Q=cEpwhzqg@mail.gmail.com
Backpatch-through: 17
2024-10-07 17:51:14 -07:00
Nathan Bossart
c3b80a7e98 Fix test for password hash length limit.
In commit 8275325a06, I forgot to update password_1.out (an
alternative expected test output file added by commit 3c44e7d8d4),
so this test began failing on machines with FIPS mode enabled.
2024-10-07 17:17:39 -05:00
Nathan Bossart
8318f2b170 vacuumdb: Schema-qualify operator in catalog query's WHERE clause.
Commit 1ab67c9dfa, which modified this catalog query so that it
doesn't return temporary relations, forgot to schema-qualify the
operator.  A comment earlier in the function implores us to fully
qualify everything in the query:

	 * Since we execute the constructed query with the default search_path
	 * (which could be unsafe), everything in this query MUST be fully
	 * qualified.

This commit fixes that.  While at it, add a newline for consistency
with surrounding code.

Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/ZwQJYcuPPUsF0reU%40nathan
Backpatch-through: 12
2024-10-07 16:49:20 -05:00
Nathan Bossart
5d6187d2a2 Fix Y2038 issues with MyStartTime.
Several places treat MyStartTime as a "long", which is only 32 bits
wide on some platforms.  In reality, MyStartTime is a pg_time_t,
i.e., a signed 64-bit integer.  This will lead to interesting bugs
on the aforementioned systems in 2038 when signed 32-bit integers
are no longer sufficient to store Unix time (e.g., "pg_ctl start"
hanging).  To fix, ensure that MyStartTime is handled as a 64-bit
value everywhere.  (Of course, users will need to ensure that
time_t is 64 bits wide on their system, too.)

Co-authored-by: Max Johnson
Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com
Backpatch-through: 12
2024-10-07 13:51:03 -05:00
Tom Lane
f391d9dc93 Convert tab-complete's long else-if chain to a switch statement.
Rename tab-complete.c to tab-complete.in.c, create the preprocessor
script gen_tabcomplete.pl, and install Makefile/meson.build rules
to create tab-complete.c from tab-complete.in.c.  The preprocessor
converts match_previous_words' else-if chain into a switch and
populates tcpatterns[] with the data needed by the driver loop.

The initial HeadMatches/TailMatches/Matches test in each else-if arm
is now performed in a table-driven loop.  Where we get a match, the
corresponding switch case is invoked to see if the match succeeds.
(It might not, if there were additional conditions in the original
else-if test.)

The total number of string comparisons done is just about the
same as it was in the previous coding; however, now that we
have table-driven logic underlying the handmade rules, there
is room to improve that.  For now I haven't bothered because
tab completion is still plenty fast enough for human use.
If the number of rules keeps increasing, we might someday
need to do more in that area.

The immediate benefit of all this thrashing is that C compilers
frequently don't deal well with long else-if chains.  On gcc 8.5.0,
this reduces the compile time of tab-complete.c by about a factor of
four, while MSVC is reported to crash outright with the previous
coding.

Discussion: https://postgr.es/m/2208466.1720729502@sss.pgh.pa.us
2024-10-07 12:22:10 -04:00
Tom Lane
bd1276a3c9 Prepare tab-complete.c for preprocessing.
Separate out psql_completion's giant else-if chain of *Matches
tests into a new function.  Add the infrastructure needed for
table-driven checking of the initial match of each completion
rule.  As-is, however, the code continues to operate as it did.
The new behavior applies only if SWITCH_CONVERSION_APPLIED
is #defined, which it is not here.  (The preprocessor added
in the next patch will add a #define for that.)

The first and last couple of bits of psql_completion are not
based on HeadMatches/TailMatches/Matches tests, so they stay
where they are; they won't become part of the switch.

This patch also fixes up a couple of if-conditions that didn't meet
the conditions enumerated in the comment for match_previous_words().
Those restrictions exist to simplify the preprocessor.

Discussion: https://postgr.es/m/2208466.1720729502@sss.pgh.pa.us
2024-10-07 12:19:12 -04:00
Tom Lane
ef0938f7bd Invent "MatchAnyN" option for tab-complete.c's Matches/MatchesCS.
This argument matches any number (including zero) of previous words.
Use it to replace the common coding pattern

	if (HeadMatches("A", "B") && TailMatches("X", "Y"))

with

	if (Matches("A", "B", MatchAnyN, "X", "Y"))

In itself this feature doesn't do much except (arguably) make the
code slightly shorter and more readable.  However, it reduces the
number of complex if-condition patterns that have to be dealt with
in the next commits in this series.

While here, restructure the *Matches implementation functions so
that the actual work is done in functions that take a char **
array of pattern strings, and the versions taking variadic arguments
are thin wrappers around the array ones.  This simplifies the
new Matches logic considerably.  At the end of this patch series,
the array functions will be the only ones that are material to
performance, so having the variadic ones be wrappers makes sense.

Discussion: https://postgr.es/m/2208466.1720729502@sss.pgh.pa.us
2024-10-07 12:13:02 -04:00
Nathan Bossart
8275325a06 Restrict password hash length.
Commit 6aa44060a3 removed pg_authid's TOAST table because the only
varlena column is rolpassword, which cannot be de-TOASTed during
authentication because we haven't selected a database yet and
cannot read pg_class.  Since that change, attempts to set password
hashes that require out-of-line storage will fail with a "row is
too big" error.  This error message might be confusing to users.

This commit places a limit on the length of password hashes so that
attempts to set long password hashes will fail with a more
user-friendly error.  The chosen limit of 512 bytes should be
sufficient to avoid "row is too big" errors independent of BLCKSZ,
but it should also be lenient enough for all reasonable use-cases
(or at least all the use-cases we could imagine).

Reviewed-by: Tom Lane, Jonathan Katz, Michael Paquier, Jacob Champion
Discussion: https://postgr.es/m/89e8649c-eb74-db25-7945-6d6b23992394%40gmail.com
2024-10-07 10:56:16 -05:00
Amit Kapila
022564f60c Fix fetching default toast value during decoding of in-progress transactions.
During logical decoding of in-progress transactions, we perform the toast
table scan while fetching the default toast value for an attribute. We
forgot to initialize the flag during this scan to indicate that the system
table scan is in progress. We need this flag to ensure that during logical
decoding we never directly access the tableam or heap APIs because we check
for concurrent aborts only in systable_* APIs.

Reported-by: Alexander Lakhin
Author: Takeshi Ideriha, Hou Zhijie
Reviewed-by: Amit Kapila, Hou Zhijie
Backpatch-through: 14
Discussion: https://postgr.es/m/18641-6687273b7f15269d@postgresql.org
2024-10-07 15:38:45 +05:30
Daniel Gustafsson
6ae387eb63 doc: Quote value in SET NAMES documentation
The value passed to SET NAMES should be wrapped in single quotes.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxG3EoUsbX4ZoMFkWrvBJcSCbPjdpRvPhuQN65fADc3mFg@mail.gmail.com
2024-10-07 11:50:39 +02:00
Michael Paquier
e09fff7c98 doc: Add minimal C and SQL example to add a custom table AM handler
The documentation was rather sparse on this matter and there is no
extension in-core that shows how to do it.  Adding a small example will
hopefully help newcomers.  An advantage of writing things this way is
that the contents are not going to rot because of backend changes.

Author: Phil Eaton
Reviewed-by: Robert Haas, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CAByiw+r+CS-ojBDP7Dm=9YeOLkZTXVnBmOe_ajK=en8C_zB3_g@mail.gmail.com
2024-10-07 15:47:40 +09:00
Michael Paquier
2e7c4abe5a Use camel case for "DateStyle" in some error messages
This GUC is written as camel-case in most of the documentation and the
GUC table (but not postgresql.conf.sample), and two error messages
hardcoded it with lower case characters.  Let's use a style more
consistent.

Most of the noise comes from the regression tests, updated to reflect
the GUC name in these error messages.

Author: Peter Smith
Reviewed-by: Peter Eisentraut, Álvaro Herrera
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-10-07 12:36:00 +09:00
Tom Lane
f8d9a9f21e Ignore not-yet-defined Portals in pg_cursors view.
pg_cursor() supposed that any Portal it finds in the hash table must
have sourceText set up, but there's an edge case where that is not so.
A newly-created Portal has sourceText = NULL, and that doesn't change
until PortalDefineQuery is called.  In SPI_cursor_open_internal,
we perform GetCachedPlan between CreatePortal and PortalDefineQuery,
and it's possible for user-defined code to execute during that
planning and cause a fetch from the pg_cursors view, resulting in a
null-pointer-dereference crash.  (It looks like the same could happen
in exec_bind_message, but I've not tried to provoke a failure there.)

I considered trying to fix this by setting sourceText sooner, but
there may be instances of this same calling pattern in extensions,
and we couldn't be sure they'd get the memo promptly.  It seems
better to redefine pg_cursor as not showing Portals that have
not yet had PortalDefineQuery called on them, which we can do by
just skipping them if sourceText is still NULL.

(Before a1c692358, pg_cursor would instead return a row with NULL
in the statement column.  We could revert to that behavior but it
doesn't really seem like a better definition, especially since our
documentation doesn't suggest that the column could be NULL.)

Per report from PetSerAl.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAKygsHTBXLXjwV43kpZa+Cs+XTiaeeJiZdL4cPBm9f4MTdw7wg@mail.gmail.com
2024-10-06 16:03:48 -04:00
Andrew Dunstan
70fea390cf Move Cluster.pm initialization code to a more obvious place
Commit 460c0076e8 added some module intialization code to set signal
handlers. However, that code has now become somewhat buried, as later
commits added new subroutines. Therefore, move the initialization code
to the module's INIT block where it won't become obscured.
2024-10-06 10:34:45 -04:00
Michael Paquier
430ce189fc libpq: Discard leading and trailing spaces for parameters and values in URIs
Integer values applied a parsing rule through pqParseIntParam() that
made URIs like this one working, even if these include spaces around
values:
"postgresql://localhost:5432/postgres?keepalives=1 &keepalives_idle=1 "

This commit changes the parsing so as spaces before and after parameters
and values are discarded, offering more consistency with the parsing
that already applied to libpq for integer values in URIs.

Note that %20 can be used in a URI for a space character.  ECPGconnect()
has been discarded leading and trailing spaces around parameters and
values that for a long time, as well.  Like f22e84df1dea, this is done
as a HEAD-only change.

Reviewed-by: Yuto Sasaki
Discussion: https://postgr.es/m/Zv3oWOfcrHTph7JK@paquier.xyz
2024-10-06 18:23:02 +09:00
Tom Lane
68dfecbef2 Use generateClonedIndexStmt to propagate CREATE INDEX to partitions.
When instantiating an existing partitioned index for a new child
partition, we use generateClonedIndexStmt to build a suitable
IndexStmt to pass to DefineIndex.  However, when DefineIndex needs
to recurse to instantiate a newly created partitioned index on an
existing child partition, it was doing copyObject on the given
IndexStmt and then applying a bunch of ad-hoc fixups.  This has
a number of problems, primarily that it implies fresh lookups of
referenced objects such as opclasses and collations.  Since commit
2af07e2f7 caused DefineIndex to restrict search_path internally, those
lookups could fail or deliver different results than the original one.
We can avoid those problems and save a few dozen lines of code by
using generateClonedIndexStmt in this code path too.

Another thing this fixes is incorrect propagation of parent-index
comments to child indexes (because the copyObject approach copies
the idxcomment field while generateClonedIndexStmt doesn't).  I had
noticed this in connection with commit c01eb619a, but not run the
problem to ground.

I'm tempted to back-patch this further than v17, but the only thing
it's known to fix in older branches is the comment issue, which is
pretty minor and doesn't seem worth the risk of introducing new
issues in stable branches.  (If anyone does care about that,
clearing idxcomment in the copied IndexStmt would be a safer fix.)

Per bug #18637 from usamoi.  Back-patch to v17 where the search_path
change came in.

Discussion: https://postgr.es/m/18637-f51e314546e3ba2a@postgresql.org
2024-10-05 14:46:44 -04:00
Heikki Linnakangas
f9ecb57a50 Clean up WaitLatch calls that passed latch without WL_LATCH_SET
The 'latch' argument is ignored if WL_LATCH_SET is not given. Clarify
these calls by not pointlessly passing MyLatch.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c@iki.fi
2024-10-05 15:31:06 +03:00
Heikki Linnakangas
094ae07160 Remove unneeded #include
Unneeded since commit d72731a704.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c@iki.fi
2024-10-05 15:09:32 +03:00
Heikki Linnakangas
6c0c49f7d3 Remove unused latch
It was left unused by commit bc971f4025, which replaced the latch
usage with a condition variable

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c@iki.fi
2024-10-05 15:09:27 +03:00
Thomas Munro
adbb27ac89 Reject non-ASCII locale names.
Commit bf03cfd1 started scanning all available BCP 47 locale names on
Windows.  This caused an abort/crash in the Windows runtime library if
the default locale name contained non-ASCII characters, because of our
use of the setlocale() save/restore pattern with "char" strings.  After
switching to another locale with a different encoding, the saved name
could no longer be understood, and setlocale() would abort.

"Turkish_Türkiye.1254" is the example from recent reports, but there are
other examples of countries and languages with non-ASCII characters in
their names, and they appear in Windows' (old style) locale names.

To defend against this:

1.  In initdb, reject non-ASCII locale names given explicity on the
command line, or returned by the operating system environment with
setlocale(..., ""), or "canonicalized" by the operating system when we
set it.

2.  In initdb only, perform the save-and-restore with Windows'
non-standard wchar_t variant of setlocale(), so that it is not subject
to round trip failures stemming from char string encoding confusion.

3.  In the backend, we don't have to worry about the save-and-restore
problem because we have already vetted the defaults, so we just have to
make sure that CREATE DATABASE also rejects non-ASCII names in any new
databases.  SET lc_XXX doesn't suffer from the problem, but the ban
applies to it too because it uses check_locale().  CREATE COLLATION
doesn't suffer from the problem either, but it doesn't use
check_locale() so it is not included in the new ban for now, to minimize
the change.

Anyone who encounters the new error message should either create a new
duplicated locale with an ASCII-only name using Windows Locale Builder,
or consider using BCP 47 names like "tr-TR".  Users already couldn't
initialize a cluster with "Turkish_Türkiye.1254" on PostgreSQL 16+, but
the new failure mode is an error message that explains why, instead of a
crash.

Back-patch to 16, where bf03cfd1 landed.  Older versions are affected
in theory too, but only 16 and later are causing crash reports.

Reviewed-by: Andrew Dunstan <andrew@dunslane.net> (the idea, not the patch)
Reported-by: Haifang Wang (Centific Technologies Inc) <v-haiwang@microsoft.com>
Discussion: https://postgr.es/m/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com
2024-10-05 13:50:02 +13:00
Tom Lane
f22e84df1d ecpg: avoid adding whitespace around '&' in connection URLs.
The preprocessor really should not have done this to begin with.
The space after '&' was masked by ECPGconnect's skipping spaces
before option keywords, and the space before by dint of libpq
being (mostly) insensitive to trailing space in option values.
We fixed the one known problem with that in 920d51979.  Hence
this patch is mostly cosmetic, and we'll just change it in HEAD.

Discussion: https://postgr.es/m/TY2PR01MB36286A7B97B9A15793335D18C1772@TY2PR01MB3628.jpnprd01.prod.outlook.com
2024-10-04 12:01:50 -04:00
Peter Eisentraut
ddbba3aac8 Rename PageData to GenericXLogPageData
In the PostgreSQL C type naming schema, the type PageData should be
what the pointer of type Page points to.  But in this case it's
actually an unrelated type local to generic_xlog.c.  Rename that to a
more specific name.  This makes room to possible add a PageData type
with the mentioned meaning, but this is not done here.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/001d457e-c118-4219-8132-e1846c2ae3c9%40eisentraut.org
2024-10-04 12:47:35 +02:00
Dean Rasheed
9428c001f6 Speed up numeric division by always using the "fast" algorithm.
Formerly there were two internal functions in numeric.c to perform
numeric division, div_var() and div_var_fast(). div_var() performed
division exactly to a specified rscale using Knuth's long division
algorithm, while div_var_fast() used the algorithm from the "FM"
library, which approximates each quotient digit using floating-point
arithmetic, and computes a truncated quotient with DIV_GUARD_DIGITS
extra digits. div_var_fast() could be many times faster than
div_var(), but did not guarantee correct results in all cases, and was
therefore only suitable for use in transcendental functions, where
small errors are acceptable.

This commit merges div_var() and div_var_fast() together into a single
function with an extra "exact" boolean parameter, which can be set to
false if the caller is OK with an approximate result. The new function
uses the faster algorithm from the "FM" library, except that when
"exact" is true, it does not truncate the computation with
DIV_GUARD_DIGITS extra digits, but instead performs the full-precision
computation, subtracting off complete multiples of the divisor for
each quotient digit. However, it is able to retain most of the
performance benefits of div_var_fast(), by delaying the propagation of
carries, allowing the inner loop to be auto-vectorized.

Since this may still lead to an inaccurate result, when "exact" is
true, it then inspects the remainder and uses that to adjust the
quotient, if necessary, to make it correct. In practice, the quotient
rarely needs to be adjusted, and never by more than one in the final
digit, though it's difficult to prove that, so the code allows for
larger adjustments, just in case.

In addition, use base-NBASE^2 arithmetic and a 64-bit dividend array,
similar to mul_var(), so that the number of iterations of the outer
loop is roughly halved. Together with the faster algorithm, this makes
div_var() up to around 20 times as fast as the old Knuth algorithm
when "exact" is true, and up to 2 or 3 times as fast as the old
div_var_fast() function when "exact" is false.

Dean Rasheed, reviewed by Joel Jacobson.

Discussion: https://postgr.es/m/CAEZATCVHR10BPDJSANh0u2+Sg6atO3mD0G+CjKDNRMD-C8hKzQ@mail.gmail.com
2024-10-04 09:49:24 +01:00
Michael Paquier
4dd3087300 Remove assertion checking query ID in execMain.c
This assertion has been added by 24f520594809, but Alexander Lakhin has
proved that the ExecutorRun() one can be broken by using a PL function
that manipulates compute_query_id and track_activities, while the ones
in ExecutorFinish() and ExecutorEnd() could be triggered when cleaning
up portals at the beginning of a new query execution.

Discussion: https://postgr.es/m/b37d8e6c-e83d-e157-8865-1b2460a6aef2@gmail.com
2024-10-04 12:51:17 +09:00
Dean Rasheed
259a0a99fe Fix wrong varnullingrels error for MERGE WHEN NOT MATCHED BY SOURCE.
If a MERGE command contains WHEN NOT MATCHED BY SOURCE actions, the
source relation appears on the outer side of the join. Thus, any Vars
referring to the source in the merge join condition, actions, and
RETURNING list should be marked as nullable by the join, since they
are used in the ModifyTable node above the join. Note that this only
applies to the copy of join condition used in the executor to
distinguish MATCHED from NOT MATCHED BY SOURCE cases. Vars in the
original join condition, inside the join node itself, should not be
marked.

Failure to correctly mark these Vars led to a "wrong varnullingrels"
error in the final stage of query planning, in some circumstances. We
happened to get away without this in all previous tests, since they
all involved a ModifyTable node directly on top of the join node, so
that the top plan targetlist coincided with the output of the join,
and the varnullingrels check was more lax. However, if another plan
node, such as a one-time filter Result node, gets inserted between the
ModifyTable node and the join node, then a stricter check is applied,
which fails.

Per bug #18634 from Alexander Lakhin. Thanks to Tom Lane and Richard
Guo for review and analysis.

Back-patch to v17, where WHEN NOT MATCHED BY SOURCE support was added
to MERGE.

Discussion: https://postgr.es/m/18634-db5299c937877f2b%40postgresql.org
2024-10-03 13:48:32 +01:00
Dean Rasheed
dddb5640c6 Fix incorrect non-strict join recheck in MERGE WHEN NOT MATCHED BY SOURCE.
If a MERGE command contains WHEN NOT MATCHED BY SOURCE actions, the
merge join condition is used by the executor to distinguish MATCHED
from NOT MATCHED BY SOURCE cases. However, this qual is executed using
the output from the join subplan node, which nulls the output from the
source relation in the not matched case, and so the result may be
incorrect if the join condition is "non-strict" -- for example,
something like "src.col IS NOT DISTINCT FROM tgt.col".

Fix this by enhancing the join recheck condition with an additional
"src IS NOT NULL" check, so that it does the right thing when
evaluated using the output from the join subplan.

Noted by Tom Lane while investigating bug #18634 from Alexander
Lakhin.

Back-patch to v17, where WHEN NOT MATCHED BY SOURCE support was added
to MERGE.

Discussion: https://postgr.es/m/18634-db5299c937877f2b%40postgresql.org
2024-10-03 12:53:03 +01:00
Amit Langote
19531968e8 Replace Unicode apostrophe with ASCII apostrophe
In commit babb3993dbe9, I accidentally introduced a Unicode
apostrophe (U+2019). This commit replaces it with the ASCII
apostrophe (U+0027) for consistency.

Reported-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/CAPpHfduNWMBjkJFtqXJremk6b6YQYO2s3_VEpnj-T_CaUNUYYQ@mail.gmail.com
2024-10-03 20:00:36 +09:00
Fujii Masao
e55f025b05 Refactor CopyFrom() in copyfrom.c.
This commit simplifies CopyFrom() by removing the unnecessary local variable
'skipped', which tracked the number of rows skipped due to on_error = 'ignore'.
That count is already handled by cstate->num_errors, so the 'skipped' variable
was redundant.

Additionally, the condition on_error != COPY_ON_ERROR_STOP is removed.
Since on_error == COPY_ON_ERROR_IGNORE is already checked, and on_error
only has two values (ignore and stop), the additional check was redundant
and made the logic harder to read. Seemingly this was introduced
in preparation for a future patch, but the current checks don’t offer
clear value and have been removed to improve readability.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24cfd@oss.nttdata.com
2024-10-03 15:59:16 +09:00
Fujii Masao
a1c4c8a9e1 file_fdw: Add on_error and log_verbosity options to file_fdw.
In v17, the on_error and log_verbosity options were introduced for
the COPY command. This commit extends support for these options
to file_fdw.

Setting on_error = 'ignore' for a file_fdw foreign table allows users
to query it without errors, even when the input file contains
malformed rows, by skipping the problematic rows.

Both on_error and log_verbosity options apply to SELECT and ANALYZE
operations on file_fdw foreign tables.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24cfd@oss.nttdata.com
2024-10-03 15:57:32 +09:00
Fujii Masao
e7834a1a25 Add log_verbosity = 'silent' support to COPY command.
Previously, when the on_error option was set to ignore, the COPY command
would always log NOTICE messages for input rows discarded due to
data type incompatibility. Users had no way to suppress these messages.

This commit introduces a new log_verbosity setting, 'silent',
which prevents the COPY command from emitting NOTICE messages
when on_error = 'ignore' is used, even if rows are discarded.
This feature is particularly useful when processing malformed files
frequently, where a flood of NOTICE messages can be undesirable.

For example, when frequently loading malformed files via the COPY command
or querying foreign tables using file_fdw (with an upcoming patch to
add on_error support for file_fdw), users may prefer to suppress
these messages to reduce log noise and improve clarity.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24cfd@oss.nttdata.com
2024-10-03 15:55:37 +09:00
Amit Langote
babb3993db Fix expression list handling in ATExecAttachPartition()
This commit addresses two issues related to the manipulation of the
partition constraint expression list in ATExecAttachPartition().

First, the current use of list_concat() to combine the partition's
constraint (retrieved via get_qual_from_partbound()) with the parent
table’s partition constraint can lead to memory safety issues. After
calling list_concat(), the original constraint (partBoundConstraint)
might no longer be safe to access, as list_concat() may free or modify
it.

Second, there's a logical error in constructing the constraint for
validating against the default partition. The current approach
incorrectly includes a negated version of the parent table's partition
constraint, which is redundant, as it always evaluates to false for
rows in the default partition.

To resolve these issues, list_concat() is replaced with
list_concat_copy(), ensuring that partBoundConstraint remains unchanged
and can be safely reused when constructing the validation constraint
for the default partition.

This fix is not applied to back-branches, as there is no live bug and
the issue has not caused any reported problems in practice.

Nitin Jadhav posted a patch to address the memory safety issue, but I
decided to follow Alvaro Herrera's suggestion from the initial
discussion, as it allows us to fix both the memory safety and logical
issues.

Reported-by: Andres Freund <andres@anarazel.de>
Reported-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/20231115165737.zeulb575cgrbqo74@awork3.anarazel.de
Discussion: https://postgr.es/m/CAMm1aWbmYHM3bqtjyMQ-a+4Ub=dgsb_2E3_up2cn=UGdHNrGTg@mail.gmail.com
2024-10-03 11:59:09 +09:00
Michael Paquier
e2bab2d792 Remove support for unlogged on partitioned tables
The following commands were allowed on partitioned tables, with
different effects:
1) ALTER TABLE SET [UN]LOGGED did not issue an error, and did not update
pg_class.relpersistence.
2) CREATE UNLOGGED TABLE was working with pg_class.relpersistence marked
as initially defined, but partitions did not inherit the UNLOGGED
property, which was confusing.

This commit causes the commands mentioned above to fail for partitioned
tables, instead.

pg_dump is tweaked so as partitioned tables marked as UNLOGGED ignore
the option when dumped from older server versions.  pgbench needs a
tweak for --unlogged and --partitions=N to ignore the UNLOGGED option on
the partitioned tables created, its partitions still being unlogged.

Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/ZiiyGFTBNkqcMQi_@paquier.xyz
2024-10-03 10:55:02 +09:00
Tom Lane
554d3a18f3 Adjust json_manifest_per_file_callback API in one more place.
Oversight in commit d94cf5ca7 (and in my testing of same).

Discussion: https://postgr.es/m/9468.1727895630@sss.pgh.pa.us
2024-10-02 20:27:45 -04:00
Tom Lane
920d51979a Parse libpq's "keepalives" option more like other integer options.
Use pqParseIntParam (nee parse_int_param) instead of using strtol
directly.  This allows trailing whitespace, which the previous coding
didn't, and makes the spelling of the error message consistent with
other similar cases.

This seems to be an oversight in commit e7a221797, which introduced
parse_int_param.  That fixed places that were using atoi(), but missed
this place which was randomly using strtol() instead.

Ordinarily I'd consider this minor cleanup not worth back-patching.
However, it seems that ecpg assumes it can add trailing whitespace
to URL parameters, so that use of the keepalives option fails in
that context.  Perhaps that's worth improving as a separate matter.
In the meantime, back-patch this to all supported branches.

Yuto Sasaki (some further cleanup by me)

Discussion: https://postgr.es/m/TY2PR01MB36286A7B97B9A15793335D18C1772@TY2PR01MB3628.jpnprd01.prod.outlook.com
2024-10-02 17:30:36 -04:00
Robert Haas
d94cf5ca7f File size in a backup manifest should use uint64, not size_t.
size_t is the size of an object in memory, not the size of a file on disk.

Thanks to Tom Lane for noting the error.

Discussion: http://postgr.es/m/1865585.1727803933@sss.pgh.pa.us
2024-10-02 09:59:04 -04:00
Daniel Gustafsson
7b2822ecf9 doc: Missing markup, punctuation and wordsmithing
Various improvements to the documentation like adding missing
markup, improving punctuation, ensuring consistent spelling of
words and minor wordsmithing.

Author: Oleg Sibiryakov <o.sibiryakov@postgrespro.ru>
Discussion: https://postgr.es/m/b7d0a03c-107e-48c7-a5c9-2c6f73cdf78f@postgrespro.ru
2024-10-02 14:50:56 +02:00
Daniel Gustafsson
9c73395104 Add fastpaths for when no objects are found
If there are no objects found, there is no reason to inspect the
result columns and mallocing a zero-sized  (which will be 1 byte
in reality) heap buffer for it.  Add a fast-path for immediately
returning like how other object inspection functions are already
doing it.

Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/C2F05B3C-1414-45DD-AE09-6FEE4D0F89BD@yesql.se
2024-10-02 13:08:55 +02:00
Daniel Gustafsson
1a123e3b13 Remove superfluous PQExpBuffer resetting
Since the buffer was just created, there is no reason to immediately
reset it.

Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/C2F05B3C-1414-45DD-AE09-6FEE4D0F89BD@yesql.se
2024-10-02 13:07:31 +02:00
Daniel Gustafsson
94902b146f doc: Add link to login event trigger example
The login event trigger is not listed on the trigger firing matrix
since it's not fired by a command.  Add a link to the example code
page similar to how the other event triggers link to the matrix.

Reported-by: Marcos Pegoraro <marcos@f10.com.br>
Discussion: https://postgr.es/m/CAB-JLwYS+78rX02BZ3wJ9ykVrd2i3O1K+7jzvZKQ0evquyQiLQ@mail.gmail.com
2024-10-02 12:24:39 +02:00
Fujii Masao
17cc5f666f Fix inconsistent reporting of checkpointer stats.
Previously, the pg_stat_checkpointer view and the checkpoint completion
log message could show different numbers for buffers written
during checkpoints. The view only counted shared buffers,
while the log message included both shared and SLRU buffers,
causing inconsistencies.

This commit resolves the issue by updating both the view and the log message
to separately report shared and SLRU buffers written during checkpoints.
A new slru_written column is added to the pg_stat_checkpointer view
to track SLRU buffers, while the existing buffers_written column now
tracks only shared buffers. This change would help users distinguish
between the two types of buffers, in the pg_stat_checkpointer view and
the checkpoint complete log message, respectively.

Bump catalog version.

Author: Nitin Jadhav
Reviewed-by: Bharath Rupireddy, Michael Paquier, Kyotaro Horiguchi, Robert Haas
Reviewed-by: Andres Freund, vignesh C, Fujii Masao
Discussion: https://postgr.es/m/CAMm1aWb18EpT0whJrjG+-nyhNouXET6ZUw0pNYYAe+NezpvsAA@mail.gmail.com
2024-10-02 11:17:47 +09:00
Michael Paquier
506eede711 doc: Clarify name of files generated by pg_waldump --save-fullpage
The fork name is always separated with the block number by an underscore
in the names of the files generated, but the docs stuck them together
without a separator, which was confusing.

Author: Christoph Berg
Discussion: https://postgr.es/m/ZvxtSLiix9eceMRM@msg.df7cb.de
Backpatch-through: 16
2024-10-02 11:12:40 +09:00
Tom Lane
da8a4c1666 Reject a copy EOF marker that has data ahead of it on the same line.
We have always documented that a copy EOF marker (\.) must appear
by itself on a line, and that is how psql interprets the rule.
However, the backend's actual COPY FROM logic only insists that
there not be data between the \. and the following newline.
Any data ahead of the \. is parsed as a final line of input.
It's hard to interpret this as anything but an ancient mistake
that we've faithfully carried forward.  Continuing to allow it
is not cost-free, since it could mask client-side bugs that
unnecessarily backslash-escape periods (and thereby risk
accidentally creating an EOF marker).  So, let's remove that
provision and throw error if the EOF marker isn't alone on its
line, matching what the documentation has said right along.
Adjust the relevant error messages to be clearer, too.

Discussion: https://postgr.es/m/ed659f37-a9dd-42a7-82b9-0da562cc4006@manitou-mail.org
2024-10-01 16:53:54 -04:00
Peter Eisentraut
983a588e0b initdb: Add new option "--no-data-checksums"
Right now this does nothing except override any earlier
--data-checksums option.  But the idea is that --data-checksums could
become the default, and then this option would allow forcing it off
instead.

Author: Greg Sabino Mullane <greg@turnstep.com>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com
2024-10-01 10:50:30 -04:00
Peter Eisentraut
efd72a3d42 Tweak docs to reduce possible impact of data checksums
Author: Greg Sabino Mullane <greg@turnstep.com>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com
2024-10-01 09:58:20 -04:00
Peter Eisentraut
10b721821d Use macro to define the number of enum values
Refactoring in the interest of code consistency, a follow-up to 2e068db56e31.

The argument against inserting a special enum value at the end of the enum
definition is that a switch statement might generate a compiler warning unless
it has a default clause.

Aleksander Alekseev, reviewed by Michael Paquier, Dean Rasheed, Peter Eisentraut

Discussion: https://postgr.es/m/CAJ7c6TMsiaV5urU_Pq6zJ2tXPDwk69-NKVh4AMN5XrRiM7N%2BGA%40mail.gmail.com
2024-10-01 09:30:24 -04:00
Robert Haas
fc1b2ce0ee Fix some pg_verifybackup issues reported by Coverity.
Commit 8dfd3129027969fdd2d9d294220c867d2efd84aa introduced a few
problems. verify_tar_file() forgot to free a buffer; the leak can't
add up to anything material, but might as well fix it.
precheck_tar_backup_file() intended to return after reporting an
error but didn't actually do so. member_copy_control_data() could
try to copy zero bytes (and maybe Coverity thinks it can even be
trying to copy a negative number of bytes).

Per discussion with Tom Lane.

Discussion: http://postgr.es/m/1240823.1727629418@sss.pgh.pa.us
2024-10-01 08:36:54 -04:00
Peter Eisentraut
9c2a6c5a5f Simplify checking for xlocale.h
Instead of XXX_IN_XLOCALE_H for several features XXX, let's just
include <xlocale.h> if HAVE_XLOCALE_H.  The reason for the extra
complication was apparently that some old glibc systems also had an
<xlocale.h>, and you weren't supposed to include it directly, but it's
gone now (as far as I can tell it was harmless to do so anyway).

Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech
2024-10-01 07:23:45 -04:00
Peter Eisentraut
ee4859123e jit: Use opaque pointers in all supported LLVM versions.
LLVM's opaque pointer change began in LLVM 14, but remained optional
until LLVM 16.  When commit 37d5babb added opaque pointer support, we
didn't turn it on for LLVM 14 and 15 yet because we didn't want to risk
weird bitcode incompatibility problems in released branches of
PostgreSQL.  (That might have been overly cautious, I don't know.)

Now that PostgreSQL 18 has dropped support for LLVM versions < 14, and
since it hasn't been released yet and no extensions or bitcode have been
built against it in the wild yet, we can be more aggressive.  We can rip
out the support code and build system clutter that made opaque pointer
use optional.

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussions: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
2024-10-01 06:10:15 -04:00
Peter Eisentraut
972c2cd288 jit: Require at least LLVM 14, if enabled.
Remove support for LLVM versions 10-13.  The default on all non-EOL'd
OSes represented in our build farm will be at least LLVM 14 when
PostgreSQL 18 ships.

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
2024-10-01 04:49:11 -04:00
Daniel Gustafsson
1b4d52c355 doc: Mention the connstring key word for PGSERVICE
The documentation for the connection service file was mentioning
the environment variable early but not the connection string key
word until the last sentence and only then in an example.  This
adds the keyword in the first paragraph to make it clearer

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87r09ibpke.fsf@wibble.ilmari.org
2024-10-01 10:20:14 +02:00
Michael Paquier
cf4401fe6c Fix race condition in COMMIT PREPARED causing orphaned 2PC files
COMMIT PREPARED removes on-disk 2PC files near its end, but the state
checked if a file is on-disk or not gets read from shared memory while
not holding the two-phase state lock.

Because of that, there was a small window where a second backend doing a
PREPARE TRANSACTION could reuse the GlobalTransaction put back into the
2PC free list by the COMMIT PREPARED, overwriting the "ondisk" flag read
afterwards by the COMMIT PREPARED to decide if its on-disk two-phase
state file should be removed, preventing the file deletion.

This commit fixes this issue so as the "ondisk" flag in the
GlobalTransaction is read while holding the two-phase state lock, not
from shared memory after its entry has been added to the free list.

Orphaned two-phase state files flushed to disk after a checkpoint are
discarded at the beginning of recovery.  However, a truncation of
pg_xact/ would make the startup process issue a FATAL when it cannot
read the SLRU page holding the state of the transaction whose 2PC file
was orphaned, which is a necessary step to decide if the 2PC file should
be removed or not.  Removing manually the file would be necessary in
this case.

Issue introduced by effe7d9552dd, so backpatch all the way down.

Mea culpa.

Author: wuchengwen
Discussion: https://postgr.es/m/tencent_A7F059B5136A359625C7B2E4A386B3C3F007@qq.com
Backpatch-through: 12
2024-10-01 15:44:03 +09:00
Tatsuo Ishii
3b1a377def Doc: replace unnecessary non-breaking space with ordinal space.
There were unnecessary non-breaking spaces (nbsp, U+00A0, 0xc2a0 in
UTF-8) in the docs.  This commit replaces them with ASCII spaces
(0x20).

config.sgml is backpatched through 17.
ref/drop_extension.sgml is backpatched through 13.

Discussion: https://postgr.es/m/20240930.153404.202479334310259810.ishii%40postgresql.org
Reviewed-by: Yugo Nagata, Daniel Gustafsson
Backpatch-through: 17, 13
2024-10-01 11:34:34 +09:00
Michael Paquier
5deb56387e Expand assertion check for query ID reporting in executor
As formulated, the assertion added in the executor by 24f520594809 to
check that a query ID is set had two problems:
- track_activities may be disabled while compute_query_id is enabled,
causing the query ID to not be reported to pg_stat_activity.
- debug_query_string may not be set in some context.  The only path
where this would matter is visibly autovacuum, should parallel workers
be enabled there at some point.  This is not the case currently.

There was no test showing the interactions between the query ID and
track_activities, so let's add one based on a scan of pg_stat_activity.
This assertion is still an experimentation at this stage, but let's see
if this shows more paths where query IDs are not properly set while they
should.

Discussion: https://postgr.es/m/Zvn5616oYXmpXyHI@paquier.xyz
2024-10-01 08:56:21 +09:00
Daniel Gustafsson
102de3be73 Add missing command for pg_maintain in comment
The comment in pg_class_aclmask_ext() which lists the allowed commands
for the pg_maintain role lacked LOCK TABLE.

Reported-by: Yusuke Sugie <btsugieyuusuke@oss.nttdata.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/034d3c60f5daba1919cd90f236b2e22d@oss.nttdata.com
2024-10-01 00:01:32 +02:00
Tom Lane
7702337489 Do not treat \. as an EOF marker in CSV mode for COPY IN.
Since backslash is (typically) not special in CSV data, we should
not be treating \. as special either.  The server historically did
this to keep CSV and TEXT modes more alike and to support V2 protocol;
but V2 protocol is long dead, and the inconsistency with CSV standards
is annoying.  Remove that behavior in CopyReadLineText, and make some
minor consequent code simplifications.

On the client side, we need to fix psql so that it does not check
for \. except when reading data from STDIN (that is, the script
source).  We must do that regardless of TEXT/CSV mode or there is
no way to end the COPY short of script EOF.  Also, be careful
not to send the \. to the server in that case.

This is a small compatibility break in that other applications
beside psql may need similar adjustment.  Also, using an older
version of psql with a v18 server may result in misbehavior
during CSV-mode COPY IN.

Daniel Vérité, reviewed by vignesh C, Robert Haas, and myself

Discussion: https://postgr.es/m/ed659f37-a9dd-42a7-82b9-0da562cc4006@manitou-mail.org
2024-09-30 17:57:12 -04:00
Fujii Masao
a19f83f879 docs: Enhance the pg_stat_checkpointer view documentation.
This commit updates the documentation for the pg_stat_checkpointer view
to clarify what kind of checkpoints or restartpoints each counter tracks.
This makes it easier to understand the meaning of each counter.

Previously, the num_requested description included "backend,"
which could be misleading since requests come from other sources as well.
This commit also removes "backend" from the description of num_requested,
to avoid confusion.

Author: Fujii Masao
Reviewed-by: Anton A. Melnikov
Discussion: https://postgr.es/m/4640258e-d959-4cf0-903c-cd02389c3e05@oss.nttdata.com
2024-10-01 02:01:57 +09:00
Tom Lane
04c64e3fb3 Remove incorrect entries in pg_walsummary's getopt_long call.
For some reason this listed "-f" and "-w" as valid switches, though
the code doesn't implement any such thing nor do the docs mention
them.  The effect of this was that if you tried to use one of these
switches, you'd get an unhelpful error message.

Yusuke Sugie

Discussion: https://postgr.es/m/68e72a2a70f4d84c1c7847b13bcdaef8@oss.nttdata.com
2024-09-30 12:06:54 -04:00
Alvaro Herrera
4dea33ce76
Don't disallow DROP of constraints ONLY on partitioned tables
This restriction seems to have come about due to some fuzzy thinking: in
commit 9139aa19423b we were adding a restriction against ADD constraint
ONLY on partitioned tables (which is sensible) and apparently we thought
the DROP case had to be symmetrical.  However, it isn't, and the
comments about it are mistaken about the effect it would have.  Remove
this limitation.

There have been no reports of users bothered by this limitation, so I'm
not backpatching it just yet.  We can revisit this decision later, as needed.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202409261752.nbvlawkxsttf@alvherre.pgsql
Discussion: https://postgr.es/m/7682253a-6f79-6a92-00aa-267c4c412870@lab.ntt.co.jp
	(about commit 9139aa19423b, previously not registered)
2024-09-30 11:58:13 +02:00
Michael Paquier
4c7cd07aa6 Bump catalog version for change in VariableSetStmt
Oversight in dc68515968e8, as this breaks SQL functions with a SET
command.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/1364409.1727673407@sss.pgh.pa.us
2024-09-30 14:52:03 +09:00
Michael Paquier
dc68515968 Show values of SET statements as constants in pg_stat_statements
This is a continuation of work like 11c34b342bd7, done to reduce the
bloat of pg_stat_statements by applying more normalization to query
entries.  This commit is able to detect and normalize values in
VariableSetStmt, resulting in:
SET conf_param = $1

Compared to other parse nodes, VariableSetStmt is embedded in much more
places in the parser, impacting many query patterns in
pg_stat_statements.  A custom jumble function is used, with an extra
field in the node to decide if arguments should be included in the
jumbling or not, a location field being not enough for this purpose.
This approach allows for a finer tuning.

Clauses relying on one or more keywords are not normalized, for example:
* DEFAULT
* FROM CURRENT
* List of keywords.  SET SESSION CHARACTERISTICS AS TRANSACTION,
where it is critical to differentiate different sets of options, is a
good example of why normalization should not happen.

Some queries use VariableSetStmt for some subclauses with SET, that also
have their values normalized:
- ALTER DATABASE
- ALTER ROLE
- ALTER SYSTEM
- CREATE/ALTER FUNCTION

ba90eac7a995 has added test coverage for most of the existing SET
patterns.  The expected output of these tests shows the difference this
commit creates.  Normalization could be perhaps applied to more portions
of the grammar but what is done here is conservative, and good enough as
a starting point.

Author: Greg Sabino Mullane, Michael Paquier
Discussion: https://postgr.es/m/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com
Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com
Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
2024-09-30 14:02:00 +09:00
Fujii Masao
559efce1d6 Add num_done counter to the pg_stat_checkpointer view.
Checkpoints can be skipped when the server is idle. The existing num_timed and
num_requested counters in pg_stat_checkpointer track both completed and
skipped checkpoints, but there was no way to count only the completed ones.

This commit introduces the num_done counter, which tracks only completed
checkpoints, making it easier to see how many were actually performed.

Bump catalog version.

Author: Anton A. Melnikov
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
2024-09-30 11:56:05 +09:00
Fujii Masao
20cfec896c reindexdb: Skip reindexing temporary tables and indexes.
Reindexing temp tables or indexes of other sessions is not allowed.
However, reindexdb in parallel mode previously listed them as
the objects to process, leading to failures.

This commit ensures reindexdb in parallel mode skips temporary tables
and indexes by adding a condition based on the relpersistence column
in pg_class to the object listing queries, preventing these issues.

Note that this commit does not affect reindexdb when temporary tables
or indexes are explicitly specified using the -t or -j options;
reindexdb in that case still does not skip them and can cause an error.

Back-patch to v13 where parallel mode was introduced in reindexdb.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/5f37ee56-14fb-44fe-9150-9eb97e10538b@oss.nttdata.com
2024-09-30 11:13:55 +09:00
Michael Paquier
6fd5071909 Set query ID in parallel workers for vacuum, BRIN and btree
All these code paths use their own entry point when starting parallel
workers, but failed to set a query ID, even if they set a text query.
Hence, this data would be missed in pg_stat_activity for the worker
processes.  The main entry point for parallel query processing,
ParallelQueryMain(), is already doing that by saving its query ID in a
dummy PlannedStmt, but not the others.  The code is changed so as the
query ID of these queries is set in their shared state, and reported
back once the parallel workers start.

Some tests are added to show how the failures can happen for btree and
BRIN with a parallel build enforced, which are able to trigger a failure
in an assertion added by 24f520594809 in the recovery TAP test
027_stream_regress.pl where pg_stat_statements is always loaded.  In
this case, the executor path was taken because the index expression
needs to be flattened when building its IndexInfo.

Alexander Lakhin has noticed the problem in btree, and I have noticed
that the issue was more spread.  This is arguably a bug, but nobody has
complained about that until now, so no backpatch is done out of caution.
If folks would like to see a backpatch, well, let me know.

Reported-by: Alexander Lakhin
Reviewed-by: Sami Imseih
Discussion: https://postgr.es/m/cf3547c1-498a-6a61-7b01-819f902a251f@gmail.com
2024-09-30 08:43:28 +09:00
Noah Misch
0d5a3d7574 Remove NULL dereference from RenameRelationInternal().
Defect in last week's commit aac2c9b4fde889d13f859c233c2523345e72d32b,
per Coverity.  Reaching this would need catalog corruption.  Back-patch
to v12, like that commit.
2024-09-29 15:54:25 -07:00
Tom Lane
e9339782a6 In passwordFromFile, don't leak the open file after stat failures.
Oversight in e882bcae0.  Per Coverity.
2024-09-29 13:40:03 -04:00
Noah Misch
c1ff2d8bc5 Avoid 037_invalid_database.pl hang under debug_discard_caches.
Back-patch to v12 (all supported versions).
2024-09-27 15:28:56 -07:00
Nathan Bossart
d8ebcac547 doc: Note that CREATE MATERIALIZED VIEW restricts search_path.
Since v17, CREATE MATERIALIZED VIEW has set search_path to
"pg_catalog, pg_temp" while running the query.  The docs for the
other commands that restrict search_path mention it, but the page
for CREATE MATERIALIZED VIEW does not.  Fix that.

Oversight in commit 4b74ebf726.

Author: Yugo Nagata
Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/20240805160502.d2a4975802a832b1e04afb80%40sraoss.co.jp
Backpatch-through: 17
2024-09-27 16:21:21 -05:00
Tom Lane
a3179ab692 Recalculate where-needed data accurately after a join removal.
Up to now, remove_rel_from_query() has done a pretty shoddy job
of updating our where-needed bitmaps (per-Var attr_needed and
per-PlaceHolderVar ph_needed relid sets).  It removed direct mentions
of the to-be-removed baserel and outer join, which is the minimum
amount of effort needed to keep the data structures self-consistent.
But it didn't account for the fact that the removed join ON clause
probably mentioned Vars of other relations, and those Vars might now
not be needed as high up in the join tree as before.  It's easy to
show cases where this results in failing to remove a lower outer join
that could also have been removed.

To fix, recalculate the where-needed bitmaps from scratch after
each successful join removal.  This sounds expensive, but it seems
to add only negligible planner runtime.  (We cheat a little bit
by preserving "relation 0" entries in the bitmaps, allowing us to
skip re-scanning the targetlist and HAVING qual.)

The submitted test case drew attention because we had successfully
optimized away the lower join prior to v16.  I suspect that that's
somewhat accidental and there are related cases that were never
optimized before and now can be.  I've not tried to come up with
one, though.

Perhaps we should back-patch this into v16 and v17 to repair the
performance regression.  However, since it took a year for anyone
to notice the problem, it can't be affecting too many people.  Let's
let the patch bake awhile in HEAD, and see if we get more complaints.

Per bug #18627 from Mikaël Gourlaouen.  No back-patch for now.

Discussion: https://postgr.es/m/18627-44f950eb6a8416c2@postgresql.org
2024-09-27 16:04:04 -04:00
Robert Haas
7f7474a8e4 Reindent pg_verifybackup.c. 2024-09-27 11:14:31 -04:00
Robert Haas
8dfd312902 pg_verifybackup: Verify tar-format backups.
This also works for compressed tar-format backups. However, -n must be
used, because we use pg_waldump to verify WAL, and it doesn't yet know
how to verify WAL that is stored inside of a tarfile.

Amul Sul, reviewed by Sravan Kumar and by me, and revised by me.
2024-09-27 08:40:24 -04:00
Fujii Masao
8410f738ad Fix typo in pg_walsummary/nls.mk.
Author: Koki Nakamura
Discussion: https://postgr.es/m/485c613d1db8de2e8169d5afd43e7f9e@oss.nttdata.com
2024-09-27 10:20:22 +09:00
Michael Paquier
09620ea091 Fix incorrect memory access in VACUUM FULL with invalid toast indexes
An invalid toast index is skipped in reindex_relation().  These would be
remnants of a failed REINDEX CONCURRENTLY and they should never been
rebuilt as there can only be one valid toast index at a time.

REINDEX_REL_SUPPRESS_INDEX_USE, used by CLUSTER and VACUUM FULL, needs
to maintain a list of the indexes being processed.  The list of indexes
is retrieved from the relation cache, and includes invalid indexes.  The
code has missed that invalid toast indexes are ignored in
reindex_relation() as this leads to a hard failure in reindex_index(),
and they were left in the reindex pending list, making the list
inconsistent when rechecked.  The incorrect memory access was happening
when scanning pg_class for the refresh of pg_database.datfrozenxid, when
doing a scan of pg_class.

This issue exists since REINDEX CONCURRENTLY exists, where invalid toast
indexes can exist, so backpatch all the way down.

Reported-by: Alexander Lakhin
Author: Tender Wang
Discussion: https://postgr.es/m/18630-9aed99c38830657d@postgresql.org
Backpatch-through: 12
2024-09-27 09:40:09 +09:00
Michael Paquier
f762d99c87 Fix catalog data of new LO privilege functions
This commit improves the catalog data in pg_proc for the three functions
for has_largeobject_privilege(), introduced in 4eada203a5a8:
- Fix their descriptions (typos and consistency).
- Reallocate OIDs to be within the 8000-9999 range as required by
a6417078c414.

Bump catalog version.

Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/ZvUYR0V0dzWaLnsV@paquier.xyz
2024-09-27 07:26:29 +09:00
Nathan Bossart
b52adbad46 Ensure we have a snapshot when updating pg_index entries.
Creating, reindexing, and dropping an index concurrently could
entail accessing pg_index's TOAST table, which was recently added
in commit b52c4fc3c0.  These code paths start and commit their own
transactions, but they do not always set an active snapshot.  This
rightfully leads to assertion failures and ERRORs when trying to
access pg_index's TOAST table, such as the following:

	ERROR:  cannot fetch toast data without an active snapshot

To fix, push an active snapshot just before each section of code
that might require accessing pg_index's TOAST table, and pop it
shortly afterwards.

Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/a97d7401-e7c9-f771-6a00-037379f0a8bb%40gmail.com
2024-09-26 15:51:23 -05:00
Nathan Bossart
9726653185 Improve style of pg_upgrade task callback functions.
I wanted to avoid adjusting this code too much when converting
these tasks to use the new parallelization framework (see commit
40e2e5e92b), which is why this is being done as a follow-up commit.
These stylistic adjustments result in fewer lines of code and fewer
levels of indentation in some places.

While at it, add names to the UpgradeTaskSlotState enum and the
UpgradeTaskSlot struct.  I'm not aware of any established project
policy in this area, but let's at least be consistent within the
same file.

Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/ZunW7XHLd2uTts4f%40nathan
2024-09-26 13:54:37 -05:00
Tom Lane
147bbc90f7 Modernize to_char's Roman-numeral code, fixing overflow problems.
int_to_roman() only accepts plain "int" input, which is fine since
we're going to produce '###############' for any value above 3999
anyway.  However, the numeric and int8 variants of to_char() would
throw an error if the given input exceeded the integer range, while
the float-input variants invoked undefined-per-C-standard behavior.
Fix things so that you uniformly get '###############' for out of
range input.

Also add test cases covering this code, plus the equally-untested
EEEE, V, and PL format codes.

Discussion: https://postgr.es/m/2956175.1725831136@sss.pgh.pa.us
2024-09-26 11:02:31 -04:00
Tom Lane
e3a92ab070 Doc: InitPlans aren't parallel-restricted any more.
Commit e08d74ca1 removed that restriction, but missed updating
the documentation about it.  Noted by Egor Rogov.

Discussion: https://postgr.es/m/cdc8f87b-a378-4e22-6d29-40ae32dd97d1@postgrespro.ru
2024-09-26 10:37:51 -04:00
Amit Kapila
d66572d9fe Doc: Add a note in the upgrade of logical replication clusters.
The steps used to upgrade the cluster first upgraded the publisher node
but ideally, any node could be upgraded first.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm1_iDO6srWzntqTr0ZDVkk2whVhNKEWAvtgZBfSmuBeZQ@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm3Y-M+kAqr_mf=_C1kNwAB-cS6S5hTHnKMEqDw4sGEh4Q@mail.gmail.com
2024-09-26 16:14:07 +05:30
Alexander Korotkov
e658038772 Update oid for pg_wal_replay_wait() procedure
Use an oid from 8000-9999 range, as required by 98eab30b93d5.

Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZvUY6bfTwB0GsyzP%40paquier.xyz
2024-09-26 11:49:41 +03:00
Nathan Bossart
2ceeb638b7 Remove extra whitespace in pg_upgrade status message.
There's no need to add another level of indentation to this status
message.  pg_log() will put it in the right place.

Oversight in commit 347758b120.

Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/ZunW7XHLd2uTts4f%40nathan
Backpatch-through: 17
2024-09-25 11:18:56 -05:00
Alvaro Herrera
dce507356a
Turn 'if' condition around to avoid Svace complaint
The unwritten assumption of this code is that both events->head and
events->tail are NULL together (an empty list) or they aren't.  So the
code was testing events->head for nullness and using that as a cue to
deference events->tail, which annoys the Svace static code analyzer.
We can silence it by testing events->tail member instead, and add an
assertion about events->head to ensure it's all consistent.

This code is very old and as far as we know, there's never been a bug
report related to this, so there's no need to backpatch.

This was found by the ALT Linux Team using Svace.

Author: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Discussion: https://postgr.es/m/6d0323c3-3f5d-4137-af73-98a5ab90e77c@altlinux.org
2024-09-25 16:42:02 +02:00
Michael Paquier
1ab67c9dfa vacuumdb: Skip temporary tables in query to build list of relations
Running vacuumdb with a non-superuser while another user has created a
temporary table would lead to a mid-flight permission failure,
interrupting the operation.  vacuum_rel() skips temporary relations of
other backends, and it makes no sense for vacuumdb to know about these
relations, so let's switch it to ignore temporary relations entirely.

Adding a qual in the query based on relpersistence simplifies the
generation of its WHERE clause in vacuum_one_database(), per se the
removal of "has_where".

Author: VaibhaveS, Michael Paquier
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CAM_eQjwfAR=y3G1fGyS1U9FTmc+FyJm9amNfY2QCZBnDDbNPZg@mail.gmail.com
Backpatch-through: 12
2024-09-25 14:43:16 +09:00
Amit Kapila
7fdeaf5774 Doc: Add the steps for upgrading the logical replication cluster.
Author: Vignesh C
Reviewed-by: Peter Smith, Amit Kapila, Hayato Kuroda, Bharath Rupireddy
Discussion: https://postgr.es/m/CALDaNm1_iDO6srWzntqTr0ZDVkk2whVhNKEWAvtgZBfSmuBeZQ@mail.gmail.com
2024-09-25 10:06:10 +05:30
Michael Paquier
ba90eac7a9 pg_stat_statements: Expand tests for SET statements
There are many grammar flavors that depend on the parse node
VariableSetStmt.  This closes the gap in pg_stat_statements by providing
test coverage for what should be a large majority of them, improving more
the work begun in de2aca288569.  This will be used to ease the
evaluation of a path towards more normalization of SET queries with
query jumbling.

Note that SET NAMES (grammar from the standard, synonym of SET
client_encoding) is omitted on purpose, this could use UTF8 with a
conditional script where UTF8 is supported, but that does not seem worth
the maintenance cost for the sake of these tests.

The author has submitted most of these in a TAP test (filled in any
holes I could spot), still queries in a SQL file of pg_stat_statements
is able to achieve the same goal while being easier to look at when
testing normalization patterns.

Author: Greg Sabino Mullane, Michael Paquier
Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
2024-09-25 10:04:44 +09:00
Noah Misch
aac2c9b4fd For inplace update durability, make heap_update() callers wait.
The previous commit fixed some ways of losing an inplace update.  It
remained possible to lose one when a backend working toward a
heap_update() copied a tuple into memory just before inplace update of
that tuple.  In catalogs eligible for inplace update, use LOCKTAG_TUPLE
to govern admission to the steps of copying an old tuple, modifying it,
and issuing heap_update().  This includes MERGE commands.  To avoid
changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when
holding a relation lock sufficient to exclude inplace updaters.
Back-patch to v12 (all supported versions).  In v13 and v12, "UPDATE
pg_class" or "UPDATE pg_database" can still lose an inplace update.  The
v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35,
and it wasn't worth reimplementing that fix without such infrastructure.

Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas.

Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
2024-09-24 15:25:18 -07:00
Noah Misch
a07e03fd8f Fix data loss at inplace update after heap_update().
As previously-added tests demonstrated, heap_inplace_update() could
instead update an unrelated tuple of the same catalog.  It could lose
the update.  Losing relhasindex=t was a source of index corruption.
Inplace-updating commands like VACUUM will now wait for heap_update()
commands like GRANT TABLE and GRANT DATABASE.  That isn't ideal, but a
long-running GRANT already hurts VACUUM progress more just by keeping an
XID running.  The VACUUM will behave like a DELETE or UPDATE waiting for
the uncommitted change.

For implementation details, start at the systable_inplace_update_begin()
header comment and README.tuplock.  Back-patch to v12 (all supported
versions).  In back branches, retain a deprecated heap_inplace_update(),
for extensions.

Reported by Smolkin Grigory.  Reviewed by Nitin Motiani, (in earlier
versions) Heikki Linnakangas, and (in earlier versions) Alexander
Lakhin.

Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
2024-09-24 15:25:18 -07:00
Noah Misch
dbf3f974ee Warn if LOCKTAG_TUPLE is held at commit, under debug_assertions.
The current use always releases this locktag.  A planned use will
continue that intent.  It will involve more areas of code, making unlock
omissions easier.  Warn under debug_assertions, like we do for various
resource leaks.  Back-patch to v12 (all supported versions), the plan
for the commit of the new use.

Reviewed by Heikki Linnakangas.

Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-09-24 15:25:18 -07:00
Jeff Davis
ac30021356 Allow length=-1 for NUL-terminated input to pg_strncoll(), etc.
Like ICU, allow a length of -1 to be specified for NUL-terminated
arguments to pg_strncoll(), pg_strnxfrm(), and pg_strnxfrm_prefix().

Simplifies the code and comments.

Discussion: https://postgr.es/m/2d758e07dff26bcc7cbe2aec57431329bfe3679a.camel@j-davis.com
2024-09-24 15:15:18 -07:00
Tom Lane
1591b38d17 Fix psql describe commands' handling of ACL columns for old servers.
Commit d1379ebf4 carelessly broke printACLColumn for pre-9.4 servers,
by using the cardinality() function which we introduced in 9.4.
We expect psql's describe-related commands to work back to 9.2, so
this is bad.  Use the longstanding array_length() function instead.

Per report from Christoph Berg.  Back-patch to v17.

Discussion: https://postgr.es/m/ZvLXYglRS6hMMhtr@msg.df7cb.de
2024-09-24 17:21:38 -04:00
Jeff Davis
ceeaaed87a Tighten up make_libc_collator() and make_icu_collator().
Ensure that error paths within these functions do not leak a collator,
and return the result rather than using an out parameter. (Error paths
in the caller may still result in a leaked collator, which will be
addressed separately.)

In make_libc_collator(), if the first newlocale() succeeds and the
second one fails, close the first locale_t object.

The function make_icu_collator() doesn't have any external callers, so
change it to be static.

Discussion: https://postgr.es/m/54d20e812bd6c3e44c10eddcd757ec494ebf1803.camel@j-davis.com
2024-09-24 12:01:45 -07:00
Peter Eisentraut
59f0eea7b0 Add further excludes to headerscheck
Some header files under contrib/isn/ are not meant to be included
independently, and they fail -Wmissing-variable-declarations when
doing so.

Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BYVt5MBD-w0HyHpsGb4U8RNge3DvAbDmOFy_epGhZ2Mg%40mail.gmail.com#aba3226c6dd493923bd6ce95d25a2d77
2024-09-24 20:41:47 +02:00
Tom Lane
cd838e2008 Neaten up our choices of SQLSTATEs for XML-related errors.
When our XML-handling modules were first written, the SQL standard
lacked any error codes that were particularly intended for XML
error conditions.  Unsurprisingly, this led to some rather random
choices of errcodes in those modules.  Now the standard has a whole
SQLSTATE class, "Class 10 - XQuery Error", with a reasonably large
selection of relevant-looking errcodes.

In this patch I've chosen one fairly generic code defined by the
standard, 10608 = invalid_argument_for_xquery, and used it where
it seemed appropriate.  I've also made an effort to replace
ERRCODE_INTERNAL_ERROR everywhere it was not clearly reporting
a coding problem; in particular, many of the existing uses look
like they can fairly be reported as ERRCODE_OUT_OF_MEMORY.

It might be interesting to try to map libxml2's error codes into
the standard's new collection, but I've not undertaken that here.

Discussion: https://postgr.es/m/417250.1726341268@sss.pgh.pa.us
2024-09-24 12:59:56 -04:00
Peter Geoghegan
3da436ec09 Update obsolete nbtree array preprocessing comments.
The array->scan_key references fixed up at the end of preprocessing
start out as offsets into the arrayKeyData[] array (the array returned
by _bt_preprocess_array_keys at the start of preprocessing that involves
array scan keys).  Offsets into the arrayKeyData[] array are no longer
guaranteed to be valid offsets into our original scan->keyData[] input
scan key array, but comments describing the array->scan_key references
still talked about scan->keyData[].  Update those comments.

Oversight in commit b5249741.
2024-09-24 12:58:55 -04:00
David Rowley
62ddf7ee9a Add ONLY support for VACUUM and ANALYZE
Since autovacuum does not trigger an ANALYZE for partitioned tables,
users must perform these manually.  However, performing a manual ANALYZE
on a partitioned table would always result in recursively analyzing each
partition and that could be undesirable as autovacuum takes care of that.
For partitioned tables that contain a large number of partitions, having
to analyze each partition could take an unreasonably long time, especially
so for tables with a large number of columns.

Here we allow the ONLY keyword to prefix the name of the table to allow
users to have ANALYZE skip processing partitions.  This option can also
be used with VACUUM, but there is no work to do if VACUUM ONLY is used on
a partitioned table.

This commit also changes the behavior of VACUUM	and ANALYZE for
inheritance parents.  Previously inheritance child tables would not be
processed when operating on the parent.  Now, by default we *do* operate
on the child tables.  ONLY can be used to obtain the old behavior.
The release notes should note this as an incompatibility.  The default
behavior has not changed for partitioned tables as these always
recursively processed the partitions.

Author: Michael Harris <harmic@gmail.com>
Discussion: https://postgr.es/m/CADofcAWATx_haD=QkSxHbnTsAe6+e0Aw8Eh4H8cXyogGvn_kOg@mail.gmail.com
Discussion: https://postgr.es/m/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
2024-09-24 18:03:40 +12:00
Michael Paquier
bbba59e69a Remove ATT_TABLE for ALTER TABLE ... ATTACH/DETACH
Attempting these commands for a non-partitioned table would result in a
failure when creating the relation in transformPartitionCmd().  This
gives the possibility to throw an error earlier with a much better error
message, thanks to d69a3f4d70b7.

The extra test cases are from me.  Note that FINALIZE uses a different
subcommand and it had no coverage for its failure path with
non-partitioned tables.

Author: Álvaro Herrera, Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/202409190803.tnis52adt2n5@alvherre.pgsql
2024-09-24 08:59:08 +09:00
Tom Lane
75240f65e7 jsonapi: fix memory leakage during OOM error recovery.
Coverity pointed out that inc_lex_level() would leak memory
(not to mention corrupt the pstack data structure) if some
but not all of its three REALLOC's failed.  To fix, store
successfully-updated pointers back into the pstack struct
immediately.

Oversight in 0785d1b8b, so no need for back-patch.
2024-09-23 12:30:51 -04:00
Tomas Vondra
a7e5237f26 Fix asserts in fast-path locking code
Commit c4d5cb71d229 introduced a couple asserts in the fast-path locking
code, upsetting Coverity.

The assert in InitProcGlobal() is clearly wrong, as it assigns instead
of checking the value. This is harmless, but doesn't check anything.

The asserts in FAST_PATH_ macros are written as if for signed values,
but the macros are only called for unsigned ones. That makes the check
for (val >= 0) useless. Checks written as ((uint32) x < max) work for
both signed and unsigned values. Negative values should wrap to values
greater than INT32_MAX.

Per Coverity, report by Tom Lane.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/2891628.1727019959@sss.pgh.pa.us
2024-09-23 11:37:12 +02:00
Tatsuo Ishii
40708acd65 Add memory/disk usage for more executor nodes.
This commit is similar to 95d6e9af07, expanding the idea to CTE scan,
table function scan and recursive union scan nodes so that the maximum
tuplestore memory or disk usage is shown with EXPLAIN ANALYZE command.

Also adjust show_storage_info() so that it accepts storage type and
storage size arguments instead of Tuplestorestate. This allows the
node types to share the formatting code using show_storage_info(). Due
to this show_material_info() and show_windowagg_info() are also
modified.

Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20240918.211246.1127161704188186085.ishii%40postgresql.org
2024-09-23 16:34:24 +09:00
Nathan Bossart
6aa44060a3 Remove pg_authid's TOAST table.
pg_authid's only varlena column is rolpassword, which unfortunately
cannot be de-TOASTed during authentication because we haven't
selected a database yet and cannot read pg_class.  By removing
pg_authid's TOAST table, attempts to set password hashes that
require out-of-line storage will fail with a "row is too big"
error instead.  We may want to provide a more user-friendly error
in the future, but for now let's just remove the useless TOAST
table.

Bumps catversion.

Reported-by: Alexander Lakhin
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/89e8649c-eb74-db25-7945-6d6b23992394%40gmail.com
2024-09-21 15:17:46 -05:00
Tomas Vondra
c4d5cb71d2 Increase the number of fast-path lock slots
Replace the fixed-size array of fast-path locks with arrays, sized on
startup based on max_locks_per_transaction. This allows using fast-path
locking for workloads that need more locks.

The fast-path locking introduced in 9.2 allowed each backend to acquire
a small number (16) of weak relation locks cheaply. If a backend needs
to hold more locks, it has to insert them into the shared lock table.
This is considerably more expensive, and may be subject to contention
(especially on many-core systems).

The limit of 16 fast-path locks was always rather low, because we have
to lock all relations - not just tables, but also indexes, views, etc.
For planning we need to lock all relations that might be used in the
plan, not just those that actually get used in the final plan. So even
with rather simple queries and schemas, we often need significantly more
than 16 locks.

As partitioning gets used more widely, and the number of partitions
increases, this limit is trivial to hit. Complex queries may easily use
hundreds or even thousands of locks. For workloads doing a lot of I/O
this is not noticeable, but for workloads accessing only data in RAM,
the access to the shared lock table may be a serious issue.

This commit removes the hard-coded limit of the number of fast-path
locks. Instead, the size of the fast-path arrays is calculated at
startup, and can be set much higher than the original 16-lock limit.
The overall fast-path locking protocol remains unchanged.

The variable-sized fast-path arrays can no longer be part of PGPROC, but
are allocated as a separate chunk of shared memory and then references
from the PGPROC entries.

The fast-path slots are organized as a 16-way set associative cache. You
can imagine it as a hash table of 16-slot "groups". Each relation is
mapped to exactly one group using hash(relid), and the group is then
processed using linear search, just like the original fast-path cache.
With only 16 entries this is cheap, with good locality.

Treating this as a simple hash table with open addressing would not be
efficient, especially once the hash table gets almost full. The usual
remedy is to grow the table, but we can't do that here easily. The
access would also be more random, with worse locality.

The fast-path arrays are sized using the max_locks_per_transaction GUC.
We try to have enough capacity for the number of locks specified in the
GUC, using the traditional 2^n formula, with an upper limit of 1024 lock
groups (i.e. 16k locks). The default value of max_locks_per_transaction
is 64, which means those instances will have 64 fast-path slots.

The main purpose of the max_locks_per_transaction GUC is to size the
shared lock table. It is often set to the "average" number of locks
needed by backends, with some backends using significantly more locks.
This should not be a major issue, however. Some backens may have to
insert locks into the shared lock table, but there can't be too many of
them, limiting the contention.

The only solution is to increase the GUC, even if the shared lock table
already has sufficient capacity. That is not free, especially in terms
of memory usage (the shared lock table entries are fairly large). It
should only happen on machines with plenty of memory, though.

In the future we may consider a separate GUC for the number of fast-path
slots, but let's try without one first.

Reviewed-by: Robert Haas, Jakub Wartak
Discussion: https://postgr.es/m/510b887e-c0ce-4a0c-a17a-2c6abb8d9a5c@enterprisedb.com
2024-09-21 20:09:35 +02:00
Peter Geoghegan
b524974106 Refactor handling of nbtree array redundancies.
Teach _bt_preprocess_array_keys to eliminate redundant array equality
scan keys directly, rather than just marking them as redundant.  Its
_bt_preprocess_keys caller is no longer required to ignore input scan
keys that were marked redundant in this way.  Oversights like the one
fixed by commit f22e17f7 are no longer possible.

The new scheme also makes it easier for _bt_preprocess_keys to output a
so.keyData[] scan key array with _more_ scan keys than it was passed in
its scan.keyData[] input scan key array.  An upcoming patch that adds
skip scan optimizations to nbtree will take advantage of this.

In passing, remove and rename certain _bt_preprocess_keys variables to
make the difference between our input scan key array and our output scan
key array clearer.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com
2024-09-21 13:25:49 -04:00
Tom Lane
54562c9cfa Improve Asserts checking relation matching in parallel scans.
table_beginscan_parallel and index_beginscan_parallel contain
Asserts checking that the relation a worker will use in
a parallel scan is the same one the leader intended.  However,
they were checking for relation OID match, which was not strong
enough to detect the mismatch problem fixed in 126ec0bc7.
What would be strong enough is to compare relfilenodes instead.
Arguably, that's a saner definition anyway, since a scan surely
operates on a physical relation not a logical one.  Hence,
store and compare RelFileLocators not relation OIDs.  Also
ensure that index_beginscan_parallel checks the index identity
not just the table identity.

Discussion: https://postgr.es/m/2127254.1726789524@sss.pgh.pa.us
2024-09-20 16:37:55 -04:00
Nathan Bossart
afb03e2ebf Alphabetize #include directives in pg_checksums.c.
Author: Michael Banck
Discussion: https://postgr.es/m/66edaed0.050a0220.32a9ba.42c8%40mx.google.com
2024-09-20 15:18:42 -05:00
Tom Lane
a2ebf3274a Doc: explain how to test ADMIN privilege with pg_has_role().
This has always been possible, but the syntax is a bit obscure,
and our user-facing docs were not very helpful.  Spell it out
more clearly.

Per complaint from Dominique Devienne.  Back-patch to
all supported branches.

Discussion: https://postgr.es/m/CAFCRh-8JNEy+dV4SXFOrWca50u+d=--TO4cq=+ac1oBtfJy4AA@mail.gmail.com
2024-09-20 15:56:34 -04:00
Peter Geoghegan
c00c54a9ac Fix nbtree pgstats accounting with parallel scans.
Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made
parallel index scans work with the new design for arrays via explicit
scheduling of primitive index scans.  Under this scheme a parallel index
scan with array keys will perform the same number of index descents as
an equivalent serial index scan (barring corner cases where an
individual parallel worker discovers that it can advance the scan's
array keys without anybody needing to perform another descent of the
index to get to the relevant page on the leaf level).

Despite all this, the pgstats accounting wasn't updated; it continued to
increment the total number of index scans for the rel once per _bt_first
call, no matter the details.  As a result, the number of (primitive)
index scans could be over-counted during parallel scans.

To fix, delay incrementing the count of index scans until after we've
established that another descent of the index (using either _bt_search
or _bt_endpoint) is required.  That way pg_stat_user_tables.idx_scan
always advances in the same way, regardless of whether or not the scan
makes use of parallelism.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2-Wz=E7XrkvscBN0U6V81NK3Q-dQOmivvbEsjG-zwEfDdFpg@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
Backpatch: 17-, where nbtree SAOP execution was enhanced.
2024-09-20 14:06:32 -04:00
Michael Paquier
d35e293878 Add parameter "connstr" to PostgreSQL::Test::Cluster::background_psql
Like for Cluster::psql, this can be handy to force the use of a
connection string with some values overriden, like a "host".

Author: Aidar Imamov
Discussion: https://postgr.es/m/ecacb079efc533aed3c234cbcb5b07b6@postgrespro.ru
2024-09-20 09:59:22 +09:00
Tom Lane
126ec0bc76 Restore relmapper state early enough in parallel workers.
We need to do RestoreRelationMap before loading catalog-derived
state, else the worker may end up with catalog relcache entries
containing stale relfilenode data.  Move up RestoreReindexState
too, on the principle that that should also happen before we
do much of any catalog access.

I think ideally these things would happen even before InitPostgres,
but there are various problems standing in the way of that, notably
that the relmapper thinks "active" mappings should be discarded at
transaction end.  The implication of this is that InitPostgres and
RestoreLibraryState will see the same catalog state as an independent
backend would see, which is probably fine; at least, it's been like
that all along.

Per report from Justin Pryzby.  There is a case to be made that
this should be back-patched.  But given the lack of complaints
before 6e086fa2e and the short amount of time remaining before
17.0 wraps, I'll just put it in HEAD for now.

Discussion: https://postgr.es/m/ZuoU_8EbSTE14o1U@pryzbyj2023
2024-09-19 20:58:21 -04:00
Michael Paquier
91287b5f5d psql: Add tests for repeated calls of \bind[_named]
The implementation assumes that on multiple calls of these meta-commands
the last one wins.  Multiple \g calls in-between mean multiple
executions.

There were no tests to check these properties, hence let's add
something.

Author: Jelte Fennema-Nio, Michael Paquier
Discussion: https://postgr.es/m/CAGECzQSTE7CoM=Gst56Xj8pOvjaPr09+7jjtWqTC40pGETyAuA@mail.gmail.com
2024-09-20 08:59:20 +09:00
Bruce Momjian
658fc6c6af doc PG relnotes: remove warning about commit links in PDF build
Make paragraph empty instead of removing it.

Discussion: https://postgr.es/m/2029579.1726779139@sss.pgh.pa.us

Backpatch-through: 12
2024-09-19 18:05:22 -04:00
Bruce Momjian
c6b1506f71 doc PG relnotes: document "Unresolved ID reference found" cause
Backpatch-through: 12
2024-09-19 12:01:59 -04:00
Bruce Momjian
25f8cf19ab doc PG relnotes: rename commit link paragraph for clarity
FYI, during PDF builds, this link type generates a "Unresolved ID
reference found" warning because it is suppressed from the PDF output.

Backpatch-through: 12
2024-09-19 09:47:22 -04:00
Bruce Momjian
f057781686 Improve Perl script which adds commit links to release notes
Reported-by: Andrew Dunstan

Discussion: https://postgr.es/m/b2465837-56df-4794-a0b5-5e6ed44ed870@dunslane.net

Author: Andrew Dunstan

Backpatch-through: 12
2024-09-19 08:45:33 -04:00
Alexander Korotkov
4c57facbb1 Add UpgradeTaskProcessCB to typedefs.list
While it doesn't directly influence indentation right now, add it for
uniformity.
2024-09-19 14:34:52 +03:00
Alexander Korotkov
a094e8b9e4 Fix order of includes in src/bin/pg_upgrade/info.c 2024-09-19 14:34:00 +03:00
Alexander Korotkov
014f9f34d2 Move pg_wal_replay_wait() to xlogfuncs.c
This commit moves pg_wal_replay_wait() procedure to be a neighbor of
WAL-related functions in xlogfuncs.c.  The implementation of LSN waiting
continues to reside in the same place.

By proposal from Michael Paquier.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
2024-09-19 14:26:11 +03:00
Michael Paquier
87eeadaea1 psql: Clean up more aggressively state of \bind[_named], \parse and \close
This fixes a couple of issues with the psql meta-commands mentioned
above when called repeatedly:
- The statement name is reset for each call.  If a command errors out,
its send_mode would still be set, causing an incorrect path to be taken
when processing a query.  For \bind_named, this could trigger an
assertion failure as a statement name is always expected for this
meta-command.  This issue has been introduced by d55322b0da60.
- The memory allocated for bind parameters can be leaked.  This is a bug
enlarged by d55322b0da60 that exists since 5b66de3433e2, as it is also
possible to leak memory with \bind in v16 and v17.  This requires a fix
that will be done on the affected branches separately.  This issue is
taken care of here for HEAD.

This patch tightens the cleanup of the state used for the extended
protocol meta-commands (bind parameters, send mode, statement name) by
doing it before running each meta-command on top of doing it once a
query has been processed, avoiding any leaks and the inconsistencies
when mixing calls, by refactoring the cleanup in a single routine used
in all the code paths where this step is required.

Reported-by: Alexander Lakhin
Author: Anthonin Bonnefoy
Discussion: https://postgr.es/m/2e5b89af-a351-ff0a-000c-037ac28314ab@gmail.com
2024-09-19 15:39:01 +09:00
Michael Paquier
d69a3f4d70 Introduce ATT_PARTITIONED_TABLE in tablecmds.c
Partitioned tables and normal tables have been relying on ATT_TABLE in
ATSimplePermissions() to produce error messages that depend on the
relation's relkind, because both relkinds currently support the same set
of ALTER TABLE subcommands.

A patch to restrict SET LOGGED/UNLOGGED for partitioned tables is under
discussion, and introducing ATT_PARTITIONED_TABLE makes subcommand
restrictions for partitioned tables easier to deal with, so let's add
one.  There is no functional change.

Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/Zt6cDnwSvnuLLnak@paquier.xyz
2024-09-19 12:22:56 +09:00
David Rowley
5d56d07ca3 Optimize tuplestore usage for WITH RECURSIVE CTEs
nodeRecursiveunion.c makes use of two tuplestores and, until now, would
delete and recreate one of these tuplestores after every recursive
iteration.

Here we adjust that behavior and instead reuse one of the existing
tuplestores and just empty it of all tuples using tuplestore_clear().

This saves some free/malloc roundtrips and has shown a 25-30% performance
improvement for queries that perform very little work between recursive
iterations.

This also paves the way to add some EXPLAIN ANALYZE telemetry output for
recursive common table expressions, similar to what was done in 1eff8279d
and 95d6e9af0.  Previously calling tuplestore_end() would have caused
the maximum storage space used to be lost.

Reviewed-by: Tatsuo Ishii
Discussion: https://postgr.es/m/CAApHDvr9yW0YRiK8A2J7nvyT8g17YzbSfOviEWrghazKZbHbig@mail.gmail.com
2024-09-19 15:20:35 +12:00
Bruce Momjian
8a6e85b46e doc PG relnotes: add paragraph explaining the section symbol
And suppress the symbol in print mode, where the section symbol does not
appear.

Discussion: https://postgr.es/m/ZuobILbmGGetxEg5@momjian.us

Backpatch-through: 12
2024-09-18 17:13:19 -04:00
Bruce Momjian
f986882ffd doc PG relnotes: no relnote footnotes for commit links in PDF
In print output, there are too many commit links for footnotes in the
release notes to be useful.

Reported-by: Tom Lane

Discussion: https://postgr.es/m/1709858.1726618961@sss.pgh.pa.us

Backpatch-through: 12
2024-09-18 16:34:52 -04:00
Nathan Bossart
b52c4fc3c0 Add TOAST table to pg_index.
This change allows pg_index rows to use out-of-line storage for the
"indexprs" and "indpred" columns, which enables use-cases such as
very large index expressions.

This system catalog was previously not given a TOAST table due to a
fear of circularity issues (see commit 96cdeae07f).  Testing has
not revealed any such problems, and it seems unlikely that the
entries for system indexes could ever need out-of-line storage.  In
any case, it is still early in the v18 development cycle, so
committing this now will hopefully increase the chances of finding
any unexpected problems prior to release.

Bumps catversion.

Reported-by: Jonathan Katz
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/b611015f-b423-458c-aa2d-be0e655cc1b4%40postgresql.org
2024-09-18 14:42:57 -05:00
Fujii Masao
a7c39db5eb docs: Improve the description of num_timed column in pg_stat_checkpointer.
The previous documentation stated that num_timed reflects the number of
scheduled checkpoints performed. However, checkpoints may be skipped
if the server has been idle, and num_timed counts both skipped and completed
checkpoints. This commit clarifies the description to make it clear that
the counter includes both skipped and completed checkpoints.

Back-patch to v17 where pg_stat_checkpointer was added.

Author: Fujii Masao
Reviewed-by: Alexander Korotkov
Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
2024-09-19 02:14:10 +09:00
Michael Paquier
24f5205948 Add some sanity checks in executor for query ID reporting
This commit adds three sanity checks in code paths of the executor where
it is possible to use hooks, checking that a query ID is reported in
pg_stat_activity if compute_query_id is enabled:
- ExecutorRun()
- ExecutorFinish()
- ExecutorEnd()

This causes the test in pg_stat_statements added in 933848d16dc9 to
complain immediately in ExecutorRun().  The idea behind this commit is
to help extensions to detect if they are missing query ID reports when a
query goes through the executor.  Perhaps this will prove to be a bad
idea, but let's see where this experience goes in v18 and newer
versions.

Reviewed-by: Sami Imseih
Discussion: https://postgr.es/m/ZuJb5xCKHH0A9tMN@paquier.xyz
2024-09-18 14:43:37 +09:00
Fujii Masao
4f08ab5545 postgres_fdw: Extend postgres_fdw_get_connections to return user name.
This commit adds a "user_name" output column to
the postgres_fdw_get_connections function, returning the name
of the local user mapped to the foreign server for each connection.
If a public mapping is used, it returns "public."

This helps identify postgres_fdw connections more easily,
such as determining which connections are invalid, closed,
or used within the current transaction.

No extension version bump is needed, as commit c297a47c5f
already handled it for v18~.

Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/b492a935-6c7e-8c08-e485-3c1d64d7d10f@oss.nttdata.com
2024-09-18 12:51:48 +09:00
Michael Paquier
b14e9ce7d5 Extend PgStat_HashKey.objid from 4 to 8 bytes
This opens the possibility to define keys for more types of statistics
kinds in PgStat_HashKey, the first case being 8-byte query IDs for
statistics like pg_stat_statements.

This increases the size of PgStat_HashKey from 12 to 16 bytes, while
PgStatShared_HashEntry, entry stored in the dshash for pgstats, keeps
the same size due to alignment.

xl_xact_stats_item, that tracks the stats items to drop in commit WAL
records, is increased from 12 to 16 bytes.  Note that individual chunks
in commit WAL records should be multiples of sizeof(int), hence 8-byte
object IDs are stored as two uint32, based on a suggestion from Heikki
Linnakangas.

While on it, the field of PgStat_HashKey is renamed from "objoid" to
"objid", as for some stats kinds this field does not refer to OIDs but
just IDs, like for replication slot stats.

This commit bumps the following format variables:
- PGSTAT_FILE_FORMAT_ID, as PgStat_HashKey is written to the stats file
for non-serialized stats kinds in the dshash table.
- XLOG_PAGE_MAGIC for the changes in xl_xact_stats_item.
- Catalog version, for the SQL function pg_stat_have_stats().

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZsvTS9EW79Up8I62@paquier.xyz
2024-09-18 12:44:15 +09:00
Noah Misch
ac04aa84a7 Don't enter parallel mode when holding interrupts.
Doing so caused the leader to hang in wait_event=ParallelFinish, which
required an immediate shutdown to resolve.  Back-patch to v12 (all
supported versions).

Francesco Degrassi

Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
2024-09-17 19:53:11 -07:00
Michael Paquier
933848d16d Add missing query ID reporting in extended query protocol
This commit adds query ID reports for two code paths when processing
extended query protocol messages:
- When receiving a bind message, setting it to the first Query retrieved
from a cached cache.
- When receiving an execute message, setting it to the first PlannedStmt
stored in a portal.

An advantage of this method is that this is able to cover all the types
of portals handled in the extended query protocol, particularly these
two when the report done in ExecutorStart() is not enough (neither is an
addition in ExecutorRun(), actually, for the second point):
- Multiple execute messages, with multiple ExecutorRun().
- Portal with execute/fetch messages, like a query with a RETURNING
clause and a fetch size that stores the tuples in a first execute
message going though ExecutorStart() and ExecuteRun(), followed by one
or more execute messages doing only fetches from the tuplestore created
in the first message.  This corresponds to the case where
execute_is_fetch is set, for example.

Note that the query ID reporting done in ExecutorStart() is still
necessary, as an EXECUTE requires it.  Query ID reporting is optimistic
and more calls to pgstat_report_query_id() don't matter as the first
report takes priority except if the report is forced.  The comment in
ExecutorStart() is adjusted to reflect better the reality with the
extended query protocol.

The test added in pg_stat_statements is a courtesy of Robert Haas.  This
uses psql's \bind metacommand, hence this part is backpatched down to
v16.

Reported-by:  Kaido Vaikla, Erik Wienhold
Author: Sami Imseih
Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier
Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org
Backpatch-through: 14
2024-09-18 09:59:09 +09:00
Thomas Munro
70d38e3d8a Allow ReadStream to be consumed as raw block numbers.
Commits 041b9680 and 6377e12a changed the interface of
scan_analyze_next_block() to take a ReadStream instead of a BlockNumber
and a BufferAccessStrategy, and to return a value to indicate when the
stream has run out of blocks.

This caused integration problems for at least one known extension that
uses specially encoded BlockNumber values that map to different
underlying storage, because acquire_sample_rows() sets up the stream so
that read_stream_next_buffer() reads blocks from the main fork of the
relation's SMgrRelation.

Provide read_stream_next_block(), as a way for such an extension to
access the stream of raw BlockNumbers directly and forward them to its
own ReadBuffer() calls after decoding, as it could in earlier releases.
The new function returns the BlockNumber and BufferAccessStrategy that
were previously passed directly to scan_analyze_next_block().
Alternatively, an extension could wrap the stream of BlockNumbers in
another ReadStream with a callback that performs any decoding required
to arrive at real storage manager BlockNumber values, so that it could
benefit from the I/O combining and concurrency provided by
read_stream.c.

Another class of table access method that does nothing in
scan_analyze_next_block() because it is not block-oriented could use
this function to control the number of block sampling loops.  It could
match the previous behavior with "return read_stream_next_block(stream,
&bas) != InvalidBlockNumber".

Ongoing work is expected to provide better ANALYZE support for table
access methods that don't behave like heapam with respect to storage
blocks, but that will be for future releases.

Back-patch to 17.

Reported-by: Mats Kindahl <mats@timescale.com>
Reviewed-by: Mats Kindahl <mats@timescale.com>
Discussion: https://postgr.es/m/CA%2B14425%2BCcm07ocG97Fp%2BFrD9xUXqmBKFvecp0p%2BgV2YYR258Q%40mail.gmail.com
2024-09-18 11:34:28 +12:00
Tom Lane
918e21d251 Repair pg_upgrade for identity sequences with non-default persistence.
Since we introduced unlogged sequences in v15, identity sequences
have defaulted to having the same persistence as their owning table.
However, it is possible to change that with ALTER SEQUENCE, and
pg_dump tries to preserve the logged-ness of sequences when it doesn't
match (as indeed it wouldn't for an unlogged table from before v15).

The fly in the ointment is that ALTER SEQUENCE SET [UN]LOGGED fails
in binary-upgrade mode, because it needs to assign a new relfilenode
which we cannot permit in that mode.  Thus, trying to pg_upgrade a
database containing a mismatching identity sequence failed.

To fix, add syntax to ADD/ALTER COLUMN GENERATED AS IDENTITY to allow
the sequence's persistence to be set correctly at creation, and use
that instead of ALTER SEQUENCE SET [UN]LOGGED in pg_dump.  (I tried to
make SET [UN]LOGGED work without any pg_dump modifications, but that
seems too fragile to be a desirable answer.  This way should be
markedly faster anyhow.)

In passing, document the previously-undocumented SEQUENCE NAME option
that pg_dump also relies on for identity sequences; I see no value
in trying to pretend it doesn't exist.

Per bug #18618 from Anthony Hsu.
Back-patch to v15 where we invented this stuff.

Discussion: https://postgr.es/m/18618-d4eb26d669ed110a@postgresql.org
2024-09-17 15:53:35 -04:00
Alexander Korotkov
2520226c95 Ensure standby promotion point in 043_wal_replay_wait.pl
This commit ensures standby will be promoted at least at the primary insert
LSN we have just observed.  We use pg_switch_wal() to force the insert LSN
to be written then wait for standby to catchup.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com
2024-09-17 22:51:06 +03:00
Alexander Korotkov
85b98b8d5a Minor cleanup related to pg_wal_replay_wait() procedure
* Rename $node_standby1 to $node_standby in 043_wal_replay_wait.pl as there
   is only one standby.
 * Remove useless debug printing in 043_wal_replay_wait.pl.
 * Fix typo in one check description in 043_wal_replay_wait.pl.
 * Fix some wording in comments and documentation.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com
Reviewed-by: Alexander Lakhin
2024-09-17 22:50:43 +03:00
Peter Geoghegan
d8adfc18be Avoid parallel nbtree index scan hangs with SAOPs.
Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made
parallel index scans work with the new design for arrays via explicit
scheduling of primitive index scans.  A backend that successfully
scheduled the scan's next primitive index scan saved its backend local
array keys in shared memory.  Any backend could pick up the scheduled
primitive scan within _bt_first.  This scheme decouples scheduling a
primitive scan from starting the scan (by performing another descent of
the index via a _bt_search call from _bt_first) to make things robust.

The scheme had a deadlock hazard, at least when the leader process
participated in the scan.  _bt_parallel_seize had a code path that made
backends that were not in an immediate position to start a scheduled
primitive index scan wait for some other backend to do so instead.
Under the right circumstances, the leader process could wait here
forever: the leader would wait for any other backend to start the
primitive scan, while every worker was busy waiting on the leader to
consume tuples from the scan's tuple queue.

To fix, don't wait for a scheduled primitive index scan to be started by
some other eligible backend from within _bt_parallel_seize (when the
calling backend isn't in a position to do so itself).  Return false
instead, while recording that the scan has a scheduled primitive index
scan in backend local state.  This leaves the backend in the same state
as the existing case where a backend schedules (or tries to schedule)
another primitive index scan from within _bt_advance_array_keys, before
calling _bt_parallel_seize.  _bt_parallel_seize already handles that
case by returning false without waiting, and without unsetting the
backend local state.  Leaving the backend in this state enables it to
start a previously scheduled primitive index scan once it gets back to
_bt_first.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Matthias van de Meent, with tweaks by me.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reported-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzmMGaPa32u9x_FvEbPTUkP5e95i=QxR8054nvCRydP-sw@mail.gmail.com
Backpatch: 17-, where nbtree SAOP execution was enhanced.
2024-09-17 11:10:35 -04:00
Peter Eisentraut
89f908a6d0 Add temporal FOREIGN KEY contraints
Add PERIOD clause to foreign key constraint definitions.  This is
supported for range and multirange types.  Temporal foreign keys check
for range containment instead of equality.

This feature matches the behavior of the SQL standard temporal foreign
keys, but it works on PostgreSQL's native ranges instead of SQL's
"periods", which don't exist in PostgreSQL (yet).

Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT}
are not supported yet.

(previously committed as 34768ee3616, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:30 +02:00
Peter Eisentraut
fc0438b4e8 Add temporal PRIMARY KEY and UNIQUE constraints
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

(previously committed as 46a0cd4cefb, reverted by 46a0cd4cefb; the new
part is this:)

Because 'empty' && 'empty' is false, the temporal PK/UQ constraint
allowed duplicates, which is confusing to users and breaks internal
expectations.  For instance, when GROUP BY checks functional
dependencies on the PK, it allows selecting other columns from the
table, but in the presence of duplicate keys you could get the value
from any of their rows.  So we need to forbid empties.

This all means that at the moment we can only support ranges and
multiranges for temporal PK/UQs, unlike the original patch (above).
Documentation and tests for this are added.  But this could
conceivably be extended by introducing some more general support for
the notion of "empty" for other types.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:30 +02:00
Peter Eisentraut
7406ab623f Add stratnum GiST support function
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required).  We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.

This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses.  It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.

(previously committed as 6db4598fcb8, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:29 +02:00
Tatsuo Ishii
95d6e9af07 Add memory/disk usage for Window aggregate nodes in EXPLAIN.
This commit is similar to 1eff8279d and expands the idea to Window
aggregate nodes so that users can know how much memory or disk the
tuplestore used.

This commit uses newly introduced tuplestore_get_stats() to inquire this
information and add some additional output in EXPLAIN ANALYZE to
display the information for the Window aggregate node.

Reviewed-by: David Rowley, Ashutosh Bapat, Maxim Orlov, Jian He
Discussion: https://postgr.es/m/20240706.202254.89740021795421286.ishii%40postgresql.org
2024-09-17 14:38:53 +09:00
Nathan Bossart
1bbf1e2f1a Fix redefinition of typedef.
Per buildfarm members sifaka and longfin, clang with
-Wtypedef-redefinition warns of a duplicate typedef unless building
with C11.

Oversight in commit 40e2e5e92b.
2024-09-16 16:33:50 -05:00
Nathan Bossart
c880cf2588 pg_upgrade: Parallelize encoding conversion check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for incompatible user-defined encoding
conversions, i.e., those defined on servers older than v14.  This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
f93f5f7b98 pg_upgrade: Parallelize WITH OIDS check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for tables declared WITH OIDS.  This step
will now process multiple databases concurrently when pg_upgrade's
--jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
cf2f82a37c pg_upgrade: Parallelize incompatible polymorphics check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for usage of incompatible polymorphic
functions, i.e., those with arguments of type anyarray/anyelement
rather than the newer anycompatible variants.  This step will now
process multiple databases concurrently when pg_upgrade's --jobs
option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
c34eabfbbf pg_upgrade: Parallelize postfix operator check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for user-defined postfix operators.  This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
9db3018cf8 pg_upgrade: Parallelize contrib/isn check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for contrib/isn functions that rely on the
bigint data type.  This step will now process multiple databases
concurrently when pg_upgrade's --jobs option is provided a value
greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
bbf83cab98 pg_upgrade: Parallelize data type checks.
This commit makes use of the new task framework in pg_upgrade to
parallelize the checks for incompatible data types, i.e., data
types whose on-disk format has changed, data types that have been
removed, etc.  This step will now process multiple databases
concurrently when pg_upgrade's --jobs option is provided a value
greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
6ab8f27bc7 pg_upgrade: Parallelize retrieving extension updates.
This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving the set of extensions that should be updated
with the ALTER EXTENSION command after upgrade.  This step will now
process multiple databases concurrently when pg_upgrade's --jobs
option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
46cad8b319 pg_upgrade: Parallelize retrieving loadable libraries.
This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving the names of all libraries referenced by
non-built-in C functions.  This step will now process multiple
databases concurrently when pg_upgrade's --jobs option is provided
a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
7baa36de58 pg_upgrade: Parallelize subscription check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the part of check_old_cluster_subscription_state() that
verifies each of the subscribed tables is in the 'i' (initialize)
or 'r' (ready) state.  This check will now process multiple
databases concurrently when pg_upgrade's --jobs option is provided
a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
6d3d2e8e54 pg_upgrade: Parallelize retrieving relation information.
This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving relation and logical slot information.  This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart
40e2e5e92b Introduce framework for parallelizing various pg_upgrade tasks.
A number of pg_upgrade steps require connecting to every database
in the cluster and running the same query in each one.  When there
are many databases, these steps are particularly time-consuming,
especially since they are performed sequentially, i.e., we connect
to a database, run the query, and process the results before moving
on to the next database.

This commit introduces a new framework that makes it easy to
parallelize most of these once-in-each-database tasks by processing
multiple databases concurrently.  This framework manages a set of
slots that follow a simple state machine, and it uses libpq's
asynchronous APIs to establish the connections and run the queries.
The --jobs option is used to determine the number of slots to use.
To use this new task framework, callers simply need to provide the
query and a callback function to process its results, and the
framework takes care of the rest.  A more complete description is
provided at the top of the new task.c file.

None of the eligible once-in-each-database tasks are converted to
use this new framework in this commit.  That will be done via
several follow-up commits.

Reviewed-by: Jeff Davis, Robert Haas, Daniel Gustafsson, Ilya Gladyshev, Corey Huinker
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Bruce Momjian
d891c49286 doc PG relnotes: fix SGML markup for new commit links
Backpatch-through: 12
2024-09-16 14:23:39 -04:00
Bruce Momjian
2572104552 scripts: add Perl script to add links to release notes
Reported-by: jian he

Discussion: https://postgr.es/m/ZuYsS5XdA7hVcV9l@momjian.us

Backpatch-through: 12
2024-09-16 13:26:37 -04:00
Bruce Momjian
4632e5cf4b Perl scripts: revert 43ce181059d
Small improvement not worth the code churn.

Reported-by: Andrew Dunstan

Discussion: https://postgr.es/m/42f2242a-422b-4aa3-8d60-d67b229c4a52@dunslane.net

Backpatch-through: master
2024-09-15 21:25:24 -04:00
Tom Lane
d5622acb32 Replace usages of xmlXPathCompile() with xmlXPathCtxtCompile().
In existing releases of libxml2, xmlXPathCompile can be driven
to stack overflow because it fails to protect itself against
too-deeply-nested input.  While there is an upstream fix as of
yesterday, it will take years for that to propagate into all
shipping versions.  In the meantime, we can protect our own
usages basically for free by calling xmlXPathCtxtCompile instead.

(The actual bug is that libxml2 keeps its nesting counter in the
xmlXPathContext, and its parsing code was willing to just skip
counting nesting levels if it didn't have a context.  So if we supply
a context, all is well.  It seems odd actually that it works at all
to not supply a context, because this means that XPath parsing does
not have access to XML namespace info.  Apparently libxml2 never
checks namespaces until runtime?  Anyway, this seems like good
future-proofing even if its only immediate effect is to dodge a bug.)

Sadly, this hack only offers protection with libxml2 2.9.11 and newer.
Before that there are multiple similar problems, so if you are
processing untrusted XML it behooves you to get a newer version.
But we have some pretty old libxml2 in the buildfarm, so it seems
impractical to add a regression test to verify this fix.

Per bug #18617 from Jingzhou Fu.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/18617-1cee4d2ed1f4e7ae@postgresql.org
Discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/799
2024-09-15 13:33:09 -04:00
Bruce Momjian
43ce181059 Perl scripts: eliminate "Useless interpolation" warnings
Eliminate warnings of Perl Critic from src/tools.

Backpatch-through: master
2024-09-15 10:55:37 -04:00
Tom Lane
b8ea0f675f Run regression tests with timezone America/Los_Angeles.
Historically we've used timezone "PST8PDT", but the recent release
2024b of tzdb changes the definition of that zone in a way that
breaks many test cases concerned with dates before 1970.  Although
we've not yet adopted 2024b into our own tree, this is already
problematic for people using --with-system-tzdata if their platform
has already adopted 2024b.  To work with both older and newer
versions of tzdb, switch to using "America/Los_Angeles", accepting
the ensuing changes in regression test results.

Back-patch to all supported branches.

Per report and patch from Wolfgang Walther.

Discussion: https://postgr.es/m/0a997455-5aba-4cf2-a354-d26d8bcbfae6@technowledgy.de
2024-09-14 17:55:02 -04:00
Alvaro Herrera
f64074c88c
Add commit 7229ebe011df to .git-blame-ignore-revs. 2024-09-14 20:17:30 +02:00
Tom Lane
94537982ec Remove obsolete comment in pg_stat_statements.
Commit 76db9cb63 removed the use of multiple nesting counters,
but missed one comment describing that arrangement.

Back-patch to v17 where 76db9cb63 came in, just to avoid confusion.

Julien Rouhaud

Discussion: https://postgr.es/m/gfcwh3zjxc2vygltapgo7g6yacdor5s4ynr234b6v2ohhuvt7m@gr42joxalenw
2024-09-14 11:42:31 -04:00
Andrew Dunstan
76f2a0e547 Improve meson's detection of perl build flags
The current method of detecting perl build flags breaks if the path to
perl contains a space. This change makes two improvements. First,
instead of getting a list of ldflags and ccdlflags and then trying to
filter those out of the reported ldopts, we tell perl to suppress
reporting those in the first instance. Second, it tells perl to parse
those and output them, one per line. Thus any space on the option in a
file name, for example, is preserved.

Issue reported off-list by Muralikrishna Bandaru

Discussion: https://postgr.es/01117f88-f465-bf6c-9362-083bd72ca305@dunslane.net

Backpatch to release 16.
2024-09-14 10:26:25 -04:00
Andrew Dunstan
bc46104fc9 Only define NO_THREAD_SAFE_LOCALE for MSVC plperl when required
Latest versions of Strawberry Perl define USE_THREAD_SAFE_LOCALE, and we
therefore get a handshake error when building against such instances.
The solution is to perform a test to see if USE_THREAD_SAFE_LOCALE is
defined and only define NO_THREAD_SAFE_LOCALE if it isn't.

Backpatch the meson.build fix back to release 16 and apply the same
logic to Mkvcbuild.pm in releases 12 through 16.

Original report of the issue from Muralikrishna Bandaru.
2024-09-14 08:47:06 -04:00
Tom Lane
fae55f0bb3 Allow _h_indexbuild() to be interrupted.
When we are building a hash index that is large enough to need
pre-sorting (larger than either maintenance_work_mem or NBuffers),
the initial sorting phase is interruptible, but the insertion
phase wasn't.  Add the missing CHECK_FOR_INTERRUPTS().

Per bug #18616 from Alexander Lakhin.  Back-patch to all
supported branches.

Pavel Borisov

Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org
2024-09-13 16:17:04 -04:00
Nathan Bossart
9a23967063 Add commit 2b03cfeea4 to .git-blame-ignore-revs. 2024-09-13 13:06:06 -05:00
Nathan Bossart
70d1c664f4 Fix contrib/pageinspect's test for sequences.
I managed to break this test in two different ways in commit
05036a3155.

First, the output of the new call to tuple_data_split() on the test
sequence is dependent on endianness.  This is fixed by setting a
special start value for the test sequence that produces the same
output regardless of the endianness of the machine.

Second, on versions older than v15, the new test case fails under
"force_parallel_mode = regress" with the following error:

	ERROR:  cannot access temporary tables during a parallel operation

This is because pageinspect's disk-accessing functions are
incorrectly marked PARALLEL SAFE on versions older than v15 (see
commit aeaaf520f4 for details).  This one is fixed by changing the
test sequence to be permanent.  The only reason it was previously
marked temporary was to avoid needing a DROP SEQUENCE command at
the end of the test.  Unlike some other tests in this file, the use
of a permanent sequence here shouldn't result in any test
instability like what was fixed by commit e2933a6e11.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZuOKOut5hhDlf_bP%40nathan
Backpatch-through: 12
2024-09-13 10:16:40 -05:00
Peter Eisentraut
433d8f40e9 Remove separate locale_is_c arguments
Since e9931bfb751, ctype_is_c is part of pg_locale_t.  Some functions
passed a pg_locale_t and a bool argument separately.  This can now be
combined into one argument.

Since some callers call MatchText() with locale 0, it is a bit
confusing whether this is all correct.  But it is the case that only
callers that pass a non-zero locale object to MatchText() end up
checking locale->ctype_is_c.  To make that flow a bit more
understandable, add the locale argument to MATCH_LOWER() and GETCHAR()
in like_match.c, instead of implicitly taking it from the outer scope.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/84d415fc-6780-419e-b16c-61a0ca819e2b@eisentraut.org
2024-09-13 16:10:52 +02:00
Amit Langote
2b67bdca52 SQL/JSON: Update example in JSON_QUERY() documentation
Commit e6c45d85dc fixed the behavior of JSON_QUERY() when WITH
CONDITIONAL WRAPPER is used, but the documentation example wasn't
updated to reflect this change. This commit updates the example to
show the correct result.

Per off-list report from Andreas Ulbrich.

Backpatch-through: 17
2024-09-13 16:10:14 +09:00
Amit Kapila
4d8489f4f1 Prohibit altering invalidated replication slots.
ALTER_REPLICATION_SLOT for invalid replication slots should not be allowed
because there is no way to get back the invalidated (logical) slot to
work.

Author: Bharath Rupireddy
Reviewed-by: Peter Smith, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4fSOMiKjQ3=2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ@mail.gmail.com
2024-09-13 09:31:23 +05:30
Michael Paquier
7b1ddbae36 pg_stat_statements: Add tests with extended query protocol
There are currently no tests in the tree checking that queries using the
extended query protocol are able to map with their query ID.

This can be achieved for some paths of the extended query protocol with
the psql meta-commands \bind or \bind_named, so let's add some tests
based on both.

I have found that to be a useful addition while working on a different
issue.

Discussion: https://postgr.es/m/ZuEt6MOEBSlifBfn@paquier.xyz
2024-09-13 09:41:06 +09:00
Nathan Bossart
05036a3155 Reintroduce support for sequences in pgstattuple and pageinspect.
Commit 4b82664156 restricted a number of functions provided by
contrib modules to only relations that use the "heap" table access
method.  Sequences always use this table access method, but they do
not advertise as such in the pg_class system catalog, so the
aforementioned commit also (presumably unintentionally) removed
support for sequences from some of these functions.  This commit
reintroduces said support for sequences to these functions and adds
a couple of relevant tests.

Co-authored-by: Ayush Vatsa
Reviewed-by: Robert Haas, Michael Paquier, Matthias van de Meent
Discussion: https://postgr.es/m/CACX%2BKaP3i%2Bi9tdPLjF5JCHVv93xobEdcd_eB%2B638VDvZ3i%3DcQA%40mail.gmail.com
Backpatch-through: 12
2024-09-12 16:31:29 -05:00
Jeff Davis
b0c30612c5 Simplify checks for deterministic collations.
Remove redundant checks for locale->collate_is_c now that we always
have a valid pg_locale_t.

Also, remove pg_locale_deterministic() wrapper, which is no longer
useful after commit e9931bfb75. Just check the field directly,
consistent with other fields in pg_locale_t.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-12 13:35:56 -07:00
Jeff Davis
6a9fc11033 Remove redundant check for default collation.
The operative check is for a deterministic collation, so the check for
DEFAULT_COLLATION is redundant. Furthermore, it will be wrong if we
ever support a non-deterministic default collation.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-12 13:35:49 -07:00
Tom Lane
cb599b9ddf Make jsonpath .string() be immutable for datetimes.
Discussion of commit ed055d249 revealed that we don't actually
want jsonpath's .string() method to depend on DateStyle, nor
TimeZone either, because the non-"_tz" jsonpath functions are
supposed to be immutable.  Potentially we could allow a TimeZone
dependency in the "_tz" variants, but it seems better to just
uniformly define this method as returning the same string that
jsonb text output would do.  That's easier to implement too,
saving a couple dozen lines.

Patch by me, per complaint from Peter Eisentraut.  Back-patch
to v17 where this feature came in (in 66ea94e8e).  Also
back-patch ed055d249 to provide test cases.

Discussion: https://postgr.es/m/5e8879d0-a3c8-4be2-950f-d83aa2af953a@eisentraut.org
2024-09-12 14:30:29 -04:00
Fujii Masao
4eada203a5 Add has_largeobject_privilege function.
This function checks whether a user has specific privileges on a large object,
identified by OID. The user can be provided by name, OID,
or default to the current user. If the specified large object doesn't exist,
the function returns NULL. It raises an error for a non-existent user name.
This behavior is basically consistent with other privilege inquiry functions
like has_table_privilege.

Bump catalog version.

Author: Yugo Nagata
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20240702163444.ab586f6075e502eb84f11b1a@sranhm.sraoss.co.jp
2024-09-12 21:51:26 +09:00
Fujii Masao
412229d197 Deduplicate code in LargeObjectExists and myLargeObjectExists.
myLargeObjectExists() and LargeObjectExists() had nearly identical code,
except for handling snapshots. This commit renames myLargeObjectExists()
to LargeObjectExistsWithSnapshot() and refactors LargeObjectExists()
to call it internally, reducing duplication.

Author: Yugo Nagata
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20240702163444.ab586f6075e502eb84f11b1a@sranhm.sraoss.co.jp
2024-09-12 21:45:42 +09:00
Peter Eisentraut
23d0b48468 Remove hardcoded hash opclass function signature exceptions
hashvalidate(), which validates the signatures of support functions
for the hash AM, contained several hardcoded exceptions.  For example,
hash/date_ops support function 1 was hashint4(), which would
ordinarily fail validation because the function argument is int4, not
date.  But this works internally because int4 and date are of the same
size.  There are several more exceptions like this that happen to work
and were allowed historically but would now fail the function
signature validation.

This patch removes those exceptions by providing new support functions
that have the proper declared signatures.  They internally share most
of the code with the "wrong" functions they replace, so the behavior
is still the same.

With the exceptions gone, hashvalidate() is now simplified and relies
fully on check_amproc_signature().

hashvarlena() and hashvarlenaextended() are kept in pg_proc.dat
because some extensions currently use them to build hash functions for
their own types, and we need to keep exposing these functions as
"LANGUAGE internal" functions for that to continue to work.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/29c3b746-69e7-482a-b37c-dbbf7e5b009b@eisentraut.org
2024-09-12 12:57:43 +02:00
David Rowley
5bb9ba2739 Doc: alphabetize aggregate function table
A few recent JSON aggregates have been added without much consideration
to the existing order.  Put these back in alphabetical order (with the
exception of the JSONB variant of each JSON aggregate).

Author: Wolfgang Walther <walther@technowledgy.de>
Reviewed-by: Marlene Reiterer <marlene.reiterer.03@gmail.com>
Discussion: https://postgr.es/m/6a7b910c-3feb-4006-b817-9b4759cb6bb6%40technowledgy.de
Backpatch-through: 16, where these aggregates were added
2024-09-12 22:36:39 +12:00
Fujii Masao
fefa76f70f Remove old RULE privilege completely.
The RULE privilege for tables was removed in v8.2, but for backward
compatibility, GRANT/REVOKE and privilege functions like
has_table_privilege continued to accept the RULE keyword without
any effect.

After discussions on pgsql-hackers, it was agreed that this compatibility
is no longer needed. Since it's been long enough since the deprecation,
we've decided to fully remove support for RULE privilege,
so GRANT/REVOKE and privilege functions will no longer accept it.

Author: Fujii Masao
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/976a3581-6939-457f-b947-fc3dc836c083@oss.nttdata.com
2024-09-12 19:33:44 +09:00
Peter Eisentraut
811af9786b Don't overwrite scan key in systable_beginscan()
When systable_beginscan() and systable_beginscan_ordered() choose an
index scan, they remap the attribute numbers in the passed-in scan
keys to the attribute numbers of the index, and then write those
remapped attribute numbers back into the scan key passed by the
caller.  This second part is surprising and gratuitous.  It means that
a scan key cannot safely be used more than once (but it might
sometimes work, depending on circumstances).  Also, there is no value
in providing these remapped attribute numbers back to the caller,
since they can't do anything with that.

Fix that by making a copy of the scan keys passed by the caller and
make the modifications there.

Also, some code that had to work around the previous situation is
simplified.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
2024-09-12 10:48:39 +02:00
Michael Paquier
00c76cf21c Move logic related to WAL replay of Heap/Heap2 into its own file
This brings more clarity to heapam.c, by cleanly separating all the
logic related to WAL replay and the rest of Heap and Heap2, similarly
to other RMGRs like hash, btree, etc.

The header reorganization is also nice in heapam.c, cutting half of the
headers required.

Author: Li Yong
Reviewed-by: Sutou Kouhei, Michael Paquier
Discussion: https://postgr.es/m/EFE55E65-D7BD-4C6A-B630-91F43FD0771B@ebay.com
2024-09-12 13:32:05 +09:00
David Rowley
9fba1ed294 Adjust tuplestore stats API
1eff8279d added an API to tuplestore.c to allow callers to obtain
storage telemetry data.  That API wasn't quite good enough for callers
that perform tuplestore_clear() as the telemetry functions only
accounted for the current state of the tuplestore, not the maximums
before tuplestore_clear() was called.

There's a pending patch that would like to add tuplestore telemetry
output to EXPLAIN ANALYZE for WindowAgg.  That node type uses
tuplestore_clear() before moving to the next window partition and we
want to show the maximum space used, not the space used for the final
partition.

Reviewed-by: Tatsuo Ishii, Ashutosh Bapat
Discussion: https://postgres/m/CAApHDvoY8cibGcicLV0fNh=9JVx9PANcWvhkdjBnDCc9Quqytg@mail.gmail.com
2024-09-12 16:02:01 +12:00
Amit Langote
e6c45d85dc SQL/JSON: Fix JSON_QUERY(... WITH CONDITIONAL WRAPPER)
Currently, when WITH CONDITIONAL WRAPPER is specified, array wrappers
are applied even to a single SQL/JSON item if it is a scalar JSON
value, but this behavior does not comply with the standard.

To fix, apply wrappers only when there are multiple SQL/JSON items
in the result.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Peter Eisentraut <peter@eisentraut.org>
Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/8022e067-818b-45d3-8fab-6e0d94d03626%40eisentraut.org
Backpatch-through: 17
2024-09-12 09:39:56 +09:00
Tom Lane
77761ee5dd Remove incorrect Assert.
check_agglevels_and_constraints() asserted that if we find an
aggregate function in an EXPR_KIND_FROM_SUBSELECT expression, the
expression must be in a LATERAL subquery.  Alexander Lakhin found a
case where that's not so: because of the odd scoping rules for NEW/OLD
within a rule, a reference to NEW/OLD could cause an aggregate to be
considered top-level even though it's in an unmarked sub-select.
The error message that would be thrown seems sufficiently on-point,
so just remove the Assert.  (Hence, this is not a bug for production
builds.)

This Assert was added by me in commit eaccfded9 (9.3 era).  It looks
like I put it in to cross-check that the new logic for detecting
misplaced aggregates (using agglevelsup) caught the same cases that a
previous check on p_lateral_active did.  So there might have been some
related misbehavior before eaccfded9 ... but that's very ancient
history by now, so I didn't dig any deeper.

Per bug #18608 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18608-48de0717508ee429@postgresql.org
2024-09-11 11:41:47 -04:00
Magnus Hagander
280423300b pg_createsubscriber: minor documentation fixes 2024-09-11 16:30:17 +02:00
Peter Eisentraut
8b5c6a54c4 Replace gratuitous memmove() with memcpy()
The index access methods all had similar code that copied the
passed-in scan keys to local storage.  They all used memmove() for
that, which is not wrong, but it seems confusing not to use memcpy()
when that would work.  Presumably, this was all once copied from
ancient code and never adjusted.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
2024-09-11 15:21:36 +02:00
Tomas Vondra
842265631d Fix unique key checks in JSON object constructors
When building a JSON object, the code builds a hash table of keys, to
allow checking if the keys are unique. The uniqueness check and adding
the new key happens in json_unique_check_key(), but this assumes the
pointer to the key remains valid.

Unfortunately, two places passed pointers to keys in a buffer, while
also appending more data (additional key/value pairs) to the buffer.
With enough data the buffer is resized by enlargeStringInfo(), which
calls repalloc(), invalidating the earlier key pointers.

Due to this the uniqueness check may fail with both false negatives and
false positives, producing JSON objects with duplicate keys or failing
to produce a perfectly valid JSON object.

This affects multiple functions that enforce uniqueness of keys, all
introduced in PG16 with the new SQL/JSON:

- json_object_agg_unique / jsonb_object_agg_unique
- json_object / jsonb_objectagg

Existing regression tests did not detect the issue, simply because the
initial buffer size is 1024 and the objects were small enough not to
require the repalloc.

With a sufficiently large object, AddressSanitizer reported the access
to invalid memory immediately. So would valgrind, of course.

Fixed by copying the key into the hash table memory context, and adding
regression tests with enough data to repalloc the buffer. Backpatch to
16, where the functions were introduced.

Reported by Alexander Lakhin. Investigation and initial fix by Junwang
Zhao, with various improvements and tests by me.

Reported-by: Alexander Lakhin
Author: Junwang Zhao, Tomas Vondra
Backpatch-through: 16
Discussion: https://postgr.es/m/18598-3279ed972a2347c7@postgresql.org
Discussion: https://postgr.es/m/CAEG8a3JjH0ReJF2_O7-8LuEbO69BxPhYeXs95_x7+H9AMWF1gw@mail.gmail.com
2024-09-11 13:21:10 +02:00
Peter Eisentraut
6b25c57a2d Update .gitignore
for commit 0785d1b8b2
2024-09-11 09:26:20 +02:00
Peter Eisentraut
1fb2308e69 Remove obsolete unconstify()
This is no longer needed as of OpenSSL 1.1.0 (the current minimum
version).  LibreSSL made the same change around the same time as well.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/20463f79-a7b0-4bba-a178-d805f99c02f9%40eisentraut.org
2024-09-11 09:18:12 +02:00
Peter Eisentraut
0785d1b8b2 common/jsonapi: support libpq as a client
Based on a patch by Michael Paquier.

For libpq, use PQExpBuffer instead of StringInfo. This requires us to
track allocation failures so that we can return JSON_OUT_OF_MEMORY as
needed rather than exit()ing.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
2024-09-11 09:01:07 +02:00
Amit Kapila
3beb945da9 Improve assertion in FindReplTupleInLocalRel().
The first part of the assertion verifying that the passed index must be PK
or RI was incorrectly passing index relation instead of heap relation in
GetRelationIdentityOrPK(). The assertion was not failing because the
second part of the assertion which needs to be performed only when remote
relation has REPLICA_IDENTITY_FULL set was also incorrect.

The change is not backpatched because the current coding doesn't lead to
any failure.

Reported-by: Dilip Kumar
Author: Amit Kapila
Reviewed-by: Vignesh C
Discussion: https://postgr.es/m/CAFiTN-tmguaT1DXbCC+ZomZg-oZLmU6BPhr0po7akQSG6vNJrg@mail.gmail.com
2024-09-11 09:18:23 +05:30
Noah Misch
65c310b310 Optimize pg_visibility with read streams.
We've measured 5% performance improvement, and this arranges to benefit
automatically from future optimizations to the read_stream subsystem.
The area lacked test coverage, so close that gap.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQAozv3wTY5TV2t29JcwPydbmKbiWQkZD42S2OgzdixPMDQ@mail.gmail.com
2024-09-10 15:21:33 -07:00
Tom Lane
52c707483c Use a hash table to de-duplicate column names in ruleutils.c.
Commit 8004953b5 added a hash table to avoid O(N^2) cost in choosing
unique relation aliases while deparsing a view or rule.  It did
nothing about the similar O(N^2) (maybe worse) costs of choosing
unique column aliases within each RTE.  However, that's now
demonstrably a bottleneck when deparsing CHECK constraints for wide
tables, so let's use a similar hash table to handle those.

The extra cost of setting up the hash table will not be repaid unless
the table has many columns.  I've set this up so that we use the brute
force method if there are less than 32 columns.  The exact cutoff is
not too critical, but this value seems good because it results in both
code paths getting exercised by existing regression-test cases.

Patch by me; thanks to David Rowley for review.

Discussion: https://postgr.es/m/2885468.1722291250@sss.pgh.pa.us
2024-09-10 16:49:09 -04:00
Tom Lane
bccca780ee Fix some whitespace issues in XMLSERIALIZE(... INDENT).
We must drop whitespace while parsing the input, else libxml2
will include "blank" nodes that interfere with the desired
indentation behavior.  The end result is that we didn't indent
nodes separated by whitespace.

Also, it seems that libxml2 may add a trailing newline when working
in DOCUMENT mode.  This is semantically insignificant, so strip it.

This is in the gray area between being a bug fix and a definition
change.  However, the INDENT option is still pretty new (since v16),
so I think we can get away with changing this in stable branches.
Hence, back-patch to v16.

Jim Jones

Discussion: https://postgr.es/m/872865a8-548b-48e1-bfcd-4e38e672c1e4@uni-muenster.de
2024-09-10 16:20:31 -04:00
Tom Lane
ed055d249d Improve documentation and testing of jsonpath string() for datetimes.
Point out that the output format depends on DateStyle, and test that,
along with testing some cases previously not covered.

In passing, adjust the horology test to verify that the prevailing
DateStyle is 'Postgres, MDY', much as it has long verified the
prevailing TimeZone.  We expect pg_regress to have set these up,
and there are multiple regression tests relying on these settings.

Also make the formatting of entries in table 9.50 more consistent.

David Wheeler (marginal additional hacking by me); review by jian he

Discussion: https://postgr.es/m/56955B33-6959-4FDA-A459-F00363ECDFEE@justatheory.com
2024-09-10 14:48:13 -04:00
Tomas Vondra
861086493f Add PG_TEST_PG_COMBINEBACKUP_MODE to CI tasks
The environment variable PG_TEST_PG_COMBINEBACKUP_MODE has been
available since 35a7b288b975, but was not set by any built-in CI tasks.
This commit modifies two of the CI tasks to use the alternative modes,
to exercise the pg_combinebackup code.

The Linux task uses --copy-file-range, macOS uses --clone.

This is not an exhaustive test of combinations. The supported modes
depend on the operating system and filesystem, and it would be nice to
test all supported combinations. Right now we have just one task for
each OS, and it doesn't seem worth adding more just for this.

Reported-by: Peter Eisentraut
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/48da4a1f-ccd9-4988-9622-24f37b1de2b4%40eisentraut.org
2024-09-10 16:30:38 +02:00
Daniel Gustafsson
390b3cbbb2 Protect against small overread in SASLprep validation
In case of torn UTF8 in the input data we might end up going
past the end of the string since we don't account for length.
While validation won't be performed on a sequence with a NULL
byte it's better to avoid going past the end to beging with.
Fix by taking the length into consideration.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAOYmi+mTnmM172g=_+Yvc47hzzeAsYPy2C4UBY3HK9p-AXNV0g@mail.gmail.com
2024-09-10 11:02:28 +02:00
Peter Eisentraut
56fead44dc Add amgettreeheight index AM API routine
The only current implementation is for btree where it calls
_bt_getrootheight().  Other index types can now also use this to pass
information to their amcostestimate routine.  Previously, btree was
hardcoded and other index types could not hook into the optimizer at
this point.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-09-10 10:03:23 +02:00
Richard Guo
f5050f795a Mark expressions nullable by grouping sets
When generating window_pathkeys, distinct_pathkeys, or sort_pathkeys,
we failed to realize that the grouping/ordering expressions might be
nullable by grouping sets.  As a result, we may incorrectly deem that
the PathKeys are redundant by EquivalenceClass processing and thus
remove them from the pathkeys list.  That would lead to wrong results
in some cases.

To fix this issue, we mark the grouping expressions nullable by
grouping sets if that is the case.  If the grouping expression is a
Var or PlaceHolderVar or constructed from those, we can just add the
RT index of the RTE_GROUP RTE to the existing nullingrels field(s);
otherwise we have to add a PlaceHolderVar to carry on the nullingrel
bit.

However, we have to manually remove this nullingrel bit from
expressions in various cases where these expressions are logically
below the grouping step, such as when we generate groupClause pathkeys
for grouping sets, or when we generate PathTarget for initial input to
grouping nodes.

Furthermore, in set_upper_references, the targetlist and quals of an
Agg node should have nullingrels that include the effects of the
grouping step, ie they will have nullingrels equal to the input
Vars/PHVs' nullingrels plus the nullingrel bit that references the
grouping RTE.  In order to perform exact nullingrels matches, we also
need to manually remove this nullingrel bit.

Bump catversion because this changes the querytree produced by the
parser.

Thanks to Tom Lane for the idea to invent a new kind of RTE.

Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from
various threads.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat, Sutou Kouhei
Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
2024-09-10 12:36:48 +09:00
Richard Guo
247dea89f7 Introduce an RTE for the grouping step
If there are subqueries in the grouping expressions, each of these
subqueries in the targetlist and HAVING clause is expanded into
distinct SubPlan nodes.  As a result, only one of these SubPlan nodes
would be converted to reference to the grouping key column output by
the Agg node; others would have to get evaluated afresh.  This is not
efficient, and with grouping sets this can cause wrong results issues
in cases where they should go to NULL because they are from the wrong
grouping set.  Furthermore, during re-evaluation, these SubPlan nodes
might use nulled column values from grouping sets, which is not
correct.

This issue is not limited to subqueries.  For other types of
expressions that are part of grouping items, if they are transformed
into another form during preprocessing, they may fail to match lower
target items.  This can also lead to wrong results with grouping sets.

To fix this issue, we introduce a new kind of RTE representing the
output of the grouping step, with columns that are the Vars or
expressions being grouped on.  In the parser, we replace the grouping
expressions in the targetlist and HAVING clause with Vars referencing
this new RTE, so that the output of the parser directly expresses the
semantic requirement that the grouping expressions be gotten from the
grouping output rather than computed some other way.  In the planner,
we first preprocess all the columns of this new RTE and then replace
any Vars in the targetlist and HAVING clause that reference this new
RTE with the underlying grouping expressions, so that we will have
only one instance of a SubPlan node for each subquery contained in the
grouping expressions.

Bump catversion because this changes the querytree produced by the
parser.

Thanks to Tom Lane for the idea to invent a new kind of RTE.

Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from
various threads.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat, Sutou Kouhei
Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
2024-09-10 12:35:34 +09:00
Michael Paquier
fba49d5293 Remove emode argument from XLogFileRead() and XLogFileReadAnyTLI()
This change makes the code slightly easier to reason about, because
there is actually no need to know if a specific caller of one of these
routines should fail hard on a PANIC, or just let it go through with a
DEBUG2.

The only caller of XLogFileReadAnyTLI() used DEBUG2, and XLogFileRead()
has never used its emode.  This can be simplified since 1bb2558046cc
that has introduced XLogFileReadAnyTLI(), splitting both.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240906201043.a640f3b44e755d4db2b6943e@sraoss.co.jp
2024-09-10 08:44:31 +09:00
Masahiko Sawada
bb77752342 Add WAL usage reporting to ANALYZE VERBOSE output.
This change adds WAL usage reporting to the output of ANALYZE VERBOSE
and autoanalyze reports. It aligns the analyze output with VACUUM,
providing consistency. Additionally, it aids in troubleshooting cases
where WAL records are generated during analyze operations.

Author: Anthonin Bonnefoy
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com
2024-09-09 14:56:08 -07:00
Tom Lane
de239d01e7 Consistently use PageGetExactFreeSpace() in pgstattuple.
Previously this code used PageGetHeapFreeSpace on heap pages,
and usually used PageGetFreeSpace on index pages (though for some
reason GetHashPageStats used PageGetExactFreeSpace instead).
The difference is that those functions subtract off the size of
a line pointer, and PageGetHeapFreeSpace has some additional
rules about returning zero if adding another line pointer would
require exceeding MaxHeapTuplesPerPage.  Those things make sense
when testing to see if a new tuple can be put on the page, but
they seem pretty strange for pure statistics collection.

Additionally, statapprox_heap had a special rule about counting
a "new" page as being fully available space.  This also seems
strange, because it's not actually usable until VACUUM or some
such process initializes the page.  Moreover, it's inconsistent
with what pgstat_heap does, which is to count such a page as
having zero free space.  So make it work like pgstat_heap, which
as of this patch unconditionally calls PageGetExactFreeSpace.

This is more of a definitional change than a bug fix, so no
back-patch.  The module's documentation doesn't define exactly
what "free space" means either, so we left that as-is.

Frédéric Yhuel, reviewed by Rafia Sabih and Andreas Karlsson.

Discussion: https://postgr.es/m/3a18f843-76f6-4a84-8cca-49537fefa15d@dalibo.com
2024-09-09 14:34:10 -04:00
Tom Lane
218527d014 Don't bother checking the result of SPI_connect[_ext] anymore.
SPI_connect/SPI_connect_ext have not returned any value other than
SPI_OK_CONNECT since commit 1833f1a1c in v10; any errors are thrown
via ereport.  (The most likely failure is out-of-memory, which has
always been thrown that way, so callers had better be prepared for
such errors.)  This makes it somewhat pointless to check these
functions' result, and some callers within our code haven't been
bothering; indeed, the only usage example within spi.sgml doesn't
bother.  So it's likely that the omission has propagated into
extensions too.

Hence, let's standardize on not checking, and document the return
value as historical, while not actually changing these functions'
behavior.  (The original proposal was to change their return type
to "void", but that would needlessly break extensions that are
conforming to the old practice.)  This saves a small amount of
boilerplate code in a lot of places.

Stepan Neretin

Discussion: https://postgr.es/m/CAMaYL5Z9Uk8cD9qGz9QaZ2UBJFOu7jFx5Mwbznz-1tBbPDQZow@mail.gmail.com
2024-09-09 12:18:34 -04:00
Robert Haas
cdb6b0fdb0 Add PQfullProtocolVersion() to surface the precise protocol version.
The existing function PQprotocolVersion() does not include the minor
version of the protocol.  In preparation for pending work that will
bump that number for the first time, add a new function to provide it
to clients that may care, using the (major * 10000 + minor)
convention already used by PQserverVersion().

Jacob Champion based on earlier work by Jelte Fennema-Nio

Discussion: http://postgr.es/m/CAOYmi+mM8+6Swt1k7XsLcichJv8xdhPnuNv7-02zJWsezuDL+g@mail.gmail.com
2024-09-09 11:54:55 -04:00
Michael Paquier
5bbdfa8a18 Fix waits of REINDEX CONCURRENTLY for indexes with predicates or expressions
As introduced by f9900df5f94, a REINDEX CONCURRENTLY job done for an
index with predicates or expressions would set PROC_IN_SAFE_IC in its
MyProc->statusFlags, causing it to be ignored by other concurrent
operations.

Such concurrent index rebuilds should never be ignored, as a predicate
or an expression could call a user-defined function that accesses a
different table than the table where the index is rebuilt.

A test that uses injection points is added, backpatched down to 17.
Michail has proposed a different test, but I have added something
simpler with more coverage.

Oversight in f9900df5f949.

Author: Michail Nikolaev
Discussion: https://postgr.es/m/CANtu0oj9A3kZVduFTG0vrmGnKB+DCHgEpzOp0qAyOgmks84j0w@mail.gmail.com
Backpatch-through: 14
2024-09-09 13:49:36 +09:00
Amit Langote
dd8bea88ab SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps
When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression.  This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL.  However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-09 13:46:58 +09:00
Richard Guo
87b6c3c0b7 Fix order of parameters in a cost_sort call
In label_sort_with_costsize, the cost_sort function is called with the
parameters 'input_disabled_nodes' and 'input_cost' in the wrong order.
This does not cause any plan diffs in the regression tests, because
label_sort_with_costsize is only used to label the Sort node nicely
for EXPLAIN, and cost numbers are not displayed in regression tests.

Oversight in e22253467.  Fixed by passing arguments in the right
order.

Per report from Alexander Lakhin running UBSan.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/a9b7231d-68bc-f117-a07c-96688f3e6aef@gmail.com
2024-09-09 12:58:31 +09:00
Michael Paquier
fc415edf8c Add callbacks to control flush of fixed-numbered stats
This commit adds two callbacks in pgstats to have a better control of
the flush timing of pgstat_report_stat(), whose operation depends on the
three PGSTAT_*_INTERVAL variables:
- have_fixed_pending_cb(), to check if a stats kind has any pending
data waiting for a flush.  This is used as a fast path if there are no
pending statistics to flush, and this check is done for fixed-numbered
statistics only if there are no variable-numbered statistics to flush.
A flush will need to happen if at least one callback reports any pending
data.
- flush_fixed_cb(), to do the actual flush.

These callbacks are currently used by the SLRU, WAL and IO statistics,
generalizing the concept for all stats kinds (builtin and custom).

The SLRU and IO stats relied each on one global variable to determine
whether a flush should happen; these are now local to pgstat_slru.c and
pgstat_io.c, cleaning up a bit how the pending flush states are tracked
in pgstat.c.

pgstat_flush_io() and pgstat_flush_wal() are still required, but we do
not need to check their return result anymore.

Reviewed-by: Bertrand Drouvot, Kyotaro Horiguchi
Discussion: https://postgr.es/m/ZtaVO0N-aTwiAk3w@paquier.xyz
2024-09-09 11:12:29 +09:00
Tom Lane
2e62fa62d6 Avoid core dump after getpwuid_r failure.
Looking up a nonexistent user ID would lead to a null pointer
dereference.  That's unlikely to happen here, but perhaps
not impossible.

Thinko in commit 4d5111b3f, noticed by Coverity.
2024-09-08 19:14:40 -04:00
Michael Paquier
d8df7ac5c0 Update extension lookup routines to use the syscache
The following routines are changed to use the syscache entries added for
pg_extension in 490f869d92e5:
- get_extension_oid()
- get_extension_name()
- get_extension_schema()

A catalog scan is costly and could easily lead to a noticeable
performance impact when called once or more per query, so this is going
to be helpful for developers for extension data lookups.

Author: Andrei Lepikhov
Reviewed-by: Jelte Fennema-Nio
Discussion: https://postgr.es/m/529295b2-6ba9-4dae-acd1-20a9c6fb8f9a@gmail.com
2024-09-07 20:20:46 +09:00
Jeff Davis
51edc4ca54 Remove lc_ctype_is_c().
Instead always fetch the locale and look at the ctype_is_c field.

hba.c relies on regexes working for the C locale without needing
catalog access, which worked before due to a special case for
C_COLLATION_OID in lc_ctype_is_c(). Move the special case to
pg_set_regex_collation() now that lc_ctype_is_c() is gone.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-06 13:23:21 -07:00
Tom Lane
129a2f6679 Fix incorrect pg_stat_io output on 32-bit machines.
pg_stat_get_io() applied TimestampTzGetDatum twice to the
stat_reset_timestamp value.  On 64-bit builds that's harmless because
TimestampTzGetDatum is a no-op, but on 32-bit builds it results in
displaying garbage in the stats_reset column of the pg_stat_io view.

Bug dates to commit a9c70b46d which introduced pg_stat_io, so
back-patch to v16 where that came in.

Bertrand Drouvot

Discussion: https://postgr.es/m/Ztrd+XcPTz1zorkg@ip-10-97-1-34.eu-west-3.compute.internal
2024-09-06 11:57:57 -04:00
Peter Eisentraut
9e43ab3dd7 Remove useless unconstify
Digging into the history, this was not necessary even when it was
added, but might have been some time before that.  In any case, there
is no use for this now.
2024-09-06 11:25:48 +02:00
Amit Langote
bbd4c058a8 SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE
Use EMPTY ARRAY instead of EMPTY.

This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 13:25:53 +09:00
Amit Langote
ee75a03f37 SQL/JSON: Fix JSON_TABLE() column deparsing
The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.

Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.

Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 13:25:47 +09:00
Amit Langote
4d7e24e0f4 Revert recent SQL/JSON related commits
Reverts 68222851d5a8, 565caaa79af, and 3a97460970f, because a few
BF animals didn't like one or all of them.
2024-09-06 12:53:01 +09:00
Amit Langote
3a97460970 SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps
When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression.  This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL.  However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 12:05:40 +09:00
Amit Langote
565caaa79a SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE
Use EMPTY ARRAY instead of EMPTY.

This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 10:14:01 +09:00
Amit Langote
68222851d5 SQL/JSON: Fix JSON_TABLE() column deparsing
The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.

Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.

Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 10:13:57 +09:00
Amit Langote
3422f5f93f Update comment about ExprState.escontext
The updated comment provides more helpful guidance by mentioning that
escontext should be set when soft error handling is needed.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 10:13:53 +09:00
Jeff Davis
7829f85a62 Be more careful with error paths in pg_set_regex_collation().
Set global variables after error paths so that they don't end up in an
inconsistent state.

The inconsistent state doesn't lead to an actual problem, because
after an error, pg_set_regex_collation() will be called again before
the globals are accessed.

Change extracted from patch by Andreas Karlsson, though not discussed
explicitly.

Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-05 12:10:08 -07:00
Tom Lane
fadff3fc94 Prevent mis-encoding of "trailing junk after numeric literal" errors.
Since commit 2549f0661, we reject an identifier immediately following
a numeric literal (without separating whitespace), because that risks
ambiguity with hex/octal/binary integers.  However, that patch used
token patterns like "{integer}{ident_start}", which is problematic
because {ident_start} matches only a single byte.  If the first
character after the integer is a multibyte character, this ends up
with flex reporting an error message that includes a partial multibyte
character.  That can cause assorted bad-encoding problems downstream,
both in the report to the client and in the postmaster log file.

To fix, use {identifier} not {ident_start} in the "junk" token
patterns, so that they will match complete multibyte characters.
This seems generally better user experience quite aside from the
encoding problem: for "123abc" the error message will now say that
the error appeared at or near "123abc" instead of "123a".

While at it, add some commentary about why these patterns exist
and how they work.

Report and patch by Karina Litskevich; review by Pavel Borisov.
Back-patch to v15 where the problem came in.

Discussion: https://postgr.es/m/CACiT8iZ_diop=0zJ7zuY3BXegJpkKK1Av-PU7xh0EDYHsa5+=g@mail.gmail.com
2024-09-05 12:42:33 -04:00
Daniel Gustafsson
85837b8037 Fix handling of NULL return value in typarray lookup
Commit 6ebeeae29 accidentally omitted testing the return value from
findTypeByOid which can return NULL.  Fix by adding a check to make
sure that we have a pointer to dereference.

Author: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAEudQAqfMTH8Ya_J6E-NW_y_JyDFDxtQ4V_g6nY_1=0oDbQqdg@mail.gmail.com
2024-09-05 15:32:22 +02:00
Peter Eisentraut
4af123ad45 Fix misleading error message context
Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Stepan Neretin <sncfmgg@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRAw+OkVW=FgMKHKyvY3CgtWy3cWdY7XT+S5TJaTttu=oA@mail.gmail.com
2024-09-05 15:19:00 +02:00
Michael Paquier
1b373aed20 Add callback for backend initialization in pgstats
pgstat_initialize() is currently used by the WAL stats as a code path to
take some custom actions when a backend starts.  A callback is added to
generalize the concept so as all stats kinds can do the same, for
builtin and custom kinds, if set.

Reviewed-by: Bertrand Drouvot, Kyotaro Horiguchi
Discussion: https://postgr.es/m/ZtZr1K4PLdeWclXY@paquier.xyz
2024-09-05 16:05:21 +09:00
Michael Paquier
341e9a05e7 Fix two NULL pointer dereferences when reading custom pgstats from file
There were two spots in pgstat_read_statsfile() where is was possible to
finish with a null-pointer-dereference crash for custom pgstats kinds:
- When reading stats for a fixed-numbered stats entry.
- When reading a variable stats entry with name serialization.
For both cases, these issues were reachable by starting a server after
changing shared_preload_libraries so as the stats written previously
could not be loaded.

The code is changed so as the stats are ignored in this case, like the
other code paths doing similar sanity checks.  Two WARNINGs are added to
be able to debug these issues.  A test is added for the case of
fixed-numbered stats with the module injection_points.

Oversights in 7949d9594582, spotted while looking at a different report.

Discussion: https://postgr.es/m/Ztj0Jftsn4xXuXtl@paquier.xyz
2024-09-05 14:36:57 +09:00
Michael Paquier
5735521ac2 Check availability of module injection_points in TAP tests
This fixes defects with installcheck for TAP tests that expect the
module injection_points to exist in an installation, but the contents of
src/test/modules are not installed by default with installcheck.  This
would cause, for example, failures under installcheck-world for a build
with injection points enabled, when the contents of src/test/modules/
are not installed.

The availability of the module can be done with a scan of
pg_available_extension.  This has been introduced in 2cdcae9da696, and
it is refactored here as a new routine in Cluster.pm.

Tests are changed in different ways depending on what they need:
- The libpq TAP test sets up a node even without injection points, so it
is enough to check that CREATE EXTENSION can be used.  There is no need
for the variable enable_injection_points.
- In test_misc, 006_signal_autovacuum requires a runtime check.
- 041_checkpoint_at_promote in recovery tests and 005_timeouts in
test_misc are updated to use the routine introduced in Cluster.pm.
- test_slru's 001_multixact, injection_points's 001_stats and
modules/gin/ do not require a check as these modules disable
installcheck entirely.

Discussion: https://postgr.es/m/ZtesYQ-WupeAK7xK@paquier.xyz
2024-09-05 13:29:43 +09:00
David Rowley
908a968612 Optimize WindowAgg's use of tuplestores
When WindowAgg finished one partition of a PARTITION BY, it previously
would call tuplestore_end() to purge all the stored tuples before again
calling tuplestore_begin_heap() and carefully setting up all of the
tuplestore read pointers exactly as required for the given frameOptions.
Since the frameOptions don't change between partitions, this part does
not make much sense.  For queries that had very few rows per partition,
the overhead of this was very large.

It seems much better to create the tuplestore and the read pointers once
and simply call tuplestore_clear() at the end of each partition.
tuplestore_clear() moves all of the read pointers back to the start
position and deletes all the previously stored tuples.

A simple test query with 1 million partitions and 1 tuple per partition
has been shown to run around 40% faster than without this change.  The
additional effort seems to have mostly been spent in malloc/free.

Making this work required adding a new bool field to WindowAggState
which had the unfortunate effect of being the 9th bool field in a group
resulting in the struct being enlarged.  Here we shuffle the fields
around a little so that the two bool fields for runcondition relating
stuff fit into existing padding.  Also, move the "runcondition" field to
be near those.  This frees up enough space with the other bool fields so
that the newly added one fits into the padding bytes.  This was done to
address a very small but apparent performance regression with queries
containing a large number of rows per partition.

Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://postgr.es/m/CAHoyFK9n-QCXKTUWT_xxtXninSMEv%2BgbJN66-y6prM3f4WkEHw%40mail.gmail.com
2024-09-05 16:18:30 +12:00
David Rowley
19b861f880 Speedup WindowAgg code by moving uncommon code out-of-line
The code to calculate the frame offsets is only performed once per scan.
Moving this code out of line gives a small (around 4-5%) speedup when testing
with some CPUs.  Other tested CPUs are indifferent to the change.

Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://postgr.es/m/CAApHDvqPgFtwme2Zyf75BpMLwYr2mnUstDyPiP%3DEpudYuQTPPQ%40mail.gmail.com
2024-09-05 15:59:47 +12:00
Jeff Davis
06421b0843 Remove lc_collate_is_c().
Instead just look up the collation and check collate_is_c field.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-04 14:35:25 -07:00
Tom Lane
83eb481d52 Remove test-case workarounds for ancient libedit versions.
This reverts some hacks added in d33a81203 and cd69ec66c.
At the time the concern was the already-ancient version of
libedit shipped in Debian 10 (Buster).  That platform is
now two years past EOL, so desupporting it for PG 18 seems
fine.  (Also, if anyone is really hot to keep testing it,
they can use SKIP_READLINE_TESTS to skip this test.)

We might have to reconsider if any animals still in the
buildfarm don't like this, but the best way to find out
is to try it.

Anton Melnikov

Discussion: https://postgr.es/m/CAGRrpzZU48F2oV3d8eDLr=4TU9xFH5Jt9ED+qU1+X91gMH68Sw@mail.gmail.com
2024-09-04 16:25:28 -04:00
Noah Misch
ddfc556a64 Revert "Optimize pg_visibility with read streams."
This reverts commit ed1b1ee59fb3792baa32f669333b75024ef01bcc and its
followup 1c61fd8b527954f0ec522e5e60a11ce82628b681.  They rendered
collect_corrupt_items() unable to detect corruption.

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQAozv3wTY5TV2t29JcwPydbmKbiWQkZD42S2OgzdixPMDQ@mail.gmail.com
2024-09-04 11:36:40 -07:00
Peter Eisentraut
82b07eba9e Remove a couple of strerror() calls
Change to using %m in the error message string.  We need to be a bit
careful here to preserve errno until we need to print it.

This change avoids the use of not-thread-safe strerror() and unifies
some error message strings, and maybe makes the code appear more
consistent.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/daa87d79-c044-46c4-8458-8d77241ed7b0%40eisentraut.org
2024-09-04 14:45:31 +02:00
Michael Paquier
a68159ff2b Unify some error messages to ease work of translators
This commit updates a couple of error messages around control file data,
GUCs and server settings, unifying to the same message where possible.
This reduces the translation burden a bit.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-09-04 14:53:18 +09:00
Michael Paquier
b4db64270e Apply more quoting to GUC names in messages
This is a continuation of 17974ec25946.  More quotes are applied to
GUC names in error messages and hints, taking care of what seems to be
all the remaining holes currently in the tree for the GUCs.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-09-04 13:50:44 +09:00
Amit Kapila
6c2b5edecc Collect statistics about conflicts in logical replication.
This commit adds columns in view pg_stat_subscription_stats to show the
number of times a particular conflict type has occurred during the
application of logical replication changes. The following columns are
added:

confl_insert_exists:
        Number of times a row insertion violated a NOT DEFERRABLE unique
        constraint.
confl_update_origin_differs:
        Number of times an update was performed on a row that was
        previously modified by another origin.
confl_update_exists:
        Number of times that the updated value of a row violates a
        NOT DEFERRABLE unique constraint.
confl_update_missing:
        Number of times that the tuple to be updated is missing.
confl_delete_origin_differs:
        Number of times a delete was performed on a row that was
        previously modified by another origin.
confl_delete_missing:
        Number of times that the tuple to be deleted is missing.

The update_origin_differs and delete_origin_differs conflicts can be
detected only when track_commit_timestamp is enabled.

Author: Hou Zhijie
Reviewed-by: Shveta Malik, Peter Smith, Anit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57160A07BD575773045FC214948F2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2024-09-04 08:55:21 +05:30
Richard Guo
9626068f13 Avoid unnecessary post-sort projection
When generating paths for the ORDER BY clause, one thing we need to
ensure is that the output paths project the correct final_target.  To
achieve this, in create_ordered_paths, we compare the pathtarget of
each generated path with the given 'target', and add a post-sort
projection step if the two targets do not match.

Currently we perform a simple pointer comparison between the two
targets.  It turns out that this is not sufficient.  Each sorted_path
generated in create_ordered_paths initially projects the correct
target required by the preceding steps of sort.  If it is the same
pointer as sort_input_target, pointer comparison suffices, because
sort_input_target is always identical to final_target when no
post-sort projection is needed.

However, sorted_path's initial pathtarget may not be the same pointer
as sort_input_target, because in apply_scanjoin_target_to_paths, if
the target to be applied has the same expressions as the existing
reltarget, we only inject the sortgroupref info into the existing
pathtargets, rather than create new projection paths.  As a result,
pointer comparison in create_ordered_paths is not reliable.

Instead, we can compare PathTarget.exprs to determine whether a
projection step is needed.  If the expressions match, we can be
confident that a post-sort projection is not required.

It could be argued that this change adds extra check cost each time we
decide whether a post-sort projection is needed.  However, as
explained in apply_scanjoin_target_to_paths, by avoiding the creation
of projection paths, we save effort both immediately and at plan
creation time.  This, I think, justifies the extra check cost.

There are two ensuing plan changes in the regression tests, but they
look reasonable and are exactly what we are fixing here.  So no
additional test cases are added.

No backpatch as this could result in plan changes.

Author: Richard Guo
Reviewed-by: Peter Eisentraut, David Rowley, Tom Lane
Discussion: https://postgr.es/m/CAMbWs48TosSvmnz88663_2yg3hfeOFss-J2PtnENDH6J_rLnRQ@mail.gmail.com
2024-09-04 12:19:19 +09:00
Richard Guo
4f1124548f Check the validity of commutators for merge/hash clauses
When creating merge or hash join plans in createplan.c, the merge or
hash clauses may need to get commuted to ensure that the outer var is
on the left and the inner var is on the right if they are not already
in the expected form.  This requires that their operators have
commutators.  Failing to find a commutator at this stage would result
in 'ERROR: could not find commutator for operator xxx', with no
opportunity to select an alternative plan.

Typically, this is not an issue because mergejoinable or hashable
operators are expected to always have valid commutators.  But in some
artificial cases this assumption may not hold true.  Therefore, here
in this patch we check the validity of commutators for clauses in the
form "inner op outer" when selecting mergejoin/hash clauses, and
consider a clause unusable for the current pair of outer and inner
relations if it lacks a commutator.

There are not (and should not be) any such operators built into
Postgres that are mergejoinable or hashable but have no commutators;
so we leverage the alias type 'int8alias1' created in equivclass.sql
to build the test case.  This is why the test case is included in
equivclass.sql rather than in join.sql.

Although this is arguably a bug fix, it cannot be reproduced without
installing an incomplete opclass, which is unlikely to happen in
practice, so no back-patch.

Reported-by: Alexander Pyhalov
Author: Richard Guo
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/c59ec04a2fef94d9ffc35a9b17dfc081@postgrespro.ru
2024-09-04 12:17:11 +09:00
Michael Paquier
08b9b9e043 Fix inconsistent LWLock tranche name "CommitTsSLRU"
This term was using an inconsistent casing between the code and the
documentation, using "CommitTsSLRU" in wait_event_names.txt and
"CommitTSSLRU" in the code.

Let's update the term in the code to reflect what's in the
documentation, "CommitTs" being more commonly used, so as
pg_stat_activity shows the same term as the documentation.

Oversight in 53c2a97a9266.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com
Backpatch-through: 17
2024-09-04 10:21:17 +09:00
Michael Paquier
2cdcae9da6 Avoid installcheck failure in TAP tests using injection_points
These tests depend on the test module injection_points to be installed,
but it may not be available as the contents of src/test/modules/ are not
installed by default.

This commit adds a workaround based on a scan of pg_available_extensions
to check if the extension is available, skipping the test if it is not.
This allows installcheck to work transparently.

There are more tests impacted by this problem on HEAD, but for now this
addresses only the tests that exist on HEAD and v17 as the release is
close by.

Reported-by: Maxim Orlov
Discussion: https://postgr.es/m/CACG=ezZkoT-pFz6a9XnyToiuR-Wg8fGELqHLoyBodr+2h-77qA@mail.gmail.com
Backpatch-through: 17
2024-09-04 08:56:23 +09:00
Jeff Davis
12d3345c0d Remember last collation to speed up collation cache.
This optimization is to avoid a performance regression in an upcoming
patch that will remove lc_collate_is_c().

Discussion: https://postgr.es/m/96a559be83329bc66074a3925ebcfa8ceb16dfc5.camel@j-davis.com
Discussion: https://postgr.es/m/646f662e145ab38cff1c04d475f4448f53fc5042.camel@j-davis.com
Discussion: https://postgr.es/m/54565933-d82f-4d7c-8f47-288b1b570fd8@eisentraut.org
2024-09-03 16:30:03 -07:00
Michael Paquier
516ff05539 Simplify makefiles exporting twice enable_injection_points
This is confusing, as it exports twice the same variable.  Oversight in
6782709df81f that has spread in more places afterwards.

Reported-by: Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/202408201630.mn6vbohjh7hh@alvherre.pgsql
Backpatch-through: 17
2024-09-04 08:05:44 +09:00
Thomas Munro
813fde73d4 Standardize "read-ahead advice" terminology.
Commit 6654bb920 added macOS's equivalent of POSIX_FADV_WILLNEED, and
changed some explicit references to posix_fadvise to use this more
general name for the concept.  Update some remaining references.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0827edec-1317-4917-a186-035eb1e3241d%40eisentraut.org
2024-09-04 10:28:53 +12:00
Noah Misch
1c61fd8b52 Fix stack variable scope from previous commit.
The defect came from me, not from that commit's credited author.  Per
buildfarm members olingo and grassquit.

Discussion: https://postgr.es/m/20240903192030.1e@rfd.leadboat.com
2024-09-03 12:44:54 -07:00
Noah Misch
ed1b1ee59f Optimize pg_visibility with read streams.
We've measured 5% performance improvement, and this arranges to benefit
automatically from future optimizations to the read_stream subsystem.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
2024-09-03 10:46:20 -07:00
Noah Misch
c582b75851 Add block_range_read_stream_cb(), to deduplicate code.
This replaces two functions for iterating over all blocks in a range.  A
pending patch will use this instead of adding a third.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/20240820184742.f2.nmisch@google.com
2024-09-03 10:46:20 -07:00
Daniel Gustafsson
ba7625a7a5 Use library functions to edit config in SSL tests
The SSL tests were editing the postgres configuration by directly
reading and writing the files rather than using append_conf() from
the testcode library.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/01F4684C-8C98-4BBE-AB83-AC8D7C746AF8@yesql.se
2024-09-03 18:57:56 +02:00
Daniel Gustafsson
e5f1f0a4f2 Test for PG_TEST_EXTRA separately in SSL tests
PG_TEST_EXTRA is an override and should be tested for separately
from any other test as there is no dependency on whether OpenSSL
is available or not.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/01F4684C-8C98-4BBE-AB83-AC8D7C746AF8@yesql.se
2024-09-03 18:57:54 +02:00
Daniel Gustafsson
31a98934d1 Fix typos in code comments and test data
The typos in 005_negotiate_encryption.pl and pg_combinebackup.c
shall be backported to v17 where they were introduced.

Backpatch-through: v17
Discussion: https://postgr.es/m/Ztaj7BkN4658OMxF@paquier.xyz
2024-09-03 11:33:38 +02:00
Peter Eisentraut
2b5f57977f Add const qualifiers to XLogRegister*() functions
Add const qualifiers to XLogRegisterData() and XLogRegisterBufData().
Several unconstify() calls can be removed.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/dd889784-9ce7-436a-b4f1-52e4a5e577bd@eisentraut.org
2024-09-03 08:06:03 +02:00
Michael Paquier
4236825197 Fix typos and grammar in code comments and docs
Author: Alexander Lakhin
Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com
2024-09-03 14:49:04 +09:00
Michael Paquier
c7cd2d6ed0 Define PG_TBLSPC_DIR for path pg_tblspc/ in data folder
Similarly to 2065ddf5e34c, this introduces a define for "pg_tblspc".
This makes the style more consistent with the existing PG_STAT_TMP_DIR,
for example.

There is a difference with the other cases with the introduction of
PG_TBLSPC_DIR_SLASH, required in two places for recovery and backups.

Author: Bertrand Drouvot
Reviewed-by: Ashutosh Bapat, Álvaro Herrera, Yugo Nagata, Michael
Paquier
Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
2024-09-03 09:11:54 +09:00
Daniel Gustafsson
94eec79633 doc: Consistently use result set in documentation
We use "result set" in all other places so let's be consistent
across the entire documentation.

Reported-by: grantgryczan@gmail.com
Discussion: https://postgr.es/m/172187924855.915373.15595156724215203822@wrigleys.postgresql.org
2024-09-02 18:36:57 +02:00
Peter Eisentraut
2befd22790 Fix rarely-run test for message wording change
fixup for 2e6a8047f0

Reported-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
2024-09-02 17:40:32 +02:00
Daniel Gustafsson
c3333dbc0c Only perform pg_strong_random init when required
The random number generator in OpenSSL 1.1.1 was redesigned to provide
fork safety by default, thus removing the need for calling RAND_poll
after forking to ensure that two processes cannot share the same state.
Since we now support 1.1.0 as the minumum version, and 1.1.0 is being
increasingly phased out from production use, only perform the RAND_poll
initialization for installations running 1.1.0 by checking the OpenSSL
version number.

LibreSSL changed random number generator when forking OpenSSL and has
provided fork safety since version 2.0.2.

This removes the overhead of initializing the RNG for strong random
for the vast majority of users for whom it is no longer required.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA+hUKGKh7QrYzu=8yWEUJvXtMVm_CNWH1L_TLWCbZMwbi1XP2Q@mail.gmail.com
2024-09-02 13:52:27 +02:00
Daniel Gustafsson
a70e01d430 Remove support for OpenSSL older than 1.1.0
OpenSSL 1.0.2 has been EOL from the upstream OpenSSL project for
some time, and is no longer the default OpenSSL version with any
vendor which package PostgreSQL. By retiring support for OpenSSL
1.0.2 we can remove a lot of no longer required complexity for
managing state within libcrypto which is now handled by OpenSSL.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/ZG3JNursG69dz1lr@paquier.xyz
Discussion: https://postgr.es/m/CA+hUKGKh7QrYzu=8yWEUJvXtMVm_CNWH1L_TLWCbZMwbi1XP2Q@mail.gmail.com
2024-09-02 13:51:48 +02:00
Daniel Gustafsson
6ebeeae296 Cache typarray for fast lookups in binary upgrade mode
When upgrading a large schema it adds significant overhead to perform
individual catalog lookups per relation in order to retrieve Oid for
preserving Oid calls. This instead adds the typarray to the TypeInfo
cache which then allows for fast lookups using the existing API. A
35% reduction of pg_dump runtime in binary upgrade mode was observed
with this change.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/8F1F1E1D-D17B-4B33-B014-EDBCD15F3F0B@yesql.se
2024-09-02 10:17:46 +02:00
Peter Eisentraut
4d5111b3f1 More use of getpwuid_r() directly
Remove src/port/user.c, call getpwuid_r() directly.  This reduces some
complexity and allows better control of the error behavior.  For
example, the old code would in some circumstances silently truncate
the result string, or produce error message strings that the caller
wouldn't use.

src/port/user.c used to be called src/port/thread.c and contained
various portability complications to support thread-safety.  These are
all obsolete, and all but the user-lookup functions have already been
removed.  This patch completes this by also removing the user-lookup
functions.

Also convert src/backend/libpq/auth.c to use getpwuid_r() for
thread-safety.

Originally, I tried to be overly correct by using
sysconf(_SC_GETPW_R_SIZE_MAX) to get the buffer size for getpwuid_r(),
but that doesn't work on FreeBSD.  All the OS where I could find the
source code internally use 1024 as the suggested buffer size, so I
just ended up hardcoding that.  The previous code used BUFSIZ, which
is an unrelated constant from stdio.h, so its use seemed
inappropriate.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/5f293da9-ceb4-4937-8e52-82c25db8e4d3%40eisentraut.org
2024-09-02 09:04:30 +02:00
Michael Paquier
23138284cd Rename enum labels of PG_Locale_Strategy
PG_REGEX_BUILTIN was added in f69319f2f1fb but it did not follow the
same pattern as the previous labels, i.e. PG_LOCALE_*.  In addition to
this, the two libc strategies did not include in the name that they were
related to this library.

The enum labels are renamed as PG_STRATEGY_type[_subtype] to make the
code clearer, in accordance to the library and the functions they rely
on.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/6f81200f-68fd-411e-97a1-d1f291d2e222@proxel.se
2024-09-02 08:18:41 +09:00
Thomas Munro
4effd0844d Fix unfairness in all-cached parallel seq scan.
Commit b5a9b18c introduced block streaming infrastructure with a special
fast path for all-cached scans, and commit b7b0f3f2 connected the
infrastructure up to sequential scans.  One of the fast path
micro-optimizations had an unintended consequence: it interfered with
parallel sequential scan's block range allocator (from commit 56788d21),
which has its own ramp-up and ramp-down algorithm when handing out
groups of pages to workers.  A scan of an all-cached table could give
extra blocks to one worker, when others had finished.  In some plans
(probably already very bad plans, such as the one reported by
Alexander), the unfairness could be magnified.

An internal buffer of 16 block numbers is removed, keeping just a single
block buffer for technical reasons.

Back-patch to 17.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/63a63690-dd92-c809-0b47-af05459e95d1%40gmail.com
2024-08-31 17:28:02 +12:00
Thomas Munro
ecd56459cf Stabilize 039_end_of_wal test.
The first test was sensitive to the insert LSN after setting up the
catalogs, which depended on environmental things like the locales on the
OS and usernames.  Switch to a new WAL file before the first test, as a
simple way to put every computer into the same state.

Back-patch to all supported releases.

Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru>
Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/b26aeac2-cb6d-4633-a7ea-945baae83dcf%40postgrespro.ru
2024-08-31 14:48:44 +12:00
Masahiko Sawada
d7613ea72f Clarify restrict_nonsystem_relation_kind description.
This change improves the description of the
restrict_nonsystem_relation_kind parameter in guc_table.c and the
documentation for better clarity.

Backpatch to 12, where this GUC parameter was introduced.

Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/6a96f1af-22b4-4a80-8161-1f26606b9ee2%40eisentraut.org
Backpatch-through: 12
2024-08-30 15:06:09 -07:00
Tom Lane
0e5c823806 Make postgres_fdw's query_cancel test less flaky.
This test occasionally shows

+WARNING:  could not get result of cancel request due to timeout

which appears to be because the cancel request is sometimes unluckily
sent to the remote session between queries, and then it's ignored.

This patch tries to make that less probable in three ways:

1. Use a test query that does not involve remote estimates, so that
no EXPLAINs are sent.
2. Make sure that the remote session is ready-to-go (transaction
started, SET commands sent) before we start the timer.
3. Increase the statement_timeout to 100ms, to give the local
session enough time to plan and issue the query.

We might have to go higher than 100ms to make this adequately
stable in the buildfarm, but let's see how it goes.

Back-patch to v17 where this test was introduced.

Jelte Fennema-Nio and Tom Lane

Discussion: https://postgr.es/m/578934.1725045685@sss.pgh.pa.us
2024-08-30 16:47:39 -04:00
Tom Lane
cb8e50a4a0 Avoid inserting PlaceHolderVars in cases where pre-v16 PG did not.
Commit 2489d76c4 removed some logic from pullup_replace_vars()
that avoided wrapping a PlaceHolderVar around a pulled-up
subquery output expression if the expression could be proven
to go to NULL anyway (because it contained Vars or PHVs of the
pulled-up relation and did not contain non-strict constructs).
But removing that logic turns out to cause performance regressions
in some cases, because the extra PHV blocks subexpression folding,
and will do so even if outer-join reduction later turns it into a
no-op with no phnullingrels bits.  This can for example prevent
an expression from being matched to an index.

The reason for always adding a PHV was to ensure we had someplace
to put the varnullingrels marker bits of the Var being replaced.
However, it turns out we can optimize in exactly the same cases that
the previous code did, because we can instead attach the needed
varnullingrels bits to the contained Var(s)/PHV(s).

This is not a complete solution --- it would be even better if we
could remove PHVs after reducing them to no-ops.  It doesn't look
practical to back-patch such an improvement, but this change seems
safe and at least gets rid of the performance-regression cases.

Per complaint from Nikhil Raj.  Back-patch to v16 where the
problem appeared.

Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com
2024-08-30 12:42:12 -04:00
Tom Lane
3409b4db63 Remove one memoize test case added by commit 069d0ff02.
This test case turns out to depend on the assumption that a non-Var
subquery output that's underneath an outer join will always get
wrapped in a PlaceHolderVar.  But that behavior causes performance
regressions in some cases compared to what happened before v16.
The next commit will avoid inserting a PHV in the same cases where
pre-v16 did, and that causes get_memoized_path to not detect that
a memoize plan could be used.

Commit this separately, in hopes that we can restore the test after
making get_memoized_path smarter.  (It's failing to find memoize
plans in adjacent cases where no PHV was ever inserted, so there
is definitely room for improvement there.)

Discussion: https://postgr.es/m/CAG1ps1xvnTZceKK24OUfMKLPvDP2vjT-d+F2AOCWbw_v3KeEgg@mail.gmail.com
2024-08-30 12:22:31 -04:00
Michael Paquier
c39afc38cf Define PG_LOGICAL_DIR for path pg_logical/ in data folder
This is similar to 2065ddf5e34c, but this time for pg_logical/ itself
and its contents, like the paths for snapshots, mappings or origin
checkpoints.

Author: Bertrand Drouvot
Reviewed-by: Ashutosh Bapat, Yugo Nagata, Michael Paquier
Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-30 15:25:12 +09:00
Michael Paquier
2065ddf5e3 Define PG_REPLSLOT_DIR for path pg_replslot/ in data folder
This commit replaces most of the hardcoded values of "pg_replslot" by a
new PG_REPLSLOT_DIR #define.  This makes the style more consistent with
the existing PG_STAT_TMP_DIR, for example.  More places will follow a
similar change.

Author: Bertrand Drouvot
Reviewed-by: Ashutosh Bapat, Yugo Nagata, Michael Paquier
Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-30 10:42:21 +09:00
Michael Paquier
a83a944e9f Rename pg_sequence_read_tuple() to pg_get_sequence_data()
This commit removes log_cnt from the tuple returned by the SQL function.
This field is an internal counter that tracks when a WAL record should
be generated for a sequence, and it is reset each time the sequence is
restored or recovered.  It is not necessary to rebuild the sequence DDL
commands for pg_dump and pg_upgrade where this function is used.  The
field can still be queried with a scan of the "table" created
under-the-hood for a sequence.

Issue noticed while hacking on a feature that can rely on this new
function rather than pg_sequence_last_value(), aimed at making sequence
computation more easily pluggable.

Bump catalog version.

Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/Zsvka3r-y2ZoXAdH@paquier.xyz
2024-08-30 08:49:24 +09:00
Tom Lane
43f2e7634d Fix mis-deparsing of ORDER BY lists when there is a name conflict.
If an ORDER BY item in SELECT is a bare identifier, the parser
first seeks it as an output column name of the SELECT (for SQL92
compatibility).  However, ruleutils.c is expecting the SQL99
interpretation where such a name is an input column name.  So it's
possible to produce an incorrect display of a view in the (admittedly
pretty ill-advised) case where some other column is renamed in the
SELECT output list to match an ORDER BY column.

This can be fixed by table-qualifying such names in the dumped
view text.  To avoid cluttering less-ill-advised queries, we'd
like to do so only when there's an actual name conflict.
That requires passing the current get_query_def call's resultDesc
parameter down to get_variable, so that it can determine what
the output column names are.  In hopes of reducing rather than
increasing notational clutter in ruleutils.c, I moved that value
into the deparse_context struct and removed it from the parameter
lists of get_query_def's other subroutines.

I made a few other cosmetic changes while at it:
* Likewise move the colNamesVisible parameter into deparse_context.
* Rename deparse_context's windowTList field to targetList,
since it's no longer used only in connection with WINDOW clauses.
* Replace the special_exprkind field with a bool inGroupBy,
since that was all it was being used for, and the apparent
flexibility of storing a ParseExprKind proved to be illusory.
(We need a separate varInOrderBy field to make this patch work.)
* Remove useless save/restore logic in get_select_query_def.

In principle, this bug is quite old.  However, it seems unreachable
before 1b4d280ea, because before that the presence of "new" and "old"
entries in a view's rangetable caused us to always table-qualify every
Var reference in dumped views.  Hence, back-patch to v16 where that
came in.

Per bug #18589 from Quynh Tran.

Discussion: https://postgr.es/m/18589-70091cb81db1a3f1@postgresql.org
2024-08-29 13:24:17 -04:00
Peter Eisentraut
edee0c621d Message style improvements 2024-08-29 14:43:34 +02:00
Peter Eisentraut
894be11adf Put generated_stored test objects in a schema
This avoids naming conflicts with concurrent tests with similarly
named objects.  Currently, there are none, but a tests for virtual
generated columns are planned to be added.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Tomasz Rybak <tomasz.rybak@post.pl>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2024-08-29 12:24:47 +02:00
Peter Eisentraut
b9ed496925 Rename regress test generated to generated_stored
This makes naming room to have another test file for virtual generated
columns.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Tomasz Rybak <tomasz.rybak@post.pl>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2024-08-29 12:08:31 +02:00
Peter Eisentraut
4d68a04324 Disallow USING clause when altering type of generated column
This does not make sense.  It would write the output of the USING
clause into the converted column, which would violate the generation
expression.  This adds a check to error out if this is specified.

There was a test for this, but that test errored out for a different
reason, so it was not effective.

Reported-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://www.postgresql.org/message-id/flat/c7083982-69f4-4b14-8315-f9ddb20b9834%40eisentraut.org
2024-08-29 09:06:15 +02:00
Heikki Linnakangas
478846e768 Rename some shared memory initialization routines
To make them follow the usual naming convention where
FoobarShmemSize() calculates the amount of shared memory needed by
Foobar subsystem, and FoobarShmemInit() performs the initialization.

I didn't rename CreateLWLocks() and InitShmmeIndex(), because they are
a little special. They need to be called before any of the other
ShmemInit() functions, because they set up the shared memory
bookkeeping itself. I also didn't rename InitProcGlobal(), because
unlike other Shmeminit functions, it's not called by individual
backends.

Reviewed-by: Andreas Karlsson
Discussion: https://www.postgresql.org/message-id/c09694ff-2453-47e5-b26c-32a16cd75ce6@iki.fi
2024-08-29 09:46:21 +03:00
Heikki Linnakangas
fbce7dfc77 Refactor lock manager initialization to make it a bit less special
Split the shared and local initialization to separate functions, and
follow the common naming conventions. With this, we no longer create
the LockMethodLocalHash hash table in the postmaster process, which
was always pointless.

Reviewed-by: Andreas Karlsson
Discussion: https://www.postgresql.org/message-id/c09694ff-2453-47e5-b26c-32a16cd75ce6@iki.fi
2024-08-29 09:46:06 +03:00
Michael Paquier
9f87da1cff Refactor some code for ALTER TABLE SET LOGGED/UNLOGGED in tablecmds.c
Both sub-commands use the same routine to switch the relpersistence of a
relation, duplicated the same checks, and used a style inconsistent with
access methods and tablespaces.

SET LOGEED/UNLOGGED is refactored to avoid any duplication, setting the
reason why a relation rewrite happens within ATPrepChangePersistence().
This shaves some code.

Discussion: https://postgr.es/m/ZiiyGFTBNkqcMQi_@paquier.xyz
2024-08-29 15:31:30 +09:00
Peter Eisentraut
d7fe02fb9e Fixup for prefetching support on macOS
The new code path (commit 6654bb92047) should call FileAccess() first,
like the posix_fadvise() path.

Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0827edec-1317-4917-a186-035eb1e3241d%40eisentraut.org
2024-08-29 08:22:28 +02:00
Amit Kapila
640178c92e Rename the conflict types for the origin differ cases.
The conflict types 'update_differ' and 'delete_differ' indicate that a row
to be modified was previously altered by another origin. Rename those to
'update_origin_differs' and 'delete_origin_differs' to clarify their
meaning.

Author: Hou Zhijie
Reviewed-by: Shveta Malik, Peter Smith
Discussion: https://postgr.es/m/CAA4eK1+HEKwG_UYt4Zvwh5o_HoCKCjEGesRjJX38xAH3OxuuYA@mail.gmail.com
2024-08-29 09:12:12 +05:30
Amit Kapila
9d90e2bdaf Doc: Fix the ambiguity in the description of failover slots.
The failover slots ensure a seamless transition of a subscriber after the
standby is promoted. But the docs for it also explain the behavior of
asynchronous replication which can confuse the readers.

Reported-by: Masahiro Ikeda
Backpatch-through: 17
Discussion: https://postgr.es/m/OS3PR01MB6390B660F4198BB9745E0526B18B2@OS3PR01MB6390.jpnprd01.prod.outlook.com
2024-08-29 08:56:52 +05:30
Peter Eisentraut
6654bb9204 Add prefetching support on macOS
macOS doesn't have posix_fadvise(), but fcntl() with the F_RDADVISE
command does the same thing.

Some related documentation has been generalized to not mention
posix_advise() specifically anymore.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0827edec-1317-4917-a186-035eb1e3241d%40eisentraut.org
2024-08-28 07:28:27 +02:00
Peter Eisentraut
2e6a8047f0 Message style improvements 2024-08-27 16:54:10 +02:00
Peter Eisentraut
dc26ff2f22 Fix misplaced translator comments
They did not immediately precede the code they were applying to.
2024-08-27 16:15:28 +02:00
Masahiko Sawada
7229ebe011 Fix identation. 2024-08-26 16:16:12 -07:00
Masahiko Sawada
52f1d6730b Fix memory counter update in ReorderBuffer.
Commit 5bec1d6bc5e changed the memory usage updates of the
ReorderBufferTXN to zero all at once by subtracting txn->size, rather
than updating it for each change. However, if TOAST reconstruction
data remained in the transaction when freeing it, there were cases
where it further subtracted the memory counter from zero, resulting in
an assertion failure.

This change calculates the memory size for each change and updates the
memory usage to precisely the amount that has been freed.

Backpatch to v17, where this was introducd.

Reviewed-by: Amit Kapila, Shlok Kyal
Discussion: https://postgr.es/m/CAD21AoAqkNUvicgKPT_dXzNoOwpPkVTg0QPPxEcWmzT0moCJ1g%40mail.gmail.com
Backpatch-through: 17
2024-08-26 11:00:07 -07:00
Peter Geoghegan
09a8407dbf Fix nbtree lookahead overflow bug.
Add bounds checking to nbtree's lookahead/skip-within-a-page mechanism.
Otherwise it's possible for cases with lots of before-array-keys tuples
to overflow an int16 variable, causing the mechanism to generate an out
of bounds page offset number.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/6c68ac42-bbb5-8b24-103e-af0e279c536f@gmail.com
Backpatch: 17-, where nbtree SAOP execution was enhanced.
2024-08-26 11:29:15 -04:00
Peter Eisentraut
dbe37f1adb pg_upgrade: Message style improvements 2024-08-26 14:40:48 +02:00
Dean Rasheed
7cac6307a4 Fix compiler warning in mul_var_short().
Some compilers (e.g., gcc before version 7) mistakenly think "carry"
might be used uninitialized.

Reported by Tom Lane, per various buildfarm members, e.g. arowana.
2024-08-26 11:00:20 +01:00
Alexander Korotkov
8daa62a10c Revert: Avoid looping over all type cache entries in TypeCacheRelCallback()
This commit reverts c14d4acb8 as the patch design didn't take into account
that TypeCacheEntry could be invalidated during the lookup_type_cache() call.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1927cba4-177e-5c23-cbcc-d444a850304f%40gmail.com
2024-08-26 00:22:44 +03:00
Alexander Korotkov
c14d4acb81 Avoid looping over all type cache entries in TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate.  Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups.  This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.

We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean.  Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.

Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
2024-08-25 03:21:23 +03:00
Alexander Korotkov
3890d90c15 Revert support for ALTER TABLE ... MERGE/SPLIT PARTITION(S) commands
This commit reverts 1adf16b8fb, 87c21bb941, and subsequent fixes and
improvements including df64c81ca9, c99ef1811a, 9dfcac8e15, 885742b9f8,
842c9b2705, fcf80c5d5f, 96c7381c4c, f4fc7cb54b, 60ae37a8bc, 259c96fa8f,
449cdcd486, 3ca43dbbb6, 2a679ae94e, 3a82c689fd, fbd4321fd5, d53a4286d7,
c086896625, 4e5d6c4091, 04158e7fa3.

The reason for reverting is security issues related to repeatable name lookups
(CVE-2014-0062).  Even though 04158e7fa3 solved part of the problem, there
are still remaining issues, which aren't feasible to even carefully analyze
before the RC deadline.

Reported-by: Noah Misch, Robert Haas
Discussion: https://postgr.es/m/20240808171351.a9.nmisch%40google.com
Backpatch-through: 17
2024-08-24 18:48:48 +03:00
Peter Eisentraut
6e8a0317b4 pg_createsubscriber: Message style improvements 2024-08-24 15:56:32 +02:00
Tom Lane
ff59d5d2cf Provide feature-test macros for libpq features added in v17.
As per the policy established in commit 6991e774e, invent macros
that can be tested at compile time to detect presence of new libpq
features.  This should make calling code more readable and less
error-prone than checking the libpq version would be (especially
since we don't expose that at compile time; the server version is
an unreliable substitute).

Discussion: https://postgr.es/m/2042418.1724346970@sss.pgh.pa.us
2024-08-23 10:12:56 -04:00
Peter Eisentraut
a2bbc58f74 thread-safety: gmtime_r(), localtime_r()
Use gmtime_r() and localtime_r() instead of gmtime() and localtime(),
for thread-safety.

There are a few affected calls in libpq and ecpg's libpgtypes, which
are probably effectively bugs, because those libraries already claim
to be thread-safe.

There is one affected call in the backend.  Most of the backend
otherwise uses the custom functions pg_gmtime() and pg_localtime(),
which are implemented differently.

While we're here, change the call in the backend to gmtime*() instead
of localtime*(), since for that use time zone behavior is irrelevant,
and this side-steps any questions about when time zones are
initialized by localtime_r() vs localtime().

Portability: gmtime_r() and localtime_r() are in POSIX but are not
available on Windows.  Windows has functions gmtime_s() and
localtime_s() that can fulfill the same purpose, so we add some small
wrappers around them.  (Note that these *_s() functions are also
different from the *_s() functions in the bounds-checking extension of
C11.  We are not using those here.)

On MinGW, you can get the POSIX-style *_r() functions by defining
_POSIX_C_SOURCE appropriately before including <time.h>.  This leads
to a conflict at least in plpython because apparently _POSIX_C_SOURCE
gets defined in some header there, and then our replacement
definitions conflict with the system definitions.  To avoid that sort
of thing, we now always define _POSIX_C_SOURCE on MinGW and use the
POSIX-style functions here.

Reviewed-by: Stepan Neretin <sncfmgg@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/eba1dc75-298e-4c46-8869-48ba8aad7d70@eisentraut.org
2024-08-23 07:43:04 +02:00
Michael Paquier
94a3373ac5 Rework new SLRU test with injection points
Rather than the SQL injection_points_load(), this commit changes the
injection point test introduced in 768a9fd5535f to rely on the two
macros INJECTION_POINT_LOAD() and INJECTION_POINT_CACHED(), that have
been originally introduced for the sake of this test.

This runs the test as a two-step process: load the injection point, then
run its callback directly from the local cache loaded.  What the test
did originally was also fine, but the point here is to have an example
in core of how to use these new macros.

While on it, fix the header ordering in multixact.c, as pointed out by
Alexander Korotkov.  This was an oversight in 768a9fd5535f.

Per discussion with Álvaro Herrera.

Author: Michael Paquier
Discussion: https://postgr.es/m/ZsUnJUlSOBNAzwW1@paquier.xyz
Discussion: https://postgr.es/m/CAPpHfduzaBz7KMhwuVOZMTpG=JniPG4aUosXPZCxZydmzq_oEQ@mail.gmail.com
2024-08-23 12:11:36 +09:00
Michael Paquier
2e35c67f95 injection_point: Add injection_points.stats
This GUC controls if cumulative statistics are enabled or not in the
module.  Custom statistics require the module to be loaded with
shared_preload_libraries, hence this GUC is made PGC_POSTMASTER.  By
default, the stats are disabled.  001_stats.pl is updated to enable the
statistics, as it is the only area where these are required now.

This will be used by an upcoming change for the injection point test
added by 768a9fd5535f where stats should not be used, as the test runs a
point callback in a critical section.  And the module injection_points
will need to be loaded with shared_preload_libraries there.

Per discussion with Álvaro Herrera.

Author: Michael Paquier
Discussion: https://postgr.es/m/ZsUnJUlSOBNAzwW1@paquier.xyz
2024-08-23 11:36:41 +09:00
Michael Paquier
b2b023aa37 injection_points: Add initialization of shmem state when loading module
This commits adds callbacks to initialize the shared memory state of the
module when loaded with shared_preload_libraries.  This is necessary to
be able to update the test introduced in 768a9fd5535f to use the macros
INJECTION_POINT_{LOAD,CACHED}() rather than a SQL function in the module
injection_points forcing a load, as this test runs a callback in a
critical section where no memory allocation should happen.

Initializing the shared memory state of the module while loading
provides a strict control on the timing of its allocation.  If the
module is not loaded at startup, it will use a GetNamedDSMSegment()
instead to initialize its shmem state on-the-fly.

Per discussion with Álvaro Herrera.

Author: Michael Paquier
Discussion: https://postgr.es/m/ZsUnJUlSOBNAzwW1@paquier.xyz
2024-08-23 10:12:58 +09:00
Amit Kapila
edcb712585 Doc: explain the log format of logical replication conflicts.
This commit adds a detailed explanation of the log format for logical
replication conflicts.

Author: Hou Zhijie
Reviewed-by: Shveta Malik, Peter Smith, Hayato Kuroda
Discussion: https://postgr.es/m/OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/OS0PR01MB57162EDE8BA17F3EE08A24CA948D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2024-08-22 14:11:50 +05:30
Michael Paquier
d55322b0da psql: Add more meta-commands able to use the extended protocol
Currently, only unnamed prepared statement are supported by psql with
the meta-command \bind.  With only this command, it is not possible to
test named statement creation, execution or close through the extended
protocol.

This commit introduces three additional commands:
* \parse creates a prepared statement using the extended protocol,
acting as a wrapper of libpq's PQsendPrepare().
* \bind_named binds and executes an existing prepared statement using
the extended protocol, for PQsendQueryPrepared().
* \close closes an existing prepared statement using the extended
protocol, for PQsendClosePrepared().

This is going to be useful to add regression tests for the extended
query protocol, and I have some plans for that on separate threads.
Note that \bind relies on PQsendQueryParams().

The code of psql is refactored so as bind_flag is replaced by an enum in
_psqlSettings that tracks the type of libpq routine to execute, based on
the meta-command involved, with the default being PQsendQuery().  This
refactoring piece has been written by me, while Anthonin has implemented
the rest.

Author: Anthonin Bonnefoy, Michael Paquier
Reviewed-by: Aleksander Alekseev, Jelte Fennema-Nio
Discussion: https://postgr.es/m/CAO6_XqpSq0Q0kQcVLCbtagY94V2GxNP3zCnR6WnOM8WqXPK4nw@mail.gmail.com
2024-08-22 16:25:57 +09:00
Noah Misch
a36aa223ec Fix attach of a previously-detached injection point.
It's normal for the name in a free slot to match the new name.  The
max_inuse mechanism kept simple cases from reaching the problem.  The
problem could appear when index 0 was the previously-detached entry and
index 1 is in use.  Back-patch to v17, where this code first appeared.
2024-08-22 00:07:04 -07:00
Alexander Korotkov
04158e7fa3 Avoid repeated table name lookups in createPartitionTable()
Currently, createPartitionTable() opens newly created table using its name.
This approach is prone to privilege escalation attack, because we might end
up opening another table than we just created.

This commit address the issue above by opening newly created table by its
OID.  It appears to be tricky to get a relation OID out of ProcessUtility().
We have to extend TableLikeClause with new newRelationOid field, which is
filled within ProcessUtility() to be further accessed by caller.

Security: CVE-2014-0062
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240808171351.a9.nmisch%40google.com
Reviewed-by: Pavel Borisov, Dmitry Koval
2024-08-22 09:50:48 +03:00
Richard Guo
9bb842f95e Small code simplification
Apply the same code simplification to ATExecAddColumn as was done in
7ff9afbbd: apply GETSTRUCT() once instead of doing it repeatedly in
the same function.

Author: Tender Wang
Discussion: https://postgr.es/m/CAHewXNkO9+U437jvKT14s0MCu6Qpf6G-p2mZK5J9mAi4cHDgpQ@mail.gmail.com
2024-08-22 11:41:08 +09:00
Michael Paquier
490f869d92 Create syscache entries for pg_extension
Two syscache identifiers are added for extension names and OIDs.

Shared libraries of extensions might want to invalidate or update their
own caches whenever a CREATE, ALTER or DROP EXTENSION command is run for
their extension (in any backend).  Right now this is non-trivial to do
correctly and efficiently, but, if an extension catalog is part of a
syscache, this could simply be done by registering an callback using
CacheRegisterSyscacheCallback for the relevant syscache.

Another case where this is useful is a loaded library where some of its
code paths rely on some objects of the extension to exist; it can be
simpler and more efficient to do an existence check directly on the
extension through the syscache.

Author: Jelte Fennema-Nio
Reviewed-by: Alexander Korotkov, Pavel Stehule
Discussion: https://postgr.es/m/CAGECzQTWm9sex719Hptbq4j56hBGUti7J9OWjeMobQ1ccRok9w@mail.gmail.com
2024-08-22 10:48:25 +09:00
Jeff Davis
a839567784 Fix obsolete comments in varstr_cmp(). 2024-08-21 09:19:21 -07:00
Tom Lane
86488cdf12 Disallow creating binary-coercible casts involving range types.
For a long time we have forbidden binary-coercible casts to or from
composite and array types, because such a cast cannot work correctly:
the type OID embedded in the value would need to change, but it won't
in a binary coercion.  That reasoning applies equally to range types,
but we overlooked installing a similar restriction here when we
invented range types.  Do so now.

Given the lack of field complaints, we won't change this in stable
branches, but it seems not too late for v17.

Per discussion of a problem noted by Peter Eisentraut.

Discussion: https://postgr.es/m/076968e1-0852-40a9-bc0b-117cd3f0e43c@eisentraut.org
2024-08-21 12:00:03 -04:00
Robert Haas
c01743aa48 Show number of disabled nodes in EXPLAIN ANALYZE output.
Now that disable_cost is not included in the cost estimate, there's
no visible sign in EXPLAIN output of which plan nodes are disabled.
Fix that by propagating the number of disabled nodes from Path to
Plan, and then showing it in the EXPLAIN output.

There is some question about whether this is a desirable change.
While I personally believe that it is, it seems best to make it a
separate commit, in case we decide to back out just this part, or
rework it.

Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley.

Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
2024-08-21 10:14:35 -04:00
Robert Haas
e222534679 Treat number of disabled nodes in a path as a separate cost metric.
Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the startup cost and the
total cost. Hence, the path list is now sorted by disabled_nodes and
then by total_cost, instead of just by the latter, and likewise for
the partial path list.  It is important that this number is a count
and not simply a Boolean; else, as soon as we're unable to respect
disabled path types in all portions of the path, we stop trying to
avoid them where we can.

Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.

Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).

Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley.

Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
2024-08-21 10:12:30 -04:00
Robert Haas
2b03cfeea4 Fix pgindent damage
Oversight in commit a95ff1fe2eb4926b13e0940ad1f37d048704bdb0
2024-08-21 09:58:11 -04:00
Peter Eisentraut
4baff50132 doc: remove llvm-config search from configure documentation
As of 4dd29b6833, we no longer attempt to locate any other llvm-config
variant than plain llvm-config in configure-based builds; update the
documentation accordingly. (For Meson-based builds, we still use Meson's
LLVMDependencyConfigTool [0], which runs through a set of possible
suffixes [1], so no need to update the documentation there.)

[0]: 7d28ff2939/mesonbuild/dependencies/dev.py (L184)
[1]: 7d28ff2939/mesonbuild/environment.py (L183)

Author: Ole Peder Brandtzæg <olebra@samfundet.no>
Discussion: https://www.postgresql.org/message-id/20240518224601.gtisttjerylukjr5%40samfundet.no
2024-08-21 15:11:21 +02:00
Amit Kapila
d43b8bb6b8 Fix typos in 9758174e2e.
Reported off-list by Erik Rijkers
2024-08-21 16:45:36 +05:30
Peter Eisentraut
7ff9afbbd1 Small code simplification
Apply GETSTRUCT() once instead of doing it repeatedly in the same
function.  This simplifies the notation and makes the function's
structure more similar to the surrounding ones.

Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2024-08-21 09:21:25 +02:00
Amit Kapila
3f28b2fcac Don't advance origin during apply failure.
We advance origin progress during abort on successful streaming and
application of ROLLBACK in parallel streaming mode. But the origin
shouldn't be advanced during an error or unsuccessful apply due to
shutdown. Otherwise, it will result in a transaction loss as such a
transaction won't be sent again by the server.

Reported-by: Hou Zhijie
Author: Hayato Kuroda and Shveta Malik
Reviewed-by: Amit Kapila
Backpatch-through: 16
Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com
2024-08-21 09:22:32 +05:30
Jeff Davis
a95ff1fe2e Slightly refactor varstr_sortsupport() to improve readability.
Author: Andreas Karlsson
Discussion: https://postgr.es/m/69c2a864-846f-4309-bd5a-aaa1c34f9a11@proxel.se
2024-08-20 15:32:39 -07:00
Michael Paquier
15c1abd977 Remove _PG_fini()
ab02d702ef08 has removed from the backend the code able to support the
unloading of modules, because this has never worked.  This removes the
last references to _PG_fini(), that could be used as a callback for
modules to manipulate the stack when unloading a library.

The test module ldap_password_func had the idea to declare it, doing
nothing.  The function declaration in fmgr.h is gone.

It was left around in 2022 to avoid breaking extension code, but at this
stage there are also benefits in letting extension developers know that
keeping the unloading code is pointless and this move leads to less
maintenance.

Reviewed-by: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/ZsQfi0AUJoMF6NSd@paquier.xyz
2024-08-21 07:24:03 +09:00
Alvaro Herrera
678a8358d1
Minor wording change in table "JSON Creation Functions"
For readability.  Backpatch to 16.

Author: Erik Wienhold <ewie@ewie.name>
Discussion: https://postgr.es/m/8ddac732-d650-4958-b9c9-ea8e6116251e@ewie.name
2024-08-20 17:53:40 -04:00
Jeff Davis
0fb0f68933 Improve configure error for ICU libraries if pkg-config is absent.
If pkg-config is not installed, the ICU libraries cannot be found, but
the custom configure error message did not mention this. This might
lead to confusion about the actual problem. To improve this, remove
the explicit error message and rely on PKG_CHECK_MODULES' generic
error message.

Author: Michael Banck
Reported-by: Holger Jakobs
Discussion: https://postgr.es/m/ccd579ed-4949-d3de-ab13-9e6456fd2caf%40jakobs.com
Discussion: https://postgr.es/m/66b5d05c.050a0220.7c8ce.a951@mx.google.com
2024-08-20 12:25:06 -07:00
Nathan Bossart
5ff9b6b4d9 Fix a couple of wait event descriptions.
The descriptions for ProcArrayGroupUpdate and XactGroupUpdate claim
that these events mean we are waiting for the group leader "at end
of a parallel operation," but neither pertains to parallel
operations.  This commit reverts these descriptions to their
wording before commit 3048898e73, i.e., "end of a parallel
operation" is changed to "transaction end."

Author: Sameer Kumar
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAGPeHmh6UMrKQHKCmX%2B5vV5TH9P%3DKw9en3k68qEem6J%3DyrZPUA%40mail.gmail.com
Backpatch-through: 13
2024-08-20 13:43:20 -05:00
Alvaro Herrera
768a9fd553
Add injection-point test for new multixact CV usage
Before commit a0e0fb1ba56f, multixact.c contained a case in the
multixact-read path where it would loop sleeping 1ms each time until
another multixact-create path completed, which was uncovered by any
tests.  That commit changed the code to rely on a condition variable
instead.  Add a test now, which relies on injection points and "loading"
thereof (because of it being in a critical section), per commit
4b211003ecc2.

Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru
2024-08-20 14:21:34 -04:00
John Naylor
4d93bbd4e0 Document limit on the number of out-of-line values per table
Document the hard limit stemming from the size of an OID, and also
mention the perfomance impact that occurs before the hard limit
is reached.

Jakub Wartak and Robert Haas
Backpatch to all supported versions

Discussion: https://postgr.es/m/CAKZiRmwWhp2yxjqJLwbBjHdfbJBcUmmKMNAZyBjjtpgM9AMatQ%40mail.gmail.com
2024-08-20 13:36:33 +07:00
Amit Kapila
9758174e2e Log the conflicts while applying changes in logical replication.
This patch provides the additional logging information in the following
conflict scenarios while applying changes:

insert_exists: Inserting a row that violates a NOT DEFERRABLE unique constraint.
update_differ: Updating a row that was previously modified by another origin.
update_exists: The updated row value violates a NOT DEFERRABLE unique constraint.
update_missing: The tuple to be updated is missing.
delete_differ: Deleting a row that was previously modified by another origin.
delete_missing: The tuple to be deleted is missing.

For insert_exists and update_exists conflicts, the log can include the origin
and commit timestamp details of the conflicting key with track_commit_timestamp
enabled.

update_differ and delete_differ conflicts can only be detected when
track_commit_timestamp is enabled on the subscriber.

We do not offer additional logging for exclusion constraint violations because
these constraints can specify rules that are more complex than simple equality
checks. Resolving such conflicts won't be straightforward. This area can be
further enhanced if required.

Author: Hou Zhijie
Reviewed-by: Shveta Malik, Amit Kapila, Nisha Moond, Hayato Kuroda, Dilip Kumar
Discussion: https://postgr.es/m/OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com
2024-08-20 08:35:11 +05:30
David Rowley
adf97c1562 Speed up Hash Join by making ExprStates support hashing
Here we add ExprState support for obtaining a 32-bit hash value from a
list of expressions.  This allows both faster hashing and also JIT
compilation of these expressions.  This is especially useful when hash
joins have multiple join keys as the previous code called ExecEvalExpr on
each hash join key individually and that was inefficient as tuple
deformation would have only taken into account one key at a time, which
could lead to walking the tuple once for each join key.  With the new
code, we'll determine the maximum attribute required and deform the tuple
to that point only once.

Some performance tests done with this change have shown up to a 20%
performance increase of a query containing a Hash Join without JIT
compilation and up to a 26% performance increase when JIT is enabled and
optimization and inlining were performed by the JIT compiler.  The
performance increase with 1 join column was less with a 14% increase
with and without JIT.  This test was done using a fairly small hash
table and a large number of hash probes.  The increase will likely be
less with large tables, especially ones larger than L3 cache as memory
pressure is more likely to be the limiting factor there.

This commit only addresses Hash Joins, but lays expression evaluation
and JIT compilation infrastructure for other hashing needs such as Hash
Aggregate.

Author: David Rowley
Reviewed-by: Alexey Dvoichenkov <alexey@hyperplane.net>
Reviewed-by: Tels <nospam-pg-abuse@bloodgate.com>
Discussion: https://postgr.es/m/CAApHDvoexAxgQFNQD_GRkr2O_eJUD1-wUGm%3Dm0L%2BGc%3DT%3DkEa4g%40mail.gmail.com
2024-08-20 13:38:22 +12:00
Bruce Momjian
9380e5f129 doc: improve create/alter sequence CYCLE syntax
Reported-by: Peter Smith

Discussion: https://postgr.es/m/CAHut+PtqwZwPfGq62xq2614_ce2ejDmbB9CfP+a1azxpneFRBQ@mail.gmail.com

Author: Peter Smith

Backpatch-through: master
2024-08-19 20:18:03 -04:00
Bruce Momjian
e28a2719be doc: mention of postpostgres_fdw INSERT ON CONFLICT limitation
Reported-by: Fujii Masao

Discussion: https://postgr.es/m/47801526-d017-4c89-9f52-c02c449a139b@oss.nttdata.com

Author: Fujii Masao

Backpatch-through: master
2024-08-19 19:54:39 -04:00
Bruce Momjian
cf3bb26204 doc: clarify create database in start docs uses command line
Reported-by: vrms@netcologne.de

Discussion: https://postgr.es/m/172251463564.915373.17748961617119647662@wrigleys.postgresql.org

Backpatch-through: master
2024-08-19 19:22:10 -04:00
Bruce Momjian
6467993fb5 doc: Improve vague pg_createsubscriber description
Discussion: https://postgr.es/m/ZqX_4J-nFTQtmj6K@momjian.us

Author: Euler Taveira

Backpatch-through: 17
2024-08-19 18:27:22 -04:00
Alvaro Herrera
52f3de874b
Avoid failure to open dropped detached partition
When a partition is detached and immediately dropped, a prepared
statement could try to compute a new partition descriptor that includes
it.  This leads to this kind of error:
ERROR:  could not open relation with OID 457639

Avoid this by skipping the partition in expand_partitioned_rtentry if it
doesn't exist.

Noted by me while investigating bug #18559.  Kuntal Gosh helped to
identify the exact failure.

Backpatch to 14, where DETACH CONCURRENTLY was introduced.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/202408122233.bo4adt3vh5bi@alvherre.pgsql
2024-08-19 16:09:10 -04:00
Tomas Vondra
0d06a7eac4 Document that search_path is reported by the server
Commit 28a1121fd912 marked search_path as GUC_REPORT, but failed to
update the relevant places in docs. There are two places listing the GUC
options reported to the client, so update both.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/CAFh8B=k8s7WrcqhafmYhdN1+E5LVzZi_QaYDq8bKvrGJTAhY2Q@mail.gmail.com
2024-08-19 19:52:37 +02:00
Tomas Vondra
28a1121fd9 Mark search_path as GUC_REPORT
Report search_path changes to the client. Multi-tenant applications
often map tenants to schemas, and use search_path to pick the tenant a
given connection works with. This breaks when a connection pool (like
PgBouncer), because the search_path may change unexpectedly.

There are other GUCs we might want reported (e.g. various timeouts), but
search_path is by far the biggest foot gun that can lead either to
puzzling failures during query execution (when objects are missing or
are defined differently), or even to accessing incorrect data.

Many existing tools modify search_path, pg_dump being a notable example.

Ideally, clients could specify which GUCs are interesting and should be
subject to this reporting, but we don't support that. GUC_REPORT is what
connection pools rely on for other interesting GUCs, so just use that.

When this change was initially proposed in 2014, one of the concerns was
impact on performance. But this was addressed by commit 2432b1a04087,
which ensures we report each GUC at most once per query, no matter how
many times it changed during execution.

Eventually, this might be replaced / superseded by allowing doing this
by making the protocol extensible in this direction, but it's unclear
when (or if) that happens. Until then, we can leverage GUC_REPORT.

Author: Alexander Kukushkin, Jelte Fennema-Nio
Discussion: https://postgr.es/m/CAFh8B=k8s7WrcqhafmYhdN1+E5LVzZi_QaYDq8bKvrGJTAhY2Q@mail.gmail.com
2024-08-19 17:04:14 +02:00
Tomas Vondra
5cb902e9d5 Explain dropdb can't use syscache because of TOAST
Add a comment explaining dropdb() can't rely on syscache. The issue with
flattened rows was fixed by commit 0f92b230f88b, but better to have
a clear explanation why the systable scan is necessary. The other places
doing in-place updates on pg_database have the same comment.

Suggestion and patch by Yugo Nagata. Backpatch to 12, same as the fix.

Author: Yugo Nagata
Backpatch-through: 12
Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com
2024-08-19 13:31:51 +02:00
Daniel Gustafsson
4fdb6558c2 Fix regression in TLS session ticket disabling
Commit 274bbced disabled session tickets for TLSv1.3 on top of the
already disabled TLSv1.2 session tickets, but accidentally caused
a regression where TLSv1.2 session tickets were incorrectly sent.
Fix by unconditionally disabling TLSv1.2 session tickets and only
disable TLSv1.3 tickets when the right version of OpenSSL is used.

Backpatch to all supported branches.

Reported-by: Cameron Vogt <cvogt@automaticcontrols.net>
Reported-by: Fire Emerald <fire.github@gmail.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/DM6PR16MB3145CF62857226F350C710D1AB852@DM6PR16MB3145.namprd16.prod.outlook.com
Backpatch-through: v12
2024-08-19 12:55:11 +02:00
Thomas Munro
2724ff381a Fix harmless LC_COLLATE[_MASK] confusion.
Commit ca051d8b101 called newlocale(LC_COLLATE, ...) instead of
newlocale(LC_COLLATE_MASK, ...), in code reached only on FreeBSD.  They
have the same value on that OS, explaining why it worked.  Fix.

Back-patch to 14, where ca051d8b101 landed.
2024-08-19 22:12:55 +12:00
Heikki Linnakangas
56d23855c8 Fix garbled process name on backend crash
The log message on backend crash used wrong variable, which could be
uninitialized. Introduced in commit 28a520c0b7.

Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/451b0797-83b8-cdbc-727f-8d7a7b0e3bca@gmail.com
2024-08-19 09:48:25 +03:00
Michael Paquier
bd06cc338d Fix more holes with SLRU code in need of int64 for segment numbers
This is a continuation of c9e24573905b, containing changes included into
the proposed patch that have been missed in the actual commit.  I have
managed to miss these diffs while doing a rebase of the original patch.

Thanks to Noah Misch, Peter Eisentraut and Alexander Korotkov for the
pokes.

Discussion: https://postgr.es/m/92fe572d-638e-4162-aef6-1c42a2936f25@eisentraut.org
Discussion: https://postgr.es/m/20240810175055.cd.nmisch@google.com
Backpatch-through: 17
2024-08-19 12:34:18 +09:00
Alvaro Herrera
7b063ff26a
Search for SLRU page only in its own bank
One of the two slot scans in SlruSelectLRUPage was not walking only the
slots in the specific bank where the buffer could be; change it to do
that.

Oversight in 53c2a97a9266.

Author: Sergey Sargsyan <sergey.sargsyan.2001@gmail.com>
Discussion: https://postgr.es/m/18582-5f301dd30ba91a38@postgresql.org
2024-08-18 20:49:57 -04:00
Michael Paquier
2793acecee injection_points: Add stats for point caching and loading
This adds two counters to the fixed-numbered stats of injection points
to track the number of times injection points have been cached and
loaded from the cache, as of the additions coming from a0a5869a8598 and
4b211003ecc2.

These should have been part of f68cd847fa40, but I have lacked time and
energy back then, and it did not prevent the code to be a useful
template.

While on it, this commit simplifies the description of a few tests while
adding coverage for the new stats data.

Author: Yogesh Sharma
Discussion: https://postgr.es/m/3a6977f7-54ab-43ce-8806-11d5e15526a2@catprosystems.com
2024-08-19 09:03:52 +09:00
Thomas Munro
b10528e6cc ci: Upgrade MacPorts version to 2.10.1.
MacPorts version 2.9.3 started failing in our ci_macports_packages.sh
script, for reasons not fully determined, but plausibly linked to the
release of 2.10.1.  2.10.1 seems to work, so let's switch to it.

Back-patch to 15, where CI began.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/81f104e8-f0a9-43c0-85bd-2bbbf590a5b8%40eisentraut.org
2024-08-19 11:47:37 +12:00
Michael Paquier
a5f4ff6c80 doc: Fix typo in section for custom pgstats
Per offline report from Erik Rijkers.
2024-08-19 07:53:47 +09:00
Tomas Vondra
0f92b230f8 Fix DROP DATABASE for databases with many ACLs
Commit c66a7d75e652 modified DROP DATABASE so that if interrupted, the
database is known to be in an invalid state and can only be dropped.
This is done by setting a flag using an in-place update, so that it's
not lost in case of rollback.

For databases with many ACLs, this may however fail like this:

  ERROR:  wrong tuple length

This happens because with many ACLs, the pg_database.datacl attribute
gets TOASTed. The dropdb() code reads the tuple from the syscache, which
means it's detoasted. But the in-place update expects the tuple length
to match the on-disk tuple.

Fixed by reading the tuple from the catalog directly, not from syscache.

Report and fix by Ayush Tiwari. Backpatch to 12. The DROP DATABASE fix
was backpatched to 11, but 11 is EOL at this point.

Reported-by: Ayush Tiwari
Author: Ayush Tiwari
Reviewed-by: Tomas Vondra
Backpatch-through: 12
Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com
2024-08-19 00:04:48 +02:00
Thomas Munro
d426718d8d Fix cpluspluscheck for pg_verifybackup.h.
simplehash.h references pg_fatal(), which cpluspluscheck says is
undeclared, causing the CI CompilerWarnings task to fail since commit
aa2d6b15.  Include the header it needs.

Discussion: https://postgr.es/m/CA%2BhUKGJC3d4PXkErpfOWrzQqcq6MLiCv0%2BAH0CMQnB6hdLUFEw%40mail.gmail.com
2024-08-19 07:59:30 +12:00
Noah Misch
64740853f0 Fix comments on wal_level=minimal, CREATE TABLESPACE and CREATE DATABASE.
Commit 97ddda8a82ac470ae581d0eb485b6577707678bc removed the rmtree()
behavior from XLOG_TBLSPC_CREATE, obsoleting that part of the comment.
The comment's point about XLOG_DBASE_CREATE was wrong when commit
fa0f466d5329e10b16f3b38c8eaf5306f7e234e8 introduced the point.  (It
would have been accurate if that commit had predated commit
fbcbc5d06f53aea412130deb52e216aa3883fb8d introducing the second
checkpoint of CREATE DATABASE.)  Nothing can skip log_smgrcreate() on
the basis of wal_level=minimal, so don't comment on that.

Commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 expanded WAL skipping
from five specific operations to relfilenodes generally, hence the
CreateDatabaseUsingFileCopy() comment change.

Discussion: https://postgr.es/m/20231008022204.cc@rfd.leadboat.com
2024-08-18 12:03:59 -07:00
Bruce Momjian
03e9b958ee docs: fix incorrect plpgsql error message
Change "$1" to "username".

Reported-by: philipp.salvisberg@gmail.com

Discussion: https://postgr.es/m/172112109590.736590.12219129462878821880@wrigleys.postgresql.org

Backpatch-through: 12
2024-08-16 22:50:54 -04:00
Bruce Momjian
151da217a3 C comment: fix for commit b5a9b18cd0b
The commit was "Provide API for streaming relation data.".

Reported-by: Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ3KsZ2faZs1sK5J0W+_8B3myB232CfLYGie4u4BBMwP3g@mail.gmail.com

Backpatch-through: master
2024-08-16 21:12:18 -04:00
David Rowley
bd8fe12ef3 Relocate a badly placed Assert in COPY FROM code
There's not much point in asserting a pointer isn't NULL after some code
has already dereferenced that pointer.

Adjust the code so that the Assert occurs before the pointer dereference.

The Assert probably has questionable value in the first place, but it
seems worth keeping around to document the contract between
CopyMultiInsertInfoNextFreeSlot() and its callers.

Author: Amul Sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b94hXQzXaJxTLShkxQUgezf_SUxhzX9TH2f-g6gP7bne7g@mail.gmail.com
2024-08-17 10:36:23 +12:00
Nathan Bossart
1d80d6b50e Further reduce dependence on -fwrapv semantics in jsonb.
Commit 108d2adb9e missed updating a few places in the jsonb code
that rely on signed integer wrapping for correctness.  These can
also be fixed by using pg_abs_s32() to negate a signed integer
(that is known to be negative) for comparison with an unsigned
integer.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/bfff906f-300d-81ea-83b7-f2c93845e7f2%40gmail.com
2024-08-16 15:06:40 -05:00
Robert Haas
aa2d6b15d6 pg_verifybackup: Move some declarations to new pg_verifybackup.h
This is in preparation for adding a second source file to this
directory.

Amul Sul, reviewed by Sravan Kumar and revised a bit by me.

Discussion: http://postgr.es/m/CAAJ_b95mcGjkfAf1qduOR97CokW8-_i-dWLm3v6x1w2-OW9M+A@mail.gmail.com
2024-08-16 15:09:42 -04:00
Robert Haas
af99d44a88 pg_verifybackup: Move skip_checksums into verifier_context.
This is in preparation for adding a second source file to this
directory. It will need access to this value. Also, fewer global
variables is usually a good thing.

Amul Sul, reviewed by Sravan Kumar and revised a bit by me.

Discussion: http://postgr.es/m/CAAJ_b95mcGjkfAf1qduOR97CokW8-_i-dWLm3v6x1w2-OW9M+A@mail.gmail.com
2024-08-16 14:52:52 -04:00
Robert Haas
76dd015e85 Improve more comments in astreamer_gzip.c.
Duplicate the comment from astreamer_plain_writer_new instead of just
referring to it. Add a further note to mention that there are dangers
if anything else is written to the same FILE. Also add a comment where
we dup() the filehandle, referring to the existing comment in
astreamer_gzip_writer_finalize(), because the dup() looks wrong on
first glance without that comment to clarify.

Per concerns expressed by Tom Lane on pgsql-security, and using
some wording suggested by him.

Discussion: http://postgr.es/m/CA+TgmoYTFAD0YTh4HC1Nuhn0YEyoQi0_CENFgVzAY_YReiSksQ@mail.gmail.com
2024-08-16 13:45:23 -04:00
Alvaro Herrera
b8b3f861fb
libpq: Trace all messages received from the server
Not all messages that libpq received from the server would be sent
through our message tracing logic.  This commit tries to fix that by
introducing a new function pqParseDone which make it harder to forget
about doing so.

The messages that we now newly send through our tracing logic are:

- CopyData (received by COPY TO STDOUT)
- Authentication requests
- NegotiateProtocolVersion
- Some ErrorResponse messages during connection startup
- ReadyForQuery when received after a FunctionCall message

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQSoPHtZ4xe0raJ6FYSEiPPS+YWXBhOGo+Y1YecLgknF3g@mail.gmail.com
2024-08-16 13:23:18 -04:00
Tom Lane
6be39d77a7 Fix extraction of week and quarter fields from intervals.
"EXTRACT(WEEK FROM interval_value)" formerly threw an error.
Define it as "tm->tm_mday / 7".  (With C99 division semantics,
this gives consistent results for negative intervals.)

"EXTRACT(QUARTER FROM interval_value)" has been implemented
all along, but it formerly gave extremely strange results for
negative intervals.  Fix it so that the output for -N months
is the negative of the output for N months.

Per bug #18348 from Michael Bondarenko and subsequent discussion.

Discussion: https://postgr.es/m/18348-b097a3587dfde8a4@postgresql.org
2024-08-16 12:35:53 -04:00
Nathan Bossart
108d2adb9e Remove dependence on -fwrapv semantics in jsonb.
This commit updates a couple of places in the jsonb code to no
longer rely on signed integer wrapping for correctness.  Like
commit 9e9a2b7031, this is intended to move us closer towards
removing -fwrapv, which may enable some compiler optimizations.
However, there is presently no plan to actually remove that
compiler option in the near future.

This commit makes use of the newly introduced pg_abs_s32() routine
to negate a signed integer (that is known to be negative) for
comparison with an unsigned integer.  In passing, change one use of
INT_MIN to the more portable PG_INT32_MIN.

Reported-by: Alexander Lakhin
Author: Joseph Koshakow
Reviewed-by: Jian He
Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com
2024-08-16 11:24:44 -05:00
Peter Eisentraut
95b856de23 Remove incidental md5() function use from test
To allow test to pass in OpenSSL FIPS mode, similar to 657f5f223e, for
a new test that has been added since.

Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://www.postgresql.org/message-id/86763810-70a1-4872-8ba7-1676f788e5a2@eisentraut.org
2024-08-16 17:14:32 +02:00
Heikki Linnakangas
077ad4bd76 Relax fsyncing at end of a bulk load that was not WAL-logged
And improve the comments.

Backpatch to v17 where this was introduced.

Reviewed-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/cac7d1b6-8358-40be-af0b-21bc9b27d34c@iki.fi
2024-08-16 14:45:37 +03:00
Heikki Linnakangas
3943da46bc Refactor CopyOneRowTo
The handling of binary and text formats are quite different here, so
it's more clear to check for the format first and have two separate
loops.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Ilia Evdokimov, Junwang Zhao
Discussion: https://www.postgresql.org/message-id/CACJufxFzHCeFBQF0M%2BSgk_NwknWuQ4oU7tS1isVeBrbhcKOHkg@mail.gmail.com
2024-08-16 13:48:10 +03:00
Heikki Linnakangas
1153422eda Remove unused 'cur_skey' argument from IndexScanOK()
Commit a78fcfb51243 removed the last use of it.

Author: Hugo Zhang, Aleksander Alekseev
Reviewed-by: Daniel Gustafsson
Discussion: https://www.postgresql.org/message-id/NT0PR01MB129459E243721B954611938F9CDD2%40NT0PR01MB1294.CHNPR01.prod.partner.outlook.cn
2024-08-16 13:13:43 +03:00
Peter Eisentraut
e882bcae03 libpq: Fix minor TOCTOU violation
libpq checks the permissions of the password file before opening it.
The way this is done in two separate operations, a static analyzer
would flag as a time-of-check-time-of-use violation.  In practice, you
can't do anything with that, but it still seems better style to fix
it.

To fix it, open the file first and then check the permissions on the
opened file handle.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/a3356054-14ae-4e7a-acc6-249d19dac20b%40eisentraut.org
2024-08-16 06:41:17 +02:00
Alexander Korotkov
e3ec9dc1bf Add missing wait_for_catchup() to pg_visibility tap test
e2ed7e32271a introduced check of pg_visibility on standby.  This commit adds
missing wait_for_catchup() to synchronize standby before querying it.
2024-08-16 00:58:32 +03:00
Alexander Korotkov
e2ed7e3227 Fix GetStrictOldestNonRemovableTransactionId() on standby
e85662df44 implemented GetStrictOldestNonRemovableTransactionId() function
for computation of xid horizon that avoid reporting of false errors.
However, GetStrictOldestNonRemovableTransactionId() uses
GetRunningTransactionData() even on standby leading to an assertion failure.

Given that we decided to ignore KnownAssignedXids and standby can't have
own running xids, we switch to use TransamVariables->nextXid as a xid horizon.

Also, revise the comment regarding ignoring KnownAssignedXids with more
detailed reasoning provided by Heikki.

Reported-by: Heikki Linnakangas
Discussion: https://postgr.es/m/42218c4f-2c8d-40a3-8743-4d34dd0e4cce%40iki.fi
Reviewed-by: Heikki Linnakangas
2024-08-16 00:18:55 +03:00
Nathan Bossart
9e9a2b7031 Remove dependence on -fwrapv semantics in a few places.
This commit attempts to update a few places, such as the money,
numeric, and timestamp types, to no longer rely on signed integer
wrapping for correctness.  This is intended to move us closer
towards removing -fwrapv, which may enable some compiler
optimizations.  However, there is presently no plan to actually
remove that compiler option in the near future.

Besides using some of the existing overflow-aware routines in
int.h, this commit introduces and makes use of some new ones.
Specifically, it adds functions that accept a signed integer and
return its absolute value as an unsigned integer with the same
width (e.g., pg_abs_s64()).  It also adds functions that accept an
unsigned integer, store the result of negating that integer in a
signed integer with the same width, and return whether the negation
overflowed (e.g., pg_neg_u64_overflow()).

Finally, this commit adds a couple of tests for timestamps near
POSTGRES_EPOCH_JDATE.

Author: Joseph Koshakow
Reviewed-by: Tom Lane, Heikki Linnakangas, Jian He
Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com
2024-08-15 15:47:31 -05:00
Tom Lane
ad89d71978 Add 97add39c0 to .git-blame-ignore-revs. 2024-08-15 11:43:55 -04:00
Tom Lane
97add39c03 Clean up indentation and whitespace inconsistencies in ecpg.
ecpg's lexer and parser files aren't normally processed by
pgindent, and unsurprisingly there's a lot of code in there
that doesn't really match project style.  I spent some time
running pgindent over the fragments of these files that are
C code, and this is the result.  This is in the same spirit
as commit 30ed71e42, though apparently Peter used a different
method for that one, since it didn't find these problems.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-08-15 11:41:46 -04:00
Robert Haas
516b87502d Do not hardcode PG_PROTOCOL_LATEST in NegotiateProtocolVersion
We shouldn't ask the client to use a protocol version later than the
one that they requested. To avoid that, if the client requests a
version newer than the latest one we support, set FrontendProtocol
to the latest version we support, not the requested version. Then,
use that value when building the NegotiateProtocolVersion message.
(It seems good on general principle to avoid setting FrontendProtocol
to a version we don't support, anyway.)

None of this really matters right now, because we only support a
single protocol version, but if that ever changes, we'll need this.

Jelte Fennema-Nio, reviewed by me and incorporating some of my
proposed wording

Discussion: https://postgr.es/m/CAGECzQTyXDNtMXdq2L-Wp=OvOCPa07r6+U_MGb==h90MrfT+fQ@mail.gmail.com
2024-08-15 10:44:15 -04:00
Dean Rasheed
8dc28d7eb8 Optimise numeric multiplication using base-NBASE^2 arithmetic.
Currently mul_var() uses the schoolbook multiplication algorithm,
which is O(n^2) in the number of NBASE digits. To improve performance
for large inputs, convert the inputs to base NBASE^2 before
multiplying, which effectively halves the number of digits in each
input, theoretically speeding up the computation by a factor of 4. In
practice, the actual speedup for large inputs varies between around 3
and 6 times, depending on the system and compiler used. In turn, this
significantly reduces the runtime of the numeric_big regression test.

For this to work, 64-bit integers are required for the products of
base-NBASE^2 digits, so this works best on 64-bit machines, on which
it is faster whenever the shorter input has more than 4 or 5 NBASE
digits. On 32-bit machines, the additional overheads, especially
during carry propagation and the final conversion back to base-NBASE,
are significantly higher, and it is only faster when the shorter input
has more than around 50 NBASE digits. When the shorter input has more
than 6 NBASE digits (so that mul_var_short() cannot be used), but
fewer than around 50 NBASE digits, there may be a noticeable slowdown
on 32-bit machines. That seems to be an acceptable tradeoff, given the
performance gains for other inputs, and the effort that would be
required to maintain code specifically targeting 32-bit machines.

Joel Jacobson and Dean Rasheed.

Discussion: https://postgr.es/m/9d8a4a42-c354-41f3-bbf3-199e1957db97%40app.fastmail.com
2024-08-15 10:36:17 +01:00
Dean Rasheed
c4e44224cf Extend mul_var_short() to 5 and 6-digit inputs.
Commit ca481d3c9a introduced mul_var_short(), which is used by
mul_var() whenever the shorter input has 1-4 NBASE digits and the
exact product is requested. As speculated on in that commit, it can be
extended to work for more digits in the shorter input. This commit
extends it up to 6 NBASE digits (up to 24 decimal digits), for which
it also gives a significant speedup. This covers more cases likely to
occur in real-world queries, for which using base-NBASE^2 arithmetic
provides little benefit.

To avoid code bloat and duplication, refactor it a bit using macros
and exploiting the fact that some portions of the code are shared
between the different cases.

Dean Rasheed, reviewed by Joel Jacobson.

Discussion: https://postgr.es/m/9d8a4a42-c354-41f3-bbf3-199e1957db97%40app.fastmail.com
2024-08-15 10:33:12 +01:00
Peter Eisentraut
fce7cb6da0 Variable renaming in dbcommands.c
There were several sets of very similar local variable names, such as
"downer" and "dbowner", which was very confusing and error-prone.
Rename the former to "ownerEl" and so on, similar to collationcmds.c
and typecmds.c.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/e5bce225-ee04-40c7-a280-ea7214318048%40eisentraut.org
2024-08-15 07:08:12 +02:00
Jeff Davis
a3c6aa42ee Fix doc typo: unicode_assigned() return type.
Reported-by: Hironobu SUZUKI
Discussion: https://postgr.es/m/5dd88820-bb00-4b90-904b-738ea2e4ee2e@interdb.jp
Backpatch-through: 17
2024-08-14 19:07:09 -07:00
David Rowley
80ffcb8427 Improve ALTER PUBLICATION validation and error messages
Attempting to add a system column for a table to an existing publication
would result in the not very intuitive error message of:

ERROR:  negative bitmapset member not allowed

Here we improve that to have it display the same error message as a user
would see if they tried adding a system column for a table when adding
it to the publication in the first place.

Doing this requires making the function which validates the list of
columns an extern function.  The signature of the static function wasn't
an ideal external API as it made the code more complex than it needed to be.
Here we adjust the function to have it populate a Bitmapset of attribute
numbers.  Doing it this way allows code simplification.

There was no particular bug here other than the weird error message, so
no backpatch.

Bug: #18558
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Peter Smith, David Rowley
Discussion: https://postgr.es/m/18558-411bc81b03592125@postgresql.org
2024-08-15 13:10:25 +12:00
Nathan Bossart
ef6e028f05 Add a couple of recent commits to .git-blame-ignore-revs. 2024-08-14 14:25:54 -05:00
Alvaro Herrera
a5c6b8f22c
libpq: Trace responses to SSLRequest and GSSENCRequest
Since these are single bytes instead of v2 or v3 messages they need
custom tracing logic.  These "messages" don't even have official names
in the protocol specification, so I (Jelte) called them SSLResponse and
GSSENCResponse here.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQSoPHtZ4xe0raJ6FYSEiPPS+YWXBhOGo+Y1YecLgknF3g@mail.gmail.com
2024-08-14 14:53:55 -04:00
Peter Eisentraut
5304fec4d8 Apply PGDLLIMPORT markings to some GUC variables
According to the commit message in 8ec569479, we must have all variables
in header files marked with PGDLLIMPORT. In commit d3cc5ffe81f6 some
variables were moved from launch_backend.c file to several header files.

This adds PGDLLIMPORT to moved variables.

Author: Sofia Kopikova <s.kopikova@postgrespro.ru>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/e0b17014-5319-4dd6-91cd-93d9c8fc9539%40postgrespro.ru
2024-08-14 11:36:12 +02:00
Peter Eisentraut
c8e2d422fd Remove TRACE_SORT macro
The TRACE_SORT macro guarded the availability of the trace_sort GUC
setting.  But it has been enabled by default ever since it was
introduced in PostgreSQL 8.1, and there have been no reports that
someone wanted to disable it.  So just remove the macro to simplify
things.  (For the avoidance of doubt: The trace_sort GUC is still
there.  This only removes the rarely-used macro guarding it.)

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/be5f7162-7c1d-44e3-9a78-74dcaa6529f2%40eisentraut.org
2024-08-14 08:07:52 +02:00
Thomas Munro
bf3401fe81 Harmonize MinGW CODESET lookup with MSVC.
Historically, MinGW environments lacked some Windows API calls, so we
took a different code path in win32_langinfo().  Somehow, the code
change in commit 35eeea62 (removing setlocale() calls) caused one
particular 001_initdb.pl test to fail on MinGW + ICU builds, because
pg_import_system_collations() found no collations.  It might take a
MinGW user to discover the exact reason.

Updating that function to use the same code as MSVC seems to fix that
test, so lets do that.  (There are plenty more places that test for MSVC
unnecessarily, to be investigated later.)

While here, also rename the helper function win32_langinfo() to
win32_get_codeset(), to explain what it does less confusingly; it's not
really a general langinfo() substitute.

Noticed by triggering the optional MinGW CI task; no build farm animals
failed.

Discussion: https://postgr.es/m/CA%2BhUKGKBWfhXQ3J%2B2Lj5PhKvQnGD%3DsywA0XQcb7boTCf%3DerVLg%40mail.gmail.com
2024-08-14 15:04:14 +12:00
Masahiko Sawada
4c1b4cdb86 Add resource statistics reporting to ANALYZE VERBOSE.
Previously, log_autovacuum_min_duration utilized dedicated code for
logging resource statistics, such as system and buffer usage during
autoanalyze. However, this logging functionality was not utilized by
ANALYZE VERBOSE.

This commit adds resource statistics reporting to ANALYZE VERBOSE by
reusing the same logging code as autoanalyze.

Author: Anthonin Bonnefoy
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com
2024-08-13 19:23:56 -07:00
Masahiko Sawada
c584781bcc Use pgBufferUsage for buffer usage tracking in analyze.
Previously, (auto)analyze used global variables VacuumPageHit,
VacuumPageMiss, and VacuumPageDirty to track buffer usage. However,
pgBufferUsage provides a more generic way to track buffer usage with
support functions.

This change replaces those global variables with pgBufferUsage in
analyze. Since analyze was the sole user of those variables, it
removes their declarations. Vacuum previously used those variables but
replaced them with pgBufferUsage as part of a bug fix, commit
5cd72cc0c.

Additionally, it adjusts the buffer usage message in both vacuum and
analyze for better consistency.

Author: Anthonin Bonnefoy
Reviewed-by: Masahiko Sawada, Michael Paquier
Discussion: https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com
2024-08-13 18:49:45 -07:00
Thomas Munro
2488058dc3 Include <xlocale.h> for macOS, take II.
Fix typo in macro name.

Discussion: https://postgr.es/m/CA%2BhUKG%2Bk-o3N_SyNJNJpAcdtMo_HheN30miAeXehk9yw%3D9WYzA%40mail.gmail.com
2024-08-13 23:43:04 +12:00
Thomas Munro
52ea7f0e05 Include <xlocale.h> for older macOS.
Commit 35eeea62 forgot to include <xlocale.h> when using locale_t
(which didn't seem to be required on newer Apple SDK as used by CI,
hence mistake).  Let's see if this fixes build farm animals longfin and
sifika.
2024-08-13 23:02:05 +12:00
Thomas Munro
35eeea6230 Use thread-safe nl_langinfo_l(), not nl_langinfo().
This gets rid of some setlocale() calls.  The remaining call to
setlocale() in pg_get_encoding_from_locale() is a query of the name
of the current locale when none was provided (in a multi-threaded future
that would need more work).

All known non-Windows targets have nl_langinfo_l(), from POSIX 2008, and
for Windows we already do something thread-safe.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com
2024-08-13 22:34:53 +12:00
Thomas Munro
14c648ff00 All POSIX systems have langinfo.h and CODESET.
We don't need configure probes for HAVE_LANGINFO_H (it is implied by
!WIN32), and we don't need to consider systems that have it but don't
define CODESET (that was for OpenBSD in commit 81cca218, but it has now
had it for 19 years).

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com
2024-08-13 22:13:52 +12:00
Peter Eisentraut
93660d1c27 Use errmsg_internal for debug messages
Some newer code was applying this inconsistently.
2024-08-13 10:01:49 +02:00
Peter Eisentraut
a67a49648d Rename C23 keyword
constexpr is a keyword in C23.  Rename a conflicting identifier for
future-proofing.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/08abc832-1384-4aca-a535-1a79765b565e%40eisentraut.org
2024-08-13 06:15:28 +02:00
Alvaro Herrera
ea92f3a0a5
libpq: Trace frontend authentication challenges
If tracing was enabled during connection startup, these messages would
previously be listed in the trace output as something like this:

F	54	Unknown message: 70
mismatched message length: consumed 4, expected 54

With this commit their type and contents are now correctly listed:

F	36	StartupMessage	 3 0 "user" "foo" "database" "alvherre"
F	54	SASLInitialResponse	 "SCRAM-SHA-256" 32 'n,,n=,r=nq5zEPR/VREHEpOAZzH8Rujm'
F	108	SASLResponse	 'c=biws,r=nq5zEPR/VREHEpOAZzH8RujmVtWZDQ8glcrvy9OMNw7ZqFUn,p=BBwAKe0WjSvigB6RsmmArAC+hwucLeuwJrR5C/HQD5M='

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAGECzQSoPHtZ4xe0raJ6FYSEiPPS+YWXBhOGo+Y1YecLgknF3g@mail.gmail.com
2024-08-12 19:12:54 -04:00
Alvaro Herrera
12d6c727ca
Fix nls.mk to reflect astreamer files relocation
In the recent commit f80b09bac8, astreamer files were moved to another
directory, but this change was not reflected in nls.mk.  This commit
corrects that oversight.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20240806.102123.648178476296575604.horikyota.ntt@gmail.com
2024-08-12 18:42:18 -04:00
Alvaro Herrera
c899c6839f
Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by
DROP, this could cause a problem for a concurrent transaction
recomputing the partition descriptor when running a prepared statement,
because it tries to dereference a pointer to a tuple that's not found in
a catalog scan.

The existing retry logic added in commit dbca3469ebf8 is sufficient to
cope with the overall problem, provided we don't try to dereference a
non-existant heap tuple.

Arguably, the code in RelationBuildPartitionDesc() has been wrong all
along, since no check was added in commit 898e5e3290a7 against receiving
a NULL tuple from the catalog scan; that bug has only become
user-visible with DETACH CONCURRENTLY which was added in branch 14.
Therefore, even though there's no known mechanism to cause a crash
because of this, backpatch the addition of such a check to all supported
branches.  In branches prior to 14, this would cause the code to fail
with a "missing relpartbound for relation XYZ" error instead of
crashing; that's okay, because there are no reports of such behavior
anyway.

Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
2024-08-12 18:17:56 -04:00
Jeff Davis
a459ac504c Remove unnecessary check for NULL locale, per Coverity.
Discussion: https://postgr.es/m/3804933.1723394010@sss.pgh.pa.us
Reported-by: Tom Lane
2024-08-12 12:26:23 -07:00
Peter Geoghegan
1343ae954c Give nbtree move right function internal linkage.
Declare _bt_moveright() static.  This is a minor modularity win; the
routine was already private to nbtsearch.c for all practical purposes.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2WgWVzCNEXQB_op5MMZMDgJ3fg3AhVm6bq2iZPpJNXGhWw@mail.gmail.com
2024-08-12 14:36:55 -04:00
Tom Lane
2aecbd7526 Log more info when wait-for-catchup tests time out.
Cluster.pm's wait_for_catchup and allied subroutines don't provide
enough information to diagnose the problem when a wait times out.
In hopes of debugging some intermittent buildfarm failures, let's
dump the ending state of the relevant system view when that happens.

Add this to v17 too, but not stable branches.

Discussion: https://postgr.es/m/352068.1723422725@sss.pgh.pa.us
2024-08-12 13:18:36 -04:00
Nathan Bossart
760162fedb Add user-callable CRC functions.
We've had code for CRC-32 and CRC-32C for some time (for WAL
records, etc.), but there was no way for users to call it, despite
apparent popular demand.  The new crc32() and crc32c() functions
accept bytea input and return bigint (to avoid returning negative
values).

Bumps catversion.

Author: Aleksander Alekseev
Reviewed-by: Peter Eisentraut, Tom Lane
Discussion: https://postgr.es/m/CAJ7c6TNMTGnqnG%3DyXXUQh9E88JDckmR45H2Q%2B%3DucaCLMOW1QQw%40mail.gmail.com
2024-08-12 10:35:06 -05:00
David Rowley
313df8f5ad Fix outdated comments
A few fields in ResultRelInfo are now also used for MERGE.  Update the
comments to mention that.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxH8-NvFhLcSZZTTW+1M9AfS4+SOTKmyPG7ZhzNvN=+NkA@mail.gmail.com:wq
2024-08-12 23:41:13 +12:00
David Rowley
ffabb56c94 Fix a series of typos and outdated references
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c1d63754-cb85-2d8a-8409-bde2c4d2d04b@gmail.com
2024-08-12 23:27:09 +12:00
Heikki Linnakangas
8de5ca1dc9 Fix bad indentation introduced in commit f011e82c2c 2024-08-12 10:57:03 +03:00
Heikki Linnakangas
3354f85284 Consolidate postmaster code to launch background processes
Much of the code in process_pm_child_exit() to launch replacement
processes when one exits or when progressing to next postmaster state
was unnecessary, because the ServerLoop will launch any missing
background processes anyway. Remove the redundant code and let
ServerLoop handle it.

In ServerLoop, move the code to launch all the processes to a new
subroutine, to group it all together.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/8f2118b9-79e3-4af7-b2c9-bd5818193ca4@iki.fi
2024-08-12 10:04:26 +03:00
Peter Eisentraut
4eb5089e26 Remove dead code
After e9931bfb751, the locale argument of SB_lower_char() is never
NULL, so the branch that deals with NULL can be removed (similar to
how e9931bfb751 for example removed those branches in str_tolower()).

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/4f562d84-87f4-44dc-8946-01d6c437936f@eisentraut.org
2024-08-12 08:52:30 +02:00
Peter Eisentraut
f1976df5ea Remove fe_memutils from libpgcommon_shlib
libpq must not use palloc/pfree. It's not allowed to exit on allocation
failure, and mixing the frontend pfree with malloc is architecturally
unsound.

Remove fe_memutils from the shlib build entirely, to keep devs from
accidentally depending on it in the future.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/CAOYmi+=pg=W5L1h=3MEP_EB24jaBu2FyATrLXqQHGe7cpuvwyg@mail.gmail.com
2024-08-12 08:30:39 +02:00
Peter Eisentraut
94980c4567 Remove support for old realpath() API
The now preferred way to call realpath() is by passing NULL as the
second argument and get a malloc'ed result.  We still supported the
old way of providing our own buffer as a second argument, for some
platforms that didn't support the new way yet.  Those were only
Solaris less than version 11 and some older AIX versions (7.1 and
newer appear to support the new variant).  We don't support those
platforms versions anymore, so we can remove this extra code.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/9e638b49-5c3f-470f-a392-2cbedb2f7855%40eisentraut.org
2024-08-12 08:04:35 +02:00
David Rowley
f0d1127595 Remove "parent" column from pg_backend_memory_contexts
32d3ed816 added the "path" column to pg_backend_memory_contexts to allow
a stable method of obtaining the parent MemoryContext of a given row in
the view.  Using the "path" column is now the preferred method of
obtaining the parent row.

Previously, any queries which were self-joining to this view using the
"name" and "parent" columns could get incorrect results due to the fact
that names are not unique.  Here we aim to explicitly break such queries
so that they can be corrected and use the "path" column instead.

It is possible that there are more innocent users of the parent column
that just need an indication of the parent and having to write out a
self-joining CTE may be an unnecessary hassle for those cases.  Let's
remove the column for now and see if anyone comes back with any
complaints.  This does seem like a good time to attempt to get rid of
the column as we still have around 1 year to revert this if someone comes
back with a valid complaint.  Plus this view is new to v14 and is quite
niche, so perhaps not many people will be affected.

Author: Melih Mutlu <m.melihmutlu@gmail.com>
Discussion: https://postgr.es/m/CAGPVpCT7NOe4fZXRL8XaoxHpSXYTu6GTpULT_3E-HT9hzjoFRA@mail.gmail.com
2024-08-12 15:42:16 +12:00
Peter Geoghegan
3f44959f47 Avoid unneeded nbtree backwards scan buffer locks.
Teach nbtree backwards scans to avoid relocking a just-read leaf page to
read its current left sibling link when it isn't truly necessary.  This
happened inside _bt_readnextpage whenever _bt_readpage had already
determined that there'll be no further matches to the left (or at least
none for the current primitive index scan, for a scan with array keys).

A new precheck inside _bt_readnextpage is all that we need to avoid
these useless lock acquisitions.  Arguably, using a precheck like this
was a missed opportunity for commit 2ed5b87f96, which taught nbtree to
drop leaf page pins early to avoid blocking cleanup by VACUUM.  Forwards
scans already managed to avoid relocking the page like this.

The optimization added by this commit is particularly helpful with
backwards scans that use array keys where the scan must perform multiple
primitive index scans.  Such backwards scans will now avoid a useless
leaf page re-lock at the end of each primitive index scan.

Note that this commit does not attempt to avoid needlessly re-locking a
leaf page that was just read when the scan must follow the leaf page's
left link.  That more ambitious optimization could work by stashing the
left link when the page is first read by a backwards scan, allowing the
subsequent _bt_readnextpage call to optimistically skip re-reading the
original page just to get a new copy of its left link.  For now we only
address cases where we don't care about our original page's left link.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=xgs7PojG=EUvhgadwENzu_mY_riNh-w9wFPsaS717ew@mail.gmail.com
2024-08-11 15:42:52 -04:00
Heikki Linnakangas
f011e82c2c Initialize HASHCTL differently, to suppress Coverity warning
Coverity complained that the hash_create() call might access
hash_table_ctl->hctl. That's a false alarm, hash_create() only
accesses that field when passed the HASH_SHARED_MEM flag. Try to
silence it by using a plain local variable instead of a const. That's
how the HASHCTL is initialized in all the other hash_create() calls.
2024-08-11 20:21:16 +03:00
Tom Lane
b2be5cb2ab Suppress Coverity warnings about Asserts in get_name_for_var_field.
Coverity thinks dpns->plan could be null at these points.  That
shouldn't really be possible, but it's easy enough to modify the
Asserts so they'd not core-dump if it were true.

These are new in b919a97a6.  Back-patch to v13; the v12 version
of the patch didn't have these Asserts.
2024-08-11 12:24:56 -04:00
Tom Lane
364de74cff Allow adjusting session_authorization and role in parallel workers.
The code intends to allow GUCs to be set within parallel workers
via function SET clauses, but not otherwise.  However, doing so fails
for "session_authorization" and "role", because the assign hooks for
those attempt to set the subsidiary "is_superuser" GUC, and that call
falls foul of the "not otherwise" prohibition.  We can't switch to
using GUC_ACTION_SAVE for this, so instead add a new GUC variable
flag GUC_ALLOW_IN_PARALLEL to mark is_superuser as being safe to set
anyway.  (This is okay because is_superuser has context PGC_INTERNAL
and thus only hard-wired calls can change it.  We'd need more thought
before applying the flag to other GUCs; but maybe there are other
use-cases.)  This isn't the prettiest fix perhaps, but other
alternatives we thought of would be much more invasive.

While here, correct a thinko in commit 059de3ca4: when rejecting
a GUC setting within a parallel worker, we should return 0 not -1
if the ereport doesn't longjmp.  (This seems to have no consequences
right now because no caller cares, but it's inconsistent.)  Improve
the comments to try to forestall future confusion of the same kind.

Despite the lack of field complaints, this seems worth back-patching.
Thanks to Nathan Bossart for the idea to invent a new flag,
and for review.

Discussion: https://postgr.es/m/2833457.1723229039@sss.pgh.pa.us
2024-08-10 15:51:30 -04:00
Alexander Korotkov
0868d7ae70 Add tests for pg_wal_replay_wait() errors
Improve test coverage for pg_wal_replay_wait() procedure by adding test
cases when it errors out.
2024-08-10 21:43:14 +03:00
Alexander Korotkov
3ac3ec580c Improve header comment for WaitLSNSetLatches()
Reflect the fact that we remove waiters from the heap, not just set their
latches.
2024-08-10 21:43:09 +03:00
Alexander Korotkov
867d396ccd Adjust pg_wal_replay_wait() procedure behavior on promoted standby
pg_wal_replay_wait() is intended to be called on standby.  However, standby
can be promoted to primary at any moment, even concurrently with the
pg_wal_replay_wait() call.  If recovery is not currently in progress
that doesn't mean the wait was unsuccessful.  Thus, we always need to recheck
if the target LSN is replayed.

Reported-by: Kevin Hale Boyes
Discussion: https://postgr.es/m/CAPpHfdu5QN%2BZGACS%2B7foxmr8_nekgA2PA%2B-G3BuOUrdBLBFb6Q%40mail.gmail.com
Author: Alexander Korotkov
2024-08-10 21:43:02 +03:00
John Naylor
bbf668d66f Lower minimum maintenance_work_mem to 64kB
Since the introduction of TID store, vacuum uses far less memory in
the common case than in versions 16 and earlier. Invoking multiple
rounds of index vacuuming in turn requires a much larger table. It'd
be a good idea anyway to cover this case in regression testing, and a
lower limit is less painful for slow buildfarm animals. The reason to
do it now is to re-enable coverage of the bugfix in commit 83c39a1f7f.

For consistency, give autovacuum_work_mem the same treatment.

Suggested by Andres Freund
Tested by Melanie Plageman
Backpatch to v17, where TID store was introduced

Discussion: https://postgr.es/m/20240516205458.ohvlzis5b5tvejru@awork3.anarazel.de
Discussion: https://postgr.es/m/20240722164745.fvaoh6g6zprisqgp%40awork3.anarazel.de
2024-08-10 14:52:56 +07:00
Peter Eisentraut
f5a1311fcc Fix inappropriate uses of atol()
Some code using atol() would not work correctly if sizeof(long)==4:

- src/bin/pg_basebackup/pg_basebackup.c: Would miscount size of a
  tablespace over 2 TB.

- src/bin/pg_basebackup/streamutil.c: Would truncate a timeline ID
  beyond INT32_MAX.

- src/bin/pg_rewind/libpq_source.c: Would miscount size of files
  larger than 2 GB (but this currently cannot happen).

Replace these with atoll().

In one case, the use of atol() did not result in incorrect behavior
but seems inconsistent with related code:

- src/interfaces/ecpg/ecpglib/execute.c: Gratuitous, since it
  processes a value from pg_type.typlen, which is int16.

Replace this with atoi().

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/a52738ad-06bc-4d45-b59f-b38a8a89de49%40eisentraut.org
2024-08-10 08:22:31 +02:00
Alvaro Herrera
7adec2d5fc
libpq: Trace StartupMessage/SSLRequest/GSSENCRequest correctly
libpq tracing via PQtrace would uselessly print the wrong thing for
these types of messages.  With this commit, their type and contents
would be correctly listed.  (This can be verified with PQconnectStart(),
but we don't use that in libpq_pipeline, so I (Álvaro) haven't bothered
to add any tests.)

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQSoPHtZ4xe0raJ6FYSEiPPS+YWXBhOGo+Y1YecLgknF3g@mail.gmail.com
2024-08-09 17:55:01 -04:00
Heikki Linnakangas
a79ed10e6c Fix comment on processes being kept over a restart
All child processes except the syslogger are killed on a restart. The
archiver might be already running though, if it was started during
recovery.

The split in the comments between "other special children" and the
first group of "background tasks" seemed really arbitrary, so I just
merged them all into one group.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/8f2118b9-79e3-4af7-b2c9-bd5818193ca4@iki.fi
2024-08-10 00:06:19 +03:00
Heikki Linnakangas
28a520c0b7 Refactor code to handle death of a backend or bgworker in postmaster
Currently, when a child process exits, the postmaster first scans
through BackgroundWorkerList, to see if it the child process was a
background worker. If not found, then it scans through BackendList to
see if it was a regular backend. That leads to some duplication
between the bgworker and regular backend cleanup code, as both have an
entry in the BackendList that needs to be cleaned up in the same way.
Refactor that so that we scan just the BackendList to find the child
process, and if it was a background worker, do the additional
bgworker-specific cleanup in addition to the normal Backend cleanup.

Change HandleChildCrash so that it doesn't try to handle the cleanup
of the process that already exited, only the signaling of all the
other processes. When called for any of the aux processes, the caller
had already cleared the *PID global variable, so the code in
HandleChildCrash() to do that was unused.

On Windows, if a child process exits with ERROR_WAIT_NO_CHILDREN, it's
now logged with that exit code, instead of 0. Also, if a bgworker
exits with ERROR_WAIT_NO_CHILDREN, it's now treated as crashed and is
restarted. Previously it was treated as a normal exit.

If a child process is not found in the BackendList, the log message
now calls it "untracked child process" rather than "server process".
Arguably that should be a PANIC, because we do track all the child
processes in the list, so failing to find a child process is highly
unexpected. But if we want to change that, let's discuss and do that
as a separate commit.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
2024-08-10 00:04:43 +03:00
Heikki Linnakangas
b43100fa71 Make BackgroundWorkerList doubly-linked
This allows ForgetBackgroundWorker() and ReportBackgroundWorkerExit()
to take a RegisteredBgWorker pointer as argument, rather than a list
iterator. That feels a little more natural. But more importantly, this
paves the way for more refactoring in the next commit.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
2024-08-09 22:44:20 +03:00
Nathan Bossart
7fceb5725b doc: Standardize use of dashes in references to CRC and SHA.
Presently, we inconsistently use dashes in references to these
algorithms (e.g., CRC32C versus CRC-32C).  Some popular web sources
appear to prefer dashes, and with this commit, we will, too.

Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/ZrUFpLP-w2zTAHqq%40nathan
2024-08-09 13:16:33 -05:00
Nathan Bossart
8c3548613d doc: Fix name of CRC algorithm in "Reliability" section.
This section claims we use CRC-32 for WAL records and two-phase
state files, but we've actually used CRC-32C since v9.5 (commit
5028f22f6e).  Fix that.

Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/ZrUFpLP-w2zTAHqq%40nathan
Backpatch-through: 12
2024-08-09 10:52:37 -05:00
Tom Lane
b919a97a6c Fix "failed to find plan for subquery/CTE" errors in EXPLAIN.
To deparse a reference to a field of a RECORD-type output of a
subquery, EXPLAIN normally digs down into the subquery's plan to try
to discover exactly which anonymous RECORD type is meant.  However,
this can fail if the subquery has been optimized out of the plan
altogether on the grounds that no rows could pass the WHERE quals,
which has been possible at least since 3fc6e2d7f.  There isn't
anything remaining in the plan tree that would help us, so fall back
to printing the field name as "fN" for the N'th column of the record.
(This will actually be the right thing some of the time, since it
matches the column names we assign to RowExprs.)

In passing, fix a comment typo in create_projection_plan, which
I noticed while experimenting with an alternative fix for this.

Per bug #18576 from Vasya B.  Back-patch to all supported branches.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/18576-9feac34e132fea9e@postgresql.org
2024-08-09 11:21:39 -04:00
Peter Eisentraut
7da1bdc2c2 Remove obsolete RECHECK keyword completely
This used to be part of CREATE OPERATOR CLASS and ALTER OPERATOR
FAMILY, but it has done nothing (except issue a NOTICE) since
PostgreSQL 8.4.  Commit 30e7c175b81 removed support for dumping from
pre-9.2 servers, so this no longer serves any need.

This now removes it completely, and you'd get a normal parse error if
you used it.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/flat/113ef2d2-3657-4353-be97-f28fceddbca1%40eisentraut.org
2024-08-09 07:18:51 +02:00
Amit Kapila
701cf1e317 Change the misleading local end_lsn for prepared transactions.
The apply worker was using XactLastCommitEnd as local end_lsn for applying
prepare and rollback_prepare. The XactLastCommitEnd value is the end lsn
of the last commit applied before the prepare transaction which makes no
sense. This LSN is used to decide whether we can send the acknowledgment
of the corresponding remote LSN to the server.

It is okay not to set the local_end LSN with the actual WAL position for
the prepare because we always flush the prepare record. So, we can send
the acknowledgment of the remote_end LSN as soon as prepare is finished.

The current code is misleading but as such doesn't create any problem, so
decided not to backpatch.

Author: Hayato Kuroda
Reviewed-by: Shveta Malik, Amit Kapila
Discussion: https://postgr.es/m/TYAPR01MB5692FA4926754B91E9D7B5F0F5AA2@TYAPR01MB5692.jpnprd01.prod.outlook.com
2024-08-09 10:23:57 +05:30
Alvaro Herrera
4eb179e5bf
libpq: Add suppress argument to pqTraceOutputNchar
In future commits we're going to trace authentication related messages.
Some of these messages contain challenge bytes as part of a
challenge-response flow.  Since these bytes are different for every
connection, we want to normalize them when the PQTRACE_REGRESS_MODE
trace flag is set.  This commit modifies pqTraceOutputNchar to take a
suppress argument, which makes it possible to do so.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQSoPHtZ4xe0raJ6FYSEiPPS+YWXBhOGo+Y1YecLgknF3g@mail.gmail.com
2024-08-08 20:35:12 -04:00
Alvaro Herrera
a90bdd7a44
Refuse ATTACH of a table referenced by a foreign key
Trying to attach a table as a partition which is already on the
referenced side of a foreign key on the partitioned table that it is
being attached to, leads to strange behavior: we try to clone the
foreign key from the parent to the partition, but this new FK points to
the partition itself, and the mix of pg_constraint rows and triggers
doesn't behave well.

Rather than trying to untangle the mess (which might be possible given
sufficient time), I opted to forbid the ATTACH.  This doesn't seem a
problematic restriction, given that we already fail to create the
foreign key if you do it the other way around, that is, having the
partition first and the FK second.

Backpatch to all supported branches.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
2024-08-08 19:35:13 -04:00
Alvaro Herrera
498ee9ee2f
Refactor error messages to reduce duplication
I also took the liberty of changing

	errmsg("COPY DEFAULT only available using COPY FROM")
to
	errmsg("COPY %s cannot be used with %s", "DEFAULT", "COPY TO")

because the original wording is unlike all other messages that indicate
option incompatibility.  This message was added by commit 9f8377f7a279
(16-era), in whose development thread there was no discussion on this
point.

Backpatch to 17.
2024-08-08 15:17:11 -04:00
Alexander Korotkov
d0c8cf2a56 Add a caveat to hash_seq_init_with_hash_value() header comment
The typical use-case for hash_seq_init_with_hash_value() is syscache
callback.  Add a caveat that the default hash function doesn't match syscache
hash function.  So, one needs to define a custom hash function.

Reported-by: Pavel Stehule
Discussion: https://postgr.es/m/CAFj8pRAXmv6eyYx%3DE_BTfyK%3DO_%2ByOF8sXB%3D0bn9eOBt90EgWRA%40mail.gmail.com
Reviewed-by: Pavel Stehule
2024-08-08 11:48:57 +03:00
Heikki Linnakangas
49dc191bd1 Fix pg_rewind debug output to print the source timeline history
getTimelineHistory() is called twice, to read the source and the
target timeline history files. However, the loop to print the file
with the --debug option used the wrong variable when dealing with the
source. As a result, the source's history was always printed as empty.

Spotted while debugging bug #18575, but this does not fix that bug,
just the debugging output. Backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/092dd515-b7b4-4fd0-8407-ceca2f02f6ec@iki.fi
2024-08-08 10:20:25 +03:00
Noah Misch
e56ccc8e42 Fix names of "Visual Studio" and Meson in a documentation sentence.
Commit 3cffe7946c268be91a340ec9a27081cb93d67d35 missed this.  Back-patch
to v17, which introduced this.

Discussion: https://postgr.es/m/CAJ7c6TM7ct0EjoCQaLSVYoxxnEw4xCUFebWj77GktWsqEdyCtQ@mail.gmail.com
2024-08-07 11:43:08 -07:00
Tom Lane
8d148bb8b8 Fix edge case in plpgsql's make_callstmt_target().
If the plancache entry for the CALL statement is already stale,
it's possible for us to fetch an old procedure OID out of it,
and then fail with "cache lookup failed for function NNN".
In ordinary usage this never happens because make_callstmt_target
is called just once immediately after building the plancache
entry.  It can be forced however by setting up an erroneous CALL
(that causes make_callstmt_target itself to report an error),
then dropping/recreating the target procedure, then repeating
the erroneous CALL.

To fix, use SPI_plan_get_cached_plan() to fetch the plancache's
plan, rather than assuming we can use SPI_plan_get_plan_sources().
This shouldn't add any noticeable overhead in the normal case,
and in the stale-plan case we'd have had to replan anyway a little
further down.

The other callers of SPI_plan_get_plan_sources() seem OK, because
either they don't need up-to-date plans or they know that the
query was just (re) planned.  But add some commentary in hopes
of not falling into this trap again.

Per bug #18574 from Song Hongyu.  Back-patch to v14 where this coding
was introduced.  (Older branches have comparable code, but it's run
after any required replanning, so there's no issue.)

Discussion: https://postgr.es/m/18574-2ce7ba3249221389@postgresql.org
2024-08-07 12:54:39 -04:00
Alvaro Herrera
2bb969f399
Refactor/reword some error messages to avoid duplicates
Also, remove brackets around "EMPTY [ ARRAY ]".  An error message is
not the place to state that a keyword is optional.

Backpatch to 17.
2024-08-07 11:30:36 -04:00
Robert Haas
22b4a1b561 Improve file header comments for astramer code.
Make it clear that "astreamer" stands for "archive streamer".
Generalize comments that still believe this code can only be used
by pg_basebackup. Add some comments explaining the asymmetry
between the gzip, lz4, and zstd astreamers, in the hopes of making
life easier for anyone who hacks on this code in the future.

Robert Haas, reviewed by Amul Sul.

Discussion: http://postgr.es/m/CAAJ_b97O2kkKVTWxt8MxDN1o-cDfbgokqtiN2yqFf48=gXpcxQ@mail.gmail.com
2024-08-07 08:49:41 -04:00
Heikki Linnakangas
2676040df0 Make fallback MD5 implementation thread-safe on big-endian systems
Replace a static scratch buffer with a local variable, because a
static buffer makes the function not thread-safe. This function is
used in client-code in libpq, so it needs to be thread-safe. It was
until commit b67b57a966, which replaced the implementation with the
one from pgcrypto.

Backpatch to v14, where we switched to the new implementation.

Reviewed-by: Robert Haas, Michael Paquier
Discussion: https://www.postgresql.org/message-id/dfa2015d-ad21-4802-a4cc-3850fc5fff3f@iki.fi
2024-08-07 10:43:52 +03:00
Peter Eisentraut
5388216f6a Revert ECPG's use of pnstrdup()
Commit 0b9466fce added a dependency on fe_memutils' pnstrdup() inside
informix.c.  This adds an exit() path in a library, which we don't
want.  (Unlike libpq, the ecpg libraries don't have an automated check
for that, but it makes sense to keep them to a similar standard.)  The
ecpg code can already handle failure results from the *strdup() call
by itself.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/CAOYmi+=pg=W5L1h=3MEP_EB24jaBu2FyATrLXqQHGe7cpuvwyg@mail.gmail.com
2024-08-07 09:21:07 +02:00
Alexander Korotkov
40064a8ee1 Optimize InvalidateAttoptCacheCallback() and TypeCacheTypCallback()
These callbacks are receiving hash values as arguments, which doesn't allow
direct lookups for AttoptCacheHash and TypeCacheHash.  This is why subject
callbacks currently use full iteration over corresponding hashes.

This commit avoids full hash iteration in InvalidateAttoptCacheCallback(),
and TypeCacheTypCallback().  At first, we switch AttoptCacheHash and
TypeCacheHash to use same hash function as syscache.  As second, we
use hash_seq_init_with_hash_value() to iterate only hash entries with matching
hash value.

Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
2024-08-07 07:06:17 +03:00
Alexander Korotkov
d0f020037e Introduce hash_search_with_hash_value() function
This new function iterates hash entries with given hash values.  This function
is designed to avoid full sequential hash search in the syscache invalidation
callbacks.

Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
2024-08-07 07:06:17 +03:00
Heikki Linnakangas
3ab2668d48 Use psprintf to simplify gtsvectorout()
The buffer allocation was correct, but looked archaic and scary:

- It was weird to calculate the buffer size before determining which
  format string was used. With the same effort, we could've used the
  right-sized buffer for each branch.

- Commit aa0d3504560 added one more possible return string ("all true
  bits"), but didn't adjust the code at the top of the function to
  calculate the returned string's max size. It was not a live bug,
  because the new string was smaller than the existing ones, but
  seemed wrong in principle.

- Use of sprintf() is generally eyebrow-raising these days

Switch to psprintf(). psprintf() allocates a larger buffer than what
was allocated before, 128 bytes vs 80 bytes, which is acceptable as
this code is not performance or space critical.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:05:25 +03:00
Heikki Linnakangas
d5f139cb68 Constify fields and parameters in spell.c
I started by marking VoidString as const, and fixing the fallout by
marking more fields and function arguments as const. It proliferated
quite a lot, but all within spell.c and spell.h.

A more narrow patch to get rid of the static VoidString buffer would
be to replace it with '#define VoidString ""', as C99 allows assigning
"" to a non-const pointer, even though you're not allowed to modify
it. But it seems like good hygiene to mark all these as const. In the
structs, the pointers can point to the constant VoidString, or a
buffer allocated with palloc(), or with compact_palloc(), so you
should not modify them.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:51 +03:00
Heikki Linnakangas
fe8dd65bf2 Mark misc static global variables as const
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:48 +03:00
Heikki Linnakangas
85829c973c Make nullSemAction const, add 'const' decorators to related functions
To make it more clear that these should never be modified.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:22 +03:00
Heikki Linnakangas
1e35951e71 Turn a few 'validnsps' static variables into locals
There was no need for these to be static buffers, local variables work
just as well. I think they were marked as 'static' to imply that they
are read-only, but 'const' is more appropriate for that, so change
them to const.

To make it possible to mark the variables as 'const', also add 'const'
decorations to the transformRelOptions() signature.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:03:43 +03:00
Jeff Davis
a890ad2149 selfuncs.c: use pg_strxfrm() instead of strxfrm().
pg_strxfrm() takes a pg_locale_t, so it works properly with all
providers. This improves estimates for ICU when performing linear
interpolation within a histogram bin.

Previously, convert_string_datum() always used strxfrm() and relied on
setlocale(). That did not produce good estimates for non-default or
non-libc collations.

Discussion: https://postgr.es/m/89475ee5487d795124f4e25118ea8f1853edb8cb.camel@j-davis.com
2024-08-06 12:25:12 -07:00
Heikki Linnakangas
a54d4ed183 Fix datatypes in comments in instr_time.h
The INSTR_TIME_GET_NANOSEC(t) and INSTR_TIME_GET_MICROSEC(t) macros
return a signed int64.

Discussion: https://www.postgresql.org/message-id/ZrHkv3MAQfwNSmTG@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-06 22:15:55 +03:00
Heikki Linnakangas
39a138fbef Revert "Fix comments in instr_time.h and remove an unneeded cast to int64"
This reverts commit 3dcb09de7b. Tom Lane pointed out that it broke the
abstraction provided by the macros. The callers should not need to
know what the internal type is.

This commit is an exact revert, the next commit will fix the comments
on the macros that incorrectly claim that they return uint64.

Discussion: https://www.postgresql.org/message-id/ZrHkv3MAQfwNSmTG@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-06 22:15:46 +03:00
Tom Lane
6e086fa2e7 Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
BEGIN;
CREATE USER foo;
SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

This un-reverts commit f5f30c22e.  The original plan was to back-patch
that, but the fact that 0ae5b763e proved to be a pre-requisite shows
that the subtle API change for GUC hooks might actually break some of
them.  The problem we're trying to fix seems not worth taking such a
risk for in stable branches.

Per bug #18545 from Andrey Rachitskiy.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-08-06 12:36:42 -04:00
Tom Lane
0ae5b763ea Clean up handling of client_encoding GUC in parallel workers.
The previous coding here threw an error from assign_client_encoding
if it was invoked in a parallel worker.  That's a very fundamental
violation of the GUC hook API: assign hooks must not throw errors.
The place to complain is in the check hook, so move the test to
there, and use the regular check-hook API (ie return false) to
report it.

The reason this coding is a problem is that it breaks GUC rollback,
which may occur after we leave InitializingParallelWorker state.
That case seems not actually reachable before now, but commit
f5f30c22e made it reachable, so we need to fix this before that
can be un-reverted.

In passing, improve the commentary in ParallelWorkerMain, and
add a check for failure of SetClientEncoding.  That's another
case that can't happen now but might become possible after
foreseeable code rearrangements (notably, if the shortcut of
skipping PrepareClientEncoding stops being OK).

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-08-06 12:21:53 -04:00
Nathan Bossart
8928817769 Remove volatile qualifiers from pg_stat_statements.c.
Prior to commit 0709b7ee72, which changed the spinlock primitives
to function as compiler barriers, access to variables within a
spinlock-protected section required using a volatile pointer, but
that is no longer necessary.

Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/Zqkv9iK7MkNS0KaN%40nathan
2024-08-06 10:56:37 -05:00
Heikki Linnakangas
3dcb09de7b Fix comments in instr_time.h and remove an unneeded cast to int64
03023a2664 represented time as an int64 on all platforms but forgot to
update the comment related to INSTR_TIME_GET_MICROSEC() and provided
an incorrect comment for INSTR_TIME_GET_NANOSEC().

In passing remove an unneeded cast to int64.

Author: Bertrand Drouvot
Discussion: https://www.postgresql.org/message-id/ZrHkv3MAQfwNSmTG@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-06 14:28:02 +03:00
Michael Paquier
8771298605 Remove unnecessary declaration of heapam_methods
This overlaps with the declaration at the end of heapam_handler.c that
lists all the callback routines for the heap table AM.

Author: Japin Li
Discussion: https://postgr.es/m/ME0P300MB04459456D5C4E70D48116896B6B12@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2024-08-06 16:27:38 +09:00
Jeff Davis
e9931bfb75 Remove support for null pg_locale_t most places.
Previously, passing NULL for pg_locale_t meant "use the libc provider
and the server environment". Now that the database collation is
represented as a proper pg_locale_t (not dependent on setlocale()),
remove special cases for NULL.

Leave wchar2char() and char2wchar() unchanged for now, because the
callers don't always have a libc-based pg_locale_t available.

Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org
Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-08-05 18:31:48 -07:00
Robert Haas
f80b09bac8 Move astreamer (except astreamer_inject) to fe_utils.
This allows the code to be used by other frontend applications.

Amul Sul, reviewed by Sravan Kumar, Andres Freund (whose input
I specifically solicited regarding the meson.build changes),
and me.

Discussion: http://postgr.es/m/CAAJ_b94StvLWrc_p4q-f7n3OPfr6GhL8_XuAg2aAaYZp1tF-nw@mail.gmail.com
2024-08-05 11:41:57 -04:00
Robert Haas
53b2c921a0 Move recovery injector astreamer to a separate header file.
Unlike the rest of the astreamer (formerly bbstreamer) infrastructure
which is reusable by other tools, astreamer_inject.c seems extremely
specific to pg_basebackup. Hence, move the corresponding declarations
to a separate header file, so that we can move the rest of the code
without moving this.

Amul Sul, reviewed by Sravan Kumar and by me.

Discussion: http://postgr.es/m/CAAJ_b94StvLWrc_p4q-f7n3OPfr6GhL8_XuAg2aAaYZp1tF-nw@mail.gmail.com
2024-08-05 10:55:06 -04:00
Robert Haas
3c90569811 Rename bbstreamer to astreamer.
I (rhaas) intended "bbstreamer" to stand for "base backup streamer,"
but that implies that this infrastructure can only ever be used by
pg_basebackup.  In fact, it is a generally useful way of streaming
data from a tar or compressed tar file, and it could be extended to
work with other archive formats as well if we ever wanted to do that.
Hence, rename it to "astreamer" (archive streamer) in preparation for
reusing the infrastructure from pg_verifybackup (and perhaps
eventually also other utilities, such as pg_combinebackup or
pg_waldump).

This is purely a renaming commit. Comment adjustments and relocation
of the actual code to someplace from which it can be reused are left
to future commits.

Amul Sul, reviewed by Sravan Kumar and by me.

Discussion: http://postgr.es/m/CAAJ_b94StvLWrc_p4q-f7n3OPfr6GhL8_XuAg2aAaYZp1tF-nw@mail.gmail.com
2024-08-05 09:56:25 -04:00
Masahiko Sawada
66e94448ab Restrict accesses to non-system views and foreign tables during pg_dump.
When pg_dump retrieves the list of database objects and performs the
data dump, there was possibility that objects are replaced with others
of the same name, such as views, and access them. This vulnerability
could result in code execution with superuser privileges during the
pg_dump process.

This issue can arise when dumping data of sequences, foreign
tables (only 13 or later), or tables registered with a WHERE clause in
the extension configuration table.

To address this, pg_dump now utilizes the newly introduced
restrict_nonsystem_relation_kind GUC parameter to restrict the
accesses to non-system views and foreign tables during the dump
process. This new GUC parameter is added to back branches too, but
these changes do not require cluster recreation.

Back-patch to all supported branches.

Reviewed-by: Noah Misch
Security: CVE-2024-7348
Backpatch-through: 12
2024-08-05 06:05:33 -07:00
David Rowley
ca6fde9225 Optimize JSON escaping using SIMD
Here we adjust escape_json_with_len() to make use of SIMD to allow
processing of up to 16-bytes at a time rather than processing a single
byte at a time.  This has been shown to speed up escaping of JSON
strings significantly.

Escaping is required for both JSON string properties and also the
property names themselves, so this should also help improve the speed of
the conversion from JSON into text for JSON objects that have property
names 16 or more bytes long.

Escaping JSON strings was often a significant bottleneck for longer
strings.  With these changes, some benchmarking has shown a query
performing nearly 4 times faster when escaping a JSON object with a 1MB
text property.  Tests with shorter text properties saw smaller but still
significant performance improvements.  For example, a test outputting 1024
JSON strings with a text property length ranging from 1 char to 1024 chars
became around 2 times faster.

Author: David Rowley
Reviewed-by: Melih Mutlu
Discussion: https://postgr.es/m/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com
2024-08-05 23:16:44 +12:00
Amit Kapila
b5df24e520 Fix typo in bufpage.h.
Author: Senglee Choi
Reviewed-by: Tender Wang
Discussion: https://postgr.es/m/CACUsy79U0=S5zWEf6D57F=vB7rOEa86xFY6oovDZ58jRcROCxQ@mail.gmail.com
2024-08-05 14:38:00 +05:30
Michael Paquier
f68cd847fa injection_points: Add some fixed-numbered statistics
Like 75534436a477, this acts mainly as a template to show what can be
achieved with fixed-numbered stats (like WAL, bgwriter, etc.) with the
pluggable cumulative statistics APIs introduced in 7949d9594582.

Fixed-numbered stats are defined in their own file, named
injection_stats_fixed.c, separated entirely from the variable-numbered
case in injection_stats.c.  This is mainly for clarity as having both
examples in the same file would be confusing.

Note that this commit uses the helper routines added in 2eff9e678d35.
The stats stored track globally the number of times injection points
have been attached, detached or run.  Two more fields should be added
later for the number of times a point has been cached or loaded, but
what's here is enough as a template.

More TAP tests are added, providing coverage for fixed-numbered custom
stats.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-05 12:29:22 +09:00
Michael Paquier
75534436a4 injection_points: Add some cumulative stats for injection points
This acts as a template of what can be achieved with the pluggable
cumulative stats APIs introduced in 7949d9594582 for the
variable-numbered case where stats entries are stored in the pgstats
dshash, while being potentially useful on its own for injection points,
say to add starting and/or stopping conditions based on the statistics
(want to trigger a callback after N calls, for example?).

Currently, the only data gathered is the number of times an injection
point is run.  More fields can always be added as required.  All the
routines related to the stats are located in their own file, called
injection_stats.c in the test module injection_points, for clarity.

The stats can be used only if the test module is loaded through
shared_preload_libraries.  The key of the dshash uses InvalidOid for the
database, and an int4 hash of the injection point name as object ID.

A TAP test is added to provide coverage for the new custom cumulative
stats APIs, showing the persistency of the data across restarts, for
example.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-05 12:06:54 +09:00
Michael Paquier
2eff9e678d Add helper routines to retrieve data for custom fixed-numbered pgstats
This is useful for extensions to get snapshot and shmem data for custom
cumulative statistics when these have a fixed number of objects, so as
these do not need to know about the snapshot internals, aka pgStatLocal.

An upcoming commit introducing an example template for custom cumulative
stats with fixed-numbered objects will make use of these.  I have
noticed that this is useful for extension developers while hacking my
own example, actually.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-05 11:43:33 +09:00
Alexander Korotkov
8036d73ae3 pg_wal_replay_wait(): Fix typo in the doc
Reported-by: Kevin Hale Boyes
Discussion: https://postgr.es/m/CADAecHWKpaPuPGXAMOH%3DwmhTpydHWGPOk9KWX97UYhp5GdqCWw%40mail.gmail.com
2024-08-04 20:26:48 +03:00
Michael Paquier
7949d95945 Introduce pluggable APIs for Cumulative Statistics
This commit adds support in the backend for $subject, allowing
out-of-core extensions to plug their own custom kinds of cumulative
statistics.  This feature has come up a few times into the lists, and
the first, original, suggestion came from Andres Freund, about
pg_stat_statements to use the cumulative statistics APIs in shared
memory rather than its own less efficient internals.  The advantage of
this implementation is that this can be extended to any kind of
statistics.

The stats kinds are divided into two parts:
- The in-core "builtin" stats kinds, with designated initializers, able
to use IDs up to 128.
- The "custom" stats kinds, able to use a range of IDs from 128 to 256
(128 slots available as of this patch), with information saved in
TopMemoryContext.  This can be made larger, if necessary.

There are two types of cumulative statistics in the backend:
- For fixed-numbered objects (like WAL, archiver, etc.).  These are
attached to the snapshot and pgstats shmem control structures for
efficiency, and built-in stats kinds still do that to avoid any
redirection penalty.  The data of custom kinds is stored in a first
array in snapshot structure and a second array in the shmem control
structure, both indexed by their ID, acting as an equivalent of the
builtin stats.
- For variable-numbered objects (like tables, functions, etc.).  These
are stored in a dshash using the stats kind ID in the hash lookup key.

Internally, the handling of the builtin stats is unchanged, and both
fixed and variabled-numbered objects are supported.  Structure
definitions for builtin stats kinds are renamed to reflect better the
differences with custom kinds.

Like custom RMGRs, custom cumulative statistics can only be loaded with
shared_preload_libraries at startup, and must allocate a unique ID
shared across all the PostgreSQL extension ecosystem with the following
wiki page to avoid conflicts:
https://wiki.postgresql.org/wiki/CustomCumulativeStats

This makes the detection of the stats kinds and their handling when
reading and writing stats much easier than, say, allocating IDs for
stats kinds from a shared memory counter, that may change the ID used by
a stats kind across restarts.  When under development, extensions can
use PGSTAT_KIND_EXPERIMENTAL.

Two examples that can be used as templates for fixed-numbered and
variable-numbered stats kinds will be added in some follow-up commits,
with tests to provide coverage.

Some documentation is added to explain how to use this plugin facility.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-04 19:41:24 +09:00
Peter Eisentraut
365b5a345b Use CXXFLAGS instead of CFLAGS for linking C++ code
Otherwise, this would break if using C and C++ compilers from
different families and they understand different options.  It already
used the right flags for compiling, this is only for linking.  Also,
the meson setup already did this correctly.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/228700.1722717983@sss.pgh.pa.us
2024-08-04 11:17:46 +02:00
Michael Paquier
028b4b21df Fix incorrect format placeholders in pgstat.c
These should have been switched from %d to %u in 3188a4582a8c in the
debugging elogs added in ca1ba50fcb6f.  PgStat_Kind should never be
higher than INT32_MAX, but let's be clean.

Issue noticed while hacking more on this area.
2024-08-04 03:07:20 +09:00
Peter Eisentraut
6618891256 Add -Wmissing-variable-declarations to the standard compilation flags
This warning flag detects global variables not declared in header
files.  This is similar to what -Wmissing-prototypes does for
functions.  (More correctly, it is similar to what
-Wmissing-declarations does for functions, but -Wmissing-prototypes is
a superset of that in C.)

This flag is new in GCC 14.  Clang has supported it for a while.

Several recent commits have cleaned up warnings triggered by this, so
it should now be clean.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-08-03 11:51:02 +02:00
Jeff Davis
7926a9a80f Small refactoring around ExecCreateTableAs().
Since commit 4b74ebf726, the refresh logic is used to populate
materialized views, so we can simplify the error message in
ExecCreateTableAs().

Also, RefreshMatViewByOid() is moved to just after
create_ctas_nodata() call to improve code readability.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240802161301.d975daca9ba7a706fa05ecd7@sraoss.co.jp
2024-08-02 12:52:56 -07:00
Noah Misch
3cffe7946c Fix name of "Visual Studio" in documentation.
Back-patch to v17, which introduced this.

Aleksander Alekseev

Discussion: https://postgr.es/m/CAJ7c6TM7ct0EjoCQaLSVYoxxnEw4xCUFebWj77GktWsqEdyCtQ@mail.gmail.com
2024-08-02 12:49:56 -07:00
Alexander Korotkov
3c5db1d6b0 Implement pg_wal_replay_wait() stored procedure
pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed.  This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.

The queue of waiters is stored in the shared memory as an LSN-ordered pairing
heap, where the waiter with the nearest LSN stays on the top.  During
the replay of WAL, waiters whose LSNs have already been replayed are deleted
from the shared memory pairing heap and woken up by setting their latches.

pg_wal_replay_wait() needs to wait without any snapshot held.  Otherwise,
the snapshot could prevent the replay of WAL records, implying a kind of
self-deadlock.  This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working without an active snapshot,
not a function.

Catversion is bumped.

Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi
2024-08-02 21:16:56 +03:00
Alvaro Herrera
a83f3088b8
Fix NLS file reference in pg_createsubscriber
pg_createsubscriber is referring to a non-existent message translation
file, causing NLS to not work correctly. This command should use the
same file as pg_basebackup.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20240802.115717.1083441453338151622.horikyota.ntt@gmail.com
2024-08-02 12:05:38 -04:00
Alvaro Herrera
3b2f668b78
pg_createsubscriber: Fix bogus error message
Also some desultory style improvement
2024-08-02 12:01:10 -04:00
Peter Eisentraut
9fb855fe1a Include bison header files into implementation files
Before Bison 3.4, the generated parser implementation files run afoul
of -Wmissing-variable-declarations (in spite of commit ab61c40bfa2)
because declarations for yylval and possibly yylloc are missing.  The
generated header files contain an extern declaration, but the
implementation files don't include the header files.  Since Bison 3.4,
the generated implementation files automatically include the generated
header files, so then it works.

To make this work with older Bison versions as well, include the
generated header file from the .y file.

(With older Bison versions, the generated implementation file contains
effectively a copy of the header file pasted in, so including the
header file is redundant.  But we know this works anyway because the
core grammar uses this arrangement already.)

Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-08-02 10:25:11 +02:00
Heikki Linnakangas
63bef4df97 Minor refactoring of assign_backendlist_entry()
Make assign_backendlist_entry() responsible just for allocating the
Backend struct. Linking it to the RegisteredBgWorker is the caller's
responsibility now. Seems more clear that way.

Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
2024-08-01 23:23:55 +03:00
Heikki Linnakangas
ef4c35b416 Fix outdated comment; all running bgworkers are in BackendList
Before commit 8a02b3d732, only bgworkers that connected to a database
had an entry in the Backendlist. Commit 8a02b3d732 changed that, but
forgot to update this comment.

Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
2024-08-01 23:23:47 +03:00
Michael Paquier
3188a4582a Switch PgStat_Kind from an enum to a uint32 type
A follow-up patch is planned to make cumulative statistics pluggable,
and using a type is useful in the internal routines used by pgstats as
PgStat_Kind may have a value that was not originally in the enum removed
here, once made pluggable.

While on it, this commit switches pgstat_is_kind_valid() to use
PgStat_Kind rather than an int, to be more consistent with its existing
callers.  Some loops based on the stats kind IDs are switched to use
PgStat_Kind rather than int, for consistency with the new time.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-02 04:49:34 +09:00
Michael Paquier
b860848232 Add redo LSN to pgstats files
This is used in the startup process to check that the pgstats file we
are reading includes the redo LSN referring to the shutdown checkpoint
where it has been written.  The redo LSN in the pgstats file needs to
match with what the control file has.

This is intended to be used for an upcoming change that will extend the
write of the stats file to happen during checkpoints, rather than only
shutdown sequences.

Bump PGSTAT_FILE_FORMAT_ID.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
2024-08-02 01:57:28 +09:00
Peter Eisentraut
c27090bd60 Convert some extern variables to static, Windows code
Similar to 720b0eaae9b, discovered by MinGW.
2024-08-01 13:28:32 +02:00
Peter Eisentraut
6bbbd7f65f Convert an extern variable to static
Similar to 720b0eaae9b, fixes new code from bd15b7db489.
2024-08-01 12:43:26 +02:00
Peter Eisentraut
c671e142bf pg_createsubscriber: Rename option --socket-directory to --socketdir
For consistency with the equivalent option in pg_upgrade.

Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://www.postgresql.org/message-id/flat/1ed82b9b-8e20-497d-a2f8-aebdd793d595%40eisentraut.org
2024-08-01 12:14:01 +02:00
Etsuro Fujita
e66b32e43b Update comment in portal.h.
We store tuples into the portal's tuple store for a PORTAL_ONE_MOD_WITH
query as well.

Back-patch to all supported branches.

Reviewed by Andy Fan.

Discussion: https://postgr.es/m/CAPmGK14HVYBZYZtHabjeCd-e31VT%3Dwx6rQNq8QfehywLcpZ2Hw%40mail.gmail.com
2024-08-01 17:45:00 +09:00
Peter Eisentraut
a292c98d62 Convert node test compile-time settings into run-time parameters
This converts

    COPY_PARSE_PLAN_TREES
    WRITE_READ_PARSE_PLAN_TREES
    RAW_EXPRESSION_COVERAGE_TEST

into run-time parameters

    debug_copy_parse_plan_trees
    debug_write_read_parse_plan_trees
    debug_raw_expression_coverage_test

They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS.

The compile-time symbols are kept for build farm compatibility, but
they now just determine the default value of the run-time settings.

Furthermore, support for these settings is not compiled in at all
unless assertions are enabled, or the new symbol
DEBUG_NODE_TESTS_ENABLED is defined at compile time, or any of the
legacy compile-time setting symbols are defined.  So there is no
run-time overhead in production builds.  (This is similar to the
handling of DISCARD_CACHES_ENABLED.)

Discussion: https://www.postgresql.org/message-id/flat/30747bd8-f51e-4e0c-a310-a6e2c37ec8aa%40eisentraut.org
2024-08-01 10:09:18 +02:00
Amit Kapila
a67da49e1d Avoid duplicate table scans for cross-partition updates during logical replication.
When performing a cross-partition update in the apply worker, it
needlessly scans the old partition twice, resulting in noticeable
overhead.

This commit optimizes it by removing the redundant table scan.

Author: Hou Zhijie
Reviewed-by: Hayato Kuroda, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB571623E39984D94CBB5341D994AB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2024-08-01 10:11:06 +05:30
Andres Freund
a7f107df2b Evaluate arguments of correlated SubPlans in the referencing ExprState
Until now we generated an ExprState for each parameter to a SubPlan and
evaluated them one-by-one ExecScanSubPlan. That's sub-optimal as creating lots
of small ExprStates
a) makes JIT compilation more expensive
b) wastes memory
c) is a bit slower to execute

This commit arranges to evaluate parameters to a SubPlan as part of the
ExprState referencing a SubPlan, using the new EEOP_PARAM_SET expression
step. We emit one EEOP_PARAM_SET for each argument to a subplan, just before
the EEOP_SUBPLAN step.

It likely is worth using EEOP_PARAM_SET in other places as well, e.g. for
SubPlan outputs, nestloop parameters and - more ambitiously - to get rid of
ExprContext->domainValue/caseValue/ecxt_agg*.  But that's for later.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Discussion: https://postgr.es/m/20230225214401.346ancgjqc3zmvek@awork3.anarazel.de
2024-07-31 19:54:46 -07:00
Tom Lane
e6a9637488 Revert "Allow parallel workers to cope with a newly-created session user ID."
This reverts commit f5f30c22ed69fb37b896c4d4546b2ab823c3fd61.

Some buildfarm animals are failing with "cannot change
"client_encoding" during a parallel operation".  It looks like
assign_client_encoding is unhappy at being asked to roll back a
client_encoding setting after a parallel worker encounters a
failure.  There must be more to it though: why didn't I see this
during local testing?  In any case, it's clear that moving the
RestoreGUCState() call is not as side-effect-free as I thought.
Given that the bug f5f30c22e intended to fix has gone unreported
for years, it's not something that's urgent to fix; I'm not
willing to risk messing with it further with only days to our
next release wrap.
2024-07-31 20:57:00 -04:00
Jeff Davis
ca2eea3ac8 Add is_create parameter to RefreshMatviewByOid().
RefreshMatviewByOid is used for both REFRESH and CREATE MATERIALIZED
VIEW.  This flag is currently just used for handling internal error
messages, but also aimed to improve code-readability.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240726122630.70e889f63a4d7e26f8549de8@sraoss.co.jp
2024-07-31 16:42:19 -07:00
Jeff Davis
f683d3a4ca Remove unused ParamListInfo argument from ExecRefreshMatView.
Author: Yugo Nagata
Discussion: https://postgr.es/m/20240726122630.70e889f63a4d7e26f8549de8@sraoss.co.jp
2024-07-31 16:37:53 -07:00
Tom Lane
f5f30c22ed Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
	BEGIN;
	CREATE USER foo;
	SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

Per bug #18545 from Andrey Rachitskiy.  Back-patch to all
supported branches, because this has been wrong for awhile.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-07-31 18:54:10 -04:00
Nathan Bossart
bd15b7db48 Improve performance of dumpSequenceData().
As one might guess, this function dumps the sequence data.  It is
called once per sequence, and each such call executes a query to
retrieve the relevant data for a single sequence.  This can cause
pg_dump to take significantly longer, especially when there are
many sequences.

This commit improves the performance of this function by gathering
all the sequence data with a single query at the beginning of
pg_dump.  This information is stored in a sorted array that
dumpSequenceData() can bsearch() for what it needs.  This follows a
similar approach as previous commits that introduced sorted arrays
for role information, pg_class information, and sequence metadata.
As with those commits, this patch will cause pg_dump to use more
memory, but that isn't expected to be too egregious.

Note that we use the brand new function pg_sequence_read_tuple() in
the query that gathers all sequence data, so we must continue to
use the preexisting query-per-sequence approach for versions older
than 18.

Reviewed-by: Euler Taveira, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
2024-07-31 10:12:42 -05:00
Nathan Bossart
c8b06bb969 Introduce pg_sequence_read_tuple().
This new function returns the data for the given sequence, i.e.,
the values within the sequence tuple.  Since this function is a
substitute for SELECT from the sequence, the SELECT privilege is
required on the sequence in question.  It returns all NULLs for
sequences for which we lack privileges, other sessions' temporary
sequences, and unlogged sequences on standbys.

This function is primarily intended for use by pg_dump in a
follow-up commit that will use it to optimize dumpSequenceData().
Like pg_sequence_last_value(), which is a support function for the
pg_sequences system view, pg_sequence_read_tuple() is left
undocumented.

Bumps catversion.

Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
2024-07-31 10:12:42 -05:00
Nathan Bossart
68e9629985 Improve performance of dumpSequence().
This function dumps the sequence definitions.  It is called once
per sequence, and each such call executes a query to retrieve the
metadata for a single sequence.  This can cause pg_dump to take
significantly longer, especially when there are many sequences.

This commit improves the performance of this function by gathering
all the sequence metadata with a single query at the beginning of
pg_dump.  This information is stored in a sorted array that
dumpSequence() can bsearch() for what it needs.  This follows a
similar approach as commits d5e8930f50 and 2329cad1b9, which
introduced sorted arrays for role information and pg_class
information, respectively.  As with those commits, this patch will
cause pg_dump to use more memory, but that isn't expected to be too
egregious.

Note that before version 10, the sequence metadata was stored in
the sequence relation itself, which makes it difficult to gather
all the sequence metadata with a single query.  For those older
versions, we continue to use the preexisting query-per-sequence
approach.

Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
2024-07-31 10:12:42 -05:00
Nathan Bossart
23687e925f Parse sequence type and integer metadata in dumpSequence().
This commit modifies dumpSequence() to parse all the sequence
metadata into the appropriate types instead of carting around
string pointers to the PGresult data.  Besides allowing us to free
the PGresult storage earlier in the function, this eliminates the
need to compare min_value and max_value to their respective
defaults as strings.

This is preparatory work for a follow-up commit that will improve
the performance of dumpSequence() in a similar manner to how commit
2329cad1b9 optimized binary_upgrade_set_pg_class_oids().

Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
2024-07-31 10:12:41 -05:00
David Rowley
057ee9183c Doc: mention executor memory usage for enable_partitionwise* GUCs
Prior to this commit, the docs for enable_partitionwise_aggregate and
enable_partitionwise_join mentioned the additional overheads enabling
these causes for the query planner, but they mentioned nothing about the
possible surge in work_mem-consuming executor nodes that could end up in
the final plan.  Dimitrios reported the OOM killer intervened on his
query as a result of using enable_partitionwise_aggregate=on.

Here we adjust the docs to mention the possible increase in the number of
work_mem-consuming executor nodes that can appear in the final plan as a
result of enabling these GUCs.

Reported-by: Dimitrios Apostolou
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/3603c380-d094-136e-e333-610914fb3e80%40gmx.net
Discussion: https://postgr.es/m/CAApHDvoZ0_yqwPFEpb6h261L76BUpmh5GxBQq0LeRzQ5Jh3zzg@mail.gmail.com
Backpatch-through: 12, oldest supported version
2024-08-01 01:25:25 +12:00
Peter Eisentraut
e54a42ac9d Add API and ABI stability guidance to the C language docs
Includes guidance for major and minor version releases, and sets
reasonable expectations for extension developers to follow.

Author: David Wheeler, Peter Eisentraut

Discussion: https://www.postgresql.org/message-id/flat/5DA9F9D2-B8B2-43DE-BD4D-53A4160F6E8D%40justatheory.com
2024-07-31 11:11:09 +02:00
Peter Eisentraut
4f29394ea9 doc: Avoid too prominent use of "backup" on pg_dump man page
Some users inadvertently rely on pg_dump as their primary backup tool,
when better solutions exist.  The pg_dump man page is arguably
misleading in that it starts with

"pg_dump is a utility for backing up a PostgreSQL database."

This tones this down a little bit, by replacing most uses of "backup"
with "export" and adding a short note that pg_dump is not a
general-purpose backup tool.

Discussion: https://www.postgresql.org/message-id/flat/70b48475-7706-4268-990d-fd522b038d96%40eisentraut.org
2024-07-31 07:57:47 +02:00
Peter Eisentraut
73275f093f Make building with LTO work on macOS
When building with -flto, the backend binary must keep many otherwise
unused symbols to make them available to dynamically loaded modules /
extensions.  This has been done via -Wl,--export-dynamic on many
platforms for years.  This flag is not supported by the macOS linker,
though.  Here it's called -Wl,-export_dynamic instead.

Thus, make configure pick up on this variant of the flag as well.
Meson has the logic upstream as of version 1.5.0.

Without this fix, building with -flto fails with errors similar to [1]
and [2].

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.us

Author: Wolfgang Walther <walther@technowledgy.de>
Discussion: https://www.postgresql.org/message-id/flat/427c7c25-e8e1-4fc5-a1fb-01ceff185e5b@technowledgy.de
2024-07-31 06:22:02 +02:00
Amit Kapila
0dcea330ba Fix random failure in 021_twophase.
After disabling the subscription, the failed test was changing the
two_phase option for the subscription. We can't change the two_phase
option for a subscription till the corresponding apply worker is active.
The check to ensure that the replication apply worker has exited was
incorrect.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm3YY+bzj+JWJbY+DsUgJ2mPk8OR1ttjVX2cywKr4BUgxw@mail.gmail.com
2024-07-31 08:53:55 +05:30
Jeff Davis
679c5084cf Relax check for return value from second call of pg_strnxfrm().
strxfrm() is not guaranteed to return the exact number of bytes needed
to store the result; it may return a higher value.

Discussion: https://postgr.es/m/32f85d88d1f64395abfe5a10dd97a62a4d3474ce.camel@j-davis.com
Reviewed-by: Heikki Linnakangas
Backpatch-through: 16
2024-07-30 16:23:20 -07:00
Heikki Linnakangas
f822be3962 Refactor getWeights to write to caller-supplied buffer
This gets rid of the static result buffer.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:06:07 +03:00
Heikki Linnakangas
01e51ed780 Replace static buf with a stack-allocated one in 'seg' extension
The buffer is used only locally within the function. Also, the
initialization to '0' characters was unnecessary, the initial content
were always overwritten with sprintf(). I don't understand why it was
done that way, but it's been like that since forever.

In the passing, change from sprintf() to snprintf(). The buffer was
long enough so sprintf() was fine, but this makes it more obvious that
there's no risk of a buffer overflow.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:06:03 +03:00
Heikki Linnakangas
da8a587e2e Replace static buf with a stack-allocated one in ReadControlFile
It's only used very locally within the function.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:05:59 +03:00
Heikki Linnakangas
6151cb7876 Replace static buf with palloc in str_time()
The function is used only once in the startup process, so the leak
into current memory context is harmless.

This is a tiny step in making the server thread-safe.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:05:51 +03:00
Heikki Linnakangas
5bf948d564 Replace static bufs with a StringInfo in cash_words()
For clarity. The code was correct, and the buffer was large enough,
but string manipulation with no bounds checking is scary.

This incurs an extra palloc+pfree to every call, but in quick
performance testing, it doesn't seem to be significant.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:02:58 +03:00
Heikki Linnakangas
47c98035c6 Remove leftover function declaration
Commit 9d9b9d46f3 removed the function (or rather, moved it to a
different source file and renamed it to SendCancelRequest), but forgot
the declaration in the header file.
2024-07-30 15:19:46 +03:00
Andrew Dunstan
524d490a9f Preserve tz when converting to jsonb timestamptz
This removes an inconsistency in the treatment of different datatypes by
the jsonpath timestamp_tz() function. Conversions from data types that
are not timestamp-aware, such as date and timestamp, are now treated
consistently with conversion from those that are such as timestamptz.

Author: David Wheeler
Reviewed-by: Junwang Zhao and Jeevan Chalke

Discussion: https://postgr.es/m/7DE080CE-6D8C-4794-9BD1-7D9699172FAB%40justatheory.com

Backpatch to release 17.
2024-07-30 07:57:38 -04:00
Thomas Munro
06ffce4559 Remove spinlocks and atomics from meson_options.txt.
Commits e2562667 and 81385261 removed the configure equivalents, but
forgot to remove these options from meson_options.txt.

Revealed by the fact that build farm animals rorqual and francolin
didn't fail, despite being configured to set those options to off.  They
should now fail with unknown option, until they are adjusted.
2024-07-30 23:37:14 +12:00
Thomas Munro
71d6c4b966 Remove useless member of BackendParameters.
Oversight in e2562667, which stopped using SpinlockSemaArray but forgot
to remove it from the array.

Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/310f4005-91d7-42b2-ac70-92624260dd28%40iki.fi
2024-07-30 23:15:09 +12:00
Thomas Munro
83aadbeb96 Require memory barrier support.
Previously we had a fallback implementation that made a harmless system
call, based on the assumption that system calls must contain a memory
barrier.  That shouldn't be reached on any current system, and it seems
highly likely that we can easily find out how to request explicit memory
barriers, if we've already had to find out how to do atomics on a
hypothetical new system.

Removed comments and a function name referred to a spinlock used for
fallback memory barriers, but that changed in 1b468a13, which left some
misleading words behind in a few places.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
2024-07-30 23:01:55 +12:00
Thomas Munro
a011dc399c Require compiler barrier support.
Previously we had a fallback implementation of pg_compiler_barrier()
that called an empty function across a translation unit boundary so the
compiler couldn't see what it did.  That shouldn't be needed on any
current systems, and might not even work with a link time optimizer.
Since we now require compiler-specific knowledge of how to implement
atomics, we should also know how to implement compiler barriers on a
hypothetical new system.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
2024-07-30 22:59:30 +12:00
Thomas Munro
8138526136 Remove --disable-atomics, require 32 bit atomics.
Modern versions of all relevant architectures and tool chains have
atomics support.  Since edadeb07, there is no remaining reason to carry
code that simulates atomic flags and uint32 imperfectly with spinlocks.
64 bit atomics are still emulated with spinlocks, if needed, for now.

Any modern compiler capable of implementing C11 <stdatomic.h> must have
the underlying operations we need, though we don't require C11 yet.  We
detect certain compilers and architectures, so hypothetical new systems
might need adjustments here.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (concept, not the patch)
Reviewed-by: Andres Freund <andres@anarazel.de> (concept, not the patch)
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
2024-07-30 22:58:57 +12:00
Thomas Munro
e25626677f Remove --disable-spinlocks.
A later change will require atomic support, so it wouldn't make sense
for a hypothetical new system not to be able to implement spinlocks.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (concept, not the patch)
Reviewed-by: Andres Freund <andres@anarazel.de> (concept, not the patch)
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
2024-07-30 22:58:37 +12:00
Peter Eisentraut
1330843bb7 pg_createsubscriber: Remove obsolete comment
This comment should have been removed by commit b9639138262.  There is
no replication slot check on the primary anymore.

Author: Euler Taveira <euler@eulerto.com>
Discussion: https://www.postgresql.org/message-id/697d692f-f9d3-41f6-9f0e-29a4fb18e544@app.fastmail.com
2024-07-30 12:32:57 +02:00
Andrew Dunstan
800cd3e923 Stabilize xid_wraparound tests
The tests had a race condition if autovacuum was set to off. Instead we
create all the tables we are interested in with autovacuum disabled, so
they are only ever touched when in danger of wraparound.

Discussion: https://postgr.es/m/3e2cbd24-f45e-4b2b-ba83-8149214f0a4d@dunslane.net

Masahiko Sawada (slightly tweaked by me)

Backpatch to release 17 where these tests were introduced.
2024-07-30 06:24:59 -04:00
Amit Kapila
03b08c8f5f pg_createsubscriber: Fix an unpredictable recovery wait time.
The problem is that the tool is using the LSN returned by
pg_create_logical_replication_slot() as recovery_target_lsn. This LSN is
ahead of the current WAL position and the recovery waits until the
publisher writes a WAL record to reach the target and ends the recovery.
On idle systems, this wait time is unpredictable and could lead to failure
in promoting the subscriber. To avoid that, insert a harmless WAL record.

Reported-by: Alexander Lakhin and Tom Lane
Diagnosed-by: Hayato Kuroda
Author: Euler Taveira
Reviewed-by: Hayato Kuroda, Amit Kapila
Backpatch-through: 17
Discussion: https://postgr.es/m/2377319.1719766794%40sss.pgh.pa.us
Discussion: https://postgr.es/m/CA+TgmoYcY+Wb67NAwaHT7MvxCSeV86oSc+va9hHKaasE42ukyw@mail.gmail.com
2024-07-30 14:01:01 +05:30
David Rowley
c19615fe39 Disallow setting MAX_PARTITION_BUFFERS to less than 2
Add some comments to mention that this value must be at least 2 and also
add a StaticAssertDecl to cause compilation failure if anyone tries to
build with an invalid value.

The multiInsertBuffers list must have at least two elements due to how the
code in CopyMultiInsertInfoFlush() pushes the current ResultRelInfo's
CopyMultiInsertBuffer to the end of the list.  If the first element is
also the last element, bad things will happen.

Author: Zhang Mingli <avamingli@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpQ6t9ROcqbD-OgqR04Kfq4vQKw79Vo6r5j%2BciHwsSfkA%40mail.gmail.com
2024-07-30 20:19:59 +12:00
Jeff Davis
72fe6d24a3 Make collation not depend on setlocale().
Now that the result of pg_newlocale_from_collation() is always
non-NULL, then we can move the collate_is_c and ctype_is_c flags into
pg_locale_t. That simplifies the logic in lc_collate_is_c() and
lc_ctype_is_c(), removing the dependence on setlocale().

This commit also eliminates the multi-stage initialization of the
collation cache.

As long as we have catalog access, then it's now safe to call
pg_newlocale_from_collation() without checking lc_collate_is_c()
first.

Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org
Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-30 00:58:06 -07:00
Richard Guo
9b282a9359 Fix partitionwise join with partially-redundant join clauses
To determine if the two relations being joined can use partitionwise
join, we need to verify the existence of equi-join conditions
involving pairs of matching partition keys for all partition keys.
Currently we do that by looking through the join's restriction
clauses.  However, it has been discovered that this approach is
insufficient, because there might be partition keys known equal by a
specific EC, but they do not form a join clause because it happens
that other members of the EC than the partition keys are constrained
to become a join clause.

To address this issue, in addition to examining the join's restriction
clauses, we also check if any partition keys are known equal by ECs,
by leveraging function exprs_known_equal().  To accomplish this, we
enhance exprs_known_equal() to check equality per the semantics of the
opfamily, if provided.

It could be argued that exprs_known_equal() could be called O(N^2)
times, where N is the number of partition key expressions, resulting
in noticeable performance costs if there are a lot of partition key
expressions.  But I think this is not a problem.  The number of a
joinrel's partition key expressions would only be equal to the join
degree, since each base relation within the join contributes only one
partition key expression.  That is to say, it does not scale with the
number of partitions.  A benchmark with a query involving 5-way joins
of partitioned tables, each with 3 partition keys and 1000 partitions,
shows that the planning time is not significantly affected by this
patch (within the margin of error), particularly when compared to the
impact caused by partitionwise join.

Thanks to Tom Lane for the idea of leveraging exprs_known_equal() to
check if partition keys are known equal by ECs.

Author: Richard Guo, Tom Lane
Reviewed-by: Tom Lane, Ashutosh Bapat, Robert Haas
Discussion: https://postgr.es/m/CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com
2024-07-30 15:51:54 +09:00
Richard Guo
2309eff62b Refactor the checks for parameterized partial paths
Parameterized partial paths are not supported, and we have several
checks in try_partial_xxx_path functions to enforce this.  For a
partial nestloop join path, we need to ensure that if the inner path
is parameterized, the parameterization is fully satisfied by the
proposed outer path.  For a partial merge/hashjoin join path, we need
to ensure that the inner path is not parameterized.  In all cases, we
need to ensure that the outer path is not parameterized.

However, the comment in try_partial_hashjoin_path does not describe
this correctly.  This patch fixes that.

In addtion, this patch simplifies the checks peformed in
try_partial_hashjoin_path and try_partial_mergejoin_path with the help
of macro PATH_REQ_OUTER, and also adds asserts that the outer path is
not parameterized in try_partial_xxx_path functions.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs48mKJ6g_GnYNa7dnw04MHaMK-jnAEBrMVhTp2uUg3Ut4A@mail.gmail.com
2024-07-30 15:49:44 +09:00
Richard Guo
cc9daa09ee Short-circuit sort_inner_and_outer if there are no mergejoin clauses
In sort_inner_and_outer, we create mergejoin join paths by explicitly
sorting both relations on each possible ordering of the available
mergejoin clauses.  However, if there are no available mergejoin
clauses, we can skip this process entirely.

This patch introduces a check for mergeclause_list at the beginning of
sort_inner_and_outer and exits the function if it is found to be
empty.  This might help skip all the statements that come before the
call to select_outer_pathkeys_for_merge, including the build of
UniquePaths in the case of JOIN_UNIQUE_OUTER or JOIN_UNIQUE_INNER.

I doubt there's any measurable performance improvement, but throughout
the run of the regression tests, sort_inner_and_outer is called a
total of 44,424 times.  Among these calls, there are 11,064 instances
where mergeclause_list is found to be empty, which accounts for
approximately one-fourth.  I think this suggests that implementing
this shortcut is worthwhile.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48RKiZGFEd5A0JtztRY5ZdvVvNiHh0AKeuoz21F+0dVjQ@mail.gmail.com
2024-07-30 15:46:39 +09:00
Michael Paquier
ca1ba50fcb Add more debugging information when failing to read pgstats files
This is useful to know which part of a stats file is corrupted when
reading it, adding to the server logs a WARNING with details about what
could not be read before giving up with the remaining data in the file.

Author: Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
2024-07-30 15:08:21 +09:00
Amit Langote
7f56eaff2f SQL/JSON: Fix casting for integer EXISTS columns in JSON_TABLE
The current method of coercing the boolean result value of
JsonPathExists() to the target type specified for an EXISTS column,
which is to call the type's input function via json_populate_type(),
leads to an error when the target type is integer, because the
integer input function doesn't recognize boolean literal values as
valid.

Instead use the boolean-to-integer cast function for coercion in that
case so that using integer or domains thereof as type for EXISTS
columns works. Note that coercion for ON ERROR values TRUE and FALSE
already works like that because the parser creates a cast expression
including the cast function, but the coercion of the actual result
value is not handled by the parser.

Tests by Jian He.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-30 10:34:17 +09:00
Amit Langote
74c96699be SQL/JSON: Some fixes to JsonBehavior expression casting
1. Remove the special case handling when casting the JsonBehavior
   expressions to types with typmod, like 86d33987 did for the casting
   of SQL/JSON constructor functions.

2. Fix casting for fixed-length character and bit string types by
   using assignment-level casts.  This is again similar to what
   86d33987 did, but for ON ERROR / EMPTY expressions.

3. Use runtime coercion for the boolean ON ERROR constants so that
   using fixed-length character string types, for example, for an
   EXISTS column doesn't cause a "value too long for type
   character(n)" when the parser tries to coerce the default ON ERROR
   value "false" to that type, that is, even when clause is not
   specified.

4. Simplify the conditions of when to use runtime coercion vs
   creating the cast expression in the parser itself.  jsonb-valued
   expressions are now always coerced at runtime and boolean
   expressions too if the target type is a string type for the
   reasons mentioned above.

Tests are taken from a patch that Jian He posted.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-30 10:34:17 +09:00
Jeff Davis
8240401437 Do not return NULL from pg_newlocale_from_collation().
Previously, pg_newlocale_from_collation() returned NULL as a special
case for the DEFAULT_COLLATION_OID if the provider was libc. In that
case the behavior would depend on the last call to setlocale().

Now, consistent with the other providers, it will return a pointer to
default_locale, which is not dependent on setlocale().

Note: for the C and POSIX locales, the locale_t structure within the
pg_locale_t will still be zero, because those locales are implemented
with internal logic and do not use libc at all.

lc_collate_is_c() and lc_ctype_is_c() still depend on setlocale() to
determine the current locale, which will be removed in a subsequent
commit.

Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org
Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-29 15:18:06 -07:00
Heikki Linnakangas
6a1d8cef46 Detach syslogger from shared memory
Commit aafc05de1b removed the calls to detach from shared memory from
syslogger startup. That was not intentional, so put them back.

Author: Rui Zhao
Reviewed-by: Aleksander Alekseev
Backpatch-through: 17
Discussion: https://www.postgresql.org/message-id/11505016-8cf3-4691-b996-7faed99b7877.xiyuan.zr@alibaba-inc.com
2024-07-29 22:21:34 +03:00
Heikki Linnakangas
679f940740 Remove dead generators for cyrillic encoding conversion tables
These tools were used to read the koi-iso.tab, koi-win.tab, and
koi-alt.tab files, which contained the mappings between the
single-byte cyrillic encodings. However, those data files were removed
in commit 4c3c8c048d, back in 2003. These code generators have been
unused and unusable ever since.

The generated tables live in cyrillic_and_mic.c. There has been one
change to the tables since they were generated in 1999, in commit
f4b7624eb07a. So if we resurrected the original data tables, that
change would need to be taken into account.

So this code is very dead. The tables in cyrillic_and_mic.c, which
were originally generated by these tools, are now the authoritative
source for these mappings.

Reviewed-by: Tom Lane, Aleksander Alekseev
Discussion: https://www.postgresql.org/message-id/flat/a821c3dc-36ec-4cee-8b41-7ccaa17adb18@iki.fi
2024-07-29 20:38:19 +03:00
Nathan Bossart
5c1ce1bbbe Remove tab completion for CREATE UNLOGGED MATERIALIZED VIEW.
Commit 3bf3ab8c56 added support for unlogged materialized views,
but commit 3223b25ff7 reverted that feature before it made it into
a release.  However, the latter commit left the grammar and
tab-completion support intact.  This commit removes the
tab-completion support to prevent psql from recommending bogus
commands.  I've opted to keep the grammar support so that the
server continues to emit a descriptive error when users try to
create unlogged matviews.

Reported-by: Daniel Westermann, px shi
Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/ZR0P278MB092093E92263DE16734208A5D2C59%40ZR0P278MB0920.CHEP278.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/CAAccyY%2BWg1Z-9tNfSwLmuZVgGOwqU5u1OP-RWcoAr2UZGuvN_w%40mail.gmail.com
2024-07-29 11:34:12 -05:00
Tom Lane
0f12905215 Count individual SQL commands in pg_restore's --transaction-size mode.
The initial implementation in commit 959b38d77 counted one action
per TOC entry (except for some special cases for multi-blob BLOBS
entries).  This assumes that TOC entries are all about equally
complex, but it turns out that that assumption doesn't hold up very
well in binary-upgrade mode.  For example, even after the previous
commit I was able to cause backend bloat with tables having many
inherited constraints.  There may be other cases too.  (Since no
serious problems have been reported with --single-transaction mode,
we can conclude that the backend copes well with psql's regular
restore scripts; but before 959b38d77 we never ran binary-upgrade
restores with multi-command transactions.)

To fix, count multi-command TOC entries as N actions, allowing the
transaction size to be scaled down when we hit a complex TOC entry.
Rather than add a SQL parser to pg_restore, approximate "multi
command" by counting semicolons in the TOC entry's defn string.
This will be fooled by semicolons appearing in string literals ---
but the error is in the conservative direction, so it doesn't seem
worth working harder.  The biggest risk is with function/procedure
TOC entries, but we can just explicitly skip those.

(This is undoubtedly a hack, and maybe someday we'll be able to
revert it after fixing the backend's bloat issues or rethinking
what pg_dump emits in binary upgrade mode.  But that surely isn't
a project for v17.)

Thanks to Alexander Korotkov for the let's-count-semicolons idea.

Per report from Justin Pryzby.  Back-patch to v17 where txn_size mode
was introduced.

Discussion: https://postgr.es/m/ZqEND4ZcTDBmcv31@pryzbyj2023
2024-07-29 12:17:24 -04:00
Tom Lane
b3f0e0503f Reduce number of commands dumpTableSchema emits for binary upgrade.
Avoid issuing a separate SQL UPDATE command for each column when
directly manipulating pg_attribute contents in binary upgrade mode.
With the separate updates, we triggered a relcache invalidation with
each update.  For a table with N columns, that causes O(N^2) relcache
bloat in txn_size mode because the table's newly-created relcache
entry can't be flushed till end of transaction.  Reducing the number
of commands should make it marginally faster as well as avoiding that
problem.

While at it, likewise avoid issuing a separate UPDATE on pg_constraint
for each inherited constraint.  This is less exciting, first because
inherited (non-partitioned) constraints are relatively rare, and
second because the backend has a good deal of trouble anyway with
restoring tables containing many such constraints, due to
MergeConstraintsIntoExisting being horribly inefficient.  But it seems
more consistent to do it this way here too, and it surely can't hurt.

In passing, fix one place in dumpTableSchema that failed to use ONLY
in ALTER TABLE.  That's not a live bug, but it's inconsistent.
Also avoid silently casting away const from string literals.

Per report from Justin Pryzby.  Back-patch to v17 where txn_size mode
was introduced.

Discussion: https://postgr.es/m/ZqEND4ZcTDBmcv31@pryzbyj2023
2024-07-29 11:53:49 -04:00
Heikki Linnakangas
0393f542d7 Fix double-release of spinlock
Commit 9d9b9d46f3 added spinlocks to protect the fields in ProcSignal
flags, but in EmitProcSignalBarrier(), the spinlock was released
twice. With most spinlock implementations, releasing a lock that's not
held is not easy to notice, because most of the time it does nothing,
but if the spinlock was concurrently acquired by another process, it
could lead to more serious issues. Fortunately, with the
--disable-spinlocks emulation implementation, it caused more visible
failures.

In the passing, fix a type in comment and add an assertion that the
procNumber passed to SendProcSignal looks valid.

Discussion: https://www.postgresql.org/message-id/b8ce284c-18a2-4a79-afd3-1991a2e7d246@iki.fi
2024-07-29 18:17:33 +03:00
Heikki Linnakangas
8bda213ec1 Fix compiler warning/error about typedef redefinitions
Per buildfarm member 'sifaka':

    procsignal.c:87:3: error: redefinition of typedef 'ProcSignalHeader' is a C11 feature [-Werror,-Wtypedef-redefinition]
2024-07-29 16:23:30 +03:00
Heikki Linnakangas
9d9b9d46f3 Move cancel key generation to after forking the backend
Move responsibility of generating the cancel key to the backend
process. The cancel key is now generated after forking, and the
backend advertises it in the ProcSignal array. When a cancel request
arrives, the backend handling it scans the ProcSignal array to find
the target pid and cancel key. This is similar to how this previously
worked in the EXEC_BACKEND case with the ShmemBackendArray, just
reusing the ProcSignal array.

One notable change is that we no longer generate cancellation keys for
non-backend processes. We generated them before just to prevent a
malicious user from canceling them; the keys for non-backend processes
were never actually given to anyone. There is now an explicit flag
indicating whether a process has a valid key or not.

I wrote this originally in preparation for supporting longer cancel
keys, but it's a nice cleanup on its own.

Reviewed-by: Jelte Fennema-Nio
Discussion: https://www.postgresql.org/message-id/508d0505-8b7a-4864-a681-e7e5edfe32aa@iki.fi
2024-07-29 15:37:48 +03:00
Heikki Linnakangas
19de089cdc Fix outdated comment in smgrtruncate()
Commit c5315f4f44 replaced smgr_fsm_nblocks and smgr_vm_nblocks with
smgr_cached_nblocks, but forgot to update this comment.

Author: Kirill Reshke
Discussion: https://www.postgresql.org/message-id/CALdSSPh9VA6SDSVjrcmSPEYramf%2BrFisK7GqJo1dtRnD3vddmA@mail.gmail.com
2024-07-29 14:23:23 +03:00
Richard Guo
513f4472a4 Reduce memory used by partitionwise joins
In try_partitionwise_join, we aim to break down the join between two
partitioned relations into joins between matching partitions.  To
achieve this, we iterate through each pair of partitions from the two
joining relations and create child-join relations for them.  With
potentially thousands of partitions, the local objects allocated in
each iteration can accumulate significant memory usage.  Therefore, we
opt to eagerly free these local objects at the end of each iteration.

In line with this approach, this patch frees the bitmap set that
represents the relids of child-join relations at the end of each
iteration.  Additionally, it modifies build_child_join_rel() to reuse
the AppendRelInfo structures generated within each iteration.

Author: Ashutosh Bapat
Reviewed-by: David Christensen, Richard Guo
Discussion: https://postgr.es/m/CAExHW5s4EqY43oB=ne6B2=-xLgrs9ZGeTr1NXwkGFt2j-OmaQQ@mail.gmail.com
2024-07-29 11:35:51 +09:00
Richard Guo
f47b33a191 Simplify create_merge_append_path for clarity
We don't currently support parameterized MergeAppend paths: there's
little use for an ordered path on the inside of a nestloop.  Given
this, we can simplify create_merge_append_path by directly setting
param_info to NULL instead of calling get_appendrel_parampathinfo.  We
can also simplify the Assert for child paths a little bit.

This change won't make any measurable difference in performance; it's
just for clarity's sake.

Author: Richard Guo
Reviewed-by: Alena Rybakina, Paul A Jungwirth
Discussion: https://postgr.es/m/CAMbWs4_n1bgH2nACMuGsXZct3KH6PBFS0tPdQsXdstRfyxTunQ@mail.gmail.com
2024-07-29 11:33:18 +09:00
Jeff Davis
2e68077b07 Refactor pg_set_regex_collation() for clarity.
Discussion: https://postgr.es/m/63409030-2746-462e-beac-759bd43032ce@proxel.se
Reviewed-by: Andreas Karlsson
2024-07-28 16:55:17 -07:00
David Rowley
da87dc07f1 Add missing pointer dereference in pg_backend_memory_contexts view
32d3ed816 moved the logic for setting the context's name and ident into
a reusable function.  I missed adding a pointer dereference after
copying and pasting the code into that function.  The ident parameter is
a pointer to the ident variable in the calling function, so the
dereference is required to correctly determine if the contents of that
variable is NULL or not.

In passing, adjust the if condition to include an == NULL to make it
more clear that it's not checking for == '\0'.

Reported-by: Tom Lane, Coverity
Discussion: https://postgr.es/m/2256588.1722184287@sss.pgh.pa.us
2024-07-29 09:53:10 +12:00
Jeff Davis
c0ef1234df Fix whitespace in commit 005c6b833f. 2024-07-28 13:34:52 -07:00
Jeff Davis
1c461a8d8d Refactor: make default_locale internal to pg_locale.c.
Discussion: https://postgr.es/m/2228884bb1f1a02614b39f71a90c94d2cc8a3a2f.camel@j-davis.com
Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-28 13:07:25 -07:00
Jeff Davis
005c6b833f Change collation cache to use simplehash.h.
Speeds up text comparison expressions when using a collation other
than the database default collation. Does not affect larger operations
such as ORDER BY, because the lookup is only done once.

Discussion: https://postgr.es/m/7bb9f018d20a7b30b9a7f6231efab1b5e50c7720.camel@j-davis.com
Reviewed-by: John Naylor, Andreas Karlsson
2024-07-28 12:39:57 -07:00
Alexander Korotkov
cdd6ab9d1f amcheck: Optimize speed of checking for unique constraint violation
Currently, when amcheck validates a unique constraint, it visits the heap for
each index tuple.  This commit implements skipping keys, which have only one
non-dedeuplicated index tuple (quite common case for unique indexes). That
gives substantial economy on index checking time.

Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240325020323.fd.nmisch%40google.com
Author: Alexander Korotkov, Pavel Borisov
2024-07-28 13:50:57 +03:00
David Rowley
b181062aa5 Fix incorrect return value for pg_size_pretty(bigint)
pg_size_pretty(bigint) would return the value in bytes rather than PB
for the smallest-most bigint value.  This happened due to an incorrect
assumption that the absolute value of -9223372036854775808 could be
stored inside a signed 64-bit type.

Here we fix that by instead storing that value in an unsigned 64-bit type.

This bug does exist in versions prior to 15 but the code there is
sufficiently different and the bug seems sufficiently non-critical that
it does not seem worth risking backpatching further.

Author: Joseph Koshakow <koshy44@gmail.com>
Discussion: https://postgr.es/m/CAAvxfHdTsMZPWEHUrZ=h3cky9Ccc3Mtx2whUHygY+ABP-mCmUw@mail.gmail.com
Backpatch-through: 15
2024-07-28 22:22:52 +12:00
Peter Eisentraut
1e666fd7c6 libpq: Use strerror_r instead of strerror
Commit 453c4687377 introduced a use of strerror() into libpq, but that
is not thread-safe.  Fix by using strerror_r() instead.

In passing, update some of the code comments added by 453c4687377, as
we have learned more about the reason for the change in OpenSSL that
started this.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: Discussion: https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649faf18@highgo.ca
2024-07-28 09:23:24 +02:00
Tom Lane
da4017a694 Doc: fix text's description of regexp_replace's arguments.
Section 9.7.3 had a syntax synopsis for regexp_replace()
that was different from Table 9.10's, but still wrong.
Update that one too.  Oversight in 580f8727c.

Jian He

Discussion: https://postgr.es/m/CACJufxG3NFKKsh6x4fRLv8h3V-HvN4W5dA=zNKMxsNcDwOKang@mail.gmail.com
2024-07-27 15:38:54 -04:00
David Rowley
17a5871d9d Optimize escaping of JSON strings
There were quite a few places where we either had a non-NUL-terminated
string or a text Datum which we needed to call escape_json() on.  Many of
these places required that a temporary string was created due to the fact
that escape_json() needs a NUL-terminated cstring.  For text types, those
first had to be converted to cstring before calling escape_json() on them.

Here we introduce two new functions to make escaping JSON more optimal:

escape_json_text() can be given a text Datum to append onto the given
buffer.  This is more optimal as it foregoes the need to convert the text
Datum into a cstring.  A temporary allocation is only required if the text
Datum needs to be detoasted.

escape_json_with_len() can be used when the length of the cstring is
already known or the given string isn't NUL-terminated.  Having this
allows various places which were creating a temporary NUL-terminated
string to just call escape_json_with_len() without any temporary memory
allocations.

Discussion: https://postgr.es/m/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com
Reviewed-by: Melih Mutlu, Heikki Linnakangas
2024-07-27 23:46:07 +12:00
Heikki Linnakangas
67427f1009 Support falling back to non-preferred readline implementation with meson
To build with -Dreadline=enabled one can use either readline or
libedit. The -Dlibedit_preferred flag is supposed to control the order
of names to lookup.  This works fine when either both libraries are
present or -Dreadline is set to auto. However, explicitly enabling
readline with only libedit present, but not setting libedit_preferred,
or alternatively enabling readline with only readline present, but
setting libedit_preferred, too, are both broken. This is because
cc.find_library will throw an error for a not found dependency as soon
as the first required dependency is checked, thus it's impossible to
fallback to the alternative.

Here we only check the second of the two dependencies for
requiredness, thus we only fail when none of the two can be found.

Author: Wolfgang Walther
Reviewed-by: Nazir Bilal Yavuz, Alvaro Herrera, Peter Eisentraut
Reviewed-by: Tristan Partin
Discussion: https://www.postgresql.org/message-id/ca8f37e1-a2c3-40e2-91f6-59c3d3652ad4@technowledgy.de
Backpatch: 16-, where meson support was added
2024-07-27 13:53:16 +03:00
Heikki Linnakangas
ff34ae368b Support absolute bindir/libdir in regression tests with meson
Passing an absolute bindir/libdir will install the binaries and
libraries to <build>/tmp_install/<bindir> and
<build>/tmp_install/<libdir> respectively.

This path is correctly passed to the regression test suite via
configure/make, but not via meson, yet. This is because the "/"
operator in the following expression throws away the whole left side
when the right side is an absolute path:

    test_install_location / get_option('libdir')

This was already correctly handled for dir_prefix, which is likely
absolute as well. This patch handles both bindir and libdir in the
same way - prefixing absolute paths with the tmp_install path
correctly.

Author: Wolfgang Walther
Reviewed-by: Nazir Bilal Yavuz, Alvaro Herrera, Peter Eisentraut
Reviewed-by: Tristan Partin
Discussion: https://www.postgresql.org/message-id/ca8f37e1-a2c3-40e2-91f6-59c3d3652ad4@technowledgy.de
Backpatch: 16-, where meson support was added
2024-07-27 13:53:14 +03:00
Heikki Linnakangas
4d8de281b5 Fallback to clang in PATH with meson
Some distributions put clang into a different path than the llvm
binary path.

For example, this is the case on NixOS / nixpkgs, which failed to find
clang with meson before this patch.

Author: Wolfgang Walther
Reviewed-by: Nazir Bilal Yavuz, Alvaro Herrera, Peter Eisentraut
Reviewed-by: Tristan Partin
Discussion: https://www.postgresql.org/message-id/ca8f37e1-a2c3-40e2-91f6-59c3d3652ad4@technowledgy.de
Backpatch: 16-, where meson support was added
2024-07-27 13:53:11 +03:00
Heikki Linnakangas
a00fae9d43 Fallback to uuid for ossp-uuid with meson
The upstream name for the ossp-uuid package / pkg-config file is
"uuid". Many distributions change this to be "ossp-uuid" to not
conflict with e2fsprogs.

This lookup fails on distributions which don't change this name, for
example NixOS / nixpkgs. Both "ossp-uuid" and "uuid" are also checked
in configure.ac.

Author: Wolfgang Walther
Reviewed-by: Nazir Bilal Yavuz, Alvaro Herrera, Peter Eisentraut
Reviewed-by: Tristan Partin
Discussion: https://www.postgresql.org/message-id/ca8f37e1-a2c3-40e2-91f6-59c3d3652ad4@technowledgy.de
Backpatch: 16-, where meson support was added
2024-07-27 13:53:08 +03:00
Michael Paquier
c9e2457390 Fix more holes with SLRU code in need of int64 for segment numbers
This is a continuation of 3937cadfd438, taking care of more areas I have
managed to miss previously.

Reported-by: Noah Misch
Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/20240724130059.1f.nmisch@google.com
Backpatch-through: 17
2024-07-27 07:16:52 +09:00
Nathan Bossart
0dcaea5690 Introduce num_os_semaphores GUC.
The documentation for System V IPC parameters provides complicated
formulas to determine the appropriate values for SEMMNI and SEMMNS.
Furthermore, these formulas have often been wrong because folks
forget to update them (e.g., when adding a new auxiliary process).

This commit introduces a new runtime-computed GUC named
num_os_semaphores that reports the number of semaphores needed for
the configured number of allowed connections, worker processes,
etc.  This new GUC allows us to simplify the formulas in the
documentation, and it should help prevent future inaccuracies.
Like the other runtime-computed GUCs, users can view it with
"postgres -C" before starting the server, which is useful for
preconfiguring the necessary operating system resources.

Reviewed-by: Tom Lane, Sami Imseih, Andres Freund, Robert Haas
Discussion: https://postgr.es/m/20240517164452.GA1914161%40nathanxps13
2024-07-26 15:28:55 -05:00
Robert Haas
8a53539bd6 Wait for WAL summarization to catch up before creating .partial file.
When a standby is promoted, CleanupAfterArchiveRecovery() may decide
to rename the final WAL file from the old timeline by adding ".partial"
to the name. If WAL summarization is enabled and this file is renamed
before its partial contents are summarized, WAL summarization breaks:
the summarizer gets stuck at that point in the WAL stream and just
errors out.

To fix that, first make the startup process wait for WAL summarization
to catch up before renaming the file. Generally, this should be quick,
and if it's not, the user can shut off summarize_wal and try again.
To make this fix work, also teach the WAL summarizer that after a
promotion has occurred, no more WAL can appear on the previous
timeline: previously, the WAL summarizer wouldn't switch to the new
timeline until we actually started writing WAL there, but that meant
that when the startup process was waiting for the WAL summarizer, it
was waiting for an action that the summarizer wasn't yet prepared to
take.

In the process of fixing these bugs, I realized that the logic to wait
for WAL summarization to catch up was spread out in a way that made
it difficult to reuse properly, so this code refactors things to make
it easier.

Finally, add a test case that would have caught this bug and the
previously-fixed bug that WAL summarization sometimes needs to back up
when the timeline changes.

Discussion: https://postgr.es/m/CA+TgmoZGEsZodXC4f=XZNkAeyuDmWTSkpkjCEOcF19Am0mt_OA@mail.gmail.com
2024-07-26 15:00:48 -04:00
Fujii Masao
454aab4b73 postgres_fdw: Fix bug in connection status check.
The buildfarm member "hake" reported a failure in the regression test
added by commit 857df3cef7, where postgres_fdw_get_connections(true)
returned unexpected results.

The function postgres_fdw_get_connections(true) checks
if a connection is closed by using POLLRDHUP in the requested events
and calling poll(). Previously, the function only considered
POLLRDHUP or 0 as valid returned events. However, poll() can also
return POLLHUP, POLLERR, and/or POLLNVAL. So if any of these events
were returned, postgres_fdw_get_connections(true) would report
incorrect results. postgres_fdw_get_connections(true) failed to
account for these return events.

This commit updates postgres_fdw_get_connections(true) to correctly
report a closed connection when poll() returns not only POLLRDHUP
but also POLLHUP, POLLERR, or POLLNVAL.

Discussion: https://postgr.es/m/fd8f6186-9e1e-4b9a-92c5-e71e3697d381@oss.nttdata.com
2024-07-27 03:58:48 +09:00
Nathan Bossart
4b56bb4ab4 pg_upgrade: Move live_check variable to user_opts.
At the moment, pg_upgrade stores whether it is doing a "live check"
(i.e., the user specified --check and the old server is still
running) in a local variable scoped to main().  This live_check
variable is passed to several functions.  To further complicate
matters, a few call sites provide a hard-coded "false" as the
live_check argument.  Specifically, this is done when calling these
functions for the new cluster, for which any live-check-only paths
won't apply.

This commit moves the live_check variable to the global user_opts
variable, which stores information about the options the user
specified on the command line.  This allows us to remove the
live_check parameter from several functions.  For the functions
with callers that provide a hard-coded "false" as the live_check
argument (e.g., get_control_data()), we verify the given cluster is
the old cluster before taking any live-check-only paths.

This small refactoring effort helps simplify some proposed changes
that would parallelize many of pg_upgrade's once-in-each-database
tasks using libpq's asynchronous APIs.  By removing the live_check
parameter, we can more easily convert the functions to callbacks
for the new parallel system.

Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-07-26 13:37:32 -05:00
Tom Lane
5d1d8b3c82 Clarify error message and documentation related to typed tables.
We restrict typed tables (those declared as "OF composite_type")
to be based on stand-alone composite types, not composite types
that are the implicitly-created rowtypes of other tables.
But if you tried to do that, you got the very confusing error
message "type foo is not a composite type".  Provide a more specific
message for that case.  Also clarify related documentation in the
CREATE TABLE man page.

Erik Wienhold and David G. Johnston, per complaint from Hannu Krosing.

Discussion: https://postgr.es/m/CAMT0RQRysCb_Amy5CTENSc5GfsvXL1a4qX3mv_hx31_v74P==g@mail.gmail.com
2024-07-26 12:39:45 -04:00
Robert Haas
c883453cb2 Fix indentation. 2024-07-26 12:00:04 -04:00
Daniel Gustafsson
161c73462b Fix macro placement in pg_config.h.in
Commit 274bbced85383e831dde accidentally placed the pg_config.h.in
for SSL_CTX_set_num_tickets on the wrong line wrt where autoheader
places it.  Fix by re-arranging and backpatch to the same level as
the original commit.

Reported-by: Marina Polyakova <m.polyakova@postgrespro.ru>
Discussion: https://postgr.es/m/48cebe8c3eaf308bae253b1dbf4e4a75@postgrespro.ru
Backpatch-through: v12
2024-07-26 16:25:28 +02:00
Robert Haas
cf8a489836 Allow WAL summarization to back up when timeline changes.
The old code believed that it was not possible to switch timelines
without first replaying all of the WAL from the old timeline, but
that turns out to be false, as demonstrated by an example from Fujii
Masao. As a result, it assumed that summarization would always
continue from the LSN where summarization previously ended. But in
fact, when a timeline switch occurs without replaying all the WAL
from the previous timeline, we can need to back up to an earlier
LSN. Adjust accordingly.

Discussion: https://postgr.es/m/CA+TgmoZGEsZodXC4f=XZNkAeyuDmWTSkpkjCEOcF19Am0mt_OA@mail.gmail.com
2024-07-26 09:50:31 -04:00
Fujii Masao
857df3cef7 postgres_fdw: Add connection status check to postgres_fdw_get_connections().
This commit extends the postgres_fdw_get_connections() function
to check if connections are closed. This is useful for detecting closed
postgres_fdw connections that could prevent successful transaction
commits. Users can roll back transactions immediately upon detecting
closed connections, avoiding unnecessary processing of failed
transactions.

This feature is available only on systems supporting the non-standard
POLLRDHUP extension to the poll system call, including Linux.

Author: Hayato Kuroda
Reviewed-by: Shinya Kato, Zhihong Yu, Kyotaro Horiguchi, Andres Freund
Reviewed-by: Onder Kalaci, Takamichi Osumi, Vignesh C, Tom Lane, Ted Yu
Reviewed-by: Katsuragi Yuta, Peter Smith, Shubham Khanna, Fujii Masao
Discussion: https://postgr.es/m/TYAPR01MB58662809E678253B90E82CE5F5889@TYAPR01MB5866.jpnprd01.prod.outlook.com
2024-07-26 22:16:39 +09:00
Fujii Masao
c297a47c5f postgres_fdw: Add "used_in_xact" column to postgres_fdw_get_connections().
This commit extends the postgres_fdw_get_connections() function to
include a new used_in_xact column, indicating whether each connection
is used in the current transaction.

This addition is particularly useful for the upcoming feature that
will check if connections are closed. By using those information,
users can verify if postgres_fdw connections used in a transaction
remain open. If any connection is closed, the transaction cannot
be committed successfully. In this case users can roll back it
immediately without waiting for transaction end.

The SQL API for postgres_fdw_get_connections() is updated by
this commit and may change in the future. To handle compatibility
with older SQL declarations, an API versioning system is introduced,
allowing the function to behave differently based on the API version.

Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/be9382f7-5072-4760-8b3f-31d6dffa8d62@oss.nttdata.com
2024-07-26 22:15:51 +09:00
Peter Eisentraut
5687f8c0dd pg_createsubscriber: Message style improvements
Refactor some messages, improve quoting.
2024-07-26 14:55:42 +02:00
Heikki Linnakangas
ef7fa900fb Add tests for errors during SSL or GSSAPI handshake
These test that libpq correctly falls back to a plaintext connection
on handshake error, in the "prefer" modes.

Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
2024-07-26 15:12:23 +03:00
Heikki Linnakangas
20e0e7da9b Add test for early backend startup errors
The new test tests the libpq fallback behavior on an early error,
which was fixed in the previous commit.

This adds an IS_INJECTION_POINT_ATTACHED() macro, to allow writing
injected test code alongside the normal source code. In principle, the
new test could've been implemented by an extra test module with a
callback that sets the FrontendProtocol global variable, but I think
it's more clear to have the test code right where the injection point
is, because it has pretty intimate knowledge of the surrounding
context it runs in.

Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
2024-07-26 15:12:21 +03:00
Heikki Linnakangas
b9e5249c29 Fix using injection points at backend startup in EXEC_BACKEND mode
Commit 86db52a506 changed the locking of injection points to use only
atomic ops and spinlocks, to make it possible to define injection
points in processes that don't have a PGPROC entry (yet). However, it
didn't work in EXEC_BACKEND mode, because the pointer to shared memory
area was not initialized until the process "attaches" to all the
shared memory structs. To fix, pass the pointer to the child process
along with other global variables that need to be set up early.

Backpatch-through: 17
2024-07-26 15:11:50 +03:00
Heikki Linnakangas
c95d2159c1 Fix fallback behavior when server sends an ERROR early at startup
With sslmode=prefer, the desired behavior is to completely fail the
connection attempt, *not* fall back to a plaintext connection, if the
server responds to the SSLRequest with an error ('E') response instead
of rejecting SSL with an 'N' response. This was broken in commit
05fd30c0e7.

Reported-by: Jacob Champion
Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
Backpatch-through: 17
2024-07-26 15:00:36 +03:00
Fujii Masao
284c030a10 doc: Enhance documentation for postgres_fdw_get_connections() output columns.
The documentation previously described the output columns of
postgres_fdw_get_connections() in text format, which was manageable
for the original two columns. However, upcoming patches will add
new columns, making text descriptions less readable.

This commit updates the documentation to use a table format,
making it easier for users to understand each output column.

Author: Fujii Masao, Hayato Kuroda
Reviewed-by: Hayato Kuroda
Discussion: https://postgr.es/m/d04aae8d-05f5-42f4-a263-b962334d9f75@oss.nttdata.com
2024-07-26 20:47:05 +09:00
Daniel Gustafsson
274bbced85 Disable all TLS session tickets
OpenSSL supports two types of session tickets for TLSv1.3, stateless
and stateful. The option we've used only turns off stateless tickets
leaving stateful tickets active. Use the new API introduced in 1.1.1
to disable all types of tickets.

Backpatch to all supported versions.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20240617173803.6alnafnxpiqvlh3g@awork3.anarazel.de
Backpatch-through: v12
2024-07-26 11:09:45 +02:00
Amit Langote
6f9a62b454 SQL/JSON: Remove useless code in ExecInitJsonExpr()
The code was for adding an unconditional JUMP to the next step,
which is unnecessary processing.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-26 16:38:46 +09:00
Amit Langote
4fc6a55560 SQL/JSON: Respect OMIT QUOTES when RETURNING domains over jsonb
populate_domain() didn't take into account the omit_quotes flag passed
down to json_populate_type() by ExecEvalJsonCoercion() and that led
to incorrect behavior when the RETURNING type is a domain over
jsonb.  Fix that by passing the flag by adding a new function
parameter to populate_domain().

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-26 16:08:13 +09:00
Amit Langote
231b7d670b SQL/JSON: Improve error-handling of JsonBehavior expressions
Instead of returning a NULL when the JsonBehavior expression value
could not be coerced to the RETURNING type, throw the error message
informing the user that it is the JsonBehavior expression that caused
the error with the actual coercion error message shown in its DETAIL
line.

Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-26 16:00:56 +09:00
Amit Langote
63e6c5f4a2 SQL/JSON: Fix error-handling of some JsonBehavior expressions
To ensure that the errors of executing a JsonBehavior expression that
is coerced in the parser are caught instead of being thrown directly,
pass ErrorSaveContext to ExecInitExprRec() when initializing it.
Also, add a EEOP_JSONEXPR_COERCION_FINISH step to handle the errors
that are caught that way.

Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-26 16:00:06 +09:00
Tom Lane
c7301c3b6f Doc: fix misleading syntax synopses for targetlists.
In the syntax synopses for SELECT, INSERT, UPDATE, etc,
SELECT ... and RETURNING ... targetlists were missing { ... }
braces around an OR (|) operator.  That allows misinterpretation
which could lead to confusion.

David G. Johnston, per gripe from masondeanm@aol.com.

Discussion: https://postgr.es/m/172193970148.915373.2403176471224676074@wrigleys.postgresql.org
2024-07-25 19:52:08 -04:00
Tom Lane
e458dc1ac8 Doc: update some HTTP links to point to canonical URLs.
These aren't actually broken at present, but we might as well
avoid redirects.

Joel Jacobson

Discussion: https://postgr.es/m/8ccc96c7-0515-491b-be98-cfacdaeda815@app.fastmail.com
2024-07-25 16:38:28 -04:00
Robert Haas
744ddc6c6a Document restrictions regarding incremental backups and standbys.
If you try to take an incremental backup on a standby and there hasn't
been much system activity, it might fail. Document why this happens.
Also add a hint to the error message you get, to make it more likely
that users will understand what has gone wrong.

Laurenz Albe and Robert Haas

Discussion: https://postgr.es/m/5468641ad821dad7aa3b2d65bf843146443a1b68.camel@cybertec.at
2024-07-25 15:45:06 -04:00
Tom Lane
580f8727ca Add argument names to the regexp_XXX functions.
This change allows these functions to be called using named-argument
notation, which can be helpful for readability, particularly for
the ones with many arguments.

There was considerable debate about exactly which names to use,
but in the end we settled on the names already shown in our
documentation table 9.10.

The citext extension provides citext-aware versions of some of
these functions, so add argument names to those too.

In passing, fix table 9.10's syntax synopses for regexp_match,
which were slightly wrong about which combinations of arguments
are allowed.

Jian He, reviewed by Dian Fay and others

Discussion: https://postgr.es/m/CACJufxG3NFKKsh6x4fRLv8h3V-HvN4W5dA=zNKMxsNcDwOKang@mail.gmail.com
2024-07-25 14:51:46 -04:00
Peter Eisentraut
05faf06e9c pg_createsubscriber: Message improvements
Objects are typically "in" a database, not "on".
2024-07-25 15:25:42 +02:00
Daniel Gustafsson
88e3da5658 pg_upgrade: Remove unused macro
Commit f06b1c598 removed validate_exec from pg_upgrade and instead
exported it from src/common, but the macro for checking executable
suffix on Windows was accidentally left.  Fix by removing.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c1d63754-cb85-2d8a-8409-bde2c4d2d04b@gmail.com
2024-07-25 15:03:50 +02:00
Daniel Gustafsson
cc59f9d0ff pgcrypto: Remove unused binary from clean target
Generation of the gen-rtab binary was removed in db7d1a7b0 but it
was accidentally left in the cleaning target.  Remove since it is
no longer built.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c1d63754-cb85-2d8a-8409-bde2c4d2d04b@gmail.com
2024-07-25 14:27:01 +02:00
Peter Eisentraut
c5c7183026 Remove useless unconstify() call
This should have been part of 67c0ef9752 but was apparently forgotten
there.
2024-07-25 11:38:05 +02:00
Peter Eisentraut
37c6923cf3 Fix -Wmissing-variable-declarations warnings for float.c special case
This adds extern declarations for the global variables defined in
float.c but not meant for external use.  This is a workaround to be
able to add -Wmissing-variable-declarations to the global set of
warning options in the near future.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-25 10:40:04 +02:00
Peter Eisentraut
ab61c40bfa Add extern declarations for Bison global variables
This adds extern declarations for some global variables produced by
Bison that are not already declared in its generated header file.
This is a workaround to be able to add -Wmissing-variable-declarations
to the global set of warning options in the near future.

Another longer-term solution would be to convert these grammars to
"pure" parsers in Bison, to avoid global variables altogether.  Note
that the core grammar is already pure, so this patch did not need to
touch it.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-25 09:26:08 +02:00
David Rowley
32d3ed8165 Add path column to pg_backend_memory_contexts view
"path" provides a reliable method of determining the parent/child
relationships between memory contexts.  Previously this could be done in
a non-reliable way by writing a recursive query and joining the "parent"
and "name" columns.  This wasn't reliable as the names were not unique,
which could result in joining to the wrong parent.

To make this reliable, "path" stores an array of numerical identifiers
starting with the identifier for TopLevelMemoryContext.  It contains an
element for each intermediate parent between that and the current context.

Incompatibility: Here we also adjust the "level" column to make it
1-based rather than 0-based.  A 1-based level provides a convenient way
to access elements in the "path" array. e.g. path[level] gives the
identifier for the current context.

Identifiers are not stable across multiple evaluations of the view.  In
an attempt to make these more stable for ad-hoc queries, the identifiers
are assigned breadth-first.  Contexts closer to TopLevelMemoryContext
are less likely to change between queries and during queries.

Author: Melih Mutlu <m.melihmutlu@gmail.com>
Discussion: https://postgr.es/m/CAGPVpCThLyOsj3e_gYEvLoHkr5w=tadDiN_=z2OwsK3VJppeBA@mail.gmail.com
Reviewed-by: Andres Freund, Stephen Frost, Atsushi Torikoshi,
Reviewed-by: Michael Paquier, Robert Haas, David Rowley
2024-07-25 15:03:28 +12:00
Thomas Munro
64c39bd504 ci: Pin MacPorts version to 2.9.3.
Commit d01ce180 invented a new way to find the latest MacPorts version.
By bad luck, a new beta release has just been published, and it seems
to lack some packages we need.  Go back to searching for this specific
version for now.  We still search with a pattern so that we can find the
package for the running version of macOS, but for now we always look for
2.9.3.  The code to do that had been anticipated already in a commented
out line, I just didn't expect to have to use it so soon...

Also include the whole MacPorts installation script in the cache key, so
that changes to the script cause a fresh installation.  This should make
it a bit easier to reason about the effect of changes on cached state in
github accounts using CI, when we make adjustments.

Back-patch to 15, like d01ce180.

Discussion: https://postgr.es/m/CA%2BhUKGLqJdv6RcwyZ_0H7khxtLTNJyuK%2BvDFzv3uwYbn8hKH6A%40mail.gmail.com
2024-07-25 14:48:01 +12:00
Michael Paquier
b8aa44fd4f doc: Decorate psql page with application markup tags
Noticed while looking at this area of the documentation for a separate
patch.
2024-07-25 10:59:49 +09:00
Thomas Munro
d01ce180d9 ci: Upgrade macOS version from 13 to 14.
1.  Previously we were using ghcr.io/cirruslabs/macos-XXX-base:latest
images, but Cirrus has started ignoring that and using a particular
image, currently ghcr.io/cirruslabs/macos-runner:sonoma, for github
accounts using free CI resources (as opposed to dedicated runner
machines, as cfbot uses).  Let's just ask for that image anyway, to stay
in sync.

2.  Instead of hard-coding a MacPorts installation URL, deduce it from
the running macOS version and the available releases.  This removes the
need to keep the ci_macports_packages.sh in sync with .cirrus.task.yml,
and to advance the MacPorts version from time to time.

3.  Change the cache key we use to cache the whole macports installation
across builds to include the OS major version, to trigger a fresh
installation when appropriate.

Back-patch to 15 where CI began.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGLqJdv6RcwyZ_0H7khxtLTNJyuK%2BvDFzv3uwYbn8hKH6A%40mail.gmail.com
2024-07-25 11:30:55 +12:00
Nathan Bossart
364509a2e7 pg_upgrade: Retrieve subscription count more efficiently.
Presently, pg_upgrade obtains the number of subscriptions in the
to-be-upgraded cluster by first querying pg_subscription in every
database for the number of subscriptions in only that database.
Then, in count_old_cluster_subscriptions(), it adds all the values
collected in the first step.  This is expensive, especially when
there are many databases.

Fortunately, there is a better way to retrieve the subscription
count.  Since pg_subscription is a shared catalog, we only need to
connect to a single database and query it once.  This commit
modifies pg_upgrade to use that approach, which also allows us to
trim several lines of code.  In passing, move the call to
get_db_subscription_count(), which has been renamed to
get_subscription_count(), from get_db_rel_and_slot_infos() to the
dedicated >= v17 section in check_and_dump_old_cluster().

We may be able to make similar improvements to
get_old_cluster_logical_slot_infos(), but that is left as a future
exercise.

Reviewed-by: Michael Paquier, Amit Kapila
Discussion: https://postgr.es/m/ZprQJv_TxccN3tkr%40nathan
Backpatch-through: 17
2024-07-24 11:30:33 -05:00
Alvaro Herrera
9f21482fe1
Fix a missing article in the documentation
Per complaint from Grant Gryczan.

It's a very old typo; backpatch all the way back.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/172179789219.915368.16590585529628354757@wrigleys.postgresql.org
2024-07-24 14:13:55 +02:00
Fujii Masao
97f2bc5aa5 pg_stat_statements: Add regression test for privilege handling.
This commit adds a regression test to verify that pg_stat_statements
correctly handles privileges, improving its test coverage.

Author: Keisuke Kuroda
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/2224ccf2e12c41ccb81702ef3303d5ac@nttcom.co.jp
2024-07-24 20:54:51 +09:00
Alvaro Herrera
3dd637f3d5
Reset relhassubclass upon attaching table as a partition
We don't allow inheritance parents as partitions, and have checks to
prevent this; but if a table _was_ in the past an inheritance parents
and all their children are removed, the pg_class.relhassubclass flag
may remain set, which confuses the partition pruning code (most
obviously, it results in an assertion failure; in production builds it
may be worse.)

Fix by resetting relhassubclass on attach.

Backpatch to all supported versions.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18550-d5e047e9a897a889@postgresql.org
2024-07-24 12:38:18 +02:00
Amit Kapila
07fbecb87b Doc: Fix the mistakes in the subscription's failover option.
The documentation incorrectly stated that users could not alter the
subscription's failover option when the two-phase commit is enabled.

The steps to confirm that the standby server is ready for failover were
incorrect.

Author: Shveta Malik, Hou Zhijie
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB571657B72F8D75BD858DCCE394AD2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/CAJpy0uBBk+OZXXqQ00Gai09XR+mDi2=9sMBYY0F+BedoFivaMA@mail.gmail.com
2024-07-24 14:24:45 +05:30
Thomas Munro
f6bef362ca Refactor tidstore.c iterator buffering.
Previously, TidStoreIterateNext() would expand the set of offsets for
each block into an internal buffer that it overwrote each time.  In
order to be able to collect the offsets for multiple blocks before
working with them, change the contract.  Now, the offsets are obtained
by a separate call to TidStoreGetBlockOffsets(), which can be called at
a later time.  TidStoreIteratorResult objects are safe to copy and store
in a queue.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CAAKRu_bbkmwAzSBgnezancgJeXrQZXy4G4kBTd+5=cr86H5yew@mail.gmail.com
2024-07-24 17:32:35 +12:00
Amit Kapila
1462aad2e4 Allow altering of two_phase option of a SUBSCRIPTION.
The two_phase option is controlled by both the publisher (as a slot
option) and the subscriber (as a subscription option), so the slot option
must also be modified.

Changing the 'two_phase' option for a subscription from 'true' to 'false'
is permitted only when there are no pending prepared transactions
corresponding to that subscription. Otherwise, the changes of already
prepared transactions can be replicated again along with their corresponding
commit leading to duplicate data or errors.

To avoid data loss, the 'two_phase' option for a subscription can only be
changed from 'false' to 'true' once the initial data synchronization is
completed. Therefore this is performed later by the logical replication worker.

Author: Hayato Kuroda, Ajin Cherian, Amit Kapila
Reviewed-by: Peter Smith, Hou Zhijie, Amit Kapila, Vitaly Davydov, Vignesh C
Discussion: https://postgr.es/m/8fab8-65d74c80-1-2f28e880@39088166
2024-07-24 10:13:36 +05:30
Peter Eisentraut
774d47b6c0 Move all extern declarations for GUC variables to header files
Add extern declarations in appropriate header files for global
variables related to GUC.  In many cases, this was handled quite
inconsistently before, with some GUC variables declared in a header
file and some only pulled in via ad-hoc extern declarations in various
.c files.

Also add PGDLLIMPORT qualifications to those variables.  These were
previously missing because src/tools/mark_pgdllimport.pl has only been
used with header files.

This also fixes -Wmissing-variable-declarations warnings for GUC
variables (not yet part of the standard warning options).

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-24 06:31:07 +02:00
Nathan Bossart
991f8cf8ab Detect integer overflow in array_set_slice().
When provided an empty initial array, array_set_slice() fails to
check for overflow when computing the new array's dimensions.
While such overflows are ordinarily caught by ArrayGetNItems(),
commands with the following form are accepted:

	INSERT INTO t (i[-2147483648:2147483647]) VALUES ('{}');

To fix, perform the hazardous computations using overflow-detecting
arithmetic routines.  As with commit 18b585155a, the added test
cases generate errors that include a platform-dependent value, so
we again use psql's VERBOSITY parameter to suppress printing the
message text.

Reported-by: Alexander Lakhin
Author: Joseph Koshakow
Reviewed-by: Jian He
Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com
Backpatch-through: 12
2024-07-23 21:59:02 -05:00
Peter Eisentraut
d3cc5ffe81 Move extern declarations for EXEC_BACKEND to header files
This fixes warnings from -Wmissing-variable-declarations (not yet part
of the standard warning options) under EXEC_BACKEND.  The
NON_EXEC_STATIC variables need a suitable declaration in a header file
under EXEC_BACKEND.

Also fix the inconsistent application of the volatile qualifier for
PMSignalState, which was revealed by this change.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-23 15:07:10 +02:00
Noah Misch
840b3b5b4e Fix private struct field name to match the code using it.
Commit 8720a15e9ab121e49174d889eaeafae8ac89de7b added the wrong name.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/20240720181405.5a.nmisch@google.com
2024-07-23 05:32:03 -07:00
Michael Paquier
3937cadfd4 Use more consistently int64 for page numbers in SLRU-related code
clog.c, async.c and predicate.c included some SLRU page numbers still
handled as 4-byte integers, while int64 should be used for this purpose.

These holes have been introduced in 4ed8f0913bfd, that has introduced
the use of 8-byte integers for SLRU page numbers, still forgot about the
code paths updated by this commit.

Reported-by: Noah Misch
Author: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20240626002747.dc.nmisch@google.com
Backpatch-through: 17
2024-07-23 17:59:05 +09:00
Peter Eisentraut
f68d85bf69 ldapurl is supported with simple bind
The docs currently imply that ldapurl is for search+bind only, but
that's not true.  Rearrange the docs to cover this better.

Add a test ldapurl with simple bind.  This was previously allowed but
unexercised, and now that it's documented it'd be good to pin the
behavior.

Improve error when mixing LDAP bind modes.  The option names had gone
stale; replace them with a more general statement.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAOYmi+nyg9gE0LeP=xQ3AgyQGR=5ZZMkVVbWd0uR8XQmg_dd5Q@mail.gmail.com
2024-07-23 10:17:55 +02:00
Peter Eisentraut
935e675f3c Get rid of a global variable
bootstrap_data_checksum_version can just as easily be passed to where
it is used via function arguments.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-23 10:00:41 +02:00
Michael Paquier
ffb0603929 Improve comments in slru.{c,h} about segment name format
slru.h described incorrectly how SLRU segment names are formatted
depending on the segment number and if long or short segment names are
used.  This commit closes the gap with a better description, fitting
with the reality.

Reported-by: Noah Misch
Author: Aleksander Alekseev
Discussion: https://postgr.es/m/20240626002747.dc.nmisch@google.com
Backpatch-through: 17
2024-07-23 16:54:51 +09:00
Peter Eisentraut
65504b747f Replace remaining strtok() with strtok_r()
for thread-safety in the server in the future

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
2024-07-23 09:20:22 +02:00
Peter Eisentraut
4d130b2872 Windows replacement for strtok_r()
They spell it "strtok_s" there.

There are currently no uses, but some will be added soon.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
2024-07-23 09:20:22 +02:00
Richard Guo
8b2e9fd26a Remove redundant code in create_gather_merge_path
In create_gather_merge_path, we should always guarantee that the
subpath is adequately ordered, and we do not add a Sort node in
createplan.c for a Gather Merge node.  Therefore, the 'else' branch in
create_gather_merge_path, which computes the cost for a Sort node, is
redundant.

This patch removes the redundant code and emits an error if the
subpath is not sufficiently ordered.  Meanwhile, this patch changes
the check for the subpath's pathkeys in create_gather_merge_plan to an
Assert.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs48u=0bWf3epVtULjJ-=M9Hbkz+ieZQAOS=BfbXZFqbDCg@mail.gmail.com
2024-07-23 11:18:53 +09:00
Richard Guo
581df21487 Fix rowcount estimate for gather (merge) paths
In the case of a parallel plan, when computing the number of tuples
processed per worker, we divide the total number of tuples by the
parallel_divisor obtained from get_parallel_divisor(), which accounts
for the leader's contribution in addition to the number of workers.

Accordingly, when estimating the number of tuples for gather (merge)
nodes, we should multiply the number of tuples per worker by the same
parallel_divisor to reverse the division.  However, currently we use
parallel_workers rather than parallel_divisor for the multiplication.
This could result in an underestimation of the number of tuples for
gather (merge) nodes, especially when there are fewer than four
workers.

This patch fixes this issue by using the same parallel_divisor for the
multiplication.  There is one ensuing plan change in the regression
tests, but it looks reasonable and does not compromise its original
purpose of testing parallel-aware hash join.

In passing, this patch removes an unnecessary assignment for path.rows
in create_gather_merge_path, and fixes an uninitialized-variable issue
in generate_useful_gather_paths.

No backpatch as this could result in plan changes.

Author: Anthonin Bonnefoy
Reviewed-by: Rafia Sabih, Richard Guo
Discussion: https://postgr.es/m/CAO6_Xqr9+51NxgO=XospEkUeAg-p=EjAWmtpdcZwjRgGKJ53iA@mail.gmail.com
2024-07-23 10:33:26 +09:00
Tom Lane
d2cba4f2cb Doc: improve description of plpgsql's FETCH and MOVE commands.
We were not being clear about which variants of the "direction"
clause are permitted in MOVE.  Also, the text seemed to be
written with only the FETCH/MOVE NEXT case in mind, so it
didn't apply very well to other variants.

Also, document that "MOVE count IN cursor" only works if count
is a constant.  This is not the whole truth, because some other
cases such as a parenthesized expression will also work, but
we want to push people to use "MOVE FORWARD count" instead.
The constant case is enough to cover what we allow in plain SQL,
and that seems sufficient to claim support for.

Update a comment in pl_gram.y claiming that we don't document
that point.

Per gripe from Philipp Salvisberg.

Discussion: https://postgr.es/m/172155553388.702.7932496598218792085@wrigleys.postgresql.org
2024-07-22 19:43:12 -04:00
Melanie Plageman
efcbb76efe Revert "Test that vacuum removes tuples older than OldestXmin"
This reverts commit aa607980aee08416211f003ab41aa750f5559712.

This test proved to be unstable on the buildfarm, timing out before the
standby could catch up on 32-bit machines where more rows were required
and failing to reliably trigger multiple index vacuum rounds on 64-bit
machines where fewer rows should be required.

Because the instability is only known to be present on versions of
Postgres with TIDStore used for dead TID storage by vacuum, this is only
being reverted on master and REL_17_STABLE.

As having this coverage may be valuable, there is a discussion on the
thread of possible ways to stabilize the test. If that happens, a fixed
test can be committed again.

Backpatch-through: 17
Reported-by: Tom Lane

Discussion: https://postgr.es/m/614152.1721580711%40sss.pgh.pa.us
2024-07-22 16:58:15 -04:00
Robert Haas
6a6ebb92b0 Initialize wal_level in the initial checkpoint record.
As per Coverity and Tom Lane, commit 402b586d0 (back-patched to v17
as 2b5819e2b) forgot to initialize this new structure member in this
code path.
2024-07-22 15:32:43 -04:00
Robert Haas
e4326fbc60 Remove grotty use of disable_cost for TID scan plans.
Previously, the code charged disable_cost for CurrentOfExpr, and then
subtracted disable_cost from the cost of a TID path that used
CurrentOfExpr as the TID qual, effectively disabling all paths except
that one. Now, we instead suppress generation of the disabled paths
entirely, and generate only the one that the executor will actually
understand.

With this approach, we do not need to rely on disable_cost being
large enough to prevent the wrong path from being chosen, and we
save some CPU cycle by avoiding generating paths that we can't
actually use. In my opinion, the code is also easier to understand
like this.

Patch by me. Review by Heikki Linnakangas.

Discussion: http://postgr.es/m/591b3596-2ea0-4b8e-99c6-fad0ef2801f5@iki.fi
2024-07-22 14:57:53 -04:00
Robert Haas
c0348fd0e3 Add missing call to ConditionVariableCancelSleep().
After calling ConditionVariableSleep() or ConditionVariableTimedSleep()
one or more times, code is supposed to call ConditionVariableCancelSleep()
to remove itself from the waitlist. This code neglected to do so.
As far as I know, that had no observable consequences, but let's make
the code correct.

Discussion: http://postgr.es/m/CA+TgmoYW8eR+KN6zhVH0sin7QH6AvENqw_bkN-bB4yLYKAnsew@mail.gmail.com
2024-07-22 10:02:31 -04:00
Peter Eisentraut
5d2e1cc117 Replace some strtok() with strsep()
strtok() considers adjacent delimiters to be one delimiter, which is
arguably the wrong behavior in some cases.  Replace with strsep(),
which has the right behavior: Adjacent delimiters create an empty
token.

Affected by this are parsing of:

- Stored SCRAM secrets
  ("SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>")

- ICU collation attributes
  ("und@colStrength=primary;colCaseLevel=yes") for ICU older than
  version 54

- PG_COLORS environment variable
  ("error=01;31:warning=01;35:note=01;36:locus=01")

- pg_regress command-line options with comma-separated list arguments
  (--dbname, --create-role) (currently only used pg_regress_ecpg)

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
2024-07-22 15:45:46 +02:00
Alvaro Herrera
90c1ba52e0
postgres_fdw: Split out the query_cancel test to its own file
This allows us to skip it in Cygwin, where it's reportedly flaky because
of platform bugs or something.

Backpatch to 17, where the test was introduced by commit 2466d6654f85.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/e4d0cb33-6be5-e4d5-ae49-9eac3ff2b005@gmail.com
2024-07-22 12:49:57 +02:00
Peter Eisentraut
683be87fbb Add port/ replacement for strsep()
from OpenBSD, similar to strlcat, strlcpy

There are currently no uses, but some will be added soon.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
2024-07-22 09:50:30 +02:00
Richard Guo
7e187a7386 Fix unstable test in select_parallel.sql
One test case added in 22d946b0f verifies the plan of a non-parallel
nestloop join.  The planner's choice of join order is arbitrary, and
slight variations in underlying statistics could result in a different
displayed plan.  To stabilize the test result, here we enforce the
join order using a lateral join.

While here, modify the test case to verify that parallel nestloop join
is not generated if the inner path is not parallel-safe, which is what
we wanted to test in 22d946b0f.

Reported-by: Alexander Lakhin as per buildfarm
Author: Richard Guo
Discussion: https://postgr.es/m/7c09a439-e48d-5460-cfa0-a371b1a57066@gmail.com
2024-07-22 11:29:21 +09:00
Michael Paquier
2d8ef5e24f Add new error code for "file name too long"
This new error code, named file_name_too_long, maps internally to the
errno ENAMETOOLONG to produce a proper error code rather than an
internal code under errcode_for_file_access().  This error code can be
reached with some SQL command patterns, like a snapshot file name.

Reported-by: Alexander Lakhin
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/Zo4ROR9mgy8bowMo@paquier.xyz
2024-07-22 09:28:01 +09:00
Andres Freund
5ec2c529f5 meson: Add dependency lookups via names used by cmake
Particularly on windows it's useful to look up dependencies via cmake, instead
of pkg-config. Meson supports doing so. Unfortunately the dependency names
used by various projects often differs between their pkg-config and cmake
files.

This would look a lot neater if we could rely on meson >= 0.60.0...

Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://postgr.es/m/20240709065101.xhc74r3mdg2lmn4w@awork3.anarazel.de
Backpatch: 16-, where meson support was added
2024-07-20 13:51:14 -07:00
Andres Freund
2416fdb3ee meson: Add support for detecting ossp-uuid without pkg-config
This is necessary as ossp-uuid on windows installs neither a pkg-config nor a
cmake dependency information. Nor is there another supported uuid
implementation available on windows.

Reported-by: Dave Page <dpage@pgadmin.org>
Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://postgr.es/m/20240709065101.xhc74r3mdg2lmn4w@awork3.anarazel.de
Backpatch: 16-, where meson support was added
2024-07-20 13:51:14 -07:00
Andres Freund
7ed2ce0b25 meson: Add support for detecting gss without pkg-config
This is required as MIT Kerberos does provide neither pkg-config nor cmake
dependency information on windows.

Reported-by: Dave Page <dpage@pgadmin.org>
Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://postgr.es/m/20240709065101.xhc74r3mdg2lmn4w@awork3.anarazel.de
Backpatch: 16-, where meson support was added
2024-07-20 13:51:14 -07:00
Andres Freund
c3dafaaac3 meson: Add missing argument to gssapi.h check
These were missing since the initial introduction of the meson based build, in
e6927270cd18. As-is this is unlikely to cause an issue, but a future commit
will add support for detecting gssapi without use of dependency(), which could
fail due to this.

Discussion: https://postgr.es/m/20240708225659.gmyqoosi7km6ysgn@awork3.anarazel.de
Backpatch: 16-, where the meson based build was added
2024-07-20 13:51:14 -07:00
Tom Lane
220003b9b9 Correctly check updatability of columns targeted by INSERT...DEFAULT.
If a view has some updatable and some non-updatable columns, we failed
to verify updatability of any columns for which an INSERT or UPDATE
on the view explicitly specifies a DEFAULT item (unless the view has
a declared default for that column, which is rare anyway, and one
would almost certainly not write one for a non-updatable column).
This would lead to an unexpected "attribute number N not found in
view targetlist" error rather than the intended error.

Per bug #18546 from Alexander Lakhin.  This bug is old, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/18546-84a292e759a9361d@postgresql.org
2024-07-20 13:40:15 -04:00
Noah Misch
8720a15e9a Use read streams in CREATE DATABASE when STRATEGY=WAL_LOG.
While this doesn't significantly change runtime now, it arranges for
STRATEGY=WAL_LOG to benefit automatically from future optimizations to
the read_stream subsystem.  For large tables in the template database,
this does read 16x as many bytes per system call.  Platforms with high
per-call overhead, if any, may see an immediate benefit.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
2024-07-20 04:22:12 -07:00
Noah Misch
a858be17c3 Add a way to create read stream object by using SMgrRelation.
Currently read stream object can be created only by using Relation.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
2024-07-20 04:22:12 -07:00
Noah Misch
af07a827b9 Refactor PinBufferForBlock() to remove checks about persistence.
There are checks in PinBufferForBlock() function to set persistence of
the relation.  This function is called for each block in the relation.
Instead, set persistence of the relation before PinBufferForBlock().

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
2024-07-20 04:22:12 -07:00
Noah Misch
e00c45f685 Remove "smgr_persistence == 0" dead code.
Reaching that code would have required multiple processes performing
relation extension during recovery, which does not happen.  That caller
has the persistence available, so pass it.  This was dead code as soon
as commit 210622c60e1a9db2e2730140b8106ab57d259d15 added it.

Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
2024-07-20 04:22:12 -07:00
Nathan Bossart
22b0ccd65d Add overflow checks to money type.
None of the arithmetic functions for the the money type handle
overflow.  This commit introduces several helper functions with
overflow checking and makes use of them in the money type's
arithmetic functions.

Fixes bug #18240.

Reported-by: Alexander Lakhin
Author: Joseph Koshakow
Discussion: https://postgr.es/m/18240-c5da758d7dc1ecf0%40postgresql.org
Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com
Backpatch-through: 12
2024-07-19 11:52:32 -05:00
Melanie Plageman
aa607980ae Test that vacuum removes tuples older than OldestXmin
If vacuum fails to prune a tuple killed before OldestXmin, it will
decide to freeze its xmax and later error out in pre-freeze checks.

Add a test reproducing this scenario to the recovery suite which creates
a table on a primary, updates the table to generate dead tuples for
vacuum, and then, during the vacuum, uses a replica to force
GlobalVisState->maybe_needed on the primary to move backwards and
precede the value of OldestXmin set at the beginning of vacuuming the
table.

This commit is separate from the fix in case there are test stability
issues.

Author: Melanie Plageman
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAAKRu_apNU2MPBK96V%2BbXjTq0RiZ-%3DA4ZTaysakpx9jxbq1dbQ%40mail.gmail.com
2024-07-19 12:04:11 -04:00
Melanie Plageman
83c39a1f7f Ensure vacuum removes all visibly dead tuples older than OldestXmin
If vacuum fails to remove a tuple with xmax older than
VacuumCutoffs->OldestXmin and younger than GlobalVisState->maybe_needed,
it may attempt to freeze the tuple's xmax and then ERROR out in
pre-freeze checks with "cannot freeze committed xmax".

Fix this by having vacuum always remove tuples older than OldestXmin.

It is possible for GlobalVisState->maybe_needed to precede OldestXmin if
maybe_needed is forced to go backward while vacuum is running. This can
happen if a disconnected standby with a running transaction older than
VacuumCutoffs->OldestXmin reconnects to the primary after vacuum
initially calculates GlobalVisState and OldestXmin.

In back branches starting with 14, the first version using
GlobalVisState, failing to remove tuples older than OldestXmin during
pruning caused vacuum to infinitely loop in lazy_scan_prune(), as
investigated on this [1] thread. After 1ccc1e05ae removed the retry loop
in lazy_scan_prune() and stopped comparing tuples to OldestXmin, the
hang could no longer happen, but we could still attempt to freeze dead
tuples with xmax older than OldestXmin -- resulting in an ERROR.

Fix this by always removing dead tuples with xmax older than
VacuumCutoffs->OldestXmin. This is okay because the standby won't replay
the tuple removal until the tuple is removable. Thus, the worst that can
happen is a recovery conflict.

[1] https://postgr.es/m/20240415173913.4zyyrwaftujxthf2%40awork3.anarazel.de#1b216b7768b5bd577a3d3d51bd5aadee

Back-patch through 14

Author: Melanie Plageman
Reviewed-by: Peter Geoghegan, Robert Haas, Andres Freund, Heikki Linnakangas, and Noah Misch
Discussion: https://postgr.es/m/CAAKRu_bDD7oq9ZwB2OJqub5BovMG6UjEYsoK2LVttadjEqyRGg%40mail.gmail.com
2024-07-19 12:04:00 -04:00
Heikki Linnakangas
5784a493f1 Move resowner from common JitContext to LLVM specific
Only the LLVM specific code uses it since resource owners were made
extensible in commit b8bff07daa85c837a2747b4d35cd5a27e73fb7b2. This is
new in v17, so backpatch there to keep the branches from diverging
just yet.

Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/fd3a2a00-6605-4e30-a118-48418b478e6e@proxel.se
2024-07-19 10:27:06 +03:00
Michael Paquier
3a137ab7e5 Add more test coverage for jsonpath "$.*" with arrays
There was no coverage for the code path to unwrap an array before
applying ".*" to it, so add tests to provide more coverage for both
objects and arrays.

This shows, for example, that no results are returned for an array of
scalars, and what results are returned when the array contains an
object.  A few more scenarios are covered with the strict/lax modes and
the operator "@?".

Author: David Wheeler
Reported-by: David G. Johnston, Stepan Neretin
Discussion: https://postgr.es/m/A95346F9-6147-46E0-809E-532A485D71D6@justatheory.com
2024-07-19 14:17:56 +09:00
Etsuro Fujita
5c571a34d0 postgres_fdw: Avoid "cursor can only scan forward" error.
Commit d844cd75a disallowed rewind in a non-scrollable cursor to resolve
anomalies arising from such a cursor operation.  However, this failed to
take into account the assumption in postgres_fdw that when rescanning a
foreign relation, it can rewind the cursor created for scanning the
foreign relation without specifying the SCROLL option, regardless of its
scrollability, causing this error when it tried to do such a rewind in a
non-scrollable cursor.  Fix by modifying postgres_fdw to instead
recreate the cursor, regardless of its scrollability, when rescanning
the foreign relation.  (If we had a way to check its scrollability, we
could improve this by rewinding it if it is scrollable and recreating it
if not, but we do not have it, so this commit modifies it to recreate it
in any case.)

Per bug #17889 from Eric Cyr.  Devrim Gunduz also reported this problem.
Back-patch to v15 where that commit enforced the prohibition.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/17889-e8c39a251d258dda%40postgresql.org
Discussion: https://postgr.es/m/b415ac3255f8352d1ea921cf3b7ba39e0587768a.camel%40gunduz.org
2024-07-19 13:15:00 +09:00
Michael Paquier
c145f321b6 Propagate query IDs of utility statements in functions
For utility statements defined within a function, the query tree is
copied to a PlannedStmt as utility commands do not require planning.
However, the query ID was missing from the information passed down.

This leads to plugins relying on the query ID like pg_stat_statements to
not be able to track utility statements within function calls.  Tests
are added to check this behavior, depending on pg_stat_statements.track.

This is an old bug.  Now, query IDs for utilities are compiled using
their parsed trees rather than the query string since v16
(3db72ebcbe20), leading to less bloat with utilities, so backpatch down
only to this version.

Author: Anthonin Bonnefoy
Discussion: https://postgr.es/m/CAO6_XqrGp-uwBqi3vBPLuRULKkddjC7R5QZCgsFren=8E+m2Sg@mail.gmail.com
Backpatch-through: 16
2024-07-19 10:21:01 +09:00
Tom Lane
cd85ae1114 Improve pg_ctl's message for shutdown after recovery.
If pg_ctl tries to start the postmaster, but the postmaster shuts down
because it completed a point-in-time recovery, pg_ctl used to report
a message that indicated a failure.  It's not really a failure, so
instead say "server shut down because of recovery target settings".

Zhao Junwang, Crisp Lee, Laurenz Albe

Discussion: https://postgr.es/m/CAGHPtV7GttPZ-HvxZuYRy70jLGQMEm5=LQc4fKGa=J74m2VZbg@mail.gmail.com
2024-07-18 13:48:58 -04:00
Tom Lane
56c6be57af Doc: improve description of plpgsql's RAISE command.
RAISE accepts either = or := in the USING clause, so fix the
syntax synopsis to show that.

Rearrange and wordsmith the descriptions of the different syntax
variants, in hopes of improving clarity.

Igor Gnatyuk, reviewed by Jian He and Laurenz Albe; minor additional
wordsmithing by me

Discussion: https://postgr.es/m/CAEu6iLvhF5sdGeat2x4_L0FvWW_SiN--ma8ya7CZd-oJoV+yqQ@mail.gmail.com
2024-07-18 12:37:58 -04:00
Robert Haas
402b586d0a Do not summarize WAL if generated with wal_level=minimal.
To do this, we must include the wal_level in the first WAL record
covered by each summary file; so add wal_level to struct Checkpoint
and the payload of XLOG_CHECKPOINT_REDO and XLOG_END_OF_RECOVERY.

This, in turn, requires bumping XLOG_PAGE_MAGIC and, since the
Checkpoint is also stored in the control file, also
PG_CONTROL_VERSION. It's not great to do that so late in the release
cycle, but the alternative seems to ship v17 without robust
protections against this scenario, which could result in corrupted
incremental backups.

A side effect of this patch is that, when a server with
wal_level=replica is started with summarize_wal=on for the first time,
summarization will no longer begin with the oldest WAL that still
exists in pg_wal, but rather from the first checkpoint after that.
This change should be harmless, because a WAL summary for a partial
checkpoint cycle can never make an incremental backup possible when
it would otherwise not have been.

Report by Fujii Masao. Patch by me. Review and/or testing by Jakub
Wartak and Fujii Masao.

Discussion: http://postgr.es/m/6e30082e-041b-4e31-9633-95a66de76f5d@oss.nttdata.com
2024-07-18 12:09:48 -04:00
Michael Paquier
a0a5869a85 Add INJECTION_POINT_CACHED() to run injection points directly from cache
This new macro is able to perform a direct lookup from the local cache
of injection points (refreshed each time a point is loaded or run),
without touching the shared memory state of injection points at all.

This works in combination with INJECTION_POINT_LOAD(), and it is better
than INJECTION_POINT() in a critical section due to the fact that it
would avoid all memory allocations should a concurrent detach happen
since a LOAD(), as it retrieves a callback from the backend-private
memory.

The documentation is updated to describe in more details how to use this
new macro with a load.  Some tests are added to the module
injection_points based on a new SQL function that acts as a wrapper of
INJECTION_POINT_CACHED().

Based on a suggestion from Heikki Linnakangas.

Author: Heikki Linnakangas, Michael Paquier
Discussion: https://postgr.es/m/58d588d0-e63f-432f-9181-bed29313dece@iki.fi
2024-07-18 09:50:41 +09:00
Tom Lane
6159331acf Doc: fix minor syntax error in example.
The CREATE TABLE option is GENERATED BY DEFAULT *AS* IDENTITY.

Per bug #18543 from Ondřej Navrátil.  Seems to have crept in
in a37bb7c13, so back-patch to v17 where that was added.

Discussion: https://postgr.es/m/18543-93c721689f9928e8@postgresql.org
2024-07-17 15:17:52 -04:00
Nathan Bossart
a99cc6c6b4 Use PqMsg_* macros in more places.
Commit f4b54e1ed9, which introduced macros for protocol characters,
missed updating a few places.  It also did not introduce macros for
messages sent from parallel workers to their leader processes.
This commit adds a new section in protocol.h for those.

Author: Aleksander Alekseev
Discussion: https://postgr.es/m/CAJ7c6TNTd09AZq8tGaHS3LDyH_CCnpv0oOz2wN1dGe8zekxrdQ%40mail.gmail.com
Backpatch-through: 17
2024-07-17 10:51:00 -05:00
Andrew Dunstan
f2a0d5808c Avoid error in recovery test if history file is not yet present
Error was detected when testing use of libpq sessions instead of psql
for polling queries.

Discussion: https://postgr.es/m/e86b6d2d-20d8-4ac9-9a98-165fff7db886@dunslane.net

Backpatch to all live branches
2024-07-17 10:44:20 -04:00
Amit Langote
86d33987e8 SQL/JSON: Rethink c2d93c3802b
This essentially reverts c2d93c3802b except tests. The problem with
c2d93c3802b was that it only changed the casting behavior for types
with typmod, and had coding issues noted in the post-commit review.

This commit changes coerceJsonFuncExpr() to use assignment-level casts
instead of explicit casts to coerce the result of JSON constructor
functions to the specified or the default RETURNING type.  Using
assignment-level casts fixes the problem that using explicit casts was
leading to the wrong typmod / length coercion behavior -- truncating
results longer than the specified length instead of erroring out --
which c2d93c3802b aimed to solve.

That restricts the set of allowed target types to string types, the
same set that's currently allowed.

Discussion: https://postgr.es/m/202406291824.reofujy7xdj3@alvherre.pgsql
2024-07-17 17:14:01 +09:00
Michael Paquier
ec678692f6 Make write of pgstats file durable at shutdown
This switches the pgstats write code to use durable_rename() rather than
rename().  This ensures that the stats file's data is durable when the
statistics are written, which is something only happening at shutdown
now with the checkpointer doing the job.

This could cause the statistics to be lost even after PostgreSQL is shut
down, should a host failure happen, for example.

Suggested-by: Konstantin Knizhnik
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZpDQTZ0cAz0WEbh7@paquier.xyz
2024-07-17 11:50:36 +09:00
Jeff Davis
4b74ebf726 When creating materialized views, use REFRESH to load data.
Previously, CREATE MATERIALIZED VIEW ... WITH DATA populated the MV
the same way as CREATE TABLE ... AS.

Instead, reuse the REFRESH logic, which locks down security-restricted
operations and restricts the search_path. This reduces the chance that
a subsequent refresh will fail.

Reported-by: Noah Misch
Backpatch-through: 17
Discussion: https://postgr.es/m/20240630222344.db.nmisch@google.com
2024-07-16 15:41:29 -07:00
Nathan Bossart
0a8ca122e5 Add a couple of recent commits to .git-blame-ignore-revs. 2024-07-16 11:04:55 -05:00
Andrew Dunstan
49546ae9c7 Adjust recently added test for pg_signal_autovacuum role
This test was added by commit d2b74882ca, but fails if
log_error_verbosity is set to verbose. Adjust the regex that checks the
error message to allow for it containing an SQL status code.
2024-07-16 10:05:48 -04:00
Amit Langote
884d791b21 SQL/JSON: Fix a paragraph in JSON_TABLE documentation
Using <replaceable>text</replaceable> inside parantheses is not a
common or good style, so rephrase a sentence to avoid that style.
Also rephrase the text in that paragraph a bit while at it.

Reported-by: Marcos Pegoraro <marcos@f10.com.br>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CAB-JLwZqH3Yec6Kz-4-+pa0ZG9QJBsxjJZwYcMZYzEDR_fXnKw@mail.gmail.com
2024-07-16 14:10:58 +09:00
Michael Paquier
d2b74882ca Add tap test for pg_signal_autovacuum role
This commit provides testig coverage for ccd38024bc3c, checking that a
role granted pg_signal_autovacuum_worker is able to stop a vacuum
worker.

An injection point with a wait is placed at the beginning of autovacuum
worker startup to make sure that a worker is still alive when sending
and processing the signal sent.

Author: Anthony Leung, Michael Paquier, Kirill Reshke
Reviewed-by: Andrey Borodin, Nathan Bossart
Discussion: https://postgr.es/m/CALdSSPiQPuuQpOkF7x0g2QkA5eE-3xXt7hiJFvShV1bHKDvf8w@mail.gmail.com
2024-07-16 10:05:46 +09:00
Andres Freund
47ecbfdfcc Fix bad indentation introduced in 43cd30bcd1c
Oops.

Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/ZpVZB9rH5tHllO75@nathan
Backpatch: 12-, like 43cd30bcd1c
2024-07-15 15:17:04 -07:00
Andres Freund
4128453003 ci: Use newer LLVM version with gcc, to avoid compiler warnings
gcc emits a warning for LLVM 14 code outside of our control. To avoid that,
update to a newer LLVM version. Do so both in the CompilerWarnings and normal
tasks - the latter don't fail, but the warnings make it more likely that we'd
miss other warnings.

We might want to backpatch this eventually. The higher priority right now is
to unbreak CI though - which is only broken on master, due to 0c3930d0768
interacting badly with c8a6ec206a9 (mea culpa, I should have noticed this
before pushing, but I missed it due to another, independent CI failure).

Discussion: https://postgr.es/m/20240715193754.awdxgrzurxnwwu2t@awork3.anarazel.de
2024-07-15 15:04:15 -07:00
Jeff Davis
8e28778ce3 Add missing RestrictSearchPath() calls.
Reported-by: Noah Misch
Backpatch-through: 17
Discussion: https://postgr.es/m/20240630222344.db.nmisch@google.com
2024-07-15 12:07:35 -07:00
Andres Freund
c8a6ec206a ci: Upgrade to Debian Bookworm
Bullseye is getting long in the tooth, upgrade to the current stable version.

Backpatch to all versions with CI support, we don't want to generate CI images
for multiple Debian versions.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0fY5EFHXLKCO_%3Dp4pwFmHRoVom_qSE_7B48gpchfAqzw%40mail.gmail.com
Backpatch: 15-, where CI was added
2024-07-15 09:26:01 -07:00
Andres Freund
43cd30bcd1 Fix type confusion in guc_var_compare()
Before this change guc_var_compare() cast the input arguments to
const struct config_generic *.  That's not quite right however, as the input
on one side is often just a char * on one side.

Instead just use char *, the first field in config_generic.

This fixes a -Warray-bounds warning with some versions of gcc. While the
warning is only known to be triggered for <= 15, the issue the warning points
out seems real, so apply the fix everywhere.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reported-by: Erik Rijkers <er@xs4all.nl>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/a74a1a0d-0fd2-3649-5224-4f754e8f91aa%40xs4all.nl
2024-07-15 09:26:01 -07:00
Tom Lane
a0899c0a97 Doc: minor improvements for plpgsql "Transaction Management" section.
Point out that savepoint commands cannot be issued in PL/pgSQL,
and suggest that exception blocks can usually be used instead.

Add a caveat to the discussion of cursor loops vs. transactions,
pointing out that any locks taken by the cursor query will be lost
at COMMIT.  This is implicit in what's already said, but the existing
text leaves the distinct impression that the auto-hold behavior is
transparent, which it's not really.

Per a couple of recent complaints (one unsigned, and one in bug #18531
from Dzmitry Jachnik).  Back-patch to v17, just so this makes it into
current docs in less than a year-and-a-half.

Discussion: https://postgr.es/m/172076354433.736586.14347210271966220018@wrigleys.postgresql.org
Discussion: https://postgr.es/m/18531-c6dddd33b8555fd2@postgresql.org
2024-07-15 11:59:43 -04:00
Thomas Munro
8583b1f993 Run LLVM verify pass on IR in assert builds.
The problem fixed by commit 53c8d6c9 would have been noticed if we'd
been running LLVM's verify pass on generated IR.  Doing so also reveals
a complaint about incorrect name mangling, fixed here.  Only enabled for
LLVM 17+ because it uses the new pass manager API.

Suggested-by: Dmitry Dolgov <9erthalion6@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRACpVFr7LMdVYENUkScG5FCYMZDDdSGNU-tch%2Bw98OxYg%40mail.gmail.com
2024-07-15 21:48:48 +12:00
Heikki Linnakangas
91651347ba Use correct type for pq_mq_parallel_leader_proc_number variable
It's a ProcNumber, not a process id. Both are integers, so it's
harmless, but clearly wrong. It's been wrong since forever, the
mistake has survived through a couple of refactorings already.

Spotted-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/CA+hUKGKPTLSGMyE4Brin-osY8omPLNXmVWDMfrRABLp=6QrR_Q@mail.gmail.com
2024-07-15 11:12:22 +03:00
Heikki Linnakangas
86db52a506 Use atomics to avoid locking in InjectionPointRun()
This allows using injection points without having a PGPROC, like early
at backend startup, or in the postmaster.

The injection points facility is new in v17, so backpatch there.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Disussion: https://www.postgresql.org/message-id/4317a7f7-8d24-435e-9e49-29b72a3dc418@iki.fi
2024-07-15 10:22:11 +03:00
Fujii Masao
4e5d6c4091 Fix unstable tests in partition_merge.sql and partition_split.sql.
The tests added by commit c086896625 were unstable due to
missing schema names when checking pg_tables and pg_indexes.

Backpatch to v17.

Reported by buildfarm.
2024-07-15 14:11:13 +09:00
Fujii Masao
c086896625 Fix tablespace handling in MERGE/SPLIT partition commands.
As commit ca4103025d stated, new partitions without a specified tablespace
should inherit the parent relation's tablespace. However, previously,
ALTER TABLE MERGE PARTITIONS and ALTER TABLE SPLIT PARTITION commands
always created new partitions in the default tablespace, ignoring
the parent's tablespace. This commit ensures new partitions inherit
the parent's tablespace.

Backpatch to v17 where these commands were introduced.

Author: Fujii Masao
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/abaf390b-3320-40a5-8815-ef476db5cfe7@oss.nttdata.com
2024-07-15 13:11:51 +09:00
Richard Guo
069d0ff022 Check lateral references within PHVs for memoize cache keys
If we intend to generate a Memoize node on top of a path, we need
cache keys of some sort.  Currently we search for the cache keys in
the parameterized clauses of the path as well as the lateral_vars of
its parent.  However, it turns out that this is not sufficient because
there might be lateral references derived from PlaceHolderVars, which
we fail to take into consideration.

This oversight can cause us to miss opportunities to utilize the
Memoize node.  Moreover, in some plans, failing to recognize all the
cache keys could result in performance regressions.  This is because
without identifying all the cache keys, we would need to purge the
entire cache every time we get a new outer tuple during execution.

This patch fixes this issue by extracting lateral Vars from within
PlaceHolderVars and subsequently including them in the cache keys.

In passing, this patch also includes a comment clarifying that Memoize
nodes are currently not added on top of join relation paths.  This
explains why this patch only considers PlaceHolderVars that are due to
be evaluated at baserels.

Author: Richard Guo
Reviewed-by: Tom Lane, David Rowley, Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48jLxn0pAPZpJ50EThZ569Xrw+=4Ac3QvkpQvNszbeoNg@mail.gmail.com
2024-07-15 10:26:33 +09:00
Tom Lane
f96c2c7278 Avoid unhelpful internal error for incorrect recursive-WITH queries.
checkWellFormedRecursion would issue "missing recursive reference"
if a WITH RECURSIVE query contained a single self-reference but
that self-reference was inside a top-level WITH, ORDER BY, LIMIT,
etc, rather than inside the second arm of the UNION as expected.
We already intended to throw more-on-point errors for such cases,
but those error checks must be done before examining the UNION arm
in order to have the desired results.  So this patch need only
move some code (and improve the comments).

Per bug #18536 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18536-0a342ec07901203e@postgresql.org
2024-07-14 13:49:46 -04:00
Noah Misch
d5e6891502 Fix new assertion for MERGE view_name ... DO NOTHING.
Such queries don't expand automatically updatable views, and ModifyTable
uses the wholerow attribute unconditionally.  The user-visible behavior
is fine, so change to more-specific assertions.  Commit
d5f788b41dc2cbdde6e7694c70dda54d829a5ed5 added the wrong assertion.
Back-patch to v17, where commit 5f2e179bd31e5f5803005101eb12a8d7bf8db8f3
introduced MERGE view_name.

Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/e4b40a88-c134-6926-3196-bc4501cb87a2@gmail.com
2024-07-13 08:09:33 -07:00
Noah Misch
7102070329 Don't lose partitioned table reltuples=0 after relhassubclass=f.
ANALYZE sets relhassubclass=f when a partitioned table no longer has
partitions.  An ANALYZE doing that proceeded to apply the inplace update
of pg_class.reltuples to the old pg_class tuple instead of the new
tuple, losing that reltuples=0 change if the ANALYZE committed.
Non-partitioning inheritance trees were unaffected.  Back-patch to v14,
where commit 375aed36ad83f0e021e9bdd3a0034c0c992c66dc introduced
maintenance of partitioned table pg_class.reltuples.

Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/a295b499-dcab-6a99-c06e-01cf60593344@gmail.com
2024-07-13 08:09:33 -07:00
Andrew Dunstan
055891f374 Make sure to run pg_isready on correct port
The current code can have pg_isready unexpectedly succeed if there is a
server running on the default port. To avoid this we delay running the
test until after a node has been created but before it starts, and then
use that node's port, so we are fairly sure there is nothing running on
the port.

Backpatch to all live branches.
2024-07-13 08:06:53 -04:00
Thomas Munro
a8458f508a Fix lost Windows socket EOF events.
Winsock only signals an FD_CLOSE event once if the other end of the
socket shuts down gracefully.  Because each WaitLatchOrSocket() call
constructs and destroys a new event handle every time, with unlucky
timing we can lose it and hang.  We get away with this only if the other
end disconnects non-gracefully, because FD_CLOSE is repeatedly signaled
in that case.

To fix this design flaw in our Windows socket support fundamentally,
we'd probably need to rearchitect it so that a single event handle
exists for the lifetime of a socket, or switch to completely different
multiplexing or async I/O APIs.  That's going to be a bigger job
and probably wouldn't be back-patchable.

This brute force kludge closes the race by explicitly polling with
MSG_PEEK before sleeping.

Back-patch to all supported releases.  This should hopefully clear up
some random build farm and CI hang failures reported over the years.  It
might also allow us to try using graceful shutdown in more places again
(reverted in commit 29992a6) to fix instability in the transmission of
FATAL error messages, but that isn't done by this commit.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/176008.1715492071%40sss.pgh.pa.us
2024-07-13 14:59:46 +12:00
Andrew Dunstan
291c420747 Use diff --strip-trailing-cr in pg_regress.c
This was reverted in commit c194de0713. However with a correct
collate.windows.win1252.out we can now re-enable it.

Discussion: https://postgr.es/m/CAN55FZ1objLz3Vn5Afu4ojNESMQpxjxKcp2q18yrKF4eKMLENg@mail.gmail.com
2024-07-12 18:25:11 -04:00
Alvaro Herrera
74e12db19c
Add ORDER BY to new test query
Per buildfarm.
2024-07-12 13:44:19 +02:00
Alvaro Herrera
8391779138
Fix ALTER TABLE DETACH for inconsistent indexes
When a partitioned table has an index that doesn't support a constraint,
but a partition has an equivalent index that does, then a DETACH
operation would misbehave: a crash in assertion-enabled systems (because
we fail to find the constraint in the parent that we expect to), or a
broken coninhcount value (-1) in production systems (because we blindly
believe that we've successfully detached the parent).

While we should reject an ATTACH of a partition with such an index, we
have failed to do so in existing releases, so adding an error in stable
releases might break the (unlikely) existing applications that rely on
this behavior.  At this point I don't even want to reject them in
master, because it'd break pg_upgrade if such databases exist, and there
would be no easy way to fix existing databases without expensive index
rebuilds.

(Later on we could add ALTER TABLE ... ADD CONSTRAINT USING INDEX to
partitioned tables, which would allow the user to fix such patterns.  At
that point we could add more restrictions to prevent the problem from
its root.)

Also, add a test case that leaves one table in this condition, so that
we can verify that pg_upgrade continues to work if we later decide to
change the policy on the master branch.

Backpatch to all supported branches.

Co-authored-by: Tender Wang <tndrwang@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18500-62948b6fe5522f56@postgresql.org
2024-07-12 12:54:01 +02:00
Michael Paquier
734c057a89 Add assertion in pgstat_write_statsfile() about processes allowed
This routine can currently only be called from the postmaster in
single-user mode or the checkpointer, but there was no sanity check to
make sure that this was always the case.

This has proved to be useful when hacking the zone (at least to me), to
make sure that the write of the pgstats file happens at shutdown, as
wanted by design, in the correct process context.

Discussion: https://postgr.es/m/ZnEiqAITL-VgZDoY@paquier.xyz
2024-07-12 15:09:53 +09:00
Amit Kapila
63909da978 Fix a typo in logicalrep_write_typ().
Author: ChangAo Chen
Discussion: https://postgr.es/m/tencent_CDECB843B30A8B6B5152FA6458F0F00FDE09@qq.com
2024-07-12 10:20:59 +05:30
Amit Kapila
9fd8b331df Fix unstable test in 040_pg_createsubscriber.
The slot synchronization failed because the local slot's (created during
slot synchronization) catalog_xmin on standby is ahead of remote slot.
This happens because the INSERT before slot synchronization results in the
generation of a new xid that could be replicated to the standby. Now
before the xmin of the physical slot on the primary catches up via
hot_standby_feedback, the test has created a logical slot that got some
prior value of catalog_xmin.

To fix this we could try to ensure that the physical slot's catalog_xmin
is caught up to latest value before creating a logical slot but we took a
simpler path to move the INSERT after synchronizing the logical slot.

Reported-by: Alexander Lakhin as per buildfarm
Diagnosed-by: Amit Kapila, Hou Zhijie, Alexander Lakhin
Author: Hou Zhijie
Backpatch-through: 17
Discussion: https://postgr.es/m/bde6ac67-69cc-c104-5ab6-dd4f5deadf24@gmail.com
2024-07-12 09:29:21 +05:30
Richard Guo
22d946b0f8 Consider materializing the cheapest inner path in parallel nestloop
When generating non-parallel nestloop paths for each available outer
path, we always consider materializing the cheapest inner path if
feasible.  Similarly, in this patch, we also consider materializing
the cheapest inner path when building partial nestloop paths.  This
approach potentially reduces the need to rescan the inner side of a
partial nestloop path for each outer tuple.

Author: Tender Wang
Reviewed-by: Richard Guo, Robert Haas, David Rowley, Alena Rybakina
Reviewed-by: Tomasz Rybak, Paul Jungwirth, Yuki Fujii
Discussion: https://postgr.es/m/CAHewXNkPmtEXNfVQMou_7NqQmFABca9f4etjBtdbbm0ZKDmWvw@mail.gmail.com
2024-07-12 11:16:43 +09:00
Michael Paquier
72c0b24b2d Improve comment of pgstat_read_statsfile()
The comment at the top of pgstat_read_statsfile() mentioned that the
stats are read from the on-disk file into the pgstats dshash.  This is
incorrect for fix-numbered stats as these are loaded directly into
shared memory.  This commit simplifies the comment to be more general.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/Zo/eJIHUcqKxeSgv@ip-10-97-1-34.eu-west-3.compute.internal
2024-07-12 09:31:33 +09:00
Tom Lane
0d8bd0a72e Improve logical replication connection-failure messages.
These messages mostly said "could not connect to the publisher: %s"
which is lacking context.  Add some verbiage to indicate which
subscription or worker process is failing.

Nisha Moond

Discussion: https://postgr.es/m/CABdArM7q1=zqL++cYd0hVMg3u_tc0S=0Of=Um-KvDhLony0cSg@mail.gmail.com
2024-07-11 13:21:13 -04:00
Tom Lane
a0f1fce80c Add min and max aggregates for composite types (records).
Like min/max for arrays, these are just thin wrappers around
the existing btree comparison function for records.

Aleksander Alekseev

Discussion: https://postgr.es/m/CAO=iB8L4WYSNxCJ8GURRjQsrXEQ2-zn3FiCsh2LMqvWq2WcONg@mail.gmail.com
2024-07-11 11:50:50 -04:00
Masahiko Sawada
bb19b70081 Fix possibility of logical decoding partial transaction changes.
When creating and initializing a logical slot, the restart_lsn is set
to the latest WAL insertion point (or the latest replay point on
standbys). Subsequently, WAL records are decoded from that point to
find the start point for extracting changes in the
DecodingContextFindStartpoint() function. Since the initial
restart_lsn could be in the middle of a transaction, the start point
must be a consistent point where we won't see the data for partial
transactions.

Previously, when not building a full snapshot, serialized snapshots
were restored, and the SnapBuild jumps to the consistent state even
while finding the start point. Consequently, the slot's restart_lsn
and confirmed_flush could be set to the middle of a transaction. This
could lead to various unexpected consequences. Specifically, there
were reports of logical decoding decoding partial transactions, and
assertion failures occurred because only subtransactions were decoded
without decoding their top-level transaction until decoding the commit
record.

To resolve this issue, the changes prevent restoring the serialized
snapshot and jumping to the consistent state while finding the start
point.

On v17 and HEAD, a flag indicating whether snapshot restores should be
skipped has been added to the SnapBuild struct, and SNAPBUILD_VERSION
has been bumpded.

On backbranches, the flag is stored in the LogicalDecodingContext
instead, preserving on-disk compatibility.

Backpatch to all supported versions.

Reported-by: Drew Callahan
Reviewed-by: Amit Kapila, Hayato Kuroda
Discussion: https://postgr.es/m/2444AA15-D21B-4CCE-8052-52C7C2DAFE5C%40amazon.com
Backpatch-through: 12
2024-07-11 22:48:23 +09:00
Andrew Dunstan
c194de0713 Change pg_regress.c back to using diff -w on Windows
This partially reverts commit 628c1d1f2c.

It appears that there are non line-end differences in some regression
tests on Windows. To keep the buildfarm and CI clients happy, change
this back for now, pending further investigation.

Per reports from Tatsuo Ishii and Nazir Bilal Yavuz.
2024-07-11 09:34:27 -04:00
Michael Paquier
9e4664d950 Add a new 'F' entry type for fixed-numbered stats in pgstats file
This new entry type is used for all the fixed-numbered statistics,
making possible support for custom pluggable stats.  In short, we need
to be able to detect more easily if a stats kind exists or not when
reading back its data from the pgstats file without a dependency on the
order of the entries read.  The kind ID of the stats is added to the
data written.

The data is written in the same fashion as previously, with the
fixed-numbered stats first and the dshash entries next.  The read part
becomes more flexible, loading fixed-numbered stats into shared memory
based on the new entry type found.

Bump PGSTAT_FILE_FORMAT_ID.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zot5bxoPYdS7yaoy@paquier.xyz
2024-07-11 16:12:44 +09:00
Michael Paquier
21471f18e9 Add PgStat_KindInfo.init_shmem_cb
This new callback gives fixed-numbered stats the possibility to take
actions based on the area of shared memory allocated for them.

This removes from pgstat_shmem.c any knowledge specific to the types
of fixed-numbered stats, and the initializations happen in their own
files.  Like b68b29bc8fec, this change is useful to make this area of
the code more pluggable, so as custom fixed-numbered stats can take
actions after their shared memory area is initialized.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zot5bxoPYdS7yaoy@paquier.xyz
2024-07-11 09:21:40 +09:00
Nathan Bossart
cc2236854e Revamp documentation for predefined roles.
Presently, the page for predefined roles contains a table with
brief descriptions of what each role allows.  Below the table,
there is a separate section with more detailed information about
some of the roles.  As the set of predefined roles has grown over
the years, this page has (IMHO) become less readable.

This commit attempts to improve the predefined roles documentation
by abandoning the table in favor of listing each role with its own
complete description, similar to how we document GUCs.  Besides
merging the information that was split between the table and the
section below it, this commit also alphabetizes the roles.  The
alphabetization is imperfect because some of the roles are grouped
(e.g., pg_read_all_data and pg_write_all_data), and we order such
groups by the first role mentioned, but that seemed like a better
choice than breaking the groups apart.  Finally, this commit makes
some stylistic adjustments to the text.

Reviewed-by: David G. Johnston, Robert Haas
Discussion: https://postgr.es/m/ZmtM-4-eRtq8DRf6%40nathan
2024-07-10 16:35:25 -05:00
Dean Rasheed
0dcf753bd8 Improve the numeric width_bucket() computation.
Formerly, the computation of the bucket index involved calling
div_var() with a scale determined by select_div_scale(), and then
taking the floor of the result. That involved computing anything from
16 to 1000 digits after the decimal point, only for floor_var() to
throw them away. In addition, the quotient was computed with rounding
in the final digit, which meant that in rare cases the whole result
could round up to the wrong bucket, and could exceed count. Thus it
was also necessary to clamp the result to the range [1, count], though
that didn't prevent the result being in the wrong internal bucket.

Instead, compute the quotient using floor division, which guarantees
the correct result, as specified by the SQL spec, and doesn't need to
be clamped. This is both much simpler and more efficient, since it no
longer computes any quotient digits after the decimal point.

In addition, it is not necessary to have separate code to handle
reversed bounds, since the signs cancel out when dividing.

As with b0e9e4d76c and a2a0c7c29e, no back-patch.

Dean Rasheed, reviewed by Joel Jacobson.

Discussion: https://postgr.es/m/CAEZATCVbJH%2BLE9EXW8Rk3AxLe%3DjbOk2yrT_AUJGGh5Rah6zoeg%40mail.gmail.com
2024-07-10 20:07:20 +01:00
Andrew Dunstan
628c1d1f2c Use diff's --strip-trailing-cr flag where appropriate on Windows
Test result files might be checked out using Unix or Windows style line
endings, depening on git flags, so on Windows we use the
--strip-trailing-cr flag to tell diff to ignore line endings
differences.

The flag is added to the diff invocation for the test_json_parser module
tests and the pg_bsd_indent tests. in pg_regress.c we replace the
current use of the "-w" flag, which ignore all white space differences,
with this one which only ignores line end differences.

Discussion: https://postgr.es/m/20240707052030.r77hbdkid3mwksop@awork3.anarazel.de
2024-07-10 09:53:47 -04:00
Fujii Masao
05506510de doc: Update track_io_timing documentation to mention pg_stat_io.
The I/O timing information collected when track_io_timing is
enabled is now documented to appear in the pg_stat_io view,
which was previously not mentioned.

This commit also enhances the description of track_io_timing
to clarify that it monitors not only block read and write
but also block extend and fsync operations. Additionally,
the description of track_wal_io_timing has been improved
to mention both WAL write and WAL fsync monitoring.

Backpatch to v16 where pg_stat_io was added.

Author: Hajime Matsunaga
Reviewed-by: Melanie Plageman, Nazir Bilal Yavuz, Fujii Masao
Discussion: https://postgr.es/m/TYWPR01MB10742EE4A6F34C33061429D38A4D52@TYWPR01MB10742.jpnprd01.prod.outlook.com
2024-07-10 15:56:07 +09:00
Michael Paquier
d898665bf7 Extend pg_get_acl() to handle sub-object IDs
This patch modifies the pg_get_acl() function to accept a third argument
called "objsubid", bringing it on par with similar functions in this
area like pg_describe_object().  This enables the retrieval of ACLs for
relation attributes when scanning dependencies.

Bump catalog version.

Author: Joel Jacobson
Discussion: https://postgr.es/m/f2539bff-64be-47f0-9f0b-df85d3cc0432@app.fastmail.com
2024-07-10 10:14:37 +09:00
Andrew Dunstan
f7bd0a381d Prevent CRLF conversion of inputs in json_parser test module
Do this by opening the file in PG_BINARY_R mode. This prevents us from
getting wrong byte count from stat().

Per complaint from Andres Freund

Discussion: https://postgr.es/m/20240707052030.r77hbdkid3mwksop@awork3.anarazel.de

Backpatch to rlease 17 where this code was introduced
2024-07-09 17:29:48 -04:00
Tom Lane
896cd266fd Remove new XML test cases added by e7192486d.
These turn out to produce libxml2-version-dependent error reports.
They aren't adding value that would justify dealing with that,
so just remove them again.  (I had in fact guessed wrong about
what versions matching xml_2.out would produce, but it doesn't
matter because there are other discrepancies.)

Per buildfarm.

Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25
Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01
2024-07-09 16:31:24 -04:00
Jeff Davis
b3bd18294e Fix missing invalidations for search_path cache.
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240630223047.1f.nmisch@google.com
Backpatch-through: 17
2024-07-09 12:37:05 -07:00
Tom Lane
e7192486dd Suppress "chunk is not well balanced" errors from libxml2.
libxml2 2.13 has an entirely different rule than earlier versions
about when to emit "chunk is not well balanced" errors.  This
causes regression test output discrepancies for three test cases
that formerly provoked that error (along with others) and now don't.

Closer inspection shows that at least in 2.13, this error is pretty
useless because it can only be emitted after some other more-relevant
error.  So let's get rid of the cross-version discrepancy by just
suppressing it.  In case some older libxml2 version is capable of
emitting this error by itself, suppress only when some other error
has already been captured.

Like 066e8ac6e and 6082b3d5d, this will need to be back-patched,
but let's check the results in HEAD first.  (The patch for xml_2.out,
in particular, is blind since I can't test it here.)

Erik Wienhold and Tom Lane, per report from Frank Streitzig.

Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25
Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01
2024-07-09 15:01:13 -04:00
Nathan Bossart
ccd38024bc Introduce pg_signal_autovacuum_worker.
Since commit 3a9b18b309, roles with privileges of pg_signal_backend
cannot signal autovacuum workers.  Many users treated the ability
to signal autovacuum workers as a feature instead of a bug, so we
are reintroducing it via a new predefined role.  Having privileges
of this new role, named pg_signal_autovacuum_worker, only permits
signaling autovacuum workers.  It does not permit signaling other
types of superuser backends.

Bumps catversion.

Author: Kirill Reshke
Reviewed-by: Anthony Leung, Michael Paquier, Andrey Borodin
Discussion: https://postgr.es/m/CALdSSPhC4GGmbnugHfB9G0%3DfAxjCSug_-rmL9oUh0LTxsyBfsg%40mail.gmail.com
2024-07-09 13:03:40 -05:00
Fujii Masao
629520be5f Fix comment in libpqrcv_check_conninfo().
Previously, the comment incorrectly stated that libpqrcv_check_conninfo()
returns true or false based on the connection string check.
However, this function actually has a void return type and
raises an error if the check fails.

Author: Rintaro Ikeda
Reviewed-by: Jelte Fennema-Nio, Fujii Masao
Discussion: https://postgr.es/m/6a1ca81b27fec4da0ccdfaaaec787982@oss.nttdata.com
2024-07-09 21:30:18 +09:00
Dean Rasheed
ca481d3c9a Optimise numeric multiplication for short inputs.
When either input has a small number of digits, and the exact product
is requested, the speed of numeric multiplication can be increased
significantly by using a faster direct multiplication algorithm. This
works by fully computing each result digit in turn, starting with the
least significant, and propagating the carry up. This save cycles by
not requiring a temporary buffer to store digit products, not making
multiple passes over the digits of the longer input, and not requiring
separate carry-propagation passes.

For now, this is used when the shorter input has 1-4 NBASE digits (up
to 13-16 decimal digits), and the longer input is of any size, which
covers a lot of common real-world cases. Also, the relative benefit
increases as the size of the longer input increases.

Possible future work would be to try extending the technique to larger
numbers of digits in the shorter input.

Joel Jacobson and Dean Rasheed.

Discussion: https://postgr.es/m/44d2ffca-d560-4919-b85a-4d07060946aa@app.fastmail.com
2024-07-09 10:00:42 +01:00
Amit Langote
42de72fa7b SQL/JSON: Various improvements to SQL/JSON query function docs
1. Remove the keyword SELECT from the examples to be consistent
with the examples of other JSON-related functions listed on the
same page.

2. Add <synopsis> tags around the functions' syntax definition

3. Capitalize function names in the syntax synopsis and the examples

4. Use <itemizedlist> lists for dividing the descriptions of
   individual functions into bullet points

5. Significantly rewrite the description of wrapper clauses of
   JSON_QUERY

6. Significantly rewrite the descriptions of ON ERROR / EMPTY
   clauses of JSON_QUERY() and JSON_VALUE() functions

7. Add a note about how JSON_VALUE() and JSON_QUERY() differ when
   returning a JSON null result

8. Move the description of the PASSING clause from the descriptions
   of individual functions into the top paragraph

And other miscellaneous text improvements, typo fixes.

Suggested-by: Thom Brown <thom@linux.com>
Suggested-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/CAA-aLv7Dfy9BMrhUZ1skcg=OdqysWKzObS7XiDXdotJNF0E44Q@mail.gmail.com
Discussion: https://postgr.es/m/CAKFQuwZNxNHuPk44zDF7z8qZec1Aof10aA9tWvBU5CMhEKEd8A@mail.gmail.com
2024-07-09 16:12:22 +09:00
Amit Kapila
571f7f7086 To improve the code, move the error check in logical_read_xlog_page().
Commit 0fdab27ad6 changed the code to wait for WAL to be available before
determining the timeline but forgot to move the failure check.

This change is to make the related code easier to understand and enhance
otherwise there is no bug in the current code.

In the passing, improve the nearby comments to explain why we determine
am_cascading_walsender after waiting for the required WAL.

Author: Peter Smith
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://postgr.es/m/CAHut+PvqX49fusLyXspV1Mmd_EekPtXG0oT146vZjcb9XDvNgw@mail.gmail.com
2024-07-09 09:00:45 +05:30
Michael Paquier
b68b29bc8f Use pgstat_kind_infos to write fixed shared statistics
This is similar to 9004abf6206e, but this time for the write part of the
stats file.  The code is changed so as, rather than referring to
individual members of PgStat_Snapshot in an order based on their
PgStat_Kind value, a loop based on pgstat_kind_infos is used to retrieve
the contents to write from the snapshot structure, for a size of
PgStat_KindInfo's shared_data_len.

This requires the addition to PgStat_KindInfo of an offset to track the
location of each fixed-numbered stats in PgStat_Snapshot.  This change
is useful to make this area of the code more easily pluggable, and
reduces the knowledge of specific fixed-numbered kinds in pgstat.c.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zot5bxoPYdS7yaoy@paquier.xyz
2024-07-09 10:27:12 +09:00
David Rowley
c048cd992c Avoid JIT-related test instability in EXPLAIN ANALYZE
036bdcec9 added some code to perform some verification on portions of
the planner costs in EXPLAIN ANALYZE but failed to consider that some
buildfarm animals such as bushmaster and taipan are running very low jit
thresholds.  This caused these animals to fail as they were outputting
JIT-related details in EXPLAIN ANALYZE for the newly added tests.

Here we avoid that by disabling JIT for the plans in question.

Discussion: https://postgr.es/m/CAApHDvpxV4rrO3XUCgGS5N9Wg6f2r0ojJPD2tX2FRV-o9sRTJA@mail.gmail.com
2024-07-09 12:46:48 +12:00
Fujii Masao
c8d5d6c78a Fix limit block handling in pg_wal_summary_contents().
Previously, pg_wal_summary_contents() had two issues,
causing discrepancies between pg_wal_summary_contents()
and the pg_walsummary command on the same WAL summary file:

(1) It did not emit the limit block when that's the only data for
     a particular relation fork.
(2) It emitted the same limit block multiple times if the list of
     block numbers was long enough.

This commit fixes these issues.

Backpatch to v17 where pg_wal_summary_contents() was added.

Author: Fujii Masao
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/90980ee6-2da6-42f6-a7b0-b7bae62ae279@oss.nttdata.com
2024-07-09 09:26:54 +09:00
David Rowley
5a1e6df3b8 Show Parallel Bitmap Heap Scan worker stats in EXPLAIN ANALYZE
Nodes like Memoize report the cache stats for each parallel worker, so it
makes sense to show the exact and lossy pages in Parallel Bitmap Heap Scan
in a similar way.  Likewise, Sort shows the method and memory used for
each worker.

There was some discussion on whether the leader stats should include the
totals for each parallel worker or not.  I did some analysis on this to
see what other parallel node types do and it seems only Parallel Hash does
anything like this.  All the rest, per what's supported by
ExecParallelRetrieveInstrumentation() are consistent with each other.

Author: David Geier <geidav.pg@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Donghang Lin <donghanglin@gmail.com>
Author: Alena Rybakina <lena.ribackina@yandex.ru>
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Christofides <michael@pgmustard.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Donghang Lin <donghanglin@gmail.com>
Reviewed-by: Masahiro Ikeda <Masahiro.Ikeda@nttdata.com>
Discussion: https://postgr.es/m/b3d80961-c2e5-38cc-6a32-61886cdf766d%40gmail.com
2024-07-09 12:15:47 +12:00
David Rowley
e41f713097 Perform forgotten cat version bump
I missed this in 036bdcec9
2024-07-09 09:56:46 +12:00
David Rowley
036bdcec9f Teach planner how to estimate rows for timestamp generate_series
This provides the planner with row estimates for
generate_series(TIMESTAMP, TIMESTAMP, INTERVAL),
generate_series(TIMESTAMPTZ, TIMESTAMPTZ, INTERVAL) and
generate_series(TIMESTAMPTZ, TIMESTAMPTZ, INTERVAL, TEXT) when the input
parameter values can be estimated during planning.

Author: David Rowley
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrBE%3D%2BASo_sGYmQJ3GvO8GPvX5yxXhRS%3Dt_ybd4odFkhQ%40mail.gmail.com
2024-07-09 09:54:59 +12:00
Andrew Dunstan
5193ca8e15 Symlink pg_replslot robustly on Windows in pg_basebackup test
This reverts commit e9f15bc9. Instead of a hacky solution that didn't
work on Windows, we avoid trying to move the directory possibly across
drives, and instead remove it and recreate it in the new location.

Discussion: https://postgr.es/m/20240707070243.sb77kp4ubowauctz@awork3.anarazel.de

Backpatch to release 14 like the previous patch.
2024-07-08 14:01:49 -04:00
Nathan Bossart
64f34eb2e2 Use CREATE DATABASE ... STRATEGY = FILE_COPY in pg_upgrade.
While this strategy is ordinarily quite costly because it requires
performing two checkpoints, testing shows that it tends to be a
faster choice than WAL_LOG during pg_upgrade, presumably because
fsync is turned off.  Furthermore, we can skip the checkpoints
altogether because the problems they are intended to prevent don't
apply to pg_upgrade.  Instead, we just need to CHECKPOINT once in
the new cluster after making any changes to template0 and before
restoring the rest of the databases.  This ensures that said
template0 changes are written out to disk prior to creating the
databases via FILE_COPY.

Co-authored-by: Matthias van de Meent
Reviewed-by: Ranier Vilela, Dilip Kumar, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/Zl9ta3FtgdjizkJ5%40nathan
2024-07-08 16:18:00 -05:00
Andrew Dunstan
4b4b931bcd Choose ports for test servers less likely to result in conflicts
If we choose ports in the range typically used for ephemeral ports there
is a danger of encountering a port conflict due to a race condition
between the time we choose the port in a range below that typically used
to allocate ephemeral ports, but higher than the range typically used by
well known services.

Author: Jelte Fenema-Nio, with some editing by me.

Discussion: https://postgr.es/m/d6ee8761-39d1-0033-1afb-d5a57ee056f2@gmail.com

Backpatch to all live branches (12 and up)
2024-07-08 11:40:58 -04:00
Andrew Dunstan
e4f4c5424c Force nodes for SSL tests to start in TCP mode
Currently they are started in unix socket mode in ost cases, and then
converted to run in TCP mode. This can result in port collisions, and
there is no virtue in startng in unix socket mode, so start as we will
be going on.

Discussion: https://postgr.es/m/d6ee8761-39d1-0033-1afb-d5a57ee056f2@gmail.com

Backpatch to all live branches (12 and up).
2024-07-08 11:40:58 -04:00
Tom Lane
6082b3d5d3 Use xmlParseInNodeContext not xmlParseBalancedChunkMemory.
xmlParseInNodeContext has basically the same functionality with
a different API: we have to supply an xmlNode that's attached to a
document rather than just the document.  That's not hard though.
The benefits are two:

* Early 2.13.x releases of libxml2 contain a bug that causes
xmlParseBalancedChunkMemory to return the wrong status value in some
cases.  This breaks our regression tests.  While that bug is now fixed
upstream and will probably never be seen in any production-oriented
distro, it is currently a problem on some more-bleeding-edge-friendly
platforms.

* xmlParseBalancedChunkMemory is considered to depend on libxml2's
semi-deprecated SAX1 APIs, and will go away when and if they do.
There may already be libxml2 builds out there that lack this function.

So there are both short- and long-term reasons to make this change.

While here, avoid allocating an xmlParserCtxt in DOCUMENT parse mode,
since that code path is not going to use it.

Like 066e8ac6e, this will need to be back-patched.  This is just a
trial commit to see if the buildfarm agrees that we can use
xmlParseInNodeContext unconditionally.

Erik Wienhold and Tom Lane, per report from Frank Streitzig.

Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25
Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01
2024-07-08 14:04:00 -04:00
Dean Rasheed
1ff39f4ff2 Fix scale clamping in numeric round() and trunc().
The numeric round() and trunc() functions clamp the scale argument to
the range between +/- NUMERIC_MAX_RESULT_SCALE (2000), which is much
smaller than the actual allowed range of type numeric. As a result,
they return incorrect results when asked to round/truncate more than
2000 digits before or after the decimal point.

Fix by using the correct upper and lower scale limits based on the
actual allowed (and documented) range of type numeric.

While at it, use the new NUMERIC_WEIGHT_MAX constant instead of
SHRT_MAX in all other overflow checks, and fix a comment thinko in
power_var() introduced by e54a758d24 -- the minimum value of
ln_dweight is -NUMERIC_DSCALE_MAX (-16383), not -SHRT_MAX, though this
doesn't affect the point being made in the comment, that the resulting
local_rscale value may exceed NUMERIC_MAX_DISPLAY_SCALE (1000).

Back-patch to all supported branches.

Dean Rasheed, reviewed by Joel Jacobson.

Discussion: https://postgr.es/m/CAEZATCXB%2BrDTuMjhK5ZxcouufigSc-X4tGJCBTMpZ3n%3DxxQuhg%40mail.gmail.com
2024-07-08 17:48:45 +01:00
Amit Langote
519d710720 Typo fix
Reported-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3KPi=LayiTwJ11ikF7bcqnZUrcj8NgX0V8nO1mQKZ9GfQ@mail.gmail.com
Backpatch-through: 17
2024-07-08 22:12:55 +09:00
Heikki Linnakangas
cc68ca6d42 Fix outdated comment after removal of direct SSL fallback
The option to fall back from direct SSL to negotiated SSL or a
plaintext connection was removed in commit fb5718f35f.

Discussion: https://www.postgresql.org/message-id/c82ad227-e049-4e18-8898-475a748b5a5a@iki.fi
2024-07-08 12:44:45 +03:00
Michael Paquier
e311c6e539 Renumber pg_get_acl() in pg_proc.dat
a6417078c414 has introduced as project policy that new features
committed during the development cycle should use new OIDs in the
[8000,9999] range.

4564f1cebd43 did not respect that rule, so let's renumber pg_get_acl()
to use an OID in the correct range.

Bump catalog version.
2024-07-08 15:34:33 +09:00
David Rowley
7340d9362a Widen lossy and exact page counters for Bitmap Heap Scan
Both of these counters were using the "long" data type.  On MSVC that's
a 32-bit type.  On modern hardware, I was able to demonstrate that we can
wrap those counters with a query that only takes 15 minutes to run.

This issue may manifest itself either by not showing the values of the
counters because they've wrapped and are less than zero, resulting in
them being filtered by the > 0 checks in show_tidbitmap_info(), or bogus
numbers being displayed which are modulus 2^32 of the actual number.

Widen these counters to uint64.

Discussion: https://postgr.es/m/CAApHDvpS_97TU+jWPc=T83WPp7vJa1dTw3mojEtAVEZOWh9bjQ@mail.gmail.com
2024-07-08 14:43:09 +12:00
Richard Guo
d7db04dfda Remove an extra period in code comment
Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3L9GgfKc+XT+NMHPY7atAOVYqjUqKEFQKhcPHFYRW=PuQ@mail.gmail.com
2024-07-08 11:17:22 +09:00
Richard Guo
0ffc0acaf3 Fix right-anti-joins when the inner relation is proven unique
For an inner_unique join, we always assume that the executor will stop
scanning for matches after the first match.  Therefore, for a mergejoin
that is inner_unique and whose mergeclauses are sufficient to identify a
match, we set the skip_mark_restore flag to true, indicating that the
executor need not do mark/restore calls.  However, merge-right-anti-join
did not get this memo and continues scanning the inner side for matches
after the first match.  If there are duplicates in the outer scan, we
may incorrectly skip matching some inner tuples, which can lead to wrong
results.

Here we fix this issue by ensuring that merge-right-anti-join also
advances to next outer tuple after the first match in inner_unique
cases.  This also saves cycles by avoiding unnecessary scanning of inner
tuples after the first match.

Although hash-right-anti-join does not suffer from this wrong results
issue, we apply the same change to it as well, to help save cycles for
the same reason.

Per bug #18522 from Antti Lampinen, and bug #18526 from Feliphe Pozzer.
Back-patch to v16 where right-anti-join was introduced.

Author: Richard Guo
Discussion: https://postgr.es/m/18522-c7a8956126afdfd0@postgresql.org
2024-07-08 10:11:46 +09:00
Michael Paquier
74b8e6a698 Re-enable autoruns for cmd.exe on Windows
This acts as a revert of b83747a8a65b and 9886744a361b.  As pointed out
by Noah, HEAD and REL_17_STABLE are in a weird state where the code
paths adding /D would limit the spawn of child processes, but we still
have code paths where the spawn of more than one child process(es) would
be possible.

Let's remove these /D switches for now, to bring back the code into a
state consistent with how autorun is configured on a Windows host.

Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240630021211.f3.nmisch@google.com
Backpatch-through: 17
2024-07-08 09:43:59 +09:00
Tom Lane
066e8ac6ea Use xmlAddChildList not xmlAddChild in XMLSERIALIZE.
It looks like we should have been doing this all along,
but we got away with the wrong coding until libxml2 2.13.0
tightened up xmlAddChild's behavior.

There is more stuff to be fixed to be compatible with 2.13.0,
and it will all need to be back-patched.  This is just a
trial commit to see if the buildfarm agrees that we can use
xmlAddChildList unconditionally.

Erik Wienhold, per report from Frank Streitzig.

Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25
Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01
2024-07-06 15:16:13 -04:00
David Rowley
04bcf9e19a Adjust tuplestore.c not to allocate BufFiles in generation context
590b045c3 made it so tuplestore.c would store tuples inside a
generation.c memory context.  After fixing a bug report in 97651b013, it
seems that it's probably best not to allocate BufFile related
allocations in that context.  Let's keep it just for tuple data.

This adjusts the code to switch to the Tuplestorestate.context's parent,
which is the MemoryContext that tuplestore_begin_common() was called in.
It does not seem worth adding a new field in Tuplestorestate to store
this when we can access it by looking at the Tuplestorestate's
context's parent.

Discussion: https://postgr.es/m/CAApHDvqFt_CdJtSr+E9YLZb7jZAyRCy3hjQ+ktM+dcOFVq-xkg@mail.gmail.com
2024-07-06 17:40:05 +12:00
David Rowley
97651b0139 Fix incorrect sentinel byte logic in GenerationRealloc()
This only affects MEMORY_CONTEXT_CHECKING builds.

This fixes an off-by-one issue in GenerationRealloc() where the
fast-path code which tries to reuse the existing allocation if the
existing chunk is >= the new requested size.  The code there thought it
was always ok to use the existing chunk, but when oldsize == size there
isn't enough space to store the sentinel byte.  If both sizes matched
exactly set_sentinel() would overwrite the first byte beyond the chunk
and then subsequent GenerationRealloc() calls could then fail the
Assert(chunk->requested_size < oldsize) check which is trying to ensure
the chunk is large enough to store the sentinel.

The same issue does not exist in aset.c as the sentinel checking code
only adds a sentinel byte if there's enough space in the chunk.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/49275921-7b39-41af-5eb8-97b50ce3312e@gmail.com
Backpatch-through: 16, where the problem was introduced by 0e480385e
2024-07-06 13:59:34 +12:00
Thomas Munro
2a5ef09830 Cope with <regex.h> name clashes.
macOS 15's SDK pulls in headers related to <regex.h> when we include
<xlocale.h>.  This causes our own regex_t implementation to clash with
the OS's regex_t implementation.  Luckily our function names already had
pg_ prefixes, but the macros and typenames did not.

Include <regex.h> explicitly on all POSIX systems, and fix everything
that breaks.  Then we can prove that we are capable of fully hiding and
replacing the system regex API with our own.

1.  Deal with standard-clobbering macros by undefining them all first.
POSIX says they are "symbolic constants".  If they are macros, this
allows us to redefine them.  If they are enums or variables, our macros
will hide them.

2.  Deal with standard-clobbering types by giving our types pg_
prefixes, and then using macros to redirect xxx_t -> pg_xxx_t.

After including our "regex/regex.h", the system <regex.h> is hidden,
because we've replaced all the standard names.  The PostgreSQL source
tree and extensions can continue to use standard prefix-less type and
macro names, but reach our implementation, if they included our
"regex/regex.h" header.

Back-patch to all supported branches, so that macOS 15's tool chain can
build them.

Reported-by: Stan Hu <stanhu@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com
2024-07-06 10:27:16 +12:00
Tom Lane
8212625e53 Fix placement of "static".
Various buildfarm critters were complaining about

pgbench.c:304:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]

Evidently a thinko in 720b0eaae.
2024-07-05 17:32:55 -04:00
Nathan Bossart
0b1fe1413e Remove check hooks for GUCs that contribute to MaxBackends.
Each of max_connections, max_worker_processes,
autovacuum_max_workers, and max_wal_senders has a GUC check hook
that verifies the sum of those GUCs does not exceed a hard-coded
limit (see the comment for MAX_BACKENDS in postmaster.h).  In
general, the hooks effectively guard against egregious
misconfigurations.

However, this approach has some problems.  Since these check hooks
are called as each GUC is assigned its user-specified value, only
one of the hooks will be called with all the relevant GUCs set.  If
one or more of the user-specified values are less than the initial
values of the GUCs' underlying variables, false positives can
occur.

Furthermore, the error message emitted when one of the check hooks
fails is not tremendously helpful.  For example, the command

	$ pg_ctl -D . start -o "-c max_connections=262100 -c max_wal_senders=10000"

fails with the following error:

	FATAL:  invalid value for parameter "max_wal_senders": 10000

Fortunately, there is an extra copy of this check in
InitializeMaxBackends() that we can rely on, so this commit removes
the aforementioned GUC check hooks in favor of that one.  It also
enhances the error message to clearly show the values of the
relevant GUCs and the hard-coded limit their sum may not exceed.
The downside of this change is that server startup progresses
further before failing due to such misconfigurations (thus taking
longer), but these failures are expected to be rare, so we don't
anticipate any real harm in practice.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZnMr2k-Nk5vj7T7H%40nathan
2024-07-05 14:42:55 -05:00
Tom Lane
ba8f00eef6 Improve PL/Tcl's method for choosing Tcl names of procedures.
Previously, the internal name of a PL/Tcl function was just
"__PLTcl_proc_NNNN", where NNNN is the function OID.  That's pretty
unhelpful when reading an error report.  Plus it prevents us from
testing the CONTEXT output for PL/Tcl errors, since the OIDs shown
in the regression tests wouldn't be stable.

Instead, base the internal name on the result of format_procedure(),
which will be unique in most cases.  For the edge cases where it's
not, we can append the function OID to make it unique.

Sadly, the pltcl_trigger.sql test script still has to suppress the
context reports, because they'd include trigger arguments which
contain relation OIDs per PL/Tcl's longstanding API for triggers.

I had to modify one existing test case to throw a different error
than before, because I found that Tcl 8.5 and Tcl 8.6 spell the
context message for the original error slightly differently.
We might have to make more adjustments in that vein once this
gets wider testing.

Patch by me; thanks to Pavel Stehule for the idea to use
format_procedure() rather than just the proname.

Discussion: https://postgr.es/m/890581.1717609350@sss.pgh.pa.us
2024-07-05 14:14:42 -04:00
Tom Lane
aaab3ee9c6 Doc: minor improvements for our "Brief History" chapter.
Add a link to Joe Hellerstein's paper "Looking Back at Postgres",
which is quite an interesting take on the history of Postgres.

The reference to Appendix E was written when we were still keeping
the entire release-note history there, which we stopped doing some
years ago when the O(N^2) cost of that started to become apparent.
Instead, point to the release note archives on the website.
(This per suggestion from Daniel Gustafsson.)

In passing, move the "ports12" biblioentry to be in alphabetical
order within that section.

Discussion: https://postgr.es/m/3345678.1720071633@sss.pgh.pa.us
2024-07-05 13:12:34 -04:00
Michael Paquier
4b211003ec Support loading of injection points
This can be used to load an injection point and prewarm the
backend-level cache before running it, to avoid issues if the point
cannot be loaded due to restrictions in the code path where it would be
run, like a critical section where no memory allocation can happen
(load_external_function() can do allocations when expanding a library
name).

Tests can use a macro called INJECTION_POINT_LOAD() to load an injection
point.  The test module injection_points gains some tests, and a SQL
function able to load an injection point.

Based on a request from Andrey Borodin, who has implemented a test for
multixacts requiring this facility.

Reviewed-by: Andrey Borodin
Discussion: https://postgr.es/m/ZkrBE1e2q2wGvsoN@paquier.xyz
2024-07-05 18:09:03 +09:00
Heikki Linnakangas
98347b5a3a Lift limitation that PGPROC->links must be the first field
Since commit 5764f611e1, we've been using the ilist.h functions for
handling the linked list. There's no need for 'links' to be the first
element of the struct anymore, except for one call in InitProcess
where we used a straight cast from the 'dlist_node *' to PGPROC *,
without the dlist_container() macro. That was just an oversight in
commit 5764f611e1, fix it.

There no imminent need to move 'links' from being the first field, but
let's be tidy.

Reviewed-by: Aleksander Alekseev, Andres Freund
Discussion: https://www.postgresql.org/message-id/22aa749e-cc1a-424a-b455-21325473a794@iki.fi
2024-07-05 11:21:46 +03:00
David Rowley
590b045c37 Improve memory management and performance of tuplestore.c
Here we make tuplestore.c use a generation.c memory context rather than
allocating tuples into the CurrentMemoryContext, which primarily is the
ExecutorState or PortalHoldContext memory context.  Not having a
dedicated context can cause the CurrentMemoryContext context to become
bloated when pfree'd chunks are not reused by future tuples.  Using
generation speeds up users of tuplestore.c, such as the Materialize,
WindowAgg and CTE Scan executor nodes.  The main reason for the speedup is
due to generation.c being more memory efficient than aset.c memory
contexts.  Specifically, generation does not round sizes up to the next
power of 2 value.  This both saves memory, allowing more tuples to fit in
work_mem, but also makes the memory usage more compact and fit on fewer
cachelines.  One benchmark showed up to a 22% performance increase in a
query containing a Materialize node.  Much higher gains are possible if
the memory reduction prevents tuplestore.c from spilling to disk.  This is
especially true for WindowAgg nodes where improvements of several thousand
times are possible if the memory reductions made here prevent tuplestore
from spilling to disk.

Additionally, a generation.c memory context is much better suited for this
job as it works well with FIFO palloc/pfree patterns, which is exactly how
tuplestore.c uses it.  Because of the way generation.c allocates memory,
tuples consecutively stored in tuplestores are much more likely to be
stored consecutively in memory.  This allows the CPU's hardware prefetcher
to work more efficiently as it provides a more predictable pattern to
allow cachelines for the next tuple to be loaded from RAM in advance of
them being needed by the executor.

Using a dedicated memory context for storing tuples also allows us to more
efficiently clean up the memory used by the tuplestore as we can reset or
delete the context rather than looping over all stored tuples and
pfree'ing them one by one.

Also, remove a badly placed USEMEM call in readtup_heap().  The tuple
wasn't being allocated in the Tuplestorestate's context, so no need to
adjust the memory consumed by the tuplestore there.

Author: David Rowley
Reviewed-by: Matthias van de Meent, Dmitry Dolgov
Discussion: https://postgr.es/m/CAApHDvp5Py9g4Rjq7_inL3-MCK1Co2CRt_YWFwTU2zfQix0p4A@mail.gmail.com
2024-07-05 17:51:27 +12:00
David Rowley
53abb1e0eb Fix newly introduced issue in EXPLAIN for Materialize nodes
The code added in 1eff8279d was lacking a check to see if the tuplestore
had been created.  In nodeMaterial.c this is done by ExecMaterial() rather
than by ExecInitMaterial(), so the tuplestore won't be created unless
the node has been executed at least once, as demonstrated by Alexander
in his report.

Here we skip showing any of the new EXPLAIN ANALYZE information when the
Materialize node has not been executed.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/fe7fc8fb-86e5-ecb0-3cb2-dd2c9a6c482f@gmail.com
2024-07-05 16:56:16 +12:00
Thomas Munro
18501841bc Add simple codepoint redirections to unaccent.rules.
Previously we searched for code points where the Unicode data file
listed an equivalent combining character sequence that added accents.
Some codepoints redirect to a single other codepoint, instead of doing
any combining.  We can follow those references recursively to get the
answer.

Per bug report #18362, which reported missing Ancient Greek characters.
Specifically, precomposed characters with oxia (from the polytonic
accent system used for old Greek) just point to precomposed characters
with tonos (from the monotonic accent system for modern Greek), and we
have to follow the extra hop to find out that they are composed with
an acute accent.

Besides those, the new rule also:

* pulls in a lot of 'Mathematical Alphanumeric Symbols', which are
  copies of the Latin and Greek alphabets and numbers rendered
  in different typefaces, and

* corrects a single mathematical letter that previously came from the
  CLDR transliteration file, but the new rule extracts from the main
  Unicode database file, where clearly the latter is right and the
  former is a wrong (reported to CLDR).

Reported-by: Cees van Zeeland <cees.van.zeeland@freedom.nl>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18362-be6d0cfe122b6354%40postgresql.org
2024-07-05 15:25:31 +12:00
David Rowley
1eff8279d4 Add memory/disk usage for Material nodes in EXPLAIN
Up until now, there was no ability to easily determine if a Material
node caused the underlying tuplestore to spill to disk or even see how
much memory the tuplestore used if it didn't.

Here we add some new functions to tuplestore.c to query this information
and add some additional output in EXPLAIN ANALYZE to display this
information for the Material node.

There are a few other executor node types that use tuplestores, so we
could also consider adding these details to the EXPLAIN ANALYZE for
those nodes too.  Let's consider those independently from this.  Having
the tuplestore.c infrastructure in to allow that is step 1.

Author: David Rowley
Reviewed-by: Matthias van de Meent, Dmitry Dolgov
Discussion: https://postgr.es/m/CAApHDvp5Py9g4Rjq7_inL3-MCK1Co2CRt_YWFwTU2zfQix0p4A@mail.gmail.com
2024-07-05 14:05:08 +12:00
Richard Guo
aa86129e19 Support "Right Semi Join" plan shapes
Hash joins can support semijoin with the LHS input on the right, using
the existing logic for inner join, combined with the assurance that only
the first match for each inner tuple is considered, which can be
achieved by leveraging the HEAP_TUPLE_HAS_MATCH flag.  This can be very
useful in some cases since we may now have the option to hash the
smaller table instead of the larger.

Merge join could likely support "Right Semi Join" too.  However, the
benefit of swapping inputs tends to be small here, so we do not address
that in this patch.

Note that this patch also modifies a test query in join.sql to ensure it
continues testing as intended.  With this patch the original query would
result in a right-semi-join rather than semi-join, compromising its
original purpose of testing the fix for neqjoinsel's behavior for
semi-joins.

Author: Richard Guo
Reviewed-by: wenhui qiu, Alena Rybakina, Japin Li
Discussion: https://postgr.es/m/CAMbWs4_X1mN=ic+SxcyymUqFx9bB8pqSLTGJ-F=MHy4PW3eRXw@mail.gmail.com
2024-07-05 09:26:48 +09:00
Tom Lane
5a519abedd Doc: small improvements in discussion of geometric data types.
State explicitly that the coordinates in our geometric data types are
float8.  Also explain that polygons store their bounding box.

While here, fix the table of geometric data types to show type
"line"'s size correctly: it's 24 bytes not 32.  This has somehow
escaped notice since that table was made in 1998.

Per suggestion from Sebastian Skałacki.  The size error seems
important enough to justify back-patching.

Discussion: https://postgr.es/m/172000045661.706.1822177575291548794@wrigleys.postgresql.org
2024-07-04 13:23:32 -04:00
Alvaro Herrera
2ef575c780
Fix copy/paste mistake in comment
Backpatch to 17

Author: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20240704134638.355ad44a445fa1e764a220cd@sranhm.sraoss.co.jp
2024-07-04 13:57:47 +02:00
Alvaro Herrera
768f0c3e21
Remove bogus assertion in pg_atomic_monotonic_advance_u64
This code wanted to ensure that the 'exchange' variable passed to
pg_atomic_compare_exchange_u64 has correct alignment, but apparently
platforms don't actually require anything that doesn't come naturally.

While messing with pg_atomic_monotonic_advance_u64: instead of using
Max() to determine the value to return, just use
pg_atomic_compare_exchange_u64()'s return value to decide; also, use
pg_atomic_compare_exchange_u64 instead of the _impl version; also remove
the unnecessary underscore at the end of variable name "target".

Backpatch to 17, where this code was introduced by commit bf3ff7bf83bc.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/36796438-a718-cf9b-2071-b2c1b947c1b5@gmail.com
2024-07-04 13:25:31 +02:00
Daniel Gustafsson
ab0ae64320 doc: Specify when ssl_prefer_server_ciphers was added
The ssl_prefer_server_ciphers setting is quite important from a
security point of view, so simply stating that older versions
doesn't have it isn't very helpful.  This adds the version when
the GUC was added to help readers.

Backpatch to all supported versions since this setting has been
around since 9.4.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/5D7E0F5E-E620-4D54-8788-66D421AC76F0@yesql.se
Backpatch-through: v12
2024-07-04 11:38:37 +02:00
Michael Paquier
4564f1cebd Add pg_get_acl() to get the ACL for a database object
This function returns the ACL for a database object, specified by
catalog OID and object OID.  This is useful to be able to
retrieve the ACL associated to an object specified with a
(class_id,objid) couple, similarly to the other functions for object
identification, when joined with pg_depend or pg_shdepend.

Original idea by Álvaro Herrera.

Bump catalog version.

Author: Joel Jacobson
Reviewed-by: Isaac Morland, Michael Paquier, Ranier Vilela
Discussion: https://postgr.es/m/80b16434-b9b1-4c3d-8f28-569f21c2c102@app.fastmail.com
2024-07-04 17:09:06 +09:00
Amit Langote
3a8a1f3254 SQL/JSON: Fix some obsolete comments.
JSON_OBJECT(), JSON_OBJETAGG(), JSON_ARRAY(), and JSON_ARRAYAGG()
added in 7081ac46ace are not transformed into direct calls to
user-defined functions as the comments claim. Fix by mentioning
instead that they are transformed into JsonConstructorExpr nodes,
which may call them, for example, for the *AGG() functions.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/058c856a-e090-ac42-ff00-ffe394f52a87%40gmail.com
Backpatch-through: 16
2024-07-04 16:05:35 +09:00
Michael Paquier
b81a71aa05 Assign error codes where missing for user-facing failures
All the errors triggered in the code paths patched here would cause the
backend to issue an internal_error errcode, which is a state that should
be used only for "can't happen" situations.  However, these code paths
are reachable by the regression tests, and could be seen by users in
valid cases.  Some regression tests expect internal errcodes as they
manipulate the backend state to cause corruption (like checksums), or
use elog() because it is more convenient (like injection points), these
have no need to change.

This reduces the number of internal failures triggered in a check-world
by more than half, while providing correct errcodes for these valid
cases.

Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/Zic_GNgos5sMxKoa@paquier.xyz
2024-07-04 09:48:40 +09:00
Alexander Korotkov
6897f0ec02 Optimize memory access in GetRunningTransactionData()
e85662df44 made GetRunningTransactionData() calculate the oldest running
transaction id within the current database.  This commit optimized this
calculation by performing a cheap transaction id comparison before fetching
the process database id, while the latter could cause extra cache misses.

Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240630231816.bf.nmisch%40google.com
2024-07-04 02:05:37 +03:00
Alexander Korotkov
6c1af5482e Fix typo in GetRunningTransactionData()
e85662df44 made GetRunningTransactionData() calculate the oldest running
transaction id within the current database.  However, because of the typo,
the new code uses oldestRunningXid instead of oldestDatabaseRunningXid
in comparison before updating oldestDatabaseRunningXid.  This commit fixes
that issue.

Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240630231816.bf.nmisch%40google.com
Backpatch-through: 17
2024-07-04 02:05:27 +03:00
David Rowley
4331a11c62 Remove incorrect Asserts in buffile.c
Both BufFileSize() and BufFileAppend() contained Asserts to ensure the
given BufFile(s) had a valid fileset.  A valid fileset isn't required in
either of these functions, so remove the Asserts and adjust the
comments accordingly.

This was noticed while work was being done on a new patch to call
BufFileSize() on a BufFile without a valid fileset.  It seems there's
currently no code in the tree which could trigger these Asserts, so no
need to backpatch this, for now.

Reviewed-by: Peter Geoghegan, Matthias van de Meent, Tom Lane
Discussion: https://postgr.es/m/CAApHDvofgZT0VzydhyGH5MMb-XZzNDqqAbzf1eBZV5HDm3%2BosQ%40mail.gmail.com
2024-07-04 09:44:34 +12:00
Nathan Bossart
2329cad1b9 Improve performance of binary_upgrade_set_pg_class_oids().
This function generates the commands that preserve the OIDs and
relfilenodes of relations during pg_upgrade.  It is called once per
relevant relation, and each such call executes a relatively
expensive query to retrieve information for a single pg_class_oid.
This can cause pg_dump to take significantly longer when
--binary-upgrade is specified, especially when there are many
tables.

This commit improves the performance of this function by gathering
all the required pg_class information with a single query at the
beginning of pg_dump.  This information is stored in a sorted array
that binary_upgrade_set_pg_class_oids() can bsearch() for what it
needs.  This follows a similar approach as commit d5e8930f50, which
introduced a sorted array for role information.

With this patch, 'pg_dump --binary-upgrade' will use more memory,
but that isn't expected to be too egregious.  Per the mailing list
discussion, folks feel that this is worth the trade-off.

Reviewed-by: Corey Huinker, Michael Paquier, Daniel Gustafsson
Discussion: https://postgr.es/m/20240418041712.GA3441570%40nathanxps13
2024-07-03 14:21:50 -05:00
Nathan Bossart
6e1c4a03a9 Remove is_index parameter from binary_upgrade_set_pg_class_oids().
Since commit 9a974cbcba, this function retrieves the relkind before
it needs to know whether the relation is an index, so we no longer
need callers to provide this information.

Suggested-by: Daniel Gustafsson
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20240418041712.GA3441570%40nathanxps13
2024-07-03 10:58:26 -05:00
Heikki Linnakangas
f3412a61f3 Avoid 0-length memcpy to NULL with EXEC_BACKEND
memcpy(NULL, src, 0) is forbidden by POSIX, even though every
production version of libc allows it. Let's be tidy.

Per report from Thomas Munro, running UBSan with EXEC_BACKEND.
Backpatch to v17, where this code was added.

Discussion: https://www.postgresql.org/message-id/CA%2BhUKG%2Be-dV7YWBzfBZXsgovgRuX5VmvmOT%2Bv0aXiZJ-EKbXcw@mail.gmail.com
2024-07-03 15:58:14 +03:00
Heikki Linnakangas
a06e8f84a1 Tighten check for --forkchild argument when spawning child process
Commit aafc05de1b removed all the other --fork* arguments. Altough
this is inconsequential, backpatch to v17 since this is new.

Author: Nathan Bossart
Discussion: https://www.postgresql.org/message-id/ZnCCEN0l3qWv-XpW@nathan
2024-07-03 15:53:30 +03:00
Amit Kapila
ae395f0f7e Fix the testcase introduced in commit 81d20fbf7a.
The failed test was syncing failover replication slot to standby to test
that we remove such slots after the standby is converted to subscriber by
pg_createsubscriber.

In one of the buildfarm members, the sync of the slot failed because the
LSN on the standby was before the syncslot's LSN. We need to wait for
standby to catch up before trying to sync the slot with
pg_sync_replication_slots().

The other buildfarm failed because autovacuum generated a xid which is
replicated to the standby at some random point making slots at primary
lag behind standby during slot sync.

Both these failures wouldn't have occurred if we had used built-in
slotsync worker as it would have waited for the standby to sync with
primary but for this test, it is sufficient to use
pg_sync_replication_slots().

Reported-by: Alexander Lakhin as per buildfarm
Author: Kuroda Hayato
Reviewed-by: Amit Kapila
Backpatch-through: 17
Discussion: https://postgr.es/m/0dffca12-bf17-4a7a-334d-225569de5e6e@gmail.com
Discussion: https://postgr.es/m/OSBPR01MB25528300C71FDD83EA1DCA12F5DD2@OSBPR01MB2552.jpnprd01.prod.outlook.com
2024-07-03 15:04:59 +05:30
Michael Paquier
9fd0252579 Replace hardcoded identifiers of pgstats file by #defines
This changes pgstat.c so as the three types of entries that can exist in
a pgstats file are not hardcoded anymore, replacing them with
descriptively-named macros, when reading and writing stats files:
- 'N' for named entries, like replication slot stats.
- 'S' for entries identified by a hash.
- 'E' for the end-of-file

This has come up while working on making this area of the code more
pluggable.  The format of the stats file is unchanged, hence there is no
need to bump PGSTAT_FILE_FORMAT_ID.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-07-03 13:09:20 +09:00
Michael Paquier
dd569214aa Clean up more unused variables in perl code
This is a continuation of 0c1aca461481, with some cleanup in:
- msvc_gendef.pl
- pgindent
- 005_negotiate_encryption.pl, as of an oversight of d39a49c1e459 that
has removed %params in test_matrix(), making also $server_config
useless.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87wmm4dkci.fsf@wibble.ilmari.org
2024-07-03 12:43:57 +09:00
Nathan Bossart
dec9d4acdb Add CODE_OF_CONDUCT.md, CONTRIBUTING.md, and SECURITY.md.
These "community health files" provide important information about
the project and will be displayed prominently on the PostgreSQL
GitHub mirror.  For now, they just point to the website, but we may
want to expand on the content in the future.

Reviewed-by: Peter Eisentraut, Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20240417023609.GA3228660%40nathanxps13
2024-07-02 13:03:58 -05:00
Heikki Linnakangas
eb21f5bc67 Remove redundant SetProcessingMode(InitProcessing) calls
After several refactoring iterations, auxiliary processes are no
longer initialized from the bootstrapper. Using the InitProcessing
mode for initializing auxiliary processes is more appropriate. Since
the global variable Mode is initialized to InitProcessing, we can just
remove the redundant calls of SetProcessingMode(InitProcessing).

Author: Xing Guo <higuoxing@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACpMh%2BDBHVT4xPGimzvex%3DwMdMLQEu9PYhT%2BkwwD2x2nu9dU_Q%40mail.gmail.com
2024-07-02 20:14:40 +03:00
Heikki Linnakangas
4d22173ec0 Move bgworker specific logic to bgworker.c
For clarity, we've been slowly moving functions that are not called
from the postmaster process out of postmaster.c.

Author: Xing Guo <higuoxing@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACpMh%2BDBHVT4xPGimzvex%3DwMdMLQEu9PYhT%2BkwwD2x2nu9dU_Q%40mail.gmail.com
2024-07-02 20:12:05 +03:00
Nathan Bossart
8213df9eff pg_dump: Remove some unused return values.
getSchemaData() does not use the return values of many of its get*
helper functions because they store the data elsewhere.  For
example, commit 92316a4582 introduced a separate hash table for
dumpable objects that said helper functions populate.  This commit
changes these functions to return void and removes their "int *"
parameters that returned the number of objects found.

Reviewed-by: Neil Conway, Tom Lane, Daniel Gustafsson
Discussion: https://postgr.es/m/ZmCAtVaOrHpf31PJ%40nathan
2024-07-02 11:22:06 -05:00
Daniel Gustafsson
e930c872b6 Use safe string copy routine
Using memcpy with strlen as the size parameter will not take the
NULL terminator into account, relying instead on the destination
buffer being properly initialized. Replace with strlcpy which is
a safer alternative, and more in line with how we handle copying
strings elsewhere.

Author: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQApAsbLsQ+gGiw-hT+JwGhgogFa_=5NUkgFO6kOPxyNidQ@mail.gmail.com
2024-07-02 11:16:56 +02:00
Peter Eisentraut
da3ea048ca Remove too demanding test
Remove the test from commit 9c2e660b07.  This test ends up allocating
quite a bit of memory, which can make the test fail with out of memory
errors on some build farm machines.
2024-07-02 10:43:12 +02:00
Peter Eisentraut
9c2e660b07 Limit max parameter number with MaxAllocSize
MaxAllocSize puts an upper bound on the largest possible parameter
number ($268435455).  Use that limit instead of INT_MAX to report that
no parameters exist beyond that point instead of reporting an error
about the maximum allocation size being exceeded.

Author: Erik Wienhold <ewie@ewie.name>
Discussion: https://www.postgresql.org/message-id/flat/5d216d1c-91f6-4cbe-95e2-b4cbd930520c@ewie.name
2024-07-02 09:29:26 +02:00
Peter Eisentraut
d35cd06199 Fix overflow in parsing of positional parameter
Replace atol with pg_strtoint32_safe in the backend parser and with
strtoint in ECPG to reject overflows when parsing the number of a
positional parameter.  With atol from glibc, parameters $2147483648 and
$4294967297 turn into $-2147483648 and $1, respectively.

Author: Erik Wienhold <ewie@ewie.name>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5d216d1c-91f6-4cbe-95e2-b4cbd930520c@ewie.name
2024-07-02 09:29:26 +02:00
Amit Kapila
4867f8a555 Drop pre-existing subscriptions from the converted subscriber.
We don't need the pre-existing subscriptions on the newly formed
subscriber by using pg_createsubscriber. The apply workers corresponding
to these subscriptions can connect to other publisher nodes and either get
some unwarranted data or can lead to ERRORs in connecting to such nodes.

Author: Kuroda Hayato
Reviewed-by: Amit Kapila, Shlok Kyal, Vignesh C
Backpatch-through: 17
Discussion: https://postgr.es/m/OSBPR01MB25526A30A1FBF863ACCDDA3AF5C92@OSBPR01MB2552.jpnprd01.prod.outlook.com
2024-07-02 11:36:21 +05:30
Peter Eisentraut
8f8bcb8883 Improve some global variable declarations
We have in launch_backend.c:

    /*
     * The following need to be available to the save/restore_backend_variables
     * functions.  They are marked NON_EXEC_STATIC in their home modules.
     */
    extern slock_t *ShmemLock;
    extern slock_t *ProcStructLock;
    extern PGPROC *AuxiliaryProcs;
    extern PMSignalData *PMSignalState;
    extern pg_time_t first_syslogger_file_time;
    extern struct bkend *ShmemBackendArray;
    extern bool redirection_done;

That comment is not completely true: ShmemLock, ShmemBackendArray, and
redirection_done are not in fact NON_EXEC_STATIC.  ShmemLock once was,
but was then needed elsewhere.  ShmemBackendArray was static inside
postmaster.c before launch_backend.c was created.  redirection_done
was never static.

This patch moves the declaration of ShmemLock and redirection_done to
a header file.

ShmemBackendArray gets a NON_EXEC_STATIC.  This doesn't make a
difference, since it only exists if EXEC_BACKEND anyway, but it makes
it consistent.

After that, the comment is now correct.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-02 07:26:22 +02:00
Peter Eisentraut
881455e57b Add missing includes for some global variables
src/backend/libpq/pqcomm.c: "postmaster/postmaster.h" for Unix_socket_group, Unix_socket_permissions
src/backend/utils/init/globals.c: "postmaster/postmaster.h" for MyClientSocket
src/backend/utils/misc/guc_tables.c: "utils/rls.h" for row_security
src/backend/utils/sort/tuplesort.c: "utils/guc.h" for trace_sort

Nothing currently diagnoses missing includes for global variables, but
this is being cleaned up, and these ones had an obvious header file
available.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-02 07:26:22 +02:00
Peter Eisentraut
720b0eaae9 Convert some extern variables to static
These probably should have been static all along, it was only
forgotten out of sloppiness.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-02 07:26:22 +02:00
Amit Kapila
a4c87df43a Remove unused structure member in pg_createsubscriber.c.
Author: Kuroda Hayato
Backpatch-through: 17
Discussion: https://postgr.es/m/OSBPR01MB25526A30A1FBF863ACCDDA3AF5C92@OSBPR01MB2552.jpnprd01.prod.outlook.com
2024-07-02 10:28:51 +05:30
David Rowley
65b71dec2d Use TupleDescAttr macro consistently
A few places were directly accessing the attrs[] array. This goes
against the standards set by 2cd708452. Fix that.

Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-07-02 13:41:47 +12:00
Michael Paquier
0c1aca4614 Cleanup perl code from unused variables and routines
This commit removes unused variables and routines from some perl code
that have accumulated across the years.  This touches the following
areas:
- Wait event generation script.
- AdjustUpgrade.pm.
- TAP perl code

Author: Alexander Lakhin
Reviewed-by: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/70b340bc-244a-589d-ef8b-d8aebb707a84@gmail.com
2024-07-02 09:47:16 +09:00
Michael Paquier
978f38c771 Add information about access method for partitioned relations in \dP+
Since 374c7a229042, it is possible to set a table AM on a partitioned
table.  This information was showing up already in psql with \d+, while
\dP+ provided no information.

This commit extends \dP+ to show the access method used by a partitioned
table or index, if set.

Author: Justin Pryzby
Discussion: https://postgr.es/m/ZkyivySXnbvOogZz@pryzbyj2023
2024-07-02 09:01:38 +09:00
Tom Lane
edadeb0710 Remove support for HPPA (a/k/a PA-RISC) architecture.
This old CPU architecture hasn't been produced in decades, and
whatever instances might still survive are surely too underpowered
for anyone to consider running Postgres on in production.  We'd
nonetheless continued to carry code support for it (largely at my
insistence), because its unique implementation of spinlocks seemed
like a good edge case for our spinlock infrastructure.  However,
our last buildfarm animal of this type was retired last year, and
it seems quite unlikely that another will emerge.  Without the ability
to run tests, the argument that this is useful test code fails to
hold water.  Furthermore, carrying code support for an untestable
architecture has costs not to be ignored.  So, remove HPPA-specific
code, in the same vein as commits 718aa43a4 and 92d70b77e.

Discussion: https://postgr.es/m/3351991.1697728588@sss.pgh.pa.us
2024-07-01 13:55:52 -04:00
Nathan Bossart
7967d10c5b Remove redundant privilege check from pg_sequences system view.
This commit adjusts pg_sequence_last_value() to return NULL instead
of ERROR-ing for sequences for which the current user lacks
privileges.  This allows us to remove the call to
has_sequence_privilege() in the definition of the pg_sequences
system view.

Bumps catversion.

Suggested-by: Michael Paquier
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20240501005730.GA594666%40nathanxps13
2024-07-01 11:47:40 -05:00
Tom Lane
1afe31f03c Preserve CurrentMemoryContext across Start/CommitTransactionCommand.
Up to now, committing a transaction has caused CurrentMemoryContext to
get set to TopMemoryContext.  Most callers did not pay any particular
heed to this, which is problematic because TopMemoryContext is a
long-lived context that never gets reset.  If the caller assumes it
can leak memory because it's running in a limited-lifespan context,
that behavior translates into a session-lifespan memory leak.

The first-reported instance of this involved ProcessIncomingNotify,
which is called from the main processing loop that normally runs in
MessageContext.  That outer-loop code assumes that whatever it
allocates will be cleaned up when we're done processing the current
client message --- but if we service a notify interrupt, then whatever
gets allocated before the next switch to MessageContext will be
permanently leaked in TopMemoryContext.  sinval catchup interrupts
have a similar problem, and I strongly suspect that some places in
logical replication do too.

To fix this in a generic way, let's redefine the behavior as
"CommitTransactionCommand restores the memory context that was current
at entry to StartTransactionCommand".  This clearly fixes the issue
for the notify and sinval cases, and it seems to match the mental
model that's in use in the logical replication code, to the extent
that anybody thought about it there at all.

For consistency, likewise make subtransaction exit restore the context
that was current at subtransaction start (rather than always selecting
the CurTransactionContext of the parent transaction level).  This case
has less risk of resulting in a permanent leak than the outer-level
behavior has, but it would not meet the principle of least surprise
for some CommitTransactionCommand calls to restore the previous
context while others don't.

While we're here, also change xact.c so that we reset
TopTransactionContext at transaction exit and then re-use it in later
transactions, rather than dropping and recreating it in each cycle.
This probably doesn't save a lot given the context recycling mechanism
in aset.c, but it should save a little bit.  Per suggestion from David
Rowley.  (Parenthetically, the text in src/backend/utils/mmgr/README
implies that this is how I'd planned to implement it as far back as
commit 1aebc3618 --- but the code actually added in that commit just
drops and recreates it each time.)

This commit also cleans up a few places outside xact.c that were
needlessly making TopMemoryContext current, and thus risking more
leaks of the same kind.  I don't think any of them represent
significant leak risks today, but let's deal with them while the
issue is top-of-mind.

Per bug #18512 from wizardbrony.  Commit to HEAD only, as this change
seems to have some risk of breaking things for some callers.  We'll
apply a narrower fix for the known-broken cases in the back branches.

Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us
2024-07-01 11:55:19 -04:00
Nathan Bossart
6e16b1e420 Add --no-sync to pg_upgrade's uses of pg_dump and pg_dumpall.
There's no reason to ensure that the files pg_upgrade generates
with pg_dump and pg_dumpall have been written safely to disk.  If
there is a crash during pg_upgrade, the upgrade must be restarted
from the beginning; dump files left behind by previous pg_upgrade
attempts cannot be reused.

Reviewed-by: Peter Eisentraut, Tom Lane, Michael Paquier, Daniel Gustafsson
Discussion: https://postgr.es/m/20240503171348.GA1731524%40nathanxps13
2024-07-01 10:18:26 -05:00
Peter Eisentraut
3fb59e789d Remove useless extern keywords
An extern keyword on a function definition (not declaration) is
useless and not the normal style.

Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-01 16:40:25 +02:00
Alvaro Herrera
3497c87b05
Fix copy-paste mistake in PQcancelCreate
When an OOM occurred, this function was incorrectly setting a status of
CONNECTION_BAD on the passed in PGconn instead of on the newly created
PGcancelConn.

Mistake introduced with 61461a300c1c.  Backpatch to 17.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20240630190040.26.nmisch@google.com
2024-07-01 13:58:22 +02:00
David Rowley
12227a1d5f Add context type field to pg_backend_memory_contexts
Since we now (as of v17) have 4 MemoryContext types, the type of context
seems like useful information to include in the pg_backend_memory_contexts
view.  Here we add that.

Reviewed-by: David Christensen, Michael Paquier
Discussion: https://postgr.es/m/CAApHDvrXX1OR09Zjb5TnB0AwCKze9exZN%3D9Nxxg1ZCVV8W-3BA%40mail.gmail.com
2024-07-01 21:19:01 +12:00
Peter Eisentraut
e26d313bad Remove useless code
BuildDescForRelation() goes out of its way to fill in
->constr->has_not_null, but that value is not used for anything later,
so this code can all be removed.  Note that BuildDescForRelation()
doesn't make any effort to fill in the rest of ->constr, so there is
no claim that that structure is completely filled in.

Reviewed-by: Tomasz Rybak <tomasz.rybak@post.pl>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2024-07-01 08:50:29 +02:00
Peter Eisentraut
da2aeba8f5 Remove useless initializations
The struct is already initialized to all zeros right before this, and
randomly initializing a few but not all fields to zero again has no
technical or educational value.

Reviewed-by: Tomasz Rybak <tomasz.rybak@post.pl>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2024-07-01 08:50:10 +02:00
Peter Eisentraut
da486d3601 doc: Clarify that pg_attrdef also stores generation expressions
This was documented with pg_attribute but not with pg_attrdef.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxE+E-iYmBnZVZHiYA+WpyZZVv7BfiBLpo=T70EZHDU9rw@mail.gmail.com
2024-07-01 08:39:07 +02:00
Amit Kapila
2357c9223b Rename standby_slot_names to synchronized_standby_slots.
The standby_slot_names GUC allows the specification of physical standby
slots that must be synchronized before the logical walsenders associated
with logical failover slots. However, for this purpose, the GUC name is
too generic.

Author: Hou Zhijie
Reviewed-by: Bertrand Drouvot, Masahiko Sawada
Backpatch-through: 17
Discussion: https://postgr.es/m/ZnWeUgdHong93fQN@momjian.us
2024-07-01 11:36:56 +05:30
Peter Eisentraut
0c3930d076 Apply COPT to CXXFLAGS as well
The main use of that make variable is to pass in -Werror.  It makes
sense to apply this to C++ as well.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/fe3e200c-edee-44e0-a6e3-d45dca72873b%40eisentraut.org
2024-07-01 07:30:55 +02:00
Michael Paquier
9004abf620 Use pgstat_kind_infos to read fixed shared statistics
Shared statistics with a fixed number of objects are read from the stats
file in pgstat_read_statsfile() using members of PgStat_ShmemControl and
following an order based on their PgStat_Kind value.

Instead of being explicit, this commit changes the stats read to iterate
over the pgstat_kind_infos array to find the memory locations to read
into, based on a new shared_ctl_off in PgStat_KindInfo that can be used
to define the position of this stats kind in shared memory.  This makes
the read logic simpler, and eases the introduction of future
improvements aimed at making this area more pluggable for external
modules.

Original idea suggested by Andres Freund.

Author: Tristan Partin
Reviewed-by: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/D12SQ7OYCD85.20BUVF3DWU5K7@neon.tech
2024-07-01 14:26:25 +09:00
Tom Lane
a1333ec048 Further weaken new pg_createsubscriber test on Windows.
Also omit backslashes (\) in the generated database names on Windows.
As before, perhaps we can revert this after updating affected
buildfarm animals.

Discussion: https://postgr.es/m/2509767.1719773880@sss.pgh.pa.us
2024-06-30 23:20:57 -04:00
Michael Paquier
797adaf0fe Format better code for xact_decode()'s XLOG_XACT_INVALIDATIONS
This makes the code more consistent with the surroundings.

Author: ChangAo Chen
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/CAExHW5tNTevUh58SKddTtcX3yU_5_PDSC8Mdp-Q2hc9PpZHRJg@mail.gmail.com
2024-07-01 10:08:00 +09:00
Michael Paquier
00d819d46a doc: Add ACL acronym for "Access Control List"
Five places across the docs use this abbreviation, so let's use a proper
acronym entry for it.

Per suggestion from me.

Author: Joel Jacobson
Reviewed-by: Nathan Bossart, David G. Johnston
Discussion: https://postgr.es/m/9253b872-dbb1-42a6-a79e-b1e96effc857@app.fastmail.com
2024-07-01 09:55:37 +09:00
Michael Paquier
b19db55bd6 Remove PgStat_KindInfo.named_on_disk
This field is used to track if a stats kind can use a custom format
representation on disk when reading or writing its stats case.  On HEAD,
this exists for replication slots stats, that need a mapping between an
internal index ID and the slot names.

named_on_disk is currently used nowhere and the callbacks
to_serialized_name and from_serialized_name are in charge of checking if
the serialization of the stats data should apply, so let's remove it.

Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/ZmKVlSX_T5YvIOsd@paquier.xyz
2024-07-01 09:35:36 +09:00
David Rowley
1029bdec2d Improve enlargeStringInfo's ERROR message
Until now, when an enlargeStringInfo() call would cause the StringInfo to
exceed its maximum size, we reported an "out of memory" error.  This is
misleading as it's no such thing.

Here we remove the "out of memory" text and replace it with something
more relevant to better indicate that it's a program limitation that's
been reached.

Reported-by: Michael Banck
Reviewed-by: Daniel Gustafsson, Tom Lane
Discussion: https://postgr.es/m/18484-3e357ade5fe50e61@postgresql.org
2024-07-01 12:11:10 +12:00
Michael Paquier
e26810d01d Stamp HEAD as 18devel.
Let the hacking begin ...
2024-07-01 07:56:10 +09:00
3951 changed files with 372709 additions and 196380 deletions

View File

@ -2,6 +2,12 @@
#
# For instructions on how to enable the CI integration in a repository and
# further details, see src/tools/ci/README
#
#
# NB: Different tasks intentionally test with different, non-default,
# configurations, to increase the chance of catching problems. Each task with
# non-obvious non-default documents their oddity at the top of the task,
# prefixed by "SPECIAL:".
env:
@ -17,10 +23,13 @@ env:
CHECK: check-world PROVE_FLAGS=$PROVE_FLAGS
CHECKFLAGS: -Otarget
PROVE_FLAGS: --timer
# Build test dependencies as part of the build step, to see compiler
# errors/warnings in one place.
MBUILD_TARGET: all testprep
MTEST_ARGS: --print-errorlogs --no-rebuild -C build
PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance
PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance oauth
# What files to preserve in case tests fail
@ -52,6 +61,10 @@ on_failure_meson: &on_failure_meson
# To avoid unnecessarily spinning up a lot of VMs / containers for entirely
# broken commits, have a minimal task that all others depend on.
#
# SPECIAL:
# - Builds with --auto-features=disabled and thus almost no enabled
# dependencies
task:
name: SanityCheck
@ -65,7 +78,7 @@ task:
CPUS: 4
BUILD_JOBS: 8
TEST_JOBS: 8
IMAGE_FAMILY: pg-ci-bullseye
IMAGE_FAMILY: pg-ci-bookworm
CCACHE_DIR: ${CIRRUS_WORKING_DIR}/ccache_dir
# no options enabled, should be small
CCACHE_MAXSIZE: "150M"
@ -99,7 +112,7 @@ task:
EOF
build_script: |
su postgres <<-EOF
ninja -C build -j${BUILD_JOBS}
ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}
EOF
upload_caches: ccache
@ -122,20 +135,33 @@ task:
src/tools/ci/cores_backtrace.sh linux /tmp/cores
# SPECIAL:
# - Uses postgres specific CPPFLAGS that increase test coverage
# - Specifies configuration options that test reading/writing/copying of node trees
# - Specifies debug_parallel_query=regress, to catch related issues during CI
# - Also runs tests against a running postgres instance, see test_running_script
task:
name: FreeBSD - 13 - Meson
name: FreeBSD - Meson
env:
CPUS: 4
BUILD_JOBS: 4
TEST_JOBS: 8
IMAGE_FAMILY: pg-ci-freebsd-13
IMAGE_FAMILY: pg-ci-freebsd
DISK_SIZE: 50
CCACHE_DIR: /tmp/ccache_dir
CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES -DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
CFLAGS: -Og -ggdb
# Several buildfarm animals enable these options. Without testing them
# during CI, it would be easy to cause breakage on the buildfarm with CI
# passing.
PG_TEST_INITDB_EXTRA_OPTS: >-
-c debug_copy_parse_plan_trees=on
-c debug_write_read_parse_plan_trees=on
-c debug_raw_expression_coverage_test=on
-c debug_parallel_query=regress
PG_TEST_PG_UPGRADE_MODE: --link
<<: *freebsd_task_template
@ -151,8 +177,7 @@ task:
ccache_cache:
folder: $CCACHE_DIR
# Work around performance issues due to 32KB block size
repartition_script: src/tools/ci/gcp_freebsd_repartition.sh
setup_ram_disk_script: src/tools/ci/gcp_ram_disk.sh
create_user_script: |
pw useradd postgres
chown -R postgres:postgres .
@ -174,11 +199,10 @@ task:
--buildtype=debug \
-Dcassert=true -Dinjection_points=true \
-Duuid=bsd -Dtcl_version=tcl86 -Ddtrace=auto \
-DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
-Dextra_lib_dirs=/usr/local/lib -Dextra_include_dirs=/usr/local/include/ \
build
EOF
build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}'
build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}'
upload_caches: ccache
test_world_script: |
@ -213,12 +237,120 @@ task:
cores_script: src/tools/ci/cores_backtrace.sh freebsd /tmp/cores
task:
depends_on: SanityCheck
trigger_type: manual
env:
# Below are experimentally derived to be a decent choice.
CPUS: 4
BUILD_JOBS: 8
TEST_JOBS: 8
# Default working directory is /tmp, but its total size (1.2 GB) is not
# enough, so different working and cache directory are set.
CIRRUS_WORKING_DIR: /home/postgres/postgres
CCACHE_DIR: /home/postgres/cache
PATH: /usr/sbin:$PATH
CORE_DUMP_DIR: /var/crash
matrix:
- name: NetBSD - Meson
only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*netbsd.*'
env:
OS_NAME: netbsd
IMAGE_FAMILY: pg-ci-netbsd-postgres
PKGCONFIG_PATH: '/usr/lib/pkgconfig:/usr/pkg/lib/pkgconfig'
# initdb fails with: 'invalid locale settings' error on NetBSD.
# Force 'LANG' and 'LC_*' variables to be 'C'.
# See https://postgr.es/m/2490325.1734471752%40sss.pgh.pa.us
LANG: "C"
LC_ALL: "C"
# -Duuid is not set for the NetBSD, see the comment below, above
# configure_script, for more information.
setup_additional_packages_script: |
#pkgin -y install ...
<<: *netbsd_task_template
- name: OpenBSD - Meson
only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*openbsd.*'
env:
OS_NAME: openbsd
IMAGE_FAMILY: pg-ci-openbsd-postgres
PKGCONFIG_PATH: '/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'
UUID: -Duuid=e2fs
TCL: -Dtcl_version=tcl86
setup_additional_packages_script: |
#pkg_add -I ...
# Always core dump to ${CORE_DUMP_DIR}
set_core_dump_script: sysctl -w kern.nosuidcoredump=2
<<: *openbsd_task_template
sysinfo_script: |
locale
id
uname -a
ulimit -a -H && ulimit -a -S
env
ccache_cache:
folder: $CCACHE_DIR
setup_ram_disk_script: src/tools/ci/gcp_ram_disk.sh
create_user_script: |
useradd postgres
chown -R postgres:users /home/postgres
mkdir -p ${CCACHE_DIR}
chown -R postgres:users ${CCACHE_DIR}
setup_core_files_script: |
mkdir -p ${CORE_DUMP_DIR}
chmod -R 770 ${CORE_DUMP_DIR}
chown -R postgres:users ${CORE_DUMP_DIR}
# -Duuid=bsd is not set since 'bsd' uuid option
# is not working on NetBSD & OpenBSD. See
# https://www.postgresql.org/message-id/17358-89806e7420797025@postgresql.org
# And other uuid options are not available on NetBSD.
configure_script: |
su postgres <<-EOF
meson setup \
--buildtype=debugoptimized \
--pkg-config-path ${PKGCONFIG_PATH} \
-Dcassert=true -Dinjection_points=true \
-Dssl=openssl ${UUID} ${TCL} \
-DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
build
EOF
build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}'
upload_caches: ccache
test_world_script: |
su postgres <<-EOF
ulimit -c unlimited
# Otherwise tests will fail on OpenBSD, due to inability to start enough
# processes.
ulimit -p 256
meson test $MTEST_ARGS --num-processes ${TEST_JOBS}
EOF
on_failure:
<<: *on_failure_meson
cores_script: |
# Although we try to configure the OS to core dump inside
# ${CORE_DUMP_DIR}, they may not obey this. So, move core files to the
# ${CORE_DUMP_DIR} directory.
find build/ -type f -name '*.core' -exec mv '{}' ${CORE_DUMP_DIR} \;
src/tools/ci/cores_backtrace.sh ${OS_NAME} ${CORE_DUMP_DIR}
# configure feature flags, shared between the task running the linux tests and
# the CompilerWarnings task
LINUX_CONFIGURE_FEATURES: &LINUX_CONFIGURE_FEATURES >-
--with-gssapi
--with-icu
--with-ldap
--with-libcurl
--with-libxml
--with-libxslt
--with-llvm
@ -238,12 +370,13 @@ LINUX_MESON_FEATURES: &LINUX_MESON_FEATURES >-
-Duuid=e2fs
# Check SPECIAL in the matrix: below
task:
env:
CPUS: 4
BUILD_JOBS: 4
TEST_JOBS: 8 # experimentally derived to be a decent choice
IMAGE_FAMILY: pg-ci-bullseye
IMAGE_FAMILY: pg-ci-bookworm
CCACHE_DIR: /tmp/ccache_dir
DEBUGINFOD_URLS: "https://debuginfod.debian.net"
@ -272,6 +405,8 @@ task:
LDFLAGS: $SANITIZER_FLAGS
CC: ccache gcc
CXX: ccache g++
# GCC emits a warning for llvm-14, so switch to a newer one.
LLVM_CONFIG: llvm-config-16
LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
@ -314,10 +449,15 @@ task:
#DEBIAN_FRONTEND=noninteractive apt-get -y install ...
matrix:
- name: Linux - Debian Bullseye - Autoconf
# SPECIAL:
# - Uses address sanitizer, sanitizer failures are typically printed in
# the server log
# - Configures postgres with a small segment size
- name: Linux - Debian Bookworm - Autoconf
env:
SANITIZER_FLAGS: -fsanitize=address
PG_TEST_PG_COMBINEBACKUP_MODE: --copy-file-range
# Normally, the "relation segment" code basically has no coverage in our
# tests, because we (quite reasonably) don't generate tables large
@ -331,10 +471,12 @@ task:
--enable-cassert --enable-injection-points --enable-debug \
--enable-tap-tests --enable-nls \
--with-segsize-blocks=6 \
--with-libnuma \
--with-liburing \
\
${LINUX_CONFIGURE_FEATURES} \
\
CLANG="ccache clang"
CLANG="ccache clang-16"
EOF
build_script: su postgres -c "make -s -j${BUILD_JOBS} world-bin"
upload_caches: ccache
@ -348,11 +490,18 @@ task:
on_failure:
<<: *on_failure_ac
- name: Linux - Debian Bullseye - Meson
# SPECIAL:
# - Uses undefined behaviour and alignment sanitizers, sanitizer failures
# are typically printed in the server log
# - Test both 64bit and 32 bit builds
# - uses io_method=io_uring
- name: Linux - Debian Bookworm - Meson
env:
CCACHE_MAXSIZE: "400M" # tests two different builds
SANITIZER_FLAGS: -fsanitize=alignment,undefined
PG_TEST_INITDB_EXTRA_OPTS: >-
-c io_method=io_uring
configure_script: |
su postgres <<-EOF
@ -360,7 +509,6 @@ task:
--buildtype=debug \
-Dcassert=true -Dinjection_points=true \
${LINUX_MESON_FEATURES} \
-DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
build
EOF
@ -375,13 +523,22 @@ task:
${LINUX_MESON_FEATURES} \
-Dllvm=disabled \
--pkg-config-path /usr/lib/i386-linux-gnu/pkgconfig/ \
-DPERL=perl5.32-i386-linux-gnu \
-DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
-DPERL=perl5.36-i386-linux-gnu \
-Dlibnuma=disabled \
build-32
EOF
build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS}'
build_32_script: su postgres -c 'ninja -C build-32 -j${BUILD_JOBS}'
build_script: |
su postgres <<-EOF
ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}
ninja -C build -t missingdeps
EOF
build_32_script: |
su postgres <<-EOF
ninja -C build-32 -j${BUILD_JOBS} ${MBUILD_TARGET}
ninja -C build -t missingdeps
EOF
upload_caches: ccache
@ -410,8 +567,13 @@ task:
cores_script: src/tools/ci/cores_backtrace.sh linux /tmp/cores
# NB: macOS is by far the most expensive OS to run CI for, therefore no
# expensive additional checks should be added.
#
# SPECIAL:
# - Enables --clone for pg_upgrade and pg_combinebackup
task:
name: macOS - Ventura - Meson
name: macOS - Sonoma - Meson
env:
CPUS: 4 # always get that much for cirrusci macOS instances
@ -420,18 +582,33 @@ task:
# work OK. See
# https://postgr.es/m/20220927040208.l3shfcidovpzqxfh%40awork3.anarazel.de
TEST_JOBS: 8
IMAGE: ghcr.io/cirruslabs/macos-ventura-base:latest
IMAGE: ghcr.io/cirruslabs/macos-runner:sonoma
CIRRUS_WORKING_DIR: ${HOME}/pgsql/
CCACHE_DIR: ${HOME}/ccache
MACPORTS_CACHE: ${HOME}/macports-cache
MACOS_PACKAGE_LIST: >-
ccache
icu
kerberos5
lz4
meson
openldap
openssl
p5.34-io-tty
p5.34-ipc-run
python312
tcl
zstd
CC: ccache cc
CXX: ccache c++
CFLAGS: -Og -ggdb
CXXFLAGS: -Og -ggdb
PG_TEST_PG_UPGRADE_MODE: --clone
PG_TEST_PG_COMBINEBACKUP_MODE: --clone
<<: *macos_task_template
@ -460,20 +637,15 @@ task:
# updates macports every time.
macports_cache:
folder: ${MACPORTS_CACHE}
fingerprint_script: |
# Reinstall packages if the OS major version, the list of the packages
# to install or the MacPorts install script changes.
sw_vers -productVersion | sed 's/\..*//'
echo $MACOS_PACKAGE_LIST
md5 src/tools/ci/ci_macports_packages.sh
reupload_on_changes: true
setup_additional_packages_script: |
sh src/tools/ci/ci_macports_packages.sh \
ccache \
icu \
kerberos5 \
lz4 \
meson \
openldap \
openssl \
p5.34-io-tty \
p5.34-ipc-run \
python312 \
tcl \
zstd
sh src/tools/ci/ci_macports_packages.sh $MACOS_PACKAGE_LIST
# system python doesn't provide headers
sudo /opt/local/bin/port select python3 python312
# Make macports install visible for subsequent steps
@ -490,10 +662,9 @@ task:
-Dextra_lib_dirs=/opt/local/lib \
-Dcassert=true -Dinjection_points=true \
-Duuid=e2fs -Ddtrace=auto \
-DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
build
build_script: ninja -C build -j${BUILD_JOBS}
build_script: ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}
upload_caches: ccache
test_world_script: |
@ -562,11 +733,12 @@ task:
# Use /DEBUG:FASTLINK to avoid high memory usage during linking
configure_script: |
vcvarsall x64
meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" build
meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% build
build_script: |
vcvarsall x64
ninja -C build
ninja -C build %MBUILD_TARGET%
ninja -C build -t missingdeps
check_world_script: |
vcvarsall x64
@ -624,7 +796,7 @@ task:
%BASH% -c "meson setup -Ddebug=true -Doptimization=g -Dcassert=true -Dinjection_points=true -Db_pch=true -Dnls=disabled -DTAR=%TAR% build"
build_script: |
%BASH% -c "ninja -C build"
%BASH% -c "ninja -C build ${MBUILD_TARGET}"
upload_caches: ccache
@ -651,7 +823,7 @@ task:
env:
CPUS: 4
BUILD_JOBS: 4
IMAGE_FAMILY: pg-ci-bullseye
IMAGE_FAMILY: pg-ci-bookworm
# Use larger ccache cache, as this task compiles with multiple compilers /
# flag combinations
@ -661,6 +833,9 @@ task:
LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
# GCC emits a warning for llvm-14, so switch to a newer one.
LLVM_CONFIG: llvm-config-16
<<: *linux_task_template
sysinfo_script: |
@ -696,7 +871,7 @@ task:
--cache gcc.cache \
--enable-dtrace \
${LINUX_CONFIGURE_FEATURES} \
CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"
make -s -j${BUILD_JOBS} clean
time make -s -j${BUILD_JOBS} world-bin
@ -707,7 +882,7 @@ task:
--cache gcc.cache \
--enable-cassert \
${LINUX_CONFIGURE_FEATURES} \
CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"
make -s -j${BUILD_JOBS} clean
time make -s -j${BUILD_JOBS} world-bin
@ -717,7 +892,7 @@ task:
time ./configure \
--cache clang.cache \
${LINUX_CONFIGURE_FEATURES} \
CC="ccache clang" CXX="ccache clang++" CLANG="ccache clang"
CC="ccache clang" CXX="ccache clang++-16" CLANG="ccache clang-16"
make -s -j${BUILD_JOBS} clean
time make -s -j${BUILD_JOBS} world-bin
@ -729,7 +904,7 @@ task:
--enable-cassert \
--enable-dtrace \
${LINUX_CONFIGURE_FEATURES} \
CC="ccache clang" CXX="ccache clang++" CLANG="ccache clang"
CC="ccache clang" CXX="ccache clang++-16" CLANG="ccache clang-16"
make -s -j${BUILD_JOBS} clean
time make -s -j${BUILD_JOBS} world-bin
@ -753,9 +928,7 @@ task:
docs_build_script: |
time ./configure \
--cache gcc.cache \
CC="ccache gcc" \
CXX="ccache g++" \
CLANG="ccache clang"
CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"
make -s -j${BUILD_JOBS} clean
time make -s -j${BUILD_JOBS} -C doc
@ -774,7 +947,7 @@ task:
${LINUX_CONFIGURE_FEATURES} \
--without-icu \
--quiet \
CC="gcc" CXX"=g++" CLANG="clang"
CC="gcc" CXX"=g++" CLANG="clang-16"
make -s -j${BUILD_JOBS} clean
time make -s headerscheck EXTRAFLAGS='-fmax-errors=10'
headers_cpluspluscheck_script: |

View File

@ -52,6 +52,16 @@ default_freebsd_task_template: &freebsd_task_template
PLATFORM: freebsd
<<: *cirrus_community_vm_template
default_netbsd_task_template: &netbsd_task_template
env:
PLATFORM: netbsd
<<: *cirrus_community_vm_template
default_openbsd_task_template: &openbsd_task_template
env:
PLATFORM: openbsd
<<: *cirrus_community_vm_template
default_windows_task_template: &windows_task_template
env:

View File

@ -1,14 +1,176 @@
root = true
[*.{c,h,l,y,pl,pm}]
indent_style = tab
[*]
indent_size = tab
[*]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = unset
tab_width = unset
[*.[chly]]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = tab
tab_width = 4
[*.{sgml,xml}]
[*.cpp]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = tab
tab_width = 4
[*.pl]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = tab
tab_width = 4
[*.pm]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = tab
tab_width = 4
[*.po]
trim_trailing_whitespace = true
insert_final_newline = unset
indent_style = space
tab_width = unset
[*.py]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = space
tab_width = unset
indent_size = 4
[*.sgml]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = space
tab_width = unset
indent_size = 1
[*.xsl]
[*.xml]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = space
tab_width = unset
indent_size = 2
[*.xsl]
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = space
tab_width = unset
indent_size = 1
[*.data]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[contrib/pgcrypto/sql/pgp-armor.sql]
trim_trailing_whitespace = unset
insert_final_newline = true
indent_style = unset
tab_width = unset
[src/backend/catalog/sql_features.txt]
trim_trailing_whitespace = unset
insert_final_newline = true
indent_style = unset
tab_width = unset
[*.out]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/interfaces/ecpg/test/expected/*]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[configure]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[ppport.h]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/backend/jit/llvm/SectionMemoryManager.cpp]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/backend/jit/llvm/SectionMemoryManager.LICENSE]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/backend/regex/COPYRIGHT]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/backend/snowball/libstemmer/*.c]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/backend/utils/mb/Unicode/*-std.txt]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/include/jit/SectionMemoryManager.h]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/include/snowball/libstemmer/*]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/timezone/data/*]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/tools/pg_bsd_indent/*]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/tools/pg_bsd_indent/tests/*]
indent_style = unset
indent_size = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
[src/tools/pg_bsd_indent/*.[ch]]
trim_trailing_whitespace = unset
insert_final_newline = unset
indent_style = unset
tab_width = 8

View File

@ -14,6 +14,66 @@
#
# $ git log --pretty=format:"%H # %cd%n# %s" $PGINDENTGITHASH -1 --date=iso
b27644bade0348d0dafd3036c47880a349fe9332 # 2025-06-15 13:04:24 -0400
# Sync typedefs.list with the buildfarm.
4672b6223910687b2aab075bcd2dd54ce90d5171 # 2025-06-01 14:55:24 -0400
# Run pgindent on the previous commit.
918e7287ed20eb1fe280ab6c4056ccf94dcd53a8 # 2025-04-30 19:18:30 +1200
# Fix broken indentation
e1a8b1ad587112e67fdc5aa7b388631dde4dbdda # 2025-04-04 09:38:22 -0500
# Re-pgindent pg_largeobject.c after commit 0d6c477664.
796bdda484c838313959f65e2b700f14ac7c0e66 # 2025-03-18 09:02:36 -0400
# Fix indentation again.
203c1b4cc49455364b6bcab8034900d1c016b9cd # 2025-03-17 16:06:17 -0400
# Fix indentation.
b955df443405e056fd9047ef819a1465654f9d79 # 2025-03-13 12:41:44 +1300
# Fix indentation issue
76aa615943049c04efd36ab4765c06eda89cdfea # 2025-01-31 16:44:24 +0900
# Fix bad indentation introduced in commit d47cbf474
6e826278f1ebd9967c0f8adda29c8960a812e344 # 2025-01-13 11:27:32 +0900
# Fix pgindent damage
301de6a6f609cb3ad2d9d31fd8db9ae6c71e6dea # 2024-12-25 17:55:42 +0100
# Partial pgindent of .l and .y files
53dcba9be5746cc126bdb949bf81c29ea2cfc24d # 2024-11-21 21:40:17 +0100
# pgindent run
a7f2f6adc240a2823c2344b89e90bb630dea8803 # 2024-10-16 12:21:13 -0700
# Whitespace fixup from generated unicode tables.
7f7474a8e4002ac9fd4979cc7b16b50b70b70c28 # 2024-09-27 11:14:31 -0400
# Reindent pg_verifybackup.c.
7229ebe011dff3f418251a4836f6d098923aa1e1 # 2024-08-26 16:16:12 -0700
# Fix identation.
2b03cfeea47834913ff769124f4deba88140f662 # 2024-08-21 09:58:11 -0400
# Fix pgindent damage
97add39c038bbdb9082b416ddf04cd20b0d20bf5 # 2024-08-15 11:41:46 -0400
# Clean up indentation and whitespace inconsistencies in ecpg.
8de5ca1dc9fa809102acd1983ee19159d0bc469f # 2024-08-12 10:57:03 +0300
# Fix bad indentation introduced in commit f011e82c2c
c883453cb29cb40c1e59c3c54d159c5e744da8a9 # 2024-07-26 12:00:04 -0400
# Fix indentation.
47ecbfdfcc71e41d7dcc35f0be04f8adbe88397f # 2024-07-15 15:17:04 -0700
# Fix bad indentation introduced in 43cd30bcd1c
b48f275f18d7da4f4863888ad047cbd699698880 # 2024-06-28 10:51:05 -0400
# pgindent, because I forgot to do that.
da256a4a7fdcca35fe7ca808686ad3de6ee22306 # 2024-05-14 16:34:50 -0400
# Pre-beta mechanical code beautification.

11
.gitattributes vendored
View File

@ -1,10 +1,15 @@
# IMPORTANT: After updating this file, also run src/tools/generate_editorconfig.py
* whitespace=space-before-tab,trailing-space
*.[chly] whitespace=space-before-tab,trailing-space,indent-with-non-tab,tabwidth=4
*.cpp whitespace=space-before-tab,trailing-space,indent-with-non-tab,tabwidth=4
*.pl whitespace=space-before-tab,trailing-space,tabwidth=4
*.pm whitespace=space-before-tab,trailing-space,tabwidth=4
*.po whitespace=space-before-tab,trailing-space,tab-in-indent,-blank-at-eof
*.py whitespace=space-before-tab,trailing-space,tab-in-indent
*.sgml whitespace=space-before-tab,trailing-space,tab-in-indent
*.x[ms]l whitespace=space-before-tab,trailing-space,tab-in-indent
*.xml whitespace=space-before-tab,trailing-space,tab-in-indent
*.xsl whitespace=space-before-tab,trailing-space,tab-in-indent
# Avoid confusing ASCII underlines with leftover merge conflict markers
README conflict-marker-size=32
@ -22,10 +27,14 @@ src/interfaces/ecpg/test/expected/* -whitespace
# These files are maintained or generated elsewhere. We take them as is.
configure -whitespace
ppport.h -whitespace
src/backend/jit/llvm/SectionMemoryManager.cpp -whitespace
src/backend/jit/llvm/SectionMemoryManager.LICENSE -whitespace
src/backend/regex/COPYRIGHT -whitespace
src/backend/snowball/libstemmer/*.c -whitespace
src/backend/utils/mb/Unicode/*-std.txt -whitespace
src/include/jit/SectionMemoryManager.h -whitespace
src/include/snowball/libstemmer/* -whitespace
src/timezone/data/* -whitespace
src/tools/pg_bsd_indent/* -whitespace
src/tools/pg_bsd_indent/tests/* -whitespace
src/tools/pg_bsd_indent/*.[ch] whitespace=-blank-at-eol,-blank-at-eof,tabwidth=8

2
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@ -0,0 +1,2 @@
The PostgreSQL code of conduct can be found at
<https://www.postgresql.org/about/policies/coc/>.

2
.github/CONTRIBUTING.md vendored Normal file
View File

@ -0,0 +1,2 @@
For information about contributing to PostgreSQL, see
<https://www.postgresql.org/developer/>.

2
.github/SECURITY.md vendored Normal file
View File

@ -0,0 +1,2 @@
For information about reporting security issues, see
<https://www.postgresql.org/support/security/>.

1
.mailmap Normal file
View File

@ -0,0 +1 @@
Álvaro Herrera <alvherre@alvh.no-ip.org>

View File

@ -1,7 +1,7 @@
PostgreSQL Database Management System
(formerly known as Postgres, then as Postgres95)
(also known as Postgres, formerly known as Postgres95)
Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
Portions Copyright (c) 1994, The Regents of the University of California

View File

@ -38,60 +38,6 @@ ac_c_werror_flag=$ac_save_c_werror_flag
])# PGAC_TEST_PRINTF_ARCHETYPE
# PGAC_TYPE_64BIT_INT(TYPE)
# -------------------------
# Check if TYPE is a working 64 bit integer type. Set HAVE_TYPE_64 to
# yes or no respectively, and define HAVE_TYPE_64 if yes.
AC_DEFUN([PGAC_TYPE_64BIT_INT],
[define([Ac_define], [translit([have_$1_64], [a-z *], [A-Z_P])])dnl
define([Ac_cachevar], [translit([pgac_cv_type_$1_64], [ *], [_p])])dnl
AC_CACHE_CHECK([whether $1 is 64 bits], [Ac_cachevar],
[AC_RUN_IFELSE([AC_LANG_SOURCE(
[typedef $1 ac_int64;
/*
* These are globals to discourage the compiler from folding all the
* arithmetic tests down to compile-time constants.
*/
ac_int64 a = 20000001;
ac_int64 b = 40000005;
int does_int64_work()
{
ac_int64 c,d;
if (sizeof(ac_int64) != 8)
return 0; /* definitely not the right size */
/* Do perfunctory checks to see if 64-bit arithmetic seems to work */
c = a * b;
d = (c + b) / b;
if (d != a+1)
return 0;
return 1;
}
int
main() {
return (! does_int64_work());
}])],
[Ac_cachevar=yes],
[Ac_cachevar=no],
[# If cross-compiling, check the size reported by the compiler and
# trust that the arithmetic works.
AC_COMPILE_IFELSE([AC_LANG_BOOL_COMPILE_TRY([], [sizeof($1) == 8])],
Ac_cachevar=yes,
Ac_cachevar=no)])])
Ac_define=$Ac_cachevar
if test x"$Ac_cachevar" = xyes ; then
AC_DEFINE(Ac_define, 1, [Define to 1 if `]$1[' works and is 64 bits.])
fi
undefine([Ac_define])dnl
undefine([Ac_cachevar])dnl
])# PGAC_TYPE_64BIT_INT
# PGAC_TYPE_128BIT_INT
# --------------------
# Check if __int128 is a working 128 bit integer type, and if so
@ -196,7 +142,7 @@ fi])# PGAC_C_STATIC_ASSERT
AC_DEFUN([PGAC_C_TYPEOF],
[AC_CACHE_CHECK(for typeof, pgac_cv_c_typeof,
[pgac_cv_c_typeof=no
for pgac_kw in typeof __typeof__ decltype; do
for pgac_kw in typeof __typeof__; do
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
[int x = 0;
$pgac_kw(x) y;
@ -270,9 +216,10 @@ fi])# PGAC_C_BUILTIN_CONSTANT_P
AC_DEFUN([PGAC_C_BUILTIN_OP_OVERFLOW],
[AC_CACHE_CHECK(for __builtin_mul_overflow, pgac_cv__builtin_op_overflow,
[AC_LINK_IFELSE([AC_LANG_PROGRAM([
PG_INT64_TYPE a = 1;
PG_INT64_TYPE b = 1;
PG_INT64_TYPE result;
#include <stdint.h>
int64_t a = 1;
int64_t b = 1;
int64_t result;
int oflo;
],
[oflo = __builtin_mul_overflow(a, b, &result);])],
@ -557,13 +504,13 @@ fi])# PGAC_HAVE_GCC__SYNC_INT32_CAS
# types, and define HAVE_GCC__SYNC_INT64_CAS if so.
AC_DEFUN([PGAC_HAVE_GCC__SYNC_INT64_CAS],
[AC_CACHE_CHECK(for builtin __sync int64 atomic operations, pgac_cv_gcc_sync_int64_cas,
[AC_LINK_IFELSE([AC_LANG_PROGRAM([],
[PG_INT64_TYPE lock = 0;
__sync_val_compare_and_swap(&lock, 0, (PG_INT64_TYPE) 37);])],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <stdint.h>],
[int64_t lock = 0;
__sync_val_compare_and_swap(&lock, 0, (int64_t) 37);])],
[pgac_cv_gcc_sync_int64_cas="yes"],
[pgac_cv_gcc_sync_int64_cas="no"])])
if test x"$pgac_cv_gcc_sync_int64_cas" = x"yes"; then
AC_DEFINE(HAVE_GCC__SYNC_INT64_CAS, 1, [Define to 1 if you have __sync_val_compare_and_swap(int64 *, int64, int64).])
AC_DEFINE(HAVE_GCC__SYNC_INT64_CAS, 1, [Define to 1 if you have __sync_val_compare_and_swap(int64_t *, int64_t, int64_t).])
fi])# PGAC_HAVE_GCC__SYNC_INT64_CAS
# PGAC_HAVE_GCC__ATOMIC_INT32_CAS
@ -588,9 +535,9 @@ fi])# PGAC_HAVE_GCC__ATOMIC_INT32_CAS
# types, and define HAVE_GCC__ATOMIC_INT64_CAS if so.
AC_DEFUN([PGAC_HAVE_GCC__ATOMIC_INT64_CAS],
[AC_CACHE_CHECK(for builtin __atomic int64 atomic operations, pgac_cv_gcc_atomic_int64_cas,
[AC_LINK_IFELSE([AC_LANG_PROGRAM([],
[PG_INT64_TYPE val = 0;
PG_INT64_TYPE expect = 0;
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <stdint.h>],
[int64_t val = 0;
int64_t expect = 0;
__atomic_compare_exchange_n(&val, &expect, 37, 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);])],
[pgac_cv_gcc_atomic_int64_cas="yes"],
[pgac_cv_gcc_atomic_int64_cas="no"])])
@ -605,29 +552,73 @@ fi])# PGAC_HAVE_GCC__ATOMIC_INT64_CAS
# test the 8-byte variant, _mm_crc32_u64, but it is assumed to be present if
# the other ones are, on x86-64 platforms)
#
# An optional compiler flag can be passed as argument (e.g. -msse4.2). If the
# intrinsics are supported, sets pgac_sse42_crc32_intrinsics, and CFLAGS_CRC.
# If the intrinsics are supported, sets pgac_sse42_crc32_intrinsics.
#
# To detect the case where the compiler knows the function but library support
# is missing, we must link not just compile, and store the results in global
# variables so the compiler doesn't optimize away the call.
AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics_$1])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=$1], [Ac_cachevar],
[pgac_save_CFLAGS=$CFLAGS
CFLAGS="$pgac_save_CFLAGS $1"
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>],
[unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
/* return computed value, to prevent the above being optimized away */
return crc == 0;])],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
unsigned int crc;
#if defined(__has_attribute) && __has_attribute (target)
__attribute__((target("sse4.2")))
#endif
static int crc32_sse42_test(void)
{
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}],
[return crc32_sse42_test();])],
[Ac_cachevar=yes],
[Ac_cachevar=no])
CFLAGS="$pgac_save_CFLAGS"])
[Ac_cachevar=no])])
if test x"$Ac_cachevar" = x"yes"; then
CFLAGS_CRC="$1"
pgac_sse42_crc32_intrinsics=yes
fi
undefine([Ac_cachevar])dnl
])# PGAC_SSE42_CRC32_INTRINSICS
# PGAC_AVX512_PCLMUL_INTRINSICS
# ---------------------------
# Check if the compiler supports AVX-512 carryless multiplication
# and three-way exclusive-or instructions used for computing CRC.
# AVX-512F is assumed to be supported if the above are.
#
# If the intrinsics are supported, sets pgac_avx512_pclmul_intrinsics.
AC_DEFUN([PGAC_AVX512_PCLMUL_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_pclmul_intrinsics])])dnl
AC_CACHE_CHECK([for _mm512_clmulepi64_epi128], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
__m512i x;
__m512i y;
#if defined(__has_attribute) && __has_attribute (target)
__attribute__((target("vpclmulqdq,avx512vl")))
#endif
static int avx512_pclmul_test(void)
{
__m128i z;
x = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(0)), x);
y = _mm512_clmulepi64_epi128(x, y, 0);
z = _mm_ternarylogic_epi64(
_mm512_castsi512_si128(y),
_mm512_extracti32x4_epi32(y, 1),
_mm512_extracti32x4_epi32(y, 2),
0x96);
return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
}],
[return avx512_pclmul_test();])],
[Ac_cachevar=yes],
[Ac_cachevar=no])])
if test x"$Ac_cachevar" = x"yes"; then
pgac_avx512_pclmul_intrinsics=yes
fi
undefine([Ac_cachevar])dnl
])# PGAC_AVX512_PCLMUL_INTRINSICS
# PGAC_ARMV8_CRC32C_INTRINSICS
# ----------------------------
@ -644,9 +635,9 @@ AC_DEFUN([PGAC_ARMV8_CRC32C_INTRINSICS],
AC_CACHE_CHECK([for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=$1], [Ac_cachevar],
[pgac_save_CFLAGS=$CFLAGS
CFLAGS="$pgac_save_CFLAGS $1"
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <arm_acle.h>],
[unsigned int crc = 0;
crc = __crc32cb(crc, 0);
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <arm_acle.h>
unsigned int crc;],
[crc = __crc32cb(crc, 0);
crc = __crc32ch(crc, 0);
crc = __crc32cw(crc, 0);
crc = __crc32cd(crc, 0);
@ -679,9 +670,8 @@ AC_DEFUN([PGAC_LOONGARCH_CRC32C_INTRINSICS],
AC_CACHE_CHECK(
[for __builtin_loongarch_crcc_w_b_w, __builtin_loongarch_crcc_w_h_w, __builtin_loongarch_crcc_w_w_w and __builtin_loongarch_crcc_w_d_w],
[Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([],
[unsigned int crc = 0;
crc = __builtin_loongarch_crcc_w_b_w(0, crc);
[AC_LINK_IFELSE([AC_LANG_PROGRAM([unsigned int crc;],
[crc = __builtin_loongarch_crcc_w_b_w(0, crc);
crc = __builtin_loongarch_crcc_w_h_w(0, crc);
crc = __builtin_loongarch_crcc_w_w_w(0, crc);
crc = __builtin_loongarch_crcc_w_d_w(0, crc);
@ -700,20 +690,22 @@ undefine([Ac_cachevar])dnl
# Check if the compiler supports the XSAVE instructions using the _xgetbv
# intrinsic function.
#
# An optional compiler flag can be passed as argument (e.g., -mxsave). If the
# intrinsic is supported, sets pgac_xsave_intrinsics and CFLAGS_XSAVE.
# If the intrinsics are supported, sets pgac_xsave_intrinsics.
AC_DEFUN([PGAC_XSAVE_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics_$1])])dnl
AC_CACHE_CHECK([for _xgetbv with CFLAGS=$1], [Ac_cachevar],
[pgac_save_CFLAGS=$CFLAGS
CFLAGS="$pgac_save_CFLAGS $1"
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>],
[return _xgetbv(0) & 0xe0;])],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics])])dnl
AC_CACHE_CHECK([for _xgetbv], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
__attribute__((target("xsave")))
#endif
static int xsave_test(void)
{
return _xgetbv(0) & 0xe0;
}],
[return xsave_test();])],
[Ac_cachevar=yes],
[Ac_cachevar=no])
CFLAGS="$pgac_save_CFLAGS"])
[Ac_cachevar=no])])
if test x"$Ac_cachevar" = x"yes"; then
CFLAGS_XSAVE="$1"
pgac_xsave_intrinsics=yes
fi
undefine([Ac_cachevar])dnl
@ -725,30 +717,84 @@ undefine([Ac_cachevar])dnl
# _mm512_setzero_si512, _mm512_maskz_loadu_epi8, _mm512_popcnt_epi64,
# _mm512_add_epi64, and _mm512_reduce_add_epi64 intrinsic functions.
#
# Optional compiler flags can be passed as argument (e.g., -mavx512vpopcntdq
# -mavx512bw). If the intrinsics are supported, sets
# pgac_avx512_popcnt_intrinsics and CFLAGS_POPCNT.
# If the intrinsics are supported, sets pgac_avx512_popcnt_intrinsics.
AC_DEFUN([PGAC_AVX512_POPCNT_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics_$1])])dnl
AC_CACHE_CHECK([for _mm512_popcnt_epi64 with CFLAGS=$1], [Ac_cachevar],
[pgac_save_CFLAGS=$CFLAGS
CFLAGS="$pgac_save_CFLAGS $1"
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>],
[const char buf@<:@sizeof(__m512i)@:>@;
PG_INT64_TYPE popcnt = 0;
__m512i accum = _mm512_setzero_si512();
const __m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *) buf);
const __m512i cnt = _mm512_popcnt_epi64(val);
accum = _mm512_add_epi64(accum, cnt);
popcnt = _mm512_reduce_add_epi64(accum);
/* return computed value, to prevent the above being optimized away */
return popcnt == 0;])],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics])])dnl
AC_CACHE_CHECK([for _mm512_popcnt_epi64], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([[#include <immintrin.h>
#include <stdint.h>
char buf[sizeof(__m512i)];
#if defined(__has_attribute) && __has_attribute (target)
__attribute__((target("avx512vpopcntdq,avx512bw")))
#endif
static int popcount_test(void)
{
int64_t popcnt = 0;
__m512i accum = _mm512_setzero_si512();
__m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *) buf);
__m512i cnt = _mm512_popcnt_epi64(val);
accum = _mm512_add_epi64(accum, cnt);
popcnt = _mm512_reduce_add_epi64(accum);
return (int) popcnt;
}]],
[return popcount_test();])],
[Ac_cachevar=yes],
[Ac_cachevar=no])
CFLAGS="$pgac_save_CFLAGS"])
[Ac_cachevar=no])])
if test x"$Ac_cachevar" = x"yes"; then
CFLAGS_POPCNT="$1"
pgac_avx512_popcnt_intrinsics=yes
fi
undefine([Ac_cachevar])dnl
])# PGAC_AVX512_POPCNT_INTRINSICS
# PGAC_SVE_POPCNT_INTRINSICS
# --------------------------
# Check if the compiler supports the SVE popcount instructions using the
# svptrue_b64, svdup_u64, svcntb, svld1_u64, svld1_u8, svadd_u64_x,
# svcnt_u64_x, svcnt_u8_x, svaddv_u64, svaddv_u8, svwhilelt_b8_s32,
# svand_n_u64_x, and svand_n_u8_x intrinsic functions.
#
# If the intrinsics are supported, sets pgac_sve_popcnt_intrinsics.
AC_DEFUN([PGAC_SVE_POPCNT_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sve_popcnt_intrinsics])])dnl
AC_CACHE_CHECK([for svcnt_x], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([[#include <arm_sve.h>
char buf[128];
#if defined(__has_attribute) && __has_attribute (target)
__attribute__((target("arch=armv8-a+sve")))
#endif
static int popcount_test(void)
{
svbool_t pred = svptrue_b64();
svuint8_t vec8;
svuint64_t accum1 = svdup_u64(0),
accum2 = svdup_u64(0),
vec64;
char *p = buf;
uint64_t popcnt,
mask = 0x5555555555555555;
vec64 = svand_n_u64_x(pred, svld1_u64(pred, (const uint64_t *) p), mask);
accum1 = svadd_u64_x(pred, accum1, svcnt_u64_x(pred, vec64));
p += svcntb();
vec64 = svand_n_u64_x(pred, svld1_u64(pred, (const uint64_t *) p), mask);
accum2 = svadd_u64_x(pred, accum2, svcnt_u64_x(pred, vec64));
p += svcntb();
popcnt = svaddv_u64(pred, svadd_u64_x(pred, accum1, accum2));
pred = svwhilelt_b8_s32(0, sizeof(buf));
vec8 = svand_n_u8_x(pred, svld1_u8(pred, (const uint8_t *) p), 0x55);
return (int) (popcnt + svaddv_u8(pred, svcnt_u8_x(pred, vec8)));
}]],
[return popcount_test();])],
[Ac_cachevar=yes],
[Ac_cachevar=no])])
if test x"$Ac_cachevar" = x"yes"; then
pgac_sve_popcnt_intrinsics=yes
fi
undefine([Ac_cachevar])dnl
])# PGAC_SVE_POPCNT_INTRINSICS

View File

@ -81,58 +81,3 @@ AC_DEFUN([PGAC_STRUCT_SOCKADDR_SA_LEN],
[#include <sys/types.h>
#include <sys/socket.h>
])])# PGAC_STRUCT_SOCKADDR_MEMBERS
# PGAC_TYPE_LOCALE_T
# ------------------
# Check for the locale_t type and find the right header file. macOS
# needs xlocale.h; standard is locale.h, but glibc <= 2.25 also had an
# xlocale.h file that we should not use, so we check the standard
# header first.
AC_DEFUN([PGAC_TYPE_LOCALE_T],
[AC_CACHE_CHECK([for locale_t], pgac_cv_type_locale_t,
[AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
[#include <locale.h>
locale_t x;],
[])],
[pgac_cv_type_locale_t=yes],
[AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
[#include <xlocale.h>
locale_t x;],
[])],
[pgac_cv_type_locale_t='yes (in xlocale.h)'],
[pgac_cv_type_locale_t=no])])])
if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
AC_DEFINE(LOCALE_T_IN_XLOCALE, 1,
[Define to 1 if `locale_t' requires <xlocale.h>.])
fi])# PGAC_TYPE_LOCALE_T
# PGAC_FUNC_WCSTOMBS_L
# --------------------
# Try to find a declaration for wcstombs_l(). It might be in stdlib.h
# (following the POSIX requirement for wcstombs()), or in locale.h, or in
# xlocale.h. If it's in the latter, define WCSTOMBS_L_IN_XLOCALE.
#
AC_DEFUN([PGAC_FUNC_WCSTOMBS_L],
[AC_CACHE_CHECK([for wcstombs_l declaration], pgac_cv_func_wcstombs_l,
[AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
[#include <stdlib.h>
#include <locale.h>],
[#ifndef wcstombs_l
(void) wcstombs_l;
#endif])],
[pgac_cv_func_wcstombs_l='yes'],
[AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
[#include <stdlib.h>
#include <locale.h>
#include <xlocale.h>],
[#ifndef wcstombs_l
(void) wcstombs_l;
#endif])],
[pgac_cv_func_wcstombs_l='yes (in xlocale.h)'],
[pgac_cv_func_wcstombs_l='no'])])])
if test "$pgac_cv_func_wcstombs_l" = 'yes (in xlocale.h)'; then
AC_DEFINE(WCSTOMBS_L_IN_XLOCALE, 1,
[Define to 1 if `wcstombs_l' requires <xlocale.h>.])
fi])# PGAC_FUNC_WCSTOMBS_L

View File

@ -1,5 +1,5 @@
# Copyright (c) 2024, PostgreSQL Global Development Group
# Copyright (c) 2024-2025, PostgreSQL Global Development Group
#
# Verify that required Perl modules are available,

11
config/config.guess vendored
View File

@ -4,7 +4,7 @@
# shellcheck disable=SC2006,SC2268 # see below for rationale
timestamp='2024-01-01'
timestamp='2024-07-27'
# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
@ -123,7 +123,7 @@ set_cc_for_build() {
dummy=$tmp/dummy
case ${CC_FOR_BUILD-},${HOST_CC-},${CC-} in
,,) echo "int x;" > "$dummy.c"
for driver in cc gcc c89 c99 ; do
for driver in cc gcc c17 c99 c89 ; do
if ($driver -c -o "$dummy.o" "$dummy.c") >/dev/null 2>&1 ; then
CC_FOR_BUILD=$driver
break
@ -634,7 +634,8 @@ EOF
sed 's/^ //' << EOF > "$dummy.c"
#include <sys/systemcfg.h>
main()
int
main ()
{
if (!__power_pc())
exit(1);
@ -718,7 +719,8 @@ EOF
#include <stdlib.h>
#include <unistd.h>
int main ()
int
main ()
{
#if defined(_SC_KERNEL_BITS)
long bits = sysconf(_SC_KERNEL_BITS);
@ -1621,6 +1623,7 @@ cat > "$dummy.c" <<EOF
#endif
#endif
#endif
int
main ()
{
#if defined (sony)

729
config/config.sub vendored
View File

@ -2,9 +2,9 @@
# Configuration validation subroutine script.
# Copyright 1992-2024 Free Software Foundation, Inc.
# shellcheck disable=SC2006,SC2268 # see below for rationale
# shellcheck disable=SC2006,SC2268,SC2162 # see below for rationale
timestamp='2024-01-01'
timestamp='2024-05-27'
# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
@ -120,7 +120,6 @@ case $# in
esac
# Split fields of configuration type
# shellcheck disable=SC2162
saved_IFS=$IFS
IFS="-" read field1 field2 field3 field4 <<EOF
$1
@ -142,10 +141,20 @@ case $1 in
# parts
maybe_os=$field2-$field3
case $maybe_os in
nto-qnx* | linux-* | uclinux-uclibc* \
| uclinux-gnu* | kfreebsd*-gnu* | knetbsd*-gnu* | netbsd*-gnu* \
| netbsd*-eabi* | kopensolaris*-gnu* | cloudabi*-eabi* \
| storm-chaos* | os2-emx* | rtmk-nova* | managarm-* \
cloudabi*-eabi* \
| kfreebsd*-gnu* \
| knetbsd*-gnu* \
| kopensolaris*-gnu* \
| linux-* \
| managarm-* \
| netbsd*-eabi* \
| netbsd*-gnu* \
| nto-qnx* \
| os2-emx* \
| rtmk-nova* \
| storm-chaos* \
| uclinux-gnu* \
| uclinux-uclibc* \
| windows-* )
basic_machine=$field1
basic_os=$maybe_os
@ -161,8 +170,12 @@ case $1 in
esac
;;
*-*)
# A lone config we happen to match not fitting any pattern
case $field1-$field2 in
# Shorthands that happen to contain a single dash
convex-c[12] | convex-c3[248])
basic_machine=$field2-convex
basic_os=
;;
decstation-3100)
basic_machine=mips-dec
basic_os=
@ -170,28 +183,88 @@ case $1 in
*-*)
# Second component is usually, but not always the OS
case $field2 in
# Prevent following clause from handling this valid os
# Do not treat sunos as a manufacturer
sun*os*)
basic_machine=$field1
basic_os=$field2
;;
# Manufacturers
3100* \
| 32* \
| 3300* \
| 3600* \
| 7300* \
| acorn \
| altos* \
| apollo \
| apple \
| atari \
| att* \
| axis \
| be \
| bull \
| cbm \
| ccur \
| cisco \
| commodore \
| convergent* \
| convex* \
| cray \
| crds \
| dec* \
| delta* \
| dg \
| digital \
| dolphin \
| encore* \
| gould \
| harris \
| highlevel \
| hitachi* \
| hp \
| ibm* \
| intergraph \
| isi* \
| knuth \
| masscomp \
| microblaze* \
| mips* \
| motorola* \
| ncr* \
| news \
| next \
| ns \
| oki \
| omron* \
| pc533* \
| rebel \
| rom68k \
| rombug \
| semi \
| sequent* \
| siemens \
| sgi* \
| siemens \
| sim \
| sni \
| sony* \
| stratus \
| sun \
| sun[234]* \
| tektronix \
| tti* \
| ultra \
| unicom* \
| wec \
| winbond \
| wrs)
basic_machine=$field1-$field2
basic_os=
;;
zephyr*)
basic_machine=$field1-unknown
basic_os=$field2
;;
# Manufacturers
dec* | mips* | sequent* | encore* | pc533* | sgi* | sony* \
| att* | 7300* | 3300* | delta* | motorola* | sun[234]* \
| unicom* | ibm* | next | hp | isi* | apollo | altos* \
| convergent* | ncr* | news | 32* | 3600* | 3100* \
| hitachi* | c[123]* | convex* | sun | crds | omron* | dg \
| ultra | tti* | harris | dolphin | highlevel | gould \
| cbm | ns | masscomp | apple | axis | knuth | cray \
| microblaze* | sim | cisco \
| oki | wec | wrs | winbond)
basic_machine=$field1-$field2
basic_os=
;;
*)
basic_machine=$field1
basic_os=$field2
@ -272,26 +345,6 @@ case $1 in
basic_machine=arm-unknown
basic_os=cegcc
;;
convex-c1)
basic_machine=c1-convex
basic_os=bsd
;;
convex-c2)
basic_machine=c2-convex
basic_os=bsd
;;
convex-c32)
basic_machine=c32-convex
basic_os=bsd
;;
convex-c34)
basic_machine=c34-convex
basic_os=bsd
;;
convex-c38)
basic_machine=c38-convex
basic_os=bsd
;;
cray)
basic_machine=j90-cray
basic_os=unicos
@ -714,15 +767,26 @@ case $basic_machine in
vendor=dec
basic_os=tops20
;;
delta | 3300 | motorola-3300 | motorola-delta \
| 3300-motorola | delta-motorola)
delta | 3300 | delta-motorola | 3300-motorola | motorola-delta | motorola-3300)
cpu=m68k
vendor=motorola
;;
dpx2*)
# This used to be dpx2*, but that gets the RS6000-based
# DPX/20 and the x86-based DPX/2-100 wrong. See
# https://oldskool.silicium.org/stations/bull_dpx20.htm
# https://www.feb-patrimoine.com/english/bull_dpx2.htm
# https://www.feb-patrimoine.com/english/unix_and_bull.htm
dpx2 | dpx2[23]00 | dpx2[23]xx)
cpu=m68k
vendor=bull
basic_os=sysv3
;;
dpx2100 | dpx21xx)
cpu=i386
vendor=bull
;;
dpx20)
cpu=rs6000
vendor=bull
;;
encore | umax | mmax)
cpu=ns32k
@ -837,18 +901,6 @@ case $basic_machine in
next | m*-next)
cpu=m68k
vendor=next
case $basic_os in
openstep*)
;;
nextstep*)
;;
ns2*)
basic_os=nextstep2
;;
*)
basic_os=nextstep3
;;
esac
;;
np1)
cpu=np1
@ -937,7 +989,6 @@ case $basic_machine in
;;
*-*)
# shellcheck disable=SC2162
saved_IFS=$IFS
IFS="-" read cpu vendor <<EOF
$basic_machine
@ -972,15 +1023,19 @@ unset -v basic_machine
# Decode basic machines in the full and proper CPU-Company form.
case $cpu-$vendor in
# Here we handle the default manufacturer of certain CPU types in canonical form. It is in
# some cases the only manufacturer, in others, it is the most popular.
# Here we handle the default manufacturer of certain CPU types in canonical form.
# It is in some cases the only manufacturer, in others, it is the most popular.
c[12]-convex | c[12]-unknown | c3[248]-convex | c3[248]-unknown)
vendor=convex
basic_os=${basic_os:-bsd}
;;
craynv-unknown)
vendor=cray
basic_os=${basic_os:-unicosmp}
;;
c90-unknown | c90-cray)
vendor=cray
basic_os=${Basic_os:-unicos}
basic_os=${basic_os:-unicos}
;;
fx80-unknown)
vendor=alliant
@ -1026,11 +1081,29 @@ case $cpu-$vendor in
vendor=alt
basic_os=${basic_os:-linux-gnueabihf}
;;
dpx20-unknown | dpx20-bull)
cpu=rs6000
vendor=bull
# Normalized CPU+vendor pairs that imply an OS, if not otherwise specified
m68k-isi)
basic_os=${basic_os:-sysv}
;;
m68k-sony)
basic_os=${basic_os:-newsos}
;;
m68k-tektronix)
basic_os=${basic_os:-bsd}
;;
m88k-harris)
basic_os=${basic_os:-sysv3}
;;
i386-bull | m68k-bull)
basic_os=${basic_os:-sysv3}
;;
rs6000-bull)
basic_os=${basic_os:-bosx}
;;
mips-sni)
basic_os=${basic_os:-sysv4}
;;
# Here we normalize CPU types irrespective of the vendor
amd64-*)
@ -1038,7 +1111,7 @@ case $cpu-$vendor in
;;
blackfin-*)
cpu=bfin
basic_os=linux
basic_os=${basic_os:-linux}
;;
c54x-*)
cpu=tic54x
@ -1061,7 +1134,7 @@ case $cpu-$vendor in
;;
m68knommu-*)
cpu=m68k
basic_os=linux
basic_os=${basic_os:-linux}
;;
m9s12z-* | m68hcs12z-* | hcs12z-* | s12z-*)
cpu=s12z
@ -1071,7 +1144,7 @@ case $cpu-$vendor in
;;
parisc-*)
cpu=hppa
basic_os=linux
basic_os=${basic_os:-linux}
;;
pentium-* | p5-* | k5-* | k6-* | nexgen-* | viac3-*)
cpu=i586
@ -1085,9 +1158,6 @@ case $cpu-$vendor in
pentium4-*)
cpu=i786
;;
pc98-*)
cpu=i386
;;
ppc-* | ppcbe-*)
cpu=powerpc
;;
@ -1121,9 +1191,6 @@ case $cpu-$vendor in
tx39el-*)
cpu=mipstx39el
;;
x64-*)
cpu=x86_64
;;
xscale-* | xscalee[bl]-*)
cpu=`echo "$cpu" | sed 's/^xscale/arm/'`
;;
@ -1179,90 +1246,227 @@ case $cpu-$vendor in
# Recognize the canonical CPU types that are allowed with any
# company name.
case $cpu in
1750a | 580 \
1750a \
| 580 \
| [cjt]90 \
| a29k \
| aarch64 | aarch64_be | aarch64c | arm64ec \
| aarch64 \
| aarch64_be \
| aarch64c \
| abacus \
| alpha | alphaev[4-8] | alphaev56 | alphaev6[78] \
| alpha64 | alpha64ev[4-8] | alpha64ev56 | alpha64ev6[78] \
| alphapca5[67] | alpha64pca5[67] \
| alpha \
| alpha64 \
| alpha64ev56 \
| alpha64ev6[78] \
| alpha64ev[4-8] \
| alpha64pca5[67] \
| alphaev56 \
| alphaev6[78] \
| alphaev[4-8] \
| alphapca5[67] \
| am33_2.0 \
| amdgcn \
| arc | arceb | arc32 | arc64 \
| arm | arm[lb]e | arme[lb] | armv* \
| avr | avr32 \
| arc \
| arc32 \
| arc64 \
| arceb \
| arm \
| arm64e \
| arm64ec \
| arm[lb]e \
| arme[lb] \
| armv* \
| asmjs \
| avr \
| avr32 \
| ba \
| be32 | be64 \
| bfin | bpf | bs2000 \
| c[123]* | c30 | [cjt]90 | c4x \
| c8051 | clipper | craynv | csky | cydra \
| d10v | d30v | dlx | dsp16xx \
| e2k | elxsi | epiphany \
| f30[01] | f700 | fido | fr30 | frv | ft32 | fx80 \
| javascript \
| h8300 | h8500 \
| hppa | hppa1.[01] | hppa2.0 | hppa2.0[nw] | hppa64 \
| be32 \
| be64 \
| bfin \
| bpf \
| bs2000 \
| c30 \
| c4x \
| c8051 \
| c[123]* \
| clipper \
| craynv \
| csky \
| cydra \
| d10v \
| d30v \
| dlx \
| dsp16xx \
| e2k \
| elxsi \
| epiphany \
| f30[01] \
| f700 \
| fido \
| fr30 \
| frv \
| ft32 \
| fx80 \
| h8300 \
| h8500 \
| hexagon \
| i370 | i*86 | i860 | i960 | ia16 | ia64 \
| ip2k | iq2000 \
| hppa \
| hppa1.[01] \
| hppa2.0 \
| hppa2.0[nw] \
| hppa64 \
| i*86 \
| i370 \
| i860 \
| i960 \
| ia16 \
| ia64 \
| ip2k \
| iq2000 \
| javascript \
| k1om \
| kvx \
| le32 | le64 \
| le32 \
| le64 \
| lm32 \
| loongarch32 | loongarch64 \
| m32c | m32r | m32rle \
| m5200 | m68000 | m680[012346]0 | m68360 | m683?2 | m68k \
| m6811 | m68hc11 | m6812 | m68hc12 | m68hcs12x \
| m88110 | m88k | maxq | mb | mcore | mep | metag \
| microblaze | microblazeel \
| loongarch32 \
| loongarch64 \
| m32c \
| m32r \
| m32rle \
| m5200 \
| m68000 \
| m680[012346]0 \
| m6811 \
| m6812 \
| m68360 \
| m683?2 \
| m68hc11 \
| m68hc12 \
| m68hcs12x \
| m68k \
| m88110 \
| m88k \
| maxq \
| mb \
| mcore \
| mep \
| metag \
| microblaze \
| microblazeel \
| mips* \
| mmix \
| mn10200 | mn10300 \
| mn10200 \
| mn10300 \
| moxie \
| mt \
| msp430 \
| mt \
| nanomips* \
| nds32 | nds32le | nds32be \
| nds32 \
| nds32be \
| nds32le \
| nfp \
| nios | nios2 | nios2eb | nios2el \
| none | np1 | ns16k | ns32k | nvptx \
| nios \
| nios2 \
| nios2eb \
| nios2el \
| none \
| np1 \
| ns16k \
| ns32k \
| nvptx \
| open8 \
| or1k* \
| or32 \
| orion \
| pdp10 \
| pdp11 \
| picochip \
| pdp10 | pdp11 | pj | pjl | pn | power \
| powerpc | powerpc64 | powerpc64le | powerpcle | powerpcspe \
| pj \
| pjl \
| pn \
| power \
| powerpc \
| powerpc64 \
| powerpc64le \
| powerpcle \
| powerpcspe \
| pru \
| pyramid \
| riscv | riscv32 | riscv32be | riscv64 | riscv64be \
| rl78 | romp | rs6000 | rx \
| s390 | s390x \
| riscv \
| riscv32 \
| riscv32be \
| riscv64 \
| riscv64be \
| rl78 \
| romp \
| rs6000 \
| rx \
| s390 \
| s390x \
| score \
| sh | shl \
| sh[1234] | sh[24]a | sh[24]ae[lb] | sh[23]e | she[lb] | sh[lb]e \
| sh[1234]e[lb] | sh[12345][lb]e | sh[23]ele | sh64 | sh64le \
| sparc | sparc64 | sparc64b | sparc64v | sparc86x | sparclet \
| sh \
| sh64 \
| sh64le \
| sh[12345][lb]e \
| sh[1234] \
| sh[1234]e[lb] \
| sh[23]e \
| sh[23]ele \
| sh[24]a \
| sh[24]ae[lb] \
| sh[lb]e \
| she[lb] \
| shl \
| sparc \
| sparc64 \
| sparc64b \
| sparc64v \
| sparc86x \
| sparclet \
| sparclite \
| sparcv8 | sparcv9 | sparcv9b | sparcv9v | sv1 | sx* \
| sparcv8 \
| sparcv9 \
| sparcv9b \
| sparcv9v \
| spu \
| sv1 \
| sx* \
| tahoe \
| thumbv7* \
| tic30 | tic4x | tic54x | tic55x | tic6x | tic80 \
| tic30 \
| tic4x \
| tic54x \
| tic55x \
| tic6x \
| tic80 \
| tron \
| ubicom32 \
| v70 | v850 | v850e | v850e1 | v850es | v850e2 | v850e2v3 \
| v70 \
| v810 \
| v850 \
| v850e \
| v850e1 \
| v850e2 \
| v850e2v3 \
| v850es \
| vax \
| vc4 \
| visium \
| w65 \
| wasm32 | wasm64 \
| wasm32 \
| wasm64 \
| we32k \
| x86 | x86_64 | xc16x | xgate | xps100 \
| xstormy16 | xtensa* \
| x86 \
| x86_64 \
| xc16x \
| xgate \
| xps100 \
| xstormy16 \
| xtensa* \
| ymp \
| z8k | z80)
| z80 \
| z8k)
;;
*)
@ -1307,7 +1511,6 @@ case $basic_os in
os=`echo "$basic_os" | sed -e 's|nto-qnx|qnx|'`
;;
*-*)
# shellcheck disable=SC2162
saved_IFS=$IFS
IFS="-" read kernel os <<EOF
$basic_os
@ -1354,6 +1557,23 @@ case $os in
unixware*)
os=sysv4.2uw
;;
# The marketing names for NeXT's operating systems were
# NeXTSTEP, NeXTSTEP 2, OpenSTEP 3, OpenSTEP 4. 'openstep' is
# mapped to 'openstep3', but 'openstep1' and 'openstep2' are
# mapped to 'nextstep' and 'nextstep2', consistent with the
# treatment of SunOS/Solaris.
ns | ns1 | nextstep | nextstep1 | openstep1)
os=nextstep
;;
ns2 | nextstep2 | openstep2)
os=nextstep2
;;
ns3 | nextstep3 | openstep | openstep3)
os=openstep3
;;
ns4 | nextstep4 | openstep4)
os=openstep4
;;
# es1800 is here to avoid being matched by es* (a different OS)
es1800*)
os=ose
@ -1424,6 +1644,7 @@ case $os in
;;
utek*)
os=bsd
vendor=`echo "$vendor" | sed -e 's|^unknown$|tektronix|'`
;;
dynix*)
os=bsd
@ -1440,21 +1661,25 @@ case $os in
386bsd)
os=bsd
;;
ctix* | uts*)
ctix*)
os=sysv
vendor=`echo "$vendor" | sed -e 's|^unknown$|convergent|'`
;;
uts*)
os=sysv
;;
nova*)
os=rtmk-nova
;;
ns2)
os=nextstep2
kernel=rtmk
os=nova
;;
# Preserve the version number of sinix5.
sinix5.*)
os=`echo "$os" | sed -e 's|sinix|sysv|'`
vendor=`echo "$vendor" | sed -e 's|^unknown$|sni|'`
;;
sinix*)
os=sysv4
vendor=`echo "$vendor" | sed -e 's|^unknown$|sni|'`
;;
tpf*)
os=tpf
@ -1595,6 +1820,14 @@ case $cpu-$vendor in
os=
obj=elf
;;
# The -sgi and -siemens entries must be before the mips- entry
# or we get the wrong os.
*-sgi)
os=irix
;;
*-siemens)
os=sysv4
;;
mips*-cisco)
os=
obj=elf
@ -1607,7 +1840,8 @@ case $cpu-$vendor in
os=
obj=coff
;;
*-tti) # must be before sparc entry or we get the wrong os.
# This must be before the sparc-* entry or we get the wrong os.
*-tti)
os=sysv3
;;
sparc-* | *-sun)
@ -1639,7 +1873,7 @@ case $cpu-$vendor in
os=hpux
;;
*-hitachi)
os=hiux
os=hiuxwe2
;;
i860-* | *-att | *-ncr | *-altos | *-motorola | *-convergent)
os=sysv
@ -1683,12 +1917,6 @@ case $cpu-$vendor in
*-encore)
os=bsd
;;
*-sgi)
os=irix
;;
*-siemens)
os=sysv4
;;
*-masscomp)
os=rtu
;;
@ -1735,40 +1963,193 @@ case $os in
ghcjs)
;;
# Now accept the basic system types.
# The portable systems comes first.
# Each alternative MUST end in a * to match a version number.
gnu* | android* | bsd* | mach* | minix* | genix* | ultrix* | irix* \
| *vms* | esix* | aix* | cnk* | sunos | sunos[34]* \
| hpux* | unos* | osf* | luna* | dgux* | auroraux* | solaris* \
| sym* | plan9* | psp* | sim* | xray* | os68k* | v88r* \
| hiux* | abug | nacl* | netware* | windows* \
| os9* | macos* | osx* | ios* | tvos* | watchos* \
| mpw* | magic* | mmixware* | mon960* | lnews* \
| amigaos* | amigados* | msdos* | newsos* | unicos* | aof* \
| aos* | aros* | cloudabi* | sortix* | twizzler* \
| nindy* | vxsim* | vxworks* | ebmon* | hms* | mvs* \
| clix* | riscos* | uniplus* | iris* | isc* | rtu* | xenix* \
| mirbsd* | netbsd* | dicos* | openedition* | ose* \
| bitrig* | openbsd* | secbsd* | solidbsd* | libertybsd* | os108* \
| ekkobsd* | freebsd* | riscix* | lynxos* | os400* \
| bosx* | nextstep* | cxux* | oabi* \
| ptx* | ecoff* | winnt* | domain* | vsta* \
| udi* | lites* | ieee* | go32* | aux* | hcos* \
| chorusrdb* | cegcc* | glidix* | serenity* \
| cygwin* | msys* | moss* | proelf* | rtems* \
| midipix* | mingw32* | mingw64* | mint* \
| uxpv* | beos* | mpeix* | udk* | moxiebox* \
| interix* | uwin* | mks* | rhapsody* | darwin* \
| openstep* | oskit* | conix* | pw32* | nonstopux* \
| storm-chaos* | tops10* | tenex* | tops20* | its* \
| os2* | vos* | palmos* | uclinux* | nucleus* | morphos* \
| scout* | superux* | sysv* | rtmk* | tpf* | windiss* \
| powermax* | dnix* | nx6 | nx7 | sei* | dragonfly* \
| skyos* | haiku* | rdos* | toppers* | drops* | es* \
| onefs* | tirtos* | phoenix* | fuchsia* | redox* | bme* \
| midnightbsd* | amdhsa* | unleashed* | emscripten* | wasi* \
| nsk* | powerunix* | genode* | zvmoe* | qnx* | emx* | zephyr* \
| fiwix* | mlibc* | cos* | mbr* | ironclad* )
abug \
| aix* \
| amdhsa* \
| amigados* \
| amigaos* \
| android* \
| aof* \
| aos* \
| aros* \
| atheos* \
| auroraux* \
| aux* \
| beos* \
| bitrig* \
| bme* \
| bosx* \
| bsd* \
| cegcc* \
| chorusos* \
| chorusrdb* \
| clix* \
| cloudabi* \
| cnk* \
| conix* \
| cos* \
| cxux* \
| cygwin* \
| darwin* \
| dgux* \
| dicos* \
| dnix* \
| domain* \
| dragonfly* \
| drops* \
| ebmon* \
| ecoff* \
| ekkobsd* \
| emscripten* \
| emx* \
| es* \
| fiwix* \
| freebsd* \
| fuchsia* \
| genix* \
| genode* \
| glidix* \
| gnu* \
| go32* \
| haiku* \
| hcos* \
| hiux* \
| hms* \
| hpux* \
| ieee* \
| interix* \
| ios* \
| iris* \
| irix* \
| ironclad* \
| isc* \
| its* \
| l4re* \
| libertybsd* \
| lites* \
| lnews* \
| luna* \
| lynxos* \
| mach* \
| macos* \
| magic* \
| mbr* \
| midipix* \
| midnightbsd* \
| mingw32* \
| mingw64* \
| minix* \
| mint* \
| mirbsd* \
| mks* \
| mlibc* \
| mmixware* \
| mon960* \
| morphos* \
| moss* \
| moxiebox* \
| mpeix* \
| mpw* \
| msdos* \
| msys* \
| mvs* \
| nacl* \
| netbsd* \
| netware* \
| newsos* \
| nextstep* \
| nindy* \
| nonstopux* \
| nova* \
| nsk* \
| nucleus* \
| nx6 \
| nx7 \
| oabi* \
| ohos* \
| onefs* \
| openbsd* \
| openedition* \
| openstep* \
| os108* \
| os2* \
| os400* \
| os68k* \
| os9* \
| ose* \
| osf* \
| oskit* \
| osx* \
| palmos* \
| phoenix* \
| plan9* \
| powermax* \
| powerunix* \
| proelf* \
| psos* \
| psp* \
| ptx* \
| pw32* \
| qnx* \
| rdos* \
| redox* \
| rhapsody* \
| riscix* \
| riscos* \
| rtems* \
| rtmk* \
| rtu* \
| scout* \
| secbsd* \
| sei* \
| serenity* \
| sim* \
| skyos* \
| solaris* \
| solidbsd* \
| sortix* \
| storm-chaos* \
| sunos \
| sunos[34]* \
| superux* \
| syllable* \
| sym* \
| sysv* \
| tenex* \
| tirtos* \
| toppers* \
| tops10* \
| tops20* \
| tpf* \
| tvos* \
| twizzler* \
| uclinux* \
| udi* \
| udk* \
| ultrix* \
| unicos* \
| uniplus* \
| unleashed* \
| unos* \
| uwin* \
| uxpv* \
| v88r* \
|*vms* \
| vos* \
| vsta* \
| vxsim* \
| vxworks* \
| wasi* \
| watchos* \
| wince* \
| windiss* \
| windows* \
| winnt* \
| xenix* \
| xray* \
| zephyr* \
| zvmoe* )
;;
# This one is extra strict with allowed versions
sco3.2v2 | sco3.2v[4-9]* | sco5v6*)
@ -1829,9 +2210,9 @@ esac
case $kernel-$os-$obj in
linux-gnu*- | linux-android*- | linux-dietlibc*- | linux-llvm*- \
| linux-mlibc*- | linux-musl*- | linux-newlib*- \
| linux-relibc*- | linux-uclibc*- )
| linux-relibc*- | linux-uclibc*- | linux-ohos*- )
;;
uclinux-uclibc*- )
uclinux-uclibc*- | uclinux-gnu*- )
;;
managarm-mlibc*- | managarm-kernel*- )
;;
@ -1856,7 +2237,7 @@ case $kernel-$os-$obj in
echo "Invalid configuration '$1': '$os' needs 'windows'." 1>&2
exit 1
;;
kfreebsd*-gnu*- | kopensolaris*-gnu*-)
kfreebsd*-gnu*- | knetbsd*-gnu*- | netbsd*-gnu*- | kopensolaris*-gnu*-)
;;
vxworks-simlinux- | vxworks-simwindows- | vxworks-spe-)
;;
@ -1864,6 +2245,8 @@ case $kernel-$os-$obj in
;;
os2-emx-)
;;
rtmk-nova-)
;;
*-eabi*- | *-gnueabi*-)
;;
none--*)
@ -1890,7 +2273,7 @@ case $vendor in
*-riscix*)
vendor=acorn
;;
*-sunos*)
*-sunos* | *-solaris*)
vendor=sun
;;
*-cnk* | *-aix*)

View File

@ -25,8 +25,8 @@ AC_DEFUN([PGAC_LLVM_SUPPORT],
AC_MSG_ERROR([$LLVM_CONFIG does not work])
fi
# and whether the version is supported
if echo $pgac_llvm_version | $AWK -F '.' '{ if ([$]1 >= 10) exit 1; else exit 0;}';then
AC_MSG_ERROR([$LLVM_CONFIG version is $pgac_llvm_version but at least 10 is required])
if echo $pgac_llvm_version | $AWK -F '.' '{ if ([$]1 >= 14) exit 1; else exit 0;}';then
AC_MSG_ERROR([$LLVM_CONFIG version is $pgac_llvm_version but at least 14 is required])
fi
AC_MSG_NOTICE([using llvm $pgac_llvm_version])

View File

@ -59,57 +59,16 @@ AC_SUBST(BISONFLAGS)
# PGAC_PATH_FLEX
# --------------
# Look for Flex, set the output variable FLEX to its path if found.
# Reject versions before 2.5.35 (the earliest version in the buildfarm
# as of 2022). Also find Flex if its installed under `lex', but do not
# accept other Lex programs.
AC_DEFUN([PGAC_PATH_FLEX],
[AC_CACHE_CHECK([for flex], pgac_cv_path_flex,
[# Let the user override the test
if test -n "$FLEX"; then
pgac_cv_path_flex=$FLEX
else
pgac_save_IFS=$IFS
IFS=$PATH_SEPARATOR
for pgac_dir in $PATH; do
IFS=$pgac_save_IFS
if test -z "$pgac_dir" || test x"$pgac_dir" = x"."; then
pgac_dir=`pwd`
fi
for pgac_prog in flex lex; do
pgac_candidate="$pgac_dir/$pgac_prog"
if test -f "$pgac_candidate" \
&& $pgac_candidate --version </dev/null >/dev/null 2>&1
then
echo '%%' > conftest.l
if $pgac_candidate -t conftest.l 2>/dev/null | grep FLEX_SCANNER >/dev/null 2>&1; then
pgac_flex_version=`$pgac_candidate --version 2>/dev/null`
if echo "$pgac_flex_version" | sed ['s/[.a-z]/ /g'] | $AWK '{ if ([$]1 == 2 && ([$]2 > 5 || ([$]2 == 5 && [$]3 >= 35))) exit 0; else exit 1;}'
then
pgac_cv_path_flex=$pgac_candidate
break 2
else
AC_MSG_ERROR([
*** The installed version of Flex, $pgac_candidate, is too old to use with PostgreSQL.
*** Flex version 2.5.35 or later is required, but this is $pgac_flex_version.])
fi
fi
fi
done
done
rm -f conftest.l lex.yy.c
: ${pgac_cv_path_flex=no}
fi
])[]dnl AC_CACHE_CHECK
if test x"$pgac_cv_path_flex" = x"no"; then
[PGAC_PATH_PROGS(FLEX, flex)
if test -z "$FLEX"; then
AC_MSG_ERROR([flex not found])
else
FLEX=$pgac_cv_path_flex
pgac_flex_version=`$FLEX --version 2>/dev/null`
AC_MSG_NOTICE([using $pgac_flex_version])
fi
pgac_flex_version=`$FLEX --version 2>/dev/null`
AC_MSG_NOTICE([using $pgac_flex_version])
AC_SUBST(FLEX)
AC_SUBST(FLEXFLAGS)
])# PGAC_PATH_FLEX
@ -315,3 +274,83 @@ AC_DEFUN([PGAC_CHECK_STRIP],
AC_SUBST(STRIP_STATIC_LIB)
AC_SUBST(STRIP_SHARED_LIB)
])# PGAC_CHECK_STRIP
# PGAC_CHECK_LIBCURL
# ------------------
# Check for required libraries and headers, and test to see whether the current
# installation of libcurl is thread-safe.
AC_DEFUN([PGAC_CHECK_LIBCURL],
[
AC_CHECK_HEADER(curl/curl.h, [],
[AC_MSG_ERROR([header file <curl/curl.h> is required for --with-libcurl])])
AC_CHECK_LIB(curl, curl_multi_init, [
AC_DEFINE([HAVE_LIBCURL], [1], [Define to 1 if you have the `curl' library (-lcurl).])
AC_SUBST(LIBCURL_LDLIBS, -lcurl)
],
[AC_MSG_ERROR([library 'curl' does not provide curl_multi_init])])
pgac_save_CPPFLAGS=$CPPFLAGS
pgac_save_LDFLAGS=$LDFLAGS
pgac_save_LIBS=$LIBS
CPPFLAGS="$LIBCURL_CPPFLAGS $CPPFLAGS"
LDFLAGS="$LIBCURL_LDFLAGS $LDFLAGS"
LIBS="$LIBCURL_LDLIBS $LIBS"
# Check to see whether the current platform supports threadsafe Curl
# initialization.
AC_CACHE_CHECK([for curl_global_init thread safety], [pgac_cv__libcurl_threadsafe_init],
[AC_RUN_IFELSE([AC_LANG_PROGRAM([
#include <curl/curl.h>
],[
curl_version_info_data *info;
if (curl_global_init(CURL_GLOBAL_ALL))
return -1;
info = curl_version_info(CURLVERSION_NOW);
#ifdef CURL_VERSION_THREADSAFE
if (info->features & CURL_VERSION_THREADSAFE)
return 0;
#endif
return 1;
])],
[pgac_cv__libcurl_threadsafe_init=yes],
[pgac_cv__libcurl_threadsafe_init=no],
[pgac_cv__libcurl_threadsafe_init=unknown])])
if test x"$pgac_cv__libcurl_threadsafe_init" = xyes ; then
AC_DEFINE(HAVE_THREADSAFE_CURL_GLOBAL_INIT, 1,
[Define to 1 if curl_global_init() is guaranteed to be thread-safe.])
fi
# Fail if a thread-friendly DNS resolver isn't built.
AC_CACHE_CHECK([for curl support for asynchronous DNS], [pgac_cv__libcurl_async_dns],
[AC_RUN_IFELSE([AC_LANG_PROGRAM([
#include <curl/curl.h>
],[
curl_version_info_data *info;
if (curl_global_init(CURL_GLOBAL_ALL))
return -1;
info = curl_version_info(CURLVERSION_NOW);
return (info->features & CURL_VERSION_ASYNCHDNS) ? 0 : 1;
])],
[pgac_cv__libcurl_async_dns=yes],
[pgac_cv__libcurl_async_dns=no],
[pgac_cv__libcurl_async_dns=unknown])])
if test x"$pgac_cv__libcurl_async_dns" = xno ; then
AC_MSG_ERROR([
*** The installed version of libcurl does not support asynchronous DNS
*** lookups. Rebuild libcurl with the AsynchDNS feature enabled in order
*** to use it with libpq.])
fi
CPPFLAGS=$pgac_save_CPPFLAGS
LDFLAGS=$pgac_save_LDFLAGS
LIBS=$pgac_save_LIBS
])# PGAC_CHECK_LIBCURL

2589
configure vendored

File diff suppressed because it is too large Load Diff

View File

@ -17,13 +17,13 @@ dnl Read the Autoconf manual for details.
dnl
m4_pattern_forbid(^PGAC_)dnl to catch undefined macros
AC_INIT([PostgreSQL], [17beta2], [pgsql-bugs@lists.postgresql.org], [], [https://www.postgresql.org/])
AC_INIT([PostgreSQL], [18beta1], [pgsql-bugs@lists.postgresql.org], [], [https://www.postgresql.org/])
m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.69], [], [m4_fatal([Autoconf version 2.69 is required.
Untested combinations of 'autoconf' and PostgreSQL versions are not
recommended. You can remove the check from 'configure.ac' but it is then
your responsibility whether the result works or not.])])
AC_COPYRIGHT([Copyright (c) 1996-2024, PostgreSQL Global Development Group])
AC_COPYRIGHT([Copyright (c) 1996-2025, PostgreSQL Global Development Group])
AC_CONFIG_SRCDIR([src/backend/access/common/heaptuple.c])
AC_CONFIG_AUX_DIR(config)
AC_PREFIX_DEFAULT(/usr/local/pgsql)
@ -186,18 +186,6 @@ PGAC_ARG_BOOL(enable, rpath, yes,
[do not embed shared library search path in executables])
AC_SUBST(enable_rpath)
#
# Spinlocks
#
PGAC_ARG_BOOL(enable, spinlocks, yes,
[do not use spinlocks])
#
# Atomic operations
#
PGAC_ARG_BOOL(enable, atomics, yes,
[do not use atomic operations])
#
# --enable-debug adds -g to compiler flags
#
@ -248,6 +236,8 @@ AC_SUBST(enable_dtrace)
PGAC_ARG_BOOL(enable, tap-tests, no,
[enable TAP tests (requires Perl and IPC::Run)])
AC_SUBST(enable_tap_tests)
AC_ARG_VAR(PG_TEST_EXTRA,
[enable selected extra tests (overridden at runtime by PG_TEST_EXTRA environment variable)])
#
# Injection points
@ -530,6 +520,15 @@ if test "$GCC" = yes -a "$ICC" = no; then
# This was included in -Wall/-Wformat in older GCC versions
PGAC_PROG_CC_CFLAGS_OPT([-Wformat-security])
PGAC_PROG_CXX_CFLAGS_OPT([-Wformat-security])
# gcc 14+, clang for a while
# (Supported in C++ by clang but not gcc. For consistency, omit in C++.)
save_CFLAGS=$CFLAGS
PGAC_PROG_CC_CFLAGS_OPT([-Wmissing-variable-declarations])
PERMIT_MISSING_VARIABLE_DECLARATIONS=
if test x"$save_CFLAGS" != x"$CFLAGS"; then
PERMIT_MISSING_VARIABLE_DECLARATIONS=-Wno-missing-variable-declarations
fi
AC_SUBST(PERMIT_MISSING_VARIABLE_DECLARATIONS)
# Disable strict-aliasing rules; needed for gcc 3.3+
PGAC_PROG_CC_CFLAGS_OPT([-fno-strict-aliasing])
PGAC_PROG_CXX_CFLAGS_OPT([-fno-strict-aliasing])
@ -634,8 +633,12 @@ if test "$with_llvm" = yes ; then
PGAC_PROG_VARCC_VARFLAGS_OPT(CLANG, BITCODE_CFLAGS, [-fexcess-precision=standard])
PGAC_PROG_VARCXX_VARFLAGS_OPT(CLANGXX, BITCODE_CXXFLAGS, [-fexcess-precision=standard])
PGAC_PROG_VARCC_VARFLAGS_OPT(CLANG, BITCODE_CFLAGS, [-Xclang -no-opaque-pointers])
PGAC_PROG_VARCXX_VARFLAGS_OPT(CLANGXX, BITCODE_CXXFLAGS, [-Xclang -no-opaque-pointers])
# Ideally bitcode should perhaps match $CC's use, or not, of outline atomic
# functions, but for now we err on the side of suppressing them in bitcode,
# because we can't assume they're available at runtime. This affects aarch64
# builds using the basic armv8-a ISA without LSE support.
PGAC_PROG_VARCXX_VARFLAGS_OPT(CLANG, BITCODE_CFLAGS, [-mno-outline-atomics])
PGAC_PROG_VARCXX_VARFLAGS_OPT(CLANG, BITCODE_CXXFLAGS, [-mno-outline-atomics])
NOT_THE_CFLAGS=""
PGAC_PROG_VARCC_VARFLAGS_OPT(CLANG, NOT_THE_CFLAGS, [-Wunused-command-line-argument])
@ -690,10 +693,10 @@ if test "$enable_profiling" = yes && test "$ac_cv_prog_cc_g" = yes; then
fi
fi
# On Solaris, we need this #define to get POSIX-conforming versions
# of many interfaces (sigwait, getpwuid_r, ...).
# On Solaris, we need these #defines to get POSIX-conforming versions
# of many interfaces (sigwait, getpwuid_r, shmdt, ...).
if test "$PORTNAME" = "solaris"; then
CPPFLAGS="$CPPFLAGS -D_POSIX_PTHREAD_SEMANTICS"
CPPFLAGS="$CPPFLAGS -D_POSIX_C_SOURCE=200112L -D__EXTENSIONS__ -D_POSIX_PTHREAD_SEMANTICS"
fi
# We already have this in Makefile.win32, but configure needs it too
@ -804,7 +807,6 @@ for dir in $with_includes $SRCH_INC; do
fi
done
IFS=$ac_save_IFS
AC_SUBST(INCLUDES)
#
@ -832,11 +834,7 @@ AC_MSG_RESULT([$with_icu])
AC_SUBST(with_icu)
if test "$with_icu" = yes; then
PKG_CHECK_MODULES(ICU, icu-uc icu-i18n, [],
[AC_MSG_ERROR([ICU library not found
If you have ICU already installed, see config.log for details on the
failure. It is possible the compiler isn't looking in the proper directory.
Use --without-icu to disable ICU support.])])
PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
fi
#
@ -977,6 +975,18 @@ AC_SUBST(with_readline)
PGAC_ARG_BOOL(with, libedit-preferred, no,
[prefer BSD Libedit over GNU Readline])
#
# liburing
#
AC_MSG_CHECKING([whether to build with liburing support])
PGAC_ARG_BOOL(with, liburing, no, [build with io_uring support, for asynchronous I/O],
[AC_DEFINE([USE_LIBURING], 1, [Define to build with io_uring support. (--with-liburing)])])
AC_MSG_RESULT([$with_liburing])
AC_SUBST(with_liburing)
if test "$with_liburing" = yes; then
PKG_CHECK_MODULES(LIBURING, liburing)
fi
#
# UUID library
@ -1009,6 +1019,62 @@ fi
AC_SUBST(with_uuid)
#
# libcurl
#
AC_MSG_CHECKING([whether to build with libcurl support])
PGAC_ARG_BOOL(with, libcurl, no, [build with libcurl support],
[AC_DEFINE([USE_LIBCURL], 1, [Define to 1 to build with libcurl support. (--with-libcurl)])])
AC_MSG_RESULT([$with_libcurl])
AC_SUBST(with_libcurl)
if test "$with_libcurl" = yes ; then
# Check for libcurl 7.61.0 or higher (corresponding to RHEL8 and the ability
# to explicitly set TLS 1.3 ciphersuites).
PKG_CHECK_MODULES(LIBCURL, [libcurl >= 7.61.0])
# Curl's flags are kept separate from the standard CPPFLAGS/LDFLAGS. We use
# them only for libpq-oauth.
LIBCURL_CPPFLAGS=
LIBCURL_LDFLAGS=
# We only care about -I, -D, and -L switches. Note that -lcurl will be added
# to LIBCURL_LDLIBS by PGAC_CHECK_LIBCURL, below.
for pgac_option in $LIBCURL_CFLAGS; do
case $pgac_option in
-I*|-D*) LIBCURL_CPPFLAGS="$LIBCURL_CPPFLAGS $pgac_option";;
esac
done
for pgac_option in $LIBCURL_LIBS; do
case $pgac_option in
-L*) LIBCURL_LDFLAGS="$LIBCURL_LDFLAGS $pgac_option";;
esac
done
AC_SUBST(LIBCURL_CPPFLAGS)
AC_SUBST(LIBCURL_LDFLAGS)
# OAuth requires python for testing
if test "$with_python" != yes; then
AC_MSG_WARN([*** OAuth support tests require --with-python to run])
fi
fi
#
# libnuma
#
AC_MSG_CHECKING([whether to build with libnuma support])
PGAC_ARG_BOOL(with, libnuma, no, [build with libnuma support],
[AC_DEFINE([USE_LIBNUMA], 1, [Define to build with NUMA support. (--with-libnuma)])])
AC_MSG_RESULT([$with_libnuma])
AC_SUBST(with_libnuma)
if test "$with_libnuma" = yes ; then
AC_CHECK_LIB(numa, numa_available, [], [AC_MSG_ERROR([library 'libnuma' is required for NUMA support])])
PKG_CHECK_MODULES(LIBNUMA, numa)
fi
#
# XML
#
@ -1296,18 +1362,8 @@ failure. It is possible the compiler isn't looking in the proper directory.
Use --without-zlib to disable zlib support.])])
fi
if test "$enable_spinlocks" = yes; then
AC_DEFINE(HAVE_SPINLOCKS, 1, [Define to 1 if you have spinlocks.])
else
AC_MSG_WARN([
*** Not using spinlocks will cause poor performance.])
fi
if test "$enable_atomics" = yes; then
AC_DEFINE(HAVE_ATOMICS, 1, [Define to 1 if you want to use atomics if available.])
else
AC_MSG_WARN([
*** Not using atomic operations will cause poor performance.])
if test "$with_libcurl" = yes ; then
PGAC_CHECK_LIBCURL
fi
if test "$with_gssapi" = yes ; then
@ -1335,30 +1391,17 @@ fi
if test "$with_ssl" = openssl ; then
dnl Order matters!
# Minimum required OpenSSL version is 1.0.2
AC_DEFINE(OPENSSL_API_COMPAT, [0x10002000L],
# Minimum required OpenSSL version is 1.1.1
AC_DEFINE(OPENSSL_API_COMPAT, [0x10101000L],
[Define to the OpenSSL API version in use. This avoids deprecation warnings from newer OpenSSL versions.])
if test "$PORTNAME" != "win32"; then
AC_CHECK_LIB(crypto, CRYPTO_new_ex_data, [], [AC_MSG_ERROR([library 'crypto' is required for OpenSSL])])
AC_CHECK_LIB(ssl, SSL_new, [], [AC_MSG_ERROR([library 'ssl' is required for OpenSSL])])
else
AC_SEARCH_LIBS(CRYPTO_new_ex_data, [eay32 crypto], [], [AC_MSG_ERROR([library 'eay32' or 'crypto' is required for OpenSSL])])
AC_SEARCH_LIBS(SSL_new, [ssleay32 ssl], [], [AC_MSG_ERROR([library 'ssleay32' or 'ssl' is required for OpenSSL])])
fi
AC_CHECK_LIB(crypto, CRYPTO_new_ex_data, [], [AC_MSG_ERROR([library 'crypto' is required for OpenSSL])])
AC_CHECK_LIB(ssl, SSL_new, [], [AC_MSG_ERROR([library 'ssl' is required for OpenSSL])])
# Functions introduced in OpenSSL 1.1.1.
AC_CHECK_FUNCS([SSL_CTX_set_ciphersuites], [], [AC_MSG_ERROR([OpenSSL version >= 1.1.1 is required for SSL support])])
# Function introduced in OpenSSL 1.0.2, not in LibreSSL.
AC_CHECK_FUNCS([SSL_CTX_set_cert_cb])
# Functions introduced in OpenSSL 1.1.0. We used to check for
# OPENSSL_VERSION_NUMBER, but that didn't work with 1.1.0, because LibreSSL
# defines OPENSSL_VERSION_NUMBER to claim version 2.0.0, even though it
# doesn't have these OpenSSL 1.1.0 functions. So check for individual
# functions.
AC_CHECK_FUNCS([OPENSSL_init_ssl BIO_meth_new ASN1_STRING_get0_data HMAC_CTX_new HMAC_CTX_free])
# OpenSSL versions before 1.1.0 required setting callback functions, for
# thread-safety. In 1.1.0, it's no longer required, and CRYPTO_lock()
# function was removed.
AC_CHECK_FUNCS([CRYPTO_lock])
# Function introduced in OpenSSL 1.1.1.
AC_CHECK_FUNCS([X509_get_signature_info])
# Function introduced in OpenSSL 1.1.1, not in LibreSSL.
AC_CHECK_FUNCS([X509_get_signature_info SSL_CTX_set_num_tickets SSL_CTX_set_keylog_callback])
AC_DEFINE([USE_OPENSSL], 1, [Define to 1 to build with OpenSSL support. (--with-ssl=openssl)])
elif test "$with_ssl" != no ; then
AC_MSG_ERROR([--with-ssl must specify openssl])
@ -1456,15 +1499,12 @@ AC_SUBST(UUID_LIBS)
## Header files
##
AC_HEADER_STDBOOL
AC_CHECK_HEADERS(m4_normalize([
atomic.h
copyfile.h
execinfo.h
getopt.h
ifaddrs.h
langinfo.h
mbarrier.h
sys/epoll.h
sys/event.h
@ -1475,6 +1515,7 @@ AC_CHECK_HEADERS(m4_normalize([
sys/ucred.h
termios.h
ucred.h
xlocale.h
]))
if expr x"$pgac_cv_check_readline" : 'x-lreadline' >/dev/null ; then
@ -1618,6 +1659,13 @@ if test "$PORTNAME" = "win32" ; then
AC_CHECK_HEADERS(crtdefs.h)
fi
if test "$with_libcurl" = yes ; then
# Error out early if this platform can't support libpq-oauth.
if test "$ac_cv_header_sys_event_h" != yes -a "$ac_cv_header_sys_epoll_h" != yes; then
AC_MSG_ERROR([client-side OAuth is not supported on this platform])
fi
fi
##
## Types, structures, compiler characteristics
##
@ -1630,6 +1678,7 @@ PGAC_C_STATIC_ASSERT
PGAC_C_TYPEOF
PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_OP_OVERFLOW
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
PGAC_STRUCT_TIMEZONE
@ -1637,8 +1686,6 @@ PGAC_UNION_SEMUN
AC_CHECK_TYPES(socklen_t, [], [], [#include <sys/socket.h>])
PGAC_STRUCT_SOCKADDR_SA_LEN
PGAC_TYPE_LOCALE_T
# MSVC doesn't cope well with defining restrict to __restrict, the
# spelling it understands, because it conflicts with
# __declspec(restrict). Therefore we define pg_restrict to the
@ -1719,25 +1766,12 @@ if test "$ac_cv_sizeof_off_t" -lt 8; then
fi
fi
AC_CHECK_SIZEOF([bool], [],
[#ifdef HAVE_STDBOOL_H
#include <stdbool.h>
#endif])
dnl We use <stdbool.h> if we have it and it declares type bool as having
dnl size 1. Otherwise, c.h will fall back to declaring bool as unsigned char.
if test "$ac_cv_header_stdbool_h" = yes -a "$ac_cv_sizeof_bool" = 1; then
AC_DEFINE([PG_USE_STDBOOL], 1,
[Define to 1 to use <stdbool.h> to define type bool.])
fi
##
## Functions, global variables
##
PGAC_VAR_INT_TIMEZONE
PGAC_FUNC_WCSTOMBS_L
# Some versions of libedit contain strlcpy(), setproctitle(), and other
# symbols that that library has no business exposing to the world. Pending
@ -1750,18 +1784,19 @@ AC_CHECK_FUNCS(m4_normalize([
backtrace_symbols
copyfile
copy_file_range
elf_aux_info
getauxval
getifaddrs
getpeerucred
inet_pton
kqueue
localeconv_l
mbstowcs_l
memset_s
posix_fallocate
ppoll
pthread_is_threaded_np
setproctitle
setproctitle_fast
strchrnul
strsignal
syncfs
sync_file_range
@ -1795,12 +1830,15 @@ AC_CHECK_DECLS(posix_fadvise, [], [], [#include <fcntl.h>])
]) # fi
AC_CHECK_DECLS(fdatasync, [], [], [#include <unistd.h>])
AC_CHECK_DECLS([strlcat, strlcpy, strnlen])
AC_CHECK_DECLS([strlcat, strlcpy, strnlen, strsep, timingsafe_bcmp])
# We can't use AC_CHECK_FUNCS to detect these functions, because it
# won't handle deployment target restrictions on macOS
AC_CHECK_DECLS([preadv], [], [], [#include <sys/uio.h>])
AC_CHECK_DECLS([pwritev], [], [], [#include <sys/uio.h>])
AC_CHECK_DECLS([strchrnul], [], [], [#include <string.h>])
AC_CHECK_DECLS([memset_s], [], [], [#define __STDC_WANT_LIB_EXT1__ 1
#include <string.h>])
# This is probably only present on macOS, but may as well check always
AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include <fcntl.h>])
@ -1814,6 +1852,8 @@ AC_REPLACE_FUNCS(m4_normalize([
strlcat
strlcpy
strnlen
strsep
timingsafe_bcmp
]))
AC_REPLACE_FUNCS(pthread_barrier_wait)
@ -1849,7 +1889,6 @@ fi
# Win32 (really MinGW) support
if test "$PORTNAME" = "win32"; then
AC_CHECK_FUNCS(_configthreadlocale)
AC_LIBOBJ(dirmod)
AC_LIBOBJ(kill)
AC_LIBOBJ(open)
@ -1945,54 +1984,18 @@ for the exact reason.]])],
# Run tests below here
# --------------------
dnl Check to see if we have a working 64-bit integer type.
dnl Since Postgres 8.4, we no longer support compilers without a working
dnl 64-bit type; but we have to determine whether that type is called
dnl "long int" or "long long int".
PGAC_TYPE_64BIT_INT([long int])
if test x"$HAVE_LONG_INT_64" = x"yes" ; then
pg_int64_type="long int"
else
PGAC_TYPE_64BIT_INT([long long int])
if test x"$HAVE_LONG_LONG_INT_64" = x"yes" ; then
pg_int64_type="long long int"
else
AC_MSG_ERROR([Cannot find a working 64-bit integer type.])
fi
fi
AC_DEFINE_UNQUOTED(PG_INT64_TYPE, $pg_int64_type,
[Define to the name of a signed 64-bit integer type.])
# Select the printf length modifier that goes with that, too.
if test x"$pg_int64_type" = x"long long int" ; then
INT64_MODIFIER='"ll"'
else
INT64_MODIFIER='"l"'
fi
AC_DEFINE_UNQUOTED(INT64_MODIFIER, $INT64_MODIFIER,
[Define to the appropriate printf length modifier for 64-bit ints.])
# has to be down here, rather than with the other builtins, because
# the test uses PG_INT64_TYPE.
PGAC_C_BUILTIN_OP_OVERFLOW
# Check size of void *, size_t (enables tweaks for > 32bit address space)
AC_CHECK_SIZEOF([void *])
AC_CHECK_SIZEOF([size_t])
AC_CHECK_SIZEOF([long])
AC_CHECK_SIZEOF([long long])
# Determine memory alignment requirements for the basic C data types.
AC_CHECK_ALIGNOF(short)
AC_CHECK_ALIGNOF(int)
AC_CHECK_ALIGNOF(long)
if test x"$HAVE_LONG_LONG_INT_64" = x"yes" ; then
AC_CHECK_ALIGNOF(long long int)
fi
AC_CHECK_ALIGNOF(int64_t)
AC_CHECK_ALIGNOF(double)
# Compute maximum alignment of any basic type.
@ -2016,17 +2019,11 @@ MAX_ALIGNOF=$ac_cv_alignof_double
if test $ac_cv_alignof_long -gt $MAX_ALIGNOF ; then
AC_MSG_ERROR([alignment of 'long' is greater than the alignment of 'double'])
fi
if test x"$HAVE_LONG_LONG_INT_64" = xyes && test $ac_cv_alignof_long_long_int -gt $MAX_ALIGNOF ; then
AC_MSG_ERROR([alignment of 'long long int' is greater than the alignment of 'double'])
if test $ac_cv_alignof_int64_t -gt $MAX_ALIGNOF ; then
AC_MSG_ERROR([alignment of 'int64_t' is greater than the alignment of 'double'])
fi
AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX_ALIGNOF, [Define as the maximum alignment requirement of any C data type.])
# Some platforms predefine the types int8, int16, etc. Only check
# a (hopefully) representative subset.
AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
[#include <stdio.h>])
# Some compilers offer a 128-bit integer scalar type.
PGAC_TYPE_128BIT_INT
@ -2087,42 +2084,32 @@ fi
# Check for XSAVE intrinsics
#
CFLAGS_XSAVE=""
PGAC_XSAVE_INTRINSICS([])
if test x"$pgac_xsave_intrinsics" != x"yes"; then
PGAC_XSAVE_INTRINSICS([-mxsave])
fi
PGAC_XSAVE_INTRINSICS()
if test x"$pgac_xsave_intrinsics" = x"yes"; then
AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
fi
AC_SUBST(CFLAGS_XSAVE)
# Check for AVX-512 popcount intrinsics
#
CFLAGS_POPCNT=""
PG_POPCNT_OBJS=""
if test x"$host_cpu" = x"x86_64"; then
PGAC_AVX512_POPCNT_INTRINSICS([])
if test x"$pgac_avx512_popcnt_intrinsics" != x"yes"; then
PGAC_AVX512_POPCNT_INTRINSICS([-mavx512vpopcntdq -mavx512bw])
fi
PGAC_AVX512_POPCNT_INTRINSICS()
if test x"$pgac_avx512_popcnt_intrinsics" = x"yes"; then
PG_POPCNT_OBJS="pg_popcount_avx512.o pg_popcount_avx512_choose.o"
AC_DEFINE(USE_AVX512_POPCNT_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX-512 popcount instructions with a runtime check.])
fi
fi
AC_SUBST(CFLAGS_POPCNT)
AC_SUBST(PG_POPCNT_OBJS)
# Check for SVE popcount intrinsics
#
if test x"$host_cpu" = x"aarch64"; then
PGAC_SVE_POPCNT_INTRINSICS()
if test x"$pgac_sve_popcnt_intrinsics" = x"yes"; then
AC_DEFINE(USE_SVE_POPCNT_WITH_RUNTIME_CHECK, 1, [Define to 1 to use SVE popcount instructions with a runtime check.])
fi
fi
# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
#
# First check if the _mm_crc32_u8 and _mm_crc32_u64 intrinsics can be used
# with the default compiler flags. If not, check if adding the -msse4.2
# flag helps. CFLAGS_CRC is set to -msse4.2 if that's required.
PGAC_SSE42_CRC32_INTRINSICS([])
if test x"$pgac_sse42_crc32_intrinsics" != x"yes"; then
PGAC_SSE42_CRC32_INTRINSICS([-msse4.2])
fi
PGAC_SSE42_CRC32_INTRINSICS()
# Are we targeting a processor that supports SSE 4.2? gcc, clang and icc all
# define __SSE4_2__ in that case.
@ -2135,11 +2122,15 @@ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [
# Check for ARMv8 CRC Extension intrinsics to do CRC calculations.
#
# First check if __crc32c* intrinsics can be used with the default compiler
# flags. If not, check if adding -march=armv8-a+crc flag helps.
# flags. If not, check if adding "-march=armv8-a+crc+simd" flag helps.
# On systems using soft-float ABI, "-march=armv8-a+crc" is required instead.
# CFLAGS_CRC is set if the extra flag is required.
PGAC_ARMV8_CRC32C_INTRINSICS([])
if test x"$pgac_armv8_crc32c_intrinsics" != x"yes"; then
PGAC_ARMV8_CRC32C_INTRINSICS([-march=armv8-a+crc])
PGAC_ARMV8_CRC32C_INTRINSICS([-march=armv8-a+crc+simd])
if test x"$pgac_armv8_crc32c_intrinsics" != x"yes"; then
PGAC_ARMV8_CRC32C_INTRINSICS([-march=armv8-a+crc])
fi
fi
# Check for LoongArch CRC intrinsics to do CRC calculations.
@ -2152,17 +2143,26 @@ AC_SUBST(CFLAGS_CRC)
# Select CRC-32C implementation.
#
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
# use the special CRC instructions for calculating CRC-32C. If we're not
# targeting such a processor, but we can nevertheless produce code that uses
# the SSE intrinsics, perhaps with some extra CFLAGS, compile both
# implementations and select which one to use at runtime, depending on whether
# SSE 4.2 is supported by the processor we're running on.
# There are three methods of calculating CRC, in order of increasing
# performance:
#
# Similarly, if we are targeting an ARM processor that has the CRC
# instructions that are part of the ARMv8 CRC Extension, use them. And if
# we're not targeting such a processor, but can nevertheless produce code that
# uses the CRC instructions, compile both, and select at runtime.
# 1. The fallback using a lookup table, called slicing-by-8
# 2. CRC-32C instructions (found in e.g. Intel SSE 4.2 and ARMv8 CRC Extension)
# 3. Algorithms using carryless multiplication instructions
# (e.g. Intel PCLMUL and Arm PMULL)
#
# If we can produce code (via function attributes or additional compiler
# flags) that uses #2 (and possibly #3), we compile all implementations
# and select which one to use at runtime, depending on what is supported
# by the processor we're running on.
#
# If we are targeting a processor that has #2, we can use that without
# runtime selection.
#
# Note that we do not use __attribute__((target("..."))) for the ARM CRC
# instructions because until clang 16, using the ARM intrinsics still requires
# special -march flags. Perhaps we can re-evaluate this decision after some
# time has passed.
#
# You can skip the runtime check by setting the appropriate USE_*_CRC32 flag to 1
# in the template or configure command line.
@ -2205,7 +2205,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
PG_CRC32C_OBJS="pg_crc32c_sse42.o"
PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
@ -2238,6 +2238,19 @@ else
fi
AC_SUBST(PG_CRC32C_OBJS)
# Check for carryless multiplication intrinsics to do vectorized CRC calculations.
#
if test x"$host_cpu" = x"x86_64"; then
PGAC_AVX512_PCLMUL_INTRINSICS()
fi
AC_MSG_CHECKING([for vectorized CRC-32C])
if test x"$pgac_avx512_pclmul_intrinsics" = x"yes"; then
AC_DEFINE(USE_AVX512_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX-512 CRC algorithms with a runtime check.])
AC_MSG_RESULT(AVX-512 with runtime check)
else
AC_MSG_RESULT(none)
fi
# Select semaphore implementation type.
if test "$PORTNAME" != "win32"; then
@ -2423,7 +2436,13 @@ fi
# For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
# (backend specific ldflags). One some platforms this will always fail (e.g.,
# windows), but on others it depends on the choice of linker (e.g., solaris).
# macOS uses -export_dynamic instead. (On macOS, the option is only
# needed when also using -flto, but we add it anyway since it's
# harmless.)
PGAC_PROG_CC_LD_VARFLAGS_OPT(LDFLAGS_EX_BE, [-Wl,--export-dynamic], $link_test_func)
if test x"$LDFLAGS_EX_BE" = x""; then
PGAC_PROG_CC_LD_VARFLAGS_OPT(LDFLAGS_EX_BE, [-Wl,-export_dynamic], $link_test_func)
fi
AC_SUBST(LDFLAGS_EX_BE)
# Create compiler version string
@ -2520,12 +2539,6 @@ AC_CONFIG_HEADERS([src/include/pg_config.h],
echo >src/include/stamp-h
])
AC_CONFIG_HEADERS([src/include/pg_config_ext.h],
[
# Update timestamp for pg_config_ext.h (see Makefile.global)
echo >src/include/stamp-ext-h
])
AC_CONFIG_HEADERS([src/interfaces/ecpg/include/ecpg_config.h],
[echo >src/interfaces/ecpg/include/stamp-h])

View File

@ -32,6 +32,8 @@ SUBDIRS = \
passwordcheck \
pg_buffercache \
pg_freespacemap \
pg_logicalinspect \
pg_overexplain \
pg_prewarm \
pg_stat_statements \
pg_surgery \

View File

@ -3,14 +3,17 @@
MODULE_big = amcheck
OBJS = \
$(WIN32RES) \
verify_common.o \
verify_gin.o \
verify_heapam.o \
verify_nbtree.o
EXTENSION = amcheck
DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
PGFILEDESC = "amcheck - function for verifying relation integrity"
REGRESS = check check_btree check_heap
REGRESS = check check_btree check_gin check_heap
EXTRA_INSTALL = contrib/pg_walinspect
TAP_TESTS = 1

View File

@ -0,0 +1,14 @@
/* contrib/amcheck/amcheck--1.4--1.5.sql */
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
-- gin_index_check()
--
CREATE FUNCTION gin_index_check(index regclass)
RETURNS VOID
AS 'MODULE_PATHNAME', 'gin_index_check'
LANGUAGE C STRICT;
REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;

View File

@ -1,5 +1,5 @@
# amcheck extension
comment = 'functions for verifying relation integrity'
default_version = '1.4'
default_version = '1.5'
module_pathname = '$libdir/amcheck'
relocatable = true

View File

@ -57,8 +57,8 @@ ERROR: could not open relation with OID 17
BEGIN;
CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
SELECT bt_index_parent_check('bttest_a_brin_idx');
ERROR: only B-Tree indexes are supported as targets for verification
DETAIL: Relation "bttest_a_brin_idx" is not a B-Tree index.
ERROR: expected "btree" index as targets for verification
DETAIL: Relation "bttest_a_brin_idx" is a brin index.
ROLLBACK;
-- normal check outside of xact
SELECT bt_index_check('bttest_a_idx');

View File

@ -0,0 +1,90 @@
-- Test of index bulk load
SELECT setseed(1);
setseed
---------
(1 row)
CREATE TABLE "gin_check"("Column1" int[]);
-- posting trees (frequently used entries)
INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
-- posting leaves (sparse entries)
INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
SELECT gin_index_check('gin_check_idx');
gin_index_check
-----------------
(1 row)
-- cleanup
DROP TABLE gin_check;
-- Test index inserts
SELECT setseed(1);
setseed
---------
(1 row)
CREATE TABLE "gin_check"("Column1" int[]);
CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
ALTER INDEX gin_check_idx SET (fastupdate = false);
-- posting trees
INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
-- posting leaves
INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
SELECT gin_index_check('gin_check_idx');
gin_index_check
-----------------
(1 row)
-- cleanup
DROP TABLE gin_check;
-- Test GIN over text array
SELECT setseed(1);
setseed
---------
(1 row)
CREATE TABLE "gin_check_text_array"("Column1" text[]);
-- posting trees
INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300)::text::bytea)::text) from generate_series(1, 100000) as i group by i % 10000;
-- posting leaves
INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300 + 300)::text::bytea)::text) from generate_series(1, 10000) as i group by i % 100;
CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
SELECT gin_index_check('gin_check_text_array_idx');
gin_index_check
-----------------
(1 row)
-- cleanup
DROP TABLE gin_check_text_array;
-- Test GIN over jsonb
CREATE TABLE "gin_check_jsonb"("j" jsonb);
INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
SELECT gin_index_check('gin_check_jsonb_idx');
gin_index_check
-----------------
(1 row)
-- cleanup
DROP TABLE gin_check_jsonb;
-- Test GIN multicolumn index
CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
SELECT gin_index_check('gin_check_multicolumn_idx');
gin_index_check
-----------------
(1 row)
-- cleanup
DROP TABLE gin_check_multicolumn;

View File

@ -1,6 +1,8 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
amcheck_sources = files(
'verify_common.c',
'verify_gin.c',
'verify_heapam.c',
'verify_nbtree.c',
)
@ -24,6 +26,7 @@ install_data(
'amcheck--1.1--1.2.sql',
'amcheck--1.2--1.3.sql',
'amcheck--1.3--1.4.sql',
'amcheck--1.4--1.5.sql',
kwargs: contrib_data_args,
)
@ -35,6 +38,7 @@ tests += {
'sql': [
'check',
'check_btree',
'check_gin',
'check_heap',
],
},
@ -45,6 +49,7 @@ tests += {
't/003_cic_2pc.pl',
't/004_verify_nbtree_unique.pl',
't/005_pitr.pl',
't/006_verify_gin.pl',
],
},
}

View File

@ -0,0 +1,62 @@
-- Test of index bulk load
SELECT setseed(1);
CREATE TABLE "gin_check"("Column1" int[]);
-- posting trees (frequently used entries)
INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
-- posting leaves (sparse entries)
INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
SELECT gin_index_check('gin_check_idx');
-- cleanup
DROP TABLE gin_check;
-- Test index inserts
SELECT setseed(1);
CREATE TABLE "gin_check"("Column1" int[]);
CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
ALTER INDEX gin_check_idx SET (fastupdate = false);
-- posting trees
INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
-- posting leaves
INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
SELECT gin_index_check('gin_check_idx');
-- cleanup
DROP TABLE gin_check;
-- Test GIN over text array
SELECT setseed(1);
CREATE TABLE "gin_check_text_array"("Column1" text[]);
-- posting trees
INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300)::text::bytea)::text) from generate_series(1, 100000) as i group by i % 10000;
-- posting leaves
INSERT INTO gin_check_text_array select array_agg(sha256(round(random()*300 + 300)::text::bytea)::text) from generate_series(1, 10000) as i group by i % 100;
CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
SELECT gin_index_check('gin_check_text_array_idx');
-- cleanup
DROP TABLE gin_check_text_array;
-- Test GIN over jsonb
CREATE TABLE "gin_check_jsonb"("j" jsonb);
INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
SELECT gin_index_check('gin_check_jsonb_idx');
-- cleanup
DROP TABLE gin_check_jsonb;
-- Test GIN multicolumn index
CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
SELECT gin_index_check('gin_check_multicolumn_idx');
-- cleanup
DROP TABLE gin_check_multicolumn;

View File

@ -1,5 +1,5 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
use strict;
use warnings FATAL => 'all';
@ -9,13 +9,13 @@ use PostgreSQL::Test::Utils;
use Test::More;
my ($node, $result);
my $node;
#
# Test set-up
#
$node = PostgreSQL::Test::Cluster->new('test');
$node->init;
$node->init(no_data_checksums => 1);
$node->append_conf('postgresql.conf', 'autovacuum=off');
$node->start;
$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
@ -87,19 +87,6 @@ sub relation_filepath
return "$pgdata/$rel";
}
# Returns the fully qualified name of the toast table for the named relation
sub get_toast_for
{
my ($relname) = @_;
return $node->safe_psql(
'postgres', qq(
SELECT 'pg_toast.' || t.relname
FROM pg_catalog.pg_class c, pg_catalog.pg_class t
WHERE c.relname = '$relname'
AND c.reltoastrelid = t.oid));
}
# (Re)create and populate a test table of the given name.
sub fresh_test_table
{

View File

@ -1,5 +1,5 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
# Test CREATE INDEX CONCURRENTLY with concurrent modifications
use strict;
@ -10,7 +10,7 @@ use PostgreSQL::Test::Utils;
use Test::More;
my ($node, $result);
my $node;
#
# Test set-up
@ -21,8 +21,9 @@ $node->append_conf('postgresql.conf',
'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
$node->start;
$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
$node->safe_psql('postgres', q(CREATE INDEX idx ON tbl(i)));
$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(j)));
#
# Stress CIC with pgbench.
@ -40,13 +41,13 @@ $node->pgbench(
{
'002_pgbench_concurrent_transaction' => q(
BEGIN;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
COMMIT;
),
'002_pgbench_concurrent_transaction_savepoints' => q(
BEGIN;
SAVEPOINT s1;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0, '[[14,2,3]]');
COMMIT;
),
'002_pgbench_concurrent_cic' => q(
@ -54,7 +55,10 @@ $node->pgbench(
\if :gotlock
DROP INDEX CONCURRENTLY idx;
CREATE INDEX CONCURRENTLY idx ON tbl(i);
DROP INDEX CONCURRENTLY ginidx;
CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
SELECT bt_index_check('idx',true);
SELECT gin_index_check('ginidx');
SELECT pg_advisory_unlock(42);
\endif
)

View File

@ -1,5 +1,5 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
# Test CREATE INDEX CONCURRENTLY with concurrent prepared-xact modifications
use strict;
@ -25,7 +25,7 @@ $node->append_conf('postgresql.conf',
'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
$node->start;
$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
#
@ -41,7 +41,7 @@ my $main_h = $node->background_psql('postgres');
$main_h->query_safe(
q(
BEGIN;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0, '[[14,2,3]]');
));
my $cic_h = $node->background_psql('postgres');
@ -50,6 +50,7 @@ $cic_h->query_until(
qr/start/, q(
\echo start
CREATE INDEX CONCURRENTLY idx ON tbl(i);
CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
));
$main_h->query_safe(
@ -60,7 +61,7 @@ PREPARE TRANSACTION 'a';
$main_h->query_safe(
q(
BEGIN;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0, '[[14,2,3]]');
));
$node->safe_psql('postgres', q(COMMIT PREPARED 'a';));
@ -69,7 +70,7 @@ $main_h->query_safe(
q(
PREPARE TRANSACTION 'b';
BEGIN;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0, '"mary had a little lamb"');
));
$node->safe_psql('postgres', q(COMMIT PREPARED 'b';));
@ -86,6 +87,9 @@ $cic_h->quit;
$result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
is($result, '0', 'bt_index_check after overlapping 2PC');
$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
is($result, '0', 'gin_index_check after overlapping 2PC');
#
# Server restart shall not change whether prepared xact blocks CIC
@ -94,7 +98,7 @@ is($result, '0', 'bt_index_check after overlapping 2PC');
$node->safe_psql(
'postgres', q(
BEGIN;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
PREPARE TRANSACTION 'spans_restart';
BEGIN;
CREATE TABLE unused ();
@ -108,12 +112,16 @@ $reindex_h->query_until(
\echo start
DROP INDEX CONCURRENTLY idx;
CREATE INDEX CONCURRENTLY idx ON tbl(i);
DROP INDEX CONCURRENTLY ginidx;
CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
));
$node->safe_psql('postgres', "COMMIT PREPARED 'spans_restart'");
$reindex_h->quit;
$result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
is($result, '0', 'bt_index_check after 2PC and restart');
$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
is($result, '0', 'gin_index_check after 2PC and restart');
#
@ -136,14 +144,14 @@ $node->pgbench(
{
'003_pgbench_concurrent_2pc' => q(
BEGIN;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0,'null');
PREPARE TRANSACTION 'c:client_id';
COMMIT PREPARED 'c:client_id';
),
'003_pgbench_concurrent_2pc_savepoint' => q(
BEGIN;
SAVEPOINT s1;
INSERT INTO tbl VALUES(0);
INSERT INTO tbl VALUES(0,'[false, "jnvaba", -76, 7, {"_": [1]}, 9]');
PREPARE TRANSACTION 'c:client_id';
COMMIT PREPARED 'c:client_id';
),
@ -163,7 +171,25 @@ $node->pgbench(
SELECT bt_index_check('idx',true);
SELECT pg_advisory_unlock(42);
\endif
),
'005_pgbench_concurrent_cic' => q(
SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
\if :gotginlock
DROP INDEX CONCURRENTLY ginidx;
CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
SELECT gin_index_check('ginidx');
SELECT pg_advisory_unlock(42);
\endif
),
'006_pgbench_concurrent_ric' => q(
SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
\if :gotginlock
REINDEX INDEX CONCURRENTLY ginidx;
SELECT gin_index_check('ginidx');
SELECT pg_advisory_unlock(42);
\endif
)
});
$node->stop;

View File

@ -1,5 +1,5 @@
# Copyright (c) 2023-2024, PostgreSQL Global Development Group
# Copyright (c) 2023-2025, PostgreSQL Global Development Group
# This regression test checks the behavior of the btree validation in the
# presence of breaking sort order changes.

View File

@ -1,4 +1,4 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
# Test integrity of intermediate states by PITR to those states
use strict;

View File

@ -0,0 +1,316 @@
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
use strict;
use warnings FATAL => 'all';
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
my $node;
my $blksize;
# to get the split fast, we want tuples to be as large as possible, but the same time we don't want them to be toasted.
my $filler_size = 1900;
#
# Test set-up
#
$node = PostgreSQL::Test::Cluster->new('test');
$node->init(no_data_checksums => 1);
$node->append_conf('postgresql.conf', 'autovacuum=off');
$node->start;
$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
$node->safe_psql(
'postgres', q(
CREATE OR REPLACE FUNCTION random_string( INT ) RETURNS text AS $$
SELECT string_agg(substring('0123456789abcdefghijklmnopqrstuvwxyz', ceil(random() * 36)::integer, 1), '') from generate_series(1, $1);
$$ LANGUAGE SQL;));
# Tests
invalid_entry_order_leaf_page_test();
invalid_entry_order_inner_page_test();
invalid_entry_columns_order_test();
inconsistent_with_parent_key__parent_key_corrupted_test();
inconsistent_with_parent_key__child_key_corrupted_test();
inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
sub invalid_entry_order_leaf_page_test
{
my $relname = "test";
my $indexname = "test_gin_idx";
$node->safe_psql(
'postgres', qq(
DROP TABLE IF EXISTS $relname;
CREATE TABLE $relname (a text[]);
INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
CREATE INDEX $indexname ON $relname USING gin (a);
));
my $relpath = relation_filepath($indexname);
$node->stop;
my $blkno = 1; # root
# produce wrong order by replacing aaaaa with ccccc
string_replace_block(
$relpath,
'aaaaa',
'ccccc',
$blkno
);
$node->start;
my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
like($stderr, qr/$expected/);
}
sub invalid_entry_order_inner_page_test
{
my $relname = "test";
my $indexname = "test_gin_idx";
# to break the order in the inner page we need at least 3 items (rightmost key in the inner level is not checked for the order)
# so fill table until we have 2 splits
$node->safe_psql(
'postgres', qq(
DROP TABLE IF EXISTS $relname;
CREATE TABLE $relname (a text[]);
INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string($filler_size) ||'}')::text[]);
CREATE INDEX $indexname ON $relname USING gin (a);
));
my $relpath = relation_filepath($indexname);
$node->stop;
my $blkno = 1; # root
# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
string_replace_block(
$relpath,
'rrrrrrrrrr',
'zzzzzzzzzz',
$blkno
);
$node->start;
my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
like($stderr, qr/$expected/);
}
sub invalid_entry_columns_order_test
{
my $relname = "test";
my $indexname = "test_gin_idx";
$node->safe_psql(
'postgres', qq(
DROP TABLE IF EXISTS $relname;
CREATE TABLE $relname (a text[],b text[]);
INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
CREATE INDEX $indexname ON $relname USING gin (a,b);
));
my $relpath = relation_filepath($indexname);
$node->stop;
my $blkno = 1; # root
# mess column numbers
# root items order before: (1,aaa), (2,bbb)
# root items order after: (2,aaa), (1,bbb)
my $attrno_1 = pack('s', 1);
my $attrno_2 = pack('s', 2);
my $find = qr/($attrno_1)(.)(aaa)/s;
my $replace = $attrno_2 . '$2$3';
string_replace_block(
$relpath,
$find,
$replace,
$blkno
);
$find = qr/($attrno_2)(.)(bbb)/s;
$replace = $attrno_1 . '$2$3';
string_replace_block(
$relpath,
$find,
$replace,
$blkno
);
$node->start;
my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
like($stderr, qr/$expected/);
}
sub inconsistent_with_parent_key__parent_key_corrupted_test
{
my $relname = "test";
my $indexname = "test_gin_idx";
# fill the table until we have a split
$node->safe_psql(
'postgres', qq(
DROP TABLE IF EXISTS $relname;
CREATE TABLE $relname (a text[]);
INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
CREATE INDEX $indexname ON $relname USING gin (a);
));
my $relpath = relation_filepath($indexname);
$node->stop;
my $blkno = 1; # root
# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
string_replace_block(
$relpath,
'nnnnnnnnnn',
'aaaaaaaaaa',
$blkno
);
$node->start;
my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
like($stderr, qr/$expected/);
}
sub inconsistent_with_parent_key__child_key_corrupted_test
{
my $relname = "test";
my $indexname = "test_gin_idx";
# fill the table until we have a split
$node->safe_psql(
'postgres', qq(
DROP TABLE IF EXISTS $relname;
CREATE TABLE $relname (a text[]);
INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
CREATE INDEX $indexname ON $relname USING gin (a);
));
my $relpath = relation_filepath($indexname);
$node->stop;
my $blkno = 3; # leaf
# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
string_replace_block(
$relpath,
'nnnnnnnnnn',
'pppppppppp',
$blkno
);
$node->start;
my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
like($stderr, qr/$expected/);
}
sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
{
my $relname = "test";
my $indexname = "test_gin_idx";
$node->safe_psql(
'postgres', qq(
DROP TABLE IF EXISTS $relname;
CREATE TABLE $relname (a text[]);
INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
CREATE INDEX $indexname ON $relname USING gin (a);
));
my $relpath = relation_filepath($indexname);
$node->stop;
my $blkno = 2; # posting tree root
# we have a posting tree for 'aaaaa' key with the root at 2nd block
# and two leaf pages 3 and 4. replace 4th page's high key with (1,1)
# so that there are tid's in leaf page that are larger then the new high key.
my $find = pack('S*', 0, 4, 0) . '....';
my $replace = pack('S*', 0, 4, 0, 1, 1);
string_replace_block(
$relpath,
$find,
$replace,
$blkno
);
$node->start;
my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
my $expected = "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4";
like($stderr, qr/$expected/);
}
# Returns the filesystem path for the named relation.
sub relation_filepath
{
my ($relname) = @_;
my $pgdata = $node->data_dir;
my $rel = $node->safe_psql('postgres',
qq(SELECT pg_relation_filepath('$relname')));
die "path not found for relation $relname" unless defined $rel;
return "$pgdata/$rel";
}
# substitute pattern 'find' with 'replace' within the block with number 'blkno' in the file 'filename'
sub string_replace_block
{
my ($filename, $find, $replace, $blkno) = @_;
my $fh;
open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
binmode $fh;
my $offset = $blkno * $blksize;
my $buffer;
sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
$buffer =~ s/$find/'"' . $replace . '"'/gee;
sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
close($fh) or BAIL_OUT("close failed: $!");
return;
}
done_testing();

View File

@ -0,0 +1,191 @@
/*-------------------------------------------------------------------------
*
* verify_common.c
* Utility functions common to all access methods.
*
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/amcheck/verify_common.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/genam.h"
#include "access/table.h"
#include "access/tableam.h"
#include "verify_common.h"
#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/tablecmds.h"
#include "utils/guc.h"
#include "utils/syscache.h"
static bool amcheck_index_mainfork_expected(Relation rel);
/*
* Check if index relation should have a file for its main relation fork.
* Verification uses this to skip unlogged indexes when in hot standby mode,
* where there is simply nothing to verify.
*
* NB: Caller should call index_checkable() before calling here.
*/
static bool
amcheck_index_mainfork_expected(Relation rel)
{
if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
!RecoveryInProgress())
return true;
ereport(NOTICE,
(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
RelationGetRelationName(rel))));
return false;
}
/*
* Amcheck main workhorse.
* Given index relation OID, lock relation.
* Next, take a number of standard actions:
* 1) Make sure the index can be checked
* 2) change the context of the user,
* 3) keep track of GUCs modified via index functions
* 4) execute callback function to verify integrity.
*/
void
amcheck_lock_relation_and_check(Oid indrelid,
Oid am_id,
IndexDoCheckCallback check,
LOCKMODE lockmode,
void *state)
{
Oid heapid;
Relation indrel;
Relation heaprel;
Oid save_userid;
int save_sec_context;
int save_nestlevel;
/*
* We must lock table before index to avoid deadlocks. However, if the
* passed indrelid isn't an index then IndexGetRelation() will fail.
* Rather than emitting a not-very-helpful error message, postpone
* complaining, expecting that the is-it-an-index test below will fail.
*
* In hot standby mode this will raise an error when parentcheck is true.
*/
heapid = IndexGetRelation(indrelid, true);
if (OidIsValid(heapid))
{
heaprel = table_open(heapid, lockmode);
/*
* Switch to the table owner's userid, so that any index functions are
* run as that user. Also lock down security-restricted operations
* and arrange to make GUC variable changes local to this command.
*/
GetUserIdAndSecContext(&save_userid, &save_sec_context);
SetUserIdAndSecContext(heaprel->rd_rel->relowner,
save_sec_context | SECURITY_RESTRICTED_OPERATION);
save_nestlevel = NewGUCNestLevel();
}
else
{
heaprel = NULL;
/* Set these just to suppress "uninitialized variable" warnings */
save_userid = InvalidOid;
save_sec_context = -1;
save_nestlevel = -1;
}
/*
* Open the target index relations separately (like relation_openrv(), but
* with heap relation locked first to prevent deadlocking). In hot
* standby mode this will raise an error when parentcheck is true.
*
* There is no need for the usual indcheckxmin usability horizon test
* here, even in the heapallindexed case, because index undergoing
* verification only needs to have entries for a new transaction snapshot.
* (If this is a parentcheck verification, there is no question about
* committed or recently dead heap tuples lacking index entries due to
* concurrent activity.)
*/
indrel = index_open(indrelid, lockmode);
/*
* Since we did the IndexGetRelation call above without any lock, it's
* barely possible that a race against an index drop/recreation could have
* netted us the wrong table.
*/
if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("could not open parent table of index \"%s\"",
RelationGetRelationName(indrel))));
/* Check that relation suitable for checking */
if (index_checkable(indrel, am_id))
check(indrel, heaprel, state, lockmode == ShareLock);
/* Roll back any GUC changes executed by index functions */
AtEOXact_GUC(false, save_nestlevel);
/* Restore userid and security context */
SetUserIdAndSecContext(save_userid, save_sec_context);
/*
* Release locks early. That's ok here because nothing in the called
* routines will trigger shared cache invalidations to be sent, so we can
* relax the usual pattern of only releasing locks after commit.
*/
index_close(indrel, lockmode);
if (heaprel)
table_close(heaprel, lockmode);
}
/*
* Basic checks about the suitability of a relation for checking as an index.
*
*
* NB: Intentionally not checking permissions, the function is normally not
* callable by non-superusers. If granted, it's useful to be able to check a
* whole cluster.
*/
bool
index_checkable(Relation rel, Oid am_id)
{
if (rel->rd_rel->relkind != RELKIND_INDEX ||
rel->rd_rel->relam != am_id)
{
HeapTuple amtup;
HeapTuple amtuprel;
amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
errdetail("Relation \"%s\" is a %s index.",
RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
}
if (RELATION_IS_OTHER_TEMP(rel))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary tables of other sessions"),
errdetail("Index \"%s\" is associated with temporary relation.",
RelationGetRelationName(rel))));
if (!rel->rd_index->indisvalid)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot check index \"%s\"",
RelationGetRelationName(rel)),
errdetail("Index is not valid.")));
return amcheck_index_mainfork_expected(rel);
}

View File

@ -0,0 +1,31 @@
/*-------------------------------------------------------------------------
*
* amcheck.h
* Shared routines for amcheck verifications.
*
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/amcheck/amcheck.h
*
*-------------------------------------------------------------------------
*/
#include "storage/bufpage.h"
#include "storage/lmgr.h"
#include "storage/lockdefs.h"
#include "utils/relcache.h"
#include "miscadmin.h"
/* Typedefs for callback functions for amcheck_lock_relation_and_check */
typedef void (*IndexCheckableCallback) (Relation index);
typedef void (*IndexDoCheckCallback) (Relation rel,
Relation heaprel,
void *state,
bool readonly);
extern void amcheck_lock_relation_and_check(Oid indrelid,
Oid am_id,
IndexDoCheckCallback check,
LOCKMODE lockmode, void *state);
extern bool index_checkable(Relation rel, Oid am_id);

View File

@ -0,0 +1,794 @@
/*-------------------------------------------------------------------------
*
* verify_gin.c
* Verifies the integrity of GIN indexes based on invariants.
*
*
* GIN index verification checks a number of invariants:
*
* - consistency: Paths in GIN graph have to contain consistent keys: tuples
* on parent pages consistently include tuples from children pages.
*
* - graph invariants: Each internal page must have at least one downlink, and
* can reference either only leaf pages or only internal pages.
*
*
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/amcheck/verify_gin.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/gin_private.h"
#include "access/nbtree.h"
#include "catalog/pg_am.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "verify_common.h"
#include "string.h"
/*
* GinScanItem represents one item of depth-first scan of the index.
*/
typedef struct GinScanItem
{
int depth;
IndexTuple parenttup;
BlockNumber parentblk;
BlockNumber blkno;
struct GinScanItem *next;
} GinScanItem;
/*
* GinPostingTreeScanItem represents one item of a depth-first posting tree scan.
*/
typedef struct GinPostingTreeScanItem
{
int depth;
ItemPointerData parentkey;
BlockNumber parentblk;
BlockNumber blkno;
struct GinPostingTreeScanItem *next;
} GinPostingTreeScanItem;
PG_FUNCTION_INFO_V1(gin_index_check);
static void gin_check_parent_keys_consistency(Relation rel,
Relation heaprel,
void *callback_state, bool readonly);
static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
static IndexTuple gin_refind_parent(Relation rel,
BlockNumber parentblkno,
BlockNumber childblkno,
BufferAccessStrategy strategy);
static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
OffsetNumber offset);
/*
* gin_index_check(index regclass)
*
* Verify integrity of GIN index.
*
* Acquires AccessShareLock on heap & index relations.
*/
Datum
gin_index_check(PG_FUNCTION_ARGS)
{
Oid indrelid = PG_GETARG_OID(0);
amcheck_lock_relation_and_check(indrelid,
GIN_AM_OID,
gin_check_parent_keys_consistency,
AccessShareLock,
NULL);
PG_RETURN_VOID();
}
/*
* Read item pointers from leaf entry tuple.
*
* Returns a palloc'd array of ItemPointers. The number of items is returned
* in *nitems.
*/
static ItemPointer
ginReadTupleWithoutState(IndexTuple itup, int *nitems)
{
Pointer ptr = GinGetPosting(itup);
int nipd = GinGetNPosting(itup);
ItemPointer ipd;
int ndecoded;
if (GinItupIsCompressed(itup))
{
if (nipd > 0)
{
ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
if (nipd != ndecoded)
elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
nipd, ndecoded);
}
else
ipd = palloc(0);
}
else
{
ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
}
*nitems = nipd;
return ipd;
}
/*
* Scans through a posting tree (given by the root), and verifies that the keys
* on a child keys are consistent with the parent.
*
* Allocates a separate memory context and scans through posting tree graph.
*/
static void
gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
{
BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
GinPostingTreeScanItem *stack;
MemoryContext mctx;
MemoryContext oldcontext;
int leafdepth;
mctx = AllocSetContextCreate(CurrentMemoryContext,
"posting tree check context",
ALLOCSET_DEFAULT_SIZES);
oldcontext = MemoryContextSwitchTo(mctx);
/*
* We don't know the height of the tree yet, but as soon as we encounter a
* leaf page, we will set 'leafdepth' to its depth.
*/
leafdepth = -1;
/* Start the scan at the root page */
stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
stack->depth = 0;
ItemPointerSetInvalid(&stack->parentkey);
stack->parentblk = InvalidBlockNumber;
stack->blkno = posting_tree_root;
elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
while (stack)
{
GinPostingTreeScanItem *stack_next;
Buffer buffer;
Page page;
OffsetNumber i,
maxoff;
BlockNumber rightlink;
CHECK_FOR_INTERRUPTS();
buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
RBM_NORMAL, strategy);
LockBuffer(buffer, GIN_SHARE);
page = (Page) BufferGetPage(buffer);
Assert(GinPageIsData(page));
/* Check that the tree has the same height in all branches */
if (GinPageIsLeaf(page))
{
ItemPointerData minItem;
int nlist;
ItemPointerData *list;
char tidrange_buf[MAXPGPATH];
ItemPointerSetMin(&minItem);
elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
if (leafdepth == -1)
leafdepth = stack->depth;
else if (stack->depth != leafdepth)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
RelationGetRelationName(rel), stack->blkno)));
list = GinDataLeafPageGetItems(page, &nlist, minItem);
if (nlist > 0)
snprintf(tidrange_buf, sizeof(tidrange_buf),
"%d tids (%u, %u) - (%u, %u)",
nlist,
ItemPointerGetBlockNumberNoCheck(&list[0]),
ItemPointerGetOffsetNumberNoCheck(&list[0]),
ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
else
snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
if (stack->parentblk != InvalidBlockNumber)
elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
stack->blkno,
stack->parentblk,
ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
tidrange_buf);
else
elog(DEBUG3, "blk %u: root leaf, %s",
stack->blkno,
tidrange_buf);
if (stack->parentblk != InvalidBlockNumber &&
ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
RelationGetRelationName(rel), stack->blkno)));
}
else
{
LocationIndex pd_lower;
ItemPointerData bound;
int lowersize;
/*
* Check that tuples in each page are properly ordered and
* consistent with parent high key
*/
maxoff = GinPageGetOpaque(page)->maxoff;
rightlink = GinPageGetOpaque(page)->rightlink;
elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
if (stack->parentblk != InvalidBlockNumber)
elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
stack->blkno, maxoff, stack->parentblk,
ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
else
elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
stack->blkno, maxoff);
/*
* A GIN posting tree internal page stores PostingItems in the
* 'lower' part of the page. The 'upper' part is unused. The
* number of elements is stored in the opaque area (maxoff). Make
* sure the size of the 'lower' part agrees with 'maxoff'
*
* We didn't set pd_lower until PostgreSQL version 9.4, so if this
* check fails, it could also be because the index was
* binary-upgraded from an earlier version. That was a long time
* ago, though, so let's warn if it doesn't match.
*/
pd_lower = ((PageHeader) page)->pd_lower;
lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
/*
* Before the PostingItems, there's one ItemPointerData in the
* 'lower' part that stores the page's high key.
*/
bound = *GinDataPageGetRightBound(page);
/*
* Gin page right bound has a sane value only when not a highkey
* on the rightmost page (at a given level). For the rightmost
* page does not store the highkey explicitly, and the value is
* infinity.
*/
if (ItemPointerIsValid(&stack->parentkey) &&
rightlink != InvalidBlockNumber &&
!ItemPointerEquals(&stack->parentkey, &bound))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
RelationGetRelationName(rel),
ItemPointerGetBlockNumberNoCheck(&bound),
ItemPointerGetOffsetNumberNoCheck(&bound),
stack->blkno, stack->parentblk,
ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
{
GinPostingTreeScanItem *ptr;
PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
/* ItemPointerGetOffsetNumber expects a valid pointer */
if (!(i == maxoff &&
rightlink == InvalidBlockNumber))
elog(DEBUG3, "key (%u, %u) -> %u",
ItemPointerGetBlockNumber(&posting_item->key),
ItemPointerGetOffsetNumber(&posting_item->key),
BlockIdGetBlockNumber(&posting_item->child_blkno));
else
elog(DEBUG3, "key (%u, %u) -> %u",
0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
if (i == maxoff && rightlink == InvalidBlockNumber)
{
/*
* The rightmost item in the tree level has (0, 0) as the
* key
*/
if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
RelationGetRelationName(rel),
stack->blkno,
ItemPointerGetBlockNumberNoCheck(&posting_item->key),
ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
}
else if (i != FirstOffsetNumber)
{
PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
RelationGetRelationName(rel), stack->blkno, i)));
}
/*
* Check if this tuple is consistent with the downlink in the
* parent.
*/
if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
RelationGetRelationName(rel),
stack->blkno, i)));
/* This is an internal page, recurse into the child. */
ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
ptr->depth = stack->depth + 1;
/*
* The rightmost parent key is always invalid item pointer.
* Its value is 'Infinity' and not explicitly stored.
*/
ptr->parentkey = posting_item->key;
ptr->parentblk = stack->blkno;
ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
ptr->next = stack->next;
stack->next = ptr;
}
}
LockBuffer(buffer, GIN_UNLOCK);
ReleaseBuffer(buffer);
/* Step to next item in the queue */
stack_next = stack->next;
pfree(stack);
stack = stack_next;
}
MemoryContextSwitchTo(oldcontext);
MemoryContextDelete(mctx);
}
/*
* Main entry point for GIN checks.
*
* Allocates memory context and scans through the whole GIN graph.
*/
static void
gin_check_parent_keys_consistency(Relation rel,
Relation heaprel,
void *callback_state,
bool readonly)
{
BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
GinScanItem *stack;
MemoryContext mctx;
MemoryContext oldcontext;
GinState state;
int leafdepth;
mctx = AllocSetContextCreate(CurrentMemoryContext,
"amcheck consistency check context",
ALLOCSET_DEFAULT_SIZES);
oldcontext = MemoryContextSwitchTo(mctx);
initGinState(&state, rel);
/*
* We don't know the height of the tree yet, but as soon as we encounter a
* leaf page, we will set 'leafdepth' to its depth.
*/
leafdepth = -1;
/* Start the scan at the root page */
stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
stack->depth = 0;
stack->parenttup = NULL;
stack->parentblk = InvalidBlockNumber;
stack->blkno = GIN_ROOT_BLKNO;
while (stack)
{
GinScanItem *stack_next;
Buffer buffer;
Page page;
OffsetNumber i,
maxoff,
prev_attnum;
IndexTuple prev_tuple;
BlockNumber rightlink;
CHECK_FOR_INTERRUPTS();
buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
RBM_NORMAL, strategy);
LockBuffer(buffer, GIN_SHARE);
page = (Page) BufferGetPage(buffer);
maxoff = PageGetMaxOffsetNumber(page);
rightlink = GinPageGetOpaque(page)->rightlink;
/* Do basic sanity checks on the page headers */
check_index_page(rel, buffer, stack->blkno);
elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
/*
* It's possible that the page was split since we looked at the
* parent, so that we didn't missed the downlink of the right sibling
* when we scanned the parent. If so, add the right sibling to the
* stack now.
*/
if (stack->parenttup != NULL)
{
GinNullCategory parent_key_category;
Datum parent_key = gintuple_get_key(&state,
stack->parenttup,
&parent_key_category);
OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
page, maxoff);
IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
GinNullCategory page_max_key_category;
Datum page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
if (rightlink != InvalidBlockNumber &&
ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
page_max_key_category, parent_key_attnum,
parent_key, parent_key_category) < 0)
{
/* split page detected, install right link to the stack */
GinScanItem *ptr;
elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
ptr->depth = stack->depth;
ptr->parenttup = CopyIndexTuple(stack->parenttup);
ptr->parentblk = stack->parentblk;
ptr->blkno = rightlink;
ptr->next = stack->next;
stack->next = ptr;
}
}
/* Check that the tree has the same height in all branches */
if (GinPageIsLeaf(page))
{
if (leafdepth == -1)
leafdepth = stack->depth;
else if (stack->depth != leafdepth)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
RelationGetRelationName(rel), stack->blkno)));
}
/*
* Check that tuples in each page are properly ordered and consistent
* with parent high key
*/
prev_tuple = NULL;
prev_attnum = InvalidAttrNumber;
for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
{
ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
OffsetNumber current_attnum = gintuple_get_attrnum(&state, idxtuple);
GinNullCategory current_key_category;
Datum current_key;
if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
RelationGetRelationName(rel), stack->blkno, i)));
current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
/*
* Compare the entry to the preceding one.
*
* Don't check for high key on the rightmost inner page, as this
* key is not really stored explicitly.
*
* The entries may be for different attributes, so make sure to
* use ginCompareAttEntries for comparison.
*/
if ((i != FirstOffsetNumber) &&
!(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
{
Datum prev_key;
GinNullCategory prev_key_category;
prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
if (ginCompareAttEntries(&state, prev_attnum, prev_key,
prev_key_category, current_attnum,
current_key, current_key_category) >= 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
RelationGetRelationName(rel), stack->blkno, i, rightlink)));
}
/*
* Check if this tuple is consistent with the downlink in the
* parent.
*/
if (stack->parenttup &&
i == maxoff)
{
GinNullCategory parent_key_category;
OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
Datum parent_key = gintuple_get_key(&state,
stack->parenttup,
&parent_key_category);
if (ginCompareAttEntries(&state, current_attnum, current_key,
current_key_category, parent_key_attnum,
parent_key, parent_key_category) > 0)
{
/*
* There was a discrepancy between parent and child
* tuples. We need to verify it is not a result of
* concurrent call of gistplacetopage(). So, lock parent
* and try to find downlink for current page. It may be
* missing due to concurrent page split, this is OK.
*/
pfree(stack->parenttup);
stack->parenttup = gin_refind_parent(rel, stack->parentblk,
stack->blkno, strategy);
/* We found it - make a final check before failing */
if (!stack->parenttup)
elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
stack->blkno, stack->parentblk);
else
{
parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
parent_key = gintuple_get_key(&state,
stack->parenttup,
&parent_key_category);
/*
* Check if it is properly adjusted. If succeed,
* proceed to the next key.
*/
if (ginCompareAttEntries(&state, current_attnum, current_key,
current_key_category, parent_key_attnum,
parent_key, parent_key_category) > 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has inconsistent records on page %u offset %u",
RelationGetRelationName(rel), stack->blkno, i)));
}
}
}
/* If this is an internal page, recurse into the child */
if (!GinPageIsLeaf(page))
{
GinScanItem *ptr;
ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
ptr->depth = stack->depth + 1;
/* last tuple in layer has no high key */
if (i == maxoff && rightlink == InvalidBlockNumber)
ptr->parenttup = NULL;
else
ptr->parenttup = CopyIndexTuple(idxtuple);
ptr->parentblk = stack->blkno;
ptr->blkno = GinGetDownlink(idxtuple);
ptr->next = stack->next;
stack->next = ptr;
}
/* If this item is a pointer to a posting tree, recurse into it */
else if (GinIsPostingTree(idxtuple))
{
BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
}
else
{
ItemPointer ipd;
int nipd;
ipd = ginReadTupleWithoutState(idxtuple, &nipd);
for (int j = 0; j < nipd; j++)
{
if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
RelationGetRelationName(rel), stack->blkno)));
}
pfree(ipd);
}
prev_tuple = CopyIndexTuple(idxtuple);
prev_attnum = current_attnum;
}
LockBuffer(buffer, GIN_UNLOCK);
ReleaseBuffer(buffer);
/* Step to next item in the queue */
stack_next = stack->next;
if (stack->parenttup)
pfree(stack->parenttup);
pfree(stack);
stack = stack_next;
}
MemoryContextSwitchTo(oldcontext);
MemoryContextDelete(mctx);
}
/*
* Verify that a freshly-read page looks sane.
*/
static void
check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
{
Page page = BufferGetPage(buffer);
/*
* ReadBuffer verifies that every newly-read page passes
* PageHeaderIsValid, which means it either contains a reasonably sane
* page header or is all-zero. We have to defend against the all-zero
* case, however.
*/
if (PageIsNew(page))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" contains unexpected zero page at block %u",
RelationGetRelationName(rel),
BufferGetBlockNumber(buffer)),
errhint("Please REINDEX it.")));
/*
* Additionally check that the special area looks sane.
*/
if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" contains corrupted page at block %u",
RelationGetRelationName(rel),
BufferGetBlockNumber(buffer)),
errhint("Please REINDEX it.")));
if (GinPageIsDeleted(page))
{
if (!GinPageIsLeaf(page))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted internal page %u",
RelationGetRelationName(rel), blockNo)));
if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted page %u with tuples",
RelationGetRelationName(rel), blockNo)));
}
else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has page %u with exceeding count of tuples",
RelationGetRelationName(rel), blockNo)));
}
/*
* Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
*
* If found, returns a palloc'd copy of the downlink tuple. Otherwise,
* returns NULL.
*/
static IndexTuple
gin_refind_parent(Relation rel, BlockNumber parentblkno,
BlockNumber childblkno, BufferAccessStrategy strategy)
{
Buffer parentbuf;
Page parentpage;
OffsetNumber o,
parent_maxoff;
IndexTuple result = NULL;
parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
strategy);
LockBuffer(parentbuf, GIN_SHARE);
parentpage = BufferGetPage(parentbuf);
if (GinPageIsLeaf(parentpage))
{
UnlockReleaseBuffer(parentbuf);
return result;
}
parent_maxoff = PageGetMaxOffsetNumber(parentpage);
for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
{
ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
IndexTuple itup = (IndexTuple) PageGetItem(parentpage, p_iid);
if (GinGetDownlink(itup) == childblkno)
{
/* Found it! Make copy and return it */
result = CopyIndexTuple(itup);
break;
}
}
UnlockReleaseBuffer(parentbuf);
return result;
}
static ItemId
PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
OffsetNumber offset)
{
ItemId itemid = PageGetItemId(page, offset);
if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("line pointer points past end of tuple space in index \"%s\"",
RelationGetRelationName(rel)),
errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
block, offset, ItemIdGetOffset(itemid),
ItemIdGetLength(itemid),
ItemIdGetFlags(itemid))));
/*
* Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
* since GIN never uses all three. Verify that line pointer has storage,
* too.
*/
if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("invalid line pointer storage in index \"%s\"",
RelationGetRelationName(rel)),
errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
block, offset, ItemIdGetOffset(itemid),
ItemIdGetLength(itemid),
ItemIdGetFlags(itemid))));
return itemid;
}

View File

@ -3,7 +3,7 @@
* verify_heapam.c
* Functions to check postgresql heap relations for corruption
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* contrib/amcheck/verify_heapam.c
*-------------------------------------------------------------------------
@ -12,18 +12,23 @@
#include "access/detoast.h"
#include "access/genam.h"
#include "access/heapam.h"
#include "access/heaptoast.h"
#include "access/multixact.h"
#include "access/relation.h"
#include "access/table.h"
#include "access/toast_internals.h"
#include "access/visibilitymap.h"
#include "access/xact.h"
#include "catalog/pg_am.h"
#include "catalog/pg_class.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "storage/bufmgr.h"
#include "storage/procarray.h"
#include "storage/read_stream.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/rel.h"
PG_FUNCTION_INFO_V1(verify_heapam);
@ -114,7 +119,10 @@ typedef struct HeapCheckContext
Relation valid_toast_index;
int num_toast_indexes;
/* Values for iterating over pages in the relation */
/*
* Values for iterating over pages in the relation. `blkno` is the most
* recent block in the buffer yielded by the read stream API.
*/
BlockNumber blkno;
BufferAccessStrategy bstrategy;
Buffer buffer;
@ -149,7 +157,32 @@ typedef struct HeapCheckContext
Tuplestorestate *tupstore;
} HeapCheckContext;
/*
* The per-relation data provided to the read stream API for heap amcheck to
* use in its callback for the SKIP_PAGES_ALL_FROZEN and
* SKIP_PAGES_ALL_VISIBLE options.
*/
typedef struct HeapCheckReadStreamData
{
/*
* `range` is used by all SkipPages options. SKIP_PAGES_NONE uses the
* default read stream callback, block_range_read_stream_cb(), which takes
* a BlockRangeReadStreamPrivate as its callback_private_data. `range`
* keeps track of the current block number across
* read_stream_next_buffer() invocations.
*/
BlockRangeReadStreamPrivate range;
SkipPages skip_option;
Relation rel;
Buffer *vmbuffer;
} HeapCheckReadStreamData;
/* Internal implementation */
static BlockNumber heapcheck_read_stream_next_unskippable(ReadStream *stream,
void *callback_private_data,
void *per_buffer_data);
static void check_tuple(HeapCheckContext *ctx,
bool *xmin_commit_status_ok,
XidCommitStatus *xmin_commit_status);
@ -227,6 +260,11 @@ verify_heapam(PG_FUNCTION_ARGS)
BlockNumber last_block;
BlockNumber nblocks;
const char *skip;
ReadStream *stream;
int stream_flags;
ReadStreamBlockNumberCB stream_cb;
void *stream_data;
HeapCheckReadStreamData stream_skip_data;
/* Check supplied arguments */
if (PG_ARGISNULL(0))
@ -400,7 +438,46 @@ verify_heapam(PG_FUNCTION_ARGS)
if (TransactionIdIsNormal(ctx.relfrozenxid))
ctx.oldest_xid = ctx.relfrozenxid;
for (ctx.blkno = first_block; ctx.blkno <= last_block; ctx.blkno++)
/* Now that `ctx` is set up, set up the read stream */
stream_skip_data.range.current_blocknum = first_block;
stream_skip_data.range.last_exclusive = last_block + 1;
stream_skip_data.skip_option = skip_option;
stream_skip_data.rel = ctx.rel;
stream_skip_data.vmbuffer = &vmbuffer;
if (skip_option == SKIP_PAGES_NONE)
{
/*
* It is safe to use batchmode as block_range_read_stream_cb takes no
* locks.
*/
stream_cb = block_range_read_stream_cb;
stream_flags = READ_STREAM_SEQUENTIAL |
READ_STREAM_FULL |
READ_STREAM_USE_BATCHING;
stream_data = &stream_skip_data.range;
}
else
{
/*
* It would not be safe to naively use batchmode, as
* heapcheck_read_stream_next_unskippable takes locks. It shouldn't be
* too hard to convert though.
*/
stream_cb = heapcheck_read_stream_next_unskippable;
stream_flags = READ_STREAM_DEFAULT;
stream_data = &stream_skip_data;
}
stream = read_stream_begin_relation(stream_flags,
ctx.bstrategy,
ctx.rel,
MAIN_FORKNUM,
stream_cb,
stream_data,
0);
while ((ctx.buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer)
{
OffsetNumber maxoff;
OffsetNumber predecessor[MaxOffsetNumber];
@ -413,30 +490,11 @@ verify_heapam(PG_FUNCTION_ARGS)
memset(predecessor, 0, sizeof(OffsetNumber) * MaxOffsetNumber);
/* Optionally skip over all-frozen or all-visible blocks */
if (skip_option != SKIP_PAGES_NONE)
{
int32 mapbits;
mapbits = (int32) visibilitymap_get_status(ctx.rel, ctx.blkno,
&vmbuffer);
if (skip_option == SKIP_PAGES_ALL_FROZEN)
{
if ((mapbits & VISIBILITYMAP_ALL_FROZEN) != 0)
continue;
}
if (skip_option == SKIP_PAGES_ALL_VISIBLE)
{
if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
continue;
}
}
/* Read and lock the next page. */
ctx.buffer = ReadBufferExtended(ctx.rel, MAIN_FORKNUM, ctx.blkno,
RBM_NORMAL, ctx.bstrategy);
/* Lock the next page. */
Assert(BufferIsValid(ctx.buffer));
LockBuffer(ctx.buffer, BUFFER_LOCK_SHARE);
ctx.blkno = BufferGetBlockNumber(ctx.buffer);
ctx.page = BufferGetPage(ctx.buffer);
/* Perform tuple checks */
@ -795,6 +853,8 @@ verify_heapam(PG_FUNCTION_ARGS)
break;
}
read_stream_end(stream);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
@ -811,6 +871,42 @@ verify_heapam(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}
/*
* Heap amcheck's read stream callback for getting the next unskippable block.
* This callback is only used when 'all-visible' or 'all-frozen' is provided
* as the skip option to verify_heapam(). With the default 'none',
* block_range_read_stream_cb() is used instead.
*/
static BlockNumber
heapcheck_read_stream_next_unskippable(ReadStream *stream,
void *callback_private_data,
void *per_buffer_data)
{
HeapCheckReadStreamData *p = callback_private_data;
/* Loops over [current_blocknum, last_exclusive) blocks */
for (BlockNumber i; (i = p->range.current_blocknum++) < p->range.last_exclusive;)
{
uint8 mapbits = visibilitymap_get_status(p->rel, i, p->vmbuffer);
if (p->skip_option == SKIP_PAGES_ALL_FROZEN)
{
if ((mapbits & VISIBILITYMAP_ALL_FROZEN) != 0)
continue;
}
if (p->skip_option == SKIP_PAGES_ALL_VISIBLE)
{
if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
continue;
}
return i;
}
return InvalidBlockNumber;
}
/*
* Shared internal implementation for report_corruption and
* report_toast_corruption.
@ -1567,11 +1663,11 @@ check_tuple_attribute(HeapCheckContext *ctx)
struct varlena *attr;
char *tp; /* pointer to the tuple data */
uint16 infomask;
Form_pg_attribute thisatt;
CompactAttribute *thisatt;
struct varatt_external toast_pointer;
infomask = ctx->tuphdr->t_infomask;
thisatt = TupleDescAttr(RelationGetDescr(ctx->rel), ctx->attnum);
thisatt = TupleDescCompactAttr(RelationGetDescr(ctx->rel), ctx->attnum);
tp = (char *) ctx->tuphdr + ctx->tuphdr->t_hoff;
@ -1592,7 +1688,7 @@ check_tuple_attribute(HeapCheckContext *ctx)
/* Skip non-varlena values, but update offset first */
if (thisatt->attlen != -1)
{
ctx->offset = att_align_nominal(ctx->offset, thisatt->attalign);
ctx->offset = att_nominal_alignby(ctx->offset, thisatt->attalignby);
ctx->offset = att_addlength_pointer(ctx->offset, thisatt->attlen,
tp + ctx->offset);
if (ctx->tuphdr->t_hoff + ctx->offset > ctx->lp_len)
@ -1608,8 +1704,8 @@ check_tuple_attribute(HeapCheckContext *ctx)
}
/* Ok, we're looking at a varlena attribute. */
ctx->offset = att_align_pointer(ctx->offset, thisatt->attalign, -1,
tp + ctx->offset);
ctx->offset = att_pointer_alignby(ctx->offset, thisatt->attalignby, -1,
tp + ctx->offset);
/* Get the (possibly corrupt) varlena datum */
attdatum = fetchatt(thisatt, tp + ctx->offset);
@ -1763,7 +1859,6 @@ check_tuple_attribute(HeapCheckContext *ctx)
static void
check_toasted_attribute(HeapCheckContext *ctx, ToastedAttribute *ta)
{
SnapshotData SnapshotToast;
ScanKeyData toastkey;
SysScanDesc toastscan;
bool found_toasttup;
@ -1787,10 +1882,9 @@ check_toasted_attribute(HeapCheckContext *ctx, ToastedAttribute *ta)
* Check if any chunks for this toasted object exist in the toast table,
* accessible via the index.
*/
init_toast_snapshot(&SnapshotToast);
toastscan = systable_beginscan_ordered(ctx->toast_rel,
ctx->valid_toast_index,
&SnapshotToast, 1,
get_toast_snapshot(), 1,
&toastkey);
found_toasttup = false;
while ((toasttup =
@ -1875,7 +1969,9 @@ check_tuple(HeapCheckContext *ctx, bool *xmin_commit_status_ok,
/*
* Convert a TransactionId into a FullTransactionId using our cached values of
* the valid transaction ID range. It is the caller's responsibility to have
* already updated the cached values, if necessary.
* already updated the cached values, if necessary. This is akin to
* FullTransactionIdFromAllowableAt(), but it tolerates corruption in the form
* of an xid before epoch 0.
*/
static FullTransactionId
FullTransactionIdFromXidAndCtx(TransactionId xid, const HeapCheckContext *ctx)

View File

@ -14,7 +14,7 @@
* that every visible heap tuple has a matching index tuple.
*
*
* Copyright (c) 2017-2024, PostgreSQL Global Development Group
* Copyright (c) 2017-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/amcheck/verify_nbtree.c
@ -30,21 +30,23 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
#include "verify_common.h"
#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "catalog/pg_opfamily_d.h"
#include "commands/tablecmds.h"
#include "common/pg_prng.h"
#include "lib/bloomfilter.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
#include "storage/smgr.h"
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/snapmgr.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "amcheck",
.version = PG_VERSION
);
/*
* A B-Tree cannot possibly have this many levels, since there must be one
@ -158,14 +160,22 @@ typedef struct BtreeLastVisibleEntry
ItemPointer tid; /* Heap tid */
} BtreeLastVisibleEntry;
/*
* arguments for the bt_index_check_callback callback
*/
typedef struct BTCallbackState
{
bool parentcheck;
bool heapallindexed;
bool rootdescend;
bool checkunique;
} BTCallbackState;
PG_FUNCTION_INFO_V1(bt_index_check);
PG_FUNCTION_INFO_V1(bt_index_parent_check);
static void bt_index_check_internal(Oid indrelid, bool parentcheck,
bool heapallindexed, bool rootdescend,
bool checkunique);
static inline void btree_index_checkable(Relation rel);
static inline bool btree_index_mainfork_expected(Relation rel);
static void bt_index_check_callback(Relation indrel, Relation heaprel,
void *state, bool readonly);
static void bt_check_every_level(Relation rel, Relation heaprel,
bool heapkeyspace, bool readonly, bool heapallindexed,
bool rootdescend, bool checkunique);
@ -240,15 +250,21 @@ Datum
bt_index_check(PG_FUNCTION_ARGS)
{
Oid indrelid = PG_GETARG_OID(0);
bool heapallindexed = false;
bool checkunique = false;
BTCallbackState args;
args.heapallindexed = false;
args.rootdescend = false;
args.parentcheck = false;
args.checkunique = false;
if (PG_NARGS() >= 2)
heapallindexed = PG_GETARG_BOOL(1);
if (PG_NARGS() == 3)
checkunique = PG_GETARG_BOOL(2);
args.heapallindexed = PG_GETARG_BOOL(1);
if (PG_NARGS() >= 3)
args.checkunique = PG_GETARG_BOOL(2);
bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
bt_index_check_callback,
AccessShareLock, &args);
PG_RETURN_VOID();
}
@ -266,18 +282,23 @@ Datum
bt_index_parent_check(PG_FUNCTION_ARGS)
{
Oid indrelid = PG_GETARG_OID(0);
bool heapallindexed = false;
bool rootdescend = false;
bool checkunique = false;
BTCallbackState args;
args.heapallindexed = false;
args.rootdescend = false;
args.parentcheck = true;
args.checkunique = false;
if (PG_NARGS() >= 2)
heapallindexed = PG_GETARG_BOOL(1);
args.heapallindexed = PG_GETARG_BOOL(1);
if (PG_NARGS() >= 3)
rootdescend = PG_GETARG_BOOL(2);
if (PG_NARGS() == 4)
checkunique = PG_GETARG_BOOL(3);
args.rootdescend = PG_GETARG_BOOL(2);
if (PG_NARGS() >= 4)
args.checkunique = PG_GETARG_BOOL(3);
bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
bt_index_check_callback,
ShareLock, &args);
PG_RETURN_VOID();
}
@ -286,193 +307,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
* Helper for bt_index_[parent_]check, coordinating the bulk of the work.
*/
static void
bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
bool rootdescend, bool checkunique)
bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
{
Oid heapid;
Relation indrel;
Relation heaprel;
LOCKMODE lockmode;
Oid save_userid;
int save_sec_context;
int save_nestlevel;
BTCallbackState *args = (BTCallbackState *) state;
bool heapkeyspace,
allequalimage;
if (parentcheck)
lockmode = ShareLock;
else
lockmode = AccessShareLock;
/*
* We must lock table before index to avoid deadlocks. However, if the
* passed indrelid isn't an index then IndexGetRelation() will fail.
* Rather than emitting a not-very-helpful error message, postpone
* complaining, expecting that the is-it-an-index test below will fail.
*
* In hot standby mode this will raise an error when parentcheck is true.
*/
heapid = IndexGetRelation(indrelid, true);
if (OidIsValid(heapid))
{
heaprel = table_open(heapid, lockmode);
/*
* Switch to the table owner's userid, so that any index functions are
* run as that user. Also lock down security-restricted operations
* and arrange to make GUC variable changes local to this command.
*/
GetUserIdAndSecContext(&save_userid, &save_sec_context);
SetUserIdAndSecContext(heaprel->rd_rel->relowner,
save_sec_context | SECURITY_RESTRICTED_OPERATION);
save_nestlevel = NewGUCNestLevel();
RestrictSearchPath();
}
else
{
heaprel = NULL;
/* Set these just to suppress "uninitialized variable" warnings */
save_userid = InvalidOid;
save_sec_context = -1;
save_nestlevel = -1;
}
/*
* Open the target index relations separately (like relation_openrv(), but
* with heap relation locked first to prevent deadlocking). In hot
* standby mode this will raise an error when parentcheck is true.
*
* There is no need for the usual indcheckxmin usability horizon test
* here, even in the heapallindexed case, because index undergoing
* verification only needs to have entries for a new transaction snapshot.
* (If this is a parentcheck verification, there is no question about
* committed or recently dead heap tuples lacking index entries due to
* concurrent activity.)
*/
indrel = index_open(indrelid, lockmode);
/*
* Since we did the IndexGetRelation call above without any lock, it's
* barely possible that a race against an index drop/recreation could have
* netted us the wrong table.
*/
if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("could not open parent table of index \"%s\"",
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" lacks a main relation fork",
RelationGetRelationName(indrel))));
/* Relation suitable for checking as B-Tree? */
btree_index_checkable(indrel);
if (btree_index_mainfork_expected(indrel))
/* Extract metadata from metapage, and sanitize it in passing */
_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
if (allequalimage && !heapkeyspace)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
RelationGetRelationName(indrel))));
if (allequalimage && !_bt_allequalimage(indrel, false))
{
bool heapkeyspace,
allequalimage;
bool has_interval_ops = false;
if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" lacks a main relation fork",
RelationGetRelationName(indrel))));
/* Extract metadata from metapage, and sanitize it in passing */
_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
if (allequalimage && !heapkeyspace)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
RelationGetRelationName(indrel))));
if (allequalimage && !_bt_allequalimage(indrel, false))
{
bool has_interval_ops = false;
for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
has_interval_ops = true;
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
RelationGetRelationName(indrel)),
has_interval_ops
? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
: 0));
}
/* Check index, possibly against table it is an index on */
bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
heapallindexed, rootdescend, checkunique);
for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
{
has_interval_ops = true;
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
RelationGetRelationName(indrel)),
has_interval_ops
? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
: 0));
}
}
/* Roll back any GUC changes executed by index functions */
AtEOXact_GUC(false, save_nestlevel);
/* Restore userid and security context */
SetUserIdAndSecContext(save_userid, save_sec_context);
/*
* Release locks early. That's ok here because nothing in the called
* routines will trigger shared cache invalidations to be sent, so we can
* relax the usual pattern of only releasing locks after commit.
*/
index_close(indrel, lockmode);
if (heaprel)
table_close(heaprel, lockmode);
}
/*
* Basic checks about the suitability of a relation for checking as a B-Tree
* index.
*
* NB: Intentionally not checking permissions, the function is normally not
* callable by non-superusers. If granted, it's useful to be able to check a
* whole cluster.
*/
static inline void
btree_index_checkable(Relation rel)
{
if (rel->rd_rel->relkind != RELKIND_INDEX ||
rel->rd_rel->relam != BTREE_AM_OID)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only B-Tree indexes are supported as targets for verification"),
errdetail("Relation \"%s\" is not a B-Tree index.",
RelationGetRelationName(rel))));
if (RELATION_IS_OTHER_TEMP(rel))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary tables of other sessions"),
errdetail("Index \"%s\" is associated with temporary relation.",
RelationGetRelationName(rel))));
if (!rel->rd_index->indisvalid)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot check index \"%s\"",
RelationGetRelationName(rel)),
errdetail("Index is not valid.")));
}
/*
* Check if B-Tree index relation should have a file for its main relation
* fork. Verification uses this to skip unlogged indexes when in hot standby
* mode, where there is simply nothing to verify. We behave as if the
* relation is empty.
*
* NB: Caller should call btree_index_checkable() before calling here.
*/
static inline bool
btree_index_mainfork_expected(Relation rel)
{
if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
!RecoveryInProgress())
return true;
ereport(DEBUG1,
(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
RelationGetRelationName(rel))));
return false;
/* Check index, possibly against table it is an index on */
bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
args->heapallindexed, args->rootdescend, args->checkunique);
}
/*
@ -721,7 +595,7 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
RelationGetRelationName(state->heaprel));
table_index_build_scan(state->heaprel, state->rel, indexinfo, true, false,
bt_tuple_present_callback, (void *) state, scan);
bt_tuple_present_callback, state, scan);
ereport(DEBUG1,
(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
@ -1433,6 +1307,13 @@ bt_target_page_check(BtreeCheckState *state)
bool lowersizelimit;
ItemPointer scantid;
/*
* True if we already called bt_entry_unique_check() for the current
* item. This helps to avoid visiting the heap for keys, which are
* anyway presented only once and can't comprise a unique violation.
*/
bool unique_checked = false;
CHECK_FOR_INTERRUPTS();
itemid = PageGetItemIdCareful(state, state->targetblock,
@ -1592,8 +1473,7 @@ bt_target_page_check(BtreeCheckState *state)
*/
lowersizelimit = skey->heapkeyspace &&
(P_ISLEAF(topaque) || BTreeTupleGetHeapTID(itup) == NULL);
if (tupsize > (lowersizelimit ? BTMaxItemSize(state->target) :
BTMaxItemSizeNoHeapTid(state->target)))
if (tupsize > (lowersizelimit ? BTMaxItemSize : BTMaxItemSizeNoHeapTid))
{
ItemPointer tid = BTreeTupleGetPointsToTID(itup);
char *itid,
@ -1775,12 +1655,18 @@ bt_target_page_check(BtreeCheckState *state)
/*
* If the index is unique verify entries uniqueness by checking the
* heap tuples visibility.
* heap tuples visibility. Immediately check posting tuples and
* tuples with repeated keys. Postpone check for keys, which have the
* first appearance.
*/
if (state->checkunique && state->indexinfo->ii_Unique &&
P_ISLEAF(topaque) && !skey->anynullkeys)
P_ISLEAF(topaque) && !skey->anynullkeys &&
(BTreeTupleIsPosting(itup) || ItemPointerIsValid(lVis.tid)))
{
bt_entry_unique_check(state, itup, state->targetblock, offset,
&lVis);
unique_checked = true;
}
if (state->checkunique && state->indexinfo->ii_Unique &&
P_ISLEAF(topaque) && OffsetNumberNext(offset) <= max)
@ -1799,6 +1685,9 @@ bt_target_page_check(BtreeCheckState *state)
* data (whole index tuple or last posting in index tuple). Key
* containing null value does not violate unique constraint and
* treated as different to any other key.
*
* If the next key is the same as the previous one, do the
* bt_entry_unique_check() call if it was postponed.
*/
if (_bt_compare(state->rel, skey, state->target,
OffsetNumberNext(offset)) != 0 || skey->anynullkeys)
@ -1808,6 +1697,11 @@ bt_target_page_check(BtreeCheckState *state)
lVis.postingIndex = -1;
lVis.tid = NULL;
}
else if (!unique_checked)
{
bt_entry_unique_check(state, itup, state->targetblock, offset,
&lVis);
}
skey->scantid = scantid; /* Restore saved scan key state */
}
@ -1890,10 +1784,19 @@ bt_target_page_check(BtreeCheckState *state)
rightkey->scantid = NULL;
/* The first key on the next page is the same */
if (_bt_compare(state->rel, rightkey, state->target, max) == 0 && !rightkey->anynullkeys)
if (_bt_compare(state->rel, rightkey, state->target, max) == 0 &&
!rightkey->anynullkeys)
{
Page rightpage;
/*
* Do the bt_entry_unique_check() call if it was
* postponed.
*/
if (!unique_checked)
bt_entry_unique_check(state, itup, state->targetblock,
offset, &lVis);
elog(DEBUG2, "cross page equal keys");
rightpage = palloc_btree_page(state,
rightblock_number);
@ -2441,7 +2344,7 @@ bt_child_highkey_check(BtreeCheckState *state,
* So, now we traverse to the right of that cousin page and
* current child level page under consideration still belongs
* to the subtree of target's left sibling. Thus, we need to
* match child's high key to it's left uncle page high key.
* match child's high key to its left uncle page high key.
* Thankfully we saved it, it's called a "low key" of target
* page.
*/

View File

@ -2,7 +2,7 @@
*
* auth_delay.c
*
* Copyright (c) 2010-2024, PostgreSQL Global Development Group
* Copyright (c) 2010-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/auth_delay/auth_delay.c
@ -14,11 +14,12 @@
#include <limits.h>
#include "libpq/auth.h"
#include "port.h"
#include "utils/guc.h"
#include "utils/timestamp.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "auth_delay",
.version = PG_VERSION
);
/* GUC Variables */
static int auth_delay_milliseconds = 0;

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
auth_delay_sources = files(
'auth_delay.c',

View File

@ -3,7 +3,7 @@
* auto_explain.c
*
*
* Copyright (c) 2008-2024, PostgreSQL Global Development Group
* Copyright (c) 2008-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/auto_explain/auto_explain.c
@ -16,13 +16,16 @@
#include "access/parallel.h"
#include "commands/explain.h"
#include "commands/explain_format.h"
#include "commands/explain_state.h"
#include "common/pg_prng.h"
#include "executor/instrument.h"
#include "jit/jit.h"
#include "nodes/params.h"
#include "utils/guc.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "auto_explain",
.version = PG_VERSION
);
/* GUC variables */
static int auto_explain_log_min_duration = -1; /* msec or -1 */
@ -72,7 +75,7 @@ static bool current_query_sampled = false;
(nesting_level == 0 || auto_explain_log_nested_statements) && \
current_query_sampled)
/* Saved hook values in case of unload */
/* Saved hook values */
static ExecutorStart_hook_type prev_ExecutorStart = NULL;
static ExecutorRun_hook_type prev_ExecutorRun = NULL;
static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
@ -81,7 +84,7 @@ static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static void explain_ExecutorStart(QueryDesc *queryDesc, int eflags);
static void explain_ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction,
uint64 count, bool execute_once);
uint64 count);
static void explain_ExecutorFinish(QueryDesc *queryDesc);
static void explain_ExecutorEnd(QueryDesc *queryDesc);
@ -95,7 +98,7 @@ _PG_init(void)
/* Define custom GUC variables. */
DefineCustomIntVariable("auto_explain.log_min_duration",
"Sets the minimum execution time above which plans will be logged.",
"Zero prints all plans. -1 turns this feature off.",
"-1 disables logging plans. 0 means log all plans.",
&auto_explain_log_min_duration,
-1,
-1, INT_MAX,
@ -106,8 +109,8 @@ _PG_init(void)
NULL);
DefineCustomIntVariable("auto_explain.log_parameter_max_length",
"Sets the maximum length of query parameters to log.",
"Zero logs no query parameters, -1 logs them in full.",
"Sets the maximum length of query parameter values to log.",
"-1 means log values in full.",
&auto_explain_log_parameter_max_length,
-1,
-1, INT_MAX,
@ -323,15 +326,15 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
*/
static void
explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction,
uint64 count, bool execute_once)
uint64 count)
{
nesting_level++;
PG_TRY();
{
if (prev_ExecutorRun)
prev_ExecutorRun(queryDesc, direction, count, execute_once);
prev_ExecutorRun(queryDesc, direction, count);
else
standard_ExecutorRun(queryDesc, direction, count, execute_once);
standard_ExecutorRun(queryDesc, direction, count);
}
PG_FINALLY();
{

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
auto_explain_sources = files(
'auto_explain.c',

View File

@ -1,5 +1,5 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
use strict;
use warnings FATAL => 'all';
@ -28,7 +28,7 @@ sub query_log
}
my $node = PostgreSQL::Test::Cluster->new('main');
$node->init('auth_extra' => [ '--create-role', 'regress_user1' ]);
$node->init(auth_extra => [ '--create-role' => 'regress_user1' ]);
$node->append_conf('postgresql.conf',
"session_preload_libraries = 'auto_explain'");
$node->append_conf('postgresql.conf', "auto_explain.log_min_duration = 0");
@ -212,4 +212,17 @@ REVOKE SET ON PARAMETER auto_explain.log_format FROM regress_user1;
DROP USER regress_user1;
});
# Test pg_get_loaded_modules() function. This function is particularly
# useful for modules with no SQL presence, such as auto_explain.
my $res = $node->safe_psql(
"postgres", q{
SELECT module_name,
version = current_setting('server_version') as version_ok,
regexp_replace(file_name, '\..*', '') as file_name_stripped
FROM pg_get_loaded_modules()
WHERE module_name = 'auto_explain';
});
like($res, qr/^auto_explain\|t\|auto_explain$/, "pg_get_loaded_modules() ok");
done_testing();

View File

@ -3,7 +3,7 @@
* basebackup_to_shell.c
* target base backup files to a shell command
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* contrib/basebackup_to_shell/basebackup_to_shell.c
*-------------------------------------------------------------------------
@ -18,7 +18,10 @@
#include "utils/acl.h"
#include "utils/guc.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "basebackup_to_shell",
.version = PG_VERSION
);
typedef struct bbsink_shell
{

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
basebackup_to_shell_sources = files(
'basebackup_to_shell.c',

View File

@ -1,4 +1,4 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
use strict;
use warnings FATAL => 'all';
@ -24,8 +24,8 @@ my $node = PostgreSQL::Test::Cluster->new('primary');
# Make sure pg_hba.conf is set up to allow connections from backupuser.
# This is only needed on Windows machines that don't use UNIX sockets.
$node->init(
'allows_streaming' => 1,
'auth_extra' => [ '--create-role', 'backupuser' ]);
allows_streaming => 1,
auth_extra => [ '--create-role' => 'backupuser' ]);
$node->append_conf('postgresql.conf',
"shared_preload_libraries = 'basebackup_to_shell'");
@ -37,15 +37,19 @@ $node->safe_psql('postgres', 'CREATE ROLE trustworthy');
# to keep test times reasonable. Using @pg_basebackup_defs as the first
# element of the array passed to IPC::Run interpolate the array (as it is
# not a reference to an array)...
my @pg_basebackup_defs = ('pg_basebackup', '--no-sync', '-cfast');
my @pg_basebackup_defs =
('pg_basebackup', '--no-sync', '--checkpoint' => 'fast');
# This particular test module generally wants to run with -Xfetch, because
# -Xstream is not supported with a backup target, and with -U backupuser.
my @pg_basebackup_cmd = (@pg_basebackup_defs, '-U', 'backupuser', '-Xfetch');
my @pg_basebackup_cmd = (
@pg_basebackup_defs,
'--username' => 'backupuser',
'--wal-method' => 'fetch');
# Can't use this module without setting basebackup_to_shell.command.
$node->command_fails_like(
[ @pg_basebackup_cmd, '--target', 'shell' ],
[ @pg_basebackup_cmd, '--target' => 'shell' ],
qr/shell command for backup is not configured/,
'fails if basebackup_to_shell.command is not set');
@ -64,13 +68,13 @@ $node->reload();
# Should work now.
$node->command_ok(
[ @pg_basebackup_cmd, '--target', 'shell' ],
[ @pg_basebackup_cmd, '--target' => 'shell' ],
'backup with no detail: pg_basebackup');
verify_backup('', $backup_path, "backup with no detail");
# Should fail with a detail.
$node->command_fails_like(
[ @pg_basebackup_cmd, '--target', 'shell:foo' ],
[ @pg_basebackup_cmd, '--target' => 'shell:foo' ],
qr/a target detail is not permitted because the configured command does not include %d/,
'fails if detail provided without %d');
@ -87,19 +91,19 @@ $node->reload();
# Should fail due to lack of permission.
$node->command_fails_like(
[ @pg_basebackup_cmd, '--target', 'shell' ],
[ @pg_basebackup_cmd, '--target' => 'shell' ],
qr/permission denied to use basebackup_to_shell/,
'fails if required_role not granted');
# Should fail due to lack of a detail.
$node->safe_psql('postgres', 'GRANT trustworthy TO backupuser');
$node->command_fails_like(
[ @pg_basebackup_cmd, '--target', 'shell' ],
[ @pg_basebackup_cmd, '--target' => 'shell' ],
qr/a target detail is required because the configured command includes %d/,
'fails if %d is present and detail not given');
# Should work.
$node->command_ok([ @pg_basebackup_cmd, '--target', 'shell:bar' ],
$node->command_ok([ @pg_basebackup_cmd, '--target' => 'shell:bar' ],
'backup with detail: pg_basebackup');
verify_backup('bar.', $backup_path, "backup with detail");
@ -127,15 +131,19 @@ sub verify_backup
# Untar.
my $extract_path = PostgreSQL::Test::Utils::tempdir;
system_or_bail($tar, 'xf', $backup_dir . '/' . $prefix . 'base.tar',
'-C', $extract_path);
system_or_bail(
$tar,
'xf' => $backup_dir . '/' . $prefix . 'base.tar',
'-C' => $extract_path);
# Verify.
$node->command_ok(
[
'pg_verifybackup', '-n',
'-m', "${backup_dir}/${prefix}backup_manifest",
'-e', $extract_path
'pg_verifybackup',
'--no-parse-wal',
'--manifest-path' => "${backup_dir}/${prefix}backup_manifest",
'--exit-on-error',
$extract_path
],
"$test_name: backup verifies ok");
}

View File

@ -17,7 +17,7 @@
* a file is successfully archived and then the system crashes before
* a durable record of the success has been made.
*
* Copyright (c) 2022-2024, PostgreSQL Global Development Group
* Copyright (c) 2022-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/basic_archive/basic_archive.c
@ -36,9 +36,11 @@
#include "storage/copydir.h"
#include "storage/fd.h"
#include "utils/guc.h"
#include "utils/memutils.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "basic_archive",
.version = PG_VERSION
);
static char *archive_directory = NULL;

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
basic_archive_sources = files(
'basic_archive.c',

View File

@ -3,7 +3,7 @@
* blcost.c
* Cost estimate function for bloom indexes.
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/bloom/blcost.c
@ -13,7 +13,6 @@
#include "postgres.h"
#include "bloom.h"
#include "fmgr.h"
#include "utils/selfuncs.h"
/*

View File

@ -3,7 +3,7 @@
* blinsert.c
* Bloom index build and insert functions.
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/bloom/blinsert.c
@ -16,15 +16,16 @@
#include "access/generic_xlog.h"
#include "access/tableam.h"
#include "bloom.h"
#include "catalog/index.h"
#include "miscadmin.h"
#include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/indexfsm.h"
#include "storage/smgr.h"
#include "utils/memutils.h"
#include "utils/rel.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "bloom",
.version = PG_VERSION
);
/*
* State of bloom index build. We accumulate one page data here before
@ -141,7 +142,7 @@ blbuild(Relation heap, Relation index, IndexInfo *indexInfo)
/* Do the heap scan */
reltuples = table_index_build_scan(heap, index, indexInfo, true, true,
bloomBuildCallback, (void *) &buildstate,
bloomBuildCallback, &buildstate,
NULL);
/* Flush last page if needed (it will be, unless heap was empty) */

View File

@ -3,7 +3,7 @@
* bloom.h
* Header for bloom index.
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/bloom/bloom.h
@ -110,12 +110,9 @@ typedef struct BloomOptions
* FreeBlockNumberArray - array of block numbers sized so that metadata fill
* all space in metapage.
*/
typedef BlockNumber FreeBlockNumberArray[
MAXALIGN_DOWN(
BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(BloomPageOpaqueData))
- MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
) / sizeof(BlockNumber)
];
typedef BlockNumber FreeBlockNumberArray[MAXALIGN_DOWN(BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(BloomPageOpaqueData))
- MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions)))
/ sizeof(BlockNumber)];
/* Metadata of bloom index */
typedef struct BloomMetaPageData

View File

@ -3,7 +3,7 @@
* blscan.c
* Bloom index scan functions.
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/bloom/blscan.c
@ -17,9 +17,6 @@
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/bufmgr.h"
#include "storage/lmgr.h"
#include "utils/memutils.h"
#include "utils/rel.h"
/*
* Begin scan of bloom index.
@ -55,10 +52,7 @@ blrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
so->sign = NULL;
if (scankey && scan->numberOfKeys > 0)
{
memmove(scan->keyData, scankey,
scan->numberOfKeys * sizeof(ScanKeyData));
}
memcpy(scan->keyData, scankey, scan->numberOfKeys * sizeof(ScanKeyData));
}
/*
@ -121,6 +115,9 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
*/
bas = GetAccessStrategy(BAS_BULKREAD);
npages = RelationGetNumberOfBlocks(scan->indexRelation);
pgstat_count_index_scan(scan->indexRelation);
if (scan->instrument)
scan->instrument->nsearches++;
for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)
{

View File

@ -3,7 +3,7 @@
* blutils.c
* Bloom index utilities.
*
* Portions Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Portions Copyright (c) 2016-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1990-1993, Regents of the University of California
*
* IDENTIFICATION
@ -17,14 +17,11 @@
#include "access/generic_xlog.h"
#include "access/reloptions.h"
#include "bloom.h"
#include "catalog/index.h"
#include "commands/vacuum.h"
#include "miscadmin.h"
#include "storage/bufmgr.h"
#include "storage/freespace.h"
#include "storage/indexfsm.h"
#include "storage/lmgr.h"
#include "utils/memutils.h"
#include "varatt.h"
/* Signature dealing macros - note i is assumed to be of type int */
#define GETWORD(x,i) ( *( (BloomSignatureWord *)(x) + ( (i) / SIGNWORDBITS ) ) )
@ -112,6 +109,9 @@ blhandler(PG_FUNCTION_ARGS)
amroutine->amoptsprocnum = BLOOM_OPTIONS_PROC;
amroutine->amcanorder = false;
amroutine->amcanorderbyop = false;
amroutine->amcanhash = false;
amroutine->amconsistentequality = false;
amroutine->amconsistentordering = false;
amroutine->amcanbackward = false;
amroutine->amcanunique = false;
amroutine->amcanmulticol = true;
@ -137,6 +137,7 @@ blhandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = blvacuumcleanup;
amroutine->amcanreturn = NULL;
amroutine->amcostestimate = blcostestimate;
amroutine->amgettreeheight = NULL;
amroutine->amoptions = bloptions;
amroutine->amproperty = NULL;
amroutine->ambuildphasename = NULL;
@ -152,6 +153,8 @@ blhandler(PG_FUNCTION_ARGS)
amroutine->amestimateparallelscan = NULL;
amroutine->aminitparallelscan = NULL;
amroutine->amparallelrescan = NULL;
amroutine->amtranslatestrategy = NULL;
amroutine->amtranslatecmptype = NULL;
PG_RETURN_POINTER(amroutine);
}
@ -201,7 +204,7 @@ initBloomState(BloomState *state, Relation index)
UnlockReleaseBuffer(buffer);
index->rd_amcache = (void *) opts;
index->rd_amcache = opts;
}
memcpy(&state->opts, index->rd_amcache, sizeof(state->opts));

View File

@ -3,7 +3,7 @@
* blvacuum.c
* Bloom VACUUM functions.
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/bloom/blvacuum.c
@ -14,13 +14,9 @@
#include "access/genam.h"
#include "bloom.h"
#include "catalog/storage.h"
#include "commands/vacuum.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "storage/indexfsm.h"
#include "storage/lmgr.h"
/*
@ -61,7 +57,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
*itupPtr,
*itupEnd;
vacuum_delay_point();
vacuum_delay_point(false);
buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
RBM_NORMAL, info->strategy);
@ -191,7 +187,7 @@ blvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
Buffer buffer;
Page page;
vacuum_delay_point();
vacuum_delay_point(false);
buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
RBM_NORMAL, info->strategy);

View File

@ -3,7 +3,7 @@
* blvalidate.c
* Opclass validator for bloom.
*
* Copyright (c) 2016-2024, PostgreSQL Global Development Group
* Copyright (c) 2016-2025, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/bloom/blvalidate.c
@ -18,9 +18,7 @@
#include "catalog/pg_amop.h"
#include "catalog/pg_amproc.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_opfamily.h"
#include "catalog/pg_type.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/regproc.h"
#include "utils/syscache.h"
@ -38,8 +36,6 @@ blvalidate(Oid opclassoid)
Oid opcintype;
Oid opckeytype;
char *opclassname;
HeapTuple familytup;
Form_pg_opfamily familyform;
char *opfamilyname;
CatCList *proclist,
*oprlist;
@ -62,12 +58,7 @@ blvalidate(Oid opclassoid)
opclassname = NameStr(classform->opcname);
/* Fetch opfamily information */
familytup = SearchSysCache1(OPFAMILYOID, ObjectIdGetDatum(opfamilyoid));
if (!HeapTupleIsValid(familytup))
elog(ERROR, "cache lookup failed for operator family %u", opfamilyoid);
familyform = (Form_pg_opfamily) GETSTRUCT(familytup);
opfamilyname = NameStr(familyform->opfname);
opfamilyname = get_opfamily_name(opfamilyoid, false);
/* Fetch all operators and support functions of the opfamily */
oprlist = SearchSysCacheList1(AMOPSTRATEGY, ObjectIdGetDatum(opfamilyoid));
@ -126,7 +117,7 @@ blvalidate(Oid opclassoid)
{
ereport(INFO,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("gist opfamily %s contains function %s with wrong signature for support number %d",
errmsg("bloom opfamily %s contains function %s with wrong signature for support number %d",
opfamilyname,
format_procedure(procform->amproc),
procform->amprocnum)));
@ -218,7 +209,6 @@ blvalidate(Oid opclassoid)
ReleaseCatCacheList(proclist);
ReleaseCatCacheList(oprlist);
ReleaseSysCache(familytup);
ReleaseSysCache(classtup);
return result;

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
bloom_sources = files(
'blcost.c',

View File

@ -1,5 +1,5 @@
# Copyright (c) 2021-2024, PostgreSQL Global Development Group
# Copyright (c) 2021-2025, PostgreSQL Global Development Group
# Test generic xlog record work for bloom index replication.
use strict;

View File

@ -4,7 +4,10 @@
#include "plperl.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "bool_plperl",
.version = PG_VERSION
);
PG_FUNCTION_INFO_V1(bool_to_plperl);

View File

@ -104,9 +104,9 @@ SELECT spi_test();
DROP EXTENSION plperl CASCADE;
NOTICE: drop cascades to 6 other objects
DETAIL: drop cascades to function spi_test()
drop cascades to extension bool_plperl
DETAIL: drop cascades to extension bool_plperl
drop cascades to function perl2int(integer)
drop cascades to function perl2text(text)
drop cascades to function perl2undef()
drop cascades to function bool2perl(boolean,boolean,boolean)
drop cascades to function spi_test()

View File

@ -104,9 +104,9 @@ SELECT spi_test();
DROP EXTENSION plperlu CASCADE;
NOTICE: drop cascades to 6 other objects
DETAIL: drop cascades to function spi_test()
drop cascades to extension bool_plperlu
DETAIL: drop cascades to extension bool_plperlu
drop cascades to function perl2int(integer)
drop cascades to function perl2text(text)
drop cascades to function perl2undef()
drop cascades to function bool2perl(boolean,boolean,boolean)
drop cascades to function spi_test()

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
if not perl_dep.found()
subdir_done()

View File

@ -7,17 +7,17 @@
#include "access/stratnum.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/cash.h"
#include "utils/date.h"
#include "utils/float.h"
#include "utils/inet.h"
#include "utils/numeric.h"
#include "utils/timestamp.h"
#include "utils/uuid.h"
#include "utils/varbit.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "btree_gin",
.version = PG_VERSION
);
typedef struct QueryInfo
{

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
btree_gin_sources = files(
'btree_gin.c',

View File

@ -33,12 +33,14 @@ EXTENSION = btree_gist
DATA = btree_gist--1.0--1.1.sql \
btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \
btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \
btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql
btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \
btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql
PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
time timetz date interval macaddr macaddr8 inet cidr text varchar char \
bytea bit varbit numeric uuid not_equal enum bool partitions
bytea bit varbit numeric uuid not_equal enum bool partitions \
stratnum without_overlaps
SHLIB_LINK += $(filter -lm, $(LIBS))

View File

@ -5,20 +5,19 @@
#include "btree_gist.h"
#include "btree_utils_var.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/fmgrprotos.h"
#include "utils/sortsupport.h"
#include "utils/varbit.h"
/*
** Bit ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_bit_compress);
PG_FUNCTION_INFO_V1(gbt_bit_union);
PG_FUNCTION_INFO_V1(gbt_bit_picksplit);
PG_FUNCTION_INFO_V1(gbt_bit_consistent);
PG_FUNCTION_INFO_V1(gbt_bit_penalty);
PG_FUNCTION_INFO_V1(gbt_bit_same);
PG_FUNCTION_INFO_V1(gbt_bit_sortsupport);
PG_FUNCTION_INFO_V1(gbt_varbit_sortsupport);
/* define for comparison */
@ -122,7 +121,7 @@ static const gbtree_vinfo tinfo =
/**************************************************
* Bit ops
* GiST support functions
**************************************************/
Datum
@ -137,7 +136,7 @@ Datum
gbt_bit_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
void *query = (void *) DatumGetByteaP(PG_GETARG_DATUM(1));
void *query = DatumGetByteaP(PG_GETARG_DATUM(1));
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
@ -162,8 +161,6 @@ gbt_bit_consistent(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(retval);
}
Datum
gbt_bit_union(PG_FUNCTION_ARGS)
{
@ -174,7 +171,6 @@ gbt_bit_union(PG_FUNCTION_ARGS)
&tinfo, fcinfo->flinfo));
}
Datum
gbt_bit_picksplit(PG_FUNCTION_ARGS)
{
@ -197,7 +193,6 @@ gbt_bit_same(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_bit_penalty(PG_FUNCTION_ARGS)
{
@ -208,3 +203,46 @@ gbt_bit_penalty(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_var_penalty(result, o, n, PG_GET_COLLATION(),
&tinfo, fcinfo->flinfo));
}
static int
gbt_bit_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);
GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);
GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);
GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);
Datum result;
/* for leaf items we expect lower == upper, so only compare lower */
result = DirectFunctionCall2(byteacmp,
PointerGetDatum(arg1.lower),
PointerGetDatum(arg2.lower));
GBT_FREE_IF_COPY(key1, x);
GBT_FREE_IF_COPY(key2, y);
return DatumGetInt32(result);
}
Datum
gbt_bit_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_bit_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}
Datum
gbt_varbit_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_bit_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,7 +5,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "common/int.h"
#include "utils/sortsupport.h"
typedef struct boolkey
{
@ -13,9 +13,7 @@ typedef struct boolkey
bool upper;
} boolKEY;
/*
** bool ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_bool_compress);
PG_FUNCTION_INFO_V1(gbt_bool_fetch);
PG_FUNCTION_INFO_V1(gbt_bool_union);
@ -23,6 +21,7 @@ PG_FUNCTION_INFO_V1(gbt_bool_picksplit);
PG_FUNCTION_INFO_V1(gbt_bool_consistent);
PG_FUNCTION_INFO_V1(gbt_bool_penalty);
PG_FUNCTION_INFO_V1(gbt_bool_same);
PG_FUNCTION_INFO_V1(gbt_bool_sortsupport);
static bool
gbt_boolgt(const void *a, const void *b, FmgrInfo *flinfo)
@ -83,10 +82,9 @@ static const gbtree_ninfo tinfo =
/**************************************************
* bool ops
* GiST support functions
**************************************************/
Datum
gbt_bool_compress(PG_FUNCTION_ARGS)
{
@ -121,11 +119,10 @@ gbt_bool_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_bool_union(PG_FUNCTION_ARGS)
{
@ -133,10 +130,9 @@ gbt_bool_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(boolKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(boolKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_bool_penalty(PG_FUNCTION_ARGS)
{
@ -167,3 +163,24 @@ gbt_bool_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_bool_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
boolKEY *arg1 = (boolKEY *) DatumGetPointer(x);
boolKEY *arg2 = (boolKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return (int32) arg1->lower - (int32) arg2->lower;
}
Datum
gbt_bool_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_bool_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,19 +5,17 @@
#include "btree_gist.h"
#include "btree_utils_var.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/fmgrprotos.h"
#include "utils/sortsupport.h"
/*
** Bytea ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_bytea_compress);
PG_FUNCTION_INFO_V1(gbt_bytea_union);
PG_FUNCTION_INFO_V1(gbt_bytea_picksplit);
PG_FUNCTION_INFO_V1(gbt_bytea_consistent);
PG_FUNCTION_INFO_V1(gbt_bytea_penalty);
PG_FUNCTION_INFO_V1(gbt_bytea_same);
PG_FUNCTION_INFO_V1(gbt_bytea_sortsupport);
/* define for comparison */
@ -70,7 +68,6 @@ gbt_byteacmp(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
PointerGetDatum(b)));
}
static const gbtree_vinfo tinfo =
{
gbt_t_bytea,
@ -87,10 +84,9 @@ static const gbtree_vinfo tinfo =
/**************************************************
* Text ops
* GiST support functions
**************************************************/
Datum
gbt_bytea_compress(PG_FUNCTION_ARGS)
{
@ -99,13 +95,11 @@ gbt_bytea_compress(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_var_compress(entry, &tinfo));
}
Datum
gbt_bytea_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
void *query = (void *) DatumGetByteaP(PG_GETARG_DATUM(1));
void *query = DatumGetByteaP(PG_GETARG_DATUM(1));
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
@ -122,8 +116,6 @@ gbt_bytea_consistent(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(retval);
}
Datum
gbt_bytea_union(PG_FUNCTION_ARGS)
{
@ -134,7 +126,6 @@ gbt_bytea_union(PG_FUNCTION_ARGS)
&tinfo, fcinfo->flinfo));
}
Datum
gbt_bytea_picksplit(PG_FUNCTION_ARGS)
{
@ -157,7 +148,6 @@ gbt_bytea_same(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_bytea_penalty(PG_FUNCTION_ARGS)
{
@ -168,3 +158,35 @@ gbt_bytea_penalty(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_var_penalty(result, o, n, PG_GET_COLLATION(),
&tinfo, fcinfo->flinfo));
}
static int
gbt_bytea_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);
GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);
GBT_VARKEY_R xkey = gbt_var_key_readable(key1);
GBT_VARKEY_R ykey = gbt_var_key_readable(key2);
Datum result;
/* for leaf items we expect lower == upper, so only compare lower */
result = DirectFunctionCall2(byteacmp,
PointerGetDatum(xkey.lower),
PointerGetDatum(ykey.lower));
GBT_FREE_IF_COPY(key1, x);
GBT_FREE_IF_COPY(key2, y);
return DatumGetInt32(result);
}
Datum
gbt_bytea_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_bytea_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -7,6 +7,7 @@
#include "btree_utils_num.h"
#include "common/int.h"
#include "utils/cash.h"
#include "utils/sortsupport.h"
typedef struct
{
@ -14,9 +15,7 @@ typedef struct
Cash upper;
} cashKEY;
/*
** Cash ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_cash_compress);
PG_FUNCTION_INFO_V1(gbt_cash_fetch);
PG_FUNCTION_INFO_V1(gbt_cash_union);
@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_cash_consistent);
PG_FUNCTION_INFO_V1(gbt_cash_distance);
PG_FUNCTION_INFO_V1(gbt_cash_penalty);
PG_FUNCTION_INFO_V1(gbt_cash_same);
PG_FUNCTION_INFO_V1(gbt_cash_sortsupport);
static bool
gbt_cashgt(const void *a, const void *b, FmgrInfo *flinfo)
@ -111,10 +111,10 @@ cash_dist(PG_FUNCTION_ARGS)
PG_RETURN_CASH(ra);
}
/**************************************************
* Cash ops
**************************************************/
/**************************************************
* GiST support functions
**************************************************/
Datum
gbt_cash_compress(PG_FUNCTION_ARGS)
@ -150,12 +150,11 @@ gbt_cash_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo,
fcinfo->flinfo));
}
Datum
gbt_cash_distance(PG_FUNCTION_ARGS)
{
@ -169,11 +168,10 @@ gbt_cash_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_cash_union(PG_FUNCTION_ARGS)
{
@ -181,10 +179,9 @@ gbt_cash_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(cashKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(cashKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_cash_penalty(PG_FUNCTION_ARGS)
{
@ -215,3 +212,29 @@ gbt_cash_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_cash_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
cashKEY *arg1 = (cashKEY *) DatumGetPointer(x);
cashKEY *arg2 = (cashKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
if (arg1->lower > arg2->lower)
return 1;
else if (arg1->lower < arg2->lower)
return -1;
else
return 0;
}
Datum
gbt_cash_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_cash_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,8 +5,9 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/builtins.h"
#include "utils/fmgrprotos.h"
#include "utils/date.h"
#include "utils/sortsupport.h"
typedef struct
{
@ -14,9 +15,7 @@ typedef struct
DateADT upper;
} dateKEY;
/*
** date ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_date_compress);
PG_FUNCTION_INFO_V1(gbt_date_fetch);
PG_FUNCTION_INFO_V1(gbt_date_union);
@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_date_consistent);
PG_FUNCTION_INFO_V1(gbt_date_distance);
PG_FUNCTION_INFO_V1(gbt_date_penalty);
PG_FUNCTION_INFO_V1(gbt_date_same);
PG_FUNCTION_INFO_V1(gbt_date_sortsupport);
static bool
gbt_dategt(const void *a, const void *b, FmgrInfo *flinfo)
@ -128,11 +128,9 @@ date_dist(PG_FUNCTION_ARGS)
/**************************************************
* date ops
* GiST support functions
**************************************************/
Datum
gbt_date_compress(PG_FUNCTION_ARGS)
{
@ -167,12 +165,11 @@ gbt_date_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo,
fcinfo->flinfo));
}
Datum
gbt_date_distance(PG_FUNCTION_ARGS)
{
@ -186,11 +183,10 @@ gbt_date_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_date_union(PG_FUNCTION_ARGS)
{
@ -198,10 +194,9 @@ gbt_date_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(dateKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(dateKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_date_penalty(PG_FUNCTION_ARGS)
{
@ -238,7 +233,6 @@ gbt_date_penalty(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_date_picksplit(PG_FUNCTION_ARGS)
{
@ -257,3 +251,26 @@ gbt_date_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_date_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
dateKEY *akey = (dateKEY *) DatumGetPointer(x);
dateKEY *bkey = (dateKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(DirectFunctionCall2(date_cmp,
DateADTGetDatum(akey->lower),
DateADTGetDatum(bkey->lower)));
}
Datum
gbt_date_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_date_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -6,7 +6,9 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "fmgr.h"
#include "utils/builtins.h"
#include "utils/fmgrprotos.h"
#include "utils/fmgroids.h"
#include "utils/sortsupport.h"
/* enums are really Oids, so we just use the same structure */
@ -16,9 +18,7 @@ typedef struct
Oid upper;
} oidKEY;
/*
** enum ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_enum_compress);
PG_FUNCTION_INFO_V1(gbt_enum_fetch);
PG_FUNCTION_INFO_V1(gbt_enum_union);
@ -26,6 +26,7 @@ PG_FUNCTION_INFO_V1(gbt_enum_picksplit);
PG_FUNCTION_INFO_V1(gbt_enum_consistent);
PG_FUNCTION_INFO_V1(gbt_enum_penalty);
PG_FUNCTION_INFO_V1(gbt_enum_same);
PG_FUNCTION_INFO_V1(gbt_enum_sortsupport);
static bool
@ -99,10 +100,9 @@ static const gbtree_ninfo tinfo =
/**************************************************
* Enum ops
* GiST support functions
**************************************************/
Datum
gbt_enum_compress(PG_FUNCTION_ARGS)
{
@ -137,7 +137,7 @@ gbt_enum_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo,
fcinfo->flinfo));
}
@ -149,10 +149,9 @@ gbt_enum_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(oidKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(oidKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_enum_penalty(PG_FUNCTION_ARGS)
{
@ -183,3 +182,39 @@ gbt_enum_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_enum_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
oidKEY *arg1 = (oidKEY *) DatumGetPointer(x);
oidKEY *arg2 = (oidKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(CallerFInfoFunctionCall2(enum_cmp,
ssup->ssup_extra,
InvalidOid,
arg1->lower,
arg2->lower));
}
Datum
gbt_enum_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
FmgrInfo *flinfo;
ssup->comparator = gbt_enum_ssup_cmp;
/*
* Since gbt_enum_ssup_cmp() uses enum_cmp() like the rest of the
* comparison functions, it also needs to pass flinfo when calling it. The
* caller to a SortSupport comparison function doesn't provide an FmgrInfo
* struct, so look it up now, save it in ssup_extra and use it in
* gbt_enum_ssup_cmp() later.
*/
flinfo = MemoryContextAlloc(ssup->ssup_cxt, sizeof(FmgrInfo));
fmgr_info_cxt(F_ENUM_CMP, flinfo, ssup->ssup_cxt);
ssup->ssup_extra = flinfo;
PG_RETURN_VOID();
}

View File

@ -6,6 +6,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/float.h"
#include "utils/sortsupport.h"
typedef struct float4key
{
@ -13,9 +14,7 @@ typedef struct float4key
float4 upper;
} float4KEY;
/*
** float4 ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_float4_compress);
PG_FUNCTION_INFO_V1(gbt_float4_fetch);
PG_FUNCTION_INFO_V1(gbt_float4_union);
@ -24,6 +23,7 @@ PG_FUNCTION_INFO_V1(gbt_float4_consistent);
PG_FUNCTION_INFO_V1(gbt_float4_distance);
PG_FUNCTION_INFO_V1(gbt_float4_penalty);
PG_FUNCTION_INFO_V1(gbt_float4_same);
PG_FUNCTION_INFO_V1(gbt_float4_sortsupport);
static bool
gbt_float4gt(const void *a, const void *b, FmgrInfo *flinfo)
@ -107,10 +107,9 @@ float4_dist(PG_FUNCTION_ARGS)
/**************************************************
* float4 ops
* GiST support functions
**************************************************/
Datum
gbt_float4_compress(PG_FUNCTION_ARGS)
{
@ -145,12 +144,11 @@ gbt_float4_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo,
fcinfo->flinfo));
}
Datum
gbt_float4_distance(PG_FUNCTION_ARGS)
{
@ -164,11 +162,10 @@ gbt_float4_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_float4_union(PG_FUNCTION_ARGS)
{
@ -176,10 +173,9 @@ gbt_float4_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(float4KEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(float4KEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_float4_penalty(PG_FUNCTION_ARGS)
{
@ -210,3 +206,24 @@ gbt_float4_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_float4_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
float4KEY *arg1 = (float4KEY *) DatumGetPointer(x);
float4KEY *arg2 = (float4KEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return float4_cmp_internal(arg1->lower, arg2->lower);
}
Datum
gbt_float4_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_float4_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -6,6 +6,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/float.h"
#include "utils/sortsupport.h"
typedef struct float8key
{
@ -13,9 +14,7 @@ typedef struct float8key
float8 upper;
} float8KEY;
/*
** float8 ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_float8_compress);
PG_FUNCTION_INFO_V1(gbt_float8_fetch);
PG_FUNCTION_INFO_V1(gbt_float8_union);
@ -24,6 +23,7 @@ PG_FUNCTION_INFO_V1(gbt_float8_consistent);
PG_FUNCTION_INFO_V1(gbt_float8_distance);
PG_FUNCTION_INFO_V1(gbt_float8_penalty);
PG_FUNCTION_INFO_V1(gbt_float8_same);
PG_FUNCTION_INFO_V1(gbt_float8_sortsupport);
static bool
@ -113,10 +113,10 @@ float8_dist(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(fabs(r));
}
/**************************************************
* float8 ops
**************************************************/
/**************************************************
* GiST support functions
**************************************************/
Datum
gbt_float8_compress(PG_FUNCTION_ARGS)
@ -152,12 +152,11 @@ gbt_float8_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo,
fcinfo->flinfo));
}
Datum
gbt_float8_distance(PG_FUNCTION_ARGS)
{
@ -171,11 +170,10 @@ gbt_float8_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_float8_union(PG_FUNCTION_ARGS)
{
@ -183,10 +181,9 @@ gbt_float8_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(float8KEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(float8KEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_float8_penalty(PG_FUNCTION_ARGS)
{
@ -217,3 +214,24 @@ gbt_float8_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_float8_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
float8KEY *arg1 = (float8KEY *) DatumGetPointer(x);
float8KEY *arg2 = (float8KEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return float8_cmp_internal(arg1->lower, arg2->lower);
}
Datum
gbt_float8_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_float8_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -0,0 +1,87 @@
/* contrib/btree_gist/btree_gist--1.7--1.8.sql */
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.8'" to load this file. \quit
CREATE FUNCTION gist_translate_cmptype_btree(int)
RETURNS smallint
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_time_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_date_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_text_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;
ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD
FUNCTION 12 ("any", "any") gist_translate_cmptype_btree (int) ;

View File

@ -0,0 +1,197 @@
/* contrib/btree_gist/btree_gist--1.7--1.8.sql */
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.9'" to load this file. \quit
CREATE FUNCTION gbt_bit_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_varbit_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_bool_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_bytea_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_cash_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_date_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_enum_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_float4_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_float8_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_inet_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_int2_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_int4_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_int8_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_intv_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_macaddr_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_macad8_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_numeric_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_oid_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_text_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_bpchar_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_time_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_ts_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
CREATE FUNCTION gbt_uuid_sortsupport(internal)
RETURNS void
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;
ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD
FUNCTION 11 (bit, bit) gbt_bit_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD
FUNCTION 11 (varbit, varbit) gbt_varbit_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD
FUNCTION 11 (bool, bool) gbt_bool_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD
FUNCTION 11 (bytea, bytea) gbt_bytea_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD
FUNCTION 11 (money, money) gbt_cash_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_date_ops USING gist ADD
FUNCTION 11 (date, date) gbt_date_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD
FUNCTION 11 (anyenum, anyenum) gbt_enum_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD
FUNCTION 11 (float4, float4) gbt_float4_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD
FUNCTION 11 (float8, float8) gbt_float8_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD
FUNCTION 11 (inet, inet) gbt_inet_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD
FUNCTION 11 (cidr, cidr) gbt_inet_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD
FUNCTION 11 (int2, int2) gbt_int2_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
FUNCTION 11 (int4, int4) gbt_int4_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
FUNCTION 11 (int8, int8) gbt_int8_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD
FUNCTION 11 (interval, interval) gbt_intv_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD
FUNCTION 11 (macaddr, macaddr) gbt_macaddr_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD
FUNCTION 11 (macaddr8, macaddr8) gbt_macad8_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD
FUNCTION 11 (numeric, numeric) gbt_numeric_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD
FUNCTION 11 (oid, oid) gbt_oid_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_text_ops USING gist ADD
FUNCTION 11 (text, text) gbt_text_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD
FUNCTION 11 (bpchar, bpchar) gbt_bpchar_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_time_ops USING gist ADD
FUNCTION 11 (time, time) gbt_time_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD
FUNCTION 11 (timetz, timetz) gbt_time_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD
FUNCTION 11 (timestamp, timestamp) gbt_ts_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
FUNCTION 11 (timestamptz, timestamptz) gbt_ts_sortsupport (internal) ;
ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD
FUNCTION 11 (uuid, uuid) gbt_uuid_sortsupport (internal) ;

View File

@ -3,13 +3,19 @@
*/
#include "postgres.h"
#include "access/cmptype.h"
#include "access/stratnum.h"
#include "utils/builtins.h"
PG_MODULE_MAGIC;
PG_MODULE_MAGIC_EXT(
.name = "btree_gist",
.version = PG_VERSION
);
PG_FUNCTION_INFO_V1(gbt_decompress);
PG_FUNCTION_INFO_V1(gbtreekey_in);
PG_FUNCTION_INFO_V1(gbtreekey_out);
PG_FUNCTION_INFO_V1(gist_translate_cmptype_btree);
/**************************************************
* In/Out for keys
@ -51,3 +57,28 @@ gbt_decompress(PG_FUNCTION_ARGS)
{
PG_RETURN_POINTER(PG_GETARG_POINTER(0));
}
/*
* Returns the btree number for supported operators, otherwise invalid.
*/
Datum
gist_translate_cmptype_btree(PG_FUNCTION_ARGS)
{
CompareType cmptype = PG_GETARG_INT32(0);
switch (cmptype)
{
case COMPARE_EQ:
PG_RETURN_UINT16(BTEqualStrategyNumber);
case COMPARE_LT:
PG_RETURN_UINT16(BTLessStrategyNumber);
case COMPARE_LE:
PG_RETURN_UINT16(BTLessEqualStrategyNumber);
case COMPARE_GT:
PG_RETURN_UINT16(BTGreaterStrategyNumber);
case COMPARE_GE:
PG_RETURN_UINT16(BTGreaterEqualStrategyNumber);
default:
PG_RETURN_UINT16(InvalidStrategy);
}
}

View File

@ -1,6 +1,6 @@
# btree_gist extension
comment = 'support for indexing common datatypes in GiST'
default_version = '1.7'
default_version = '1.9'
module_pathname = '$libdir/btree_gist'
relocatable = true
trusted = true

View File

@ -7,7 +7,7 @@
#include "btree_utils_num.h"
#include "catalog/pg_type.h"
#include "utils/builtins.h"
#include "utils/inet.h"
#include "utils/sortsupport.h"
typedef struct inetkey
{
@ -15,15 +15,14 @@ typedef struct inetkey
double upper;
} inetKEY;
/*
** inet ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_inet_compress);
PG_FUNCTION_INFO_V1(gbt_inet_union);
PG_FUNCTION_INFO_V1(gbt_inet_picksplit);
PG_FUNCTION_INFO_V1(gbt_inet_consistent);
PG_FUNCTION_INFO_V1(gbt_inet_penalty);
PG_FUNCTION_INFO_V1(gbt_inet_same);
PG_FUNCTION_INFO_V1(gbt_inet_sortsupport);
static bool
@ -86,10 +85,9 @@ static const gbtree_ninfo tinfo =
/**************************************************
* inet ops
* GiST support functions
**************************************************/
Datum
gbt_inet_compress(PG_FUNCTION_ARGS)
{
@ -115,7 +113,6 @@ gbt_inet_compress(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(retval);
}
Datum
gbt_inet_consistent(PG_FUNCTION_ARGS)
{
@ -139,11 +136,10 @@ gbt_inet_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query,
&strategy, GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_inet_union(PG_FUNCTION_ARGS)
{
@ -151,10 +147,9 @@ gbt_inet_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(inetKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(inetKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_inet_penalty(PG_FUNCTION_ARGS)
{
@ -185,3 +180,29 @@ gbt_inet_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_inet_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
inetKEY *arg1 = (inetKEY *) DatumGetPointer(x);
inetKEY *arg2 = (inetKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
if (arg1->lower < arg2->lower)
return -1;
else if (arg1->lower > arg2->lower)
return 1;
else
return 0;
}
Datum
gbt_inet_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_inet_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -6,6 +6,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "common/int.h"
#include "utils/sortsupport.h"
typedef struct int16key
{
@ -13,9 +14,7 @@ typedef struct int16key
int16 upper;
} int16KEY;
/*
** int16 ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_int2_compress);
PG_FUNCTION_INFO_V1(gbt_int2_fetch);
PG_FUNCTION_INFO_V1(gbt_int2_union);
@ -24,6 +23,8 @@ PG_FUNCTION_INFO_V1(gbt_int2_consistent);
PG_FUNCTION_INFO_V1(gbt_int2_distance);
PG_FUNCTION_INFO_V1(gbt_int2_penalty);
PG_FUNCTION_INFO_V1(gbt_int2_same);
PG_FUNCTION_INFO_V1(gbt_int2_sortsupport);
static bool
gbt_int2gt(const void *a, const void *b, FmgrInfo *flinfo)
@ -112,10 +113,9 @@ int2_dist(PG_FUNCTION_ARGS)
/**************************************************
* int16 ops
* GiST support functions
**************************************************/
Datum
gbt_int2_compress(PG_FUNCTION_ARGS)
{
@ -150,11 +150,10 @@ gbt_int2_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_int2_distance(PG_FUNCTION_ARGS)
{
@ -168,11 +167,10 @@ gbt_int2_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_int2_union(PG_FUNCTION_ARGS)
{
@ -180,10 +178,9 @@ gbt_int2_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(int16KEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(int16KEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_int2_penalty(PG_FUNCTION_ARGS)
{
@ -214,3 +211,27 @@ gbt_int2_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_int2_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
int16KEY *arg1 = (int16KEY *) DatumGetPointer(x);
int16KEY *arg2 = (int16KEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
if (arg1->lower < arg2->lower)
return -1;
else if (arg1->lower > arg2->lower)
return 1;
else
return 0;
}
Datum
gbt_int2_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_int2_ssup_cmp;
PG_RETURN_VOID();
}

View File

@ -2,10 +2,10 @@
* contrib/btree_gist/btree_int4.c
*/
#include "postgres.h"
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "common/int.h"
#include "utils/sortsupport.h"
typedef struct int32key
{
@ -13,9 +13,7 @@ typedef struct int32key
int32 upper;
} int32KEY;
/*
** int32 ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_int4_compress);
PG_FUNCTION_INFO_V1(gbt_int4_fetch);
PG_FUNCTION_INFO_V1(gbt_int4_union);
@ -24,7 +22,7 @@ PG_FUNCTION_INFO_V1(gbt_int4_consistent);
PG_FUNCTION_INFO_V1(gbt_int4_distance);
PG_FUNCTION_INFO_V1(gbt_int4_penalty);
PG_FUNCTION_INFO_V1(gbt_int4_same);
PG_FUNCTION_INFO_V1(gbt_int4_sortsupport);
static bool
gbt_int4gt(const void *a, const void *b, FmgrInfo *flinfo)
@ -113,10 +111,9 @@ int4_dist(PG_FUNCTION_ARGS)
/**************************************************
* int32 ops
* GiST support functions
**************************************************/
Datum
gbt_int4_compress(PG_FUNCTION_ARGS)
{
@ -151,11 +148,10 @@ gbt_int4_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_int4_distance(PG_FUNCTION_ARGS)
{
@ -169,11 +165,10 @@ gbt_int4_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_int4_union(PG_FUNCTION_ARGS)
{
@ -181,10 +176,9 @@ gbt_int4_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(int32KEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(int32KEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_int4_penalty(PG_FUNCTION_ARGS)
{
@ -215,3 +209,27 @@ gbt_int4_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_int4_ssup_cmp(Datum a, Datum b, SortSupport ssup)
{
int32KEY *ia = (int32KEY *) DatumGetPointer(a);
int32KEY *ib = (int32KEY *) DatumGetPointer(b);
/* for leaf items we expect lower == upper, so only compare lower */
if (ia->lower < ib->lower)
return -1;
else if (ia->lower > ib->lower)
return 1;
else
return 0;
}
Datum
gbt_int4_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_int4_ssup_cmp;
PG_RETURN_VOID();
}

View File

@ -6,6 +6,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "common/int.h"
#include "utils/sortsupport.h"
typedef struct int64key
{
@ -13,9 +14,7 @@ typedef struct int64key
int64 upper;
} int64KEY;
/*
** int64 ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_int8_compress);
PG_FUNCTION_INFO_V1(gbt_int8_fetch);
PG_FUNCTION_INFO_V1(gbt_int8_union);
@ -24,6 +23,7 @@ PG_FUNCTION_INFO_V1(gbt_int8_consistent);
PG_FUNCTION_INFO_V1(gbt_int8_distance);
PG_FUNCTION_INFO_V1(gbt_int8_penalty);
PG_FUNCTION_INFO_V1(gbt_int8_same);
PG_FUNCTION_INFO_V1(gbt_int8_sortsupport);
static bool
@ -113,10 +113,9 @@ int8_dist(PG_FUNCTION_ARGS)
/**************************************************
* int64 ops
* GiST support functions
**************************************************/
Datum
gbt_int8_compress(PG_FUNCTION_ARGS)
{
@ -151,11 +150,10 @@ gbt_int8_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_int8_distance(PG_FUNCTION_ARGS)
{
@ -169,11 +167,10 @@ gbt_int8_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_int8_union(PG_FUNCTION_ARGS)
{
@ -181,10 +178,9 @@ gbt_int8_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(int64KEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(int64KEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_int8_penalty(PG_FUNCTION_ARGS)
{
@ -215,3 +211,28 @@ gbt_int8_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_int8_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
int64KEY *arg1 = (int64KEY *) DatumGetPointer(x);
int64KEY *arg2 = (int64KEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
if (arg1->lower < arg2->lower)
return -1;
else if (arg1->lower > arg2->lower)
return 1;
else
return 0;
}
Datum
gbt_int8_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_int8_ssup_cmp;
PG_RETURN_VOID();
}

View File

@ -5,7 +5,8 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/builtins.h"
#include "utils/fmgrprotos.h"
#include "utils/sortsupport.h"
#include "utils/timestamp.h"
typedef struct
@ -14,10 +15,7 @@ typedef struct
upper;
} intvKEY;
/*
** Interval ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_intv_compress);
PG_FUNCTION_INFO_V1(gbt_intv_fetch);
PG_FUNCTION_INFO_V1(gbt_intv_decompress);
@ -27,6 +25,7 @@ PG_FUNCTION_INFO_V1(gbt_intv_consistent);
PG_FUNCTION_INFO_V1(gbt_intv_distance);
PG_FUNCTION_INFO_V1(gbt_intv_penalty);
PG_FUNCTION_INFO_V1(gbt_intv_same);
PG_FUNCTION_INFO_V1(gbt_intv_sortsupport);
static bool
@ -113,7 +112,7 @@ static const gbtree_ninfo tinfo =
Interval *
abs_interval(Interval *a)
{
static Interval zero = {0, 0, 0};
static const Interval zero = {0, 0, 0};
if (DatumGetBool(DirectFunctionCall2(interval_lt,
IntervalPGetDatum(a),
@ -137,10 +136,9 @@ interval_dist(PG_FUNCTION_ARGS)
/**************************************************
* interval ops
* GiST support functions
**************************************************/
Datum
gbt_intv_compress(PG_FUNCTION_ARGS)
{
@ -224,7 +222,7 @@ gbt_intv_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
@ -242,7 +240,7 @@ gbt_intv_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
@ -254,7 +252,7 @@ gbt_intv_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(intvKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(intvKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
@ -295,3 +293,26 @@ gbt_intv_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_intv_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
intvKEY *arg1 = (intvKEY *) DatumGetPointer(x);
intvKEY *arg2 = (intvKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(DirectFunctionCall2(interval_cmp,
IntervalPGetDatum(&arg1->lower),
IntervalPGetDatum(&arg2->lower)));
}
Datum
gbt_intv_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_intv_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,8 +5,9 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/builtins.h"
#include "utils/fmgrprotos.h"
#include "utils/inet.h"
#include "utils/sortsupport.h"
typedef struct
{
@ -15,9 +16,7 @@ typedef struct
char pad[4]; /* make struct size = sizeof(gbtreekey16) */
} macKEY;
/*
** OID ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_macad_compress);
PG_FUNCTION_INFO_V1(gbt_macad_fetch);
PG_FUNCTION_INFO_V1(gbt_macad_union);
@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_macad_picksplit);
PG_FUNCTION_INFO_V1(gbt_macad_consistent);
PG_FUNCTION_INFO_V1(gbt_macad_penalty);
PG_FUNCTION_INFO_V1(gbt_macad_same);
PG_FUNCTION_INFO_V1(gbt_macaddr_sortsupport);
static bool
@ -88,11 +88,9 @@ static const gbtree_ninfo tinfo =
/**************************************************
* macaddr ops
* GiST support functions
**************************************************/
static uint64
mac_2_uint64(macaddr *m)
{
@ -105,8 +103,6 @@ mac_2_uint64(macaddr *m)
return res;
}
Datum
gbt_macad_compress(PG_FUNCTION_ARGS)
{
@ -141,7 +137,7 @@ gbt_macad_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
@ -153,7 +149,7 @@ gbt_macad_union(PG_FUNCTION_ARGS)
void *out = palloc0(sizeof(macKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(macKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
@ -194,3 +190,26 @@ gbt_macad_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_macaddr_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
macKEY *arg1 = (macKEY *) DatumGetPointer(x);
macKEY *arg2 = (macKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(DirectFunctionCall2(macaddr_cmp,
MacaddrPGetDatum(&arg1->lower),
MacaddrPGetDatum(&arg2->lower)));
}
Datum
gbt_macaddr_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_macaddr_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,8 +5,9 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/builtins.h"
#include "utils/fmgrprotos.h"
#include "utils/inet.h"
#include "utils/sortsupport.h"
typedef struct
{
@ -15,9 +16,7 @@ typedef struct
/* make struct size = sizeof(gbtreekey16) */
} mac8KEY;
/*
** OID ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_macad8_compress);
PG_FUNCTION_INFO_V1(gbt_macad8_fetch);
PG_FUNCTION_INFO_V1(gbt_macad8_union);
@ -25,7 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_macad8_picksplit);
PG_FUNCTION_INFO_V1(gbt_macad8_consistent);
PG_FUNCTION_INFO_V1(gbt_macad8_penalty);
PG_FUNCTION_INFO_V1(gbt_macad8_same);
PG_FUNCTION_INFO_V1(gbt_macad8_sortsupport);
static bool
gbt_macad8gt(const void *a, const void *b, FmgrInfo *flinfo)
@ -88,11 +87,9 @@ static const gbtree_ninfo tinfo =
/**************************************************
* macaddr ops
* GiST support functions
**************************************************/
static uint64
mac8_2_uint64(macaddr8 *m)
{
@ -105,8 +102,6 @@ mac8_2_uint64(macaddr8 *m)
return res;
}
Datum
gbt_macad8_compress(PG_FUNCTION_ARGS)
{
@ -141,11 +136,10 @@ gbt_macad8_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_macad8_union(PG_FUNCTION_ARGS)
{
@ -153,10 +147,9 @@ gbt_macad8_union(PG_FUNCTION_ARGS)
void *out = palloc0(sizeof(mac8KEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(mac8KEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_macad8_penalty(PG_FUNCTION_ARGS)
{
@ -194,3 +187,26 @@ gbt_macad8_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_macaddr8_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
mac8KEY *arg1 = (mac8KEY *) DatumGetPointer(x);
mac8KEY *arg2 = (mac8KEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(DirectFunctionCall2(macaddr8_cmp,
Macaddr8PGetDatum(&arg1->lower),
Macaddr8PGetDatum(&arg2->lower)));
}
Datum
gbt_macad8_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_macaddr8_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -11,16 +11,16 @@
#include "utils/builtins.h"
#include "utils/numeric.h"
#include "utils/rel.h"
#include "utils/sortsupport.h"
/*
** Bytea ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_numeric_compress);
PG_FUNCTION_INFO_V1(gbt_numeric_union);
PG_FUNCTION_INFO_V1(gbt_numeric_picksplit);
PG_FUNCTION_INFO_V1(gbt_numeric_consistent);
PG_FUNCTION_INFO_V1(gbt_numeric_penalty);
PG_FUNCTION_INFO_V1(gbt_numeric_same);
PG_FUNCTION_INFO_V1(gbt_numeric_sortsupport);
/* define for comparison */
@ -90,10 +90,9 @@ static const gbtree_vinfo tinfo =
/**************************************************
* Text ops
* GiST support functions
**************************************************/
Datum
gbt_numeric_compress(PG_FUNCTION_ARGS)
{
@ -102,13 +101,11 @@ gbt_numeric_compress(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_var_compress(entry, &tinfo));
}
Datum
gbt_numeric_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
void *query = (void *) DatumGetNumeric(PG_GETARG_DATUM(1));
void *query = DatumGetNumeric(PG_GETARG_DATUM(1));
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
@ -125,8 +122,6 @@ gbt_numeric_consistent(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(retval);
}
Datum
gbt_numeric_union(PG_FUNCTION_ARGS)
{
@ -137,7 +132,6 @@ gbt_numeric_union(PG_FUNCTION_ARGS)
&tinfo, fcinfo->flinfo));
}
Datum
gbt_numeric_same(PG_FUNCTION_ARGS)
{
@ -149,7 +143,6 @@ gbt_numeric_same(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_numeric_penalty(PG_FUNCTION_ARGS)
{
@ -215,8 +208,6 @@ gbt_numeric_penalty(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_numeric_picksplit(PG_FUNCTION_ARGS)
{
@ -227,3 +218,35 @@ gbt_numeric_picksplit(PG_FUNCTION_ARGS)
&tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(v);
}
static int
gbt_numeric_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);
GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);
GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);
GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);
Datum result;
/* for leaf items we expect lower == upper, so only compare lower */
result = DirectFunctionCall2(numeric_cmp,
PointerGetDatum(arg1.lower),
PointerGetDatum(arg2.lower));
GBT_FREE_IF_COPY(key1, x);
GBT_FREE_IF_COPY(key2, y);
return DatumGetInt32(result);
}
Datum
gbt_numeric_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_numeric_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,6 +5,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/sortsupport.h"
typedef struct
{
@ -12,9 +13,7 @@ typedef struct
Oid upper;
} oidKEY;
/*
** OID ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_oid_compress);
PG_FUNCTION_INFO_V1(gbt_oid_fetch);
PG_FUNCTION_INFO_V1(gbt_oid_union);
@ -23,6 +22,7 @@ PG_FUNCTION_INFO_V1(gbt_oid_consistent);
PG_FUNCTION_INFO_V1(gbt_oid_distance);
PG_FUNCTION_INFO_V1(gbt_oid_penalty);
PG_FUNCTION_INFO_V1(gbt_oid_same);
PG_FUNCTION_INFO_V1(gbt_oid_sortsupport);
static bool
@ -113,10 +113,9 @@ oid_dist(PG_FUNCTION_ARGS)
/**************************************************
* Oid ops
* GiST support functions
**************************************************/
Datum
gbt_oid_compress(PG_FUNCTION_ARGS)
{
@ -151,11 +150,10 @@ gbt_oid_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_oid_distance(PG_FUNCTION_ARGS)
{
@ -169,11 +167,10 @@ gbt_oid_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
Datum
gbt_oid_union(PG_FUNCTION_ARGS)
{
@ -181,10 +178,9 @@ gbt_oid_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(oidKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(oidKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_oid_penalty(PG_FUNCTION_ARGS)
{
@ -215,3 +211,29 @@ gbt_oid_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_oid_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
oidKEY *arg1 = (oidKEY *) DatumGetPointer(x);
oidKEY *arg2 = (oidKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
if (arg1->lower > arg2->lower)
return 1;
else if (arg1->lower < arg2->lower)
return -1;
else
return 0;
}
Datum
gbt_oid_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_oid_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,11 +5,11 @@
#include "btree_gist.h"
#include "btree_utils_var.h"
#include "utils/builtins.h"
#include "mb/pg_wchar.h"
#include "utils/fmgrprotos.h"
#include "utils/sortsupport.h"
/*
** Text ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_text_compress);
PG_FUNCTION_INFO_V1(gbt_bpchar_compress);
PG_FUNCTION_INFO_V1(gbt_text_union);
@ -18,6 +18,8 @@ PG_FUNCTION_INFO_V1(gbt_text_consistent);
PG_FUNCTION_INFO_V1(gbt_bpchar_consistent);
PG_FUNCTION_INFO_V1(gbt_text_penalty);
PG_FUNCTION_INFO_V1(gbt_text_same);
PG_FUNCTION_INFO_V1(gbt_text_sortsupport);
PG_FUNCTION_INFO_V1(gbt_bpchar_sortsupport);
/* define for comparison */
@ -162,10 +164,9 @@ static gbtree_vinfo bptinfo =
/**************************************************
* Text ops
* GiST support functions
**************************************************/
Datum
gbt_text_compress(PG_FUNCTION_ARGS)
{
@ -186,13 +187,11 @@ gbt_bpchar_compress(PG_FUNCTION_ARGS)
return gbt_text_compress(fcinfo);
}
Datum
gbt_text_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
void *query = (void *) DatumGetTextP(PG_GETARG_DATUM(1));
void *query = DatumGetTextP(PG_GETARG_DATUM(1));
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
@ -215,12 +214,11 @@ gbt_text_consistent(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(retval);
}
Datum
gbt_bpchar_consistent(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
void *query = (void *) DatumGetTextP(PG_GETARG_DATUM(1));
void *query = DatumGetTextP(PG_GETARG_DATUM(1));
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
@ -242,7 +240,6 @@ gbt_bpchar_consistent(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(retval);
}
Datum
gbt_text_union(PG_FUNCTION_ARGS)
{
@ -253,7 +250,6 @@ gbt_text_union(PG_FUNCTION_ARGS)
&tinfo, fcinfo->flinfo));
}
Datum
gbt_text_picksplit(PG_FUNCTION_ARGS)
{
@ -276,7 +272,6 @@ gbt_text_same(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_text_penalty(PG_FUNCTION_ARGS)
{
@ -287,3 +282,69 @@ gbt_text_penalty(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_var_penalty(result, o, n, PG_GET_COLLATION(),
&tinfo, fcinfo->flinfo));
}
static int
gbt_text_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);
GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);
GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);
GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);
Datum result;
/* for leaf items we expect lower == upper, so only compare lower */
result = DirectFunctionCall2Coll(bttextcmp,
ssup->ssup_collation,
PointerGetDatum(arg1.lower),
PointerGetDatum(arg2.lower));
GBT_FREE_IF_COPY(key1, x);
GBT_FREE_IF_COPY(key2, y);
return DatumGetInt32(result);
}
Datum
gbt_text_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_text_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}
static int
gbt_bpchar_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
GBT_VARKEY *key1 = PG_DETOAST_DATUM(x);
GBT_VARKEY *key2 = PG_DETOAST_DATUM(y);
GBT_VARKEY_R arg1 = gbt_var_key_readable(key1);
GBT_VARKEY_R arg2 = gbt_var_key_readable(key2);
Datum result;
/* for leaf items we expect lower == upper, so only compare lower */
result = DirectFunctionCall2Coll(bpcharcmp,
ssup->ssup_collation,
PointerGetDatum(arg1.lower),
PointerGetDatum(arg2.lower));
GBT_FREE_IF_COPY(key1, x);
GBT_FREE_IF_COPY(key2, y);
return DatumGetInt32(result);
}
Datum
gbt_bpchar_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_bpchar_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -5,8 +5,9 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/builtins.h"
#include "utils/fmgrprotos.h"
#include "utils/date.h"
#include "utils/sortsupport.h"
#include "utils/timestamp.h"
typedef struct
@ -15,9 +16,7 @@ typedef struct
TimeADT upper;
} timeKEY;
/*
** time ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_time_compress);
PG_FUNCTION_INFO_V1(gbt_timetz_compress);
PG_FUNCTION_INFO_V1(gbt_time_fetch);
@ -28,6 +27,8 @@ PG_FUNCTION_INFO_V1(gbt_time_distance);
PG_FUNCTION_INFO_V1(gbt_timetz_consistent);
PG_FUNCTION_INFO_V1(gbt_time_penalty);
PG_FUNCTION_INFO_V1(gbt_time_same);
PG_FUNCTION_INFO_V1(gbt_time_sortsupport);
PG_FUNCTION_INFO_V1(gbt_timetz_sortsupport);
#ifdef USE_FLOAT8_BYVAL
@ -92,8 +93,6 @@ gbt_timelt(const void *a, const void *b, FmgrInfo *flinfo)
TimeADTGetDatumFast(*bb)));
}
static int
gbt_timekey_cmp(const void *a, const void *b, FmgrInfo *flinfo)
{
@ -150,11 +149,9 @@ time_dist(PG_FUNCTION_ARGS)
/**************************************************
* time ops
* GiST support functions
**************************************************/
Datum
gbt_time_compress(PG_FUNCTION_ARGS)
{
@ -163,7 +160,6 @@ gbt_time_compress(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_num_compress(entry, &tinfo));
}
Datum
gbt_timetz_compress(PG_FUNCTION_ARGS)
{
@ -216,7 +212,7 @@ gbt_time_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
@ -233,7 +229,7 @@ gbt_time_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
@ -258,11 +254,10 @@ gbt_timetz_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &qqq, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &qqq, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
Datum
gbt_time_union(PG_FUNCTION_ARGS)
{
@ -270,10 +265,9 @@ gbt_time_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(timeKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(timeKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
Datum
gbt_time_penalty(PG_FUNCTION_ARGS)
{
@ -313,7 +307,6 @@ gbt_time_penalty(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(result);
}
Datum
gbt_time_picksplit(PG_FUNCTION_ARGS)
{
@ -332,3 +325,26 @@ gbt_time_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_timekey_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
timeKEY *arg1 = (timeKEY *) DatumGetPointer(x);
timeKEY *arg2 = (timeKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(DirectFunctionCall2(time_cmp,
TimeADTGetDatumFast(arg1->lower),
TimeADTGetDatumFast(arg2->lower)));
}
Datum
gbt_time_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_timekey_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -7,9 +7,10 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "utils/builtins.h"
#include "utils/datetime.h"
#include "utils/fmgrprotos.h"
#include "utils/timestamp.h"
#include "utils/float.h"
#include "utils/sortsupport.h"
typedef struct
{
@ -17,9 +18,7 @@ typedef struct
Timestamp upper;
} tsKEY;
/*
** timestamp ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_ts_compress);
PG_FUNCTION_INFO_V1(gbt_tstz_compress);
PG_FUNCTION_INFO_V1(gbt_ts_fetch);
@ -31,6 +30,7 @@ PG_FUNCTION_INFO_V1(gbt_tstz_consistent);
PG_FUNCTION_INFO_V1(gbt_tstz_distance);
PG_FUNCTION_INFO_V1(gbt_ts_penalty);
PG_FUNCTION_INFO_V1(gbt_ts_same);
PG_FUNCTION_INFO_V1(gbt_ts_sortsupport);
#ifdef USE_FLOAT8_BYVAL
@ -40,6 +40,8 @@ PG_FUNCTION_INFO_V1(gbt_ts_same);
#endif
/* define for comparison */
static bool
gbt_tsgt(const void *a, const void *b, FmgrInfo *flinfo)
{
@ -95,7 +97,6 @@ gbt_tslt(const void *a, const void *b, FmgrInfo *flinfo)
TimestampGetDatumFast(*bb)));
}
static int
gbt_tskey_cmp(const void *a, const void *b, FmgrInfo *flinfo)
{
@ -126,7 +127,6 @@ gbt_ts_dist(const void *a, const void *b, FmgrInfo *flinfo)
return fabs(INTERVAL_TO_SEC(i));
}
static const gbtree_ninfo tinfo =
{
gbt_t_ts,
@ -190,12 +190,10 @@ tstz_dist(PG_FUNCTION_ARGS)
PG_RETURN_INTERVAL_P(abs_interval(r));
}
/**************************************************
* timestamp ops
* GiST support functions
**************************************************/
static inline Timestamp
tstz_to_ts_gmt(TimestampTz ts)
{
@ -212,7 +210,6 @@ gbt_ts_compress(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(gbt_num_compress(entry, &tinfo));
}
Datum
gbt_tstz_compress(PG_FUNCTION_ARGS)
{
@ -265,7 +262,7 @@ gbt_ts_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &query, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
@ -282,7 +279,7 @@ gbt_ts_distance(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &query, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &query, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
@ -306,7 +303,7 @@ gbt_tstz_consistent(PG_FUNCTION_ARGS)
key.upper = (GBT_NUMKEY *) &kkk[MAXALIGN(tinfo.size)];
qqq = tstz_to_ts_gmt(query);
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) &qqq, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, &qqq, &strategy,
GIST_LEAF(entry), &tinfo, fcinfo->flinfo));
}
@ -325,7 +322,7 @@ gbt_tstz_distance(PG_FUNCTION_ARGS)
key.upper = (GBT_NUMKEY *) &kkk[MAXALIGN(tinfo.size)];
qqq = tstz_to_ts_gmt(query);
PG_RETURN_FLOAT8(gbt_num_distance(&key, (void *) &qqq, GIST_LEAF(entry),
PG_RETURN_FLOAT8(gbt_num_distance(&key, &qqq, GIST_LEAF(entry),
&tinfo, fcinfo->flinfo));
}
@ -337,7 +334,7 @@ gbt_ts_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(tsKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(tsKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
@ -398,3 +395,26 @@ gbt_ts_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_ts_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
tsKEY *arg1 = (tsKEY *) DatumGetPointer(x);
tsKEY *arg2 = (tsKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return DatumGetInt32(DirectFunctionCall2(timestamp_cmp,
TimestampGetDatumFast(arg1->lower),
TimestampGetDatumFast(arg2->lower)));
}
Datum
gbt_ts_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_ts_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -9,7 +9,6 @@
#include "access/gist.h"
#include "btree_gist.h"
#include "utils/rel.h"
typedef char GBT_NUMKEY;

View File

@ -9,8 +9,7 @@
#include "btree_gist.h"
#include "btree_utils_var.h"
#include "utils/builtins.h"
#include "utils/pg_locale.h"
#include "mb/pg_wchar.h"
#include "utils/rel.h"
/* used for key sorting */

View File

@ -6,7 +6,6 @@
#include "access/gist.h"
#include "btree_gist.h"
#include "mb/pg_wchar.h"
/* Variable length key */
typedef bytea GBT_VARKEY;
@ -42,7 +41,17 @@ typedef struct
GBT_VARKEY *(*f_l2n) (GBT_VARKEY *, FmgrInfo *flinfo); /* convert leaf to node */
} gbtree_vinfo;
/*
* Free ptr1 in case its a copy of ptr2.
*
* This is adapted from varlena's PG_FREE_IF_COPY, though doesn't require
* fcinfo access.
*/
#define GBT_FREE_IF_COPY(ptr1, ptr2) \
do { \
if ((Pointer) (ptr1) != DatumGetPointer(ptr2)) \
pfree(ptr1); \
} while (0)
extern GBT_VARKEY_R gbt_var_key_readable(const GBT_VARKEY *k);

View File

@ -6,6 +6,7 @@
#include "btree_gist.h"
#include "btree_utils_num.h"
#include "port/pg_bswap.h"
#include "utils/sortsupport.h"
#include "utils/uuid.h"
typedef struct
@ -15,9 +16,7 @@ typedef struct
} uuidKEY;
/*
* UUID ops
*/
/* GiST support functions */
PG_FUNCTION_INFO_V1(gbt_uuid_compress);
PG_FUNCTION_INFO_V1(gbt_uuid_fetch);
PG_FUNCTION_INFO_V1(gbt_uuid_union);
@ -25,6 +24,7 @@ PG_FUNCTION_INFO_V1(gbt_uuid_picksplit);
PG_FUNCTION_INFO_V1(gbt_uuid_consistent);
PG_FUNCTION_INFO_V1(gbt_uuid_penalty);
PG_FUNCTION_INFO_V1(gbt_uuid_same);
PG_FUNCTION_INFO_V1(gbt_uuid_sortsupport);
static int
@ -93,10 +93,9 @@ static const gbtree_ninfo tinfo =
/**************************************************
* uuid ops
* GiST support functions
**************************************************/
Datum
gbt_uuid_compress(PG_FUNCTION_ARGS)
{
@ -148,7 +147,7 @@ gbt_uuid_consistent(PG_FUNCTION_ARGS)
key.lower = (GBT_NUMKEY *) &kkk->lower;
key.upper = (GBT_NUMKEY *) &kkk->upper;
PG_RETURN_BOOL(gbt_num_consistent(&key, (void *) query, &strategy,
PG_RETURN_BOOL(gbt_num_consistent(&key, query, &strategy,
GIST_LEAF(entry), &tinfo,
fcinfo->flinfo));
}
@ -160,7 +159,7 @@ gbt_uuid_union(PG_FUNCTION_ARGS)
void *out = palloc(sizeof(uuidKEY));
*(int *) PG_GETARG_POINTER(1) = sizeof(uuidKEY);
PG_RETURN_POINTER(gbt_num_union((void *) out, entryvec, &tinfo, fcinfo->flinfo));
PG_RETURN_POINTER(gbt_num_union(out, entryvec, &tinfo, fcinfo->flinfo));
}
/*
@ -233,3 +232,24 @@ gbt_uuid_same(PG_FUNCTION_ARGS)
*result = gbt_num_same((void *) b1, (void *) b2, &tinfo, fcinfo->flinfo);
PG_RETURN_POINTER(result);
}
static int
gbt_uuid_ssup_cmp(Datum x, Datum y, SortSupport ssup)
{
uuidKEY *arg1 = (uuidKEY *) DatumGetPointer(x);
uuidKEY *arg2 = (uuidKEY *) DatumGetPointer(y);
/* for leaf items we expect lower == upper, so only compare lower */
return uuid_internal_cmp(&arg1->lower, &arg2->lower);
}
Datum
gbt_uuid_sortsupport(PG_FUNCTION_ARGS)
{
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
ssup->comparator = gbt_uuid_ssup_cmp;
ssup->ssup_extra = NULL;
PG_RETURN_VOID();
}

View File

@ -1,5 +1,8 @@
-- enum check
create type rainbow as enum ('r','o','y','g','b','i','v');
create type rainbow as enum ('r','o','g','b','i','v');
-- enum values added later take some different codepaths internally,
-- so make sure we have coverage for those too
alter type rainbow add value 'y' before 'g';
CREATE TABLE enumtmp (a rainbow);
\copy enumtmp from 'data/enum.data'
SET enable_seqscan=on;

View File

@ -0,0 +1,13 @@
-- test stratnum translation support func
SELECT gist_translate_cmptype_btree(7);
gist_translate_cmptype_btree
------------------------------
0
(1 row)
SELECT gist_translate_cmptype_btree(3);
gist_translate_cmptype_btree
------------------------------
3
(1 row)

View File

@ -0,0 +1,92 @@
-- Core must test WITHOUT OVERLAPS
-- with an int4range + daterange,
-- so here we do some simple tests
-- to make sure int + daterange works too,
-- since that is the expected use-case.
CREATE TABLE temporal_rng (
id integer,
valid_at daterange,
CONSTRAINT temporal_rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);
\d temporal_rng
Table "public.temporal_rng"
Column | Type | Collation | Nullable | Default
----------+-----------+-----------+----------+---------
id | integer | | not null |
valid_at | daterange | | not null |
Indexes:
"temporal_rng_pk" PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'temporal_rng_pk';
pg_get_constraintdef
---------------------------------------------
PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
(1 row)
SELECT pg_get_indexdef(conindid, 0, true) FROM pg_constraint WHERE conname = 'temporal_rng_pk';
pg_get_indexdef
-------------------------------------------------------------------------------
CREATE UNIQUE INDEX temporal_rng_pk ON temporal_rng USING gist (id, valid_at)
(1 row)
INSERT INTO temporal_rng VALUES
(1, '[2000-01-01,2001-01-01)');
-- same key, doesn't overlap:
INSERT INTO temporal_rng VALUES
(1, '[2001-01-01,2002-01-01)');
-- overlaps but different key:
INSERT INTO temporal_rng VALUES
(2, '[2000-01-01,2001-01-01)');
-- should fail:
INSERT INTO temporal_rng VALUES
(1, '[2000-06-01,2001-01-01)');
ERROR: conflicting key value violates exclusion constraint "temporal_rng_pk"
DETAIL: Key (id, valid_at)=(1, [06-01-2000,01-01-2001)) conflicts with existing key (id, valid_at)=(1, [01-01-2000,01-01-2001)).
-- Foreign key
CREATE TABLE temporal_fk_rng2rng (
id integer,
valid_at daterange,
parent_id integer,
CONSTRAINT temporal_fk_rng2rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS),
CONSTRAINT temporal_fk_rng2rng_fk FOREIGN KEY (parent_id, PERIOD valid_at)
REFERENCES temporal_rng (id, PERIOD valid_at)
);
\d temporal_fk_rng2rng
Table "public.temporal_fk_rng2rng"
Column | Type | Collation | Nullable | Default
-----------+-----------+-----------+----------+---------
id | integer | | not null |
valid_at | daterange | | not null |
parent_id | integer | | |
Indexes:
"temporal_fk_rng2rng_pk" PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
Foreign-key constraints:
"temporal_fk_rng2rng_fk" FOREIGN KEY (parent_id, PERIOD valid_at) REFERENCES temporal_rng(id, PERIOD valid_at)
SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'temporal_fk_rng2rng_fk';
pg_get_constraintdef
---------------------------------------------------------------------------------------
FOREIGN KEY (parent_id, PERIOD valid_at) REFERENCES temporal_rng(id, PERIOD valid_at)
(1 row)
-- okay
INSERT INTO temporal_fk_rng2rng VALUES
(1, '[2000-01-01,2001-01-01)', 1);
-- okay spanning two parent records:
INSERT INTO temporal_fk_rng2rng VALUES
(2, '[2000-01-01,2002-01-01)', 1);
-- key is missing
INSERT INTO temporal_fk_rng2rng VALUES
(3, '[2000-01-01,2001-01-01)', 3);
ERROR: insert or update on table "temporal_fk_rng2rng" violates foreign key constraint "temporal_fk_rng2rng_fk"
DETAIL: Key (parent_id, valid_at)=(3, [01-01-2000,01-01-2001)) is not present in table "temporal_rng".
-- key exist but is outside range
INSERT INTO temporal_fk_rng2rng VALUES
(4, '[2001-01-01,2002-01-01)', 2);
ERROR: insert or update on table "temporal_fk_rng2rng" violates foreign key constraint "temporal_fk_rng2rng_fk"
DETAIL: Key (parent_id, valid_at)=(2, [01-01-2001,01-01-2002)) is not present in table "temporal_rng".
-- key exist but is partly outside range
INSERT INTO temporal_fk_rng2rng VALUES
(5, '[2000-01-01,2002-01-01)', 2);
ERROR: insert or update on table "temporal_fk_rng2rng" violates foreign key constraint "temporal_fk_rng2rng_fk"
DETAIL: Key (parent_id, valid_at)=(2, [01-01-2000,01-01-2002)) is not present in table "temporal_rng".

View File

@ -1,4 +1,4 @@
# Copyright (c) 2022-2024, PostgreSQL Global Development Group
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
btree_gist_sources = files(
'btree_bit.c',
@ -50,6 +50,8 @@ install_data(
'btree_gist--1.4--1.5.sql',
'btree_gist--1.5--1.6.sql',
'btree_gist--1.6--1.7.sql',
'btree_gist--1.7--1.8.sql',
'btree_gist--1.8--1.9.sql',
kwargs: contrib_data_args,
)
@ -89,6 +91,8 @@ tests += {
'enum',
'bool',
'partitions',
'stratnum',
'without_overlaps',
],
},
}

View File

@ -1,6 +1,10 @@
-- enum check
create type rainbow as enum ('r','o','y','g','b','i','v');
create type rainbow as enum ('r','o','g','b','i','v');
-- enum values added later take some different codepaths internally,
-- so make sure we have coverage for those too
alter type rainbow add value 'y' before 'g';
CREATE TABLE enumtmp (a rainbow);

View File

@ -0,0 +1,3 @@
-- test stratnum translation support func
SELECT gist_translate_cmptype_btree(7);
SELECT gist_translate_cmptype_btree(3);

View File

@ -0,0 +1,53 @@
-- Core must test WITHOUT OVERLAPS
-- with an int4range + daterange,
-- so here we do some simple tests
-- to make sure int + daterange works too,
-- since that is the expected use-case.
CREATE TABLE temporal_rng (
id integer,
valid_at daterange,
CONSTRAINT temporal_rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);
\d temporal_rng
SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'temporal_rng_pk';
SELECT pg_get_indexdef(conindid, 0, true) FROM pg_constraint WHERE conname = 'temporal_rng_pk';
INSERT INTO temporal_rng VALUES
(1, '[2000-01-01,2001-01-01)');
-- same key, doesn't overlap:
INSERT INTO temporal_rng VALUES
(1, '[2001-01-01,2002-01-01)');
-- overlaps but different key:
INSERT INTO temporal_rng VALUES
(2, '[2000-01-01,2001-01-01)');
-- should fail:
INSERT INTO temporal_rng VALUES
(1, '[2000-06-01,2001-01-01)');
-- Foreign key
CREATE TABLE temporal_fk_rng2rng (
id integer,
valid_at daterange,
parent_id integer,
CONSTRAINT temporal_fk_rng2rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS),
CONSTRAINT temporal_fk_rng2rng_fk FOREIGN KEY (parent_id, PERIOD valid_at)
REFERENCES temporal_rng (id, PERIOD valid_at)
);
\d temporal_fk_rng2rng
SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'temporal_fk_rng2rng_fk';
-- okay
INSERT INTO temporal_fk_rng2rng VALUES
(1, '[2000-01-01,2001-01-01)', 1);
-- okay spanning two parent records:
INSERT INTO temporal_fk_rng2rng VALUES
(2, '[2000-01-01,2002-01-01)', 1);
-- key is missing
INSERT INTO temporal_fk_rng2rng VALUES
(3, '[2000-01-01,2001-01-01)', 3);
-- key exist but is outside range
INSERT INTO temporal_fk_rng2rng VALUES
(4, '[2001-01-01,2002-01-01)', 2);
-- key exist but is partly outside range
INSERT INTO temporal_fk_rng2rng VALUES
(5, '[2000-01-01,2002-01-01)', 2);

View File

@ -4,6 +4,8 @@ MODULES = citext
EXTENSION = citext
DATA = citext--1.4.sql \
citext--1.7--1.8.sql \
citext--1.6--1.7.sql \
citext--1.5--1.6.sql \
citext--1.4--1.5.sql \
citext--1.3--1.4.sql \

Some files were not shown because too many files have changed in this diff Show More