59426 Commits

Author SHA1 Message Date
Fujii Masao
aab422af95 Fix logical decoding test to correctly check slot removal on standby.
The regression test for logical decoding verifies whether a logical slot
is correctly dropped on a standby when its associated database is dropped.
However, the test mistakenly retrieved slot information from the primary
instead of the standby, causing incorrect behavior.

This commit fixes the issue by ensuring the test correctly checks the slot
on the standby.

Back-patch to all supported versions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/1fdfd020-a509-403c-bd8f-a04664aba148@oss.nttdata.com
Backpatch-through: 13
2025-04-04 13:33:52 +09:00
Fujii Masao
5570e103bf Fix logical decoding regression tests to correctly check slot existence.
The regression tests for logical decoding verify whether a logical slot
exists or has been dropped. Previously, these tests attempted to
retrieve "slot_name" from the result of slot(), but since "slot_name" was
not included in the result, slot()->{'slot_name'} always returned undef,
leading to incorrect behavior.

This commit fixes the issue by checking the "plugin" field in the result
of slot() instead, ensuring the tests properly verify slot existence.

Back-patch to all supported versions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB149667EC4E738769CA80B7EA5F5AE2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
2025-04-04 13:10:21 +09:00
Masahiko Sawada
a4309e85f4 Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots that
were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.

This commit restricts the copying of invalidated replication slots.

Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.

For v15 and earlier, this check is not required since slots can only
be invalidated due to WAL removal, and existing checks already handle
this issue.

Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
2025-04-03 10:30:02 -07:00
Tom Lane
e0191121b2 Remove unnecessary type violation in tsvectorrecv().
compareentry() is declared to work on WordEntryIN structs, but
tsvectorrecv() is using it in two places to work on WordEntry
structs.  This is almost okay, since WordEntry is the first
field of WordEntryIN.  But on machines with 8-byte pointers,
WordEntryIN will have a larger alignment spec than WordEntry,
and it's at least theoretically possible that the compiler
could generate code that depends on the larger alignment.

Given the lack of field reports, this may be just a hypothetical bug
that upsets nothing except sanitizer tools.  Or it may be real on
certain hardware but nobody's tried to use tsvectorrecv() on such
hardware.  In any case we should fix it, and the fix is trivial:
just change compareentry() so that it works on WordEntry without any
mention of WordEntryIN.  We can also get rid of the quite-useless
intermediate function WordEntryCMP.

Bug: #18875
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18875-07a29c49c825a608@postgresql.org
Backpatch-through: 13
2025-04-02 16:17:51 -04:00
Andres Freund
78cb2466f7 Remove HeapBitmapScan's skip_fetch optimization
The optimization does not take the removal of TIDs by a concurrent vacuum into
account. The concurrent vacuum can remove dead TIDs and make pages ALL_VISIBLE
while those dead TIDs are referenced in the bitmap. This can lead to a
skip_fetch scan returning too many tuples.

It likely would be possible to implement this optimization safely, but we
don't have the necessary infrastructure in place. Nor is it clear that it's
worth building that infrastructure, given how limited the skip_fetch
optimization is.

In the backbranches we just disable the optimization by always passing
need_tuples=true to table_beginscan_bm(). We can't perform API/ABI changes in
the backbranches and we want to make the change as minimal as possible.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reported-By: Konstantin Knizhnik <knizhnik@garret.ru>
Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com
Backpatch-through: 13
2025-04-02 14:42:03 -04:00
Tom Lane
0941aadcd5 Need to do CommandCounterIncrement after StoreAttrMissingVal.
Without this, an additional change to the same pg_attribute row
within the same command will fail.  This is possible at least with
ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure.
(Another potential hazard is that immediately-following operations
might not see the missingval.)

Introduced by 95f650674, which split the former coding that
used a single pg_attribute update to change both atthasdef and
atthasmissing/attmissingval into two updates, but missed that
this should entail two CommandCounterIncrements as well.  Like
that fix, back-patch through v13.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/025a3ffa-5eff-4a88-97fb-8f583b015965@gmail.com
Backpatch-through: 13
2025-04-02 11:13:01 -04:00
Peter Eisentraut
b19893b94b Fix code comment
The changes made in commit d2b4b4c2259 contained incorrect comments:
They said that certain forward declarations were necessary to "avoid
including pathnodes.h here", but the file is itself pathnodes.h!  So
change the comment to just say it's a forward declaration in one case,
and in the other case we don't need the declaration at all because it
already appeared earlier in the file.
2025-04-02 14:42:12 +02:00
David Rowley
e962d6d12b Doc: add information about partition locking
The documentation around locking of partitions for the executor startup
phase of run-time partition pruning wasn't clear about which partitions
were being locked.  Fix that.

Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAApHDvp738G75HfkKcfXaf3a8s%3D6mmtOLh46tMD0D2hAo1UCzA%40mail.gmail.com
Backpatch-through: 13
2025-04-02 14:03:48 +13:00
David Rowley
5672a83997 Fix planner's failure to identify multiple hashable ScalarArrayOpExprs
50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify
IN and NOT IN clauses which have Const lists as right-hand arguments and
when an appropriate hash function is available for the data types, mark
the ScalarArrayOpExpr as hashable so the executor could execute it more
optimally by building and probing a hash table during expression
evaluation.

These commits both worked correctly when there was only a single
ScalarArrayOpExpr in the given expression being processed by the
planner, but when there were multiple, only the first was checked and any
subsequent ones were not identified, which resulted in less optimal
expression evaluation during query execution for all but the first found
ScalarArrayOpExpr.

Backpatch to 14, where 50e17ad28 was introduced.

Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/29a76f51-97b0-4c07-87b7-ec8e3b5345c9@gmail.com
Backpatch-through: 14
2025-04-02 11:57:27 +13:00
Tom Lane
915e889680 Fix detection and handling of strchrnul() for macOS 15.4.
As of 15.4, macOS has strchrnul(), but access to it is blocked behind
a check for MACOSX_DEPLOYMENT_TARGET >= 15.4.  But our does-it-link
configure check finds it, so we try to use it, and fail with the
present default deployment target (namely 15.0).  This accounts for
today's buildfarm failures on indri and sifaka.

This is the identical problem that we faced some years ago when Apple
introduced preadv and pwritev in the same way.  We solved that in
commit f014b1b9b by using AC_CHECK_DECLS instead of AC_CHECK_FUNCS
to check the functions' availability.  So do the same now for
strchrnul().  Interestingly, we already had a workaround for
"the link check doesn't agree with <string.h>" cases with glibc,
which we no longer need since only the header declaration is being
checked.

Testing this revealed that the meson version of this check has never
worked, because it failed to use "-Werror=unguarded-availability-new".
(Apparently nobody's tried to build with meson on macOS versions that
lack preadv/pwritev as standard.)  Adjust that while at it.  Also,
we had never put support for "-Werror=unguarded-availability-new"
into v13, but we need that now.

Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/385134.1743523038@sss.pgh.pa.us
Backpatch-through: 13
2025-04-01 16:49:51 -04:00
Dean Rasheed
25303678a1 Fix MERGE with DO NOTHING actions into a partitioned table.
ExecInitPartitionInfo() duplicates much of the logic in
ExecInitMerge(), except that it failed to handle DO NOTHING
actions. This would cause an "unknown action in MERGE WHEN clause"
error if a MERGE with any DO NOTHING actions attempted to insert into
a partition not already initialised by ExecInitModifyTable().

Bug: #18871
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Gurjeet Singh <gurjeet@singh.im>
Discussion: https://postgr.es/m/18871-b44e3c96de3bd2e8%40postgresql.org
Backpatch-through: 15
2025-03-29 09:50:14 +00:00
Daniel Gustafsson
8afec4ef67 Fix guc_malloc calls for consistency and OOM checks
check_createrole_self_grant and check_synchronized_standby_slots
were allocating memory on a LOG elevel without checking if the
allocation succeeded or not, which would have led to a segfault
on allocation failure.

On top of that, a number of callsites were using the ERROR level,
relying on erroring out rather than returning false to allow the
GUC machinery handle it gracefully.  Other callsites used WARNING
instead of LOG.  While neither being not wrong, this changes all
check_ functions do it consistently with LOG.

init_custom_variable gets a promoted elevel to FATAL to keep
the guc_malloc error handling in line with the rest of the
error handling in that function which already call FATAL.  If
we encounter an OOM in this callsite there is no graceful
handling to be had, better to error out hard.

Backpatch the fix to check_createrole_self_grant down to v16
and the fix to check_synchronized_standby_slots down to v17
where they were introduced.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Nikita <pm91.arapov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18845
Discussion: https://postgr.es/m/18845-582c6e10247377ec@postgresql.org
Backpatch-through: 16
2025-03-27 22:57:34 +01:00
Tom Lane
51d038da82 Prevent assertion failure in contrib/pg_freespacemap.
Applying pg_freespacemap() to a relation lacking storage (such as a
view) caused an assertion failure, although there was no ill effect
in non-assert builds.  Add an error check for that case.

Bug: #18866
Reported-by: Robins Tharakan <tharakan@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://postgr.es/m/18866-d68926d0f1c72d44@postgresql.org
Backpatch-through: 13
2025-03-27 13:20:23 -04:00
Michael Paquier
2afdb9dd96 doc: Correct description of values used in FSM for indexes
The implementation of FSM for indexes is simpler than heap, where 0 is
used to track if a page is in-use and (BLCKSZ - 1) if a page is free.
One comment in indexfsm.c and one description in the documentation of
pg_freespacemap were incorrect about that.

Author: Alex Friedman <alexf01@gmail.com>
Discussion: https://postgr.es/m/71eef655-c192-453f-ac45-2772fec2cb04@gmail.com
Backpatch-through: 13
2025-03-27 10:20:45 +09:00
Tomas Vondra
cb0ad70b8e Keep the decompressed filter in brin_bloom_union
The brin_bloom_union() function combines two BRIN summaries, by merging
one filter into the other. With bloom, we have to decompress the filters
first, but the function failed to update the summary to store the merged
filter. As a consequence, the index may be missing some of the data, and
return false negatives.

This issue exists since BRIN bloom indexes were introduced in Postgres
14, but at that point the union function was called only when two
sessions happened to summarize a range concurrently, which is rare. It
got much easier to hit in 17, as parallel builds use the union function
to merge summaries built by workers.

Fixed by storing a pointer to the decompressed filter, and freeing the
original one. Free the second filter too, if it was decompressed. The
freeing is not strictly necessary, because the union is called in
short-lived contexts, but it's tidy.

Backpatch to 14, where BRIN bloom indexes were introduced.

Reported by Arseniy Mukhin, investigation and fix by me.

Reported-by: Arseniy Mukhin
Discussion: https://postgr.es/m/18855-1cf1c8bcc22150e6%40postgresql.org
Backpatch-through: 14
2025-03-26 17:02:17 +01:00
Richard Guo
34fbfe1f57 Fix integer-overflow problem in scram_SaltedPassword()
Setting the iteration count for SCRAM secret generation to INT_MAX
will cause an infinite loop in scram_SaltedPassword() due to integer
overflow, as the loop uses the "i <= iterations" comparison.  To fix,
use "i < iterations" instead.

Back-patch to v16 where the user-settable GUC scram_iterations has
been added.

Author: Kevin K Biju <kevinkbiju@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAM45KeEMm8hnxdTOxA98qhfZ9CzGDdgy3mxgJmy0c+2WwjA6Zg@mail.gmail.com
2025-03-26 17:49:57 +09:00
Tom Lane
f186f90e55 Fix order of -I switches for building pg_regress.o.
We need the -I switch for libpq_srcdir to come before any -I switches
injected by configure.  Otherwise there is a risk of pulling in a
mismatched version of libpq_fe.h from someplace like
/usr/local/include, if the platform has another Postgres version
installed there.  This evidently accounts for today's buildfarm
failures on "anaconda".

In principle the -I switch for src/port/ is at similar hazard, and has
been for a very long time.  But the only .h files we keep there are
pg_config_paths.h and pthread-win32.h, neither of which get installed
on Unix-ish systems, so the odds of picking up a conflicting header
seem pretty small.  That doubtless accounts for the lack of prior
reports.

Back-patch to v17 where pg_regress acquired a build dependency on
libpq_fe.h.  We could go back further to fix the hazard for src/port/
in older branches, but it seems unlikely to be worth troubling over.

Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/Z-MhRzoc7t-nPUQG@nathan
Backpatch-through: 17
2025-03-25 20:03:56 -04:00
Alexander Korotkov
c5cf99e9e5 postgres_fdw: Remove redundant check in semijoin_target_ok()
If a var belongs to the innerrel of the joinrel, it's not possible that
it belongs to the outerrel.  This commit removes the redundant check from
the if-clause but keeps it as an assertion.

Discussion: https://postgr.es/m/flat/CAHewXN=8aW4hd_W71F7Ua4+_w0=bppuvvTEBFBF6G0NuSXLwUw@mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Alexander Pyhalov <a.yhalov@postgrespro.ru>
Backpatch-through: 17
2025-03-25 12:49:14 +02:00
Alexander Korotkov
729fe699e6 postgres_fdw: Avoid pulling up restrict infos from subqueries
Semi-join joins below left/right join are deparsed as
subqueries.  Thus, we can't refer to subqueries vars from upper relations.
This commit avoids pulling conditions from them.

Reported-by: Robins Tharakan <tharakan@gmail.com>
Bug: #18852
Discussion: https://postgr.es/m/CAEP4nAzryLd3gwcUpFBAG9MWyDfMRX8ZjuyY2XXjyC_C6k%2B_Zw%40mail.gmail.com
Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 17
2025-03-25 05:50:39 +02:00
Heikki Linnakangas
302ce5bd93 Fix rare assertion failure in standby, if primary is restarted
During hot standby, ExpireAllKnownAssignedTransactionIds() and
ExpireOldKnownAssignedTransactionIds() functions mark old transactions
as no-longer running, but they failed to update xactCompletionCount
and latestCompletedXid. AFAICS it would not lead to incorrect query
results, because those functions effectively turn in-progress
transactions into aborted transactions and an MVCC snapshot considers
both as "not visible". But it could surprise GetSnapshotDataReuse()
and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin,
RecentXmin))" assertion in it, if the apparent xmin in a backend would
move backwards. We saw this happen when GetCatalogSnapshot() would
reuse an older catalog snapshot, when GetTransactionSnapshot() had
already advanced TransactionXmin.

The bug goes back all the way to commit 623a9ba79b in v14 that
introduced the snapshot reuse mechanism, but it started to happen more
frequently with commit 952365cded6 which removed a
GetTransactionSnapshot() call from backend startup. That made it more
likely for ExpireOldKnownAssignedTransactionIds() to be called between
GetCatalogSnapshot() and the first GetTransactionSnapshot() in a
backend.

Andres Freund first spotted this assertion failure on buildfarm member
'skink'. Reproduction and analysis by Tomas Vondra.

Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5zmdj
2025-03-23 20:41:52 +02:00
Tom Lane
1353b1161a Fix plpgsql's handling of simple expressions in scrollable cursors.
exec_save_simple_expr did not account for the possibility that
standard_planner would stick a Materialize node atop the plan
of even a simple Result, if CURSOR_OPT_SCROLL is set.  This led
to an "unexpected plan node type" error.

This is a very old bug, but it'd only be reached by declaring a
cursor for a "SELECT simple-expression" query and explicitly
marking it scrollable, which is an odd thing to do.  So the lack
of prior reports isn't too surprising.

Bug: #18859
Reported-by: Olleg Samoylov <splarv@ya.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18859-0d5f28ac99a37059@postgresql.org
Backpatch-through: 13
2025-03-21 11:30:42 -04:00
Fujii Masao
97ce4d375a doc: Remove incorrect description about dropping replication slots.
pg_drop_replication_slot() can drop replication slots created on
a different database than the one where it is executed. This behavior
has been in place since PostgreSQL 9.4, when pg_drop_replication_slot()
was introduced.

However, commit ff539d mistakenly added the following incorrect
description in the documentation:

     For logical slots, this must be called when connected to
     the same database the slot was created on.

This commit removes that incorrect statement. A similar mistake was
also present in the documentation for the DROP_REPLICATION_SLOT
command, which has now been corrected as well.

Back-patch to all supported versions.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966C6BE304B5BB2E58D4009F5DE2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
2025-03-21 12:58:43 +09:00
Andres Freund
9de86e367a meson: Flush stdout in testwrap
Otherwise the progress won't reliably be displayed during a test.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/kx6xu7suexal5vwsxpy7ybgkcznx6hgywbuhkr6qabcwxjqax2@i4pcpk75jvaa
Backpatch-through: 16
2025-03-19 09:04:09 -04:00
Masahiko Sawada
a38dce3c4a Fix assertion failure in parallel vacuum with minimal maintenance_work_mem setting.
bbf668d66fbf lowered the minimum value of maintenance_work_mem to
64kB. However, in parallel vacuum cases, since the initial underlying
DSA size is 256kB, it attempts to perform a cycle of index vacuuming
and table vacuuming with an empty TID store, resulting in an assertion
failure.

This commit ensures that at least one page is processed before index
vacuuming and table vacuuming begins.

Backpatch to 17, where the minimum maintenance_work_mem value was
lowered.

Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAD21AoCEAmbkkXSKbj4dB+5pJDRL4ZHxrCiLBgES_g_g8mVi1Q@mail.gmail.com
Backpatch-through: 17
2025-03-18 16:36:59 -07:00
Andres Freund
ee578921b6 smgr: Make SMgrRelation initialization safer against errors
In case the smgr_open callback failed, the ->pincount field would not be
initialized and the relation would not be put onto the unpinned_relns list.

This buglet was introduced in 21d9c3ee4ef7, in 17.

Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl
Backpatch-through: 17
2025-03-18 13:44:10 -04:00
Alexander Korotkov
09ef2f8df1 reindexdb: Fix the index-level REINDEX with multiple jobs
47f99a407d introduced a parallel index-level REINDEX.  The code was written
assuming that running run_reindex_command() with 'async == true' can schedule
a number of queries for a connection.  That's not true, and the second query
sent using run_reindex_command() will wait for the completion of the previous
one.

This commit fixes that by putting REINDEX commands for the same table into a
single query.

Also, this commit removes the 'async' argument from run_reindex_command(),
as only its call always passes 'async == true'.

Reported-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202503071820.j25zn3lo4hvn%40alvherre.pgsql
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Backpatch-through: 17
2025-03-16 13:29:20 +02:00
Tom Lane
c826cd1b1d Fix ARRAY_SUBLINK and ARRAY[] for int2vector and oidvector input.
If the given input_type yields valid results from both
get_element_type and get_array_type, initArrayResultAny believed the
former and treated the input as an array type.  However this is
inconsistent with what get_promoted_array_type does, leading to
situations where the output of an ARRAY() subquery is labeled with
the wrong type: it's labeled as oidvector[] but is really a 2-D
array of OID.  That at least results in strange output, and can
result in crashes if further processing such as unnest() is applied.
AFAIK this is only possible with the int2vector and oidvector
types, which are special-cased to be treated mostly as true arrays
even though they aren't quite.

Fix by switching the logic to match get_promoted_array_type by
testing get_array_type not get_element_type, and remove an Assert
thereby made pointless.  (We need not introduce a symmetrical
check for get_element_type in the other if-branch, because
initArrayResultArr will check it.)  This restores the behavior
that existed before bac27394a introduced initArrayResultAny:
the output really is int2vector[] or oidvector[].

Comparable confusion exists when an input of an ARRAY[] construct
is int2vector or oidvector: transformArrayExpr decides it's dealing
with a multidimensional array constructor, and we end up with
something that's a multidimensional OID array but is alleged to be
of type oidvector.  I have not found a crashing case here, but it's
easy to demonstrate totally-wrong results.  Adjust that code so
that what you get is an oidvector[] instead, for consistency with
ARRAY() subqueries.  (This change also makes these types work like
domains-over-arrays in this context, which seems correct.)

Bug: #18840
Reported-by: yang lei <ylshiyu@126.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18840-fbc9505f066e50d6@postgresql.org
Backpatch-through: 13
2025-03-13 16:07:55 -04:00
Thomas Munro
e273468070 Fix read_stream.c for changing io_combine_limit.
In a couple of places, read_stream.c assumed that io_combine_limit would
be stable during the lifetime of a stream.  That is not true in at least
one unusual case: streams held by CURSORs where you could change the GUC
between FETCH commands, with unpredictable results.

Fix, by storing stream->io_combine_limit and referring only to that
after construction.  This mirrors the treatment of the other important
setting {effective,maintenance}_io_concurrency, which is stored in
stream->max_ios.

One of the cases was the queue overflow space, which was sized for
io_combine_limit and could be overrun if the GUC was increased.  Since
that coding was a little hard to follow, also introduce a variable for
better readability instead of open-coding the arithmetic.  Doing so
revealed an off-by-one thinko while clamping max_pinned_buffers to
INT16_MAX, though that wasn't a live bug due to the current limits on
GUC values.

Back-patch to 17.

Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
2025-03-13 15:48:02 +13:00
Amit Langote
8b2392ae3d Fix copy-paste error in datum_to_jsonb_internal()
Commit 3c152a27b06 mistakenly repeated JSONTYPE_JSON in a condition,
omitting JSONTYPE_CAST. As a result, datum_to_jsonb_internal() failed
to reject inputs that were casts (e.g., from an enum to json as in the
example below) when used as keys in JSON constructors.

This led to a crash in cases like:

  SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);

where 'happy'::mood is implicitly cast to json. The missing check
meant such casted values weren’t properly rejected as invalid
(non-scalar) JSON keys.

Reported-by: Maciek Sakrejda <maciek@pganalyze.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Maciek Sakrejda <maciek@pganalyze.com>
Discussion: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry+Xrq-ghjcejBRhcRMzWZvbd__QdgJA@mail.gmail.com
Backpatch-through: 17
2025-03-13 09:56:49 +09:00
Heikki Linnakangas
e731e9d5ee Handle interrupts while waiting on Append's async subplans
We did not wake up on interrupts while waiting on async events on an
async-capable append node. For example, if you tried to cancel the
query, nothing would happen until one of the async subplans becomes
readable. To fix, add WL_LATCH_SET to the WaitEventSet.

Backpatch down to v14 where async Append execution was introduced.

Discussion: https://www.postgresql.org/message-id/37a40570-f558-40d3-b5ea-5c2079b3b30b@iki.fi
2025-03-12 20:53:16 +02:00
Tom Lane
ca0830e5a2 Build whole-row Vars the same way during parsing and planning.
makeWholeRowVar() has different rules for constructing a
whole-row Var depending on the kind of RTE it's representing.
This turns out to be problematic because the rewriter and planner
can convert view RTEs and set-returning-function RTEs into
subquery RTEs; so a whole-row Var made during planning might
look different from one made by the parser.  In isolation this
doesn't cause any problem, but if a query contains Vars made
both ways for the same varno, there are cross-checks in the
executor that will complain.  This manifests for UPDATE, DELETE,
and MERGE queries that use whole-row table references.

To fix, we need makeWholeRowVar() to produce the same result
from an inlined RTE as it would have for the original.  For
an inlined view, we can use RangeTblEntry.relid to detect
that this had been a view RTE.  For inlined SRFs, make a
data structure definition change akin to commit 47bb9db75,
and say that we won't clear RangeTblEntry.functions until
the end of planning.  That allows makeWholeRowVar() to
repeat what it would have done with the unmodified RTE.

Reported-by: Duncan Sands <duncan.sands@deepbluecap.com>
Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/3518c50a-ab18-482f-b916-a37263622501@deepbluecap.com
Backpatch-through: 13
2025-03-12 11:47:19 -04:00
Álvaro Herrera
ade976f8b4
BRIN: be more strict about required support procs
With improperly defined operator classes, it's possible to get a
Postgres crash because we'd try to invoke a procedure that doesn't
exist.  This is because the code is being a bit too trusting that the
opclass is correctly defined.  Add some ereport(ERROR)s for cases where
mandatory support procedures are not defined, transforming the crashes
into errors.

The particular case that was reported is an incomplete opclass in
PostGIS.

Backpatch all the way down to 13.

Reported-by: Tobias Wendorff <tobias.wendorff@tu-dortmund.de>
Diagnosed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/fb6d9a35-6c8e-4869-af80-0a4944a793a4@tu-dortmund.de
2025-03-11 12:50:35 +01:00
Heikki Linnakangas
f1ef111a09 Fix a few more redundant calls of GetLatestSnapshot()
Commit 2367503177 fixed this in RelationFindReplTupleByIndex(), but I
missed two other similar cases.

Per report from Ranier Vilela.

Discussion: https://www.postgresql.org/message-id/CAEudQArUT1dE45WN87F-Gb7XMy_hW6x1DFd3sqdhhxP-RMDa0Q@mail.gmail.com
Backpatch-through: 13
2025-03-10 19:00:08 +02:00
Heikki Linnakangas
c1dd3a9443 Fix snapshot used in logical replication index lookup
The function calls GetLatestSnapshot() to acquire a fresh snapshot,
makes it active, and was meant to pass it to table_tuple_lock(), but
instead called GetLatestSnapshot() again to acquire yet another
snapshot. It was harmless because the heap AM and all other known
table AMs ignore the 'snapshot' argument anyway, but let's be tidy.

In the long run, this perhaps should be redesigned so that snapshot
was not needed in the first place. The table AM API uses TID +
snapshot as the unique identifier for the row version, which is
questionable when the row came from an index scan with a Dirty
snapshot. You might lock a different row version when you use a
different snapshot in the table_tuple_lock() call (a fresh MVCC
snapshot) than in the index scan (DirtySnapshot). However, in the heap
AM and other AMs where the TID alone identifies the row version, it
doesn't matter. So for now, just fix the obvious albeit harmless bug.

This has been wrong ever since the table AM API was introduced in
commit 5db6df0c01, so backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/83d243d6-ad8d-4307-8b51-2ee5844f6230@iki.fi
Backpatch-through: 13
2025-03-10 17:07:59 +02:00
Tom Lane
0f27bd14e4 Doc: improve description of window function processing.
The previous wording talked about a "single pass over the data",
which can be read as promising more than intended (to wit, that only
one WindowAgg plan node will be used).  What we promise is only what
the SQL spec requires, namely that the data not get re-sorted between
window functions with compatible PARTITION BY/ORDER BY clauses.
Adjust the wording in hopes of making this clearer.

Reported-by: Christopher Inokuchi <cinokuchi@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CABde6B5va2wMsnM79u_x=n9KUgfKQje_pbLROEBmA9Ru5XWidw@mail.gmail.com
Backpatch-through: 13
2025-03-10 10:22:15 -04:00
Tom Lane
43847dd5e9 Don't try to parallelize array_agg() on an anonymous record type.
This doesn't work because record_recv requires the typmod that
identifies the specific record type (in our session) and
array_agg_deserialize has no convenient way to get that information.
The result is an "input of anonymous composite types is not
implemented" error.

We could probably make this work if we had to, but it does not seem
worth the trouble, given that it took this long to get a field report.
Just shut off parallelization, as though record_recv didn't exist.

Oversight in commit 16fd03e95.  Back-patch to v16 where that
came in.

Reported-by: Kirill Zdornyy <kirill@dineserve.com>
Diagnosed-by: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/atLI5Kce2ie1zcYjU0w_kjtVaxiYbYGTihrkLDmGZQnRDD4pnXukIATaABbnIj9pUnelC4ESvCXMm4HAyHg-v61XABaKpERj0A2IXzJZM7g=@dineserve.com
Backpatch-through: 16
2025-03-09 13:11:20 -04:00
Tom Lane
99c01aadf9 Clear errno before calling strtol() in spell.c.
Per POSIX, a caller of strtol() that wishes to check for errors must
set errno to 0 beforehand.  Several places in spell.c neglected that,
so that they risked delivering a false overflow error in case errno
had been ERANGE already.  Given the lack of field reports, this case
may be unreachable at present --- but it's surely trouble waiting to
happen, so fix it.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaBhsq6EromFm+knMJfzK6nTpG23zJ+K2=nfUQQXcj_xcQ@mail.gmail.com
Backpatch-through: 13
2025-03-08 11:24:42 -05:00
Nathan Bossart
c3510cfc8b Assert that wrapper_handler()'s argument is within expected range.
pqsignal() already does a similar check, but strange Valgrind
reports have us wondering if wrapper_handler() is somehow getting
called with an invalid signal number.

Reported-by: Tomas Vondra <tomas@vondra.me>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/ace01111-f9ac-4f61-b1b1-8e9379415444%40vondra.me
Backpatch-through: 17
2025-03-07 15:23:09 -06:00
John Naylor
5c8dcf9483 Doc: correct aggressive vacuum threshold for multixact members storage
The threshold is two billion members, which was interpreted as 2GB
in the documentation. Fix to reflect that each member takes up five
bytes, which translates to about 10GB. This is not exact, because of
page boundaries. While at it, mention the maximum size 20GB.

This has been wrong since commit c552e171d16e, so backpatch to
version 14.

Author: Alex Friedman <alexf01@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CACbFw60UOk6fCC02KsyT3OfU9Dnuq5roYxdw2aFisiN_p1L0bg@mail.gmail.com
Backpatch-through: 14
2025-03-07 10:24:06 +07:00
Tom Lane
9094eb25b7 Fix some performance issues in GIN query startup.
If a GIN index search had a lot of search keys (for example,
"jsonbcol ?| array[]" with tens of thousands of array elements),
both ginFillScanKey() and startScanKey() took O(N^2) time.
Worse, those loops were uncancelable for lack of CHECK_FOR_INTERRUPTS.

The problem in ginFillScanKey() is the brute-force search key
de-duplication done in ginFillScanEntry().  The most expedient
solution seems to be to just stop trying to de-duplicate once
there are "too many" search keys.  We could imagine working harder,
say by using a sort-and-unique algorithm instead of brute force
compare-all-the-keys.  But it seems unlikely to be worth the trouble.
There is no correctness issue here, since the code already allowed
duplicate keys if any extra_data is present.

The problem in startScanKey() is the loop that attempts to identify
the first non-required search key.  In the submitted test case, that
vainly tests all the key positions, and each iteration takes O(N)
time.  One part of that is that it's reinitializing the entryRes[]
array from scratch each time, which is entirely unnecessary given
that the triConsistentFn isn't supposed to scribble on its input.
We can easily adjust the array contents incrementally instead.
The other part of it is that the triConsistentFn may itself take
O(N) time (and does in this test case).  This is all extremely
brute force: in simple cases with AND or OR semantics, we could
know without any looping whatever that all or none of the keys
are required.  But GIN opclasses don't have any API for exposing
that knowledge, so at least in the short run there is little to
be done about that.  Put in a CHECK_FOR_INTERRUPTS so that at
least the loop is cancelable.

These two changes together resolve the primary complaint that
the test query doesn't respond promptly to cancel interrupts.
Also, while they don't completely eliminate the O(N^2) behavior,
they do provide quite a nice speedup for mid-sized examples.

Bug: #18831
Reported-by: Niek <niek.brasa@hitachienergy.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18831-e845ac44ebc5dd36@postgresql.org
Backpatch-through: 13
2025-03-06 11:54:27 -05:00
Andres Freund
3f4c5e38e8 ci: Upgrade FreeBSD image
Upgrade to the current stable version. To avoid needing commits like this in
the future, the CI image name now doesn't contain the OS version number
anymore.

Backpatch to all versions with CI support, we don't want to generate CI images
for multiple FreeBSD versions.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ3_P4JJ6tWZafjf-_XbHgG6DQGXhH-y6Yp78_bwBJjcww@mail.gmail.com
2025-03-05 10:29:08 -05:00
Álvaro Herrera
4e026be5f1
Fix ALTER TABLE error message
This bogus error message was introduced in 2013 by commit f177cbfe676d,
because of misunderstanding the processCASbits() API; at the time, no
test cases were added that would be affected by this change.  Only in
ca87c415e2fc was one added (along with a couple of typos), with an XXX
note that the error message was bogus.  Fix the whole, add some test
cases.

Backpatch all the way back.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/202503041822.aobpqke3igvb@alvherre.pgsql
2025-03-04 20:07:30 +01:00
Daniel Gustafsson
56e6a31868 doc: Expand version compatibility for pg_basebackup features
This updates the paragraph on backwards compatitibility for server
features to include --incremental which only works on servers with
v17 or newer.  Backpatch down to v17 where incremental backup was
added.

Author: David G. Johnston <David.G.Johnston@Gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAKFQuwZYfZyeTkS3g2Ovw84TsxHa796xnf-u5kfgn_auyxZk0Q@mail.gmail.com
Backpatch-through: 17
2025-03-04 12:08:27 +01:00
Richard Guo
bc5a08af3c Avoid NullTest deduction for clone clauses
In commit b262ad440, we introduced an optimization that reduces an IS
NOT NULL qual on a column defined as NOT NULL to constant true, and an
IS NULL qual on a NOT NULL column to constant false, provided we can
prove that the input expression of the NullTest is not nullable by any
outer join.  This deduction happens after we have generated multiple
clones of the same qual condition to cope with commuted-left-join
cases.

However, performing the NullTest deduction for clone clauses can be
unsafe, because we don't have a reliable way to determine if the input
expression of a NullTest is non-nullable: nullingrel bits in clone
clauses may not reflect reality, so we dare not draw conclusions from
clones about whether Vars are guaranteed not-null.

To fix, we check whether the given RestrictInfo is a clone clause in
restriction_is_always_true and restriction_is_always_false, and avoid
performing any reduction if it is.

There are several ensuing plan changes in predicate.out, and we have
to modify the tests to ensure that they continue to test what they are
intended to.  Additionally, this fix causes the test case added in
f00ab1fd1 to no longer trigger the bug that commit fixed, so we also
remove that test case.

Back-patch to v17 where this bug crept in.

Reported-by: Ronald Cruz <cruz@rentec.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/f5320d3d-77af-4ce8-b9c3-4715ff33f213@rentec.com
Backpatch-through: 17
2025-03-04 16:17:19 +09:00
Tom Lane
d6dd2a02ba Fix broken handling of domains in atthasmissing logic.
If a domain type has a default, adding a column of that type (without
any explicit DEFAULT clause) failed to install the domain's default
value in existing rows, instead leaving the new column null.  This
is unexpected, and it used to work correctly before v11.  The cause
is confusion in the atthasmissing mechanism about which default value
to install: we'd only consider installing an explicitly-specified
default, and then we'd decide that no table rewrite is needed.

To fix, take the responsibility for filling attmissingval out of
StoreAttrDefault, and instead put it into ATExecAddColumn's existing
logic that derives the correct value to fill the new column with.
Also, centralize the logic that determines the need for
default-related table rewriting there, instead of spreading it over
four or five places.

In the back branches, we'll leave the attmissingval-filling code
in StoreAttrDefault even though it's now dead, for fear that some
extension may be depending on that functionality to exist there.
A separate HEAD-only patch will clean up the now-useless code.

Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
Backpatch-through: 13
2025-03-03 12:43:29 -05:00
Tom Lane
d69c781084 Fix pg_strtof() to not crash on NULL endptr.
We had managed not to notice this simple oversight because none
of our calls exercised the case --- until commit 8f427187d.
That led to pg_dump crashing on any platform that uses this code
(currently Cygwin and Mingw).

Even though there's no immediate bug in the back branches, backpatch,
because a non-POSIX-compliant strtof() substitute is trouble waiting
to happen for extensions or future back-patches.

Diagnosed-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/339b3902-4e98-4e31-a744-94e43b7b9292@gmail.com
Backpatch-through: 13
2025-03-01 14:22:56 -05:00
Michael Paquier
ee78823ff5 pg_upgrade: Fix inconsistency in memory freeing
The function in charge of freeing the memory from a result created by
PQescapeIdentifier() has to be PQfreemem(), to ensure that both
allocation and free come from libpq.

One spot in pg_upgrade was not respecting that for pg_database's
datlocale (daticulocale in v16) when the collation provider is libc (aka
datlocale/daticulocale is NULL) with an allocation done using
pg_strdup() and a free with PQfreemem().  The code is changed to always
use PQescapeLiteral() when processing the input.

Oversight in 9637badd9f92.  This commit is similar to 48e4ae9a0707 and
5b94e2753439.

Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/Z601RQxTmIUohdkV@paquier.xyz
Backpatch-through: 16
2025-02-28 10:15:32 +09:00
Michael Paquier
f903d4da92 pg_amcheck: Fix inconsistency in memory freeing
The function in charge of freeing the memory from a result created by
PQescapeIdentifier() has to be PQfreemem(), to ensure that both
allocation and free come from libpq, but one spot in pg_amcheck was
missing that.

Oversight in b859d94c6389.

Author: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAEudQArD_nKSnYCNUZiPPsJ2tNXgRmLbXGSOrH1vpOF_XtP0Vg@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQArbTWVSbxq608GRmXJjnNSQ0B6R7CSffNnj2hPWMUsRNg@mail.gmail.com
Backpatch-through: 14
2025-02-27 14:05:55 +09:00
Amit Kapila
7c906c5b46 Doc: Fix pg_copy_logical_replication_slot description.
This commit documents that the failover option is not copied when using
the pg_copy_logical_replication_slot function.

In passing, we modify the comments in the function clarifying the reason
for this behavior.

Reported-by: <duffieldzane@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/173976850802.682632.11315364077431550250@wrigleys.postgresql.org
2025-02-25 09:22:16 +05:30
Masahiko Sawada
174952ece1 Fix assertion when decoding XLOG_PARAMETER_CHANGE on promoted primary.
When a standby replays an XLOG_PARAMETER_CHANGE record that lowers
wal_level below logical, we invalidate all logical slots in hot
standby mode. However, if this record was replayed while not in hot
standby mode, logical slots could remain valid even after promotion,
potentially causing an assertion failure during WAL record decoding.

To fix this issue, this commit adds a check for hot_standby status
when restoring a logical replication slot on standbys. This check
ensures that logical slots are invalidated when they become
incompatible due to insufficient wal_level during recovery.

Backpatch to v16 where logical decoding on standby was introduced.

Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CAD21AoABoFwGY_Rh2aeE6tEq3HkJxf0c6UeOXn4VV9v6BAQPSw%40mail.gmail.com
Backpatch-through: 16
2025-02-24 14:03:07 -08:00