The callback function pa_shutdown() accesses MyLogicalRepWorker which may
not be initialized if there is an error during the initialization of the
parallel apply worker. The other problem is that by the time it is invoked
even after the initialization of the worker, the MyLogicalRepWorker will
be reset by another callback logicalrep_worker_onexit. So, it won't have
the required information.
To fix this, register the shutdown callback after we are attached to the
worker slot.
After this fix, we observed another issue which is that sometimes the
leader apply worker tries to receive the message from the error queue that
might already be detached by the parallel apply worker leading to an
error. To prevent such an error, we ensure that the leader apply worker
detaches from the parallel apply worker's error queue before stopping it.
Reported-by: Sawada Masahiko
Author: Hou Zhijie
Reviewed-by: Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDo+yUwNq6nTrvE2h9bB2vZfcag=jxWc7QxuWCmkDAqcA@mail.gmail.com
Commit 1de58df4, which added page-level freezing, taught VACUUM to reuse
each page's "set-visibility-map" snapshotConflictHorizon for freezing
(at least in the vast majority of cases where freezing went ahead).
This made VACUUM FREEZE much less prone to generating recovery conflicts
on standbys; VACUUM FREEZE became only slightly more likely to cause
recovery conflicts than an equivalent VACUUM.
Update old documentation that specifically warned of the likelihood of
recovery conflicts from VACUUM FREEZE. Explain the same general issue
(the issue of VACUUM generating recovery conflicts even in the absence
of dead row cleanup) using the example of conflicts caused by VISIBLE
WAL records.
If an SRF in the FROM clause references a table having row-level
security policies, and we inline that SRF into the calling query,
we neglected to mark the plan as potentially dependent on which
role is executing it. This could lead to later executions in the
same session returning or hiding rows that should have been hidden
or returned instead.
Our thanks to Wolfgang Walther for reporting this problem.
Stephen Frost and Tom Lane
Security: CVE-2023-2455
The two methods don't cooperate, so set_config_option("search_path",
...) has been ineffective under non-empty overrideStack. This defect
enabled an attacker having database-level CREATE privilege to execute
arbitrary code as the bootstrap superuser. While that particular attack
requires v13+ for the trusted extension attribute, other attacks are
feasible in all supported versions.
Standardize on the combination of NewGUCNestLevel() and
set_config_option("search_path", ...). It is newer than
PushOverrideSearchPath(), more-prevalent, and has no known
disadvantages. The "override" mechanism remains for now, for
compatibility with out-of-tree code. Users should update such code,
which likely suffers from the same sort of vulnerability closed here.
Back-patch to v11 (all supported versions).
Alexander Lakhin. Reported by Alexander Lakhin.
Security: CVE-2023-2454
This wait event was documented as "CommitTsBuffer" since its
introduction, but the code named it "CommitTSBuffer". This commit fixes
the code to follow the term documented, which is also more consistent
with the naming of the other wait events used for commit timestamps.
Introduced by 5da1493.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com
Backpatch-through: 13
Commit f75cec4fff87 changed the order in which the relations are
permission-checked in RI_Initial_Check, which the sepgsql test is
sensitive to. Adapt.
Discussion: https://postgr.es/m/3468125.1683238309@sss.pgh.pa.us
Commit 153e215677 added the portlock directory. This is created in
$ENV{top_builddir} if it is set. Under PGXS, top_builddir points into
the installation directory, which is not necessarily writable and in
any case inappropriate to use by a test suite. The cause of the
problem is that the prove_installcheck target in Makefile.global
exports top_builddir, which isn't useful (since no other Perl code
actually reads it) and breaks this use case. The reason this code is
there is probably that is has been dragged around with various other
changes, in particular a0fc813266, but without a real purpose of its
own. By just removing the exporting of top_builddir in
prove_installcheck, the portlock directory then ends up under
tmp_check in the build directory, which is more suitable.
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://www.postgresql.org/message-id/78d1cfa6-0065-865d-584b-cde6d8c18aff@enterprisedb.com
If we exit a PG_TRY block early via "continue", "break", "goto", or
"return", we'll skip unwinding its exception stack. This change
moves a couple of such "return" statements in PL/Python out of
PG_TRY blocks. This was introduced in d0aa965c0a and affects all
supported versions.
We might also be able to add compile-time checks to prevent
recurrence, but that is left as a future exercise.
Reported-by: Mikhail Gribkov, Xing Guo
Author: Xing Guo
Reviewed-by: Michael Paquier, Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAMEv5_v5Y%2B-D%3DCO1%2Bqoe16sAmgC4sbbQjz%2BUtcHmB6zcgS%2B5Ew%40mail.gmail.com
Discussion: https://postgr.es/m/CACpMh%2BCMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ%40mail.gmail.com
Backpatch-through: 11 (all supported versions)
RI_Initial_Check was setting up a list of RTEPermissionInfo for
ExecCheckPermissions() wrong, and the problem is subtle enough that it
doesn't have any immediate effect in core code. However, if an
extension is using the ExecutorCheckPerms_hook, then it would get the
wrong parameters and perhaps arrive at a wrong conclusion, or outright
malfunction. Fix by constructing that list and the RTE list more
honestly.
We also add an assertion check to verify that these lists match. This
new assertion would have caught this bug.
Co-authored-by: Олег Целебровский (Oleg Tselebrovskii) <o.tselebrovskiy@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/3722b7a2cbe27a1796ee40824bd86dd1@postgrespro.ru
These functions incautiously fetched the array's first lower bound
even when the array is zero-dimensional, thus fetching the word
after the allocated array space. While almost always harmless,
with very bad luck this could result in SIGSEGV. Fix by adding
an early exit for empty input.
Per bug #17920 from Alexander Lakhin.
Discussion: https://postgr.es/m/17920-f7c228c627b6d02e%40postgresql.org
Like plperl before f47004add, plpython wasn't being sufficiently
careful about checking that list-of-list structures represent
rectangular arrays, so that it would accept some cases in which
different parts of the "array" are nested to different depths.
This was exacerbated by Python's weak distinction between
sequences and lists, so that in some cases strings could get
treated as though they are lists (and burst into individual
characters) even though a different ordering of the upper-level
list would give a different result.
Some of this behavior was unreachable (without risking a crash)
before 81eaaf65e. It seems like a good idea to clean it all up
in the same releases, rather than shipping a non-crashing but
nonetheless visibly buggy behavior in the name of minimal change.
Hence, back-patch.
Per bug #17912 and further testing by Alexander Lakhin.
Discussion: https://postgr.es/m/17912-82ceed78731d9cdc@postgresql.org
This reverts commit ec386948948c and its fixup 589bb816499e.
This change was intended to support query planning avoiding acquisition
of locks on partitions that were going to be pruned; however, the
overall project took a different direction at [1] and this bit is no
longer needed. Put things back the way they were as agreed in [2], to
avoid unnecessary complexity.
Discussion: [1] https://postgr.es/m/4191508.1674157166@sss.pgh.pa.us
Discussion: [2] https://postgr.es/m/20230502175409.kcoirxczpdha26wt@alvherre.pgsql
Add:
- "Restartpoint"
- "Log sequence number"
"LSN" was already listed in the Acronyms appendix, but it is more
suitable as a glossary entry, so move it there and have the acronyms
entry link into the glossary.
Also turn on DocBook parameter glossentry.show.acronym to show
acronyms for glossary entries, which is being used here.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/60915312-62cd-9c94-0d94-556023ece45f%40enterprisedb.com
During exit, the logical replication apply worker tries to release session
level locks, if any. However, if the apply worker exits due to an error
before its connection is initialized, trying to release locks can lead to
assertion failure. The locks will be acquired once the worker is
initialized, so we don't need to release them till the worker
initialization is complete.
Reported-by: Alexander Lakhin
Author: Hou Zhijie based on inputs from Sawada Masahiko and Amit Kapila
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/2185d65f-5aae-3efa-c48f-fb42b173ef5c@gmail.com
plperl, plpython, and pltcl all provide query-execution functions
that are thin wrappers around SPI_execute() or its variants.
The SPI functions document their row-count limit arguments clearly,
as "maximum number of rows to return, or 0 for no limit". However
the PLs' documentation failed to explain this special behavior of
zero, so that a reader might well assume it means "fetch zero
rows". Improve that.
Daniel Gustafsson and Tom Lane, per report from Kieran McCusker
Discussion: https://postgr.es/m/CAGgUQ6H6qYScctOhktQ9HLFDDoafBKHyUgJbZ6q_dOApnzNTXg@mail.gmail.com
The <source>_traverse_files functions take a callback for processing
files, but both the local and libpq source implementations called the
function directly without using the callback argument. While there is
no bug right now as the function called is the same as the callback,
fix by calling the callback to reduce the risk of subtle bugs in the
future.
Author: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3Jdwgh+PZr2zh1=t8apA4Yz8tKq+uubPqoCt14nvWKHEw@mail.gmail.com
plperl_array_to_datum() wasn't sufficiently careful about checking
that nested lists represent a rectangular array structure; it would
accept inputs such as "[1, []]". This is a bit related to the
PL/Python bug fixed in commit 81eaaf65e, but it doesn't seem to
provide any direct route to a memory stomp. Instead the likely
failure mode is for makeMdArrayResult to be passed fewer Datums than
the claimed array dimensionality requires, possibly leading to a wild
pointer dereference and SIGSEGV.
Per report from Alexander Lakhin. It's been broken for a long
time, so back-patch to all supported branches.
Discussion: https://postgr.es/m/5ebae5e4-d401-fadf-8585-ac3eaf53219c@gmail.com
If PLySequence_ToArray came across a zero-length sublist, it'd compute
the overall array size as zero, possibly leading to a memory clobber.
(This would likely qualify as a security bug, were it not that plpython
is an untrusted language already.)
I think there are other corner-case issues in this code as well, notably
that the error messages don't match the core code and for some ranges
of array sizes you'd get "invalid memory alloc request size" rather than
the intended message about array size.
Really this code has no business doing its own array size calculation
at all, so remove the faulty code in favor of using ArrayGetNItems().
Per bug #17912 from Alexander Lakhin. Bug seems to have come in with
commit 94aceed31, so back-patch to all supported branches.
Discussion: https://postgr.es/m/17912-82ceed78731d9cdc@postgresql.org
CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to
crashes when comparing the schema name of the query with the schemas
used in the qualification of some clauses in the elements' queries.
The origin of the problem is that the transformation routine for the
elements listed in a CREATE SCHEMA query uses as new, expected, schema
name the one listed in CreateSchemaStmt itself. However, depending on
the query, CreateSchemaStmt.schemaname may be NULL, being computed
instead from the role specification of the query given by the
AUTHORIZATION clause, that could be either:
- A user name string, with the new schema name being set to the same
value as the role given.
- Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new
schema name computed from the security context where CREATE SCHEMA is
running.
Regression tests are added for CREATE SCHEMA with some appended elements
(some of them with schema qualifications), covering also some role
specification patterns.
While on it, this simplifies the context structure used during the
transformation of the elements listed in a CREATE SCHEMA query by
removing the fields for the role specification and the role type. They
were not used, and for the role specification this could be confusing as
the schema name may by extracted from that at the beginning of
CreateSchemaCommand().
This issue exists for a long time, so backpatch down to all the versions
supported.
Reported-by: Song Hongyu
Author: Michael Paquier
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org
Backpatch-through: 11
Commit 7d71d3dd08 changed resetting the VacuumFailsafeActive flag to an
assertion since the flag is reset before starting vacuuming a relation.
This however failed to take recursive calls of vacuum_rel() and vacuum
of TOAST tables into consideration. Fix by reverting back to resettting
the flag.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://postgr.es/m/CAFBsxsFz=GqaG5Ens5aNgVYoV2Y+pfMUijX0ku+CCkWfALwiqg@mail.gmail.com
The test to ensure that decoding changes via logical slot from another
database will fail was incorrectly done on the primary node instead of on
the standby node.
In the passing, make the test to wait for replay catchup by using
wait_for_replay_catchup(). This will make it consistent with the way we
wait at other places in the test.
Author: Shi yu
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://postgr.es/m/OSZPR01MB6310B0A507A0F2A2D379F38CFD6A9@OSZPR01MB6310.jpnprd01.prod.outlook.com
The call to XLogGetReplicationSlotMinimumLSN() might return a
greater LSN than the one given to the function. Subsequent segment
number calculations might then underflow, which could result in
unexpected behavior when removing or recyling WAL files. This was
introduced with max_slot_wal_keep_size in c655077639. To fix, skip
the block of code for replication slots if the LSN is greater.
Reported-by: Xu Xingwang
Author: Kyotaro Horiguchi
Reviewed-by: Junwang Zhao
Discussion: https://postgr.es/m/17903-4288d439dee856c6%40postgresql.org
Backpatch-through: 13
The current code unintentionally uses the wrong datum to construct an array.
The bug was introduced by 096dd80f3c, so no backpatching is needed.
Reported-by: David Steele
Discussion: https://postgr.es/m/d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net
Author: Nathan Bossart
Reviewed-by: David Steele, Tom Lane
Python 3 changed the behavior of PyMapping_Check(), breaking the
test in plpython_to_hstore() that verifies whether a function result
to be transformed is acceptable. A backwards-compatible fix is to
first verify that the object doesn't pass PySequence_Check().
Perhaps accidentally, our other uses of PyMapping_Check() already
follow uses of PySequence_Check(), so that no other bugs were
created by this change.
Per bug #17908 from Alexander Lakhin. Back-patch to all supported
branches.
Dmitry Dolgov and Tom Lane
Discussion: https://postgr.es/m/17908-3f19a125d56a11d6@postgresql.org
As written, pg_dump would call twice parse_compress_specification() for
the custom and directory formats to build a compression specification if
no compression option is defined, as these formats should be compressed
by default when compiled with zlib, or use no compression without zlib.
This made the code logic quite confusing, and the first compression
specification built would be incorrect before being overwritten by the
second one.
Rather than creating two compression specifications, this commit changes
a bit the order of the checks for the compression options so as
compression_algorithm_str is now set to a correct value for the custom
and format directory when no compression option is defined. This makes
the code easier to understand, as parse_compress_specification() is now
called once for all the format, with or without user-specified
compression methods. One comment was also confusing for the non-zlib
case, so remove it while on it.
This code has been introduced in 5e73a60 when adding support for
compression specifications in pg_dump.
Per discussion with Justin Pryzby and Georgios Kokolatos.
Discussion: https://postgr.es/m/20230225050214.GH1653@telsasoft.com
Commit ce6b672e44 changed dumping GRANT commands to ensure that
grantors already have an ADMIN OPTION on the role for which it
is granting permissions. Looping over the grants per role has a
stop condition on dumping the grant statements, but accidentally
missed updating the variable for the conditional check.
Author: Andreas Scherbaum <ads@pgug.de>
Co-authored-by: Artur Zakirov <zaartur@gmail.com>
Discussion: https://postgr.es/m/de44299d-cd31-b41f-2c2a-161fa5e586a5@pgug.de
Commit 664d75753 added the BackgroundPsql module with helper functions
for tests running interactive or background psql tasks. The new module
was however not added to the install rules of the build systems.
Reported-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/c0ba3008-dbc8-e53f-29f2-2e9abe72b2a2@enterprisedb.com
The recently added inclusion of guc.h in smgr.h is not necessary and
introduces more server-related stuff. Removing the directive helps
avoid potential issues with including sgmr.h in frontends.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20230425.115748.2130383825066921512.horikyota.ntt%40gmail.com
SLRUFlushSync has been accidently removed during dee663f, that has moved
the flush of the SLRU files to the checkpointer, so add it back. The
issue has been noticed by Thomas when checking for orphaned wait
events.
Author: Thomas Munro
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/CA+hUKGK6tqm59KuF1z+h5Y8fsWcu5v8+84kduSHwRzwjB2aa_A@mail.gmail.com
Commit 1021bd6a89 excluded autovacuum workers from cost-limit balance
calculations when per-relation options were set. The code checks for
limit and cost_delay being greater than zero, but since cost_delay can
be set to -1 the test needs to check for greater than or zero.
Backpatch to all supported branches since 1021bd6a89 was backpatched
all the way at the time.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B=vVYczMugKMQ@mail.gmail.com
Backpatch-through: v11 (all supported branches)
The leak would show up when using batch inserts with foreign tables
included in a partition tree, as the slots used in the batch were not
reset once processed. In order to fix this problem, some
ExecClearTuple() are added to clean up the slots used once a batch is
filled and processed, mapping with the number of slots currently in use
as tracked by the counter ri_NumSlots.
This buffer refcount leak has been introduced in b676ac4 with the
addition of the executor facility to improve bulk inserts for FDWs, so
backpatch down to 14.
Alexander has provided the patch (slightly modified by me). The test
for postgres_fdw comes from me, based on the test case that the author
has sent in the report.
Author: Alexander Pyhalov
Discussion: https://postgr.es/m/b035780a740efd38dc30790c76927255@postgrespro.ru
Backpatch-through: 14