PostgreSQL

mirror of https://github.com/postgres/postgres.git synced 2025-06-01 00:01:20 -04:00

Author	SHA1	Message	Date
Andres Freund	dd6f2618f6	localbuf: Introduce TerminateLocalBufferIO() Previously TerminateLocalBufferIO() was open-coded in multiple places, which doesn't seem like a great idea. While TerminateLocalBufferIO() currently is rather simple, an upcoming patch requires additional code to be added to TerminateLocalBufferIO(), making this modification particularly worthwhile. For some reason FlushRelationBuffers() previously cleared BM_JUST_DIRTIED, even though that's never set for temporary buffers. This is not carried over as part of this change. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andres Freund	0762a151b0	localbuf: Introduce InvalidateLocalBuffer() Previously, there were three copies of this code, two of them identical. There's no good reason for that. This change is nice on its own, but the main motivation is the AIO patchset, which needs to add extra checks the deduplicated code, which of course is easier if there is only one version. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andres Freund	fa6af9b25e	localbuf: Fix dangerous coding pattern in GetLocalVictimBuffer() If PinLocalBuffer() were to modify the buf_state, the buf_state in GetLocalVictimBuffer() would be out of date. Currently that does not happen, as PinLocalBuffer() only modifies the buf_state if adjust_usagecount=true and GetLocalVictimBuffer() passes false. However, it's easy to make this not the case anymore - it cost me a few hours to debug the consequences. The minimal fix would be to just refetch the buf_state after after calling PinLocalBuffer(), but the same danger exists in later parts of the function. Instead, declare buf_state in the narrower scopes and re-read the state in conditional branches. Besides being safer, it also fits well with an upcoming set of cleanup patches that move the contents of the conditional branches in GetLocalVictimBuffer() into helper functions. I "broke" this in 794f2594479. Arguably this should be backpatched, but as the relevant functions are not exported and there is no actual misbehaviour, I chose to not backpatch, at least for now. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_b9anbWzEs5AAF9WCvcEVmgz-1AkHSQ-CLLy-p7WHzvFw@mail.gmail.com	2025-03-15 22:07:48 -04:00
Andrew Dunstan	5eabd91a83	Silence perl critic Commit 27bdec06841 uses a loop variable that is not strictly local to the loop. Perlcritic disapproves, and there's really no reason as the variable is not used outside the loop. Per buildfarm animals koel and crake.	2025-03-15 17:41:54 -04:00
Jeff Davis	27bdec0684	Optimization for lower(), upper(), casefold() functions. Improve performance and reduce table sizes for case mapping. The main case mapping table stores only 16-bit offsets, which can be used to look up the mapped code point in any of the case tables (fold, lower, upper, or title case). Simple case pairs point to the same offsets. Generate a function in generate-unicode_case_table.pl that consists of a nested branches to test for specific codepoint ranges that determine the offset in the main table. Other approaches were considered, such as representing these ranges as another structure (rather than branches in a generated function), or a different approach such as a radix tree, or perfect hashing. The author implemented and tested these alternatives and settled on the generated branches. Author: Alexander Borisov <lex.borisov@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/7cac7e66-9a3b-4e3f-a997-42aa0c401f80%40gmail.com	2025-03-15 13:00:50 -07:00
Melanie Plageman	c3953226a0	Remove table AM callback scan_bitmap_next_block After pushing the bitmap iterator into table-AM specific code (as part of making bitmap heap scan use the read stream API in 2b73a8cd33b7), scan_bitmap_next_block() no longer returns the current block number. Since scan_bitmap_next_block() isn't returning any relevant information to bitmap table scan code, it makes more sense to get rid of it. Now, bitmap table scan code only calls table_scan_bitmap_next_tuple(), and the heap AM implementation of scan_bitmap_next_block() is a local helper in heapam_handler.c. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com	2025-03-15 10:37:46 -04:00
Melanie Plageman	2b73a8cd33	BitmapHeapScan uses the read stream API Make Bitmap Heap Scan use the read stream API instead of invoking ReadBuffer() for each block indicated by the bitmap. The read stream API handles prefetching, so remove all of the explicit prefetching from bitmap heap scan code. Now, heap table AM implements a read stream callback which uses the bitmap iterator to return the next required block to the read stream code. Tomas Vondra conducted extensive regression testing of this feature. Andres Freund, Thomas Munro, and I analyzed regressions and Thomas Munro patched the read stream API. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Tested-by: Tomas Vondra <tomas@vondra.me> Tested-by: Andres Freund <andres@anarazel.de> Tested-by: Thomas Munro <thomas.munro@gmail.com> Tested-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com	2025-03-15 10:34:42 -04:00
Melanie Plageman	944e81bf99	Separate TBM[Shared\|Private]Iterator and TBMIterateResult Remove the TBMIterateResult member from the TBMPrivateIterator and TBMSharedIterator and make tbm_[shared\|private_]iterate() take a TBMIterateResult as a parameter. This allows tidbitmap API users to manage multiple TBMIterateResults per scan. This is required for bitmap heap scan to use the read stream API, with which there may be multiple I/Os in flight at once, each one with a TBMIterateResult. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me	2025-03-15 10:11:19 -04:00
Thomas Munro	799959dc7c	Simplify distance heuristics in read_stream.c. Make the distance control heuristics simpler and more aggressive in preparation for asynchronous I/O. The v17 version of read_stream.c made a conservative choice to limit the look-ahead distance when streaming sequential blocks, because it couldn't benefit very much from looking ahead further yet. It had a three-behavior model where only random I/O would rapidly increase the look-ahead distance, to support read-ahead advice. Sequential I/O would move it towards the io_combine_limit setting, just enough to build one full-sized synchronous I/O at a time, and then expect kernel read-ahead to avoid I/O stalls. That already left I/O performance on the table with advice-based I/O concurrency, since sequential blocks could be followed by random jumps, eg with the proposed streaming Bitmap Heap Scan patch. It is time to delete the cautious middle option and adjust the distance based on recent I/O needs only, since asynchronous reads will need to be started ahead of time whether random or sequential. It is still limited by io_combine_limit, *_io_concurrency, buffer availability and strategy ring size, as before. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Tested-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-16 03:05:07 +13:00
Thomas Munro	7ea8cd1566	Improve read_stream.c advice for dense streams. read_stream.c tries not to issue read-ahead advice when it thinks the kernel's own read-ahead should be active, ie when using buffered I/O and reading sequential blocks. It previously gave up too easily, and issued advice only for the first read of up to io_combine_limit blocks in a larger range of sequential blocks after random jump. The following read could suffer an avoidable I/O stall. Fix, by continuing to issue advice until the corresponding preadv() calls catch up with the start of the region we're currently issuing advice for, if ever. That's when the kernel actually sees the sequential pattern. Advice is now disabled only when the stream is entirely sequential as far as we can see in the look-ahead window, or in other words, when a sequential region is larger than we can cover with the current io_concurrency and io_combine_limit settings. While refactoring the advice control logic, also get rid of the "suppress_advice" argument that was passed around between functions to skip useless posix_fadvise() calls immediately followed by preadv(). read_stream_start_pending_read() can figure that out, so let's concentrate knowledge of advice heuristics in fewer places (our goal being to make advice-based I/O concurrency a legacy mode soon). The problem cases were revealed by Tomas Vondra's extensive regression testing with many different disk access patterns using Melanie Plageman's streaming Bitmap Heap Scan patch, in a battle against the venerable always-issue-advice-and-always-one-block-at-a-time code. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reported-by: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de> Tested-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGJ3HSWciQCz8ekP1Zn7N213RfA4nbuotQawfpq23%2Bw-5Q%40mail.gmail.com	2025-03-15 19:04:54 +13:00
Álvaro Herrera	11bd831860	doc: Explain more thoroughly when a table rewrite is needed Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/00e6eb5f5c793b8ef722252c7a519c9a@oss.nttdata.com	2025-03-14 20:44:59 +01:00
Tom Lane	1c9242b2cd	Doc: remove obsolete comment. This para should have been removed by 2f9661311, which made it both false and irrelevant. Noted while looking at SQL function plancache patch.	2025-03-14 14:08:47 -04:00
Fujii Masao	6d376c3b0d	Add GUC option to log lock acquisition failures. This commit introduces a new GUC, log_lock_failure, which controls whether a detailed log message is produced when a lock acquisition fails. Currently, it only supports logging lock failures caused by SELECT ... NOWAIT. The log message includes information about all processes holding or waiting for the lock that couldn't be acquired, helping users analyze and diagnose the causes of lock failures. Currently, this option does not log failures from SELECT ... SKIP LOCKED, as that could generate excessive log messages if many locks are skipped, causing unnecessary noise. This mechanism can be extended in the future to support for logging lock failures from other commands, such as LOCK TABLE ... NOWAIT. Author: Yuki Seino <seinoyu@oss.nttdata.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/411280a186cc26ef7034e0f2dfe54131@oss.nttdata.com	2025-03-14 23:14:12 +09:00
Fujii Masao	e80171d57c	Optimize iteration over PGPROC for fast-path lock searches. This commit improves efficiency in FastPathTransferRelationLocks() and GetLockConflicts(), which iterate over PGPROCs to search for fast-path locks. Previously, these functions recalculated the fast-path group during every loop iteration, even though it remained constant. This update optimizes the process by calculating the group once and reusing it throughout the loop. The functions also now skip empty fast-path groups, avoiding unnecessary scans of their slots. Additionally, groups belonging to inactive backends (with pid=0) are always empty, so checking the group is sufficient to bypass these backends, further enhancing performance. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/07d5fd6a-71f1-4ce8-8602-4cc6883f4bd1@oss.nttdata.com	2025-03-14 22:49:29 +09:00
Peter Eisentraut	a359d37019	Simplify and generalize PrepareSortSupportFromIndexRel() PrepareSortSupportFromIndexRel() was accepting btree strategy numbers purely for the purpose of comparing it later against btree strategies to determine if the sort direction was forward or reverse. Change that. Instead, pass a bool directly, to indicate the same without an unfortunate assumption that a strategy number refers specifically to a btree strategy. (This is similar in spirit to commits 0d2aa4d4937 and c594f1ad2ba.) (This could arguably be simplfied further by having the callers fill in ssup_reverse directly. But this way, it preserves consistency by having all PrepareSortSupport*() variants be responsible for filling in ssup_reverse.) Moreover, remove the hardcoded check against BTREE_AM_OID, and check against amcanorder instead, which is the actual requirement. Co-authored-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com	2025-03-14 10:34:08 +01:00
Álvaro Herrera	1548c3a304	Remove direct handling of reloptions for toast tables It doesn't actually work, even with allow_system_table_mods turned on: the ALTER TABLE operation is rejected by ATSimplePermissions(), so even the error message we're adding in this commit is unreachable. Add a test case for it. Author: Nikolay Shaplov <dhyan@nataraj.su> Discussion: https://postgr.es/m/1913854.tdWV9SEqCh@thinkpad-pgpro	2025-03-14 09:28:51 +01:00
Thomas Munro	92fc6856cb	Respect changing pin limits in read_stream.c. To avoid pinning too much of the buffer pool at once, read_stream.c previously used LimitAdditionalPins(). The coding was naive, and only considered the available buffers at stream construction time. This commit checks before each StartReadBuffers() call with GetAdditionalPinLimit(). The result might change over time due to pins acquired outside this stream by the same backend. No extra CPU cycles are added to the all-buffered fast-path code, but the I/O-starting path now considers the up-to-date remaining buffer limit. In practice it was quite difficult to exceed limits and cause any real problems in v17, so no back-patch for now, but proposed changes will make it easier. Per code review from Andres, in the course of testing his AIO patches. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-14 21:21:09 +13:00
Peter Eisentraut	0793ab8100	Activate Python "Limited API" in PL/Python This allows building PL/Python against any Python 3.x version and using another Python 3.x version at run time. This is useful for installers that want to run against a separately downloaded Python, so that they don't have to bundle it themselves. This builds on the earlier patch to only use APIs supported by the Limited API. At the moment, this is not activated on MSVC because that leads to build failures that no one could explain or cared enough to address. This could be done later. Reviewed-by: Jakob Egger <jakob@eggerapps.at> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org	2025-03-14 08:57:02 +01:00
Peter Eisentraut	05cbd6cb22	Swap order of extern/static and pg_nodiscard When pg_nodiscard was first added, the C standard draft had it as a function specifier, and so the code comment about placement was written with that in mind. The final C23 standard has it as an attribute and the placement rules are a bit different for that. Specifically, it needs to be before extern or static. (Or at least both current clang and gcc require that.) So just swap these. (To be clear: The current implementation with gcc attributes doesn't care. This change is just for maximum forward compatibility for non-gcc compilers.) This also keeps the order consistent with the previously introduced pg_noreturn. Also update the code comment to reflect the mentioned developments since its introduction. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw	2025-03-14 07:18:07 +01:00
Thomas Munro	01261fb078	Improve buffer manager API for backend pin limits. Previously the support functions assumed that the caller needed one pin to make progress, and could optionally use some more, allowing enough for every connection to do the same. Add a couple more functions for callers that want to know: * what the maximum possible number could be, irrespective of currently held pins, for space planning purposes * how many additional pins they could acquire right now, without the special case allowing one pin, for callers that already hold pins and could already make progress even if no extra pins are available The pin limit logic began in commit 31966b15. This refactoring is better suited to read_stream.c, which will be adjusted to respect the remaining limit as it changes over time in a follow-up commit. It also computes MaxProportionalPins up front, to avoid performing divisions whenever a caller needs to check the balance. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com	2025-03-14 17:13:09 +13:00
Amit Kapila	7c99dc587a	Fix ALTER SUBSCRIPTION ... SET PUBLICATION ... command. The problem is that ALTER SUBSCRIPTION ... SET PUBLICATION ... will lead to restarting of apply worker and after the restart, the apply worker will use the existing slot and replication origin corresponding to the subscription. Now, it is possible that before the restart, the origin has not been updated, and the WAL start location points to a location before where PUBLICATION pointed to by SET PUBLICATION doesn't exist, and that can lead to an error like: "ERROR: publication "pub1" does not exist". Once this error occurs, apply worker will never be able to proceed and will always return the same error. We decided to skip loading the publication if the publication does not exist. The publication is loaded later and updates the relation entry when the publication gets created. We decided not to backpatch this as this is a behaviour change, and we don't see field reports. This problem has been found by intermittent buildfarm failures. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com	2025-03-14 08:57:40 +05:30
Tom Lane	4618045bee	Fix ARRAY_SUBLINK and ARRAY[] for int2vector and oidvector input. If the given input_type yields valid results from both get_element_type and get_array_type, initArrayResultAny believed the former and treated the input as an array type. However this is inconsistent with what get_promoted_array_type does, leading to situations where the output of an ARRAY() subquery is labeled with the wrong type: it's labeled as oidvector[] but is really a 2-D array of OID. That at least results in strange output, and can result in crashes if further processing such as unnest() is applied. AFAIK this is only possible with the int2vector and oidvector types, which are special-cased to be treated mostly as true arrays even though they aren't quite. Fix by switching the logic to match get_promoted_array_type by testing get_array_type not get_element_type, and remove an Assert thereby made pointless. (We need not introduce a symmetrical check for get_element_type in the other if-branch, because initArrayResultArr will check it.) This restores the behavior that existed before bac27394a introduced initArrayResultAny: the output really is int2vector[] or oidvector[]. Comparable confusion exists when an input of an ARRAY[] construct is int2vector or oidvector: transformArrayExpr decides it's dealing with a multidimensional array constructor, and we end up with something that's a multidimensional OID array but is alleged to be of type oidvector. I have not found a crashing case here, but it's easy to demonstrate totally-wrong results. Adjust that code so that what you get is an oidvector[] instead, for consistency with ARRAY() subqueries. (This change also makes these types work like domains-over-arrays in this context, which seems correct.) Bug: #18840 Reported-by: yang lei <ylshiyu@126.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18840-fbc9505f066e50d6@postgresql.org Backpatch-through: 13	2025-03-13 16:07:55 -04:00
Álvaro Herrera	c7fc8808a9	ATExecSetRelOptions: Reduce scope of 'isnull' variable Author: Nikolay Shaplov <dhyan@nataraj.su> Reviewed-by: Timur Magomedov <t.magomedov@postgrespro.ru> Discussion: https://postgr.es/m/1913854.tdWV9SEqCh@thinkpad-pgpro	2025-03-13 18:15:59 +01:00
Álvaro Herrera	da0f0582e8	Make lwlocknames.h generated file less ugly We can make the output look a bit better by aligning each lock's definition, so add some padding space to achieve that. This change makes no practical difference, but casual onlookers will be less distracted by (lack of) whitespace. Author: Gurjeet Singh <gurjeet@singh.im> Discussion: https://postgr.es/m/CABwTF4VxfwDtRV-H22_XK4XeDogaV-Vaobu+af5U=8ZAZn9ZZQ@mail.gmail.com	2025-03-13 17:38:21 +01:00
Nathan Bossart	0697b23906	Add reverse(bytea). This commit introduces a function for reversing the order of the bytes in binary strings. Bumps catversion. Author: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAJ7c6TMe0QVRuNssUArbMi0bJJK32%2BzNA3at5m3osrBQ25MHuw%40mail.gmail.com	2025-03-13 11:20:53 -05:00
Peter Eisentraut	bb25276205	Fix copy-and-paste mistake in error message Introduced in commit a68159ff2b3.	2025-03-13 15:17:08 +01:00
Peter Eisentraut	3691edfab9	pg_noreturn to replace pg_attribute_noreturn() We want to support a "noreturn" decoration on more compilers besides just GCC-compatible ones, but for that we need to move the decoration in front of the function declaration instead of either behind it or wherever, which is the current style afforded by GCC-style attributes. Also rename the macro to "pg_noreturn" to be similar to the C11 standard "noreturn". pg_noreturn is now supported on all compilers that support C11 (using _Noreturn), as well as GCC-compatible ones (using __attribute__, as before), as well as MSVC (using __declspec). (When PostgreSQL requires C11, the latter two variants can be dropped.) Now, all supported compilers effectively support pg_noreturn, so the extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped. This also fixes a possible problem if third-party code includes stdnoreturn.h, because then the current definition of #define pg_attribute_noreturn() __attribute__((noreturn)) would cause an error. Note that the C standard does not support a noreturn attribute on function pointer types. So we have to drop these here. There are only two instances at this time, so it's not a big loss. In one case, we can make up for it by adding the pg_noreturn to a wrapper function and adding a pg_unreachable(), in the other case, the latter was already done before. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw	2025-03-13 12:37:26 +01:00
Richard Guo	cc5d98525d	Fix incorrect handling of subquery pullup When pulling up a subquery, if the subquery's target list items are used in grouping set columns, we need to wrap them in PlaceHolderVars. This ensures that expressions retain their separate identity so that they will match grouping set columns when appropriate. In 90947674f, we decided to wrap subquery outputs that are non-var expressions in PlaceHolderVars. This prevents const-simplification from merging them into the surrounding expressions after subquery pullup, which could otherwise lead to failing to match those subexpressions to grouping set columns, with the effect that they'd not go to null when expected. However, that left some loose ends. If the subquery's target list contains two or more identical Var expressions, we can still fail to match the Var expression to the expected grouping set expression. This is not related to const-simplification, but rather to how we match expressions to lower target items in setrefs.c. For sort/group expressions, we use ressortgroupref matching, which works well. For other expressions, we primarily rely on comparing the expressions to determine if they are the same. Therefore, we need a way to prevent setrefs.c from matching the expression to some other identical ones. To fix, wrap all subquery outputs in PlaceHolderVars if the parent query uses grouping sets, ensuring that they preserve their separate identity throughout the whole planning process. Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-meSahaanKskpBn0KKxdHAXC1_EJCVWHxEodqirrGJnw@mail.gmail.com	2025-03-13 16:36:03 +09:00
Richard Guo	4c49611715	Remove code setting wrap_non_vars to true for UNION ALL subqueries In pull_up_simple_subquery and pull_up_constant_function, there is code that sets wrap_non_vars to true when dealing with an appendrel member. The goal is to wrap subquery outputs that are not simple Vars in PlaceHolderVars, ensuring that what we pull up doesn't get merged into a surrounding expression during later processing, which could cause it to fail to match the expression actually available from the appendrel. However, this is unnecessary. When pulling up an appendrel child subquery, the only part of the upper query that could reference the appendrel child yet is the translated_vars list of the associated AppendRelInfo that we just made for this child. Furthermore, we do not want to force use of PHVs in the AppendRelInfo, as there is no outer join between. In fact, perform_pullup_replace_vars always sets wrap_non_vars to false before performing pullup_replace_vars on the AppendRelInfo. This patch simply removes the code that sets wrap_non_vars to true for UNION ALL subqueries. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-VXDEi1v+hZYLxpOv0riJxHsCkCH1f46tLnhonEAyGCQ@mail.gmail.com	2025-03-13 16:34:28 +09:00
Jeff Davis	d3b2e5e1ab	Refactor convert_case() to prepare for optimizations. Upcoming optimizations will add complexity to convert_case(). This patch reorganizes slightly so that the complexity can be contained within the logic to convert the case of a single character, rather than mixing it in with logic to iterate through the string. Reviewed-by: Alexander Borisov <lex.borisov@gmail.com> Discussion: https://postgr.es/m/44005c3d-88f4-4a26-981f-fd82dfa8e313@gmail.com	2025-03-12 21:51:52 -07:00
Amit Kapila	3abe9dc188	Avoid invalidating all RelationSyncCache entries on publication rename. On Publication rename, we need to only invalidate the RelationSyncCache entries corresponding to relations that are part of the publication being renamed. As part of this patch, we introduce a new invalidation message to invalidate the cache maintained by the logical decoding output plugin. We can't use existing relcache invalidation for this purpose, as that would unnecessarily cause relcache invalidations in other backends. This will improve performance by building fewer relation cache entries during logical replication. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-03-13 09:16:33 +05:30
Thomas Munro	75da2bece6	Fix read_stream.c for changing io_combine_limit. In a couple of places, read_stream.c assumed that io_combine_limit would be stable during the lifetime of a stream. That is not true in at least one unusual case: streams held by CURSORs where you could change the GUC between FETCH commands, with unpredictable results. Fix, by storing stream->io_combine_limit and referring only to that after construction. This mirrors the treatment of the other important setting {effective,maintenance}_io_concurrency, which is stored in stream->max_ios. One of the cases was the queue overflow space, which was sized for io_combine_limit and could be overrun if the GUC was increased. Since that coding was a little hard to follow, also introduce a variable for better readability instead of open-coding the arithmetic. Doing so revealed an off-by-one thinko while clamping max_pinned_buffers to INT16_MAX, though that wasn't a live bug due to the current limits on GUC values. Back-patch to 17. Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com	2025-03-13 15:43:34 +13:00
Amit Langote	d4f79865d4	Fix copy-paste error in datum_to_jsonb_internal() Commit 3c152a27b06 mistakenly repeated JSONTYPE_JSON in a condition, omitting JSONTYPE_CAST. As a result, datum_to_jsonb_internal() failed to reject inputs that were casts (e.g., from an enum to json as in the example below) when used as keys in JSON constructors. This led to a crash in cases like: SELECT JSON_OBJECT('happy'::mood: '123'::jsonb); where 'happy'::mood is implicitly cast to json. The missing check meant such casted values weren’t properly rejected as invalid (non-scalar) JSON keys. Reported-by: Maciek Sakrejda <maciek@pganalyze.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Maciek Sakrejda <maciek@pganalyze.com> Discussion: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry+Xrq-ghjcejBRhcRMzWZvbd__QdgJA@mail.gmail.com Backpatch-through: 17	2025-03-13 09:56:36 +09:00
Masahiko Sawada	4ecdd4110d	pg_rewind: Add dbname to primary_conninfo when using --write-recovery-conf. This commit enhances pg_rewind's --write-recovery-conf option to include the dbname in the generated primary_conninfo value when specified in the --source-server option. With this modification, the rewound server can connect to the primary server without manual configuration file modifications when sync_replication_slots is enabled. Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAD21AoAkW=Ht0k9dVoBTCcqLiiZ2MXhVr+d=j2T_EZMerGrLWQ@mail.gmail.com	2025-03-12 16:56:04 -07:00
David Rowley	cdc1471cc7	Add b955df443 to .git-blame-ignore-revs	2025-03-13 12:44:26 +13:00
David Rowley	b955df4434	Fix indentation issue Introduced recently by 9e088f7dd Per buildfarm member koel	2025-03-13 12:41:44 +13:00
Masahiko Sawada	9e088f7dd8	Fix compiler warning in pg_logicalinspect. Oversight in bd65cb3cd48. Reported-by: David Rowley <dgrowleyml@gmail.com> Reported-by: Nathan Bossart <nathandbossart@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqrhFfnetbcwgGkJ=z63T8HfQ_OyP=vX8BYiXyxFKt67w@mail.gmail.com	2025-03-12 14:23:56 -07:00
Heikki Linnakangas	ac4494646d	Rename alloc/free functions in reorderbuffer.c There used to be bespoken pools for these structs to reduce the palloc/pfree overhead, but that was ripped out a long time ago and replaced with the generic, cheaper generational memory allocator (commit a4ccc1cef5). The Get/Return terminology made sense with the pools, as you "got" an object from the pool and "returned" it later, but now it just looks weird. Rename to Alloc/Free. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/c9e43d2d-8e83-444f-b111-430377368989@iki.fi	2025-03-12 22:03:39 +02:00
Nathan Bossart	025e7e1eb4	Remove count_one_bits() in acl.c. The only caller, select_best_grantor(), can instead use pg_popcount64(). This isn't performance-critical code, but we might as well use the centralized implementation. While at it, add some test coverage for this part of select_best_grantor(). Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/Z9GtL7Nm6hsYyJnF%40nathan	2025-03-12 15:01:52 -05:00
Melanie Plageman	ff79b5b2ab	Increase default effective_io_concurrency to 16 The default effective_io_concurrency has been 1 since it was introduced in b7b8f0b6096d2ab6e. Referencing the associated discussion [1], it seems 1 was chosen as a conservative value that seemed unlikely to cause regressions. Experimentation on high latency cloud storage as well as fast, local nvme storage (see Discussion link) shows that even slightly higher values improve query timings substantially. 1 actually performs worse than 0 [2]. With effective_io_concurrency 1, we are not prefetching enough to avoid I/O stalls, but we are issuing extra syscalls. The new default is 16, which should be more appropriate for common hardware while still avoiding flooding low IOPs devices with I/O requests. [1] https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com [2] https://www.postgresql.org/message-id/CAAKRu_Zv08Cic%3DqdCfzrQabpEXGrd9Z9UOW5svEVkCM6%3DFXA9g%40mail.gmail.com Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAAKRu_Z%2BJa-mwXebOoOERMMUMvJeRhzTjad4dSThxG0JLXESxw%40mail.gmail.com	2025-03-12 15:57:44 -04:00
Heikki Linnakangas	af717317a0	Handle interrupts while waiting on Append's async subplans We did not wake up on interrupts while waiting on async events on an async-capable append node. For example, if you tried to cancel the query, nothing would happen until one of the async subplans becomes readable. To fix, add WL_LATCH_SET to the WaitEventSet. Backpatch down to v14 where async Append execution was introduced. Discussion: https://www.postgresql.org/message-id/37a40570-f558-40d3-b5ea-5c2079b3b30b@iki.fi	2025-03-12 20:53:09 +02:00
Tom Lane	f4e7756ef9	Build whole-row Vars the same way during parsing and planning. makeWholeRowVar() has different rules for constructing a whole-row Var depending on the kind of RTE it's representing. This turns out to be problematic because the rewriter and planner can convert view RTEs and set-returning-function RTEs into subquery RTEs; so a whole-row Var made during planning might look different from one made by the parser. In isolation this doesn't cause any problem, but if a query contains Vars made both ways for the same varno, there are cross-checks in the executor that will complain. This manifests for UPDATE, DELETE, and MERGE queries that use whole-row table references. To fix, we need makeWholeRowVar() to produce the same result from an inlined RTE as it would have for the original. For an inlined view, we can use RangeTblEntry.relid to detect that this had been a view RTE. For inlined SRFs, make a data structure definition change akin to commit 47bb9db75, and say that we won't clear RangeTblEntry.functions until the end of planning. That allows makeWholeRowVar() to repeat what it would have done with the unmodified RTE. Reported-by: Duncan Sands <duncan.sands@deepbluecap.com> Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/3518c50a-ab18-482f-b916-a37263622501@deepbluecap.com Backpatch-through: 13	2025-03-12 11:47:38 -04:00
Melanie Plageman	18cd15e706	Add connection establishment duration logging Add log_connections option 'setup_durations' which logs durations of several key parts of connection establishment and backend setup. For an incoming connection, starting from when the postmaster gets a socket from accept() and ending when the forked child backend is first ready for query, there are multiple steps that could each take longer than expected due to external factors. This logging provides visibility into authentication and fork duration as well as the end-to-end connection establishment and backend initialization time. To make this portable, the timings captured in the postmaster (socket creation time, fork initiation time) are passed through the BackendStartupData. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Guillaume Lelarge <guillaume.lelarge@dalibo.com> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com	2025-03-12 11:35:27 -04:00
Melanie Plageman	9219093cab	Modularize log_connections output Convert the boolean log_connections GUC into a list GUC comprised of the connection aspects to log. This gives users more control over the volume and kind of connection logging. The current log_connections options are 'receipt', 'authentication', and 'authorization'. The empty string disables all connection logging. 'all' enables all available connection logging. For backwards compatibility, the most common values for the log_connections boolean are still supported (on, off, 1, 0, true, false, yes, no). Note that previously supported substrings of on, off, true, false, yes, and no are no longer supported. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com	2025-03-12 11:35:21 -04:00
Michael Paquier	f554a95379	Remove initialization from PendingBackendStats 9a8dd2c5a6d has added an initialization to PendingBackendStats, which has been causing compilation warnings in the buildfarm. This code does not strictly require it as PendingBackendStats is always initialized with memset(0), so let's remove it. Per report from multiple buildfarm members, like ayu and batfish, via Tom Lane. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1870853.1741749264@sss.pgh.pa.us	2025-03-12 20:37:43 +09:00
Peter Eisentraut	72a3d0462b	Prepare for Python "Limited API" in PL/Python Using the Python Limited API would allow building PL/Python against any Python 3.x version and using another Python 3.x version at run time. This commit does not activate that, but it prepares the code to only use APIs supported by the Limited API. Implementation details: - Convert static types to heap types (https://docs.python.org/3/howto/isolating-extensions.html#heap-types). - Replace PyRun_String() with component functions. - Replace PyList_SET_ITEM() with PyList_SetItem(). This was previously committed as c47e8df815c and then reverted because it wasn't working under Python older than 3.8. That has been fixed in this version. There was a Python API change/bugfix between 3.7 and 3.8 that directly affects this patch. The relevant commit is <https://github.com/python/cpython/commit/364f0b0f19c>. The workarounds described there have been applied in this patch, and it has been confirmed to work with Python 3.6 and 3.7. Reviewed-by: Jakob Egger <jakob@eggerapps.at> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24@eisentraut.org	2025-03-12 08:53:54 +01:00
Tom Lane	c872516d8f	Doc: silence A4 PDF build warnings. Commit 0fbceae84 put a "&zwsp;" in almost but not quite the correct place to avoid "The contents of fo:block line 1 exceed the available area" warnings. Per buildfarm.	2025-03-11 23:35:39 -04:00
Heikki Linnakangas	043745c3a0	Improve snapmgr.c comment Add more details on the different kinds of snapshots, how to use them, and how the active snapshot stack works. Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi	2025-03-11 23:28:38 +02:00
Heikki Linnakangas	8076c00592	Assert that a snapshot is active or registered before it's used The comment in GetTransactionSnapshot() said that you "should call RegisterSnapshot or PushActiveSnapshot on the returned snap if it is to be used very long". That felt too unclear to me. Make the comment more strongly worded. To enforce that rule and to catch potential bugs where a snapshot might get invalidated while it's still in use, add an assertion to HeapTupleSatisfiesMVCC() to check that the snapshot is registered or pushed to active stack. No new bugs were found by this, but it seems like good future-proofing. It's not a great place for the check; HeapTupleSatisfiesMVCC() is in fact safe to call with an unregistered snapshot, and the assertion won't catch other unsafe uses. But it goes a long way in practice. Fix a few cases that were playing fast and loose with that and just assumed that the snapshot cannot be invalidated during a scan. Those assumptions were not wrong, but they're not performance critical, so let's drop the excuses and just register the snapshot. These were false positives found by the new assertion. Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi	2025-03-11 23:20:34 +02:00
Masahiko Sawada	bd65cb3cd4	pg_logicalinspect: Fix possible crash when passing a directory path. Previously, pg_logicalinspect functions were too trusting of their input and blindly passed it to SnapBuildRestoreSnapshot(). If the input pointed to a directory, the server could a PANIC error while attempting to fsync_fname() with isdir=false on a directory. This commit adds validation checks for input filenames and passes the LSN extracted from the filename to SnapBuildRestoreSnapshot() instead of the filename itself. It also adds regression tests for various input patterns and permission checks. Bug: #18828 Reported-by: Robins Tharakan <tharakan@gmail.com> Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/18828-0f4701c635064211@postgresql.org	2025-03-11 09:56:40 -07:00

... 3 4 5 6 7 ...

60977 Commits