PostgreSQL

mirror of https://github.com/postgres/postgres.git synced 2025-05-13 01:13:08 -04:00

Author	SHA1	Message	Date
David Rowley	f3281f9f93	Improve comments for estimate_multivariate_ndistinct() estimate_multivariate_ndistinct() is coded to assume the caller handles passing it a list of GroupVarInfos with unique 'var' fields over the entire list. 6bb6a62f3 added code which didn't ensure this and that could result in estimate_multivariate_ndistinct() erroring out with: ERROR: corrupt MVNDistinct entry This occurred because estimate_multivariate_ndistinct() first searches for a set of stats that match to at least two of the given GroupVarInfos and then later assumes that the MVNDistinctItem.items array of the best matching stats will have an entry for those two columns. If the GroupVarInfos List contained a duplicate entry then the same column could be matched to twice and that could trick the code into thinking we have >= 2 columns matched in cases where only a single distinct column has been matched. This could result in a failure to find the correct MVNDistinctItem in the stats as the array containing those never contains an item for single columns. Here we make it more clear that the function needs a distinct set of GroupVarInfos and also tidy up a few other comments to make things a bit easier to follow. Author: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvocZCUhM9W9mJ39d6oQz7ePKoqFnao_347mvC-A7QatcQ@mail.gmail.com	2025-04-17 11:03:24 +12:00
Tom Lane	ab3d8afc7f	Sync declarations and definitions of two new tablecmds.c functions. Buildfarm member drongo complained because the definitions of these functions used "const Oid foo" where the forward declarations just had "Oid foo". (I'm a bit surprised that drongo seems to be the only complainant.) I chose to fix this by removing the "consts" because (a) I'm generally not a fan of using const that way, and (b) it was a minority usage even within these two functions, let alone compared to the rest of our code base. Oversight in commit eec0040c4, so no need for back-patch.	2025-04-16 17:59:08 -04:00
Álvaro Herrera	11ff192b5b	Elide not-null constraint checks on child tables during PK creation We were unnecessarily acquiring AccessExclusiveLock on all child tables when "ALTER TABLE ONLY sometab ADD PRIMARY KEY" was run on their parent table, an oversight in commit 14e87ffa5c54. This caused deadlocks during pg_restore of partitioned tables. The reason to acquire the AEL was that we need to verify that child tables have the involved columns already marked as not-null; but if the parent table has an inheritable not-null constraint, then all children must necessarily be in the correct state already, so we can skip the check, which avoids acquiring the lock. Reorder the code so that it works that way. This doesn't change things in the case where the constraint doesn't exist, but that case is of lesser importance because it doesn't occur during parallel pg_restore. While at it, reword some errmsg() and add errhint() to similar cases in related but not adjacent code. Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/67469c1c-38bc-7d94-918a-67033f5dd731@gmx.net Discussion: https://postgr.es/m/2045026.1743801143@sss.pgh.pa.us Discussion: https://postgr.es/m/1280408.1744650810@sss.pgh.pa.us	2025-04-16 21:51:23 +02:00
Daniel Gustafsson	1fd3566ebc	Update pg_config.h.in with libnuma changes Add macros from autoheader which were accidentally omitted in commit 65c298f61fc. There is no function change by this as no code is currently using the missing macro. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CF6D7D7F-E1C4-45BE-9019-0F4B4BC7C135@yesql.se	2025-04-16 20:16:57 +02:00
Tom Lane	1fc3403626	Fix pg_dump --clean with partitioned indexes. We'd try to drop the partitions of a partitioned index separately, which is disallowed by the backend, leading to an error during restore. While the error is harmless, it causes problems if you try to use --single-transaction mode. Fortunately, there seems no need to do a DROP at all, since the partition will go away silently when we drop either the parent index or the partition's table. So just make the DROP conditional on not being a partition. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com Backpatch-through: 13	2025-04-16 13:31:59 -04:00
Andrew Dunstan	40b9c27014	pg_restore cleanups . remove unnecessary oid_string list stuff . use pg_get_line_buf() instead of open-coding it . cleaner parsing of map.dat lines Reverts 2b69afbe50d add new list type simple_oid_string_list to fe-utils/simple_list Author: Álvaro Herrera <alvherre@kurilemu.de> Author: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/202504141220.343fmoxfsbj4@alvherre.pgsql	2025-04-16 12:04:34 -04:00
Richard Guo	3b35f9a4c5	Fix an incorrect check in get_memoize_path Memoize typically marks cache entries as complete after fully scanning the inner side of a join. However, in the case of unique joins, we skip to the next outer tuple as soon as the first matching inner tuple is found, leaving no opportunity to scan the inner side to completion. To work around that, we mark cache entries as complete after fetching the first matching inner tuple in unique joins. This approach is only safe when all of the join's restriction clauses are parameterized; otherwise, there is no guarantee that reading just one tuple from the inner side is sufficient. Currently, we check for this by verifying that the number of clauses in ppi_clauses is no less than the number of the join's restriction clauses. However, this check isn't entirely reliable, as ppi_clauses includes join clauses available from all outer rels, not just the current outer rel. This means the check could pass even if a restriction clause isn't parameterized, as long as another join clause, which doesn't belong to the current join, is included in ppi_clauses. To fix this, we explicitly check whether each restriction clause of the current join is present in ppi_clauses. While we're here, remove the XXX comment from the modified code, as it's not justified; in certain cases, it's not possible to move a join clause to the inner side. This is arguably a bugfix, but no backpatch given the lack of field reports. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-8JPouj=wBDj4DhK-WO4+Xdx=A2jbjvvyyTBQneJ1=BQ@mail.gmail.com	2025-04-16 10:55:44 +09:00
Tom Lane	7c87284940	Fix failure for generated column with a not-null domain constraint. If a GENERATED column is declared to have a domain data type where the domain's constraints disallow null values, INSERT commands failed because we built a targetlist that included coercing a null constant to the domain's type. The failure occurred even when the generated value would have been perfectly OK. This is adjacent to the issues fixed in 0da39aa76, but we didn't notice for lack of testing a domain with such a constraint. We aren't going to use the result of the targetlist entry for the generated column --- ExecComputeStoredGenerated will overwrite it. So it's not really necessary that it have the exact datatype of the generated column. This patch fixes the problem by changing the targetlist entry to be a null Const of the domain's base type, which should be sufficiently legal. (We do have to tweak ExecCheckPlanOutput to accept the situation, though.) This has been broken since we implemented generated columns. However, this patch only applies easily as far back as v14, partly because I (tgl) only carried 0da39aa76 back that far, but mostly because v14 significantly refactored the handling of INSERT/UPDATE targetlists. Given the lack of field complaints and the short remaining support lifetime of v13, I judge the cost-benefit ratio not good for devising a version that would work in v13. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxG59tip2+9h=rEv-ykOFjt0cbsPVchhi0RTij8bABBA0Q@mail.gmail.com Backpatch-through: 14	2025-04-15 12:08:34 -04:00
Daniel Gustafsson	7ae13170ba	pg_combinebackup: Fix incorrect code documentation The code comment for parse_oid accidentally used the wrong parameter when referring to the location of the last backup. Also, while there, improve sentence wording by removing a superfluous word. Backpatch to v17 where pg_combinebackup was addedd Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAAJ_b95ecWgzcS4K3Dx0E_Yp-SLwK5JBasFgioKMSjhQLw9xvg@mail.gmail.com Backpatch-through: 17	2025-04-15 15:27:08 +02:00
Peter Eisentraut	7cd171a5d2	Add more source files to pg_verifybackup/nls.mk also related to commit 8dfd3129027	2025-04-14 08:32:46 +02:00
Daniel Gustafsson	2970c75dd9	Mark sslkeylogfile as Debug option Mark the sslkeylogile option as "D" debug as this truly is a debug option, and it will allow postgres_fdw et.al to filter it out as well. Also update the display length to match that for an ssl key as they are both filename based inputs. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CAOYmi+=5GyBKpu7bU4D_xkAnYJTj=rMzGaUvHO99-DpNG_YKcw@mail.gmail.com	2025-04-13 21:53:03 +02:00
Andrew Dunstan	64e193f5dd	Make AIO error test more portable Alpine Linux's C library (musl) spells one error message differently. Reported-by: Wolfgang Walther	2025-04-13 14:39:45 -04:00
Andrew Dunstan	f09088a01d	Free memory properly in pg_restore.c Thinko in commit 39729ec01d2. Mea maxima culpa. Per Mahendra Singh Thalor <mahi6run@gmail.com>	2025-04-12 14:54:48 -04:00
Tom Lane	e708ffe79d	Fix GIN's shimTriConsistentFn to not corrupt its input. Commit 0f21db36d made an assumption that GIN triConsistentFns would not modify their input entryRes[] arrays. But in fact, the "shim" triConsistentFn that we use for opclasses that don't supply their own did exactly that, potentially leading to wrong answers from a GIN index search. Through bad luck, none of the test cases that we have for such opclasses exposed the bug. One response to this could be that the assumption of consistency check functions not modifying entryRes[] arrays is a bad one, but it still seems reasonable to me. Notably, shimTriConsistentFn is itself assuming that with respect to the underlying boolean consistentFn, so it's sure being self-centered in supposing that it gets to do so. Fortunately, it's quite simple to fix shimTriConsistentFn to restore the entry-time state of entryRes[], so let's do that instead. This issue doesn't affect any core GIN opclasses, since they all supply their own triConsistentFns. It does affect contrib modules btree_gin, hstore, and intarray. Along the way, I (tgl) noticed that shimTriConsistentFn failed to pick up on a "recheck" flag returned by its first call to the boolean consistentFn. This may be only a latent problem, since it would be unlikely for a consistentFn to set recheck for the all-false case and not any other cases. (Indeed, none of our contrib modules do that.) Nonetheless, it's formally wrong. Reported-by: Vinod Sridharan <vsridh90@gmail.com> Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com Backpatch-through: 13	2025-04-12 12:28:02 -04:00
Peter Geoghegan	a6cab6a78e	Harmonize function parameter names for Postgres 18. Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 18 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit 035ce1fe).	2025-04-12 12:07:36 -04:00
Michael Paquier	fdb69dd582	Fix instability with WAL fsync test in stats.sql A backend using wal_sync_method set to "open_sync" or "open_datasync" would fail the test checking the WAL sync data in pg_stat_io. These modes guarantee that a sync is done when WAL is written to disk, and the data checked by the test is not incremented in this case, issue_xlog_fsync() doing nothing. Oversight in commit a051e71e28a1. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0uxwg3xAi4nvdBMJ-zJQEeyg+RotuU+ebM2F6CKmnvaYA@mail.gmail.com	2025-04-12 13:09:48 +09:00
Daniel Gustafsson	847bbb21f8	Fix recently introduced typos This fixes typos in docs and comments introduced during the v18 development cycle, to keep them from ending up in backbranches. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+COZaCgGua25f2hSrjrDLJcJJAHkwoKgTTqUy-wyL1=64JNjw@mail.gmail.com	2025-04-11 22:17:12 +02:00
Peter Eisentraut	914ea1c93c	Add missing source file to pg_verifybackup/nls.mk added by commit 8dfd3129027	2025-04-11 10:53:36 +02:00
Peter Eisentraut	b63cbacb86	Add missing source file to pg_dump/nls.mk added by commit c1da7281060	2025-04-11 10:28:59 +02:00
Peter Eisentraut	9e0e1cfc3e	Add missing source file to pg_upgrade/nls.mk added by commit 40e2e5e92b7	2025-04-11 10:26:51 +02:00
Peter Eisentraut	7d430a5728	Add missing PGDLLIMPORT markings Discussion: https://www.postgresql.org/message-id/flat/25095db5-b595-4b85-9100-d358907c25b5%40eisentraut.org	2025-04-11 08:59:52 +02:00
Michael Paquier	2e57790836	Fix race with synchronous_standby_names at startup synchronous_standby_names cannot be reloaded safely by backends, and the checkpointer is in charge of updating a state in shared memory if the GUC is enabled in WalSndCtl, to let the backends know if they should wait or not for a given LSN. This provides a strict control on the timing of the waiting queues if the GUC is enabled or disabled, then reloaded. The checkpointer is also in charge of waking up the backends that could be waiting for a LSN when the GUC is disabled. This logic had a race condition at startup, where it would be possible for backends to not wait for a LSN even if synchronous_standby_names is enabled. This would cause visibility issues with transactions that we should be waiting for but they were not. The problem lasts until the checkpointer does its initial update of the shared memory state when it loads synchronous_standby_names. In order to take care of this problem, the shared memory state in WalSndCtl is extended to detect if it has been initialized by the checkpointer, and not only check if synchronous_standby_names is defined. In WalSndCtlData, sync_standbys_defined is renamed to sync_standbys_status, a bits8 able to know about two states: - If the shared memory state has been initialized. This flag is set by the checkpointer at startup once, and never removed. - If synchronous_standby_names is known as defined in the shared memory state. This is the same as the previous sync_standbys_defined in WalSndCtl. This method gives a way for backends to decide what they should do until the shared memory area is initialized, and they now ultimately fall back to a check on the GUC value in this case, which is the best thing that can be done. Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by the checkpointer when this process starts, so the window is very narrow. It is possible to enlarge the problematic window by making the checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined() with a hardcoded sleep for example, and doing so has showed that a 2PC visibility test is indeed failing. On machines slow enough, this bug would cause spurious failures. In 17~, we have looked at the possibility of adding an injection point to have a reproducible test, but as the problematic window happens at early startup, we would need to invent a way to make an injection point optionally persistent across restarts when attached, something that would be fine for this case as it would involve the checkpointer. This issue is quite old, and can be reproduced on all the stable branches. Author: Melnikov Maksim <m.melnikov@postgrespro.ru> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru Backpatch-through: 13	2025-04-11 10:00:21 +09:00
David Rowley	530050d8d2	Add code comment explaining ins_since_vacuum and aborted inserts Sami complained that there's a discrepancy between n_mod_since_analyze and n_ins_since_vacuum, as the former only accounts for committed changes and the latter tracks committed and aborted inserts. Nobody seemed overly concerned that this would cause any concerning issues. The repercussions, from what I can tell, are limited to causing an autovacuum to trigger for inserts sooner than it otherwise might. For typical ratios of commits to aborts, it's unlikely to ever be noticed. Fixing things to make it so n_ins_since_vacuum only displays committed inserts would require an additional field in PgStat_TableCounts, which does not quite seem worthwhile at this stage. This commit just adds a comment with some details to mention that we know about it, which will hopefully prevent repeat discussions. Reported-by: Sami Imseih <samimseih@gmail.com> Author: David Rowley <drowleyml@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAApHDvpgV3a-R2EGmPOh0L-x3pHbZpM3y4dySWfy+UqUazwDQA@mail.gmail.com	2025-04-11 11:36:21 +12:00
Andrew Dunstan	39729ec01d	Fix fat fingering in 22cb6d28950 Per Rainier Vilela	2025-04-10 19:08:04 -04:00
David Rowley	928394b664	Improve various new-to-v18 appendStringInfo calls Similar to 8461424fd, here we adjust a few new locations which were not using the most suitable appendStringInfo* function for the intended purpose. Author: David Rowley <drowleyml@gmail.com Discussion: https://postgr.es/m/CAApHDvqJnNjueb=Eoj8K+8n0g7nj_AcPWSiCj5RNV4fDejAfqA@mail.gmail.com	2025-04-11 10:07:22 +12:00
Daniel Gustafsson	55ef7abf88	Rename global variable backing DSA area The global variable backing the DSA area for Memory Context stats reporting had a too generic name, rename to be more descriptive. Independently reported by Peter and Laurenz. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Peter Eisentraut <peter@eisentraut.org> Reported-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/d51172bd4e7f4b07a18a0288ca1b1c28a71a5f6a.camel@cybertec.at Discussion: https://postgr.es/m/25095db5-b595-4b85-9100-d358907c25b5@eisentraut.org	2025-04-10 22:40:27 +02:00
Andrew Dunstan	22cb6d2895	Fix memory leak in pg_restore.c Oversight in 1495eff7bdb Author: Ranier Vilela <ranier.vf@gmail.com>	2025-04-10 14:57:02 -04:00
Tom Lane	f27eb0325b	Remove useless check for negative result of ip_addrsize(). By inspection, ip_addrsize() can't return a negative result. (If it could, we'd have way bigger problems elsewhere.) So delete useless check in network_send(). Most C compilers are probably perfectly capable of removing this code by themselves, but it's confusing/misleading. Bug: #18889 Reported-by: Daniel Elishakov <dan-eli@mail.ru> Discussion: https://postgr.es/m/18889-73d4f19e953a629e@postgresql.org	2025-04-10 14:18:07 -04:00
Andrew Dunstan	4170298b6e	Further cleanup for directory creation on pg_dump/pg_dumpall Instead of two separate (and different) implementations, refactor to use a single common routine. Along the way, remove use of a hardcoded file permissions constant in favor of the common project setting for directory creation. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Discussion: https://postgr.es/m/CAKYtNApihL8X1h7XO-zOjznc8Ca66Aevgvhc9zOTh6DBh2iaeA@mail.gmail.com	2025-04-10 12:11:36 -04:00
Amit Kapila	4909b38af0	Fix data loss in logical replication. Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ... or ALTER TYPE ... that don't take a strong lock on table happens concurrently to DMLs on the tables involved in the DDL. This happens because logical decoding doesn't distribute invalidations to concurrent transactions and those transactions use stale cache data to decode the changes. The problem becomes bigger because we keep using the stale cache even after those in-progress transactions are finished and skip the changes required to be sent to the client. This commit fixes the issue by distributing invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. This allows the necessary rebuild of the catalog cache when decoding new changes after concurrent DDL. We observed performance regression primarily during frequent execution of publication DDL statements that modify the published tables. The regression is minor or nearly nonexistent for DDLs that do not affect the published tables or occur infrequently, making this a worthwhile cost to resolve a longstanding data loss issue. An alternative approach considered was to take a strong lock on each affected table during publication modification. However, this would only address issues related to publication DDLs (but not the ALTER TYPE ...) and require locking every relation in the database for publications created as FOR ALL TABLES, which is impractical. The bug exists in all supported branches, but we are backpatching till 14. The fix for 13 requires somewhat bigger changes than this fix, so the fix for that branch is still under discussion. Reported-by: hubert depesz lubaczewski <depesz@depesz.com> Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Tested-by: Benoit Lobréau <benoit.lobreau@dalibo.com> Backpatch-through: 14 Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com	2025-04-10 13:14:40 +05:30
Peter Eisentraut	9ad19295e9	Fix incorrect format placeholders for commits 8f427187db7, 6ee3b91bad2	2025-04-10 08:04:35 +02:00
David Rowley	d7c04db27a	Update wording in optimizer/README for EquivalenceClasses d69d45a5a changed how em_is_child members are stored in EquivalenceClasses. Children are no longer stored in the ec_members list. optimizer/README mentioned that most operations "should ignore child members", but that felt a little untrue now since child members are now stored in a separate place, they simply won't be found by the normal means of looking (a foreach loop over ec_members), and if you don't find them, there's technically no need to "ignore" them. Here we tweak the wording slightly to reflect the new storage location for child members. Reported-by: Amit Langote <amitlangote09@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CA+HiwqE8v=EuAP_3F_A2xn8zWx+nG_etW_Fe_DvKO-Fkx=+DdQ@mail.gmail.com	2025-04-10 17:33:58 +12:00
Amit Kapila	d438515c29	Cosmetic fixes for pg_createsubscriber's -all option. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHut+PsmSCQ-ENSDQ0YOUcsgzT=GG-E9jyXBvxd51A_dMXH5XA@mail.gmail.com	2025-04-10 10:30:05 +05:30
Tomas Vondra	3887d0cfeb	Cleanup of pg_numa.c This moves/renames some of the functions defined in pg_numa.c: * pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and moved to src/backend/storage/ipc/shmem.c. The new name better reflects that the page size is not related to NUMA, and it's specifically about the page size used for the main shared memory segment. * move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into the backend (which more appropriate for functions callable from SQL). While at it, improve the comment to explain what page size it returns. * remove unnecessary includes from src/port/pg_numa.c, adding unnecessary dependencies (src/port should be suitable for frontent). These were either leftovers or unnecessary thanks to the other changes in this commit. This eliminates unnecessary dependencies on backend symbols, which we don't want in src/port. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com	2025-04-09 21:50:17 +02:00
Nathan Bossart	e2665efd0f	pg_upgrade: Mention that we preserve database OIDs in a comment. Oversight in commit aa01051418. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4055696.1744134682%40sss.pgh.pa.us	2025-04-09 14:27:08 -05:00
Tom Lane	837cc73af2	Fix performance issue in deadlock-parallel isolation test. With debug_discard_caches = 1, the runtime of this test script increased by about a factor of 10 after commit 0dca5d68d. That's causing some of our buildfarm animals to fail with a timeout. The reason for the increased time is that now we are re-planning some intentionally-non-inlineable SQL functions on every execution, where the previous coding held onto the original plans throughout the outer query. The previous behavior was arguably quite buggy, so I don't think 0dca5d68d deserves blame here. But we would like this test script to not take so long. To fix, instead of forcing a "parallel safe" label via a non-inlineable SQL function, apply it directly to the advisory-lock functions by making internal-language aliases for them. A small problem is that the advisory-lock functions return void but this test would really like them to return integer 1. I cheated here by declaring the aliases as returning "int". That's perhaps undue familiarity with the implementation of PG_RETURN_VOID(), but that hasn't changed in twenty years and is unlikely to do so in the next twenty. That gets us an integer 0 result, and then an inline-able wrapper to convert that to an integer 1 allows the rest of the script to remain unchanged. For me, this reduces the runtime with debug_discard_caches = 1 by about 100x, making the test comfortably faster than before instead of slower. Discussion: https://postgr.es/m/136163.1744179562@sss.pgh.pa.us	2025-04-09 12:28:34 -04:00
Noah Misch	5bbc596391	Fix test races between syscache-update-pruned.spec and autovacuum. This spec fails ~3% of my Valgrind runs, and the spec has failed on Valgrind buildfarm member skink at a similar rate. Two problems contributed to that: - A competing buffer pin triggered VACUUM's lazy_scan_noprune() path, causing "tuples missed: 1 dead from 1 pages not removed due to cleanup lock contention". FREEZE fixes that. - The spec ran lazy VACUUM immediately after VACUUM FULL. The spec implicitly assumed lazy VACUUM prunes the one tuple that VACUUM FULL made dead. First wait for old snapshots, making that assumption reliable. This also adds two forms of defense in depth: - Wait for snapshots using shared catalog pruning rules (VISHORIZON_SHARED). This avoids the removable cutoff moving backward when an XID-bearing autoanalyze process runs in another database. That may never happen in this test, but it's cheap insurance. - Use lazy VACUUM option DISABLE_PAGE_SKIPPING. Commit c2dc1a79767a0f947e1145f82eb65dfe4360d25f did this for a related requirement in other tests, but I suspect FREEZE is necessary and sufficient in all these tests. Back-patch to v17, where the test first appeared. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/sv3taq4e6ea4qckimien3nxp3sz4b6cw6sfcy4nhwl52zpur4g@h6i6tohxmizu Backpatch-through: 17	2025-04-09 07:23:39 -07:00
Heikki Linnakangas	0f1433f053	Fix a few oversights in the longer cancel keys patch Change MyCancelKeyLength's type from uint8 to int. While it always fits in a uint8, plain int is less surprising, as there's no particular reason for it to be uint8. Fix one ProcSignalInit caller that passed 'false' instead of NULL for the pointer argument. Author: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org	2025-04-09 13:11:42 +03:00
Daniel Gustafsson	ef366b7d7e	Perform missed catversion bump Commit c57971034e69ca renamed an argument for a function but missed to bump the catversion to reflect this. Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqOega=dPtu3h2C5fJWJEuaGCMDib_sVfhKQqgUNJVmFA@mail.gmail.com	2025-04-09 09:29:12 +02:00
Tom Lane	dd496eedea	Doc: note that two examples in optimizer/README are oversimplified. These examples fail to account for join clauses generated by EquivalenceClasses, but since we haven't mentioned EquivalenceClasses yet it seems like it'd just add confusion to make them fully accurate. Instead, parenthetically note that they're oversimplified. Reported-by: Zeyuan Hu <ferrishu3886@gmail.com> Co-authored-by: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACvHWmYFo+60yMqKJajDDvKN5EM41YHrCT3oxukwXmGAqpWvyw@mail.gmail.com	2025-04-08 23:03:33 -04:00
Tom Lane	b65b9da568	Adjust AdjustUpgrade.pm for commit b1720fe63. Need to delete the functions we no longer have available from the dumps to be reloaded from old versions. Per buildfarm.	2025-04-08 20:21:03 -04:00
Tom Lane	b1720fe63f	Move contrib/spi testing from core regression tests to contrib/spi. It's weird to have the core regression tests depending on contrib code, and coverage testing shows that those test queries add nothing to the core-code coverage of the core tests. So pull those test bits out and put them into ordinary test scripts inside contrib/spi/, making that more like other contrib modules. Aside from being structurally nicer, anything we can take out of the core tests (which are executed multiple times per check-world run) and put into tests executed only once should be a win. It doesn't look like this change will buy a whole lot of milliseconds, but a cycle saved is a cycle earned. Also, there is some discussion around possibly removing refint and/or autoinc altogether. I don't know if that will happen, but we'd certainly need to decouple them from the core tests to do so. The tests for autoinc were quite intertwined with the undocumented "ttdummy" trigger in regress.c. That made the tests very hard to understand and contributed nothing to autoinc's testing either. So I just deleted ttdummy and rewrote the autoinc tests without it. I realized while doing this that the description of autoinc in the SGML docs is not a great description of what the function actually does, so the patch includes some updates to those docs. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/3872677.1744077559@sss.pgh.pa.us	2025-04-08 19:12:03 -04:00
Daniel Gustafsson	c57971034e	Rename argument in pg_get_process_memory_contexts(). During development the third argument to pg_get_process_memory_contexts was a retry count, but it was changed to a timeout instead. The param name was accidentally left in pg_proc.dat though. Fix by renaming to the correct parameter name. Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/3eb40b3e-45c7-426a-b7f8-81f7d05a9b53@oss.nttdata.com	2025-04-08 23:09:13 +02:00
Nathan Bossart	b0a4c3e88b	Prevent 006_transfer_modes.pl from leaving files behind. This test was leaving files like delete_old_cluster.{sh,bat} in the source directory for VPATH and meson builds. To fix, change the directory to tmp_check before running the test, as was done in commits 15b6d21553, 8af917be6b, and c462b054ba. Oversight in commit af0d4901c1. Reported-by: Andrew Dunstan <andrew@dunslane.net> (on Discord) Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/Z_RHkG770w3SE0yU%40nathan	2025-04-08 10:57:31 -05:00
Amit Kapila	12eece5fd5	Fix uninitialized index information access during apply. The issue happens when building conflict information during apply of INSERT or UPDATE operations that violate unique constraints on leaf partitions. The problem was introduced in commit 9ff68679b5, which removed the redundant calls to ExecOpenIndices/ExecCloseIndices. The previous code was relying on the redundant ExecOpenIndices call in apply_handle_tuple_routing() to build the index information required for unique key conflict detection. The fix is to delay building the index information until a conflict is detected instead of relying on ExecOpenIndices to do the same. The additional benefit of this approach is that it avoids building index information when there is no conflict. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by:Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TYAPR01MB57244ADA33DDA57119B9D26494A62@TYAPR01MB5724.jpnprd01.prod.outlook.com	2025-04-08 15:35:42 +05:30
Thomas Munro	f78ca6f3eb	Introduce file_copy_method setting. It can be set to either COPY (the default) or CLONE if the system supports it. CLONE causes callers of copydir(), currently CREATE DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE = ..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy files instead of a read-write loop over the contents. CLONE gives the kernel the opportunity to share block ranges on copy-on-write file systems and push copying down to storage on others, depending on configuration. On some systems CLONE can be used to clone large databases quickly with CREATE DATABASE ... TEMPLATE=source STRATEGY=FILE_COPY. Other operating systems could be supported; patches welcome. Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com	2025-04-08 21:35:38 +12:00
Daniel Gustafsson	042a66291b	Add function to get memory context stats for processes This adds a function for retrieving memory context statistics and information from backends as well as auxiliary processes. The intended usecase is cluster debugging when under memory pressure or unanticipated memory usage characteristics. When calling the function it sends a signal to the specified process to submit statistics regarding its memory contexts into dynamic shared memory. Each memory context is returned in detail, followed by a cumulative total in case the number of contexts exceed the max allocated amount of shared memory. Each process is limited to use at most 1Mb memory for this. A summary can also be explicitly requested by the user, this will return the TopMemoryContext and a cumulative total of all lower contexts. In order to not block on busy processes the caller specifies the number of seconds during which to retry before timing out. In the case where no statistics are published within the set timeout, the last known statistics are returned, or NULL if no previously published statistics exist. This allows dash- board type queries to continually publish even if the target process is temporarily congested. Context records contain a timestamp to indicate when they were submitted. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com	2025-04-08 11:06:56 +02:00
Andres Freund	15f0cb26b5	Increase BAS_BULKREAD based on effective_io_concurrency Before, BAS_BULKREAD was always of size 256kB. With the default io_combine_limit of 16, that only allowed 1-2 IOs to be in flight - insufficient even on very low latency storage. We don't just want to increase the size to a much larger hardcoded value, as very large rings (10s of MBs of of buffers), appear to have negative performance effects when reading in data that the OS has cached (but not when actually needing to do IO). To address this, increase the size of BAS_BULKREAD to allow for io_combine_limit * effective_io_concurrency buffers getting read in. To prevent the ring being much larger than useful, limit the increased size with GetPinLimit(). The formula outlined above keeps the ring size to sizes for which we have not observed performance regressions, unless very large effective_io_concurrency values are used together with large shared_buffers setting. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/lqwghabtu2ak4wknzycufqjm5ijnxhb4k73vzphlt2a3wsemcd@gtftg44kdim6 Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah@brqs62irg4dt	2025-04-08 02:41:03 -04:00
Andres Freund	dcf7e1697b	Add pg_buffercache_evict_{relation,all} functions In addition to the added functions, the pg_buffercache_evict() function now shows whether the buffer was flushed. pg_buffercache_evict_relation(): Evicts all shared buffers in a relation at once. pg_buffercache_evict_all(): Evicts all shared buffers at once. Both functions provide mechanism to evict multiple shared buffers at once. They are designed to address the inefficiency of repeatedly calling pg_buffercache_evict() for each individual buffer, which can be time-consuming when dealing with large shared buffer pools. (e.g., ~477ms vs. ~2576ms for 16GB of fully populated shared buffers). These functions are intended for developer testing and debugging purposes and are available to superusers only. Minimal tests for the new functions are included. Also, there was no test for pg_buffercache_evict(), test for this added too. No new extension version is needed, as it was already increased this release by ba2a3c2302f. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Joseph Koshakow <koshy44@gmail.com> Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com	2025-04-08 02:19:32 -04:00
David Rowley	d69d45a5a9	Speedup child EquivalenceMember lookup in planner When planning queries to partitioned tables, we clone all EquivalenceMembers belonging to the partitioned table into em_is_child EquivalenceMembers for each non-pruned partition. For partitioned tables with large numbers of partitions, this meant the ec_members list could become large and code searching that list would become slow. Effectively, the more partitions which were present, the more searches needed to be performed for operations such as find_ec_member_matching_expr() during create_plan() and the more partitions present, the longer these searches would take, i.e., a quadratic slowdown. To fix this, here we adjust how we store EquivalenceMembers for em_is_child members. Instead of storing these directly in ec_members, these are now stored in a new array of Lists in the EquivalenceClass, which is indexed by the relid. When we want to find EquivalenceMembers belonging to a certain child relation, we can narrow the search to the array element for that relation. To make EquivalenceMember lookup easier and to reduce the amount of code change, this commit provides a pair of functions to allow iteration over the EquivalenceMembers of an EC which also handles finding the child members, if required. Callers that never need to look at child members can remain using the foreach loop over ec_members, which will now often be faster due to only parent-level members being stored there. The actual performance increases here are highly dependent on the number of partitions and the query being planned. Performance increases can be visible with as few as 8 partitions, but the speedup is marginal for such low numbers of partitions. The speedups become much more visible with a few dozen to hundreds of partitions. With some tested queries using 56 partitions, the planner was around 3x faster than before. For use cases with thousands of partitions, these are likely to become significantly faster. Some testing has shown planner speedups of 60x or more with 8192 partitions. Author: Yuya Watari <watari.yuya@gmail.com> Co-authored-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Tested-by: Thom Brown <thom@linux.com> Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com> Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com	2025-04-08 18:09:57 +12:00

1 2 3 4 5 ...

45468 Commits