17684 Commits

Author SHA1 Message Date
Nathan Bossart
d5f1b6a75b Further adjust guidance for running vacuumdb after pg_upgrade.
Since pg_upgrade does not transfer the cumulative statistics used
to trigger autovacuum and autoanalyze, the server may take much
longer than expected to process them post-upgrade.  Currently, we
recommend analyzing only relations for which optimizer statistics
were not transferred by using the --analyze-in-stages and
--missing-stats-only options.  This commit appends another
recommendation to analyze all relations to update the relevant
cumulative statistics by using the --analyze-only option.  This is
similar to the recommendation for pg_stat_reset().

Reported-by: Christoph Berg <myon@debian.org>
Reviewed-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/aAfxfKC82B9NvJDj%40msg.df7cb.de
2025-04-30 14:12:59 -05:00
Nathan Bossart
f60420cff6 doc: Alphabetize long options for pg_dump[all].
The current ordering strategy for these pages is to list the short
options in alphabetical order followed by the long options in
alphabetical order.  If an option has both a short variant and a
long variant, the short variant takes precedence.  This commit
moves a few recently added options to match this style.  We should
probably adjust all pages and --help output to list the long and
short options in one combined alphabetical list (with the long
variants taking precedence), but that is a much larger change, so
it is left as a future exercise.

Oversights in commits a5cf808be5, 1fd1bd8710, and bde2fb797a.

Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/aBFBtsltgu3-IU1d%40nathan
2025-04-30 13:07:51 -05:00
Daniel Gustafsson
f8c115a6cb Typo and doc fixups for memory context reporting
This fixes comment and docs typos as well as a small documentation
change to make it clearer.  Found via post-commit review.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com
2025-04-30 11:10:27 +02:00
Tom Lane
810a8b1c80 Give up on running with NetBSD/OpenBSD's default semaphore settings.
This reverts commit 38da053463bef32adf563ddee5277d16d2b6c5af, which
attempted to preserve our ability to start with only 60 semaphores.

Subsequent changes (particularly 55b454d0e) have put that idea pretty
much permanently out of reach: people wishing to use Postgres v18 on
OpenBSD or NetBSD will have no choice but to increase those platforms'
default values of SEMMNI and SEMMNS.

Hence, revert 38da05346's changes in SEMAS_PER_SET and the minimum
tested value of max_connections.  Adjust a comment from the subsequent
patch 6d0154196, and tweak the wording in runtime.sgml to make it
clear that changing SEMMNI/SEMMNS is no longer even a little bit
optional on these platforms.

Although 38da05346 was later back-patched into v17, leave that branch
alone: it's still capable of starting with 60 semaphores, and there's
no reason to break that.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/E1tuZNv-0037Gs-34@gemulon.postgresql.org
Discussion: https://postgr.es/m/1052019.1745947915@sss.pgh.pa.us
2025-04-29 17:27:52 -04:00
Jacob Champion
45363fca63 Bump the minimum supported Python version to 3.6.8
Python 3.2 is no longer tested by the buildfarm, and there are only a
handful of buildfarm animals running versions older than 3.6, which
itself went end-of-life in 2021. Python 3.6.8 is the default version
shipped in RHEL8, so that seems like a reasonable baseline for PG18.

Now that we use the Python Limited API as of 0793ab810, older versions
of Python should continue functioning for users of PL/Python in
particular, so soften the language from "required" to "supported".

Wording by Tom Lane. Separate from the review of the patch itself,
several people provided input on the choice of cutoff: Christoph Berg,
Devrim Gündüz, Florents Tselai, Jelte Fennema-Nio, and Renan Alves
Fonseca. Thank you!

Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/16098.1745079444%40sss.pgh.pa.us
2025-04-29 13:04:19 -07:00
Bruce Momjian
faced8e6a4 doc: adjust max_files_per_process again
Reported-by: Andres Freund

Discussion: https://postgr.es/m/5yqochswkulckuzzrwgv2nqdrfh4k4coc4uwq4lvgzkfwnbjbd@46igbiwjabn2
2025-04-29 10:30:08 -04:00
Bruce Momjian
9a9e60fed3 doc: clarify new behavior of max_files_per_process 2025-04-29 09:45:41 -04:00
Peter Eisentraut
913c60b067 doc: Small example improvement
Add a comment character before a line annotation, so that the query
can be used as presented.

Reported-by: Yaroslav Saburov <y.saburov@gmail.com>
Author: Euler Taveira <euler@eulerto.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://www.postgresql.org/message-id/flat/174393459040.678.17810152410419444783%40wrigleys.postgresql.org
2025-04-29 14:43:35 +02:00
Amit Kapila
9807617a92 Doc: Specify the interaction of publish_generated_columns with column list.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PtnjLiNFFh-3f9cXH0wnwqjdkTjQNbVmZdZ1y+zKt_PPg@mail.gmail.com
2025-04-29 09:01:43 +05:30
Bruce Momjian
10e8176950 doc: improve wording of vacuum_max_eager_freeze_failure_rate 2025-04-26 11:41:23 -04:00
Fujii Masao
632f62dcec doc: remove unnecessary secondary index terms for replication settings.
Previously, config.sgml included secondary index terms for
max_replication_slots and max_active_replication_origins. These are
no longer necessary, as each parameter now has a single distinct index entry.

The secondary index terms were originally useful because
max_active_replication_origins was part of max_replication_slots,
and separate index entries helped users locate each setting. However,
commit 04ff636cbce split them into independent parameters,
making the secondary terms redundant.

This commit removes the unnecessary secondary index entries to
simplify the documentation.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/e825e7a7-4877-441d-93c1-25377db36c31@oss.nttdata.com
2025-04-25 14:58:14 +09:00
Bruce Momjian
6389db2320 doc: simplify new EXPLAIN ANALYZE BUFFERS description 2025-04-24 22:02:35 -04:00
Michael Paquier
e29df428a1 doc: Mention naming convention used by injection points
All the injection points used in the tree have relied on an implied
rule: their names should be made of lower-case characters, with dashes
between the words used.

This commit adds a light mention about that in the docs, encouraging the
practice.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
2025-04-22 12:41:29 +09:00
David Rowley
0b06459f3c Doc: reword text explaining the --maintenance-db option
The previous text was a little clumsy.  Here we improve that.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Noboru Saito <noborusai@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
Backpatch-through: 13
2025-04-22 14:54:22 +12:00
David Rowley
5e6f9a9c4e Doc: various fixups
* Use <symbol> tags for CONNECTION_* #defines

We were using an inconsistent mix of <literal> and sometimes <function>
tags.

* Use <application> tag for libpq

There was a mix of <literal> and <productname>

Also fix a whitespace issue.

None of these seem critical enough mistakes to backpatch.

Author: Noboru Saito <noborusai@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
2025-04-22 11:10:08 +12:00
David Rowley
d010cc6cca Doc: fix incorrect punctuation
Author: Noboru Saito <noborusai@gmail.com>
Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com
Backpatch-through: 17
2025-04-22 11:04:04 +12:00
Fujii Masao
706cbed351 doc: Fix memory context level in pg_log_backend_memory_contexts() example.
Commit d9e03864b6b changed the memory context level numbers shown by
pg_log_backend_memory_contexts() to be 1-based. However, the example in
the documentation was not updated and still used 0-based numbering.

This commit updates the example to match the current 1-based output.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/1ad6d388-1b43-400d-bec9-36d52f755f74@oss.nttdata.com
2025-04-21 14:53:25 +09:00
Michael Paquier
88e947136b Fix typos and grammar in the code
The large majority of these have been introduced by recent commits done
in the v18 development cycle.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com
2025-04-19 19:17:42 +09:00
Tatsuo Ishii
05883bd6e5 Doc: fix missing comma at the end of a line.
Backpatch to 17, where the line was added.

Reported by Noboru Saito while he was working on translating the file
into Japanese.

Discussion: https://postgr.es/m/20250417.203047.1321297410457834775.ishii%40postgresql.org
Reported-by: Noboru Saito <noborusai@gmail.com>
Reviewed-by: Daniel Gustafs <daniel@yesql.se>
Backpatch-through: 17
2025-04-18 09:38:46 +09:00
Daniel Gustafsson
5ee476294c doc: Fix typos in documentation
This fixes a set of typos introduced during the v18 development
cycle.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/7038B4C5-2742-42B1-A8F0-0FFEAECF02A7@yesql.se
2025-04-15 21:32:18 +02:00
Fujii Masao
f840f8ee30 doc: Fix missing whitespace in pg_restore documentation.
Previously, a space was missing between "<option>--exclude-schema</option>"
and "for" in the pg_restore documentation. This commit fixes the typo by
adding the missing whitespace.

Back-patch to v17 where the typo was added.

Author: Lele Gaifax <lele@metapensiero.it>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/87lds3ysm0.fsf@metapensiero.it
Backpatch-through: 17
2025-04-15 23:15:06 +09:00
David Rowley
b51f86e49a Doc: use "an SQL" consistently rather than "a SQL"
Per the precedent set by 04539e73f, adjust article prefixes for "SQL" to
use "an" consistently rather than "a", i.e., "an es-que-ell" rather than
"a sequel".

Both of these are new to v18. Also see b1b13d2b5, d866f0374 and
7bdd489d3.
2025-04-14 11:55:18 +12:00
Tom Lane
78637a8be2 Doc: do a little copy-editing on Index Storage Parameters list.
Add a paragraph break per suggestion from David G. Johnston.
Use a consistent voice for all the different parameter
descriptions, and fix a couple of grammatical issues.

Reported-by: Igor Korot <ikorot01@gmail.com>
Co-authored-by: "David G. Johnston" <david.g.johnston@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+FnnTz=EW1VQRpWB9J+G-NSchrPFcw4nR7d0JqzEK9jWKB35A@mail.gmail.com
2025-04-12 13:42:31 -04:00
Daniel Gustafsson
847bbb21f8 Fix recently introduced typos
This fixes typos in docs and comments introduced during the v18
development cycle, to keep them from ending up in backbranches.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+COZaCgGua25f2hSrjrDLJcJJAHkwoKgTTqUy-wyL1=64JNjw@mail.gmail.com
2025-04-11 22:17:12 +02:00
Nathan Bossart
5822bf21d5 Add missing space in pg_restore documentation.
Oversight in commit 1495eff7bd.
2025-04-11 10:05:32 -05:00
Tom Lane
d89335eea6 Doc: remove long-obsolete advice about generated constraint names.
It's been twenty years since we generated constraint names that
look like "$N".  So this advice about double-quoting such names
is well past its sell-by date, and now it merely seems confusing.

Reported-by: Yaroslav Saburov <y.saburov@gmail.com>
Author: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/174393459040.678.17810152410419444783@wrigleys.postgresql.org
Backpatch-through: 13
2025-04-10 14:49:10 -04:00
Amit Kapila
d438515c29 Cosmetic fixes for pg_createsubscriber's -all option.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsmSCQ-ENSDQ0YOUcsgzT=GG-E9jyXBvxd51A_dMXH5XA@mail.gmail.com
2025-04-10 10:30:05 +05:30
Tom Lane
b1720fe63f Move contrib/spi testing from core regression tests to contrib/spi.
It's weird to have the core regression tests depending on contrib
code, and coverage testing shows that those test queries add nothing
to the core-code coverage of the core tests.  So pull those test bits
out and put them into ordinary test scripts inside contrib/spi/,
making that more like other contrib modules.

Aside from being structurally nicer, anything we can take out of the
core tests (which are executed multiple times per check-world run)
and put into tests executed only once should be a win.  It doesn't
look like this change will buy a whole lot of milliseconds, but a
cycle saved is a cycle earned.

Also, there is some discussion around possibly removing refint and/or
autoinc altogether.  I don't know if that will happen, but we'd
certainly need to decouple them from the core tests to do so.

The tests for autoinc were quite intertwined with the undocumented
"ttdummy" trigger in regress.c.  That made the tests very hard to
understand and contributed nothing to autoinc's testing either.
So I just deleted ttdummy and rewrote the autoinc tests without it.

I realized while doing this that the description of autoinc in
the SGML docs is not a great description of what the function
actually does, so the patch includes some updates to those docs.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/3872677.1744077559@sss.pgh.pa.us
2025-04-08 19:12:03 -04:00
Tomas Vondra
91f1fe90c7 pg_buffercache: Change page_num type to bigint
The page_num was defined as integer, which should be sufficient for the
near future (with 4K pages it's 8TB). But it's virtually free to return
bigint, and get a wider range. This was agreed on the thread, but I
forgot to tweak this in ba2a3c2302f1.

While at it, make the data types in CREATE VIEW a bit more consistent.

Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.co
2025-04-08 12:38:42 +02:00
Tomas Vondra
b8a6078ca8 doc: Correct pg_shmem_allocations_numa.size data type
The code in pg_get_shmem_allocations_numa() returned 'size' as int64,
but the docs said int32.

Report and fix by Noriyoshi Shinoda.

Reported-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com>
Discussion: https://postgr.es/m/DM4PR84MB1734308EB741A6ECFF040C27EEAA2@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM
2025-04-08 12:36:36 +02:00
Thomas Munro
7ea21f4ee2 Fix typo in docs.
Typo in previous commit.
2025-04-08 22:02:45 +12:00
Thomas Munro
f78ca6f3eb Introduce file_copy_method setting.
It can be set to either COPY (the default) or CLONE if the system
supports it.  CLONE causes callers of copydir(), currently CREATE
DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE =
..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy
files instead of a read-write loop over the contents.

CLONE gives the kernel the opportunity to share block ranges on
copy-on-write file systems and push copying down to storage on others,
depending on configuration.  On some systems CLONE can be used to clone
large databases quickly with CREATE DATABASE ... TEMPLATE=source
STRATEGY=FILE_COPY.

Other operating systems could be supported; patches welcome.

Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com
2025-04-08 21:35:38 +12:00
Daniel Gustafsson
042a66291b Add function to get memory context stats for processes
This adds a function for retrieving memory context statistics
and information from backends as well as auxiliary processes.
The intended usecase is cluster debugging when under memory
pressure or unanticipated memory usage characteristics.

When calling the function it sends a signal to the specified
process to submit statistics regarding its memory contexts
into dynamic shared memory.  Each memory context is returned
in detail, followed by a cumulative total in case the number
of contexts exceed the max allocated amount of shared memory.
Each process is limited to use at most 1Mb memory for this.

A summary can also be explicitly requested by the user, this
will return the TopMemoryContext and a cumulative total of
all lower contexts.

In order to not block on busy processes the caller specifies
the number of seconds during which to retry before timing out.
In the case where no statistics are published within the set
timeout,  the last known statistics are returned, or NULL if
no previously published statistics exist.  This allows dash-
board type queries to continually publish even if the target
process is temporarily congested.  Context records contain a
timestamp to indicate when they were submitted.

Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com
2025-04-08 11:06:56 +02:00
Andres Freund
dcf7e1697b Add pg_buffercache_evict_{relation,all} functions
In addition to the added functions, the pg_buffercache_evict() function now
shows whether the buffer was flushed.

pg_buffercache_evict_relation(): Evicts all shared buffers in a
relation at once.
pg_buffercache_evict_all(): Evicts all shared buffers at once.

Both functions provide mechanism to evict multiple shared buffers at
once. They are designed to address the inefficiency of repeatedly calling
pg_buffercache_evict() for each individual buffer, which can be time-consuming
when dealing with large shared buffer pools. (e.g., ~477ms vs. ~2576ms for
16GB of fully populated shared buffers).

These functions are intended for developer testing and debugging
purposes and are available to superusers only.

Minimal tests for the new functions are included. Also, there was no test for
pg_buffercache_evict(), test for this added too.

No new extension version is needed, as it was already increased this release
by ba2a3c2302f.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru>
Reviewed-by: Joseph Koshakow <koshy44@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com
2025-04-08 02:19:32 -04:00
Bruce Momjian
46b4ba533c Fix PG 17 [NOT] NULL optimization bug for domains
A PG 17 optimization allowed columns with NOT NULL constraints to skip
table scans for IS NULL queries, and to skip IS NOT NULL checks for IS
NOT NULL queries.  This didn't work for domain types, since domain types
don't follow the IS NULL/IS NOT NULL constraint logic.  To fix, disable
this optimization for domains for PG 17+.

Reported-by: Jan Behrens

Diagnosed-by: Tom Lane

Discussion: https://postgr.es/m/Z37p0paENWWUarj-@momjian.us

Backpatch-through: 17
2025-04-07 21:33:42 -04:00
Tomas Vondra
ba2a3c2302 Add pg_buffercache_numa view with NUMA node info
Introduces a new view pg_buffercache_numa, showing NUMA memory nodes
for individual buffers. For each buffer the view returns an entry for
each memory page, with the associated NUMA node.

The database blocks and OS memory pages may have different size - the
default block size is 8KB, while the memory page is 4K (on x86). But
other combinations are possible, depending on configure parameters,
platform, etc. This means buffers may overlap with multiple memory
pages, each associated with a different NUMA node.

To determine the NUMA node for a buffer, we first need to touch the
memory pages using pg_numa_touch_mem_if_required, otherwise we might get
status -2 (ENOENT = The page is not present), indicating the page is
either unmapped or unallocated.

The view may be relatively expensive, especially when accessed for the
first time in a backend, as it touches all memory pages to get reliable
information about the NUMA node. This may also force allocation of the
shared memory.

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07 23:08:17 +02:00
Tomas Vondra
8cc139bec3 Introduce pg_shmem_allocations_numa view
Introduce new pg_shmem_alloctions_numa view with information about how
shared memory is distributed across NUMA nodes. For each shared memory
segment, the view returns one row for each NUMA node backing it, with
the total amount of memory allocated from that node.

The view may be relatively expensive, especially when executed for the
first time in a backend, as it has to touch all memory pages to get
reliable information about the NUMA node. This may also force allocation
of the shared memory.

Unlike pg_shmem_allocations, the view does not show anonymous shared
memory allocations. It also does not show memory allocated using the
dynamic shared memory infrastructure.

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07 23:08:17 +02:00
Tomas Vondra
65c298f61f Add support for basic NUMA awareness
Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c
portability wrapper and an optional build dependency, enabled by
--with-libnuma configure option. For now this is Linux-only, other
platforms may be supported later.

A built-in SQL function pg_numa_available() allows checking NUMA
support, i.e. that the server was built/linked with the NUMA library.

The main function introduced is pg_numa_query_pages(), which allows
determining the NUMA node for individual memory pages. Internally the
function uses move_pages(2) syscall, as it allows batching, and is more
efficient than get_mempolicy(2).

Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07 23:08:17 +02:00
Tom Lane
8cfbdf8f4d Fix some issues in contrib/spi/refint.c.
check_foreign_key incorrectly used a single cache entry for its saved
plans for a 'c' (cascade) trigger, although there are two different
queries to execute depending on whether it fires for an update or a
delete.  This caused the wrong things to be done if both types of
event occur in one session.  (This was indeed visible in the triggers
regression test, but apparently nobody ever questioned it.)  To fix,
add the operation type to the cache key.

Its debug log output failed to distinguish update from delete
events, too.

Also, change the intended trigger usage from BEFORE ROW to AFTER ROW,
and add checks insisting on that usage.  BEFORE is really rather
unsafe, since if there are other BEFORE triggers they might change or
cancel the operation we are trying to check.  AFTER triggers are the
standard way to propagate changes to other rows, so we should follow
that way here.

In passing, remove a useless duplicate lookup of the cache entry.

This code is mostly intended as a documentation example, so we
won't consider a back-patch.

Author: Dmitrii Bondar <d.bondar@postgrespro.ru>
Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Lilian Ontowhee <ontowhee@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/79755a2b18ed4fe5e29da6a87a1e00d1@postgrespro.ru
2025-04-07 15:54:16 -04:00
Masahiko Sawada
a13d49014d doc: Fix a typo in pg_recvlogical documentation.
Oversight in cf2655a9029a.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/OS3PR01MB5718DD1466E2B9043448AE5094AA2@OS3PR01MB5718.jpnprd01.prod.outlook.com
2025-04-07 12:13:08 -07:00
Álvaro Herrera
a379061a22
Allow NOT NULL constraints to be added as NOT VALID
This allows them to be added without scanning the table, and validating
them afterwards without holding access exclusive lock on the table after
any violating rows have been deleted or fixed.

Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid
not-null constraint validates that constraint.  ALTER TABLE .. VALIDATE
CONSTRAINT is also supported.  There are various checks on whether an
invalid constraint is allowed in a child table when the parent table has
a valid constraint; this should match what we do for enforced/not
enforced constraints.

pg_attribute.attnotnull is now only an indicator for whether a not-null
constraint exists for the column; whether it's valid or invalid must be
queried in pg_constraint.  Applications can continue to query
pg_attribute.attnotnull as before, but now it's possible that NULL rows
are present in the column even when that's set to true.

For backend internal purposes, we cache the nullability status in
CompactAttribute->attnullability that each tuple descriptor carries
(replacing CompactAttribute.attnotnull, which was a mirror of
Form_pg_attribute.attnotnull).  During the initial tuple descriptor
creation, based on the pg_attribute scan, we set this to UNRESTRICTED if
pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we
update the latter to VALID or INVALID depending on the pg_constraint
scan.  This flag is also copied when tupledescs are copied.

Comparing tuple descs for equality must also compare the
CompactAttribute.attnullability flag and return false in case of a
mismatch.

pg_dump deals with these constraints by storing the OIDs of invalid
not-null constraints in a separate array, and running a query to obtain
their properties.  The regular table creation SQL omits them entirely.
They are then dealt with in the same way as "separate" CHECK
constraints, and dumped after the data has been loaded.  Because no
additional pg_dump infrastructure was required, we don't bump its
version number.

I decided not to bump catversion either, because the old catalog state
works perfectly in the new world.  (Trying to run with new catalog state
and the old server version would likely run into issues, however.)

System catalogs do not support invalid not-null constraints (because
commit 14e87ffa5c54 didn't allow them to have pg_constraint rows
anyway.)

Author: Rushabh Lathia <rushabh.lathia@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com
2025-04-07 19:19:50 +02:00
Tom Lane
3516ea768c Add local-address escape "%L" to log_line_prefix.
This escape shows the numeric server IP address that the client
has connected to.  Unix-socket connections will show "[local]".
Non-client processes (e.g. background processes) will show "[none]".

We expect that this option will be of interest to only a fairly
small number of users.  Therefore the implementation is optimized
for the case where it's not used (that is, we don't do the string
conversion until we have to), and we've not added the field to
csvlog or jsonlog formats.

Author: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Cary Huang <cary.huang@highgo.ca>
Reviewed-by: David Steele <david@pgmasters.net>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAKAnmmK-U+UicE-qbNU23K--Q5XTLdM6bj+gbkZBZkjyjrd3Ow@mail.gmail.com
2025-04-07 11:06:05 -04:00
Daniel Gustafsson
a233a603ba doc: Clarify project naming
Clarify the project naming in the history section of the docs
to match the recent license preamble changes.

Backpatch to all supported versions.

Author: Dave Page <dpage@pgadmin.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+OCxozLzK2+Jc14XZyWXSp6L9Ot+3efwXUE35FJG=fsbib2EA@mail.gmail.com
Backpatch-through: 13
2025-04-07 00:03:18 +02:00
Tom Lane
218ab68275 Doc: fix PDF "contents ... exceed the available area" warnings.
Tweak column widths in a new table, similarly to some previous
fixes such as b62381d9a.

Per buildfarm.
2025-04-06 16:27:39 -04:00
Álvaro Herrera
749a9e20c9
Add modern SHA-2 based password hashes to pgcrypto.
This adapts the publicly available reference implementation on
https://www.akkadia.org/drepper/SHA-crypt.txt and adds the new hash
algorithms sha256crypt and sha512crypt to crypt() and gen_salt()
respectively.

Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/c763235a2757e2f5f9e3e27268b9028349cef659.camel@oopsware.de
2025-04-05 19:17:13 +02:00
Andrew Dunstan
1495eff7bd Non text modes for pg_dumpall, correspondingly change pg_restore
pg_dumpall acquires a new -F/--format option, with the same meanings as
pg_dump. The default is p, meaning plain text. For any other value, a
directory is created containing two files, globals.data and map.dat. The
first contains SQL for restoring the global data, and the second
contains a map from oids to database names. It will also contain a
subdirectory called databases, inside which it will create archives in
the specified format, named using the database oids.

In these casess the -f argument is required.

If pg_restore encounters a directory containing globals.dat, and no
toc.dat, it restores the global settings and then restores each
database.

pg_restore acquires two new options: -g/--globals-only which suppresses
restoration of any databases, and --exclude-database which inhibits
restoration of particualr database(s) in the same way the same option
works in pg_dumpall.

Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Co-authored-by:  Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>

Discussion: https://postgr.es/m/cb103623-8ee6-4ba5-a2c9-f32e3a4933fa@dunslane.net
2025-04-04 16:01:22 -04:00
Masahiko Sawada
cf2655a902 pg_recvlogical: Add --failover option.
This new option instructs pg_recvlogical to create the logical
replication slot with the failover option enabled. It can be used in
conjunction with the --create-slot option.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966C54097FC83AF19F3516BF5AC2@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-04-04 10:39:57 -07:00
Peter Geoghegan
92fe23d93a Add nbtree skip scan optimization.
Teach nbtree multi-column index scans to opportunistically skip over
irrelevant sections of the index given a query with no "=" conditions on
one or more prefix index columns.  When nbtree is passed input scan keys
derived from a predicate "WHERE b = 5", new nbtree preprocessing steps
output "WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys.
That is, preprocessing generates a "skip array" (and an output scan key)
for the omitted prefix column "a", which makes it safe to mark the scan
key on "b" as required to continue the scan.  The scan is therefore able
to repeatedly reposition itself by applying both the "a" and "b" keys.

A skip array has "elements" that are generated procedurally and on
demand, but otherwise works just like a regular ScalarArrayOp array.
Preprocessing can freely add a skip array before or after any input
ScalarArrayOp arrays.  Index scans with a skip array decide when and
where to reposition the scan using the same approach as any other scan
with array keys.  This design builds on the design for array advancement
and primitive scan scheduling added to Postgres 17 by commit 5bf748b8.

Testing has shown that skip scans of an index with a low cardinality
skipped prefix column can be multiple orders of magnitude faster than an
equivalent full index scan (or sequential scan).  In general, the
cardinality of the scan's skipped column(s) limits the number of leaf
pages that can be skipped over.

The core B-Tree operator classes on most discrete types generate their
array elements with the help of their own custom skip support routine.
This infrastructure gives nbtree a way to generate the next required
array element by incrementing (or decrementing) the current array value.
It can reduce the number of index descents in cases where the next
possible indexable value frequently turns out to be the next value
stored in the index.  Opclasses that lack a skip support routine fall
back on having nbtree "increment" (or "decrement") a skip array's
current element by setting the NEXT (or PRIOR) scan key flag, without
directly changing the scan key's sk_argument.  These sentinel values
behave just like any other value from an array -- though they can never
locate equal index tuples (they can only locate the next group of index
tuples containing the next set of non-sentinel values that the scan's
arrays need to advance to).

A skip array's range is constrained by "contradictory" inequality keys.
For example, a skip array on "x" will only generate the values 1 and 2
given a qual such as "WHERE x BETWEEN 1 AND 2 AND y = 66".  Such a skip
array qual usually has near-identical performance characteristics to a
comparable SAOP qual "WHERE x = ANY('{1, 2}') AND y = 66".  However,
improved performance isn't guaranteed.  Much depends on physical index
characteristics.

B-Tree preprocessing is optimistic about skipping working out: it
applies static, generic rules when determining where to generate skip
arrays, which assumes that the runtime overhead of maintaining skip
arrays will pay for itself -- or lead to only a modest performance loss.
As things stand, these assumptions are much too optimistic: skip array
maintenance will lead to unacceptable regressions with unsympathetic
queries (queries whose scan can't skip over many irrelevant leaf pages).
An upcoming commit will address the problems in this area by enhancing
_bt_readpage's approach to saving cycles on scan key evaluation, making
it work in a way that directly considers the needs of = array keys
(particularly = skip array keys).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiro Ikeda <masahiro.ikeda@nttdata.com>
Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
2025-04-04 12:27:04 -04:00
Heikki Linnakangas
b4f453f6ab docs: Clarify that NULL arg to set_config() means reset to default
Author: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAKFQuwY0SK6JdCci1VJX6xsztRXgGeVEY-grkENZx%2B3CZpyPcQ@mail.gmail.com
2025-04-04 15:17:17 +03:00
Fujii Masao
534874fac0 Allow "COPY table TO" command to copy rows from materialized views.
Previously, "COPY table TO" command worked only with plain tables and
did not support materialized views, even when they were populated and
had physical storage. To copy rows from materialized views,
"COPY (query) TO" command had to be used, instead.

This commit extends "COPY table TO" to support populated materialized
views directly, improving usability and performance, as "COPY table TO"
is generally faster than "COPY (query) TO". Note that copying from
unpopulated materialized views will still result in an error.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACJufxHVxnyRYy67hiPePNCPwVBMzhTQ6FaL9_Te5On9udG=yg@mail.gmail.com
2025-04-04 19:32:00 +09:00