Compare commits

...

18 Commits

Author SHA1 Message Date
Michael Paquier
fd7d7b7191 Improve checks for GUC recovery_target_timeline
Currently check_recovery_target_timeline() converts any value that is
not "current", "latest", or a valid integer to 0.  So, for example, the
following configuration added to postgresql.conf followed by a startup:
recovery_target_timeline = 'bogus'
recovery_target_timeline = '9999999999'

...  results in the following error patterns:
FATAL:  22023: recovery target timeline 0 does not exist
FATAL:  22023: recovery target timeline 1410065407 does not exist

This is confusing, because the server does not reflect the intention of
the user, and just reports incorrect data unrelated to the GUC.

The origin of the problem is that we do not perform a range check in the
GUC value passed-in for recovery_target_timeline.  This commit improves
the situation by using strtou64() and by providing stricter range
checks.  Some test cases are added for the cases of an incorrect, an
upper-bound and a lower-bound timeline value, checking the sanity of the
reports based on the contents of the server logs.

Author: David Steele <david@pgmasters.net>
Discussion: https://postgr.es/m/e5d472c7-e9be-4710-8dc4-ebe721b62cea@pgbackrest.org
2025-07-03 11:14:20 +09:00
Richard Guo
0da29e4cb1 Enable use of Memoize for ANTI joins
Currently, we do not support Memoize for SEMI and ANTI joins because
nested loop SEMI/ANTI joins do not scan the inner relation to
completion, which prevents Memoize from marking the cache entry as
complete.  One might argue that we could mark the cache entry as
complete after fetching the first inner tuple, but that would not be
safe: if the first inner tuple and the current outer tuple do not
satisfy the join clauses, a second inner tuple matching the parameters
would find the cache entry already marked as complete.

However, if the inner side is provably unique, this issue doesn't
arise, since there would be no second matching tuple.  That said, this
doesn't help in the case of SEMI joins, because a SEMI join with a
provably unique inner side would already have been reduced to an inner
join by reduce_unique_semijoins.

Therefore, in this patch, we check whether the inner relation is
provably unique for ANTI joins and enable the use of Memoize in such
cases.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CAMbWs48FdLiMNrmJL-g6mDvoQVt0yNyJAqMkv4e2Pk-5GKCZLA@mail.gmail.com
2025-07-03 10:57:26 +09:00
Michael Paquier
7b2eb72b1b Add InjectionPointList() to retrieve list of injection points
This routine has come as a useful piece to be able to know the list of
injection points currently attached in a system.  One area would be to
use it in a set-returning function, or just let out-of-core code play
with it.

This hides the internals of the shared memory array lookup holding the
information about the injection points (point name, library and function
name), allocating the result in a palloc'd List consumable by the
caller.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/Z_xYkA21KyLEHvWR@paquier.xyz
Discussion: https://postgr.es/m/aBG2rPwl3GE7m1-Q@paquier.xyz
2025-07-03 08:41:25 +09:00
Tom Lane
fe05430ace Correctly copy the target host identification in PQcancelCreate.
PQcancelCreate failed to copy struct pg_conn_host's "type" field,
instead leaving it zero (a/k/a CHT_HOST_NAME).  This seemingly
has no great ill effects if it should have been CHT_UNIX_SOCKET
instead, but if it should have been CHT_HOST_ADDRESS then a
null-pointer dereference will occur when the cancelConn is used.

Bug: #18974
Reported-by: Maxim Boguk <maxim.boguk@gmail.com>
Author: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18974-575f02b2168b36b3@postgresql.org
Backpatch-through: 17
2025-07-02 15:48:02 -04:00
Nathan Bossart
0c2b7174c3 Fix cross-version upgrade test breakage from commit fe07100e82.
In commit fe07100e82, I renamed a couple of functions in
test_dsm_registry to make it clear what they are testing.  However,
the buildfarm's cross-version upgrade tests run pg_upgrade with the
test modules installed, so this caused errors like:

    ERROR:  could not find function "get_val_in_shmem" in file ".../test_dsm_registry.so"

To fix, revert those renames.  I could probably get away with only
un-renaming the C symbols, but I figured I'd avoid introducing
function name mismatches.  Also, AFAICT the buildfarm's
cross-version upgrade tests do not run the test module tests
post-upgrade, else we'll need to properly version the extension.

Per buildfarm member crake.

Discussion: https://postgr.es/m/aGVuYUNW23tStUYs%40nathan
2025-07-02 13:26:33 -05:00
Nathan Bossart
bb109382ef Make more use of RELATION_IS_OTHER_TEMP().
A few places were open-coding it instead of using this handy macro.

Author: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3LjTGJcOcxQx-SUOGoxstG4XuCWLH0ATJKKt_aBTE5K8w%40mail.gmail.com
2025-07-02 12:32:19 -05:00
Nathan Bossart
fe07100e82 Add GetNamedDSA() and GetNamedDSHash().
Presently, the dynamic shared memory (DSM) registry only provides
GetNamedDSMSegment(), which allocates a fixed-size segment.  To use
the DSM registry for more sophisticated things like dynamic shared
memory areas (DSAs) or a hash table backed by a DSA (dshash), users
need to create a DSM segment that stores various handles and LWLock
tranche IDs and to write fairly complicated initialization code.
Furthermore, there is likely little variation in this
initialization code between libraries.

This commit introduces functions that simplify allocating a DSA or
dshash within the DSM registry.  These functions are very similar
to GetNamedDSMSegment().  Notable differences include the lack of
an initialization callback parameter and the prohibition of calling
the functions more than once for a given entry in each backend
(which should be trivially avoidable in most circumstances).  While
at it, this commit bumps the maximum DSM registry entry name length
from 63 bytes to 127 bytes.

Also note that even though one could presumably detach/destroy the
DSAs and dshashes created in the registry, such use-cases are not
yet well-supported, if for no other reason than the associated DSM
registry entries cannot be removed.  Adding such support is left as
a future exercise.

The test_dsm_registry test module contains tests for the new
functions and also serves as a complete usage example.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Florents Tselai <florents.tselai@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/aEC8HGy2tRQjZg_8%40nathan
2025-07-02 11:50:52 -05:00
Peter Geoghegan
9ca30a0b04 Update obsolete row compare preprocessing comments.
Restore nbtree preprocessing comments describing how we mark nbtree row
compare members required to how they were prior to 2016 bugfix commit
a298a1e0.

Oversight in commit bd3f59fd, which made nbtree preprocessing revert to
the original 2006 rules, but neglected to revert these comments.

Backpatch-through: 18
2025-07-02 12:36:35 -04:00
Tom Lane
7374b3a536 Allow width_bucket()'s "operand" input to be NaN.
The array-based variant of width_bucket() has always accepted NaN
inputs, treating them as equal but larger than any non-NaN,
as we do in ordinary comparisons.  But up to now, the four-argument
variants threw errors for a NaN operand.  This is inconsistent
and unnecessary, since we can perfectly well regard NaN as falling
after the last bucket.

We do still throw error for NaN or infinity histogram-bound inputs,
since there's no way to compute sensible bucket boundaries.

Arguably this is a bug fix, but given the lack of field complaints
I'm content to fix it in master.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/2822872.1750540911@sss.pgh.pa.us
2025-07-02 11:34:40 -04:00
Álvaro Herrera
c989affb52
Fix error message for ALTER CONSTRAINT ... NOT VALID
Trying to alter a constraint so that it becomes NOT VALID results in an
error that assumes the constraint is a foreign key.  This is potentially
wrong, so give a more generic error message.

While at it, give CREATE CONSTRAINT TRIGGER a better error message as
well.

Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Co-authored-by: Amul Sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CACJufxHSp2puxP=q8ZtUGL1F+heapnzqFBZy5ZNGUjUgwjBqTQ@mail.gmail.com
2025-07-02 17:02:27 +02:00
Peter Geoghegan
bd3f59fdb7 Make row compares robust during nbtree array scans.
Recent nbtree bugfix commit 5f4d98d4 added a special case to the code
that sets up a page-level prefix of keys that are definitely satisfied
by every tuple on the page: whenever _bt_set_startikey reached a row
compare key, we'd refuse to apply the pstate.forcenonrequired behavior
in scans where that usually happens (scans with a higher-order array
key).  That hack made the scan avoid essentially the same infinite
cycling behavior that also affected nbtree scans with redundant keys
(keys that preprocessing could not eliminate) prior to commit f09816a0.
There are now serious doubts about this row compare workaround.

Testing has shown that a scan with a row compare key and an array key
could still read the same leaf page twice (without the scan's direction
changing), which isn't supposed to be possible following the SAOP
enhancements added by Postgres 17 commit 5bf748b8.  Also, we still
allowed a required row compare key to be used with forcenonrequired mode
when its header key happened to be beyond the pstate.ikey set by
_bt_set_startikey, which was complicated and brittle.

The underlying problem was that row compares had inconsistent rules
around how scans start (which keys can be used for initial positioning
purposes) and how scans end (which keys can set continuescan=false).
Quals with redundant keys that could not be eliminated by preprocessing
also had that same quality to them prior to today's bugfix f09816a0.  It
now seems prudent to bring row compare keys in line with the new charter
for required keys, by making the start and end rules symmetric.

This commit fixes two points of disagreement between _bt_first and
_bt_check_rowcompare.  Firstly, _bt_check_rowcompare was capable of
ending the scan at the point where it needed to compare an ISNULL-marked
row compare member that came immediately after a required row compare
member.  _bt_first now has symmetric handling for NULL row compares.
Secondly, _bt_first had its own ideas about which keys were safe to use
for initial positioning purposes.  It could use fewer or more keys than
_bt_check_rowcompare.  _bt_first now uses the same requiredness markings
as _bt_check_rowcompare for this.

Now that _bt_first and _bt_check_rowcompare agree on how to start and
end scans, we can get rid of the forcenonrequired special case, without
any risk of infinite cycling.  This approach also makes row compare keys
behave more like regular scalar keys, particularly within _bt_first.

Fixing these inconsistencies necessitates dealing with a related issue
with the way that row compares were marked required by preprocessing: we
didn't mark any lower-order row members required following 2016 bugfix
commit a298a1e0.  That approach was over broad.  The bug in question was
actually an oversight in how _bt_check_rowcompare dealt with tuple NULL
values that failed to satisfy a scan key marked required in the opposite
scan direction (it was a bug in 2011 commits 6980f817 and 882368e8, not
a bug in 2006 commit 3a0a16cb).  Go back to marking row compare members
as required using the original 2006 rules, and fix the 2016 bug in a
more principled way: by limiting use of the "set continuescan=false with
a key required in the opposite scan direction upon encountering a NULL
tuple value" optimization to the first/most significant row member key.
While it isn't safe to use an implied IS NOT NULL qualifier to end the
scan when it comes from a required lower-order row compare member key,
it _is_ generally safe for such a required member key to end the scan --
provided the key is marked required in the _current_ scan direction.

This fixes what was arguably an oversight in either commit 5f4d98d4 or
commit 8a510275.  It is a direct follow-up to today's commit f09816a0.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Discussion: https://postgr.es/m/CAH2-Wz=pcijHL_mA0_TJ5LiTB28QpQ0cGtT-ccFV=KzuunNDDQ@mail.gmail.com
Backpatch-through: 18
2025-07-02 09:48:15 -04:00
Peter Geoghegan
f09816a0a7 Make handling of redundant nbtree keys more robust.
nbtree preprocessing's handling of redundant (and contradictory) keys
created problems for scans with = arrays.  It was just about possible
for a scan with an = array key and one or more redundant keys (keys that
preprocessing could not eliminate due an incomplete opfamily and a
cross-type key) to get stuck.  Testing has shown that infinite cycling
where the scan never manages to make forward progress was possible.
This could happen when the scan's arrays were reset in _bt_readpage's
forcenonrequired=true path (added by bugfix commit 5f4d98d4) when the
arrays weren't at least advanced up to the same point that they were in
at the start of the _bt_readpage call.  Earlier redundant keys prevented
the finaltup call to _bt_advance_array_keys from reaching lower-order
keys that needed to be used to sufficiently advance the scan's arrays.

To fix, make preprocessing leave the scan's keys in a state that is as
close as possible to how it'll usually leave them (in the common case
where there's no redundant keys that preprocessing failed to eliminate).
Now nbtree preprocessing _reliably_ leaves behind at most one required
>/>= key per index column, and at most one required </<= key per index
column.  Columns that have one or more = keys that are eligible to be
marked required (based on the traditional rules) prioritize the = keys
over redundant inequality keys; they'll _reliably_ be left with only one
of the = keys as the index column's only required key.

Keys that are not marked required (whether due to the new preprocessing
step running or for some other reason) are relocated to the end of the
so->keyData[] array as needed.  That way they'll always be evaluated
after the scan's required keys, and so cannot prevent code in places
like _bt_advance_array_keys and _bt_first from reaching a required key.

Also teach _bt_first to decide which initial positioning keys to use
based on the same requiredness markings that have long been used by
_bt_checkkeys/_bt_advance_array_keys.  This is a necessary condition for
reliably avoiding infinite cycling.  _bt_advance_array_keys expects to
be able to reason about what'll happen in the next _bt_first call should
it start another primitive index scan, by evaluating inequality keys
that were marked required in the opposite-to-scan scan direction only.
Now everybody (_bt_first, _bt_checkkeys, and _bt_advance_array_keys)
will always agree on which exact key will be used on each index column
to start and/or end the scan (except when row compare keys are involved,
which have similar problems not addressed by this commit).

An upcoming commit will finish off the work started by this commit by
harmonizing how _bt_first, _bt_checkkeys, and _bt_advance_array_keys
apply row compare keys to start and end scans.

This fixes what was arguably an oversight in either commit 5f4d98d4 or
commit 8a510275.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Discussion: https://postgr.es/m/CAH2-Wz=ds4M+3NXMgwxYxqU8MULaLf696_v5g=9WNmWL2=Uo2A@mail.gmail.com
Backpatch-through: 18
2025-07-02 09:40:49 -04:00
Daniel Gustafsson
8eede2c720 doc: pg_buffercache documentation wordsmithing
A words seemed to have gone missing in the leading paragraphs.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/aGTQYZz9L0bjlzVL@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 18
2025-07-02 11:42:36 +02:00
Peter Eisentraut
f039c22441 meson: Increase minimum version to 0.57.2
The previous minimum was to maintain support for Python 3.5, but we
now require Python 3.6 anyway (commit 45363fca637), so that reason is
obsolete.  A small raise to Meson 0.57 allows getting rid of a fair
amount of version conditionals and silences some future-deprecated
warnings.

With the version bump, the following deprecation warnings appeared and
are fixed:

WARNING: Project targets '>=0.57' but uses feature deprecated since '0.55.0': ExternalProgram.path. use ExternalProgram.full_path() instead
WARNING: Project targets '>=0.57' but uses feature deprecated since '0.56.0': meson.build_root. use meson.project_build_root() or meson.global_build_root() instead.

It turns out that meson 0.57.0 and 0.57.1 are buggy for our use, so
the minimum is actually set to 0.57.2.  This is specific to this
version series; in the future we won't necessarily need to be this
precise.

Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/42e13eb0-862a-441e-8d84-4f0fd5f6def0%40eisentraut.org
2025-07-02 11:14:53 +02:00
Peter Eisentraut
de5aa15209 Reformat some node comments
Use per-field comments for IndexInfo, instead of one big header
comment listing all the fields.  This makes the relevant comments
easier to find, and it will also make it less likely that comments are
not updated when fields are added or removed, as has happened in the
past.

Author: Japin Li <japinli@hotmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-02 09:50:51 +02:00
Masahiko Sawada
3811ca3600 Fix missing FSM vacuum opportunities on tables without indexes.
Commit c120550edb86 optimized the vacuuming of relations without
indexes (a.k.a. one-pass strategy) by directly marking dead item IDs
as LP_UNUSED. However, the periodic FSM vacuum was still checking if
dead item IDs had been marked as LP_DEAD when attempting to vacuum the
FSM every VACUUM_FSM_EVERY_PAGES blocks. This condition was never met
due to the optimization, resulting in missed FSM vacuum
opportunities.

This commit modifies the periodic FSM vacuum condition to use the
number of tuples deleted during HOT pruning. This count includes items
marked as either LP_UNUSED or LP_REDIRECT, both of which are expected
to result in new free space to report.

Back-patch to v17 where the vacuum optimization for tables with no
indexes was introduced.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAD21AoBL8m6B9GSzQfYxVaEgvD7-Kr3AJaS-hJPHC+avm-29zw@mail.gmail.com
Backpatch-through: 17
2025-07-01 23:25:20 -07:00
John Naylor
9adb58a3cc Remove implicit cast from 'void *'
Commit e2809e3a101 added code to a header which assigns a pointer
to void to a pointer to unsigned char. This causes build errors for
extensions written in C++. Fix by adding an explicit cast.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CANWCAZaCq9AHBuhs%3DMx7Gg_0Af9oRU7iAqr0itJCtfmsWwVmnQ%40mail.gmail.com
Backpatch-through: 18
2025-07-02 11:51:10 +07:00
Michael Paquier
3369a3b49b Fix bug in archive streamer with LZ4 decompression
When decompressing some input data, the calculation for the initial
starting point and the initial size were incorrect, potentially leading
to failures when decompressing contents with LZ4.  These initialization
points are fixed in this commit, bringing the logic closer to what
exists for gzip and zstd.

The contents of the compressed data is clear (for example backups taken
with LZ4 can still be decompressed with a "lz4" command), only the
decompression part reading the input data was impacted by this issue.

This code path impacts pg_basebackup and pg_verifybackup, which can use
the LZ4 decompression routines with an archive streamer, or any tools
that try to use the archive streamers in src/fe_utils/.

The issue is easier to reproduce with files that have a low-compression
rate, like ones filled with random data, for a size of at least 512kB,
but this could happen with anything as long as it is stored in a data
folder.  Some tests are added based on this idea, with a file filled
with random bytes grabbed from the backend, written at the root of the
data folder.  This is proving good enough to reproduce the original
problem.

Author: Mikhail Gribkov <youzhick@gmail.com>
Discussion: https://postgr.es/m/CAMEv5_uQS1Hg6KCaEP2JkrTBbZ-nXQhxomWrhYQvbdzR-zy-wA@mail.gmail.com
Backpatch-through: 15
2025-07-02 13:48:36 +09:00
54 changed files with 1797 additions and 793 deletions

View File

@ -24,7 +24,7 @@ tests += {
'tests': [ 'tests': [
't/001_basic.pl', 't/001_basic.pl',
], ],
'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', 'env': {'GZIP_PROGRAM': gzip.found() ? gzip.full_path() : '',
'TAR': tar.found() ? tar.path() : '' }, 'TAR': tar.found() ? tar.full_path() : '' },
}, },
} }

View File

@ -34,7 +34,7 @@ tests += {
'sql': [ 'sql': [
'dblink', 'dblink',
], ],
'regress_args': ['--dlpath', meson.build_root() / 'src/test/regress'], 'regress_args': ['--dlpath', meson.project_build_root() / 'src/test/regress'],
}, },
'tap': { 'tap': {
'tests': [ 'tests': [

View File

@ -39,7 +39,7 @@ tests += {
'postgres_fdw', 'postgres_fdw',
'query_cancel', 'query_cancel',
], ],
'regress_args': ['--dlpath', meson.build_root() / 'src/test/regress'], 'regress_args': ['--dlpath', meson.project_build_root() / 'src/test/regress'],
}, },
'tap': { 'tap': {
'tests': [ 'tests': [

View File

@ -65,7 +65,7 @@
</para> </para>
<para> <para>
The minimum required version of <application>Meson</application> is 0.54. The minimum required version of <application>Meson</application> is 0.57.2.
</para> </para>
</listitem> </listitem>

View File

@ -37,12 +37,12 @@
<para> <para>
This module provides the <function>pg_buffercache_pages()</function> This module provides the <function>pg_buffercache_pages()</function>
function (wrapped in the <structname>pg_buffercache</structname> view), function (wrapped in the <structname>pg_buffercache</structname> view), the
<function>pg_buffercache_numa_pages()</function> function (wrapped in the <function>pg_buffercache_numa_pages()</function> function (wrapped in the
<structname>pg_buffercache_numa</structname> view), the <structname>pg_buffercache_numa</structname> view), the
<function>pg_buffercache_summary()</function> function, the <function>pg_buffercache_summary()</function> function, the
<function>pg_buffercache_usage_counts()</function> function, the <function>pg_buffercache_usage_counts()</function> function, the
<function>pg_buffercache_evict()</function>, the <function>pg_buffercache_evict()</function> function, the
<function>pg_buffercache_evict_relation()</function> function and the <function>pg_buffercache_evict_relation()</function> function and the
<function>pg_buffercache_evict_all()</function> function. <function>pg_buffercache_evict_all()</function> function.
</para> </para>
@ -55,7 +55,7 @@
</para> </para>
<para> <para>
The <function>pg_buffercache_numa_pages()</function> provides The <function>pg_buffercache_numa_pages()</function> function provides
<acronym>NUMA</acronym> node mappings for shared buffer entries. This <acronym>NUMA</acronym> node mappings for shared buffer entries. This
information is not part of <function>pg_buffercache_pages()</function> information is not part of <function>pg_buffercache_pages()</function>
itself, as it is much slower to retrieve. itself, as it is much slower to retrieve.

View File

@ -11,10 +11,11 @@ project('postgresql',
version: '19devel', version: '19devel',
license: 'PostgreSQL', license: 'PostgreSQL',
# We want < 0.56 for python 3.5 compatibility on old platforms. EPEL for # We want < 0.62 for python 3.6 compatibility on old platforms.
# RHEL 7 has 0.55. < 0.54 would require replacing some uses of the fs # RHEL 8 has 0.58. < 0.57 would require various additional
# module, < 0.53 all uses of fs. So far there's no need to go to >=0.56. # backward-compatibility conditionals.
meson_version: '>=0.54', # Meson 0.57.0 and 0.57.1 are buggy, therefore >=0.57.2.
meson_version: '>=0.57.2',
default_options: [ default_options: [
'warning_level=1', #-Wall equivalent 'warning_level=1', #-Wall equivalent
'b_pch=false', 'b_pch=false',
@ -1288,7 +1289,7 @@ pyopt = get_option('plpython')
python3_dep = not_found_dep python3_dep = not_found_dep
if not pyopt.disabled() if not pyopt.disabled()
pm = import('python') pm = import('python')
python3_inst = pm.find_installation(python.path(), required: pyopt) python3_inst = pm.find_installation(python.full_path(), required: pyopt)
if python3_inst.found() if python3_inst.found()
python3_dep = python3_inst.dependency(embed: true, required: pyopt) python3_dep = python3_inst.dependency(embed: true, required: pyopt)
# Remove this check after we depend on Meson >= 1.1.0 # Remove this check after we depend on Meson >= 1.1.0
@ -3150,13 +3151,13 @@ gen_kwlist_cmd = [
### ###
if host_system == 'windows' if host_system == 'windows'
pg_ico = meson.source_root() / 'src' / 'port' / 'win32.ico' pg_ico = meson.project_source_root() / 'src' / 'port' / 'win32.ico'
win32ver_rc = files('src/port/win32ver.rc') win32ver_rc = files('src/port/win32ver.rc')
rcgen = find_program('src/tools/rcgen', native: true) rcgen = find_program('src/tools/rcgen', native: true)
rcgen_base_args = [ rcgen_base_args = [
'--srcdir', '@SOURCE_DIR@', '--srcdir', '@SOURCE_DIR@',
'--builddir', meson.build_root(), '--builddir', meson.project_build_root(),
'--rcout', '@OUTPUT0@', '--rcout', '@OUTPUT0@',
'--out', '@OUTPUT1@', '--out', '@OUTPUT1@',
'--input', '@INPUT@', '--input', '@INPUT@',
@ -3165,11 +3166,11 @@ if host_system == 'windows'
if cc.get_argument_syntax() == 'msvc' if cc.get_argument_syntax() == 'msvc'
rc = find_program('rc', required: true) rc = find_program('rc', required: true)
rcgen_base_args += ['--rc', rc.path()] rcgen_base_args += ['--rc', rc.full_path()]
rcgen_outputs = ['@BASENAME@.rc', '@BASENAME@.res'] rcgen_outputs = ['@BASENAME@.rc', '@BASENAME@.res']
else else
windres = find_program('windres', required: true) windres = find_program('windres', required: true)
rcgen_base_args += ['--windres', windres.path()] rcgen_base_args += ['--windres', windres.full_path()]
rcgen_outputs = ['@BASENAME@.rc', '@BASENAME@.obj'] rcgen_outputs = ['@BASENAME@.rc', '@BASENAME@.obj']
endif endif
@ -3402,7 +3403,7 @@ foreach t1 : configure_files
potentially_conflicting_files += meson.current_build_dir() / t potentially_conflicting_files += meson.current_build_dir() / t
endforeach endforeach
foreach sub, fnames : generated_sources_ac foreach sub, fnames : generated_sources_ac
sub = meson.build_root() / sub sub = meson.project_build_root() / sub
foreach fname : fnames foreach fname : fnames
potentially_conflicting_files += sub / fname potentially_conflicting_files += sub / fname
endforeach endforeach
@ -3502,7 +3503,7 @@ run_target('install-test-files',
############################################################### ###############################################################
# DESTDIR for the installation we'll run tests in # DESTDIR for the installation we'll run tests in
test_install_destdir = meson.build_root() / 'tmp_install/' test_install_destdir = meson.project_build_root() / 'tmp_install/'
# DESTDIR + prefix appropriately munged # DESTDIR + prefix appropriately munged
if build_system != 'windows' if build_system != 'windows'
@ -3545,7 +3546,7 @@ test('install_test_files',
is_parallel: false, is_parallel: false,
suite: ['setup']) suite: ['setup'])
test_result_dir = meson.build_root() / 'testrun' test_result_dir = meson.project_build_root() / 'testrun'
# XXX: pg_regress doesn't assign unique ports on windows. To avoid the # XXX: pg_regress doesn't assign unique ports on windows. To avoid the
@ -3556,12 +3557,12 @@ testport = 40000
test_env = environment() test_env = environment()
test_initdb_template = meson.build_root() / 'tmp_install' / 'initdb-template' test_initdb_template = meson.project_build_root() / 'tmp_install' / 'initdb-template'
test_env.set('PG_REGRESS', pg_regress.full_path()) test_env.set('PG_REGRESS', pg_regress.full_path())
test_env.set('REGRESS_SHLIB', regress_module.full_path()) test_env.set('REGRESS_SHLIB', regress_module.full_path())
test_env.set('INITDB_TEMPLATE', test_initdb_template) test_env.set('INITDB_TEMPLATE', test_initdb_template)
# for Cluster.pm's portlock logic # for Cluster.pm's portlock logic
test_env.set('top_builddir', meson.build_root()) test_env.set('top_builddir', meson.project_build_root())
# Add the temporary installation to the library search path on platforms where # Add the temporary installation to the library search path on platforms where
# that works (everything but windows, basically). On windows everything # that works (everything but windows, basically). On windows everything
@ -3605,26 +3606,20 @@ sys.exit(sp.returncode)
# Test Generation # Test Generation
############################################################### ###############################################################
# When using a meson version understanding exclude_suites, define a # Define a 'tmp_install' test setup (the default) that excludes tests
# 'tmp_install' test setup (the default) that excludes tests running against a # running against a pre-existing install and a 'running' setup that
# pre-existing install and a 'running' setup that conflicts with creation of # conflicts with creation of the temporary installation and tap tests
# the temporary installation and tap tests (which don't support running # (which don't support running against a running server).
# against a running server).
running_suites = [] running_suites = []
install_suites = [] install_suites = []
if meson.version().version_compare('>=0.57')
runningcheck = true
else
runningcheck = false
endif
testwrap = files('src/tools/testwrap') testwrap = files('src/tools/testwrap')
foreach test_dir : tests foreach test_dir : tests
testwrap_base = [ testwrap_base = [
testwrap, testwrap,
'--basedir', meson.build_root(), '--basedir', meson.project_build_root(),
'--srcdir', test_dir['sd'], '--srcdir', test_dir['sd'],
# Some test suites are not run by default but can be run if selected by the # Some test suites are not run by default but can be run if selected by the
# user via variable PG_TEST_EXTRA. Pass configuration time value of # user via variable PG_TEST_EXTRA. Pass configuration time value of
@ -3714,7 +3709,7 @@ foreach test_dir : tests
install_suites += test_group install_suites += test_group
# some tests can't support running against running DB # some tests can't support running against running DB
if runningcheck and t.get('runningcheck', true) if t.get('runningcheck', true)
test(test_group_running / kind, test(test_group_running / kind,
python, python,
args: [ args: [
@ -3741,8 +3736,8 @@ foreach test_dir : tests
endif endif
test_command = [ test_command = [
perl.path(), perl.full_path(),
'-I', meson.source_root() / 'src/test/perl', '-I', meson.project_source_root() / 'src/test/perl',
'-I', test_dir['sd'], '-I', test_dir['sd'],
] ]
@ -3797,13 +3792,11 @@ foreach test_dir : tests
endforeach # directories with tests endforeach # directories with tests
# repeat condition so meson realizes version dependency # repeat condition so meson realizes version dependency
if meson.version().version_compare('>=0.57') add_test_setup('tmp_install',
add_test_setup('tmp_install', is_default: true,
is_default: true, exclude_suites: running_suites)
exclude_suites: running_suites) add_test_setup('running',
add_test_setup('running', exclude_suites: ['setup'] + install_suites)
exclude_suites: ['setup'] + install_suites)
endif
@ -3860,7 +3853,7 @@ tar_gz = custom_target('tar.gz',
'--format', 'tar.gz', '--format', 'tar.gz',
'-9', '-9',
'--prefix', distdir + '/', '--prefix', distdir + '/',
'-o', join_paths(meson.build_root(), '@OUTPUT@'), '-o', join_paths(meson.project_build_root(), '@OUTPUT@'),
pg_git_revision], pg_git_revision],
output: distdir + '.tar.gz', output: distdir + '.tar.gz',
) )
@ -3870,11 +3863,11 @@ if bzip2.found()
build_always_stale: true, build_always_stale: true,
command: [git, '-C', '@SOURCE_ROOT@', command: [git, '-C', '@SOURCE_ROOT@',
'-c', 'core.autocrlf=false', '-c', 'core.autocrlf=false',
'-c', 'tar.tar.bz2.command="@0@" -c'.format(bzip2.path()), '-c', 'tar.tar.bz2.command="@0@" -c'.format(bzip2.full_path()),
'archive', 'archive',
'--format', 'tar.bz2', '--format', 'tar.bz2',
'--prefix', distdir + '/', '--prefix', distdir + '/',
'-o', join_paths(meson.build_root(), '@OUTPUT@'), '-o', join_paths(meson.project_build_root(), '@OUTPUT@'),
pg_git_revision], pg_git_revision],
output: distdir + '.tar.bz2', output: distdir + '.tar.bz2',
) )
@ -3891,10 +3884,7 @@ alias_target('pgdist', [tar_gz, tar_bz2])
# But not if we are in a subproject, in case the parent project wants to # But not if we are in a subproject, in case the parent project wants to
# create a dist using the standard Meson command. # create a dist using the standard Meson command.
if not meson.is_subproject() if not meson.is_subproject()
# We can only pass the identifier perl here when we depend on >= 0.55 meson.add_dist_script(perl, '-e', 'exit 1')
if meson.version().version_compare('>=0.55')
meson.add_dist_script(perl, '-e', 'exit 1')
endif
endif endif
@ -3903,106 +3893,102 @@ endif
# The End, The End, My Friend # The End, The End, My Friend
############################################################### ###############################################################
if meson.version().version_compare('>=0.57') summary(
{
'data block size': '@0@ kB'.format(cdata.get('BLCKSZ') / 1024),
'WAL block size': '@0@ kB'.format(cdata.get('XLOG_BLCKSZ') / 1024),
'segment size': get_option('segsize_blocks') != 0 ?
'@0@ blocks'.format(cdata.get('RELSEG_SIZE')) :
'@0@ GB'.format(get_option('segsize')),
},
section: 'Data layout',
)
summary(
{
'host system': '@0@ @1@'.format(host_system, host_cpu),
'build system': '@0@ @1@'.format(build_machine.system(),
build_machine.cpu_family()),
},
section: 'System',
)
summary(
{
'linker': '@0@'.format(cc.get_linker_id()),
'C compiler': '@0@ @1@'.format(cc.get_id(), cc.version()),
},
section: 'Compiler',
)
summary(
{
'CPP FLAGS': ' '.join(cppflags),
'C FLAGS, functional': ' '.join(cflags),
'C FLAGS, warnings': ' '.join(cflags_warn),
'C FLAGS, modules': ' '.join(cflags_mod),
'C FLAGS, user specified': ' '.join(get_option('c_args')),
'LD FLAGS': ' '.join(ldflags + get_option('c_link_args')),
},
section: 'Compiler Flags',
)
if llvm.found()
summary( summary(
{ {
'data block size': '@0@ kB'.format(cdata.get('BLCKSZ') / 1024), 'C++ compiler': '@0@ @1@'.format(cpp.get_id(), cpp.version()),
'WAL block size': '@0@ kB'.format(cdata.get('XLOG_BLCKSZ') / 1024),
'segment size': get_option('segsize_blocks') != 0 ?
'@0@ blocks'.format(cdata.get('RELSEG_SIZE')) :
'@0@ GB'.format(get_option('segsize')),
},
section: 'Data layout',
)
summary(
{
'host system': '@0@ @1@'.format(host_system, host_cpu),
'build system': '@0@ @1@'.format(build_machine.system(),
build_machine.cpu_family()),
},
section: 'System',
)
summary(
{
'linker': '@0@'.format(cc.get_linker_id()),
'C compiler': '@0@ @1@'.format(cc.get_id(), cc.version()),
}, },
section: 'Compiler', section: 'Compiler',
) )
summary( summary(
{ {
'CPP FLAGS': ' '.join(cppflags), 'C++ FLAGS, functional': ' '.join(cxxflags),
'C FLAGS, functional': ' '.join(cflags), 'C++ FLAGS, warnings': ' '.join(cxxflags_warn),
'C FLAGS, warnings': ' '.join(cflags_warn), 'C++ FLAGS, user specified': ' '.join(get_option('cpp_args')),
'C FLAGS, modules': ' '.join(cflags_mod),
'C FLAGS, user specified': ' '.join(get_option('c_args')),
'LD FLAGS': ' '.join(ldflags + get_option('c_link_args')),
}, },
section: 'Compiler Flags', section: 'Compiler Flags',
) )
if llvm.found()
summary(
{
'C++ compiler': '@0@ @1@'.format(cpp.get_id(), cpp.version()),
},
section: 'Compiler',
)
summary(
{
'C++ FLAGS, functional': ' '.join(cxxflags),
'C++ FLAGS, warnings': ' '.join(cxxflags_warn),
'C++ FLAGS, user specified': ' '.join(get_option('cpp_args')),
},
section: 'Compiler Flags',
)
endif
summary(
{
'bison': '@0@ @1@'.format(bison.full_path(), bison_version),
'dtrace': dtrace,
'flex': '@0@ @1@'.format(flex.full_path(), flex_version),
},
section: 'Programs',
)
summary(
{
'bonjour': bonjour,
'bsd_auth': bsd_auth,
'docs': docs_dep,
'docs_pdf': docs_pdf_dep,
'gss': gssapi,
'icu': icu,
'ldap': ldap,
'libcurl': libcurl,
'libnuma': libnuma,
'liburing': liburing,
'libxml': libxml,
'libxslt': libxslt,
'llvm': llvm,
'lz4': lz4,
'nls': libintl,
'openssl': ssl,
'pam': pam,
'plperl': [perl_dep, perlversion],
'plpython': python3_dep,
'pltcl': tcl_dep,
'readline': readline,
'selinux': selinux,
'systemd': systemd,
'uuid': uuid,
'zlib': zlib,
'zstd': zstd,
},
section: 'External libraries',
list_sep: ' ',
)
endif endif
summary(
{
'bison': '@0@ @1@'.format(bison.full_path(), bison_version),
'dtrace': dtrace,
'flex': '@0@ @1@'.format(flex.full_path(), flex_version),
},
section: 'Programs',
)
summary(
{
'bonjour': bonjour,
'bsd_auth': bsd_auth,
'docs': docs_dep,
'docs_pdf': docs_pdf_dep,
'gss': gssapi,
'icu': icu,
'ldap': ldap,
'libcurl': libcurl,
'libnuma': libnuma,
'liburing': liburing,
'libxml': libxml,
'libxslt': libxslt,
'llvm': llvm,
'lz4': lz4,
'nls': libintl,
'openssl': ssl,
'pam': pam,
'plperl': [perl_dep, perlversion],
'plpython': python3_dep,
'pltcl': tcl_dep,
'readline': readline,
'selinux': selinux,
'systemd': systemd,
'uuid': uuid,
'zlib': zlib,
'zstd': zstd,
},
section: 'External libraries',
list_sep: ' ',
)

View File

@ -431,7 +431,7 @@ static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page, BlockNumber blkno, Page page,
bool sharelock, Buffer vmbuffer); bool sharelock, Buffer vmbuffer);
static void lazy_scan_prune(LVRelState *vacrel, Buffer buf, static int lazy_scan_prune(LVRelState *vacrel, Buffer buf,
BlockNumber blkno, Page page, BlockNumber blkno, Page page,
Buffer vmbuffer, bool all_visible_according_to_vm, Buffer vmbuffer, bool all_visible_according_to_vm,
bool *has_lpdead_items, bool *vm_page_frozen); bool *has_lpdead_items, bool *vm_page_frozen);
@ -1245,6 +1245,7 @@ lazy_scan_heap(LVRelState *vacrel)
Buffer buf; Buffer buf;
Page page; Page page;
uint8 blk_info = 0; uint8 blk_info = 0;
int ndeleted = 0;
bool has_lpdead_items; bool has_lpdead_items;
void *per_buffer_data = NULL; void *per_buffer_data = NULL;
bool vm_page_frozen = false; bool vm_page_frozen = false;
@ -1387,10 +1388,10 @@ lazy_scan_heap(LVRelState *vacrel)
* line pointers previously marked LP_DEAD. * line pointers previously marked LP_DEAD.
*/ */
if (got_cleanup_lock) if (got_cleanup_lock)
lazy_scan_prune(vacrel, buf, blkno, page, ndeleted = lazy_scan_prune(vacrel, buf, blkno, page,
vmbuffer, vmbuffer,
blk_info & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM, blk_info & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
&has_lpdead_items, &vm_page_frozen); &has_lpdead_items, &vm_page_frozen);
/* /*
* Count an eagerly scanned page as a failure or a success. * Count an eagerly scanned page as a failure or a success.
@ -1481,7 +1482,7 @@ lazy_scan_heap(LVRelState *vacrel)
* table has indexes. There will only be newly-freed space if we * table has indexes. There will only be newly-freed space if we
* held the cleanup lock and lazy_scan_prune() was called. * held the cleanup lock and lazy_scan_prune() was called.
*/ */
if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items && if (got_cleanup_lock && vacrel->nindexes == 0 && ndeleted > 0 &&
blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES) blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
{ {
FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
@ -1936,8 +1937,10 @@ cmpOffsetNumbers(const void *a, const void *b)
* *vm_page_frozen is set to true if the page is newly set all-frozen in the * *vm_page_frozen is set to true if the page is newly set all-frozen in the
* VM. The caller currently only uses this for determining whether an eagerly * VM. The caller currently only uses this for determining whether an eagerly
* scanned page was successfully set all-frozen. * scanned page was successfully set all-frozen.
*
* Returns the number of tuples deleted from the page during HOT pruning.
*/ */
static void static int
lazy_scan_prune(LVRelState *vacrel, lazy_scan_prune(LVRelState *vacrel,
Buffer buf, Buffer buf,
BlockNumber blkno, BlockNumber blkno,
@ -2208,6 +2211,8 @@ lazy_scan_prune(LVRelState *vacrel,
*vm_page_frozen = true; *vm_page_frozen = true;
} }
} }
return presult.ndeleted;
} }
/* /*

View File

@ -16,6 +16,7 @@
#include "postgres.h" #include "postgres.h"
#include "access/nbtree.h" #include "access/nbtree.h"
#include "common/int.h"
#include "lib/qunique.h" #include "lib/qunique.h"
#include "utils/array.h" #include "utils/array.h"
#include "utils/lsyscache.h" #include "utils/lsyscache.h"
@ -56,6 +57,8 @@ static void _bt_skiparray_strat_decrement(IndexScanDesc scan, ScanKey arraysk,
BTArrayKeyInfo *array); BTArrayKeyInfo *array);
static void _bt_skiparray_strat_increment(IndexScanDesc scan, ScanKey arraysk, static void _bt_skiparray_strat_increment(IndexScanDesc scan, ScanKey arraysk,
BTArrayKeyInfo *array); BTArrayKeyInfo *array);
static void _bt_unmark_keys(IndexScanDesc scan, int *keyDataMap);
static int _bt_reorder_array_cmp(const void *a, const void *b);
static ScanKey _bt_preprocess_array_keys(IndexScanDesc scan, int *new_numberOfKeys); static ScanKey _bt_preprocess_array_keys(IndexScanDesc scan, int *new_numberOfKeys);
static void _bt_preprocess_array_keys_final(IndexScanDesc scan, int *keyDataMap); static void _bt_preprocess_array_keys_final(IndexScanDesc scan, int *keyDataMap);
static int _bt_num_array_keys(IndexScanDesc scan, Oid *skip_eq_ops_out, static int _bt_num_array_keys(IndexScanDesc scan, Oid *skip_eq_ops_out,
@ -96,7 +99,7 @@ static int _bt_compare_array_elements(const void *a, const void *b, void *arg);
* incomplete sets of cross-type operators, we may fail to detect redundant * incomplete sets of cross-type operators, we may fail to detect redundant
* or contradictory keys, but we can survive that.) * or contradictory keys, but we can survive that.)
* *
* The output keys must be sorted by index attribute. Presently we expect * Required output keys are sorted by index attribute. Presently we expect
* (but verify) that the input keys are already so sorted --- this is done * (but verify) that the input keys are already so sorted --- this is done
* by match_clauses_to_index() in indxpath.c. Some reordering of the keys * by match_clauses_to_index() in indxpath.c. Some reordering of the keys
* within each attribute may be done as a byproduct of the processing here. * within each attribute may be done as a byproduct of the processing here.
@ -127,29 +130,36 @@ static int _bt_compare_array_elements(const void *a, const void *b, void *arg);
* This has the potential to be much more efficient than a full index scan * This has the potential to be much more efficient than a full index scan
* (though it behaves like a full scan when there's many distinct "x" values). * (though it behaves like a full scan when there's many distinct "x" values).
* *
* If possible, redundant keys are eliminated: we keep only the tightest * Typically, redundant keys are eliminated: we keep only the tightest
* >/>= bound and the tightest </<= bound, and if there's an = key then * >/>= bound and the tightest </<= bound, and if there's an = key then
* that's the only one returned. (So, we return either a single = key, * that's the only one returned. (So, we return either a single = key,
* or one or two boundary-condition keys for each attr.) However, if we * or one or two boundary-condition keys for each attr.) However, if we
* cannot compare two keys for lack of a suitable cross-type operator, * cannot compare two keys for lack of a suitable cross-type operator,
* we cannot eliminate either. If there are two such keys of the same * we cannot eliminate either key.
* operator strategy, the second one is just pushed into the output array
* without further processing here. We may also emit both >/>= or both
* </<= keys if we can't compare them. The logic about required keys still
* works if we don't eliminate redundant keys.
* *
* Note that one reason we need direction-sensitive required-key flags is * When all redundant keys could not be eliminated, we'll output a key array
* precisely that we may not be able to eliminate redundant keys. Suppose * that can more or less be treated as if it had no redundant keys. Suppose
* we have "x > 4::int AND x > 10::bigint", and we are unable to determine * we have "x > 4::int AND x > 10::bigint AND x < 70", and we are unable to
* which key is more restrictive for lack of a suitable cross-type operator. * determine which > key is more restrictive for lack of a suitable cross-type
* _bt_first will arbitrarily pick one of the keys to do the initial * operator. We'll arbitrarily pick one of the > keys; the other > key won't
* positioning with. If it picks x > 4, then the x > 10 condition will fail * be marked required. Obviously, the scan will be less efficient if we
* until we reach index entries > 10; but we can't stop the scan just because * choose x > 4 over x > 10 -- but it can still largely proceed as if there
* x > 10 is failing. On the other hand, if we are scanning backwards, then * was only a single > condition. "x > 10" will be placed at the end of the
* failure of either key is indeed enough to stop the scan. (In general, when * so->keyData[] output array. It'll always be evaluated last, after the keys
* inequality keys are present, the initial-positioning code only promises to * that could be marked required in the usual way (after "x > 4 AND x < 70").
* position before the first possible match, not exactly at the first match, * This can sometimes result in so->keyData[] keys that aren't even in index
* for a forward scan; or after the last match for a backward scan.) * attribute order (if the qual involves multiple attributes). The scan's
* required keys will still be in attribute order, though, so it can't matter.
*
* This scheme ensures that _bt_first always uses the same set of keys at the
* start of a forwards scan as those _bt_checkkeys uses to determine when to
* end a similar backwards scan (and vice-versa). _bt_advance_array_keys
* depends on this: it expects to be able to reliably predict what the next
* _bt_first call will do by testing whether _bt_checkkeys' routines report
* that the final tuple on the page is past the end of matches for the scan's
* keys with the scan direction flipped. If it is (if continuescan=false),
* then it follows that calling _bt_first will, at a minimum, relocate the
* scan to the very next leaf page (in the current scan direction).
* *
* As a byproduct of this work, we can detect contradictory quals such * As a byproduct of this work, we can detect contradictory quals such
* as "x = 1 AND x > 2". If we see that, we return so->qual_ok = false, * as "x = 1 AND x > 2". If we see that, we return so->qual_ok = false,
@ -188,7 +198,8 @@ _bt_preprocess_keys(IndexScanDesc scan)
int numberOfEqualCols; int numberOfEqualCols;
ScanKey inkeys; ScanKey inkeys;
BTScanKeyPreproc xform[BTMaxStrategyNumber]; BTScanKeyPreproc xform[BTMaxStrategyNumber];
bool test_result; bool test_result,
redundant_key_kept = false;
AttrNumber attno; AttrNumber attno;
ScanKey arrayKeyData; ScanKey arrayKeyData;
int *keyDataMap = NULL; int *keyDataMap = NULL;
@ -388,7 +399,8 @@ _bt_preprocess_keys(IndexScanDesc scan)
xform[j].inkey = NULL; xform[j].inkey = NULL;
xform[j].inkeyi = -1; xform[j].inkeyi = -1;
} }
/* else, cannot determine redundancy, keep both keys */ else
redundant_key_kept = true;
} }
/* track number of attrs for which we have "=" keys */ /* track number of attrs for which we have "=" keys */
numberOfEqualCols++; numberOfEqualCols++;
@ -409,6 +421,8 @@ _bt_preprocess_keys(IndexScanDesc scan)
else else
xform[BTLessStrategyNumber - 1].inkey = NULL; xform[BTLessStrategyNumber - 1].inkey = NULL;
} }
else
redundant_key_kept = true;
} }
/* try to keep only one of >, >= */ /* try to keep only one of >, >= */
@ -426,6 +440,8 @@ _bt_preprocess_keys(IndexScanDesc scan)
else else
xform[BTGreaterStrategyNumber - 1].inkey = NULL; xform[BTGreaterStrategyNumber - 1].inkey = NULL;
} }
else
redundant_key_kept = true;
} }
/* /*
@ -466,25 +482,6 @@ _bt_preprocess_keys(IndexScanDesc scan)
/* check strategy this key's operator corresponds to */ /* check strategy this key's operator corresponds to */
j = inkey->sk_strategy - 1; j = inkey->sk_strategy - 1;
/* if row comparison, push it directly to the output array */
if (inkey->sk_flags & SK_ROW_HEADER)
{
ScanKey outkey = &so->keyData[new_numberOfKeys++];
memcpy(outkey, inkey, sizeof(ScanKeyData));
if (arrayKeyData)
keyDataMap[new_numberOfKeys - 1] = i;
if (numberOfEqualCols == attno - 1)
_bt_mark_scankey_required(outkey);
/*
* We don't support RowCompare using equality; such a qual would
* mess up the numberOfEqualCols tracking.
*/
Assert(j != (BTEqualStrategyNumber - 1));
continue;
}
if (inkey->sk_strategy == BTEqualStrategyNumber && if (inkey->sk_strategy == BTEqualStrategyNumber &&
(inkey->sk_flags & SK_SEARCHARRAY)) (inkey->sk_flags & SK_SEARCHARRAY))
{ {
@ -593,9 +590,8 @@ _bt_preprocess_keys(IndexScanDesc scan)
* the new scan key. * the new scan key.
* *
* Note: We do things this way around so that our arrays are * Note: We do things this way around so that our arrays are
* always in the same order as their corresponding scan keys, * always in the same order as their corresponding scan keys.
* even with incomplete opfamilies. _bt_advance_array_keys * _bt_preprocess_array_keys_final expects this.
* depends on this.
*/ */
ScanKey outkey = &so->keyData[new_numberOfKeys++]; ScanKey outkey = &so->keyData[new_numberOfKeys++];
@ -607,6 +603,7 @@ _bt_preprocess_keys(IndexScanDesc scan)
xform[j].inkey = inkey; xform[j].inkey = inkey;
xform[j].inkeyi = i; xform[j].inkeyi = i;
xform[j].arrayidx = arrayidx; xform[j].arrayidx = arrayidx;
redundant_key_kept = true;
} }
} }
} }
@ -622,6 +619,15 @@ _bt_preprocess_keys(IndexScanDesc scan)
if (arrayKeyData) if (arrayKeyData)
_bt_preprocess_array_keys_final(scan, keyDataMap); _bt_preprocess_array_keys_final(scan, keyDataMap);
/*
* If there are remaining redundant inequality keys, we must make sure
* that each index attribute has no more than one required >/>= key, and
* no more than one required </<= key. Attributes that have one or more
* required = keys now must keep only one required key (the first = key).
*/
if (unlikely(redundant_key_kept) && so->qual_ok)
_bt_unmark_keys(scan, keyDataMap);
/* Could pfree arrayKeyData/keyDataMap now, but not worth the cycles */ /* Could pfree arrayKeyData/keyDataMap now, but not worth the cycles */
} }
@ -746,9 +752,12 @@ _bt_fix_scankey_strategy(ScanKey skey, int16 *indoption)
* *
* Depending on the operator type, the key may be required for both scan * Depending on the operator type, the key may be required for both scan
* directions or just one. Also, if the key is a row comparison header, * directions or just one. Also, if the key is a row comparison header,
* we have to mark its first subsidiary ScanKey as required. (Subsequent * we have to mark the appropriate subsidiary ScanKeys as required. In such
* subsidiary ScanKeys are normally for lower-order columns, and thus * cases, the first subsidiary key is required, but subsequent ones are
* cannot be required, since they're after the first non-equality scankey.) * required only as long as they correspond to successive index columns and
* match the leading column as to sort direction. Otherwise the row
* comparison ordering is different from the index ordering and so we can't
* stop the scan on the basis of those lower-order columns.
* *
* Note: when we set required-key flag bits in a subsidiary scankey, we are * Note: when we set required-key flag bits in a subsidiary scankey, we are
* scribbling on a data structure belonging to the index AM's caller, not on * scribbling on a data structure belonging to the index AM's caller, not on
@ -786,12 +795,25 @@ _bt_mark_scankey_required(ScanKey skey)
if (skey->sk_flags & SK_ROW_HEADER) if (skey->sk_flags & SK_ROW_HEADER)
{ {
ScanKey subkey = (ScanKey) DatumGetPointer(skey->sk_argument); ScanKey subkey = (ScanKey) DatumGetPointer(skey->sk_argument);
AttrNumber attno = skey->sk_attno;
/* First subkey should be same column/operator as the header */ /* First subkey should be same column/operator as the header */
Assert(subkey->sk_flags & SK_ROW_MEMBER); Assert(subkey->sk_attno == attno);
Assert(subkey->sk_attno == skey->sk_attno);
Assert(subkey->sk_strategy == skey->sk_strategy); Assert(subkey->sk_strategy == skey->sk_strategy);
subkey->sk_flags |= addflags;
for (;;)
{
Assert(subkey->sk_flags & SK_ROW_MEMBER);
if (subkey->sk_attno != attno)
break; /* non-adjacent key, so not required */
if (subkey->sk_strategy != skey->sk_strategy)
break; /* wrong direction, so not required */
subkey->sk_flags |= addflags;
if (subkey->sk_flags & SK_ROW_END)
break;
subkey++;
attno++;
}
} }
} }
@ -847,8 +869,7 @@ _bt_compare_scankey_args(IndexScanDesc scan, ScanKey op,
cmp_op; cmp_op;
StrategyNumber strat; StrategyNumber strat;
Assert(!((leftarg->sk_flags | rightarg->sk_flags) & Assert(!((leftarg->sk_flags | rightarg->sk_flags) & SK_ROW_MEMBER));
(SK_ROW_HEADER | SK_ROW_MEMBER)));
/* /*
* First, deal with cases where one or both args are NULL. This should * First, deal with cases where one or both args are NULL. This should
@ -924,6 +945,16 @@ _bt_compare_scankey_args(IndexScanDesc scan, ScanKey op,
return true; return true;
} }
/*
* We don't yet know how to determine redundancy when it involves a row
* compare key (barring simple cases involving IS NULL/IS NOT NULL)
*/
if ((leftarg->sk_flags | rightarg->sk_flags) & SK_ROW_HEADER)
{
Assert(!((leftarg->sk_flags | rightarg->sk_flags) & SK_BT_SKIP));
return false;
}
/* /*
* If either leftarg or rightarg are equality-type array scankeys, we need * If either leftarg or rightarg are equality-type array scankeys, we need
* specialized handling (since by now we know that IS NULL wasn't used) * specialized handling (since by now we know that IS NULL wasn't used)
@ -1467,6 +1498,283 @@ _bt_skiparray_strat_increment(IndexScanDesc scan, ScanKey arraysk,
} }
} }
/*
* _bt_unmark_keys() -- make superfluous required keys nonrequired after all
*
* When _bt_preprocess_keys fails to eliminate one or more redundant keys, it
* calls here to make sure that no index attribute has more than one > or >=
* key marked required, and no more than one required < or <= key. Attributes
* with = keys will always get one = key as their required key. All other
* keys that were initially marked required get "unmarked" here. That way,
* _bt_first and _bt_checkkeys will reliably agree on which keys to use to
* start and/or to end the scan.
*
* We also relocate keys that become/started out nonrequired to the end of
* so->keyData[]. That way, _bt_first and _bt_checkkeys cannot fail to reach
* a required key due to some earlier nonrequired key getting in the way.
*
* Only call here when _bt_compare_scankey_args returned false at least once
* (otherwise, calling here will just waste cycles).
*/
static void
_bt_unmark_keys(IndexScanDesc scan, int *keyDataMap)
{
BTScanOpaque so = (BTScanOpaque) scan->opaque;
AttrNumber attno;
bool *unmarkikey;
int nunmark,
nunmarked,
nkept,
firsti;
ScanKey keepKeys,
unmarkKeys;
FmgrInfo *keepOrderProcs = NULL,
*unmarkOrderProcs = NULL;
bool haveReqEquals,
haveReqForward,
haveReqBackward;
/*
* Do an initial pass over so->keyData[] that determines which keys to
* keep as required. We expect so->keyData[] to still be in attribute
* order when we're called (though we don't expect any particular order
* among each attribute's keys).
*
* When both equality and inequality keys remain on a single attribute, we
* *must* make sure that exactly one of the equalities remains required.
* Any requiredness markings that we might leave on later keys/attributes
* are predicated on there being required = keys on all prior columns.
*/
unmarkikey = palloc0(so->numberOfKeys * sizeof(bool));
nunmark = 0;
/* Set things up for first key's attribute */
attno = so->keyData[0].sk_attno;
firsti = 0;
haveReqEquals = false;
haveReqForward = false;
haveReqBackward = false;
for (int i = 0; i < so->numberOfKeys; i++)
{
ScanKey origkey = &so->keyData[i];
if (origkey->sk_attno != attno)
{
/* Reset for next attribute */
attno = origkey->sk_attno;
firsti = i;
haveReqEquals = false;
haveReqForward = false;
haveReqBackward = false;
}
/* Equalities get priority over inequalities */
if (haveReqEquals)
{
/*
* We already found the first "=" key for this attribute. We've
* already decided that all its other keys will be unmarked.
*/
Assert(!(origkey->sk_flags & SK_SEARCHNULL));
unmarkikey[i] = true;
nunmark++;
continue;
}
else if ((origkey->sk_flags & SK_BT_REQFWD) &&
(origkey->sk_flags & SK_BT_REQBKWD))
{
/*
* Found the first "=" key for attno. All other attno keys will
* be unmarked.
*/
Assert(origkey->sk_strategy == BTEqualStrategyNumber);
haveReqEquals = true;
for (int j = firsti; j < i; j++)
{
/* Unmark any prior inequality keys on attno after all */
if (!unmarkikey[j])
{
unmarkikey[j] = true;
nunmark++;
}
}
continue;
}
/* Deal with inequalities next */
if ((origkey->sk_flags & SK_BT_REQFWD) && !haveReqForward)
{
haveReqForward = true;
continue;
}
else if ((origkey->sk_flags & SK_BT_REQBKWD) && !haveReqBackward)
{
haveReqBackward = true;
continue;
}
/*
* We have either a redundant inequality key that will be unmarked, or
* we have a key that wasn't marked required in the first place
*/
unmarkikey[i] = true;
nunmark++;
}
/* Should only be called when _bt_compare_scankey_args reported failure */
Assert(nunmark > 0);
/*
* Next, allocate temp arrays: one for required keys that'll remain
* required, the other for all remaining keys
*/
unmarkKeys = palloc(nunmark * sizeof(ScanKeyData));
keepKeys = palloc((so->numberOfKeys - nunmark) * sizeof(ScanKeyData));
nunmarked = 0;
nkept = 0;
if (so->numArrayKeys)
{
unmarkOrderProcs = palloc(nunmark * sizeof(FmgrInfo));
keepOrderProcs = palloc((so->numberOfKeys - nunmark) * sizeof(FmgrInfo));
}
/*
* Next, copy the contents of so->keyData[] into the appropriate temp
* array.
*
* Scans with = array keys need us to maintain invariants around the order
* of so->orderProcs[] and so->arrayKeys[] relative to so->keyData[]. See
* _bt_preprocess_array_keys_final for a full explanation.
*/
for (int i = 0; i < so->numberOfKeys; i++)
{
ScanKey origkey = &so->keyData[i];
ScanKey unmark;
if (!unmarkikey[i])
{
/*
* Key gets to keep its original requiredness markings.
*
* Key will stay in its original position, unless we're going to
* unmark an earlier key (in which case this key gets moved back).
*/
memcpy(keepKeys + nkept, origkey, sizeof(ScanKeyData));
if (so->numArrayKeys)
{
keyDataMap[i] = nkept;
memcpy(keepOrderProcs + nkept, &so->orderProcs[i],
sizeof(FmgrInfo));
}
nkept++;
continue;
}
/*
* Key will be unmarked as needed, and moved to the end of the array,
* next to other keys that will become (or always were) nonrequired
*/
unmark = unmarkKeys + nunmarked;
memcpy(unmark, origkey, sizeof(ScanKeyData));
if (so->numArrayKeys)
{
keyDataMap[i] = (so->numberOfKeys - nunmark) + nunmarked;
memcpy(&unmarkOrderProcs[nunmarked], &so->orderProcs[i],
sizeof(FmgrInfo));
}
/*
* Preprocessing only generates skip arrays when it knows that they'll
* be the only required = key on the attr. We'll never unmark them.
*/
Assert(!(unmark->sk_flags & SK_BT_SKIP));
/*
* Also shouldn't have to unmark an IS NULL or an IS NOT NULL key.
* They aren't cross-type, so an incomplete opfamily can't matter.
*/
Assert(!(unmark->sk_flags & SK_ISNULL) ||
!(unmark->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)));
/* Clear requiredness flags on redundant key (and on any subkeys) */
unmark->sk_flags &= ~(SK_BT_REQFWD | SK_BT_REQBKWD);
if (unmark->sk_flags & SK_ROW_HEADER)
{
ScanKey subkey = (ScanKey) DatumGetPointer(unmark->sk_argument);
Assert(subkey->sk_strategy == unmark->sk_strategy);
for (;;)
{
Assert(subkey->sk_flags & SK_ROW_MEMBER);
subkey->sk_flags &= ~(SK_BT_REQFWD | SK_BT_REQBKWD);
if (subkey->sk_flags & SK_ROW_END)
break;
subkey++;
}
}
nunmarked++;
}
/* Copy both temp arrays back into so->keyData[] to reorder */
Assert(nkept == so->numberOfKeys - nunmark);
Assert(nunmarked == nunmark);
memcpy(so->keyData, keepKeys, sizeof(ScanKeyData) * nkept);
memcpy(so->keyData + nkept, unmarkKeys, sizeof(ScanKeyData) * nunmarked);
/* Done with temp arrays */
pfree(unmarkikey);
pfree(keepKeys);
pfree(unmarkKeys);
/*
* Now copy so->orderProcs[] temp entries needed by scans with = array
* keys back (just like with the so->keyData[] temp arrays)
*/
if (so->numArrayKeys)
{
memcpy(so->orderProcs, keepOrderProcs, sizeof(FmgrInfo) * nkept);
memcpy(so->orderProcs + nkept, unmarkOrderProcs,
sizeof(FmgrInfo) * nunmarked);
/* Also fix-up array->scan_key references */
for (int arridx = 0; arridx < so->numArrayKeys; arridx++)
{
BTArrayKeyInfo *array = &so->arrayKeys[arridx];
array->scan_key = keyDataMap[array->scan_key];
}
/*
* Sort so->arrayKeys[] based on its new BTArrayKeyInfo.scan_key
* offsets, so that its order matches so->keyData[] order as expected
*/
qsort(so->arrayKeys, so->numArrayKeys, sizeof(BTArrayKeyInfo),
_bt_reorder_array_cmp);
/* Done with temp arrays */
pfree(unmarkOrderProcs);
pfree(keepOrderProcs);
}
}
/*
* qsort comparator for reordering so->arrayKeys[] BTArrayKeyInfo entries
*/
static int
_bt_reorder_array_cmp(const void *a, const void *b)
{
BTArrayKeyInfo *arraya = (BTArrayKeyInfo *) a;
BTArrayKeyInfo *arrayb = (BTArrayKeyInfo *) b;
return pg_cmp_s32(arraya->scan_key, arrayb->scan_key);
}
/* /*
* _bt_preprocess_array_keys() -- Preprocess SK_SEARCHARRAY scan keys * _bt_preprocess_array_keys() -- Preprocess SK_SEARCHARRAY scan keys
* *

View File

@ -960,46 +960,51 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
/*---------- /*----------
* Examine the scan keys to discover where we need to start the scan. * Examine the scan keys to discover where we need to start the scan.
* The selected scan keys (at most one per index column) are remembered by
* storing their addresses into the local startKeys[] array. The final
* startKeys[] entry's strategy is set in strat_total. (Actually, there
* are a couple of cases where we force a less/more restrictive strategy.)
* *
* We want to identify the keys that can be used as starting boundaries; * We must use the key that was marked required (in the direction opposite
* these are =, >, or >= keys for a forward scan or =, <, <= keys for * our own scan's) during preprocessing. Each index attribute can only
* a backwards scan. We can use keys for multiple attributes so long as * have one such required key. In general, the keys that we use to find
* the prior attributes had only =, >= (resp. =, <=) keys. Once we accept * an initial position when scanning forwards are the same keys that end
* a > or < boundary or find an attribute with no boundary (which can be * the scan on the leaf level when scanning backwards (and vice-versa).
* thought of as the same as "> -infinity"), we can't use keys for any
* attributes to its right, because it would break our simplistic notion
* of what initial positioning strategy to use.
* *
* When the scan keys include cross-type operators, _bt_preprocess_keys * When the scan keys include cross-type operators, _bt_preprocess_keys
* may not be able to eliminate redundant keys; in such cases we will * may not be able to eliminate redundant keys; in such cases it will
* arbitrarily pick a usable one for each attribute. This is correct * arbitrarily pick a usable key for each attribute (and scan direction),
* but possibly not optimal behavior. (For example, with keys like * ensuring that there is no more than one key required in each direction.
* "x >= 4 AND x >= 5" we would elect to scan starting at x=4 when * We stop considering further keys once we reach the first nonrequired
* x=5 would be more efficient.) Since the situation only arises given * key (which must come after all required keys), so this can't affect us.
* a poorly-worded query plus an incomplete opfamily, live with it.
* *
* When both equality and inequality keys appear for a single attribute * The required keys that we use as starting boundaries have to be =, >,
* (again, only possible when cross-type operators appear), we *must* * or >= keys for a forward scan or =, <, <= keys for a backwards scan.
* select one of the equality keys for the starting point, because * We can use keys for multiple attributes so long as the prior attributes
* _bt_checkkeys() will stop the scan as soon as an equality qual fails. * had only =, >= (resp. =, <=) keys. These rules are very similar to the
* For example, if we have keys like "x >= 4 AND x = 10" and we elect to * rules that preprocessing used to determine which keys to mark required.
* start at x=4, we will fail and stop before reaching x=10. If multiple * We cannot always use every required key as a positioning key, though.
* equality quals survive preprocessing, however, it doesn't matter which * Skip arrays necessitate independently applying our own rules here.
* one we use --- by definition, they are either redundant or * Skip arrays are always generally considered = array keys, but we'll
* contradictory. * nevertheless treat them as inequalities at certain points of the scan.
* When that happens, it _might_ have implications for the number of
* required keys that we can safely use for initial positioning purposes.
* *
* In practice we rarely see any "attribute boundary key gaps" here. * For example, a forward scan with a skip array on its leading attribute
* Preprocessing can usually backfill skip array keys for any attributes * (with no low_compare/high_compare) will have at least two required scan
* that were omitted from the original scan->keyData[] input keys. All * keys, but we won't use any of them as boundary keys during the scan's
* array keys are always considered = keys, but we'll sometimes need to * initial call here. Our positioning key during the first call here can
* treat the current key value as if we were using an inequality strategy. * be thought of as representing "> -infinity". Similarly, if such a skip
* This happens with range skip arrays, which store inequality keys in the * array's low_compare is "a > 'foo'", then we position using "a > 'foo'"
* array's low_compare/high_compare fields (used to find the first/last * during the scan's initial call here; a lower-order key such as "b = 42"
* set of matches, when = key will lack a usable sk_argument value). * can't be used until the "a" array advances beyond MINVAL/low_compare.
* These are always preferred over any redundant "standard" inequality *
* keys on the same column (per the usual rule about preferring = keys). * On the other hand, if such a skip array's low_compare was "a >= 'foo'",
* Note also that any column with an = skip array key can never have an * then we _can_ use "a >= 'foo' AND b = 42" during the initial call here.
* additional, contradictory = key. * A subsequent call here might have us use "a = 'fop' AND b = 42". Note
* that we treat = and >= as equivalent when scanning forwards (just as we
* treat = and <= as equivalent when scanning backwards). We effectively
* do the same thing (though with a distinct "a" element/value) each time.
* *
* All keys (with the exception of SK_SEARCHNULL keys and SK_BT_SKIP * All keys (with the exception of SK_SEARCHNULL keys and SK_BT_SKIP
* array keys whose array is "null_elem=true") imply a NOT NULL qualifier. * array keys whose array is "null_elem=true") imply a NOT NULL qualifier.
@ -1011,21 +1016,20 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* traversing a lot of null entries at the start of the scan. * traversing a lot of null entries at the start of the scan.
* *
* In this loop, row-comparison keys are treated the same as keys on their * In this loop, row-comparison keys are treated the same as keys on their
* first (leftmost) columns. We'll add on lower-order columns of the row * first (leftmost) columns. We'll add all lower-order columns of the row
* comparison below, if possible. * comparison that were marked required during preprocessing below.
* *
* The selected scan keys (at most one per index column) are remembered by * _bt_advance_array_keys needs to know exactly how we'll reposition the
* storing their addresses into the local startKeys[] array. * scan (should it opt to schedule another primitive index scan). It is
* * critical that primscans only be scheduled when they'll definitely make
* _bt_checkkeys/_bt_advance_array_keys decide whether and when to start * some useful progress. _bt_advance_array_keys does this by calling
* the next primitive index scan (for scans with array keys) based in part * _bt_checkkeys routines that report whether a tuple is past the end of
* on an understanding of how it'll enable us to reposition the scan. * matches for the scan's keys (given the scan's current array elements).
* They're directly aware of how we'll sometimes cons up an explicit * If the page's final tuple is "after the end of matches" for a scan that
* SK_SEARCHNOTNULL key. They'll even end primitive scans by applying a * uses the *opposite* scan direction, then it must follow that it's also
* symmetric "deduce NOT NULL" rule of their own. This allows top-level * "before the start of matches" for the actual current scan direction.
* scans to skip large groups of NULLs through repeated deductions about * It is therefore essential that all of our initial positioning rules are
* key strictness (for a required inequality key) and whether NULLs in the * symmetric with _bt_checkkeys's corresponding continuescan=false rule.
* key's index column are stored last or first (relative to non-NULLs).
* If you update anything here, _bt_checkkeys/_bt_advance_array_keys might * If you update anything here, _bt_checkkeys/_bt_advance_array_keys might
* need to be kept in sync. * need to be kept in sync.
*---------- *----------
@ -1034,18 +1038,17 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
if (so->numberOfKeys > 0) if (so->numberOfKeys > 0)
{ {
AttrNumber curattr; AttrNumber curattr;
ScanKey chosen; ScanKey bkey;
ScanKey impliesNN; ScanKey impliesNN;
ScanKey cur; ScanKey cur;
/* /*
* chosen is the so-far-chosen key for the current attribute, if any. * bkey will be set to the key that preprocessing left behind as the
* We don't cast the decision in stone until we reach keys for the * boundary key for this attribute, in this scan direction (if any)
* next attribute.
*/ */
cur = so->keyData; cur = so->keyData;
curattr = 1; curattr = 1;
chosen = NULL; bkey = NULL;
/* Also remember any scankey that implies a NOT NULL constraint */ /* Also remember any scankey that implies a NOT NULL constraint */
impliesNN = NULL; impliesNN = NULL;
@ -1058,23 +1061,29 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
{ {
if (i >= so->numberOfKeys || cur->sk_attno != curattr) if (i >= so->numberOfKeys || cur->sk_attno != curattr)
{ {
/* Done looking for the curattr boundary key */
Assert(bkey == NULL ||
(bkey->sk_attno == curattr &&
(bkey->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD))));
Assert(impliesNN == NULL ||
(impliesNN->sk_attno == curattr &&
(impliesNN->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD))));
/* /*
* Done looking at keys for curattr.
*
* If this is a scan key for a skip array whose current * If this is a scan key for a skip array whose current
* element is MINVAL, choose low_compare (when scanning * element is MINVAL, choose low_compare (when scanning
* backwards it'll be MAXVAL, and we'll choose high_compare). * backwards it'll be MAXVAL, and we'll choose high_compare).
* *
* Note: if the array's low_compare key makes 'chosen' NULL, * Note: if the array's low_compare key makes 'bkey' NULL,
* then we behave as if the array's first element is -inf, * then we behave as if the array's first element is -inf,
* except when !array->null_elem implies a usable NOT NULL * except when !array->null_elem implies a usable NOT NULL
* constraint. * constraint.
*/ */
if (chosen != NULL && if (bkey != NULL &&
(chosen->sk_flags & (SK_BT_MINVAL | SK_BT_MAXVAL))) (bkey->sk_flags & (SK_BT_MINVAL | SK_BT_MAXVAL)))
{ {
int ikey = chosen - so->keyData; int ikey = bkey - so->keyData;
ScanKey skipequalitykey = chosen; ScanKey skipequalitykey = bkey;
BTArrayKeyInfo *array = NULL; BTArrayKeyInfo *array = NULL;
for (int arridx = 0; arridx < so->numArrayKeys; arridx++) for (int arridx = 0; arridx < so->numArrayKeys; arridx++)
@ -1087,35 +1096,35 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
if (ScanDirectionIsForward(dir)) if (ScanDirectionIsForward(dir))
{ {
Assert(!(skipequalitykey->sk_flags & SK_BT_MAXVAL)); Assert(!(skipequalitykey->sk_flags & SK_BT_MAXVAL));
chosen = array->low_compare; bkey = array->low_compare;
} }
else else
{ {
Assert(!(skipequalitykey->sk_flags & SK_BT_MINVAL)); Assert(!(skipequalitykey->sk_flags & SK_BT_MINVAL));
chosen = array->high_compare; bkey = array->high_compare;
} }
Assert(chosen == NULL || Assert(bkey == NULL ||
chosen->sk_attno == skipequalitykey->sk_attno); bkey->sk_attno == skipequalitykey->sk_attno);
if (!array->null_elem) if (!array->null_elem)
impliesNN = skipequalitykey; impliesNN = skipequalitykey;
else else
Assert(chosen == NULL && impliesNN == NULL); Assert(bkey == NULL && impliesNN == NULL);
} }
/* /*
* If we didn't find a usable boundary key, see if we can * If we didn't find a usable boundary key, see if we can
* deduce a NOT NULL key * deduce a NOT NULL key
*/ */
if (chosen == NULL && impliesNN != NULL && if (bkey == NULL && impliesNN != NULL &&
((impliesNN->sk_flags & SK_BT_NULLS_FIRST) ? ((impliesNN->sk_flags & SK_BT_NULLS_FIRST) ?
ScanDirectionIsForward(dir) : ScanDirectionIsForward(dir) :
ScanDirectionIsBackward(dir))) ScanDirectionIsBackward(dir)))
{ {
/* Yes, so build the key in notnullkeys[keysz] */ /* Yes, so build the key in notnullkeys[keysz] */
chosen = &notnullkeys[keysz]; bkey = &notnullkeys[keysz];
ScanKeyEntryInitialize(chosen, ScanKeyEntryInitialize(bkey,
(SK_SEARCHNOTNULL | SK_ISNULL | (SK_SEARCHNOTNULL | SK_ISNULL |
(impliesNN->sk_flags & (impliesNN->sk_flags &
(SK_BT_DESC | SK_BT_NULLS_FIRST))), (SK_BT_DESC | SK_BT_NULLS_FIRST))),
@ -1130,12 +1139,12 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
} }
/* /*
* If we still didn't find a usable boundary key, quit; else * If preprocessing didn't leave a usable boundary key, quit;
* save the boundary key pointer in startKeys. * else save the boundary key pointer in startKeys[]
*/ */
if (chosen == NULL) if (bkey == NULL)
break; break;
startKeys[keysz++] = chosen; startKeys[keysz++] = bkey;
/* /*
* We can only consider adding more boundary keys when the one * We can only consider adding more boundary keys when the one
@ -1143,7 +1152,7 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* (during backwards scans we can only do so when the key that * (during backwards scans we can only do so when the key that
* we just added to startKeys[] uses the = or <= strategy) * we just added to startKeys[] uses the = or <= strategy)
*/ */
strat_total = chosen->sk_strategy; strat_total = bkey->sk_strategy;
if (strat_total == BTGreaterStrategyNumber || if (strat_total == BTGreaterStrategyNumber ||
strat_total == BTLessStrategyNumber) strat_total == BTLessStrategyNumber)
break; break;
@ -1154,19 +1163,19 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* make strat_total > or < (and stop adding boundary keys). * make strat_total > or < (and stop adding boundary keys).
* This can only happen with opclasses that lack skip support. * This can only happen with opclasses that lack skip support.
*/ */
if (chosen->sk_flags & (SK_BT_NEXT | SK_BT_PRIOR)) if (bkey->sk_flags & (SK_BT_NEXT | SK_BT_PRIOR))
{ {
Assert(chosen->sk_flags & SK_BT_SKIP); Assert(bkey->sk_flags & SK_BT_SKIP);
Assert(strat_total == BTEqualStrategyNumber); Assert(strat_total == BTEqualStrategyNumber);
if (ScanDirectionIsForward(dir)) if (ScanDirectionIsForward(dir))
{ {
Assert(!(chosen->sk_flags & SK_BT_PRIOR)); Assert(!(bkey->sk_flags & SK_BT_PRIOR));
strat_total = BTGreaterStrategyNumber; strat_total = BTGreaterStrategyNumber;
} }
else else
{ {
Assert(!(chosen->sk_flags & SK_BT_NEXT)); Assert(!(bkey->sk_flags & SK_BT_NEXT));
strat_total = BTLessStrategyNumber; strat_total = BTLessStrategyNumber;
} }
@ -1180,24 +1189,30 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
/* /*
* Done if that was the last scan key output by preprocessing. * Done if that was the last scan key output by preprocessing.
* Also done if there is a gap index attribute that lacks a * Also done if we've now examined all keys marked required.
* usable key (only possible when preprocessing was unable to
* generate a skip array key to "fill in the gap").
*/ */
if (i >= so->numberOfKeys || if (i >= so->numberOfKeys ||
cur->sk_attno != curattr + 1) !(cur->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)))
break; break;
/* /*
* Reset for next attr. * Reset for next attr.
*/ */
Assert(cur->sk_attno == curattr + 1);
curattr = cur->sk_attno; curattr = cur->sk_attno;
chosen = NULL; bkey = NULL;
impliesNN = NULL; impliesNN = NULL;
} }
/* /*
* Can we use this key as a starting boundary for this attr? * If we've located the starting boundary key for curattr, we have
* no interest in curattr's other required key
*/
if (bkey != NULL)
continue;
/*
* Is this key the starting boundary key for curattr?
* *
* If not, does it imply a NOT NULL constraint? (Because * If not, does it imply a NOT NULL constraint? (Because
* SK_SEARCHNULL keys are always assigned BTEqualStrategyNumber, * SK_SEARCHNULL keys are always assigned BTEqualStrategyNumber,
@ -1207,27 +1222,20 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
{ {
case BTLessStrategyNumber: case BTLessStrategyNumber:
case BTLessEqualStrategyNumber: case BTLessEqualStrategyNumber:
if (chosen == NULL) if (ScanDirectionIsBackward(dir))
{ bkey = cur;
if (ScanDirectionIsBackward(dir)) else if (impliesNN == NULL)
chosen = cur; impliesNN = cur;
else
impliesNN = cur;
}
break; break;
case BTEqualStrategyNumber: case BTEqualStrategyNumber:
/* override any non-equality choice */ bkey = cur;
chosen = cur;
break; break;
case BTGreaterEqualStrategyNumber: case BTGreaterEqualStrategyNumber:
case BTGreaterStrategyNumber: case BTGreaterStrategyNumber:
if (chosen == NULL) if (ScanDirectionIsForward(dir))
{ bkey = cur;
if (ScanDirectionIsForward(dir)) else if (impliesNN == NULL)
chosen = cur; impliesNN = cur;
else
impliesNN = cur;
}
break; break;
} }
} }
@ -1253,16 +1261,18 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
Assert(keysz <= INDEX_MAX_KEYS); Assert(keysz <= INDEX_MAX_KEYS);
for (int i = 0; i < keysz; i++) for (int i = 0; i < keysz; i++)
{ {
ScanKey cur = startKeys[i]; ScanKey bkey = startKeys[i];
Assert(cur->sk_attno == i + 1); Assert(bkey->sk_attno == i + 1);
if (cur->sk_flags & SK_ROW_HEADER) if (bkey->sk_flags & SK_ROW_HEADER)
{ {
/* /*
* Row comparison header: look to the first row member instead * Row comparison header: look to the first row member instead
*/ */
ScanKey subkey = (ScanKey) DatumGetPointer(cur->sk_argument); ScanKey subkey = (ScanKey) DatumGetPointer(bkey->sk_argument);
bool loosen_strat = false,
tighten_strat = false;
/* /*
* Cannot be a NULL in the first row member: _bt_preprocess_keys * Cannot be a NULL in the first row member: _bt_preprocess_keys
@ -1270,9 +1280,18 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* ever getting this far * ever getting this far
*/ */
Assert(subkey->sk_flags & SK_ROW_MEMBER); Assert(subkey->sk_flags & SK_ROW_MEMBER);
Assert(subkey->sk_attno == cur->sk_attno); Assert(subkey->sk_attno == bkey->sk_attno);
Assert(!(subkey->sk_flags & SK_ISNULL)); Assert(!(subkey->sk_flags & SK_ISNULL));
/*
* This is either a > or >= key (during backwards scans it is
* either < or <=) that was marked required during preprocessing.
* Later so->keyData[] keys can't have been marked required, so
* our row compare header key must be the final startKeys[] entry.
*/
Assert(subkey->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD));
Assert(i == keysz - 1);
/* /*
* The member scankeys are already in insertion format (ie, they * The member scankeys are already in insertion format (ie, they
* have sk_func = 3-way-comparison function) * have sk_func = 3-way-comparison function)
@ -1280,112 +1299,141 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
memcpy(inskey.scankeys + i, subkey, sizeof(ScanKeyData)); memcpy(inskey.scankeys + i, subkey, sizeof(ScanKeyData));
/* /*
* If the row comparison is the last positioning key we accepted, * Now look to later row compare members.
* try to add additional keys from the lower-order row members. *
* (If we accepted independent conditions on additional index * If there's an "index attribute gap" between two row compare
* columns, we use those instead --- doesn't seem worth trying to * members, the second member won't have been marked required, and
* determine which is more restrictive.) Note that this is OK * so can't be used as a starting boundary key here. The part of
* even if the row comparison is of ">" or "<" type, because the * the row comparison that we do still use has to be treated as a
* condition applied to all but the last row member is effectively * ">=" or "<=" condition. For example, a qual "(a, c) > (1, 42)"
* ">=" or "<=", and so the extra keys don't break the positioning * with an omitted intervening index attribute "b" will use an
* scheme. But, by the same token, if we aren't able to use all * insertion scan key "a >= 1". Even the first "a = 1" tuple on
* the row members, then the part of the row comparison that we * the leaf level might satisfy the row compare qual.
* did use has to be treated as just a ">=" or "<=" condition, and *
* so we'd better adjust strat_total accordingly. * We're able to use a _more_ restrictive strategy when we reach a
* NULL row compare member, since they're always unsatisfiable.
* For example, a qual "(a, b, c) >= (1, NULL, 77)" will use an
* insertion scan key "a > 1". All tuples where "a = 1" cannot
* possibly satisfy the row compare qual, so this is safe.
*/ */
if (i == keysz - 1) Assert(!(subkey->sk_flags & SK_ROW_END));
for (;;)
{ {
bool used_all_subkeys = false; subkey++;
Assert(subkey->sk_flags & SK_ROW_MEMBER);
Assert(!(subkey->sk_flags & SK_ROW_END)); if (subkey->sk_flags & SK_ISNULL)
for (;;)
{ {
subkey++; /*
Assert(subkey->sk_flags & SK_ROW_MEMBER); * NULL member key, can only use earlier keys.
if (subkey->sk_attno != keysz + 1) *
break; /* out-of-sequence, can't use it */ * We deliberately avoid checking if this key is marked
if (subkey->sk_strategy != cur->sk_strategy) * required. All earlier keys are required, and this key
break; /* wrong direction, can't use it */ * is unsatisfiable either way, so we can't miss anything.
if (subkey->sk_flags & SK_ISNULL) */
break; /* can't use null keys */ tighten_strat = true;
Assert(keysz < INDEX_MAX_KEYS); break;
memcpy(inskey.scankeys + keysz, subkey,
sizeof(ScanKeyData));
keysz++;
if (subkey->sk_flags & SK_ROW_END)
{
used_all_subkeys = true;
break;
}
} }
if (!used_all_subkeys)
if (!(subkey->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)))
{ {
switch (strat_total) /* nonrequired member key, can only use earlier keys */
{ loosen_strat = true;
case BTLessStrategyNumber: break;
strat_total = BTLessEqualStrategyNumber;
break;
case BTGreaterStrategyNumber:
strat_total = BTGreaterEqualStrategyNumber;
break;
}
} }
break; /* done with outer loop */
Assert(subkey->sk_attno == keysz + 1);
Assert(subkey->sk_strategy == bkey->sk_strategy);
Assert(keysz < INDEX_MAX_KEYS);
memcpy(inskey.scankeys + keysz, subkey,
sizeof(ScanKeyData));
keysz++;
if (subkey->sk_flags & SK_ROW_END)
break;
} }
Assert(!(loosen_strat && tighten_strat));
if (loosen_strat)
{
/* Use less restrictive strategy (and fewer member keys) */
switch (strat_total)
{
case BTLessStrategyNumber:
strat_total = BTLessEqualStrategyNumber;
break;
case BTGreaterStrategyNumber:
strat_total = BTGreaterEqualStrategyNumber;
break;
}
}
if (tighten_strat)
{
/* Use more restrictive strategy (and fewer member keys) */
switch (strat_total)
{
case BTLessEqualStrategyNumber:
strat_total = BTLessStrategyNumber;
break;
case BTGreaterEqualStrategyNumber:
strat_total = BTGreaterStrategyNumber;
break;
}
}
/* done adding to inskey (row comparison keys always come last) */
break;
}
/*
* Ordinary comparison key/search-style key.
*
* Transform the search-style scan key to an insertion scan key by
* replacing the sk_func with the appropriate btree 3-way-comparison
* function.
*
* If scankey operator is not a cross-type comparison, we can use the
* cached comparison function; otherwise gotta look it up in the
* catalogs. (That can't lead to infinite recursion, since no
* indexscan initiated by syscache lookup will use cross-data-type
* operators.)
*
* We support the convention that sk_subtype == InvalidOid means the
* opclass input type; this hack simplifies life for ScanKeyInit().
*/
if (bkey->sk_subtype == rel->rd_opcintype[i] ||
bkey->sk_subtype == InvalidOid)
{
FmgrInfo *procinfo;
procinfo = index_getprocinfo(rel, bkey->sk_attno, BTORDER_PROC);
ScanKeyEntryInitializeWithInfo(inskey.scankeys + i,
bkey->sk_flags,
bkey->sk_attno,
InvalidStrategy,
bkey->sk_subtype,
bkey->sk_collation,
procinfo,
bkey->sk_argument);
} }
else else
{ {
/* RegProcedure cmp_proc;
* Ordinary comparison key. Transform the search-style scan key
* to an insertion scan key by replacing the sk_func with the
* appropriate btree comparison function.
*
* If scankey operator is not a cross-type comparison, we can use
* the cached comparison function; otherwise gotta look it up in
* the catalogs. (That can't lead to infinite recursion, since no
* indexscan initiated by syscache lookup will use cross-data-type
* operators.)
*
* We support the convention that sk_subtype == InvalidOid means
* the opclass input type; this is a hack to simplify life for
* ScanKeyInit().
*/
if (cur->sk_subtype == rel->rd_opcintype[i] ||
cur->sk_subtype == InvalidOid)
{
FmgrInfo *procinfo;
procinfo = index_getprocinfo(rel, cur->sk_attno, BTORDER_PROC); cmp_proc = get_opfamily_proc(rel->rd_opfamily[i],
ScanKeyEntryInitializeWithInfo(inskey.scankeys + i, rel->rd_opcintype[i],
cur->sk_flags, bkey->sk_subtype, BTORDER_PROC);
cur->sk_attno, if (!RegProcedureIsValid(cmp_proc))
InvalidStrategy, elog(ERROR, "missing support function %d(%u,%u) for attribute %d of index \"%s\"",
cur->sk_subtype, BTORDER_PROC, rel->rd_opcintype[i], bkey->sk_subtype,
cur->sk_collation, bkey->sk_attno, RelationGetRelationName(rel));
procinfo, ScanKeyEntryInitialize(inskey.scankeys + i,
cur->sk_argument); bkey->sk_flags,
} bkey->sk_attno,
else InvalidStrategy,
{ bkey->sk_subtype,
RegProcedure cmp_proc; bkey->sk_collation,
cmp_proc,
cmp_proc = get_opfamily_proc(rel->rd_opfamily[i], bkey->sk_argument);
rel->rd_opcintype[i],
cur->sk_subtype,
BTORDER_PROC);
if (!RegProcedureIsValid(cmp_proc))
elog(ERROR, "missing support function %d(%u,%u) for attribute %d of index \"%s\"",
BTORDER_PROC, rel->rd_opcintype[i], cur->sk_subtype,
cur->sk_attno, RelationGetRelationName(rel));
ScanKeyEntryInitialize(inskey.scankeys + i,
cur->sk_flags,
cur->sk_attno,
InvalidStrategy,
cur->sk_subtype,
cur->sk_collation,
cmp_proc,
cur->sk_argument);
}
} }
} }
@ -1474,6 +1522,8 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
if (!BufferIsValid(so->currPos.buf)) if (!BufferIsValid(so->currPos.buf))
{ {
Assert(!so->needPrimScan);
/* /*
* We only get here if the index is completely empty. Lock relation * We only get here if the index is completely empty. Lock relation
* because nothing finer to lock exists. Without a buffer lock, it's * because nothing finer to lock exists. Without a buffer lock, it's
@ -1492,7 +1542,6 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
if (!BufferIsValid(so->currPos.buf)) if (!BufferIsValid(so->currPos.buf))
{ {
Assert(!so->needPrimScan);
_bt_parallel_done(scan); _bt_parallel_done(scan);
return false; return false;
} }

View File

@ -44,7 +44,6 @@ static bool _bt_array_decrement(Relation rel, ScanKey skey, BTArrayKeyInfo *arra
static bool _bt_array_increment(Relation rel, ScanKey skey, BTArrayKeyInfo *array); static bool _bt_array_increment(Relation rel, ScanKey skey, BTArrayKeyInfo *array);
static bool _bt_advance_array_keys_increment(IndexScanDesc scan, ScanDirection dir, static bool _bt_advance_array_keys_increment(IndexScanDesc scan, ScanDirection dir,
bool *skip_array_set); bool *skip_array_set);
static void _bt_rewind_nonrequired_arrays(IndexScanDesc scan, ScanDirection dir);
static bool _bt_tuple_before_array_skeys(IndexScanDesc scan, ScanDirection dir, static bool _bt_tuple_before_array_skeys(IndexScanDesc scan, ScanDirection dir,
IndexTuple tuple, TupleDesc tupdesc, int tupnatts, IndexTuple tuple, TupleDesc tupdesc, int tupnatts,
bool readpagetup, int sktrig, bool *scanBehind); bool readpagetup, int sktrig, bool *scanBehind);
@ -52,7 +51,6 @@ static bool _bt_advance_array_keys(IndexScanDesc scan, BTReadPageState *pstate,
IndexTuple tuple, int tupnatts, TupleDesc tupdesc, IndexTuple tuple, int tupnatts, TupleDesc tupdesc,
int sktrig, bool sktrig_required); int sktrig, bool sktrig_required);
#ifdef USE_ASSERT_CHECKING #ifdef USE_ASSERT_CHECKING
static bool _bt_verify_arrays_bt_first(IndexScanDesc scan, ScanDirection dir);
static bool _bt_verify_keys_with_arraykeys(IndexScanDesc scan); static bool _bt_verify_keys_with_arraykeys(IndexScanDesc scan);
#endif #endif
static bool _bt_oppodir_checkkeys(IndexScanDesc scan, ScanDirection dir, static bool _bt_oppodir_checkkeys(IndexScanDesc scan, ScanDirection dir,
@ -1034,73 +1032,6 @@ _bt_advance_array_keys_increment(IndexScanDesc scan, ScanDirection dir,
return false; return false;
} }
/*
* _bt_rewind_nonrequired_arrays() -- Rewind SAOP arrays not marked required
*
* Called when _bt_advance_array_keys decides to start a new primitive index
* scan on the basis of the current scan position being before the position
* that _bt_first is capable of repositioning the scan to by applying an
* inequality operator required in the opposite-to-scan direction only.
*
* Although equality strategy scan keys (for both arrays and non-arrays alike)
* are either marked required in both directions or in neither direction,
* there is a sense in which non-required arrays behave like required arrays.
* With a qual such as "WHERE a IN (100, 200) AND b >= 3 AND c IN (5, 6, 7)",
* the scan key on "c" is non-required, but nevertheless enables positioning
* the scan at the first tuple >= "(100, 3, 5)" on the leaf level during the
* first descent of the tree by _bt_first. Later on, there could also be a
* second descent, that places the scan right before tuples >= "(200, 3, 5)".
* _bt_first must never be allowed to build an insertion scan key whose "c"
* entry is set to a value other than 5, the "c" array's first element/value.
* (Actually, it's the first in the current scan direction. This example uses
* a forward scan.)
*
* Calling here resets the array scan key elements for the scan's non-required
* arrays. This is strictly necessary for correctness in a subset of cases
* involving "required in opposite direction"-triggered primitive index scans.
* Not all callers are at risk of _bt_first using a non-required array like
* this, but advancement always resets the arrays when another primitive scan
* is scheduled, just to keep things simple. Array advancement even makes
* sure to reset non-required arrays during scans that have no inequalities.
* (Advancement still won't call here when there are no inequalities, though
* that's just because it's all handled indirectly instead.)
*
* Note: _bt_verify_arrays_bt_first is called by an assertion to enforce that
* everybody got this right.
*
* Note: In practice almost all SAOP arrays are marked required during
* preprocessing (if necessary by generating skip arrays). It is hardly ever
* truly necessary to call here, but consistently doing so is simpler.
*/
static void
_bt_rewind_nonrequired_arrays(IndexScanDesc scan, ScanDirection dir)
{
Relation rel = scan->indexRelation;
BTScanOpaque so = (BTScanOpaque) scan->opaque;
int arrayidx = 0;
for (int ikey = 0; ikey < so->numberOfKeys; ikey++)
{
ScanKey cur = so->keyData + ikey;
BTArrayKeyInfo *array = NULL;
if (!(cur->sk_flags & SK_SEARCHARRAY) ||
cur->sk_strategy != BTEqualStrategyNumber)
continue;
array = &so->arrayKeys[arrayidx++];
Assert(array->scan_key == ikey);
if ((cur->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)))
continue;
Assert(array->num_elems != -1); /* No non-required skip arrays */
_bt_array_set_low_or_high(rel, cur, array,
ScanDirectionIsForward(dir));
}
}
/* /*
* _bt_tuple_before_array_skeys() -- too early to advance required arrays? * _bt_tuple_before_array_skeys() -- too early to advance required arrays?
* *
@ -1380,8 +1311,6 @@ _bt_start_prim_scan(IndexScanDesc scan, ScanDirection dir)
*/ */
if (so->needPrimScan) if (so->needPrimScan)
{ {
Assert(_bt_verify_arrays_bt_first(scan, dir));
/* /*
* Flag was set -- must call _bt_first again, which will reset the * Flag was set -- must call _bt_first again, which will reset the
* scan's needPrimScan flag * scan's needPrimScan flag
@ -2007,14 +1936,7 @@ _bt_advance_array_keys(IndexScanDesc scan, BTReadPageState *pstate,
*/ */
else if (has_required_opposite_direction_only && pstate->finaltup && else if (has_required_opposite_direction_only && pstate->finaltup &&
unlikely(!_bt_oppodir_checkkeys(scan, dir, pstate->finaltup))) unlikely(!_bt_oppodir_checkkeys(scan, dir, pstate->finaltup)))
{
/*
* Make sure that any SAOP arrays that were not marked required by
* preprocessing are reset to their first element for this direction
*/
_bt_rewind_nonrequired_arrays(scan, dir);
goto new_prim_scan; goto new_prim_scan;
}
continue_scan: continue_scan:
@ -2045,8 +1967,6 @@ continue_scan:
*/ */
so->oppositeDirCheck = has_required_opposite_direction_only; so->oppositeDirCheck = has_required_opposite_direction_only;
_bt_rewind_nonrequired_arrays(scan, dir);
/* /*
* skip by setting "look ahead" mechanism's offnum for forwards scans * skip by setting "look ahead" mechanism's offnum for forwards scans
* (backwards scans check scanBehind flag directly instead) * (backwards scans check scanBehind flag directly instead)
@ -2142,48 +2062,6 @@ end_toplevel_scan:
} }
#ifdef USE_ASSERT_CHECKING #ifdef USE_ASSERT_CHECKING
/*
* Verify that the scan's qual state matches what we expect at the point that
* _bt_start_prim_scan is about to start a just-scheduled new primitive scan.
*
* We enforce a rule against non-required array scan keys: they must start out
* with whatever element is the first for the scan's current scan direction.
* See _bt_rewind_nonrequired_arrays comments for an explanation.
*/
static bool
_bt_verify_arrays_bt_first(IndexScanDesc scan, ScanDirection dir)
{
BTScanOpaque so = (BTScanOpaque) scan->opaque;
int arrayidx = 0;
for (int ikey = 0; ikey < so->numberOfKeys; ikey++)
{
ScanKey cur = so->keyData + ikey;
BTArrayKeyInfo *array = NULL;
int first_elem_dir;
if (!(cur->sk_flags & SK_SEARCHARRAY) ||
cur->sk_strategy != BTEqualStrategyNumber)
continue;
array = &so->arrayKeys[arrayidx++];
if (((cur->sk_flags & SK_BT_REQFWD) && ScanDirectionIsForward(dir)) ||
((cur->sk_flags & SK_BT_REQBKWD) && ScanDirectionIsBackward(dir)))
continue;
if (ScanDirectionIsForward(dir))
first_elem_dir = 0;
else
first_elem_dir = array->num_elems - 1;
if (array->cur_elem != first_elem_dir)
return false;
}
return _bt_verify_keys_with_arraykeys(scan);
}
/* /*
* Verify that the scan's "so->keyData[]" scan keys are in agreement with * Verify that the scan's "so->keyData[]" scan keys are in agreement with
* its array key state * its array key state
@ -2194,6 +2072,7 @@ _bt_verify_keys_with_arraykeys(IndexScanDesc scan)
BTScanOpaque so = (BTScanOpaque) scan->opaque; BTScanOpaque so = (BTScanOpaque) scan->opaque;
int last_sk_attno = InvalidAttrNumber, int last_sk_attno = InvalidAttrNumber,
arrayidx = 0; arrayidx = 0;
bool nonrequiredseen = false;
if (!so->qual_ok) if (!so->qual_ok)
return false; return false;
@ -2217,8 +2096,16 @@ _bt_verify_keys_with_arraykeys(IndexScanDesc scan)
if (array->num_elems != -1 && if (array->num_elems != -1 &&
cur->sk_argument != array->elem_values[array->cur_elem]) cur->sk_argument != array->elem_values[array->cur_elem])
return false; return false;
if (last_sk_attno > cur->sk_attno) if (cur->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD))
return false; {
if (last_sk_attno > cur->sk_attno)
return false;
if (nonrequiredseen)
return false;
}
else
nonrequiredseen = true;
last_sk_attno = cur->sk_attno; last_sk_attno = cur->sk_attno;
} }
@ -2551,37 +2438,12 @@ _bt_set_startikey(IndexScanDesc scan, BTReadPageState *pstate)
if (!(key->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD))) if (!(key->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)))
{ {
/* Scan key isn't marked required (corner case) */ /* Scan key isn't marked required (corner case) */
Assert(!(key->sk_flags & SK_ROW_HEADER));
break; /* unsafe */ break; /* unsafe */
} }
if (key->sk_flags & SK_ROW_HEADER) if (key->sk_flags & SK_ROW_HEADER)
{ {
/* /* RowCompare inequalities currently aren't supported */
* RowCompare inequality. break; /* "unsafe" */
*
* Only the first subkey from a RowCompare can ever be marked
* required (that happens when the row header is marked required).
* There is no simple, general way for us to transitively deduce
* whether or not every tuple on the page satisfies a RowCompare
* key based only on firsttup and lasttup -- so we just give up.
*/
if (!start_past_saop_eq && !so->skipScan)
break; /* unsafe to go further */
/*
* We have to be even more careful with RowCompares that come
* after an array: we assume it's unsafe to even bypass the array.
* Calling _bt_start_array_keys to recover the scan's arrays
* following use of forcenonrequired mode isn't compatible with
* _bt_check_rowcompare's continuescan=false behavior with NULL
* row compare members. _bt_advance_array_keys must not make a
* decision on the basis of a key not being satisfied in the
* opposite-to-scan direction until the scan reaches a leaf page
* where the same key begins to be satisfied in scan direction.
* The _bt_first !used_all_subkeys behavior makes this limitation
* hard to work around some other way.
*/
return; /* completely unsafe to set pstate.startikey */
} }
if (key->sk_strategy != BTEqualStrategyNumber) if (key->sk_strategy != BTEqualStrategyNumber)
{ {
@ -3078,76 +2940,7 @@ _bt_check_rowcompare(ScanKey skey, IndexTuple tuple, int tupnatts,
Assert(subkey->sk_flags & SK_ROW_MEMBER); Assert(subkey->sk_flags & SK_ROW_MEMBER);
if (subkey->sk_attno > tupnatts) /* When a NULL row member is compared, the row never matches */
{
/*
* This attribute is truncated (must be high key). The value for
* this attribute in the first non-pivot tuple on the page to the
* right could be any possible value. Assume that truncated
* attribute passes the qual.
*/
Assert(BTreeTupleIsPivot(tuple));
cmpresult = 0;
if (subkey->sk_flags & SK_ROW_END)
break;
subkey++;
continue;
}
datum = index_getattr(tuple,
subkey->sk_attno,
tupdesc,
&isNull);
if (isNull)
{
if (forcenonrequired)
{
/* treating scan's keys as non-required */
}
else if (subkey->sk_flags & SK_BT_NULLS_FIRST)
{
/*
* Since NULLs are sorted before non-NULLs, we know we have
* reached the lower limit of the range of values for this
* index attr. On a backward scan, we can stop if this qual
* is one of the "must match" subset. We can stop regardless
* of whether the qual is > or <, so long as it's required,
* because it's not possible for any future tuples to pass. On
* a forward scan, however, we must keep going, because we may
* have initially positioned to the start of the index.
* (_bt_advance_array_keys also relies on this behavior during
* forward scans.)
*/
if ((subkey->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)) &&
ScanDirectionIsBackward(dir))
*continuescan = false;
}
else
{
/*
* Since NULLs are sorted after non-NULLs, we know we have
* reached the upper limit of the range of values for this
* index attr. On a forward scan, we can stop if this qual is
* one of the "must match" subset. We can stop regardless of
* whether the qual is > or <, so long as it's required,
* because it's not possible for any future tuples to pass. On
* a backward scan, however, we must keep going, because we
* may have initially positioned to the end of the index.
* (_bt_advance_array_keys also relies on this behavior during
* backward scans.)
*/
if ((subkey->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD)) &&
ScanDirectionIsForward(dir))
*continuescan = false;
}
/*
* In any case, this indextuple doesn't match the qual.
*/
return false;
}
if (subkey->sk_flags & SK_ISNULL) if (subkey->sk_flags & SK_ISNULL)
{ {
/* /*
@ -3172,6 +2965,114 @@ _bt_check_rowcompare(ScanKey skey, IndexTuple tuple, int tupnatts,
return false; return false;
} }
if (subkey->sk_attno > tupnatts)
{
/*
* This attribute is truncated (must be high key). The value for
* this attribute in the first non-pivot tuple on the page to the
* right could be any possible value. Assume that truncated
* attribute passes the qual.
*/
Assert(BTreeTupleIsPivot(tuple));
return true;
}
datum = index_getattr(tuple,
subkey->sk_attno,
tupdesc,
&isNull);
if (isNull)
{
int reqflags;
if (forcenonrequired)
{
/* treating scan's keys as non-required */
}
else if (subkey->sk_flags & SK_BT_NULLS_FIRST)
{
/*
* Since NULLs are sorted before non-NULLs, we know we have
* reached the lower limit of the range of values for this
* index attr. On a backward scan, we can stop if this qual
* is one of the "must match" subset. However, on a forwards
* scan, we must keep going, because we may have initially
* positioned to the start of the index.
*
* All required NULLS FIRST > row members can use NULL tuple
* values to end backwards scans, just like with other values.
* A qual "WHERE (a, b, c) > (9, 42, 'foo')" can terminate a
* backwards scan upon reaching the index's rightmost "a = 9"
* tuple whose "b" column contains a NULL (if not sooner).
* Since "b" is NULLS FIRST, we can treat its NULLs as "<" 42.
*/
reqflags = SK_BT_REQBKWD;
/*
* When a most significant required NULLS FIRST < row compare
* member sees NULL tuple values during a backwards scan, it
* signals the end of matches for the whole row compare/scan.
* A qual "WHERE (a, b, c) < (9, 42, 'foo')" will terminate a
* backwards scan upon reaching the rightmost tuple whose "a"
* column has a NULL. The "a" NULL value is "<" 9, and yet
* our < row compare will still end the scan. (This isn't
* safe with later/lower-order row members. Notice that it
* can only happen with an "a" NULL some time after the scan
* completely stops needing to use its "b" and "c" members.)
*/
if (subkey == (ScanKey) DatumGetPointer(skey->sk_argument))
reqflags |= SK_BT_REQFWD; /* safe, first row member */
if ((subkey->sk_flags & reqflags) &&
ScanDirectionIsBackward(dir))
*continuescan = false;
}
else
{
/*
* Since NULLs are sorted after non-NULLs, we know we have
* reached the upper limit of the range of values for this
* index attr. On a forward scan, we can stop if this qual is
* one of the "must match" subset. However, on a backward
* scan, we must keep going, because we may have initially
* positioned to the end of the index.
*
* All required NULLS LAST < row members can use NULL tuple
* values to end forwards scans, just like with other values.
* A qual "WHERE (a, b, c) < (9, 42, 'foo')" can terminate a
* forwards scan upon reaching the index's leftmost "a = 9"
* tuple whose "b" column contains a NULL (if not sooner).
* Since "b" is NULLS LAST, we can treat its NULLs as ">" 42.
*/
reqflags = SK_BT_REQFWD;
/*
* When a most significant required NULLS LAST > row compare
* member sees NULL tuple values during a forwards scan, it
* signals the end of matches for the whole row compare/scan.
* A qual "WHERE (a, b, c) > (9, 42, 'foo')" will terminate a
* forwards scan upon reaching the leftmost tuple whose "a"
* column has a NULL. The "a" NULL value is ">" 9, and yet
* our > row compare will end the scan. (This isn't safe with
* later/lower-order row members. Notice that it can only
* happen with an "a" NULL some time after the scan completely
* stops needing to use its "b" and "c" members.)
*/
if (subkey == (ScanKey) DatumGetPointer(skey->sk_argument))
reqflags |= SK_BT_REQBKWD; /* safe, first row member */
if ((subkey->sk_flags & reqflags) &&
ScanDirectionIsForward(dir))
*continuescan = false;
}
/*
* In any case, this indextuple doesn't match the qual.
*/
return false;
}
/* Perform the test --- three-way comparison not bool operator */ /* Perform the test --- three-way comparison not bool operator */
cmpresult = DatumGetInt32(FunctionCall2Coll(&subkey->sk_func, cmpresult = DatumGetInt32(FunctionCall2Coll(&subkey->sk_func,
subkey->sk_collation, subkey->sk_collation,

View File

@ -4994,13 +4994,25 @@ check_recovery_target_timeline(char **newval, void **extra, GucSource source)
rttg = RECOVERY_TARGET_TIMELINE_LATEST; rttg = RECOVERY_TARGET_TIMELINE_LATEST;
else else
{ {
char *endp;
uint64 timeline;
rttg = RECOVERY_TARGET_TIMELINE_NUMERIC; rttg = RECOVERY_TARGET_TIMELINE_NUMERIC;
errno = 0; errno = 0;
strtoul(*newval, NULL, 0); timeline = strtou64(*newval, &endp, 0);
if (errno == EINVAL || errno == ERANGE)
if (*endp != '\0' || errno == EINVAL || errno == ERANGE)
{ {
GUC_check_errdetail("\"recovery_target_timeline\" is not a valid number."); GUC_check_errdetail("\"%s\" is not a valid number.",
"recovery_target_timeline");
return false;
}
if (timeline < 1 || timeline > PG_UINT32_MAX)
{
GUC_check_errdetail("\"%s\" must be between %u and %u.",
"recovery_target_timeline", 1, UINT_MAX);
return false; return false;
} }
} }

View File

@ -2711,8 +2711,7 @@ MergeAttributes(List *columns, const List *supers, char relpersistence,
RelationGetRelationName(relation)))); RelationGetRelationName(relation))));
/* If existing rel is temp, it must belong to this session */ /* If existing rel is temp, it must belong to this session */
if (relation->rd_rel->relpersistence == RELPERSISTENCE_TEMP && if (RELATION_IS_OTHER_TEMP(relation))
!relation->rd_islocaltemp)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE), (errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg(!is_partition errmsg(!is_partition
@ -17230,15 +17229,13 @@ ATExecAddInherit(Relation child_rel, RangeVar *parent, LOCKMODE lockmode)
RelationGetRelationName(parent_rel)))); RelationGetRelationName(parent_rel))));
/* If parent rel is temp, it must belong to this session */ /* If parent rel is temp, it must belong to this session */
if (parent_rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP && if (RELATION_IS_OTHER_TEMP(parent_rel))
!parent_rel->rd_islocaltemp)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE), (errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot inherit from temporary relation of another session"))); errmsg("cannot inherit from temporary relation of another session")));
/* Ditto for the child */ /* Ditto for the child */
if (child_rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP && if (RELATION_IS_OTHER_TEMP(child_rel))
!child_rel->rd_islocaltemp)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE), (errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot inherit to temporary relation of another session"))); errmsg("cannot inherit to temporary relation of another session")));
@ -20309,15 +20306,13 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd,
RelationGetRelationName(rel)))); RelationGetRelationName(rel))));
/* If the parent is temp, it must belong to this session */ /* If the parent is temp, it must belong to this session */
if (rel->rd_rel->relpersistence == RELPERSISTENCE_TEMP && if (RELATION_IS_OTHER_TEMP(rel))
!rel->rd_islocaltemp)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE), (errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot attach as partition of temporary relation of another session"))); errmsg("cannot attach as partition of temporary relation of another session")));
/* Ditto for the partition */ /* Ditto for the partition */
if (attachrel->rd_rel->relpersistence == RELPERSISTENCE_TEMP && if (RELATION_IS_OTHER_TEMP(attachrel))
!attachrel->rd_islocaltemp)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE), (errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot attach temporary relation of another session as partition"))); errmsg("cannot attach temporary relation of another session as partition")));

View File

@ -53,7 +53,7 @@ llvm_irgen_args = [
if ccache.found() if ccache.found()
llvm_irgen_command = ccache llvm_irgen_command = ccache
llvm_irgen_args = [clang.path()] + llvm_irgen_args llvm_irgen_args = [clang.full_path()] + llvm_irgen_args
else else
llvm_irgen_command = clang llvm_irgen_command = clang
endif endif

View File

@ -154,13 +154,17 @@ add_paths_to_joinrel(PlannerInfo *root,
/* /*
* See if the inner relation is provably unique for this outer rel. * See if the inner relation is provably unique for this outer rel.
* *
* We have some special cases: for JOIN_SEMI and JOIN_ANTI, it doesn't * We have some special cases: for JOIN_SEMI, it doesn't matter since the
* matter since the executor can make the equivalent optimization anyway; * executor can make the equivalent optimization anyway. It also doesn't
* we need not expend planner cycles on proofs. For JOIN_UNIQUE_INNER, we * help enable use of Memoize, since a semijoin with a provably unique
* must be considering a semijoin whose inner side is not provably unique * inner side should have been reduced to an inner join in that case.
* (else reduce_unique_semijoins would've simplified it), so there's no * Therefore, we need not expend planner cycles on proofs. (For
* point in calling innerrel_is_unique. However, if the LHS covers all of * JOIN_ANTI, although it doesn't help the executor for the same reason,
* the semijoin's min_lefthand, then it's appropriate to set inner_unique * it can benefit Memoize paths.) For JOIN_UNIQUE_INNER, we must be
* considering a semijoin whose inner side is not provably unique (else
* reduce_unique_semijoins would've simplified it), so there's no point in
* calling innerrel_is_unique. However, if the LHS covers all of the
* semijoin's min_lefthand, then it's appropriate to set inner_unique
* because the path produced by create_unique_path will be unique relative * because the path produced by create_unique_path will be unique relative
* to the LHS. (If we have an LHS that's only part of the min_lefthand, * to the LHS. (If we have an LHS that's only part of the min_lefthand,
* that is *not* true.) For JOIN_UNIQUE_OUTER, pass JOIN_INNER to avoid * that is *not* true.) For JOIN_UNIQUE_OUTER, pass JOIN_INNER to avoid
@ -169,12 +173,6 @@ add_paths_to_joinrel(PlannerInfo *root,
switch (jointype) switch (jointype)
{ {
case JOIN_SEMI: case JOIN_SEMI:
case JOIN_ANTI:
/*
* XXX it may be worth proving this to allow a Memoize to be
* considered for Nested Loop Semi/Anti Joins.
*/
extra.inner_unique = false; /* well, unproven */ extra.inner_unique = false; /* well, unproven */
break; break;
case JOIN_UNIQUE_INNER: case JOIN_UNIQUE_INNER:
@ -715,16 +713,21 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel,
return NULL; return NULL;
/* /*
* Currently we don't do this for SEMI and ANTI joins unless they're * Currently we don't do this for SEMI and ANTI joins, because nested loop
* marked as inner_unique. This is because nested loop SEMI/ANTI joins * SEMI/ANTI joins don't scan the inner node to completion, which means
* don't scan the inner node to completion, which will mean memoize cannot * memoize cannot mark the cache entry as complete. Nor can we mark the
* mark the cache entry as complete. * cache entry as complete after fetching the first inner tuple, because
* * if that tuple and the current outer tuple don't satisfy the join
* XXX Currently we don't attempt to mark SEMI/ANTI joins as inner_unique * clauses, a second inner tuple that satisfies the parameters would find
* = true. Should we? See add_paths_to_joinrel() * the cache entry already marked as complete. The only exception is when
* the inner relation is provably unique, as in that case, there won't be
* a second matching tuple and we can safely mark the cache entry as
* complete after fetching the first inner tuple. Note that in such
* cases, the SEMI join should have been reduced to an inner join by
* reduce_unique_semijoins.
*/ */
if (!extra->inner_unique && (jointype == JOIN_SEMI || if ((jointype == JOIN_SEMI || jointype == JOIN_ANTI) &&
jointype == JOIN_ANTI)) !extra->inner_unique)
return NULL; return NULL;
/* /*

View File

@ -2668,6 +2668,12 @@ alter_table_cmd:
c->alterDeferrability = true; c->alterDeferrability = true;
if ($4 & CAS_NO_INHERIT) if ($4 & CAS_NO_INHERIT)
c->alterInheritability = true; c->alterInheritability = true;
/* handle unsupported case with specific error message */
if ($4 & CAS_NOT_VALID)
ereport(ERROR,
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("constraints cannot be altered to be NOT VALID"),
parser_errposition(@4));
processCASbits($4, @4, "FOREIGN KEY", processCASbits($4, @4, "FOREIGN KEY",
&c->deferrable, &c->deferrable,
&c->initdeferred, &c->initdeferred,

View File

@ -15,6 +15,20 @@
* current backend. This function guarantees that only one backend * current backend. This function guarantees that only one backend
* initializes the segment and that all other backends just attach it. * initializes the segment and that all other backends just attach it.
* *
* A DSA can be created in or retrieved from the registry by calling
* GetNamedDSA(). As with GetNamedDSMSegment(), if a DSA with the provided
* name does not yet exist, it is created. Otherwise, GetNamedDSA()
* ensures the DSA is attached to the current backend. This function
* guarantees that only one backend initializes the DSA and that all other
* backends just attach it.
*
* A dshash table can be created in or retrieved from the registry by
* calling GetNamedDSHash(). As with GetNamedDSMSegment(), if a hash
* table with the provided name does not yet exist, it is created.
* Otherwise, GetNamedDSHash() ensures the hash table is attached to the
* current backend. This function guarantees that only one backend
* initializes the table and that all other backends just attach it.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
@ -32,6 +46,12 @@
#include "storage/shmem.h" #include "storage/shmem.h"
#include "utils/memutils.h" #include "utils/memutils.h"
#define DSMR_NAME_LEN 128
#define DSMR_DSA_TRANCHE_SUFFIX " DSA"
#define DSMR_DSA_TRANCHE_SUFFIX_LEN (sizeof(DSMR_DSA_TRANCHE_SUFFIX) - 1)
#define DSMR_DSA_TRANCHE_NAME_LEN (DSMR_NAME_LEN + DSMR_DSA_TRANCHE_SUFFIX_LEN)
typedef struct DSMRegistryCtxStruct typedef struct DSMRegistryCtxStruct
{ {
dsa_handle dsah; dsa_handle dsah;
@ -40,15 +60,48 @@ typedef struct DSMRegistryCtxStruct
static DSMRegistryCtxStruct *DSMRegistryCtx; static DSMRegistryCtxStruct *DSMRegistryCtx;
typedef struct DSMRegistryEntry typedef struct NamedDSMState
{ {
char name[64];
dsm_handle handle; dsm_handle handle;
size_t size; size_t size;
} NamedDSMState;
typedef struct NamedDSAState
{
dsa_handle handle;
int tranche;
char tranche_name[DSMR_DSA_TRANCHE_NAME_LEN];
} NamedDSAState;
typedef struct NamedDSHState
{
NamedDSAState dsa;
dshash_table_handle handle;
int tranche;
char tranche_name[DSMR_NAME_LEN];
} NamedDSHState;
typedef enum DSMREntryType
{
DSMR_ENTRY_TYPE_DSM,
DSMR_ENTRY_TYPE_DSA,
DSMR_ENTRY_TYPE_DSH,
} DSMREntryType;
typedef struct DSMRegistryEntry
{
char name[DSMR_NAME_LEN];
DSMREntryType type;
union
{
NamedDSMState dsm;
NamedDSAState dsa;
NamedDSHState dsh;
} data;
} DSMRegistryEntry; } DSMRegistryEntry;
static const dshash_parameters dsh_params = { static const dshash_parameters dsh_params = {
offsetof(DSMRegistryEntry, handle), offsetof(DSMRegistryEntry, type),
sizeof(DSMRegistryEntry), sizeof(DSMRegistryEntry),
dshash_strcmp, dshash_strcmp,
dshash_strhash, dshash_strhash,
@ -141,7 +194,7 @@ GetNamedDSMSegment(const char *name, size_t size,
ereport(ERROR, ereport(ERROR,
(errmsg("DSM segment name cannot be empty"))); (errmsg("DSM segment name cannot be empty")));
if (strlen(name) >= offsetof(DSMRegistryEntry, handle)) if (strlen(name) >= offsetof(DSMRegistryEntry, type))
ereport(ERROR, ereport(ERROR,
(errmsg("DSM segment name too long"))); (errmsg("DSM segment name too long")));
@ -158,32 +211,39 @@ GetNamedDSMSegment(const char *name, size_t size,
entry = dshash_find_or_insert(dsm_registry_table, name, found); entry = dshash_find_or_insert(dsm_registry_table, name, found);
if (!(*found)) if (!(*found))
{ {
NamedDSMState *state = &entry->data.dsm;
dsm_segment *seg;
entry->type = DSMR_ENTRY_TYPE_DSM;
/* Initialize the segment. */ /* Initialize the segment. */
dsm_segment *seg = dsm_create(size, 0); seg = dsm_create(size, 0);
dsm_pin_segment(seg); dsm_pin_segment(seg);
dsm_pin_mapping(seg); dsm_pin_mapping(seg);
entry->handle = dsm_segment_handle(seg); state->handle = dsm_segment_handle(seg);
entry->size = size; state->size = size;
ret = dsm_segment_address(seg); ret = dsm_segment_address(seg);
if (init_callback) if (init_callback)
(*init_callback) (ret); (*init_callback) (ret);
} }
else if (entry->size != size) else if (entry->type != DSMR_ENTRY_TYPE_DSM)
{
ereport(ERROR, ereport(ERROR,
(errmsg("requested DSM segment size does not match size of " (errmsg("requested DSM segment does not match type of existing entry")));
"existing segment"))); else if (entry->data.dsm.size != size)
} ereport(ERROR,
(errmsg("requested DSM segment size does not match size of existing segment")));
else else
{ {
dsm_segment *seg = dsm_find_mapping(entry->handle); NamedDSMState *state = &entry->data.dsm;
dsm_segment *seg;
/* If the existing segment is not already attached, attach it now. */ /* If the existing segment is not already attached, attach it now. */
seg = dsm_find_mapping(state->handle);
if (seg == NULL) if (seg == NULL)
{ {
seg = dsm_attach(entry->handle); seg = dsm_attach(state->handle);
if (seg == NULL) if (seg == NULL)
elog(ERROR, "could not map dynamic shared memory segment"); elog(ERROR, "could not map dynamic shared memory segment");
@ -198,3 +258,180 @@ GetNamedDSMSegment(const char *name, size_t size,
return ret; return ret;
} }
/*
* Initialize or attach a named DSA.
*
* This routine returns a pointer to the DSA. A new LWLock tranche ID will be
* generated if needed. Note that the lock tranche will be registered with the
* provided name. Also note that this should be called at most once for a
* given DSA in each backend.
*/
dsa_area *
GetNamedDSA(const char *name, bool *found)
{
DSMRegistryEntry *entry;
MemoryContext oldcontext;
dsa_area *ret;
Assert(found);
if (!name || *name == '\0')
ereport(ERROR,
(errmsg("DSA name cannot be empty")));
if (strlen(name) >= offsetof(DSMRegistryEntry, type))
ereport(ERROR,
(errmsg("DSA name too long")));
/* Be sure any local memory allocated by DSM/DSA routines is persistent. */
oldcontext = MemoryContextSwitchTo(TopMemoryContext);
/* Connect to the registry. */
init_dsm_registry();
entry = dshash_find_or_insert(dsm_registry_table, name, found);
if (!(*found))
{
NamedDSAState *state = &entry->data.dsa;
entry->type = DSMR_ENTRY_TYPE_DSA;
/* Initialize the LWLock tranche for the DSA. */
state->tranche = LWLockNewTrancheId();
strcpy(state->tranche_name, name);
LWLockRegisterTranche(state->tranche, state->tranche_name);
/* Initialize the DSA. */
ret = dsa_create(state->tranche);
dsa_pin(ret);
dsa_pin_mapping(ret);
/* Store handle for other backends to use. */
state->handle = dsa_get_handle(ret);
}
else if (entry->type != DSMR_ENTRY_TYPE_DSA)
ereport(ERROR,
(errmsg("requested DSA does not match type of existing entry")));
else
{
NamedDSAState *state = &entry->data.dsa;
if (dsa_is_attached(state->handle))
ereport(ERROR,
(errmsg("requested DSA already attached to current process")));
/* Initialize existing LWLock tranche for the DSA. */
LWLockRegisterTranche(state->tranche, state->tranche_name);
/* Attach to existing DSA. */
ret = dsa_attach(state->handle);
dsa_pin_mapping(ret);
}
dshash_release_lock(dsm_registry_table, entry);
MemoryContextSwitchTo(oldcontext);
return ret;
}
/*
* Initialize or attach a named dshash table.
*
* This routine returns the address of the table. The tranche_id member of
* params is ignored; new tranche IDs will be generated if needed. Note that
* the DSA lock tranche will be registered with the provided name with " DSA"
* appended. The dshash lock tranche will be registered with the provided
* name. Also note that this should be called at most once for a given table
* in each backend.
*/
dshash_table *
GetNamedDSHash(const char *name, const dshash_parameters *params, bool *found)
{
DSMRegistryEntry *entry;
MemoryContext oldcontext;
dshash_table *ret;
Assert(params);
Assert(found);
if (!name || *name == '\0')
ereport(ERROR,
(errmsg("DSHash name cannot be empty")));
if (strlen(name) >= offsetof(DSMRegistryEntry, type))
ereport(ERROR,
(errmsg("DSHash name too long")));
/* Be sure any local memory allocated by DSM/DSA routines is persistent. */
oldcontext = MemoryContextSwitchTo(TopMemoryContext);
/* Connect to the registry. */
init_dsm_registry();
entry = dshash_find_or_insert(dsm_registry_table, name, found);
if (!(*found))
{
NamedDSAState *dsa_state = &entry->data.dsh.dsa;
NamedDSHState *dsh_state = &entry->data.dsh;
dshash_parameters params_copy;
dsa_area *dsa;
entry->type = DSMR_ENTRY_TYPE_DSH;
/* Initialize the LWLock tranche for the DSA. */
dsa_state->tranche = LWLockNewTrancheId();
sprintf(dsa_state->tranche_name, "%s%s", name, DSMR_DSA_TRANCHE_SUFFIX);
LWLockRegisterTranche(dsa_state->tranche, dsa_state->tranche_name);
/* Initialize the LWLock tranche for the dshash table. */
dsh_state->tranche = LWLockNewTrancheId();
strcpy(dsh_state->tranche_name, name);
LWLockRegisterTranche(dsh_state->tranche, dsh_state->tranche_name);
/* Initialize the DSA for the hash table. */
dsa = dsa_create(dsa_state->tranche);
dsa_pin(dsa);
dsa_pin_mapping(dsa);
/* Initialize the dshash table. */
memcpy(&params_copy, params, sizeof(dshash_parameters));
params_copy.tranche_id = dsh_state->tranche;
ret = dshash_create(dsa, &params_copy, NULL);
/* Store handles for other backends to use. */
dsa_state->handle = dsa_get_handle(dsa);
dsh_state->handle = dshash_get_hash_table_handle(ret);
}
else if (entry->type != DSMR_ENTRY_TYPE_DSH)
ereport(ERROR,
(errmsg("requested DSHash does not match type of existing entry")));
else
{
NamedDSAState *dsa_state = &entry->data.dsh.dsa;
NamedDSHState *dsh_state = &entry->data.dsh;
dsa_area *dsa;
/* XXX: Should we verify params matches what table was created with? */
if (dsa_is_attached(dsa_state->handle))
ereport(ERROR,
(errmsg("requested DSHash already attached to current process")));
/* Initialize existing LWLock tranches for the DSA and dshash table. */
LWLockRegisterTranche(dsa_state->tranche, dsa_state->tranche_name);
LWLockRegisterTranche(dsh_state->tranche, dsh_state->tranche_name);
/* Attach to existing DSA for the hash table. */
dsa = dsa_attach(dsa_state->handle);
dsa_pin_mapping(dsa);
/* Attach to existing dshash table. */
ret = dshash_attach(dsa, params, dsh_state->handle, NULL);
}
dshash_release_lock(dsm_registry_table, entry);
MemoryContextSwitchTo(oldcontext);
return ret;
}

View File

@ -4067,8 +4067,9 @@ float84ge(PG_FUNCTION_ARGS)
* with the specified characteristics. An operand smaller than the * with the specified characteristics. An operand smaller than the
* lower bound is assigned to bucket 0. An operand greater than or equal * lower bound is assigned to bucket 0. An operand greater than or equal
* to the upper bound is assigned to an additional bucket (with number * to the upper bound is assigned to an additional bucket (with number
* count+1). We don't allow "NaN" for any of the float8 inputs, and we * count+1). We don't allow the histogram bounds to be NaN or +/- infinity,
* don't allow either of the histogram bounds to be +/- infinity. * but we do allow those values for the operand (taking NaN to be larger
* than any other value, as we do in comparisons).
*/ */
Datum Datum
width_bucket_float8(PG_FUNCTION_ARGS) width_bucket_float8(PG_FUNCTION_ARGS)
@ -4084,12 +4085,11 @@ width_bucket_float8(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION), (errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION),
errmsg("count must be greater than zero"))); errmsg("count must be greater than zero")));
if (isnan(operand) || isnan(bound1) || isnan(bound2)) if (isnan(bound1) || isnan(bound2))
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION), (errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION),
errmsg("operand, lower bound, and upper bound cannot be NaN"))); errmsg("lower and upper bounds cannot be NaN")));
/* Note that we allow "operand" to be infinite */
if (isinf(bound1) || isinf(bound2)) if (isinf(bound1) || isinf(bound2))
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION), (errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION),
@ -4097,15 +4097,15 @@ width_bucket_float8(PG_FUNCTION_ARGS)
if (bound1 < bound2) if (bound1 < bound2)
{ {
if (operand < bound1) if (isnan(operand) || operand >= bound2)
result = 0;
else if (operand >= bound2)
{ {
if (pg_add_s32_overflow(count, 1, &result)) if (pg_add_s32_overflow(count, 1, &result))
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("integer out of range"))); errmsg("integer out of range")));
} }
else if (operand < bound1)
result = 0;
else else
{ {
if (!isinf(bound2 - bound1)) if (!isinf(bound2 - bound1))
@ -4135,7 +4135,7 @@ width_bucket_float8(PG_FUNCTION_ARGS)
} }
else if (bound1 > bound2) else if (bound1 > bound2)
{ {
if (operand > bound1) if (isnan(operand) || operand > bound1)
result = 0; result = 0;
else if (operand <= bound2) else if (operand <= bound2)
{ {

View File

@ -1960,8 +1960,9 @@ generate_series_numeric_support(PG_FUNCTION_ARGS)
* with the specified characteristics. An operand smaller than the * with the specified characteristics. An operand smaller than the
* lower bound is assigned to bucket 0. An operand greater than or equal * lower bound is assigned to bucket 0. An operand greater than or equal
* to the upper bound is assigned to an additional bucket (with number * to the upper bound is assigned to an additional bucket (with number
* count+1). We don't allow "NaN" for any of the numeric inputs, and we * count+1). We don't allow the histogram bounds to be NaN or +/- infinity,
* don't allow either of the histogram bounds to be +/- infinity. * but we do allow those values for the operand (taking NaN to be larger
* than any other value, as we do in comparisons).
*/ */
Datum Datum
width_bucket_numeric(PG_FUNCTION_ARGS) width_bucket_numeric(PG_FUNCTION_ARGS)
@ -1979,17 +1980,13 @@ width_bucket_numeric(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION), (errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION),
errmsg("count must be greater than zero"))); errmsg("count must be greater than zero")));
if (NUMERIC_IS_SPECIAL(operand) || if (NUMERIC_IS_SPECIAL(bound1) || NUMERIC_IS_SPECIAL(bound2))
NUMERIC_IS_SPECIAL(bound1) ||
NUMERIC_IS_SPECIAL(bound2))
{ {
if (NUMERIC_IS_NAN(operand) || if (NUMERIC_IS_NAN(bound1) || NUMERIC_IS_NAN(bound2))
NUMERIC_IS_NAN(bound1) ||
NUMERIC_IS_NAN(bound2))
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION), (errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION),
errmsg("operand, lower bound, and upper bound cannot be NaN"))); errmsg("lower and upper bounds cannot be NaN")));
/* We allow "operand" to be infinite; cmp_numerics will cope */
if (NUMERIC_IS_INF(bound1) || NUMERIC_IS_INF(bound2)) if (NUMERIC_IS_INF(bound1) || NUMERIC_IS_INF(bound2))
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION), (errcode(ERRCODE_INVALID_ARGUMENT_FOR_WIDTH_BUCKET_FUNCTION),

View File

@ -584,3 +584,49 @@ IsInjectionPointAttached(const char *name)
return false; /* silence compiler */ return false; /* silence compiler */
#endif #endif
} }
/*
* Retrieve a list of all the injection points currently attached.
*
* This list is palloc'd in the current memory context.
*/
List *
InjectionPointList(void)
{
#ifdef USE_INJECTION_POINTS
List *inj_points = NIL;
uint32 max_inuse;
LWLockAcquire(InjectionPointLock, LW_SHARED);
max_inuse = pg_atomic_read_u32(&ActiveInjectionPoints->max_inuse);
for (uint32 idx = 0; idx < max_inuse; idx++)
{
InjectionPointEntry *entry;
InjectionPointData *inj_point;
uint64 generation;
entry = &ActiveInjectionPoints->entries[idx];
generation = pg_atomic_read_u64(&entry->generation);
/* skip free slots */
if (generation % 2 == 0)
continue;
inj_point = (InjectionPointData *) palloc0(sizeof(InjectionPointData));
inj_point->name = pstrdup(entry->name);
inj_point->library = pstrdup(entry->library);
inj_point->function = pstrdup(entry->function);
inj_points = lappend(inj_points, inj_point);
}
LWLockRelease(InjectionPointLock);
return inj_points;
#else
elog(ERROR, "Injection points are not supported by this build");
return NIL; /* keep compiler quiet */
#endif
}

View File

@ -531,6 +531,21 @@ dsa_attach(dsa_handle handle)
return area; return area;
} }
/*
* Returns whether the area with the given handle was already attached by the
* current process. The area must have been created with dsa_create (not
* dsa_create_in_place).
*/
bool
dsa_is_attached(dsa_handle handle)
{
/*
* An area handle is really a DSM segment handle for the first segment, so
* we can just search for that.
*/
return dsm_find_mapping(handle) != NULL;
}
/* /*
* Attach to an area that was created with dsa_create_in_place. The caller * Attach to an area that was created with dsa_create_in_place. The caller
* must somehow know the location in memory that was used when the area was * must somehow know the location in memory that was used when the area was

View File

@ -93,9 +93,9 @@ tests += {
'sd': meson.current_source_dir(), 'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(), 'bd': meson.current_build_dir(),
'tap': { 'tap': {
'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', 'env': {'GZIP_PROGRAM': gzip.found() ? gzip.full_path() : '',
'TAR': tar.found() ? tar.path() : '', 'TAR': tar.found() ? tar.full_path() : '',
'LZ4': program_lz4.found() ? program_lz4.path() : '', 'LZ4': program_lz4.found() ? program_lz4.full_path() : '',
}, },
'tests': [ 'tests': [
't/010_pg_basebackup.pl', 't/010_pg_basebackup.pl',

View File

@ -91,9 +91,9 @@ tests += {
'bd': meson.current_build_dir(), 'bd': meson.current_build_dir(),
'tap': { 'tap': {
'env': { 'env': {
'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', 'GZIP_PROGRAM': gzip.found() ? gzip.full_path() : '',
'LZ4': program_lz4.found() ? program_lz4.path() : '', 'LZ4': program_lz4.found() ? program_lz4.full_path() : '',
'ZSTD': program_zstd.found() ? program_zstd.path() : '', 'ZSTD': program_zstd.found() ? program_zstd.full_path() : '',
'with_icu': icu.found() ? 'yes' : 'no', 'with_icu': icu.found() ? 'yes' : 'no',
}, },
'tests': [ 'tests': [

View File

@ -23,10 +23,10 @@ tests += {
'sd': meson.current_source_dir(), 'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(), 'bd': meson.current_build_dir(),
'tap': { 'tap': {
'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '', 'env': {'GZIP_PROGRAM': gzip.found() ? gzip.full_path() : '',
'TAR': tar.found() ? tar.path() : '', 'TAR': tar.found() ? tar.full_path() : '',
'LZ4': program_lz4.found() ? program_lz4.path() : '', 'LZ4': program_lz4.found() ? program_lz4.full_path() : '',
'ZSTD': program_zstd.found() ? program_zstd.path() : ''}, 'ZSTD': program_zstd.found() ? program_zstd.full_path() : ''},
'tests': [ 'tests': [
't/001_basic.pl', 't/001_basic.pl',
't/002_algorithm.pl', 't/002_algorithm.pl',

View File

@ -16,6 +16,22 @@ my $primary = PostgreSQL::Test::Cluster->new('primary');
$primary->init(allows_streaming => 1); $primary->init(allows_streaming => 1);
$primary->start; $primary->start;
# Create file with some random data and an arbitrary size, useful to check
# the solidity of the compression and decompression logic. The size of the
# file is chosen to be around 640kB. This has proven to be large enough to
# detect some issues related to LZ4, and low enough to not impact the runtime
# of the test significantly.
my $junk_data = $primary->safe_psql(
'postgres', qq(
SELECT string_agg(encode(sha256(i::bytea), 'hex'), '')
FROM generate_series(1, 10240) s(i);));
my $data_dir = $primary->data_dir;
my $junk_file = "$data_dir/junk";
open my $jf, '>', $junk_file
or die "Could not create junk file: $!";
print $jf $junk_data;
close $jf;
# Create a tablespace directory. # Create a tablespace directory.
my $source_ts_path = PostgreSQL::Test::Utils::tempdir_short(); my $source_ts_path = PostgreSQL::Test::Utils::tempdir_short();
@ -52,6 +68,12 @@ my @test_configuration = (
'backup_archive' => [ 'base.tar.lz4', "$tsoid.tar.lz4" ], 'backup_archive' => [ 'base.tar.lz4', "$tsoid.tar.lz4" ],
'enabled' => check_pg_config("#define USE_LZ4 1") 'enabled' => check_pg_config("#define USE_LZ4 1")
}, },
{
'compression_method' => 'lz4',
'backup_flags' => [ '--compress', 'server-lz4:5' ],
'backup_archive' => [ 'base.tar.lz4', "$tsoid.tar.lz4" ],
'enabled' => check_pg_config("#define USE_LZ4 1")
},
{ {
'compression_method' => 'zstd', 'compression_method' => 'zstd',
'backup_flags' => [ '--compress', 'server-zstd' ], 'backup_flags' => [ '--compress', 'server-zstd' ],

View File

@ -15,6 +15,22 @@ my $primary = PostgreSQL::Test::Cluster->new('primary');
$primary->init(allows_streaming => 1); $primary->init(allows_streaming => 1);
$primary->start; $primary->start;
# Create file with some random data and an arbitrary size, useful to check
# the solidity of the compression and decompression logic. The size of the
# file is chosen to be around 640kB. This has proven to be large enough to
# detect some issues related to LZ4, and low enough to not impact the runtime
# of the test significantly.
my $junk_data = $primary->safe_psql(
'postgres', qq(
SELECT string_agg(encode(sha256(i::bytea), 'hex'), '')
FROM generate_series(1, 10240) s(i);));
my $data_dir = $primary->data_dir;
my $junk_file = "$data_dir/junk";
open my $jf, '>', $junk_file
or die "Could not create junk file: $!";
print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup'; my $backup_path = $primary->backup_dir . '/client-backup';
my $extract_path = $primary->backup_dir . '/extracted-backup'; my $extract_path = $primary->backup_dir . '/extracted-backup';
@ -37,6 +53,12 @@ my @test_configuration = (
'backup_archive' => 'base.tar.lz4', 'backup_archive' => 'base.tar.lz4',
'enabled' => check_pg_config("#define USE_LZ4 1") 'enabled' => check_pg_config("#define USE_LZ4 1")
}, },
{
'compression_method' => 'lz4',
'backup_flags' => [ '--compress', 'client-lz4:1' ],
'backup_archive' => 'base.tar.lz4',
'enabled' => check_pg_config("#define USE_LZ4 1")
},
{ {
'compression_method' => 'zstd', 'compression_method' => 'zstd',
'backup_flags' => [ '--compress', 'client-zstd:5' ], 'backup_flags' => [ '--compress', 'client-zstd:5' ],

View File

@ -322,9 +322,9 @@ astreamer_lz4_decompressor_content(astreamer *streamer,
mystreamer = (astreamer_lz4_frame *) streamer; mystreamer = (astreamer_lz4_frame *) streamer;
next_in = (uint8 *) data; next_in = (uint8 *) data;
next_out = (uint8 *) mystreamer->base.bbs_buffer.data; next_out = (uint8 *) mystreamer->base.bbs_buffer.data + mystreamer->bytes_written;
avail_in = len; avail_in = len;
avail_out = mystreamer->base.bbs_buffer.maxlen; avail_out = mystreamer->base.bbs_buffer.maxlen - mystreamer->bytes_written;
while (avail_in > 0) while (avail_in > 0)
{ {

View File

@ -157,35 +157,6 @@ typedef struct ExprState
* entries for a particular index. Used for both index_build and * entries for a particular index. Used for both index_build and
* retail creation of index entries. * retail creation of index entries.
* *
* NumIndexAttrs total number of columns in this index
* NumIndexKeyAttrs number of key columns in index
* IndexAttrNumbers underlying-rel attribute numbers used as keys
* (zeroes indicate expressions). It also contains
* info about included columns.
* Expressions expr trees for expression entries, or NIL if none
* ExpressionsState exec state for expressions, or NIL if none
* Predicate partial-index predicate, or NIL if none
* PredicateState exec state for predicate, or NIL if none
* ExclusionOps Per-column exclusion operators, or NULL if none
* ExclusionProcs Underlying function OIDs for ExclusionOps
* ExclusionStrats Opclass strategy numbers for ExclusionOps
* UniqueOps These are like Exclusion*, but for unique indexes
* UniqueProcs
* UniqueStrats
* Unique is it a unique index?
* NullsNotDistinct is NULLS NOT DISTINCT?
* ReadyForInserts is it valid for inserts?
* CheckedUnchanged IndexUnchanged status determined yet?
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
* WithoutOverlaps is it a WITHOUT OVERLAPS index?
* Summarizing is it a summarizing index?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
* Context memory context holding this IndexInfo
*
* ii_Concurrent, ii_BrokenHotChain, and ii_ParallelWorkers are used only * ii_Concurrent, ii_BrokenHotChain, and ii_ParallelWorkers are used only
* during index build; they're conventionally zeroed otherwise. * during index build; they're conventionally zeroed otherwise.
* ---------------- * ----------------
@ -193,31 +164,67 @@ typedef struct ExprState
typedef struct IndexInfo typedef struct IndexInfo
{ {
NodeTag type; NodeTag type;
int ii_NumIndexAttrs; /* total number of columns in index */
int ii_NumIndexKeyAttrs; /* number of key columns in index */ /* total number of columns in index */
int ii_NumIndexAttrs;
/* number of key columns in index */
int ii_NumIndexKeyAttrs;
/*
* Underlying-rel attribute numbers used as keys (zeroes indicate
* expressions). It also contains info about included columns.
*/
AttrNumber ii_IndexAttrNumbers[INDEX_MAX_KEYS]; AttrNumber ii_IndexAttrNumbers[INDEX_MAX_KEYS];
/* expr trees for expression entries, or NIL if none */
List *ii_Expressions; /* list of Expr */ List *ii_Expressions; /* list of Expr */
/* exec state for expressions, or NIL if none */
List *ii_ExpressionsState; /* list of ExprState */ List *ii_ExpressionsState; /* list of ExprState */
/* partial-index predicate, or NIL if none */
List *ii_Predicate; /* list of Expr */ List *ii_Predicate; /* list of Expr */
/* exec state for expressions, or NIL if none */
ExprState *ii_PredicateState; ExprState *ii_PredicateState;
/* Per-column exclusion operators, or NULL if none */
Oid *ii_ExclusionOps; /* array with one entry per column */ Oid *ii_ExclusionOps; /* array with one entry per column */
/* Underlying function OIDs for ExclusionOps */
Oid *ii_ExclusionProcs; /* array with one entry per column */ Oid *ii_ExclusionProcs; /* array with one entry per column */
/* Opclass strategy numbers for ExclusionOps */
uint16 *ii_ExclusionStrats; /* array with one entry per column */ uint16 *ii_ExclusionStrats; /* array with one entry per column */
/* These are like Exclusion*, but for unique indexes */
Oid *ii_UniqueOps; /* array with one entry per column */ Oid *ii_UniqueOps; /* array with one entry per column */
Oid *ii_UniqueProcs; /* array with one entry per column */ Oid *ii_UniqueProcs; /* array with one entry per column */
uint16 *ii_UniqueStrats; /* array with one entry per column */ uint16 *ii_UniqueStrats; /* array with one entry per column */
/* is it a unique index? */
bool ii_Unique; bool ii_Unique;
/* is NULLS NOT DISTINCT? */
bool ii_NullsNotDistinct; bool ii_NullsNotDistinct;
/* is it valid for inserts? */
bool ii_ReadyForInserts; bool ii_ReadyForInserts;
/* IndexUnchanged status determined yet? */
bool ii_CheckedUnchanged; bool ii_CheckedUnchanged;
/* aminsert hint, cached for retail inserts */
bool ii_IndexUnchanged; bool ii_IndexUnchanged;
/* are we doing a concurrent index build? */
bool ii_Concurrent; bool ii_Concurrent;
/* did we detect any broken HOT chains? */
bool ii_BrokenHotChain; bool ii_BrokenHotChain;
/* is it a summarizing index? */
bool ii_Summarizing; bool ii_Summarizing;
/* is it a WITHOUT OVERLAPS index? */
bool ii_WithoutOverlaps; bool ii_WithoutOverlaps;
/* # of workers requested (excludes leader) */
int ii_ParallelWorkers; int ii_ParallelWorkers;
/* Oid of index AM */
Oid ii_Am; Oid ii_Am;
/* private cache area for index AM */
void *ii_AmCache; void *ii_AmCache;
/* memory context holding this IndexInfo */
MemoryContext ii_Context; MemoryContext ii_Context;
} IndexInfo; } IndexInfo;

View File

@ -28,7 +28,7 @@ node_support_input_i = [
node_support_input = [] node_support_input = []
foreach i : node_support_input_i foreach i : node_support_input_i
node_support_input += meson.source_root() / 'src' / 'include' / i node_support_input += meson.project_source_root() / 'src' / 'include' / i
endforeach endforeach
node_support_output = [ node_support_output = [

View File

@ -1,6 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group # Copyright (c) 2022-2025, PostgreSQL Global Development Group
# See https://github.com/mesonbuild/meson/issues/10338 # See https://github.com/mesonbuild/meson/issues/10338
pch_c_h = meson.source_root() / meson.current_source_dir() / 'c_pch.h' pch_c_h = meson.project_source_root() / meson.current_source_dir() / 'c_pch.h'
pch_postgres_h = meson.source_root() / meson.current_source_dir() / 'postgres_pch.h' pch_postgres_h = meson.project_source_root() / meson.current_source_dir() / 'postgres_pch.h'
pch_postgres_fe_h = meson.source_root() / meson.current_source_dir() / 'postgres_fe_pch.h' pch_postgres_fe_h = meson.project_source_root() / meson.current_source_dir() / 'postgres_fe_pch.h'

View File

@ -72,7 +72,7 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
{ {
if (__builtin_constant_p(len) && len < 32) if (__builtin_constant_p(len) && len < 32)
{ {
const unsigned char *p = data; const unsigned char *p = (const unsigned char *) data;
/* /*
* For small constant inputs, inline the computation to avoid a * For small constant inputs, inline the computation to avoid a

View File

@ -13,10 +13,15 @@
#ifndef DSM_REGISTRY_H #ifndef DSM_REGISTRY_H
#define DSM_REGISTRY_H #define DSM_REGISTRY_H
#include "lib/dshash.h"
extern void *GetNamedDSMSegment(const char *name, size_t size, extern void *GetNamedDSMSegment(const char *name, size_t size,
void (*init_callback) (void *ptr), void (*init_callback) (void *ptr),
bool *found); bool *found);
extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
extern Size DSMRegistryShmemSize(void); extern Size DSMRegistryShmemSize(void);
extern void DSMRegistryShmemInit(void); extern void DSMRegistryShmemInit(void);

View File

@ -145,6 +145,7 @@ extern dsa_area *dsa_create_in_place_ext(void *place, size_t size,
size_t init_segment_size, size_t init_segment_size,
size_t max_segment_size); size_t max_segment_size);
extern dsa_area *dsa_attach(dsa_handle handle); extern dsa_area *dsa_attach(dsa_handle handle);
extern bool dsa_is_attached(dsa_handle handle);
extern dsa_area *dsa_attach_in_place(void *place, dsm_segment *segment); extern dsa_area *dsa_attach_in_place(void *place, dsm_segment *segment);
extern void dsa_release_in_place(void *place); extern void dsa_release_in_place(void *place);
extern void dsa_on_dsm_detach_release_in_place(dsm_segment *, Datum); extern void dsa_on_dsm_detach_release_in_place(dsm_segment *, Datum);

View File

@ -11,6 +11,19 @@
#ifndef INJECTION_POINT_H #ifndef INJECTION_POINT_H
#define INJECTION_POINT_H #define INJECTION_POINT_H
#include "nodes/pg_list.h"
/*
* Injection point data, used when retrieving a list of all the attached
* injection points.
*/
typedef struct InjectionPointData
{
const char *name;
const char *library;
const char *function;
} InjectionPointData;
/* /*
* Injection points require --enable-injection-points. * Injection points require --enable-injection-points.
*/ */
@ -47,6 +60,9 @@ extern void InjectionPointCached(const char *name, void *arg);
extern bool IsInjectionPointAttached(const char *name); extern bool IsInjectionPointAttached(const char *name);
extern bool InjectionPointDetach(const char *name); extern bool InjectionPointDetach(const char *name);
/* Get the current set of injection points attached */
extern List *InjectionPointList(void);
#ifdef EXEC_BACKEND #ifdef EXEC_BACKEND
extern PGDLLIMPORT struct InjectionPointsCtl *ActiveInjectionPoints; extern PGDLLIMPORT struct InjectionPointsCtl *ActiveInjectionPoints;
#endif #endif

View File

@ -137,6 +137,7 @@ PQcancelCreate(PGconn *conn)
goto oom_error; goto oom_error;
originalHost = conn->connhost[conn->whichhost]; originalHost = conn->connhost[conn->whichhost];
cancelConn->connhost[0].type = originalHost.type;
if (originalHost.host) if (originalHost.host)
{ {
cancelConn->connhost[0].host = strdup(originalHost.host); cancelConn->connhost[0].host = strdup(originalHost.host);

View File

@ -6,7 +6,7 @@
# Emulation of PGAC_CHECK_STRIP # Emulation of PGAC_CHECK_STRIP
strip_bin = find_program(get_option('STRIP'), required: false, native: true) strip_bin = find_program(get_option('STRIP'), required: false, native: true)
strip_cmd = strip_bin.found() ? [strip_bin.path()] : [':'] strip_cmd = strip_bin.found() ? [strip_bin.full_path()] : [':']
working_strip = false working_strip = false
if strip_bin.found() if strip_bin.found()
@ -49,8 +49,8 @@ pgxs_kv = {
'PORTNAME': portname, 'PORTNAME': portname,
'PG_SYSROOT': pg_sysroot, 'PG_SYSROOT': pg_sysroot,
'abs_top_builddir': meson.build_root(), 'abs_top_builddir': meson.project_build_root(),
'abs_top_srcdir': meson.source_root(), 'abs_top_srcdir': meson.project_source_root(),
'enable_rpath': get_option('rpath') ? 'yes' : 'no', 'enable_rpath': get_option('rpath') ? 'yes' : 'no',
'enable_nls': libintl.found() ? 'yes' : 'no', 'enable_nls': libintl.found() ? 'yes' : 'no',
@ -123,7 +123,7 @@ pgxs_kv = {
if llvm.found() if llvm.found()
pgxs_kv += { pgxs_kv += {
'CLANG': clang.path(), 'CLANG': clang.full_path(),
'CXX': ' '.join(cpp.cmd_array()), 'CXX': ' '.join(cpp.cmd_array()),
'LLVM_BINPATH': llvm_binpath, 'LLVM_BINPATH': llvm_binpath,
} }
@ -258,7 +258,7 @@ pgxs_deps = {
pgxs_cdata = configuration_data(pgxs_kv) pgxs_cdata = configuration_data(pgxs_kv)
foreach b, p : pgxs_bins foreach b, p : pgxs_bins
pgxs_cdata.set(b, p.found() ? p.path() : '') pgxs_cdata.set(b, p.found() ? p.full_path() : '')
endforeach endforeach
foreach pe : pgxs_empty foreach pe : pgxs_empty

View File

@ -96,7 +96,7 @@ tests += {
'plperl_transaction', 'plperl_transaction',
'plperl_env', 'plperl_env',
], ],
'regress_args': ['--dlpath', meson.build_root() / 'src/test/regress'], 'regress_args': ['--dlpath', meson.project_build_root() / 'src/test/regress'],
}, },
} }

View File

@ -39,7 +39,7 @@ tests += {
'reindex_conc', 'reindex_conc',
'vacuum', 'vacuum',
], ],
'regress_args': ['--dlpath', meson.build_root() / 'src/test/regress'], 'regress_args': ['--dlpath', meson.project_build_root() / 'src/test/regress'],
# The injection points are cluster-wide, so disable installcheck # The injection points are cluster-wide, so disable installcheck
'runningcheck': false, 'runningcheck': false,
}, },

View File

@ -77,7 +77,7 @@ tests += {
't/002_client.pl', 't/002_client.pl',
], ],
'env': { 'env': {
'PYTHON': python.path(), 'PYTHON': python.full_path(),
'with_libcurl': oauth_flow_supported ? 'yes' : 'no', 'with_libcurl': oauth_flow_supported ? 'yes' : 'no',
'with_python': 'yes', 'with_python': 'yes',
}, },

View File

@ -5,6 +5,12 @@ SELECT set_val_in_shmem(1236);
(1 row) (1 row)
SELECT set_val_in_hash('test', '1414');
set_val_in_hash
-----------------
(1 row)
\c \c
SELECT get_val_in_shmem(); SELECT get_val_in_shmem();
get_val_in_shmem get_val_in_shmem
@ -12,3 +18,9 @@ SELECT get_val_in_shmem();
1236 1236
(1 row) (1 row)
SELECT get_val_in_hash('test');
get_val_in_hash
-----------------
1414
(1 row)

View File

@ -1,4 +1,6 @@
CREATE EXTENSION test_dsm_registry; CREATE EXTENSION test_dsm_registry;
SELECT set_val_in_shmem(1236); SELECT set_val_in_shmem(1236);
SELECT set_val_in_hash('test', '1414');
\c \c
SELECT get_val_in_shmem(); SELECT get_val_in_shmem();
SELECT get_val_in_hash('test');

View File

@ -8,3 +8,9 @@ CREATE FUNCTION set_val_in_shmem(val INT) RETURNS VOID
CREATE FUNCTION get_val_in_shmem() RETURNS INT CREATE FUNCTION get_val_in_shmem() RETURNS INT
AS 'MODULE_PATHNAME' LANGUAGE C; AS 'MODULE_PATHNAME' LANGUAGE C;
CREATE FUNCTION set_val_in_hash(key TEXT, val TEXT) RETURNS VOID
AS 'MODULE_PATHNAME' LANGUAGE C;
CREATE FUNCTION get_val_in_hash(key TEXT) RETURNS TEXT
AS 'MODULE_PATHNAME' LANGUAGE C;

View File

@ -15,6 +15,7 @@
#include "fmgr.h" #include "fmgr.h"
#include "storage/dsm_registry.h" #include "storage/dsm_registry.h"
#include "storage/lwlock.h" #include "storage/lwlock.h"
#include "utils/builtins.h"
PG_MODULE_MAGIC; PG_MODULE_MAGIC;
@ -24,15 +25,31 @@ typedef struct TestDSMRegistryStruct
LWLock lck; LWLock lck;
} TestDSMRegistryStruct; } TestDSMRegistryStruct;
static TestDSMRegistryStruct *tdr_state; typedef struct TestDSMRegistryHashEntry
{
char key[64];
dsa_pointer val;
} TestDSMRegistryHashEntry;
static TestDSMRegistryStruct *tdr_dsm;
static dsa_area *tdr_dsa;
static dshash_table *tdr_hash;
static const dshash_parameters dsh_params = {
offsetof(TestDSMRegistryHashEntry, val),
sizeof(TestDSMRegistryHashEntry),
dshash_strcmp,
dshash_strhash,
dshash_strcpy
};
static void static void
tdr_init_shmem(void *ptr) init_tdr_dsm(void *ptr)
{ {
TestDSMRegistryStruct *state = (TestDSMRegistryStruct *) ptr; TestDSMRegistryStruct *dsm = (TestDSMRegistryStruct *) ptr;
LWLockInitialize(&state->lck, LWLockNewTrancheId()); LWLockInitialize(&dsm->lck, LWLockNewTrancheId());
state->val = 0; dsm->val = 0;
} }
static void static void
@ -40,11 +57,17 @@ tdr_attach_shmem(void)
{ {
bool found; bool found;
tdr_state = GetNamedDSMSegment("test_dsm_registry", tdr_dsm = GetNamedDSMSegment("test_dsm_registry_dsm",
sizeof(TestDSMRegistryStruct), sizeof(TestDSMRegistryStruct),
tdr_init_shmem, init_tdr_dsm,
&found); &found);
LWLockRegisterTranche(tdr_state->lck.tranche, "test_dsm_registry"); LWLockRegisterTranche(tdr_dsm->lck.tranche, "test_dsm_registry");
if (tdr_dsa == NULL)
tdr_dsa = GetNamedDSA("test_dsm_registry_dsa", &found);
if (tdr_hash == NULL)
tdr_hash = GetNamedDSHash("test_dsm_registry_hash", &dsh_params, &found);
} }
PG_FUNCTION_INFO_V1(set_val_in_shmem); PG_FUNCTION_INFO_V1(set_val_in_shmem);
@ -53,9 +76,9 @@ set_val_in_shmem(PG_FUNCTION_ARGS)
{ {
tdr_attach_shmem(); tdr_attach_shmem();
LWLockAcquire(&tdr_state->lck, LW_EXCLUSIVE); LWLockAcquire(&tdr_dsm->lck, LW_EXCLUSIVE);
tdr_state->val = PG_GETARG_INT32(0); tdr_dsm->val = PG_GETARG_INT32(0);
LWLockRelease(&tdr_state->lck); LWLockRelease(&tdr_dsm->lck);
PG_RETURN_VOID(); PG_RETURN_VOID();
} }
@ -68,9 +91,57 @@ get_val_in_shmem(PG_FUNCTION_ARGS)
tdr_attach_shmem(); tdr_attach_shmem();
LWLockAcquire(&tdr_state->lck, LW_SHARED); LWLockAcquire(&tdr_dsm->lck, LW_SHARED);
ret = tdr_state->val; ret = tdr_dsm->val;
LWLockRelease(&tdr_state->lck); LWLockRelease(&tdr_dsm->lck);
PG_RETURN_INT32(ret); PG_RETURN_INT32(ret);
} }
PG_FUNCTION_INFO_V1(set_val_in_hash);
Datum
set_val_in_hash(PG_FUNCTION_ARGS)
{
TestDSMRegistryHashEntry *entry;
char *key = TextDatumGetCString(PG_GETARG_DATUM(0));
char *val = TextDatumGetCString(PG_GETARG_DATUM(1));
bool found;
if (strlen(key) >= offsetof(TestDSMRegistryHashEntry, val))
ereport(ERROR,
(errmsg("key too long")));
tdr_attach_shmem();
entry = dshash_find_or_insert(tdr_hash, key, &found);
if (found)
dsa_free(tdr_dsa, entry->val);
entry->val = dsa_allocate(tdr_dsa, strlen(val) + 1);
strcpy(dsa_get_address(tdr_dsa, entry->val), val);
dshash_release_lock(tdr_hash, entry);
PG_RETURN_VOID();
}
PG_FUNCTION_INFO_V1(get_val_in_hash);
Datum
get_val_in_hash(PG_FUNCTION_ARGS)
{
TestDSMRegistryHashEntry *entry;
char *key = TextDatumGetCString(PG_GETARG_DATUM(0));
text *val = NULL;
tdr_attach_shmem();
entry = dshash_find(tdr_hash, key, false);
if (entry == NULL)
PG_RETURN_NULL();
val = cstring_to_text(dsa_get_address(tdr_dsa, entry->val));
dshash_release_lock(tdr_hash, entry);
PG_RETURN_TEXT_P(val);
}

View File

@ -187,4 +187,54 @@ ok( $logfile =~
qr/FATAL: .* recovery ended before configured recovery target was reached/, qr/FATAL: .* recovery ended before configured recovery target was reached/,
'recovery end before target reached is a fatal error'); 'recovery end before target reached is a fatal error');
# Invalid timeline target
$node_standby = PostgreSQL::Test::Cluster->new('standby_9');
$node_standby->init_from_backup($node_primary, 'my_backup',
has_restoring => 1);
$node_standby->append_conf('postgresql.conf',
"recovery_target_timeline = 'bogus'");
$res = run_log(
[
'pg_ctl',
'--pgdata' => $node_standby->data_dir,
'--log' => $node_standby->logfile,
'start',
]);
ok(!$res, 'invalid timeline target (bogus value)');
my $log_start = $node_standby->wait_for_log("is not a valid number");
# Timeline target out of min range
$node_standby->append_conf('postgresql.conf',
"recovery_target_timeline = '0'");
$res = run_log(
[
'pg_ctl',
'--pgdata' => $node_standby->data_dir,
'--log' => $node_standby->logfile,
'start',
]);
ok(!$res, 'invalid timeline target (lower bound check)');
$log_start =
$node_standby->wait_for_log("must be between 1 and 4294967295", $log_start);
# Timeline target out of max range
$node_standby->append_conf('postgresql.conf',
"recovery_target_timeline = '4294967296'");
$res = run_log(
[
'pg_ctl',
'--pgdata' => $node_standby->data_dir,
'--log' => $node_standby->logfile,
'start',
]);
ok(!$res, 'invalid timeline target (upper bound check)');
$log_start =
$node_standby->wait_for_log("must be between 1 and 4294967295", $log_start);
done_testing(); done_testing();

View File

@ -195,54 +195,123 @@ ORDER BY proname DESC, proargtypes DESC, pronamespace DESC LIMIT 1;
(1 row) (1 row)
-- --
-- Add coverage for RowCompare quals whose rhs row has a NULL that ends scan -- Forwards scan RowCompare qual whose row arg has a NULL that affects our
-- initial positioning strategy
-- --
explain (costs off) explain (costs off)
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) < ('abs', NULL) WHERE (proname, proargtypes) >= ('abs', NULL) AND proname <= 'abs'
ORDER BY proname, proargtypes, pronamespace; ORDER BY proname, proargtypes, pronamespace;
QUERY PLAN QUERY PLAN
------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------
Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc
Index Cond: ((ROW(proname, proargtypes) < ROW('abs'::name, NULL::oidvector)) AND (proname = 'abs'::name)) Index Cond: ((ROW(proname, proargtypes) >= ROW('abs'::name, NULL::oidvector)) AND (proname <= 'abs'::name))
(2 rows) (2 rows)
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) < ('abs', NULL) WHERE (proname, proargtypes) >= ('abs', NULL) AND proname <= 'abs'
ORDER BY proname, proargtypes, pronamespace; ORDER BY proname, proargtypes, pronamespace;
proname | proargtypes | pronamespace proname | proargtypes | pronamespace
---------+-------------+-------------- ---------+-------------+--------------
(0 rows) (0 rows)
-- --
-- Add coverage for backwards scan RowCompare quals whose rhs row has a NULL -- Forwards scan RowCompare quals whose row arg has a NULL that ends scan
-- that ends scan
-- --
explain (costs off) explain (costs off)
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) > ('abs', NULL) WHERE proname >= 'abs' AND (proname, proargtypes) < ('abs', NULL)
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC; ORDER BY proname, proargtypes, pronamespace;
QUERY PLAN QUERY PLAN
------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------
Index Only Scan Backward using pg_proc_proname_args_nsp_index on pg_proc Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc
Index Cond: ((ROW(proname, proargtypes) > ROW('abs'::name, NULL::oidvector)) AND (proname = 'abs'::name)) Index Cond: ((proname >= 'abs'::name) AND (ROW(proname, proargtypes) < ROW('abs'::name, NULL::oidvector)))
(2 rows) (2 rows)
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) > ('abs', NULL) WHERE proname >= 'abs' AND (proname, proargtypes) < ('abs', NULL)
ORDER BY proname, proargtypes, pronamespace;
proname | proargtypes | pronamespace
---------+-------------+--------------
(0 rows)
--
-- Backwards scan RowCompare qual whose row arg has a NULL that affects our
-- initial positioning strategy
--
explain (costs off)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname >= 'abs' AND (proname, proargtypes) <= ('abs', NULL)
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Index Only Scan Backward using pg_proc_proname_args_nsp_index on pg_proc
Index Cond: ((proname >= 'abs'::name) AND (ROW(proname, proargtypes) <= ROW('abs'::name, NULL::oidvector)))
(2 rows)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname >= 'abs' AND (proname, proargtypes) <= ('abs', NULL)
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC; ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
proname | proargtypes | pronamespace proname | proargtypes | pronamespace
---------+-------------+-------------- ---------+-------------+--------------
(0 rows) (0 rows)
-- --
-- Add coverage for recheck of > key following array advancement on previous -- Backwards scan RowCompare qual whose row arg has a NULL that ends scan
-- (left sibling) page that used a high key whose attribute value corresponding --
-- to the > key was -inf (due to being truncated when the high key was created). explain (costs off)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE (proname, proargtypes) > ('abs', NULL) AND proname <= 'abs'
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Index Only Scan Backward using pg_proc_proname_args_nsp_index on pg_proc
Index Cond: ((ROW(proname, proargtypes) > ROW('abs'::name, NULL::oidvector)) AND (proname <= 'abs'::name))
(2 rows)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE (proname, proargtypes) > ('abs', NULL) AND proname <= 'abs'
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
proname | proargtypes | pronamespace
---------+-------------+--------------
(0 rows)
-- Makes B-Tree preprocessing deal with unmarking redundant keys that were
-- initially marked required (test case relies on current row compare
-- preprocessing limitations)
explain (costs off)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname = 'zzzzzz' AND (proname, proargtypes) > ('abs', NULL)
AND pronamespace IN (1, 2, 3) AND proargtypes IN ('26 23', '5077')
ORDER BY proname, proargtypes, pronamespace;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc
Index Cond: ((ROW(proname, proargtypes) > ROW('abs'::name, NULL::oidvector)) AND (proname = 'zzzzzz'::name) AND (proargtypes = ANY ('{"26 23",5077}'::oidvector[])) AND (pronamespace = ANY ('{1,2,3}'::oid[])))
(2 rows)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname = 'zzzzzz' AND (proname, proargtypes) > ('abs', NULL)
AND pronamespace IN (1, 2, 3) AND proargtypes IN ('26 23', '5077')
ORDER BY proname, proargtypes, pronamespace;
proname | proargtypes | pronamespace
---------+-------------+--------------
(0 rows)
--
-- Performs a recheck of > key following array advancement on previous (left
-- sibling) page that used a high key whose attribute value corresponding to
-- the > key was -inf (due to being truncated when the high key was created).
-- --
-- XXX This relies on the assumption that tenk1_thous_tenthous has a truncated -- XXX This relies on the assumption that tenk1_thous_tenthous has a truncated
-- high key "(183, -inf)" on the first page that we'll scan. The test will only -- high key "(183, -inf)" on the first page that we'll scan. The test will only

View File

@ -748,6 +748,11 @@ ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key ENFORCED;
ERROR: cannot alter enforceability of constraint "unique_tbl_i_key" of relation "unique_tbl" ERROR: cannot alter enforceability of constraint "unique_tbl_i_key" of relation "unique_tbl"
ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT ENFORCED; ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT ENFORCED;
ERROR: cannot alter enforceability of constraint "unique_tbl_i_key" of relation "unique_tbl" ERROR: cannot alter enforceability of constraint "unique_tbl_i_key" of relation "unique_tbl"
-- can't make an existing constraint NOT VALID
ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT VALID;
ERROR: constraints cannot be altered to be NOT VALID
LINE 1: ...ABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT VALID;
^
DROP TABLE unique_tbl; DROP TABLE unique_tbl;
-- --
-- EXCLUDE constraints -- EXCLUDE constraints

View File

@ -1359,7 +1359,7 @@ LINE 1: ...e ALTER CONSTRAINT fktable_fk_fkey NOT DEFERRABLE INITIALLY ...
ALTER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey NO INHERIT; ALTER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey NO INHERIT;
ERROR: constraint "fktable_fk_fkey" of relation "fktable" is not a not-null constraint ERROR: constraint "fktable_fk_fkey" of relation "fktable" is not a not-null constraint
ALTER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey NOT VALID; ALTER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey NOT VALID;
ERROR: FOREIGN KEY constraints cannot be marked NOT VALID ERROR: constraints cannot be altered to be NOT VALID
LINE 1: ...ER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey NOT VALID; LINE 1: ...ER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey NOT VALID;
^ ^
ALTER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey ENFORCED NOT ENFORCED; ALTER TABLE fktable ALTER CONSTRAINT fktable_fk_fkey ENFORCED NOT ENFORCED;

View File

@ -25,6 +25,7 @@ begin
ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N'); ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
ln := regexp_replace(ln, 'loops=\d+', 'loops=N'); ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N'); ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
ln := regexp_replace(ln, 'Memory: \d+kB', 'Memory: NkB');
return next ln; return next ln;
end loop; end loop;
end; end;
@ -500,3 +501,62 @@ RESET max_parallel_workers_per_gather;
RESET parallel_tuple_cost; RESET parallel_tuple_cost;
RESET parallel_setup_cost; RESET parallel_setup_cost;
RESET min_parallel_table_scan_size; RESET min_parallel_table_scan_size;
-- Ensure memoize works for ANTI joins
CREATE TABLE tab_anti (a int, b boolean);
INSERT INTO tab_anti SELECT i%3, false FROM generate_series(1,100)i;
ANALYZE tab_anti;
-- Ensure we get a Memoize plan for ANTI join
SELECT explain_memoize('
SELECT COUNT(*) FROM tab_anti t1 LEFT JOIN
LATERAL (SELECT DISTINCT ON (a) a, b, t1.a AS x FROM tab_anti t2) t2
ON t1.a+1 = t2.a
WHERE t2.a IS NULL;', false);
explain_memoize
--------------------------------------------------------------------------------------------
Aggregate (actual rows=1.00 loops=N)
-> Nested Loop Anti Join (actual rows=33.00 loops=N)
-> Seq Scan on tab_anti t1 (actual rows=100.00 loops=N)
-> Memoize (actual rows=0.67 loops=N)
Cache Key: (t1.a + 1), t1.a
Cache Mode: binary
Hits: 97 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Subquery Scan on t2 (actual rows=0.67 loops=N)
Filter: ((t1.a + 1) = t2.a)
Rows Removed by Filter: 2
-> Unique (actual rows=2.67 loops=N)
-> Sort (actual rows=67.33 loops=N)
Sort Key: t2_1.a
Sort Method: quicksort Memory: NkB
-> Seq Scan on tab_anti t2_1 (actual rows=100.00 loops=N)
(15 rows)
-- And check we get the expected results.
SELECT COUNT(*) FROM tab_anti t1 LEFT JOIN
LATERAL (SELECT DISTINCT ON (a) a, b, t1.a AS x FROM tab_anti t2) t2
ON t1.a+1 = t2.a
WHERE t2.a IS NULL;
count
-------
33
(1 row)
-- Ensure we do not add memoize node for SEMI join
EXPLAIN (COSTS OFF)
SELECT * FROM tab_anti t1 WHERE t1.a IN
(SELECT a FROM tab_anti t2 WHERE t2.b IN
(SELECT t1.b FROM tab_anti t3 WHERE t2.a > 1 OFFSET 0));
QUERY PLAN
-------------------------------------------------
Nested Loop Semi Join
-> Seq Scan on tab_anti t1
-> Nested Loop Semi Join
Join Filter: (t1.a = t2.a)
-> Seq Scan on tab_anti t2
-> Subquery Scan on "ANY_subquery"
Filter: (t2.b = "ANY_subquery".b)
-> Result
One-Time Filter: (t2.a > 1)
-> Seq Scan on tab_anti t3
(10 rows)
DROP TABLE tab_anti;

View File

@ -1464,9 +1464,21 @@ ERROR: count must be greater than zero
SELECT width_bucket(3.5::float8, 3.0::float8, 3.0::float8, 888); SELECT width_bucket(3.5::float8, 3.0::float8, 3.0::float8, 888);
ERROR: lower bound cannot equal upper bound ERROR: lower bound cannot equal upper bound
SELECT width_bucket('NaN', 3.0, 4.0, 888); SELECT width_bucket('NaN', 3.0, 4.0, 888);
ERROR: operand, lower bound, and upper bound cannot be NaN width_bucket
--------------
889
(1 row)
SELECT width_bucket('NaN'::float8, 3.0::float8, 4.0::float8, 888);
width_bucket
--------------
889
(1 row)
SELECT width_bucket(0, 'NaN', 4.0, 888);
ERROR: lower and upper bounds cannot be NaN
SELECT width_bucket(0::float8, 'NaN', 4.0::float8, 888); SELECT width_bucket(0::float8, 'NaN', 4.0::float8, 888);
ERROR: operand, lower bound, and upper bound cannot be NaN ERROR: lower and upper bounds cannot be NaN
SELECT width_bucket(2.0, 3.0, '-inf', 888); SELECT width_bucket(2.0, 3.0, '-inf', 888);
ERROR: lower and upper bounds must be finite ERROR: lower and upper bounds must be finite
SELECT width_bucket(0::float8, '-inf', 4.0::float8, 888); SELECT width_bucket(0::float8, '-inf', 4.0::float8, 888);

View File

@ -143,38 +143,83 @@ SELECT proname, proargtypes, pronamespace
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC LIMIT 1; ORDER BY proname DESC, proargtypes DESC, pronamespace DESC LIMIT 1;
-- --
-- Add coverage for RowCompare quals whose rhs row has a NULL that ends scan -- Forwards scan RowCompare qual whose row arg has a NULL that affects our
-- initial positioning strategy
-- --
explain (costs off) explain (costs off)
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) < ('abs', NULL) WHERE (proname, proargtypes) >= ('abs', NULL) AND proname <= 'abs'
ORDER BY proname, proargtypes, pronamespace; ORDER BY proname, proargtypes, pronamespace;
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) < ('abs', NULL) WHERE (proname, proargtypes) >= ('abs', NULL) AND proname <= 'abs'
ORDER BY proname, proargtypes, pronamespace; ORDER BY proname, proargtypes, pronamespace;
-- --
-- Add coverage for backwards scan RowCompare quals whose rhs row has a NULL -- Forwards scan RowCompare quals whose row arg has a NULL that ends scan
-- that ends scan
-- --
explain (costs off) explain (costs off)
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) > ('abs', NULL) WHERE proname >= 'abs' AND (proname, proargtypes) < ('abs', NULL)
ORDER BY proname, proargtypes, pronamespace;
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname >= 'abs' AND (proname, proargtypes) < ('abs', NULL)
ORDER BY proname, proargtypes, pronamespace;
--
-- Backwards scan RowCompare qual whose row arg has a NULL that affects our
-- initial positioning strategy
--
explain (costs off)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname >= 'abs' AND (proname, proargtypes) <= ('abs', NULL)
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC; ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
SELECT proname, proargtypes, pronamespace SELECT proname, proargtypes, pronamespace
FROM pg_proc FROM pg_proc
WHERE proname = 'abs' AND (proname, proargtypes) > ('abs', NULL) WHERE proname >= 'abs' AND (proname, proargtypes) <= ('abs', NULL)
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC; ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
-- --
-- Add coverage for recheck of > key following array advancement on previous -- Backwards scan RowCompare qual whose row arg has a NULL that ends scan
-- (left sibling) page that used a high key whose attribute value corresponding --
-- to the > key was -inf (due to being truncated when the high key was created). explain (costs off)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE (proname, proargtypes) > ('abs', NULL) AND proname <= 'abs'
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE (proname, proargtypes) > ('abs', NULL) AND proname <= 'abs'
ORDER BY proname DESC, proargtypes DESC, pronamespace DESC;
-- Makes B-Tree preprocessing deal with unmarking redundant keys that were
-- initially marked required (test case relies on current row compare
-- preprocessing limitations)
explain (costs off)
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname = 'zzzzzz' AND (proname, proargtypes) > ('abs', NULL)
AND pronamespace IN (1, 2, 3) AND proargtypes IN ('26 23', '5077')
ORDER BY proname, proargtypes, pronamespace;
SELECT proname, proargtypes, pronamespace
FROM pg_proc
WHERE proname = 'zzzzzz' AND (proname, proargtypes) > ('abs', NULL)
AND pronamespace IN (1, 2, 3) AND proargtypes IN ('26 23', '5077')
ORDER BY proname, proargtypes, pronamespace;
--
-- Performs a recheck of > key following array advancement on previous (left
-- sibling) page that used a high key whose attribute value corresponding to
-- the > key was -inf (due to being truncated when the high key was created).
-- --
-- XXX This relies on the assumption that tenk1_thous_tenthous has a truncated -- XXX This relies on the assumption that tenk1_thous_tenthous has a truncated
-- high key "(183, -inf)" on the first page that we'll scan. The test will only -- high key "(183, -inf)" on the first page that we'll scan. The test will only

View File

@ -537,6 +537,9 @@ CREATE TABLE UNIQUE_NOTEN_TBL(i int UNIQUE NOT ENFORCED);
ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key ENFORCED; ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key ENFORCED;
ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT ENFORCED; ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT ENFORCED;
-- can't make an existing constraint NOT VALID
ALTER TABLE unique_tbl ALTER CONSTRAINT unique_tbl_i_key NOT VALID;
DROP TABLE unique_tbl; DROP TABLE unique_tbl;
-- --

View File

@ -26,6 +26,7 @@ begin
ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N'); ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
ln := regexp_replace(ln, 'loops=\d+', 'loops=N'); ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N'); ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
ln := regexp_replace(ln, 'Memory: \d+kB', 'Memory: NkB');
return next ln; return next ln;
end loop; end loop;
end; end;
@ -244,3 +245,29 @@ RESET max_parallel_workers_per_gather;
RESET parallel_tuple_cost; RESET parallel_tuple_cost;
RESET parallel_setup_cost; RESET parallel_setup_cost;
RESET min_parallel_table_scan_size; RESET min_parallel_table_scan_size;
-- Ensure memoize works for ANTI joins
CREATE TABLE tab_anti (a int, b boolean);
INSERT INTO tab_anti SELECT i%3, false FROM generate_series(1,100)i;
ANALYZE tab_anti;
-- Ensure we get a Memoize plan for ANTI join
SELECT explain_memoize('
SELECT COUNT(*) FROM tab_anti t1 LEFT JOIN
LATERAL (SELECT DISTINCT ON (a) a, b, t1.a AS x FROM tab_anti t2) t2
ON t1.a+1 = t2.a
WHERE t2.a IS NULL;', false);
-- And check we get the expected results.
SELECT COUNT(*) FROM tab_anti t1 LEFT JOIN
LATERAL (SELECT DISTINCT ON (a) a, b, t1.a AS x FROM tab_anti t2) t2
ON t1.a+1 = t2.a
WHERE t2.a IS NULL;
-- Ensure we do not add memoize node for SEMI join
EXPLAIN (COSTS OFF)
SELECT * FROM tab_anti t1 WHERE t1.a IN
(SELECT a FROM tab_anti t2 WHERE t2.b IN
(SELECT t1.b FROM tab_anti t3 WHERE t2.a > 1 OFFSET 0));
DROP TABLE tab_anti;

View File

@ -869,6 +869,8 @@ SELECT width_bucket(5.0::float8, 3.0::float8, 4.0::float8, 0);
SELECT width_bucket(5.0::float8, 3.0::float8, 4.0::float8, -5); SELECT width_bucket(5.0::float8, 3.0::float8, 4.0::float8, -5);
SELECT width_bucket(3.5::float8, 3.0::float8, 3.0::float8, 888); SELECT width_bucket(3.5::float8, 3.0::float8, 3.0::float8, 888);
SELECT width_bucket('NaN', 3.0, 4.0, 888); SELECT width_bucket('NaN', 3.0, 4.0, 888);
SELECT width_bucket('NaN'::float8, 3.0::float8, 4.0::float8, 888);
SELECT width_bucket(0, 'NaN', 4.0, 888);
SELECT width_bucket(0::float8, 'NaN', 4.0::float8, 888); SELECT width_bucket(0::float8, 'NaN', 4.0::float8, 888);
SELECT width_bucket(2.0, 3.0, '-inf', 888); SELECT width_bucket(2.0, 3.0, '-inf', 888);
SELECT width_bucket(0::float8, '-inf', 4.0::float8, 888); SELECT width_bucket(0::float8, '-inf', 4.0::float8, 888);

View File

@ -7,7 +7,7 @@ tests += {
'tap': { 'tap': {
'env': { 'env': {
'with_ssl': ssl_library, 'with_ssl': ssl_library,
'OPENSSL': openssl.found() ? openssl.path() : '', 'OPENSSL': openssl.found() ? openssl.full_path() : '',
}, },
'tests': [ 'tests': [
't/001_ssltests.pl', 't/001_ssltests.pl',

View File

@ -601,6 +601,7 @@ DR_intorel
DR_printtup DR_printtup
DR_sqlfunction DR_sqlfunction
DR_transientrel DR_transientrel
DSMREntryType
DSMRegistryCtxStruct DSMRegistryCtxStruct
DSMRegistryEntry DSMRegistryEntry
DWORD DWORD
@ -1290,6 +1291,7 @@ InjectionPointCacheEntry
InjectionPointCallback InjectionPointCallback
InjectionPointCondition InjectionPointCondition
InjectionPointConditionType InjectionPointConditionType
InjectionPointData
InjectionPointEntry InjectionPointEntry
InjectionPointSharedState InjectionPointSharedState
InjectionPointsCtl InjectionPointsCtl
@ -1737,6 +1739,9 @@ Name
NameData NameData
NameHashEntry NameHashEntry
NamedArgExpr NamedArgExpr
NamedDSAState
NamedDSHState
NamedDSMState
NamedLWLockTranche NamedLWLockTranche
NamedLWLockTrancheRequest NamedLWLockTrancheRequest
NamedTuplestoreScan NamedTuplestoreScan
@ -3006,6 +3011,7 @@ Tcl_Obj
Tcl_Size Tcl_Size
Tcl_Time Tcl_Time
TempNamespaceStatus TempNamespaceStatus
TestDSMRegistryHashEntry
TestDSMRegistryStruct TestDSMRegistryStruct
TestDecodingData TestDecodingData
TestDecodingTxnData TestDecodingTxnData