222 Commits

Author SHA1 Message Date
Michael Paquier
9e98583898 Create routine able to set single-call SRFs for Materialize mode
Set-returning functions that use the Materialize mode, creating a
tuplestore to include all the tuples returned in a set rather than doing
so in multiple calls, use roughly the same set of steps to prepare
ReturnSetInfo for this job:
- Check if ReturnSetInfo supports returning a tuplestore and if the
materialize mode is enabled.
- Create a tuplestore for all the tuples part of the returned set in the
per-query memory context, stored in ReturnSetInfo->setResult.
- Build a tuple descriptor mostly from get_call_result_type(), then
stored in ReturnSetInfo->setDesc.  Note that there are some cases where
the SRF's tuple descriptor has to be the one specified by the function
caller.

This refactoring is done so as there are (well, should be) no behavior
changes in any of the in-core functions refactored, and the centralized
function that checks and sets up the function's ReturnSetInfo can be
controlled with a set of bits32 options.  Two of them prove to be
necessary now:
- SRF_SINGLE_USE_EXPECTED to use expectedDesc as tuple descriptor, as
expected by the function's caller.
- SRF_SINGLE_BLESS to validate the tuple descriptor for the SRF.

The same initialization pattern is simplified in 28 places per my
count as of src/backend/, shaving up to ~900 lines of code.  These
mostly come from the removal of the per-query initializations and the
sanity checks now grouped in a single location.  There are more
locations that could be simplified in contrib/, that are left for a
follow-up cleanup.

fcc2817, 07daca5 and d61a361 have prepared the areas of the code related
to this change, to ease this refactoring.

Author: Melanie Plageman, Michael Paquier
Reviewed-by: Álvaro Herrera, Justin Pryzby
Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com
2022-03-07 10:26:29 +09:00
Amit Kapila
7a85073290 Reconsider pg_stat_subscription_workers view.
It was decided (refer to the Discussion link below) that the stats
collector is not an appropriate place to store the error information of
subscription workers.

This patch changes the pg_stat_subscription_workers view (introduced by
commit 8d74fc96db) so that it stores only statistics counters:
apply_error_count and sync_error_count, and has one entry for
each subscription. The removed error information such as error-XID and
the error message would be stored in another way in the future which is
more reliable and persistent.

After removing these error details, there is no longer any relation
information, so the subscription statistics are now a cluster-wide
statistics.

The patch also changes the view name to pg_stat_subscription_stats since
the word "worker" is an implementation detail that we use one worker for
one tablesync and one apply.

Author: Masahiko Sawada, based on suggestions by Andres Freund
Reviewed-by: Peter Smith, Haiying Tang, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de
2022-03-01 06:17:52 +05:30
Michael Paquier
d61a361d1a Remove all traces of tuplestore_donestoring() in the C code
This routine is a no-op since dd04e95 from 2003, with a macro kept
around for compatibility purposes.  This has led to the same code
patterns being copy-pasted around for no effect, sometimes in confusing
ways like in pg_logical_slot_get_changes_guts() from logical.c where the
code was actually incorrect.

This issue has been discussed on two different threads recently, so
rather than living with this legacy, remove any uses of this routine in
the C code to simplify things.  The compatibility macro is kept to avoid
breaking any out-of-core modules that depend on it.

Reported-by: Tatsuhito Kasahara, Justin Pryzby
Author: Tatsuhito Kasahara
Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com
Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
2022-02-17 09:52:02 +09:00
Bruce Momjian
27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Amit Kapila
8d74fc96db Add a view to show the stats of subscription workers.
This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.

It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.

The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.

This view can be extended in the future to track other xact related
statistics like the number of xacts committed/aborted for subscription
workers.

Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2021-11-30 08:54:30 +05:30
Michael Paquier
a45ed975c5 Fix memory overrun when querying pg_stat_slru
pg_stat_get_slru() in pgstatfuncs.c would point to one element after the
end of the array PgStat_SLRUStats when finishing to scan its entries.
This had no direct consequences as no data from the extra memory area
was read, but static analyzers would rightfully complain here.  So let's
be clean.

While on it, this adds one regression test in the area reserved for
system views.

Reported-by: Alexander Kozhemyakin, via AddressSanitizer
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/17280-37da556e86032070@postgresql.org
Backpatch-through: 13
2021-11-12 21:49:21 +09:00
Andres Freund
1bc8e7b099 pgstat: split reporting/fetching of bgwriter and checkpointer stats.
These have been unrelated since bgwriter and checkpointer were split into two
processes in 806a2aee379. As there several pending patches (shared memory
stats, extending the set of tracked IO / buffer statistics) that are made a
bit more awkward by the grouping, split them. Done separately to make
reviewing easier.

This does *not* change the contents of pg_stat_bgwriter or move fields out of
bgwriter/checkpointer stats that arguably do not belong in either. However
pgstat_fetch_global() was renamed and split into
pgstat_fetch_stat_checkpointer() and pgstat_fetch_stat_bgwriter().

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
2021-08-04 19:16:04 -07:00
Amit Kapila
ab5e48f153 Replace magic constants used in pg_stat_get_replication_slot().
A few variables have been using 10 as a magic constant while
PG_STAT_GET_REPLICATION_SLOT_COLS can be used instead.

Author: Masahiko Sawada
Reviewed-By: Amit Kapila
Backpatch-through: 14, where it was introduced
Discussion: https://postgr.es/m/CAD21AoBvqODDfmD17DkEuPCvV2KbruukXQ2Vwrv5Xi-TsAsTJA@mail.gmail.com
2021-06-30 11:29:53 +05:30
Amit Kapila
3fa17d3771 Use HTAB for replication slot statistics.
Previously, we used to use the array of size max_replication_slots to
store stats for replication slots. But that had two problems in the cases
where a message for dropping a slot gets lost: 1) the stats for the new
slot are not recorded if the array is full and 2) writing beyond the end
of the array if the user reduces the max_replication_slots.

This commit uses HTAB for replication slot statistics, resolving both
problems. Now, pgstat_vacuum_stat() search for all the dead replication
slots in stats hashtable and tell the collector to remove them. To avoid
showing the stats for the already-dropped slots, pg_stat_replication_slots
view searches slot stats by the slot name taken from pg_replication_slots.

Also, we send a message for creating a slot at slot creation, initializing
the stats. This reduces the possibility that the stats are accumulated
into the old slot stats when a message for dropping a slot gets lost.

Reported-by: Andres Freund
Author: Sawada Masahiko, test case by Vignesh C
Reviewed-by: Amit Kapila, Vignesh C, Dilip Kumar
Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
2021-04-27 09:09:11 +05:30
Bruce Momjian
9660834dd8 adjust query id feature to use pg_stat_activity.query_id
Previously, it was pg_stat_activity.queryid to match the
pg_stat_statements queryid column.  This is an adjustment to patch
4f0b0966c8.  This also adjusts some of the internal function calls to
match.  Catversion bumped.

Reported-by: Álvaro Herrera, Julien Rouhaud

Discussion: https://postgr.es/m/20210408032704.GA7498@alvherre.pgsql
2021-04-20 12:22:26 -04:00
Amit Kapila
f5fc2f5b23 Add information of total data processed to replication slot stats.
This adds the statistics about total transactions count and total
transaction data logically sent to the decoding output plugin from
ReorderBuffer. Users can query the pg_stat_replication_slots view to check
these stats.

Suggested-by: Andres Freund
Author: Vignesh C and Amit Kapila
Reviewed-by: Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
2021-04-16 07:34:43 +05:30
Amit Kapila
cca57c1d9b Use NameData datatype for slotname in stats.
This will make it consistent with the other usage of slotname in the code.
In the passing, change pgstat_report_replslot signature to use a structure
rather than multiple parameters.

Reported-by: Andres Freund
Author: Vignesh C
Reviewed-by: Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
2021-04-14 08:55:03 +05:30
Bruce Momjian
0f61727b75 Fixes for query_id feature
Ignore parallel workers in pg_stat_statements
  Oversight in 4f0b0966c8 which exposed queryid in parallel workers.
  Counters are aggregated by the main backend process so parallel workers
  would report duplicated activity, and could also report activity for the
  wrong entry as they are only aware of the top level queryid.

Fix thinko in pg_stat_get_activity when retrieving the queryid.

Remove unnecessary call to pgstat_report_queryid().

Reported-by: Amit Kapila, Andres Freund, Thomas Munro

Discussion: https://postgr.es/m/20210408051735.lfbdzun5zdlax5gd@alap3.anarazel.de p634GTSOqnDW86Owrn6qDAVosC5dJjXjp7BMfc5Gz1Q@mail.gmail.com

Author: Julien Rouhaud
2021-04-08 11:16:01 -04:00
Bruce Momjian
4f0b0966c8 Make use of in-core query id added by commit 5fd9dfa5f5
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.

Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.

Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.

For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.

Bump catalog version.

Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol

Author: Julien Rouhaud

Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
2021-04-07 14:04:06 -04:00
Stephen Frost
c9c41c7a33 Rename Default Roles to Predefined Roles
The term 'default roles' wasn't quite apt as these roles aren't able to
be modified or removed after installation, so rename them to be
'Predefined Roles' instead, adding an entry into the newly added
Obsolete Appendix to help users of current releases find the new
documentation.

Bruce Momjian and Stephen Frost

Discussion: https://postgr.es/m/157742545062.1149.11052653770497832538%40wrigleys.postgresql.org
and https://www.postgresql.org/message-id/20201120211304.GG16415@tamriel.snowman.net
2021-04-01 15:32:06 -04:00
Fujii Masao
ff99918c62 Track total amounts of times spent writing and syncing WAL data to disk.
This commit adds new GUC track_wal_io_timing. When this is enabled,
the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
WAL data to disk are counted in pg_stat_wal. This information would be
useful to check how much WAL write and sync affect the performance.

Enabling track_wal_io_timing will make the server query the operating
system for the current time every time WAL is written or synced,
which may cause significant overhead on some platforms. To avoid such
additional overhead in the server with track_io_timing enabled,
this commit introduces track_wal_io_timing as a separate parameter from
track_io_timing.

Note that WAL write and sync activity by walreceiver has not been tracked yet.

This commit makes the server also track the numbers of times XLogWrite
writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
regardless of the setting of track_wal_io_timing. This counters can be
used to calculate the WAL write and sync time per request, for example.

Bump PGSTAT_FILE_FORMAT_ID.

Bump catalog version.

Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-09 16:52:06 +09:00
Michael Paquier
f9264d1524 Remove support for SSL compression
PostgreSQL disabled compression as of e3bdb2d and the documentation
recommends against using it since.  Additionally, SSL compression has
been disabled in OpenSSL since version 1.1.0, and was disabled in many
distributions long before that.  The most recent TLS version, TLSv1.3,
disallows compression at the protocol level.

This commit removes the feature itself, removing support for the libpq
parameter sslcompression (parameter still listed for compatibility
reasons with existing connection strings, just ignored), and removes
the equivalent field in pg_stat_ssl and de facto PgBackendSSLStatus.

Note that, on top of removing the ability to activate compression by
configuration, compression is actively disabled in both frontend and
backend to avoid overrides from local configurations.

A TAP test is added for deprecated SSL parameters to check after
backwards compatibility.

Bump catalog version.

Author: Daniel Gustafsson
Reviewed-by: Peter Eisentraut, Magnus Hagander, Michael Paquier
Discussion:  https://postgr.es/m/7E384D48-11C5-441B-9EC3-F7DB1F8518F6@yesql.se
2021-03-09 11:16:47 +09:00
Magnus Hagander
960869da08 Add pg_stat_database counters for sessions and session time
This add counters for number of sessions, the different kind of session
termination types, and timers for how much time is spent in active vs
idle in a database to pg_stat_database.

Internally this also renames the parameter "force" to disconnect. This
was the only use-case for the parameter before, so repurposing it to
this mroe narrow usecase makes things cleaner than inventing something
new.

Author: Laurenz Albe
Reviewed-By: Magnus Hagander, Soumyadeep Chakraborty, Masahiro Ikeda
Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.camel@cybertec.at
2021-01-17 13:52:31 +01:00
Tomas Vondra
8a4f618e7a Report progress of COPY commands
This commit introduces a view pg_stat_progress_copy, reporting progress
of COPY commands.  This allows rough estimates how far a running COPY
progressed, with the caveat that the total number of bytes may not be
available in some cases (e.g. when the input comes from the client).

Author: Josef Šimánek
Reviewed-by: Fujii Masao, Bharath Rupireddy, Vignesh C, Matthias van de Meent
Discussion: https://postgr.es/m/CAFp7QwqMGEi4OyyaLEK9DR0+E+oK3UtA4bEjDVCa4bNkwUY2PQ@mail.gmail.com
Discussion: https://postgr.es/m/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com
2021-01-06 21:51:06 +01:00
Bruce Momjian
ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
Fujii Masao
01469241b2 Track total number of WAL records, FPIs and bytes generated in the cluster.
Commit 6b466bf5f2 allowed pg_stat_statements to track the number of
WAL records, full page images and bytes that each statement generated.
Similarly this commit allows us to track the cluster-wide WAL statistics
counters.

New columns wal_records, wal_fpi and wal_bytes are added into the
pg_stat_wal view, and reports the total number of WAL records,
full page images and bytes generated in the , respectively.

Author: Masahiro Ikeda
Reviewed-by: Amit Kapila, Movead Li, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/35ef960128b90bfae3b3fdf60a3a860f@oss.nttdata.com
2020-12-02 13:00:15 +09:00
Amit Kapila
8e90ec5580 Track statistics for streaming of changes from ReorderBuffer.
This adds the statistics about transactions streamed to the decoding
output plugin from ReorderBuffer. Users can query the
pg_stat_replication_slots view to check these stats and call
pg_stat_reset_replication_slot to reset the stats of a particular slot.
Users can pass NULL in pg_stat_reset_replication_slot to reset stats of
all the slots.

Commit 9868167500 has added the basic infrastructure to capture the stats
of slot and this commit extends the statistics collector to track
additional information about slots.

Bump the catversion as we have added new columns in the catalog entry.

Author: Ajin Cherian and Amit Kapila
Reviewed-by: Sawada Masahiko and Dilip Kumar
Discussion: https://postgr.es/m/CAA4eK1+chpEomLzgSoky-D31qev19AmECNiEAietPQUGEFhtVA@mail.gmail.com
2020-10-29 09:11:51 +05:30
Amit Kapila
03d51b776d Change the attribute name in pg_stat_replication_slots view.
Change the attribute 'name' to 'slot_name' in pg_stat_replication_slots
view to make it clear and that way we will be consistent with the other
places like pg_stat_wal_receiver view where we display the same attribute.

In the passing, fix the typo in one of the macros in the related code.

Bump the catversion as we have modified the name in the catalog as well.

Reported-by: Noriyoshi Shinoda
Author: Noriyoshi Shinoda
Reviewed-by: Sawada  Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
2020-10-20 10:24:36 +05:30
Amit Kapila
9868167500 Track statistics for spilling of changes from ReorderBuffer.
This adds the statistics about transactions spilled to disk from
ReorderBuffer. Users can query the pg_stat_replication_slots view to check
these stats and call pg_stat_reset_replication_slot to reset the stats of
a particular slot. Users can pass NULL in pg_stat_reset_replication_slot
to reset stats of all the slots.

This commit extends the statistics collector to track this information
about slots.

Author: Sawada Masahiko and Amit Kapila
Reviewed-by: Amit Kapila and Dilip Kumar
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
2020-10-08 09:09:08 +05:30
Fujii Masao
8d9a935965 Add pg_stat_wal statistics view.
This view shows the statistics about WAL activity. Currently it has only
two columns: wal_buffers_full and stats_reset. wal_buffers_full column
indicates the number of times WAL data was written to the disk because
WAL buffers got full. This information is useful when tuning wal_buffers.
stats_reset column indicates the time at which these statistics were
last reset.

pg_stat_wal view is also the basic infrastructure to expose other
various statistics about WAL activity later.

Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format.

Bump catalog version.

Author: Masahiro Ikeda
Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao
Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com
2020-10-02 10:17:11 +09:00
Michael Paquier
11a68e4b53 Tweak behavior of pg_stat_activity.leader_pid
The initial implementation of leader_pid in pg_stat_activity added by
b025f32 took the approach to strictly print what a PGPROC entry
includes.  In short, if a backend has been involved in parallel query at
least once, leader_pid would remain set as long as the backend is alive.
For a parallel group leader, this means that the field would always be
set after it participated at least once in parallel query, and after
more discussions this could be confusing if using for example a
connection pooler.

This commit changes the data printed so as leader_pid becomes always
NULL for a parallel group leader, showing up a non-NULL value only for
the parallel workers, and actually as long as a parallel query is
running as workers are shut down once the query has completed.

This does not change the definition of any catalog, so no catalog bump
is needed.  Per discussion with Justin Pryzby, Álvaro Herrera, Julien
Rouhaud and me.

Discussion: https://postgr.es/m/20200721035145.GB17300@paquier.xyz
Backpatch-through: 13
2020-07-26 16:32:11 +09:00
Tom Lane
5cbfce562f Initial pgindent and pgperltidy run for v13.
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.

Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
2020-05-14 13:06:50 -04:00
Tom Lane
81ca868630 Improve management of SLRU statistics collection.
Instead of re-identifying which statistics bucket to use for a given
SLRU on every counter increment, do it once during shmem initialization.
This saves a fair number of cycles, and there's no real cost because
we could not have a bucket assignment that varies over time or across
backends anyway.

Also, get rid of the ill-considered decision to let pgstat.c pry
directly into SLRU's shared state; it's cleaner just to have slru.c
pass the stats bucket number.

In consequence of these changes, there's no longer any need to store
an SLRU's LWLock tranche info in shared memory, so get rid of that,
making this a net reduction in shmem consumption.  (That partly
reverts fe702a7b3.)

This is basically code review for 28cac71bd, so I also cleaned up
some comments, removed a dangling extern declaration, fixed some
things that should be static and/or const, etc.

Discussion: https://postgr.es/m/3618.1589313035@sss.pgh.pa.us
2020-05-13 13:08:23 -04:00
Fujii Masao
043e3e0401 Use proper GetDatum function in pg_stat_get_slru().
This commit changes pg_stat_get_slru() so that it uses
TimestampTzGetDatum() for stats_reset field because that field
stores the timestamp with time zone value. Previously
Int64GetDatum() was used.

Author: Fujii Masao
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/b8784fe6-1401-ab35-aa14-d57b5bb8e312@oss.nttdata.com
2020-05-13 22:20:37 +09:00
Magnus Hagander
7e4e574744 Allow pg_read_all_stats to access all stats views again
The views pg_stat_progress_* had not gotten the memo that
pg_read_all_stats is supposed to be able to read all statistics. Also
make a pass over all text-returning pg_stat_xyz functions that could
return "insufficient privilege" and make sure they also respect
pg_read_all_status.

Reported-by: Andrey M. Borodin
Reviewed-by: Andrey M. Borodin, Kyotaro Horiguchi
Discussion: https://postgr.es/m/13145F2F-8458-4977-9D2D-7B2E862E5722@yandex-team.ru
2020-04-20 12:53:40 +02:00
Tomas Vondra
28cac71bd3 Collect statistics about SLRU caches
There's a number of SLRU caches used to access important data like clog,
commit timestamps, multixact, asynchronous notifications, etc. Until now
we had no easy way to monitor these shared caches, compute hit ratios,
number of reads/writes etc.

This commit extends the statistics collector to track this information
for a predefined list of SLRUs, and also introduces a new system view
pg_stat_slru displaying the data.

The list of built-in SLRUs is fixed, but additional SLRUs may be defined
in extensions. Unfortunately, there's no suitable registry of SLRUs, so
this patch simply defines a fixed list of SLRUs with entries for the
built-in ones and one entry for all additional SLRUs. Extensions adding
their own SLRU are fairly rare, so this seems acceptable.

This patch only allows monitoring of SLRUs, not tuning. The SLRU sizes
are still fixed (hard-coded in the code) and it's not entirely clear
which of the SLRUs might need a GUC to tune size. In a way, allowing us
to determine that is one of the goals of this patch.

Bump catversion as the patch introduces new functions and system view.

Author: Tomas Vondra
Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/flat/20200119143707.gyinppnigokesjok@development
2020-04-02 02:34:21 +02:00
David Rowley
b07642dbcd Trigger autovacuum based on number of INSERTs
Traditionally autovacuum has only ever invoked a worker based on the
estimated number of dead tuples in a table and for anti-wraparound
purposes. For the latter, with certain classes of tables such as
insert-only tables, anti-wraparound vacuums could be the first vacuum that
the table ever receives. This could often lead to autovacuum workers being
busy for extended periods of time due to having to potentially freeze
every page in the table. This could be particularly bad for very large
tables. New clusters, or recently pg_restored clusters could suffer even
more as many large tables may have the same relfrozenxid, which could
result in large numbers of tables requiring an anti-wraparound vacuum all
at once.

Here we aim to reduce the work required by anti-wraparound and aggressive
vacuums in general, by triggering autovacuum when the table has received
enough INSERTs. This is controlled by adding two new GUCs and reloptions;
autovacuum_vacuum_insert_threshold and
autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
existing scale factor and threshold controls, only base themselves off the
number of inserts since the last vacuum, rather than the number of dead
tuples. New controls were added rather than reusing the existing
controls, to allow these new vacuums to be tuned independently and perhaps
even completely disabled altogether, which can be done by setting
autovacuum_vacuum_insert_threshold to -1.

We make no attempt to skip index cleanup operations on these vacuums as
they may trigger for an insert-mostly table which continually doesn't have
enough dead tuples to trigger an autovacuum for the purpose of removing
those dead tuples. If we were to skip cleaning the indexes in this case,
then it is possible for the index(es) to become bloated over time.

There are additional benefits to triggering autovacuums based on inserts,
as tables which never contain enough dead tuples to trigger an autovacuum
are now more likely to receive a vacuum, which can mark more of the table
as "allvisible" and encourage the query planner to make use of Index Only
Scans.

Currently, we still obey vacuum_freeze_min_age when triggering these new
autovacuums based on INSERTs. For large insert-only tables, it may be
beneficial to lower the table's autovacuum_freeze_min_age so that tuples
are eligible to be frozen sooner. Here we've opted not to zero that for
these types of vacuums, since the table may just be insert-mostly and we
may otherwise freeze tuples that are still destined to be updated or
removed in the near future.

There was some debate to what exactly the new scale factor and threshold
should default to. For now, these are set to 0.2 and 1000, respectively.
There may be some motivation to adjust these before the release.

Author: Laurenz Albe, Darafei Praliaskouski
Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
2020-03-28 19:20:12 +13:00
Peter Eisentraut
8e8a0becb3 Unify several ways to tracking backend type
Add a new global variable MyBackendType that uses the same BackendType
enum that was previously only used by the stats collector.  That way
several duplicate ways of checking what type a particular process is
can be simplified.  Since it's no longer just for stats, move to
miscinit.c and rename existing functions to match the expanded
purpose.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-13 14:01:10 +01:00
Fujii Masao
e65497df8f Report progress of streaming base backup.
This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5e0, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
2020-03-03 12:03:43 +09:00
Michael Paquier
b025f32e0b Add leader_pid to pg_stat_activity
This new field tracks the PID of the group leader used with parallel
query.  For parallel workers and the leader, the value is set to the
PID of the group leader.  So, for the group leader, the value is the
same as its own PID.  Note that this reflects what PGPROC stores in
shared memory, so as leader_pid is NULL if a backend has never been
involved in parallel query.  If the backend is using parallel query or
has used it at least once, the value is set until the backend exits.

Author: Julien Rouhaud
Reviewed-by: Sergei Kornilov, Guillaume Lelarge, Michael Paquier, Tomas
Vondra
Discussion: https://postgr.es/m/CAOBaU_Yy5bt0vTPZ2_LUM6cUcGeqmYNoJ8-Rgto+c2+w3defYA@mail.gmail.com
2020-02-06 09:18:06 +09:00
Alvaro Herrera
c9d2977519 Clean up newlines following left parentheses
We used to strategically place newlines after some function call left
parentheses to make pgindent move the argument list a few chars to the
left, so that the whole line would fit under 80 chars.  However,
pgindent no longer does that, so the newlines just made the code
vertically longer for no reason.  Remove those newlines, and reflow some
of those lines for some extra naturality.

Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
2020-01-30 13:42:14 -03:00
Alvaro Herrera
a166d408eb Report progress of ANALYZE commands
This uses the progress reporting infrastructure added by c16dc1aca5e0,
adding support for ANALYZE.

Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Co-authored-by: Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp>
Reviewed-by: Julien Rouhaud, Robert Haas, Anthony Nowocien, Kyotaro Horiguchi,
	Vignesh C, Amit Langote
2020-01-15 11:14:39 -03:00
Alvaro Herrera
f5d28710c7 Reimplement nullification of walsender timestamp
Make the value null only at pg_stat_activity-output time, as suggested
by Tom Lane, instead of messing with the internal state.  This should
appease buildfarm members with force_parallel_mode=regress, which are
running parallel queries on logical replication walsenders.

The fact that walsenders can run parallel queries should perhaps be
studied more carefully, but for the moment let's get rid of the red
blots in buildfarm.

Backpatch to pg10, like the previous commit.

Discussion: https://postgr.es/m/30804.1578438763@sss.pgh.pa.us
2020-01-08 14:33:49 -03:00
Bruce Momjian
7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Alvaro Herrera
c4dcd9144b Avoid splitting C string literals with \-newline
Using \ is unnecessary and ugly, so remove that.  While at it, stitch
the literals back into a single line: we've long discouraged splitting
error message literals even when they go past the 80 chars line limit,
to improve greppability.

Leave contrib/tablefunc alone.

Discussion: https://postgr.es/m/20191223195156.GA12271@alvherre.pgsql
2019-12-24 12:44:12 -03:00
Magnus Hagander
252b707bc4 Return NULL for checksum failures if checksums are not enabled
Returning 0 could falsely indicate that there is no problem. NULL
correctly indicates that there is no information about potential
problems.

Also return 0 as numbackends instead of NULL for shared objects (as no
connection can be made to a shared object only).

Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
2019-04-17 13:51:48 +02:00
Magnus Hagander
77bd49adba Show shared object statistics in pg_stat_database
This adds a row to the pg_stat_database view with datoid 0 and datname
NULL for those objects that are not in a database. This was added
particularly for checksums, but we were already tracking more satistics
for these objects, just not returning it.

Also add a checksum_last_failure column that holds the timestamptz of
the last checksum failure that occurred in a database (or in a
non-dataabase file), if any.

Author: Julien Rouhaud <rjuju123@gmail.com>
2019-04-12 14:04:50 +02:00
Stephen Frost
b0b39f72b9 GSSAPI encryption support
On both the frontend and backend, prepare for GSSAPI encryption
support by moving common code for error handling into a separate file.
Fix a TODO for handling multiple status messages in the process.
Eliminate the OIDs, which have not been needed for some time.

Add frontend and backend encryption support functions.  Keep the
context initiation for authentication-only separate on both the
frontend and backend in order to avoid concerns about changing the
requested flags to include encryption support.

In postmaster, pull GSSAPI authorization checking into a shared
function.  Also share the initiator name between the encryption and
non-encryption codepaths.

For HBA, add "hostgssenc" and "hostnogssenc" entries that behave
similarly to their SSL counterparts.  "hostgssenc" requires either
"gss", "trust", or "reject" for its authentication.

Similarly, add a "gssencmode" parameter to libpq.  Supported values are
"disable", "require", and "prefer".  Notably, negotiation will only be
attempted if credentials can be acquired.  Move credential acquisition
into its own function to support this behavior.

Add a simple pg_stat_gssapi view similar to pg_stat_ssl, for monitoring
if GSSAPI authentication was used, what principal was used, and if
encryption is being used on the connection.

Finally, add documentation for everything new, and update existing
documentation on connection security.

Thanks to Michael Paquier for the Windows fixes.

Author: Robbie Harwood, with changes to the read/write functions by me.
Reviewed in various forms and at different times by: Michael Paquier,
   Andres Freund, David Steele.
Discussion: https://www.postgresql.org/message-id/flat/jlg1tgq1ktm.fsf@thriss.redhat.com
2019-04-03 15:02:33 -04:00
Alvaro Herrera
ab0dfc961b Report progress of CREATE INDEX operations
This uses the progress reporting infrastructure added by c16dc1aca5e0,
adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY.

There are two pieces to this: one is index-AM-agnostic, and the other is
AM-specific.  The latter is fairly elaborate for btrees, including
reportage for parallel index builds and the separate phases that btree
index creation uses; other index AMs, which are much simpler in their
building procedures, have simplistic reporting only, but that seems
sufficient, at least for non-concurrent builds.

The index-AM-agnostic part is fairly complete, providing insight into
the CONCURRENTLY wait phases as well as block-based progress during the
index validation table scan.  (The index validation index scan requires
patching each AM, which has not been included here.)

Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada
Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
2019-04-02 15:18:08 -03:00
Robert Haas
6f97457e0d Add progress reporting for CLUSTER and VACUUM FULL.
This uses the same progress reporting infrastructure added in commit
c16dc1aca5e01e6acaadfcf38f5fc964a381dc62 and extends it to these
additional cases.  We lack the ability to track the internal progress
of sorts and index builds so the information reported is
coarse-grained for some parts of the operation, but it still seems
like a significant improvement over having nothing at all.

Tatsuro Yamada, reviewed by Thomas Munro, Masahiko Sawada, Michael
Paquier, Jeff Janes, Alvaro Herrera, Rafia Sabih, and by me.  A fair
amount of polishing also by me.

Discussion: http://postgr.es/m/59A77072.3090401@lab.ntt.co.jp
2019-03-25 10:59:04 -04:00
Magnus Hagander
6b9e875f72 Track block level checksum failures in pg_stat_database
This adds a column that counts how many checksum failures have occurred
on files belonging to a specific database. Both checksum failures
during normal backend processing and those created when a base backup
detects a checksum failure are counted.

Author: Magnus Hagander
Reviewed by: Julien Rouhaud
2019-03-09 10:47:30 -08:00
Peter Eisentraut
f9692a769b Hide other user's pg_stat_ssl rows
Change pg_stat_ssl so that an unprivileged user can only see their own
rows; other rows will be all null.  This makes the behavior consistent
with pg_stat_activity, where information about where the connection
came from is also restricted.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/63117976-d02c-c8e2-3aef-caa31a5ab8d3%402ndquadrant.com
2019-02-21 19:51:52 +01:00
Peter Eisentraut
f60a0e9677 Add more columns to pg_stat_ssl
Add columns client_serial and issuer_dn to pg_stat_ssl.  These allow
uniquely identifying the client certificate.

Rename the existing column clientdn to client_dn, to make the naming
more consistent and easier to read.

Discussion: https://www.postgresql.org/message-id/flat/398754d8-6bb5-c5cf-e7b8-22e5f0983caf@2ndquadrant.com/
2019-02-01 00:33:47 +01:00
Peter Eisentraut
e77cfa54d7 Fix pg_stat_ssl.clientdn
Return null if there is no client certificate.  This is how it has
always been documented, but in reality it returned an empty string.

Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Discussion: https://www.postgresql.org/message-id/flat/398754d8-6bb5-c5cf-e7b8-22e5f0983caf@2ndquadrant.com/
2019-01-29 13:06:33 +01:00
Bruce Momjian
97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00