Peter Geoghegan 5100010ee4 Teach VACUUM to bypass unnecessary index vacuuming.
VACUUM has never needed to call ambulkdelete() for each index in cases
where there are precisely zero TIDs in its dead_tuples array by the end
of its first pass over the heap (also its only pass over the heap in
this scenario).  Index vacuuming is simply not required when this
happens.  Index cleanup will still go ahead, but in practice most calls
to amvacuumcleanup() are usually no-ops when there were zero preceding
ambulkdelete() calls.  In short, VACUUM has generally managed to avoid
index scans when there were clearly no index tuples to delete from
indexes.  But cases with _close to_ no index tuples to delete were
another matter -- a round of ambulkdelete() calls took place (one per
index), each of which performed a full index scan.

VACUUM now behaves just as if there were zero index tuples to delete in
cases where there are in fact "virtually zero" such tuples.  That is, it
can now bypass index vacuuming and heap vacuuming as an optimization
(though not index cleanup).  Whether or not VACUUM bypasses indexes is
determined dynamically, based on the just-observed number of heap pages
in the table that have one or more LP_DEAD items (LP_DEAD items in heap
pages have a 1:1 correspondence with index tuples that still need to be
deleted from each index in the worst case).

We only skip index vacuuming when 2% or less of the table's pages have
one or more LP_DEAD items -- bypassing index vacuuming as an
optimization must not noticeably impede setting bits in the visibility
map.  As a further condition, the dead_tuples array (i.e. VACUUM's array
of LP_DEAD item TIDs) must not exceed 32MB at the point that the first
pass over the heap finishes, which is also when the decision to bypass
is made.  (The VACUUM must also have been able to fit all TIDs in its
maintenance_work_mem-bound dead_tuples space, though with a default
maintenance_work_mem setting it can't matter.)

This avoids surprising jumps in the duration and overhead of routine
vacuuming with workloads where successive VACUUM operations consistently
have almost zero dead index tuples.  The number of LP_DEAD items may
well accumulate over multiple VACUUM operations, before finally the
threshold is crossed and VACUUM performs conventional index vacuuming.
Even then, the optimization will have avoided a great deal of largely
unnecessary index vacuuming.

In the future we may teach VACUUM to skip index vacuuming on a per-index
basis, using a much more sophisticated approach.  For now we only
consider the extreme cases, where we can be quite confident that index
vacuuming just isn't worth it using simple heuristics.

Also log information about how many heap pages have one or more LP_DEAD
items when autovacuum logging is enabled.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzmkebqPd4MVGuPTOS9bMFvp9MDs5cRTCOsv1rQJ3jCbXw@mail.gmail.com
2021-04-07 16:14:54 -07:00
2021-01-02 13:06:25 -05:00
2021-04-07 21:47:55 +02:00
2019-12-18 09:13:13 +01:00
2019-11-12 08:13:55 +01:00
2021-04-03 17:30:49 +09:00
2021-04-03 17:30:49 +09:00
2021-01-02 13:06:25 -05:00
2020-02-10 20:47:50 +01:00
2020-02-10 20:47:50 +01:00
2020-02-10 20:47:50 +01:00

PostgreSQL Database Management System
=====================================

This directory contains the source code distribution of the PostgreSQL
database management system.

PostgreSQL is an advanced object-relational database management system
that supports an extended subset of the SQL standard, including
transactions, foreign keys, subqueries, triggers, user-defined types
and functions.  This distribution also contains C language bindings.

PostgreSQL has many language interfaces, many of which are listed here:

	https://www.postgresql.org/download/

See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.

The latest version of this software may be obtained at
https://www.postgresql.org/download/.  For more information look at our
web site located at https://www.postgresql.org/.
Description
he World's Most Advanced Open Source Relational Database.
Readme 743 MiB
Languages
C 85.3%
PLpgSQL 6%
Perl 4.4%
Yacc 1.2%
Meson 0.7%
Other 2.2%