The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
This patch allows the frame to start from CURRENT ROW (in either RANGE or
ROWS mode), and it also adds support for ROWS n PRECEDING and ROWS n FOLLOWING
start and end points. (RANGE value PRECEDING/FOLLOWING isn't there yet ---
the grammar works, but that's all.)
Hitoshi Harada, reviewed by Pavel Stehule
being called as aggregates, and to get the aggregate transition state memory
context if needed. Use it instead of poking directly into AggState and
WindowAggState in places that shouldn't know so much.
We should have done this in 8.4, probably, but better late than never.
Revised version of a patch by Hitoshi Harada.
of shared or nailed system catalogs. This has two key benefits:
* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.
* We no longer have to use an unsafe reindex-in-place approach for reindexing
shared catalogs.
CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.
Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.
This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch. As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.
the input values into a string. The two argument version also does the same
thing, but inserts delimiters between elements.
Original patch by Pavel Stehule, reviewed by David E. Wheeler and me.
and implement OVERLAY() for bit strings and bytea.
In passing also convert text OVERLAY() to a true built-in, instead of
relying on a SQL function.
Leonardo F, reviewed by Kevin Grittner
the various disk-size-reporting functions will respond to query cancel
reasonably promptly even in very large databases. Per report from
Kevin Grittner.
pg_constraint before searching pg_trigger. This allows saner handling of
corner cases; in particular we now say "constraint is not deferrable"
rather than "constraint does not exist" when the command is applied to
a constraint that's inherently non-deferrable. Per a gripe several months
ago from hubert depesz lubaczewski.
To make this work without breaking user-defined constraint triggers,
we have to add entries for them to pg_constraint. However, in return
we can remove the pgconstrname column from pg_constraint, which represents
a fairly sizable space savings. I also replaced the tgisconstraint column
with tgisinternal; the old meaning of tgisconstraint can now be had by
testing for nonzero tgconstraint, while there is no other way to get
the old meaning of nonzero tgconstraint, namely that the trigger was
internally generated rather than being user-created.
In passing, fix an old misstatement in the docs and comments, namely that
pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable.
Actually, we mark RI action triggers as nondeferrable even when they belong to
a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on
that instead of hard-coding a list of exception OIDs.
can upgrade clusters without renaming the tablespace directories. New
directory structure format is, e.g.:
$PGDATA/pg_tblspc/20981/PG_8.5_201001061/719849/83292814
of the string". The previous coding treated only -1 that way, and would
produce an invalid result value for other negative values.
We ought to fix it so that 2-parameter bit substring() is a different C
function and the 3-parameter form throws error for negative length, but
that takes a pg_proc change which is impractical in the back branches;
and in any case somebody might be relying on -1 working this way.
So just do this as a back-patchable fix.
we're not going to support that anymore.
I did keep the 64-bit-CRC-with-32-bit-arithmetic code, since it has a
performance excuse to live. It's a bit moot since that's all ifdef'd
out, of course.
This patch only supports seq_page_cost and random_page_cost as parameters,
but it provides the infrastructure to scalably support many more.
In particular, we may want to add support for effective_io_concurrency,
but I'm leaving that as future work for now.
Thanks to Tom Lane for design help and Alvaro Herrera for the review.
"column < constant", and the comparison value is in the first or last
histogram bin or outside the histogram entirely, try to fetch the actual
column min or max value using an index scan (if there is an index on the
column). If successful, replace the lower or upper histogram bound with
that value before carrying on with the estimate. This limits the
estimation error caused by moving min/max values when the comparison
value is close to the min or max. Per a complaint from Josh Berkus.
It is tempting to consider using this mechanism for mergejoinscansel as well,
but that would inject index fetches into main-line join estimation not just
endpoint cases. I'm refraining from that until we can get a better handle
on the costs of doing this type of lookup.
This is needed to avoid unwanted interference with SUBSTRING behavior,
as per bug #5257 from Roman Kononov. Also, add some basic intelligence
about character classes (bracket expressions) since we now have several
behaviors that aren't appropriate inside a character class.
As with the previous patch in this area, I'm reluctant to back-patch
since it might affect applications that are relying on the prior
behavior.
to be just a minor extension of the previous patch that made "x IS NULL"
indexable, because we can treat the IS NOT NULL condition as if it were
"x < NULL" or "x > NULL" (depending on the index's NULLS FIRST/LAST option),
just like IS NULL is treated like "x = NULL". Aside from any possible
usefulness in its own right, this is an important improvement for
index-optimized MAX/MIN aggregates: it is now reliably possible to get
a column's min or max value cheaply, even when there are a lot of nulls
cluttering the interesting end of the index.
Since the int2vector type is intended only for internal use, this patch doesn't
worry about prettifying the error messages, which has the fringe benefit of
avoiding creating additional translatable strings. For a type intended to be
used by end-users, we would want to do better, but the approach taken here
seems like the correct trade-off for this case.
Caleb Welton
and teach ANALYZE to compute such stats for tables that have subclasses.
Per my proposal of yesterday.
autovacuum still needs to be taught about running ANALYZE on parent tables
when their subclasses change, but the feature is useful even without that.
Rewrite or adjust various comments for clarity. Remove one bogus comment that
doesn't reflect what the code actually does. Improve the description of the
lo_compat_privileges option.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
non-kluge method for controlling the order in which values are fed to an
aggregate function. At the same time eliminate the old implementation
restriction that DISTINCT was only supported for single-argument aggregates.
Possibly release-notable behavioral change: formerly, agg(DISTINCT x)
dropped null values of x unconditionally. Now, it does so only if the
agg transition function is strict; otherwise nulls are treated as DISTINCT
normally would, ie, you get one copy.
Andrew Gierth, reviewed by Hitoshi Harada
correctly when the output bit width is wider than the given integer by
something other than a multiple of 8 bits.
This has been wrong since I first wrote that code for 8.0 :-(. Kudos to
Roman Kononov for being the first to notice, though I didn't use his
patch. Per bug #5237.
an allegedly immutable index function. It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user. However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.
The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue. GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation. Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement. (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)
Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.
Security: CVE-2009-4136
support any indexable commutative operator, not just equality. Two rows
violate the exclusion constraint if "row1.col OP row2.col" is TRUE for
each of the columns in the constraint.
Jeff Davis, reviewed by Robert Haas
Instead of expensive cross joins to resolve the ACL, add table-returning
function aclexplode() that expands the ACL into a useful form, and join
against that.
Also, implement the role_*_grants views as a thin layer over the respective
*_privileges views instead of essentially repeating the same code twice.
fixes bug #4596
by Joachim Wieland, with cleanup by me
checked to determine whether the trigger should be fired.
For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.
Takahiro Itagaki, reviewed by KaiGai Kohei.
it works just as well to have them be ordinary identifiers, and this gets rid
of a number of ugly special cases. Plus we aren't interfering with non-rule
usage of these names.
catversion bump because the names change internally in stored rules.
that are not handled by find_coercion_pathway, notably composite->RECORD.
Now that 8.4 supports composites as primary keys, it's worth dealing with
this case.
adding the ModifyTable node type --- I had been thinking ModifyTable should
replace Append as a special case in push_plan(), but actually both of them
have to be special-cased.
when FOR UPDATE is propagated down into a sub-select expanded from a view.
Similar bug to parser's isLockedRel issue that I fixed yesterday; likewise
seems not quite worth the effort to back-patch.
underneath the Limit node, not atop it. This fixes the old problem that such
a query might unexpectedly return fewer rows than the LIMIT says, due to
LockRows discarding updated rows.
There is a related problem that LockRows might destroy the sort ordering
produced by earlier steps; but fixing that by pushing LockRows below Sort
would create serious performance problems that are unjustified in many
real-world applications, as well as potential deadlock problems from locking
many more rows than expected. Instead, keep the present semantics of applying
FOR UPDATE after ORDER BY within a single query level; but allow the user to
specify the other way by writing FOR UPDATE in a sub-select. To make that
work, track whether FOR UPDATE appeared explicitly in sub-selects or got
pushed down from the parent, and don't flatten a sub-select that contained an
explicit FOR UPDATE.
those accepted by date_in(). I confused julian day numbers and number of
days since the postgres epoch 2000-01-01 in the original patch.
I just noticed that it's still easy to get such out-of-range values into
the database using to_date or +- operators, but this patch doesn't do
anything about those functions.
Per report from James Pye.
style by default. Per discussion, there seems to be hardly anything that
really relies on being able to change the regex flavor, so the ability to
select it via embedded options ought to be enough for any stragglers.
Also, if we didn't remove the GUC, we'd really be morally obligated to
mark the regex functions non-immutable, which'd possibly create performance
issues.
are named in the UPDATE's SET list.
Note: the schema of pg_trigger has not actually changed; we've just started
to use a column that was there all along. catversion bumped anyway so that
this commit is included in the history of potentially interesting changes
to system catalog contents.
Itagaki Takahiro
Also insert a couple of Asserts that check for stack overflow.
Bogus coding appears to be new in 8.4 --- older releases had a much
simpler algorithm here. Per bug #5111.
* Stop escaping ? and {. As of SQL:2008, SIMILAR TO is defined to have
POSIX-compatible interpretation of ? as well as {m,n} and related constructs,
so we should allow these things through to our regex engine.
* Escape ^ and $. It appears that our regex engine will treat ^^ at the
beginning of the string the same as ^, and similarly for $$ at the end of
the string, which meant that SIMILAR TO was effectively ignoring ^ at the
start of the pattern and $ at the end. Since these are not supposed to be
metacharacters, this is a bug.
The second part of this is arguably a back-patchable bug fix, but I'm
hesitant to do that because it might break applications that are expecting
something like "col SIMILAR TO '^foo$'" to work like a POSIX pattern.
Seems safer to only change it at a major version boundary.
Per discussion of an example from Doug Gorley.