Add some checks that seem logically necessary, in particular let's make
real sure that HS slave sessions cannot create temp tables. (If they did
they would think that temp tables belonging to the master's session with
the same BackendId were theirs. We *must* not allow myTempNamespace to
become set in a slave session.)
Change setval() and nextval() so that they are only allowed on temp sequences
in a read-only transaction. This seems consistent with what we allow for
table modifications in read-only transactions. Since an HS slave can't have a
temp sequence, this also provides a nicer cure for the setval PANIC reported
by Erik Rijkers.
Make the error messages more uniform, and have them mention the specific
command being complained of. This seems worth the trifling amount of extra
code, since people are likely to see such messages a lot more than before.
VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
Per discussion, the use case for this method of vacuuming is no longer large
enough to justify maintaining it; not to mention that we don't wish to invest
the work that would be needed to make it play nicely with Hot Standby.
Aside from the code directly related to old-style VACUUM FULL, this commit
removes support for certain WAL record types that could only be generated
within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
nontransactional generation of cache invalidation sinval messages (the last
being the sticking point for Hot Standby).
We still have to retain all code that copes with finding HEAP_MOVED_OFF and
HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long
as we want to support in-place update from pre-9.0 databases.
heap_sync() to the callers, because heap_sync() is sometimes called even
if the operation itself is WAL-logged. This eliminates the bogus unlogged
records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.
We need to free the OID list returned by ExecInsertIndexTuples to avoid
a query-lifespan memory leak. When many rows require rechecking, this
can be a significant leak --- it's even more than the space used for the
queued trigger events.
Dean Rasheed
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.
Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.
Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.
Fujii Masao, with additional hacking by me
This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.
Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.
checked to determine whether the trigger should be fired.
For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.
Takahiro Itagaki, reviewed by KaiGai Kohei.
This is intentionally similar to the recently revised syntax for EXPLAIN
options, ie, (name value, ...). The old syntax is still supported for
backwards compatibility, but we intend that any options added in future
will be provided only in the new syntax.
Robert Haas, Emmanuel Cecchet
The current implementation fires an AFTER ROW trigger for each tuple that
looks like it might be non-unique according to the index contents at the
time of insertion. This works well as long as there aren't many conflicts,
but won't scale to massive unique-key reassignments. Improving that case
is a TODO item.
Dean Rasheed
temp relations; this is no more expensive than before, now that we have
pg_class.relistemp. Insert tests into bufmgr.c to prevent attempting
to fetch pages from nonlocal temp relations. This provides a low-level
defense against bugs-of-omission allowing temp pages to be loaded into shared
buffers, as in the contrib/pgstattuple problem reported by Stuart Bishop.
While at it, tweak a bunch of places to use new relcache tests (instead of
expensive probes into pg_namespace) to detect local or nonlocal temp tables.
has_column_privilege and has_any_column_privilege SQL functions; fix the
information_schema views that are supposed to pay attention to column
privileges; adjust pg_stats to show stats for any column you have select
privilege on; and fix COPY to allow copying a subset of columns if the user
has suitable per-column privileges for all the columns.
To improve efficiency of some of the information_schema views, extend the
has_xxx_privilege functions to allow inquiring about the OR of a set of
privileges in just one call. This is just exposing capability that already
existed in the underlying aclcheck routines.
In passing, make the information_schema views report the owner's own
privileges as being grantable, since Postgres assumes this even when the grant
option bit is not set in the ACL. This is a longstanding oversight.
Also, make the new has_xxx_privilege functions for foreign data objects follow
the same coding conventions used by the older ones.
Stephen Frost and Tom Lane
practically free given prior 8.4 changes in plancache and portal management,
and it makes it a lot easier for ExecutorStart/Run/End hooks to get at the
query text. Extracted from Itagaki Takahiro's pg_stat_statements patch,
with minor editorialization.
that a Portal is a useful and sufficient additional argument for
CreateDestReceiver --- it just isn't, in most cases. Instead formalize
the approach of passing any needed parameters to the receiver separately.
One unexpected benefit of this change is that we can declare typedef Portal
in a less surprising location.
This patch is just code rearrangement and doesn't change any functionality.
I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.
(but not locked, as that would risk deadlocks). Also, make it work in a small
ring of buffers to avoid having bulk inserts trash the whole buffer arena.
Robert Haas, after an idea of Simon Riggs'.
and heap_deformtuple in favor of the newer functions heap_form_tuple et al
(which do the same things but use bool control flags instead of arbitrary
char values). Eliminate the former duplicate coding of these functions,
reducing the deprecated functions to mere wrappers around the newer ones.
We can't get rid of them entirely because add-on modules probably still
contain many instances of the old coding style.
Kris Jurka
There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment. This is all done automatically now.
As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.
snapmgmt.c file for the former. The header files have also been reorganized
in three parts: the most basic snapshot definitions are now in a new file
snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c.
tqual.h has been reduced to the bare minimum.
This patch is just a first step towards managing live snapshots within a
transaction; there is no functionality change.
Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and
subsequent discussion.
COPY. We need a restriction here because when the delimiter occurs as a
data character, it is emitted with a backslash, and that will only work
as desired if CopyReadAttributesText() will interpret the backslash sequence
as representing the second character literally. This is currently untrue
for 'b', 'f', 'n', 'r', 't', 'v', 'x', and octal digits. For future-proofing
and simplicity of explanation, it seems best to disallow a-z and 0-9.
We must also disallow dot, since "\." by itself would look like copy EOF.
Note: "\N" is by default the null print string, so N would also cause a
problem, but that is already tested for.
CopyAttributeOutText(), so that control characters are converted to the
C-style escape sequences even if they happen to be equal to the column
delimiter (as is true by default for tab, for example). Oversight in my
previous patch to restore pre-8.3 behavior of COPY OUT escaping. Per report
from Tomas Szepe.
namely that \r, \n, \t, \b, \f, \v are dumped as those two-character
representations rather than a backslash and the literal control character.
I had made it do the other to save some code, but this was ill-advised,
because dump files in which these characters appear literally are prone to
newline mangling. Fortunately, doing it the old way should only cost a few
more lines of code, and not slow down the copy loop materially.
Per bug #3795 from Lou Duchez.
but no database changes have been made since the last CommandCounterIncrement.
This should result in a significant improvement in the number of "commands"
that can typically be performed within a transaction before hitting the 2^32
CommandId size limit. In particular this buys back (and more) the possible
adverse consequences of my previous patch to fix plan caching behavior.
The implementation requires tracking whether the current CommandCounter
value has been "used" to mark any tuples. CommandCounter values stored into
snapshots are presumed not to be used for this purpose. This requires some
small executor changes, since the executor used to conflate the curcid of
the snapshot it was using with the command ID to mark output tuples with.
Separating these concepts allows some small simplifications in executor APIs.
Something for the TODO list: look into having CommandCounterIncrement not do
AcceptInvalidationMessages. It seems fairly bogus to be doing it there,
but exactly where to do it instead isn't clear, and I'm disinclined to mess
with asynchronous behavior during late beta.
no need for serialization against snapshot-taking because the xact doesn't
affect anyone else's snapshot anyway. Per discussion. Also, move various
info about the interlocking of transactions and snapshots out of code comments
and into a hopefully-more-cohesive discussion in access/transam/README.
Also, remove a couple of now-obsolete comments about having to force some WAL
to be written to persuade RecordTransactionCommit to do its thing.
profiling that CopyAttributeOutText was taking an unreasonable fraction of
the backend run time (like 66%!) on the following trivial test case:
$ time psql -c "copy (select repeat('xyzzy',50) from generate_series(1,10000000)) to stdout" regression >/dev/null
The time is all being spent on scanning the string for characters to be
escaped, which most of the time there aren't any of. Some tweaking to take
as many tests as possible out of the inner loop reduced the runtime of this
example by more than 10%. In a real-world case it wouldn't be as useful
a speedup, but it still seems worth adding a few lines here.
types of unspecified parameters when submitted via extended query protocol.
This worked in 8.2 but I had broken it during plancache changes. DECLARE
CURSOR is now treated almost exactly like a plain SELECT through parse
analysis, rewrite, and planning; only just before sending to the executor
do we divert it away to ProcessUtility. This requires a special-case check
in a number of places, but practically all of them were already special-casing
SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge
the two by treating IntoClause as a form of utility statement? Not going to
worry about that now, though.) That approach doesn't work for EXPLAIN,
however, so for that I punted and used a klugy solution of running parse
analysis an extra time if under extended query protocol.
access to the planner's cursor-related planning options, and provide new
FETCH/MOVE routines that allow access to the full power of those commands.
Small refactoring of planner(), pg_plan_query(), and pg_plan_queries()
APIs to make it convenient to pass the planning options down from SPI.
This is the core-code portion of Pavel Stehule's patch for scrollable
cursor support in plpgsql; I'll review and apply the plpgsql changes
separately.
module and teach PREPARE and protocol-level prepared statements to use it.
In service of this, rearrange utility-statement processing so that parse
analysis does not assume table schemas can't change before execution for
utility statements (necessary because we don't attempt to re-acquire locks
for utility statements when reusing a stored plan). This requires some
refactoring of the ProcessUtility API, but it ends up cleaner anyway,
for instance we can get rid of the QueryContext global.
Still to do: fix up SPI and related code to use the plan cache; I'm tempted to
try to make SQL functions use it too. Also, there are at least some aspects
of system state that we want to ensure remain the same during a replan as in
the original processing; search_path certainly ought to behave that way for
instance, and perhaps there are others.
fixup various places in the tree that were clearing a StringInfo by hand.
Making this function a part of the API simplifies client code slightly,
and avoids needlessly peeking inside the StringInfo interface.
storing mostly-redundant Query trees in prepared statements, portals, etc.
To replace Query, a new node type called PlannedStmt is inserted by the
planner at the top of a completed plan tree; this carries just the fields of
Query that are still needed at runtime. The statement lists kept in portals
etc. now consist of intermixed PlannedStmt and bare utility-statement nodes
--- no Query. This incidentally allows us to remove some fields from Query
and Plan nodes that shouldn't have been there in the first place.
Still to do: simplify the execution-time range table; at the moment the
range table passed to the executor still contains Query trees for subqueries.
initdb forced due to change of stored rules.