If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
will do the same thing during replay, so the corruption propagates to any
standby. Note, however, that the bug can't be demonstrated unless archiving
is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
are not set.
Back-patch to 8.0; prior releases do not have tablespaces.
Analysis and patch by Jeff Davis. Adjustments for back-branches and minor
wordsmithing by me.
an allegedly immutable index function. It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user. However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.
The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue. GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation. Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement. (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)
Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.
Security: CVE-2009-4136
use the old relfilenode in the new tablespace. There might be another relation
in the new tablespace with the same relfilenode, so we must generate a fresh
relfilenode in the new tablespace.
The 8.3 patch to let deleted relation files linger as zero-length files until
the next checkpoint made this more obvious: moving a relation from one table
space another, and then back again, caused a collision with the lingering
file.
Back-patch to 8.1. The issue is present in 8.0 as well, but it doesn't seem
worth fixing there, because we didn't have protection from OID collisions
after OID wraparound before 8.1.
Report by Guillaume Lelarge.
current transaction has any open references to the target relation or index
(implying it has an active query using the relation). Also back-patch the
8.2 fix that prohibits TRUNCATE and CLUSTER when there are pending
AFTER-trigger events. Per suggestion from Heikki.
varoattno along with varattno. This resulted in having Vars that were not
seen as equal(), causing inheritance of the "same" constraint from different
parent relations to fail. An example is
create table pp1 (f1 int check (f1>0));
create table cc1 (f2 text, f3 int) inherits (pp1);
create table cc2(f4 float) inherits(pp1,cc1);
Backpatch as far as 7.4. (The test case still fails in 7.4, for reasons
that I don't feel like investigating at the moment.)
This is a backpatch commit only. The fix will be applied in HEAD as part
of the upcoming pg_constraint patch.
checked to see if it's been initialized to all non-nulls. The implicit NOT
NULL constraint was not being checked during the ALTER (in fact, not even if
there was an explicit NOT NULL too), because ATExecAddColumn neglected to
set the flag needed to make the test happen. This has been broken since
the capability was first added, in 8.0.
Brendan Jurd, per a report from Kaloyan Iliev.
made query plan. Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has. Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor. Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query. However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend. Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Security: CVE-2007-0556
a table. Otherwise a USING clause that yields NULL can leave the table
violating its constraint (possibly there are other cases too). Per report
from Alexander Pravking.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
argument as a 'regclass' value instead of a text string. The frontend
conversion of text string to pg_class OID is now encapsulated as an
implicitly-invocable coercion from text to regclass. This provides
backwards compatibility to the old behavior when the sequence argument
is explicitly typed as 'text'. When the argument is just an unadorned
literal string, it will be taken as 'regclass', which means that the
stored representation will be an OID. This solves longstanding problems
with renaming sequences that are referenced in default expressions, as
well as new-in-8.1 problems with renaming such sequences' schemas or
moving them to another schema. All per recent discussion.
Along the way, fix some rather serious problems in dbmirror's support
for mirroring sequence operations (int4 vs int8 confusion for instance).
the parent table, even if the command that creates them is executed by
someone else (such as a superuser or a member of the owning role).
Per gripe from Michael Fuhr.
use these instead of its previous hack of changing pg_class.reltriggers.
Documentation is lacking, will add that later.
Patch by Satoshi Nagayasu, review and some extra work by Tom Lane.
erroring out as it has done for the last couple weeks. Document that this
form is now ignored because indexes can't usefully have different owners
from their parent tables. Fix pg_dump to not generate ALTER OWNER commands
for indexes.
discussion of getting around this by relaxing the checks made for regular
users, but I'm disinclined to toy with the security model right now,
so just special-case it for superusers where needed.
This was not especially critical before, but it is now that we track
ownership dependencies --- the dependency for the rowtype *must* shift
to the new owner. Spotted by Bernd Helmle.
Also fix a problem introduced by recent change to allow non-superusers
to do ALTER OWNER in some cases: if the table had a toast table, ALTER
OWNER failed *even for superusers*, because the test being applied would
conclude that the new would-be owner had no create rights on pg_toast.
A side-effect of the fix is to disallow changing the ownership of indexes
or toast tables separately from their parent table, which seems a good
idea on the whole.
requiring superuserness always, allow an owner to reassign ownership
to any role he is a member of, if that role would have the right to
create a similar object. These three requirements essentially state
that the would-be alterer has enough privilege to DROP the existing
object and then re-CREATE it as the new role; so we might as well
let him do it in one step. The ALTER TABLESPACE case is a bit
squirrely, but the whole concept of non-superuser tablespace owners
is pretty dubious anyway. Stephen Frost, code review by Tom Lane.
have adequate mechanisms for tracking the contents of databases and
tablespaces). This solves the longstanding problem that you can drop a
user who still owns objects and/or has access permissions.
Alvaro Herrera, with some kibitzing from Tom Lane.
and pg_auth_members. There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance). But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies. The catalog changes should
be pretty much done.
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero. That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space. Per suggestion by Heikki Linnakangas.
representation as the jointree) with two lists of RTEs, one showing
the RTEs accessible by qualified names, and the other showing the RTEs
accessible by unqualified names. I think this is conceptually simpler
than what we did before, and it's sure a whole lot easier to search.
This seems to eliminate the parse-time bottleneck for deeply nested
JOIN structures that was exhibited by phil@vodafone.
key, compare the new and old row versions. If the foreign key column has
not changed, we needn't enqueue the trigger, since the update cannot
violate the foreign key. This optimization was previously applied in the
RI trigger function, but it is more efficient to avoid firing the trigger
altogether. Per recent discussion on pgsql-hackers.
Also add a regression test for some unintuitive foreign key behavior, and
refactor some code that deals with the OIDs of the various RI trigger
functions.
keys, rather than a single trigger for both events. This should not change
functionality, but it is more consistent: previously, there were trigger
functions for both "check_insert" and "check_update", but the former was
used for both events.
Bump catalog version number (not strictly necessary, but best to be
cautious).
which is neither needed by nor related to that header. Remove the bogus
inclusion and instead include the header in those C files that actually
need it. Also fix unnecessary inclusions and bad inclusion order in
tsearch2 files.
indexes. Replace all heap_openr and index_openr calls by heap_open
and index_open. Remove runtime lookups of catalog OID numbers in
various places. Remove relcache's support for looking up system
catalogs by name. Bulky but mostly very boring patch ...
indexes. Extend the macros in include/catalog/*.h to carry the info
about hand-assigned OIDs, and adjust the genbki script and bootstrap
code to make the relations actually get those OIDs. Remove the small
number of RelOid_pg_foo macros that we had in favor of a complete
set named like the catname.h and indexing.h macros. Next phase will
get rid of internal use of names for looking up catalogs and indexes;
but this completes the changes forcing an initdb, so it looks like a
good place to commit.
Along the way, I made the shared relations (pg_database etc) not be
'bootstrap' relations any more, so as to reduce the number of hardwired
entries and simplify changing those relations in future. I'm not
sure whether they ever really needed to be handled as bootstrap
relations, but it seems to work fine to not do so now.
of just a relation OID, thereby not having to open the relation for itself.
This actually saves code rather than adding it for most of the existing
callers, which had the rel open already. The main point though is to be
able to use this rather than plain addRangeTableEntry in setTargetTable,
thus saving one relation_openrv/relation_close cycle for every INSERT,
UPDATE, or DELETE. Seems to provide a several percent win on simple
INSERTs.
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not. I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass. However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.
overly strong lock on pg_depend, and it wasn't closing the rel when done.
The latter bug was masked by the ResourceOwner code, which is something
that should be changed.
ExclusiveLock rather than AccessExclusiveLock. This will allow concurrent
SELECT queries to proceed on the table. Per discussion with Andrew at
SuperNews.
to write out data that we are about to tell the filesystem to drop.
smgr_internal_unlink already had a DropRelFileNodeBuffers call to
get rid of dead buffers without a write after it's no longer possible
to roll back the deleting transaction. Adding a similar call in
smgrtruncate simplifies callers and makes the overall division of
labor clearer. This patch removes the former behavior that VACUUM
would write all dirty buffers of a relation unconditionally.
of tuples when passing data up through multiple plan nodes. A slot can now
hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting
of Datum/isnull arrays. Upper plan levels can usually just copy the Datum
arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr()
calls to extract the data again. This work extends Atsushi Ogawa's earlier
patch, which provided the key idea of adding Datum arrays to TupleTableSlots.
(I believe however that something like this was foreseen way back in Berkeley
days --- see the old comment on ExecProject.) A test case involving many
levels of join of fairly wide tables (about 80 columns altogether) showed
about 3x overall speedup, though simple queries will probably not be
helped very much.
I have also duplicated some code in heaptuple.c in order to provide versions
of heap_formtuple and friends that use "bool" arrays to indicate null
attributes, instead of the old convention of "char" arrays containing either
'n' or ' '. This provides a better match to the convention used by
ExecEvalExpr. While I have not made a concerted effort to get rid of uses
of the old routines, I think they should be deprecated and eventually removed.
column with a default expression. In that situation, we need to rewrite
the heap relation. To evaluate the new default expression, we use
ExecEvalExpr(); however, this can allocate memory in the current memory
context, and ATRewriteTable() does not switch out of the active portal's
heap memory context. The end result is a rather large memory leak (on
the order of gigabytes for a reasonably sized table).
This patch changes ATRewriteTable() to switch to the per-tuple memory
context before beginning the per-tuple loop. It also removes an explicit
heap_freetuple() in the loop, since that is no longer needed.
In an unrelated change, I noticed the code was scanning through the
attributes of the new tuple descriptor for each tuple of the old table.
I changed this to use precomputation, which should slightly speed up
the loop.
Thanks to steve@deefs.net for reporting the leak.
command. This is useful because we can allow truncation of tables
referenced by foreign keys, so long as the referencing table is
truncated in the same command.
Alvaro Herrera
is the minimum required fix. I want to look next at taking advantage of
it by simplifying the message semantics in the shared inval message queue,
but that part can be held over for 8.1 if it turns out too ugly.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
more than 65K columns, or when the created table has more than 65K columns
due to adding inherited columns from parent relations. Fix a similar
crash when processing SELECT queries with more than 65K target list
entries. In all three cases we would eventually detect the error and
elog, but the check was being made too late.
clause implicitly whenever one is not given explicitly. Remove concept
of a schema having an associated tablespace, and simplify the rules for
selecting a default tablespace for a table or index. It's now just
(a) explicit TABLESPACE clause; (b) default_tablespace if that's not an
empty string; (c) database's default. This will allow pg_dump to use
SET commands instead of tablespace clauses to determine object locations
(but I didn't actually make it do so). All per recent discussions.
of HeapTupleSatisfiesItself() to trigger a hint-bit update on the tuple:
if the row was updated or deleted by a subtransaction of my own transaction
that was later rolled back. This cannot occur in pre-8.0 of course, so
the hint-bit patch applied a couple weeks ago is OK for existing releases.
But for 8.0 it seems we had better fix things so that RI_FKey_check can
pass the correct buffer number to HeapTupleSatisfiesItself. Accordingly,
add fields to the TriggerData struct to carry the buffer ID(s) for the
old and new tuple(s). There are other possible solutions but this one
seems cleanest; it will allow other AFTER-trigger functions to safely
do tqual.c calls if they want to. Put new fields at end of struct so
that there is no API breakage.
at the top level of the column's old default expression before adding
an implicit coercion to the new column type. This seems to satisfy the
principle of least surprise, as per discussion of bug #1290.
NO ACTION check is deferrable. This seems to be a closer approximation
to what the SQL spec says than what we were doing before, and it prevents
some anomalous behaviors that are possible now that triggers can fire
during the execution of PL functions.
Stephan Szabo.