mirror of
https://github.com/postgres/postgres.git
synced 2025-05-21 00:02:53 -04:00
This patch adds the core-system infrastructure needed to support updates on foreign tables, and extends contrib/postgres_fdw to allow updates against remote Postgres servers. There's still a great deal of room for improvement in optimization of remote updates, but at least there's basic functionality there now. KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather heavily revised by Tom Lane.
329 lines
12 KiB
Plaintext
329 lines
12 KiB
Plaintext
<!-- doc/src/sgml/postgres-fdw.sgml -->
|
|
|
|
<sect1 id="postgres-fdw" xreflabel="postgres_fdw">
|
|
<title>postgres_fdw</title>
|
|
|
|
<indexterm zone="postgres-fdw">
|
|
<primary>postgres_fdw</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <filename>postgres_fdw</> module provides the foreign-data wrapper
|
|
<literal>postgres_fdw</literal>, which can be used to access data
|
|
stored in external <productname>PostgreSQL</productname> servers.
|
|
</para>
|
|
|
|
<para>
|
|
The functionality provided by this module overlaps substantially
|
|
with the functionality of the older <xref linkend="dblink"> module.
|
|
But <filename>postgres_fdw</> provides more transparent and
|
|
standards-compliant syntax for accessing remote tables, and can give
|
|
better performance in many cases.
|
|
</para>
|
|
|
|
<para>
|
|
To prepare for remote access using <filename>postgres_fdw</>:
|
|
<orderedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
Install the <filename>postgres_fdw</> extension using <xref
|
|
linkend="sql-createextension">.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Create a foreign server object, using <xref linkend="sql-createserver">,
|
|
to represent each remote database you want to connect to.
|
|
Specify connection information, except <literal>user</literal> and
|
|
<literal>password</literal>, as options of the server object.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Create a user mapping, using <xref linkend="sql-createusermapping">, for
|
|
each database user you want to allow to access each foreign server.
|
|
Specify the remote user name and password to use as
|
|
<literal>user</literal> and <literal>password</literal> options of the
|
|
user mapping.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Create a foreign table, using <xref linkend="sql-createforeigntable">,
|
|
for each remote table you want to access. The columns of the foreign
|
|
table must match the referenced remote table. You can, however, use
|
|
table and/or column names different from the remote table's, if you
|
|
specify the correct remote names as options of the foreign table object.
|
|
</para>
|
|
</listitem>
|
|
</orderedlist>
|
|
</para>
|
|
|
|
<para>
|
|
Now you need only <command>SELECT</> from a foreign table to access
|
|
the data stored in its underlying remote table. You can also modify
|
|
the remote table using <command>INSERT</>, <command>UPDATE</>, or
|
|
<command>DELETE</>. (Of course, the remote user you have specified
|
|
in your user mapping must have privileges to do these things.)
|
|
</para>
|
|
|
|
<para>
|
|
It is generally recommended that the columns of a foreign table be declared
|
|
with exactly the same data types, and collations if applicable, as the
|
|
referenced columns of the remote table. Although <filename>postgres_fdw</>
|
|
is currently rather forgiving about performing data type conversions at
|
|
need, surprising semantic anomalies may arise when types or collations do
|
|
not match, due to the remote server interpreting <literal>WHERE</> clauses
|
|
slightly differently from the local server.
|
|
</para>
|
|
|
|
<para>
|
|
Note that a foreign table can be declared with fewer columns, or with a
|
|
different column order, than its underlying remote table has. Matching
|
|
of columns to the remote table is by name, not position.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>FDW Options of postgres_fdw</title>
|
|
|
|
<sect3>
|
|
<title>Connection Options</title>
|
|
|
|
<para>
|
|
A foreign server using the <filename>postgres_fdw</> foreign data wrapper
|
|
can have the same options that <application>libpq</> accepts in
|
|
connection strings, as described in <xref linkend="libpq-paramkeywords">,
|
|
except that these options are not allowed:
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
<literal>user</literal> and <literal>password</literal> (specify these
|
|
for a user mapping, instead)
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>client_encoding</> (this is automatically set from the local
|
|
server encoding)
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>fallback_application_name</> (always set to
|
|
<literal>postgres_fdw</>)
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
Only superusers may connect to foreign servers without password
|
|
authentication, so always specify the <literal>password</literal> option
|
|
for user mappings belonging to non-superusers.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Object Name Options</title>
|
|
|
|
<para>
|
|
These options can be used to control the names used in SQL statements
|
|
sent to the remote <productname>PostgreSQL</productname> server. These
|
|
options are needed when a foreign table is created with names different
|
|
from the underlying remote table's names.
|
|
</para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><literal>schema_name</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign table, gives the
|
|
schema name to use for the foreign table on the remote server. If this
|
|
option is omitted, the name of the foreign table's schema is used.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>table_name</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign table, gives the
|
|
table name to use for the foreign table on the remote server. If this
|
|
option is omitted, the foreign table's name is used.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>column_name</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a column of a foreign table,
|
|
gives the column name to use for the column on the remote server.
|
|
If this option is omitted, the column's name is used.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Cost Estimation Options</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</> retrieves remote data by executing queries
|
|
against remote servers, so ideally the estimated cost of scanning a
|
|
foreign table should be whatever it costs to be done on the remote
|
|
server, plus some overhead for communication. The most reliable way to
|
|
get such an estimate is to ask the remote server and then add something
|
|
for overhead — but for simple queries, it may not be worth the cost
|
|
of an additional remote query to get a cost estimate.
|
|
So <filename>postgres_fdw</> provides the following options to control
|
|
how cost estimation is done:
|
|
</para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><literal>use_remote_estimate</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign table or a foreign
|
|
server, controls whether <filename>postgres_fdw</> issues remote
|
|
<command>EXPLAIN</command> commands to obtain cost estimates.
|
|
A setting for a foreign table overrides any setting for its server,
|
|
but only for that table.
|
|
The default is <literal>false</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>fdw_startup_cost</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign server, is a numeric
|
|
value that is added to the estimated startup cost of any foreign-table
|
|
scan on that server. This represents the additional overhead of
|
|
establishing a connection, parsing and planning the query on the
|
|
remote side, etc.
|
|
The default value is <literal>100</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>fdw_tuple_cost</literal></term>
|
|
<listitem>
|
|
<para>
|
|
This option, which can be specified for a foreign server, is a numeric
|
|
value that is used as extra cost per-tuple for foreign-table
|
|
scans on that server. This represents the additional overhead of
|
|
data transfer between servers. You might increase or decrease this
|
|
number to reflect higher or lower network delay to the remote server.
|
|
The default value is <literal>0.01</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
<para>
|
|
When <literal>use_remote_estimate</literal> is true,
|
|
<filename>postgres_fdw</> obtains rowcount and cost estimates from the
|
|
remote server and then adds <literal>fdw_startup_cost</literal> and
|
|
<literal>fdw_tuple_cost</literal> to the cost estimates. When
|
|
<literal>use_remote_estimate</literal> is false,
|
|
<filename>postgres_fdw</> performs local rowcount and cost estimation
|
|
and then adds <literal>fdw_startup_cost</literal> and
|
|
<literal>fdw_tuple_cost</literal> to the cost estimates. This local
|
|
estimation is unlikely to be very accurate unless local copies of the
|
|
remote table's statistics are available. Running
|
|
<xref linkend="sql-analyze"> on the foreign table is the way to update
|
|
the local statistics; this will perform a scan of the remote table and
|
|
then calculate and store statistics just as though the table were local.
|
|
Keeping local statistics can be a useful way to reduce per-query planning
|
|
overhead for a remote table — but if the remote table is
|
|
frequently updated, the local statistics will soon be obsolete.
|
|
</para>
|
|
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Connection Management</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</filename> establishes a connection to a
|
|
foreign server during the first query that uses a foreign table
|
|
associated with the foreign server. This connection is kept and
|
|
re-used for subsequent queries in the same session. However, if
|
|
multiple user identities (user mappings) are used to access the foreign
|
|
server, a connection is established for each user mapping.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Transaction Management</title>
|
|
|
|
<para>
|
|
During a query that references any remote tables on a foreign server,
|
|
<filename>postgres_fdw</filename> opens a transaction on the
|
|
remote server if one is not already open corresponding to the current
|
|
local transaction. The remote transaction is committed or aborted when
|
|
the local transaction commits or aborts. Savepoints are similarly
|
|
managed by creating corresponding remote savepoints.
|
|
</para>
|
|
|
|
<para>
|
|
The remote transaction uses <literal>SERIALIZABLE</>
|
|
isolation level when the local transaction has <literal>SERIALIZABLE</>
|
|
isolation level; otherwise it uses <literal>REPEATABLE READ</>
|
|
isolation level. This choice ensures that if a query performs multiple
|
|
table scans on the remote server, it will get snapshot-consistent results
|
|
for all the scans. A consequence is that successive queries within a
|
|
single transaction will see the same data from the remote server, even if
|
|
concurrent updates are occurring on the remote server due to other
|
|
activities. That behavior would be expected anyway if the local
|
|
transaction uses <literal>SERIALIZABLE</> or <literal>REPEATABLE READ</>
|
|
isolation level, but it might be surprising for a <literal>READ
|
|
COMMITTED</> local transaction. A future
|
|
<productname>PostgreSQL</productname> release might modify these rules.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Remote Query Optimization</title>
|
|
|
|
<para>
|
|
<filename>postgres_fdw</> attempts to optimize remote queries to reduce
|
|
the amount of data transferred from foreign servers. This is done by
|
|
sending query <literal>WHERE</> clauses to the remote server for
|
|
execution, and by not retrieving table columns that are not needed for
|
|
the current query. To reduce the risk of misexecution of queries,
|
|
<literal>WHERE</> clauses are not sent to the remote server unless they use
|
|
only built-in data types, operators, and functions. Operators and
|
|
functions in the clauses must be <literal>IMMUTABLE</> as well.
|
|
</para>
|
|
|
|
<para>
|
|
The query that is actually sent to the remote server for execution can
|
|
be examined using <command>EXPLAIN VERBOSE</>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Author</title>
|
|
<para>
|
|
Shigeru Hanada <email>shigeru.hanada@gmail.com</email>
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|