6 Commits

Author SHA1 Message Date
Michael Paquier
6e951f279b psql: Forbid use of COPY and \copy while in a pipeline
Running COPY within a pipeline can break protocol synchronization in
multiple ways.  psql is limited in terms of result processing if mixing
COPY commands with normal queries while controlling a pipeline with the
new meta-commands, as an effect of the following reasons:
- In COPY mode, the backend ignores additional Sync messages and will
not send a matching ReadyForQuery expected by the frontend.  Doing a
\syncpipeline just after COPY will leave the frontend waiting for a
ReadyForQuery message that won't be sent, leaving psql out-of-sync.
- libpq automatically sends a Sync with the Copy message which is not
tracked in the command queue, creating an unexpected synchronisation
point that psql cannot really know about.  While it is possible to track
such activity for a \copy, this cannot really be done sanely with plain
COPY queries.  Backend failures during a COPY would leave the pipeline
in an aborted state while the backend would be in a clean state, ready
to process commands.

At the end, fixing those issues would require modifications in how libpq
handles pipeline and COPY.  So, rather than implementing workarounds in
psql to shortcut the libpq internals (with command queue handling for
one), and because meta-commands for pipelines in psql are a new feature
with COPY in a pipeline having a limited impact compared to other
queries, this commit forbids the use of COPY within a pipeline to avoid
possible break of protocol synchronisation within psql.  If there is a
use-case for COPY support within pipelines in libpq, this could always
be added in the future, if necessary.

Most of the changes of this commit impacts the tests for psql pipelines,
removing the tests related to COPY.  Some TAP tests still exist for COPY
TO/FROM and \copy to/from, to check that that connections are aborted
when this operation is attempted.

Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/AC468509-06E8-4E2A-A4B1-63046A4AC6AB@postgrespro.ru
2025-06-13 10:15:17 +09:00
Michael Paquier
2cce0fe440 psql: Allow queries terminated by semicolons while in pipeline mode
Currently, the only way to pipe queries in an ongoing pipeline (in a
\startpipeline block) is to leverage the meta-commands able to create
extended queries such as \bind, \parse or \bind_named.

While this is good enough for testing the backend with pipelines, it has
been mentioned that it can also be very useful to allow queries
terminated by semicolons to be appended to a pipeline.  For example, it
would be possible to migrate existing psql scripts to use pipelines by
just adding a set of \startpipeline and \endpipeline meta-commands,
making such scripts more efficient.

Doing such a change is proving to be simple in psql: queries terminated
by semicolons can be executed through PQsendQueryParams() without any
parameters set when the pipeline mode is active, instead of
PQsendQuery(), the default, like pgbench.  \watch is still forbidden
while in a pipeline, as it expects its results to be processed
synchronously.

The large portion of this commit consists in providing more test
coverage, with mixes of extended queries appended in a pipeline by \bind
and friends, and queries terminated by semicolons.

This improvement has been suggested by Daniel Vérité.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/d67b9c19-d009-4a50-8020-1a0ea92366a1@manitou-mail.org
2025-03-19 13:34:59 +09:00
Michael Paquier
17caf66445 psql: Add \sendpipeline to send query buffers while in a pipeline
In the initial pipeline support for psql added in 41625ab8ea3d, \g was
used as the way to push extended query into an ongoing pipeline.  \gx
was blocked.

These two meta-commands have format-related options that can be applied
when fetching a query result (expanded, etc.).  As the results of a
pipeline are fetched asynchronously, not at the moment of the
meta-command execution but at the moment of a \getresults or a
\endpipeline, authorizing \g while blocking \gx leads to a confusing
implementation, making one think that psql should be smart enough to
remember the output format options defined from the time when \g or \gx
were executed.  Doing so would lead to more code complications when
retrieving a batch of results.  There is an extra argument other than
simplicity here: the output format options defined at the point of a
\getresults or a \endpipeline execution should be what affect the output
format for a batch of results.

To avoid any confusion, we have settled to the introduction of a new
meta-command called \sendpipeline, replacing \g when within a pipeline.
An advantage of this design is that it is possible to add new options
specific to pipelines when sending a query buffer, independent of \g
and \gx, should it prove to be necessary.

Most of the changes of this commit happen in the regression tests, where
\g is replaced by \sendpipeline.  More tests are added to check that \g
is not allowed.

Per discussion between the author, Daniel Vérité and me.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/ad4b9f1a-f7fe-4ab8-8546-90754726d0be@manitou-mail.org
2025-03-18 09:41:21 +09:00
Michael Paquier
3ce357584e psql: Add pipeline status to prompt and some state variables
This commit adds %P to psql prompts, able to report the status of a
pipeline depending on PQpipelineStatus(): on, off or abort.

The following variables are added to report the state of an ongoing
pipeline:
- PIPELINE_SYNC_COUNT: reports the number of piped syncs.
- PIPELINE_COMMAND_COUNT: reports the number of piped commands, a
command being either \bind, \bind_named, \close or \parse.
- PIPELINE_RESULT_COUNT: reports the results available to read with
\getresults.

These variables can be used with \echo or in a prompt, using "%:name:"
in PROMPT1, PROMPT2 or PROMPT3.  Some basic regression tests are added
for these.  The suggestion to use variables to show the details about
the status counters comes from me.  The original patch proposed was less
extensible, hardcoding the output in the prompt.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-25 10:07:24 +09:00
Michael Paquier
a4e986ef5a Add more tests for utility commands in pipelines
This commit checks interactions with pipelines and implicit transaction
blocks for the following commands that have their own behaviors when
used in pipelines depending on their order in a pipeline and sync
requests:
- SET LOCAL
- REINDEX CONCURRENTLY
- VACUUM
- Subtransactions (SAVEPOINT, ROLLBACK TO SAVEPOINT)

These scenarios could be tested only with pgbench previously.  The
meta-commands of psql controlling pipelines make these easier to
implement, debug, and they can be run in a SQL script.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-23 16:43:07 +09:00
Michael Paquier
41625ab8ea psql: Add support for pipelines
With \bind, \parse, \bind_named and \close, it is possible to issue
queries from psql using the extended protocol.  However, it was not
possible to send these queries using libpq's pipeline mode.  This
feature has two advantages:
- Testing.  Pipeline tests were only possible with pgbench, using TAP
tests.  It now becomes possible to have more SQL tests that are able to
stress the backend with pipelines and extended queries.  More tests will
be added in a follow-up commit that were discussed on some other
threads.  Some external projects in the community had to implement their
own facility to work around this limitation.
- Emulation of custom workloads, with more control over the actions
taken by a client with libpq APIs.  It is possible to emulate more
workload patterns to bottleneck the backend with the extended query
protocol.

This patch adds six new meta-commands to be able to control pipelines:
* \startpipeline starts a new pipeline.  All extended queries are queued
until the end of the pipeline are reached or a sync request is sent and
processed.
* \endpipeline ends an existing pipeline.  All queued commands are sent
to the server and all responses are processed by psql.
* \syncpipeline queues a synchronisation request, without flushing the
commands to the server, equivalent of PQsendPipelineSync().
* \flush, equivalent of PQflush().
* \flushrequest, equivalent of PQsendFlushRequest()
* \getresults reads the server's results for the queries in a pipeline.
Unsent data is automatically pushed when \getresults is called.  It is
possible to control the number of results read in a single meta-command
execution with an optional parameter, 0 means that all the results
should be read.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-21 11:19:59 +09:00