PostgreSQL/doc/src/sgml/trouble.sgml
Tom Lane 7c81a130a4 Fix erroneous claim that 'postmaster -S' leaves the postmaster running
in the foreground --- in fact, it auto-detaches.
2000-03-26 06:58:17 +00:00

274 lines
11 KiB
Plaintext

<Chapter Id="trouble">
<Title>Troubleshooting</Title>
<sect1>
<title>Postmaster Startup Failures</title>
<para>
There are several common reasons for the postmaster to fail to start up.
Check the postmaster's log file, or start it by hand (without redirecting
standard output or standard error) to see what complaint messages appear.
Some of the possible error messages are reasonably self-explanatory,
but here are some that are not:
</para>
<para>
<ProgramListing>
FATAL: StreamServerPort: bind() failed: Address already in use
Is another postmaster already running on that port?
</ProgramListing>
This usually means just what it suggests: you accidentally started a
second postmaster on the same port where one is already running.
However, if the kernel error
message is not "Address already in use" or some variant of that wording,
there may be a different problem. For example, trying to start a
postmaster on a reserved port number may draw something like
<ProgramListing>
$ postmaster -i -p 666
FATAL: StreamServerPort: bind() failed: Permission denied
Is another postmaster already running on that port?
</ProgramListing>
</para>
<para>
<ProgramListing>
IpcMemoryCreate: shmget failed (Invalid argument) key=5440001, size=83918612, permission=600
FATAL 1: ShmemCreate: cannot create region
</ProgramListing>
A message like this probably means that your kernel's limit on the size
of shared memory areas is smaller than the buffer area that Postgres
is trying to create. (Or it could mean that you don't have SysV-style
shared memory support configured into your kernel at all.) As a temporary
workaround, you can try starting the postmaster with a smaller-than-normal
number of buffers (-B switch). You will eventually want to reconfigure
your kernel to increase the allowed shared memory size, however.
You may see this message when trying to start multiple postmasters on
the same machine, if their total space requests exceed the kernel limit.
</para>
<para>
<ProgramListing>
IpcSemaphoreCreate: semget failed (No space left on device) key=5440026, num=16, permission=600
</ProgramListing>
A message like this does <emphasis>not</emphasis> mean that you've run out
of disk space; it means that your kernel's limit on the number of SysV
semaphores is smaller than the number Postgres wants to create. As above,
you may be able to work around the problem by starting the postmaster with
a reduced number of backend processes (-N switch), but you'll eventually
want to increase the kernel limit.
</para>
</sect1>
<sect1>
<title>Client Connection Problems</title>
<para>
Once you have a running postmaster, trying to connect to it with
client applications can fail for a variety of reasons. The sample
error messages shown here are for clients based on recent versions
of libpq --- clients based on other interface libraries may produce
other messages with more or less information.
</para>
<para>
<ProgramListing>
connectDB() -- connect() failed: Connection refused
Is the postmaster running (with -i) at 'server.joe.com' and accepting connections on TCP/IP port '5432'?
</ProgramListing>
This is the generic "I couldn't find a postmaster to talk to" failure.
It looks like the above when TCP/IP communication is attempted, or like
this when attempting Unix-socket communication to a local postmaster:
<ProgramListing>
connectDB() -- connect() failed: No such file or directory
Is the postmaster running at 'localhost' and accepting connections on Unix socket '5432'?
</ProgramListing>
The last line is useful in verifying that the client is trying to connect
where it is supposed to. If there is in fact no postmaster
running there, the kernel error message will typically be either
"Connection refused" or "No such file or directory", as illustrated.
(It is particularly important to realize that "Connection refused" in
this context does <emphasis>not</emphasis> mean that the postmaster
got your connection request and rejected it --- that case will produce
a different message, as shown below.)
Other error messages such as "Connection timed out" may indicate more
fundamental problems, like lack of network connectivity.
</para>
<para>
<ProgramListing>
No pg_hba.conf entry for host 123.123.123.123, user joeblow, database testdb
</ProgramListing>
This is what you are most likely to get if you succeed in contacting
a postmaster, but it doesn't want to talk to you. As the message
suggests, the postmaster refused the connection request because it
found no authorizing entry in its pg_hba.conf configuration file.
</para>
<para>
<ProgramListing>
Password authentication failed for user 'joeblow'
</ProgramListing>
Messages like this indicate that you contacted the postmaster, and it's
willing to talk to you, but not until you pass the authorization method
specified in the pg_hba.conf file. Check the password you're providing,
or check your Kerberos or IDENT software if the complaint mentions
one of those authentication types.
</para>
<para>
<ProgramListing>
FATAL 1: SetUserId: user 'joeblow' is not in 'pg_shadow'
</ProgramListing>
This is another variant of authentication failure: no Postgres create_user
command has been executed for the given username.
</para>
<para>
<ProgramListing>
FATAL 1: Database testdb does not exist in pg_database
</ProgramListing>
There's no database by that name under the control of this postmaster.
Note that if you don't specify a database name, it defaults to your
Postgres username, which may or may not be the right thing.
</para>
</sect1>
<sect1>
<title>Debugging Messages</title>
<para>
The <Application>postmaster</Application> occasionally prints out
messages which
are often helpful during troubleshooting. If you wish
to view debugging messages from the <Application>postmaster</Application>,
you can
start it with the -d option and redirect the output to
the log file:
<ProgramListing>
% postmaster -d > pm.log 2>&1 &
</ProgramListing>
If you do not wish to see these messages, you can type
<ProgramListing>
% postmaster -S
</ProgramListing>
and the <Application>postmaster</Application> will be "S"ilent.
No ampersand ("&amp") is required in this case, since the postmaster
automatically detaches from the terminal when -S is specified.
</Para>
<sect2 Id="pg-options-trouble">
<Title>pg_options</Title>
<Para>
<Note>
<Para>
Contributed by <ULink url="mailto:dz@cs.unitn.it">Massimo Dal Zotto</ULink>
</Para>
</Note>
</para>
<Para>
The optional file <filename>data/pg_options</filename> contains runtime
options used by the backend to control trace messages and other backend
tunable parameters.
What makes this file interesting is the fact that it is re-read by a backend
when it receives a SIGHUP signal, making thus possible to change run-time
options on the fly without needing to restart
<productname>Postgres</productname>.
The options specified in this file may be debugging flags used by the trace
package (<filename>backend/utils/misc/trace.c</filename>) or numeric
parameters which can be used by the backend to control its behaviour.
New options and parameters must be defined in
<filename>backend/utils/misc/trace.c</filename> and
<filename>backend/include/utils/trace.h</filename>.
</para>
<para>
pg_options can also be specified with the <option>-T</option> switch of
<productname>Postgres</productname>:
<programlisting>
postgres <replaceable>options</replaceable> -T "verbose=2,query,hostlookup-"
</programlisting>
</para>
<Para>
The functions used for printing errors and debug messages can now make use
of the <citetitle>syslog(2)</citetitle> facility. Message printed to stdout
or stderr are prefixed by a timestamp containing also the backend pid:
<programlisting>
#timestamp #pid #message
980127.17:52:14.173 [29271] StartTransactionCommand
980127.17:52:14.174 [29271] ProcessUtility: drop table t;
980127.17:52:14.186 [29271] SIIncNumEntries: table is 70% full
980127.17:52:14.186 [29286] Async_NotifyHandler
980127.17:52:14.186 [29286] Waking up sleeping backend process
980127.19:52:14.292 [29286] Async_NotifyFrontEnd
980127.19:52:14.413 [29286] Async_NotifyFrontEnd done
980127.19:52:14.466 [29286] Async_NotifyHandler done
</programlisting>
</para>
<para>
This format improves readability of the logs and allows people to understand
exactly which backend is doing what and at which time. It also makes
easier to write simple awk or perl scripts which monitor the log to
detect database errors or problem, or to compute transaction time statistics.
</para>
<para>
Messages printed to syslog use the log facility LOG_LOCAL0.
The use of syslog can be controlled with the syslog pg_option.
Unfortunately many functions call directly <function>printf()</function>
to print their messages to stdout or stderr and this output can't be
redirected to syslog or have timestamps in it.
It would be advisable that all calls to printf would be replaced with the
PRINTF macro and output to stderr be changed to use EPRINTF instead so that
we can control all output in a uniform way.
</Para>
<para>
The format of the <filename>pg_options</filename> file is as follows:
<programlisting>
# <replaceable>comment</replaceable>
<replaceable>option</replaceable>=<replaceable class="parameter">integer_value</replaceable> # set value for <replaceable>option</replaceable>
<replaceable>option</replaceable> # set <replaceable>option</replaceable> = 1
<replaceable>option</replaceable>+ # set <replaceable>option</replaceable> = 1
<replaceable>option</replaceable>- # set <replaceable>option</replaceable> = 0
</programlisting>
Note that <replaceable class="parameter">keyword</replaceable> can also be
an abbreviation of the option name defined in
<filename>backend/utils/misc/trace.c</filename>.
</Para>
<Para>
Refer to <xref linkend="pg-options-title" endterm="pg-options-title">
for a complete list of option keywords and possible values.
</para>
</sect2>
</sect1>
</Chapter>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:"/usr/lib/sgml/CATALOG"
sgml-local-ecat-files:nil
End:
-->