mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-24 00:03:18 -04:00 
			
		
		
		
	
		
			
				
	
	
		
			804 lines
		
	
	
		
			35 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			804 lines
		
	
	
		
			35 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 
 | |
|                 Frequently Asked Questions (FAQ) for PostgreSQL
 | |
|                                        
 | |
|    Last updated: Wed Oct 5 07:24:44 EDT 2005
 | |
|    
 | |
|    Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
 | |
|    
 | |
|    The most recent version of this document can be viewed at
 | |
|    http://www.postgresql.org/files/documentation/faqs/FAQ.html.
 | |
|    
 | |
|    Platform-specific questions are answered at
 | |
|    http://www.postgresql.org/docs/faq/.
 | |
|      _________________________________________________________________
 | |
|    
 | |
|                              General Questions
 | |
|                                       
 | |
|    1.1) What is PostgreSQL? How is it pronounced?
 | |
|    1.2) What is the copyright of PostgreSQL?
 | |
|    1.3) What platforms does PostgreSQL support?
 | |
|    1.4) Where can I get PostgreSQL?
 | |
|    1.5) Where can I get support?
 | |
|    1.6) How do I submit a bug report?
 | |
|    1.7) What is the latest release?
 | |
|    1.8) What documentation is available?
 | |
|    1.9) How do I find out about known bugs or missing features?
 | |
|    1.10) How can I learn SQL?
 | |
|    1.11) How do I join the development team?
 | |
|    1.12) How does PostgreSQL compare to other DBMSs?
 | |
|    1.13) Who controls PostgreSQL?
 | |
|    
 | |
|                            User Client Questions
 | |
|                                       
 | |
|    2.1) What interfaces are available for PostgreSQL?
 | |
|    2.2) What tools are available for using PostgreSQL with Web pages?
 | |
|    2.3) Does PostgreSQL have a graphical user interface?
 | |
|    
 | |
|                           Administrative Questions
 | |
|                                       
 | |
|    3.1) How do I install PostgreSQL somewhere other than
 | |
|    /usr/local/pgsql?
 | |
|    3.2) How do I control connections from other hosts?
 | |
|    3.3) How do I tune the database engine for better performance?
 | |
|    3.4) What debugging features are available?
 | |
|    3.5) Why do I get "Sorry, too many clients" when trying to connect?
 | |
|    3.6) Why do I need to do a dump and restore to upgrade PostgreSQL
 | |
|    releases?
 | |
|    3.7) What computer hardware should I use?
 | |
|    
 | |
|                            Operational Questions
 | |
|                                       
 | |
|    4.1) How do I SELECT only the first few rows of a query? A random row?
 | |
|    4.2) How do I find out what tables, indexes, databases, and users are
 | |
|    defined? How do I see the queries used by psql to display them?
 | |
|    4.3) How do you change a column's data type?
 | |
|    4.4) What is the maximum size for a row, a table, and a database?
 | |
|    4.5) How much database disk space is required to store data from a
 | |
|    typical text file?
 | |
|    4.6) Why are my queries slow? Why don't they use my indexes?
 | |
|    4.7) How do I see how the query optimizer is evaluating my query?
 | |
|    4.8) How do I perform regular expression searches and case-insensitive
 | |
|    regular expression searches? How do I use an index for
 | |
|    case-insensitive searches?
 | |
|    4.9) In a query, how do I detect if a field is NULL? How can I sort on
 | |
|    whether a field is NULL or not?
 | |
|    4.10) What is the difference between the various character types?
 | |
|    4.11.1) How do I create a serial/auto-incrementing field?
 | |
|    4.11.2) How do I get the value of a SERIAL insert?
 | |
|    4.11.3) Doesn't currval() lead to a race condition with other users?
 | |
|    4.11.4) Why aren't my sequence numbers reused on transaction abort?
 | |
|    Why are there gaps in the numbering of my sequence/SERIAL column?
 | |
|    4.12) What is an OID? What is a CTID?
 | |
|    4.13) Why do I get the error "ERROR: Memory exhausted in
 | |
|    AllocSetAlloc()"?
 | |
|    4.14) How do I tell what PostgreSQL version I am running?
 | |
|    4.15) How do I create a column that will default to the current time?
 | |
|    4.16) How do I perform an outer join?
 | |
|    4.17) How do I perform queries using multiple databases?
 | |
|    4.18) How do I return multiple rows or columns from a function?
 | |
|    4.19) Why do I get "relation with OID ##### does not exist" errors
 | |
|    when accessing temporary tables in PL/PgSQL functions?
 | |
|    4.20) What replication solutions are available?
 | |
|    4.21) Why are my table and column names not recognized in my query?
 | |
|      _________________________________________________________________
 | |
|    
 | |
|                              General Questions
 | |
|                                       
 | |
|   1.1) What is PostgreSQL? How is it pronounced?
 | |
|   
 | |
|    PostgreSQL is pronounced Post-Gres-Q-L, and is also sometimes referred
 | |
|    to as just Postgres. An audio file is available in MP3 format for
 | |
|    those would like to hear the pronunciation.
 | |
|    
 | |
|    PostgreSQL is an object-relational database system that has the
 | |
|    features of traditional commercial database systems with enhancements
 | |
|    to be found in next-generation DBMS systems. PostgreSQL is free and
 | |
|    the complete source code is available.
 | |
|    
 | |
|    PostgreSQL development is performed by a team of mostly volunteer
 | |
|    developers spread throughout the world and communicating via the
 | |
|    Internet. It is a community project and is not controlled by any
 | |
|    company. To get involved, see the developer's FAQ at
 | |
|    http://www.postgresql.org/files/documentation/faqs/FAQ_DEV.html
 | |
|    
 | |
|   1.2) What is the copyright of PostgreSQL?
 | |
|   
 | |
|    PostgreSQL is distributed under the classic BSD license. It has no
 | |
|    restrictions on how the source code can be used. We like it and have
 | |
|    no intention of changing it.
 | |
|    
 | |
|    This is the BSD license we use:
 | |
|    
 | |
|    PostgreSQL Data Base Management System
 | |
|    
 | |
|    Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
 | |
|    Portions Copyright (c) 1994-1996 Regents of the University of
 | |
|    California
 | |
|    
 | |
|    Permission to use, copy, modify, and distribute this software and its
 | |
|    documentation for any purpose, without fee, and without a written
 | |
|    agreement is hereby granted, provided that the above copyright notice
 | |
|    and this paragraph and the following two paragraphs appear in all
 | |
|    copies.
 | |
|    
 | |
|    IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY
 | |
|    FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
 | |
|    INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND
 | |
|    ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN
 | |
|    ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 | |
|    
 | |
|    THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
 | |
|    INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 | |
|    MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE
 | |
|    PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF
 | |
|    CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT,
 | |
|    UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
 | |
|    
 | |
|   1.3) What platforms does PostgreSQL support?
 | |
|   
 | |
|    In general, any modern Unix-compatible platform should be able to run
 | |
|    PostgreSQL. The platforms that had received explicit testing at the
 | |
|    time of release are listed in the installation instructions.
 | |
|    
 | |
|    PostgreSQL also runs natively on Microsoft Windows NT-based operating
 | |
|    systems like Win2000, WinXP, and Win2003. A prepackaged installer is
 | |
|    available at http://pgfoundry.org/projects/pginstaller. MSDOS-based
 | |
|    versions of Windows (Win95, Win98, WinMe) can run PostgreSQL using
 | |
|    Cygwin.
 | |
|    
 | |
|    There is also a Novell Netware 6 port at http://forge.novell.com, and
 | |
|    an OS/2 (eComStation) version at
 | |
|    http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=postgre
 | |
|    SQL&stype=all&sort=type&dir=%2F.
 | |
|    
 | |
|   1.4) Where can I get PostgreSQL?
 | |
|   
 | |
|    Via web browser, use http://www.postgresql.org/ftp/, and via ftp, use
 | |
|    ftp://ftp.PostgreSQL.org/pub/.
 | |
|    
 | |
|   1.5) Where can I get support?
 | |
|   
 | |
|    The PostgreSQL community provides assistance to many of its users via
 | |
|    email. The main web site to subscribe to the email lists is
 | |
|    http://www.postgresql.org/community/lists/. The general or bugs lists
 | |
|    are a good place to start.
 | |
|    
 | |
|    The major IRC channel is #postgresql on Freenode (irc.freenode.net).
 | |
|    To connect you can use the Unix program irc -c '#postgresql' "$USER"
 | |
|    irc.freenode.net or use any other IRC clients. A Spanish one also
 | |
|    exists on the same network, (#postgresql-es), and a French one,
 | |
|    (#postgresqlfr). There is also a PostgreSQL channel on EFNet.
 | |
|    
 | |
|    A list of commercial support companies is available at
 | |
|    http://techdocs.postgresql.org/companies.php.
 | |
|    
 | |
|   1.6) How do I submit a bug report?
 | |
|   
 | |
|    Visit the PostgreSQL bug form at
 | |
|    http://www.postgresql.org/support/submitbug.
 | |
|    
 | |
|    Also check out our ftp site ftp://ftp.PostgreSQL.org/pub/ to see if
 | |
|    there is a more recent PostgreSQL version.
 | |
|    
 | |
|   1.7) What is the latest release?
 | |
|   
 | |
|    The latest release of PostgreSQL is version 8.0.4.
 | |
|    
 | |
|    We plan to have a major release every year, with minor releases every
 | |
|    few months.
 | |
|    
 | |
|   1.8) What documentation is available?
 | |
|   
 | |
|    PostgreSQL includes extensive documentation, including a large manual,
 | |
|    manual pages, and some test examples. See the /doc directory. You can
 | |
|    also browse the manuals online at http://www.PostgreSQL.org/docs.
 | |
|    
 | |
|    There are two PostgreSQL books available online at
 | |
|    http://www.postgresql.org/docs/books/awbook.html and
 | |
|    http://www.commandprompt.com/ppbook/. There are a number of PostgreSQL
 | |
|    books available for purchase. One of the most popular ones is by Korry
 | |
|    Douglas. A list of book reviews can be found at
 | |
|    http://techdocs.PostgreSQL.org/techdocs/bookreviews.php. There is also
 | |
|    a collection of PostgreSQL technical articles at
 | |
|    http://techdocs.PostgreSQL.org/.
 | |
|    
 | |
|    The command line client program psql has some \d commands to show
 | |
|    information about types, operators, functions, aggregates, etc. - use
 | |
|    \? to display the available commands.
 | |
|    
 | |
|    Our web site contains even more documentation.
 | |
|    
 | |
|   1.9) How do I find out about known bugs or missing features?
 | |
|   
 | |
|    PostgreSQL supports an extended subset of SQL:2003. See our TODO list
 | |
|    for known bugs, missing features, and future plans.
 | |
|    
 | |
|   1.10) How can I learn SQL?
 | |
|   
 | |
|    First, consider the PostgreSQL-specific books mentioned above. Another
 | |
|    one is "Teach Yourself SQL in 21 Days, Second Edition" at
 | |
|    http://members.tripod.com/er4ebus/sql/index.htm. Many of our users
 | |
|    like The Practical SQL Handbook, Bowman, Judith S., et al.,
 | |
|    Addison-Wesley. Others like The Complete Reference SQL, Groff et al.,
 | |
|    McGraw-Hill.
 | |
|    
 | |
|    There is also a nice tutorial at
 | |
|    http://www.intermedia.net/support/sql/sqltut.shtm, at
 | |
|    http://ourworld.compuserve.com/homepages/graeme_birchall/HTM_COOK.HTM,
 | |
|    and at http://sqlcourse.com.
 | |
|    
 | |
|   1.11) How do I join the development team?
 | |
|   
 | |
|    See the Developer's FAQ.
 | |
|    
 | |
|   1.12) How does PostgreSQL compare to other DBMSs?
 | |
|   
 | |
|    There are several ways of measuring software: features, performance,
 | |
|    reliability, support, and price.
 | |
|    
 | |
|    Features
 | |
|           PostgreSQL has most features present in large commercial DBMSs,
 | |
|           like transactions, subselects, triggers, views, foreign key
 | |
|           referential integrity, and sophisticated locking. We have some
 | |
|           features they do not have, like user-defined types,
 | |
|           inheritance, rules, and multi-version concurrency control to
 | |
|           reduce lock contention.
 | |
|           
 | |
|    Performance
 | |
|           PostgreSQL's performance is comparable to other commercial and
 | |
|           open source databases. It is faster for some things, slower for
 | |
|           others. Our performance is usually +/-10% compared to other
 | |
|           databases.
 | |
|           
 | |
|    Reliability
 | |
|           We realize that a DBMS must be reliable, or it is worthless. We
 | |
|           strive to release well-tested, stable code that has a minimum
 | |
|           of bugs. Each release has at least one month of beta testing,
 | |
|           and our release history shows that we can provide stable, solid
 | |
|           releases that are ready for production use. We believe we
 | |
|           compare favorably to other database software in this area.
 | |
|           
 | |
|    Support
 | |
|           Our mailing lists provide contact with a large group of
 | |
|           developers and users to help resolve any problems encountered.
 | |
|           While we cannot guarantee a fix, commercial DBMSs do not always
 | |
|           supply a fix either. Direct access to developers, the user
 | |
|           community, manuals, and the source code often make PostgreSQL
 | |
|           support superior to other DBMSs. There is commercial
 | |
|           per-incident support available for those who need it. (See FAQ
 | |
|           section 1.5.)
 | |
|           
 | |
|    Price
 | |
|           We are free for all use, both commercial and non-commercial.
 | |
|           You can add our code to your product with no limitations,
 | |
|           except those outlined in our BSD-style license stated above.
 | |
|           
 | |
|   1.13) Who controls PostgreSQL?
 | |
|   
 | |
|    If you are looking for a PostgreSQL gatekeeper, central committee, or
 | |
|    controlling company, give up --- there isn't one. We do have a core
 | |
|    committee and CVS committers, but these groups are more for
 | |
|    administrative purposes than control. The project is directed by the
 | |
|    community of developers and users, which anyone can join. All you need
 | |
|    to do is subscribe to the mailing lists and participate in the
 | |
|    discussions. (See the Developer's FAQ for information on how to get
 | |
|    involved in PostgreSQL development.)
 | |
|      _________________________________________________________________
 | |
|    
 | |
|                            User Client Questions
 | |
|                                       
 | |
|   2.1) What interfaces are available for PostgreSQL?
 | |
|   
 | |
|    The PostgreSQL install includes only the C and embedded C interfaces.
 | |
|    All other interfaces are independent projects that are downloaded
 | |
|    separately; being separate allows them to have their own release
 | |
|    schedule and development teams.
 | |
|    
 | |
|    Some programming languages like PHP include an interface to
 | |
|    PostgreSQL. Interfaces for languages like Perl, TCL, Python, and many
 | |
|    others are available at http://gborg.postgresql.org in the
 | |
|    Drivers/Interfaces section and via Internet search.
 | |
|    
 | |
|   2.2) What tools are available for using PostgreSQL with Web pages?
 | |
|   
 | |
|    A nice introduction to Database-backed Web pages can be seen at:
 | |
|    http://www.webreview.com
 | |
|    
 | |
|    For Web integration, PHP (http://www.php.net) is an excellent
 | |
|    interface.
 | |
|    
 | |
|    For complex cases, many use the Perl and DBD::Pg with CGI.pm or
 | |
|    mod_perl.
 | |
|    
 | |
|   2.3) Does PostgreSQL have a graphical user interface?
 | |
|   
 | |
|    Yes, see http://techdocs.postgresql.org/guides/GUITools for a detailed
 | |
|    list.
 | |
|      _________________________________________________________________
 | |
|    
 | |
|                           Administrative Questions
 | |
|                                       
 | |
|   3.1) How do I install PostgreSQL somewhere other than /usr/local/pgsql?
 | |
|   
 | |
|    Specify the --prefix option when running configure.
 | |
|    
 | |
|   3.2) How do I control connections from other hosts?
 | |
|   
 | |
|    By default, PostgreSQL only allows connections from the local machine
 | |
|    using Unix domain sockets or TCP/IP connections. Other machines will
 | |
|    not be able to connect unless you modify listen_addresses in the
 | |
|    postgresql.conf file, enable host-based authentication by modifying
 | |
|    the $PGDATA/pg_hba.conf file, and restart the server.
 | |
|    
 | |
|   3.3) How do I tune the database engine for better performance?
 | |
|   
 | |
|    There are three major areas for potential performance improvement:
 | |
|    
 | |
|    Query Changes
 | |
|           This involves modifying queries to obtain better performance:
 | |
|           
 | |
|           + Creation of indexes, including expression and partial indexes
 | |
|           + Use of COPY instead of multiple INSERTs
 | |
|           + Grouping of multiple statements into a single transaction to
 | |
|             reduce commit overhead
 | |
|           + Use of CLUSTER when retrieving many rows from an index
 | |
|           + Use of LIMIT for returning a subset of a query's output
 | |
|           + Use of Prepared queries
 | |
|           + Use of ANALYZE to maintain accurate optimizer statistics
 | |
|           + Regular use of VACUUM or pg_autovacuum
 | |
|           + Dropping of indexes during large data changes
 | |
|             
 | |
|    Server Configuration
 | |
|           A number of postgresql.conf settings affect performance. For
 | |
|           more details, see Administration Guide/Server Run-time
 | |
|           Environment/Run-time Configuration for a full listing, and for
 | |
|           commentary see
 | |
|           http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_co
 | |
|           nf_e.html and
 | |
|           http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html.
 | |
|           
 | |
|    Hardware Selection
 | |
|           The effect of hardware on performance is detailed in
 | |
|           http://candle.pha.pa.us/main/writings/pgsql/hw_performance/inde
 | |
|           x.html and http://www.powerpostgresql.com/PerfList/.
 | |
|           
 | |
|   3.4) What debugging features are available?
 | |
|   
 | |
|    There are many log_* server configuration variables that enable
 | |
|    printing of query and process statistics which can be very useful for
 | |
|    debugging and performance measurements.
 | |
|    
 | |
|   3.5) Why do I get "Sorry, too many clients" when trying to connect?
 | |
|   
 | |
|    You have reached the default limit is 100 database sessions. You need
 | |
|    to increase the postmaster's limit on how many concurrent backend
 | |
|    processes it can start by changing the max_connections value in
 | |
|    postgresql.conf and restarting the postmaster.
 | |
|    
 | |
|   3.6) Why do I need to do a dump and restore to upgrade between major
 | |
|   PostgreSQL releases?
 | |
|   
 | |
|    The PostgreSQL team makes only small changes between minor releases,
 | |
|    so upgrading from 7.4.0 to 7.4.1 does not require a dump and restore.
 | |
|    However, major releases (e.g. from 7.3 to 7.4) often change the
 | |
|    internal format of system tables and data files. These changes are
 | |
|    often complex, so we don't maintain backward compatibility for data
 | |
|    files. A dump outputs data in a generic format that can then be loaded
 | |
|    in using the new internal format.
 | |
|    
 | |
|   3.7) What computer hardware should I use?
 | |
|   
 | |
|    Because PC hardware is mostly compatible, people tend to believe that
 | |
|    all PC hardware is of equal quality. It is not. ECC RAM, SCSI, and
 | |
|    quality motherboards are more reliable and have better performance
 | |
|    than less expensive hardware. PostgreSQL will run on almost any
 | |
|    hardware, but if reliability and performance are important it is wise
 | |
|    to research your hardware options thoroughly. Our email lists can be
 | |
|    used to discuss hardware options and tradeoffs.
 | |
|      _________________________________________________________________
 | |
|    
 | |
|                            Operational Questions
 | |
|                                       
 | |
|   4.1) How do I SELECT only the first few rows of a query? A random row?
 | |
|   
 | |
|    To retrieve only a few rows, if you know at the number of rows needed
 | |
|    at the time of the SELECT use LIMIT . If an index matches the ORDER BY
 | |
|    it is possible the entire query does not have to be executed. If you
 | |
|    don't know the number of rows at SELECT time, use a cursor and FETCH.
 | |
|    
 | |
|    To SELECT a random row, use:
 | |
|     SELECT col
 | |
|     FROM tab
 | |
|     ORDER BY random()
 | |
|     LIMIT 1;
 | |
| 
 | |
|   4.2) How do I find out what tables, indexes, databases, and users are
 | |
|   defined? How do I see the queries used by psql to display them?
 | |
|   
 | |
|    Use the \dt command to see tables in psql. For a complete list of
 | |
|    commands inside psql you can use \?. Alternatively you can read the
 | |
|    source code for psql in file pgsql/src/bin/psql/describe.c, it
 | |
|    contains SQL commands that generate the output for psql's backslash
 | |
|    commands. You can also start psql with the -E option so it will print
 | |
|    out the queries it uses to execute the commands you give. PostgreSQL
 | |
|    also provides an SQL compliant INFORMATION SCHEMA interface you can
 | |
|    query to get information about the database.
 | |
|    
 | |
|    There are also system tables beginning with pg_ that describe these
 | |
|    too.
 | |
|    
 | |
|    Use psql -l will list all databases.
 | |
|    
 | |
|    Also try the file pgsql/src/tutorial/syscat.source. It illustrates
 | |
|    many of the SELECTs needed to get information from the database system
 | |
|    tables.
 | |
|    
 | |
|   4.3) How do you change a column's data type?
 | |
|   
 | |
|    Changing the data type of a column can be done easily in 8.0 and later
 | |
|    with ALTER TABLE ALTER COLUMN TYPE.
 | |
|    
 | |
|    In earlier releases, do this:
 | |
|     BEGIN;
 | |
|     ALTER TABLE tab ADD COLUMN new_col new_data_type;
 | |
|     UPDATE tab SET new_col = CAST(old_col AS new_data_type);
 | |
|     ALTER TABLE tab DROP COLUMN old_col;
 | |
|     COMMIT;
 | |
| 
 | |
|    You might then want to do VACUUM FULL tab to reclaim the disk space
 | |
|    used by the expired rows.
 | |
|    
 | |
|   4.4) What is the maximum size for a row, a table, and a database?
 | |
|   
 | |
|    These are the limits:
 | |
|    
 | |
|    Maximum size for a database? unlimited (32 TB databases exist)
 | |
|    Maximum size for a table? 32 TB
 | |
|    Maximum size for a row? 1.6TB
 | |
|    Maximum size for a field? 1 GB
 | |
|    Maximum number of rows in a table? unlimited
 | |
|    Maximum number of columns in a table? 250-1600 depending on column
 | |
|    types
 | |
|    Maximum number of indexes on a table? unlimited
 | |
|    
 | |
|    Of course, these are not actually unlimited, but limited to available
 | |
|    disk space and memory/swap space. Performance may suffer when these
 | |
|    values get unusually large.
 | |
|    
 | |
|    The maximum table size of 32 TB does not require large file support
 | |
|    from the operating system. Large tables are stored as multiple 1 GB
 | |
|    files so file system size limits are not important.
 | |
|    
 | |
|    The maximum table size and maximum number of columns can be quadrupled
 | |
|    by increasing the default block size to 32k.
 | |
|    
 | |
|    One limitation is that indexes can not be created on columns longer
 | |
|    than about 2,000 characters. Fortunately, such indexes are rarely
 | |
|    needed. Uniqueness is best guaranteed by a funtion index of an MD5
 | |
|    hash of the long column, and full text indexing allows for searching
 | |
|    of words within the column.
 | |
|    
 | |
|   4.5) How much database disk space is required to store data from a typical
 | |
|   text file?
 | |
|   
 | |
|    A PostgreSQL database may require up to five times the disk space to
 | |
|    store data from a text file.
 | |
|    
 | |
|    As an example, consider a file of 100,000 lines with an integer and
 | |
|    text description on each line. Suppose the text string avergages
 | |
|    twenty bytes in length. The flat file would be 2.8 MB. The size of the
 | |
|    PostgreSQL database file containing this data can be estimated as 6.4
 | |
|    MB:
 | |
|     28 bytes: each row header (approximate)
 | |
|     24 bytes: one int field and one text field
 | |
|    + 4 bytes: pointer on page to tuple
 | |
|    ----------------------------------------
 | |
|     56 bytes per row
 | |
| 
 | |
|    The data page size in PostgreSQL is 8192 bytes (8 KB), so:
 | |
| 
 | |
|    8192 bytes per page
 | |
|    -------------------   =  146 rows per database page (rounded down)
 | |
|      56 bytes per row
 | |
| 
 | |
|    100000 data rows
 | |
|    --------------------  =  685 database pages (rounded up)
 | |
|       146 rows per page
 | |
| 
 | |
| 685 database pages * 8192 bytes per page  =  5,611,520 bytes (5.6 MB)
 | |
| 
 | |
|    Indexes do not require as much overhead, but do contain the data that
 | |
|    is being indexed, so they can be large also.
 | |
|    
 | |
|    NULLs are stored as bitmaps, so they use very little space.
 | |
|    
 | |
|   4.6) Why are my queries slow? Why don't they use my indexes?
 | |
|   
 | |
|    Indexes are not used by every query. Indexes are used only if the
 | |
|    table is larger than a minimum size, and the query selects only a
 | |
|    small percentage of the rows in the table. This is because the random
 | |
|    disk access caused by an index scan can be slower than a straight read
 | |
|    through the table, or sequential scan.
 | |
|    
 | |
|    To determine if an index should be used, PostgreSQL must have
 | |
|    statistics about the table. These statistics are collected using
 | |
|    VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer
 | |
|    knows how many rows are in the table, and can better determine if
 | |
|    indexes should be used. Statistics are also valuable in determining
 | |
|    optimal join order and join methods. Statistics collection should be
 | |
|    performed periodically as the contents of the table change.
 | |
|    
 | |
|    Indexes are normally not used for ORDER BY or to perform joins. A
 | |
|    sequential scan followed by an explicit sort is usually faster than an
 | |
|    index scan of a large table.
 | |
|    However, LIMIT combined with ORDER BY often will use an index because
 | |
|    only a small portion of the table is returned. In fact, though MAX()
 | |
|    and MIN() don't use indexes, it is possible to retrieve such values
 | |
|    using an index with ORDER BY and LIMIT:
 | |
|     SELECT col
 | |
|     FROM tab
 | |
|     ORDER BY col [ DESC ]
 | |
|     LIMIT 1;
 | |
| 
 | |
|    If you believe the optimizer is incorrect in choosing a sequential
 | |
|    scan, use SET enable_seqscan TO 'off' and run query again to see if an
 | |
|    index scan is indeed faster.
 | |
|    
 | |
|    When using wild-card operators such as LIKE or ~, indexes can only be
 | |
|    used in certain circumstances:
 | |
|      * The beginning of the search string must be anchored to the start
 | |
|        of the string, i.e.
 | |
|           + LIKE patterns must not start with %.
 | |
|           + ~ (regular expression) patterns must start with ^.
 | |
|      * The search string can not start with a character class, e.g.
 | |
|        [a-e].
 | |
|      * Case-insensitive searches such as ILIKE and ~* do not utilize
 | |
|        indexes. Instead, use expression indexes, which are described in
 | |
|        section 4.8.
 | |
|      * The default C locale must be used during initdb because it is not
 | |
|        possible to know the next-greatest character in a non-C locale.
 | |
|        You can create a special text_pattern_ops index for such cases
 | |
|        that work only for LIKE indexing.
 | |
|        
 | |
|    In pre-8.0 releases, indexes often can not be used unless the data
 | |
|    types exactly match the index's column types. This was particularly
 | |
|    true of int2, int8, and numeric column indexes.
 | |
|    
 | |
|   4.7) How do I see how the query optimizer is evaluating my query?
 | |
|   
 | |
|    See the EXPLAIN manual page.
 | |
|    
 | |
|   4.8) How do I perform regular expression searches and case-insensitive
 | |
|   regular expression searches? How do I use an index for case-insensitive
 | |
|   searches?
 | |
|   
 | |
|    The ~ operator does regular expression matching, and ~* does
 | |
|    case-insensitive regular expression matching. The case-insensitive
 | |
|    variant of LIKE is called ILIKE.
 | |
|    
 | |
|    Case-insensitive equality comparisons are normally expressed as:
 | |
|     SELECT *
 | |
|     FROM tab
 | |
|     WHERE lower(col) = 'abc';
 | |
| 
 | |
|    This will not use an standard index. However, if you create a
 | |
|    expresssion index, it will be used:
 | |
|     CREATE INDEX tabindex ON tab (lower(col));
 | |
| 
 | |
|    If the above index is created as UNIQUE, though the column can store
 | |
|    upper and lowercase characters, it can not have identical values that
 | |
|    differ only in case. To force a particular case to be stored in the
 | |
|    column, use a CHECK constraint or a trigger.
 | |
|    
 | |
|   4.9) In a query, how do I detect if a field is NULL? How can I sort on
 | |
|   whether a field is NULL or not?
 | |
|   
 | |
|    You test the column with IS NULL and IS NOT NULL, like this:
 | |
|    SELECT *
 | |
|    FROM tab
 | |
|    WHERE col IS NULL;
 | |
| 
 | |
|    To sort by the NULL status, use the IS NULL and IS NOT NULL modifiers
 | |
|    in your ORDER BY clause. Things that are true will sort higher than
 | |
|    things that are false, so the following will put NULL entries at the
 | |
|    top of the resulting list:
 | |
|    SELECT *
 | |
|    FROM tab
 | |
|    ORDER BY (col IS NOT NULL)
 | |
| 
 | |
|   4.10) What is the difference between the various character types?
 | |
|   
 | |
|         Type    Internal Name                    Notes
 | |
|      VARCHAR(n) varchar       size specifies maximum length, no padding
 | |
|      CHAR(n)    bpchar        blank padded to the specified fixed length
 | |
|      TEXT       text          no specific upper limit on length
 | |
|      BYTEA      bytea         variable-length byte array (null-byte safe)
 | |
|      "char"     char          one character
 | |
|    
 | |
|    You will see the internal name when examining system catalogs and in
 | |
|    some error messages.
 | |
|    
 | |
|    The first four types above are "varlena" types (i.e., the first four
 | |
|    bytes on disk are the length, followed by the data). Thus the actual
 | |
|    space used is slightly greater than the declared size. However, long
 | |
|    values are also subject to compression, so the space on disk might
 | |
|    also be less than expected.
 | |
|    VARCHAR(n) is best when storing variable-length strings and it limits
 | |
|    how long a string can be. TEXT is for strings of unlimited length,
 | |
|    with a maximum of one gigabyte.
 | |
|    
 | |
|    CHAR(n) is for storing strings that are all the same length. CHAR(n)
 | |
|    pads with blanks to the specified length, while VARCHAR(n) only stores
 | |
|    the characters supplied. BYTEA is for storing binary data,
 | |
|    particularly values that include NULL bytes. All the types described
 | |
|    here have similar performance characteristics.
 | |
|    
 | |
|   4.11.1) How do I create a serial/auto-incrementing field?
 | |
|   
 | |
|    PostgreSQL supports a SERIAL data type. It auto-creates a sequence.
 | |
|    For example, this:
 | |
|     CREATE TABLE person (
 | |
|         id   SERIAL,
 | |
|         name TEXT
 | |
|     );
 | |
| 
 | |
|    is automatically translated into this:
 | |
|     CREATE SEQUENCE person_id_seq;
 | |
|     CREATE TABLE person (
 | |
|         id   INT4 NOT NULL DEFAULT nextval('person_id_seq'),
 | |
|         name TEXT
 | |
|     );
 | |
| 
 | |
|    See the create_sequence manual page for more information about
 | |
|    sequences.
 | |
|    
 | |
|   4.11.2) How do I get the value of a SERIAL insert?
 | |
|   
 | |
|    One approach is to retrieve the next SERIAL value from the sequence
 | |
|    object with the nextval() function before inserting and then insert it
 | |
|    explicitly. Using the example table in 4.11.1, an example in a
 | |
|    pseudo-language would look like this:
 | |
|     new_id = execute("SELECT nextval('person_id_seq')");
 | |
|     execute("INSERT INTO person (id, name) VALUES (new_id, 'Blaise Pascal')");
 | |
| 
 | |
|    You would then also have the new value stored in new_id for use in
 | |
|    other queries (e.g., as a foreign key to the person table). Note that
 | |
|    the name of the automatically created SEQUENCE object will be named
 | |
|    <table>_< serialcolumn>_seq, where table and serialcolumn are the
 | |
|    names of your table and your SERIAL column, respectively.
 | |
|    
 | |
|    Alternatively, you could retrieve the assigned SERIAL value with the
 | |
|    currval() function after it was inserted by default, e.g.,
 | |
|     execute("INSERT INTO person (name) VALUES ('Blaise Pascal')");
 | |
|     new_id = execute("SELECT currval('person_id_seq')");
 | |
| 
 | |
|   4.11.3) Doesn't currval() lead to a race condition with other users?
 | |
|   
 | |
|    No. currval() returns the current value assigned by your session, not
 | |
|    by all sessions.
 | |
|    
 | |
|   4.11.4) Why aren't my sequence numbers reused on transaction abort? Why are
 | |
|   there gaps in the numbering of my sequence/SERIAL column?
 | |
|   
 | |
|    To improve concurrency, sequence values are given out to running
 | |
|    transactions as needed and are not locked until the transaction
 | |
|    completes. This causes gaps in numbering from aborted transactions.
 | |
|    
 | |
|   4.12) What is an OID? What is a CTID?
 | |
|   
 | |
|    Every row that is created in PostgreSQL gets a unique OID unless
 | |
|    created WITHOUT OIDS. OIDs are autotomatically assigned unique 4-byte
 | |
|    integers that are unique across the entire installation. However, they
 | |
|    overflow at 4 billion, and then the OIDs start being duplicated.
 | |
|    PostgreSQL uses OIDs to link its internal system tables together.
 | |
|    
 | |
|    To uniquely number columns in user tables, it is best to use SERIAL
 | |
|    rather than OIDs because SERIAL sequences are unique only within a
 | |
|    single table. and are therefore less likely to overflow. SERIAL8 is
 | |
|    available for storing eight-byte sequence values.
 | |
|    
 | |
|    CTIDs are used to identify specific physical rows with block and
 | |
|    offset values. CTIDs change after rows are modified or reloaded. They
 | |
|    are used by index entries to point to physical rows.
 | |
|    
 | |
|   4.13) Why do I get the error "ERROR: Memory exhausted in AllocSetAlloc()"?
 | |
|   
 | |
|    You probably have run out of virtual memory on your system, or your
 | |
|    kernel has a low limit for certain resources. Try this before starting
 | |
|    postmaster:
 | |
|     ulimit -d 262144
 | |
|     limit datasize 256m
 | |
| 
 | |
|    Depending on your shell, only one of these may succeed, but it will
 | |
|    set your process data segment limit much higher and perhaps allow the
 | |
|    query to complete. This command applies to the current process, and
 | |
|    all subprocesses created after the command is run. If you are having a
 | |
|    problem with the SQL client because the backend is returning too much
 | |
|    data, try it before starting the client.
 | |
|    
 | |
|   4.14) How do I tell what PostgreSQL version I am running?
 | |
|   
 | |
|    From psql, type SELECT version();
 | |
|    
 | |
|   4.15) How do I create a column that will default to the current time?
 | |
|   
 | |
|    Use CURRENT_TIMESTAMP:
 | |
|     CREATE TABLE test (x int, modtime TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
 | |
| 
 | |
|   4.16) How do I perform an outer join?
 | |
|   
 | |
|    PostgreSQL supports outer joins using the SQL standard syntax. Here
 | |
|    are two examples:
 | |
|     SELECT *
 | |
|     FROM t1 LEFT OUTER JOIN t2 ON (t1.col = t2.col);
 | |
| 
 | |
|    or
 | |
|     SELECT *
 | |
|     FROM t1 LEFT OUTER JOIN t2 USING (col);
 | |
| 
 | |
|    These identical queries join t1.col to t2.col, and also return any
 | |
|    unjoined rows in t1 (those with no match in t2). A RIGHT join would
 | |
|    add unjoined rows of t2. A FULL join would return the matched rows
 | |
|    plus all unjoined rows from t1 and t2. The word OUTER is optional and
 | |
|    is assumed in LEFT, RIGHT, and FULL joins. Ordinary joins are called
 | |
|    INNER joins.
 | |
|    
 | |
|   4.17) How do I perform queries using multiple databases?
 | |
|   
 | |
|    There is no way to query a database other than the current one.
 | |
|    Because PostgreSQL loads database-specific system catalogs, it is
 | |
|    uncertain how a cross-database query should even behave.
 | |
|    
 | |
|    contrib/dblink allows cross-database queries using function calls. Of
 | |
|    course, a client can also make simultaneous connections to different
 | |
|    databases and merge the results on the client side.
 | |
|    
 | |
|   4.18) How do I return multiple rows or columns from a function?
 | |
|   
 | |
|    It is easy using set-returning functions,
 | |
|    http://techdocs.postgresql.org/guides/SetReturningFunctions
 | |
|    .
 | |
|    
 | |
|   4.19) Why do I get "relation with OID ##### does not exist" errors when
 | |
|   accessing temporary tables in PL/PgSQL functions?
 | |
|   
 | |
|    PL/PgSQL caches function scripts, and an unfortunate side effect is
 | |
|    that if a PL/PgSQL function accesses a temporary table, and that table
 | |
|    is later dropped and recreated, and the function called again, the
 | |
|    function will fail because the cached function contents still point to
 | |
|    the old temporary table. The solution is to use EXECUTE for temporary
 | |
|    table access in PL/PgSQL. This will cause the query to be reparsed
 | |
|    every time.
 | |
|    
 | |
|   4.20) What replication solutions are available?
 | |
|   
 | |
|    Though "replication" is a single term, there are several technologies
 | |
|    for doing replication, with advantages and disadvantages for each.
 | |
|    
 | |
|    Master/slave replication allows a single master to receive read/write
 | |
|    queries, while slaves can only accept read/SELECT queries. The most
 | |
|    popular freely available master-slave PostgreSQL replication solution
 | |
|    is Slony-I.
 | |
|    
 | |
|    Multi-master replication allows read/write queries to be sent to
 | |
|    multiple replicated computers. This capability also has a severe
 | |
|    impact on performance due to the need to synchronize changes between
 | |
|    servers. Pgcluster is the most popular such solution freely available
 | |
|    for PostgreSQL.
 | |
|    
 | |
|    There are also commercial and hardware-based replication solutions
 | |
|    available supporting a variety of replication models.
 | |
|    
 | |
|   4.21) Why are my table and column names not recognized in my query?
 | |
|   
 | |
|    The most common cause is the use of double-quotes around table or
 | |
|    column names during table creation. When double-quotes are used, table
 | |
|    and column names (called identifiers) are stored case-sensitive,
 | |
|    meaning you must use double-quotes when referencing the names in a
 | |
|    query. Some interfaces, like pgAdmin, automatically double-quote
 | |
|    identifiers during table creation. So, for identifiers to be
 | |
|    recognized, you must either:
 | |
|      * Avoid double-quoting identifiers when creating tables
 | |
|      * Use only lowercase characters in identifiers
 | |
|      * Double-quote identifiers when referencing them in queries
 |