diff --git a/src/tools/backend/backend_dirs.html b/src/tools/backend/backend_dirs.html index 433324bff13..3c64f4f7cf5 100644 --- a/src/tools/backend/backend_dirs.html +++ b/src/tools/backend/backend_dirs.html @@ -1,428 +1,351 @@ - -
--Click on any of the section headings to see the source code for that section. - -
--Because PostgreSQL requires access to system tables for almost every -operation, getting those system tables in place is a problem. -You can't just create the tables and insert data into them in the normal way, -because table creation and insertion requires the tables to already -exist. -This code jams the data directly into tables using a -special syntax used only by the bootstrap procedure. -
--This checks the process name(argv[0]) and various flags, and passes -control to the postmaster or postgres backend code. -
--This creates shared memory, and then goes into a loop waiting for -connection requests. -When a connection request arrives, a postgres backend is started, -and the connection is passed to it. -
--This handles communication to the client processes. -
--This contains the postgres backend main handler, as well as the -code that makes calls to the parser, optimizer, executor, and -/commands functions. -
--This converts SQL queries coming from libpq into command-specific -structures to be used the the optimizer/executor, or /commands -routines. -The SQL is lexically analyzed into keywords, identifiers, and constants, -and passed to the parser. -The parser creates command-specific structures to hold the elements of -the query. -The command-specific structures are then broken apart, checked, and passed to -/commands processing routines, or converted into Lists of -Nodes to be handled by the optimizer and executor. -
--This uses the parser output to generate an optimal plan for the -executor. -
--This takes the parser query output, and generates all possible methods of -executing the request. -It examines table join order, where clause restrictions, -and optimizer table statistics to evaluate each possible execution -method, and assigns a cost to each. -
--optimizer/path evaluates all possible ways to join the requested tables. -When the number of tables becomes great, the number of tests made -becomes great too. -The Genetic Query Optimizer considers each table separately, then figures -the most optimal order to perform the join. -For a few tables, this method takes longer, but for a large number of -tables, it is faster. -There is an option to control when this feature is used. -
--This takes the optimizer/path output, chooses the path with the -least cost, and creates a plan for the executor. -
--This does special plan processing. -
--This contains support routines used by other parts of the optimizer. -
--This handles select, insert, update, and delete statements. -The operations required to handle these statement types include -heap scans, index scans, sorting, joining tables, grouping, aggregates, -and uniqueness. -
--These process SQL commands that do not require complex handling. -It includes vacuum, copy, alter, create table, create type, and -many others. -The code is called with the structures generated by the parser. -Most of the routines do some processing, then call lower-level functions -in the catalog directory to do the actual work. -
--This contains functions that manipulate the system tables or catalogs. -Table, index, procedure, operator, type, and aggregate creation and -manipulation routines are here. -These are low-level routines, and are usually called by upper routines -that pre-format user requests into a predefined format. -
-
-These allow uniform resource access by the backend.
-
-
-
-storage/buffer
-- shared buffer pool manager
-
-
-storage/file
-- file manager
-
-
-storage/ipc
-- semaphores and shared memory
-
-
-storage/large_object
-- large objects
-
-
-storage/lmgr
-- lock manager
-
-
-storage/page
-- page manager
-
-
-storage/smgr
-- storage/disk manager
-
-
-
-These control the way data is accessed in heap, indexes, and
-transactions.
-
-
-
-access/common
-- common access routines
-
-
-access/gist
-- easy-to-define access method system
-
-
-access/hash
-- hash
-
-
-access/heap
-- heap is use to store data rows
-
-
-access/index
-- used by all index types
-
-
-access/nbtree
-- Lehman and Yao's btree management algorithm
-
-
-access/rtree
-- used for indexing of 2-dimensional data
-
-
-access/transam
-- transaction manager (BEGIN/ABORT/COMMIT)
-
-
-
-PostgreSQL stores information about SQL queries in structures called -nodes. -Nodes are generic containers that have a type field and then a -type-specific data section. -Nodes are usually placed in Lists. -A List is container with an elem element, -and a next field that points to the next List. -These List structures are chained together in a forward linked list. -In this way, a chain of Lists can contain an unlimited number of Node -elements, and each Node can contain any data type. -These are used extensively in the parser, optimizer, and executor to -store requests and data. -
--This contains all the PostgreSQL builtin data types. -
--PostgreSQL supports arbitrary data types, so no data types are hard-coded -into the core backend routines. -When the backend needs to find out about a type, is does a lookup of a -system table. -Because these system tables are referred to often, a cache is maintained -that speeds lookups. -There is a system relation cache, a function/operator cache, and a relation -information cache. -This last cache maintains information about all recently-accessed -tables, not just system ones. -
--Reports backend errors to the front end. -
--This handles the calling of dynamically-loaded functions, and the calling -of functions defined in the system tables. -
--These hash routines are used by the cache and memory-manager routines to -do quick lookups of dynamic data storage structures maintained by the -backend. -
--When PostgreSQL allocates memory, it does so in an explicit context. -Contexts can be statement-specific, transaction-specific, or -persistent/global. -By doing this, the backend can easily free memory once a statement or -transaction completes. -
--When statement output must be sorted as part of a backend operation, -this code sorts the tuples, either in memory or using disk files. -
--These routines do checking of tuple internal columns to determine if the -current row is still valid, or is part of a non-committed transaction or -superseded by a new row. -
--There are include directories for each subsystem. -
--This houses several generic routines. -
--This is used for regular expression handling in the backend, i.e. '~'. -
--This does processing for the rules system. -
-Click on any of the section headings to see the source code +for that section.
+ +Because PostgreSQL requires access to system tables for almost +every operation, getting those system tables in place is a problem. +You can't just create the tables and insert data into them in the +normal way, because table creation and insertion requires the +tables to already exist. This code jams the data directly +into tables using a special syntax used only by the bootstrap +procedure.
+ +This checks the process name(argv[0]) and various flags, and +passes control to the postmaster or postgres backend code.
+ +This creates shared memory, and then goes into a loop waiting +for connection requests. When a connection request arrives, a +postgres backend is started, and the connection is passed to +it.
+ +This handles communication to the client processes.
+ +This contains the postgres backend main handler, as well +as the code that makes calls to the parser, optimizer, executor, +and /commands functions.
+ +This converts SQL queries coming from libpq into +command-specific structures to be used the the optimizer/executor, +or /commands routines. The SQL is lexically analyzed into +keywords, identifiers, and constants, and passed to the parser. The +parser creates command-specific structures to hold the elements of +the query. The command-specific structures are then broken apart, +checked, and passed to /commands processing routines, or +converted into Lists of Nodes to be handled by the +optimizer and executor.
+ +This uses the parser output to generate an optimal plan for the +executor.
+ +This takes the parser query output, and generates all possible +methods of executing the request. It examines table join order, +where clause restrictions, and optimizer table statistics to +evaluate each possible execution method, and assigns a cost to +each.
+ +optimizer/path evaluates all possible ways to join the +requested tables. When the number of tables becomes great, the +number of tests made becomes great too. The Genetic Query Optimizer +considers each table separately, then figures the most optimal +order to perform the join. For a few tables, this method takes +longer, but for a large number of tables, it is faster. There is an +option to control when this feature is used.
+ +This takes the optimizer/path output, chooses the path +with the least cost, and creates a plan for the executor.
+ +This does special plan processing.
+ +This contains support routines used by other parts of the +optimizer.
+ +This handles select, insert, update, and delete +statements. The operations required to handle these statement types +include heap scans, index scans, sorting, joining tables, grouping, +aggregates, and uniqueness.
+ +These process SQL commands that do not require complex handling. +It includes vacuum, copy, alter, create table, create type, +and many others. The code is called with the structures generated +by the parser. Most of the routines do some processing, then call +lower-level functions in the catalog directory to do the actual +work.
+ +This contains functions that manipulate the system tables or +catalogs. Table, index, procedure, operator, type, and aggregate +creation and manipulation routines are here. These are low-level +routines, and are usually called by upper routines that pre-format +user requests into a predefined format.
+ +These allow uniform resource access by the backend.
+
+ storage/buffer - shared
+buffer pool manager
+ storage/file - file
+manager
+ storage/freespace - free
+space map
+ storage/ipc - semaphores and
+shared memory
+ storage/large_object
+- large objects
+ storage/lmgr - lock
+manager
+ storage/page - page
+manager
+ storage/smgr - storage/disk
+manager
+
+
These control the way data is accessed in heap, indexes, and
+transactions.
+
+ access/common - common
+access routines
+ access/gist - easy-to-define
+access method system
+ access/hash - hash
+ access/heap - heap is use to
+store data rows
+ access/index - used by all
+index types
+ access/nbtree - Lehman and
+Yao's btree management algorithm
+ access/rtree - used for
+indexing of 2-dimensional data
+ access/transam -
+transaction manager (BEGIN/ABORT/COMMIT)
+
+
PostgreSQL stores information about SQL queries in structures +called nodes. Nodes are generic containers that have a +type field and then a type-specific data section. Nodes are +usually placed in Lists. A List is container with an +elem element, and a next field that points to the +next List. These List structures are chained together +in a forward linked list. In this way, a chain of Lists can +contain an unlimited number of Node elements, and each +Node can contain any data type. These are used extensively +in the parser, optimizer, and executor to store requests and +data.
+ +This contains all the PostgreSQL builtin data types.
+ +PostgreSQL supports arbitrary data types, so no data types are +hard-coded into the core backend routines. When the backend needs +to find out about a type, is does a lookup of a system table. +Because these system tables are referred to often, a cache is +maintained that speeds lookups. There is a system relation cache, a +function/operator cache, and a relation information cache. This +last cache maintains information about all recently-accessed +tables, not just system ones.
+ +Reports backend errors to the front end.
+ +This handles the calling of dynamically-loaded functions, and +the calling of functions defined in the system tables.
+ +These hash routines are used by the cache and memory-manager +routines to do quick lookups of dynamic data storage structures +maintained by the backend.
+ +When PostgreSQL allocates memory, it does so in an explicit +context. Contexts can be statement-specific, transaction-specific, +or persistent/global. By doing this, the backend can easily free +memory once a statement or transaction completes.
+ +When statement output must be sorted as part of a backend +operation, this code sorts the tuples, either in memory or using +disk files.
+ +These routines do checking of tuple internal columns to +determine if the current row is still valid, or is part of a +non-committed transaction or superseded by a new row.
+ +There are include directories for each subsystem.
+ +This houses several generic routines.
+ +This is used for regular expression handling in the backend, +i.e. '~'.
+ +
+
+
The statement is then identified as complex (SELECT / INSERT / UPDATE / DELETE) or a simple, e.g CREATE USER, ANALYZE, , -etc. Utility commands are processed by statement-specific functions in backend/commands. Complex statements -require more handling.
+etc. Simple utility commands are processed by statement-specific +functions in backend/commands. +Complex statements require more handling.The parser takes a complex query, and creates a Query structure that @@ -98,7 +100,7 @@ optimal index usage.
The Plan is then passed to the executor for execution, and the -result returned to the client. The Plan actually as set of nodes, +result returned to the client. The Plan is actually as set of nodes, arranged in a tree structure with a top-level node, and various sub-nodes as children.