mirror of
https://github.com/postgres/postgres.git
synced 2025-05-22 00:02:02 -04:00
Update documentation to reflect availability of aggregate(DISTINCT).
Try to provide a more lucid discussion in 'Using Aggregate Functions' tutorial section.
This commit is contained in:
parent
662371cc5d
commit
ff6fe1502d
@ -361,39 +361,90 @@ DELETE FROM classname;
|
|||||||
Like most other query languages,
|
Like most other query languages,
|
||||||
<ProductName>PostgreSQL</ProductName> supports
|
<ProductName>PostgreSQL</ProductName> supports
|
||||||
aggregate functions.
|
aggregate functions.
|
||||||
The current implementation of
|
An aggregate function computes a single result from multiple input rows.
|
||||||
<ProductName>Postgres</ProductName> aggregate functions have some limitations.
|
For example, there are aggregates to compute the
|
||||||
Specifically, while there are aggregates to compute
|
<Function>count</Function>, <Function>sum</Function>,
|
||||||
such functions as the <Function>count</Function>, <Function>sum</Function>,
|
|
||||||
<Function>avg</Function> (average), <Function>max</Function> (maximum) and
|
<Function>avg</Function> (average), <Function>max</Function> (maximum) and
|
||||||
<Function>min</Function> (minimum) over a set of instances, aggregates can only
|
<Function>min</Function> (minimum) over a set of instances.
|
||||||
appear in the target list of a query and not directly in the
|
</para>
|
||||||
qualification (the where clause). As an example,
|
|
||||||
|
<Para>
|
||||||
|
It is important to understand the interaction between aggregates and
|
||||||
|
SQL's <Command>where</Command> and <Command>having</Command> clauses.
|
||||||
|
The fundamental difference between <Command>where</Command> and
|
||||||
|
<Command>having</Command> is this: <Command>where</Command> selects
|
||||||
|
input rows before groups and aggregates are computed (thus, it controls
|
||||||
|
which rows go into the aggregate computation), whereas
|
||||||
|
<Command>having</Command> selects group rows after groups and
|
||||||
|
aggregates are computed. Thus, the
|
||||||
|
<Command>where</Command> clause may not contain aggregate functions;
|
||||||
|
it makes no sense to try to use an aggregate to determine which rows
|
||||||
|
will be inputs to the aggregates. On the other hand,
|
||||||
|
<Command>having</Command> clauses always contain aggregate functions.
|
||||||
|
(Strictly speaking, you are allowed to write a <Command>having</Command>
|
||||||
|
clause that doesn't use aggregates, but it's wasteful; the same condition
|
||||||
|
could be used more efficiently at the <Command>where</Command> stage.)
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<Para>
|
||||||
|
As an example, we can find the highest low-temperature reading anywhere
|
||||||
|
with
|
||||||
|
|
||||||
<ProgramListing>
|
<ProgramListing>
|
||||||
SELECT max(temp_lo) FROM weather;
|
SELECT max(temp_lo) FROM weather;
|
||||||
</ProgramListing>
|
</ProgramListing>
|
||||||
|
|
||||||
is allowed, while
|
If we want to know which city (or cities) that reading occurred in,
|
||||||
|
we might try
|
||||||
|
|
||||||
<ProgramListing>
|
<ProgramListing>
|
||||||
SELECT city FROM weather WHERE temp_lo = max(temp_lo);
|
SELECT city FROM weather WHERE temp_lo = max(temp_lo);
|
||||||
</ProgramListing>
|
</ProgramListing>
|
||||||
|
|
||||||
is not. However, as is often the case the query can be restated to accomplish
|
but this will not work since the aggregate max() can't be used in
|
||||||
the intended result; here by using a <FirstTerm>subselect</FirstTerm>:
|
<Command>where</Command>. However, as is often the case the query can be
|
||||||
|
restated to accomplish the intended result; here by using a
|
||||||
|
<FirstTerm>subselect</FirstTerm>:
|
||||||
<ProgramListing>
|
<ProgramListing>
|
||||||
SELECT city FROM weather WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
|
SELECT city FROM weather WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
|
||||||
</ProgramListing>
|
</ProgramListing>
|
||||||
|
This is OK because the sub-select is an independent computation that
|
||||||
|
computes its own aggregate separately from what's happening in the outer
|
||||||
|
select.
|
||||||
</Para>
|
</Para>
|
||||||
|
|
||||||
<Para>
|
<Para>
|
||||||
Aggregates may also have <FirstTerm>group by</FirstTerm> clauses:
|
Aggregates are also very useful in combination with
|
||||||
|
<FirstTerm>group by</FirstTerm> clauses. For example, we can get the
|
||||||
|
maximum low temperature observed in each city with
|
||||||
<ProgramListing>
|
<ProgramListing>
|
||||||
SELECT city, max(temp_lo)
|
SELECT city, max(temp_lo)
|
||||||
FROM weather
|
FROM weather
|
||||||
GROUP BY city;
|
GROUP BY city;
|
||||||
</ProgramListing>
|
</ProgramListing>
|
||||||
|
which gives us one output row per city. We can filter these grouped
|
||||||
|
rows using <Command>having</Command>:
|
||||||
|
<ProgramListing>
|
||||||
|
SELECT city, max(temp_lo)
|
||||||
|
FROM weather
|
||||||
|
GROUP BY city
|
||||||
|
HAVING min(temp_lo) < 0;
|
||||||
|
</ProgramListing>
|
||||||
|
which gives us the same results for only the cities that have some
|
||||||
|
below-zero readings. Finally, if we only care about cities whose
|
||||||
|
names begin with 'P', we might do
|
||||||
|
<ProgramListing>
|
||||||
|
SELECT city, max(temp_lo)
|
||||||
|
FROM weather
|
||||||
|
WHERE city like 'P%'
|
||||||
|
GROUP BY city
|
||||||
|
HAVING min(temp_lo) < 0;
|
||||||
|
</ProgramListing>
|
||||||
|
Note that we can apply the city-name restriction in
|
||||||
|
<Command>where</Command>, since it needs no aggregate. This is
|
||||||
|
more efficient than adding the restriction to <Command>having</Command>,
|
||||||
|
because we avoid doing the grouping and aggregate calculations
|
||||||
|
for all rows that fail the <Command>where</Command> check.
|
||||||
</Para>
|
</Para>
|
||||||
</sect1>
|
</sect1>
|
||||||
</Chapter>
|
</Chapter>
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$Header: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v 1.22 1999/08/06 13:50:31 thomas Exp $
|
$Header: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v 1.23 1999/12/13 17:39:38 tgl Exp $
|
||||||
Postgres documentation
|
Postgres documentation
|
||||||
-->
|
-->
|
||||||
|
|
||||||
@ -202,10 +202,10 @@ SELECT [ ALL | DISTINCT [ ON <replaceable class="PARAMETER">column</replaceable>
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
<command>DISTINCT</command> will eliminate all duplicate rows from the
|
<command>DISTINCT</command> will eliminate all duplicate rows from the
|
||||||
selection.
|
result.
|
||||||
<command>DISTINCT ON <replaceable class="PARAMETER">column</replaceable></command>
|
<command>DISTINCT ON <replaceable class="PARAMETER">column</replaceable></command>
|
||||||
will eliminate all duplicates in the specified column; this is
|
will eliminate all duplicates in the specified column; this is
|
||||||
equivalent to using
|
similar to using
|
||||||
<command>GROUP BY <replaceable class="PARAMETER">column</replaceable></command>.
|
<command>GROUP BY <replaceable class="PARAMETER">column</replaceable></command>.
|
||||||
<command>ALL</command> will return all candidate rows,
|
<command>ALL</command> will return all candidate rows,
|
||||||
including duplicates.
|
including duplicates.
|
||||||
@ -320,11 +320,13 @@ GROUP BY <replaceable class="PARAMETER">column</replaceable> [, ...]
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
GROUP BY will condense into a single row all rows that share the
|
GROUP BY will condense into a single row all rows that share the
|
||||||
same values for the
|
same values for the grouped columns. Aggregate functions, if any,
|
||||||
grouped columns; aggregates return values derived from all rows
|
are computed across all rows making up each group, producing a
|
||||||
that make up the group. The value returned for an ungrouped
|
separate value for each group (whereas without GROUP BY, an
|
||||||
and unaggregated column is dependent on the order in which rows
|
aggregate produces a single value computed across all the selected
|
||||||
happen to be read from the database.
|
rows). When GROUP BY is present, it is not valid to refer to
|
||||||
|
ungrouped columns except within aggregate functions, since there
|
||||||
|
would be more than one possible value to return for an ungrouped column.
|
||||||
</para>
|
</para>
|
||||||
</refsect2>
|
</refsect2>
|
||||||
|
|
||||||
@ -354,7 +356,8 @@ HAVING <replaceable class="PARAMETER">cond_expr</replaceable>
|
|||||||
<para>
|
<para>
|
||||||
Each column referenced in
|
Each column referenced in
|
||||||
<replaceable class="PARAMETER">cond_expr</replaceable> shall unambiguously
|
<replaceable class="PARAMETER">cond_expr</replaceable> shall unambiguously
|
||||||
reference a grouping column.
|
reference a grouping column, unless the reference appears within an
|
||||||
|
aggregate function.
|
||||||
</para>
|
</para>
|
||||||
</refsect2>
|
</refsect2>
|
||||||
|
|
||||||
|
@ -642,15 +642,16 @@ CAST '<replaceable>string</replaceable>' AS <replaceable>type</replaceable>
|
|||||||
<member><replaceable>a_expr</replaceable> <replaceable>right_unary_operator</replaceable></member>
|
<member><replaceable>a_expr</replaceable> <replaceable>right_unary_operator</replaceable></member>
|
||||||
<member><replaceable>left_unary_operator</replaceable> <replaceable>a_expr</replaceable></member>
|
<member><replaceable>left_unary_operator</replaceable> <replaceable>a_expr</replaceable></member>
|
||||||
<member>parameter</member>
|
<member>parameter</member>
|
||||||
<member>functional expressions</member>
|
<member>functional expression</member>
|
||||||
<member>aggregate expressions</member>
|
<member>aggregate expression</member>
|
||||||
</simplelist>
|
</simplelist>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
We have already discussed constants and attributes. The two kinds of
|
We have already discussed constants and attributes. The three kinds of
|
||||||
operator expressions indicate respectively binary and left_unary
|
operator expressions indicate respectively binary (infix), right-unary
|
||||||
expressions. The following sections discuss the remaining options.
|
(suffix) and left-unary (prefix) operators. The following sections
|
||||||
|
discuss the remaining options.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<sect2>
|
<sect2>
|
||||||
@ -690,7 +691,7 @@ CREATE FUNCTION dept (name)
|
|||||||
enclosed in parentheses:
|
enclosed in parentheses:
|
||||||
|
|
||||||
<synopsis>
|
<synopsis>
|
||||||
<replaceable>function</replaceable> (<replaceable>a_expr</replaceable> [, <replaceable>a_expr</replaceable> )
|
<replaceable>function</replaceable> (<replaceable>a_expr</replaceable> [, <replaceable>a_expr</replaceable> ... ] )
|
||||||
</synopsis>
|
</synopsis>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -705,20 +706,40 @@ sqrt(emp.salary)
|
|||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2>
|
<sect2>
|
||||||
<title>Aggregate Expression</title>
|
<title>Aggregate Expressions</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
An <firstterm>aggregate expression</firstterm>
|
An <firstterm>aggregate expression</firstterm> represents the application
|
||||||
represents a simple aggregate (i.e., one that computes a single value)
|
of an aggregate function across the rows selected by a query.
|
||||||
or an aggregate function (i.e., one that computes a set of values).
|
An aggregate function reduces multiple inputs to a single output value,
|
||||||
The syntax is the following:
|
such as the sum or average of the inputs.
|
||||||
|
The syntax of an aggregate expression is one of the following:
|
||||||
|
|
||||||
<synopsis>
|
<simplelist>
|
||||||
<replaceable>aggregate_name</replaceable> (<replaceable>attribute</replaceable>)
|
<member><replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable>)</member>
|
||||||
</synopsis>
|
<member><replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)</member>
|
||||||
|
<member><replaceable>aggregate_name</replaceable> ( * )</member>
|
||||||
|
</simplelist>
|
||||||
|
|
||||||
where <replaceable>aggregate_name</replaceable>
|
where <replaceable>aggregate_name</replaceable> is a previously defined
|
||||||
must be a previously defined aggregate.
|
aggregate, and <replaceable>expression</replaceable> is any expression
|
||||||
|
that doesn't itself contain an aggregate expression.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
The first form of aggregate expression invokes the aggregate across all
|
||||||
|
input rows for which the given expression yields a non-null value.
|
||||||
|
The second form invokes the aggregate for all distinct non-null values
|
||||||
|
of the expression found in the input rows. The last form invokes the
|
||||||
|
aggregate once for each input row regardless of null or non-null values;
|
||||||
|
since no particular input value is specified, it is generally only useful
|
||||||
|
for the count() aggregate.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
For example, count(*) yields the total number of input rows;
|
||||||
|
count(f1) yields the number of input rows in which f1 is non-null;
|
||||||
|
count(distinct f1) yields the number of distinct non-null values of f1.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user