mirror of
https://github.com/postgres/postgres.git
synced 2025-05-31 00:01:57 -04:00
Add anther sequential scan email.
This commit is contained in:
parent
43e740b317
commit
d87677022b
@ -345,7 +345,7 @@ From owner-pgsql-hackers@hub.org Tue Oct 19 10:31:10 1999
|
|||||||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA29087
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA29087
|
||||||
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:31:08 -0400 (EDT)
|
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:31:08 -0400 (EDT)
|
||||||
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.14 $) with ESMTP id KAA27535 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:19:47 -0400 (EDT)
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.15 $) with ESMTP id KAA27535 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:19:47 -0400 (EDT)
|
||||||
Received: from localhost (majordom@localhost)
|
Received: from localhost (majordom@localhost)
|
||||||
by hub.org (8.9.3/8.9.3) with SMTP id KAA30328;
|
by hub.org (8.9.3/8.9.3) with SMTP id KAA30328;
|
||||||
Tue, 19 Oct 1999 10:12:10 -0400 (EDT)
|
Tue, 19 Oct 1999 10:12:10 -0400 (EDT)
|
||||||
@ -454,7 +454,7 @@ From owner-pgsql-hackers@hub.org Tue Oct 19 21:25:30 1999
|
|||||||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA28130
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA28130
|
||||||
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:25:26 -0400 (EDT)
|
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:25:26 -0400 (EDT)
|
||||||
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.14 $) with ESMTP id VAA10512 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:15:28 -0400 (EDT)
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.15 $) with ESMTP id VAA10512 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:15:28 -0400 (EDT)
|
||||||
Received: from localhost (majordom@localhost)
|
Received: from localhost (majordom@localhost)
|
||||||
by hub.org (8.9.3/8.9.3) with SMTP id VAA50745;
|
by hub.org (8.9.3/8.9.3) with SMTP id VAA50745;
|
||||||
Tue, 19 Oct 1999 21:07:23 -0400 (EDT)
|
Tue, 19 Oct 1999 21:07:23 -0400 (EDT)
|
||||||
@ -1006,7 +1006,7 @@ From pgsql-general-owner+M2497@hub.org Fri Jun 16 18:31:03 2000
|
|||||||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04165
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04165
|
||||||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:31:01 -0400 (EDT)
|
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:31:01 -0400 (EDT)
|
||||||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.14 $) with ESMTP id RAA13110 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:20:12 -0400 (EDT)
|
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.15 $) with ESMTP id RAA13110 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:20:12 -0400 (EDT)
|
||||||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||||||
by hub.org (8.10.1/8.10.1) with SMTP id e5GLDaM14477;
|
by hub.org (8.10.1/8.10.1) with SMTP id e5GLDaM14477;
|
||||||
Fri, 16 Jun 2000 17:13:36 -0400 (EDT)
|
Fri, 16 Jun 2000 17:13:36 -0400 (EDT)
|
||||||
@ -3032,3 +3032,133 @@ Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org
|
|||||||
Don't you know, in this new Dark Age, we're all light. --XTC
|
Don't you know, in this new Dark Age, we're all light. --XTC
|
||||||
|
|
||||||
|
|
||||||
|
From cjs@cynic.net Wed Apr 24 23:19:23 2002
|
||||||
|
Return-path: <cjs@cynic.net>
|
||||||
|
Received: from angelic.cynic.net ([202.232.117.21])
|
||||||
|
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3P3JM414917
|
||||||
|
for <pgman@candle.pha.pa.us>; Wed, 24 Apr 2002 23:19:22 -0400 (EDT)
|
||||||
|
Received: from localhost (localhost [127.0.0.1])
|
||||||
|
by angelic.cynic.net (Postfix) with ESMTP
|
||||||
|
id 1F36F870E; Thu, 25 Apr 2002 12:19:14 +0900 (JST)
|
||||||
|
Date: Thu, 25 Apr 2002 12:19:14 +0900 (JST)
|
||||||
|
From: Curt Sampson <cjs@cynic.net>
|
||||||
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||||
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||||
|
Subject: Re: Sequential Scan Read-Ahead
|
||||||
|
In-Reply-To: <200204250156.g3P1ufh05751@candle.pha.pa.us>
|
||||||
|
Message-ID: <Pine.NEB.4.43.0204251118040.445-100000@angelic.cynic.net>
|
||||||
|
MIME-Version: 1.0
|
||||||
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||||||
|
Status: OR
|
||||||
|
|
||||||
|
On Wed, 24 Apr 2002, Bruce Momjian wrote:
|
||||||
|
|
||||||
|
> > 1. Not all systems do readahead.
|
||||||
|
>
|
||||||
|
> If they don't, that isn't our problem. We expect it to be there, and if
|
||||||
|
> it isn't, the vendor/kernel is at fault.
|
||||||
|
|
||||||
|
It is your problem when another database kicks Postgres' ass
|
||||||
|
performance-wise.
|
||||||
|
|
||||||
|
And at that point, *you're* at fault. You're the one who's knowingly
|
||||||
|
decided to do things inefficiently.
|
||||||
|
|
||||||
|
Sorry if this sounds harsh, but this, "Oh, someone else is to blame"
|
||||||
|
attitude gets me steamed. It's one thing to say, "We don't support
|
||||||
|
this." That's fine; there are often good reasons for that. It's a
|
||||||
|
completely different thing to say, "It's an unrelated entity's fault we
|
||||||
|
don't support this."
|
||||||
|
|
||||||
|
At any rate, relying on the kernel to guess how to optimise for
|
||||||
|
the workload will never work as well as well as the software that
|
||||||
|
knows the workload doing the optimization.
|
||||||
|
|
||||||
|
The lack of support thing is no joke. Sure, lots of systems nowadays
|
||||||
|
support unified buffer cache and read-ahead. But how many, besides
|
||||||
|
Solaris, support free-behind, which is also very important to avoid
|
||||||
|
blowing out your buffer cache when doing sequential reads? And who
|
||||||
|
at all supports read-ahead for reverse scans? (Or does Postgres
|
||||||
|
not do those, anyway? I can see the support is there.)
|
||||||
|
|
||||||
|
And even when the facilities are there, you create problems by
|
||||||
|
using them. Look at the OS buffer cache, for example. Not only do
|
||||||
|
we lose efficiency by using two layers of caching, but (as people
|
||||||
|
have pointed out recently on the lists), the optimizer can't even
|
||||||
|
know how much or what is being cached, and thus can't make decisions
|
||||||
|
based on that.
|
||||||
|
|
||||||
|
> Yes, seek() in file will turn off read-ahead. Grabbing bigger chunks
|
||||||
|
> would help here, but if you have two people already reading from the
|
||||||
|
> same file, grabbing bigger chunks of the file may not be optimal.
|
||||||
|
|
||||||
|
Grabbing bigger chunks is always optimal, AFICT, if they're not
|
||||||
|
*too* big and you use the data. A single 64K read takes very little
|
||||||
|
longer than a single 8K read.
|
||||||
|
|
||||||
|
> > 3. Even when the read-ahead does occur, you're still doing more
|
||||||
|
> > syscalls, and thus more expensive kernel/userland transitions, than
|
||||||
|
> > you have to.
|
||||||
|
>
|
||||||
|
> I would guess the performance impact is minimal.
|
||||||
|
|
||||||
|
If it were minimal, people wouldn't work so hard to build multi-level
|
||||||
|
thread systems, where multiple userland threads are scheduled on
|
||||||
|
top of kernel threads.
|
||||||
|
|
||||||
|
However, it does depend on how much CPU your particular application
|
||||||
|
is using. You may have it to spare.
|
||||||
|
|
||||||
|
> http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg00009.html
|
||||||
|
|
||||||
|
Well, this message has some points in it that I feel are just incorrect.
|
||||||
|
|
||||||
|
1. It is *not* true that you have no idea where data is when
|
||||||
|
using a storage array or other similar system. While you
|
||||||
|
certainly ought not worry about things such as head positions
|
||||||
|
and so on, it's been a given for a long, long time that two
|
||||||
|
blocks that have close index numbers are going to be close
|
||||||
|
together in physical storage.
|
||||||
|
|
||||||
|
2. Raw devices are quite standard across Unix systems (except
|
||||||
|
in the unfortunate case of Linux, which I think has been
|
||||||
|
remedied, hasn't it?). They're very portable, and have just as
|
||||||
|
well--if not better--defined write semantics as a filesystem.
|
||||||
|
|
||||||
|
3. My observations of OS performance tuning over the past six
|
||||||
|
or eight years contradict the statement, "There's a considerable
|
||||||
|
cost in complexity and code in using "raw" storage too, and
|
||||||
|
it's not a one off cost: as the technologies change, the "fast"
|
||||||
|
way to do things will change and the code will have to be
|
||||||
|
updated to match." While optimizations have been removed over
|
||||||
|
the years the basic optimizations (order reads by block number,
|
||||||
|
do larger reads rather than smaller, cache the data) have
|
||||||
|
remained unchanged for a long, long time.
|
||||||
|
|
||||||
|
4. "Better to leave this to the OS vendor where possible, and
|
||||||
|
take advantage of the tuning they do." Well, sorry guys, but
|
||||||
|
have a look at the tuning they do. It hasn't changed in years,
|
||||||
|
except to remove now-unnecessary complexity realated to really,
|
||||||
|
really old and slow disk devices, and to add a few thing that
|
||||||
|
guess workload but still do a worse job than if the workload
|
||||||
|
generator just did its own optimisations in the first place.
|
||||||
|
|
||||||
|
> http://candle.pha.pa.us/mhonarc/todo.detail/optimizer/msg00011.html
|
||||||
|
|
||||||
|
Well, this one, with statements like "Postgres does have control
|
||||||
|
over its buffer cache," I don't know what to say. You can interpret
|
||||||
|
the statement however you like, but in the end Postgres very little
|
||||||
|
control at all over how data is moved between memory and disk.
|
||||||
|
|
||||||
|
BTW, please don't take me as saying that all control over physical
|
||||||
|
IO should be done by Postgres. I just think that Posgres could do
|
||||||
|
a better job of managing data transfer between disk and memory than
|
||||||
|
the OS can. The rest of the things (using raw paritions, read-ahead,
|
||||||
|
free-behind, etc.) just drop out of that one idea.
|
||||||
|
|
||||||
|
cjs
|
||||||
|
--
|
||||||
|
Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org
|
||||||
|
Don't you know, in this new Dark Age, we're all light. --XTC
|
||||||
|
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user