mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-26 00:02:18 -04:00 
			
		
		
		
	
		
			
				
	
	
		
			2627 lines
		
	
	
		
			114 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			2627 lines
		
	
	
		
			114 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| From pgsql-performance-owner+M17204@postgresql.org Wed Feb 15 16:28:34 2006
 | |
| Return-path: <pgsql-performance-owner+M17204@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k1FLSV527014
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 15 Feb 2006 16:28:31 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 168C967B584;
 | |
| 	Wed, 15 Feb 2006 17:28:29 -0400 (AST)
 | |
| X-Original-To: pgsql-performance-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id BB0AB9DCB9E
 | |
| 	for <pgsql-performance-postgresql.org@localhost.postgresql.org>; Wed, 15 Feb 2006 17:27:56 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 22055-07
 | |
| 	for <pgsql-performance-postgresql.org@localhost.postgresql.org>;
 | |
| 	Wed, 15 Feb 2006 17:27:57 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id F385E9DCB98
 | |
| 	for <pgsql-performance@postgresql.org>; Wed, 15 Feb 2006 17:27:53 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k1FLRsqd019780;
 | |
| 	Wed, 15 Feb 2006 16:27:54 -0500 (EST)
 | |
| To: Gary Doades <gpd@gpdnet.co.uk>
 | |
| cc: pgsql-performance@postgresql.org
 | |
| Subject: Re: [PERFORM] Strange Create Index behaviour 
 | |
| In-Reply-To: <19510.1140036968@sss.pgh.pa.us> 
 | |
| References: <43F38867.6010701@gpdnet.co.uk> <19510.1140036968@sss.pgh.pa.us>
 | |
| Comments: In-reply-to Tom Lane <tgl@sss.pgh.pa.us>
 | |
| 	message dated "Wed, 15 Feb 2006 15:56:08 -0500"
 | |
| Date: Wed, 15 Feb 2006 16:27:54 -0500
 | |
| Message-ID: <19779.1140038874@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.11 required=5 tests=[AWL=0.110]
 | |
| X-Spam-Score: 0.11
 | |
| X-Mailing-List: pgsql-performance
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-performance>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-performance.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-performance-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-performance@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-performance>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-performance>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-performance-owner@postgresql.org
 | |
| Status: ORr
 | |
| 
 | |
| I wrote:
 | |
| > Interesting.  I tried your test script and got fairly close times
 | |
| > for all the cases on two different machines:
 | |
| > 	old HPUX machine: shortest 5800 msec, longest 7960 msec
 | |
| > 	new Fedora 4 machine: shortest 461 msec, longest 608 msec
 | |
| 
 | |
| > So what this looks like to me is a corner case that FreeBSD's qsort
 | |
| > fails to handle well.
 | |
| 
 | |
| I tried forcing PG to use src/port/qsort.c on the Fedora machine,
 | |
| and lo and behold:
 | |
| 	new Fedora 4 machine: shortest 434 msec, longest 8530 msec
 | |
| 
 | |
| So it sure looks like this script does expose a problem on BSD-derived
 | |
| qsorts.  Curiously, the case that's much the worst for me is the third
 | |
| in the script, while the shortest time is the first case, which was slow
 | |
| for Gary.  So I'd venture that the *BSD code has been tweaked somewhere
 | |
| along the way, in a manner that moves the problem around without really
 | |
| fixing it.  (Anyone want to compare the actual FreeBSD source to what
 | |
| we have?)
 | |
| 
 | |
| This is pretty relevant stuff, because there was a thread recently
 | |
| advocating that we stop using the platform qsort on all platforms:
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
 | |
| 
 | |
| It's really interesting to see a case where port/qsort is radically
 | |
| worse than other qsorts ... unless we figure that out and fix it,
 | |
| I think the idea of using port/qsort everywhere has just taken a
 | |
| major hit.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-performance-owner+M17212@postgresql.org Wed Feb 15 18:29:07 2006
 | |
| Return-path: <pgsql-performance-owner+M17212@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k1FNT6509074
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 15 Feb 2006 18:29:06 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 2BE6267B58B;
 | |
| 	Wed, 15 Feb 2006 19:29:04 -0400 (AST)
 | |
| X-Original-To: pgsql-performance-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 7C3D49DC803;
 | |
| 	Wed, 15 Feb 2006 19:28:30 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 47149-10; Wed, 15 Feb 2006 19:28:32 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id C56AD9DC843;
 | |
| 	Wed, 15 Feb 2006 19:28:27 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k1FNSTkm020782;
 | |
| 	Wed, 15 Feb 2006 18:28:29 -0500 (EST)
 | |
| To: Gary Doades <gpd@gpdnet.co.uk>
 | |
| cc: pgsql-performance@postgresql.org, pgsql-hackers@postgresql.org
 | |
| Subject: qsort again (was Re: [PERFORM] Strange Create Index behaviour)
 | |
| In-Reply-To: <43F39E53.1020009@gpdnet.co.uk> 
 | |
| References: <43F38867.6010701@gpdnet.co.uk> <19510.1140036968@sss.pgh.pa.us> <19779.1140038874@sss.pgh.pa.us> <43F39E53.1020009@gpdnet.co.uk>
 | |
| Comments: In-reply-to Gary Doades <gpd@gpdnet.co.uk>
 | |
| 	message dated "Wed, 15 Feb 2006 21:34:11 +0000"
 | |
| Date: Wed, 15 Feb 2006 18:28:29 -0500
 | |
| Message-ID: <20781.1140046109@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.11 required=5 tests=[AWL=0.110]
 | |
| X-Spam-Score: 0.11
 | |
| X-Mailing-List: pgsql-performance
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-performance>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-performance.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-performance-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-performance@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-performance>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-performance>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-performance-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Gary Doades <gpd@gpdnet.co.uk> writes:
 | |
| > If I run the script again, it is not always the first case that is slow, 
 | |
| > it varies from run to run, which is why I repeated it quite a few times 
 | |
| > for the test.
 | |
| 
 | |
| For some reason I hadn't immediately twigged to the fact that your test
 | |
| script is just N repetitions of the exact same structure with random data.
 | |
| So it's not so surprising that you get random variations in behavior
 | |
| with different test data sets.
 | |
| 
 | |
| I did some experimentation comparing the qsort from Fedora Core 4
 | |
| (glibc-2.3.5-10.3) with our src/port/qsort.c.  For those who weren't
 | |
| following the pgsql-performance thread, the test case is just this
 | |
| repeated a lot of times:
 | |
| 
 | |
| create table atest(i int4, r int4);
 | |
| insert into atest (i,r) select generate_series(1,100000), 0;
 | |
| insert into atest (i,r) select generate_series(1,100000), random()*100000;
 | |
| \timing
 | |
| create index idx on atest(r);
 | |
| \timing
 | |
| drop table atest;
 | |
| 
 | |
| I did this 100 times and sorted the reported runtimes.  (Investigation
 | |
| with trace_sort = on confirms that the runtime is almost entirely spent
 | |
| in qsort() called from our performsort --- the Postgres overhead is
 | |
| about 100msec on this machine.)  Results are below.
 | |
| 
 | |
| It seems clear that our qsort.c is doing a pretty awful job of picking
 | |
| qsort pivots, while glibc is mostly managing not to make that mistake.
 | |
| I haven't looked at the glibc code yet to see what they are doing
 | |
| differently.
 | |
| 
 | |
| I'd say this puts a considerable damper on my enthusiasm for using our
 | |
| qsort all the time, as was recently debated in this thread:
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
 | |
| We need to fix our qsort.c before pushing ahead with that idea.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| 
 | |
| 100 runtimes for glibc qsort, sorted ascending:
 | |
| 
 | |
| Time: 459.860 ms
 | |
| Time: 460.209 ms
 | |
| Time: 460.704 ms
 | |
| Time: 461.317 ms
 | |
| Time: 461.538 ms
 | |
| Time: 461.652 ms
 | |
| Time: 461.988 ms
 | |
| Time: 462.573 ms
 | |
| Time: 462.638 ms
 | |
| Time: 462.716 ms
 | |
| Time: 462.917 ms
 | |
| Time: 463.219 ms
 | |
| Time: 463.455 ms
 | |
| Time: 463.650 ms
 | |
| Time: 463.723 ms
 | |
| Time: 463.737 ms
 | |
| Time: 463.750 ms
 | |
| Time: 463.852 ms
 | |
| Time: 463.964 ms
 | |
| Time: 463.988 ms
 | |
| Time: 464.003 ms
 | |
| Time: 464.135 ms
 | |
| Time: 464.372 ms
 | |
| Time: 464.458 ms
 | |
| Time: 464.496 ms
 | |
| Time: 464.551 ms
 | |
| Time: 464.599 ms
 | |
| Time: 464.655 ms
 | |
| Time: 464.656 ms
 | |
| Time: 464.722 ms
 | |
| Time: 464.814 ms
 | |
| Time: 464.827 ms
 | |
| Time: 464.878 ms
 | |
| Time: 464.899 ms
 | |
| Time: 464.905 ms
 | |
| Time: 464.987 ms
 | |
| Time: 465.055 ms
 | |
| Time: 465.138 ms
 | |
| Time: 465.159 ms
 | |
| Time: 465.194 ms
 | |
| Time: 465.310 ms
 | |
| Time: 465.316 ms
 | |
| Time: 465.375 ms
 | |
| Time: 465.450 ms
 | |
| Time: 465.535 ms
 | |
| Time: 465.595 ms
 | |
| Time: 465.680 ms
 | |
| Time: 465.769 ms
 | |
| Time: 465.865 ms
 | |
| Time: 465.892 ms
 | |
| Time: 465.903 ms
 | |
| Time: 466.003 ms
 | |
| Time: 466.154 ms
 | |
| Time: 466.164 ms
 | |
| Time: 466.203 ms
 | |
| Time: 466.305 ms
 | |
| Time: 466.344 ms
 | |
| Time: 466.364 ms
 | |
| Time: 466.388 ms
 | |
| Time: 466.502 ms
 | |
| Time: 466.593 ms
 | |
| Time: 466.725 ms
 | |
| Time: 466.794 ms
 | |
| Time: 466.798 ms
 | |
| Time: 466.904 ms
 | |
| Time: 466.971 ms
 | |
| Time: 466.997 ms
 | |
| Time: 467.122 ms
 | |
| Time: 467.146 ms
 | |
| Time: 467.221 ms
 | |
| Time: 467.224 ms
 | |
| Time: 467.244 ms
 | |
| Time: 467.277 ms
 | |
| Time: 467.587 ms
 | |
| Time: 468.142 ms
 | |
| Time: 468.207 ms
 | |
| Time: 468.237 ms
 | |
| Time: 468.471 ms
 | |
| Time: 468.663 ms
 | |
| Time: 468.700 ms
 | |
| Time: 469.235 ms
 | |
| Time: 469.840 ms
 | |
| Time: 470.472 ms
 | |
| Time: 471.140 ms
 | |
| Time: 472.811 ms
 | |
| Time: 472.959 ms
 | |
| Time: 474.858 ms
 | |
| Time: 477.210 ms
 | |
| Time: 479.571 ms
 | |
| Time: 479.671 ms
 | |
| Time: 482.797 ms
 | |
| Time: 488.852 ms
 | |
| Time: 514.639 ms
 | |
| Time: 529.287 ms
 | |
| Time: 612.185 ms
 | |
| Time: 660.748 ms
 | |
| Time: 742.227 ms
 | |
| Time: 866.814 ms
 | |
| Time: 1234.848 ms
 | |
| Time: 1267.398 ms
 | |
| 
 | |
| 
 | |
| 100 runtimes for port/qsort.c, sorted ascending:
 | |
| 
 | |
| Time: 418.905 ms
 | |
| Time: 420.611 ms
 | |
| Time: 420.764 ms
 | |
| Time: 420.904 ms
 | |
| Time: 421.706 ms
 | |
| Time: 422.466 ms
 | |
| Time: 422.627 ms
 | |
| Time: 423.189 ms
 | |
| Time: 423.302 ms
 | |
| Time: 425.096 ms
 | |
| Time: 425.731 ms
 | |
| Time: 425.851 ms
 | |
| Time: 427.253 ms
 | |
| Time: 430.113 ms
 | |
| Time: 432.756 ms
 | |
| Time: 432.963 ms
 | |
| Time: 440.502 ms
 | |
| Time: 440.640 ms
 | |
| Time: 450.452 ms
 | |
| Time: 458.143 ms
 | |
| Time: 459.212 ms
 | |
| Time: 467.706 ms
 | |
| Time: 468.006 ms
 | |
| Time: 468.574 ms
 | |
| Time: 470.003 ms
 | |
| Time: 472.313 ms
 | |
| Time: 483.622 ms
 | |
| Time: 492.395 ms
 | |
| Time: 509.564 ms
 | |
| Time: 531.037 ms
 | |
| Time: 533.366 ms
 | |
| Time: 535.610 ms
 | |
| Time: 575.523 ms
 | |
| Time: 582.688 ms
 | |
| Time: 593.545 ms
 | |
| Time: 647.364 ms
 | |
| Time: 660.612 ms
 | |
| Time: 677.312 ms
 | |
| Time: 680.288 ms
 | |
| Time: 697.626 ms
 | |
| Time: 833.066 ms
 | |
| Time: 834.511 ms
 | |
| Time: 851.819 ms
 | |
| Time: 920.443 ms
 | |
| Time: 926.731 ms
 | |
| Time: 954.289 ms
 | |
| Time: 1045.214 ms
 | |
| Time: 1059.200 ms
 | |
| Time: 1062.328 ms
 | |
| Time: 1136.018 ms
 | |
| Time: 1260.091 ms
 | |
| Time: 1276.883 ms
 | |
| Time: 1319.351 ms
 | |
| Time: 1438.854 ms
 | |
| Time: 1475.457 ms
 | |
| Time: 1538.211 ms
 | |
| Time: 1549.004 ms
 | |
| Time: 1744.642 ms
 | |
| Time: 1771.258 ms
 | |
| Time: 1959.530 ms
 | |
| Time: 2300.140 ms
 | |
| Time: 2589.641 ms
 | |
| Time: 2612.780 ms
 | |
| Time: 3100.024 ms
 | |
| Time: 3284.125 ms
 | |
| Time: 3379.792 ms
 | |
| Time: 3750.278 ms
 | |
| Time: 4302.278 ms
 | |
| Time: 4780.624 ms
 | |
| Time: 5000.056 ms
 | |
| Time: 5092.604 ms
 | |
| Time: 5168.722 ms
 | |
| Time: 5292.941 ms
 | |
| Time: 5895.964 ms
 | |
| Time: 7003.164 ms
 | |
| Time: 7099.449 ms
 | |
| Time: 7115.083 ms
 | |
| Time: 7384.940 ms
 | |
| Time: 8214.010 ms
 | |
| Time: 8700.771 ms
 | |
| Time: 9331.225 ms
 | |
| Time: 10503.360 ms
 | |
| Time: 12496.026 ms
 | |
| Time: 12982.474 ms
 | |
| Time: 15192.390 ms
 | |
| Time: 15392.161 ms
 | |
| Time: 15958.295 ms
 | |
| Time: 18375.693 ms
 | |
| Time: 18617.706 ms
 | |
| Time: 18927.515 ms
 | |
| Time: 19898.018 ms
 | |
| Time: 20865.979 ms
 | |
| Time: 21000.907 ms
 | |
| Time: 21297.585 ms
 | |
| Time: 21714.518 ms
 | |
| Time: 25423.235 ms
 | |
| Time: 27543.052 ms
 | |
| Time: 28314.182 ms
 | |
| Time: 29400.278 ms
 | |
| Time: 34142.534 ms
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: if posting/reading through Usenet, please send an appropriate
 | |
|        subscribe-nomail command to majordomo@postgresql.org so that your
 | |
|        message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M79733@postgresql.org Wed Feb 15 20:22:07 2006
 | |
| Return-path: <pgsql-hackers-owner+M79733@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k1G1M6529533
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 15 Feb 2006 20:22:06 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id E5C5467B58F;
 | |
| 	Wed, 15 Feb 2006 21:22:03 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 3DAA69DCACE;
 | |
| 	Wed, 15 Feb 2006 21:21:34 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 76351-01; Wed, 15 Feb 2006 21:21:36 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 2FBB59DCA3F;
 | |
| 	Wed, 15 Feb 2006 21:21:31 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k1G1LXXi021616;
 | |
| 	Wed, 15 Feb 2006 20:21:33 -0500 (EST)
 | |
| To: Ron <rjpeace@earthlink.net>
 | |
| cc: pgsql-performance@postgresql.org, pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour) 
 | |
| In-Reply-To: <7.0.1.0.2.20060215194635.03b55da0@earthlink.net> 
 | |
| References: <43F38867.6010701@gpdnet.co.uk> <19510.1140036968@sss.pgh.pa.us> <19779.1140038874@sss.pgh.pa.us> <43F39E53.1020009@gpdnet.co.uk> <20781.1140046109@sss.pgh.pa.us> <7.0.1.0.2.20060215194635.03b55da0@earthlink.net>
 | |
| Comments: In-reply-to Ron <rjpeace@earthlink.net>
 | |
| 	message dated "Wed, 15 Feb 2006 19:57:51 -0500"
 | |
| Date: Wed, 15 Feb 2006 20:21:33 -0500
 | |
| Message-ID: <21615.1140052893@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.11 required=5 tests=[AWL=0.110]
 | |
| X-Spam-Score: 0.11
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Ron <rjpeace@earthlink.net> writes:
 | |
| > How are we choosing our pivots?
 | |
| 
 | |
| See qsort.c: it looks like median of nine equally spaced inputs (ie,
 | |
| the 1/8th points of the initial input array, plus the end points),
 | |
| implemented as two rounds of median-of-three choices.  With half of the
 | |
| data inputs zero, it's not too improbable for two out of the three
 | |
| samples to be zeroes in which case I think the med3 result will be zero
 | |
| --- so choosing a pivot of zero is much more probable than one would
 | |
| like, and doing so in many levels of recursion causes the problem.
 | |
| 
 | |
| I think.  I'm not too sure if the code isn't just being sloppy about the
 | |
| case where many data values are equal to the pivot --- there's a special
 | |
| case there to switch to insertion sort, and maybe that's getting invoked
 | |
| too soon.  It'd be useful to get a line-level profile of the behavior of
 | |
| this code in the slow cases...
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: Have you checked our extensive FAQ?
 | |
| 
 | |
|                http://www.postgresql.org/docs/faq
 | |
| 
 | |
| From pgsql-performance-owner+M17282@postgresql.org Fri Feb 17 23:11:11 2006
 | |
| Return-path: <pgsql-performance-owner+M17282@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k1I4BA515503
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 17 Feb 2006 23:11:10 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 2825F67B5F5;
 | |
| 	Sat, 18 Feb 2006 00:11:07 -0400 (AST)
 | |
| X-Original-To: pgsql-performance-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 7BB8A9DCC4F;
 | |
| 	Wed, 15 Feb 2006 21:37:57 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 79365-02; Wed, 15 Feb 2006 21:38:00 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 33BEA9DCACE;
 | |
| 	Wed, 15 Feb 2006 21:37:54 -0400 (AST)
 | |
| X-MimeOLE: Produced By Microsoft Exchange V6.5
 | |
| Content-class: urn:content-classes:message
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain;
 | |
| 	charset="us-ascii"
 | |
| Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour) 
 | |
| Date: Wed, 15 Feb 2006 17:37:58 -0800
 | |
| Message-ID: <D425483C2C5C9F49B5B7A41F8944154757D54C@postal.corporate.connx.com>
 | |
| Thread-Topic: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour) 
 | |
| Thread-Index: AcYyl2fPgxfNXHIRRyOEN4ZGeHtA3wAAEaNQ
 | |
| From: "Dann Corbit" <DCorbit@connx.com>
 | |
| To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Ron" <rjpeace@earthlink.net>
 | |
| cc: <pgsql-performance@postgresql.org>, <pgsql-hackers@postgresql.org>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.075 required=5 tests=[AWL=0.075]
 | |
| X-Spam-Score: 0.075
 | |
| X-Mailing-List: pgsql-performance
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-performance>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-performance.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-performance-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-performance@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-performance>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-performance>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-performance-owner@postgresql.org
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k1I4BA515503
 | |
| Status: ORr
 | |
| 
 | |
| 
 | |
| 
 | |
| > -----Original Message-----
 | |
| > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
 | |
| > owner@postgresql.org] On Behalf Of Tom Lane
 | |
| > Sent: Wednesday, February 15, 2006 5:22 PM
 | |
| > To: Ron
 | |
| > Cc: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org
 | |
| > Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create
 | |
| Index
 | |
| > behaviour)
 | |
| > 
 | |
| > Ron <rjpeace@earthlink.net> writes:
 | |
| > > How are we choosing our pivots?
 | |
| > 
 | |
| > See qsort.c: it looks like median of nine equally spaced inputs (ie,
 | |
| > the 1/8th points of the initial input array, plus the end points),
 | |
| > implemented as two rounds of median-of-three choices.  With half of
 | |
| the
 | |
| > data inputs zero, it's not too improbable for two out of the three
 | |
| > samples to be zeroes in which case I think the med3 result will be
 | |
| zero
 | |
| > --- so choosing a pivot of zero is much more probable than one would
 | |
| > like, and doing so in many levels of recursion causes the problem.
 | |
| 
 | |
| Adding some randomness to the selection of the pivot is a known
 | |
| technique to fix the oddball partitions problem.  However, Bentley and
 | |
| Sedgewick proved that every quick sort algorithm has some input set that
 | |
| makes it go quadratic (hence the recent popularity of introspective
 | |
| sort, which switches to heapsort if quadratic behavior is detected.  The
 | |
| C++ template I submitted was an example of introspective sort, but
 | |
| PostgreSQL does not use C++ so it was not helpful).
 | |
| 
 | |
| > I think.  I'm not too sure if the code isn't just being sloppy about
 | |
| the
 | |
| > case where many data values are equal to the pivot --- there's a
 | |
| special
 | |
| > case there to switch to insertion sort, and maybe that's getting
 | |
| invoked
 | |
| > too soon.  
 | |
| 
 | |
| Here are some cases known to make qsort go quadratic:
 | |
| 1. Data already sorted
 | |
| 2. Data reverse sorted
 | |
| 3. Data organ-pipe sorted or ramp
 | |
| 4. Almost all data of the same value
 | |
| 
 | |
| There are probably other cases.  Randomizing the pivot helps some, as
 | |
| does check for in-order or reverse order partitions.
 | |
| 
 | |
| Imagine if 1/3 of the partitions fall into a category that causes
 | |
| quadratic behavior (have one of the above formats and have more than
 | |
| CUTOFF elements in them).
 | |
| 
 | |
| It is doubtful that the switch to insertion sort is causing any sort of
 | |
| problems.  It is only going to be invoked on tiny sets, for which it has
 | |
| a fixed cost that is probably less that qsort() function calls on sets
 | |
| of the same size.
 | |
| 
 | |
| >It'd be useful to get a line-level profile of the behavior of
 | |
| > this code in the slow cases...
 | |
| 
 | |
| I guess that my in-order or presorted tests [which often arise when
 | |
| there are very few distinct values] may solve the bad partition
 | |
| problems.  Don't forget that the algorithm is called recursively.
 | |
| 
 | |
| > 			regards, tom lane
 | |
| > 
 | |
| > ---------------------------(end of
 | |
| broadcast)---------------------------
 | |
| > TIP 3: Have you checked our extensive FAQ?
 | |
| > 
 | |
| >                http://www.postgresql.org/docs/faq
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From kleptog@svana.org Mon Dec 19 06:37:51 2005
 | |
| Return-path: <kleptog@svana.org>
 | |
| Received: from svana.org (mail@svana.org [203.20.62.76])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id jBJBboe20936
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 19 Dec 2005 06:37:51 -0500 (EST)
 | |
| Received: from kleptog by svana.org with local (Exim 3.35 #1 (Debian))
 | |
| 	id 1EoJKc-00045V-00; Mon, 19 Dec 2005 22:37:30 +1100
 | |
| Date: Mon, 19 Dec 2005 12:37:30 +0100
 | |
| From: Martijn van Oosterhout <kleptog@svana.org>
 | |
| To: Dann Corbit <DCorbit@connx.com>
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
 | |
|    Bruce Momjian <pgman@candle.pha.pa.us>,
 | |
|    Luke Lonergan <llonergan@greenplum.com>, Neil Conway <neilc@samurai.com>,
 | |
|    pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Re: Which qsort is used
 | |
| Message-ID: <20051219113724.GD12251@svana.org>
 | |
| Reply-To: Martijn van Oosterhout <kleptog@svana.org>
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D38D@postal.corporate.connx.com>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: multipart/signed; micalg=pgp-sha1;
 | |
| 	protocol="application/pgp-signature"; boundary="5gxpn/Q6ypwruk0T"
 | |
| Content-Disposition: inline
 | |
| In-Reply-To: <D425483C2C5C9F49B5B7A41F8944154757D38D@postal.corporate.connx.com>
 | |
| User-Agent: Mutt/1.3.28i
 | |
| X-PGP-Key-ID: Length=1024; ID=0x0DC67BE6
 | |
| X-PGP-Key-Fingerprint: 295F A899 A81A 156D B522  48A7 6394 F08A 0DC6 7BE6
 | |
| X-PGP-Key-URL: <http://svana.org/kleptog/0DC67BE6.pgp.asc>
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| --5gxpn/Q6ypwruk0T
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Disposition: inline
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| 
 | |
| On Fri, Dec 16, 2005 at 10:43:58PM -0800, Dann Corbit wrote:
 | |
| > I am actually quite impressed with the excellence of Bentley's sort out
 | |
| > of the box.  It's definitely the best library implementation of a sort I
 | |
| > have seen.
 | |
| 
 | |
| I'm not sure whether we have a conclusion here, but I do have one
 | |
| question: is there a significant difference in the number of times the
 | |
| comparison routines are called? Comparisons in PostgreSQL are fairly
 | |
| expensive given the fmgr overhead and when comparing tuples it's even
 | |
| worse.
 | |
| 
 | |
| We don't want to accedently pick a routine that saves data shuffling by
 | |
| adding extra comparisons. The stats at [1] don't say. They try to
 | |
| factor in CPU cost but they seem to use unrealistically small values. I
 | |
| would think a number around 50 (or higher) would be more
 | |
| representative.
 | |
| 
 | |
| [1] http://www.cs.toronto.edu/~zhouqq/postgresql/sort/sort.html
 | |
| 
 | |
| Have a nice day,
 | |
| --=20
 | |
| Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
 | |
| > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 | |
| > tool for doing 5% of the work and then sitting around waiting for someone
 | |
| > else to do the other 95% so you can sue them.
 | |
| 
 | |
| --5gxpn/Q6ypwruk0T
 | |
| Content-Type: application/pgp-signature
 | |
| Content-Disposition: inline
 | |
| 
 | |
| -----BEGIN PGP SIGNATURE-----
 | |
| Version: GnuPG v1.0.6 (GNU/Linux)
 | |
| Comment: For info see http://www.gnupg.org
 | |
| 
 | |
| iD8DBQFDpptzIB7bNG8LQkwRAmC6AJ4qYrIm3SYnBV3BybSmm+Gl4vpEywCfRDxg
 | |
| bnIK4INRqOVFNBAKR/gDPcM=
 | |
| =92qA
 | |
| -----END PGP SIGNATURE-----
 | |
| 
 | |
| --5gxpn/Q6ypwruk0T--
 | |
| 
 | |
| From mkoi-pg@aon.at Wed Dec 21 19:44:03 2005
 | |
| Return-path: <mkoi-pg@aon.at>
 | |
| Received: from email.aon.at (warsl404pip5.highway.telekom.at [195.3.96.77])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id jBM0i2e05649
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 21 Dec 2005 19:44:02 -0500 (EST)
 | |
| Received: (qmail 12703 invoked from network); 22 Dec 2005 00:43:51 -0000
 | |
| Received: from m148p015.dipool.highway.telekom.at (HELO Sokrates) ([62.46.8.111])
 | |
|           (envelope-sender <mkoi-pg@aon.at>)
 | |
|           by smarthub78.highway.telekom.at (qmail-ldap-1.03) with SMTP
 | |
|           for <tgl@sss.pgh.pa.us>; 22 Dec 2005 00:43:51 -0000
 | |
| From: Manfred Koizar <mkoi-pg@aon.at>
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| cc: "Dann Corbit" <DCorbit@connx.com>, "Qingqing Zhou" <zhouqq@cs.toronto.edu>,
 | |
|    "Bruce Momjian" <pgman@candle.pha.pa.us>,
 | |
|    "Luke Lonergan" <llonergan@greenplum.com>,
 | |
|    "Neil Conway" <neilc@samurai.com>, pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Re: Which qsort is used 
 | |
| Date: Thu, 22 Dec 2005 01:43:34 +0100
 | |
| Message-ID: <odqjq1tv6cb77ri4df0aehqal8o0ljtkar@4ax.com>
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D386@postal.corporate.connx.com> <3148.1134795805@sss.pgh.pa.us>
 | |
| In-Reply-To: <3148.1134795805@sss.pgh.pa.us>
 | |
| X-Mailer: Forte Agent 3.1/32.783
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Status: OR
 | |
| 
 | |
| On Sat, 17 Dec 2005 00:03:25 -0500, Tom Lane <tgl@sss.pgh.pa.us>
 | |
| wrote:
 | |
| >I've still got a problem with these checks; I think they are a net
 | |
| >waste of cycles on average.  [...]
 | |
| > and when they fail, those cycles are entirely wasted;
 | |
| >you have not advanced the state of the sort at all.
 | |
| 
 | |
| How can we make the initial check "adavance the state of the sort"?
 | |
| One answer might be to exclude the sorted sequence at the start of the
 | |
| array from the qsort, and merge the two sorted lists as the final
 | |
| stage of the sort.
 | |
| 
 | |
| Qsorting N elements costs O(N*lnN), so excluding H elements from the
 | |
| sort reduces the cost by at least O(H*lnN).  The merge step costs O(N)
 | |
| plus some (<=50%) more memory, unless someone knows a fast in-place
 | |
| merge.  So depending on the constant factors involved there might be a
 | |
| usable solution.
 | |
| 
 | |
| I've been playing with some numbers and assuming the constant factors
 | |
| to be equal for all the O()'s this method starts to pay off at
 | |
| 	  H      for N
 | |
| 	  20       100
 | |
| 	 130      1000
 | |
| 	8000    100000
 | |
| Servus
 | |
|  Manfred
 | |
| 
 | |
| From pgsql-hackers-owner+M77795=pgman=candle.pha.pa.us@postgresql.org Thu Dec 22 02:02:28 2005
 | |
| Return-path: <pgsql-hackers-owner+M77795=pgman=candle.pha.pa.us@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id jBM72Re16910
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 22 Dec 2005 02:02:28 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id A31E067AAA0
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 22 Dec 2005 03:02:22 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 2C8EC9DCA92
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 22 Dec 2005 03:01:56 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 26033-04
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Thu, 22 Dec 2005 03:01:55 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from svana.org (svana.org [203.20.62.76])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 800859DC81D
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 22 Dec 2005 03:01:51 -0400 (AST)
 | |
| Received: from kleptog by svana.org with local (Exim 3.35 #1 (Debian))
 | |
| 	id 1EpKRg-0005ox-00; Thu, 22 Dec 2005 18:01:00 +1100
 | |
| Date: Thu, 22 Dec 2005 08:01:00 +0100
 | |
| From: Martijn van Oosterhout <kleptog@svana.org>
 | |
| To: Manfred Koizar <mkoi-pg@aon.at>
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Dann Corbit <DCorbit@connx.com>,
 | |
|    Qingqing Zhou <zhouqq@cs.toronto.edu>,
 | |
|    Bruce Momjian <pgman@candle.pha.pa.us>,
 | |
|    Luke Lonergan <llonergan@greenplum.com>, Neil Conway <neilc@samurai.com>,
 | |
|    pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Re: Which qsort is used
 | |
| Message-ID: <20051222070057.GA21783@svana.org>
 | |
| Reply-To: Martijn van Oosterhout <kleptog@svana.org>
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D386@postal.corporate.connx.com> <3148.1134795805@sss.pgh.pa.us> <odqjq1tv6cb77ri4df0aehqal8o0ljtkar@4ax.com>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: multipart/signed; micalg=pgp-sha1;
 | |
| 	protocol="application/pgp-signature"; boundary="FL5UXtIhxfXey3p5"
 | |
| Content-Disposition: inline
 | |
| In-Reply-To: <odqjq1tv6cb77ri4df0aehqal8o0ljtkar@4ax.com>
 | |
| User-Agent: Mutt/1.3.28i
 | |
| X-PGP-Key-ID: Length=1024; ID=0x0DC67BE6
 | |
| X-PGP-Key-Fingerprint: 295F A899 A81A 156D B522  48A7 6394 F08A 0DC6 7BE6
 | |
| X-PGP-Key-URL: <http://svana.org/kleptog/0DC67BE6.pgp.asc>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.065 required=5 tests=[AWL=0.065]
 | |
| X-Spam-Score: 0.065
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| --FL5UXtIhxfXey3p5
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Disposition: inline
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| 
 | |
| On Thu, Dec 22, 2005 at 01:43:34AM +0100, Manfred Koizar wrote:
 | |
| > Qsorting N elements costs O(N*lnN), so excluding H elements from the
 | |
| > sort reduces the cost by at least O(H*lnN).  The merge step costs O(N)
 | |
| > plus some (<=3D50%) more memory, unless someone knows a fast in-place
 | |
| > merge.  So depending on the constant factors involved there might be a
 | |
| > usable solution.
 | |
| 
 | |
| But where are you including the cost to check how many cells are
 | |
| already sorted? That would be O(H), right? This is where we come back
 | |
| to the issue that comparisons in PostgreSQL are expensive. The cpu_cost
 | |
| in the tests I saw so far is unrealistically low.
 | |
| 
 | |
| > I've been playing with some numbers and assuming the constant factors
 | |
| > to be equal for all the O()'s this method starts to pay off at
 | |
| > 	  H      for N
 | |
| > 	  20       100      20%
 | |
| > 	 130      1000      13%
 | |
| > 	8000    100000       8%
 | |
| 
 | |
| Hmm, what are the chances you have 100000 unordered items to sort and
 | |
| that the first 8% will already be in order. ISTM that that probability
 | |
| will be close enough to zero to not matter...
 | |
| 
 | |
| Have a nice day,
 | |
| --=20
 | |
| Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
 | |
| > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 | |
| > tool for doing 5% of the work and then sitting around waiting for someone
 | |
| > else to do the other 95% so you can sue them.
 | |
| 
 | |
| --FL5UXtIhxfXey3p5
 | |
| Content-Type: application/pgp-signature
 | |
| Content-Disposition: inline
 | |
| 
 | |
| -----BEGIN PGP SIGNATURE-----
 | |
| Version: GnuPG v1.0.6 (GNU/Linux)
 | |
| Comment: For info see http://www.gnupg.org
 | |
| 
 | |
| iD8DBQFDqk8oIB7bNG8LQkwRAjJhAJ47eXRi1DJ02cfKcnN2iPkaBB0eaQCeIiF+
 | |
| HOAYIPQrU2gpUUiGT3aGUUw=
 | |
| =R0hU
 | |
| -----END PGP SIGNATURE-----
 | |
| 
 | |
| --FL5UXtIhxfXey3p5--
 | |
| 
 | |
| From pgsql-hackers-owner+M77831=pgman=candle.pha.pa.us@postgresql.org Thu Dec 22 16:59:19 2005
 | |
| Return-path: <pgsql-hackers-owner+M77831=pgman=candle.pha.pa.us@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id jBMLxJe07480
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 22 Dec 2005 16:59:19 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id D1DBE67AC1B
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 22 Dec 2005 17:59:16 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id BE8249DCBEB
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 22 Dec 2005 17:58:53 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 64765-01
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Thu, 22 Dec 2005 17:58:54 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from email.aon.at (warsl404pip7.highway.telekom.at [195.3.96.91])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 3E08E9DCA5C
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 22 Dec 2005 17:58:49 -0400 (AST)
 | |
| Received: (qmail 6986 invoked from network); 22 Dec 2005 21:58:49 -0000
 | |
| Received: from m150p015.dipool.highway.telekom.at (HELO Sokrates) ([62.46.8.175])
 | |
|           (envelope-sender <mkoi-pg@aon.at>)
 | |
|           by smarthub76.highway.telekom.at (qmail-ldap-1.03) with SMTP
 | |
|           for <kleptog@svana.org>; 22 Dec 2005 21:58:49 -0000
 | |
| From: Manfred Koizar <mkoi-pg@aon.at>
 | |
| To: Martijn van Oosterhout <kleptog@svana.org>
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Dann Corbit <DCorbit@connx.com>,
 | |
|    Qingqing Zhou <zhouqq@cs.toronto.edu>,
 | |
|    Bruce Momjian <pgman@candle.pha.pa.us>,
 | |
|    Luke Lonergan <llonergan@greenplum.com>, Neil Conway <neilc@samurai.com>,
 | |
|    pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Re: Which qsort is used
 | |
| Date: Thu, 22 Dec 2005 22:58:31 +0100
 | |
| Message-ID: <4r6mq19fe6937mu9130h45ip3oeg135qo3@4ax.com>
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D386@postal.corporate.connx.com> <3148.1134795805@sss.pgh.pa.us> <odqjq1tv6cb77ri4df0aehqal8o0ljtkar@4ax.com> <20051222070057.GA21783@svana.org>
 | |
| In-Reply-To: <20051222070057.GA21783@svana.org>
 | |
| X-Mailer: Forte Agent 3.1/32.783
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.398 required=5 tests=[AWL=0.398]
 | |
| X-Spam-Score: 0.398
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| On Thu, 22 Dec 2005 08:01:00 +0100, Martijn van Oosterhout
 | |
| <kleptog@svana.org> wrote:
 | |
| >But where are you including the cost to check how many cells are
 | |
| >already sorted? That would be O(H), right?
 | |
| 
 | |
| Yes.  I didn't mention it, because H < N.
 | |
| 
 | |
| > This is where we come back
 | |
| >to the issue that comparisons in PostgreSQL are expensive.
 | |
| 
 | |
| So we agree that we should try to reduce the number of comparisons.
 | |
| How many comparisons does it take to sort 100000 items?  1.5 million?
 | |
| 
 | |
| >Hmm, what are the chances you have 100000 unordered items to sort and
 | |
| >that the first 8% will already be in order. ISTM that that probability
 | |
| >will be close enough to zero to not matter...
 | |
| 
 | |
| If the items are totally unordered, the check is so cheap you won't
 | |
| even notice.  OTOH in Tom's example ...
 | |
| 
 | |
| |What I think is much more probable in the Postgres environment
 | |
| |is almost-but-not-quite-ordered inputs --- eg, a table that was
 | |
| |perfectly ordered by key when filled, but some of the tuples have since
 | |
| |been moved by UPDATEs.
 | |
| 
 | |
| ... I'd not be surprised if H is 90% of N.
 | |
| Servus
 | |
|  Manfred
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From DCorbit@connx.com Thu Dec 22 17:22:03 2005
 | |
| Return-path: <DCorbit@connx.com>
 | |
| Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with SMTP id jBMMLve11671
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 22 Dec 2005 17:22:03 -0500 (EST)
 | |
| Content-class: urn:content-classes:message
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain;
 | |
| 	charset="us-ascii"
 | |
| Subject: RE: [HACKERS] Re: Which qsort is used
 | |
| X-MimeOLE: Produced By Microsoft Exchange V6.5
 | |
| Date: Thu, 22 Dec 2005 14:21:49 -0800
 | |
| Message-ID: <D425483C2C5C9F49B5B7A41F8944154757D3AC@postal.corporate.connx.com>
 | |
| Thread-Topic: [HACKERS] Re: Which qsort is used
 | |
| Thread-Index: AcYHQuXJdKs8JVgmSKywUqld6KYccQAAfWAA
 | |
| From: "Dann Corbit" <DCorbit@connx.com>
 | |
| To: "Manfred Koizar" <mkoi-pg@aon.at>,
 | |
|    "Martijn van Oosterhout" <kleptog@svana.org>
 | |
| cc: "Tom Lane" <tgl@sss.pgh.pa.us>, "Qingqing Zhou" <zhouqq@cs.toronto.edu>,
 | |
|    "Bruce Momjian" <pgman@candle.pha.pa.us>,
 | |
|    "Luke Lonergan" <llonergan@greenplum.com>,
 | |
|    "Neil Conway" <neilc@samurai.com>, <pgsql-hackers@postgresql.org>
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id jBMMLve11671
 | |
| Status: OR
 | |
| 
 | |
| An interesting article on sorting and comparison count:
 | |
| http://www.acm.org/jea/ARTICLES/Vol7Nbr5.pdf
 | |
| 
 | |
| Here is the article, the code, and an implementation that I have been
 | |
| toying with:
 | |
| http://cap.connx.com/chess-engines/new-approach/algos.zip
 | |
| 
 | |
| Algorithm quickheap is especially interesting because it does not
 | |
| require much additional space (just an array of integers up to size
 | |
| log(element_count) and in addition, it has very few data movements.
 | |
| 
 | |
| > -----Original Message-----
 | |
| > From: Manfred Koizar [mailto:mkoi-pg@aon.at]
 | |
| > Sent: Thursday, December 22, 2005 1:59 PM
 | |
| > To: Martijn van Oosterhout
 | |
| > Cc: Tom Lane; Dann Corbit; Qingqing Zhou; Bruce Momjian; Luke
 | |
| Lonergan;
 | |
| > Neil Conway; pgsql-hackers@postgresql.org
 | |
| > Subject: Re: [HACKERS] Re: Which qsort is used
 | |
| > 
 | |
| > On Thu, 22 Dec 2005 08:01:00 +0100, Martijn van Oosterhout
 | |
| > <kleptog@svana.org> wrote:
 | |
| > >But where are you including the cost to check how many cells are
 | |
| > >already sorted? That would be O(H), right?
 | |
| > 
 | |
| > Yes.  I didn't mention it, because H < N.
 | |
| > 
 | |
| > > This is where we come back
 | |
| > >to the issue that comparisons in PostgreSQL are expensive.
 | |
| > 
 | |
| > So we agree that we should try to reduce the number of comparisons.
 | |
| > How many comparisons does it take to sort 100000 items?  1.5 million?
 | |
| > 
 | |
| > >Hmm, what are the chances you have 100000 unordered items to sort and
 | |
| > >that the first 8% will already be in order. ISTM that that
 | |
| probability
 | |
| > >will be close enough to zero to not matter...
 | |
| > 
 | |
| > If the items are totally unordered, the check is so cheap you won't
 | |
| > even notice.  OTOH in Tom's example ...
 | |
| > 
 | |
| > |What I think is much more probable in the Postgres environment
 | |
| > |is almost-but-not-quite-ordered inputs --- eg, a table that was
 | |
| > |perfectly ordered by key when filled, but some of the tuples have
 | |
| since
 | |
| > |been moved by UPDATEs.
 | |
| > 
 | |
| > ... I'd not be surprised if H is 90% of N.
 | |
| > Servus
 | |
| >  Manfred
 | |
| 
 | |
| From pgsql-hackers-owner@postgresql.org Mon Dec 19 13:36:58 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 1E0CC9DC810
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Mon, 19 Dec 2005 13:36:58 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 89341-07
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Mon, 19 Dec 2005 13:36:52 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from mail.mi8.com (d01gw02.mi8.com [63.240.6.46])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 348A69DC9C2
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 19 Dec 2005 13:36:51 -0400 (AST)
 | |
| Received: from 172.16.1.25 by mail.mi8.com with ESMTP (- Welcome to Mi8
 | |
| 	Corporation www.Mi8.com (D2)); Mon, 19 Dec 2005 12:36:45 -0500
 | |
| X-Server-Uuid: 7829E76E-BB9E-4995-8473-3C0929DF7DD1
 | |
| Received: from MI8NYCMAIL06.Mi8.com ([172.16.1.175]) by
 | |
| 	D01HOST03.Mi8.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 19 Dec
 | |
| 	2005 12:36:44 -0500
 | |
| Received: from 67.103.45.218 ([67.103.45.218]) by MI8NYCMAIL06.Mi8.com (
 | |
| 	[172.16.1.219]) via Exchange Front-End Server mi8owa.mi8.com (
 | |
| 	[172.16.1.106]) with Microsoft Exchange Server HTTP-DAV ; Mon, 19 Dec
 | |
| 	2005 17:36:44 +0000
 | |
| User-Agent: Microsoft-Entourage/11.2.1.051004
 | |
| Date: Mon, 19 Dec 2005 09:36:44 -0800
 | |
| Subject: Re: Re: Which qsort is used
 | |
| From: "Luke Lonergan" <llonergan@greenplum.com>
 | |
| To: "Martijn van Oosterhout" <kleptog@svana.org>,
 | |
| 	"Dann Corbit" <DCorbit@connx.com>
 | |
| cc: "Tom Lane" <tgl@sss.pgh.pa.us>,
 | |
| 	"Qingqing Zhou" <zhouqq@cs.toronto.edu>,
 | |
| 	"Bruce Momjian" <pgman@candle.pha.pa.us>,
 | |
| 	"Neil Conway" <neilc@samurai.com>,
 | |
| 	pgsql-hackers@postgresql.org
 | |
| Message-ID: <BFCC2FAC.16CC0%llonergan@greenplum.com>
 | |
| Thread-Topic: [HACKERS] Re: Which qsort is used
 | |
| Thread-Index: AcYEkKvEA7duDr/yQneMyWGCfNr3rQAMhuDl
 | |
| In-Reply-To: <20051219113724.GD12251@svana.org>
 | |
| MIME-Version: 1.0
 | |
| X-OriginalArrivalTime: 19 Dec 2005 17:36:44.0849 (UTC)
 | |
| 	FILETIME=[C7C6AA10:01C604C2]
 | |
| X-WSS-ID: 6FB830272346940585-01-01
 | |
| Content-Type: text/plain;
 | |
| 	charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=1.253 required=5 tests=[AWL=0.000,
 | |
| 	RCVD_NUMERIC_HELO=1.253]
 | |
| X-Spam-Score: 1.253
 | |
| X-Spam-Level: *
 | |
| X-Archive-Number: 200512/868
 | |
| X-Sequence-Number: 77716
 | |
| Status: OR
 | |
| 
 | |
| Martin,
 | |
| 
 | |
| On 12/19/05 3:37 AM, "Martijn van Oosterhout" <kleptog@svana.org> wrote:
 | |
| 
 | |
| > I'm not sure whether we have a conclusion here, but I do have one
 | |
| > question: is there a significant difference in the number of times the
 | |
| > comparison routines are called? Comparisons in PostgreSQL are fairly
 | |
| > expensive given the fmgr overhead and when comparing tuples it's even
 | |
| > worse.
 | |
| 
 | |
| It would be interesting to note the comparison count of the different
 | |
| routines.
 | |
| 
 | |
| Something that really grabbed me about the results though is that the
 | |
| relative performance of the routines dramatically shifted when the indirect
 | |
| references in the comparators went in.  The first test I did sorted an array
 | |
| of int4 - these tests that Qingqing did sorted arrays using an indirect
 | |
| pointer list, at which point the same distributions performed very
 | |
| differently.
 | |
| 
 | |
| I suspect that it is the number of comparisons that caused this, and further
 | |
| that the indirection has disabled the compiler optimizations for memory
 | |
| prefetch and other things that it could normally recognize.  Given the usage
 | |
| pattern in Postgres, where sorted things are a mix of strings and intrinsic
 | |
| types, I'm not sure those optimizations could be done by one routine.
 | |
| 
 | |
| I haven't verified this, but it certainly seems that the NetBSD routine is
 | |
| the overall winner for the type of use that Postgres has (sorting the using
 | |
| a pointer list).
 | |
| 
 | |
| - Luke
 | |
| 
 | |
| 
 | |
| 
 | |
| From pgsql-hackers-owner+M81165@postgresql.org Thu Mar 16 18:37:28 2006
 | |
| Return-path: <pgsql-hackers-owner+M81165@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2GNbOu11277
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 16 Mar 2006 18:37:25 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id A609567BADC;
 | |
| 	Thu, 16 Mar 2006 19:37:21 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 8E8E19DC828
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 16 Mar 2006 19:36:50 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 31174-02
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Thu, 16 Mar 2006 19:36:52 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 8CA419DC840
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 16 Mar 2006 19:36:46 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2GNagfd023078;
 | |
| 	Thu, 16 Mar 2006 18:36:42 -0500 (EST)
 | |
| To: "Dann Corbit" <DCorbit@connx.com>
 | |
| cc: "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| In-Reply-To: <D425483C2C5C9F49B5B7A41F8944154757D67F@postal.corporate.connx.com> 
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D67F@postal.corporate.connx.com>
 | |
| Comments: In-reply-to "Dann Corbit" <DCorbit@connx.com>
 | |
| 	message dated "Thu, 16 Mar 2006 13:27:33 -0800"
 | |
| MIME-Version: 1.0
 | |
| Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0"
 | |
| Content-ID: <23060.1142551929.0@sss.pgh.pa.us>
 | |
| Date: Thu, 16 Mar 2006 18:36:42 -0500
 | |
| Message-ID: <23077.1142552202@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| ------- =_aaaaaaaaaa0
 | |
| Content-Type: text/plain; charset="us-ascii"
 | |
| Content-ID: <23060.1142551929.1@sss.pgh.pa.us>
 | |
| 
 | |
| >> So at least on randomized data, the swap_cnt thing is a serious loser.
 | |
| >> Need to run some tests on special-case inputs though.  Anyone have a
 | |
| >> test suite they like?
 | |
| 
 | |
| > Here is a distribution maker that will create some torture tests for
 | |
| > sorting programs.
 | |
| 
 | |
| I fleshed out the sort tester that Bentley & McIlroy give pseudocode for
 | |
| in their paper (attached below in case anyone wants to hack on it).  Not
 | |
| very surprisingly, it shows the unmodified B&M algorithm as
 | |
| significantly better than the BSD-lite version:
 | |
| 
 | |
| Our current BSD qsort:
 | |
| distribution SAWTOOTH: max cratio 12.9259, average 0.870261 over 252 tests
 | |
| distribution RAND: max cratio 1.07917, average 0.505924 over 252 tests
 | |
| distribution STAGGER: max cratio 12.9259, average 1.03706 over 252 tests
 | |
| distribution PLATEAU: max cratio 12.9259, average 0.632514 over 252 tests
 | |
| distribution SHUFFLE: max cratio 12.9259, average 1.21631 over 252 tests
 | |
| method COPY: max cratio 3.87533, average 0.666927 over 210 tests
 | |
| method REVERSE: max cratio 5.6248, average 0.710284 over 210 tests
 | |
| method FREVERSE: max cratio 12.9259, average 1.58323 over 210 tests
 | |
| method BREVERSE: max cratio 5.72661, average 1.13674 over 210 tests
 | |
| method SORT: max cratio 0.758625, average 0.350092 over 210 tests
 | |
| method DITHER: max cratio 3.13417, average 0.667222 over 210 tests
 | |
| Overall: average cratio 0.852415 over 1260 tests
 | |
| 
 | |
| without the swap_cnt code:
 | |
| distribution SAWTOOTH: max cratio 5.6248, average 0.745818 over 252 tests
 | |
| distribution RAND: max cratio 1.07917, average 0.510097 over 252 tests
 | |
| distribution STAGGER: max cratio 5.6248, average 1.0494 over 252 tests
 | |
| distribution PLATEAU: max cratio 3.57655, average 0.411549 over 252 tests
 | |
| distribution SHUFFLE: max cratio 5.72661, average 1.05988 over 252 tests
 | |
| method COPY: max cratio 3.87533, average 0.712122 over 210 tests
 | |
| method REVERSE: max cratio 5.6248, average 0.751011 over 210 tests
 | |
| method FREVERSE: max cratio 4.80869, average 0.690224 over 210 tests
 | |
| method BREVERSE: max cratio 5.72661, average 1.13673 over 210 tests
 | |
| method SORT: max cratio 0.806618, average 0.539829 over 210 tests
 | |
| method DITHER: max cratio 3.13417, average 0.702174 over 210 tests
 | |
| Overall: average cratio 0.755348 over 1260 tests
 | |
| 
 | |
| ("cratio" is the ratio of the actual number of comparison function calls
 | |
| to the theoretical expectation of N*lg2(N).)  The insertion sort
 | |
| switchover is a loser for both average and worst-case measurements.
 | |
| 
 | |
| I tried Dann's distributions too, with N = 100000:
 | |
| 
 | |
| Our current BSD qsort:
 | |
| dist fib: cratio 0.0694229
 | |
| dist camel: cratio 0.0903228
 | |
| dist constant: cratio 0.0602126
 | |
| dist five: cratio 0.132288
 | |
| dist ramp: cratio 4.29937
 | |
| dist random: cratio 1.09286
 | |
| dist reverse: cratio 0.5663
 | |
| dist sorted: cratio 0.18062
 | |
| dist ten: cratio 0.174781
 | |
| dist twenty: cratio 0.238098
 | |
| dist two: cratio 0.090365
 | |
| dist perverse: cratio 0.334503
 | |
| dist trig: cratio 0.679846
 | |
| Overall: max cratio 4.29937, average cratio 0.616076 over 13 tests
 | |
| 
 | |
| without the swap_cnt code:
 | |
| dist fib: cratio 0.0694229
 | |
| dist camel: cratio 0.0903228
 | |
| dist constant: cratio 0.0602126
 | |
| dist five: cratio 0.132288
 | |
| dist ramp: cratio 4.29937
 | |
| dist random: cratio 1.09286
 | |
| dist reverse: cratio 0.89184
 | |
| dist sorted: cratio 0.884907
 | |
| dist ten: cratio 0.174781
 | |
| dist twenty: cratio 0.238098
 | |
| dist two: cratio 0.090365
 | |
| dist perverse: cratio 0.334503
 | |
| dist trig: cratio 0.679846
 | |
| Overall: max cratio 4.29937, average cratio 0.695293 over 13 tests
 | |
| 
 | |
| In this set of tests the behavior is just about identical, except for
 | |
| the case of already-sorted input, where the BSD coding runs in O(N)
 | |
| instead of O(N lg2 N) time.  So that evidently is why some unknown
 | |
| person put in the special case.
 | |
| 
 | |
| Some further experimentation destroys my original proposal to limit the
 | |
| size of subfile we'll use the swap_cnt code for: it turns out that that
 | |
| eliminates the BSD code's advantage for presorted input (at least for
 | |
| inputs bigger than the limit) without doing anything much in return.
 | |
| 
 | |
| So my feeling is we should just remove the swap_cnt code and return to
 | |
| the original B&M algorithm.  Being much faster than expected for
 | |
| presorted input doesn't justify being far slower than expected for
 | |
| other inputs, IMHO.  In the context of Postgres I doubt that perfectly
 | |
| sorted input shows up very often anyway.
 | |
| 
 | |
| Comments?
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| 
 | |
| ------- =_aaaaaaaaaa0
 | |
| Content-Type: application/octet-stream
 | |
| Content-ID: <23060.1142551929.2@sss.pgh.pa.us>
 | |
| Content-Description: sorttester.c
 | |
| Content-Transfer-Encoding: base64
 | |
| 
 | |
| I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1
 | |
| ZGUgPHN0cmluZy5oPgojaW5jbHVkZSA8bWF0aC5oPgoKI2lmZGVmIFVTRV9R
 | |
| U09SVAojZGVmaW5lIHRlc3RfcXNvcnQgcXNvcnQKI2Vsc2UKZXh0ZXJuIHZv
 | |
| aWQgdGVzdF9xc29ydCh2b2lkICphLCBzaXplX3Qgbiwgc2l6ZV90IGVzLAoJ
 | |
| CQkJCSAgIGludCAoKmNtcCkgKGNvbnN0IHZvaWQgKiwgY29uc3Qgdm9pZCAq
 | |
| KSk7CiNlbmRpZgoKc3RhdGljIGNvbnN0IGludCBuX3ZhbHVlc1tdID0geyAx
 | |
| MDAsIDEwMjMsIDEwMjQsIDEwMjUgfSA7CiNkZWZpbmUgTUFYX04gMTAyNQoK
 | |
| dHlwZWRlZiBlbnVtCnsKCVNBV1RPT1RILCBSQU5ELCBTVEFHR0VSLCBQTEFU
 | |
| RUFVLCBTSFVGRkxFCn0gRElTVDsKI2RlZmluZSBESVNUX0ZJUlNUCVNBV1RP
 | |
| T1RICiNkZWZpbmUgRElTVF9MQVNUCVNIVUZGTEUKCnN0YXRpYyBjb25zdCBj
 | |
| aGFyICogY29uc3QgZGlzdG5hbWVbXSA9IHsKCSJTQVdUT09USCIsICJSQU5E
 | |
| IiwgIlNUQUdHRVIiLCAiUExBVEVBVSIsICJTSFVGRkxFIgp9OwoKdHlwZWRl
 | |
| ZiBlbnVtCnsKCU1DT1BZLCBNUkVWRVJTRSwgTUZSRVZFUlNFLCBNQlJFVkVS
 | |
| U0UsIE1TT1JULCBNRElUSEVSCn0gTU9ETUVUSE9EOwojZGVmaW5lIE1FVEhf
 | |
| RklSU1QJTUNPUFkKI2RlZmluZSBNRVRIX0xBU1QJTURJVEhFUgoKc3RhdGlj
 | |
| IGNvbnN0IGNoYXIgKiBjb25zdCBtZXRobmFtZVtdID0gewoJIkNPUFkiLCAi
 | |
| UkVWRVJTRSIsICJGUkVWRVJTRSIsICJCUkVWRVJTRSIsICJTT1JUIiwgIkRJ
 | |
| VEhFUiIKfTsKCi8qIHBlci10ZXN0IGNvdW50ZXIgKi8Kc3RhdGljIGxvbmcg
 | |
| bmNvbXBhcmVzOwoKLyogYWNjdW11bGF0ZSByZXN1bHRzIGFjcm9zcyB0ZXN0
 | |
| cywgZGlzdC13aXNlICovCnN0YXRpYyBkb3VibGUgc3VtY3JhdGlvX2RbRElT
 | |
| VF9MQVNUKzFdOwpzdGF0aWMgZG91YmxlIG1heGNyYXRpb19kW0RJU1RfTEFT
 | |
| VCsxXTsKc3RhdGljIGludCBudGVzdHNfZFtESVNUX0xBU1QrMV07Ci8qIGFj
 | |
| Y3VtdWxhdGUgcmVzdWx0cyBhY3Jvc3MgdGVzdHMsIG1vZC1tZXRob2Qtd2lz
 | |
| ZSAqLwpzdGF0aWMgZG91YmxlIHN1bWNyYXRpb19tW01FVEhfTEFTVCsxXTsK
 | |
| c3RhdGljIGRvdWJsZSBtYXhjcmF0aW9fbVtNRVRIX0xBU1QrMV07CnN0YXRp
 | |
| YyBpbnQgbnRlc3RzX21bTUVUSF9MQVNUKzFdOwoKCnN0YXRpYyBpbnQKaW50
 | |
| X2NtcChjb25zdCB2b2lkICphLCBjb25zdCB2b2lkICpiKQp7CglpbnQJCWFh
 | |
| ID0gKihjb25zdCBpbnQgKikgYTsKCWludAkJYmIgPSAqKGNvbnN0IGludCAq
 | |
| KSBiOwoKCW5jb21wYXJlcysrOwoKCWlmIChhYSA8IGJiKQoJCXJldHVybiAt
 | |
| MTsKCWlmIChhYSA+IGJiKQoJCXJldHVybiAxOwoJcmV0dXJuIDA7Cn0KCgpz
 | |
| dGF0aWMgdm9pZAp0ZXN0X2NvbW1vbihESVNUIGRpc3QsIE1PRE1FVEhPRCBt
 | |
| ZXRoLCB2b2lkICp4dCwgc2l6ZV90IG4sIHNpemVfdCBzeiwKCQkJaW50ICgq
 | |
| Y21wKSAoY29uc3Qgdm9pZCAqLCBjb25zdCB2b2lkICopKQp7Cglkb3VibGUg
 | |
| bmxvZ247Cglkb3VibGUgY3JhdGlvOwoKCW5jb21wYXJlcyA9IDA7Cgl0ZXN0
 | |
| X3Fzb3J0KHh0LCBuLCBzeiwgY21wKTsKCW5sb2duID0gbiAqIGxvZygoZG91
 | |
| YmxlKSBuKSAvIGxvZygyLjApOwoJY3JhdGlvID0gbmNvbXBhcmVzIC8gbmxv
 | |
| Z247CglzdW1jcmF0aW9fZFtkaXN0XSArPSBjcmF0aW87CglpZiAoY3JhdGlv
 | |
| ID4gbWF4Y3JhdGlvX2RbZGlzdF0pCgkJbWF4Y3JhdGlvX2RbZGlzdF0gPSBj
 | |
| cmF0aW87CgludGVzdHNfZFtkaXN0XSsrOwoJc3VtY3JhdGlvX21bbWV0aF0g
 | |
| Kz0gY3JhdGlvOwoJaWYgKGNyYXRpbyA+IG1heGNyYXRpb19tW21ldGhdKQoJ
 | |
| CW1heGNyYXRpb19tW21ldGhdID0gY3JhdGlvOwoJbnRlc3RzX21bbWV0aF0r
 | |
| KzsKfQoKCi8qIHdvcmsgb24gYSBjb3B5IG9mIHggKi8Kc3RhdGljIHZvaWQK
 | |
| dGVzdF9pbnRfY29weShESVNUIGRpc3QsIGludCB4W10sIGludCBuKQp7Cglp
 | |
| bnQJCXh0W01BWF9OXTsKCgltZW1jcHkoeHQsIHgsIG4gKiBzaXplb2YoaW50
 | |
| KSk7Cgl0ZXN0X2NvbW1vbihkaXN0LCBNQ09QWSwgKHZvaWQgKikgeHQsIG4s
 | |
| IHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogcmV2ZXJzZSB0aGUgcmFu
 | |
| Z2Ugc3RhcnQgPD0gaSA8IHN0b3AgKi8Kc3RhdGljIHZvaWQKdGVzdF9pbnRf
 | |
| cmV2ZXJzZShESVNUIGRpc3QsIE1PRE1FVEhPRCBtZXRoLCBpbnQgeFtdLCBp
 | |
| bnQgbiwgaW50IHN0YXJ0LCBpbnQgc3RvcCkKewoJaW50CQl4dFtNQVhfTl07
 | |
| CglpbnQJCWk7CgoJbWVtY3B5KHh0LCB4LCBuICogc2l6ZW9mKGludCkpOwoJ
 | |
| Zm9yIChpID0gc3RhcnQ7IGkgPCBzdG9wOyBpKyspCgl7CgkJeHRbaV0gPSB4
 | |
| W3N0b3AgLSAxIC0gaV07Cgl9Cgl0ZXN0X2NvbW1vbihkaXN0LCBtZXRoLCAo
 | |
| dm9pZCAqKSB4dCwgbiwgc2l6ZW9mKGludCksIGludF9jbXApOwp9CgovKiBw
 | |
| cmUtc29ydCB1c2luZyBhIHRydXN0ZWQgc29ydCAqLwpzdGF0aWMgdm9pZAp0
 | |
| ZXN0X2ludF9zb3J0KERJU1QgZGlzdCwgaW50IHhbXSwgaW50IG4pCnsKCWlu
 | |
| dAkJeHRbTUFYX05dOwoKCW1lbWNweSh4dCwgeCwgbiAqIHNpemVvZihpbnQp
 | |
| KTsKCXFzb3J0KCh2b2lkICopIHh0LCBuLCBzaXplb2YoaW50KSwgaW50X2Nt
 | |
| cCk7Cgl0ZXN0X2NvbW1vbihkaXN0LCBNU09SVCwgKHZvaWQgKikgeHQsIG4s
 | |
| IHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogYWRkIGklNSB0byB4W2ld
 | |
| ICovCnN0YXRpYyB2b2lkCnRlc3RfaW50X2RpdGhlcihESVNUIGRpc3QsIGlu
 | |
| dCB4W10sIGludCBuKQp7CglpbnQJCXh0W01BWF9OXTsKCWludAkJaTsKCglm
 | |
| b3IgKGkgPSAwOyBpIDwgbjsgaSsrKQoJewoJCXh0W2ldID0geFtpXSArIGkl
 | |
| NTsKCX0KCXRlc3RfY29tbW9uKGRpc3QsIE1ESVRIRVIsICh2b2lkICopIHh0
 | |
| LCBuLCBzaXplb2YoaW50KSwgaW50X2NtcCk7Cn0KCgppbnQKbWFpbigpCnsK
 | |
| CWludAkJeFtNQVhfTl07CglpbnQJCWlfbjsKCWludAkJbjsKCWludAkJbTsK
 | |
| CURJU1QJZGlzdDsKCU1PRE1FVEhPRCBtZXRoOwoJaW50CQlpOwoJaW50CQlq
 | |
| OwoJaW50CQlrOwoJZG91YmxlCXN1bWNyYXRpbzsKCWludAkJbnRlc3RzOwoK
 | |
| CWZvciAoaV9uID0gMDsgaV9uIDwgc2l6ZW9mKG5fdmFsdWVzKS9zaXplb2Yo
 | |
| bl92YWx1ZXNbMF0pOyBpX24rKykKCXsKCQluID0gbl92YWx1ZXNbaV9uXTsK
 | |
| CQlmb3IgKG0gPSAxOyBtIDwgMipuOyBtICo9IDIpCgkJewoJCQlmb3IgKGRp
 | |
| c3QgPSBESVNUX0ZJUlNUOyBkaXN0IDw9IERJU1RfTEFTVDsgZGlzdCsrKQoJ
 | |
| CQl7CgkJCQlzd2l0Y2ggKGRpc3QpCgkJCQl7CgkJCQkJY2FzZSBTQVdUT09U
 | |
| SDoKCQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBpIDwgbjsgaSsrKQoJ
 | |
| CQkJCQl7CgkJCQkJCQl4W2ldID0gaSAlIG07CgkJCQkJCX0KCQkJCQkJYnJl
 | |
| YWs7CgkJCQkJY2FzZSBSQU5EOgoJCQkJCQlmb3IgKGkgPSBqID0gMCwgayA9
 | |
| IDE7IGkgPCBuOyBpKyspCgkJCQkJCXsKCQkJCQkJCXhbaV0gPSByYW5kKCkg
 | |
| JSBtOwoJCQkJCQl9CgkJCQkJCWJyZWFrOwoJCQkJCWNhc2UgU1RBR0dFUjoK
 | |
| CQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBpIDwgbjsgaSsrKQoJCQkJ
 | |
| CQl7CgkJCQkJCQl4W2ldID0gKGkqbSArIGkpICUgbjsKCQkJCQkJfQoJCQkJ
 | |
| CQlicmVhazsKCQkJCQljYXNlIFBMQVRFQVU6CgkJCQkJCWZvciAoaSA9IGog
 | |
| PSAwLCBrID0gMTsgaSA8IG47IGkrKykKCQkJCQkJewoJCQkJCQkJeFtpXSA9
 | |
| IGkgPCBtID8gaSA6IG07CgkJCQkJCX0KCQkJCQkJYnJlYWs7CgkJCQkJY2Fz
 | |
| ZSBTSFVGRkxFOgoJCQkJCQlmb3IgKGkgPSBqID0gMCwgayA9IDE7IGkgPCBu
 | |
| OyBpKyspCgkJCQkJCXsKCQkJCQkJCXhbaV0gPSAocmFuZCgpJW0pID8gKGor
 | |
| PTIpIDogKGsrPTIpOwoJCQkJCQl9CgkJCQkJCWJyZWFrOwoJCQkJfQoJCQkJ
 | |
| dGVzdF9pbnRfY29weShkaXN0LCB4LCBuKTsgLyogd29yayBvbiBhIGNvcHkg
 | |
| b2YgeCAqLwoJCQkJdGVzdF9pbnRfcmV2ZXJzZShkaXN0LCBNUkVWRVJTRSwg
 | |
| eCwgbiwgMCwgbik7IC8qIG9uIGEgcmV2ZXJzZWQgY29weSAqLwoJCQkJdGVz
 | |
| dF9pbnRfcmV2ZXJzZShkaXN0LCBNRlJFVkVSU0UsIHgsIG4sIDAsIG4vMik7
 | |
| CS8qIGZyb250IGhhbGYgcmV2ZXJzZWQgKi8KCQkJCXRlc3RfaW50X3JldmVy
 | |
| c2UoZGlzdCwgTUJSRVZFUlNFLCB4LCBuLCBuLzIsIG4pOwkvKiBiYWNrIGhh
 | |
| bGYgcmV2ZXJzZWQgKi8KCQkJCXRlc3RfaW50X3NvcnQoZGlzdCwgeCwgbik7
 | |
| IC8qIGFuIG9yZGVyZWQgY29weSAqLwoJCQkJdGVzdF9pbnRfZGl0aGVyKGRp
 | |
| c3QsIHgsIG4pOyAvKiBhZGQgaSU1IHRvIHhbaV0gKi8KCQkJfQoJCX0KCX0K
 | |
| Cglmb3IgKGRpc3QgPSBESVNUX0ZJUlNUOyBkaXN0IDw9IERJU1RfTEFTVDsg
 | |
| ZGlzdCsrKQoJewoJCXByaW50ZigiZGlzdHJpYnV0aW9uICVzOiBtYXggY3Jh
 | |
| dGlvICVnLCBhdmVyYWdlICVnIG92ZXIgJWQgdGVzdHNcbiIsCgkJCSAgIGRp
 | |
| c3RuYW1lW2Rpc3RdLAoJCQkgICBtYXhjcmF0aW9fZFtkaXN0XSwKCQkJICAg
 | |
| c3VtY3JhdGlvX2RbZGlzdF0gLyBudGVzdHNfZFtkaXN0XSwKCQkJICAgbnRl
 | |
| c3RzX2RbZGlzdF0pOwoJfQoKCXN1bWNyYXRpbyA9IDA7CgludGVzdHMgPSAw
 | |
| OwoKCWZvciAobWV0aCA9IE1FVEhfRklSU1Q7IG1ldGggPD0gTUVUSF9MQVNU
 | |
| OyBtZXRoKyspCgl7CgkJcHJpbnRmKCJtZXRob2QgJXM6IG1heCBjcmF0aW8g
 | |
| JWcsIGF2ZXJhZ2UgJWcgb3ZlciAlZCB0ZXN0c1xuIiwKCQkJICAgbWV0aG5h
 | |
| bWVbbWV0aF0sCgkJCSAgIG1heGNyYXRpb19tW21ldGhdLAoJCQkgICBzdW1j
 | |
| cmF0aW9fbVttZXRoXSAvIG50ZXN0c19tW21ldGhdLAoJCQkgICBudGVzdHNf
 | |
| bVttZXRoXSk7CgkJc3VtY3JhdGlvICs9IHN1bWNyYXRpb19tW21ldGhdOwoJ
 | |
| CW50ZXN0cyArPSBudGVzdHNfbVttZXRoXTsKCX0KCglwcmludGYoIk92ZXJh
 | |
| bGw6IGF2ZXJhZ2UgY3JhdGlvICVnIG92ZXIgJWQgdGVzdHNcbiIsCgkJICAg
 | |
| c3VtY3JhdGlvIC8gbnRlc3RzLAoJCSAgIG50ZXN0cyk7CgoJcmV0dXJuIDA7
 | |
| Cn0K
 | |
| 
 | |
| ------- =_aaaaaaaaaa0
 | |
| Content-Type: text/plain
 | |
| Content-Disposition: inline
 | |
| Content-Transfer-Encoding: 8bit
 | |
| MIME-Version: 1.0
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: explain analyze is your friend
 | |
| 
 | |
| ------- =_aaaaaaaaaa0--
 | |
| 
 | |
| From pgsql-hackers-owner+M81167@postgresql.org Thu Mar 16 18:48:37 2006
 | |
| Return-path: <pgsql-hackers-owner+M81167@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2GNmbu12770
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 16 Mar 2006 18:48:37 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 570D567BADC;
 | |
| 	Thu, 16 Mar 2006 19:48:35 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id B49219DCBC2
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 16 Mar 2006 19:48:12 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 28142-10
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Thu, 16 Mar 2006 19:48:15 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 9E95A9DCBAD
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 16 Mar 2006 19:48:10 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2GNm9Kt023199;
 | |
| 	Thu, 16 Mar 2006 18:48:09 -0500 (EST)
 | |
| To: Darcy Buskermolen <darcy@wavefire.com>
 | |
| cc: pgsql-hackers@postgresql.org, "Dann Corbit" <DCorbit@connx.com>,
 | |
|    "Jonah H. Harris" <jonah.harris@gmail.com>,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| In-Reply-To: <200603161541.25929.darcy@wavefire.com> 
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D67D@postal.corporate.connx.com> <19646.1142539750@sss.pgh.pa.us> <200603161541.25929.darcy@wavefire.com>
 | |
| Comments: In-reply-to Darcy Buskermolen <darcy@wavefire.com>
 | |
| 	message dated "Thu, 16 Mar 2006 15:41:24 -0800"
 | |
| Date: Thu, 16 Mar 2006 18:48:09 -0500
 | |
| Message-ID: <23198.1142552889@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Darcy Buskermolen <darcy@wavefire.com> writes:
 | |
| > On Thursday 16 March 2006 12:09, Tom Lane wrote:
 | |
| >> So we still have a problem of software archaeology: who added the
 | |
| >> insertion sort switch to the NetBSD version, and on what grounds?
 | |
| 
 | |
| > This is when that particular code was pushed in, as to why exactly, you'll 
 | |
| > have to ask mycroft.
 | |
| > http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c.diff?r1=1.3&r2=1.4&only_with_tag=MAIN
 | |
| 
 | |
| Interesting.  It looks to me like he replaced the former
 | |
| vaguely-Knuth-based coding with B&M's code, but kept the insertion-
 | |
| sort-after-no-swap special case that was in the previous code.  I'll
 | |
| betcha he didn't test to see whether this was actually such a great
 | |
| idea ...
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: if posting/reading through Usenet, please send an appropriate
 | |
|        subscribe-nomail command to majordomo@postgresql.org so that your
 | |
|        message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M81168@postgresql.org Thu Mar 16 19:42:51 2006
 | |
| Return-path: <pgsql-hackers-owner+M81168@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H0gpu19805
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 16 Mar 2006 19:42:51 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 5F20967BADE;
 | |
| 	Thu, 16 Mar 2006 20:42:48 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id AA2239DCBAD
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 16 Mar 2006 20:42:20 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 46728-01
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Thu, 16 Mar 2006 20:42:23 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 062EC9DCBA4
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 16 Mar 2006 20:42:17 -0400 (AST)
 | |
| X-MimeOLE: Produced By Microsoft Exchange V6.5
 | |
| Content-class: urn:content-classes:message
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain;
 | |
| 	charset="us-ascii"
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| Date: Thu, 16 Mar 2006 16:42:20 -0800
 | |
| Message-ID: <D425483C2C5C9F49B5B7A41F8944154757D688@postal.corporate.connx.com>
 | |
| Thread-Topic: [HACKERS] qsort, once again 
 | |
| Thread-Index: AcZJUnwKofAzJ+OKTcqF67UTsFWJEQACP7AA
 | |
| From: "Dann Corbit" <DCorbit@connx.com>
 | |
| To: "Tom Lane" <tgl@sss.pgh.pa.us>
 | |
| cc: "Jonah H. Harris" <jonah.harris@gmail.com>, <pgsql-hackers@postgresql.org>,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.096 required=5 tests=[AWL=0.096]
 | |
| X-Spam-Score: 0.096
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k2H0gpu19805
 | |
| Status: OR
 | |
| 
 | |
| > So my feeling is we should just remove the swap_cnt code and return to
 | |
| > the original B&M algorithm.  Being much faster than expected for
 | |
| > presorted input doesn't justify being far slower than expected for
 | |
| > other inputs, IMHO.  In the context of Postgres I doubt that perfectly
 | |
| > sorted input shows up very often anyway.
 | |
| > 
 | |
| > Comments?
 | |
| 
 | |
| Checking for presorted input is O(n).
 | |
| If the input is random, an average of 3 elements will be tested.
 | |
| So adding an in-order check of the data should not be too expensive.
 | |
| 
 | |
| I would benchmark several approaches and see which one is best when used
 | |
| in-place.
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: Have you checked our extensive FAQ?
 | |
| 
 | |
|                http://www.postgresql.org/docs/faq
 | |
| 
 | |
| From pgsql-hackers-owner+M81169@postgresql.org Thu Mar 16 20:13:08 2006
 | |
| Return-path: <pgsql-hackers-owner+M81169@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H1D7u25008
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 16 Mar 2006 20:13:07 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 6B48F67BADE;
 | |
| 	Thu, 16 Mar 2006 21:13:04 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 19FAD9DCC5C
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Thu, 16 Mar 2006 21:12:36 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 53608-01
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Thu, 16 Mar 2006 21:12:37 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 839069DCC2D
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 16 Mar 2006 21:12:32 -0400 (AST)
 | |
| X-MimeOLE: Produced By Microsoft Exchange V6.5
 | |
| Content-class: urn:content-classes:message
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain;
 | |
| 	charset="us-ascii"
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| Date: Thu, 16 Mar 2006 17:12:35 -0800
 | |
| Message-ID: <D425483C2C5C9F49B5B7A41F8944154757D68B@postal.corporate.connx.com>
 | |
| Thread-Topic: [HACKERS] qsort, once again 
 | |
| Thread-Index: AcZJUnwKofAzJ+OKTcqF67UTsFWJEQACP7AAAADU8dA=
 | |
| From: "Dann Corbit" <DCorbit@connx.com>
 | |
| To: "Dann Corbit" <DCorbit@connx.com>, "Tom Lane" <tgl@sss.pgh.pa.us>
 | |
| cc: "Jonah H. Harris" <jonah.harris@gmail.com>, <pgsql-hackers@postgresql.org>,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.096 required=5 tests=[AWL=0.096]
 | |
| X-Spam-Score: 0.096
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k2H1D7u25008
 | |
| Status: OR
 | |
| 
 | |
| > -----Original Message-----
 | |
| > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
 | |
| > owner@postgresql.org] On Behalf Of Dann Corbit
 | |
| > Sent: Thursday, March 16, 2006 4:42 PM
 | |
| > To: Tom Lane
 | |
| > Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers
 | |
| > Subject: Re: [HACKERS] qsort, once again
 | |
| > 
 | |
| > > So my feeling is we should just remove the swap_cnt code and return
 | |
| to
 | |
| > > the original B&M algorithm.  Being much faster than expected for
 | |
| > > presorted input doesn't justify being far slower than expected for
 | |
| > > other inputs, IMHO.  In the context of Postgres I doubt that
 | |
| perfectly
 | |
| > > sorted input shows up very often anyway.
 | |
| > >
 | |
| > > Comments?
 | |
| > 
 | |
| > Checking for presorted input is O(n).
 | |
| > If the input is random, an average of 3 elements will be tested.
 | |
| > So adding an in-order check of the data should not be too expensive.
 | |
| > 
 | |
| > I would benchmark several approaches and see which one is best when
 | |
| used
 | |
| > in-place.
 | |
| 
 | |
| Even if "hunks" of the input are sorted, the test is a very good idea.
 | |
| 
 | |
| Recall that we are sorting recursively and so we divide the data into
 | |
| chunks.
 | |
| 
 | |
| Consider an example...
 | |
| 
 | |
| Quicksort of a field that contains Sex as 'M' for male, 'F' for female,
 | |
| or NULL for unknown.
 | |
| 
 | |
| The median selection is going to pick one of 'M', 'F', or NULL.
 | |
| After pass 1 of qsort we will have two partitions.  One partition will
 | |
| have all of one type and the other partition will have the other two
 | |
| types.
 | |
| 
 | |
| An in-order check will tell us that the monotone partition is sorted and
 | |
| we are done with it.
 | |
| 
 | |
| Imagine also a table that was clustered but for which we have not
 | |
| updated statistics.  Perhaps it is 98% sorted.  Checking for order in
 | |
| our partitions is probably a good idea.
 | |
| 
 | |
| I think you could also get a good optimization if you are checking for
 | |
| partitions and find a big section of the partition is not ordered (even
 | |
| though the whole thing is not).  If you could perk the ordered size up
 | |
| the tree, you could just add another partition to the merge list and
 | |
| sort the unordered part.
 | |
| 
 | |
| In "C Unleashed" I call this idea partition discovery mergesort.
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M81172@postgresql.org Fri Mar 17 00:27:41 2006
 | |
| Return-path: <pgsql-hackers-owner+M81172@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H5Reu14258
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 17 Mar 2006 00:27:40 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id CE86767BAEA;
 | |
| 	Fri, 17 Mar 2006 01:27:36 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 465549DC874
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Fri, 17 Mar 2006 01:27:11 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 76897-01
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Fri, 17 Mar 2006 01:27:10 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 2456F9DC871
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 17 Mar 2006 01:27:08 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2H5R6PF025295;
 | |
| 	Fri, 17 Mar 2006 00:27:06 -0500 (EST)
 | |
| To: "Dann Corbit" <DCorbit@connx.com>
 | |
| cc: "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| In-Reply-To: <D425483C2C5C9F49B5B7A41F8944154757D68B@postal.corporate.connx.com> 
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D68B@postal.corporate.connx.com>
 | |
| Comments: In-reply-to "Dann Corbit" <DCorbit@connx.com>
 | |
| 	message dated "Thu, 16 Mar 2006 17:12:35 -0800"
 | |
| Date: Fri, 17 Mar 2006 00:27:05 -0500
 | |
| Message-ID: <25294.1142573225@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| "Dann Corbit" <DCorbit@connx.com> writes:
 | |
| >> So my feeling is we should just remove the swap_cnt code and return
 | |
| >> to the original B&M algorithm.
 | |
| 
 | |
| > Even if "hunks" of the input are sorted, the test is a very good idea.
 | |
| 
 | |
| Yah know, guys, Bentley and McIlroy are each smarter than any five of
 | |
| us, and I'm quite certain it occurred to them to try prechecking for
 | |
| sorted input.  If that case is not in their code then it's probably
 | |
| because it's a net loss.  Unless you have reason to think that sorted
 | |
| input is *more* common than other cases for the Postgres environment,
 | |
| which is certainly a fact not in evidence.
 | |
| 
 | |
| (Bentley was my thesis adviser for awhile before he went to Bell Labs,
 | |
| so my respect for him is based on direct personal experience.  McIlroy
 | |
| I only know by reputation, but he's sure got a ton of that.)
 | |
| 
 | |
| > Imagine also a table that was clustered but for which we have not
 | |
| > updated statistics.  Perhaps it is 98% sorted.  Checking for order in
 | |
| > our partitions is probably a good idea.
 | |
| 
 | |
| If we are using the sort code rather than the recently-clustered index
 | |
| for such a case, then we have problems elsewhere.  This scenario is not
 | |
| a good argument that the sort code needs to be specialized to handle
 | |
| this case at the expense of other cases; the place to be fixing it is
 | |
| the planner or the statistics-management code.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 9: In versions below 8.0, the planner will ignore your desire to
 | |
|        choose an index scan if your joining column's datatypes do not
 | |
|        match
 | |
| 
 | |
| From pgsql-hackers-owner+M81173@postgresql.org Fri Mar 17 00:29:24 2006
 | |
| Return-path: <pgsql-hackers-owner+M81173@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H5TNu14505
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 17 Mar 2006 00:29:23 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id A271D67BAEA;
 | |
| 	Fri, 17 Mar 2006 01:29:19 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id C96D79DCA40
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Fri, 17 Mar 2006 01:28:55 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 44062-07
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Fri, 17 Mar 2006 01:28:54 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187])
 | |
| 	by postgresql.org (Postfix) with ESMTP id CDE6C9DCA4B
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 17 Mar 2006 01:28:53 -0400 (AST)
 | |
| X-MimeOLE: Produced By Microsoft Exchange V6.5
 | |
| Content-class: urn:content-classes:message
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain;
 | |
| 	charset="us-ascii"
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| Date: Thu, 16 Mar 2006 21:28:52 -0800
 | |
| Message-ID: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com>
 | |
| Thread-Topic: [HACKERS] qsort, once again 
 | |
| Thread-Index: AcZJg2/Nvc2IdeFUT0id8WtNczZvGQAACfYw
 | |
| From: "Dann Corbit" <DCorbit@connx.com>
 | |
| To: "Tom Lane" <tgl@sss.pgh.pa.us>
 | |
| cc: "Jonah H. Harris" <jonah.harris@gmail.com>, <pgsql-hackers@postgresql.org>,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.097 required=5 tests=[AWL=0.097]
 | |
| X-Spam-Score: 0.097
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k2H5TNu14505
 | |
| Status: OR
 | |
| 
 | |
| Well, my point was that it is a snap to implement and test.
 | |
| 
 | |
| It will be better, worse, or the same.
 | |
| 
 | |
| I agree that Bentley is a bloody genius.
 | |
| 
 | |
| > -----Original Message-----
 | |
| > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
 | |
| > Sent: Thursday, March 16, 2006 9:27 PM
 | |
| > To: Dann Corbit
 | |
| > Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers
 | |
| > Subject: Re: [HACKERS] qsort, once again
 | |
| > 
 | |
| > "Dann Corbit" <DCorbit@connx.com> writes:
 | |
| > >> So my feeling is we should just remove the swap_cnt code and return
 | |
| > >> to the original B&M algorithm.
 | |
| > 
 | |
| > > Even if "hunks" of the input are sorted, the test is a very good
 | |
| idea.
 | |
| > 
 | |
| > Yah know, guys, Bentley and McIlroy are each smarter than any five of
 | |
| > us, and I'm quite certain it occurred to them to try prechecking for
 | |
| > sorted input.  If that case is not in their code then it's probably
 | |
| > because it's a net loss.  Unless you have reason to think that sorted
 | |
| > input is *more* common than other cases for the Postgres environment,
 | |
| > which is certainly a fact not in evidence.
 | |
| > 
 | |
| > (Bentley was my thesis adviser for awhile before he went to Bell Labs,
 | |
| > so my respect for him is based on direct personal experience.  McIlroy
 | |
| > I only know by reputation, but he's sure got a ton of that.)
 | |
| > 
 | |
| > > Imagine also a table that was clustered but for which we have not
 | |
| > > updated statistics.  Perhaps it is 98% sorted.  Checking for order
 | |
| in
 | |
| > > our partitions is probably a good idea.
 | |
| > 
 | |
| > If we are using the sort code rather than the recently-clustered index
 | |
| > for such a case, then we have problems elsewhere.  This scenario is
 | |
| not
 | |
| > a good argument that the sort code needs to be specialized to handle
 | |
| > this case at the expense of other cases; the place to be fixing it is
 | |
| > the planner or the statistics-management code.
 | |
| > 
 | |
| > 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M81277@postgresql.org Tue Mar 21 13:53:08 2006
 | |
| Return-path: <pgsql-hackers-owner+M81277@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LIr6M03797
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 21 Mar 2006 13:53:06 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 01A8467BBF2;
 | |
| 	Tue, 21 Mar 2006 14:53:00 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 0ED1D9DCCD2
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Tue, 21 Mar 2006 14:52:29 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 38232-04-3
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Tue, 21 Mar 2006 14:52:26 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 1A0EE9DCC2C
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 21 Mar 2006 14:52:22 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LIqHPc018733;
 | |
| 	Tue, 21 Mar 2006 13:52:17 -0500 (EST)
 | |
| To: "Dann Corbit" <DCorbit@connx.com>
 | |
| cc: "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| In-Reply-To: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com> 
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com>
 | |
| Comments: In-reply-to "Dann Corbit" <DCorbit@connx.com>
 | |
| 	message dated "Thu, 16 Mar 2006 21:28:52 -0800"
 | |
| MIME-Version: 1.0
 | |
| Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0"
 | |
| Content-ID: <18685.1142966822.0@sss.pgh.pa.us>
 | |
| Date: Tue, 21 Mar 2006 13:52:17 -0500
 | |
| Message-ID: <18732.1142967137@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| ------- =_aaaaaaaaaa0
 | |
| Content-Type: text/plain; charset="us-ascii"
 | |
| Content-ID: <18685.1142966822.1@sss.pgh.pa.us>
 | |
| 
 | |
| "Dann Corbit" <DCorbit@connx.com> writes:
 | |
| > Well, my point was that it is a snap to implement and test.
 | |
| 
 | |
| Well, having done this, I have to eat my words: it does seem to be a
 | |
| pretty good idea.
 | |
| 
 | |
| The following test numbers are using Bentley & McIlroy's test framework,
 | |
| but modified to test only the case N=10000 rather than the four smaller
 | |
| N values they originally used.  I did that because it exposes quadratic
 | |
| behavior more obviously, and the variance in N made it harder to compare
 | |
| comparison ratios for different cases.  I also added a "NEARSORT" test
 | |
| method, which sorts the input distribution and then exchanges two
 | |
| elements chosen at random.  I did that because I was concerned that
 | |
| nearly sorted input would be the worst case for the presorted-input
 | |
| check, as it would waste the most cycles before failing on such input.
 | |
| 
 | |
| With our existing qsort code, the results look like
 | |
| 
 | |
| distribution SAWTOOTH: max cratio 94.17, min 0.08, average 1.56 over 105 tests
 | |
| distribution RAND: max cratio 1.06, min 0.08, average 0.51 over 105 tests
 | |
| distribution STAGGER: max cratio 6.08, min 0.23, average 1.01 over 105 tests
 | |
| distribution PLATEAU: max cratio 94.17, min 0.08, average 2.12 over 105 tests
 | |
| distribution SHUFFLE: max cratio 94.17, min 0.23, average 1.92 over 105 tests
 | |
| method COPY: max cratio 6.08, min 0.08, average 0.72 over 75 tests
 | |
| method REVERSE: max cratio 5.34, min 0.08, average 0.69 over 75 tests
 | |
| method FREVERSE: max cratio 94.17, min 0.08, average 5.71 over 75 tests
 | |
| method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests
 | |
| method SORT: max cratio 0.82, min 0.08, average 0.31 over 75 tests
 | |
| method NEARSORT: max cratio 0.82, min 0.08, average 0.36 over 75 tests
 | |
| method DITHER: max cratio 5.52, min 0.18, average 0.77 over 75 tests
 | |
| Overall: average cratio 1.42 over 525 tests
 | |
| 
 | |
| ("cratio" is the ratio of the actual number of comparison function calls
 | |
| to the theoretical expectation, N log2(N))
 | |
| 
 | |
| That's pretty awful: there are several test cases that make it use
 | |
| nearly 100 times the expected number of comparisons.
 | |
| 
 | |
| Removing the swap_cnt test to bring it close to B&M's original
 | |
| recommendations, we get
 | |
| 
 | |
| distribution SAWTOOTH: max cratio 3.85, min 0.08, average 0.70 over 105 tests
 | |
| distribution RAND: max cratio 1.06, min 0.08, average 0.52 over 105 tests
 | |
| distribution STAGGER: max cratio 6.08, min 0.58, average 1.12 over 105 tests
 | |
| distribution PLATEAU: max cratio 3.70, min 0.08, average 0.34 over 105 tests
 | |
| distribution SHUFFLE: max cratio 3.86, min 0.86, average 1.24 over 105 tests
 | |
| method COPY: max cratio 6.08, min 0.08, average 0.76 over 75 tests
 | |
| method REVERSE: max cratio 5.34, min 0.08, average 0.75 over 75 tests
 | |
| method FREVERSE: max cratio 4.56, min 0.08, average 0.73 over 75 tests
 | |
| method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests
 | |
| method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
 | |
| method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
 | |
| method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests
 | |
| Overall: average cratio 0.78 over 525 tests
 | |
| 
 | |
| which is a whole lot better as to both average and worst cases.
 | |
| 
 | |
| I then added some code to check for presorted input (just after the
 | |
| n<7 insertion sort code):
 | |
| 
 | |
| #ifdef CHECK_SORTED
 | |
| 	presorted = 1;
 | |
| 	for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es)
 | |
| 	{
 | |
| 		if (cmp(pm - es, pm) > 0)
 | |
| 		{
 | |
| 			presorted = 0;
 | |
| 			break;
 | |
| 		}
 | |
| 	}
 | |
| 	if (presorted)
 | |
| 		return;
 | |
| #endif
 | |
| 
 | |
| This gives
 | |
| 
 | |
| distribution SAWTOOTH: max cratio 3.88, min 0.08, average 0.62 over 105 tests
 | |
| distribution RAND: max cratio 1.06, min 0.08, average 0.46 over 105 tests
 | |
| distribution STAGGER: max cratio 6.15, min 0.08, average 0.98 over 105 tests
 | |
| distribution PLATEAU: max cratio 3.79, min 0.08, average 0.31 over 105 tests
 | |
| distribution SHUFFLE: max cratio 3.91, min 0.08, average 1.09 over 105 tests
 | |
| method COPY: max cratio 6.15, min 0.08, average 0.72 over 75 tests
 | |
| method REVERSE: max cratio 5.34, min 0.08, average 0.76 over 75 tests
 | |
| method FREVERSE: max cratio 4.58, min 0.08, average 0.73 over 75 tests
 | |
| method BREVERSE: max cratio 3.91, min 0.08, average 1.44 over 75 tests
 | |
| method SORT: max cratio 0.08, min 0.08, average 0.08 over 75 tests
 | |
| method NEARSORT: max cratio 0.89, min 0.08, average 0.39 over 75 tests
 | |
| method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests
 | |
| Overall: average cratio 0.69 over 525 tests
 | |
| 
 | |
| So the worst case seems only very marginally worse, and there is a
 | |
| definite improvement in the average case, even for inputs that aren't
 | |
| entirely sorted.  Importantly, the "near sorted" case that I thought
 | |
| might send it into quadratic behavior doesn't seem to do that.
 | |
| 
 | |
| So, unless anyone wants to do further testing, I'll go ahead and commit
 | |
| these changes.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| PS: Just as a comparison point, here are the results when testing HPUX's
 | |
| library qsort:
 | |
| 
 | |
| distribution SAWTOOTH: max cratio 7.00, min 0.08, average 0.76 over 105 tests
 | |
| distribution RAND: max cratio 1.11, min 0.08, average 0.53 over 105 tests
 | |
| distribution STAGGER: max cratio 7.05, min 0.58, average 1.24 over 105 tests
 | |
| distribution PLATEAU: max cratio 7.00, min 0.08, average 0.43 over 105 tests
 | |
| distribution SHUFFLE: max cratio 7.00, min 0.86, average 1.54 over 105 tests
 | |
| method COPY: max cratio 6.70, min 0.08, average 0.79 over 75 tests
 | |
| method REVERSE: max cratio 7.05, min 0.08, average 0.78 over 75 tests
 | |
| method FREVERSE: max cratio 7.00, min 0.08, average 0.77 over 75 tests
 | |
| method BREVERSE: max cratio 7.00, min 0.08, average 2.11 over 75 tests
 | |
| method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
 | |
| method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
 | |
| method DITHER: max cratio 4.06, min 0.16, average 0.74 over 75 tests
 | |
| Overall: average cratio 0.90 over 525 tests
 | |
| 
 | |
| and here are the results using glibc's qsort, which of course isn't
 | |
| quicksort at all but some kind of merge sort:
 | |
| 
 | |
| distribution SAWTOOTH: max cratio 0.90, min 0.49, average 0.65 over 105 tests
 | |
| distribution RAND: max cratio 0.91, min 0.49, average 0.76 over 105 tests
 | |
| distribution STAGGER: max cratio 0.92, min 0.49, average 0.70 over 105 tests
 | |
| distribution PLATEAU: max cratio 0.84, min 0.49, average 0.54 over 105 tests
 | |
| distribution SHUFFLE: max cratio 0.64, min 0.49, average 0.52 over 105 tests
 | |
| method COPY: max cratio 0.92, min 0.49, average 0.66 over 75 tests
 | |
| method REVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests
 | |
| method FREVERSE: max cratio 0.92, min 0.49, average 0.67 over 75 tests
 | |
| method BREVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests
 | |
| method SORT: max cratio 0.49, min 0.49, average 0.49 over 75 tests
 | |
| method NEARSORT: max cratio 0.55, min 0.49, average 0.51 over 75 tests
 | |
| method DITHER: max cratio 0.92, min 0.50, average 0.74 over 75 tests
 | |
| Overall: average cratio 0.63 over 525 tests
 | |
| 
 | |
| PPS: final version of test framework attached for the archives.
 | |
| 
 | |
| 
 | |
| ------- =_aaaaaaaaaa0
 | |
| Content-Type: application/octet-stream
 | |
| Content-ID: <18685.1142966822.2@sss.pgh.pa.us>
 | |
| Content-Description: sorttester.c
 | |
| Content-Transfer-Encoding: base64
 | |
| 
 | |
| I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1
 | |
| ZGUgPHN0cmluZy5oPgojaW5jbHVkZSA8bWF0aC5oPgoKI2lmZGVmIFVTRV9R
 | |
| U09SVAojZGVmaW5lIHRlc3RfcXNvcnQgcXNvcnQKI2Vsc2UKZXh0ZXJuIHZv
 | |
| aWQgdGVzdF9xc29ydCh2b2lkICphLCBzaXplX3Qgbiwgc2l6ZV90IGVzLAoJ
 | |
| CQkJCSAgIGludCAoKmNtcCkgKGNvbnN0IHZvaWQgKiwgY29uc3Qgdm9pZCAq
 | |
| KSk7CiNlbmRpZgoKLy9zdGF0aWMgY29uc3QgaW50IG5fdmFsdWVzW10gPSB7
 | |
| IDEwMCwgMTAyMywgMTAyNCwgMTAyNSwgMTAwMDAgfSA7CnN0YXRpYyBjb25z
 | |
| dCBpbnQgbl92YWx1ZXNbXSA9IHsgMTAwMDAgfSA7CiNkZWZpbmUgTUFYX04g
 | |
| MTAwMDAKCnR5cGVkZWYgZW51bQp7CglTQVdUT09USCwgUkFORCwgU1RBR0dF
 | |
| UiwgUExBVEVBVSwgU0hVRkZMRQp9IERJU1Q7CiNkZWZpbmUgRElTVF9GSVJT
 | |
| VAlTQVdUT09USAojZGVmaW5lIERJU1RfTEFTVAlTSFVGRkxFCgpzdGF0aWMg
 | |
| Y29uc3QgY2hhciAqIGNvbnN0IGRpc3RuYW1lW10gPSB7CgkiU0FXVE9PVEgi
 | |
| LCAiUkFORCIsICJTVEFHR0VSIiwgIlBMQVRFQVUiLCAiU0hVRkZMRSIKfTsK
 | |
| CnR5cGVkZWYgZW51bQp7CglNQ09QWSwgTVJFVkVSU0UsIE1GUkVWRVJTRSwg
 | |
| TUJSRVZFUlNFLCBNU09SVCwgTU5FQVJTT1JULCBNRElUSEVSCn0gTU9ETUVU
 | |
| SE9EOwojZGVmaW5lIE1FVEhfRklSU1QJTUNPUFkKI2RlZmluZSBNRVRIX0xB
 | |
| U1QJTURJVEhFUgoKc3RhdGljIGNvbnN0IGNoYXIgKiBjb25zdCBtZXRobmFt
 | |
| ZVtdID0gewoJIkNPUFkiLCAiUkVWRVJTRSIsICJGUkVWRVJTRSIsICJCUkVW
 | |
| RVJTRSIsICJTT1JUIiwgIk5FQVJTT1JUIiwgIkRJVEhFUiIKfTsKCi8qIHBl
 | |
| ci10ZXN0IGNvdW50ZXIgKi8Kc3RhdGljIGxvbmcgbmNvbXBhcmVzOwoKLyog
 | |
| YWNjdW11bGF0ZSByZXN1bHRzIGFjcm9zcyB0ZXN0cywgZGlzdC13aXNlICov
 | |
| CnN0YXRpYyBkb3VibGUgc3VtY3JhdGlvX2RbRElTVF9MQVNUKzFdOwpzdGF0
 | |
| aWMgZG91YmxlIG1heGNyYXRpb19kW0RJU1RfTEFTVCsxXTsKc3RhdGljIGRv
 | |
| dWJsZSBtaW5jcmF0aW9fZFtESVNUX0xBU1QrMV07CnN0YXRpYyBpbnQgbnRl
 | |
| c3RzX2RbRElTVF9MQVNUKzFdOwovKiBhY2N1bXVsYXRlIHJlc3VsdHMgYWNy
 | |
| b3NzIHRlc3RzLCBtb2QtbWV0aG9kLXdpc2UgKi8Kc3RhdGljIGRvdWJsZSBz
 | |
| dW1jcmF0aW9fbVtNRVRIX0xBU1QrMV07CnN0YXRpYyBkb3VibGUgbWF4Y3Jh
 | |
| dGlvX21bTUVUSF9MQVNUKzFdOwpzdGF0aWMgZG91YmxlIG1pbmNyYXRpb19t
 | |
| W01FVEhfTEFTVCsxXTsKc3RhdGljIGludCBudGVzdHNfbVtNRVRIX0xBU1Qr
 | |
| MV07CgoKc3RhdGljIGludAppbnRfY21wKGNvbnN0IHZvaWQgKmEsIGNvbnN0
 | |
| IHZvaWQgKmIpCnsKCWludAkJYWEgPSAqKGNvbnN0IGludCAqKSBhOwoJaW50
 | |
| CQliYiA9ICooY29uc3QgaW50ICopIGI7CgoJbmNvbXBhcmVzKys7CgoJaWYg
 | |
| KGFhIDwgYmIpCgkJcmV0dXJuIC0xOwoJaWYgKGFhID4gYmIpCgkJcmV0dXJu
 | |
| IDE7CglyZXR1cm4gMDsKfQoKCnN0YXRpYyB2b2lkCnRlc3RfY29tbW9uKERJ
 | |
| U1QgZGlzdCwgTU9ETUVUSE9EIG1ldGgsIHZvaWQgKnh0LCBzaXplX3Qgbiwg
 | |
| c2l6ZV90IHN6LAoJCQlpbnQgKCpjbXApIChjb25zdCB2b2lkICosIGNvbnN0
 | |
| IHZvaWQgKikpCnsKCWRvdWJsZSBubG9nbjsKCWRvdWJsZSBjcmF0aW87CgoJ
 | |
| bmNvbXBhcmVzID0gMDsKCXRlc3RfcXNvcnQoeHQsIG4sIHN6LCBjbXApOwoJ
 | |
| LyogbG9nIHRoZSBjb3N0IGJlZm9yZSBkb2luZyBtb3JlIGNtcHMgKi8KCW5s
 | |
| b2duID0gbiAqIGxvZygoZG91YmxlKSBuKSAvIGxvZygyLjApOwoJY3JhdGlv
 | |
| ID0gbmNvbXBhcmVzIC8gbmxvZ247CglzdW1jcmF0aW9fZFtkaXN0XSArPSBj
 | |
| cmF0aW87CglpZiAobnRlc3RzX2RbZGlzdF0gPT0gMCkKCXsKCQltYXhjcmF0
 | |
| aW9fZFtkaXN0XSA9IG1pbmNyYXRpb19kW2Rpc3RdID0gY3JhdGlvOwoJfQoJ
 | |
| ZWxzZQoJewoJCWlmIChjcmF0aW8gPiBtYXhjcmF0aW9fZFtkaXN0XSkKCQkJ
 | |
| bWF4Y3JhdGlvX2RbZGlzdF0gPSBjcmF0aW87CgkJaWYgKGNyYXRpbyA8IG1p
 | |
| bmNyYXRpb19kW2Rpc3RdKQoJCQltaW5jcmF0aW9fZFtkaXN0XSA9IGNyYXRp
 | |
| bzsKCX0KCW50ZXN0c19kW2Rpc3RdKys7CglzdW1jcmF0aW9fbVttZXRoXSAr
 | |
| PSBjcmF0aW87CglpZiAobnRlc3RzX21bbWV0aF0gPT0gMCkKCXsKCQltYXhj
 | |
| cmF0aW9fbVttZXRoXSA9IG1pbmNyYXRpb19tW21ldGhdID0gY3JhdGlvOwoJ
 | |
| fQoJZWxzZQoJewoJCWlmIChjcmF0aW8gPiBtYXhjcmF0aW9fbVttZXRoXSkK
 | |
| CQkJbWF4Y3JhdGlvX21bbWV0aF0gPSBjcmF0aW87CgkJaWYgKGNyYXRpbyA8
 | |
| IG1pbmNyYXRpb19tW21ldGhdKQoJCQltaW5jcmF0aW9fbVttZXRoXSA9IGNy
 | |
| YXRpbzsKCX0KCW50ZXN0c19tW21ldGhdKys7CgkvKiBub3cgY2hlY2sgZm9y
 | |
| IGNvcnJlY3Qgc29ydGluZyAqLwoJewoJCWNoYXIgKnAgPSB4dDsKCQl3aGls
 | |
| ZSAobi0tID4gMSkKCQl7CgkJCWlmIChjbXAocCwgcCArIHN6KSA+IDApCgkJ
 | |
| CXsKCQkJCWZwcmludGYoc3RkZXJyLCAid3Jvbmcgc29ydCByZXN1bHQgZm9y
 | |
| ICVzLyVzIVxuIiwKCQkJCQkJZGlzdG5hbWVbZGlzdF0sIG1ldGhuYW1lW21l
 | |
| dGhdKTsKCQkJCWV4aXQoMSk7CgkJCX0KCQkJcCArPSBzejsKCQl9Cgl9Cn0K
 | |
| CgovKiB3b3JrIG9uIGEgY29weSBvZiB4ICovCnN0YXRpYyB2b2lkCnRlc3Rf
 | |
| aW50X2NvcHkoRElTVCBkaXN0LCBpbnQgeFtdLCBpbnQgbikKewoJaW50CQl4
 | |
| dFtNQVhfTl07CgoJbWVtY3B5KHh0LCB4LCBuICogc2l6ZW9mKGludCkpOwoJ
 | |
| dGVzdF9jb21tb24oZGlzdCwgTUNPUFksICh2b2lkICopIHh0LCBuLCBzaXpl
 | |
| b2YoaW50KSwgaW50X2NtcCk7Cn0KCi8qIHJldmVyc2UgdGhlIHJhbmdlIHN0
 | |
| YXJ0IDw9IGkgPCBzdG9wICovCnN0YXRpYyB2b2lkCnRlc3RfaW50X3JldmVy
 | |
| c2UoRElTVCBkaXN0LCBNT0RNRVRIT0QgbWV0aCwgaW50IHhbXSwgaW50IG4s
 | |
| IGludCBzdGFydCwgaW50IHN0b3ApCnsKCWludAkJeHRbTUFYX05dOwoJaW50
 | |
| CQlpOwoKCW1lbWNweSh4dCwgeCwgbiAqIHNpemVvZihpbnQpKTsKCWZvciAo
 | |
| aSA9IHN0YXJ0OyBpIDwgc3RvcDsgaSsrKQoJewoJCXh0W2ldID0geFtzdG9w
 | |
| IC0gMSAtIGldOwoJfQoJdGVzdF9jb21tb24oZGlzdCwgbWV0aCwgKHZvaWQg
 | |
| KikgeHQsIG4sIHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogcHJlLXNv
 | |
| cnQgdXNpbmcgYSB0cnVzdGVkIHNvcnQgKi8Kc3RhdGljIHZvaWQKdGVzdF9p
 | |
| bnRfc29ydChESVNUIGRpc3QsIGludCB4W10sIGludCBuKQp7CglpbnQJCXh0
 | |
| W01BWF9OXTsKCgltZW1jcHkoeHQsIHgsIG4gKiBzaXplb2YoaW50KSk7Cglx
 | |
| c29ydCgodm9pZCAqKSB4dCwgbiwgc2l6ZW9mKGludCksIGludF9jbXApOwoJ
 | |
| dGVzdF9jb21tb24oZGlzdCwgTVNPUlQsICh2b2lkICopIHh0LCBuLCBzaXpl
 | |
| b2YoaW50KSwgaW50X2NtcCk7Cn0KCi8qIG5lYXItc29ydGVkICovCnN0YXRp
 | |
| YyB2b2lkCnRlc3RfaW50X25lYXJzb3J0KERJU1QgZGlzdCwgaW50IHhbXSwg
 | |
| aW50IG4pCnsKCWludAkJeHRbTUFYX05dOwoKCW1lbWNweSh4dCwgeCwgbiAq
 | |
| IHNpemVvZihpbnQpKTsKCXFzb3J0KCh2b2lkICopIHh0LCBuLCBzaXplb2Yo
 | |
| aW50KSwgaW50X2NtcCk7CgkvKiBzd2FwIGEgcmFuZG9tIHR3byBlbGVtZW50
 | |
| cyAqLwoJewoJCWludCBpID0gcmFuZCgpICUgbjsKCQlpbnQgaiA9IHJhbmQo
 | |
| KSAlIG47CgkJaW50IHQgPSB4dFtpXTsKCQl4dFtpXSA9IHh0W2pdOwoJCXh0
 | |
| W2pdID0gdDsKCX0KCXRlc3RfY29tbW9uKGRpc3QsIE1ORUFSU09SVCwgKHZv
 | |
| aWQgKikgeHQsIG4sIHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogYWRk
 | |
| IGklNSB0byB4W2ldICovCnN0YXRpYyB2b2lkCnRlc3RfaW50X2RpdGhlcihE
 | |
| SVNUIGRpc3QsIGludCB4W10sIGludCBuKQp7CglpbnQJCXh0W01BWF9OXTsK
 | |
| CWludAkJaTsKCglmb3IgKGkgPSAwOyBpIDwgbjsgaSsrKQoJewoJCXh0W2ld
 | |
| ID0geFtpXSArIGklNTsKCX0KCXRlc3RfY29tbW9uKGRpc3QsIE1ESVRIRVIs
 | |
| ICh2b2lkICopIHh0LCBuLCBzaXplb2YoaW50KSwgaW50X2NtcCk7Cn0KCgpp
 | |
| bnQKbWFpbigpCnsKCWludAkJeFtNQVhfTl07CglpbnQJCWlfbjsKCWludAkJ
 | |
| bjsKCWludAkJbTsKCURJU1QJZGlzdDsKCU1PRE1FVEhPRCBtZXRoOwoJaW50
 | |
| CQlpOwoJaW50CQlqOwoJaW50CQlrOwoJZG91YmxlCXN1bWNyYXRpbzsKCWlu
 | |
| dAkJbnRlc3RzOwoKCWZvciAoaV9uID0gMDsgaV9uIDwgc2l6ZW9mKG5fdmFs
 | |
| dWVzKS9zaXplb2Yobl92YWx1ZXNbMF0pOyBpX24rKykKCXsKCQluID0gbl92
 | |
| YWx1ZXNbaV9uXTsKCQlmb3IgKG0gPSAxOyBtIDwgMipuOyBtICo9IDIpCgkJ
 | |
| ewoJCQlmb3IgKGRpc3QgPSBESVNUX0ZJUlNUOyBkaXN0IDw9IERJU1RfTEFT
 | |
| VDsgZGlzdCsrKQoJCQl7CgkJCQlzd2l0Y2ggKGRpc3QpCgkJCQl7CgkJCQkJ
 | |
| Y2FzZSBTQVdUT09USDoKCQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBp
 | |
| IDwgbjsgaSsrKQoJCQkJCQl7CgkJCQkJCQl4W2ldID0gaSAlIG07CgkJCQkJ
 | |
| CX0KCQkJCQkJYnJlYWs7CgkJCQkJY2FzZSBSQU5EOgoJCQkJCQlmb3IgKGkg
 | |
| PSBqID0gMCwgayA9IDE7IGkgPCBuOyBpKyspCgkJCQkJCXsKCQkJCQkJCXhb
 | |
| aV0gPSByYW5kKCkgJSBtOwoJCQkJCQl9CgkJCQkJCWJyZWFrOwoJCQkJCWNh
 | |
| c2UgU1RBR0dFUjoKCQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBpIDwg
 | |
| bjsgaSsrKQoJCQkJCQl7CgkJCQkJCQl4W2ldID0gKGkqbSArIGkpICUgbjsK
 | |
| CQkJCQkJfQoJCQkJCQlicmVhazsKCQkJCQljYXNlIFBMQVRFQVU6CgkJCQkJ
 | |
| CWZvciAoaSA9IGogPSAwLCBrID0gMTsgaSA8IG47IGkrKykKCQkJCQkJewoJ
 | |
| CQkJCQkJeFtpXSA9IGkgPCBtID8gaSA6IG07CgkJCQkJCX0KCQkJCQkJYnJl
 | |
| YWs7CgkJCQkJY2FzZSBTSFVGRkxFOgoJCQkJCQlmb3IgKGkgPSBqID0gMCwg
 | |
| ayA9IDE7IGkgPCBuOyBpKyspCgkJCQkJCXsKCQkJCQkJCXhbaV0gPSAocmFu
 | |
| ZCgpJW0pID8gKGorPTIpIDogKGsrPTIpOwoJCQkJCQl9CgkJCQkJCWJyZWFr
 | |
| OwoJCQkJfQoJCQkJdGVzdF9pbnRfY29weShkaXN0LCB4LCBuKTsgLyogd29y
 | |
| ayBvbiBhIGNvcHkgb2YgeCAqLwoJCQkJdGVzdF9pbnRfcmV2ZXJzZShkaXN0
 | |
| LCBNUkVWRVJTRSwgeCwgbiwgMCwgbik7IC8qIG9uIGEgcmV2ZXJzZWQgY29w
 | |
| eSAqLwoJCQkJdGVzdF9pbnRfcmV2ZXJzZShkaXN0LCBNRlJFVkVSU0UsIHgs
 | |
| IG4sIDAsIG4vMik7CS8qIGZyb250IGhhbGYgcmV2ZXJzZWQgKi8KCQkJCXRl
 | |
| c3RfaW50X3JldmVyc2UoZGlzdCwgTUJSRVZFUlNFLCB4LCBuLCBuLzIsIG4p
 | |
| OwkvKiBiYWNrIGhhbGYgcmV2ZXJzZWQgKi8KCQkJCXRlc3RfaW50X3NvcnQo
 | |
| ZGlzdCwgeCwgbik7IC8qIGFuIG9yZGVyZWQgY29weSAqLwoJCQkJdGVzdF9p
 | |
| bnRfbmVhcnNvcnQoZGlzdCwgeCwgbik7IC8qIGFuIGFsbW9zdCBvcmRlcmVk
 | |
| IGNvcHkgKi8KCQkJCXRlc3RfaW50X2RpdGhlcihkaXN0LCB4LCBuKTsgLyog
 | |
| YWRkIGklNSB0byB4W2ldICovCgkJCX0KCQl9Cgl9CgoJZm9yIChkaXN0ID0g
 | |
| RElTVF9GSVJTVDsgZGlzdCA8PSBESVNUX0xBU1Q7IGRpc3QrKykKCXsKCQlw
 | |
| cmludGYoImRpc3RyaWJ1dGlvbiAlczogbWF4IGNyYXRpbyAlLjJmLCBtaW4g
 | |
| JS4yZiwgYXZlcmFnZSAlLjJmIG92ZXIgJWQgdGVzdHNcbiIsCgkJCSAgIGRp
 | |
| c3RuYW1lW2Rpc3RdLAoJCQkgICBtYXhjcmF0aW9fZFtkaXN0XSwKCQkJICAg
 | |
| bWluY3JhdGlvX2RbZGlzdF0sCgkJCSAgIHN1bWNyYXRpb19kW2Rpc3RdIC8g
 | |
| bnRlc3RzX2RbZGlzdF0sCgkJCSAgIG50ZXN0c19kW2Rpc3RdKTsKCX0KCglz
 | |
| dW1jcmF0aW8gPSAwOwoJbnRlc3RzID0gMDsKCglmb3IgKG1ldGggPSBNRVRI
 | |
| X0ZJUlNUOyBtZXRoIDw9IE1FVEhfTEFTVDsgbWV0aCsrKQoJewoJCXByaW50
 | |
| ZigibWV0aG9kICVzOiBtYXggY3JhdGlvICUuMmYsIG1pbiAlLjJmLCBhdmVy
 | |
| YWdlICUuMmYgb3ZlciAlZCB0ZXN0c1xuIiwKCQkJICAgbWV0aG5hbWVbbWV0
 | |
| aF0sCgkJCSAgIG1heGNyYXRpb19tW21ldGhdLAoJCQkgICBtaW5jcmF0aW9f
 | |
| bVttZXRoXSwKCQkJICAgc3VtY3JhdGlvX21bbWV0aF0gLyBudGVzdHNfbVtt
 | |
| ZXRoXSwKCQkJICAgbnRlc3RzX21bbWV0aF0pOwoJCXN1bWNyYXRpbyArPSBz
 | |
| dW1jcmF0aW9fbVttZXRoXTsKCQludGVzdHMgKz0gbnRlc3RzX21bbWV0aF07
 | |
| Cgl9CgoJcHJpbnRmKCJPdmVyYWxsOiBhdmVyYWdlIGNyYXRpbyAlLjJmIG92
 | |
| ZXIgJWQgdGVzdHNcbiIsCgkJICAgc3VtY3JhdGlvIC8gbnRlc3RzLAoJCSAg
 | |
| IG50ZXN0cyk7CgoJcmV0dXJuIDA7Cn0K
 | |
| 
 | |
| ------- =_aaaaaaaaaa0
 | |
| Content-Type: text/plain
 | |
| Content-Disposition: inline
 | |
| Content-Transfer-Encoding: 8bit
 | |
| MIME-Version: 1.0
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| ------- =_aaaaaaaaaa0--
 | |
| 
 | |
| From pgsql-hackers-owner+M81283@postgresql.org Tue Mar 21 15:18:07 2006
 | |
| Return-path: <pgsql-hackers-owner+M81283@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LKI7M12970
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 21 Mar 2006 15:18:07 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id AABA167BBF8;
 | |
| 	Tue, 21 Mar 2006 16:18:04 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 3E6009DC827
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Tue, 21 Mar 2006 16:17:38 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 56406-07
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Tue, 21 Mar 2006 16:17:38 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from stark.xeocode.com (stark.xeocode.com [216.58.44.227])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 21DCF9DC809
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 21 Mar 2006 16:17:35 -0400 (AST)
 | |
| Received: from localhost ([127.0.0.1] helo=stark.xeocode.com)
 | |
| 	by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian))
 | |
| 	id 1FLnI7-0004xB-00; Tue, 21 Mar 2006 15:17:19 -0500
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| cc: "Dann Corbit" <DCorbit@connx.com>,
 | |
|    "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com>
 | |
| 	<18732.1142967137@sss.pgh.pa.us>
 | |
| In-Reply-To: <18732.1142967137@sss.pgh.pa.us>
 | |
| From: Greg Stark <gsstark@mit.edu>
 | |
| Organization: The Emacs Conspiracy; member since 1992
 | |
| Date: 21 Mar 2006 15:17:19 -0500
 | |
| Message-ID: <87lkv3mu28.fsf@stark.xeocode.com>
 | |
| Lines: 13
 | |
| User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.128 required=5 tests=[AWL=0.128]
 | |
| X-Spam-Score: 0.128
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| Tom Lane <tgl@sss.pgh.pa.us> writes:
 | |
| 
 | |
| > and here are the results using glibc's qsort, which of course isn't
 | |
| > quicksort at all but some kind of merge sort:
 | |
| > ...
 | |
| > Overall: average cratio 0.63 over 525 tests
 | |
| 
 | |
| That looks better both on average and in the worst case. Are the time
 | |
| constants that much worse that the merge sort still takes longer?
 | |
| 
 | |
| -- 
 | |
| greg
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M81285@postgresql.org Tue Mar 21 15:38:06 2006
 | |
| Return-path: <pgsql-hackers-owner+M81285@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LKc6M14799
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 21 Mar 2006 15:38:06 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 0917767BBF8;
 | |
| 	Tue, 21 Mar 2006 16:38:03 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 069389DC843
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Tue, 21 Mar 2006 16:37:39 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 60037-07
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Tue, 21 Mar 2006 16:37:39 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id BDC039DC827
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 21 Mar 2006 16:37:36 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LKbZid019858;
 | |
| 	Tue, 21 Mar 2006 15:37:35 -0500 (EST)
 | |
| To: Greg Stark <gsstark@mit.edu>
 | |
| cc: "Dann Corbit" <DCorbit@connx.com>,
 | |
|    "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| In-Reply-To: <87lkv3mu28.fsf@stark.xeocode.com> 
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com> <18732.1142967137@sss.pgh.pa.us> <87lkv3mu28.fsf@stark.xeocode.com>
 | |
| Comments: In-reply-to Greg Stark <gsstark@mit.edu>
 | |
| 	message dated "21 Mar 2006 15:17:19 -0500"
 | |
| Date: Tue, 21 Mar 2006 15:37:35 -0500
 | |
| Message-ID: <19857.1142973455@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Greg Stark <gsstark@mit.edu> writes:
 | |
| > That looks better both on average and in the worst case. Are the time
 | |
| > constants that much worse that the merge sort still takes longer?
 | |
| 
 | |
| Keep in mind that this is only counting the number of
 | |
| comparison-function calls; it's not accounting for any other effects.
 | |
| In particular, for a large sort operation quicksort might win because of
 | |
| its more cache-friendly memory access patterns.
 | |
| 
 | |
| The whole question of our qsort vs the system library's qsort probably
 | |
| needs to be revisited, however, now that we've identified and fixed this
 | |
| particular performance issue.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M81289@postgresql.org Tue Mar 21 16:27:30 2006
 | |
| Return-path: <pgsql-hackers-owner+M81289@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LLRTM20101
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 21 Mar 2006 16:27:30 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id ACB4F67BBFD;
 | |
| 	Tue, 21 Mar 2006 17:27:27 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 16E0E9DCA0F
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Tue, 21 Mar 2006 17:27:01 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 69903-02
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Tue, 21 Mar 2006 17:27:02 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from stark.xeocode.com (stark.xeocode.com [216.58.44.227])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 107429DC867
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 21 Mar 2006 17:26:58 -0400 (AST)
 | |
| Received: from localhost ([127.0.0.1] helo=stark.xeocode.com)
 | |
| 	by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian))
 | |
| 	id 1FLoNU-0006M4-00; Tue, 21 Mar 2006 16:26:56 -0500
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| cc: Greg Stark <gsstark@mit.edu>, "Dann Corbit" <DCorbit@connx.com>,
 | |
|    "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com>
 | |
| 	<18732.1142967137@sss.pgh.pa.us> <87lkv3mu28.fsf@stark.xeocode.com>
 | |
| 	<19857.1142973455@sss.pgh.pa.us>
 | |
| In-Reply-To: <19857.1142973455@sss.pgh.pa.us>
 | |
| From: Greg Stark <gsstark@mit.edu>
 | |
| Organization: The Emacs Conspiracy; member since 1992
 | |
| Date: 21 Mar 2006 16:26:55 -0500
 | |
| Message-ID: <87acbjmqu8.fsf@stark.xeocode.com>
 | |
| Lines: 22
 | |
| User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.128 required=5 tests=[AWL=0.128]
 | |
| X-Spam-Score: 0.128
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Tom Lane <tgl@sss.pgh.pa.us> writes:
 | |
| 
 | |
| > Greg Stark <gsstark@mit.edu> writes:
 | |
| > > That looks better both on average and in the worst case. Are the time
 | |
| > > constants that much worse that the merge sort still takes longer?
 | |
| > 
 | |
| > Keep in mind that this is only counting the number of
 | |
| > comparison-function calls; it's not accounting for any other effects.
 | |
| > In particular, for a large sort operation quicksort might win because of
 | |
| > its more cache-friendly memory access patterns.
 | |
| 
 | |
| My question explicitly recognized that possibility. I'm just a little
 | |
| skeptical since the comparison function in Postgres is often not some simple
 | |
| bit of tightly optimized C code, but rather a complex locale sensitive
 | |
| comparison function or even a bit of SQL expression to evaluate.
 | |
| 
 | |
| Cache effectiveness is may be a minimal factor anyways when the comparison is
 | |
| executing more than a minimal amount of code. And one extra comparison is
 | |
| going to cost a lot more too.
 | |
| 
 | |
| -- 
 | |
| greg
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 9: In versions below 8.0, the planner will ignore your desire to
 | |
|        choose an index scan if your joining column's datatypes do not
 | |
|        match
 | |
| 
 | |
| From pgsql-hackers-owner+M81290@postgresql.org Tue Mar 21 16:48:00 2006
 | |
| Return-path: <pgsql-hackers-owner+M81290@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LLlxM22215
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 21 Mar 2006 16:47:59 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 28A9867BBFD;
 | |
| 	Tue, 21 Mar 2006 17:47:57 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 1B4849DCC25
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>; Tue, 21 Mar 2006 17:47:27 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 72535-05
 | |
| 	for <pgsql-hackers-postgresql.org@localhost.postgresql.org>;
 | |
| 	Tue, 21 Mar 2006 17:47:28 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id D27239DCC21
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 21 Mar 2006 17:47:24 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LLlNpJ002194;
 | |
| 	Tue, 21 Mar 2006 16:47:23 -0500 (EST)
 | |
| To: Greg Stark <gsstark@mit.edu>
 | |
| cc: "Dann Corbit" <DCorbit@connx.com>,
 | |
|    "Jonah H. Harris" <jonah.harris@gmail.com>, pgsql-hackers@postgresql.org,
 | |
|    "Jerry Sievers" <jerry@jerrysievers.com>
 | |
| Subject: Re: [HACKERS] qsort, once again 
 | |
| In-Reply-To: <87acbjmqu8.fsf@stark.xeocode.com> 
 | |
| References: <D425483C2C5C9F49B5B7A41F8944154757D6A1@postal.corporate.connx.com> <18732.1142967137@sss.pgh.pa.us> <87lkv3mu28.fsf@stark.xeocode.com> <19857.1142973455@sss.pgh.pa.us> <87acbjmqu8.fsf@stark.xeocode.com>
 | |
| Comments: In-reply-to Greg Stark <gsstark@mit.edu>
 | |
| 	message dated "21 Mar 2006 16:26:55 -0500"
 | |
| Date: Tue, 21 Mar 2006 16:47:23 -0500
 | |
| Message-ID: <2193.1142977643@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Greg Stark <gsstark@mit.edu> writes:
 | |
| > My question explicitly recognized that possibility. I'm just a little
 | |
| > skeptical since the comparison function in Postgres is often not some simple
 | |
| > bit of tightly optimized C code, but rather a complex locale sensitive
 | |
| > comparison function or even a bit of SQL expression to evaluate.
 | |
| 
 | |
| Yeah, I'd guess the same way, but OTOH at least a few people have
 | |
| reported that our qsort code is consistently faster than glibc's (and
 | |
| that was before this fix).  See this thread:
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
 | |
| 
 | |
| Currently I believe that we only use our qsort on Solaris, not any other
 | |
| platform, so if you think that glibc's qsort is better then you've
 | |
| already got your wish.  It seems to need more investigation though.
 | |
| In particular, I'm thinking that the various adjustments we've made
 | |
| to the sort support code over the past month probably invalidate any
 | |
| previous testing of the point, and that we ought to go back and redo
 | |
| those comparisons.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M81282@postgresql.org Tue Mar 21 14:09:22 2006
 | |
| Return-path: <pgsql-hackers-owner+M81282@postgresql.org>
 | |
| Received: from ams.hub.org (ams.hub.org [200.46.204.13])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LK9KM11902
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 21 Mar 2006 15:09:21 -0500 (EST)
 | |
| Received: from postgresql.org (postgresql.org [200.46.204.71])
 | |
| 	by ams.hub.org (Postfix) with ESMTP id 6B1CF67BBF6;
 | |
| 	Tue, 21 Mar 2006 16:09:18 -0400 (AST)
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| Received: from localhost (av.hub.org [200.46.204.144])
 | |
| 	by postgresql.org (Postfix) with ESMTP id 0B2E19DCA0F;
 | |
| 	Tue, 21 Mar 2006 16:08:50 -0400 (AST)
 | |
| Received: from postgresql.org ([200.46.204.71])
 | |
| 	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | |
| 	with ESMTP id 54998-02; Tue, 21 Mar 2006 16:08:50 -0400 (AST)
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
 | |
| 	by postgresql.org (Postfix) with ESMTP id C39619DC9E6;
 | |
| 	Tue, 21 Mar 2006 16:08:45 -0400 (AST)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LK8flq019571;
 | |
| 	Tue, 21 Mar 2006 15:08:41 -0500 (EST)
 | |
| To: Gary Doades <gpd@gpdnet.co.uk>
 | |
| cc: pgsql-performance@postgresql.org, pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour) 
 | |
| In-Reply-To: <20781.1140046109@sss.pgh.pa.us> 
 | |
| References: <43F38867.6010701@gpdnet.co.uk> <19510.1140036968@sss.pgh.pa.us> <19779.1140038874@sss.pgh.pa.us> <43F39E53.1020009@gpdnet.co.uk> <20781.1140046109@sss.pgh.pa.us>
 | |
| Comments: In-reply-to Tom Lane <tgl@sss.pgh.pa.us>
 | |
| 	message dated "Wed, 15 Feb 2006 18:28:29 -0500"
 | |
| Date: Tue, 21 Mar 2006 15:08:40 -0500
 | |
| Message-ID: <19570.1142971720@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Last month I wrote:
 | |
| > It seems clear that our qsort.c is doing a pretty awful job of picking
 | |
| > qsort pivots, while glibc is mostly managing not to make that mistake.
 | |
| 
 | |
| I re-ran Gary's test script using the just-committed improvements to
 | |
| qsort.c, and got pretty nice numbers (attached --- compare to
 | |
| http://archives.postgresql.org/pgsql-performance/2006-02/msg00227.php).
 | |
| So it was wrong to blame his problems on the pivot selection --- the
 | |
| culprit was that ill-considered switch to insertion sort.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| 100 runtimes for latest port/qsort.c, sorted ascending:
 | |
| 
 | |
| Time: 335.481 ms
 | |
| Time: 335.606 ms
 | |
| Time: 335.932 ms
 | |
| Time: 336.039 ms
 | |
| Time: 336.182 ms
 | |
| Time: 336.231 ms
 | |
| Time: 336.711 ms
 | |
| Time: 336.721 ms
 | |
| Time: 336.971 ms
 | |
| Time: 336.982 ms
 | |
| Time: 337.036 ms
 | |
| Time: 337.190 ms
 | |
| Time: 337.223 ms
 | |
| Time: 337.312 ms
 | |
| Time: 337.350 ms
 | |
| Time: 337.423 ms
 | |
| Time: 337.523 ms
 | |
| Time: 337.528 ms
 | |
| Time: 337.565 ms
 | |
| Time: 337.566 ms
 | |
| Time: 337.732 ms
 | |
| Time: 337.741 ms
 | |
| Time: 337.744 ms
 | |
| Time: 337.786 ms
 | |
| Time: 337.790 ms
 | |
| Time: 337.898 ms
 | |
| Time: 337.905 ms
 | |
| Time: 337.952 ms
 | |
| Time: 337.976 ms
 | |
| Time: 338.017 ms
 | |
| Time: 338.123 ms
 | |
| Time: 338.206 ms
 | |
| Time: 338.306 ms
 | |
| Time: 338.514 ms
 | |
| Time: 338.594 ms
 | |
| Time: 338.597 ms
 | |
| Time: 338.683 ms
 | |
| Time: 338.705 ms
 | |
| Time: 338.729 ms
 | |
| Time: 338.748 ms
 | |
| Time: 338.816 ms
 | |
| Time: 338.958 ms
 | |
| Time: 338.963 ms
 | |
| Time: 338.997 ms
 | |
| Time: 339.074 ms
 | |
| Time: 339.106 ms
 | |
| Time: 339.134 ms
 | |
| Time: 339.159 ms
 | |
| Time: 339.226 ms
 | |
| Time: 339.260 ms
 | |
| Time: 339.289 ms
 | |
| Time: 339.341 ms
 | |
| Time: 339.500 ms
 | |
| Time: 339.585 ms
 | |
| Time: 339.595 ms
 | |
| Time: 339.774 ms
 | |
| Time: 339.897 ms
 | |
| Time: 339.927 ms
 | |
| Time: 340.064 ms
 | |
| Time: 340.133 ms
 | |
| Time: 340.172 ms
 | |
| Time: 340.219 ms
 | |
| Time: 340.261 ms
 | |
| Time: 340.323 ms
 | |
| Time: 340.708 ms
 | |
| Time: 340.761 ms
 | |
| Time: 340.785 ms
 | |
| Time: 340.900 ms
 | |
| Time: 340.986 ms
 | |
| Time: 341.339 ms
 | |
| Time: 341.564 ms
 | |
| Time: 341.707 ms
 | |
| Time: 342.155 ms
 | |
| Time: 342.213 ms
 | |
| Time: 342.452 ms
 | |
| Time: 342.515 ms
 | |
| Time: 342.540 ms
 | |
| Time: 342.928 ms
 | |
| Time: 343.548 ms
 | |
| Time: 343.663 ms
 | |
| Time: 344.192 ms
 | |
| Time: 344.952 ms
 | |
| Time: 345.152 ms
 | |
| Time: 345.174 ms
 | |
| Time: 345.444 ms
 | |
| Time: 346.848 ms
 | |
| Time: 348.144 ms
 | |
| Time: 348.842 ms
 | |
| Time: 354.550 ms
 | |
| Time: 356.877 ms
 | |
| Time: 357.475 ms
 | |
| Time: 358.487 ms
 | |
| Time: 364.178 ms
 | |
| Time: 370.730 ms
 | |
| Time: 493.098 ms
 | |
| Time: 648.009 ms
 | |
| Time: 849.345 ms
 | |
| Time: 860.616 ms
 | |
| Time: 936.800 ms
 | |
| Time: 1727.085 ms
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: Have you checked our extensive FAQ?
 | |
| 
 | |
|                http://www.postgresql.org/docs/faq
 | |
| 
 |