mirror of
				https://github.com/postgres/postgres.git
				synced 2025-11-04 00:02:52 -05:00 
			
		
		
		
	Documentation on the fsync() patch from OpenLink
Submitted by: Cees de Groot <C.deGroot@inter.nl.net>
This commit is contained in:
		
							parent
							
								
									715c6b6d25
								
							
						
					
					
						commit
						5995953a02
					
				
							
								
								
									
										34
									
								
								doc/README.fsync
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										34
									
								
								doc/README.fsync
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,34 @@
 | 
			
		||||
Fsync() patch (backend -F option)
 | 
			
		||||
=================================
 | 
			
		||||
 | 
			
		||||
Normally, the Postgres'95 backend makes sure that updates are actually
 | 
			
		||||
committed to disk by calling the standard function fsync() in
 | 
			
		||||
several places. Fsync() should guarantee that every modification to
 | 
			
		||||
a certain file is actually written to disk and will not hang around
 | 
			
		||||
in write caches anymore. This increases the chance that a database
 | 
			
		||||
will still be usable after a system crash by a large amount.
 | 
			
		||||
 | 
			
		||||
However, this operation severely slows down Postgres'95, because at all
 | 
			
		||||
those points it has to wait for the OS to flush the buffers. Especially
 | 
			
		||||
in one-shot operations, like creating a new database or loading lots
 | 
			
		||||
of data, you'll have a clear restart point if something goes wrong. That's
 | 
			
		||||
where the -F option kicks in: it simply disables the calls to fsync(). 
 | 
			
		||||
 | 
			
		||||
Without fsync(), the OS is allowed to do its best in buffering, sorting
 | 
			
		||||
and delaying writes, so this can be a _very_ big perfomance increase. However,
 | 
			
		||||
if the system crashes, large parts of the latest transactions will still hang
 | 
			
		||||
around in memory without having been committed to disk - lossage of data
 | 
			
		||||
is therefore almost certain to occur.
 | 
			
		||||
 | 
			
		||||
So it's a tradeoff between data integrity and speed. When initializing a
 | 
			
		||||
database, I'd use it - if the machine crashes, you simply remove the files
 | 
			
		||||
created and redo the operation. The same goes for bulk-loading data: on
 | 
			
		||||
a crash, you remove the database and restore the backup you made before
 | 
			
		||||
starting the bulk-load (you always make backups before bulk-loading,
 | 
			
		||||
don't you?).
 | 
			
		||||
 | 
			
		||||
Whether you want to use it in production, is up to you. If you trust your
 | 
			
		||||
operating system, your utility company, and your hardware, you might enable
 | 
			
		||||
it; however, keep in mind that you're running in an unsecure mode and that
 | 
			
		||||
performance gains will very much depend on access patterns (because it won't
 | 
			
		||||
help on reading data). I'd recommend against it.
 | 
			
		||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user