Saturday, January 14, 2012

PostgreSQL. fsync and speed

I was startled to read about data corruption in PostgreSQL in the pgsql-general list, I had never heard of such thing. In Oracle it struck me as well, the ones that get corrupted once in a while are the database blocks in the database buffer, and there's nothing to do about it, but nevermind.
It turns out that the person having this issue was running his server with the fsync option turned off.

What does fsync do?

The fsync() function can be used by an application to indicate that all data for the open file description named by fildes is to be transferred to the storage device associated with the file described by fildes in an implementation-dependent manner. The fsync() function does not return until the system has completed that action or until an error is detected.
http://pubs.opengroup.org/onlinepubs/7908799/xsh/fsync.html

So changes to a file are made over a memory buffer and then you close it or use fsync to "commit" the changes to disk. If you disable it, might gain some boost on your database speed, but under the risk of loosing data should a system crash or error occur.

As a side note, I stumbled upon this stack overflow question:
http://stackoverflow.com/questions/3736533/why-doesnt-mongodb-use-fsync
Where never_had_a_name asks why on earth doesn't MongoDB use fsync on each write, and the most important answer is:

This is going against ACID, more specifically against the D, which stands for durability:
Durability [guarantees] that once the user has been notified of a transaction's success the transaction will not be lost, the transaction's data changes will survive system failure, and that all integrity constraints have been satisfied, so the DBMS won't need to reverse the transaction.

ACID properties mostly apply to traditional RDBMS systems. NoSQL systems, which includes MongoDB, give up on one or more of the ACID properties in order to achieve better scalability. In MongoDB's case durability has been sacrificed for better performance when handling large amounts of updates.

Should the necessity arise, here is one good piece of advice from Craig Ringer in the pgsql-general discussion list:
Since you were running with fsync off, you must have had good backups or replication configured, because the documentation warns you that running with fsync=off will probably destroy your data and is only for expert users.

In case you don't have a current replica or backup: first, before you do ANYTHING ELSE follow the instructions here:


http://wiki.postgresql.org/wiki/Corruption

Now - **AFTER** following the instructions above - start the database in single user recovery mode. See:

http://www.postgresql.org/docs/current/static/app-postgres.html


where the "--single" command line argument for the "postgres" executable is documented.

Once you have a single user mode backend running you can REINDEX then shut the backend down, start back up in multi-user mode, and see if you have any luck.

IMHO I would never sacrifice data integrity over performance, but you never know when a client would come with a problem like this.

No comments:

Post a Comment