> A customer of ours has been having trouble with corrupted data for some
> time. Of course, we've almost always blamed hardware (and we've seen
> RAID controllers have their firmware upgraded, among other actions), but
> the useful thing to know is when corruption has happened, and where.
> So we've been tasked with adding CRCs to data files.
> The idea is that these CRCs are going to be checked just after reading
> files from disk, and calculated just before writing it. They are
> just a protection against the storage layer going mad; they are not
> intended to protect against faulty RAM, CPU or kernel.
This has been suggested before, and the usual objection is precisely
that it only protects from errors in the storage layer, giving a false
sense of security.
Doesn't some filesystems include a per-block CRC, which would achieve
the same thing? ZFS?
> This code would be run-time or compile-time configurable. I'm not
> absolutely sure which yet; the problem with run-time is what to do if
> the user restarts the server with the setting flipped. It would have
> almost no impact on users who don't enable it.
Yeah, seems like it would need to be compile-time or initdb-time
> The implementation I'm envisioning requires the use of a new relation
> fork to store the per-block CRCs. Initially I'm aiming at a CRC32 sum
> for each block. FlushBuffer would calculate the checksum and store it
> in the CRC fork; ReadBuffer_common would read the page, calculate the
> checksum, and compare it to the one stored in the CRC fork.
Surely it would be much simpler to just add a field to the page header.