Wednesday, September 24, 2008

Re: [ADMIN] missing chunk number 0 for toast value

I ran into this issue awhile ago. Here's my long internal tech note to my
dev guys on what I did. A bit modified for more genericism:

I'm in the process of migrating our internal db server, and I decided to use
the helpdesk as
my test database. It backed up fine last night. Something went horribly
wrong today, since pg_dump tells me:

> pg_dump: ERROR: missing chunk number 0 for toast value 110439697
> pg_dump: SQL command to dump the contents of table "attachments" failed:
> PQendcopy() failed.
> pg_dump: Error message from server: ERROR: missing chunk number 0 for
toast
> value 110439697
> pg_dump: The command was: COPY public.attachments (id, transactionid,
> parent, messageid, subject, filename, contenttype, contentencoding,
content,
> headers, creator, created) TO stdout;


I reindexed attachments. I reindexed the db. I retoasted the index. Or
reindexed the toast. Or toasted the index. Or something:

rt3=# select reltoastrelid::regclass from pg_class where relname =
'attachments';
reltoastrelid
---------------------------
pg_toast.pg_toast_8507627
(1 row)

rt3=# reindex table pg_toast.pg_toast_8507627;
REINDEX
rt3=# \q
-bash-2.05b$ pg_dump rt3 > /tmp/rt3

pg_dump: ERROR: missing chunk number 0 for toast value 110439697
pg_dump: SQL command to dump the contents of table "attachments" failed:
PQendcopy() failed.
pg_dump: Error message from server: ERROR: missing chunk number 0 for toast
value 110439697
pg_dump: The command was: COPY public.attachments (id, transactionid,
parent, messageid, subject, filename, contenttype, contentencoding, content,
headers, creator, created) TO stdout;


That didn't work. So...I figured I could find out what the bad rec was.
Doing this:

Select * from attachments limit 5000 offset 0
Select * from attachments limit 5000 offset 5000
Select * from attachments limit 5000 offset 10000
Select * from attachments limit 5000 offset 15000

quickly showed me that record 16179 was causing this issue.


So, in order to resolve, this I did this:

pg_dump -s rt3 > /tmp/rt3schema

followed by:

for each in `psql rt3 -c "\d" | awk {'print $3'} | grep -vw attachments`;do
pg_dump rt3 -t $each >> /tmp/rt3data; done

(This second one removes only the table named attachments)

Then I used pg.py to do this:

import pg
olddb=pg.connect('rt3','myolddbserver')
new=pg.connect('rt3','localhost')
first=olddb.query("""select * from attachments limit 16169""").getresult()
last=olddb.query("""select * from attachments limit 100000 offset 16170""")
for eachRec in first + last:
new.query("""insert into attachments values %r""" % (eachRec,))

*** I did have to deal with some quotification issues, but you get the
point.


-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org] On Behalf Of BJ Taylor
Sent: Sep 24, 2008 1:32 PM
To: pgsql-admin@postgresql.org
Subject: [ADMIN] missing chunk number 0 for toast value

Our database seems to have been corrupted. It is a heavily used database,
and went several months without any type of vacuuming. When we finally
realized that it wasn't being vacuumed, we started the process, but the
process never successfully completed, and our database has never been the
same since.

The exact error that we receive now is as follows:


postgres@server:~> pg_dumpall -p 5433 > dbmail_dumpall_23092008.sql
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: missing chunk number 0 for toast
value 554339
pg_dump: The command was: COPY public.dbmail_messageblks (messageblk_idnr,
physmessage_id, blocksize, is_header, messageblk) TO stdout;
pg_dumpall: pg_dump failed on database "dbmail", exiting


We have tried using the -d option of the pg_dumpall so we could get a full
dump of the database, and just start over from that, but that fails as well.
We have also tried reindexing the table, but although the reindex didn't
fail, it didn't solve our problem either. Our next option is to do a full
vacuum, but we are reluctant to take our mail server down for that long,
especially when we do not know for sure that it will either succeed or fix
our problem. I have searched the forums, but was unable to find a solution
that we have not already tried. The solutions didn't appear to help others
who had this problem either.

Any suggestions?

Thanks,
BJ

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

No comments: