Discussion:
Servercrash
(too old to reply)
Stefan Holzheu
2004-05-19 16:14:04 UTC
Permalink
Hallo Lists,

yesterday our database-server crashed (raid-error). After reparing the
filesystem (ext2) we have a problem with our lagest table in the database.

vaccuum aborts with the following meassage (sorry it is in German):

DEBUG: vacuume »messungen.massendaten«
HINWEIS: Relation »massendaten« TID 211540/73:
DeleteTransactionInProgress 16785658 --- kann Relation nicht verkleinern
(cannot shrink relation)
DEBUG: AbortCurrentTransaction
FEHLER: ungültiger Seitenkopf in Block 354500 von Relation
»massendaten« (unvalid pagehead in block ...)

All other operations using ctid 211540/73 give:

FEHLER: konnte auf den Status von Transaktion 16785658 nicht zugreifen
(could not get status of transaction ...)
DETAIL: konnte Datei »/var/lib/pgsql/data/pg_clog/0010« nicht öffnen:
Datei oder Verzeichnis nicht gefunden

Is there a way to repair the database?

Help welcome!

Regards

Stefan
--
-----------------------------
Dr. Stefan Holzheu
Tel.: 0921/55-5720
Fax.: 0921/55-5799
BITOeK Wiss. Sekretariat
Universitaet Bayreuth
D-95440 Bayreuth
-----------------------------

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Tom Lane
2004-05-19 23:44:21 UTC
Permalink
Post by Stefan Holzheu
yesterday our database-server crashed (raid-error). After reparing the
filesystem (ext2) we have a problem with our lagest table in the database.
DEBUG: vacuume »messungen.massendaten«
DeleteTransactionInProgress 16785658 --- kann Relation nicht verkleinern
(cannot shrink relation)
DEBUG: AbortCurrentTransaction
FEHLER: ungültiger Seitenkopf in Block 354500 von Relation
»massendaten« (unvalid pagehead in block ...)
FEHLER: konnte auf den Status von Transaktion 16785658 nicht zugreifen
(could not get status of transaction ...)
Datei oder Verzeichnis nicht gefunden
Is there a way to repair the database?
You have at least two corrupted pages in that table: page 354500 has a
header problem, and in page 211540 there's a bogus transaction ID in a
tuple header. These are the *minimum* descriptions of the data lossage,
it's entirely likely that large parts of the pages involved are junk.

I would suggest proceeding by examining those pages with pg_filedump or
another tool. If you can make some sense of the damage it might be
possible to do a selective repair. If not, your best bet is to just
zero out the damaged pages --- this will lose the rows that are on those
pages, but at least you can get the rest of the table operational again.

You can find more about this by looking in the mail list archives.
Threads mentioning pg_filedump would be good places to start.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Stefan Holzheu
2004-05-20 10:21:38 UTC
Permalink
Post by Tom Lane
Post by Stefan Holzheu
yesterday our database-server crashed (raid-error). After reparing the
filesystem (ext2) we have a problem with our lagest table in the database.
DEBUG: vacuume »messungen.massendaten«
DeleteTransactionInProgress 16785658 --- kann Relation nicht verkleinern
(cannot shrink relation)
DEBUG: AbortCurrentTransaction
FEHLER: ungültiger Seitenkopf in Block 354500 von Relation
»massendaten« (unvalid pagehead in block ...)
FEHLER: konnte auf den Status von Transaktion 16785658 nicht zugreifen
(could not get status of transaction ...)
Datei oder Verzeichnis nicht gefunden
Is there a way to repair the database?
You have at least two corrupted pages in that table: page 354500 has a
header problem, and in page 211540 there's a bogus transaction ID in a
tuple header. These are the *minimum* descriptions of the data lossage,
it's entirely likely that large parts of the pages involved are junk.
I would suggest proceeding by examining those pages with pg_filedump or
another tool. If you can make some sense of the damage it might be
possible to do a selective repair. If not, your best bet is to just
zero out the damaged pages --- this will lose the rows that are on those
pages, but at least you can get the rest of the table operational again.
You can find more about this by looking in the mail list archives.
Threads mentioning pg_filedump would be good places to start.
regards, tom lane
We set zero_damaged_pages to on and get rid of page 354500. But the
error in page 211540 remains :-(...

Any idea?

Stefan
--
-----------------------------
Dr. Stefan Holzheu
Tel.: 0921/55-5720
Fax.: 0921/55-5799
BITOeK Wiss. Sekretariat
Universitaet Bayreuth
D-95440 Bayreuth
-----------------------------

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Loading...