Discussion:
IDE and write cache
(too old to reply)
Mark Lubratt
2004-02-11 06:14:07 UTC
Permalink
Interesting discussions on IDE drives and their write caches.

I have a question...

You mentioned that you'd see the problem during a large number of
concurrent transactions. My question is, is this a necessary condition
for the database crashing when the plug was pulled, or did you need use
a large number of concurrent transactions to "guarantee" that when you
pulled the plug, that it would be at an inopportune time? In other
words, is an IDE drive still "more" susceptible to a power outage
problem even under light load?

Thanks!
Mark


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
scott.marlowe
2004-02-11 15:48:48 UTC
Permalink
Post by Mark Lubratt
Interesting discussions on IDE drives and their write caches.
I have a question...
You mentioned that you'd see the problem during a large number of
concurrent transactions. My question is, is this a necessary condition
for the database crashing when the plug was pulled, or did you need use
a large number of concurrent transactions to "guarantee" that when you
pulled the plug, that it would be at an inopportune time? In other
words, is an IDE drive still "more" susceptible to a power outage
problem even under light load?
Basically, if the data has been written to WAL, and an fsync issued, and
the drive has it in cache, but hasn't written it to the platters, and you
lose power, the database will likely be corrupted and will refuse to
startup when the machine boots up. Also, of course, some data will be
lost that was supposedly committed in a transaction.

So, yeah, the reason for having hundreds of open transactions is that it
makes the window of opportunity for a lying drive to corrupt the database.

So, yes, even under light load, you could have a corrupted database if you
lose power while a write is happening. Of course, if the database is
sitting idle at the time of the power outage then you're ok.

-------------------------------------------------------------------------

Funny little story. We had an electrician working above our main power
switch (the big box that switches us from line power, to UPS, to the
diesel generator) and said electrician clipped a piece of wire that fell
into the switch, shorting it out, and taking down our entire hosting
center (think $1,000 a minute...)

As I was walking down a hallway, one of the winders / fox pro guys asked
me if my machine would come back up when the power came on (it runs on
dial 36 gig 10krpm SCSI drives under an LSI megaraid with battery backed
cache, and I've tested it pulling the plug before going production.) I'd
been bragging to him about the power plug pull tests it had passed, so of
course, he's just teasing me.

I told him that as long as the power cut hadn't spiked the box and fried
anything we were gold.

An hour later when they got the switch fixed and everything came back up,
my machine came up fine, but the NAS machines that provide the web storage
behind it (not the database, that's local) took about 10 minutes to fsck
or mount or whatever it is they do.

So I'm walking by foxpro guy's desk and I casually say "Well, looks like
my box had some problems coming back up." He smiles, thinking he's got
me, the bragging postgresql guy, by the short ones. "yeah, seems it boots
faster than the network storage it sits on. Just CTRL-ALT-DEL and it was
up and running fine." He laughed along with me. I trust Postgresql. On
SCSI or RAID with battery backed cache.


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org
Tom Lane
2004-02-11 16:21:14 UTC
Permalink
Post by Mark Lubratt
You mentioned that you'd see the problem during a large number of
concurrent transactions. My question is, is this a necessary condition
for the database crashing when the plug was pulled, or did you need use
a large number of concurrent transactions to "guarantee" that when you
pulled the plug, that it would be at an inopportune time? In other
words, is an IDE drive still "more" susceptible to a power outage
problem even under light load?
A heavy load makes it more likely that you'd see the problem, but only
because it improves the odds that there will be data "in flight" at the
instant the power drops. You can still lose on a lightly loaded system
if your luck is bad.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Tom Ansley
2004-02-11 16:31:54 UTC
Permalink
How do I unsubscribe from this list.

I have tried everything including doing what it says at the bottom of
the messages about sending a message to majordomo with unsubscribe and
my email address as the header.

Nothing is working.

Please help.

Cheers

Tom
Post by Tom Lane
Post by Mark Lubratt
You mentioned that you'd see the problem during a large number of
concurrent transactions. My question is, is this a necessary condition
for the database crashing when the plug was pulled, or did you need use
a large number of concurrent transactions to "guarantee" that when you
pulled the plug, that it would be at an inopportune time? In other
words, is an IDE drive still "more" susceptible to a power outage
problem even under light load?
A heavy load makes it more likely that you'd see the problem, but only
because it improves the odds that there will be data "in flight" at the
instant the power drops. You can still lose on a lightly loaded system
if your luck is bad.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Loading...