Discussion:
Out of space
(too old to reply)
Tom Bakken
2004-04-07 16:27:39 UTC
Permalink
I've been running a postgres for 2 or 3 years without a problem. This
morning my disk space for the database filled up. I need to know what
transaction/log files I can truncate or delete without compromising the
system. These files are located under /var/lib/pgsql/data/

Many of them have dates of more than a year ago. I'm kind of rusty with
this. Postgres works too well to keep me fluent with troubleshooting.

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development
Peter Eisentraut
2004-04-07 17:19:46 UTC
Permalink
Post by Tom Bakken
I've been running a postgres for 2 or 3 years without a problem.
This morning my disk space for the database filled up. I need to
know what transaction/log files I can truncate or delete without
compromising the system. These files are located under
/var/lib/pgsql/data/
The answer is normally "none" unless you have experienced crashes or
other problems that might have left stale files lying around. But you
say have had no problem ...

If you have any suspicion in that direction, please show us the exact
files you're thinking about. A note about which PG version you are
running would also help.


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Tom Bakken
2004-04-07 19:23:59 UTC
Permalink
I'm running version 7.1.2. I was able to drop several tables. That cleared
up some disk space, but for some reason now, the database won't restart.
How can you determine where the problem is when you're running
/etc/rc.d/init.d/postgresql restart? Any ideas on that would be
appreciated.

I've got database dumps so I can always start over.

Here's a listing of /var/lib/pgsql/data:
.:
total 1494316
-rw------- 1 postgres postgres 4 Jun 21 2001 PG_VERSION
drwx------ 6 postgres postgres 4096 Sep 17 2003 base
drwx------ 2 postgres postgres 4096 Oct 27 13:40 global
-rw-r--r-- 1 root root 7640 Jun 29 2001 h
-rw------- 1 postgres postgres 9070 Mar 2 10:56 pg_hba.conf
-rw------- 1 postgres postgres 1118 Jun 21 2001 pg_ident.conf
-rw------- 1 postgres postgres 1528627890 Apr 7 12:32 pg_log
drwx------ 2 postgres postgres 4096 Apr 7 12:32 pg_xlog
-rw------- 1 postgres postgres 3137 Jun 21 2001 postgresql.conf
-rw------- 1 postgres postgres 52 Apr 7 12:32 postmaster.opts

As far as log files to delete, here are some I thought might be safe to
delete under

./base:
total 20
drwx------ 2 postgres postgres 4096 Jul 13 2001 1
drwx------ 2 postgres postgres 8192 Apr 7 12:11 185174
drwx------ 2 postgres postgres 4096 Jun 21 2001 18719
drwx------ 2 postgres postgres 4096 Apr 7 12:11 213304

./base/1:
total 1556
-rw------- 1 postgres postgres 0 Jun 21 2001 1215
-rw------- 1 postgres postgres 0 Jun 21 2001 1216
-rw------- 1 postgres postgres 8192 Jun 21 2001 1219
-rw------- 1 postgres postgres 16384 Jul 16 2001 1247
-rw------- 1 postgres postgres 73728 Jul 13 2001 1249
-rw------- 1 postgres postgres 229376 Jun 21 2001 1255
-rw------- 1 postgres postgres 16384 Jul 16 2001 1259
-rw------- 1 postgres postgres 0 Jun 21 2001 16567
-rw------- 1 postgres postgres 8192 Jun 21 2001 16579
-rw------- 1 postgres postgres 16384 Jun 21 2001 16600
-rw------- 1 postgres postgres 73728 Jun 21 2001 16617
-rw------- 1 postgres postgres 8192 Jun 21 2001 16642
-rw------- 1 postgres postgres 8192 Jun 21 2001 16653
-rw------- 1 postgres postgres 16384 Jun 21 2001 16685
-rw------- 1 postgres postgres 8192 Jun 21 2001 16867
-rw------- 1 postgres postgres 8192 Jun 21 2001 16934
-rw------- 1 postgres postgres 0 Jun 21 2001 16948
-rw------- 1 postgres postgres 8192 Jun 21 2001 16960
-rw------- 1 postgres postgres 0 Jun 21 2001 17033
-rw------- 1 postgres postgres 0 Jun 21 2001 17045
-rw------- 1 postgres postgres 8192 Jun 21 2001 17058
.
.
.
.

I've got a couple of directories that I suspect have stale files. One of
them:

./base/185174 contains what appears to be current information.

Thanks

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: pgsql-admin-***@postgresql.org
[mailto:pgsql-admin-***@postgresql.org] On Behalf Of Peter Eisentraut
Sent: Wednesday, April 07, 2004 12:20 PM
To: Tom Bakken; pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
Post by Tom Bakken
I've been running a postgres for 2 or 3 years without a problem.
This morning my disk space for the database filled up. I need to
know what transaction/log files I can truncate or delete without
compromising the system. These files are located under
/var/lib/pgsql/data/
The answer is normally "none" unless you have experienced crashes or
other problems that might have left stale files lying around. But you
say have had no problem ...

If you have any suspicion in that direction, please show us the exact
files you're thinking about. A note about which PG version you are
running would also help.


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Tom Lane
2004-04-07 17:22:20 UTC
Permalink
Post by Tom Bakken
I've been running a postgres for 2 or 3 years without a problem. This
morning my disk space for the database filled up. I need to know what
transaction/log files I can truncate or delete without compromising the
system. These files are located under /var/lib/pgsql/data/
I wouldn't recommend deleting *any* files manually --- unless you find
core files or old files underneath a pgsql_tmp subdirectory. Those you
could zap at little risk.

The best approach is to free up a small amount of space elsewhere,
enough so you can get through a CHECKPOINT without failing. The
checkpoint will hopefully free up some space in pg_xlog. After that you
can look at dropping tables you don't need any more, VACUUM FULL, etc.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Tom Bakken
2004-04-07 19:42:35 UTC
Permalink
I looked in the pg_log file and it's missing xlogtemp.1091.:

[***@linux04 data]# tail pg_log
DEBUG: Redo record at (1, 516646732); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 28728439; NextOid: 9098648
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1033) failed: No
such file or directory
/usr/bin/postmaster: Startup proc 1033 exited with status 512 - abort
DEBUG: database system was shut down at 2004-04-07 12:14:38 CDT
DEBUG: CheckPoint record at (1, 516646732)
DEBUG: Redo record at (1, 516646732); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 28728439; NextOid: 9098648
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1091) failed: No
such file or directory
/usr/bin/postmaster: Startup proc 1091 exited with status 512 - abort

I'm sure I didn't delete it. Regardless, hopefully based on this one of you
might have a suggestion.

[***@linux04 data]#
Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: Tom Lane [mailto:***@sss.pgh.pa.us]
Sent: Wednesday, April 07, 2004 12:22 PM
To: Tom Bakken
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
Post by Tom Bakken
I've been running a postgres for 2 or 3 years without a problem. This
morning my disk space for the database filled up. I need to know what
transaction/log files I can truncate or delete without compromising the
system. These files are located under /var/lib/pgsql/data/
I wouldn't recommend deleting *any* files manually --- unless you find
core files or old files underneath a pgsql_tmp subdirectory. Those you
could zap at little risk.

The best approach is to free up a small amount of space elsewhere,
enough so you can get through a CHECKPOINT without failing. The
checkpoint will hopefully free up some space in pg_xlog. After that you
can look at dropping tables you don't need any more, VACUUM FULL, etc.

regards, tom lane
Tom Lane
2004-04-07 19:57:18 UTC
Permalink
Post by Tom Bakken
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1091) failed: No
such file or directory
/usr/bin/postmaster: Startup proc 1091 exited with status 512 - abort
I'm sure I didn't delete it.
This is just trying to make a new, empty xlog file. I don't quite
understand why the errno is "No such file or directory" --- you wouldn't
think that write() could return that errno. But the most likely bet is
that you don't yet have enough free space on the disk. These files are
16MB each, and it could be that more than one needs to be made.

How much stuff is there in /var/lib/pgsql/data/pg_xlog anyway? I think
that 7.1.2 predates some changes we made to keep down the number of xlog
files that would be kept around.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Tom Bakken
2004-04-07 20:45:37 UTC
Permalink
Tom,

Here's the situation:
[***@linux04 init.d]# cd /var/lib/pgsql/data/
[***@linux04 data]# ls -l
total 1494316
-rw------- 1 postgres postgres 4 Jun 21 2001 PG_VERSION
drwx------ 6 postgres postgres 4096 Sep 17 2003 base
drwx------ 2 postgres postgres 4096 Oct 27 13:40 global
-rw-r--r-- 1 root root 7640 Jun 29 2001 h
-rw------- 1 postgres postgres 9070 Mar 2 10:56 pg_hba.conf
-rw------- 1 postgres postgres 1118 Jun 21 2001 pg_ident.conf
-rw------- 1 postgres postgres 1528630320 Apr 7 14:26 pg_log
drwx------ 2 postgres postgres 4096 Apr 7 14:26 pg_xlog
-rw------- 1 postgres postgres 3137 Jun 21 2001 postgresql.conf
-rw------- 1 postgres postgres 52 Apr 7 14:26 postmaster.opts
[***@linux04 data]# ls -l pg_xlog/
total 16404
-rw------- 1 postgres postgres 16777216 Apr 7 12:14 000000010000001E

I do have a limited amount of space in the partition but I'd like to get rid
of more. Just not sure what to delete if anything.

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: Tom Lane [mailto:***@sss.pgh.pa.us]
Sent: Wednesday, April 07, 2004 2:57 PM
To: Tom Bakken
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
Post by Tom Bakken
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1091) failed: No
such file or directory
/usr/bin/postmaster: Startup proc 1091 exited with status 512 - abort
I'm sure I didn't delete it.
This is just trying to make a new, empty xlog file. I don't quite
understand why the errno is "No such file or directory" --- you wouldn't
think that write() could return that errno. But the most likely bet is
that you don't yet have enough free space on the disk. These files are
16MB each, and it could be that more than one needs to be made.

How much stuff is there in /var/lib/pgsql/data/pg_xlog anyway? I think
that 7.1.2 predates some changes we made to keep down the number of xlog
files that would be kept around.

regards, tom lane
Tom Lane
2004-04-07 20:56:53 UTC
Permalink
Post by Tom Bakken
total 1494316
-rw------- 1 postgres postgres 4 Jun 21 2001 PG_VERSION
drwx------ 6 postgres postgres 4096 Sep 17 2003 base
drwx------ 2 postgres postgres 4096 Oct 27 13:40 global
-rw-r--r-- 1 root root 7640 Jun 29 2001 h
-rw------- 1 postgres postgres 9070 Mar 2 10:56 pg_hba.conf
-rw------- 1 postgres postgres 1118 Jun 21 2001 pg_ident.conf
-rw------- 1 postgres postgres 1528630320 Apr 7 14:26 pg_log
drwx------ 2 postgres postgres 4096 Apr 7 14:26 pg_xlog
-rw------- 1 postgres postgres 3137 Jun 21 2001 postgresql.conf
-rw------- 1 postgres postgres 52 Apr 7 14:26 postmaster.opts
total 16404
-rw------- 1 postgres postgres 16777216 Apr 7 12:14 000000010000001E
I do have a limited amount of space in the partition but I'd like to get =
rid of more. Just not sure what to delete if anything.
Hm, what is that pg_log file? It's not part of the normal Postgres
fileset. Is it perhaps just the postmaster's stderr output? If so,
you're in luck: truncate that as you see fit, and you'll have some
breathing room.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
Tom Bakken
2004-04-07 21:36:51 UTC
Permalink
Tom,

Doh!! I created that log file and let it get out of hand.

OK, it's truncated and now I've got plenty of space, but it's still
complaining that it can't find the xlogtemp.1405:

DEBUG: database system was shut down at 2004-04-07 12:14:38 CDT
DEBUG: CheckPoint record at (1, 516646732)
DEBUG: Redo record at (1, 516646732); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 28728439; NextOid: 9098648
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1405) failed: No
such file or directory
/usr/bin/postmaster: Startup proc 1405 exited with status 512 - abort

Again, I know I didn't delete it, but regardless, I'm unsure where to go
from here.

Thanks for all your help. I hope we're close to a fix.


Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: Tom Lane [mailto:***@sss.pgh.pa.us]
Sent: Wednesday, April 07, 2004 3:57 PM
To: Tom Bakken
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
Post by Tom Bakken
total 1494316
-rw------- 1 postgres postgres 4 Jun 21 2001 PG_VERSION
drwx------ 6 postgres postgres 4096 Sep 17 2003 base
drwx------ 2 postgres postgres 4096 Oct 27 13:40 global
-rw-r--r-- 1 root root 7640 Jun 29 2001 h
-rw------- 1 postgres postgres 9070 Mar 2 10:56 pg_hba.conf
-rw------- 1 postgres postgres 1118 Jun 21 2001 pg_ident.conf
-rw------- 1 postgres postgres 1528630320 Apr 7 14:26 pg_log
drwx------ 2 postgres postgres 4096 Apr 7 14:26 pg_xlog
-rw------- 1 postgres postgres 3137 Jun 21 2001 postgresql.conf
-rw------- 1 postgres postgres 52 Apr 7 14:26 postmaster.opts
total 16404
-rw------- 1 postgres postgres 16777216 Apr 7 12:14 000000010000001E
I do have a limited amount of space in the partition but I'd like to get =
rid of more. Just not sure what to delete if anything.
Hm, what is that pg_log file? It's not part of the normal Postgres
fileset. Is it perhaps just the postmaster's stderr output? If so,
you're in luck: truncate that as you see fit, and you'll have some
breathing room.

regards, tom lane
Tom Bakken
2004-04-07 19:28:40 UTC
Permalink
Tom,

I'm not finding any mention of CHECKPOINT in my references. Is that
something from a version newer than 7.1.2?

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: Tom Lane [mailto:***@sss.pgh.pa.us]
Sent: Wednesday, April 07, 2004 12:22 PM
To: Tom Bakken
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
Post by Tom Bakken
I've been running a postgres for 2 or 3 years without a problem. This
morning my disk space for the database filled up. I need to know what
transaction/log files I can truncate or delete without compromising the
system. These files are located under /var/lib/pgsql/data/
I wouldn't recommend deleting *any* files manually --- unless you find
core files or old files underneath a pgsql_tmp subdirectory. Those you
could zap at little risk.

The best approach is to free up a small amount of space elsewhere,
enough so you can get through a CHECKPOINT without failing. The
checkpoint will hopefully free up some space in pg_xlog. After that you
can look at dropping tables you don't need any more, VACUUM FULL, etc.

regards, tom lane
Tom Lane
2004-04-07 19:46:06 UTC
Permalink
I'm not finding any mention of CHECKPOINT in my references. Is that
something from a version newer than 7.1.2?
You're running 7.1.2? My, that *is* an old installation. You really
ought to think about an update, particularly if you might be approaching
the 4-billion-transaction event horizon. You do not want to suffer XID
wraparound in a 7.1 installation :-(. See this link for explanations:
http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAROUND

7.1 does have the CHECKPOINT command, though, whether you see it
documented or not.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org
Tom Bakken
2004-04-07 20:13:52 UTC
Permalink
Of course, I was planning to upgrade but as with most things, too little,
too late...

At this point, I just want to keep it running until I can move to my planned
new platform. Can you tell me where to start with CHECKPOINT?

If it's any help, my problem appears to be a missing file. This is from my
pg_log:

DEBUG: database system was shut down at 2004-04-07 12:14:38 CDT
DEBUG: CheckPoint record at (1, 516646732)
DEBUG: Redo record at (1, 516646732); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 28728439; NextOid: 9098648
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1091) failed: No
such file or directory
/usr/bin/postmaster: Startup proc 1091 exited with status 512 - abort

Thanks

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: Tom Lane [mailto:***@sss.pgh.pa.us]
Sent: Wednesday, April 07, 2004 2:46 PM
To: Tom Bakken
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
I'm not finding any mention of CHECKPOINT in my references. Is that
something from a version newer than 7.1.2?
You're running 7.1.2? My, that *is* an old installation. You really
ought to think about an update, particularly if you might be approaching
the 4-billion-transaction event horizon. You do not want to suffer XID
wraparound in a 7.1 installation :-(. See this link for explanations:
http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAR
OUND

7.1 does have the CHECKPOINT command, though, whether you see it
documented or not.

regards, tom lane
Tom Lane
2004-04-07 22:24:30 UTC
Permalink
OK, it's truncated and now I've got plenty of space, but it's still
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1405) failed: No
such file or directory
I think the "no such file" errno is probably actively misleading. I
took another look at the CVS logs and realized that in 7.1.2, there is
no guarantee that that message actually reflects the cause of the write
failure --- if write() indicates it couldn't write all the bytes, but
does not set errno, then the reported errno will be left over from the
last failed operation. We had patched this by 7.1.3, which is the
version I was looking at locally. Since ENOENT can't be returned by
write() AFAIK, it seems certain that this is indeed a leftover errno
setting.

In short, I still think you are running into some kind of
out-of-disk-space failure. I'm not sure what, but you might look to
whether you've exceeded the postgres user's disk space quota, or
anything along that line. Keep in mind also that an unprivileged user
account normally can't fill the disk as full as root can.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Tom Bakken
2004-04-08 13:10:25 UTC
Permalink
Tom,

I'm not sure how to check about the postgres user disk space limit issue. I
tried storing a rather large file on the postgres partition as the postgres
user and had no problem. I'm more suspicious of the file stored in:

[***@linux04 data]# ls -l /var/lib/pgsql/data/pg_xlog/
total 16404
-rw------- 1 postgres postgres 16777216 Apr 7 12:14 000000010000001E

Is it typical? It doesn't look it.

Anyway, I've got plenty of space but am unsure of the next step.

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

-----Original Message-----
From: Tom Lane [mailto:***@sss.pgh.pa.us]
Sent: Wednesday, April 07, 2004 5:25 PM
To: Tom Bakken
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Out of space
OK, it's truncated and now I've got plenty of space, but it's still
FATAL 2: ZeroFill(/var/lib/pgsql/data/pg_xlog/xlogtemp.1405) failed: No
such file or directory
I think the "no such file" errno is probably actively misleading. I
took another look at the CVS logs and realized that in 7.1.2, there is
no guarantee that that message actually reflects the cause of the write
failure --- if write() indicates it couldn't write all the bytes, but
does not set errno, then the reported errno will be left over from the
last failed operation. We had patched this by 7.1.3, which is the
version I was looking at locally. Since ENOENT can't be returned by
write() AFAIK, it seems certain that this is indeed a leftover errno
setting.

In short, I still think you are running into some kind of
out-of-disk-space failure. I'm not sure what, but you might look to
whether you've exceeded the postgres user's disk space quota, or
anything along that line. Keep in mind also that an unprivileged user
account normally can't fill the disk as full as root can.

regards, tom lane
Ericson Smith
2004-04-08 13:27:37 UTC
Permalink
Why not just get a bigger disk?

Warmest regards,
Ericson Smith
Tracking Specialist/DBA
+-----------------------+---------------------------------+
| http://www.did-it.com | "When you have to shoot, shoot, |
| ***@did-it.com | don't talk! - Tuco |
| 516-255-0500 | |
+-----------------------+---------------------------------+
Post by Tom Bakken
I've been running a postgres for 2 or 3 years without a problem. This
morning my disk space for the database filled up. I need to know what
transaction/log files I can truncate or delete without compromising the
system. These files are located under /var/lib/pgsql/data/
Many of them have dates of more than a year ago. I'm kind of rusty with
this. Postgres works too well to keep me fluent with troubleshooting.
Tom Bakken
Information Resource Manager
Texas USDA, Rural Development
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
message can get through to the mailing list cleanly
Tom Bakken
2004-04-08 14:08:51 UTC
Permalink
Just a brief summary of my problem and it's resolution:

The postgres log file was gobbling up all my disk space and I wasn't paying
attention. Postgres shut down and wouldn't restart. I truncated the log
file and now had plenty of space but Postgres still wouldn't start.

I noticed it wasn't posting to the log file. When I truncated the log file
I inadvertently changed it's ownership to root. I corrected the situation,
but it was still not writing to the log file. I changed the permissions but
again, it still wasn't starting up. I had turned on more extensive
debugging. When I removed the flag, I was surprised to see Postgres start
normally. I must have had it set wrong.

Boy do I feel dumb. I think the quote is, "Man proposes, but God disposes."
It's been a humbling experience to air my ignorance.

I really appreciate Postgres and this forum.

Tom, you've helped me (and others) more than once. Many thanks.

Tom Bakken
Information Resource Manager
Texas USDA, Rural Development

Loading...