Discussion:
Do Petabyte storage solutions exist?
(too old to reply)
Tony Reina
2004-04-01 17:41:46 UTC
Permalink
I have a database that will hold massive amounts of scientific data.
Potentially, some estimates are that we could get into needing
Petabytes (1,000 Terabytes) of storage.

1. Do off-the-shelf servers exist that will do Petabyte storage?

2. Is it possible for PostgreSQL to segment a database between
multiple servers? (I was looking at a commercial vendor who had a
product that took rarely used data in Oracle databases and migrated
them to another server to keep frequently accessed data more readily
available.)

Thanks.
-Tony
Bradley Kieser
2004-04-01 18:53:07 UTC
Permalink
Not really answering the question but I thought I would post this anyway
as it may be of interest.

If you want to have some fun (depending on how production-level the
system needs to be) you can build this level of storage using Linux
clusters and cheap IDE drives. No April foo's joke! I have built servers
in TB blocks using cheap IDE drives in RAID 5 configs! You just whack in
one of those 4-way or 8-way cards and the new high capacity drives
(300GB most likely atm although 250GB are massively cheaper). That's 1TB
-> 2TB per IDE slot x 6 plus the 2 on the motherboard. So you are
talking 12TB per server. Rework a 2U chassis and it's rack-em-up time
and go!

There are extender cards, of course, that will allow you to put in more
drives and with SCSI the game changes completely because you can just
chain them together on a single line.

Okay, there are seriously better options than this, of course, and you
probably have used one of them, but this is still fun!

I think as far as PG storage goes you're really on a losing streak here
because PG clustering really isn't going to support this across multiple
servers. We're not even close to the mark as far as clustered servers
and replication management goes, let alone the storate limit of 2GB per
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!

Brad
Post by Tony Reina
I have a database that will hold massive amounts of scientific data.
Potentially, some estimates are that we could get into needing
Petabytes (1,000 Terabytes) of storage.
1. Do off-the-shelf servers exist that will do Petabyte storage?
2. Is it possible for PostgreSQL to segment a database between
multiple servers? (I was looking at a commercial vendor who had a
product that took rarely used data in Oracle databases and migrated
them to another server to keep frequently accessed data more readily
available.)
Thanks.
-Tony
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Tony and Bryn Reina
2004-04-01 19:17:51 UTC
Permalink
let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
I just checked the PostgreSQL website and it says that tables are limited to
16 TB not 2 GB.

-Tony

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly
scott.marlowe
2004-04-01 20:56:18 UTC
Permalink
Post by Bradley Kieser
let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
I just checked the PostgreSQL website and it says that tables are limited to
16 TB not 2 GB.
Actually, it's 32 TB, which can be quadrupled by increasing the block size
to 32k, the maximum allowed, which would make the maximum table size 128
TB.

I just saw your response before firing off my previous messages.
Apologies if I came off harsh, but I've heard people at my office saying
similar things because they "heard it on the mailing lists" so it much be
true.



---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly
Bradley Kieser
2004-04-02 10:38:01 UTC
Permalink
Ah! It's been updated then! Coolio! You just can't beat OpenSource!
;-)
Thx for the update!

Brad
Post by Bradley Kieser
let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
I just checked the PostgreSQL website and it says that tables are limited to
16 TB not 2 GB.
-Tony
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
scott.marlowe
2004-04-02 16:15:08 UTC
Permalink
For quite some time. I believe the max table size of 32 TB was in effect
as far back as 6.5 or so. It's not some new thing. Now, the 8k row
barrier was broken with 7.1. I personally found the 8k row size barrier
to be a bigger problem back then. And 7.1 broke that in 2001, almost
exactly four years ago. 6.5 came out in 1999-06-09, so the limit to table
sizes was gone a very long time ago.
Post by Bradley Kieser
Ah! It's been updated then! Coolio! You just can't beat OpenSource!
;-)
Thx for the update!
Brad
Post by Bradley Kieser
let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
I just checked the PostgreSQL website and it says that tables are limited to
16 TB not 2 GB.
-Tony
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Tony Reina
2004-04-05 07:44:12 UTC
Permalink
Post by scott.marlowe
For quite some time. I believe the max table size of 32 TB was in effect
as far back as 6.5 or so. It's not some new thing. Now, the 8k row
barrier was broken with 7.1. I personally found the 8k row size barrier
to be a bigger problem back then. And 7.1 broke that in 2001, almost
exactly four years ago. 6.5 came out in 1999-06-09, so the limit to table
sizes was gone a very long time ago.
The PostgreSQL limitations on the users' page
(http://www.postgresql.org/users-lounge/limitations.html) still says
that tables are limited to 16 TB, not 32 TB.

Perhaps it should be updated?

-Tony
Tom Lane
2004-04-05 14:52:12 UTC
Permalink
Post by Tony Reina
The PostgreSQL limitations on the users' page
(http://www.postgresql.org/users-lounge/limitations.html) still says
that tables are limited to 16 TB, not 32 TB.
Perhaps it should be updated?
There was some concern at the time it was written as to whether we were
sure that we'd fixed all the places that treated block numbers as signed
rather than unsigned ints. I still misdoubt that this should be
considered a tested and guaranteed-to-work thing. Those who have done
any testing of, eg, VACUUM FULL on greater-than-16TB tables, please
raise your hands?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org
Tony and Bryn Reina
2004-04-01 19:15:13 UTC
Permalink
----- Original Message -----
From: "Bradley Kieser" <***@kieser.net>
To: "Tony Reina" <***@hotmail.com>
Cc: <pgsql-***@postgresql.org>
Sent: Thursday, April 01, 2004 8:53 PM
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?


let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
Uh oh, 2 GB limit on table sizes. I did realize the limit was that low.

Would commercial DBMS be the better solution for handling Terabyte databases
and above?


-Tony

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly
Jürgen Cappel
2004-04-01 19:55:55 UTC
Permalink
AFAIK Postgres uses an internal limit of 2 GB per table file with
a lot of files per table to make up some Terabytes. So don't worry!
Let's see what one of the gurus will tell us. Bye.



-----Ursprüngliche Nachricht-----
Von: pgsql-admin-***@postgresql.org
[mailto:pgsql-admin-***@postgresql.org]Im Auftrag von Tony and Bryn
Reina
Gesendet: Donnerstag, 1. April 2004 21:15
An: Bradley Kieser
Cc: pgsql-***@postgresql.org
Betreff: Re: [ADMIN] Do Petabyte storage solutions exist?



----- Original Message -----
From: "Bradley Kieser" <***@kieser.net>
To: "Tony Reina" <***@hotmail.com>
Cc: <pgsql-***@postgresql.org>
Sent: Thursday, April 01, 2004 8:53 PM
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?


let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
Uh oh, 2 GB limit on table sizes. I did realize the limit was that low.

Would commercial DBMS be the better solution for handling Terabyte databases
and above?


-Tony

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Bradley Kieser
2004-04-02 10:36:12 UTC
Permalink
Hi Tony,
Yep, for the time being you're pretty much limited to this for a table.
As far as commercial DBs go, IMHO (without knowing about DB2) Oracle is
the only player in town that will realistically deal with table sizes in
the order of 100sGB or more. Ingres has limitations similar to PG
although they will deny it, Informix I am a little bit rusty on now but
certainly when I used it last it didn't scale up much past the low
ordinal GBs per table and Sybase, IM v HO, is a joke anyway. Hope I
don't offend anyone with that last statement!

The wildcard here is DB2 because they have to renovated the code that I
cannot comment on it anymore.

Oracle's main drawbacks are:
a) VERY resource-intensive with a high process startup overhead.
b) VERY expensive. You are talking license fees into the £100 000s for
big iron installations.

But, as I said, IMHO, (and excluding DB2) Oracle is the only player to
look at.

Hope that this helps!

Brad
Post by Tony and Bryn Reina
----- Original Message -----
Sent: Thursday, April 01, 2004 8:53 PM
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?
let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
Uh oh, 2 GB limit on table sizes. I did realize the limit was that low.
Would commercial DBMS be the better solution for handling Terabyte databases
and above?
-Tony
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Tony and Bryn Reina
2004-04-02 14:28:13 UTC
Permalink
Post by Bradley Kieser
a) VERY resource-intensive with a high process startup overhead.
b) VERY expensive. You are talking license fees into the £100 000s for
big iron installations.
Wow! 100,000 pounds for software. Now that is expensive! Is that a ballpark
price for most of the commercial DB stuff out there? It would be interesting
to see just how expensive (cost of licensing-wise) commercial DBs really are
from a side-by-side matchup.

-Tony

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly
Bradley Kieser
2004-04-02 15:59:45 UTC
Permalink
No, it isn't. Oracle is expensive but it is also the Rolls Royce, it
seems. I am a strictly OpenSource man so I don't really get into the
pricing thing, but I do know that it is also deal-by-deal and depending
on who and what you are, the prices can vary. E.g. Educational
facilities have massive discounts. Military has massive prices, etc.
Post by Tony and Bryn Reina
Post by Bradley Kieser
a) VERY resource-intensive with a high process startup overhead.
b) VERY expensive. You are talking license fees into the £100 000s for
big iron installations.
Wow! 100,000 pounds for software. Now that is expensive! Is that a ballpark
price for most of the commercial DB stuff out there? It would be interesting
to see just how expensive (cost of licensing-wise) commercial DBs really are
from a side-by-side matchup.
-Tony
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
Bricklen
2004-04-02 15:32:27 UTC
Permalink
Post by Bradley Kieser
No, it isn't. Oracle is expensive but it is also the Rolls Royce, it
seems. I am a strictly OpenSource man so I don't really get into the
pricing thing, but I do know that it is also deal-by-deal and depending
on who and what you are, the prices can vary. E.g. Educational
facilities have massive discounts. Military has massive prices, etc.
<snip>
You're correct about it being 'deal-by-deal' pricing. You can negotiate
the salesmen down quite a bit, depending on who your company is, the
field you're in, the time of year (eg. end of quarter or year nets
bigger reductions), and especially if you use a bit of cleverness by
getting in-house demos by the big competitors (eg. MSSQL and DB2).

Standard Edition One is listed at around $6500 Canadian per processor,
or $195 per named user. This is all totally negotiable, though.
Apparently mssql is priced similarly, though I can't verify that.

Doing price comparisons isn't very helpful, what you really need to do
is analyze your requirements and see what features you actually need, or
will need in the future. I have no affiliation with any of these
companies, so I'm not going to start a marketing war about who's better
etc.

Anyways, ss they say, "You get what you pay for".
Andrew Sullivan
2004-04-05 21:28:01 UTC
Permalink
Post by Bricklen
Anyways, ss they say, "You get what you pay for".
This has not been my experience at all. The correlation between
software price and quality looks to me to be something very close to
random.

A
--
Andrew Sullivan | ***@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Tom Lane
2004-04-02 15:42:28 UTC
Permalink
Post by Bradley Kieser
No, it isn't. Oracle is expensive but it is also the Rolls Royce, it
seems. I am a strictly OpenSource man so I don't really get into the
pricing thing, but I do know that it is also deal-by-deal and depending
on who and what you are, the prices can vary.
I'm fairly sure that Oracle's pricing scales with the iron you plan to
use: the more or faster CPUs you want to run it on, the more you pay.
A large shop can easily get into the $100K license range, but Oracle
figures that they will have spent way more than that on their hardware.

The trouble with this theory is that as hardware prices fall, Oracle is
collecting a larger and larger share of people's IT budgets. That's why
we are seeing more and more interest in open-source DBs ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org
C. Bensend
2004-04-02 16:17:53 UTC
Permalink
Post by Tom Lane
I'm fairly sure that Oracle's pricing scales with the iron you plan to
use: the more or faster CPUs you want to run it on, the more you pay.
A large shop can easily get into the $100K license range, but Oracle
figures that they will have spent way more than that on their hardware.
Exactly right, Tom. Oracle's licensing is typically done by number of CPUs
it will be running on. It is also negotiated from site to site. I've been
at two shops during the negotiation of the licensing, and thankfully both
times we were able to keep it under $100,000.

Have I mentioned lately how much I appreciate the developers? :) I love
PostgreSQL...

Benny
--
"I can't believe it's not carp!" -- MXC

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Bradley Kieser
2004-04-02 17:46:39 UTC
Permalink
Well I for one find it very difficult to choose a DB other than PG and
do so only under duress. It is really only client demand that drives the
decision away from PG but like you, I am finding that more and more, PG
is winning the deal and winning the day. Once the replication and
ability to place tables and indexes on specified locations is in place,
it will be even more difficult for anyone to argue for paying a license
fee IMHO.

I don't find the data size limis of PG a problem and I do develop some
very large systems so for me personally, PG is largely an unstoppable
force now.
Post by Tom Lane
Post by Bradley Kieser
No, it isn't. Oracle is expensive but it is also the Rolls Royce, it
seems. I am a strictly OpenSource man so I don't really get into the
pricing thing, but I do know that it is also deal-by-deal and depending
on who and what you are, the prices can vary.
I'm fairly sure that Oracle's pricing scales with the iron you plan to
use: the more or faster CPUs you want to run it on, the more you pay.
A large shop can easily get into the $100K license range, but Oracle
figures that they will have spent way more than that on their hardware.
The trouble with this theory is that as hardware prices fall, Oracle is
collecting a larger and larger share of people's IT budgets. That's why
we are seeing more and more interest in open-source DBs ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ***@postgresql.org)
Andrew Sullivan
2004-04-02 17:41:32 UTC
Permalink
Post by Tom Lane
I'm fairly sure that Oracle's pricing scales with the iron you plan to
use: the more or faster CPUs you want to run it on, the more you pay.
A large shop can easily get into the $100K license range, but Oracle
figures that they will have spent way more than that on their hardware.
This is correct. For a system that I happen to know about, the
all-licenses-in (part of which was a large commercial database we may
or may not be discussing, part some other application server &c.)
price was US$8M (software only). This price was arrived at near the
end of the dotcom nonsense; I get the feeling that things are
somewhat better now. The license fees were that high because of
the number of processors, the amount of memory, and the number and
class of machines involved.

Something which is worth noting, however, is that (at least in my
experience) the curve of the license fees gets very steep near the
end. So, if you're working on 4-way machines and think you'll double
up by adding 4 more processors, you're sadly mistaken. This
investment is part of what causes the adoption rate for new systems
in large shops to be so low: if you're already spending several
millions on licenses for one product, the incremental cost of adding
another license is hardly noticable, and the savings to be realised
by moving to a competitor is usually relatively small; but the cost
of shifting is very large, because of knowlege, retraining, porting,
&c. For Postgres, however, it is a tremendous opportunity: if it can
make the last steps to be truly broadly competitive with Oracle and
DB2, the potential savings really is large enough to justify the
change. Postgres is already there for some kinds of use (I think it
provides my employer with a great advantage), but it likely needs a
few more features to take the last steps.

A
--
Andrew Sullivan | ***@crankycanuck.ca

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ***@postgresql.org)
Joe Conway
2004-04-02 17:14:21 UTC
Permalink
Post by Tom Lane
I'm fairly sure that Oracle's pricing scales with the iron you plan to
use: the more or faster CPUs you want to run it on, the more you pay.
A large shop can easily get into the $100K license range, but Oracle
figures that they will have spent way more than that on their hardware.
The trouble with this theory is that as hardware prices fall, Oracle is
collecting a larger and larger share of people's IT budgets. That's why
we are seeing more and more interest in open-source DBs ...
That's exactly correct. The last time I looked, Oracles pricing was
$40K/CPU for the base license, $10K/CPU for table partitioning, $20K/CPU
for RAC (clustering). It is no longer tied to CPU speed, just the number
of CPUs. See:
http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=10167
http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=11221
http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=10183

If you want OLAP and Data Mining, it's another $20K/CPU each. Spatial
(think PostGIS) is a mere $10K/CPU.
http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=11222
http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=11223
http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=10184

So for a pair of quad servers, using RAC, partitioning, OLAP, and data
mining, you're talking
40 + 20 + 10 + 20 + 20 = $110K/CPU
8 x $110K/CPU = $880K
*plus* annual support (roughly 20% of purchase price).

Joe


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
scott.marlowe
2004-04-01 20:48:45 UTC
Permalink
Post by Bradley Kieser
I think as far as PG storage goes you're really on a losing streak here
because PG clustering really isn't going to support this across multiple
servers. We're not even close to the mark as far as clustered servers
and replication management goes, let alone the storate limit of 2GB per
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
Hold on, there are instances of running postgresql on SANs that are many
terabytes in size. It will work fine, as long as you only need the one
image of the server running at a time. With FC-AL or more modern
technology you can put ~256 devices on a single fibre loop, and most boxes
can handle four of those controllers, so you have the possibility for 1024
drives. Of course, most kernels are not gonna handle that many drives
well, so you're much better off aggregating the drives on a storage box,
then mounting that from your database server.

HOWEVER, this isn't my biggest gripe, it is the misinformation you're
spreading about a 2g table limit. That's the individual FIELD limit on
postgresql. Tables can be significantly larger than 2g.

If you're not sure ask first, don't spread such misinformation, it makes
both the community and the database look bad.
Post by Bradley Kieser
Brad
Post by Tony Reina
I have a database that will hold massive amounts of scientific data.
Potentially, some estimates are that we could get into needing
Petabytes (1,000 Terabytes) of storage.
1. Do off-the-shelf servers exist that will do Petabyte storage?
2. Is it possible for PostgreSQL to segment a database between
multiple servers? (I was looking at a commercial vendor who had a
product that took rarely used data in Oracle databases and migrated
them to another server to keep frequently accessed data more readily
available.)
Thanks.
-Tony
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Christopher Browne
2004-04-01 22:15:51 UTC
Permalink
Post by Bradley Kieser
I think as far as PG storage goes you're really on a losing streak
here because PG clustering really isn't going to support this across
multiple servers. We're not even close to the mark as far as clustered
servers and replication management goes, let alone the storate limit
of 2GB per table. So sadly, PG would have to bow out of this IMHO
unless someone else nukes me on this!
Are you trying to to do a bad April Fool's joke?

A "2GB limit" is simply nonsense. I work with a number of databases
where tables contain >>2GB of data.

While there are some of the "pointy-clicky" approaches to clustering
and replication that aren't "there" for PostgreSQL, a '2GB limit' is
certainly NOT one of the reasons to avoid PG.
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://www.ntlug.org/~cbbrowne/oses.html
"Let me get this straight: A company that dominates the desktop, and
can afford to hire an army of the world's best programmers, markets
what is arguably the world's LEAST reliable operating system?
What's wrong with this picture?" -- <***@cc.UManitoba.CA>
Goulet, Dick
2004-04-01 20:12:24 UTC
Permalink
Yeah, move on over to Oracle. Even on older versions the file limit may have been 2GB, but a tablespace could have more than one datafile. The true limit there is 4194303 blocks where a block can be 2KB, 4KB, 8KB, 16KB, 32KB, 64KB and with 10G comes 128KB. Then each table/index can have 4194303 segments which are user definable up to the max size of a datafile. Now if you've a 64KB block size database that means you can have one segment as a max of 262,144 bytes & since you can have 4194303 of those the max possible size of a table is 1,099,511,365,632 MB. And if that ain't big enough for you, turn on partitioning. Truly the sky IS the limit.

Dick Goulet
Senior Oracle DBA
Oracle Certified 8i DBA

-----Original Message-----
From: Tony and Bryn Reina [mailto:***@hotmail.com]
Sent: Thursday, April 01, 2004 2:15 PM
To: Bradley Kieser
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?



----- Original Message -----
From: "Bradley Kieser" <***@kieser.net>
To: "Tony Reina" <***@hotmail.com>
Cc: <pgsql-***@postgresql.org>
Sent: Thursday, April 01, 2004 8:53 PM
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?


let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
Uh oh, 2 GB limit on table sizes. I did realize the limit was that low.

Would commercial DBMS be the better solution for handling Terabyte databases
and above?


-Tony

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Jürgen Cappel
2004-04-02 11:45:49 UTC
Permalink
Ingres 6.4 is pretty much history and i'm not even sure if it's
supported by CA any more. Better use 2.5 or 2.6 ! It's offered
for Linux as well.

Regarding performance problems, there are a lot of parameters
to tune an Ingres database. The standard installation out of
the box is never sufficient for a realworld application.
The problem is performance, as you stated quite correctly.
You always have to scale Ingres to your machine's size and
resources, it's preconfigured for a very small machine.

BTW Ingres has quite a remarkable replication system where
you can have multiple master sites where inserts und updates
can happen. They've taken an asynchronous approach that allows sites
or networks to be down for a while without blocking a local
application's transaction. Collision detection it up to you
however, and there is not much help but doing it manually.
i'm currently writing and administering an application with sites
residing in Germany, US, South America, all having write access
and networks being down from time to time. Database size is in a
2-digit Gigabyte range.

Bye.



-----Ursprüngliche Nachricht-----
Von: Bradley Kieser [mailto:***@kieser.net]
Gesendet: Freitag, 2. April 2004 14:03
An: Jürgen Cappel
Betreff: Re: AW: [ADMIN] Do Petabyte storage solutions exist?


Yeah, sorry, my mistake. Thanks for th e correction!

But I had serious problems getting a DB with large tables running on
Ingres 6.4, Sequent Dynix cluster. We had all sorts of errors on the
views and performance bombed badly. I really don't think that 6.4 at
least will scale to 100s GB but please tell me if you disagree because I
would like to know other experiences.
You're also a bit rusty on Ingres. There was a problem
with the early 2.5 version being limited to 2^31 bytes
per table. That was fixed end of 2000, early 2001. I'm
having table sizes in a production database of almost
10 GB since then without problems. Bye.
-----Ursprüngliche Nachricht-----
Gesendet: Freitag, 2. April 2004 12:36
An: Tony and Bryn Reina
Betreff: Re: [ADMIN] Do Petabyte storage solutions exist?
Hi Tony,
Yep, for the time being you're pretty much limited to this for a table.
As far as commercial DBs go, IMHO (without knowing about DB2) Oracle is
the only player in town that will realistically deal with table sizes in
the order of 100sGB or more. Ingres has limitations similar to PG
although they will deny it, Informix I am a little bit rusty on now but
certainly when I used it last it didn't scale up much past the low
ordinal GBs per table and Sybase, IM v HO, is a joke anyway. Hope I
don't offend anyone with that last statement!
The wildcard here is DB2 because they have to renovated the code that I
cannot comment on it anymore.
a) VERY resource-intensive with a high process startup overhead.
b) VERY expensive. You are talking license fees into the £100 000s for
big iron installations.
But, as I said, IMHO, (and excluding DB2) Oracle is the only player to
look at.
Hope that this helps!
Brad
Post by Tony and Bryn Reina
----- Original Message -----
Sent: Thursday, April 01, 2004 8:53 PM
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?
let alone the storate limit of 2GB per
Post by Bradley Kieser
table. So sadly, PG would have to bow out of this IMHO unless someone
else nukes me on this!
Uh oh, 2 GB limit on table sizes. I did realize the limit was that low.
Would commercial DBMS be the better solution for handling Terabyte
databases
Post by Tony and Bryn Reina
and above?
-Tony
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
Naomi Walker
2004-04-02 17:42:59 UTC
Permalink
Post by Bradley Kieser
Hi Tony,
Yep, for the time being you're pretty much limited to this for a table. As
far as commercial DBs go, IMHO (without knowing about DB2) Oracle is the
only player in town that will realistically deal with table sizes in the
order of 100sGB or more. Ingres has limitations similar to PG although
they will deny it, Informix I am a little bit rusty on now but certainly
when I used it last it didn't scale up much past the low ordinal GBs per
table and Sybase, IM v HO, is a joke anyway. Hope I don't offend anyone
with that last statement!
For the record, I ran Informix with 100G size databases, with no problem.

-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the intended addressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use, copy, disclose or distribute to anyone the message or any information contained in the message. If you have received this message in error, please immediately advise the sender by reply email, and delete the message. Thank you.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org
Goulet, Dick
2004-04-02 18:11:32 UTC
Permalink
Tom,

I believe PG's biggest problem is that many third party vendors of any significant size (read that as PeopleSoft, SAP, etc.....) don't support PG and PG as an entity does not have a owner like Oracle, DB2, Sql*Server. There are other problems with PG as well that I'll admit are no barrier to it doing the job in a particular application, but in others it can become a problem. I think that the world is changing & that there will always be a place for PG as well as the commercial DB's.

Dick Goulet
Senior Oracle DBA
Oracle Certified 8i DBA

-----Original Message-----
From: Bradley Kieser [mailto:***@kieser.net]
Sent: Friday, April 02, 2004 12:47 PM
To: Tom Lane
Cc: Tony and Bryn Reina; pgsql-***@postgresql.org
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?


Well I for one find it very difficult to choose a DB other than PG and
do so only under duress. It is really only client demand that drives the
decision away from PG but like you, I am finding that more and more, PG
is winning the deal and winning the day. Once the replication and
ability to place tables and indexes on specified locations is in place,
it will be even more difficult for anyone to argue for paying a license
fee IMHO.

I don't find the data size limis of PG a problem and I do develop some
very large systems so for me personally, PG is largely an unstoppable
force now.
Post by Tom Lane
Post by Bradley Kieser
No, it isn't. Oracle is expensive but it is also the Rolls Royce, it
seems. I am a strictly OpenSource man so I don't really get into the
pricing thing, but I do know that it is also deal-by-deal and depending
on who and what you are, the prices can vary.
I'm fairly sure that Oracle's pricing scales with the iron you plan to
use: the more or faster CPUs you want to run it on, the more you pay.
A large shop can easily get into the $100K license range, but Oracle
figures that they will have spent way more than that on their hardware.
The trouble with this theory is that as hardware prices fall, Oracle is
collecting a larger and larger share of people's IT budgets. That's why
we are seeing more and more interest in open-source DBs ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ***@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Goulet, Dick
2004-04-02 18:43:33 UTC
Permalink
And speaking of Rolls Royce's, there is a commercial product called Terradata that is extremely good at handling PB's of data. Of course the bottom of the barrel entry price is $400,000US, not including the proprietary hardware & OS you need.

Dick Goulet
Senior Oracle DBA
Oracle Certified 8i DBA

-----Original Message-----
From: Naomi Walker [mailto:***@eldocomp.com]
Sent: Friday, April 02, 2004 12:43 PM
To: Bradley Kieser
Cc: Tony and Bryn Reina; pgsql-***@postgresql.org
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?
Post by Bradley Kieser
Hi Tony,
Yep, for the time being you're pretty much limited to this for a table. As
far as commercial DBs go, IMHO (without knowing about DB2) Oracle is the
only player in town that will realistically deal with table sizes in the
order of 100sGB or more. Ingres has limitations similar to PG although
they will deny it, Informix I am a little bit rusty on now but certainly
when I used it last it didn't scale up much past the low ordinal GBs per
table and Sybase, IM v HO, is a joke anyway. Hope I don't offend anyone
with that last statement!
For the record, I ran Informix with 100G size databases, with no problem.

-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the intended addressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use, copy, disclose or distribute to anyone the message or any information contained in the message. If you have received this message in error, please immediately advise the sender by reply email, and delete the message. Thank you.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Gregory S. Williamson
2004-04-02 19:04:37 UTC
Permalink
Informix fees vary but figure about $33,000 per CPU for a web environment (other licenses are cheaper, for instance, a server with only a handful of connections). On the plus side for Informix, the Oracle stuff we had consists of dozens of tapes and CDs ... Informix was rarely more a CD and much easier to get going.

Greg Williamson
DBA
GlobeXplorer LLC

-----Original Message-----
From: Tony and Bryn Reina [mailto:***@hotmail.com]
Sent: Fri 4/2/2004 6:28 AM
To: Bradley Kieser
Cc: pgsql-***@postgresql.org
Subject: Re: [ADMIN] Do Petabyte storage solutions exist?
Post by Bradley Kieser
a) VERY resource-intensive with a high process startup overhead.
b) VERY expensive. You are talking license fees into the £100 000s for
big iron installations.
Wow! 100,000 pounds for software. Now that is expensive! Is that a ballpark
price for most of the commercial DB stuff out there? It would be interesting
to see just how expensive (cost of licensing-wise) commercial DBs really are
from a side-by-side matchup.

-Tony

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ***@postgresql.org so that your
message can get through to the mailing list cleanly




---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html
Naomi Walker
2004-04-02 19:32:51 UTC
Permalink
Post by Gregory S. Williamson
Informix fees vary but figure about $33,000 per CPU for a web environment
(other licenses are cheaper, for instance, a server with only a handful of
connections). On the plus side for Informix, the Oracle stuff we had
consists of dozens of tapes and CDs ... Informix was rarely more a CD and
much easier to get going.
And, IMHO, Informix *much* easier to maintain.



-------------------------------------------------------------------------------------------------------------------------
Naomi Walker Chief Information Officer
Eldorado Computing, Inc.
***@eldocomp.com 602-604-3100
-------------------------------------------------------------------------------------------------------------------------
Forget past mistakes. Forget failures. Forget everything except what you're
going to do now and do it.
- William Durant, founder of General Motors
------------------------------------------------------------------------------------------------------------------------

-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the intended addressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use, copy, disclose or distribute to anyone the message or any information contained in the message. If you have received this message in error, please immediately advise the sender by reply email, and delete the message. Thank you.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org
Goulet, Dick
2004-04-02 20:44:19 UTC
Permalink
Andrew,

Your absolutely right. During the DOTCOM fiasco commercial database licenses were based on the number of processors & the speed of those processors. Oracle's PowerUnit pricing was one of those stupid attempts. A power unit was defined as 1 CPU running at 1 MHZ. Mind you a powerunit was cheap (around $50US as I remember), BUT!!!!! Simple example (that I've intimate knowledge of)

HP9000/L2000 2 way 700 MHZ processors

Oracle: 2 * 700 * 50 = $70,000US
Server: $30,000US Including OS

Try a SuperDome

Server: $120,000US
Oracle: 12 * 1000 * 50 = $600,000US

Today things have gotten better as in less complicated. Oracle dumped PowerUnits for CPU pricing. Enterprise Edition is $40,000US per processor ($80,000US for that L2000 today). Standard Edition is $15,000US per processor. Still makes one cringe every time you talk about it. Hopefully Oracle has seen the light. Larry Ellison (CEO) spoke about site licensing at Open World. Rumor mill has it that it'll boil down to # of employees times $150US (Enterprise Edition per seat license fee). After that its' have fun. Use all the software you want. Of course there's still that 21% annual maintenance fee that they'll get you for.

Dick Goulet
Senior Oracle DBA
Oracle Certified 8i DBA


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Loading...