pgcrypto and database encryption

Discussion:

(too old to reply)

Silvana Di Martino

2004-03-07 16:13:03 UTC

Going on with my evaluation of pgcrypto as a possible solution for our (EU)
privacy law problems, I have to report the following facts:

1) pgcrypto (understandably) supplies just a small collections of server-side
functions, not a general-purpose database-encryption system.

2) This means that pgcrypto does not supply you with any password management
tool. All pgcrypto functions expect to receive the password from the calling
SQL SELECT code as a parameter.

3) This means that you have to manage by yourself all the details of the
password one-time entry at the RDBMS start-up time, its storing (in RAM,
only) and its passage to the SQL SELECTs. IMHO, this is still far to satisfy
our need for a standard and reliable solution to our "privacy law and DB
encryption" problem.

4) What could actually solve our problem is something like the following
scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
parameter called "pw". This parameter would contain a sequence of
comma-separated databasename/encryption-password pairs. I mean, something
like this:

postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"

Imagine that postmaster stores these passwords in memory (and only in memory)
as database-specific and database-wide string variables. Now, pgcrypto
functions could read the required password directly from memory (like an
environment variable).

In this way, we could have an encrypted database and a simple way to manage
its passwords.

Any comment?

5) There is also a problem related to what pgcrypto can encrypt and what it
cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and
TIME data types because it would mess up them and make them unacceptable by
the RDBMS engine. We would need specific encrypted data types like ENCDATA
and ENCTIME to handle these cases.

Any comment?

PS: I apologize for bothering americans with these all-EU details but it looks
like this mailing list is the only one I can use for communicating with
others EU pgsql-admins affected by this problem.
-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
***@interfree.it
***@tin.it

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Stephan Szabo

2004-03-07 17:40:58 UTC

Permalink

Post by Silvana Di Martino
3) This means that you have to manage by yourself all the details of the
password one-time entry at the RDBMS start-up time, its storing (in RAM,
only) and its passage to the SQL SELECTs. IMHO, this is still far to satisfy
our need for a standard and reliable solution to our "privacy law and DB
encryption" problem.
4) What could actually solve our problem is something like the following
scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
parameter called "pw". This parameter would contain a sequence of
comma-separated databasename/encryption-password pairs. I mean, something
postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"
Imagine that postmaster stores these passwords in memory (and only in memory)
as database-specific and database-wide string variables. Now, pgcrypto
functions could read the required password directly from memory (like an
environment variable).
In this way, we could have an encrypted database and a simple way to manage
its passwords.
Any comment?
5) There is also a problem related to what pgcrypto can encrypt and what it
cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and
TIME data types because it would mess up them and make them unacceptable by
the RDBMS engine. We would need specific encrypted data types like ENCDATA
and ENCTIME to handle these cases.
Any comment?

Who has access to the unencrypted versions of the data? Are there people
who will have access to the database who might have access to some of the
encrypted data but not all or someone who has access to the database but
none of the encrypted data? If so, then you presumably need a system
where the decryption will only work for the appropriate people and that
needs to be built in.

Also, what manipulation do you want to do with these values in the
database? Are you going to need to subtract two encrypted dates or
determine that one is greater than the other? What about substring or
ordering for encrypted strings? Does this have to be automatic or is
explicit description of the fact that you want to decrypt them and then do
something sufficient?

Finally, there are some questions about reliability of the system as a
whole. In a system like the -pw above, where do those passwords come from,
is it from a human typing at a console? If so, you need to consider
downtime and recovery time issues based on who has access to the
passwords. This doesn't so much affect the law in question but it may
affect contracts that you have if they have downtime requirements.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Joe Conway

2004-03-07 18:19:50 UTC

Permalink

Post by Silvana Di Martino
4) What could actually solve our problem is something like the following
scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
parameter called "pw". This parameter would contain a sequence of
comma-separated databasename/encryption-password pairs. I mean, something
postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"

But you mentioned earlier that the DBA cannot know the passwords, so who
is going to type all that in?

Does the law require protection from a determined DBA, or just casual
viewing by the DBA? *If* it's the latter, you could do something like this:

1. Export an environment variable , say PGMASTERPASS containing a hex
encoded password, something like:

PGMASTERPASS=0102000304 pg_ctl start

2. Use a C function to grab the value of the environment variable -- one
exists in PL/R already. You could write your own based on that.

3. Combine the master password with other information to make it
sufficiently unique as a key for your various purposes. For example,
you might use the md5 hashed password for the current user from
pg_shadow. This combining should be done securely -- I'd recommend
taking the HMAC of the user password using the master as the key. The
result of the HMAC becomes your data encryption/decryption key.

Post by Silvana Di Martino
5) There is also a problem related to what pgcrypto can encrypt and what it
cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and
TIME data types because it would mess up them and make them unacceptable by
the RDBMS engine. We would need specific encrypted data types like ENCDATA
and ENCTIME to handle these cases.

Just use bytea for the encrypted stuff, and write plpgsql functions to
convert the bytea output of the decrypt function back to its native
datatype. Here's a more-or-less complete example of what I mean by all this:

--8<--------------------------------------------------------------------

create or replace function text2bytea(text) returns bytea as '
begin
return $1;
end;
' language plpgsql;

create or replace function timestamp2bytea(timestamp with time zone)
returns bytea as '
begin
return $1;
end;
' language plpgsql;

create or replace function encrypt_timestamp(timestamp with time zone)
returns bytea as '
declare
v_in alias for $1;
v_masterpass bytea;
v_userpass bytea;
v_key bytea;
v_data bytea;
v_iv bytea; --skip for simplicity at the moment
begin
select into v_masterpass decode(value,''hex'') from plr_environ()
where name=''PGMASTERPASS'';
select into v_userpass text2bytea(substr(passwd,4)) from pg_shadow
where usename = current_user;
v_key := hmac(v_userpass, v_masterpass, ''sha1'');
v_data := timestamp2bytea(v_in);

return encrypt(v_data, v_key, ''aes'');
end;
' language plpgsql;

create or replace function decrypt_timestamp(bytea) returns timestamp
with time zone as '
declare
v_in alias for $1;
v_masterpass bytea;
v_userpass bytea;
v_key bytea;
v_data bytea;
v_iv bytea; --skip for simplicity at the moment
begin
select into v_masterpass decode(value,''hex'') from plr_environ()
where name=''PGMASTERPASS'';
select into v_userpass text2bytea(substr(passwd,4)) from pg_shadow
where usename = current_user;
v_key := hmac(v_userpass, v_masterpass, ''sha1'');
v_data := decrypt(v_in, v_key, ''aes'');

return v_data;
end;
' language plpgsql;

-- here you can see the master password
regression=# select decode(value,'hex') from plr_environ() where
name='PGMASTERPASS';
decode
----------------------
\001\002\000\003\004
(1 row)

-- here is an encrypted timestamp
regression=# select encrypt_timestamp(now());
encrypt_timestamp
--------------------------------------------------------------------------------------------------
\340\333*\0221r\177\022e\011_]X
\374\302Y\201\364\264\362\351e\331\006\266\\\331\236\300\256\335
(1 row)

-- and to prove that it works, this example goes full circle
regression=# select decrypt_timestamp(encrypt_timestamp(now()));
decrypt_timestamp
-------------------------------
2004-03-07 10:16:56.192193-08
(1 row)

--8<--------------------------------------------------------------------

As I said above (and others in this thread too), if the DBA (or anyone
with root access on the database server) is sufficiently determined,
they can get around this scheme and view whatever data they want. If
you're really concerned about that scenario, the data should be
encrypted in your application before it ever gets sent to the database,
using a key that is unavailable on the database server.

HTH,

Joe

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org

m***@ymogen.net

2004-03-07 22:16:44 UTC

Permalink

Post by Joe Conway
Does the law require protection from a determined DBA, or just casual

The EU directive (and therefore the laws of indiviual countries) requires
that if someone gets access to your *DATABASE* they cannot get personal
details of individuals out of it. That is all. It is intended to protect
people against the kind of idiotic organisations that put their customer
lists in an Excel sheet on their extranet without a password.

This thread has covered many interesting and diverting issues, but the
fundamental issue of legal compliance is more than satisfied by:
1) Encrypting 'personal information' stored in a DB
2) Keeping the keys on a different server than the DB
3) Making reasonable efforts [1] to keep those keys secrets

M

[1] As far as I can tell from discussions with the Data Protection
Registrar, you do not have to protect them against someone rooting the app
server (since that is essentially impossible without silly investments in
specialised hardware or other excessive costs).

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match