Frequent 'deadlock detected' in 7.4 ... or just my bad code?

Discussion:

(too old to reply)

Marc G. Fournier

2004-04-05 14:53:05 UTC

G'day ...

I've got a script that runs on all the servers that dump's IP traffic
data to a 7.4 database ... they all run at the same time, but I'm starting
to get the following on a reasonably regular basis:

ERROR: deadlock detected at /usr/local/abin/ipaudit2ams.pl line 175.

The code that is causing it, from the times I've been able to catch it,
is a simple update to the same table:

$upd->execute( $traffic{$company_id}{$ip_id}{$port}, $company_id, $ip_id, $date ) || die $upd->errstr;

Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be
loaded, I want the whole thing to rollback ... the deadlock itself, I'm
presuming, is because two servers are trying to update the same
$ip_id/$port/$company_id record, at the same time ...

Now, reading the DEADLOCKS section at:

http://www.postgresql.org/docs/7.4/static/explicit-locking.html

This is to be expected ... but, other then breaking the transaction
itself into smaller chunks, or staggering the scripts run times ... is
there something I'm overlooking to eliminate this? I could increase the
deadlock timeout, as an option as well ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: ***@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Tom Lane

2004-04-05 15:13:54 UTC

Permalink

Post by Marc G. Fournier
Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be
loaded, I want the whole thing to rollback ... the deadlock itself, I'm
presuming, is because two servers are trying to update the same
$ip_id/$port/$company_id record, at the same time ...

Actually, the problem is more likely that two servers try to update two
different rows in opposite orders. It's not possible to deadlock when
only one lock is involved.

You could work around this by ensuring that all sessions update rows in
a consistent order; for instance, at the beginning of a transaction sort
your intended updates by primary key and then apply in that order.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Marc G. Fournier

2004-04-05 17:24:54 UTC

Permalink

Post by Tom Lane

Actually, the problem is more likely that two servers try to update two
different rows in opposite orders. It's not possible to deadlock when
only one lock is involved.
You could work around this by ensuring that all sessions update rows in
a consistent order; for instance, at the beginning of a transaction sort
your intended updates by primary key and then apply in that order.

Actually, unless I'm mistaken about how hashes work in perl, the update
order for all servers is the same ... basically what happens is:

1. a traffic table is read in, and loaded into a hash table that is
ordered by company_id, ip_id and port:

$traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;

2. a foreach loop is run on that resultant list to do the updates to the
database:

foreach $company_id ( keys %traffic ) {
foreach $ip_id ( keys %{$traffic{$company_id}} ) {
foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {

and the updates are done based on those 3 values, plus the byte value
of $traffic{$company_id}{$ip_id}{$port} ...

Now, my first mistake may be that I'm mis-assuming that the hashes will
be read in a sorted order ... ? If this is the case, though, then sort
order shouldn't be an issue, as all servers would be sorted the same way
...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: ***@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Marc G. Fournier

2004-04-05 18:13:33 UTC

Permalink

D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back
to the drawing board on the code ...

Can't you just change

Post by Marc G. Fournier
foreach $company_id ( keys %traffic ) {

Post by Marc G. Fournier
foreach $company_id ( sort keys %traffic ) {

etc.

Ya, just saw the note from Matt on that also ... didn't realize it was
*that* simple ... was going to look into using the NULL Berkeley DB driver
... the O'Reilly Programming Perl book that I have, when you look into the
index at the back under Hashes: Automatic Sorting Of talks about the
Berkeley DB driver ... and that's it, nothing about simple sorting like
the above ... even if you look under 'Sorting: Hashes automatically', it
points to the same thing ...

Talk about over-complicating things :(

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: ***@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Jim Seymour

2004-04-06 00:26:29 UTC

Permalink

Post by Marc G. Fournier

D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back
to the drawing board on the code ...

Can't you just change

Post by Marc G. Fournier
foreach $company_id ( keys %traffic ) {

Post by Marc G. Fournier
foreach $company_id ( sort keys %traffic ) {

etc.

[snip]

Kind of OT for this mailing list but...

What you just ran into is why I recommend to *anybody*, even the most
experienced of designers/engineers/coders/what-have-you, that they
start with "Learning Perl." (Also an O'Reilly book.) Saves no end of
grief ;).

IMO, "Programming Perl" is mainly useful after you already have a good
handle on the language.

Jim

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Tom Lane

2004-04-05 17:41:18 UTC

Permalink

D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back
to the drawing board on the code ...

Can't you just change

Post by Marc G. Fournier
foreach $company_id ( keys %traffic ) {

Post by Marc G. Fournier
foreach $company_id ( sort keys %traffic ) {

etc.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Matt Clark

2004-04-05 17:41:20 UTC

Permalink

Post by Marc G. Fournier
1. a traffic table is read in, and loaded into a hash table that is
$traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;
2. a foreach loop is run on that resultant list to do the updates to the
foreach $company_id ( keys %traffic ) {
foreach $ip_id ( keys %{$traffic{$company_id}} ) {
foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {
and the updates are done based on those 3 values, plus the byte value
of $traffic{$company_id}{$ip_id}{$port} ...
Now, my first mistake may be that I'm mis-assuming that the hashes will
be read in a sorted order ... ? If this is the case, though, then sort
order shouldn't be an issue, as all servers would be sorted the same way

The output of keys(%hash) is NOT ordered! Try:

foreach $company_id ( sort keys %traffic ) {
foreach $ip_id ( sort keys %{$traffic{$company_id}} ) {
foreach $port ( sort keys %{$traffic{$company_id}{$ip_id}} ) {

Matt

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Marc G. Fournier

2004-04-05 21:39:08 UTC

Permalink

That appears to have fixed it, thanks ... at least it hasn't happened in a
few hours, and it was happening at least once an hour previously ...

Post by Marc G. Fournier

foreach $company_id ( sort keys %traffic ) {
foreach $ip_id ( sort keys %{$traffic{$company_id}} ) {
foreach $port ( sort keys %{$traffic{$company_id}{$ip_id}} ) {
Matt

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: ***@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ***@postgresql.org

Marc G. Fournier

2004-04-05 17:38:55 UTC

Permalink

D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back
to the drawing board on the code ...

Post by Marc G. Fournier

Post by Tom Lane

Actually, the problem is more likely that two servers try to update two
different rows in opposite orders. It's not possible to deadlock when
only one lock is involved.
You could work around this by ensuring that all sessions update rows in
a consistent order; for instance, at the beginning of a transaction sort
your intended updates by primary key and then apply in that order.

Actually, unless I'm mistaken about how hashes work in perl, the update
1. a traffic table is read in, and loaded into a hash table that is
$traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;
2. a foreach loop is run on that resultant list to do the updates to the
foreach $company_id ( keys %traffic ) {
foreach $ip_id ( keys %{$traffic{$company_id}} ) {
foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {
and the updates are done based on those 3 values, plus the byte value
of $traffic{$company_id}{$ip_id}{$port} ...
Now, my first mistake may be that I'm mis-assuming that the hashes will
be read in a sorted order ... ? If this is the case, though, then sort
order shouldn't be an issue, as all servers would be sorted the same way
...
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)