Discussion:
[Fwd: binary tree query]
(too old to reply)
Jodi Kanter
2004-01-26 19:16:13 UTC
Permalink
I have a biochemist telling me that this query below is a typical one
for crawling through a taxonomic tree and that this is how I should
represent some peptide information we have. Is there anyone on this list
familiar with such data? I am weak in the science department but this
query looks like it might not be the most efficient approach.
I have not been able to run an explain analyze yet as the database
structure and data are not in place yet. We are just in the planning
stages right now.

Any comments, suggestions, concerns, etc. would be much appreciated.
Would an experienced DBA recommend a different approach? Can anyone
offer some insight into the usefulness of INNER joins and the use of
BETWEEN? I am concernec about performance as well since I expect this
table to get large.

SELECT count(*)
FROM taxon_name
INNER JOIN taxon AS tax_b USING(taxon_id)
INNER JOIN taxon AS tax_v ON (tax_v.left_id BETWEEN
tax_b.left_id AND tax_b.right_id )
INNER JOIN annot ON (tax_v.taxon_id = annot.taxon_id)
INNER JOIN protein ON (protein.prot_id= annot.prot_id)
WHERE annot.pref = 1
AND taxon_name.taxon_id=207245
;



Thanks,
Jodi
--
/_______________________________
//Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
***@virginia.edu <mailto:***@virginia.edu>/



/ /

/ /
Yuji Shinozaki
2004-01-26 19:36:15 UTC
Permalink
Hi Jodi,

I believe the technique they are using is representing a tree as nested
sets. It requires that the database is built with properly nested
left_id's and right_id's, and this technique is often regarded as a
efficient means for retrieving hierachical information.

Here is one reference about it:

http://www.geocrawler.com/archives/3/6/2001/10/0/6961775/

As for the inner join clauses they are in effect analogous to
where's but the optimizer handles them differently. That is where
(pardon the pun) my understanding dwindles. Perhaps someone else has
better insight about inner join's vs where's.

Hope this sheds some light and not too many shadows,

yuji
----
Post by Jodi Kanter
I have a biochemist telling me that this query below is a typical one
for crawling through a taxonomic tree and that this is how I should
represent some peptide information we have. Is there anyone on this list
familiar with such data? I am weak in the science department but this
query looks like it might not be the most efficient approach.
I have not been able to run an explain analyze yet as the database
structure and data are not in place yet. We are just in the planning
stages right now.
Any comments, suggestions, concerns, etc. would be much appreciated.
Would an experienced DBA recommend a different approach? Can anyone
offer some insight into the usefulness of INNER joins and the use of
BETWEEN? I am concernec about performance as well since I expect this
table to get large.
SELECT count(*)
FROM taxon_name
INNER JOIN taxon AS tax_b USING(taxon_id)
INNER JOIN taxon AS tax_v ON (tax_v.left_id BETWEEN
tax_b.left_id AND tax_b.right_id )
INNER JOIN annot ON (tax_v.taxon_id = annot.taxon_id)
INNER JOIN protein ON (protein.prot_id= annot.prot_id)
WHERE annot.pref = 1
AND taxon_name.taxon_id=207245
;
Thanks,
Jodi
--
/_______________________________
//Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
/ /
/ /
Yuji Shinozaki Computer Systems Senior Engineer
***@virginia.edu Advanced Technologies Group
(434)924-7171 Information Technology & Communication
http://www.people.virginia.edu/~ys2n University of Virginia


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Jodi Kanter
2004-01-29 14:10:41 UTC
Permalink
Thanks for the article. It did help some but I am still not sure if
representing a tree is best served using a self referential table.
Wouldn't it be better to separate it into several tables?
Can anyone else comment? Anyone have experience storing data that is
hierarchical in nature (e.g. a tree formation).
Jodi
Post by Yuji Shinozaki
Hi Jodi,
I believe the technique they are using is representing a tree as nested
sets. It requires that the database is built with properly nested
left_id's and right_id's, and this technique is often regarded as a
efficient means for retrieving hierachical information.
http://www.geocrawler.com/archives/3/6/2001/10/0/6961775/
As for the inner join clauses they are in effect analogous to
where's but the optimizer handles them differently. That is where
(pardon the pun) my understanding dwindles. Perhaps someone else has
better insight about inner join's vs where's.
Hope this sheds some light and not too many shadows,
yuji
----
Post by Jodi Kanter
I have a biochemist telling me that this query below is a typical one
for crawling through a taxonomic tree and that this is how I should
represent some peptide information we have. Is there anyone on this list
familiar with such data? I am weak in the science department but this
query looks like it might not be the most efficient approach.
I have not been able to run an explain analyze yet as the database
structure and data are not in place yet. We are just in the planning
stages right now.
Any comments, suggestions, concerns, etc. would be much appreciated.
Would an experienced DBA recommend a different approach? Can anyone
offer some insight into the usefulness of INNER joins and the use of
BETWEEN? I am concernec about performance as well since I expect this
table to get large.
SELECT count(*)
FROM taxon_name
INNER JOIN taxon AS tax_b USING(taxon_id)
INNER JOIN taxon AS tax_v ON (tax_v.left_id BETWEEN
tax_b.left_id AND tax_b.right_id )
INNER JOIN annot ON (tax_v.taxon_id = annot.taxon_id)
INNER JOIN protein ON (protein.prot_id= annot.prot_id)
WHERE annot.pref = 1
AND taxon_name.taxon_id=207245
;
Thanks,
Jodi
--
/_______________________________
//Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
/ /
/ /
Yuji Shinozaki Computer Systems Senior Engineer
(434)924-7171 Information Technology & Communication
http://www.people.virginia.edu/~ys2n University of Virginia
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
--
/_______________________________
//Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
***@virginia.edu <mailto:***@virginia.edu>/



/ /

/ /
Loading...