I am working on a broad phylogenomic analysis of proteins involved in one particular cellular process. I have constructed a phylogenetic tree for each protein. I am now working on comparing those trees in order to extract categories (i. e., allele a of protein a occurs most frequently with allele b of protein b, etc.). I have attempted to construct a super tree using clann, but the tree is extremely difficult to interpret. I would therefore like to carryout a quantitative comparison (and subsequent categorisation) of each each tree, but it is unclear to me how to even begin this analysis. Is there a standardised method for comparison of phylogenetic trees?
I don't know whether it's exactly what you need, but there are formal algorithms for tree comparison.
There are basically two approaches: one utilizing tree lengths (branch score distance) and the other one dealing with topologies only (symmetric-difference metric): details and references can be found in the manual to treedist from the phylip-package, which implements both.
PAUP* has a command under the same name (treedist), which calculates the symmetric-difference metric only. This metric is very intuitive: it describes the total number of partitions (= splits) present only on one of the two trees.