Information

The best free and most up to date phylogenetic tree on the internet?

The best free and most up to date phylogenetic tree on the internet?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I found phylogenetic tree in wikipedia is lacks of accuracy. It actually confusing with versions. Some terms was not scientifically accurate with dna analysis anymore

So are there anyone know where is the most updated datasource on the internet?


Available ressources

I would recommend oneZoom.org is probably the best ressource. oneZoom.org is based on openTreeOfLife.org (for phylogenetic relationships) and eol.org (Encyclopedia Of Life) (mainly for the pictures I think). If you do not fancy to much the display and links of oneZoom/org, then I would just recommend openTreeOfLife.org.

There is also tolweb.org but it is not as nicely updated as openTreeOfLife.org (and therefore openTreeOfLife.org) and the dates are harder to find.

Particularities of oneZoom.org

I particularly like that for each node oneZoom.org offers links to

  • wikipedia
  • encyclopedia of life
  • the red list
  • NCBI

, which makes it a very appreciable tool. oneZoom.org has the advantage of a nice display (although a bit slow to load IMO) and is easy to navigate and get time to the Most Recent Common Ancestor of a lineage.

I also like that you can just add the species name after "#" in the URL and it will jump to its node. Here is the Kagu for example.

Potential issues in oneZoom.org

Of course, oneZoom.org contain a few mistakes! Some of them are unintentional (wrong picture, misrepresentation of what is known from the literature, fail to update information), some of them are caused by lack of knowledge of true phylogenetic relationships. oneZoom.org does not say how certain we are that a given speciation event is correct or correctly dated, it just display the best estimates we have. If a specific node is of interest to you, you will need to go to the scientific literature which will give you much more information about what we know and what we roughly guess.


Open Tree of Life could be what you're after. It was updated last month according to this page. The visual interface is rather bland, but that's probably the price one has to pay for accuracy and completeness.

Vertebrates, arthropods and molluscs are all available as categories in the taxon search bar on the home page.


A S PECIMEN -B ASED A PPROACH TO J USTIFYING P ALEONTOLOGICAL D ATA

Most studies use a Bayesian framework for estimating divergence dates with probability curves between a minimum and a maximum bound to represent calibrations (time priors) ( Thorne et al. 1998 Drummond et al. 2006 Yang 2006 Yang and Rannala 2006). An appropriately constructed fossil calibration uses the oldest assigned fossil of a taxon as the basis for its minimum age and then constructs these other parameters around it ( Benton and Donoghue 2007 Donoghue and Benton 2007). One key to improving the use of paleontological data is recognizing that this first step can be tied explicitly to one or a small set of museum specimens, creating a readily auditable chain of evidence. To minimize error and maximize clarity, all calibration data should be derived explicitly from specific fossil specimens. If links between calibration data and specimens cannot be made, then there are serious questions about the validity of the proposed time priors. In this respect, the fossil specimens used for calibrations represent a standard, much in the same way that a holotype specimen (or type series) is a taxonomic standard. In both cases, these specimens provide a necessary reference point for future inquiries.

The explicit reporting of specimen data is just as crucial to the scientific integrity of a fossil calibration study as is making genetic sequences publicly available or reporting analytical methods. Thus, it is worthwhile to compile, reiterate, and expand on the caveats from previous studies that pertain to the construction and reporting of fossil calibrations (e.g., Graur and Martin 2004 Hedges and Kumar 2004 van Tuinen and Hadly 2004a, 2004b Benton and Donoghue 2007 Donoghue and Benton 2007 Gandolfo et al. 2008 Parham and Irmis 2008 Benton et al. 2009 Ksepka 2009 Sanders et al. 2010) while providing a simple and explicit protocol (in checklist form) to address them.

The checklist can be divided into two parts, justifying phylogenetic position (Steps 1–3) and justifying age (Steps 4 and 5). In most cases, the data needed to justify calibrations are rarely found in a single publication but tend to be spread across many. In addition to being derived from many sources, such information is rarely explicitly flagged as potentially valuable for calibrations. Therefore, a rigorous and explicit approach is needed for justifying the use of paleontological and geological data for divergence dating. The following steps can be used to develop new calibrations and as a checklist for vetting and justifying previously published calibrations based on fossils. If all five steps are fulfilled, then a calibration can be considered well justified.

(1) Museum numbers of specimen(s) that demonstrate all the relevant characters and provenance data should be listed. Referrals of additional specimens to the focal taxon should be justified.

(2) An apomorphy-based diagnosis of the specimen(s) or an explicit, up-to-date, phylogenetic analysis that includes the specimen(s) should be referenced.

(3) Explicit statements on the reconciliation of morphological and molecular data sets should be given.

(4) The locality and stratigraphic level (to the best of current knowledge) from which the calibrating fossil(s) was/were collected should be specified.

(5) Reference to a published radioisotopic age and/or numeric timescale and details of numeric age selection should be given.

(1) Museum Numbers of Specimen(s) that Demonstrate all the Relevant Characters and Provenance Data Should be Listed. Referrals of Additional Specimens to the Focal Taxon Should be Justified

Ideally, a fossil used for calibration would be based on a single specimen that preserves all the characters that allow it to be unambiguously assigned to a clade. Single-specimen operational taxonomic units (OTUs) are preferable because, aside from rare mixed specimens, they are almost guaranteed to be from a single species. However, divergence dating studies that use paleontological data for calibrations usually rely on OTUs from phylogenetic analyses that are based on sets of specimens referred to a single taxon by various criteria. In some cases, the basis for a taxonomic referral can be as poor as documenting that the specimen was recovered from the same region or horizon where other specimens were previously reported. Consequently, “chimeric taxa” are a recurring problem in paleontology ( Meyer-Berthaud et al. 1992 Padian 2000 Parham 2005).

Because single-specimen fossil OTUs are not always possible, it is necessary to revisit the association and referral of specimens. It may be possible to refer specimens from different localities to a single taxon if there are overlapping diagnostic elements or even through phylogenetic analysis ( Gandolfo et al. 1997 Yates 2003 Pol 2004 Boyd et al. 2009 Makovicky 2010). In cases where previously recognized OTUs cannot be objectively assembled, it is necessary to restrict the calibration to a subset of specimens (e.g., Danilov and Parham 2005) or eliminate the OTU from the calibration.

(2) An Apomorphy-Based Diagnosis of the Specimen(s) or an Explicit, Up-to-Date, Phylogenetic Analysis that Includes the Specimen(s) Should be Referenced

Incorrect phylogenetic placement of fossil calibrations can introduce large errors into divergence date estimates ( Lee 1999 Brochu 2000 van Tuinen and Hedges 2004 Phillips et al. 2010). Fossil-calibrated dating studies rely on the paleontological literature for calibration placement but many of the putative oldest representatives of a lineage have never been included in a formal phylogenetic analysis. Gandolfo et al. (2008) identified several instances in which incorrect identifications and taxonomic assignments led to inappropriate fossil calibrations. This is a particular problem for clades that are understudied, represented by a sparse fossil record, and/or routinely overidentified (i.e., placed in a lower level taxon than the data can demonstrate) in the literature (e.g., Cenozoic amphibians and reptiles, Bever 2005 Bell et al. 2010 Sanders et al. 2010). The fact that different authorities may use the same taxon names to refer to different biological entities confounds the problem and may be particularly prevalent when addressing the fossil record of extant lineages. This is why we recommend the use of an apomorphy-based approach to identifying and phylogenetically placing specimens that are relevant for paleontological calibrations. These guidelines can also be applied to trace fossils (e.g., tetrapod footprints) in the case that their identifications are well supported and they show strong evidence for the antiquity of a lineage based on explicit apomorphies ( Carrano and Wilson 2001 Li et al. 2008 Brusatte et al. 2011).

Because fossils are incompletely preserved, many extinct species have controversial phylogenetic assignments. Given the analytical burden placed on paleontological data, it is imperative that up-to-date evidence supporting the taxonomic assignment of relevant OTUs be explicitly provided. A recurring pitfall is the understandable enthusiasm of paleontologists to report the oldest geological record of a clade, frequently based upon fragmentary evidence. This can be problematic on two counts. First, fragmentary remains often provide insufficient anatomical evidence to discriminate whether shared characters are products of convergence or common descent. Second, with fragmentary specimens, it can be difficult to distinguish whether the critical fossil belongs to the stem or the crown of the clade that it is being used to calibrate. By definition, the earliest stem members will possess the smallest subset of the diagnostic characters of the crown, and so assigning fragmentary fossils to either the crown or the stem of a clade requires detailed knowledge of character evolution that is not always available. Conversely, fossil specimens of crown clades may not be recognized as such because they lack one or more of the diagnostic characters as a consequence of taphonomy or secondary loss ( Hennig 1981 Donoghue and Purnell 2009 Sansom et al. 2010). This issue is especially true for crown clades that are united on the basis of strong molecular evidence but for which limited morphological support is known (e.g., Afrotheria or Boreoeutheria among placental mammals see Asher et al. 2009). This problem is also likely to occur in poorly represented basal taxa of lineages that underwent substantial morphological evolution long after their origin. In those cases, the taxa that might be of greatest interest in constraining the time of divergence from the nearest living relative may be difficult to identify.

These complexities underscore the need to carefully justify the phylogenetic placement of any specimen used for calibrations. It is not enough to cite a paper that merely mentions the taxon or specimen(s) because the strictness of criteria used in the reported phylogenetic placement of fossils varies among authors (especially when it comes to fragmentary, undescribed, and/or unanalyzed specimens). The phylogenetic position of a fossil taxon can be unstable even when relatively complete specimens are available. Therefore, a thorough knowledge of the paleontological literature is required to make sure that the most recent and/or valid study is being cited. After all, claims about the oldest member(s) of a lineage may change as new data and analyses are published. A good example of this phenomenon is the case of the putative oldest placental mammals, the zhelestids ( Archibald 1996). Zhelestids are Cretaceous mammal fossils that were initially hypothesized to be nested deeply within the crown clade of modern orders of placental mammals (Eutheria), the rest of which do not appear until the Cenozoic. In more recent analyses, zhelestids have been steadily moving down the tree ( Archibald et al. 2001) and now are hypothesized to be on the stem of Eutheria ( Luo and Wible 2005) where they offer no evidence about a minimum date for crown Eutheria. This stemward change in phylogenetic position arose from increasing clarity about the relationships of mammalian orders rather than from correcting errors in earlier morphological study or discovery of better specimens. All three phenomena—new specimens, new interpretations of existing specimens, and phylogenetic revisions—can lead to major revisions in the phylogenetic placement of fossils.

Existing databases such as the Paleobiology Database (www.pbdb.org) may contain detailed taxonomic, geographic, geologic, and stratigraphic information associated with fossil specimens, but relevant phylogenetic information justifying the taxonomic placement of these individual specimens is usually lacking. Moreover, rates of polyphyly in mammalian and molluscan morphotaxa were recently documented to be as high as 19% ( Jablonski and Finarelli 2009), illustrating the risks of uncritically accepting taxonomic allocations represented in large scale databases (as well as the need to construct databases following our specimen-based protocol). Whereas existing databases are extremely useful for identifying the potential oldest specimens assignable to a given clade, explicit, apomorphy-based information is still necessary to justify the phylogenetic position of a specimen for calibration.

(3) Explicit Statements on the Reconciliation of Morphological and Molecular Data Sets Should be Given

In the best cases, fossil specimens possess unambiguous apomorphies that allow them to be assigned to a single extant lineage with confidence. In these instances, assigning fossils to nodes is straightforward. Regardless of the tree topology, the fossil will track the extant lineage and serve as a candidate calibration for all nodes in which it is nested ( Fig. 1, Example 1 see, e.g., Smith 2010). In other cases, the position of a fossil is supported by ambiguous apomorphies (i.e., homoplastic characters) and is therefore highly dependent on the topology of a specific analysis. In addition to the changing position of a taxon given different morphological analyses (see 2 above), any discrepancy between topologies of morphological and molecular phylogenetic analyses is a potential pitfall that has been underemphasized ( Benton et al. 2009 Lyson et al. 2010 Wiens et al. 2010). Different topologies from morphological and molecular analyses can affect fossil calibrations in several ways. In some cases, the placement of a fossil may become ambiguous ( Fig. 1, Example 2) leading to uncertainty about which node(s) it can be used to calibrate. If morphological data show high levels of homoplasy, the polarization of morphological characters also may be sensitive to shifting topologies ( Fig. 1, Example 3). Different topologies imply different hypotheses of character evolution, potentially impacting the placement of fossils in a tree ( Asher et al. 2005 Cadena et al. 2012). Unless morphological and molecular trees are in agreement, the phylogenetic position of a fossil cannot be automatically transferred to a molecular-based topology. Therefore, merely citing a morphological phylogeny that places a fossil taxon (i.e., 2) is insufficient justification for a fossil calibration.

Example 1: A fossil (†) with unambiguous synapomorphies can be assigned to a specific lineage (D) with confidence. Regardless of the topology, the fossil will track the extant lineage and serve as a candidate calibration for all nodes above which it is nested. Example 2: Competing phylogenetic hypotheses from different data sets can change the position of fossil calibrations. In the morphological analysis, a fossil is found to be closely related to lineages C and D. Two arrows show the nodes that the fossil could calibrate. A molecular study with a different topology separates lineages C and D, making the placement of the fossil ambiguous. If the fossil is closely related to C, then it could calibrate three nodes. If the fossil is closely related to D, then it is a candidate calibration for just one node. Example 3: Changes to outgroup topology can change the polarization of morphological characters and placement of fossils. In the morphological analysis, a fossil (†) is placed in the C + D clade, sister to D. A molecular analysis changes the relationships of the outgroups (A and B). In a combined analysis, the morphological characters for the C + D clade are polarized in a different way and so using the fossil to calibrate clade C + D would be inappropriate.

Example 1: A fossil (†) with unambiguous synapomorphies can be assigned to a specific lineage (D) with confidence. Regardless of the topology, the fossil will track the extant lineage and serve as a candidate calibration for all nodes above which it is nested. Example 2: Competing phylogenetic hypotheses from different data sets can change the position of fossil calibrations. In the morphological analysis, a fossil is found to be closely related to lineages C and D. Two arrows show the nodes that the fossil could calibrate. A molecular study with a different topology separates lineages C and D, making the placement of the fossil ambiguous. If the fossil is closely related to C, then it could calibrate three nodes. If the fossil is closely related to D, then it is a candidate calibration for just one node. Example 3: Changes to outgroup topology can change the polarization of morphological characters and placement of fossils. In the morphological analysis, a fossil (†) is placed in the C + D clade, sister to D. A molecular analysis changes the relationships of the outgroups (A and B). In a combined analysis, the morphological characters for the C + D clade are polarized in a different way and so using the fossil to calibrate clade C + D would be inappropriate.

Some problems of incongruent morphological and molecular topologies can be mitigated by either “total evidence” (sensu Kluge 1989) analyses (e.g., Brochu 1997 Hermsen and Hendricks 2008 O'Leary and Gatesy 2008 Ksepka 2009) or through the use of a “molecular scaffold” in resolving morphological character distribution and, therefore, the phylogenetic position of species known only from fossils (e.g., Springer et al. 2001 Danilov and Parham 2006). Both those approaches incorporate, and therefore explicitly attempt to reconcile, the morphological data from fossil specimens with the topologies of molecular analyses though they make different assumptions about the accuracy of molecular versus morphological data. These methods do not solve every problem, so a conservative approach to calibrating analyses based on poorly supported or controversial placements is warranted. In some cases, it may be conceivable that the morphological and molecular data sets are so incongruent that neither a total evidence nor a molecular scaffold approach are sufficient for reconciling the position of an extinct taxon. For example, given current uncertainty concerning the phylogenetic position of turtles among amniotes, any use of the oldest fossil turtle specimens to calibrate amniote branching events has a two-thirds probability of introducing error into the analysis (see Lyson et al. 2010, Lyson et al. 2012). We recommend against using such controversial OTUs to calibrate divergence dating analyses.

(4) The Locality and Stratigraphic Level (to the Best of Current Knowledge) from which the Calibrating Fossil(s) Was/Were Collected Should be Specified

Unless they are subjected to direct radioisotopic analysis (which is rarely possible), the provenance of specimens used for calibrations must be documented. The accuracy with which a particular fossil can be located to a specific level in a stratigraphic column varies but depends largely on how detailed the locality data are. It might be constrained to a discrete bed in a measured stratigraphic section, or a geologic formation or group, or a depositional basin. Many specimens, especially those collected more than 50 years ago or those derived from the commercial trade, lack detailed stratigraphic and geographic occurrence data and so have limited value for calibration purposes.

Almost any fossil found in situ can be assigned to its source rock unit and often to a particular stratigraphic level within that unit. In the best cases, calibration data will be based upon fossils with precise locality information and stratigraphic context that can be assigned to a particular meter level in a chronostratigraphically well-studied section ( Fig. 2). The accuracy with which a fossil can be placed within a stratigraphic framework will have a major impact on estimates of its relative (stratigraphic) and numeric (absolute) age, particularly in light of improvements in correlation, revisions of stratigraphy, and refinements in geochronology. Geologic units (e.g., groups, formations, and members) are the key lithostratigraphic units used by field geologists to correlate and divide the sedimentary rock sequence in a geographic region they generally have formal names (e.g., Willwood Formation, Fig. 2) and explicitly defined bases and tops.

Every fossil taxon has geographic and geological contexts that provide a basis for determining its age. The example given here is for Diacodexis ilicis. Depending on the phylogeny used, D. ilicis can be a useful minimum calibration for artiodactyl mammals. Six specimens of D. ilicis are known ( Gingerich 1989) and the holotype, UM (University of Michigan) 87854, is among the oldest well-dated specimens. UM 87854 is from the Clarks Fork depositional basin in northern Wyoming. Within the Clarks Fork Basin, it is from the Willwood Formation. Within the Willwood Formation, it is from Locality UM SC-67. Locality UM SC-67 is part of a well-studied stratigraphic section for the Early Eocene. Within the Early Eocene, Locality UM SC-67 can be placed in the Wasatchian Land-Mammal Age. Within the Wasatchian, Locality UM SC-67 can be assigned to the biozone Wa-0 and occurs within a global negative carbon isotopic excursion. Wa-0 spans the latter part of this carbon isotope excursion and is inferred to represent ∼95 ky in the stratigraphic section, where UM 87854 occurs ( Abdul Aziz et al. 2008) the entire global carbon isotope excursion is currently dated to 55.65–55.93 on the basis of radioisotopic ages and orbital tuning methods based on the earth's precessional cycles ( Westerhold et al. 2009), giving specimen UM 87854 a minimum age of 55.65 Ma.

Every fossil taxon has geographic and geological contexts that provide a basis for determining its age. The example given here is for Diacodexis ilicis. Depending on the phylogeny used, D. ilicis can be a useful minimum calibration for artiodactyl mammals. Six specimens of D. ilicis are known ( Gingerich 1989) and the holotype, UM (University of Michigan) 87854, is among the oldest well-dated specimens. UM 87854 is from the Clarks Fork depositional basin in northern Wyoming. Within the Clarks Fork Basin, it is from the Willwood Formation. Within the Willwood Formation, it is from Locality UM SC-67. Locality UM SC-67 is part of a well-studied stratigraphic section for the Early Eocene. Within the Early Eocene, Locality UM SC-67 can be placed in the Wasatchian Land-Mammal Age. Within the Wasatchian, Locality UM SC-67 can be assigned to the biozone Wa-0 and occurs within a global negative carbon isotopic excursion. Wa-0 spans the latter part of this carbon isotope excursion and is inferred to represent ∼95 ky in the stratigraphic section, where UM 87854 occurs ( Abdul Aziz et al. 2008) the entire global carbon isotope excursion is currently dated to 55.65–55.93 on the basis of radioisotopic ages and orbital tuning methods based on the earth's precessional cycles ( Westerhold et al. 2009), giving specimen UM 87854 a minimum age of 55.65 Ma.

Geologic units are never of uniform scale, whether in terms of thickness or geographic extent, because they merely represent mappable units of distinctive rock types. Most importantly, rock units do not represent equal units of time—some rock units may be deposited geologically instantaneously, whereas others might represent millions of years with different portions of the total time range represented at particular outcrops. Nor do the boundaries between lithologic units necessarily coincide with geochronologic divisions (i.e., units of geologic time). But the assignment of a fossil to a named geologic rock unit provides a fixed standard of the relative age of the fossil that can then be used to establish a numeric age as outlined below (5).

Stratigraphy is not a static field. Episodically, stratigraphic nomenclature is revised or entirely redefined with the establishment of new “type sections,” and new lithostratigraphic or biostratigraphic schemes proposed. New descriptions and correlations can lead to refined interpretations of the geologic unit present at a particular geographic locality (e.g., Martz and Parker 2010). The dynamic nature of stratigraphy highlights the importance of detailed geographic locality information for fossil specimens in order to determine the impact of revised stratigraphic interpretations, correlations, and geochronologies upon divergence dating calibrations and, ultimately, divergence time estimates.

(5) Reference to a Published Radioisotopic Age and/or Numeric Timescale and Details of Numeric Age Selection Should be Given

Divergence dating analyses require numeric ages, but paleontologists do not routinely use or report numeric ages. The numeric age of a fossil is generally outside the purview of most paleontologists' research interests for two reasons. First, the geochronologic data required for numeric dates can be difficult to establish for a particular rock unit and geographic locality. Second, though geochronologies evolve, named rock units change much less frequently and so provide a more stable albeit relative comparative framework for reporting fossil occurrences. The translation of fossil occurrences to numeric ages frequently involves a daisy chain of correlations through different geographic localities on the basis of overlapping geological and paleontological evidence (e.g., van Tuinen and Hadly 2004a Benton et al. 2009 Smith 2011). However, for the vast majority of calibrations, this translation is not explained, meaning the actual numbers used in calculations are not adequately justified.

The numeric age of a fossil is not necessarily stable, particularly if it is established through correlation rather than through direct dating at the section in which the fossil was found. Any numeric age for a fossil specimen is merely the best current estimate and can be refined through time. For example, radioisotopic dating methods have improved dating precision by roughly an order of magnitude in the past 20 years as a result of new methods, recalibration of standards, and cross-testing among existing methods (e.g., Mundil et al. 2004 Erwin 2006 Renne et al. 2010). 40 Ar/ 39 Ar and U-Pb ages differ systematically by ∼1%, something that requires correction prior to comparison (e.g., Renne et al. 2010). Because of this ongoing refinement, it is important to fully explain the basis upon which the numeric age is established. If the chain of inference is explicit, the consequences of revisions will be easily identified. At its most basic level, our recommendation for justifying the numeric age of a calibration point is that the translation of relative intervals from paleontological studies should reference geochronological literature or published timescales that include numeric ages (e.g., Hess and Lippolt 1986 Menning et al. 2000 Gradstein et al. 2004 Ogg 2010 Walker and Geissman 2009). Of course, even compiled geologic timescales rely on some interpolation, are themselves constantly undergoing revision, and can become obsolete. Referencing these timescales makes it easier for later workers to revise reported ages.

A second part of this step in the protocol involves the logistical interpretation of the numeric age from the geological timescale. For a minimum age constraint, the youngest age interpretation of the fossil should be used (i.e., the uppermost limit of the relevant time interval) rather than the common practice of adopting a midpoint in the possible range. Because a fossil necessarily postdates the origination of the lineage to which it is assigned, choosing the youngest possible age from an interval will necessarily bias the minimum further from the true age of origination. However, it is important to recognize that the minimum age is only one end pointof a constraint and is meant to partially bracket, not approximate on its own, the age of origination. Therefore, the minimum age should accommodate the youngest possible age of the fossil including the error associated with the geochronologic age ( van Tuinen et al. 2004 Donoghue and Benton 2007 Benton and Donoghue 2007 Benton et al. 2009).

This youngest possible age should be applied as a hard minimum. The logic behind assigning hardminima based on the youngest possible age of the oldest-known fossil has been discussed extensively (e.g., van Tuinen et al. 2004 Benton and Donoghue 2007 Donoghue and Benton 2007). Some authors may still choose to use soft minima in cases of hypothesized anagenesis or geologic uncertainty, but such instances require careful justification. The arbitrary assignment of a minimum age that postdates the stated youngest estimates for a fossil should be avoided. The justification for arbitrarily expanding the interval might appeal to a conservative bias, but when paleontological data are properly established and justified that practice serves only to introduce unnecessary error into the analysis.

In some cases, either because of poor correlations or poorly documented provenance, the age of a fossil may not be well constrained beyond a very broad stratigraphic interval. But in many cases, it is possible to determine much more precise and accurate dates than are given by a stratigraphic interval. Those data may not be available in the publications describing the fossil specimens used for calibrations, and so it is usually necessary to compile evidence from multiple studies. Anatomically trained fossil systematists may not be able to retrieve those data any more easily than molecular systematists, but by listing the specimen numbers, rock units, and ages in a standardized way, others may check the claim, thus facilitating the refinement of numeric dates over time.

Useful Discussions

In addition to the five steps of the specimen-based protocol, we recommend that authors include some discussion about the history of each node that addresses rejected or obsolete calibrations. Such detailed discussions of calibrations already exist in some papers (e.g., Benton and Donoghue 2007 Hurley et al. 2007 Benton et al. 2009). These summary discussions make it easier for others to assess the justification by highlighting the relevant literature and argumentation. We should expect that through discovery, description, critique, and phylogenetic/stratigraphic analysis that even the best-justified calibrations would eventually be refined or even dramatically changed. In order to facilitate the evolution of justifications, we recommend that explanatory discussions (or citations of such discussions) should become a standard part of calibration reporting.

Other Parameters

The justification of the phylogenetic position and age of a fossil is an important first step to calibrating a node in a divergence dating analysis. In addition to determining what nodes can even be assigned time priors (some may not have useable fossils), this step provides the most tangible data from the fossil record: the hard minimum bound of a calibration interval. The maximum bound and the distribution of probabilities within the minimum–maximum interval are also ostensibly based on the fossil record, but in a much more complex way, because they describe probability of origination before the oldest known fossil. The idiosyncratic nature of these other parameters precludes us from developing a standard protocol for them.

Ideally, the maximum constraint is established as older than all the oldest possible records, extending back to encompass a time when the ecologic, biogeographic, geologic, and taphonomic conditions for the existence of the lineage are met, but no records are known. For the maximum bound, an intuitive approach that takes into account preservation potential and phylogenetic bracketing has been proposed (e.g., Reisz and Müller 2004a Müller and Reisz 2005 Benton and Donoghue 2007 Donoghue and Benton 2007 Benton et al. 2009). This approach is borrowed and developed from the fossil recovery potential function established by Marshall (1997). Researchers who use this intuitive approach should provide detailed arguments justifying their decisions so that others can evaluate them and, following the arguments of Benton and Donoghue (2007) and Ho and Phillips (2009), the maximum bounds should be soft and liberal.

Most studies use a Bayesian framework for estimating divergence dates with probability curves between minimum and maximum bounds. In theory, such complex, parameter-rich priors may be better models of the fossil record, but there is presently no practical way to estimate curve parameters ( Ho and Phillips 2009). Lee and Skinner (2011) note, “current practice often consists of little more than educated guesswork.” A review of recent studies shows that these parameters are usually not justified ( Warnock et al. 2012). The implications of these choices are only recently being explored ( Inoue et al. 2010 Clarke et al. 2011 Lee and Skinner 2011 Warnock et al. 2012). But the fact that a widely applied methodology is subjected to such ambiguous assumptions that have a major impact on results ( Clarke et al. 2011, Warnock et al. 2012) is a major limitation of molecular divergence dating studies. The development of objective methods for estimating maximum bounds and probability curves should be a priority (see Future Directions section).


The best free and most up to date phylogenetic tree on the internet? - Biology

I hope soon to move these software listings webpages to a Github archive, and invite others to help contribute to them and maintain them.

Here are 392 phylogeny packages and 54 free web servers, (almost) all that I know about. It is an attempt to be completely comprehensive. I have not made any attempt to exclude programs that do not meet some standard of quality or importance. Updates to these pages are made roughly monthly. Here is a "waiting list" of new programs waiting to have their full entries constructed. Many of the programs in these pages are available on the web, and some of the older ones are also available from ftp server machines.

The programs listed below include both free and non-free ones in some cases I do not know whether a program is free. I have listed as free those that I knew were free for the others you have to ask their distributor. Usually when I say that a program is downloadable from a web site, this means that it is available free.

Email addresses in these pages have had the @ symbol replaced by (at) and also surrounded by invisible confusing tags and blank characters in hopes of foiling spambots that harvest email addresses.

Owing to past NSF support of these pages, I am required to note that any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation (NSF supported these pages from 1995-2003).

L ist of packages arranged .

Phylogeny programs formerly listed here but no longer distributed

  • General-purpose packages
  • Parsimony programs
  • Distance matrix methods
  • Computation of distances
  • Maximum likelihood methods
  • Bayesian inference methods
  • Quartets methods
  • Artificial-intelligence and genetic algorithms methods
  • Invariants (or Evolutionary Parsimony) methods
  • Interactive tree manipulation
  • Looking for hybridization or recombination events
  • Bootstrapping and other measures of support
  • Compatibility analysis
  • Consensus trees, subtrees, supertrees, distances between trees
  • Tree-based alignment
  • Gene duplication and genomic analysis
  • Biogeographic analysis and host-parasite comparison
  • Comparative method analysis
  • Simulation of trees or data
  • Examination of shapes of trees
  • Clocks, dating and stratigraphy
  • Model Selection
  • Description or prediction of data from trees
  • Tree plotting/drawing
  • Sequence management/job submission
  • Teaching about phylogenies
  • Web or e-mail servers that can analyze data for you
  • PHYLIP
  • PAUP*
  • MEGA
  • Phylo_win
  • ARB
  • DAMBE
  • PAL
  • Bionumerics
  • Mesquite
  • PaupUp
  • BIRCH
  • Bosque
  • EMBOSS
  • phangorn
  • Bio++
  • ETE
  • DendroPy
  • SeaView
  • Crux
  • PHYLIP
  • PAUP*
  • Hennig86
  • MEGA
  • RA
  • NONA
  • CAFCA
  • PHYLIP
  • Phylo_win
  • sog
  • gmaes
  • LVB
  • GeneTree
  • ARB
  • DAMBE
  • MALIGN
  • POY
  • Gambit
  • TNT
  • GelCompar II
  • Bionumerics
  • Network
  • TCS
  • GAPars
  • CRANN
  • Mesquite
  • PAST
  • FootPrinter
  • BPAnalysis
  • Simplot
  • Parsimov
  • NimbleTree
  • PaupUp
  • Notung
  • BIRCH
  • IDEA
  • PSODA
  • PRAP
  • SeqState
  • Bosque
  • PhyloNet
  • EMBOSS
  • phangorn
  • Murka
  • Freqpars
  • SeaView
  • PAUPRat
  • PHYLIP
  • PAUP*
  • MEGA
  • MacT
  • ODEN
  • TREECON
  • DISPAN
  • RESTSITE
  • NTSYSpc
  • METREE
  • GDA
  • SeqPup
  • PHYLTEST
  • Lintre
  • Phylo_win
  • POPTREE2
  • Gambit
  • gmaes
  • DENDRON
  • BIONJ
  • TFPGA
  • MVSP
  • ARB
  • Darwin
  • T-REX
  • sendbs
  • nneighbor
  • DAMBE
  • weighbor
  • DNASIS
  • MINSPNET
  • PAL
  • Arlequin
  • PEBBLE
  • HY-PHY
  • Vanilla
  • GelCompar II
  • Bionumerics
  • qclust
  • TCS
  • Populations
  • Winboot
  • SYN-TAX
  • PTP
  • SplitsTree
  • FastME
  • APE
  • MacVector
  • QuickTree
  • Simplot
  • ProfDist
  • START2
  • STC
  • NimbleTree
  • CBCAnalyzer
  • PaupUp
  • Geneious
  • BIRCH
  • SEMPHY
  • FASTML
  • Rate4Site
  • SWORDS
  • IDEA
  • FAMD
  • Bosque
  • GAME
  • Bioinformatics_Toolbox
  • TreeFit
  • EMBOSS
  • phangorn
  • PC-ORD
  • Bio++
  • UGENE
  • NINJA
  • SeaView
  • Statio
  • TIMER
  • Crux
  • Ancestor
  • ANC-GENE
  • Bn-Bs
  • PHYLIP
  • PAUP*
  • RAPDistance
  • MULTICOMP
  • Microsat
  • DIPLOMO
  • OSA
  • DISPAN
  • RESTSITE
  • NTSYSpc
  • TREE-PUZZLE
  • GCUA
  • DERANGE2
  • POPGENE
  • TFPGA
  • REAP
  • MVSP
  • RSTCALC
  • Genetix
  • DISTANCE
  • Darwin
  • sendbs
  • Arlequin
  • DAMBE
  • DnaSP
  • PAML
  • puzzleboot
  • PAL
  • Vanilla
  • GelCompar II
  • Bionumerics
  • qclust
  • Populations
  • Winboot
  • FSTAT
  • SYN-TAX
  • Phylo_win
  • Phyltools
  • MSA
  • APE
  • YCDMA
  • NSA
  • T-REX
  • LDDist
  • DIVAGE
  • Genepop
  • START2
  • Swaap
  • Swaap PH
  • SPAGeDi
  • CBCAnalyzer
  • PaupUp
  • SEMPHY
  • SWORDS
  • rRNA phylogeny
  • FAMD
  • GAME
  • Bioinformatics_Toolbox
  • GenoDive
  • analysis
  • TreeFit
  • EMBOSS
  • Murka
  • Bio++
  • UGENE
  • POPTREE2
  • DISTREE
  • SeaView
  • Crux
  • Bn-Bs
  • HON-new
  • PHYLIP
  • PAUP*
  • fastDNAml
  • MOLPHY
  • PAML
  • Spectrum
  • SplitsTree
  • TREE-PUZZLE
  • SeqPup
  • Phylo_win
  • PASSML
  • ARB
  • Darwin
  • Modeltest
  • DAMBE
  • PAL
  • dnarates
  • HY-PHY
  • Vanilla
  • DT-ModSel
  • Bionumerics
  • fastDNAmlRev
  • RevDNArates
  • rate-evolution
  • CONSEL
  • EDIBLE
  • PLATO
  • Mesquite
  • PTP
  • Treefinder
  • MetaPIGA
  • RAxML
  • PHYML
  • r8s-bootstrap
  • MrMTgui
  • MrModeltest
  • BootPHYML
  • PARBOOT
  • p4
  • Porn*
  • SIMMAP
  • Spectronet
  • Rhino
  • TipDate
  • ProtTest
  • ModelGenerator
  • Simplot
  • MrAIC
  • Modelfit
  • IQPNNI
  • PARAT
  • ALIFRITZ
  • PhyNav
  • DPRML
  • MultiPhyl
  • NimbleTree
  • PaupUp
  • SSA
  • CoMET
  • BIRCH
  • Mac5
  • Kakusan4
  • GARLI
  • PHYSIG
  • SEMPHY
  • FASTML
  • Rate4Site
  • aLRT
  • McRate
  • EREM
  • PROCOV
  • DART
  • PhyloCoCo
  • PRAP
  • SeqState
  • Leaphy
  • NHML
  • SLR
  • rRNA phylogeny
  • Bosque
  • Concaterpillar
  • PHYLLAB
  • NEPAL
  • EMBOSS
  • CodeAxe
  • phangorn
  • Bio++
  • FastTree
  • nhPhyML
  • PhyML-Multi
  • Segminator
  • raxmlGUI
  • MixtureTree
  • SeaView
  • GZ-Gamma
  • PAUPRat
  • Crux
  • PAML
  • BAMBE
  • PAL
  • Vanilla
  • MrBayes
  • Mesquite
  • PHASE
  • BEAST
  • MrBayes tree scanners
  • p4
  • SIMMAP
  • IMa2
  • BAli-Phy
  • BayesPhylogenies
  • MrBayesPlugin
  • PhyloBayes
  • PHASE
  • Cadence
  • Multidivtime
  • BEST
  • AMBIORE
  • PHYLLAB
  • bms_runner
  • tracer
  • burntrees
  • Bio++
  • Crux
  • ANC-GENE
  • TREE-PUZZLE
  • SplitsTree
  • PHYLTEST
  • GEOMETRY
  • PICA
  • Darwin
  • PhyloQuart
  • Willson quartets programs
  • Gambit
  • IQPNNI
  • STC
  • Quartet Suite
  • LEVEL2
  • Bosque
  • MacClade
  • PHYLIP
  • PDAP
  • TreeTool
  • ARB
  • WINCLADA
  • TreeEdit
  • TreeExplorer
  • TreeThief
  • RadCon
  • Mavric
  • T-REX
  • EDIBLE
  • Mesquite
  • Treefinder
  • TreeView
  • TreeJuxtaposer
  • TreeMe
  • ArboDraw
  • Notung
  • TreeDyn
  • TreeMaker
  • MESA
  • BIRCH
  • SimpleClade
  • Dendroscope
  • Forest
  • Phylocom
  • TreeToy
  • Bioinformatics_Toolbox
  • Phyutility
  • EMBOSS
  • PhyloWidget
  • Phybase
  • ETE
  • TreeGraph 2
  • Crux
  • RecPars
  • TOPALi
  • partimatrix
  • Network
  • TCS
  • T-REX
  • PLATO
  • PPH
  • Spectronet
  • IMa2
  • Simplot
  • START2
  • Likewind
  • DualBrothers
  • cBrother
  • EEEP
  • HGT
  • PhyloNet
  • EvolSimulator
  • Concaterpillar
  • PhyML-Multi
  • RDP3
  • PHYLIP
  • PAUP*
  • Random Cladistics
  • AutoDecay
  • TreeRot
  • DNA Stacks
  • OSA
  • DISPAN
  • PHYLTEST
  • Lintre
  • sog
  • POPTREE2
  • MEGA
  • PARBOOT
  • PICA
  • ModelTest
  • TAXEQ3
  • BAMBE
  • DAMBE
  • puzzleboot
  • CodonBootstrap
  • Gambit
  • PAL
  • MrBayes
  • CONSEL
  • Populations
  • LVB
  • EDIBLE
  • Winboot
  • Mesquite
  • Phylo_win
  • PAST
  • Treefinder
  • RAxML
  • Phyltools
  • PHASE
  • PHYML
  • BEAST
  • r8s-bootstrap
  • MrBayes tree scanners
  • T-REX
  • MrMTgui
  • MrModeltest
  • BootPHYML
  • Porn*
  • ProtTest
  • ModelGenerator
  • Simplot
  • Permute!
  • ELW
  • MultiPhyl
  • GHOSTS
  • PaupUp
  • Geneious
  • BIRCH
  • BayesPhylogenies
  • scaleboot
  • aLRT
  • PhyloBayes
  • SWORDS
  • CTree
  • PRAP
  • FAMD
  • PhyloNet
  • PHYLLAB
  • Phyutility
  • EMBOSS
  • FastTree
  • Bionumerics
  • tracer
  • burntrees
  • Bio++
  • raxmlGUI
  • MixtureTree
  • COMPROB
  • PHYLIP
  • PICA
  • partimatrix
  • PPH
  • Spectronet
  • BIRCH
  • EMBOSS
  • Murka
  • Phybase
  • TIGER
  • COMPONENT
  • TREEMAP
  • NTSYSpc
  • PHYLIP
  • PAUP*
  • REDCON
  • TAXEQ3
  • MEGA
  • RadCon
  • Mesquite
  • PAST
  • Treefinder
  • Robinson and Foulds distance
  • Clann
  • PhyNav
  • SuperTree
  • Supertree scripts
  • PaupUp
  • Supertree
  • BIRCH
  • HeuristicMRF2
  • TopD/fMts
  • Quartet Suite
  • Rainbow
  • PhySIC_IST
  • EEEP
  • PhyloSort
  • FAMD
  • Phyutility
  • EMBOSS
  • phangorn
  • Murka
  • RAxML
  • Phybase
  • iGTP
  • raxmlGUI
  • Crux
  • TreeAlign
  • ClustalW
  • MALIGN
  • GeneDoc
  • DAMBE
  • POY
  • ALIGN
  • DNASIS
  • FootPrinter
  • ALIFRITZ
  • T-Coffee
  • ArboDraw
  • BAli-Phy
  • Geneious
  • BIRCH
  • MAFFT
  • LOBSTER
  • DART
  • MUSCLE
  • Bosque
  • EMBOSS
  • SeaView
  • SuiteMSA
  • DERANGE2
  • FORESTER
  • BPAnalysis
  • Notung
  • TopD/fMts
  • DTscore
  • DART
  • gtp
  • EvolSimulator
  • DTdraw
  • Mgenome
  • Tree Tracker
  • bms_runner
  • MANTiS
  • ETE
  • MANTiS
  • iGTP
  • COMPONENT
  • TREEMAP
  • DIVA
  • TreeFitter
  • GEODIS
  • Tarzan
  • ParaFit
  • Phylocom
  • AxParafit
  • GenGIS
  • S-DIVA
  • Lagrange
  • CoRe-PA
  • Jane
  • PHYLIP
  • CAIC
  • COMPARE
  • PDAP
  • ANCML
  • RIND
  • MacroCAIC
  • Phylogenetic Independence
  • MacClade
  • Mesquite
  • APE
  • Jevtrace
  • SIMMAP
  • PHYLOGR
  • TreeSAAP
  • Permute!
  • Parsimov
  • DIVERGE
  • IDC
  • OUCH
  • BIRCH
  • BayesTraits
  • PHYSIG
  • Cactus-Pie
    • Phylocom
    • pcca
    • EMBOSS
    • bms_runner
    • SLOUCH
    • Phybase
    • COMPONENT
    • Seq-Gen
    • Treevolve and PTreevolve
    • PSeq-Gen
    • COMPARE
    • ROSE
    • PAML
    • ProSeq
    • PAL
    • Vanilla
    • MacClade
    • EDIBLE
    • Mesquite
    • Treefinder
    • Network
    • Phylogen
    • MESA
    • SGRunner
    • apTreeshape
    • Simprot
    • EREM
    • indel-Seq-Gen
    • DAWG
    • EvolveAGene3
    • EvolSimulator
    • Bio::Phylo
    • Recodon
    • NetRecodon
    • SuiteMSA
    • Crux
    • MacroCAIC
    • Genie
    • PAL
    • Vanilla
    • RadCon
    • BRANCHLENGTH
    • APE
    • Tracer
    • SymmeTREE
    • TreeScan
    • MESA
    • apTreeshape
    • TreeStat
    • CTree
    • Phyutility
    • laser
    • PhyRe
    • Bio::Phylo
    • PHYLIP
    • QDate
    • Modeltest
    • PAML
    • RRTree
    • PEBBLE
    • TreeEdit
    • HY-PHY
    • MEGA
    • PAL
    • rate-evolution
    • BRANCHLENGTH
    • r8s
    • PAST
    • Treefinder
    • Network
    • APE
    • BEAST
    • MrMTgui
    • MrModeltest
    • SymmeTREE
    • Porn*
    • TipDate
    • Rhino
    • GHOSTS
    • Cadence
    • Multidivtime
    • CodonRates
    • BIRCH
    • Brownie
    • PATHd8
    • McRate
    • PhyloBayes
    • Cactus-Pie
    • GRate
    • NHML
    • TreeFit
    • PHYLTEST
    • Murka
    • TIMER
    • Modeltest
    • MrMTgui
    • MrModeltest
    • Porn*
    • ModelGenerator
    • ProtTest
    • MrAIC
    • Modelfit
    • DT-ModSel
    • BayesTraits
    • Kakusan4
    • MAPPS
    • DART
    • Concaterpillar
    • Statio
    • jMODELTEST
    • PHYLIP
    • PAUP*
    • TreeTool
    • TreeView
    • NJplot
    • DendroMaker
    • Tree Draw Deck
    • Phylodendron
    • ARB
    • unrooted
    • DAMBE
    • TREECON
    • Mavric
    • TreeExplorer
    • TreeThief
    • Bionumerics
    • FORESTER
    • MacClade
    • MEGA
    • Mesquite
    • Phylogenetic Tree Drawing
    • APE
    • T-REX
    • TreeJuxtaposer
    • Spectronet
    • TreeSetViz
    • TreeGraph 2
    • ArboDraw
    • PaupUp
    • Notung
    • TreeDyn
    • DigTree
    • Geneious
    • BIRCH
    • Paloverde
    • MrEnt
    • FigTree
    • HyperTree
    • GeoPhyloBuilder
    • Dendroscope
    • CTree
    • TreeToy
    • TreeSnatcher Plus
    • DTdraw
    • PHYLLAB
    • Bioinformatics_Toolbox
    • EMBOSS
    • PhyloWidget
    • GenGIS
    • Bio++
    • Bio::Phylo
    • S-DIVA
    • UGENE
    • ETE
    • POPTREE2
    • Segminator
    • MixtureTree
    • SeaView
    • Archaeopteryx
    • SuiteMSA
    • Random Cladistics
    • GDE
    • MUST 2000
    • DNA Stacks
    • SeqPup
    • PARBOOT
    • ARB
    • DAMBE
    • BioEdit
    • Singapore PHYLIP web interface
    • Bionumerics
    • W2H
    • Phyledit
    • GeneStudio Pro
    • Simplot
    • DPRML
    • NimbleTree
    • Geneious
    • BIRCH
    • TOPALi
    • MBEToolbox
    • PISE
    • Bosque
    • Bioinformatics_Toolbox
    • EMBOSS
    • PyCogent
    • PHYDIT
    • Segminator
    • SeaView
    • SuiteMSA

    T able of contents by computer systems
    on which they work

    • Unix (source code in C or executables). I have included programs that are available as C source code because most Unix workstations have a C compiler. (Programs in other compiled languages such as FORTRAN and Pascal, and in interpreted languages such as Java, Perl, Python, or R are also included), as are Java executables. For many of these the programs can also be compiled or run on Windows or Mac OS X systems if they have the appropriate compilers or interpreters loaded.
      • PHYLIP
      • PAUP*
      • Phylo_win
      • ODEN
      • SeqPup
      • Lintre
      • Microsat
      • OSA
      • TREE-PUZZLE
      • fastDNAml
      • MOLPHY
      • PAML
      • SplitsTree
      • PHYLTEST
      • TreeAlign
      • ClustalW
      • MALIGN
      • GeneDoc
      • COMPARE
      • Seq-Gen
      • TreeTool
      • GDE
      • sog
      • Phylodendron
      • Treevolve and PTreevolve
      • PSeq-Gen
      • POPTREE2
      • gmaes
      • GCUA
      • DERANGE2
      • LVB
      • BIONJ
      • ANCML
      • QDate
      • PASSML
      • TOPALi
      • RecPars
      • PARBOOT
      • ARB
      • DISTANCE
      • Darwin
      • sendbs
      • partimatrix
      • BAMBE
      • nneighbor
      • unrooted
      • ROSE
      • weighbor
      • PhyloQuart
      • puzzleboot
      • Willson quartets programs
      • POY
      • RIND
      • RRTree
      • Mavric
      • dnarates
      • Arlequin
      • HY-PHY
      • Genie
      • Vanilla
      • qclust
      • fastDNAmlRev
      • RevDNArates
      • BRANCHLENGTH
      • TCS
      • CONSEL
      • FORESTER
      • Populations
      • T-REX
      • MrBayes
      • W2H
      • GAPars
      • EDIBLE
      • r8s
      • Mesquite
      • Treefinder
      • PPH
      • PLATO
      • MetaPIGA
      • FastME
      • MSA
      • Phylogenetic Tree Drawing
      • APE
      • PHASE
      • PHYML
      • BEAST
      • TreeView
      • r8s-bootstrap
      • MrBayes tree scanners
      • Robinson and Foulds distance
      • Clann
      • Jevtrace
      • MrMTgui
      • MrModeltest
      • BootPHYML
      • SymmeTREE
      • TreeJuxtaposer
      • LDDist
      • p4
      • Porn*
      • TNT
      • Phylogen
      • Rhino
      • TipDate
      • Phylap
      • Dnatree
      • QuickTree
      • IMa2
      • FootPrinter
      • BPAnalysis
      • ProtTest
      • GEODIS
      • TreeScan
      • TreeSetViz
      • ModelGenerator
      • PHYLOGR
      • ProfDist
      • MrAIC
      • Modelfit
      • IQPNNI
      • PARAT
      • ALIFRITZ
      • PhyNav
      • STC
      • TreeSAAP
      • Likewind
      • ELW
      • TreeGraph 2
      • Supertree scripts
      • Parsimov
      • Bosque
      • DIVERGE
      • T-Coffee
      • CBCAnalyzer
      • GHOSTS
      • Tarzan
      • DT-ModSel
      • DualBrothers
      • apTreeshape
      • Multidivtime
      • Mgenome
      • ParaFit
      • IDC
      • TreeMaker
      • CodonRates
      • BAli-Phy
      • Supertree
      • OUCH
      • TreeDyn
      • DigTree
      • Geneious
      • BIRCH
      • Brownie
      • Mac5
      • BayesPhylogenies
      • BayesTraits
      • Paloverde
      • HeuristicMRF2
      • CRANN
      • Kakusan4
      • PATHd8
      • MAFFT
      • GARLI
      • TreeStat
      • FigTree
      • PHYSIG
      • scaleboot
      • cBrother
      • RAxML
      • MrBayesPlugin
      • LOBSTER
      • SEMPHY
      • FASTML
      • Rate4Site
      • TopD/fMts
      • Quartet Suite
      • Rainbow
      • McRate
      • HyperTree
      • PhyloBayes
      • Cactus-Pie
      • SWORDS
      • Dendroscope
      • Forest
      • Phylocom
      • PhySIC_IST
      • Simprot
      • BEST
      • pcca
      • EREM
      • indel-Seq-Gen
      • MBEToolbox
      • DTscore
      • PROCOV
      • DART
      • EEEP
      • DAWG
      • LEVEL2
      • PSODA
      • PhyloSort
      • PISE
      • MUSCLE
      • AMBIORE
      • CTree
      • PRAP
      • HGT
      • NHML
      • SLR
      • rRNA phylogeny
      • AxParafit
      • EvolveAGene3
      • gtp
      • TreeToy
      • TreeSnatcher Plus
      • EvolSimulator
      • Concaterpillar
      • GAME
      • DTdraw
      • NEPAL
      • PHYLLAB
      • Bioinformatics_Toolbox
      • Tree Tracker
      • analysis
      • CodeAxe
      • Phyutility
      • EMBOSS
      • phangorn
      • FastTree
      • PhyloWidget
      • laser
      • bms_runner
      • tracer
      • burntrees
      • SLOUCH
      • Murka
      • MANTiS
      • Freqpars
      • GenGIS
      • CONSERVE
      • Bio++
      • UGENE
      • DISTREE
      • Phybase
      • ETE
      • PyCogent
      • DendroPy
      • CAIC
      • NINJA
      • MUST
      • nhPhyML
      • PhyML-Multi
      • Segminator
      • iGTP
      • Bio::Phylo
      • Recodon
      • NetRecodon
      • Lagrange
      • CoRe-PA
      • MixtureTree
      • TIGER
      • SeaView
      • Jane
      • GZ-Gamma
      • PAUPRat
      • Archaeopteryx
      • SuiteMSA
      • Crux
      • Ancestor
      • ANC-GENE
      • Bn-Bs
      • HON-new

      (I am just starting to list interpreter code here. Until recently it was listed under Unix, Windows and/or Mac OS X). Until I finish transferring interpreter code here this list will be incomplete, and you will find many programs written in interpreted languages there).

      • PAL
      • Mesquite
      • BIRCH
      • PRAP
      • SeqState
      • TCS
      • IDEA
      • PhyloNet
      • Notung
      • Vanilla
      • NINJA
      • qclust
      • PEBBLE
      • SplitsTree
      • SeqPup
      • DPRML
      • MultiPhyl
      • Treefinder
      • PhyloCoCo
      • ProtTest
      • CoMET
      • Segminator
      • jMODELTEST
      • as Windows executables (not counting executing in a "DOS box"). Programs available as source code which is Windows-specific are listed here. Java executables are also included. (Note that compilers available on Windows systems, particularly the free Cygwin and MinGW compilers, can also be used to compile many of the programs listed above under Unix generic source code). Programs run in interpreted environments such as Perl, Python, R or MATLAB can also be run under Windows if the proper environment is installed. These programs are listed above under Unix.
        • PHYLIP
        • PAUP*
        • TREECON
        • GDA
        • SeqPup
        • MOLPHY
        • GeneDoc
        • COMPONENT
        • TREEMAP
        • COMPARE
        • RAPDistance
        • TreeView
        • Phylodendron
        • POPGENE
        • TFPGA
        • GeneTree
        • MVSP
        • RSTCALC
        • Genetix
        • NJplot
        • unrooted
        • Arlequin
        • DAMBE
        • DnaSP
        • PAML
        • DNASIS
        • MINSPNET
        • BioEdit
        • ProSeq
        • WINCLADA
        • NONA
        • Phylogenetic Independence
        • HY-PHY
        • TreeExplorer
        • Genie
        • MEGA
        • TNT
        • GelCompar II
        • Bionumerics
        • FORESTER
        • Populations
        • T-REX
        • MrBayes
        • EDIBLE
        • Winboot
        • r8s
        • Mesquite
        • Phyledit
        • SYN-TAX
        • PTP
        • DIVA
        • TreeFitter
        • Phylo_win
        • PAST
        • GeneStudio Pro
        • Treefinder
        • PPH
        • MetaPIGA
        • Phyltools
        • MSA
        • Mgenome
        • APE
        • PHASE
        • PHYML
        • YCDMA
        • NSA
        • BEAST
        • Clann
        • Jevtrace
        • MrMTgui
        • MrModeltest
        • SymmeTREE
        • TreeJuxtaposer
        • Network
        • Spectronet
        • Phylogen
        • Phylap
        • Dnatree
        • IMa2
        • ProtTest
        • GEODIS
        • TreeSetViz
        • TreeMe
        • ModelGenerator
        • Simplot
        • PHYLOGR
        • ProfDist
        • START2
        • IQPNNI
        • STC
        • TreeSAAP
        • Swaap
        • Swaap PH
        • TreeGraph 2
        • DIVERGE
        • MESA
        • NimbleTree
        • ArboDraw
        • SPAGeDi
        • CBCAnalyzer
        • DualBrothers
        • PaupUp
        • SSA
        • Multidivtime
        • ParaFit
        • IDC
        • TreeMaker
        • CodonRates
        • BAli-Phy
        • TreeDyn
        • DigTree
        • Geneious
        • Brownie
        • Mac5
        • BayesPhylogenies
        • BayesTraits
        • MrEnt
        • SimpleClade
        • CRANN
        • PATHd8
        • MAFFT
        • GARLI
        • TreeStat
        • FigTree
        • MrBayesPlugin
        • LOBSTER
        • SEMPHY
        • FASTML
        • Rate4Site
        • Quartet Suite
        • Rainbow
        • aLRT
        • McRate
        • HyperTree
        • SWORDS
        • GeoPhyloBuilder
        • Dendroscope
        • Phylocom
        • TOPALi
        • Simprot
        • BEST
        • pcca
        • EREM
        • MBEToolbox
        • DTscore
        • EEEP
        • LEVEL2
        • PSODA
        • PhyloSort
        • RAxML
        • MUSCLE
        • CTree
        • PRAP
        • Leaphy
        • GRate
        • SLR
        • rRNA phylogeny
        • FAMD
        • AxParafit
        • DTscore
        • PROCOV
        • DART
        • EEEP
        • CONSERVE
        • DAWG
        • EvolveAGene3
        • Bosque
        • TreeToy
        • TreeSnatcher Plus
        • GAME
        • TreeFit
        • FastTree
        • PhyloWidget
        • tracer
        • Murka
        • MANTiS
        • PC-ORD
        • GenGIS
        • Bio++
        • S-DIVA
        • UGENE
        • Phybase
        • ETE
        • Cactus-Pie
        • PHYDIT
        • POPTREE2
        • RDP3
        • PhyRe
        • Segminator
        • iGTP
        • Recodon
        • NetRecodon
        • CoRe-PA
        • TIGER
        • SeaView
        • Jane
        • TIMER
        • HON-new
        • PHYLIP
        • PAUP*
        • MEGA
        • Hennig86
        • MEGA
        • RA
        • NONA
        • TREECON
        • Microsat
        • DISPAN
        • RESTSITE
        • NTSYSpc
        • METREE
        • PHYLTEST
        • RAPDistance
        • DIPLOMO
        • TREE-PUZZLE
        • ClustalW
        • MALIGN
        • GeneDoc
        • Random Cladistics
        • POPTREE2
        • GEOMETRY
        • PDAP
        • PICA
        • REDCON
        • TAXEQ3
        • BIONJ
        • ANCML
        • REAP
        • MVSP
        • Lintre
        • T-REX
        • sendbs
        • weighbor
        • POY
        • TreeDis
        • Network
        • Gambit
        • CONSEL
        • LVB
        • FSTAT
        • SYN-TAX
        • FastME
        • MSA
        • QDate
        • Robinson and Foulds distance
        • DIVAGE
        • BPAnalysis
        • TreeScan
        • Genepop
        • Kakusan4
        • DISTREE
        • GZ-Gamma
        • PAUPRat
        • Statio
        • Ancestor
        • ANC-GENE
        • Bn-Bs
        • PHYLIP
        • PAUP*
        • MacT
        • SeqPup
        • Microsat
        • TREE-PUZZLE
        • fastDNAml
        • MacClade
        • Spectrum
        • AutoDecay
        • CAFCA
        • ClustalW
        • TREEMAP
        • CAIC
        • COMPARE
        • Seq-Gen
        • CONSERVE
        • TreeView
        • NJplot
        • DendroMaker
        • MUST
        • DNA Stacks
        • Phylogenetic Investigator
        • Tree Draw Deck
        • Phylodendron
        • TreeRot
        • Treevolve and PTreevolve
        • PSeq-Gen
        • BIONJ
        • GCUA
        • GeneTree
        • QDate
        • LVB
        • T-REX
        • unrooted
        • COMPONENT Lite
        • weighbor
        • Modeltest
        • PAML
        • Willson quartets programs
        • ALIGN
        • CodonBootstrap
        • DNASIS
        • PLATO
        • MacroCAIC
        • RadCon
        • TreeEdit
        • Arlequin
        • HY-PHY
        • TreeThief
        • Genie
        • MrBayes
        • FORESTER
        • r8s
        • GDA
        • Mesquite
        • SYN-TAX
        • DIVA
        • TreeFitter
        • Phylo_win
        • Treefinder
        • PPH
        • MetaPIGA
        • MSA
        • PHYML
        • BEAST
        • Robinson and Foulds distance
        • Clann
        • Jevtrace
        • MrModeltest
        • BootPHYML
        • SymmeTREE
        • TreeJuxtaposer
        • MacVector
        • SIMMAP
        • TNT
        • Phylogen
        • Rhino
        • TipDate
        • Phylap
        • Dnatree
        • ProtTest
        • GEODIS
        • TreeScan
        • TreeSetViz
        • ModelGenerator
        • PHYLOGR
        • ProfDist
        • IQPNNI
        • TreeSAAP
        • Permute!
        • MESA
        • SGRunner
        • DualBrothers
        • Cadence
        • ParaFit
        • TreeMaker
        • BAli-Phy
        • Supertree
        • TreeDyn
        • Geneious
        • Brownie
        • Mac5
        • BayesPhylogenies
        • BayesTraits
        • Paloverde
        • CRANN
        • MAPPS
        • PATHd8
        • MAFFT
        • GARLI
        • TreeStat
        • FigTree
        • MrBayesPlugin
        • SEMPHY
        • Quartet Suite
        • Rainbow
        • HyperTree
        • Kakusan4
        • SWORDS
        • Dendroscope
        • Phylocom
        • PhySIC_IST
        • TOPALi
        • BEST
        • pcca
        • indel-Seq-Gen
        • PhyloCoCo
        • LEVEL2
        • PSODA
        • PhyloSort
        • RAxML
        • MUSCLE
        • CTree
        • PRAP
        • SLR
        • AxParafit
        • EvolveAGene3
        • TreeToy
        • TreeSnatcher Plus
        • GAME
        • GenoDive
        • PhyloWidget
        • tracer
        • MANTiS
        • GenGIS
        • Bio++
        • UGENE
        • Phybase
        • ETE
        • PyCogent
        • DendroPy
        • NEPAL
        • Bosque
        • TreeGraph 2
        • Segminator
        • iGTP
        • Bio::Phylo
        • Recodon
        • Lagrange
        • NetRecodon
        • CoRe-PA
        • TIGER
        • SeaView
        • Jane
        • PAUPRat
        • SuiteMSA

        A nalyzing particular types of data

        Here you will find lists of programs that analyze types of data other than molecular sequence data. We will gradually expand this list of data types.

        • RSTCALC
        • POPTREE2
        • Microsat
        • Populations
        • MSA
        • YCDMA
        • Network
        • IMa2
        • Arlequin
        • tfpga
        • RAPDistance
        • GelCompar II
        • Bionumerics
        • Winboot
        • REAP
        • RESTSITE
        • MVSP
        • DENDRON
        • Phyltools
        • Network
        • BIRCH
        • FAMD
        • EMBOSS
        • PHYLIP
        • MacClade
        • Mesquite
        • ANCML
        • COMPARE
        • PDAP
        • Phylogenetic Independence
        • APE
        • CAIC
        • TreeScan
        • PHYLOGR
        • IDC
        • CoMET
        • OUCH
        • Brownie
        • BayesTraits
        • TNT
        • PHYSIG
        • Cactus-Pie
        • Phylocom
        • pcca
        • EMBOSS
        • Permute
        • SIMMAP
        • SLOUCH
        • PC-ORD
        • PHYLIP
        • DAMBE
        • Freqpars
        • DISPAN
        • GDA
        • POPGENE
        • YCDMA
        • FSTAT
        • Arlequin
        • DnaSP
        • APE
        • DIVAGE
        • POPTREE2
        • Genepop
        • SPAGeDi
        • GenoDive
        • TreeFit
        • EMBOSS

        (under construction: more coming soon)

        Here are the packages that have most recently been added to these listings: (the most recent ones first). Entries are retained in this list for 6 months. Note also below the "waiting list" area listing programs that are to be added. You can use the submission form here to submit new entries.

        • (List is currently empty because I have been unable to do much updating owing to other pressures so no new packages have been added in the last year).

        Here are the packages whose entries have most recently been changed: The date on which each change was entered is shown. Entries are retained in this list for 6 months. (Note that changes may be as small as updated version numbers or a modified web address). The most recent changes are first.

        O ther lists of phylogeny software

        • There is one phylogeny software list even more complete and up-to-date than this one: a more recent version of this list. If you are reading this on the web pages at our server evolution.gs.washington.edu , you are reading the most up-to-date version. But if you are reading a version stored anywhere else, you might want to look here instead.
        • Sergios-Orestis Kolokotronis has posted an extensive table of phylogeny programs at his site at the American Museum of Natural History, and near it are others under headings such as "molecular evolution" and "alignment".
        • Wikipedia has a good list of sequence alignment software (including both tree-based and non-tree-based alignment methods) here at http://en.wikipedia.org/wiki/List_of_sequence_alignment_software
        • David Robertson of the Bioinformatics Education and Research at the University of Manchester, England, maintains a very informative web site at listing programs and their web sites that test for the presence of recombination or hybridization events in DNA sequence data. It lists some programs that are covered here, and others that are outside the scope of these web pages. That site is located at http://bioinf.man.ac.uk/robertson/recombination/programs.shtml .
        • Mike Robeson at the University of Colorado maintains a page with multiple programs listed as Bioinformatics software for Mac OS X.
        • The Bioinformatics Organization, a nonprofit group in Hudson, Massachusetts, has posted the bioinformatics.org web pages. These offer a free membership and host open source software projects in bioinformatics. They also have a Molecular Linux listing of Linux programs to carry out bioinformatics tasks, which can be sorted by keywords.
        • On Wikipedia there is a List of phylogenetic tree visualization software at http://en.wikipedia.org/wiki/List_of_phylogenetic_tree_visualization_software
        • The University of California Museum of Paleontology page of Phylogenetics Software Resources at http://www.ucmp.berkeley.edu/subway/phylo/phylosoft.html . A few programs are listed, but there is a very nice list of software lists there.
        • Andrea Ramge, of Biomax Informatics AG, Martinsried, Germany has created the bioinformatik.de index of resources. It includes a list of software located at http://www.bioinformatik.de/cgi-bin/browse/Catalog/Software . The phylogeny programs listings there are located within the categories for different operating systems. The phylogeny software is under "Phylogenetic Analysis" within each operating system.
        • Richard Christen at the Université de Nice, France, has a list of Tree and Tree-software for visualisation and manipulations dealing with phylogenetic trees at http://bioinfo.unice.fr/biodiv/Tree_editors.html
        • Silvio Nihei at the University of São Paulo in Brazil has produced a list: Programas para Filogenia in Portugese. It concentrates on a small number of programs that mostly use parsimony methods.

        N ew programs waiting to be added

        This is a "waiting list" showing links to the web pages of many new phylogeny programs, which I have not yet had time to add to the main listing. They will be listed there, with a single web link and no detailed explanation. I hope that this list will gradually shrink as the new programs are put into the main listing. You can use the submission form here to submit new entries.

        These are waiting to be added:

        • ADAPTSITE, intended to estimate positive and negative selection at a single amino acid site.
        • PhyRe infers adequacy of taxon sampling for phylogenetic studies.
        • phylobase is an R package that contains a class of functions for comparative methods, incorporating one or more trees and trait data.
        • PhyML-mixtures, a PhyML version for mixture of amino acid models (EX2, EX3, EHO, UL2, and UL3).
        • PhyD*, Fast NJ-like algorithms to deal with incomplete distance matrices.
        • SDM a fast distance-based approach for tree and supertree building in phylogenomics.
        • SSIMUL does speciation signal extraction from multigene families.
        • Clearcut carries out Relaxed Neighbor Joining (RNJ), a faster NJ-like distance method.
        • MP-EST (also described here) uses trees from different loci to infer a species tree by a pseudo-maximum-likelihood method.
        • TreeRogue, an R script for getting trees from published figures of them.
        • Serial NetEvolve simulation program evolves serially-sampled sequences with or without recombination.
        • Rococo reconstructs ancestral gene clusters for a multigene family on a given tree.
        • (27 May 2012) Adding one entry from the Waiting List each week, have caught up with submissions submitted through the web form, and am populating a new column in the cross-referenced table, one for multiplatform interpreted code such as Java, Perl, Python, R, and MatLab. The Java programs are entered in that column. After this I will put the others in it, then go back to the main Phylogeny Programs front web page and put in a software category for these interpreters. If you have a new program, or an old one that I don't list, don't wait for me to find it by myself -- I still don't have time to, so use the web submission form.
        • (19 March 2012) Well, I should have known that another quarter of teaching lay ahead. Now that is done and I should make gradual progress over the next two quarters.
        • (23 December 2011) Once again I got stalled by heavy teaching. Now resuming again and hope to gradually catch up. One puzzle is how to handle R packages. There are a great many of them, and most are not listed here yet. It simply is too much work for me to track them all down, figure out their features, and make an entry for each one. So I will put in only those whose authors use our submission form to help create the entry.
        • (20 August 2011) I have resumed adding new submissions that people sent in using our software submission forms -- there are about 10 waiting, some having waited for over 6 months. Apologies for that, I was busy teaching and desperately trying to write up old results. I hope to gradually add all 10 over the next month or so, one at a time.
        • (17 December 2010) I have started (on this 107th anniversary of the Wright Brothers' famous first flight) this News section of the page. The current status is that I have completed (over the past 2-3 years) a complete pass through the listings, updating them. However of course some may have become outdated since then. Ahead is adding new entries, of which I expect there to be 30-40. I am caught up on entries that were submitted by the web submission form.
        • (19 December 2010) I have finished adding the ones that were already in our Waiting List. Now to take some of the approximately 40 leads that I have accumulated and put entries for the relevant ones into the Waiting List.
        • (19 June 2014) Things have been stalled for some years but now I am gradually (and slowly) resuming adding and correcting entries. Please keep using the software submission form, and please be patient. I was stalled owing to needing to write grants, and owing to not getting them so I now have no programming assistance. Progress to being current will be slow. If you are impatient, how about volunteering to help? We could set up some web site accessible to a team.

        M ysteries you can help us solve

        • 3item extracts 3-item statements from "areagrams", whatever that means, but can only be accessed by joining their Yahoo group. So I don't really know what it does.
        • Dependency v2.1 uses Multiple Interdependency to detect functional interactions between amino acids in proteins. Does not seem to actually use a phylogeny.
        • Codep Maximizes co-evolutionary interdependencies to discover interacting proteins. Also does not seem to actually use a phylogeny.
        • PhyloGrapher shows clustering relationships between genes in a genome based on a distance matrix. But is it a phylogeny program?
        • Phylosopher commercial package for functional genomics said to include some phylogeny functions. Does it?
        • Phylogenator server displaying aligned sequences -- does it actually use or construct phylogenies?
        • MultiLocus calculates multiple-locus measures of population differentiation from population genetic data. But unless someone can show me that it can calculate a measure of distance between two populations within a data set, it does not seem appropriate for this list.
        • CIPRES-KEPLER Java-based framework for organizing workflow and submitting jobs. I am not sure whether any specifically phylogeny-based pieces have yet been supplied with this.

        Notices added in compliance with University of Washington requirements for web sites hosted at the University: Privacy Terms


        Getting started with BEAST

        Downloading BEAST

        Introductory Tutorials

        As an introduction to using BEAST we provide some basic introductory tutorials using the graphical applications of BEAST to perform analyses using provided example files.

        Citing BEAST

        BEAST is descended from earlier work:

        Drummond AJ, Nicholls GK, Rodrigo AG & Solomon W (2002) Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics, 161, 1307-1320.

        Rambaut A (2000) Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics, 16, 395-399.

        Pybus OG & Rambaut A (2002) GENIE: estimating demographic history from molecular phylogenies. Bioinformatics, 18, 1404-1405.

        BEAST is built on a large body of prior work and appropriate citations for individual modules, models and components will be listed when BEAST is run.

        BEAST-Users mailing list

        Users are strongly advised to join the BEAST mailing-list. This will be used to announce new versions and advise users about bugs and problems.


        Evolutionary history of citrus revealed by most comprehensive study to date

        Citrus fruits -- delectable oranges, lemons, limes, kumquats and grapefruits -- are among the most important commercially cultivated fruit trees in the world, yet little is known of the origin of the citrus species and the history of its domestication.

        Now, Joaquin Dopazo et al, in a new publication in the journal Molecular Biology and Evolution, have performed the largest and most detailed genomic analysis on 30 species of Citrus, representing 34 citrus genotypes, and used chloroplast genomic data to reconstruct its evolutionary history.

        Overall, the results confirm a monophyletic origin -- a single common ancestor, that gave rise to all citrus fruit. Another result from the study was the remarkable level of heteroplasmy, or hybridization seen, an event that the authors showed occurred frequently in Citrus evolution.

        The Citrus evolutionary tree is made of three main branches: the citron/Australian species, the pummelo/micrantha and the papeda/mandarins. The Citrus ancestors were generated in a succession of speciation events occurring between 7.5-6.3 Mya, followed by a second radiation (5.0-3.7 Mya) that separated citrons from Australian species, and finally, Micrantha from Pummelos and Papedas from mandarins. Further radiation of Fortunella, sour and sweet oranges, lemons, and mandarins took place later (1.5-0.2 Mya).

        On a finer scale, the group also identified 6 genes that may be general hotspots of natural genetic variation in Citrus. Advantageous mutations for adaptation were detected in 4 of these genes, matK, ndhF, ycf1 and ccsA. In particular, matK and ndhF were thought to help the Australian varieties adapt to hotter and drier climates while ccsA represents the emergence of mandarins.

        "This new phylogeny based on chloroplast genomes provides an accurate description of the evolution of the genus citrus and clears up years of ambiguities derived from previous proposals based on one or a few nuclear or chloroplast genes," said Dopazo.


        Three Versions of the Tree of Life

        I developed Lifemap, a tool largely inspired by the technology developed for cartography that is free of the limitation described above. Its approach differs from that of OneZoom [10], both in the representation of the tree and in the way images are displayed and interacted with on the screen. This allows a fast and smooth exploration of the biggest tree ever proposed on a single page for exploration.

        Lifemap uses a representation inspired by Treemaps, a method developed in the field of computer sciences in the early 90s for turning file system tree structures (directories in a computer) into a planar space-filling map [12] for clearer visualization. In Treemaps, directories are represented by rectangles that are recursively split in as many subrectangles as there are subdirectories, leading to a fully-filled map. The size of the rectangles is proportional to the number of files in each directory. This approach cannot be directly used to visualize the ToL because it is incompatible with the representations of links between the nodes, which is important in an evolutionary context where branches represent time and must be visible. I, however, kept the idea of filling the map recursively, using a base shape whose size is proportional (but with a square root transformation) to the number of elements in it. In Lifemap, this base shape is a half-circle, and the way these half-circles are arranged within and between each other (Fig 1) ensures that there is space to draw all the branches and guarantees that the branches never intersect (unlike other solutions, [13]).

        Each clade is represented by a half-circle whose size depends on the relative number of species in the clade as compared to its sister clades at a given level. Note that these proportions are not respected at the tree root where the three superkingdoms are arbitrarily given the same size. Computation of the size of each half-circle is based on the angle they are associated with (α and β on the first panel): if nA and nB are the number of species in clades A and B, respectively, the angles in degrees are computed as follows: and . The square root reduces the difference in half-circle sizes between very small and very large groups. At every level, the half-circles (clades) are randomly distributed within their parental half-circle.

        Lifemap comes in three versions that differ by the tree that is displayed and the information that is associated with tips and nodes when clicking. The general public version (Fig 2A) displays a reduced NCBI taxonomy obtained by removing nonidentified clades and all taxa below the species level. When clicking on nodes or tips, a short description and a picture are displayed (Fig 2B). Pictures and text are obtained from Wikipedia. If no Wikipedia page exists or if a picture is lacking, the user can click on a link to contribute to Wikipedia for these specific taxa by creating a page, modifying the text, and/or adding a picture. Lifemap should thus help identify missing pages and improve the quality and quantity of pages dedicated to clades and species in Wikipedia.

        (A) Example of the appearance of Lifemap when zooming to the primates order. (B) Example of information displayed in the general public version when clicking on a node. (C) Visualization of the path between two taxa. Lemur image credit: Mathias Appel, Flickr (https://flic.kr/p/FKtBbU).

        The NCBI version, named “Lifemap NCBI,” displays the whole NCBI taxonomy and is updated every week. When clicking on a node, the user can (i) get additional information about the current taxa (taxid, number of species), (ii) reach the NCBI web page corresponding to the node, and (iii) download the corresponding subtree in parenthetic format for further analysis. In this version, the user can also add a layer to the tree to visualize at each node the number of fully sequenced genomes.

        The third version is named “Lifemap OTOL.” It displays the latest OTOL synthetic tree and will be updated every time a new version is released. The information displayed when clicking on the nodes is similar to the one available in the two others: Wikipedia picture and description, taxonomy code, possibility to download the subtree in parenthetic format, and taxonomic sources of information for each node. Other information that can be displayed as layers on Lifemap will be added in the future in response to user's suggestions and requests in the different versions.

        Finally, all three versions give the possibility to compute, visualize, and explore “paths” in the ToL (Fig 2C). This is done either by choosing a source and a destination taxa or by clicking the “view full ancestry” button associated with each node. In the latter case, the destination is set as the root of the tree. The path is computed instantly and highlighted on the tree. The most recent common ancestor (MRCA) is indicated with a marker, and the list of taxa encountered in the route from the source to the destination is returned.


        Homo habilis

        Homo habilis was discovered in Tanzania in the early 1960s by a group led by Louis and Mary Leakey, a married pair of paleoanthropologists. It was dubbed the “handyman” because it was thought to have made stone tools.

        This species lived in eastern and southern Africa between 2.4 million and 1.4 million years ago. Of the multiple species in our genus, Homo habilis is the least humanlike in its anatomy and most similar to apes, according to the Bradshaw Foundation.

        Scientists found that for nearly 500,000 years, Homo habilis lived alongside Homo erectus in eastern Africa, a prehistoric gathering of multiple species of the Homo group, presaging the period when Homo sapiens would cohabitate in Eurasia with Neanderthals and Denisovans.


        Obsolete Dawkinsian evidence for evolution

        It is with hesitation that I pen a blog post that could be construed as critical of Richard Dawkins FRS. Many members of this Nature Ecology & Evolution Community may have first come to understand the Darwinian mechanism through his lucid prose. His books have sold by their millions and feature on many an undergraduate reading list. School science teachers around the globe teach what they have learned from him. In the public imagination he is our greatest living evolutionary biologist.

        But for these reasons it is important to point out where he has erred. Or at least, where scientific progress has discredited his claims. Because of his wide influence, it is in the interests of the public understanding of science that any mistakes he has made should be explicitly corrected.

        In seeking to do so, I am encouraged by statements that Dawkins has often made about willingness of scientists to have their ideas disproven. With that in mind, I can have no doubt that he himself will welcome and seriously consider this post, should he happen upon it.

        My concern is that Richard Dawkins has made very public statements that, if taken to be true today, seriously misrepresent the field of phylogenetics in the era of whole genome sequencing.

        Take a look at this video hosted by the Richard Dawkins Foundation for Reason & Science YouTube Channel. In the video (8:40 minutes in), Dawkins is asked to name the single best piece of evidence for evolution. His response is to claim that phylogenetic analyses of different genes and pseudogenes each independently give us "the same family tree" for the species that carry them. This congruence between gene trees is "overwhelmingly strong evidence" for evolution - the only alternative being a deceptive creator.

        Dawkins makes the same claim more fully in his book The Greatest Show on Earth: The Evidence for Evolution (2009). He writes:

        "Comparative DNA (or protein) evidence can be used to decide - on the evolutionary assumption - which pairs of animals are closer cousins than which others. What turns this into extremely powerful evidence for evolution is that you can construct a tree of genetic resemblances separately for each gene in turn. And the important result is that every gene delivers approximately the same tree of life. Once again, this is exactly what you would expect if you were dealing with a true family tree. It is not what you expect if a designer had surveyed the whole of the animal kingdom and picked and chosen - or 'borrowed' - the best proteins for the job, wherever in the animal kingdom they might be found." (pp. 321-322 emphasis added)

        To illustrate his point, he describes a study by David Penny et al. published in Nature in 1982 using sequence data for 5 proteins from 11 species. Dawkins claims that "All five proteins 'voted' for pretty much the same subset of trees from among the 34 million possible trees. What is more, the consensus tree that the five molecules all voted for turned out to be the same as zoologists had already worked out on anatomic and palaeontological grounds, not molecular grounds." (p. 324)

        He adds that "The intervening years have seen a prolific multiplication of detailed evidence on the exact sequences of genes of lots and lots of species of animals and plants. It is the consistency of agreement among all the different genes in the genome that gives us confidence, not only in the historical accuracy of the consensus tree itself, but also in the fact that evolution has occurred." (pp. 324-325 emphasis added)

        The lay-person reading this, or watching the video above, is given the clear impression that every gene or pseudogene in every living organism gives essentially the same phylogenetic tree, when analysed with its homologs from other species. This is simply not true.

        If this were true, then phylogeny building in the genomic era would be a walk in the park. But, as many of my readers will know from personal experience, it is not.

        If this were true, terms like horizontal gene transfer, incomplete lineage sorting, introgression, and molecular convergence would be rare curiosities in the genomic literature. But they are common (click on the links in the previous sentence to see searched for these terms on Google Scholar).

        If this were true, commonly-used phylogenetic software like ASTRAL, ASTRID and BUCKy, designed to deal with gene tree incongruence, would be seldom used. But they are used often.

        I hardly need to labour my point to the present audience. Dawkins' statements are simply wrong. Gloriously and utterly wrong. To promulgate this teaching is to do a disservice to the work of thousands of scientists working in the field of phylogenomics, who daily seek to make sense of incongruent gene trees.

        It is time for this argument to be retired, or, even better for the public understanding of science, retracted.


        Introduction

        From obscure beginnings, phylogenetics has become an essential tool for understanding molecular sequence variation. In the past decade, huge progress has been made in developing methods for inferring phylogenies and estimating divergence dates. This development has been characterized by increases, both in the complexity of the models used to describe molecular sequence evolution, and in the sophistication of the methods for analyzing these new models. Nevertheless, a well-known problem that has persistently troubled phylogenetic inference is that of substitution rate variation among lineages. In order to infer divergence dates, it is convenient to assume a constant rate of evolution throughout the tree [1, 2]. This practice has been regularly challenged by results from datasets showing considerable departures from clocklike evolution [3–5], and rate variation among lineages can seriously mislead not only divergence date estimation [6] but also phylogenetic inference (e.g., [7, 8]).

        Such problems with the molecular clock hypothesis have resulted in it being abandoned almost entirely for phylogenetic inference in favor of a model that assumes that every branch has an independent rate of molecular evolution. Under such an assumption, it is possible to infer phylogenies (e.g., [9, 10]), but not to estimate molecular rates or divergence times, because the individual contributions of rate and time to molecular evolution cannot be separated. If the rate and time along each branch can only be estimated as their product, then the position of the root of the tree cannot be estimated without additional assumptions such as an outgroup or a non-reversible substitution process. This unrooted alternative to the molecular clock was first suggested by Felsenstein [10] and has formed the basis of all modern phylogenetic inference and is implemented in all major phylogenetic packages (e.g., PHYLIP [11], PAUP* [12], and MrBayes [9]).

        Recently, it has been realized that less drastic alternatives to the unrooted model of phylogeny may exist. Instead of dispensing with the molecular clock entirely, attempts have been made to relax the molecular clock assumption by allowing the rate to vary across the tree [13–15]. For example, local molecular clock models estimate a separate molecular rate for each user-circumscribed group of branches in the tree [6, 13, 16]. However, assigning branches to different groups can be a difficult exercise if the number of sequences is large or if there is considerable uncertainty about the phylogenetic relationships among the taxa. Essentially, such models are only useful in cases in which there is a strong prior hypothesis that the rate of specific taxa will differ from the rest of the tree [6].

        Bayesian relaxed-clock methods, including those published by Thorne et al. [15] and Aris-Brosou and Yang [17], present an enticing alternative to local clock models. These model the molecular rate among lineages as varying in an autocorrelated manner, with the rate in each branch being drawn (a priori) from a parametric distribution whose mean is a function of the rate on the parent branch. For example, a lognormal distribution can be employed with the variance scaled relative to the length of the branch in units of time, implying that the evolutionary rate changes continuously along the branch. Alternatively, the use of an exponential distribution would imply that changes occurred at the nodes, with the size of the change being independent of the branch length.

        Autocorrelation of rates from ancestral to descendant lineages will occur whenever the largest component of rate variation is due to inherited factors, whether these are life-history traits or biochemical mechanisms. As one looks over smaller and smaller timescales, the differences in such inherited factors become smaller relative to the variance caused by stochastic and uninherited factors (such as environmental or chance events). An alternative way of considering this is that the autocorrelation is so strong that very little of the variation in rate can be attributed to inherited factors. At the other extreme, over very long timescales, we might expect so much variation in the inherited determinants of rate that the autocorrelation from lineage to lineage begins to break down, especially with sparse taxon sampling. However, it is difficult to predict where the boundaries between these effects are and thus to specify what the degree of autocorrelation will be.

        Relaxed-clock models present a potentially useful method for removing the assumption of a strict molecular clock, but a major shortcoming of the methods that have been proposed thus far is that they require the user to specify the tree topology. This is a problem because in many cases, important parts of the tree may be uncertain or unresolved, resulting in a number of plausible tree topologies. Furthermore, a molecular clock may have been assumed when estimating the input tree (for example to find a root), but rate variation among lineages can adversely affect phylogenetic inference (e.g., [7, 8]). In some settings, the tree topology may actually be a nuisance parameter and some other aspect of the model (such as the variance in evolutionary rate, the effective population size, or the age of the most recent common ancestor) is the object of interest. Lastly, the assumption of a relaxed clock will alter the posterior probabilities of alternative tree topologies, so that the best tree under a relaxed-clock model may differ from the best tree under an unrooted or strict molecular clock model. For these reasons, a “relaxed phylogenetics” approach, in which the phylogeny and the divergence dates are co-estimated under a relaxed molecular clock, is preferred [18].

        Here we present a Bayesian Markov chain Monte Carlo (MCMC) [19, 20] method for performing relaxed phylogenetics that is able to co-estimate phylogeny and divergence times under a new class of relaxed-clock models. Its utility is demonstrated through simulation and on 871 real datasets. When absolute rates and divergence dates are estimated, we use probabilistic calibration priors, rather than point calibrations, since these more appropriately incorporate calibration uncertainties. We have implemented this method in the application BEAST [21] in which they can be used in conjunction with a wide range of other evolutionary models.


        Watch the video: iTOL Phylogenetic tree. Attractive tree of life Phylogenetic tree. Easy phylogeny tree construction (May 2022).