A phylogeny of all the plant species in the New World (from BIEN 2) is now available:
Click here to view the tree via an iPlant tree browser.
Cody Hincliff in Stephen Smith's group at U. Michigan and John Cazes at TACC-iPlant compiled the tree.To view the BIEN 2 phylogeny using the iPlant Big Tree Viewer, go here (or click on the tree at left). The iPlant Tree Viewer was developed by Kris Urie (formerly at the Field Museum) and Adam Kubach (formerly at TACC), under the leadership of Karen Cranston at NESCent.
Methods:
Creation of a species-level multi-gene phylogenetic tree and mapping phylogenetic diversity – We used our standardized list of New World species to query GenBank sequence records in order to calculate phylogenetic diversity. Data were gathered from the atpB-rbcL, ndhF, psbA, psbA-psbH, rbcL, and trnT-trnL-trnF marker regions using the software PHLAWD[1] (Smith et al. 2009). Together, these genes are represented for over 65,000 species-level tip taxa from across green plants. Individually, each of these these loci is represented by the following numbers of exemplar taxa on GenBank: trnT-trnL-trnF – 45K; trnK-psbA-psbH – 18K; atpB-rbcL (incl. rbcL gene) – 28K; ndhF – 8K. Additional details on the methodology used to extract these data from GenBank and align them are presented in Hinchliff & Smith (ms in prep). The software RAxML 7.3.0 (Stamatakis 2006) was used to estimate phylogeny from this alignment using the standard single-run, unconstrained ML search method. Penalized likelihood as implemented in the program treePL[2] (Sanderson 2002; Smith & O’Meara 2012) was used to estimate divergence times from the molecular branch lengths of this tree.
Taxa not in the BIEN database were pruned from the ultrametric topology. This resulted in a base tree that contained 18,641 species for which BIEN data were available. When possible we then grafted the remaining BIEN taxa onto the base tree using taxonomy (genus membership) as a guide. Previous approaches have attached unplaced taxa as polytomies at the base of genera or families (e.g. Webb & Donoghue 2005), however this has been shown to bias some downstream analyses (Davies et al. 2012). Instead, we chose to randomly attach unplaced taxa within genera and repeat the process over an ensemble of 100 phylogenies for later analysis. Specifically, in cases where the unplaced taxon was a member of a genus that is monophyletic on the base tree, it was randomly placed within that genus. In cases where an unplaced taxon belongs to a genus that appears polyphyletic on the base phylogeny, the taxon was made sister to a randomly selected congener. Given uncertainties in automating the workflow, no efforts were made to randomly place taxa using taxonomic information above the genus level, resulting in 7,550 species out of 88,824 in BIEN2 that could not be placed on the working phylogeny in an automated fashion, as they had no congeners on the base tree.
[2] https://github.com/blackrim/treePL/wiki
Davies T.J., Kraft N.J.B., Salamin N. & Wolkovich E.M. (2012). Incompletely resolved phylogenetic trees inflate estimates of phylogenetic conservatism. Ecology 93, 242–247.
Sanderson M.J. (2002). Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular biology and evolution, 19, 101-109.
Smith S., Beaulieu J. & Donoghue M. (2009). Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches. BMC evolutionary biology, 9, 37.
Smith S.A. & O’Meara B.C. (2012). treePL: divergence time estimation using penalized likelihood for large phylogenies. Bioinformatics, 28, 2689-2690.
Stamatakis A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22, 2688-2690.
Webb C.O. & Donoghue M.J. (2005). Phylomatic: tree assembly for applied phylogenetics. Molecular Ecology Notes, 5, 181-183.