Phylogenomics Using Transcriptome Data

Johanna Taylor Cannon; Kevin Michael Kocot

doi:10.1007/978-1-4939-3774-5_4

Phylogenomics Using Transcriptome Data

Methods Mol Biol. 2016:1452:65-80. doi: 10.1007/978-1-4939-3774-5_4.

Authors

Johanna Taylor Cannon¹, Kevin Michael Kocot²

Affiliations

¹ Department of Zoology, Naturhistoriska Riksmuseet, 50007, SE-104 05, Stockholm, Sweden. Johanna.cannon@nrm.se.
² Department of Biological Sciences and Alabama Museum of Natural History, The University of Alabama, 307 Mary Harmon Bryant Hall, Tuscaloosa, AL, 35487, USA.

PMID: 27460370
DOI: 10.1007/978-1-4939-3774-5_4

Abstract

This chapter presents a generalized protocol for conducting phylogenetic analyses using large-scale molecular datasets, specifically using transcriptome data from the Illumina sequencing platform. The general molecular lab bench protocol consists of RNA extraction, cDNA synthesis, and sequencing, in this case via Illumina. After sequences have been obtained, bioinformatics methods are used to assemble raw reads, identify coding regions, and categorize sequences from different species into groups of orthologous genes (OGs). The specific OGs to be used for phylogenetic inference are selected using a custom shell script. Finally, the selected orthologous groups are concatenated into a supermatrix. Generalized methods for phylogenomic inference using maximum likelihood and Bayesian inference software are presented.

Keywords: Illumina; Phylogenomics; Phylogeny; RNAseq; Transcriptomes; cDNA.

MeSH terms

Computational Biology / methods
DNA, Complementary / genetics
Evolution, Molecular
Genomics / methods*
Phylogeny
Sequence Analysis, DNA / methods
Transcriptome / genetics*

Substances

DNA, Complementary