Epi-Gene: An R-Package for Easy Pan-Genome Analysis

Biomed Res Int. 2021 Sep 20:2021:5585586. doi: 10.1155/2021/5585586. eCollection 2021.

Abstract

The main aim of this study was to develop a set of functions that can analyze the genomic data with less time consumption and memory. Epi-gene is presented as a solution to large sequence file handling and computational time problems. It uses less time and less programming skills in order to work with a large number of genomes. In the current study, some features of the Epi-gene R-package were described and illustrated by using a dataset of the 14 Aeromonas hydrophila genomes. The joining, relabeling, and conversion functions were also included in this package to handle the FASTA formatted sequences. To calculate the subsets of core genes, accessory genes, and unique genes, various Epi-gene functions have been used. Heat maps and phylogenetic genome trees were also constructed. This whole procedure was completed in less than 30 minutes. This package can only work on Windows operating systems. Different functions from other packages such as dplyr and ggtree were also used that were available in R computing environment.

Publication types

  • Retracted Publication

MeSH terms

  • Aeromonas hydrophila / genetics
  • Databases, Genetic
  • Genome, Bacterial*
  • Genomics*
  • Multigene Family
  • Phylogeny
  • Principal Component Analysis
  • Software*