SciApps: An Automated Platform for Processing and Distribution of Plant Genomics Data

Methods Mol Biol. 2022:2443:197-209. doi: 10.1007/978-1-0716-2067-0_10.

Abstract

SciApps is an open-source, web-based platform for processing, storing, visualizing, and distributing genomic data and analysis results. Built upon the Tapis (formerly Agave) platform, SciApps brings users TB-scale of data storage via CyVerse Data Store and over one million CPUs via the Extreme Science and Engineering Discovery Environment (XSEDE) resources at Texas Advanced Computing Center (TACC). SciApps provides users ways to chain individual jobs into automated and reproducible workflows in a distributed cloud and provides a management system for data, associated metadata, individual analysis jobs, and multi-step workflows. This chapter provides examples of how to (1) submitting, managing, constructing workflows, (2) using public workflows for Bulked Segregant Analysis (BSA), (3) constructing a Data Analysis Center (DAC), and Data Coordination Center (DCC) for the plant ENCODE project.

Keywords: BSA; Distributed cloud computing; Genetic variation; Plant ENCODE; Plant genomics; Visualization; Workflow management system.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology
  • Genome, Plant
  • Genomics* / methods
  • Information Storage and Retrieval
  • Software*
  • Workflow