RNF: a general framework to evaluate NGS read mappers

Bioinformatics. 2016 Jan 1;32(1):136-9. doi: 10.1093/bioinformatics/btv524. Epub 2015 Sep 9.

Abstract

Motivation: Read simulators combined with alignment evaluation tools provide the most straightforward way to evaluate and compare mappers. Simulation of reads is accompanied by information about their positions in the source genome. This information is then used to evaluate alignments produced by the mapper. Finally, reports containing statistics of successful read alignments are created.In default of standards for encoding read origins, every evaluation tool has to be made explicitly compatible with the simulator used to generate reads.

Results: To solve this obstacle, we have created a generic format Read Naming Format (Rnf) for assigning read names with encoded information about original positions. Futhermore, we have developed an associated software package RnfTools containing two principal components. MIShmash applies one of popular read simulating tools (among DwgSim, Art, Mason, CuReSim, etc.) and transforms the generated reads into Rnf format. LAVEnder evaluates then a given read mapper using simulated reads in Rnf format. A special attention is payed to mapping qualities that serve for parametrization of Roc curves, and to evaluation of the effect of read sample contamination.

Availability and implementation: RnfTools: http://karel-brinda.github.io/rnftools Spec. of Rnf: http://karel-brinda.github.io/rnf-spec

Contact: karel.brinda@univ-mlv.fr.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Genome
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Software*