Pairs and Pairix: a file format and a tool for efficient storage and retrieval for Hi-C read pairs

Bioinformatics. 2022 Mar 4;38(6):1729-1731. doi: 10.1093/bioinformatics/btab870.

Abstract

Summary: As the amount of 3D chromosomal interaction data continues to increase, storing and accessing such data efficiently becomes paramount. We introduce Pairs, a block-compressed text file format for storing paired genomic coordinates from Hi-C data, and Pairix, an open-source C application to index and query Pairs files. Pairix (also available in Python and R) extends the functionalities of Tabix to paired coordinates data. We have also developed PairsQC, a collapsible HTML quality control report generator for Pairs files.

Availability and implementation: The format specification and source code are available at https://github.com/4dn-dcic/pairix, https://github.com/4dn-dcic/Rpairix and https://github.com/4dn-dcic/pairsqc.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Chromosomes
  • Genomics*
  • Quality Control
  • Sequence Analysis
  • Software*