Automated analysis of immunoglobulin genes from high-throughput sequencing: life without a template

J Clin Bioinforma. 2013 Aug 27;3(1):15. doi: 10.1186/2043-9113-3-15.

Abstract

Background: Immunoglobulin (that is, antibody) and T cell receptor genes are created through somatic gene rearrangement from gene segment libraries. Immunoglobulin genes are further diversified by somatic hypermutation and selection during the immune response. Studying the repertoires of these genes yields valuable insights into immune system function in infections, aging, autoimmune diseases and cancers. The introduction of high throughput sequencing has generated unprecedented amounts of repertoire and mutation data from immunoglobulin genes. However, common analysis programs are not appropriate for pre-processing and analyzing these data due to the lack of a template or reference for the whole gene.

Results: We present here the automated analysis pipeline we created for this purpose, which integrates various software packages of our own development and others', and demonstrate its performance.

Conclusions: Our analysis pipeline presented here is highly modular, and makes it possible to analyze the data resulting from high-throughput sequencing of immunoglobulin genes, in spite of the lack of a template gene. An executable version of the Automation program (and its source code) is freely available for downloading from our website: http://immsilico2.lnx.biu.ac.il/Software.html.