Progress on deep learning in genomics

Yi Chuan. 2024 Sep;46(9):701-715. doi: 10.16288/j.yczz.24-151.

Abstract

With the rapid growth of data driven by high-throughput sequencing technologies, genomics has entered an era characterized by big data, which presents significant challenges for traditional bioinformatics methods in handling complex data patterns. At this critical juncture of technological progress, deep learning-an advanced artificial intelligence technology-offers powerful capabilities for data analysis and pattern recognition, revitalizing genomic research. In this review, we focus on four major deep learning models: Convolutional Neural Network(CNN), Recurrent Neural Network(RNN), Long Short-Term Memory(LSTM), and Generative Adversarial Network(GAN). We outline their core principles and provide a comprehensive review of their applications in DNA, RNA, and protein research over the past five years. Additionally, we also explore the use of deep learning in livestock genomics, highlighting its potential benefits and challenges in genetic trait analysis, disease prevention, and genetic enhancement. By delivering a thorough analysis, we aim to enhance precision and efficiency in genomic research through deep learning and offer a framework for developing and applying livestock genomic strategies, thereby advancing precision livestock farming and genetic breeding technologies.

随着高通量测序技术的迅猛发展,基因组学领域迎来了数据量的爆炸性增长,这对传统生物信息学处理复杂数据模式的能力构成了严峻挑战。在此技术革新的关键时刻,深度学习作为人工智能领域的前沿技术,以其强大的数据解析与模式识别能力,为基因组学研究注入了新的活力。本文聚焦于4种核心深度学习模型——卷积神经网络(convolution neural network,CNN)、循环神经网络(recurrent neural network,RNN)、长短期记忆网络(long short term memory,LSTM)及生成对抗网络(generative adversarial network,GAN),系统阐述了它们的基础原理,重点回顾了这些模型近5年在DNA、RNA和蛋白质研究领域的广泛应用。此外,文章进一步探讨了深度学习在畜禽基因组学中的应用案例,揭示了其在遗传特征解析、疾病预防以及遗传改良等领域的潜在应用价值与面临的挑战。通过深入分析,本文旨在阐述深度学习技术在增强基因组数据分析的准确性和处理能力方面的作用,并构建一个概念性框架,以指导畜禽基因组学研究策略的发展及其在具体场景下的应用,进而推动精准农业和遗传改良技术的发展。.

Keywords: CNN; GAN; LSTM; RNN; deep learning; genome.

Publication types

  • Review

MeSH terms

  • Animals
  • Computational Biology / methods
  • Deep Learning*
  • Genomics* / methods
  • Humans
  • Livestock / genetics
  • Neural Networks, Computer