The rice genome annotation project: an updated database for mining the rice genome

Nucleic Acids Res. 2024 Nov 18:gkae1061. doi: 10.1093/nar/gkae1061. Online ahead of print.

Abstract

Rice (Oryza sativa L.) is a major cereal crop that provides calories across the world. With a small genome, rice has been used extensively as a model for genetic and genomic studies in the Poaceae. Since the release of the first rice genome sequence in 2002, an improved reference genome assembly, multiple whole genome assemblies, extensive gene expression profiles, and resequencing data from over 3000 rice accessions have been generated. To facilitate access to the rice genome for plant biologists, we updated the Rice Genome Annotation Project database (RGAP; https://rice.uga.edu) with new datasets including 16 whole genome rice assemblies and sequence variants generated from multiple rice pan-genome projects including the 3000 Rice Genomes Project. We updated gene expression abundance data with 80 RNA-sequencing datasets and to facilitate gene function discovery, performed gene coexpression resulting in 39 coexpression modules that capture highly connected sets of co-regulated genes. To facilitate comparative genome analyses, 32 335 syntelogs were identified between the Nipponbare reference genome and other rice genomes and 19 371 syntelogs were identified between Nipponbare and four other Poaceae genomes. Infrastructure improvements to the RGAP database include an upgraded genome browser and data access portals, enhanced website security and increased performance of the website.