From Gene to Structure: Unraveling Genomic Dark Matter in Ca. Accumulibacter

Environ Sci Technol. 2024 Dec 19. doi: 10.1021/acs.est.4c09948. Online ahead of print.

Abstract

"Candidatus Accumulibacter" is a unique and pivotal genus of polyphosphate-accumulating organisms prevalent in wastewater treatment plants and plays mainstay roles in the global phosphorus cycle. However, the efforts to fully understand their genetic and metabolic characteristics are largely hindered by major limitations in existing sequence-based annotation methods. Here, we reported an integrated approach combining pangenome analysis, protein structure prediction and clustering, and meta-omic characterization, to uncover genetic and metabolic traits previously unexplored for Ca. Accumulibacter. The identification of a previously overlooked pyrophosphate-fructose 6-phosphate 1-phosphotransferase gene (pfp) suggested that all Ca. Accumulibacter encoded a complete Embden-Meyerhof-Parnas pathway. A homologue of the phosphate-specific transport system accessory protein (PhoU) was suggested to be an inorganic phosphate transport (Pit) accessory protein (Pap) conferring effective and efficient phosphate transport. Additional lineage members were found to encode complete denitrification pathways. A pipeline was built, generating a pan-Ca. Accumulibacter annotation reference database, covering >200,000 proteins and their encoding genes. Benchmarking on 27 Ca. Accumulibacter genomes showed major improvement in the average annotation coverage from 51% to 82%. This pipeline is readily applicable to diverse cultured and uncultured bacteria to establish high-coverage annotation reference databases, facilitating the exploration of genomic dark matter in the bacterial domain.

Keywords: Alphafold2; enhance biological phosphorus removal; function annotation; inorganic phosphate transport (pit) accessory protein; protein structure.