Retracing Phylogenetic, Host and Geographic Origins of Coronaviruses with Coloured Genomic Bootstrap Barcodes: SARS-CoV and SARS-CoV-2 as Case Studies

Viruses. 2023 Jan 31;15(2):406. doi: 10.3390/v15020406.

Abstract

Phylogenetic trees of coronaviruses are difficult to interpret because they undergo frequent genomic recombination. Here, we propose a new method, coloured genomic bootstrap (CGB) barcodes, to highlight the polyphyletic origins of human sarbecoviruses and understand their host and geographic origins. The results indicate that SARS-CoV and SARS-CoV-2 contain genomic regions of mixed ancestry originating from horseshoe bat (Rhinolophus) viruses. First, different regions of SARS-CoV share exclusive ancestry with five Rhinolophus viruses from Southwest China (RfYNLF/31C: 17.9%; RpF46: 3.3%; RspSC2018: 2.0%; Rpe3: 1.3%; RaLYRa11: 1.0%) and 97% of its genome can be related to bat viruses from Yunnan (China), supporting its emergence in the Rhinolophus species of this province. Second, different regions of SARS-CoV-2 share exclusive ancestry with eight Rhinolophus viruses from Yunnan (RpYN06: 5.8%; RaTG13: 4.8%; RmYN02: 3.8%), Laos (RpBANAL103: 3.3%; RmarBANAL236: 1.7%; RmBANAL52: 1.0%; RmBANAL247: 0.7%), and Cambodia (RshSTT200: 2.3%), and 98% of its genome can be related to bat viruses from northern Laos and Yunnan, supporting its emergence in the Rhinolophus species of this region. Although CGB barcodes are very useful in retracing the origins of human sarbecoviruses, further investigations are needed to better take into account the diversity of coronaviruses in bats from Cambodia, Laos, Myanmar, Thailand and Vietnam.

Keywords: COVID-19; coronavirus; genome; phylogenetic support; recombination; reservoir host; secondary host; tree reconstruction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • COVID-19* / epidemiology
  • China
  • Chiroptera*
  • Genomics
  • Humans
  • Phylogeny
  • SARS-CoV-2 / genetics
  • Severe acute respiratory syndrome-related coronavirus*

Grants and funding

This research was funded by the “Agence nationale de la recherche” (AAP RA-COVID-19, grant number ANR-21-CO12-0002).