Statistical Challenges in Tracking the Evolution of SARS-CoV-2

Stat Sci. 2022 May;37(2):162-182. doi: 10.1214/22-sts853. Epub 2022 May 16.

Abstract

Genomic surveillance of SARS-CoV-2 has been instrumental in tracking the spread and evolution of the virus during the pandemic. The availability of SARS-CoV-2 molecular sequences isolated from infected individuals, coupled with phylodynamic methods, have provided insights into the origin of the virus, its evolutionary rate, the timing of introductions, the patterns of transmission, and the rise of novel variants that have spread through populations. Despite enormous global efforts of governments, laboratories, and researchers to collect and sequence molecular data, many challenges remain in analyzing and interpreting the data collected. Here, we describe the models and methods currently used to monitor the spread of SARS-CoV-2, discuss long-standing and new statistical challenges, and propose a method for tracking the rise of novel variants during the epidemic.

Keywords: Bayesian nonparametrics; Phylodynamics; SIR models; birth-death processes; coalescent; genetic epidemiology.