Introduction: Colorectal cancer is a common malignancy that can be cured when detected early, but recurrence among survivors is a persistent risk. A field effect of cancer in the colon has been reported and could have implications for surveillance, but studies to date have been limited. A joint analysis of pooled transcriptomic data from all available bulk RNA-sequencing data sets of healthy, histologically normal tumor-adjacent, and tumor tissues was performed to provide an unbiased assessment of field effect.
Methods: A novel bulk RNA-sequencing data set from biopsies of nondiseased colon from screening colonoscopy along with published data sets from the Genomic Data Commons and Sequence Read Archive were considered for inclusion. Analyses were limited to samples with a quantified read depth of at least 10 million reads. Transcript abundance was estimated with Salmon, and downstream analysis was performed in R.
Results: A total of 1,139 samples were analyzed in 3 cohorts. The primary cohort consisted of 834 independent samples from 8 independent data sets, including 462 healthy, 61 tumor-adjacent, and 311 tumor samples. Tumor-adjacent gene expression was found to represent an intermediate state between healthy and tumor expression. Among differentially expressed genes in tumor-adjacent samples, 1,143 were expressed in patterns similar to tumor samples, and these genes were enriched for cancer-associated pathways.
Discussion: Novel insights into the field effect in colorectal cancer were generated in this mega-analysis of the colorectal transcriptome. Oncogenic features that might help explain metachronous lesions in cancer survivors and could be used for surveillance and risk stratification were identified.