TruSeq-Based Gene Expression Analysis of Formalin-Fixed Paraffin-Embedded (FFPE) Cutaneous T-Cell Lymphoma Samples: Subgroup Analysis Results and Elucidation of Biases from FFPE Sample Processing on the TruSeq Platform

Front Med (Lausanne). 2017 Sep 22:4:153. doi: 10.3389/fmed.2017.00153. eCollection 2017.

Abstract

Cutaneous T-cell lymphomas (CTCLs) are a heterogeneous group of malignancies with courses ranging from indolent to potentially lethal. We recently studied in a 157 patient cohort gene expression profiles generated by the TruSeq targeted RNA gene expression sequencing. We observed that the sequencing library quality and depth from formalin-fixed paraffin-embedded (FFPE) skin samples were significantly lower when biopsies were obtained prior to 2009. We also observed that the fresh CTCL samples clustered together, even though they included stage I-IV disease. In this study, we compared TruSeq gene expression patterns in older (≤2008) vs. more recent (≥2009) FFPE samples to determine whether these clustering analyses and earlier described differentially expressed gene findings are robust when analyzed based on the year of biopsy. We also explored biases found in FFPE samples when subjected to the TruSeq analysis of gene expression. Our results showed that ≤2008 and ≥2009 samples clustered equally well to the full data set and, importantly, both analyses produced nearly identical trends and findings. Specifically, both analyses enriched nearly identical DEGs when comparing benign vs. (1) stage I-IV and (2) stage IV (alone) CTCL samples. Results obtained using either ≤2008 or ≥2009 samples were strongly correlated. Furthermore, by using subgroup analyses, we were able to identify additional novel differentially expressed genes (DEGs), which did not reach statistical significance in the prior full data set analysis. Those included CTCL-upregulated BCL11A, SELL, IRF1, SMAD1, CASP1, BIRC5, and MAX and CTCL-downregulated MDM4, SERPINB3, and THBS4 genes. With respect to sample biases, no matter if we performed subgroup analyses or full data set analysis, fresh samples tightly clustered together. While principal component analysis revealed that fresh samples were spatially closer together, indicating some preprocessing batch effect, they remained in the proximity to other normal/benign and FFPE CTCL samples and were not clustering as outliers by themselves. Notably, this did not affect the determination of DEGs when analyzing ≥2009 samples (fresh and FFPE biopsies) vs. ≥2009 FFPE samples alone.

Keywords: Sézary syndrome; TruSeq; cutaneous T-cell lymphoma; diagnostic markers; expression profiling; mycosis fungoides; prognostic markers.