Optimized Whole-Slide-Image H&E Stain Normalization: A Step Towards Big Data Integration in Digital Pathology

IEEE Open J Eng Med Biol. 2024 Sep 6:6:35-40. doi: 10.1109/OJEMB.2024.3455011. eCollection 2025.

Abstract

In the medical diagnostics domain, pathology and histology are pivotal for the precise identification of diseases. Digital histopathology, enhanced by automation, facilitates the efficient analysis of massive amount of biopsy images produced on a daily basis, streamlining the evaluation process. This study focuses in Stain Color Normalization (SCN) within a Whole-Slide Image (WSI) cohort, aiming to reduce batch biases. Building on published graphical method, this research demonstrates a mathematical population or data-driven method that optimizes the dependency on the number of reference WSIs and corresponding aggregate sums, thereby increasing SCN process efficiency. This method expedites the analysis of color convergence 50-fold by using stain vector Euclidean distance analysis, slashing the requirement for reference WSIs by more than half. The approach is validated through a tripartite methodology: 1) Stain vector euclidean distances analysis, 2) Distance computation timing, and 3) Qualitative and quantitative assessments of SCN across cancer tumors regions of interest. The results validate the performance of data-driven SCN method, thus potential to enhance the precision and reliability of computational pathology analyses. This advancement is poised to enhance diagnostic processes, therapeutic strategies, and patient prognosis.

Keywords: Glioblastoma; normalization; optimizing; preprocessing; stain-vectors; whole-slide-image.

Grants and funding

This work was supported in part by CloudBank NSF under Grant 1925001, and in part by Internet2 CLASS in collaboration with CloudBank grant, and NIH DK126847.