Liver cancer is a common malignant tumor, and its clinical stage is closely related to the clinical treatment and prognosis of patients. Currently, the BCLC staging system revised by the BCLC group of University of Barcelona is the globally recognized staging system for liver cancer. However, with the deepening of related research, the current staging system can no longer fully meet the clinical needs. In this work, we propose a novel machine learning method for constructing an automatic hepatocellular carcinoma staging model that incorporates far more clinical variables than any existing staging system. Our model is based on random survival forests, which generates a unique hazard function for each patient. B-splines are used to embed hazard functions into vectors in low-dimensional space and hierarchical clustering method groups similar patients to form staging cohorts. The resulting staging system significantly outperforms the BCLC system in terms of distinctiveness between patients in different stages.
Keywords: B-splines; Cancer staging; Clustering; Random survival forests.
Copyright © 2022 Elsevier Inc. All rights reserved.