Regression Trees With Fused Leaves

Stat Med. 2024 Nov 20. doi: 10.1002/sim.10272. Online ahead of print.

Abstract

We propose a novel regression tree method named "TreeFuL," an abbreviation for 'Tree with Fused Leaves.' TreeFuL innovatively combines recursive partitioning with fused regularization, offering a distinct approach to the conventional pruning method. One of TreeFuL's noteworthy advantages is its capacity for cross-validated amalgamation of non-neighboring terminal nodes. This is facilitated by a leaf coloring scheme that supports tree shearing and node amalgamation. As a result, TreeFuL facilitates the development of more parsimonious tree models without compromising predictive accuracy. The refined model offers enhanced interpretability, making it particularly well-suited for biomedical applications of decision trees, such as disease diagnosis and prognosis. We demonstrate the practical advantages of our proposed method through simulation studies and an analysis of data collected in an obesity study.

Keywords: CART; fused regularization; pruning; regression trees; tree model selection.