Article,

The importance of knowing when to stop: a sequential stopping rule for component-wise gradient boosting

, , and .
Methods of Information in Medicine, 51 (2): 178--186 (February 2012)
DOI: 10.3414/ME11-02-0030

Abstract

Objectives: Component-wise boosting algorithms have evolved into a popular estimation scheme in biomedical regression settings. The iteration number of these algorithms is the most important tuning parameter to optimize their performance. To date, no fully automated strategy for determining the optimal stopping iteration of boosting algorithms has been proposed. Methods: We propose a fully data-driven sequential stopping rule for boosting algorithms. It combines resampling methods with a modified version of an earlier stopping approach that depends on AIC-based information criteria. The new “subsampling after AIC” stopping rule is applied to component-wise gradient boosting algorithms. Results: The newly developed sequential stopping rule outperformed earlier approaches if applied to both simulated and real data. Specifically, it improved purely AIC-based methods when used for the microarray-based prediction of the recurrence of metastases for stage II colon cancer patients. Conclusions: The proposed sequential stopping rule for boosting algorithms can help to identify the optimal stopping iteration already during the fitting process of the algorithm, at least for the most common loss functions.

Tags

Users

  • @yourwelcome

Comments and Reviews