Greedy function approximation: A gradient boosting machine.
Abstract
Function estimation/approximation is viewed from the perspective\nof numerical optimization in function space, rather than parameter space. A\nconnection is made between stagewise additive expansions and steepest-descent\nminimization. A general gradient descent “boosting” paradigm is\ndeveloped for additive expansions based on any fitting criterion.Specific\nalgorithms are presented for least-squares, least absolute deviation, and\nHuber-M loss functions for regression, and multiclass logistic likelihood for\nclassification. Special enhancements are derived for the particular case where\nthe individual additive components are regression trees, and tools for\ninterpreting such “TreeBoost” models are presented. Gradient\nboosting of regression trees produces competitive, highly robust, interpretable\nprocedures for both regression and classification, especially appropriate for\nmining less than clean data. Connections between this approach and the boosting\nmethods of Freund and Shapire and Friedman, Hastie and Tibshirani are\ndiscussed.