XGBoost

XGBoost
Developer(s)	The XGBoost Contributors
Initial release	March 27, 2014; 7 years ago
Stable release	1.4.0 / April 10, 2021; 10 months ago
Repository	github.com/dmlc/xgboost ;
Written in	C++
Operating system	Linux, macOS, Windows
Type	Machine learning
License	Apache License 2.0
Website	xgboost.ai

XGBoost^[2] is an open-source software library which provides a regularizing gradient boosting framework for C++, Java, Python,^[3] R,^[4] Julia,^[5] Perl,^[6] and Scala. It works on Linux, Windows,^[7] and macOS.^[8] From the project description, it aims to provide a "Scalable, Portable and Distributed Gradient Boosting (GBM, GBRT, GBDT) Library". It runs on a single machine, as well as the distributed processing frameworks Apache Hadoop, Apache Spark, Apache Flink, and Dask.^[9]^[10]

It has gained much popularity and attention recently as the algorithm of choice for many winning teams of machine learning competitions.^[11]

History[]

XGBoost initially started as a research project by Tianqi Chen^[12] as part of the Distributed (Deep) Machine Learning Community (DMLC) group. Initially, it began as a terminal application which could be configured using a libsvm configuration file. It became well known in the ML competition circles after its use in the winning solution of the Higgs Machine Learning Challenge. Soon after, the Python and R packages were built, and XGBoost now has package implementations for Java, Scala, Julia, Perl, and other languages. This brought the library to more developers and contributed to its popularity among the Kaggle community, where it has been used for a large number of competitions.^[11]

It was soon integrated with a number of other packages making it easier to use in their respective communities. It has now been integrated with scikit-learn for Python users and with the caret package for R users. It can also be integrated into Data Flow frameworks like Apache Spark, Apache Hadoop, and Apache Flink using the abstracted Rabit^[13] and XGBoost4J.^[14] XGBoost is also available on OpenCL for FPGAs.^[15] An efficient, scalable implementation of XGBoost has been published by Tianqi Chen and Carlos Guestrin.^[16]

While XGBoost model often achieves higher accuracy than a single decision tree, it sacrifices the intrinsic interpretability of decision trees. For example, following the path that a decision tree takes to make its decision is trivial and self-explained, but following the paths of hundreds or thousands of trees is much harder. To achieve both performance and interpretability, some model compression techniques allow transforming an XGBoost into a single "born-again" decision tree that approximates the same decision function.^[17]

Features[]

Salient features of XGBoost which make it different from other gradient boosting algorithms include:^[18]^[19]^[20]

Clever penalization of trees
A proportional shrinking of leaf nodes
Newton Boosting
Extra randomization parameter
Implementation on single, distributed systems and out-of-core computation
Automatic Feature selection

The algorithm[]

XGBoost works as Newton Raphson in function space unlike gradient boosting that works as gradient descent in function space, a second order Taylor approximation is used in the loss function to make the connection to Newton Raphson method.

A generic unregularized XGBoost algorithm is:

Input: training set $\{(x_{i},y_{i})\}_{i=1}^{N}$ , a differentiable loss function $L(y,F(x))$ , a number of weak learners $M$ and a learning rate $\alpha$ .

Algorithm:

Initialize model with a constant value:
${\hat {f}}_{(0)}(x)={\underset {\theta }{\arg \min }}\sum _{i=1}^{N}L(y_{i},\theta ).$
For m = 1 to M:
1. Compute the 'gradients' and 'hessians':
  ${\hat {g}}_{m}(x_{i})=\left[{\frac {\partial L(y_{i},f(x_{i}))}{\partial f(x_{i})}}\right]_{f(x)={\hat {f}}_{(m-1)}(x)}.$
  
  ${\hat {h}}_{m}(x_{i})=\left[{\frac {\partial ^{2}L(y_{i},f(x_{i}))}{\partial f(x_{i})^{2}}}\right]_{f(x)={\hat {f}}_{(m-1)}(x)}.$
2. Fit a base learner (or weak learner, e.g. tree) using the training set $\displaystyle \{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\}_{i=1}^{N}$ by solving the optimization problem below:
  ${\hat {\phi }}_{m}={\underset {\phi \in \mathbf {\Phi } }{\arg \min }}\sum _{i=1}^{N}{\frac {1}{2}}{\hat {h}}_{m}(x_{i})\left[-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}-\phi (x_{i})\right]^{2}.$
  
  ${\hat {f}}_{m}(x)=\alpha {\hat {\phi }}_{m}(x).$
3. Update the model:
  ${\hat {f}}_{(m)}(x)={\hat {f}}_{(m-1)}(x)+{\hat {f}}_{m}(x).$
Output ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$

Awards[]

John Chambers Award (2016)^[21]
High Energy Physics meets Machine Learning award (HEP meets ML) (2016)^[22]

References[]

^ "Release 1.4.0 · dmlc/xgboost". GitHub. Retrieved 2021-05-13.
^ "GitHub project webpage".
^ "Python Package Index PYPI: xgboost". Retrieved 2016-08-01.
^ "CRAN package xgboost". Retrieved 2016-08-01.
^ "Julia package listing xgboost". Retrieved 2016-08-01.
^ "CPAN module AI::XGBoost". Retrieved 2020-02-09.
^ "Installing XGBoost for Anaconda in Windows". Retrieved 2016-08-01.
^ "Installing XGBoost on Mac OSX". Retrieved 2016-08-01.
^ "Dask Homepage".{{cite web}}: CS1 maint: url-status (link)
^ "Distributed XGBoost with Dask — xgboost 1.5.0-dev documentation". xgboost.readthedocs.io. Retrieved 2021-07-15.
^ ^a ^b "XGBoost - ML winning solutions (incomplete list)". Retrieved 2016-08-01.
^ "Story and Lessons behind the evolution of XGBoost". Retrieved 2016-08-01.
^ "Rabit - Reliable Allreduce and Broadcast Interface". Retrieved 2016-08-01.
^ "XGBoost4J". Retrieved 2016-08-01.
^ "XGBoost on FPGAs". Retrieved 2019-08-01.
^ Chen, Tianqi; Guestrin, Carlos (2016). "XGBoost: A Scalable Tree Boosting System". In Krishnapuram, Balaji; Shah, Mohak; Smola, Alexander J.; Aggarwal, Charu C.; Shen, Dou; Rastogi, Rajeev (eds.). Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. ACM. pp. 785–794. arXiv:1603.02754. doi:10.1145/2939672.2939785.
^ Sagi, Omer; Rokach, Lior (2021). "Approximating XGBoost with an interpretable decision tree". Information Sciences. 572 (2021): 522-542. doi:10.1016/j.ins.2021.05.055.
^ Gandhi, Rohith (2019-05-24). "Gradient Boosting and XGBoost". Medium. Retrieved 2020-01-04.
^ "Boosting algorithm: XGBoost". Towards Data Science. 2017-05-14. Retrieved 2020-01-04.
^ "Tree Boosting With XGBoost – Why Does XGBoost Win "Every" Machine Learning Competition?". Synced. 2017-10-22. Retrieved 2020-01-04.
^ "John Chambers Award Previous Winners". Retrieved 2016-08-01.
^ "HEP meets ML Award". Retrieved 2016-08-01.

This artificial intelligence-related article is a stub. You can help Wikipedia by .

[1] "Release 1.4.0 · dmlc/xgboost". GitHub. Retrieved 2021-05-13.

[source-code-2] "GitHub project webpage".

[xgboost-python-3] "Python Package Index PYPI: xgboost". Retrieved 2016-08-01.

[xgboost-cran-4] "CRAN package xgboost". Retrieved 2016-08-01.

[xgboost-julia-5] "Julia package listing xgboost". Retrieved 2016-08-01.

[xgboost-perl-6] "CPAN module AI::XGBoost". Retrieved 2020-02-09.

[xgboost-windows-7] "Installing XGBoost for Anaconda in Windows". Retrieved 2016-08-01.

[xgboost-macos-8] "Installing XGBoost on Mac OSX". Retrieved 2016-08-01.

[Dask-docs-9] "Dask Homepage".{{cite web}}: CS1 maint: url-status (link)

[10] "Distributed XGBoost with Dask — xgboost 1.5.0-dev documentation". xgboost.readthedocs.io. Retrieved 2021-07-15.

[xgboost-competition-winners-11] "XGBoost - ML winning solutions (incomplete list)". Retrieved 2016-08-01.

[history-12] "Story and Lessons behind the evolution of XGBoost". Retrieved 2016-08-01.

[rabit-13] "Rabit - Reliable Allreduce and Broadcast Interface". Retrieved 2016-08-01.

[xgboost4j-14] "XGBoost4J". Retrieved 2016-08-01.

[xgboost_FPGA-15] "XGBoost on FPGAs". Retrieved 2019-08-01.

[paper-16] Chen, Tianqi; Guestrin, Carlos (2016). "XGBoost: A Scalable Tree Boosting System". In Krishnapuram, Balaji; Shah, Mohak; Smola, Alexander J.; Aggarwal, Charu C.; Shen, Dou; Rastogi, Rajeev (eds.). Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. ACM. pp. 785–794. arXiv:1603.02754. doi:10.1145/2939672.2939785.

[17] Sagi, Omer; Rokach, Lior (2021). "Approximating XGBoost with an interpretable decision tree". Information Sciences. 572 (2021): 522-542. doi:10.1016/j.ins.2021.05.055.

[18] Gandhi, Rohith (2019-05-24). "Gradient Boosting and XGBoost". Medium. Retrieved 2020-01-04.

[19] "Boosting algorithm: XGBoost". Towards Data Science. 2017-05-14. Retrieved 2020-01-04.

[20] "Tree Boosting With XGBoost – Why Does XGBoost Win "Every" Machine Learning Competition?". Synced. 2017-10-22. Retrieved 2020-01-04.

[john-chambers-21] "John Chambers Award Previous Winners". Retrieved 2016-08-01.

[hep-meets-ml-22] "HEP meets ML Award". Retrieved 2016-08-01.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]