English | 2020 | ISBN: 978-1839218354 | 243 Pages | PDF, EPUB | 347 MB
Get to grips with building robust XGBoost models using Python and scikit-learn for deployment
XGBoost is an industry-proven, open source software library that provides a gradient boosting framework for scaling billions of data points quickly and efficiently.
The book starts with an introduction to machine learning and XGBoost before gradually moving on to gradient boosting. You’ll cover decision trees in detail and analyze bagging in the machine learning context. You’ll then learn how to build gradient boosting models from scratch and extend gradient boosting to big data to recognize their limitations. The book also shows you how to implement fast and accurate machine learning models using XGBoost and scikit-learn and takes you through advanced XGBoost techniques by focusing on speed enhancements, deriving parameters mathematically, and building robust models. With the help of detailed case studies, you’ll practice building and fine-tuning regressors and classifiers and become familiar with new tools such as feature importance and the confusion matrix. Finally, you’ll explore alternative base learners, learn invaluable Kaggle tricks such as building non-correlated ensembles and stacking, and prepare XGBoost models for industry deployment with unique transformers and pipelines.
By the end of the book, you’ll be able to build high performing machine learning models using XGBoost with minimal errors and maximum speed.
What you will learn
- Build machine learning bagging and boosting models
- Develop XGBoost regressors and classifiers with impressive accuracy and speed
- Find out how to analyze variance and bias in machine learning
- Compare XGBoost’s results to decision trees, random forests, and gradient boosting
- Visualize tree-based models and use machine learning to determine the most important features of a dataset
- Implement robust XGBoost models ready for industry deployment
- Build non-correlated ensembles and stack XGBoost models to increase accuracy