Feature Selection in Machine Learning with Feature-engine

Feature Selection in Machine Learning with Feature-engine

English | 2022 | 52 Pages | PDF, EPUB | 15 MB

Discover feature selection algorithms that scale well and overcome the limitations of statistical models or the computational cost of wrapper methods.

Learn how to implement various feature selection methods in a few lines of code utilizing the open-source Python library Feature-engine.

Feature-engine is an open-source Python library for feature engineering and feature selection. It uses pandas and Scikit-learn under the hood to engineer and select feature subsets.

Feature selection is the process of selecting a subset of features from the total variables in a data set to train machine learning algorithms. Feature selection is key for developing simpler, faster, and highly performant machine learning models. The aim of any feature selection algorithm is to create classifiers or regression models that run faster and whose outputs are easier to understand by their users.

In this book, you will find feature selection methods described in scientific literature and used in data science competitions to select the best subsets of predictor variables from your data. These methods extend the feature selection toolkit already provided by Scikit-learn, with additional tools that scale better than wrapper methods, overcome the limitations of statistical methods, and are able to capture feature interaction while handling feature redundancy.