Feature Selection with the Boruta Package

Miron B. Kursa, Witold R. Rudnicki

Main Article Content

Abstract

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.

Article Details

Article Sidebar