Informaticists are attempting to predict chronic healthcare events that are not fully understood. The resulting models often incorporate copious numbers of predictors derived across diverse datasets. This approach may yield desirable performance characteristics, but it sacrifices interpretability and portability. The Bootstrapped Ridge Selector (BoRidge) offers a tool to balance performance with interpretability. Compared to two modern feature selection methods, Bootstrapped LASSO regression (BoLASSO) and a minimal-redundancy-maximal-relevance selector (mRMR), the BoRidge bested them for binary classification on artificially generated data (sensitivity: 0.83, specificity:0.72) versus BoLASSO (sensitivity: 0.1, specificity:1) and mRMR (sensitivity: 0.69, specificity: 0.69). On a dataset used to validate a published suicide risk prediction model, the BoRidge selected an equally precise model to the publication, with far fewer predictors (114 versus the 1,538 used in the published model). The BoRidge has the potential to simplify classification models for complex problems, making them easier to translate and act on.

Learning Objective 1: Learn the benefits of feature selection when building predictive models to make such models more interpretable and portable to other sites.


Matthew Lenert (Presenter)
Vanderbilt University

Colin Walsh, Vanderbilt University

Presentation Materials: