Papers
Topics
Authors
Recent
Search
2000 character limit reached

Estimating a sharp convergence bound for randomized ensembles

Published 4 Mar 2013 in math.PR, cs.SI, math.ST, stat.ML, and stat.TH | (1303.0727v3)

Abstract: When randomized ensembles such as bagging or random forests are used for binary classification, the prediction error of the ensemble tends to decrease and stabilize as the number of classifiers increases. However, the precise relationship between prediction error and ensemble size is unknown in practice. In the standard case when classifiers are aggregated by majority vote, the present work offers a way to quantify this convergence in terms of "algorithmic variance," i.e. the variance of prediction error due only to the randomized training algorithm. Specifically, we study a theoretical upper bound on this variance, and show that it is sharp --- in the sense that it is attained by a specific family of randomized classifiers. Next, we address the problem of estimating the unknown value of the bound, which leads to a unique twist on the classical problem of non-parametric density estimation. In particular, we develop an estimator for the bound and show that its MSE matches optimal non-parametric rates under certain conditions. (Concurrent with this work, some closely related results have also been considered in Cannings and Samworth (2017) and Lopes (2019).)

Citations (6)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.