Papers
Topics
Authors
Recent
Search
2000 character limit reached

A network flow approach to visualising the roles of covariates in random forests

Published 27 Jun 2017 in stat.OT | (1706.08702v1)

Abstract: We propose novel applications of parallel coordinates plots and Sankey diagrams to represent the hierarchies of interacting covariate effects in random forests. Each visualisation summarises the frequencies of all of the paths through all of the trees in a random forest. Visualisations of the roles of covariates in random forests include: ranked bar or dot charts depicting scalar metrics of the contributions of individual covariates to the predictive accuracy of the random forest; line graphs depicting various summaries of the effect of varying a particular covariate on the predictions from the random forest; heatmaps of metrics of the strengths of interactions between all pairs of covariates; and parallel coordinates plots for each response class depicting the distributions of the values of all covariates among the observations most representative of those predicted to belong that class. Together these visualisations facilitate substantial insights into the roles of covariates in a random forest but do not communicate the frequencies of the hierarchies of covariates effects across the random forest or the orders in which covariates occur in these hierarchies. Our visualisations address these gaps. We demonstrate our visualisations using a random forest fitted to publicly available data and provide a software implementation in the form of an R package.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.