Papers
Topics
Authors
Recent
Search
2000 character limit reached

Pre-validation Revisited

Published 21 May 2025 in stat.ME and stat.ML | (2505.14985v2)

Abstract: Pre-validation is a way to build prediction model with two datasets of significantly different feature dimensions. Previous work showed that the asymptotic distribution of the resulting test statistic for the pre-validated predictor deviates from a standard Normal, hence leads to issues in hypothesis testing. In this paper, we revisit the pre-validation procedure and extend the problem formulation without any independence assumption on the two feature sets. We propose not only an analytical distribution of the test statistic for the pre-validated predictor under certain models, but also a generic bootstrap procedure to conduct inference. We show properties and benefits of pre-validation in prediction, inference and error estimation by simulations and applications, including analysis of a breast cancer study and a synthetic GWAS example.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.