Papers
Topics
Authors
Recent
Search
2000 character limit reached

Exploratory data analysis for large-scale multiple testing problems and its application in gene expression studies

Published 12 Dec 2019 in stat.CO and stat.AP | (1912.06030v1)

Abstract: In large scale multiple testing problems, a two-class empirical Bayes approach can be used to control the false discovery rate (Fdr) for the entire array of hypotheses under study. A sample splitting step is incorporated to modify that approach where one part of the data is used for model fitting and the other part for detecting the significant cases by a screening technique featuring the empirical Bayes mode of Fdr control. Cases with high detection frequency across repeated random sample splits are considered true discoveries. A critical detection frequency is set to control the overall false discovery rate. The proposed method helps to balance out unwanted sources of variation and addresses potential statistical overfitting of the core empirical model by cross-validation through resampling. Further, concurrent detection frequencies are used to provide visual tools to explore the inter-relationship between significant cases. The methodology is illustrated using a microarray data set, RNA-sequencing data set, and several simulation studies. A power analysis is presented to understand the efficiency of the proposed method.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.