Papers
Topics
Authors
Recent
Search
2000 character limit reached

Scalable Causal Structure Learning via Amortized Conditional Independence Testing

Published 25 Oct 2023 in stat.ME and stat.AP | (2310.16626v2)

Abstract: Controlling false positives (Type I errors) through statistical hypothesis testing is a foundation of modern scientific data analysis. Existing causal structure discovery algorithms either do not provide Type I error control or cannot scale to the size of modern scientific datasets. We consider a variant of the causal discovery problem with two sets of nodes, where the only edges of interest form a bipartite causal subgraph between the sets. We develop Scalable Causal Structure Learning (SCSL), a method for causal structure discovery on bipartite subgraphs that provides Type I error control. SCSL recasts the discovery problem as a simultaneous hypothesis testing problem and uses discrete optimization over the set of possible confounders to obtain an upper bound on the test statistic for each edge. Semi-synthetic simulations demonstrate that SCSL scales to handle graphs with hundreds of nodes while maintaining error control and good power. We demonstrate the practical applicability of the method by applying it to a cancer dataset to reveal connections between somatic gene mutations and metastases to different tissues.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.