Testing Independence under Biased Sampling
Abstract: Testing for association or dependence between pairs of random variables is a fundamental problem in statistics. In some applications, data are subject to selection bias that causes dependence between observations even when it is absent from the population. An important example is truncation models, in which observed pairs are restricted to a specific subset of the X-Y plane. Standard tests for independence are not suitable in such cases, and alternative tests that take the selection bias into account are required. To deal with this issue, we generalize the notion of quasi-independence with respect to the sampling mechanism, and study the problem of detecting any deviations from it. We develop two test statistics motivated by the classic Hoeffding's statistic, and use two approaches to compute their distribution under the null: (i) a bootstrap-based approach, and (ii) a permutation-test with non-uniform probability of permutations, sampled using either MCMC or importance sampling with various proposal distributions. We show that our tests can tackle cases where the biased sampling mechanism is estimated from the data, with an important application to the case of censoring with truncation. We prove the validity of the tests, and show, using simulations, that they perform well for important special cases of the problem and improve power compared to competing methods. The tests are applied to four datasets, two that are subject to truncation, with and without censoring, and two to positive bias mechanisms related to length bias.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.