Rao's Mixed-Effects Model for Complex Surveys
- Rao's Mixed-Effects Model is a two-level linear mixed model estimated via pairwise composite likelihood, yielding design-consistent results for survey data with clusters.
- The method mitigates high-dimensional integration challenges by summarizing bivariate likelihood contributions, trading some efficiency for robustness in estimating fixed and random effects.
- The approach is efficiently implemented in the R package svylme, making it practical for applications in educational, health, and social science surveys with complex sampling designs.
Rao's Mixed-Effects Model refers to the class of two-level linear mixed models estimated using the pairwise composite likelihood (PCL) approach developed by Rao and co-workers for complex survey data. This methodology yields a general, design-consistent estimator for hierarchical linear models when the model clusters coincide with survey design clusters, without requiring cluster sizes to grow large. Rao's approach offers robust inference for both fixed and random effects under informative and complex sampling, at the cost of some loss in statistical efficiency, particularly for variance components (Lumley et al., 2023).
1. Formulation of the Two-Level Linear Mixed Model
The classical two-level linear mixed model is defined as
where indexes clusters (e.g., schools) and indexes units within cluster . The random effects follow , the residuals , and . Here, is a row of fixed-effects covariates and is a row of random-effects covariates, with parameter vector , where parametrizes variance-covariance structure.
Marginalizing over random effects, the within-cluster response vector has distribution
where , stack the individual-level covariates.
2. Pairwise Composite Likelihood Construction
Full maximum likelihood for mixed models involves high-dimensional integration over random effects. Rao's PCL approach instead employs a sum of bivariate log-densities for all pairs within clusters:
where
and , . The covariance submatrix is extracted from .
Survey sampling induces complex patterns of missing data: only a subset of units (and thus pairs) are observed, each with inclusion probability . The design-weighted composite loglikelihood is
with for sampled units.
3. Statistical Estimation via Composite Scores
Unbiased estimating equations are derived by differentiating with respect to model parameters:
- Fixed effects ():
Solving for fixed yields the generalized least-squares estimator.
- Residual variance ():
- Variance parameters ():
Joint closed-form solutions do not exist; thus, a profiling strategy is employed. For each candidate , corresponding profile estimators and are obtained by maximizing conditionally. The profile deviance is defined as
4. Computational Implementation
The R package svylme provides an efficient implementation via the function svy2lme(design, formula). Key algorithmic steps include:
- Pairwise enumeration: Within each sampled cluster, all observed within-cluster pairs are enumerated. Inclusion weights are calculated using exact or approximate formulas derived from survey sampling probabilities.
- Profiling: For each optimizer trial value of , the corresponding profiles and are computed by applying generalized least squares to pseudo-observations (two rows per pair).
- Optimization: The profile deviance is minimized over using Powell’s BOBYQA derivative-free optimizer (R package minqa). Start values are obtained from an unweighted fit using
lmerfrom lme4. - Efficient blockwise calculation: Closed-form formulas for determinant and inverse of covariance blocks are used for computational efficiency.
5. Variance Estimation and Large-Sample Properties
Under suitable conditions, including a law of large numbers and central limit theorem for weighted sums of pairwise scores, the PCL estimator is consistent and asymptotically normal:
where (sensitivity) and , with . For complex survey designs, sandwich estimators replace and with their weighted analogues:
- "with-replacement" PSU-level estimate of .
6. Efficiency, Robustness, and Comparative Properties
Simulations demonstrate the following empirical properties:
- Under noninformative sampling, the PCL estimator is nearly unbiased for , , and , but is less efficient than full maximum likelihood and stagewise pseudolikelihood estimators with weight scaling. The efficiency loss is most pronounced for variance components (e.g., approximately 2–3 times larger standard errors for random intercept variance in moderate cluster sizes); as cluster sizes decrease, this inefficiency diminishes, with PCL equaling maximum likelihood efficiency when cluster size .
- Under strongly informative sampling (where unit inclusion depends on cluster-level effects), PCL estimators remain unbiased, while stagewise pseudolikelihood—regardless of weight scaling—can display substantial bias in both fixed effects () and variance components, unless clusters are extremely large.
This approach trades statistical efficiency (especially for variance parameters) for robustness to informative sampling and is suitable for general multistage survey designs (Lumley et al., 2023).
7. Applications and Extensions
The PCL methodology is implemented in the svylme R package, supporting the analysis of two-level linear mixed models applied to complex survey data such as educational assessments (e.g., PISA). The method's robustness under complex, informative sampling extends its applicability to a broad class of health and social science surveys featuring multistage cluster designs. The approach does not require large clusters, facilitating estimation in practical settings where cluster sizes are moderate or where only a limited number of within-cluster units are observed.
A plausible implication is that future extensions may generalize PCL methods to multilevel structures beyond two levels or to other classes of generalized linear mixed models, building on the computational strategies and robustness guarantees of Rao's approach.