Composite Quasi-Likelihood (CQL)

Updated 20 February 2026

Composite Quasi-Likelihood (CQL) is a method for estimating group-specific dynamic panel models with spatial correlation while addressing fixed effect misspecification.
It leverages pairwise Gaussian quasi-likelihoods and the DC algorithm to jointly optimize model parameters and latent group classifications in high-dimensional settings.
Empirical results show high classification accuracy (>95%) and reduced bias, validating its use in complex spatial dynamic panel models.

Composite quasi-likelihood (CQL) refers to a quasi-likelihood methodology for estimation and classification in large dynamic panel models with group-specific heterogeneity and spatially dependent errors. Developed to address the computational infeasibility of the full likelihood in high-dimensional panels with unobserved group structure and spatial correlation, the CQL approach leverages composite (instrumental or marginal) likelihoods, paired with sophisticated optimization algorithms, to enable consistent and efficient parameter and group membership estimation in models subject to fixed effects misspecification and spatial dependence (Chu, 2017).

1. Model Framework and Specification

The foundational setting is a balanced panel $\{y_{it}, x_{it}\}$ , $i=1, \dots, N$ , $t=1, \dots, T$ . Each observational unit $i$ belongs to one of $G$ unknown latent groups, $g_i \in \{1, \dots, G\}$ . Conditional on group membership, the core dynamic error-correction model is: $y_{it} = \alpha_i + \beta_{g_i}' x_{it} + \phi_{g_i}\left(y_{i,t-1} - \theta_{g_i}' x_{i,t-1} \right) + u_{it}$ where:

$\alpha_i$ is an individual fixed effect (which may be misspecified),
$(\beta_g, \phi_g, \theta_g)$ are group-specific slope and adjustment parameters,
$u_{it}$ is an innovation permitting both serial and (contemporaneous) spatial correlation.

Spatial error dependence is modeled as

$u_t = W u_t + \varepsilon_t$

where $u_t = (u_{1t}, \dots, u_{Nt})'$ , $W$ is a spatial-weights matrix (diagonal elements zero), and $\varepsilon_t$ is spherical, $E[\varepsilon_t \varepsilon_t'] = \sigma^2 I_N$ . This yields $\text{Cov}(u_t) = \sigma^2 (I-W)^{-1}(I-W')^{-1}$ (Chu, 2017).

2. Composite Quasi-Likelihood Objective

The infeasibility of the $N$ -dimensional Gaussian likelihood at large $N$ motivates the use of a composite likelihood. Specifically, the CQL objective is formed by summing pairwise (bivariate) Gaussian quasi-log-likelihood contributions: $\mathcal{L}_{CQL}(\psi, \{g_i\}_{i=1}^N) = \sum_{i<j} w_{ij} \sum_{t=2}^T \ell_{ij, t}\left(\psi; (y_{it}, y_{jt}), (y_{i, t-1}, y_{j, t-1}), (x_{it}, x_{jt}), (x_{i, t-1}, x_{j, t-1})\right)$ where

$\psi$ is the collection of all structural parameters,
$w_{ij}$ are user-specified weights (commonly $w_{ij}=1$ ),
$\ell_{ij, t}$ is the Gaussian quasi-log-likelihood for $(i, j)$ , constructed using the 2×2 implied covariance from the spatial process (Chu, 2017).

3. Joint Parameter Estimation and Latent Group Classification

Both group memberships $\{g_i\}$ and model parameters $\psi$ are estimated jointly by maximizing the composite quasi-likelihood. This leads to the optimization problem: $\min_{\psi, \{g_i\}} -\mathcal{L}_{CQL}(\psi, \{g_i\}) \quad \text{s.t.} \ g_i \in \{1, \dots, G\},\ \sum_{g=1}^G 1\{g_i=g\} = 1 \ \forall i$ Introducing binary indicators $z_{i,g} = 1\{g_i=g\}$ , the problem becomes a nonconvex mixed-integer program (MIP), as the group assignments interact nonlinearly with both the likelihood and spatial covariance structure (Chu, 2017).

4. Optimization via Difference-of-Convex Decomposition and DCA

Chu demonstrates that $-\mathcal{L}_{CQL}$ , given fixed $Z = \{z_{i,g}\}$ , can be decomposed as the difference of two proper convex functions in the continuous parameters: $Q(\psi, Z) = f(\psi, Z) - g(\psi, Z)$ where $f$ captures convexified quadratic forms (log-determinant and trace terms, regularized as needed) and $g$ corrects over-regularization to restore the exact composite objective. This structure enables the application of the DC Algorithm (DCA):

At each iteration with membership fixed, compute a subgradient of $g$ in $\psi$ .
Minimize the resulting strongly convex surrogate to update $\psi$ .
Optionally, update group memberships $Z$ by re-assigning each $i$ to the group that minimizes its local objective, or perform a discrete optimization step.
Alternate until convergence.

This iterative block-coordinate approach efficiently heuristically solves the MIP, providing a scalable solution route for high-dimensional panels (Chu, 2017).

5. Theoretical Properties

Under moderate regularity:

Asymptotic unbiasedness and consistency of the slope parameters $(\beta_g, \phi_g, \theta_g)$ holds jointly as $N, T \to \infty$ , even when individual fixed effects $\alpha_i$ are arbitrary or misspecified.
The estimator attains the oracle property: estimation and classification are such that the limiting distribution of the parameters matches that with perfect (oracle) knowledge of group memberships and fixed effects.
Asymptotic normality holds for each group: $\sqrt{N_g T} \begin{pmatrix} \widehat{\beta}_g - \beta_g^0 \ \widehat{\phi}_g - \phi_g^0 \ \widehat{\theta}_g - \theta_g^0 \end{pmatrix} \Rightarrow \mathcal{N}(0, V_g)$ where $N_g$ is group size and $V_g$ is the appropriate sandwich covariance (Chu, 2017).

6. Regularity Assumptions

Establishing these properties relies on:

(A1) Stationarity and $\alpha$ -mixing: $\{(x_{it}, u_{it})\}$ is strictly stationary in $t$ and $\alpha$ -mixing across $(i, t)$ with mixing coefficients decaying sufficiently fast.
(A2) Group-heterogeneity: Each group contains a non-negligible share of units, and group-specific parameters are separated by at least $\delta > 0$ in norm.
(A3) Spatial-error process: $(I-W)$ is invertible, and the spatial process is $\ell$ -mixing in cross-section.
(A4) Moments: Uniform fourth moment bounds on $x_{it}$ and $u_{it}$ ; parameters contained in a compact set (Chu, 2017).

7. Finite-Sample Behavior and Simulation Results

Monte Carlo experiments with $N = \{100, 200\}$ , $T = \{20, 50\}$ , and $G = \{2, 3\}$ indicate:

Classification accuracy of unit grouping exceeds 95% even at moderate $N, T$ .
Bias and RMSE of $(\beta_g, \phi_g, \theta_g)$ decrease at a $(NT)^{-1/2}$ rate, consistent with theory.
The CQL-DCA solution outperforms both naive pooled QMLE (ignoring groups) and pre-clustering (K-means plus two-step QMLE) on both parameter estimation bias and group assignment accuracy.

The empirical findings substantiate the theoretical properties, demonstrating both robustness to fixed effect misspecification and the computational viability of the overall estimation framework (Chu, 2017).

Chu's CQL methodology, by constructing a pairwise quasi-likelihood, embedding a mixed-integer group allocation, and applying DC algorithmic optimization, provides a rigorous, computationally tractable, and asymptotically efficient solution to high-dimensional, group-heterogeneous, spatially dependent dynamic panel models (Chu, 2017).

Markdown Report Issue Upgrade to Chat

References (1)

Composite Quasi-Likelihood Estimation of Dynamic Panels with Group-Specific Heterogeneity and Spatially Dependent Errors (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Composite Quasi-Likelihood (CQL).

Composite Quasi-Likelihood (CQL)

1. Model Framework and Specification

2. Composite Quasi-Likelihood Objective

3. Joint Parameter Estimation and Latent Group Classification

4. Optimization via Difference-of-Convex Decomposition and DCA

5. Theoretical Properties

6. Regularity Assumptions

7. Finite-Sample Behavior and Simulation Results

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Composite Quasi-Likelihood (CQL)

1. Model Framework and Specification

2. Composite Quasi-Likelihood Objective

3. Joint Parameter Estimation and Latent Group Classification

4. Optimization via Difference-of-Convex Decomposition and DCA

5. Theoretical Properties

6. Regularity Assumptions

7. Finite-Sample Behavior and Simulation Results

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research