Papers
Topics
Authors
Recent
Search
2000 character limit reached

Semi-FairVAE: Fair Semi-Supervised VAE

Updated 20 February 2026
  • Semi-FairVAE is a semi-supervised fair representation learning framework that leverages dual encoders—a bias-aware branch and an adversarially purified bias-free branch—to segregate sensitive attributes.
  • It employs a VAE structure combined with adversarial and entropy regularizations, ensuring robust debiasing and minimal performance loss using both labeled and unlabeled data.
  • Empirical evaluations on datasets like Adult and NewsRec demonstrate significant reductions in bias metrics while maintaining high predictive performance.

Semi-FairVAE is a semi-supervised fair representation learning framework that integrates adversarial learning and variational autoencoders (VAEs), designed to enable effective bias removal from latent representations even when only a small fraction of sensitive attribute labels are available. The architecture, combining dual encoder paths—one bias-aware and one adversarially bias-free—and a VAE decoder, introduces a principled mechanism for leveraging both labeled and unlabeled data to drive fair representation learning while maintaining predictive performance (Wu et al., 2022).

1. Architectural Overview

Semi-FairVAE employs two parallel encoders: a bias-aware encoder trained to concentrate sensitive-attribute information in its output, and a bias-free encoder adversarially regularized to suppress such information. Downstream prediction models during training use concatenated outputs from both branches; however, only the bias-free features are retained for inference, aiming to guarantee fairness. The bias-aware encoder predicts a "soft" sensitive attribute label, while the bias-free output is fed to a discriminator tasked with identifying the sensitive attribute, which is adversarially trained.

A VAE structure is imposed by introducing an additional latent variable hN(μ(x),σ2(x))h \sim \mathcal{N}(\mu(x), \sigma^2(x)) with parameters derived from the input. The jointly modeled latent c(x)=[h,s^(x),s~(x)]c(x) = [h,\, \hat s(x),\, \tilde s(x)]—where s^(x)\hat s(x) comes from the bias-aware encoder and s~(x)\tilde s(x) from the bias-free branch's discriminator—serves as the input to the VAE decoder that reconstructs the input xx.

2. Mathematical Formulation and Losses

The model’s learning objective combines several losses, each enforcing different aspects of fairness and representation utility:

  • Semi-supervised VAE loss:

LVAE(x)=Ehqϕ(hx)[logpθ(xh,s^,s~)]+KL(qϕ(hx)  p(h))L_{\mathrm{VAE}}(x) = \mathbb{E}_{h \sim q_\phi(h|x)}\left[ -\log p_\theta(x|h, \hat s, \tilde s) \right] + \mathrm{KL}\left(q_\phi(h|x) \ \| \ p(h)\right)

This includes both reconstruction loss and KL divergence on the stochastic latent variable.

  • Adversarial loss (on the bias-free branch):

Ladv(x)=Ex[s1{s=true}logDψ(srf(x))+s1{strue}log(1Dψ(srf(x)))]L_{\mathrm{adv}}(x) = \mathbb{E}_x\left[ \sum_{s} 1\{ s=\text{true} \} \log D_\psi(s|r_f(x)) + \sum_{s} 1\{ s\neq\text{true} \} \log\left( 1 - D_\psi(s|r_f(x)) \right) \right]

The discriminator DψD_\psi is trained to minimize this cross-entropy, while the bias-free encoder is updated to maximize it via gradient reversal, driving invariance to ss.

  • Orthogonality regularization ensures the output subspaces of bias-aware (rbr_b) and bias-free (rfr_f) encoders are decorrelated:

Lortho=Ex[rb(x)rf(x)T]F2L_{\mathrm{ortho}} = \left\| \mathbb{E}_x \left[ r_b(x)\, r_f(x)^{T} \right] \right\|_F^2

  • Entropy regularization incentivizes maximal uncertainty in the discriminator's prediction on unlabeled data:

Lent(x)=ExDu[sDψ(srf(x))logDψ(srf(x))]L_{\mathrm{ent}}(x) = -\mathbb{E}_{x \in \mathcal{D}_u}\left[ \sum_s D_\psi(s|r_f(x)) \log D_\psi(s|r_f(x)) \right]

The overall objective is a weighted sum:

Ltotal=xDlDu{LVAE(x)+λadvLadv(x)+λorthoLortho(x)+λentLent(x)}L_{\mathrm{total}} = \sum_{x \in \mathcal{D}_l \cup \mathcal{D}_u} \left\{ L_{\mathrm{VAE}}(x) + \lambda_{\mathrm{adv}} L_{\mathrm{adv}}(x) + \lambda_{\mathrm{ortho}} L_{\mathrm{ortho}}(x) + \lambda_{\mathrm{ent}} L_{\mathrm{ent}}(x) \right\}

where λadv\lambda_{\mathrm{adv}}, λortho\lambda_{\mathrm{ortho}}, and λent\lambda_{\mathrm{ent}} are hyperparameters.

3. Training Procedure

Semi-FairVAE leverages both labeled and unlabeled data by treating availability of the sensitive attribute ss as a contingency:

  • Labeled samples (x,y,sx, y, s): The true sensitive attribute replaces s^(x)\hat s(x) in the decoder input, and s~\tilde s is set to a uniform distribution, eliminating bias information on that pathway.
  • Unlabeled samples (x,yx, y): The decoder receives model-inferred soft labels: s^(x)=qϕ1(sx)\hat s(x) = q_{\phi_1}(s|x) and s~(x)=Dψ(rf)\tilde s(x) = D_\psi(r_f), and entropy regularization is applied to drive s~\tilde s toward maximum entropy.

Optimization alternates between updating the discriminator (minimizing cross-entropy on labeled ss) and jointly updating the encoders and decoder by descending the total loss (with appropriate sign-flips for adversarial terms). Hyperparameters are ramped up from zero across early epochs to stabilize training dynamics (Wu et al., 2022).

4. Evaluation and Metrics

Empirical assessment of Semi-FairVAE includes two benchmarks:

  • Adult (UCI): Task—predict income >\gt50K, with gender as sensitive attribute.
  • NewsRec: Click-through-rate ranking for news recommendation; sensitive attribute is user gender.

Metrics span task utility and fairness:

  • Task utility: Accuracy or AUC for the main supervised objective.
  • Fairness: Demographic Parity (DP) gap and Equalized Opportunity (OPP) gap for Adult; for NewsRec, Acc@5 (accuracy of gender prediction from top-5 recommendations) and Δ\Delta-AUC (performance gap across gender groups).

Results under conditions with only 10%10\% or 20%20\% ss-labels demonstrate significant reduction in bias metrics and minimal loss in accuracy compared to standard adversarial and non-fair baselines. For example, in the Adult dataset with logistic regression, DP drops from $0.1548$ (no fairness) to $0.1134$ (Semi-FairVAE), incurring only 1%\sim 1\% absolute accuracy loss. In NewsRec with NAML, Acc@5 drops from $0.6745$ (no fairness) to $0.5634$ (Semi-FairVAE) with an end-task AUC decrease of only 0.5%0.5\% (Wu et al., 2022).

5. Significance and Comparative Context

Semi-FairVAE addresses the key practical limitation of standard adversarial fair representation learning—dependence on large quantities of labeled sensitive attribute data. By decomposing information streams into bias-aware and bias-free components, adversarially purifying the latter, enforcing their mutual orthogonality, and leveraging a semi-supervised VAE framework, Semi-FairVAE achieves effective debiasing under weak supervision.

A plausible implication is that this decomposition strategy, especially the explicit adversarial-orthogonal structure, offers a scalable template for other domains where biases may be partially labeled and where leveraging unlabeled data is critical. The entropy regularization on the discriminator, combined with the integration of soft attribute predictions into the generative model, further strengthens bias removal. Consistent improvements in fairness without substantial compromise to utility across distinct tasks and modalities suggest robustness and general applicability.

Semi-FairVAE is situated within a broader landscape of fair VAE-based methods. The Hierarchical VampPrior Variational Fair Auto-Encoder (H-VFAE + VP) introduces fairness regularization via mutual information penalties or MMD, employs hierarchical latent structures, and accommodates semi-supervised sensitive attributes by soft-labeling and variational estimation (Botros et al., 2018). Unlike Semi-FairVAE's explicit dual-path adversarial approach, H-VFAE + VP enforces invariance primarily through regularization in the latent space and semi-supervised ELBO objectives. Both models demonstrate that principled variational approaches, supplemented with adversarial or information-theoretic penalties, can operationalize fairness under incomplete supervision, but Semi-FairVAE’s orthogonality and entropy regularization constitute distinctive architectural advances.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Semi-FairVAE.