Semi-FairVAE: Fair Semi-Supervised VAE
- Semi-FairVAE is a semi-supervised fair representation learning framework that leverages dual encoders—a bias-aware branch and an adversarially purified bias-free branch—to segregate sensitive attributes.
- It employs a VAE structure combined with adversarial and entropy regularizations, ensuring robust debiasing and minimal performance loss using both labeled and unlabeled data.
- Empirical evaluations on datasets like Adult and NewsRec demonstrate significant reductions in bias metrics while maintaining high predictive performance.
Semi-FairVAE is a semi-supervised fair representation learning framework that integrates adversarial learning and variational autoencoders (VAEs), designed to enable effective bias removal from latent representations even when only a small fraction of sensitive attribute labels are available. The architecture, combining dual encoder paths—one bias-aware and one adversarially bias-free—and a VAE decoder, introduces a principled mechanism for leveraging both labeled and unlabeled data to drive fair representation learning while maintaining predictive performance (Wu et al., 2022).
1. Architectural Overview
Semi-FairVAE employs two parallel encoders: a bias-aware encoder trained to concentrate sensitive-attribute information in its output, and a bias-free encoder adversarially regularized to suppress such information. Downstream prediction models during training use concatenated outputs from both branches; however, only the bias-free features are retained for inference, aiming to guarantee fairness. The bias-aware encoder predicts a "soft" sensitive attribute label, while the bias-free output is fed to a discriminator tasked with identifying the sensitive attribute, which is adversarially trained.
A VAE structure is imposed by introducing an additional latent variable with parameters derived from the input. The jointly modeled latent —where comes from the bias-aware encoder and from the bias-free branch's discriminator—serves as the input to the VAE decoder that reconstructs the input .
2. Mathematical Formulation and Losses
The model’s learning objective combines several losses, each enforcing different aspects of fairness and representation utility:
- Semi-supervised VAE loss:
This includes both reconstruction loss and KL divergence on the stochastic latent variable.
- Adversarial loss (on the bias-free branch):
The discriminator is trained to minimize this cross-entropy, while the bias-free encoder is updated to maximize it via gradient reversal, driving invariance to .
- Orthogonality regularization ensures the output subspaces of bias-aware () and bias-free () encoders are decorrelated:
- Entropy regularization incentivizes maximal uncertainty in the discriminator's prediction on unlabeled data:
The overall objective is a weighted sum:
where , , and are hyperparameters.
3. Training Procedure
Semi-FairVAE leverages both labeled and unlabeled data by treating availability of the sensitive attribute as a contingency:
- Labeled samples (): The true sensitive attribute replaces in the decoder input, and is set to a uniform distribution, eliminating bias information on that pathway.
- Unlabeled samples (): The decoder receives model-inferred soft labels: and , and entropy regularization is applied to drive toward maximum entropy.
Optimization alternates between updating the discriminator (minimizing cross-entropy on labeled ) and jointly updating the encoders and decoder by descending the total loss (with appropriate sign-flips for adversarial terms). Hyperparameters are ramped up from zero across early epochs to stabilize training dynamics (Wu et al., 2022).
4. Evaluation and Metrics
Empirical assessment of Semi-FairVAE includes two benchmarks:
- Adult (UCI): Task—predict income 50K, with gender as sensitive attribute.
- NewsRec: Click-through-rate ranking for news recommendation; sensitive attribute is user gender.
Metrics span task utility and fairness:
- Task utility: Accuracy or AUC for the main supervised objective.
- Fairness: Demographic Parity (DP) gap and Equalized Opportunity (OPP) gap for Adult; for NewsRec, Acc@5 (accuracy of gender prediction from top-5 recommendations) and -AUC (performance gap across gender groups).
Results under conditions with only or -labels demonstrate significant reduction in bias metrics and minimal loss in accuracy compared to standard adversarial and non-fair baselines. For example, in the Adult dataset with logistic regression, DP drops from $0.1548$ (no fairness) to $0.1134$ (Semi-FairVAE), incurring only absolute accuracy loss. In NewsRec with NAML, Acc@5 drops from $0.6745$ (no fairness) to $0.5634$ (Semi-FairVAE) with an end-task AUC decrease of only (Wu et al., 2022).
5. Significance and Comparative Context
Semi-FairVAE addresses the key practical limitation of standard adversarial fair representation learning—dependence on large quantities of labeled sensitive attribute data. By decomposing information streams into bias-aware and bias-free components, adversarially purifying the latter, enforcing their mutual orthogonality, and leveraging a semi-supervised VAE framework, Semi-FairVAE achieves effective debiasing under weak supervision.
A plausible implication is that this decomposition strategy, especially the explicit adversarial-orthogonal structure, offers a scalable template for other domains where biases may be partially labeled and where leveraging unlabeled data is critical. The entropy regularization on the discriminator, combined with the integration of soft attribute predictions into the generative model, further strengthens bias removal. Consistent improvements in fairness without substantial compromise to utility across distinct tasks and modalities suggest robustness and general applicability.
6. Related Fairness-VAE Methodologies
Semi-FairVAE is situated within a broader landscape of fair VAE-based methods. The Hierarchical VampPrior Variational Fair Auto-Encoder (H-VFAE + VP) introduces fairness regularization via mutual information penalties or MMD, employs hierarchical latent structures, and accommodates semi-supervised sensitive attributes by soft-labeling and variational estimation (Botros et al., 2018). Unlike Semi-FairVAE's explicit dual-path adversarial approach, H-VFAE + VP enforces invariance primarily through regularization in the latent space and semi-supervised ELBO objectives. Both models demonstrate that principled variational approaches, supplemented with adversarial or information-theoretic penalties, can operationalize fairness under incomplete supervision, but Semi-FairVAE’s orthogonality and entropy regularization constitute distinctive architectural advances.