Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer

Published 14 Dec 2020 in cs.CV and cs.LG | (2012.07297v3)

Abstract: Unsupervised domain adaptation (UDA) aims to transfer knowledge from a related but different well-labeled source domain to a new unlabeled target domain. Most existing UDA methods require access to the source data, and thus are not applicable when the data are confidential and not shareable due to privacy concerns. This paper aims to tackle a realistic setting with only a classification model available trained over, instead of accessing to, the source data. To effectively utilize the source model for adaptation, we propose a novel approach called Source HypOthesis Transfer (SHOT), which learns the feature extraction module for the target domain by fitting the target data features to the frozen source classification module (representing classification hypothesis). Specifically, SHOT exploits both information maximization and self-supervised learning for the feature extraction module learning to ensure the target features are implicitly aligned with the features of unseen source data via the same hypothesis. Furthermore, we propose a new labeling transfer strategy, which separates the target data into two splits based on the confidence of predictions (labeling information), and then employ semi-supervised learning to improve the accuracy of less-confident predictions in the target domain. We denote labeling transfer as SHOT++ if the predictions are obtained by SHOT. Extensive experiments on both digit classification and object recognition tasks show that SHOT and SHOT++ achieve results surpassing or comparable to the state-of-the-arts, demonstrating the effectiveness of our approaches for various visual domain adaptation problems. Code is available at \url{https://github.com/tim-learn/SHOT-plus}.

Abstract PDF Upgrade to Chat

Citations (224)

View on Semantic Scholar

Summary

The paper introduces SHOT, a dual-phase method that adapts target models by transferring the source hypothesis without accessing source data.
It employs self-supervised tasks like pseudo-labeling and relative rotation prediction to enhance feature alignment and produce confident predictions.
It uses a labeling transfer phase to propagate reliable labels from high-confidence samples, achieving strong performance on diverse adaptation benchmarks.

Unsupervised Domain Adaptation Through Source Hypothesis Transfer

This paper introduces a novel approach to unsupervised domain adaptation (UDA) titled Source HypOthesis Transfer (SHOT). The primary objective of SHOT is to address the challenges posed by the absence of access to source data in UDA, a scenario that frequently arises when dealing with confidential or proprietary datasets. SHOT circumvents this limitation by relying only on a trained source model, rather than the source data itself, to facilitate domain adaptation.

At its core, SHOT adopts a two-phase approach: hypothesis transfer and labeling transfer. This methodology effectively learns a target-specific feature representation while leveraging the classification boundaries provided by the source hypothesis.

1. Hypothesis Transfer

SHOT employs hypothesis transfer using the source model's classification module as a fixed feature discriminator across domains. The method integrates information maximization to align target domain features with source domain features, resulting in outputs that carry high confidence, akin to one-hot vectors. This strategy encourages the network to produce distinct and comprehensive predictions for the target domain.

To further enhance feature alignment, SHOT employs self-supervised learning:

Pseudo-labeling: SHOT derives self-supervised pseudo-labels for target domain samples by clustering their latent representations into prototypes, reinforcing semantic consistency with the source hypothesis.
Relative Rotation Prediction: SHOT incorporates this self-supervised task to predict relative rotational transformations between original and rotated target samples, fostering robust feature learning that is invariant across domain shifts.

2. Labeling Transfer

Recognizing the potential inaccuracies in low-confidence predictions, SHOT introduces a labeling transfer phase. This phase distinguishes "easy" samples from "hard" ones based on confidence scores, utilizing semi-supervised learning methodologies, such as MixMatch, to propagate reliable labeling information from the easy split to improve prediction accuracy in the hard split.

Empirical Evaluation

The effectiveness of SHOT is empirically validated across a diverse set of benchmarks, including digit classification (MNIST, SVHN, USPS), object recognition (Office, Office-Home) and VisDA-C (a challenging large-scale synthetic-to-real domain adaptation task). SHOT consistently outperforms or equals state-of-the-art UDA methodologies across these benchmarks in both closed-set and partial-set settings, even when only relying on source models. Notably, SHOT manages to achieve comparable results to target-supervised baselines in specific tasks, despite the absence of target labels.

Theoretical and Practical Implications

Theoretically, SHOT provides a flexible yet robust framework for tackling UDA problems under source data absence, underscoring the potential of hypothesis-based transfer approaches. Practically, SHOT's decoupling from source data access aligns with contemporary data protection regulations, such as the General Data Protection Regulation (GDPR), broadening its applicability across various privacy-sensitive data domains.

Future Directions

The adaptive nature of SHOT paves the way for exploring its integration with additional self-supervised learning tasks and assessing its scalability with more complex tasks like semantic segmentation and object detection. Furthermore, meticulously analyzing the impact of model architecture on SHOT’s performance could provide insights into optimizing network design for varied adaptation scenarios.

In conclusion, SHOT represents a methodologically sound and practically demonstrative advance in the domain of unsupervised domain adaptation, particularly highlighting the efficacy of leveraging source hypotheses in lieu of source data.