CodeFormer++: Blind Face Restoration Using Deformable Registration and Deep Metric Learning

Published 6 Oct 2025 in cs.CV | (2510.04410v1)

Abstract: Blind face restoration (BFR) has attracted increasing attention with the rise of generative methods. Most existing approaches integrate generative priors into the restoration pro- cess, aiming to jointly address facial detail generation and identity preservation. However, these methods often suffer from a trade-off between visual quality and identity fidelity, leading to either identity distortion or suboptimal degradation removal. In this paper, we present CodeFormer++, a novel framework that maximizes the utility of generative priors for high-quality face restoration while preserving identity. We decompose BFR into three sub-tasks: (i) identity- preserving face restoration, (ii) high-quality face generation, and (iii) dynamic fusion of identity features with realistic texture details. Our method makes three key contributions: (1) a learning-based deformable face registration module that semantically aligns generated and restored faces; (2) a texture guided restoration network to dynamically extract and transfer the texture of generated face to boost the quality of identity-preserving restored face; and (3) the integration of deep metric learning for BFR with the generation of informative positive and hard negative samples to better fuse identity- preserving and generative features. Extensive experiments on real-world and synthetic datasets demonstrate that, the pro- posed CodeFormer++ achieves superior performance in terms of both visual fidelity and identity consistency.

Abstract PDF Upgrade to Chat

Summary

The paper presents a novel CodeFormer++ framework that integrates deformable image alignment and deep metric learning to restore high-quality facial images.
It introduces a Texture-Prior Guided Restoration Network using a U-Net architecture with dynamic fusion weights to preserve both texture details and identity features.
Experimental results on datasets such as CelebA-Test and LFW-Test validate state-of-the-art performance in visual quality and identity preservation using metrics like FID, NIQE, and LMD.

This essay provides a comprehensive examination of the CodeFormer++ framework designed for blind face restoration (BFR). The framework introduces deformable registration and deep metric learning to synergistically utilize generative priors for effectively restoring high-quality facial images while maintaining identity fidelity.

Blind Face Restoration (BFR) is a critical challenge in computer vision, involving the reconstruction of high-quality face images from low-quality inputs affected by complex degradations such as noise, blur, and compression artifacts. Traditional methods often face a trade-off between visual quality and identity fidelity, typically resulting in either identity distortion or suboptimal restoration.

(CodeFormer++, as depicted below, aims to address these limitations by integrating a new approach that dynamically aligns and fuses identity-preserving features with generative priors.)

Figure 1: Given a degraded face image, our method is able to reconstruct a high-fidelity, texture-rich image. In contrast, CodeFormer fails to completely remove the degradation and tends to produce overly smoothed results.

Methodology and Framework Overview

The CodeFormer++ framework is devised to reconstruct high-quality faces through a multi-stage process that includes deformable image alignment and texture-prior guided restoration networks. Below, we outline the key components of this architecture:

Deformable Image Alignment Module (DAM):

DAM serves to semantically align images containing identity features and generative textures. By predicting a dense deformation field, it ensures that the generative prior aligns with identity-preserving images, allowing for effective feature fusion in subsequent stages.

Figure 2: Overview of our CodeFormer++ framework. In stage-1, the Deformable image Alignment Module (DAM) predicts deformation fields.

Texture-Prio Guided Restoration Network (TGRN):

TGRN is structured around a U-Net architecture, where a Texture Attention Module (Figure 3) facilitates adaptive fusion of texture-rich and identity-preserving features. Dynamic fusion weights are computed to adjust the incorporation of identity and texture features, ensuring accurate restoration.

Figure 3: The architecture of texture attention module.

Deep Metric Learning: A novel sampling strategy facilitates learning a discriminative feature space via a cosine triplet loss, better aligning semantically rich features without compromising identity integrity.

Experimental Results

Extensive experiments on synthetic and real-world datasets such as CelebA-Test, LFW-Test, WebPhoto-Test, and WIDER-Test validate the superiority of CodeFormer++ against SOTA methods. The framework demonstrates enhanced visual realism and identity preservation.

Quantitative Analysis: The framework achieves top scores in metrics like FID, NIQE, and LMD, indicating both high perceptual quality and identity preservation.
Qualitative Observations: As shown in Figures 4 to 9, CodeFormer++ excels in preserving identity across varying levels of degradation while enhancing textural richness.
Figure 4: Qualitative comparisons on CelebA-Test dataset. Zoom in for best view.

Figure 5: Qualitative comparisons on LFW-Test, WebPhoto-Test, and WIDER-Test datasets. Zoom in for best view.

Conclusion

CodeFormer++ represents a refined approach to BFR, overcoming previous limitations related to identity and quality. The modular structure allows for dynamic adaptation of identity features and generative textures, leading to state-of-the-art performance in both synthetic and real-world conditions. The novel use of deformable alignment and metric learning offers a promising direction for future advancements in image restoration tasks.

Markdown Report Issue