Papers
Topics
Authors
Recent
Search
2000 character limit reached

Relightable Holoported Character (RHC)

Updated 30 January 2026
  • Relightable Holoported Characters are digital avatars that are fully animatable, relightable, and capable of dynamic full-body rendering from minimal RGB camera inputs.
  • They employ advanced neural representations, 3D Gaussian splatting, and transformer-based relighting architectures to deliver high-fidelity shading, specular highlights, and geometric accuracy.
  • Practical systems integrate scalable capture setups and real-time body tracking with physics-informed loss functions to enable interactive telepresence and adaptive lighting estimations.

A Relightable Holoported Character (RHC) is a person-specific, animatable, relightable digital avatar capable of full-body dynamic rendering under arbitrary viewpoints and novel lighting conditions, suitable for real-time telepresence. RHC systems employ advanced neural representations, physically based reflectance models, and sparse-view capture pipelines to achieve high-fidelity lighting reproduction and geometric accuracy from limited RGB camera input. The technology builds on key advances in neural rendering, 3D Gaussian splatting, articulated mesh priors, and transformer-based relighting architectures, resulting in avatars that exhibit realistic specular highlights, subsurface scattering, dynamic self-shadowing, and cloth deformation—without the need for laborious one-light-at-a-time (OLAT) light-stage acquisition.

1. Capture Setup and Data Acquisition

RHC systems are designed to function under practical capture constraints, eschewing intensive OLAT protocols in favor of scalable, sparse multi-view setups. Modern approaches utilize programmable lightstages with 40 synchronized high-resolution cameras and 331 independently operated LEDs, alternating between "random environment map" illumination (simulating 1 015 real HDR environment maps) and uniformly lit sequences for robust mesh tracking (Singh et al., 29 Nov 2025). Such data collection enables the simultaneous learning of subject motion, surface normals, and appearance under diverse lighting for each time frame.

Earlier methods required monocular or multi-view camera arrays (8–16 channels) with natural or static scene illumination and relied on SMPL/SMPL-X or FLAME shape-pose models for initial mesh estimation (Chen et al., 2022, Zhang et al., 11 Mar 2025). Preprocessing includes background removal, camera pose calibration (e.g., via COLMAP), and per-frame body fitting, enabling downstream neural field queries or Gaussian initialization.

2. Model Representations and Neural Architectures

Mesh and Latent Conditioning

All RHC variants leverage coarse mesh proxies (6890–10 475 vertices for SMPL/SMPL-X or 5 143 for FLAME), augmented by per-vertex (body) or per-point (head) latent codes that encode dynamic appearance attributes. These latent feature volumes (e.g., ZRN×16Z \in \mathbb{R}^{N \times 16}) are interpolated using trilinear schemes and passed as input to downstream network modules (Chen et al., 2022, Iqbal et al., 2022, Zhang et al., 11 Mar 2025).

Neural Fields and Gaussian Splatting

Fully relightable full-body avatars are implemented as neural fields—MLPs parameterize density, normal, occlusion, diffuse albedo, and specular lobe maps—with 4–5 layers of 256 channels each, optionally conditioned on position, viewing direction, and latent features (Chen et al., 2022). Head avatars utilize explicit 3D Gaussian splatting representations, where each Gaussian carries learnable blendshape, skinning, position, scale, orientation, opacity, and color attributes. HRAvatar and RelightAnyone refine this paradigm by inferring physical appearance maps (albedo, normal, roughness, reflectance) and employing multi-stage decoders for lighting code disentanglement (Zhang et al., 11 Mar 2025, Xu et al., 6 Jan 2026).

Relighting Networks

Recent models such as RelightNet (Singh et al., 29 Nov 2025) employ U-Net backbones with self- and cross-attention mechanisms, consuming physics-informed feature stacks (mesh normals, high-frequency image normals, position maps, refined albedo, pre-integrated shading, view encodings) and HDR environment maps embedded via sinusoidal positional encodings. Output is rendered as per-texel 3D Gaussian splats, posed into world space and efficiently composited via sorted alpha blending.

3. Reflectance Modeling and Rendering Equations

RHC frameworks are grounded in the rendering equation:

Lo(x,ωo)=Ωfr(x,ωi,ωo)Li(x,ωi)(nωi)dωiL_o(x, \omega_o) = \int_{\Omega} f_r(x, \omega_i, \omega_o) L_i(x, \omega_i) (n \cdot \omega_i)\, d\omega_i

where frf_r is the BRDF, LiL_i is the incident radiance, nn is the surface normal, and (nωi)(n \cdot \omega_i) is foreshortening (Singh et al., 29 Nov 2025, Chen et al., 2022). Practical systems approximate this via per-pixel discrete summation or spherical harmonics. RHC architectures incorporate:

RelightNet leverages transformer cross-attention between feature tokens and environment map embeddings to implicitly reproduce the high-dimensional illumination integral per texel (Singh et al., 29 Nov 2025).

4. Learning Strategy, Losses, and Optimization

RHC training is multi-stage and subject-specific. Losses include:

Typical optimization uses Adam with progressive learning rates (e.g., 5×1045\times10^{-4} to 1×1051\times10^{-5}), 260 K–360 K iterations, batch sizes of 4–16, and GPU acceleration (e.g., H100 or V100) (Singh et al., 29 Nov 2025, Chen et al., 2022). Synthetic pretraining on large human datasets accelerates personalizable adaptation and regularizes texture/lighting disentanglement (Iqbal et al., 2022).

5. Experimental Evaluation and Performance Metrics

Comprehensive benchmarks across synthetic (BlenderHuman, RenderPeople, INSTA, HDTF) and real (People-Snapshot, ZJU-Mocap, Ava-256, SDFM) datasets demonstrate the superiority of RHC approaches over prior methods. Key metrics include PSNR, SSIM, LPIPS (image/albedo/normal), and angular error in degrees. Table 1 below summarizes full-body and head avatar results.

Method PSNR (Relight) SSIM LPIPS Normal Err (°) PSNR (Albedo)
Relighting4D 26.15 0.912 0.164 32.18 28.95
RHC (RelightNet) ~32.00 >0.92 ~0.05
HRAvatar 30.36 0.948 0.0569
RelightAnyone 30.06 0.87 0.2358
RANA 22.34 0.842 0.173 62.82 24.72

RHC and RelightNet achieve qualitative and quantitative improvements in realistic shading (cloth wrinkles, skin specularities, self-shadowing) and generalize to unseen environment maps and non-Lambertian materials (Singh et al., 29 Nov 2025, Chen et al., 2022). HRAvatar and RelightAnyone demonstrate high-fidelity relighting from minimal (even single image) input (Zhang et al., 11 Mar 2025, Xu et al., 6 Jan 2026).

6. Real-Time Holoportation Pipeline Adaptation

Deploying RHC for live telepresence requires adaptation for instant inference and dynamic lighting estimation. Key enhancements for holoportation include:

7. Limitations, Ablations, and Research Directions

Current RHC technologies display several constraints:

Ablation studies confirm the essentiality of each physics-informed input feature (geometry, albedo, shading, view, cross-attention), as removal correlates with marked drops in PSNR and increases in LPIPS (Singh et al., 29 Nov 2025). Proposed future directions include universal subject-agnostic pretraining, explicit layered clothing, translucent material priors, and higher-frequency BRDF modeling (Singh et al., 29 Nov 2025, Xu et al., 6 Jan 2026). Ethical deployment, particularly concerning identity protection in telepresence, remains a key concern (Zhang et al., 11 Mar 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Relightable Holoported Character (RHC).