Unitree G1 Humanoid Platform

Updated 7 February 2026

The Unitree G1 is a research-grade humanoid robot characterized by high degrees of freedom, modular design, and integrated proprioceptive sensing.
It underpins advanced research in whole-body control, sim-to-real transfer, unified loco-manipulation, and language-conditioned action, driving innovation in embodied AI.
The platform also highlights cybersecurity challenges through static encryption practices and covert telemetry, underscoring the balance between performance and security risks.

The Unitree G1 Humanoid Platform is a commercially available, research-grade full-body humanoid robot that has become a prominent testbed for advanced whole-body control, sim-to-real transfer, manipulation, language-conditioned action, gesture synthesis, and cybersecurity research. It features a high-degree-of-freedom (DoF) articulated body, integrated proprioceptive sensing, and a modular software and hardware stack highly representative of contemporary autonomous humanoid systems. The G1 has been deployed at scale for state-of-the-art language-action models, symmetry-exploiting controllers, unified loco-manipulation, sim-to-real RL, hyper-dexterous workspace optimization, semantic gesture synthesis, and has also been the subject of deep security analysis, making it a keystone reference in recent embodied AI literature.

1. Mechanical Architecture and Degrees of Freedom

The Unitree G1's mechanical embodiment presents a bipedal, biomimetic structure with an articulated floating base and high DoF manipulators. Across recent research, several G1 configurations are described, typically with 27–36 active DoF in the core body and up to 46 including hands and custom actuators.

Reference Context	Total DoF (excl. hands)	Full Body Structure
SE-Policy (Nie et al., 2 Aug 2025)	27	14 arms (2×7), 12 legs (2×6), 1 waist
ULC (Sun et al., 9 Jul 2025)	29	14 arms (2×7), 12 legs (2×6), 3 waist
AMO (Li et al., 6 May 2025)	29	14 arms (2×7), 12 legs (2×6), 3 waist; +14 hand, +3 head
FRoM-W1 (Li et al., 19 Jan 2026)	29	Default: 14 arms (2×7), 12 legs (2×6), 3 waist (body only 21)
Humanoid-LLA (Liu et al., 28 Nov 2025)	36	Floating base (6D) + 30 joint axes
Co-speech Gesture (Zhang, 19 Dec 2025)	29 + floating base	14 arms, 12 legs, 3 waist; neck/head not explicitly counted
Gait-conditioned RL (Peng et al., 27 May 2025)	23	12 legs (2×6), 8 arms (2×4), 3 trunk (yaw, pitch), 0 hands

Most G1 deployment studies specify 7-DoF arms, 6-DoF articulated legs, and a 3-DoF waist (yaw, pitch, roll), in a left–right mirror symmetry. Neither link masses nor precise inertia tensors are detailed in research literature; physical values are typically inferred or drawn from reference URDFs. Hands, when present (e.g., Dex3-1 grippers), add 7 DoF per side and a custom head module introduces 3 DoF. The platform's architecture is deliberately modular, supporting a range of application-specific DoF allocations.

2. Sensor Suite and Proprioceptive Feedback

Onboard sensing is characterized by a unified proprioceptive stack. Joint positions and velocities are measured via integrated rotary encoders on each servo; root/base orientation and angular velocities are provided by a torso-mounted MEMS IMU. Some configurations include foot contact switches and global odometry via stereo vision (e.g., ZED Mini) or RealSense RGB-D. Notable properties include:

Proprioceptive vector: $q \in \mathbb{R}^n$ (joint angles), $\dot{q} \in \mathbb{R}^n$ (velocities), base velocities, and IMU data.
Contact sensors: Binary foot switches for gait studies (Peng et al., 27 May 2025); not universally present.
Vision: Optional ZED Mini (Li et al., 19 Jan 2026), Intel RealSense D435i (Mayoral-Vilches et al., 17 Sep 2025), used for global odometry or teleoperation.
Feedback rates: Typical encoder/IMU rates of 200–400 Hz at the hardware layer, with control policy frequencies in the 20–200 Hz range depending on architecture (Sun et al., 9 Jul 2025, Zhang, 19 Dec 2025).

No research paper provides tactile, force/torque, or embedded vision data in the low-level control loop; proprioceptive feedback is the principal sensory input for motion and balance control.

3. Actuation, Control Loops, and Dynamics

All G1 platforms employ brushless DC actuators with internal PD feedback. Position targets (and occasionally velocity/torque or impedance targets) are commanded at the joint level, with control signals relayed via onboard CAN bus. Key aspects:

Low-level control: PD loop per joint, with $\tau_i = K_{p,i}(q^{target}_i - q_i) + K_{d,i}(\dot{q}^{target}_i - \dot{q}_i)$ , $K_p, K_d$ not standardized across papers (Liu et al., 28 Nov 2025, Sun et al., 9 Jul 2025, Zhang, 19 Dec 2025).
Whole-body control: Full-body rigid-body dynamics modeled via MuJoCo, Isaac Gym, or custom ODEs using the manipulator equation $M(q)\ddot{q} + C(q,\dot{q})\dot{q} + g(q) = \tau$ .
Control rates: Typical policy rates are 50 Hz (20 ms cycle), with servo decimation to 200 Hz in some architectures (Sun et al., 9 Jul 2025, Li et al., 6 May 2025).
Real-time compute: Onboard inference via NVIDIA Jetson AGX Xavier or Orin NX; external workstation for high-rate distributed experiments (Li et al., 6 May 2025, Li et al., 19 Jan 2026).
Action spaces: Position control dominates, with some residual or hybrid action models for fine-grained correction (e.g., residual policies in ULC (Sun et al., 9 Jul 2025); torque/impedance in AMO (Li et al., 6 May 2025)).

The G1’s underlying dynamics fidelity and PD parameterization are often domain randomized (mass, inertia, gains) to mitigate sim-to-real gaps (Nie et al., 2 Aug 2025, Seo et al., 1 Dec 2025, Sun et al., 9 Jul 2025, Liu et al., 28 Nov 2025).

4. Representative Control Paradigms and Research Methodologies

The G1 has been integral in advancing several research directions:

Language-conditioned control: Humanoid-LLA (Liu et al., 28 Nov 2025) and FRoM-W1 (Li et al., 19 Jan 2026) demonstrate end-to-end language-to-action pipelines, employing unified discrete motion vocabularies (VQ-VAE, CVAE) and RL with reward components for action feasibility, semantic-alignment, and physical robustness.
Symmetry-exploiting policies: SE-Policy (Nie et al., 2 Aug 2025) utilizes the G1’s bilateral symmetry to enforce equivariance in control MLPs, yielding strictly mirrored behaviors and improved tracking/global stability.
Unified loco-manipulation: ULC (Sun et al., 9 Jul 2025) shows a monolithic policy for simultaneous dual-arm manipulation and bipedal walking, integrating residual action modeling, polynomial arm trajectory interpolation, and stochastic delay exposure.
Hyper-dexterous workspace expansion: AMO (Li et al., 6 May 2025) combines trajectory optimization with RL, supporting extreme whole-body reaches and dynamic workspace extension.
Fast sim-to-real RL: FastSAC and FastTD3 yield robust policy transfer in under 15 minutes of GPU time (Seo et al., 1 Dec 2025), via massive parallelism and extreme domain randomization.
Gesture synthesis and embodiment: Semantic co-speech gesture control integrates Motion-GPT and general motion retargeting for synchronized, semantically linked articulation and speech (Zhang, 19 Dec 2025).
Gait-conditioned curricula: Multi-phase RL policies support smooth transitions between walking, running, and standing, tracked via joint velocity, foot contact, and momentum-based rewards (Peng et al., 27 May 2025).

Performance metrics typically include mean per-joint position error (MPJPE), velocity/acceleration error, workspace realization range, robustness under delay/load, and task success rates (e.g., up to 87.6% semantic action success versus 72–80% for baselines) (Liu et al., 28 Nov 2025, Sun et al., 9 Jul 2025, Nie et al., 2 Aug 2025, Li et al., 19 Jan 2026).

5. Software, Middleware, and Communication Stack

The platform runs a custom Linux kernel (5.10.176-rt86+) with real-time preemption, layered with ROS 2 Foxy, CycloneDDS, and proprietary Unitree "master_service" for orchestration (Mayoral-Vilches et al., 17 Sep 2025, Mayoral-Vilches, 17 Sep 2025). Embedded software launches 20+ daemons, coordinating motion planning ("ai_sport"), state estimation, audio/video streaming, and cloud uplink.

Onboard compute: Rockchip RK3588 ARM CPU, 8 GB RAM, 32 GB eMMC standard (Mayoral-Vilches et al., 17 Sep 2025).
Middleware: ROS 2 topics for state/action/sensor data, CycloneDDS for IPC, Iceoryx for shared memory.
Network: Gigabit Ethernet, 802.11ac Wi-Fi, Bluetooth LE. CAN bus for motor driver real-time actuation.
Configuration: FMX-encrypted JSONs with layered proprietary crypto; communication via MQTT, WebRTC (video), DDS, and sockets.

Some deployments include outbound telemetry to manufacturer cloud, WebRTC media streaming, and audio channel publishing integrated with main compute cycles.

6. Security Architecture, Vulnerabilities, and Telemetry

Deep cybersecurity audits on the G1 reveal a hybrid posture: sophisticated service orchestration with real-time Linux and ROS 2, but critical flaws in cryptography and authentication (Mayoral-Vilches et al., 17 Sep 2025, Mayoral-Vilches, 17 Sep 2025). Key findings:

BLE Provisioning Vulnerability: Static AES-128 key for BLE provisioning enables unauthenticated root code execution during Wi-Fi setup (Mayoral-Vilches et al., 17 Sep 2025).
FMX Encryption Flaws: All robots share a static Blowfish-ECB key and predictable LCG mask; configuration files decryptable offline. No per-device secrets, violating cryptographic best practices (Mayoral-Vilches, 17 Sep 2025).
Covert Telemetry: Persistent telemetry transmissions over MQTT and DDS exfiltrate audio, vision, joint, IMU, resource, and service data every 300 s to cloud servers without user knowledge, violating GDPR/CCPA (Mayoral-Vilches et al., 17 Sep 2025, Mayoral-Vilches, 17 Sep 2025).
Real-time cloud connectivity: Immediate post-boot tunnel setup for video/audio streaming, telemetry, and OTA commands.
Counter-offensive vectors: Embedded Cybersecurity AI (CAI) agents can leverage insider access for MQTT/WebRTC pivoting and OTA manipulation (Mayoral-Vilches et al., 17 Sep 2025, Mayoral-Vilches, 17 Sep 2025).

Recommendations include hardware root-of-trust, per-device keying, secure-boot, opt-in/out telemetry, and CAI-driven anomaly/penetration testing as first-class defensive components.

7. Application Domains, Limitations, and Open Challenges

The Unitree G1’s design supports broad research in legged-locomotion, manipulation, whole-body tracking, human–robot interaction, semantic action generation, and cyber-physical security. Notable real-world deployments include:

Language-to-action pipelines executing unseen tasks in naturalistic settings (Liu et al., 28 Nov 2025, Li et al., 19 Jan 2026).
Unified gait and locomotion for navigation and teleoperation under strong domain randomization (Seo et al., 1 Dec 2025, Peng et al., 27 May 2025).
Synchronized gesture–speech synthesis for robot communication (Zhang, 19 Dec 2025).
Hyper-dexterous workspace for manipulation and manipulation-under-load (Li et al., 6 May 2025, Sun et al., 9 Jul 2025).
Demonstrations on rough terrain and obstacle-rich domains (Nie et al., 2 Aug 2025).

Current limitations include insufficient torque/mass/inertia characterization in open literature, absence of detailed timing and communication bus metrics, and only indirect access to per-joint actuation profiles. Security surfaces remain non-trivial due to static keying and factory default credentials. Sensory feedback for force or vision is typically not leveraged at low levels, representing an ongoing area for sensor fusion enhancements.

The Unitree G1 platform, through its explicit mechanical design, open control stack, and multi-modal research applicability, remains a central asset for state-of-the-art robotics research and a catalyst for the next generation of resilient, dexterous, and semantically capable humanoid systems.