PPF-Tracker: Articulated SE(3) Pose Tracking

Updated 15 November 2025

PPF-Tracker is a category-level articulated object pose tracking framework operating in SE(3), utilizing dynamic keyframes and point-pair features.
It integrates quasi-canonicalization, SE(3)-invariant learning, and tangent space voting to achieve robust tracking under complex kinematic conditions.
Its design supports real-time applications in robotics and augmented reality through efficient drift management and Gauss–Newton kinematic refinement.

PPF-Tracker is a category-level articulated object pose tracking framework operating in the SE(3) Lie group space, specifically designed to address the challenging problem of multi-part object pose tracking under complex, real-world kinematic conditions. Leveraging quasi-canonicalization and point-pair feature representations, PPF-Tracker integrates SE(3)-invariant learning, pose voting on tangent spaces, and explicit part-joint kinematic constraints. Its full pipeline delivers robust tracking for articulated structures in robotics, augmented reality, and embodied intelligence scenarios.

1. Quasi-Canonicalization on SE(3) Manifolds

PPF-Tracker defines a systematic quasi-canonicalization procedure for articulated objects comprising $K$ rigid parts. At each frame $t$ , part-wise point clouds $\mathcal P_t^k$ and predicted poses $T_t^k\in\mathrm{SE}(3)$ are processed in reference to dynamic keyframes. Frames are partitioned into segments indexed by $i$ , where each segment runs between successive keyframes.

A keyframe inverse is constructed for each part: $\mathcal K_i^k = (T_{n}^k)^{-1}$ , with $n$ marking the segment start. Canonicalization transforms incoming clouds within segment $i$ via

$\bar{\mathcal P}_t^k = \mathcal K_i^k(\mathcal P_t^k) = (T_{n}^k)^{-1}T_t^k(\mathcal P_t^k),\quad t\in[n,[i+1])$

where in practice, $\bar{\mathcal P}_t^k\approx \Delta T_t^k\cdot \mathcal P_c^k$ using previous estimates. Relative pose is expressed as

$t$ 0

with absolute pose accumulation: $t$ 1

Dynamic Keyframe Selection (DKS) centralizes drift management: after each prediction, energy is computed as

$t$ 2

where $t$ 3 and $t$ 4 are Chamfer and Hausdorff distances. A new keyframe is triggered if $t$ 5, with typical threshold $t$ 6. This mechanism regulates frame reference updates to minimize drift and enhance motion adaptation.

2. Point-Pair Feature Representation for Articulated Objects

PPF-Tracker utilizes rigidity-invariant point-pair features. For points $t$ 7 with normals $t$ 8, the directional vector is

$t$ 9

and the canonical 4-D PPF encoding is

$\mathcal P_t^k$ 0

which is invariant under any rigid transformation $\mathcal P_t^k$ 1.

A learned pair-wise weighting, based on normal angle $\mathcal P_t^k$ 2, is introduced: $\mathcal P_t^k$ 3 Biasing against nearly-parallel pairs enhances voting contrast in subsequent network heads. A set of $\mathcal P_t^k$ 4 point pairs, each with its weighted PPF and optionally their joint coordinates $\mathcal P_t^k$ 5, is propagated through a PointNet++ backbone capturing relevant geometric relationships.

3. SE(3)-Tangent Pose Voting with Explicit Parameterization

Following feature extraction, the network splits into five prediction heads per part $\mathcal P_t^k$ 6:

Translation votes: $\mathcal P_t^k$ 7
Orientation votes: $\mathcal P_t^k$ 8
Scale regressor: $\mathcal P_t^k$ 9

Let $T_t^k\in\mathrm{SE}(3)$ 0 denote the canonical part center, $T_t^k\in\mathrm{SE}(3)$ 1 the axes, and $T_t^k\in\mathrm{SE}(3)$ 2 as above. The translation parameters

$T_t^k\in\mathrm{SE}(3)$ 3

describe circles of possible part centers. The orientation parameters

$T_t^k\in\mathrm{SE}(3)$ 4

vote for canonical rotation.

Each PPF casts soft votes, via a small MLP, into discretized translation ( $T_t^k\in\mathrm{SE}(3)$ 5 bins) and orientation ( $T_t^k\in\mathrm{SE}(3)$ 6 Fibonacci sphere bins) histograms. Maxima are extracted for continuous estimates $T_t^k\in\mathrm{SE}(3)$ 7, and scale $T_t^k\in\mathrm{SE}(3)$ 8 is regressed through MSE loss.

From $T_t^k\in\mathrm{SE}(3)$ 9, an element $i$ 0 is constructed: $i$ 1 where analytical mappings follow Eade (2013). Pose updates are performed in tangent space: $i$ 2 with exponential mapping ensuring rotation matrix orthogonality.

4. Kinematic Constraints and Joint-Axis Optimization

The framework incorporates kinematic-constraint refinement for articulated joints. For $i$ 3 joints interconnecting $i$ 4 parts, revolute joints rotate about axis $i$ 5, prismatic joints slide along it. Joint $i$ 6 is characterized by reference point $i$ 7 and direction $i$ 8.

Two energy terms define the optimization:

Geometric alignment per part:

$i$ 9

Kinematic coupling per joint:

$\mathcal K_i^k = (T_{n}^k)^{-1}$ 0

with axis and translation constraints depending on joint type.

The total objective,

$\mathcal K_i^k = (T_{n}^k)^{-1}$ 1

is minimized, typically via Gauss–Newton, to yield refined pose estimates $\mathcal K_i^k = (T_{n}^k)^{-1}$ 2. This step enforces consistency of joint articulation across parts and frames.

5. Pipeline Overview and Implementation Pseudocode

The PPF-Tracker process operates as a stream on input clouds and initial poses. The following pseudocode details the core steps:

$n$ 1 This single-stream design supports online operation and naturally accommodates dynamic keyframe selection and kinematic refinement.

6. Network Architecture, Loss Functions, and Training Protocols

PPF-Tracker deploys a PointNet++ backbone for feature learning over weighted point-pair features. Four heads operate in parallel:

Translation: Predicts softmax histograms for $\mathcal K_i^k = (T_{n}^k)^{-1}$ 3 with $\mathcal K_i^k = (T_{n}^k)^{-1}$ 4 bins
Orientation: Predicts softmax histogram over $\mathcal K_i^k = (T_{n}^k)^{-1}$ 5 bins for $\mathcal K_i^k = (T_{n}^k)^{-1}$ 6
Scale: Regression for $\mathcal K_i^k = (T_{n}^k)^{-1}$ 7
Mask: Optional part segmentation via binary prediction

Loss functions are constructed as follows:

Translation and orientation: KL-divergence on softmax voting outputs
Scale: Mean squared error (MSE)
Mask: Binary cross-entropy (BCE)

The final loss combines all components: $\mathcal K_i^k = (T_{n}^k)^{-1}$ 8 Training is conducted for 200 epochs using Adam optimizer with initial learning rate $\mathcal K_i^k = (T_{n}^k)^{-1}$ 9, decayed by 0.1 every 10 epochs, and input clouds downsampled to 3072 points. Inference is performed per frame with runtime $n$ 0 on RTX 4090-class hardware, demonstrating suitability for real-time robotic or AR scenarios.

A plausible implication is that PPF-Tracker's dynamic keyframe mechanism can adapt to unpredictable motion patterns and maintain low drift even in long sequences.

7. Applications and Implementation Considerations

PPF-Tracker is applicable to pose tracking in multi-part robotic manipulators, articulated AR objects, and category-level scene understanding, wherever rigid part motion is constrained by physically plausible kinematic joints. The framework supports extension to broader categories given annotation of joint axes.

Resource requirements are compatible with real-time deployment given modern GPUs, and the modular pipeline with explicit keyframing and refinement facilitates integration with higher-level control, mapping, or semantic segmentation subsystems.

Its empirical generalization across synthetic and real-world scenarios suggests strong domain robustness. For full implementation details, all codes and pretrained models are available at https://github.com/mengxh20/PPFTracker. Lie group background follows Eade (2013).

Below is a concise summary of design choices:

Component	Key Method	Implementation
Feature Backbone	PointNet++	(v_{ij},PPF_{ij})
Voting	Softmax + MLP heads	Histograms, MSE
Kinematic Refinement	Gauss–Newton	\mathcal E_{\rm comp}
Keyframe Policy	Dynamic, energy-based	Chamfer, Hausdorff

This synthesis represents the current canonical implementation and research status of PPF-Tracker for articulated pose tracking in SE(3).

Markdown Report Issue Upgrade to Chat

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PPF-Tracker.