- The paper introduces a stem-centric localization method that uses inter-tree geometric matching to overcome challenges in forest environments.
- It employs a dual-stage descriptor pipeline combining a global Tree Distribution Histogram (TDH) for coarse retrieval and a triangle-based descriptor for fine matching.
- Experimental results show centimeter-level translation and sub-degree rotation accuracy, outperforming conventional dense point cloud and learned methods.
TreeLoc: 6-DoF LiDAR Global Localization in Forests via Inter-Tree Geometric Matching
Introduction and Motivation
Forested environments pose significant challenges for autonomous navigation and mapping: GPS reliability is degraded, and the repetitive, occluded, and structurally complex scenes undermine assumptions incorporated in urban-centric localization pipelines. Traditional LiDAR localization methods relying on dense point clouds or descriptors—originally designed for environments with salient and distinctive features—struggle to provide unique cues in forests, particularly under seasonal variation and limited computational resources.
TreeLoc addresses these limitations by leveraging trees as persistent, parametric, and repeatable primitives, extracting spatial geometry that remains stable over long times and across environmental shifts. The central premise is to represent environments via explicit tree stem models with diameter-at-breast-height (DBH), axes, and base locations, which are more interpretable and storage-efficient compared to point cloud-based approaches.
Figure 1: (a) GPS bias leads to misalignment in LiDAR-SLAM trajectories. (b) TreeLoc's TDH descriptor encodes tree count by spatial bin and DBH. (c) Fine retrieval refines candidates using triangle-based geometric descriptors among tree centers. (d) Precise 6-DoF pose enables robust cross-session map alignment.
Methodology
The TreeLoc pipeline comprises three key stages: (1) learning-free parametric tree reconstruction and scene alignment, (2) two-stage place recognition using domain-specific descriptors, and (3) geometric verification to estimate 6-DoF poses.
Tree Instance Extraction and Scene Alignment:
LiDAR scans are aggregated to ensure sufficient density, and RealtimeTrees is deployed for robust stem segmentation. Each detected tree is parameterized by its 3D axis, base position, and DBH. Local scenes are aligned using the axes of the reconstructed trees (roll and pitch correction), ensuring consistent 2D projections independent of local terrain variation or sensor inclination. This step is crucial for reducing descriptor ambiguity and maximizing geometric repeatability.
Figure 2: TreeLoc's pipeline converts LiDAR scans into tree instances, executes dual-stage place recognition, then performs 6-DoF pose estimation and tight registration.
Dual-Descriptor Place Recognition:
A Tree Distribution Histogram (TDH) is constructed in the aligned frame, encoding counts of trees by radial distance and DBH bin. This global descriptor enables rapid coarse candidate retrieval (top-100 in the database). For fine retrieval, a 2D triangle descriptor is formed by hashing triangles among projected tree centers, providing permutation-, translation-, and rotation-invariant geometric signatures. Geometric similarity is determined by the count of shared hash-keys between query and candidate scenes. The top-10 candidates are passed to the next stage.
6-DoF Pose Estimation by Geometric Verification:
For each candidate, geometric consistency is established through two-step SVD-based alignments: first using triangle centroids for initial in-plane transform, then refining the match using tree centers and base heights to obtain a full 4-DoF planar transform (x, y, yaw, and z-offset). The highest-overlap candidate provides the final 6-DoF pose by combining planar transform with the tree axis-induced alignment—forgoing dense iterative schemes like ICP.
Figure 3: Two-stage geometric verification: initial planar alignment via triangle centroids, followed by refinement using tree centers and stem base heights for a precise 4-DoF transform.
Experimental Evaluation
TreeLoc is evaluated on multiple forest benchmarks: Oxford Forest Place Recognition (including sequences with strong sensor roll/pitch), Wild-Places (spanning different seasons and years), and BotanicGarden (urban park). Datasets include both same-session and cross-session pairs, with strong viewpoint, appearance, and hardware heterogeneity.
Intra-Session Place Recognition:
TreeLoc outperforms all algorithmic baselines—both global (Scan Context++, RING++) and local (BTC, MapClosure)—in recall-at-1, F1, and AUC across all sequences except highly open areas (K-04), where it remains competitive. The tree-based descriptors provide more distinctive and less ambiguous place signatures compared to point-cloud-based or BEV approaches.
Figure 4: Similarity maps in Evo: TreeLoc (a-c) produces compact, ground-truth-aligned matches; baselines (d-e) are diffuse and prone to false positives.
Figure 5: Precision-Recall curves: TreeLoc maintains high precision at high recall, outperforming both algorithmic and learned baselines.
Inter-Session and Long-Term Robustness:
On temporally separated traversals (6 and 14 months apart), TreeLoc demonstrates both higher mean scores and lower variance in place recognition compared to all baselines. SC++ shows performance drops with increased session gaps, while TreeLoc's reliance on persistent geometry results in consistent high recall and precision.
Figure 6: Inter-session performance: TreeLoc yields superior mean and lower standard deviation in recall/F1/AUC across all evaluation pairs.
Localization Accuracy:
TreeLoc provides superior localization performance, with median translation/rotation errors at the centimeter and sub-degree scale (for 3-DoF and full 6-DoF poses), outperforming both 3D keypoint-based descriptors (BTC) and BEV-based global descriptors (RING++), even in sparse or open-tree configurations.
Figure 7: Localization errors: TreeLoc achieves lowest and most stable errors for both 3-DoF and 6-DoF pose estimation.
Comparison with Learning-Based Methods:
TreeLoc surpasses state-of-the-art learned descriptors (TransLoc3D, LoGG3D-Net, MinkLoc3Dv2, ForestLPR), especially in cross-domain/deployment scenarios, highlighting strong generalization of the learning-free, interpretable, and domain-aligned approach.
Ablation and Scalability Analyses
Ablation confirms the necessity of each design component. TDH dramatically reduces retrieval set size and runtime; incorporating DBH as a feature removes ambiguities among geometrically similar trees. Tree axis-based alignment outperforms ground-plane or raw approaches, particularly in scenes with considerable terrain variation.
Storage and Computational Efficiencies:
TreeLoc's global tree database is two to three orders of magnitude smaller than point-cloud or keypoint-descriptor archives (tens of KB vs. GB scale), enabling efficient, updatable, and long-term forest map management. Place recognition and 6-DoF pose inference are achieved within 50 ms.
Applications: Lightweight Multi-Session Alignment
A key extension of TreeLoc is lightweight multi-session alignment across years and sensors. By aggregating all unique trees into a compact database, TreeLoc enables robust loop closure and global map maintenance over repeated traversals, supporting incremental updates and minimizing storage burden. Experiments show that the tree-based representation yields more accurate multi-session alignment than keypoint-based methods, even when deployed with different LiDAR sensors and after substantial environmental change.
Figure 8: (a) Pre-optimization trajectories. (b) Comparison of TreeLoc and BTC alignments—TreeLoc shows tighter, more accurate alignment. (c-d) Post-optimization, TreeLoc yields lower errors throughout the combined map.
Theoretical and Practical Implications
TreeLoc demonstrates that semantically meaningful, structure-aware primitives can provide highly robust, interpretable, and scalable solutions for localization in challenging environments where appearance, occlusion, and sensor conditions vary substantially over time. The focus on geometric descriptors tied to domain-specific invariants (tree stems) enables compact data representations, robust long-term place recognition, and precise pose recovery without learned models or dense point cloud processing.
On the practical side, this approach opens avenues for efficient autonomous robotic operations in forestry, long-term ecological monitoring, precision forest management, and scalable map fusion for multi-session and multi-agent deployments. The lightweight, interpretable nature further enables easy integration with existing forest inventory practices and simplifies downstream map processing.
Future Directions
The paper notes that TreeLoc's performance may degrade in regions with low-density or absent canopy (open areas). Extending the framework to incorporate additional geometric or appearance-based primitives—e.g., BEV contours, undergrowth models, or semantic class features—can further improve robustness across broader natural environments. Additional relaxation or adaptation of descriptors may support more general use in non-forested settings.
Integration with learning-based post-processing, multi-modal semantics, or continuous environmental change modeling could further expand applicability while retaining the compactness and generalization benefits of the core approach.
Conclusion
TreeLoc establishes a novel paradigm for global LiDAR localization in forests, demonstrating that stem-centric, geometric matching provides clear advantages over both global and local descriptor-heavy and learned methods. The approach balances algorithmic efficiency, storage compactness, and real-world deployment constraints, enabling persistent, scalable, and robust localization essential for autonomous operation in natural environments (2602.01501).