Marker-Based Autonomous Landing

Updated 23 January 2026

Marker-based autonomous landing is a technique that uses artificial fiducial markers and multi-scale designs to achieve precise UAV pose estimation and controlled descent.
It integrates advanced visual detection algorithms, geometric marker design, and sensor fusion to reliably operate under dynamic and adverse environmental conditions.
The approach employs closed-loop control strategies and deep reinforcement learning to adapt in real time, ensuring accurate landing even with occlusions and variable lighting.

Marker-based autonomous landing refers to the use of artificial visual landmarks—fiducial markers—deployed in the landing zone to enable precise, robust, and repeatable autonomous landing of aerial vehicles via onboard or external visual sensing. The paradigm spans non-cooperative UAV landings, dynamic moving-platform scenarios, and integrated SLAM-based approaches, employing marker geometries and detection pipelines optimized for accuracy, detection range, and environmental robustness. Both classical and learning-based control architectures are used, incorporating sensor fusion, event-driven state machines, and, increasingly, deep reinforcement learning. The design of the markers, associated detection and pose estimation algorithms, sensor and computational stack, and closed-loop control strategies together define system performance limits under real-world operational disturbances.

1. Marker Design Principles and Geometric Scaling

A central challenge in marker-based landing is sustaining high-fidelity detection and accurate pose recovery over the vehicle's full descent profile. Conventional single-scale planar fiducials (e.g., AprilTag, ArUco, WhyCon) yield reliable detection only within a restricted band—set by marker size $L$ , camera focal length $f$ , and required minimum image-side length $s_{\min}$ : $d_{\max} = fL / s_{\min}$ . Multi-scale marker arrangements—spatially nested fiducials of successively smaller $L_i$ —yield a composite detection band, with long-range acquisition by the largest $L_1$ and short-range accuracy from the smallest $L_M$ . For instance, $L_1=1.0$ m, $L_2=0.5$ m, $L_3=0.25$ m enables reliable pose recovery from $d_{\max} \approx 20$ m down to touchdown (Lee et al., 2023). Marker topologies may be augmented with dual-shape (squares, triangles), hierarchical area-ratio codes, or dynamic displays for spatio-temporal adaptation (Wu et al., 2019, Acuna et al., 2017).

2. Visual Detection and Pose Recovery Algorithms

The detection pipeline is structured into RGB/IR image acquisition, local adaptive thresholding, contour/region extraction, marker decoding (grid sampling for code-bits, pattern type identification), and geometric pose estimation. For multi-scale tags, detection runs in parallel per scale $L_k$ , targeting expected image-side lengths $s_k \approx f L_k / d$ . Pose is recovered by solving the Perspective- $n$ -Point problem: $\min_{R,t} \sum_{i=1}^n \| x_i - \pi(RX_i + t) \|^2,$ where $X_i$ are known 3D marker points, $x_i$ the image detections, and $\pi(\cdot)$ the calibrated projective mapping. For planar markers, direct homography estimation followed by projection and SVD-based decomposition is used. Robustness to partial occlusion, lighting variation, or blur is enforced via adaptive thresholding, RANSAC-PnP, and IMU-informed pose priors (Lee et al., 2023, Schroder et al., 18 May 2025). For IR markers, detection leverages spectral pre-filtering, with passive/active variants distinguished by preprocessing steps (inversion, local background subtraction, morphological closing) and marker design (heated emissive or reflective structures) (Springer et al., 2024). Real-time pipelines are commonly realized on embedded ARM/FPGA (e.g., Zynq SoC) (Blachut et al., 2020) or onboard Jetson-class CPUs/GPUs (Schroder et al., 18 May 2025).

3. Visual-Inertial Fusion, SLAM, and State Estimation

High-precision landing, especially in GPS-denied or urban canyon environments, is enabled by tight integration of marker-derived absolute pose constraints with inertial navigation and SLAM backends. Factor-graph SLAM encodes IMU pre-integration constraints between vehicle states $x_k$ and fuses marker observations as "absolute” pose factors: $\phi_{\mathrm{fid}}(x_k, m_j) = \| z_{k,j} - h(x_k, m_j) \|_\Sigma^2,$ where $m_j$ is the world marker pose, $z_{k,j}$ is the observed relative pose, and $h(\cdot)$ models the forward measurement process (Lee et al., 2023). Loop-closure is achieved by repeated marker observation across the trajectory, constraining global drift. On moving platforms, ground-camera tracking of UAV-mounted LED or geometric patterns, fused with an Iterated Extended Kalman Filter (IEKF) on manifold $\mathbb{R}^3 \times SO(3) \times \mathbb{R}^3$ , achieves offboard, non-robocentric landing with real-time update rates and minimal onboard computation (Lo et al., 2024). Particle filtering is exploited for long-range infrared beacon arrays, fusing dot detections with IMU to estimate full-approach pose under adverse visibility (Khithov et al., 2017).

4. Closed-Loop Landing Control and Planning Architectures

Closed-loop control strategies integrate marker pose observations via multi-layer guidance and low-level control. Finite-state machines coordinate phases: search, acquisition, approach, alignment, descent, and commit, with transitions keyed to image metrics (centroid error, confidence, loss-of-marker) (Springer et al., 2024). Position-based visual servoing (PBVS) uses proportional velocity or acceleration laws derived from 3D pose error ( $e = s - s^*$ ); e.g., $v_c = -\lambda R^T (\hat{t}_c^*)$ , with optional decoupling of vertical and planar motions (Acuna et al., 2017). PID and LQR forms are implemented for cascaded position/attitude control, with outer-loop targeting the marker-relative trajectory (Schroder et al., 18 May 2025, Salagame et al., 2022). For moving-platform and obstacle-dense environments, real-time B-spline or gradient-based planners generate collision-free trajectories subject to dynamic feasibility and visual-frustum bounds; constraints and cost functions incorporate marker update latency and field-of-view restrictions (Wang et al., 2022).

5. Learning-Based Approaches and Robust Search

Reinforcement learning (RL) has been adopted for vision-based, marker-guided landing, reframing the sequence as a (hierarchical) MDP over raw or reduced images. Deep Q-Networks (DQN) orchestrate high-level navigation (“marker-detection” at fixed altitude, followed by “vertical-descent” and power-down) (Polvara et al., 2017). Partitioned replay and Double-DQN targets mitigate reward sparsity and overfit, while domain randomization ensures robustness to background textures and lighting. RL agents can directly regress depth and offset from monocular RGB, leveraging marker design features (color segmentation, geometric deformation) to encode slant and range (Houichime et al., 11 May 2025). In search-and-exploration, RL-trained policies (Proximal Policy Optimization, curriculum learning) are compared against heuristic 2D/3D coverage patterns (spiral, zigzag), with 3D-aware search yielding improved collision avoidance but hybrid policies needed for comprehensive marker acquisition—particularly in urban, high-occlusion settings (Yao et al., 16 Jan 2026).

6. System Performance Metrics and Empirical Evaluation

System evaluation focuses on trajectory error, pose availability, landing accuracy, computational throughput, and robustness to environment disturbances. Key metrics include:

Absolute Trajectory Error (ATE):

$\mathrm{ATE} = \sqrt{\frac{1}{n}\sum_{i=1}^n \|T^{\mathrm{est}}_i - T^{\mathrm{gt}}_i\|^2}$

Pose Availability: Fraction of frames with valid pose, $\rho = \frac{\#\text{frames with pose}}{\text{total frames}}$
Landing Error: Mean and std. deviation to pad center; typically $\mu_E \leq 0.3$ m in modern methods, best case $<0.05$ m (Springer et al., 2024, Springer et al., 2022, Springer et al., 2023, Schroder et al., 18 May 2025).
Detection Range: For 1 m marker, $d_\max\sim20$ m in daylight for global-shutter RGB, with IR and multi-sensor extensions supporting nocturnal and adverse-weather operation (Lee et al., 2023, Springer et al., 2024).
Robustness: Availability drop of 5% under wind ( $>$ 5 m/s) or low light ( $\leq 100$ lux); occlusion-recovery and search state-machine widen operational envelope (Lee et al., 2023, Springer et al., 2024, Wang et al., 2022).

7. Practical Considerations, Limitations, and Operational Best Practices

Best practices reflect empirical findings:

Prefer 3–4 scale nested markers covering target descent range; high-contrast, durable patterns (AprilTag, ArUco, designed area-code patterns) recommended (Lee et al., 2023).
Tune camera $f,\,s$ for smallest marker to project $s\gtrsim30$ px at $d_\max$; use global-shutter sensors to avoid motion blur (Schroder et al., 18 May 2025).
Adaptive thresholding, RANSAC-PnP, IMU-based pose priors for detection robustness; multi-frame validation against ID jitter and false positives (Lee et al., 2023, Schroder et al., 18 May 2025).
Factor-graph or EKF/particle-filter fusion for state estimation under missed frames or occlusion; delayed updates rely on IMU dead-reckoning (Khithov et al., 2017, Lee et al., 2023).
Pre-survey marker positions (RTK GNSS), account for range and environmental variation, monitor pose-availability in real time and trigger fallback (e.g., GPS) below $\rho=90\%$ (Lee et al., 2023).
For high-reliability deployments: modular, real-time architectures, edge-optimized deep models (i.e., TensorRT), and hot-swappable detection/planning nodes (Schroder et al., 18 May 2025).
Systematic adversarial testing (simulation, real-world) to uncover rare failures; hybrid genetic algorithm-RL test generation accelerates the discovery of violation cases, especially with dynamic obstacles and environmental perturbations (Liang et al., 2023).

Marker-based autonomous landing occupies a central role in modern aerial robotics, affording high-precision, infrastructure-minimal landings. Advances in marker geometry, sensor fusion, and closed-loop controls continue to extend operational reliability, with rigorous empirical practice and simulation-based adversarial testing guiding the field (Lee et al., 2023, Schroder et al., 18 May 2025, Liang et al., 2023).