Multi-Target Multi-Camera Tracking of Vehicles using Metadata-Aided Re-ID and Trajectory-Based Camera Link Model

Published 3 May 2021 in cs.CV | (2105.01213v1)

Abstract: In this paper, we propose a novel framework for multi-target multi-camera tracking (MTMCT) of vehicles based on metadata-aided re-identification (MA-ReID) and the trajectory-based camera link model (TCLM). Given a video sequence and the corresponding frame-by-frame vehicle detections, we first address the isolated tracklets issue from single camera tracking (SCT) by the proposed traffic-aware single-camera tracking (TSCT). Then, after automatically constructing the TCLM, we solve MTMCT by the MA-ReID. The TCLM is generated from camera topological configuration to obtain the spatial and temporal information to improve the performance of MTMCT by reducing the candidate search of ReID. We also use the temporal attention model to create more discriminative embeddings of trajectories from each camera to achieve robust distance measures for vehicle ReID. Moreover, we train a metadata classifier for MTMCT to obtain the metadata feature, which is concatenated with the temporal attention based embeddings. Finally, the TCLM and hierarchical clustering are jointly applied for global ID assignment. The proposed method is evaluated on the CityFlow dataset, achieving IDF1 76.77%, which outperforms the state-of-the-art MTMCT methods.

Abstract PDF Upgrade to Chat

Citations (38)

View on Semantic Scholar

Summary

The paper presents a novel MTMCT framework that integrates metadata-aided ReID and trajectory-based camera link models for robust vehicle tracking.
It employs Traffic-Aware Single Camera Tracking to merge fragmented trajectories and uses hierarchical clustering for consistent global ID assignment.
Experimental evaluation on the CityFlow dataset demonstrates an IDF1 score of 76.77%, validating the approach's effectiveness in complex surveillance scenarios.

Multi-Target Multi-Camera Tracking of Vehicles using Metadata-Aided Re-ID and Trajectory-Based Camera Link Model

Introduction to MTMCT of Vehicles

The paper presents a novel framework for Multi-Target Multi-Camera Tracking (MTMCT) of vehicles, which leverages metadata-aided re-identification (ReID) and trajectory-based camera link models (TCLM) to address the challenges inherent in tracking vehicles across multiple camera views. The central objective of MTMCT is to robustly track vehicle identities in a network of surveillance cameras, despite variations in viewpoints, vehicle occlusions, and lighting conditions.

Figure 1: Illustration for MTMCT of vehicles.

Framework and Methodology

The proposed MTMCT framework follows a structured approach that consists of four main components:

Traffic-Aware Single Camera Tracking (TSCT): The TSCT algorithm improves single-camera tracking results by considering traffic patterns to merge fragmented trajectories that typically result from occlusions or sudden stops. This is achieved by generating zones of entry, exit, and traffic awareness, allowing isolated tracklets to be merged effectively (Figure 2).
Figure 2: Traffic-aware zone generation.
Metadata-Aided ReID (MA-ReID): This component enriches traditional ReID models by incorporating metadata features such as vehicle type, color, and brand. The MA-ReID model enhances the discriminative power of embeddings by combining temporal attention-augmented appearance features with metadata-informed features, thereby improving cross-camera ReID robustness.
Figure 3: Examples of vehicle keypoints detection and visibility estimation.
Trajectory-Based Camera Link Model (TCLM): The TCLM automatically establishes spatial and temporal constraints between linked cameras by analyzing paths that vehicles are likely to take. This model reduces the identification search space by considering plausible trajectory transitions between cameras, improving computational efficiency and accuracy (Figure 4).
Figure 4: Illustration for trajectory-based camera link model.
Hierarchical Clustering for Global ID Assignment: The framework applies hierarchical clustering on the combined embeddings from MA-ReID and constraints from TCLM to assign global IDs across all camera views, resulting in coherent multi-view vehicle tracking (Figure 5).
Figure 5: The procedure of hierarchical clustering.

Experimental Evaluation

The proposed MTMCT system was evaluated on the CityFlow dataset, yielding an IDF1 score of 76.77%, surpassing existing state-of-the-art methods. The contributions of TSCT in mitigating isolated tracklet issues and the combined embedding strategies of MA-ReID significantly enhance tracking performance. The numerical results underscore the efficacy of exploiting contextual traffic information and metadata in improving MTMCT outcomes.

Implications and Future Directions

The paper's methodology highlights the importance of integrating domain-specific metadata with machine learning models to address real-world challenges in intelligent surveillance systems. Future developments can explore the scalability of such systems to larger, more complex environments potentially incorporating real-time processing capabilities and extending the model to work in diverse weather and lighting conditions.

In summary, this paper provides a comprehensive solution for vehicle tracking across multi-camera setups, demonstrating the power of metadata integration and traffic modeling, with strong implications for enhancing urban traffic management and security operations.