Multi-Camera Trajectory Forecasting: Pedestrian Trajectory Prediction in a Network of Cameras

Published 1 May 2020 in cs.CV | (2005.00282v1)

Abstract: We introduce the task of multi-camera trajectory forecasting (MCTF), where the future trajectory of an object is predicted in a network of cameras. Prior works consider forecasting trajectories in a single camera view. Our work is the first to consider the challenging scenario of forecasting across multiple non-overlapping camera views. This has wide applicability in tasks such as re-identification and multi-target multi-camera tracking. To facilitate research in this new area, we release the Warwick-NTU Multi-camera Forecasting Database (WNMF), a unique dataset of multi-camera pedestrian trajectories from a network of 15 synchronized cameras. To accurately label this large dataset (600 hours of video footage), we also develop a semi-automated annotation method. An effective MCTF model should proactively anticipate where and when a person will re-appear in the camera network. In this paper, we consider the task of predicting the next camera a pedestrian will re-appear after leaving the view of another camera, and present several baseline approaches for this. The labeled database is available online: https://github.com/olly-styles/Multi-Camera-Trajectory-Forecasting.

Abstract PDF Upgrade to Chat

Citations (17)

View on Semantic Scholar

Summary

The paper introduces Multi-Camera Trajectory Forecasting (MCTF), a proactive framework predicting pedestrians' future transitions across multiple cameras.
It presents the Warwick-NTU Multi-camera Forecasting Database featuring 600 hours of video from 15 cameras, annotated with a semi-automated process for high accuracy.
Experimental results demonstrate that recurrent models like GRUs achieve 75.1% top-1 and 94.9% top-3 accuracy, outperforming traditional heuristic methods.

Multi-Camera Trajectory Forecasting: Pedestrian Trajectory Prediction in a Network of Cameras

The paper "Multi-Camera Trajectory Forecasting: Pedestrian Trajectory Prediction in a Network of Cameras" presents a novel approach to trajectory forecasting in camera networks, addressing limitations in current single-camera methodologies. The authors introduce Multi-Camera Trajectory Forecasting (MCTF), which involves predicting future trajectories of objects not within a singular camera view but across multiple, non-overlapping cameras. This work has implications for tasks such as re-identification and surveillance, which are dependent on effectively tracking individuals over larger spatial areas covered by multiple camera feeds.

Core Contributions

To enable research in MCTF, the authors have compiled the Warwick-NTU Multi-camera Forecasting Database (WNMF). This comprehensive dataset includes 600 hours of video captured from 15 synchronized cameras, specifically designed for multi-camera trajectory scenarios. An innovative semi-automated annotation process was employed to label this dataset, utilizing automated methods for detection and person re-identification (RE-ID) supplemented by manual verification. This procedure not only ensures high data accuracy but also significantly reduces manual labor compared to fully manual annotation strategies.

A standout feature of the proposed approach is the shift from reactive to proactive prediction. Traditional methods in the field—such as in RE-ID and tracking—tend to respond to detected trajectories only after an object is observed across multiple views. In contrast, the MCTF task anticipates future trajectory points, including predicting the next camera that will capture the object after it leaves the current one. This prospective capability enhances surveillance efficiency by narrowing down detection requirements to select cameras.

Experimentation and Results

The paper's experimental design evaluates several models for predicting the next camera of appearance. Baseline methods include predicting based on shortest real-world distance, most frequent camera transitions, and trajectory similarity—each serves as heuristics to compare against more sophisticated techniques. Advanced classifiers, including fully connected networks, LSTMs, and GRUs, are also assessed using normalized bounding box inputs.

The results, presented using top-1 and top-3 accuracy measures, demonstrate that learned models, particularly ones with recurrent architectures, provide superior predictive power over simpler heuristic methods. In particular, the GRU model achieved top 1 accuracy of 75.1% and a top 3 accuracy of 94.9%, highlighting its capability in handling sequential data across multi-camera setups.

Implications and Future Directions

This research has significant implications for enhancing multi-camera monitoring systems, potentially reducing computational costs by preemptively focusing resources on the most relevant cameras. The preemptive trajectory forecasting can refine the search space for detection, leading to more efficient surveillance systems. Furthermore, while the current study focuses on human subjects, the methodology is generalizable to other moving objects, opening avenues for broader applications such as in traffic monitoring and automated logistics.

The introduction of the WNMF dataset is pivotal. As an open resource, it serves as a valuable tool for advancing research efforts in this domain. Future extensions of this work could explore integrating these models into full-scale surveillance systems, including optimizing real-time detection and monitoring in dynamic environments. Furthermore, research can enhance cross-camera feature learning, possibly leveraging more complex deep learning architectures to improve upon current trajectory prediction accuracies.

In conclusion, this paper contributes a rigorous framework, robust dataset, and promising baseline performances for forecasting across camera networks, providing a substantial step forward in trajectory prediction fields. The convergence of machine learning with distributed camera systems continues to underscore the evolving landscape of intelligent surveillance solutions.

Markdown Report Issue