Object Localization under Single Coarse Point Supervision

Published 17 Mar 2022 in cs.CV | (2203.09338v1)

Abstract: Point-based object localization (POL), which pursues high-performance object sensing under low-cost data annotation, has attracted increased attention. However, the point annotation mode inevitably introduces semantic variance for the inconsistency of annotated points. Existing POL methods heavily reply on accurate key-point annotations which are difficult to define. In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points. To this end, we propose a coarse point refinement (CPR) approach, which to our best knowledge is the first attempt to alleviate semantic variance from the perspective of algorithm. CPR constructs point bags, selects semantic-correlated points, and produces semantic center points through multiple instance learning (MIL). In this way, CPR defines a weakly supervised evolution procedure, which ensures training high-performance object localizer under coarse point supervision. Experimental results on COCO, DOTA and our proposed SeaPerson dataset validate the effectiveness of the CPR approach. The dataset and code will be available at https://github.com/ucas-vg/PointTinyBenchmark/.

Abstract PDF Upgrade to Chat

Citations (21)

View on Semantic Scholar

Summary

The paper presents a novel coarse point annotation strategy paired with a CPR algorithm to reduce annotation burden.
It leverages multiple instance learning to refine annotations by identifying semantic center points, minimizing semantic variance.
Experimental results on COCO, DOTA, and SeaPerson datasets demonstrate CPR's effectiveness and improved performance over baselines.

Object Localization under Single Coarse Point Supervision: An Overview

The paper presents a method for Point-based Object Localization (POL) using coarse point annotations, aiming to mitigate the semantic variance associated with the inconsistency of annotated points. The proposed method, Coarse Point Refinement (CPR), leverages multiple instance learning (MIL) to identify semantic center points, thus facilitating high-performing object localization with minimal annotation burden.

Key Contributions

Coarse Point Annotation Strategy: The study introduces a relaxed annotation scheme by allowing any point on an object to serve as a coarse annotation. This approach reduces the complexity and burden of defining accurate key-point annotations, which are often challenging for diverse object categories.
Coarse Point Refinement Algorithm: CPR identifies semantic points near the annotated location through MIL and defines a weakly supervised process for refining these points to serve as effective training signals. It constructs point bags, selects semantically correlated points, and determines a semantic center, reducing semantic variance.
Experimental Validation: Extensive experiments on datasets such as COCO, DOTA, and the newly proposed SeaPerson validate the effectiveness of the CPR approach. The method not only achieves comparable results with center-point-based localization but also shows significant performance improvements over baseline methods.
SeaPerson Dataset: The introduction of the SeaPerson dataset, containing over 600,000 annotations for tiny person detection and localization, provides a significant contribution to the community, enabling further research in low-resolution object detection.

Detailed Methodology

The CPR method innovatively addresses the semantic variance problem by shifting from strict key-point annotations to a more flexible framework. It implements a MIL paradigm to effectively discern the point most representative of an object’s essence, thereby refining initial coarse annotations into precise semantic centers.

Point Sampling: Points are sampled around the annotated location to form point bags which MIL processes consider. These point bags help identify semantic correlations among points that indicate object presence.
MIL Loss and Additional Supervision: The refinement process incorporates three types of losses: MIL loss for semantic identification, annotation loss for direct supervision using annotated points, and negative loss to mitigate background noise by utilizing non-object areas during training.
Algorithm Efficiency: Through ablation studies and various feature map levels, the paper demonstrates the robustness and adaptability of CPR across different settings and architectures, confirming its suitability for diverse POL tasks.

Implications and Future Work

The CPR approach significantly impacts the field of weakly supervised learning by reducing annotation costs while maintaining performance integrity. It opens doors for further exploration of multi-class and multi-scale datasets, addressing challenges posed by large intra-class variance and occlusions. Future research may explore making the sampling radius adaptive to enhance scalability and extending CPR to other vision tasks where annotation efficiency is crucial.

In conclusion, the CPR method in POL, supported by extensive empirical evidence, presents a substantial advancement in object localization while balancing efficiency and performance. The insights from this paper will likely inspire further developments in weakly supervised learning paradigms in computer vision.

Markdown Report Issue