Pointwise Convolutional Neural Networks

Published 14 Dec 2017 in cs.CV and cs.LG | (1712.05245v2)

Abstract: Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently. However, the capability of using point clouds with convolutional neural network has been so far not fully explored. In this paper, we present a convolutional neural network for semantic segmentation and object recognition with 3D point clouds. At the core of our network is pointwise convolution, a new convolution operator that can be applied at each point of a point cloud. Our fully convolutional network design, while being surprisingly simple to implement, can yield competitive accuracy in both semantic segmentation and object recognition task.

Abstract PDF Upgrade to Chat

Citations (473)

View on Semantic Scholar

Summary

The paper introduces a novel pointwise convolution operator that computes features for each point in 3D point clouds.
The authors design two fully convolutional network architectures tailored for semantic segmentation and object recognition.
Experimental results demonstrate competitive performance, with an 81.5% accuracy on the S3DIS dataset.

Pointwise Convolutional Neural Networks: An Overview

The paper "Pointwise Convolutional Neural Networks" by Binh-Son Hua, Minh-Khoi Tran, and Sai-Kit Yeung presents an innovative approach to processing 3D point clouds for semantic segmentation and object recognition through a novel pointwise convolution operator. This method addresses the challenge of utilizing convolutional neural networks (CNNs) with point cloud data, a problem previously hindered by the non-grid nature of such data.

Core Contributions

At the heart of this research is the introduction of the pointwise convolution operator, a mechanism that allows convolutions at each point within a point cloud. This technique proves both simple to implement and competitive in performance when compared to established methods. The authors detail significant contributions:

Pointwise Convolution Operator: This operator effectively outputs features for each point in a cloud, facilitating its integration into fully convolutional network architectures.
Network Architectures for 3D Tasks: Two distinctive pointwise convolutional neural networks (PCNNs) have been devised for semantic segmentation and object recognition of 3D data.

Technical Overview

The pointwise convolution operator is characterized by its ability to operate on the local neighborhood of each point, organized into kernel cells, thereby learning pointwise features. The proposed network frameworks are fully convolutional, negating the need for pooling operations which can compromise spatial precision, especially significant in semantic segmentation tasks.

Experimental Results

The PCNNs were evaluated on datasets such as S3DIS, SceneNN, and ModelNet40, demonstrating competitive accuracy. For instance, the method achieved an 81.5% overall accuracy on the S3DIS dataset. Furthermore, it displayed robust performance on point clouds sorted by Morton curves, indicating the operator's adaptability to point ordering.

Implications and Future Directions

The introduction of pointwise convolution enables better handling of 3D data within deep learning frameworks, allowing for more efficient scene segmentation and object recognition. The inherent simplicity of the method renders it suitable for integration into existing architectures, potentially extending its application to other domains such as RGB-D reconstruction and CAD modeling.

Future work may explore the scalability of PCNNs to handle larger point clouds efficiently and to optimize pointwise convolution operations for neural network training. Additionally, there is potential to extend the use of this method to other facets of geometric deep learning, such as upsampling and tensor voting.

Overall, this paper presents a significant advancement in the processing of 3D point clouds, contributing meaningful insights into the ongoing development of deep learning methodologies in the context of three-dimensional data.

Markdown Report Issue