Deep neural networks can efficiently process 3D point clouds. At each point convolution layer, local features can be learned from local neighborhoods of point clouds. These features are combined together for further processing to extract the semantic information encoded in the point cloud. Previous networks adopt all the same local neighborhoods at different layers, as they utilize the same metric on fixed input point coordinates to define neighborhoods. It is easy to implement but not necessarily optimal. Ideally local neighborhoods should be different at different layers so as to adapt to layer dynamics for efficient feature learning. One way to achieve this is to learn transformations of the input point cloud at each layer, and extract features from local neighborhoods defined on transformed coordinates. We propose a novel approach to learn different transformations of the input point cloud for different neighborhoods at each layer. We propose both linear and non-linear spatial transformers for point clouds. The proposed methods outperform the state-of-the-art methods in several other point cloud processing tasks (classification, segmentation and detection). Visualizations show that transformers can learn features more efficiently by dynamically altering neighborhoods according to the geometric and semantic information of 3D shapes regardless of intra-class variations.
Beschreibung
Spatial Transformer for 3D Point Clouds | IEEE Journals & Magazine | IEEE Xplore
%0 Journal Article
%1 9393615
%A Wang, Jiayun
%A Chakraborty, Rudrasis
%A Yu, Stella X.
%D 2021
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%K 2021 3D journal point-cloud tpami transformer
%P 1-1
%R 10.1109/TPAMI.2021.3070341
%T Spatial Transformer for 3D Point Clouds
%U https://ieeexplore.ieee.org/document/9393615
%X Deep neural networks can efficiently process 3D point clouds. At each point convolution layer, local features can be learned from local neighborhoods of point clouds. These features are combined together for further processing to extract the semantic information encoded in the point cloud. Previous networks adopt all the same local neighborhoods at different layers, as they utilize the same metric on fixed input point coordinates to define neighborhoods. It is easy to implement but not necessarily optimal. Ideally local neighborhoods should be different at different layers so as to adapt to layer dynamics for efficient feature learning. One way to achieve this is to learn transformations of the input point cloud at each layer, and extract features from local neighborhoods defined on transformed coordinates. We propose a novel approach to learn different transformations of the input point cloud for different neighborhoods at each layer. We propose both linear and non-linear spatial transformers for point clouds. The proposed methods outperform the state-of-the-art methods in several other point cloud processing tasks (classification, segmentation and detection). Visualizations show that transformers can learn features more efficiently by dynamically altering neighborhoods according to the geometric and semantic information of 3D shapes regardless of intra-class variations.
@article{9393615,
abstract = {Deep neural networks can efficiently process 3D point clouds. At each point convolution layer, local features can be learned from local neighborhoods of point clouds. These features are combined together for further processing to extract the semantic information encoded in the point cloud. Previous networks adopt all the same local neighborhoods at different layers, as they utilize the same metric on fixed input point coordinates to define neighborhoods. It is easy to implement but not necessarily optimal. Ideally local neighborhoods should be different at different layers so as to adapt to layer dynamics for efficient feature learning. One way to achieve this is to learn transformations of the input point cloud at each layer, and extract features from local neighborhoods defined on transformed coordinates. We propose a novel approach to learn different transformations of the input point cloud for different neighborhoods at each layer. We propose both linear and non-linear spatial transformers for point clouds. The proposed methods outperform the state-of-the-art methods in several other point cloud processing tasks (classification, segmentation and detection). Visualizations show that transformers can learn features more efficiently by dynamically altering neighborhoods according to the geometric and semantic information of 3D shapes regardless of intra-class variations.},
added-at = {2021-06-02T09:20:37.000+0200},
author = {Wang, Jiayun and Chakraborty, Rudrasis and Yu, Stella X.},
biburl = {https://www.bibsonomy.org/bibtex/26aaae0d001dc5f0254ae96f01d61d4bb/analyst},
description = {Spatial Transformer for 3D Point Clouds | IEEE Journals & Magazine | IEEE Xplore},
doi = {10.1109/TPAMI.2021.3070341},
interhash = {bf824fd1f747e6f2ce2f65304b8925e7},
intrahash = {6aaae0d001dc5f0254ae96f01d61d4bb},
issn = {1939-3539},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
keywords = {2021 3D journal point-cloud tpami transformer},
pages = {1-1},
timestamp = {2021-06-02T09:20:37.000+0200},
title = {Spatial Transformer for 3D Point Clouds},
url = {https://ieeexplore.ieee.org/document/9393615},
year = 2021
}