Misc,

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

D. Chen, A. Chang, and M. Nießner.
(2019)cite arxiv:1912.08830Comment: Video: https://youtu.be/T9J5t-UEcNA.

Abstract

We introduce the new task of 3D object localization in RGB-D scans using natural language descriptions. As input, we assume a point cloud of a scanned 3D scene along with a free-form description of a specified target object. To address this task, we propose ScanRefer, where the core idea is to learn a fused descriptor from 3D object proposals and encoded sentence embeddings. This learned descriptor then correlates the language expressions with the underlying geometric features of the 3D scan and facilitates the regression of the 3D bounding box of the target object. In order to train and benchmark our method, we introduce a new ScanRefer dataset, containing 46,173 descriptions of 9,943 objects from 703 ScanNet scenes. ScanRefer is the first large-scale effort to perform object localization via natural language expression directly in 3D.

BibTeX key: chen2019scanrefer
entry type: misc
year: 2019
url: http://arxiv.org/abs/1912.08830
note: cite arxiv:1912.08830Comment: Video: https://youtu.be/T9J5t-UEcNA

BibSonomy

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on