三维点云语义分割：现状与挑战

王艺娴; 胡雨凡; 孔庆群; 曾慧; 张利欣; 樊彬

doi:10.13374/j.issn2095-9389.2022.12.17.004

摘要: 随着获取点云数据成本下降以及GPU算力的提高，众多三维视觉场景如自动驾驶、工业控制、MR/XR对三维语义分割的需求日益旺盛，这进一步推动了深度学习模型在三维点云语义分割任务中的发展。近期，深度学习模型在网络架构上持续创新，如RandLA-Net 和Point Transformer，并突破性地以更低的计算成本提高了分割准确率，但已有的三维点云语义分割综述介绍的研究工作包含大量早期以及被舍弃的方法，没有系统地整理这些新型高效的方法，不能很好地体现研究现状。此外，这部分综述以输入网络的不同数据类型分类各点云语义分割方法，不能有效地体现各方法的演进关系，也不利于对比不同方法的分割性能。针对以上问题，本文面向近3年的研究成果和最新的研究进展，重点归纳了三维点云语义分割中基于不同网络架构的方法、面临的挑战及潜在研究方向，并从3个层面对三维点云语义分割进行了系统地综述。通过本文，读者可以较系统地了解三维点云语义分割的数据获取方式、常见数据集及模型的评价指标，对比基于不同网络架构的三维点云语义分割方法的发展过程、分割性能和优缺点，并进一步认识三维点云语义分割现存的挑战和潜在的研究方向。

Abstract: Decrease in the cost of acquiring 3D point cloud data coupled with the rapid advancements in GPU computing power have resulted in an increased demand for 3D point cloud semantic segmentation in numerous 3D visual applications, including but not limited to autonomous driving, industrial control, and MR/XR, which further advances the development of deep learning methods in 3D point cloud semantic segmentation. Recently, many novel deep learning network architectures, such as RandLA-Net and Point Transformer, have been proposed and have achieved notable improvements in semantic segmentation accuracy while decreasing the computational load. However, previous research on 3D point cloud semantic segmentation methods has focused primarily on relatively early works, whose approaches have been gradually abandoned over the years and cannot accurately reflect the current research status. Moreover, the existing methods have been categorized based on their input data types, making it difficult to compare the segmentation performance of different techniques and not providing a comprehensive view of the relationship between methods using different network architectures. Therefore, this paper reviews the mainstream 3D semantic segmentation methods developed in the last three years using different deep learning network architectures and is organized into three levels. First, the two principal 3D point cloud data acquisition methods, including their customary datasets and metrics to evaluate model performance, are introduced. Second, a systematic review of 3D semantic segmentation methods based on different network architectures is organized, followed by a statistical analysis of the evaluation of performance between different models on two 3D segmentation datasets—S3DIS and ScanNet. The analysis of model performance on these two commonly used datasets includes model structure relevance, strengths, and limitations. Finally, an insightful discussion of the remaining methodological and application challenges and potential research directions is provided. This paper offers an extensive overview of the recent three-year research progress in 3D point cloud semantic segmentation and summarizes various network architecture pipelines, elucidates their fundamental operations, compares the model performance across multiple architectures, discusses their notable strengths and limitations, most importantly, concludes the current challenges and promising research directions for future investigations. Furthermore, this paper enables researchers to effortlessly identify the relevant research and research hotspots among different 3D point cloud semantic segmentation methods based on the analyses presented and aims to update the reviews on 3D point cloud semantic segmentation methods with a better viewpoint and highlight key properties and contributions of proposed methods, providing promising research directions for the main challenges.

三维点云语义分割：现状与挑战

3D point cloud semantic segmentation: state of the art and challenges