面向目标检测的图特征增强点云采样方法

Graph-feature-enhanced point cloud sampling for object detection

  • 摘要: 激光雷达采集的点云中,前景目标点所占比例较小,传统无监督采样方法难以选择性地保留足够多的前景点,导致部分目标信息丢失,影响基于点云的目标检测网络性能. 本文提出了一种图特征增强的并行点云采样方法,利用前/背景分类标签进行监督,显著提高了采样点中前景点的比例. 与直接使用点特征进行监督的方法相比,所提出的基于图特征的方法能够更好地捕捉点云的局部几何信息,适用于目标检测网络的浅层采样过程. 在KITTI和nuScenes自动驾驶数据集上的实验结果表明:本文方法采样的前景点比例高达99%,能够有效提取受遮挡目标和远处目标等点云稀疏区域的特征信息,从而提高目标检测网络的性能. 引入该方法后,对困难情况下的车辆、行人和两轮车的检测平均精度分别提升了8.58%、2.27%和3.12%. 此外,该方法设计灵活,易于集成到依赖点云采样过程的各种3D点云任务中.

     

    Abstract: Light detection and ranging (LiDAR)-acquired point-cloud data are extensive and characterized by their non-uniform density, with points typically denser near the sensor and sparser at greater distances. Efficient sampling of point clouds is crucial for reducing computational complexity while preserving critical environmental information. However, classical sampling methods, such as farthest point sampling (FPS) and random sampling, fail to adequately address the challenges posed by the imbalanced distribution of foreground and background points. Oversampling of background points or insufficient coverage of foreground regions can result in the loss of essential target information, particularly for small or distant objects, thus ultimately degrading the performance of three-dimensional (3D) object-detection networks. Although FPS has been widely adopted in many point-based detection frameworks, its sequential nature limits its efficiency and effectiveness in complex scenarios. Hence, we propose a novel graph feature augmentation sampling (GFAS) method, which leverages graph convolutional networks and supervised learning to enhance sampling efficiency and detection performance. The proposed method introduces a graph-feature-generation module that aggregates local and global features of point clouds using multilayer graph convolutions, thus enabling the extraction of rich geometric and spatial information. Additionally, it incorporates a parallel sampling mechanism that selects foreground points based on their feature scores, thereby significantly improving sampling efficiency. By utilizing foreground–background classification labels as supervision signals, GFAS ensures a higher proportion of foreground points in the sampling process, which is particularly beneficial for detecting objects. Extensive experiments are conducted on two large-scale autonomous driving datasets, i.e., KITTI and nuScenes, to validate the effectiveness of GFAS. On the KITTI dataset, GFAS achieves significant improvements in terms of average precision for car detection, with gains of 6.2%, 6.89%, and 8.58% under easy, moderate, and hard levels, respectively. Similar improvements are observed for pedestrian and cyclist detection, thus demonstrating the robustness of the proposed method across different object categories. On the nuScenes dataset, the proposed method improves car- and pedestrian-detection performance significantly, with higher precision levels by 4.2% and 8.3%, respectively, compared with the baseline model. These results highlight the strong generalizability of GFAS in large-scale and complex driving scenarios. Ablation studies reveal that GFAS significantly increases the proportion of foreground points in the sampling process, with the ratio approaching 99% in the final layers. Visualization results show that GFAS effectively concentrates sampling points on foreground objects, thus avoiding the uniform distribution issue of classical FPS methods. Additional experiments on other 3D object-detection models, such as 3D single stage object detector (3DSSD) and PointVoxel-RCNN (PV-RCNN), further validate the flexibility and scalability of the proposed method. In conclusion, this paper proposes an efficient and parallel point-cloud-sampling method. By integrating graph-feature extraction and supervised learning, GFAS not only improves sampling efficiency but also enhances detection performance, particularly for challenging scenarios. The proposed method can be easily integrated into existing point-cloud-based detection frameworks. Its ability to retain a high proportion of foreground points while maintaining computational efficiency highlights its practicality.

     

/

返回文章
返回