基于多尺度融合金字塔焦点网络的接触网零部件检测

A detector based on multi-scale fusion pyramid focus network for catenary support components

  • 摘要: 高速铁路接触网支撑部件(CSCs)检测作为列车安全维护的重要方面,CSCs检测通常面临目标种类多、目标尺度差异大和部分零部件尺寸小等问题。针对以上问题,基于深度学习的传统目标检测算法容易产生特征融合不充分的问题。本文提出一种基于多尺度融合金字塔焦点网络的接触网零部件检测网络,用于检测不同尺度CSCs的所有成分。首先,将设计的可分离残差金字塔聚合模块(SRPAM)模块和可变形卷积(DCN)引入到主干网络中,优化其多尺度特征提取能力和对多尺度目标的适应性。其中,在空洞空间卷积池化金字塔(ASPP)模块中引用密集连接、可变形卷积模块、残差结构和注意力机制来保证它能在不显著增加计算量的同时丰富特征信息、扩大感受野。其次,在路径聚合特征金字塔网络(PA-FPN)中引入跨层特征平衡模块,优化跨层特征融合效果,弱化背景信息,提高对小尺度目标检测的检测效果。最后,将可变形卷积引入FCOS-Head中,进一步优化模型的综合检测性能。通过对比实验表明本文提出融合金字塔焦点网络是优于大多数对比方法。此外,所提出的检测框架在CSCs数据集上的检测精度(mAP)达到48.6%,同时保持较低的计算复杂度(FLOPs=38.6)。因此,本文提出的方法可以有效地应用于CSCs的检测。

     

    Abstract: As an important aspect of train safety maintenance, the detection of high-speed railway catenary support components (CSCs) usually faces problems such as multiple types of targets, large differences in target scales, and small dimensions of some components. In response to the above issues, traditional object detection algorithms based on deep learning are prone to insufficient feature fusion. This article proposes a detector based on multi-scale fusion pyramid focus network, which is used to detect all components of CSCs at different scales. Firstly, the designed separable residual pyramid aggregation module (SRPAM) and deformable convolutional network (DCN) are introduced into the backbone network to optimize its multi-scale feature extraction capability and adaptability to multi-scale targets. Among them, dense connections, deformable convolution modules, residual structures, and attention mechanisms are referenced in the atrous spatial pyramid pooling (ASPP) module to ensure that it can enrich feature information and expand receptive fields without significantly increasing computational complexity. Secondly, a cross layer feature balancing module is introduced in the path aggregation feature pyramid network (PA-FPN) to optimize the cross-layer feature fusion effect, weaken background information, and improve the detection performance for small-scale object detection. Finally, deformable convolution is introduced into FCOS Head to further optimize the comprehensive detection performance of the model. Comparative experiments have shown that the proposed fusion pyramid focus network is superior to most comparison methods. In addition, the proposed detection framework achieved a detection accuracy (mAP) of 48.6% on the CSCs dataset, while maintaining a low computational complexity (FLOPs=38.6). Therefore, the method proposed in this article can be effectively applied to the detection of CSCs.

     

/

返回文章
返回