基于改进RT-DETR的有遮挡交通标志检测算法

Blocked-traffic-sign detection algorithm based on improved RT-DETR

  • 摘要: 针对交通标志检测中目标尺寸小、检测精度低等问题,尤其是在远距离拍摄、遮挡严重的情况下,传统检测算法往往难以准确识别交通标志. 本文提出了一种基于改进RT-DETR的交通标志检测算法. 首先,考虑到当前交通标志被遮挡情况下数据集的匮乏,自建一个遮挡条件下的交通标志数据集. 然后,在反向残差移动块中引入膨胀重参数块,构建了一个轻量级的复合膨胀残差块来替换原始主干提取网络中的BasicBlock,增强了模型的特征提取能力. 最后,对RT-DETR模型的损失函数进行了优化,提出了DS-IoU联合损失函数加快收模型敛速度. 实验结果表明,改进后的算法在自制数据集上的mAP为94.2%,相比于原始算法增加量为4.7%,在公开数据集TT100K和CCTSDB2021的mAP分别为92.8%和91.7%,相比于原始算法增加量分别为3.1%和2.4%,Params和GFLOPs相比于原始的算法分别降低了26.0%和12.5%. 本文提出的改进方法极大地减少了计算量和参数数量,有效提升了遮挡情况下的交通标志的检测精度.

     

    Abstract: Accurate traffic-sign detection is a foundational capability for intelligent transportation systems and autonomous driving technologies; however, it remains a formidable challenge in real-world environments characterized by small scales, severe occlusions, highly variable lighting conditions, and complex backgrounds. Traditional convolutional neural network (CNN)-based detectors often struggle to maintain reliable performance when traffic signs appear at long distances or become partially hidden by vehicles, foliage, or roadside infrastructure owing to inherent limitations in feature extraction, scale sensitivity, and model robustness. To overcome these limitations, this paper presents an enhanced RT-DETR-based approach specifically tailored for occluded-traffic-sign detection under resource-constrained conditions. First, recognizing the scarcity of publicly available data that accurately reflect occlusion scenarios, we curated the traffic sign dataset under occlusion conditions (TSDOC), which comprises 4698 high-resolution images annotated across eight common traffic sign categories—including prohibitory, warning, and indicative signs—with 3572 images allocated for training and 1126 for testing. TSDOC systematically simulates real driving environments by incorporating diverse occlusion types, such as partial masking by other vehicles, foreign object attachment, dynamic shadows, and varying degrees of weather-induced visibility reduction. This enables a rigorous evaluation of detection methods under complex, safety-critical scenarios that closely mirror roadside conditions. Second, to improve the small and occluded object representation without incurring in excessive computational overhead, we redesigned the RT-DETR backbone by replacing the standard ResNet-18 BasicBlock with a novel composite dilated residual block (CDRB). Each CDRB integrates a dilated reparameterization block (DRB) into an inverted residual mobile block (iRMB), thereby combining multi-scale dilated convolutions that capture long-range pixel dependencies essential for reconstructing partially visible sign features with structural reparameterization techniques that streamline the inference graph for reduced latency. Consequently, the modified backbone achieves a 26.0% reduction in parameter count and a 12.5% decrease in floating-point operations per second (GFLOPs) compared to the baseline RT-DETR-R18, while maintaining or improving feature discrimination for occluded targets. Third, for faster convergence and enhanced localization precision—particularly for small and partially occluded signs—we introduce the dynamic scaled IoU loss (DS-IoU), a novel joint loss function that integrates Inner-IoU’s auxiliary bounding box strategy with a dynamically adjustable scaling factor Ratio and incorporates the minimal point distance metric from MPDIoU. This adaptive loss formulation emphasizes interior region overlap and geometric consistency during training, effectively replacing the conventional GIoU loss and enabling the model to focus on the most informative spatial regions under challenging conditions. Comprehensive experiments demonstrate the effectiveness of the proposed approach. On the TSDOC, TT100K, and CCTSDB2021 benchmarks, the proposed model achieved a mean average precision (mAP) of 94.2%, 92.8%, and 91.7%, respectively (a 4.7%, 3.1%, and 2.4% gain over RT-DETR). The real-time inference speed reached 112.8 s−1 a 18.5% improvement over RT-DETR. Ablation studies show that replacing the backbone with CDRB yields a 2.8% mAP increase, while DS-IoU further boosts recall under occlusion by 3.7%. This lightweight architecture and optimized loss function deliver higher detection accuracy and efficiency in occluded-traffic-sign scenarios, making it well suited for deployment in resource-constrained embedded systems.

     

/

返回文章
返回