基于改进YOLOv11n的轻量级桥梁裂缝图像检测算法
孙伟1,刘文江2*
1.山东交通学院轨道交通学院,山东 济南 250357;2.山东交通学院航空学院,山东 济南 250357
摘要:针对当前桥梁裂缝图像检测精度较低,算法规模较大不便于部署在资源受限的边缘设备等问题,提出一种改进YOLOv11n(you only look once version 11 nano)的轻量级桥梁裂缝图像检测算法,通过融合ShuffleNetV2模块与跨尺度融合模块(cross-scale fusion module, CCFM),构建Shuffle-CCFM结构,提高多尺度特征的融合能力并降低算法参数量;在跨通道局部空间注意力(cross-channel partial spatial attention, C2PSA)模块中引入倒置残差块注意力机制(inverted residual mobile block with attention, iRMB),形成C2PSA-iRMB模块,提高算法对复杂裂缝细节的识别能力,并增强同一裂缝结构区域内空间长距离特征的关联建模能力;在C3k2模块中集成小波卷积(wavelet transform convolution, WTConv),形成C3k2-WTConv模块,提高模型在不同尺度下的特征提取能力;采用动态上采样器DySample代替传统上采样模块,根据特征图内容自适应调整采样位置,提高上采样阶段的空间分辨率与细节还原能力。开展消融试验、对比试验和可视化检测效果试验验证改进YOLOv11n算法的检测性能,试验结果表明:相较于YOLOv11n算法,引入Shuffle-CCFM结构、C2PSA-iRMB模块、C3k2-WTConv模块和DySample模块后的改进YOLOv11n算法的参数量NP、计算量Nf、权重文件大小T分别减小27.5%、23.8%、32.7%,交并比阈值为50时平均精度均值EmAP50、交并比阈值从50增至95时平均精度均值EmAP50-95和召回率R分别提高1.6%、3.8%、0.4%,算法轻量化和检测精度明显提高;改进YOLOv11n算法对桥梁裂缝图像的检测精度和性能指标明显优于YOLOv5n、YOLOv6n、YOLOv8n、YOLOv10n等轻量级算法,适合部署于计算资源受限的边缘设备;改进YOLOv11n算法在桥梁裂缝可视化检测试验中对检测结果精确率有更高的置信度,对尺寸微小、形态复杂的裂缝细节捕捉能力较强,在复杂背景下具有较强的抗干扰能力。
关键词:桥梁裂缝图像检测;YOLOv11n;ShuffleNetV2;CCFM;iRMB;WTConv;DySample
Lightweight bridge crack image detection algorithm based on improved YOLOv11n
SUN Wei1, LIU Wenjiang2*
1. School of Rail Transportation, Shandong Jiaotong University, Jinan 250357, China;
2. School of Aeronautics, Shandong Jiaotong University, Jinan 250357, China
Abstract: To address the problems of low accuracy in current bridge crack image detection and large algorithm scale that is inconvenient for deployment on resource-constrained edge devices, a lightweight bridge crack image detection algorithm based on improved YOLOv11n (you only look once version 11 nano) is proposed. By integrating the ShuffleNetV2 module with CCFM (cross-scale fusion module), a Shuffle-CCFM structure is constructed to enhance multi-scale feature fusion capability while reducing algorithm parameters. The iRMB (inverted residual mobile block with attention) is introduced into the C2PSA (cross-channel partial spatial attention) module to form the C2PSA-iRMB module, which improves the algorithm′s recognition capability for complex crack details and enhances the correlation modeling capability of spatially distant features within the same crack structure region. WTConv (wavelet transform convolution) is integrated into the C3k2 module to form the C3k2-WTConv module, improving the model′s feature extraction capability at different scales. DySample is adopted to replace the traditional upsampling module, adaptively adjusting sampling positions according to feature map content to enhance spatial resolution and detail restoration capability during the upsampling stage. Ablation experiments, comparative experiments, and visualization detection effect experiments are conducted to evaluate the detection performance of the improved YOLOv11n algorithm. The experimental results show that: compared with the YOLOv11n algorithm, after introducing the Shuffle-CCFM structure, C2PSA-iRMB module, C3k2-WTConv module, and DySample module, the improved YOLOv11n algorithm′s params NP, computation cost Nf, and weight file size T are reduced by 27.5%, 23.8%, and 32.7%, respectively, while mean average precision at intersection over union threshold of 50 EmAP50, mean average precision at intersection over union threshold from 50 to 95 EmAP50-95, and recall R increase by 1.6%, 3.8%, 0.4% respectively, demonstrating significant improvements in algorithm lightweighting and detection accuracy. The improved YOLOv11n algorithm′s detection accuracy and performance indicators for bridge crack images are significantly superior to lightweight algorithms such as YOLOv5n, YOLOv6n, YOLOv8n, and YOLOv10n, making it suitable for deployment on edge devices with limited computational resources. The improved YOLOv11n algorithm demonstrates higher confidence in detection result precision in bridge crack visualization detection experiments, exhibits stronger capability in capturing details of minute-sized and morphologically complex cracks, and possesses stronger anti-interference capability in complex backgrounds.
Keywords: bridge crack image detection; YOLOv11n; ShuffleNetV2; CCFM; iRMB; WTConv; DySample
