An Integrated Deep Learning Framework for Road Distress Detection, Segmentation, and Quantitative Evaluation

Meng Xu; Wei Gao

doi:10.54097/hm8h2y49

Authors

Meng Xu
Wei Gao

DOI:

https://doi.org/10.54097/hm8h2y49

Keywords:

Road distress detection, Deep learning, YOLOv7, Image segmentation, Quantitative evaluation

Abstract

Road distress inspection plays a critical role in pavement-condition assessment and maintenance planning. However, existing studies often address detection, segmentation, or measurement separately, which limits their practical applicability. This paper proposes an integrated deep learning–based framework for road distress detection, pixel-level segmentation, and quantitative evaluation. First, an improved YOLOv7 detector is developed by introducing SE attention, CARAFE-based content-aware upsampling, and a Dynamic Head to enhance multi-scale feature representation and robustness under complex road backgrounds. Second, a multi-scale encoder–decoder network termed MIResU-Net is designed to accurately extract crack and pothole regions with improved structural continuity and boundary precision. Finally, a calibration-based measurement strategy is employed to convert segmentation results into physically meaningful geometric parameters, such as crack length and pothole area. A real-world road-distress dataset collected by a vehicle-mounted system is constructed for comprehensive evaluation. Experimental results demonstrate that the proposed framework achieves superior detection and segmentation performance compared with mainstream methods and provides metrically reliable quantitative indicators for pavement-condition assessment. The proposed approach offers an effective and practical solution for intelligent road inspection.

Downloads

Download data is not yet available.

References

[1] Haas, R. C. G., Hudson, W. R., & Zaniewski, J. P. (1994). Modern pavement management. Krieger Publishing Company.

[2] Chambon, S., & Moliard, J.-M. (2011). Automatic road pavement assessment with image processing: Review and comparison. International Journal of Geophysics, 2011, 989354. https://doi.org/10.1155/2011/989354

[3] Maeda, H., Sekimoto, Y., Seto, T., Kashiyama, T., & Omata, H. (2018). Road damage detection and classification using deep neural networks with smartphone images. Computer-Aided Civil and Infrastructure Engineering, 33(12), 1127–1141. https://doi.org/10.1111/mice.12387

[4] Arya, D., Maeda, H., Ghosh, S. K., Toshniwal, D., Mraz, A., Kashiyama, T., & Sekimoto, Y. (2021). Deep learning-based road damage detection and classification for multiple countries. Automation in Construction, 132, 103935. https://doi.org/10.1016/j.autcon.2021.103935

[5] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28

[6] Chu, C., Wang, L., & Xiong, H. (2022). A review on pavement distress and structural defects detection and quantification technologies using imaging approaches. Journal of Traffic and Transportation Engineering, 9(2), 135–150. https://doi.org/10.1016/j.jtte.2021.04.007

[7] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 580–587. https://doi.org/10.1109/CVPR.2014.81

[8] Wang, W., Wu, B., Yang, S., & Wang, Z. (2018). Road damage detection and classification with Faster R-CNN. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 5220–5223). IEEE. Seattle, WA, United States. https://doi.org/10.1109/BigData.2018.8621987.

[9] Chen, Q., Gan, X., Huang, W., et al. (2020). Road damage detection and classification using Mask R-CNN with DenseNet backbone. Computer, Materials & Continua, 65(3), 2201–2215. https://doi.org/10.32604/cmc.2020.011108

[10] Yang, F., Yu, B., Zhao, J., et al. (2022). Bridge-bottom crack detection method based on improved YOLOv3. China Sciencepaper, 17(3), 252–259.

[11] Cao, M. T., Tran, Q. V., Nguyen, N. M., et al. (2020). Survey on performance of deep learning models for detecting road damages using multiple dashcam image resources. Advanced Engineering Informatics, 46, 101182. https://doi.org/10.1016/j.aei.2020.101182

[12] Luo, H., Jia, C., & Li, J. (2021). Highway pavement distress detection algorithm based on improved YOLOv4. Laser & Optoelectronics Progress, 58(14), 336–344.

[13] Feng, X., Xiao, L., Li, W., et al. (2020). Pavement crack detection and segmentation method based on improved deep learning fusion model. Mathematical Problems in Engineering, 2020, 1–22. https://doi.org/10.1155/2020/6413085

[14] K. C., R., & G., R. (2022). Road damage detection and classification using YOLOv5. In 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT) (pp. 489–494). IEEE. Kannur, India. https://doi.org/10.1109/ICICICT54557.2022.9917899

[15] Zhang, Y., Zuo, Z., Xu, X., et al. (2022). Road damage detection using UAV images based on multi-level attention mechanism. Automation in Construction, 138, 104264. https://doi.org/10.1016/j.autcon.2022.104264

[16] Wang, S., et al. (2022). An ensemble learning approach with multi-depth attention mechanism for road damage detection. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 6439–6444). IEEE. Osaka, Japan. https://doi.org/10.1109/BigData55660.2022.10020445

[17] Wang, J., Gao, X., Liu, Z., & Wan, Y. (2023). GSC-YOLOv5: An algorithm based on improved attention mechanism for road crack detection. In 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS) (pp. 1664–1671). IEEE. Xiangtan, China. https://doi.org/10.1109/DDCLS58216.2023.10132562

[18] Ma, H. (2022). Object surface defect detection based on deformable convolution and attention mechanism (Master’s thesis). Yunnan University.

[19] Yang, L., He, H., & Liu, T. (2022). Road damage detection and classification based on multi-scale contextual features. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 6445–6453). IEEE. Osaka, Japan. https://doi.org/10.1109/BigData55660.2022.10020446

[20] Tzutalin. (2015). LabelImg (Version 1.8.6) [Computer software]. GitHub. https://github.com/tzutalin/labelImg

[21] Maeda, H., Sekimoto, Y., Seto, T., Kashiyama, T., & Omata, H. (2018). Road damage detection and classification using deep neural networks with smartphone images. arXiv. https://arxiv.org/abs/1801.09454

[22] Wang, C.-Y., Bochkovskiy, A., & Liao, H.-Y. M. (2023). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721

[23] Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7132–7141. https://doi.org/10.1109/CVPR.2018.00745

[24] Wang, J., Chen, K., Xu, R., Liu, Z., Loy, C. C., & Lin, D. (2019). CARAFE: Content-aware reassembly of features. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 3007–3016. https://doi.org/10.1109/ICCV.2019.00310

[25] Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L., & Zhang, L. (2021). Dynamic head: Unifying object detection heads with attentions. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7369–7378. https://doi.org/10.1109/CVPR46437.2021.00729

[26] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28