YOLOv7-BW: 基于遥感图像的密集小目标高效检测器

Authors

  • 葛 旭东 北京工商大学,计算机与人工智能学院,北京 100048 Author
  • 金 学波 北京工商大学,计算机与人工智能学院,北京 100048 Author
  • 马 慧鋆 北京工商大学,计算机与人工智能学院,北京 100048 Author
  • 邹 天畅 考克大学食品科学专业,考克,爱尔兰,T12 HY8E Author

DOI:

https://doi.org/10.52810/JIR.2024.004

Keywords:

遥感图像, YOLO, 目标检测, mAP

Abstract

近年来,深度学习技术已经越来越广泛应用于遥感图像的检测。然而,遥感图像普遍目标大小差距大同时分布密集,对检测算法性能的要求高。目前的检测方法普遍效率低,容易出现漏检以及检测框不准确的情况。为此,本文基于YOLO算法进行改进,提出了一种基于YOLOv7的算法YOLOv7-bw,实现了对遥感图像的高效率检测,促进了目标检测在遥感行业的应用和发展。YOLOv7-bw在原始的池化金字塔SPPCSPC网络中添加了Bi-level Routing Attention模块,对目标集中区域重点关注,以提高网络提取特征的能力;并引入动态非单调的WIoUv3替换原本的CIoU损失函数,使得损失函数在每一时刻都能做出最符合当前情况的梯度增益分配策略,以提高对检测目标的聚焦能力。通过对DIOR遥感图像数据集进行对比实验发现,我们的YOLOv7-bw具有较高的mAP@0.5和mAP@0.5:0.95,在数据集上表现为85.63%和65.93%,高于YOLOv7源码的83.7%和63.9%分别1.93%、2.03%。同时,对比目前常用算法,我们的YOLOv7-bw均表现更好,证明了我们提出的算法是可行的,可以更好的应用于遥感图像检测。

Downloads

Download data is not yet available.

Author Biographies

  • 葛 旭东, 北京工商大学,计算机与人工智能学院,北京 100048
    葛旭东 ,2024年毕业于北京工商大学控制工程专业,获硕士学位。研究方向为图像检测模式识别与信息融合、机器学习等。
  • 金 学波, 北京工商大学,计算机与人工智能学院,北京 100048

    金学波 教授,博士生导师.1994 年毕业于吉林大学(原吉林工业大学)获学士学位,1997 年毕业于吉林大学(原吉林工业大学)获硕士学位,2004 年获得浙江大学控制科学与工程博士学位,导师为孙优贤院士. 研究方向为信息融合、模式识别与预测、大数据分析、深度学习等.近年来在相关领域主持了 1 项国家科技支撑计划课题、4 项国家自然基金面上项目等多项研究课题. 获 2021 年度中国粮油学会科学技术奖一等奖。在时序信号模式识别、图像目标检测与识别等研究领域,已发表 SCI、EI 收录等高水平学术论文 159 篇,其中 7 篇为 ESI 高被引论文(前 1%)、3 篇 ESI 热点论文(前 0.1%),已授权国家发明专利 20 余项,出版关于传感器信号识别与状态估计、多传感器信息融合的学术专著 3 部. 担任 SCI 收录期刊 Sensors 编委,为 IEEE/CAA Journal of Automatica Sinica、Knowledge-Based Systems 等中科院一区SCI 期刊审稿人.

  • 马 慧鋆, 北京工商大学,计算机与人工智能学院,北京 100048
    马慧鋆, 2010年毕业于长春光机学院原子与分子物理专业,获硕士学位,目前为北京工商大学系统科学专业在职博士生。研究方向为复杂系统建模、模式识别与信息融合、机器学习等。
  • 邹 天畅, 考克大学食品科学专业,考克,爱尔兰,T12 HY8E
    邹天畅,就读于爱尔兰考克大学食品科学专业。研究方向为食品感官评测和产品配方改造。

References

Nie, G. T., & Huang, H. (2021). A survey of object detection in optical remote sensing images. Acta Automatica Sinica, 47(8), 1749-1768.

Chen, X. L., Zhao, H. M., Li, P. X., & Yin, Z. Y. (2006). Remote sensing image-based analysis of the relationship between urban heat island and land use/cover changes. Remote sensing of environment, 104(2), 133-146. https://doi.org/10.1016/j.rse.2005.11.016

Lenhart, D. O. M. I. N. I. K., Hinz, S. T. E. F. A. N., Leitloff, J. E. N. S., & Stilla, U. (2008). Automatic traffic monitoring based on aerial image sequences. Pattern Recognition and Image Analysis, 18, 400-405.

Liu, Y., & Wu, L. (2016). Geological disaster recognition on optical remote sensing images using deep learning. Procedia Computer Science, 91, 566-575.

Frohn, R. C. (2018). Remote sensing for landscape ecology: new metric indicators for monitoring, modeling, and assessment of ecosystems. CRC Press.

Liu, G., Sun, X., Fu, K., & Wang, H. (2012). Aircraft recognition in high-resolution satellite images using coarse-to-fine shape prior. IEEE Geoscience and Remote Sensing Letters, 10(3), 573-577.

Liu, Q., Xiang, X., Wang, Y., Luo, Z., & Fang, F. (2020). Aircraft detection in remote sensing image based on corner clustering and deep learning. Engineering Applications of Artificial Intelligence, 87, 103333.

Zhu, C., Zhou, H., Wang, R., & Guo, J. (2010). A novel hierarchical method of ship detection from spaceborne optical image based on shape and texture features. IEEE Transactions on geoscience and remote sensing, 48(9), 3446-3456.

Bi, F., Zhu, B., Gao, L., & Bian, M. (2012). A visual search inspired computational model for ship detection in optical satellite images. IEEE Geoscience and Remote Sensing Letters, 9(4), 749-753.

Hosang, J., Benenson, R., Dollár, P., & Schiele, B. (2015). What makes for effective detection proposals?. IEEE transactions on pattern analysis and machine intelligence, 38(4), 814-830.

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.

Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019). Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 821-830).

Nie, X., Duan, M., Ding, H., Hu, B., & Wong, E. K. (2020). Attention mask R-CNN for ship detection and segmentation from remote sensing images. Ieee Access, 8, 9325-9334.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).

Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271).

Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767

Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934

Ge, Z., Liu, S., Wang, F., Li, Z., & Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430.

Wang, C. Y., Bochkovskiy, A., & Liao, H. Y. M. (2023). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7464-7475).

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.

Tian, Z., Shen, C., Chen, H., & He, T. (1904). FCOS: Fully convolutional one-stage object detection. arXiv 2019. arXiv preprint arXiv:1904.01355.

Li, K., Cheng, G., Bu, S., & You, X. (2017). Rotation-insensitive and context-augmented object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 56(4), 2337-2348. https://doi.org/10.1109/TGRS.2017.2778300

Yang, X., Sun, H., Sun, X., Yan, M., Guo, Z., & Fu, K. (2018). Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network. IEEE access, 6, 50839-50849. https://doi.org/10.1109/ACCESS.2018.2869884

Xu, S. Y., Chu, K. B., Zhang, J., & Feng, C. T. (2022). An improved YOLOv3 algorithm for small target detection. Electro-Opt. Control, 29, 35-39.

Jiang, S., Yao, W., Wong, M. S., Li, G., Hong, Z., Kuc, T. Y., & Tong, X. (2020). An optimized deep neural network detecting small and narrow rectangular objects in Google Earth images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, 1068-1081.

Wang, Y., Li, W., Li, X., & Sun, X. (2018, August). Ship detection by modified RetinaNet. In 2018 10th IAPR workshop on pattern recognition in remote sensing (PRRS) (pp. 1-5). IEEE.

Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., ... & Fu, K. (2019). Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8232-8241).

Yao, Q., Hu, X., & Lei, H. (2020). Multiscale convolutional neural networks for geospatial object detection in VHR satellite images. IEEE Geoscience and Remote Sensing Letters, 18(1), 23-27.

Junhua, Y. A. N., Zhang, K., & Tianjun, S. H. I. (2022). Multi-level feature fusion based dim small ground target detection in remote sensing images. Chinese Journal of Scientific Instrument, 43(03), 221-229.

Zhang Y Z, Guo W, Li W B. Omnidirectional accurate detection algorithm for dense small objects in remote sensing images, 2023: 1-9.

Tong, Z., Chen, Y., Xu, Z., & Yu, R. (2023). Wise-IoU: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:2301.10051.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., ... & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 10012-10022).

Dong, X., Bao, J., Chen, D., Zhang, W., Yu, N., Yuan, L., ... & Guo, B. (2022). Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12124-12134).

Wang, W., Chen, W., Qiu, Q., Chen, L., Wu, B., Lin, B., ... & Liu, W. (2023). Crossformer++: A versatile vision transformer hinging on cross-scale attention. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2023.3341806

Yu, J., Jiang, Y., Wang, Z., Cao, Z., & Huang, T. (2016, October). Unitbox: An advanced object detection network. In Proceedings of the 24th ACM international conference on Multimedia (pp. 516-520). https://doi.org/10.1145/2964284.2967274

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12993-13000).

Fang, F. A. N. G., Tan, W., & Liu, J. Z. (2005). Tuning of coordinated controllers for boiler-turbine units. Acta Automatica Sinica, 31(2), 291-296.

Fang, F., Jizhen, L., & Wen, T. (2004). Nonlinear internal model control for the boiler-turbine coordinate systems of power unit. PROCEEDINGS-CHINESE SOCIETY OF ELECTRICAL ENGINEERING, 24(4), 195-199.

Lv, Y., Lv, X., Fang, F., Yang, T., & Romero, C. E. (2020). Adaptive selective catalytic reduction model development using typical operating data in coal-fired power plants. Energy, 192, 116589.

Fang, F., & Xiong, Y. (2014). Event-driven-based water level control for nuclear steam generators. IEEE Transactions on Industrial electronics, 61(10), 5480-5489.

Liu, J., Zeng, D., Tian, L., Gao, M., Wang, W., Niu, Y., & Fang, F. (2015). Control strategy for operating flexibility of coal-fired power plants in alternate electrical power systems. Proceedings of the CSEE, 35(21), 5385-5394.

Liu, J., Song, D., Li, Q., Yang, J., Hu, Y., Fang, F., & Joo, Y. H. (2023). Life cycle cost modelling and economic analysis of wind power: A state of art review. Energy Conversion and Management, 277, 116628.

Wang, W., Liu, J., Zeng, D., Fang, F., & Niu, Y. (2020). Modeling and flexible load control of combined heat and power units. Applied Thermal Engineering, 166, 114624.

Wei, L., & Fang, F. (2016). ${H} _ {infty} $-LQR-Based Coordinated Control for Large Coal-Fired Boiler–Turbine Generation Units. IEEE Transactions on Industrial Electronics, 64(6), 5212-5221.

Zhang, J., Feng, J., Zhou, Y., Fang, F., & Yue, H. (2012). Linear active disturbance rejection control of waste heat recovery systems with organic Rankine cycles. Energies, 5(12), 5111-5125.

Liu, J., Wang, Q., Song, Z., & Fang, F. (2021). Bottlenecks and countermeasures of high-penetration renewable energy development in China. Engineering, 7(11), 1611-1622.

Wang, N., Fang, F., & Feng, M. (2014, May). Multi-objective optimal analysis of comfort and energy management for intelligent buildings. In The 26th Chinese control and decision conference (2014 CCDC) (pp. 2783-2788). IEEE.

Lv, Y., Fang, F. A. N. G., Yang, T.,& Romero, C. E. (2020). An early fault detection method for induced draft fans based on MSET with informative memory matrix selection. ISA transactions, 102, 325-334.

Fang, F., & Wu, X. (2020). A win–win mode: The complementary and coexistence of 5G networks and edge computing. IEEE Internet of Things Journal, 8(6), 3983-4003.

Fang, F., Zhu, Z., Jin, S., & Hu, S. (2020). Two-layer game theoretic microgrid capacity optimization considering uncertainty of renewable energy. IEEE Systems Journal, 15(3), 4260-4271.

Zhang, X., Fang, F., & Liu, J. (2019). Weather-classification-MARS-based photovoltaic power forecasting for energy imbalance market. IEEE Transactions on Industrial Electronics, 66(11), 8692-8702.

Liu, Y., Fang, F., & Park, J. H. (2018). Decentralized dissipative filtering for delayed nonlinear interconnected systems based on T–S fuzzy model. IEEE Transactions on Fuzzy Systems, 27(4), 790-801.

Cheng, L., Kalapgar, A., Jain, A., Wang, Y., Qin, Y., Li, Y., & Liu, C. (2022). Cost-aware real-time job scheduling for hybrid cloud using deep reinforcement learning. Neural Computing and Applications, 34(21), 18579-18593.

Guo, J., Cheng, L., & Wang, S. (2023). CoTV: Cooperative control for traffic light signals and connected autonomous vehicles using deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems.

Mao, Y., Sharma, V., Zheng, W., Cheng, L., Guan, Q., & Li, A. (2022). Elastic resource management for deep learning applications in a container cluster. IEEE Transactions on Cloud Computing.

Mao, Y., Fu, Y., Zheng, W., Cheng, L., Liu, Q., & Tao, D. (2021). Speculative container scheduling for deep learning applications in a kubernetes cluster. IEEE Systems Journal, 16(3), 3770-3781.

Liang, S., Liu, C., Wang, Y., Li, H., & Li, X. (2020, November). Deepburning-gl: an automated framework for generating graph neural network accelerators. In Proceedings of the 39th International Conference on Computer-Aided Design (pp. 1-9).

Li, W., Wang, Y., Li, H., & Li, X. (2019, January). P3M: a PIM-based neural network model protection scheme for deep learning accelerator. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (pp. 633-638).

Liu, B., Chen, X., Wang, Y., Han, Y., Li, J., Xu, H., & Li, X. (2019, January). Addressing the issue of processing element under-utilization in general-purpose systolic deep learning accelerators. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (pp. 733-738).

YOLOv7-BW: 基于遥感图像的密集小目标高效检测器

Downloads

Published

2024-05-30

Issue

Section

文章

How to Cite

旭东葛., 学波金., 慧鋆马., & 天畅邹. (2024). YOLOv7-BW: 基于遥感图像的密集小目标高效检测器. 智能机器人, 1(1), 39-54. https://doi.org/10.52810/JIR.2024.004