YOLOv7-BW: 基于遥感图像的密集小目标高效检测器
DOI:
https://doi.org/10.52810/JIR.2024.004Keywords:
遥感图像, YOLO, 目标检测, mAPAbstract
近年来,深度学习技术已经越来越广泛应用于遥感图像的检测。然而,遥感图像普遍目标大小差距大同时分布密集,对检测算法性能的要求高。目前的检测方法普遍效率低,容易出现漏检以及检测框不准确的情况。为此,本文基于YOLO算法进行改进,提出了一种基于YOLOv7的算法YOLOv7-bw,实现了对遥感图像的高效率检测,促进了目标检测在遥感行业的应用和发展。YOLOv7-bw在原始的池化金字塔SPPCSPC网络中添加了Bi-level Routing Attention模块,对目标集中区域重点关注,以提高网络提取特征的能力;并引入动态非单调的WIoUv3替换原本的CIoU损失函数,使得损失函数在每一时刻都能做出最符合当前情况的梯度增益分配策略,以提高对检测目标的聚焦能力。通过对DIOR遥感图像数据集进行对比实验发现,我们的YOLOv7-bw具有较高的mAP@0.5和mAP@0.5:0.95,在数据集上表现为85.63%和65.93%,高于YOLOv7源码的83.7%和63.9%分别1.93%、2.03%。同时,对比目前常用算法,我们的YOLOv7-bw均表现更好,证明了我们提出的算法是可行的,可以更好的应用于遥感图像检测。Downloads
References
Nie, G. T., & Huang, H. (2021). A survey of object detection in optical remote sensing images. Acta Automatica Sinica, 47(8), 1749-1768.
Chen, X. L., Zhao, H. M., Li, P. X., & Yin, Z. Y. (2006). Remote sensing image-based analysis of the relationship between urban heat island and land use/cover changes. Remote sensing of environment, 104(2), 133-146. https://doi.org/10.1016/j.rse.2005.11.016
Lenhart, D. O. M. I. N. I. K., Hinz, S. T. E. F. A. N., Leitloff, J. E. N. S., & Stilla, U. (2008). Automatic traffic monitoring based on aerial image sequences. Pattern Recognition and Image Analysis, 18, 400-405.
Liu, Y., & Wu, L. (2016). Geological disaster recognition on optical remote sensing images using deep learning. Procedia Computer Science, 91, 566-575.
Frohn, R. C. (2018). Remote sensing for landscape ecology: new metric indicators for monitoring, modeling, and assessment of ecosystems. CRC Press.
Liu, G., Sun, X., Fu, K., & Wang, H. (2012). Aircraft recognition in high-resolution satellite images using coarse-to-fine shape prior. IEEE Geoscience and Remote Sensing Letters, 10(3), 573-577.
Liu, Q., Xiang, X., Wang, Y., Luo, Z., & Fang, F. (2020). Aircraft detection in remote sensing image based on corner clustering and deep learning. Engineering Applications of Artificial Intelligence, 87, 103333.
Zhu, C., Zhou, H., Wang, R., & Guo, J. (2010). A novel hierarchical method of ship detection from spaceborne optical image based on shape and texture features. IEEE Transactions on geoscience and remote sensing, 48(9), 3446-3456.
Bi, F., Zhu, B., Gao, L., & Bian, M. (2012). A visual search inspired computational model for ship detection in optical satellite images. IEEE Geoscience and Remote Sensing Letters, 9(4), 749-753.
Hosang, J., Benenson, R., Dollár, P., & Schiele, B. (2015). What makes for effective detection proposals?. IEEE transactions on pattern analysis and machine intelligence, 38(4), 814-830.
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019). Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 821-830).
Nie, X., Duan, M., Ding, H., Hu, B., & Wong, E. K. (2020). Attention mask R-CNN for ship detection and segmentation from remote sensing images. Ieee Access, 8, 9325-9334.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271).
Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://doi.org/10.48550/arXiv.1804.02767
Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934
Ge, Z., Liu, S., Wang, F., Li, Z., & Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430.
Wang, C. Y., Bochkovskiy, A., & Liao, H. Y. M. (2023). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7464-7475).
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.
Tian, Z., Shen, C., Chen, H., & He, T. (1904). FCOS: Fully convolutional one-stage object detection. arXiv 2019. arXiv preprint arXiv:1904.01355.
Li, K., Cheng, G., Bu, S., & You, X. (2017). Rotation-insensitive and context-augmented object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 56(4), 2337-2348. https://doi.org/10.1109/TGRS.2017.2778300
Yang, X., Sun, H., Sun, X., Yan, M., Guo, Z., & Fu, K. (2018). Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network. IEEE access, 6, 50839-50849. https://doi.org/10.1109/ACCESS.2018.2869884
Xu, S. Y., Chu, K. B., Zhang, J., & Feng, C. T. (2022). An improved YOLOv3 algorithm for small target detection. Electro-Opt. Control, 29, 35-39.
Jiang, S., Yao, W., Wong, M. S., Li, G., Hong, Z., Kuc, T. Y., & Tong, X. (2020). An optimized deep neural network detecting small and narrow rectangular objects in Google Earth images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, 1068-1081.
Wang, Y., Li, W., Li, X., & Sun, X. (2018, August). Ship detection by modified RetinaNet. In 2018 10th IAPR workshop on pattern recognition in remote sensing (PRRS) (pp. 1-5). IEEE.
Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., ... & Fu, K. (2019). Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8232-8241).
Yao, Q., Hu, X., & Lei, H. (2020). Multiscale convolutional neural networks for geospatial object detection in VHR satellite images. IEEE Geoscience and Remote Sensing Letters, 18(1), 23-27.
Junhua, Y. A. N., Zhang, K., & Tianjun, S. H. I. (2022). Multi-level feature fusion based dim small ground target detection in remote sensing images. Chinese Journal of Scientific Instrument, 43(03), 221-229.
Zhang Y Z, Guo W, Li W B. Omnidirectional accurate detection algorithm for dense small objects in remote sensing images, 2023: 1-9.
Tong, Z., Chen, Y., Xu, Z., & Yu, R. (2023). Wise-IoU: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:2301.10051.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., ... & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 10012-10022).
Dong, X., Bao, J., Chen, D., Zhang, W., Yu, N., Yuan, L., ... & Guo, B. (2022). Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12124-12134).
Wang, W., Chen, W., Qiu, Q., Chen, L., Wu, B., Lin, B., ... & Liu, W. (2023). Crossformer++: A versatile vision transformer hinging on cross-scale attention. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2023.3341806
Yu, J., Jiang, Y., Wang, Z., Cao, Z., & Huang, T. (2016, October). Unitbox: An advanced object detection network. In Proceedings of the 24th ACM international conference on Multimedia (pp. 516-520). https://doi.org/10.1145/2964284.2967274
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12993-13000).
Fang, F. A. N. G., Tan, W., & Liu, J. Z. (2005). Tuning of coordinated controllers for boiler-turbine units. Acta Automatica Sinica, 31(2), 291-296.
Fang, F., Jizhen, L., & Wen, T. (2004). Nonlinear internal model control for the boiler-turbine coordinate systems of power unit. PROCEEDINGS-CHINESE SOCIETY OF ELECTRICAL ENGINEERING, 24(4), 195-199.
Lv, Y., Lv, X., Fang, F., Yang, T., & Romero, C. E. (2020). Adaptive selective catalytic reduction model development using typical operating data in coal-fired power plants. Energy, 192, 116589.
Fang, F., & Xiong, Y. (2014). Event-driven-based water level control for nuclear steam generators. IEEE Transactions on Industrial electronics, 61(10), 5480-5489.
Liu, J., Zeng, D., Tian, L., Gao, M., Wang, W., Niu, Y., & Fang, F. (2015). Control strategy for operating flexibility of coal-fired power plants in alternate electrical power systems. Proceedings of the CSEE, 35(21), 5385-5394.
Liu, J., Song, D., Li, Q., Yang, J., Hu, Y., Fang, F., & Joo, Y. H. (2023). Life cycle cost modelling and economic analysis of wind power: A state of art review. Energy Conversion and Management, 277, 116628.
Wang, W., Liu, J., Zeng, D., Fang, F., & Niu, Y. (2020). Modeling and flexible load control of combined heat and power units. Applied Thermal Engineering, 166, 114624.
Wei, L., & Fang, F. (2016). ${H} _ {infty} $-LQR-Based Coordinated Control for Large Coal-Fired Boiler–Turbine Generation Units. IEEE Transactions on Industrial Electronics, 64(6), 5212-5221.
Zhang, J., Feng, J., Zhou, Y., Fang, F., & Yue, H. (2012). Linear active disturbance rejection control of waste heat recovery systems with organic Rankine cycles. Energies, 5(12), 5111-5125.
Liu, J., Wang, Q., Song, Z., & Fang, F. (2021). Bottlenecks and countermeasures of high-penetration renewable energy development in China. Engineering, 7(11), 1611-1622.
Wang, N., Fang, F., & Feng, M. (2014, May). Multi-objective optimal analysis of comfort and energy management for intelligent buildings. In The 26th Chinese control and decision conference (2014 CCDC) (pp. 2783-2788). IEEE.
Lv, Y., Fang, F. A. N. G., Yang, T.,& Romero, C. E. (2020). An early fault detection method for induced draft fans based on MSET with informative memory matrix selection. ISA transactions, 102, 325-334.
Fang, F., & Wu, X. (2020). A win–win mode: The complementary and coexistence of 5G networks and edge computing. IEEE Internet of Things Journal, 8(6), 3983-4003.
Fang, F., Zhu, Z., Jin, S., & Hu, S. (2020). Two-layer game theoretic microgrid capacity optimization considering uncertainty of renewable energy. IEEE Systems Journal, 15(3), 4260-4271.
Zhang, X., Fang, F., & Liu, J. (2019). Weather-classification-MARS-based photovoltaic power forecasting for energy imbalance market. IEEE Transactions on Industrial Electronics, 66(11), 8692-8702.
Liu, Y., Fang, F., & Park, J. H. (2018). Decentralized dissipative filtering for delayed nonlinear interconnected systems based on T–S fuzzy model. IEEE Transactions on Fuzzy Systems, 27(4), 790-801.
Cheng, L., Kalapgar, A., Jain, A., Wang, Y., Qin, Y., Li, Y., & Liu, C. (2022). Cost-aware real-time job scheduling for hybrid cloud using deep reinforcement learning. Neural Computing and Applications, 34(21), 18579-18593.
Guo, J., Cheng, L., & Wang, S. (2023). CoTV: Cooperative control for traffic light signals and connected autonomous vehicles using deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems.
Mao, Y., Sharma, V., Zheng, W., Cheng, L., Guan, Q., & Li, A. (2022). Elastic resource management for deep learning applications in a container cluster. IEEE Transactions on Cloud Computing.
Mao, Y., Fu, Y., Zheng, W., Cheng, L., Liu, Q., & Tao, D. (2021). Speculative container scheduling for deep learning applications in a kubernetes cluster. IEEE Systems Journal, 16(3), 3770-3781.
Liang, S., Liu, C., Wang, Y., Li, H., & Li, X. (2020, November). Deepburning-gl: an automated framework for generating graph neural network accelerators. In Proceedings of the 39th International Conference on Computer-Aided Design (pp. 1-9).
Li, W., Wang, Y., Li, H., & Li, X. (2019, January). P3M: a PIM-based neural network model protection scheme for deep learning accelerator. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (pp. 633-638).
Liu, B., Chen, X., Wang, Y., Han, Y., Li, J., Xu, H., & Li, X. (2019, January). Addressing the issue of processing element under-utilization in general-purpose systolic deep learning accelerators. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (pp. 733-738).