• 全部
主办单位:煤炭科学研究总院有限公司、中国煤炭学会学术期刊工作委员会
基于多模态特征融合的井下人员不安全行为识别
  • Title

    Recognition of unsafe behaviors of underground personnel based on multi modal feature fusion

  • 作者

    王宇于春华陈晓青宋家威

  • Author

    WANG Yu;YU Chunhua;CHEN Xiaoqing;SONG Jiawei

  • 单位

    辽宁科技大学 矿业工程学院凌钢股份北票保国铁矿有限公司

  • Organization
    School of Mining Engineering, University of Science and Technology Liaoning
    Lingang Group Beipiao Baoguo Iron Mining Co., Ltd.
  • 摘要

    采用人工智能技术对井下人员的行为进行实时识别,对保证矿井安全生产具有重要意义。针对基于RGB模态的行为识别方法易受视频图像背景噪声影响、基于骨骼模态的行为识别方法缺乏人与物体的外观特征信息的问题,将2种方法进行融合,提出了一种基于多模态特征融合的井下人员不安全行为识别方法。通过SlowOnly网络对RGB模态特征进行提取;使用YOLOX与Lite−HRNet网络获取骨骼模态数据,采用PoseC3D网络对骨骼模态特征进行提取;对RGB模态特征与骨骼模态特征进行早期融合与晚期融合,最后得到井下人员不安全行为识别结果。在X−Sub标准下的NTU60 RGB+D公开数据集上的实验结果表明:在基于单一骨骼模态的行为识别模型中,PoseC3D拥有比GCN(图卷积网络)类方法更高的识别准确率,达到93.1%;基于多模态特征融合的行为识别模型对比基于单一骨骼模态的识别模型拥有更高的识别准确率,达到95.4%。在自制井下不安全行为数据集上的实验结果表明:基于多模态特征融合的行为识别模型在井下复杂环境下识别准确率仍最高,达到93.3%,对相似不安全行为与多人不安全行为均能准确识别。

  • Abstract

    The use of artificial intelligence technology for real-time recognition of underground personnel's behavior is of great significance for ensuring safe production in mines. The RGB modal based behavior recognition methods is susceptible to video image background noise. The bone modal based behavior recognition methods lacks visual feature information of humans and objects. In order to solve the above problems, a multi modal feature fusion based underground personnel unsafe behavior recognition method is proposed by combining the two methods. The SlowOnly network is used to extract RGB modal features. The YOLOX and Lite HRNet networks are used to obtain bone modal data. The PoseC3D network is used to extract bone modal features. The early and late fusion of RGB modal features and bone modal features are performed. The recognition results for unsafe behavior of underground personnel are finally obtained. The experimental results on the NTU60 RGB+D public dataset under the X-Sub standard show the following points. In the behavior recognition model based on a single bone modal, PoseC3D has a higher recognition accuracy than GCN (graph convolutional network) methods, reaching 93.1%. The behavior recognition model based on multimodal feature fusion has a higher recognition accuracy than the recognition model based on a single bone modal, reaching 95.4%. The experimental results on a self-made underground unsafe behavior dataset show that the behavior recognition model based on multimodal feature fusion still has the highest recognition accuracy in complex underground environments, reaching 93.3%. It can accurately recognize similar unsafe behaviors and multiple unsafe behaviors.

  • 关键词

    智能矿山行为识别目标检测姿态估计多模态特征融合RGB模态骨骼模态YOLOX

  • KeyWords

    intelligent mine;behavior recognition;object detection;pose estimation;multi modal feature fusion;RGB mode;bone modal;YOLOX

  • 基金项目(Foundation)
    国家自然科学基金项目(51174110)。
  • DOI
  • 引用格式
    王宇,于春华,陈晓青,等. 基于多模态特征融合的井下人员不安全行为识别[J]. 工矿自动化,2023,49(11):138-144.
  • Citation
    WANG Yu, YU Chunhua, CHEN Xiaoqing, et al. Recognition of unsafe behaviors of underground personnel based on multi modal feature fusion[J]. Journal of Mine Automation,2023,49(11):138-144.
  • 相关专题
  • 图表
    •  
    •  
    • 基于多模态特征融合的行为识别模型框架

    图(9) / 表(0)

相关问题
立即提问

主办单位:煤炭科学研究总院有限公司 中国煤炭学会学术期刊工作委员会

©版权所有2015 煤炭科学研究总院有限公司 地址:北京市朝阳区和平里青年沟东路煤炭大厦 邮编:100013
京ICP备05086979号-16  技术支持:云智互联
Baidu
map