首页期刊简介编委会征稿启事出版道德声明审稿流程读者订阅论文查重联系我们English
引用本文
  • 张 宇,张 雷.融入注意力机制的深度学习动作识别[J].电讯技术,2021,61(10): - .    [点击复制]
  • ZHANG Yu,ZHANG Lei.Deep learning action recognition with attention mechanism[J].,2021,61(10): - .   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 1363次   下载 9 本文二维码信息
码上扫一扫!
融入注意力机制的深度学习动作识别
张宇,张雷
0
(北京建筑大学 电气与信息工程学院,北京 100044)
摘要:
针对现有的深度学习方法在人体动作识别中易出现过拟合、易受到干扰信息影响、特征表达能力不足的问题,提出了一种融入注意力机制的深度学习动作识别方法。该方法在数据预处理中提出了视频数据增强算法,降低了模型过拟合的风险,然后在视频帧采样过程中对现有的采样算法进行了改进,有效抑制了干扰信息的影响,并在特征提取部分提出了融入注意力的残差网络,提高了模型的特征提取能力;之后,利用长短时记忆(Long ShortTerm Memory,LSTM)网络解决了空间特征的时序关联问题;最后,通过Softmax完成了相应动作的分类。实验结果表明,在UCF YouTube、KTH和HMDB51数据集上,所提方法的识别率分别为96.72%、98.06%和64.81%。
关键词:  动作识别  深度学习  残差网络  注意力机制  长短时记忆网络
DOI:
基金项目:智能机器人与系统北京高精尖创新中心建设项目(00921917001);北京市重点实验室项目(BZ0337)
Deep learning action recognition with attention mechanism
ZHANG Yu,ZHANG Lei
(School of Electrical and Information Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044,China)
Abstract:
In order to solve the problem that the existing deep learning methods are prone to overfitting,easily affected by interference information,and lack of feature expression ability in human action recognition,a deep learning action recognition method with attention mechanism is proposed.This method proposes a video data enhancement algorithm in the data preprocessing to reduce the risk of model overfitting,and improves the existing sampling algorithm in the video frame sampling process to effectively suppress the influence of interference information,and in the feature extraction part,residual network with attention is proposed to enhance the feature extraction ability of the model.Then,Long shortterm memory(LSTM) network is used to solve the timing sequential correlation problem of spatial features.Finally,the classification of the corresponding actions is completed by Softmax.The experimental results show that the recognition rates of the proposed method are 96.72%,98.06% and 64.81% on UCF YouTube,KTH and HMDB51 datasets,respectively.
Key words:  action recognition  deep learning  residual network  attention mechanism  long short term memory network
安全联盟站长平台