摘要: |
针对目前复杂交通环境下还存在多目标检测精度和速度不高等问题,以特征金字塔网络(Feature Pyramid Network,FPN)为基础,提出了一种多层融合多目标检测与识别算法,以提高目标检测精度和网络泛化能力。首先,采用ResNet101的五层架构将空间分辨率上采样2倍构建自上而下的特征图,按照元素相加的方式将上采样图和自下而上的特征图合并,并构建一个融合高层语义信息与低层几何信息的特征层;然后,根据BBox回归存在训练样本不平衡问题,选择Efficient IOU Loss损失函数并结合Focal Loss提出一种改进Focal EIOU Loss;最后,充分考虑复杂交通环境下的实际情况,进行人工标注混合数据集进行训练。该模型在KITTI测试集上的平均检测精度和速度比FPN分别提升了2.4%和5 frame/s,在Cityscale测试集上平均检测精度和速度比FPN提升了1.9%和4 frame/s。 |
关键词: 复杂交通环境 多目标检测 多目标识别 特征金字塔网络(FPN) 多层交叉融合 |
DOI:10.20079/j.issn.1001-893x.220330003 |
|
基金项目:重庆市自然科学基金面上项目(cstc2021jcyj- msxmX0941);重庆市教委科技青年项目(KJQN202101907,KJQN201901907);重庆工程学院校内科研项目(2020xzky04) |
|
Multi-layer Intersection Fusion and Multi-target Detection in Complex Traffic Environment |
LI Cuijin,QU Zhong |
(1.College of Electronic Information,Chongqing Institute of Engineering,Chongqing 400056,China;2.School of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China) |
Abstract: |
For the low accuracy and speed of multi-target detection in complex traffic environment,a multi-target detection and recognition algorithm of multi-layer fusion based on Feature Pyramid Network(FPN) is proposed to improve the target detection accuracy and network generalization ability. Firstly,the five-layer architecture of ResNet101 is adopted to construct a top-down feature map by 2哠ymboltB@ up-sampling the spatial resolution,and the up-sampling map and the bottom-up feature map are combined by mixing of their elements to construct a feature layer,integrating high-level semantic information and low-level geometric information. Secondly,according to the imbalance of training samples in Boundingbox regression,an improved Focal Efficient Intersection over Union(EIOU) Loss function is proposed by using Efficient Intersection over Union(IOU) Loss function and Focal Loss function. Finally,manual annotation of mixed data set is used for training by considering the actual situation of complex traffic environment. The average detection accuracy and speed of the model are 2.5% and 2 frames per second(FPS) higher than those of FPN on KITTI test set,and 1.4% and 4 FPS on Cityscapes test set,respectively. |
Key words: complex traffic environment multi-target detection multi-target recognition feature pyramid network(FPN) multi-larger intersection fusion |