[1]王洪元,徐志晨,陈海琴,等.基于金字塔分割和时空注意力的视频行人重识别[J].常州大学学报(自然科学版),2023,35(02):66-76.[doi:10.3969/j.issn.2095-0411.2023.02.008 ]
 WANG Hongyuan,XU Zhichen,CHEN Haiqin,et al.Video-based person re-identification based on pyramid segmentation and spatial-temporal attention[J].Journal of Changzhou University(Natural Science Edition),2023,35(02):66-76.[doi:10.3969/j.issn.2095-0411.2023.02.008 ]
点击复制

基于金字塔分割和时空注意力的视频行人重识别 ()
分享到:

常州大学学报(自然科学版)[ISSN:2095-0411/CN:32-1822/N]

卷:
第35卷
期数:
2023年02期
页码:
66-76
栏目:
计算机与信息工程
出版日期:
2023-03-28

文章信息/Info

Title:
Video-based person re-identification based on pyramid segmentation and spatial-temporal attention
文章编号:
2095-0411(2023)02-0066-11
作者:
王洪元 徐志晨 陈海琴 丁宗元 李鹏辉
(常州大学 计算机与人工智能学院, 江苏 常州213164)
Author(s):
WANG Hongyuan XU Zhichen CHEN Haiqin DING Zongyuan LI Penghui
(School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213164, China)
关键词:
视频行人重识别 深度学习 图模型 注意力机制 加权损失策略
Keywords:
video-based person re-identification deep learning graph model attention mechanism weighted loss strategy
分类号:
TP 391.4
DOI:
10.3969/j.issn.2095-0411.2023.02.008
文献标志码:
A
摘要:
针对视频行人重识别任务中存在的行人外观、遮挡等问题,研究并设计了一个基于金字塔分割和注意力机制的视频行人重识别模型。首先,为了增强图模型对行人局部特征的识别能力,提出了多个尺度的水平金字塔分割方法,将各特征分别分割成不同大小的区域,并池化成统一尺寸后输入图模型。另外,鉴于简单的时空注意模块容易因遮挡破坏行人特征,因此使用时空相关注意力方法改进时空注意模块,逐步学习并聚合空间局部信息,同时在时序上相互作用,抑制行人干扰特征并增强判别特征。将模型在Mars和DukeMTMC-VideoReID两个数据集上进行了评估,实验结果证实了文中提出方法的有效性。
Abstract:
Aiming at the problems of similar appearance and occlusion of people in the video person re-identification, a video-based person re-identification model based on pyramid segmentation and attention mechanism was studied and designed. First, in order to enhance the recognition ability of the graph model for the local features of pedestrians, a multi-scale horizontal pyramid segmentation method was proposed. In addition, given that the simple spatiotemporal attention module was prone to damage person features due to occlusion, the spatiotemporal attention module was improved using the spatiotemporal correlation attention method, which gradually learns and aggregates spatially local information while interacting in time sequence to suppress person interference features and enhance discriminative features. This paper evaluates the model on Mars and DukeMTMC-VideoReID datasets, and the experimental results confirm the effectiveness of the proposed method.

参考文献/References:

[1] NI T G, DING Z Y, CHEN F H, et al. Relative distance metric leaning based on clustering centralization and projection vectors learning for person re-identification[J]. IEEE Access, 2018, 6: 11405-11411. [2] NI T G, GU X Q, WANG H Y, et al. Discriminative deep transfer metric learning for cross-scenario person re-identification[J]. Journal of Electronic Imaging, 2018, 27(4): 043026. [3] WANG H Y, ZHANG W W, SUN J Y, et al. A sparse dimension-reduction based person re-identification algorithm[C]//SPIE Commercial + Scientific Sensing and Imaging. Orlando: SPIE, 2018: 190-202. [4] DING Z Y, WANG H Y, CHEN F H, et al. Person re-identification by semi-supervised dictionary rectification learning[C]//SPIE Commercial + Scientific Sensing and Imaging. Orlando: SPIE, 2018: 172-181. [5] WANG H Y, WU L Y, CHEN F H, et al. Common-covariance based person re-identification model[J]. Pattern Recognition Letters, 2021, 146: 77-82. [6] XIAO Y, CAO L, WANG H Y, et al. Unsupervised video-based person re-identification based on the joint global-local metrics[C]//2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems(CCIS). Xi'an: IEEE, 2022: 176-182. [7] 张云鹏, 王洪元, 张继, 等. 近邻中心迭代策略的单标注视频行人重识别[J]. 软件学报, 2021, 32(12): 4025-4035. [8] 丁宗元, 王洪元, 陈付华, 等. 基于距离中心化与投影向量学习的行人重识别[J]. 计算机研究与发展, 2017, 54(8): 1785-1794. [9] 戴臣超, 王洪元, 倪彤光, 等. 基于深度卷积生成对抗网络和拓展近邻重排序的行人重识别[J]. 计算机研究与发展, 2019, 56(8): 1632-1641. [10] 陈莉, 王洪元, 张云鹏, 等. 联合均等采样随机擦除和全局时间特征池化的视频行人重识别方法[J]. 计算机应用, 2021, 41(1): 164-169. [11] 徐志晨, 王洪元, 齐鹏宇, 等. 基于图模型与加权损失策略的视频行人重识别研究[J]. 计算机应用研究, 2022, 39(2): 598-603. [12] LI J N, ZHANG S L, WANG J D, et al. Global-local temporal representations for video person re-identification[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2020: 3957-3966. [13] WU X H, AN W Z, YU S Q, et al. Spatial-temporal graph attention network for video-based gait recognition[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020: 274-286. [14] WU Y M, EL FAROUK BOURAHLA O, LI X, et al. Adaptive graph representation learning for video person re-identification[J]. IEEE Transactions on Image Processing, 2020, 29: 8821-8830. [15] YANG J R, ZHENG W S, YANG Q Z, et al. Spatial-temporal graph convolutional network for video-based person re-identification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle: IEEE, 2020: 3286-3296. [16] LIU J W, ZHA Z J, WU W, et al. Spatial-temporal correlation and topology learning for person re-identification in videos[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Nashville: IEEE, 2021: 4368-4377. [17] CHEN L, YANG H, GAO Z Y. Joint attentive spatial-temporal feature aggregation for video-based person re-identification[J]. IEEE Access, 2019, 7: 41230-41240. [18] ZHU X R, LIU J W, WU H Z, et al. ASTA-net: adaptive spatio-temporal attention network for person re-identification in videos[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York: ACM, 2020: 1706-1715. [19] ZHANG R M, LI J Y, SUN H B, et al. SCAN: self-and-collaborative attention network for video person re-identification[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2019, 28(10): 4870-4882. [20] WANG Y Q, ZHANG P P, GAO S, et al. Pyramid spatial-temporal aggregation for video-based person re-identification[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV). Montreal: IEEE, 2022: 12006-12015. [21] HOU R B, CHANG H, MA B P, et al. BiCnet-TKS: learning efficient spatial-temporal representation for video person re-identification[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Nashville: IEEE, 2021: 2014-2023. [22] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. [23] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu: IEEE, 2017: 6230-6239. [24] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las: IEEE, 2016: 770-778. [25] LI J N, ZHANG S L, HUANG T J. Multi-scale 3D convolution network for video based person re-identification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 8618-8625. [26] ZHANG Z Z, LAN C L, ZENG W J, et al. Relation-aware global attention for person re-identification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle: IEEE, 2020: 3183-3192. [27] HE T Y, JIN X, SHEN X, et al. Dense interaction learning for video-based person re-identification[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV). Montreal: IEEE, 2022: 1470-1481. [28] DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 248-255. [29] LUO H, GU Y Z, LIAO X Y, et al. Bag of tricks and a strong baseline for deep person re-identification[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW). Long Beach: IEEE, 2020: 1487-1495. [30] SUBRAMANIAM A, NAMBIAR A, MITTAL A. Co-segmentation inspired attention networks for video-based person re-identification[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2020: 562-572. [31] HOU R B, CHANG H, MA B P, et al. Temporal complementary learning for video person re-identification[M]//Computer Vision-ECCV 2020. Cham: Springer International Publishing, 2020: 388-405. [32] CHEN G Y, RAO Y M, LU J W, et al. Temporal coherence or temporal motion: which is more critical for video-based person re-identification? [M]//Computer Vision-ECCV 2020. Cham: Springer International Publishing, 2020: 660-676. [33] YAN Y C, QIN J, CHEN J X, et al. Learning multi-granular hypergraphs for video-based person re-identification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle: IEEE, 2020: 2896-2905. [34] ZHANG Z Z, LAN C L, ZENG W J, et al. Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle: IEEE, 2020: 10404-10413. (责任编辑:谭晓荷)

相似文献/References:

[1]宋志理,胡胜利,王峰.基于深度学习特征表示协同过滤算法[J].常州大学学报(自然科学版),2021,33(01):62.[doi:10.3969/j.issn.2095-0411.2021.01.010]
 SONG Zhili,HU Shengli,WANG Feng.Research on Cooperative Filtering Algorithm Based on Deep Learning Feature Representation[J].Journal of Changzhou University(Natural Science Edition),2021,33(02):62.[doi:10.3969/j.issn.2095-0411.2021.01.010]
[2]吴鹏,陈信华,马宇超,等.基于优化深度学习的电动桥铸件表面瑕疵识别方法[J].常州大学学报(自然科学版),2022,34(05):65.[doi:10.3969/j.issn.2095-0411.2022.05.009]
 WU Peng,CHEN Xinhua,MA Yuchao,et al.Research on Casting Surface Defects of Electric Bridge Identification Method Based on Optimal Deep Learning[J].Journal of Changzhou University(Natural Science Edition),2022,34(02):65.[doi:10.3969/j.issn.2095-0411.2022.05.009]
[3]罗俊如,丁言瑞,徐明华,等.基于深度AUC最大化算法的井漏风险预测[J].常州大学学报(自然科学版),2024,36(03):34.[doi:10.3969/j.issn.2095-0411.2024.03.005]
 LUO Junru,DING Yanrui,XU Minghua,et al.Lost circulation prediction based on deep AUC maximization[J].Journal of Changzhou University(Natural Science Edition),2024,36(02):34.[doi:10.3969/j.issn.2095-0411.2024.03.005]

备注/Memo

备注/Memo:
收稿日期: 2022-10-29。
基金项目: 国家自然科学基金资助项目(61976028, 61572085, 61070121)。
作者简介: 王洪元(1960—), 男, 江苏常熟人, 博士, 教授。E-mail: hywang@cczu.edu.cn

更新日期/Last Update: 1900-01-01