The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Pengenalpastian semula orang berasaskan video (ID semula) bertujuan untuk mendapatkan semula orang merentasi kamera tidak bertindih dan telah mencapai hasil yang menjanjikan disebabkan oleh rangkaian saraf konvolusi yang mendalam. Disebabkan sifat dinamik video, masalah kekacauan latar belakang dan oklusi adalah lebih serius daripada Re-ID orang berasaskan imej. Dalam surat ini, kami membentangkan rangkaian perhatian tiga kali ganda (TriAnet) yang pada masa yang sama menggunakan maklumat konteks temporal, ruang dan saluran dengan menggunakan mekanisme perhatian diri untuk mendapatkan ciri yang mantap dan diskriminasi. Secara khusus, rangkaian mempunyai dua bahagian, di mana bahagian pertama memperkenalkan subrangkaian perhatian sisa, yang mengandungi modul perhatian saluran untuk menangkap kebergantungan merentas dimensi dengan menggunakan putaran dan transformasi dan modul perhatian ruang untuk memfokuskan pada ciri pejalan kaki. Di bahagian kedua, modul perhatian masa direka untuk menilai skor kualiti setiap pejalan kaki, dan untuk mengurangkan berat imej pejalan kaki yang tidak lengkap untuk mengurangkan masalah oklusi. Kami menilai seni bina yang dicadangkan pada tiga set data, iLIDS-VID, PRID2011 dan MARS. Keputusan percubaan perbandingan yang meluas menunjukkan bahawa kaedah yang dicadangkan kami mencapai hasil yang terkini.
Rui SUN
Hefei University of Technology
Qili LIANG
Hefei University of Technology
Zi YANG
Hefei University of Technology
Zhenghui ZHAO
Hefei University of Technology
Xudong ZHANG
Hefei University of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Rui SUN, Qili LIANG, Zi YANG, Zhenghui ZHAO, Xudong ZHANG, "Triplet Attention Network for Video-Based Person Re-Identification" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 10, pp. 1775-1779, October 2021, doi: 10.1587/transinf.2021EDL8037.
Abstract: Video-based person re-identification (re-ID) aims at retrieving person across non-overlapping camera and has achieved promising results owing to deep convolutional neural network. Due to the dynamic properties of the video, the problems of background clutters and occlusion are more serious than image-based person Re-ID. In this letter, we present a novel triple attention network (TriANet) that simultaneously utilizes temporal, spatial, and channel context information by employing the self-attention mechanism to get robust and discriminative feature. Specifically, the network has two parts, where the first part introduces a residual attention subnetwork, which contains channel attention module to capture cross-dimension dependencies by using rotation and transformation and spatial attention module to focus on pedestrian feature. In the second part, a time attention module is designed to judge the quality score of each pedestrian, and to reduce the weight of the incomplete pedestrian image to alleviate the occlusion problem. We evaluate our proposed architecture on three datasets, iLIDS-VID, PRID2011 and MARS. Extensive comparative experimental results show that our proposed method achieves state-of-the-art results.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDL8037/_p
Salinan
@ARTICLE{e104-d_10_1775,
author={Rui SUN, Qili LIANG, Zi YANG, Zhenghui ZHAO, Xudong ZHANG, },
journal={IEICE TRANSACTIONS on Information},
title={Triplet Attention Network for Video-Based Person Re-Identification},
year={2021},
volume={E104-D},
number={10},
pages={1775-1779},
abstract={Video-based person re-identification (re-ID) aims at retrieving person across non-overlapping camera and has achieved promising results owing to deep convolutional neural network. Due to the dynamic properties of the video, the problems of background clutters and occlusion are more serious than image-based person Re-ID. In this letter, we present a novel triple attention network (TriANet) that simultaneously utilizes temporal, spatial, and channel context information by employing the self-attention mechanism to get robust and discriminative feature. Specifically, the network has two parts, where the first part introduces a residual attention subnetwork, which contains channel attention module to capture cross-dimension dependencies by using rotation and transformation and spatial attention module to focus on pedestrian feature. In the second part, a time attention module is designed to judge the quality score of each pedestrian, and to reduce the weight of the incomplete pedestrian image to alleviate the occlusion problem. We evaluate our proposed architecture on three datasets, iLIDS-VID, PRID2011 and MARS. Extensive comparative experimental results show that our proposed method achieves state-of-the-art results.},
keywords={},
doi={10.1587/transinf.2021EDL8037},
ISSN={1745-1361},
month={October},}
Salinan
TY - JOUR
TI - Triplet Attention Network for Video-Based Person Re-Identification
T2 - IEICE TRANSACTIONS on Information
SP - 1775
EP - 1779
AU - Rui SUN
AU - Qili LIANG
AU - Zi YANG
AU - Zhenghui ZHAO
AU - Xudong ZHANG
PY - 2021
DO - 10.1587/transinf.2021EDL8037
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2021
AB - Video-based person re-identification (re-ID) aims at retrieving person across non-overlapping camera and has achieved promising results owing to deep convolutional neural network. Due to the dynamic properties of the video, the problems of background clutters and occlusion are more serious than image-based person Re-ID. In this letter, we present a novel triple attention network (TriANet) that simultaneously utilizes temporal, spatial, and channel context information by employing the self-attention mechanism to get robust and discriminative feature. Specifically, the network has two parts, where the first part introduces a residual attention subnetwork, which contains channel attention module to capture cross-dimension dependencies by using rotation and transformation and spatial attention module to focus on pedestrian feature. In the second part, a time attention module is designed to judge the quality score of each pedestrian, and to reduce the weight of the incomplete pedestrian image to alleviate the occlusion problem. We evaluate our proposed architecture on three datasets, iLIDS-VID, PRID2011 and MARS. Extensive comparative experimental results show that our proposed method achieves state-of-the-art results.
ER -