The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Pendekatan penyulingan pengetahuan boleh memindahkan maklumat daripada rangkaian besar (rangkaian guru) ke rangkaian kecil (rangkaian pelajar) untuk memampatkan dan mempercepatkan rangkaian neural dalam. Kertas kerja ini mencadangkan pendekatan penyulingan pengetahuan baru yang dipanggil penyulingan pelbagai pengetahuan (MKD). MKD terdiri daripada dua peringkat. Pada peringkat pertama, ia menggunakan pengekod auto untuk mempelajari perwakilan padat dan tepat bagi peta ciri (FM) daripada rangkaian guru dan rangkaian pelajar, perwakilan ini boleh dianggap sebagai penting bagi FM, iaitu, EFM. Pada peringkat kedua, MKD menggunakan pelbagai jenis pengetahuan, iaitu, magnitud EFM sampel individu dan hubungan persamaan antara EFM beberapa sampel untuk meningkatkan keupayaan generalisasi rangkaian pelajar. Berbanding dengan pendekatan sebelumnya yang menggunakan FM atau ciri buatan tangan daripada FM, EFM yang dipelajari daripada pengekod auto boleh dipindahkan dengan lebih cekap dan boleh dipercayai. Tambahan pula, maklumat kaya yang disediakan oleh pelbagai jenis pengetahuan menjamin rangkaian pelajar untuk meniru rangkaian guru sedekat mungkin. Keputusan eksperimen juga menunjukkan bahawa MKD lebih unggul daripada yang tercanggih.
Lianqiang LI
Shanghai Jiao Tong University (SJTU)
Kangbo SUN
Shanghai Jiao Tong University (SJTU)
Jie ZHU
Shanghai Jiao Tong University (SJTU)
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Lianqiang LI, Kangbo SUN, Jie ZHU, "A Novel Multi-Knowledge Distillation Approach" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 1, pp. 216-219, January 2021, doi: 10.1587/transinf.2020EDL8080.
Abstract: Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8080/_p
Salinan
@ARTICLE{e104-d_1_216,
author={Lianqiang LI, Kangbo SUN, Jie ZHU, },
journal={IEICE TRANSACTIONS on Information},
title={A Novel Multi-Knowledge Distillation Approach},
year={2021},
volume={E104-D},
number={1},
pages={216-219},
abstract={Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.},
keywords={},
doi={10.1587/transinf.2020EDL8080},
ISSN={1745-1361},
month={January},}
Salinan
TY - JOUR
TI - A Novel Multi-Knowledge Distillation Approach
T2 - IEICE TRANSACTIONS on Information
SP - 216
EP - 219
AU - Lianqiang LI
AU - Kangbo SUN
AU - Jie ZHU
PY - 2021
DO - 10.1587/transinf.2020EDL8080
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2021
AB - Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.
ER -