The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Dalam makalah ini, kami membentangkan model hibrid aksara perkataan yang diskriminatif untuk pembahagian perkataan Cina bersama dan penandaan POS. Model hibrid aksara perkataan kami menawarkan prestasi tinggi kerana ia boleh mengendalikan kedua-dua perkataan yang diketahui dan tidak diketahui. Kami menerangkan strategi kami yang menghasilkan keseimbangan yang baik untuk mempelajari ciri-ciri perkataan yang diketahui dan tidak diketahui dan mencadangkan dasar yang didorong ralat yang memberikan keseimbangan tersebut dengan memperoleh contoh perkataan yang tidak diketahui daripada ralat tertentu dalam korpus latihan. Kami menghuraikan rangka kerja yang cekap untuk melatih model kami berdasarkan Margin Infused Relaxed Algorithm (MIRA), menilai pendekatan kami pada Penn Chinese Treebank dan menunjukkan bahawa ia mencapai prestasi unggul berbanding pendekatan terkini yang dilaporkan dalam sastera.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Canasai KRUENGKRAI, Kiyotaka UCHIMOTO, Jun'ichi KAZAMA, Yiou WANG, Kentaro TORISAWA, Hitoshi ISAHARA, "Joint Chinese Word Segmentation and POS Tagging Using an Error-Driven Word-Character Hybrid Model" in IEICE TRANSACTIONS on Information,
vol. E92-D, no. 12, pp. 2298-2305, December 2009, doi: 10.1587/transinf.E92.D.2298.
Abstract: In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance since it can handle both known and unknown words. We describe our strategies that yield good balance for learning the characteristics of known and unknown words and propose an error-driven policy that delivers such balance by acquiring examples of unknown words from particular errors in a training corpus. We describe an efficient framework for training our model based on the Margin Infused Relaxed Algorithm (MIRA), evaluate our approach on the Penn Chinese Treebank, and show that it achieves superior performance compared to the state-of-the-art approaches reported in the literature.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E92.D.2298/_p
Salinan
@ARTICLE{e92-d_12_2298,
author={Canasai KRUENGKRAI, Kiyotaka UCHIMOTO, Jun'ichi KAZAMA, Yiou WANG, Kentaro TORISAWA, Hitoshi ISAHARA, },
journal={IEICE TRANSACTIONS on Information},
title={Joint Chinese Word Segmentation and POS Tagging Using an Error-Driven Word-Character Hybrid Model},
year={2009},
volume={E92-D},
number={12},
pages={2298-2305},
abstract={In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance since it can handle both known and unknown words. We describe our strategies that yield good balance for learning the characteristics of known and unknown words and propose an error-driven policy that delivers such balance by acquiring examples of unknown words from particular errors in a training corpus. We describe an efficient framework for training our model based on the Margin Infused Relaxed Algorithm (MIRA), evaluate our approach on the Penn Chinese Treebank, and show that it achieves superior performance compared to the state-of-the-art approaches reported in the literature.},
keywords={},
doi={10.1587/transinf.E92.D.2298},
ISSN={1745-1361},
month={December},}
Salinan
TY - JOUR
TI - Joint Chinese Word Segmentation and POS Tagging Using an Error-Driven Word-Character Hybrid Model
T2 - IEICE TRANSACTIONS on Information
SP - 2298
EP - 2305
AU - Canasai KRUENGKRAI
AU - Kiyotaka UCHIMOTO
AU - Jun'ichi KAZAMA
AU - Yiou WANG
AU - Kentaro TORISAWA
AU - Hitoshi ISAHARA
PY - 2009
DO - 10.1587/transinf.E92.D.2298
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E92-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2009
AB - In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance since it can handle both known and unknown words. We describe our strategies that yield good balance for learning the characteristics of known and unknown words and propose an error-driven policy that delivers such balance by acquiring examples of unknown words from particular errors in a training corpus. We describe an efficient framework for training our model based on the Margin Infused Relaxed Algorithm (MIRA), evaluate our approach on the Penn Chinese Treebank, and show that it achieves superior performance compared to the state-of-the-art approaches reported in the literature.
ER -