The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Dalam terjemahan mesin statistik (SMT) semasa, penyusunan semula perkataan yang salah adalah salah satu masalah yang paling serius. Untuk menyelesaikan masalah ini, banyak teknik kekangan penyusunan semula perkataan telah dicadangkan. Tatabahasa transduksi penyongsangan (ITG) adalah salah satu daripada kekangan ini. Dalam kekangan ITG, susunan perkataan sisi sasaran diperoleh dengan memutarkan nod pokok binari sisi sumber. Dalam putaran nod ini, contoh pokok binari sumber tidak dipertimbangkan. Oleh itu, kekangan yang lebih kuat untuk penyusunan semula perkataan boleh diperolehi dengan mengenakan kekangan lanjut yang diperoleh daripada pokok sumber pada kekangan ITG. Contohnya, untuk urutan kata sumber { abcd }, kekangan ITG membenarkan sejumlah dua puluh dua susunan perkataan sasaran. Walau bagaimanapun, apabila contoh pokok perduaan sumber ((ab) (cd)) diberikan, kekangan "pokok sumber yang mengenakan pada ITG" (IST-ITG) yang dicadangkan kami membenarkan hanya lapan susunan perkataan. Pengurangan bilangan pilih atur susunan perkataan oleh kekangan yang lebih kuat yang dicadangkan oleh kami dengan cekap menyekat susunan perkataan yang salah. Dalam percubaan kami dengan IST-ITG menggunakan data trek terjemahan Bahasa Inggeris-ke-Cina NIST MT08, kaedah yang dicadangkan menghasilkan peningkatan 1.8 mata dalam aksara BLEU-4 (35.2 hingga 37.0) dan CER 6.2% lebih rendah (74.1 hingga 67.9). %) berbanding dengan keadaan asas kami.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Hirofumi YAMAMOTO, Hideo OKUMA, Eiichiro SUMITA, "Imposing Constraints from the Source Tree on ITG Constraints for SMT" in IEICE TRANSACTIONS on Information,
vol. E92-D, no. 9, pp. 1762-1770, September 2009, doi: 10.1587/transinf.E92.D.1762.
Abstract: In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed "imposing source tree on ITG" (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E92.D.1762/_p
Salinan
@ARTICLE{e92-d_9_1762,
author={Hirofumi YAMAMOTO, Hideo OKUMA, Eiichiro SUMITA, },
journal={IEICE TRANSACTIONS on Information},
title={Imposing Constraints from the Source Tree on ITG Constraints for SMT},
year={2009},
volume={E92-D},
number={9},
pages={1762-1770},
abstract={In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed "imposing source tree on ITG" (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.},
keywords={},
doi={10.1587/transinf.E92.D.1762},
ISSN={1745-1361},
month={September},}
Salinan
TY - JOUR
TI - Imposing Constraints from the Source Tree on ITG Constraints for SMT
T2 - IEICE TRANSACTIONS on Information
SP - 1762
EP - 1770
AU - Hirofumi YAMAMOTO
AU - Hideo OKUMA
AU - Eiichiro SUMITA
PY - 2009
DO - 10.1587/transinf.E92.D.1762
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E92-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2009
AB - In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed "imposing source tree on ITG" (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.
ER -