The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Kertas kerja ini menerangkan kaedah pengekstrakan ciri fonetik tersendiri (DPF) untuk digunakan dalam sistem pengecaman fonem; kaedah kami mempunyai kos pengiraan yang rendah. Kaedah ini terdiri daripada tiga peringkat. Peringkat pertama menggunakan dua rangkaian neural multilayer (MLNs): JUTAWANLF-DPF, yang memetakan ciri akustik berterusan, atau ciri setempat (LF), ke ciri DPF diskret, dan JUTAWANDyn, yang mengekang konteks DPF pada sempadan fonem. Peringkat kedua menggabungkan fungsi perencatan/peningkatan (In/En) untuk mendiskriminasi sama ada corak dinamik DPF bagi trajektori adalah cembung atau cekung, di mana corak cembung dipertingkatkan dan corak cekung dihalang. Peringkat ketiga menghiasi vektor DPF menggunakan prosedur ortogonalisasi Gram-Schmidt sebelum memasukkannya ke dalam pengelas berasaskan model Markov (HMM) tersembunyi. Dalam percubaan pada sebutan Ayat Artikel Akhbar Jepun (JNAS), pengekstrak ciri yang dicadangkan, yang menggabungkan dua MLN dan rangkaian In/En, didapati memberikan kadar betul fonem yang lebih tinggi dengan komponen campuran yang lebih sedikit dalam HMM.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Mohammad Nurul HUDA, Hiroaki KAWASHIMA, Tsuneo NITTA, "Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network" in IEICE TRANSACTIONS on Information,
vol. E92-D, no. 4, pp. 671-680, April 2009, doi: 10.1587/transinf.E92.D.671.
Abstract: This paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLNLF-DPF, which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLNDyn, which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalities to discriminate whether the DPF dynamic patterns of trajectories are convex or concave, where convex patterns are enhanced and concave patterns are inhibited. The third stage decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure before feeding them into a hidden Markov model (HMM)-based classifier. In an experiment on Japanese Newspaper Article Sentences (JNAS) utterances, the proposed feature extractor, which incorporates two MLNs and an In/En network, was found to provide a higher phoneme correct rate with fewer mixture components in the HMMs.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E92.D.671/_p
Salinan
@ARTICLE{e92-d_4_671,
author={Mohammad Nurul HUDA, Hiroaki KAWASHIMA, Tsuneo NITTA, },
journal={IEICE TRANSACTIONS on Information},
title={Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network},
year={2009},
volume={E92-D},
number={4},
pages={671-680},
abstract={This paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLNLF-DPF, which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLNDyn, which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalities to discriminate whether the DPF dynamic patterns of trajectories are convex or concave, where convex patterns are enhanced and concave patterns are inhibited. The third stage decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure before feeding them into a hidden Markov model (HMM)-based classifier. In an experiment on Japanese Newspaper Article Sentences (JNAS) utterances, the proposed feature extractor, which incorporates two MLNs and an In/En network, was found to provide a higher phoneme correct rate with fewer mixture components in the HMMs.},
keywords={},
doi={10.1587/transinf.E92.D.671},
ISSN={1745-1361},
month={April},}
Salinan
TY - JOUR
TI - Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network
T2 - IEICE TRANSACTIONS on Information
SP - 671
EP - 680
AU - Mohammad Nurul HUDA
AU - Hiroaki KAWASHIMA
AU - Tsuneo NITTA
PY - 2009
DO - 10.1587/transinf.E92.D.671
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E92-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2009
AB - This paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLNLF-DPF, which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLNDyn, which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalities to discriminate whether the DPF dynamic patterns of trajectories are convex or concave, where convex patterns are enhanced and concave patterns are inhibited. The third stage decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure before feeding them into a hidden Markov model (HMM)-based classifier. In an experiment on Japanese Newspaper Article Sentences (JNAS) utterances, the proposed feature extractor, which incorporates two MLNs and an In/En network, was found to provide a higher phoneme correct rate with fewer mixture components in the HMMs.
ER -