The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Kami menerangkan sistem sintesis pertuturan parametrik statistik yang dibangunkan oleh kumpulan bersama dari Institut Teknologi Nagoya (Nitech) dan Institut Sains dan Teknologi Nara (NAIST) untuk penilaian terbuka tahunan sistem sintesis teks ke pertuturan yang dinamakan Cabaran Blizzard 2006. Untuk menambah baik sistem 2005 kami (Nitech-HTS 2005), kami menyiasat ciri baharu seperti pasangan spektrum garis berasaskan cepstrum yang digeneralisasikan (MGC-LSP), transformasi linear kemungkinan maksimum (MLLT) dan varians global kovarian penuh (GV) fungsi ketumpatan kebarangkalian (pdf). Gabungan pekali mel-cepstral, MLLT, dan kovarians penuh GV pdf mendapat markah tertinggi dalam ujian pendengaran subjektif, dan sistem 2006 menunjukkan prestasi yang lebih baik daripada sistem 2005. Penilaian Blizzard Challenge 2006 menunjukkan bahawa Nitech-NAIST-HTS 2006 berdaya saing walaupun ketika bekerja dengan pangkalan data pertuturan yang agak besar.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Heiga ZEN, Tomoki TODA, Keiichi TOKUDA, "The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006" in IEICE TRANSACTIONS on Information,
vol. E91-D, no. 6, pp. 1764-1773, June 2008, doi: 10.1093/ietisy/e91-d.6.1764.
Abstract: We describe a statistical parametric speech synthesis system developed by a joint group from the Nagoya Institute of Technology (Nitech) and the Nara Institute of Science and Technology (NAIST) for the annual open evaluation of text-to-speech synthesis systems named Blizzard Challenge 2006. To improve our 2005 system (Nitech-HTS 2005), we investigated new features such as mel-generalized cepstrum-based line spectral pairs (MGC-LSPs), maximum likelihood linear transform (MLLT), and a full covariance global variance (GV) probability density function (pdf). A combination of mel-cepstral coefficients, MLLT, and full covariance GV pdf scored highest in subjective listening tests, and the 2006 system performed significantly better than the 2005 system. The Blizzard Challenge 2006 evaluations show that Nitech-NAIST-HTS 2006 is competitive even when working with relatively large speech databases.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.6.1764/_p
Salinan
@ARTICLE{e91-d_6_1764,
author={Heiga ZEN, Tomoki TODA, Keiichi TOKUDA, },
journal={IEICE TRANSACTIONS on Information},
title={The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006},
year={2008},
volume={E91-D},
number={6},
pages={1764-1773},
abstract={We describe a statistical parametric speech synthesis system developed by a joint group from the Nagoya Institute of Technology (Nitech) and the Nara Institute of Science and Technology (NAIST) for the annual open evaluation of text-to-speech synthesis systems named Blizzard Challenge 2006. To improve our 2005 system (Nitech-HTS 2005), we investigated new features such as mel-generalized cepstrum-based line spectral pairs (MGC-LSPs), maximum likelihood linear transform (MLLT), and a full covariance global variance (GV) probability density function (pdf). A combination of mel-cepstral coefficients, MLLT, and full covariance GV pdf scored highest in subjective listening tests, and the 2006 system performed significantly better than the 2005 system. The Blizzard Challenge 2006 evaluations show that Nitech-NAIST-HTS 2006 is competitive even when working with relatively large speech databases.},
keywords={},
doi={10.1093/ietisy/e91-d.6.1764},
ISSN={1745-1361},
month={June},}
Salinan
TY - JOUR
TI - The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006
T2 - IEICE TRANSACTIONS on Information
SP - 1764
EP - 1773
AU - Heiga ZEN
AU - Tomoki TODA
AU - Keiichi TOKUDA
PY - 2008
DO - 10.1093/ietisy/e91-d.6.1764
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2008
AB - We describe a statistical parametric speech synthesis system developed by a joint group from the Nagoya Institute of Technology (Nitech) and the Nara Institute of Science and Technology (NAIST) for the annual open evaluation of text-to-speech synthesis systems named Blizzard Challenge 2006. To improve our 2005 system (Nitech-HTS 2005), we investigated new features such as mel-generalized cepstrum-based line spectral pairs (MGC-LSPs), maximum likelihood linear transform (MLLT), and a full covariance global variance (GV) probability density function (pdf). A combination of mel-cepstral coefficients, MLLT, and full covariance GV pdf scored highest in subjective listening tests, and the 2006 system performed significantly better than the 2005 system. The Blizzard Challenge 2006 evaluations show that Nitech-NAIST-HTS 2006 is competitive even when working with relatively large speech databases.
ER -