The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Apabila interaksi antara manusia dan komputer terus berkembang kepada bentuk yang paling semula jadi, ia menjadi semakin mendesak untuk memasukkan emosi dalam persamaan. Kertas kerja ini menerangkan satu langkah ke arah memperluaskan penyelidikan mengenai pengecaman emosi kepada bahasa Indonesia. Bidang ini terus berkembang, namun penerokaan subjek dalam bahasa Indonesia masih kurang. Secara khususnya, kertas kerja ini mengetengahkan dua sumbangan: (1) pembinaan pangkalan data audio-visual emosi pertama dalam bahasa Indonesia, dan (2) pengecam emosi multimodal pertama dalam bahasa Indonesia, dibina daripada korpus yang disebutkan di atas. Dalam membina korpus, kami menyasarkan kepada emosi semula jadi yang sepadan dengan kejadian kehidupan sebenar. Walau bagaimanapun, pengumpulan korpora emosi adalah intensif buruh dan mahal. Untuk mengurangkan kos, kami mengumpul data emosi daripada rakaman program televisyen, menghapuskan keperluan penyediaan rakaman yang rumit dan peserta yang berpengalaman. Khususnya, kami memilih rancangan bual bicara televisyen kerana kandungan perbualannya yang semula jadi, menghasilkan kejadian emosi yang spontan. Untuk merangkumi pelbagai emosi, kami mengumpulkan tiga episod dalam genre yang berbeza: politik, kemanusiaan dan hiburan. Dalam kertas ini, kami melaporkan titik analisis data dan anotasi. Pemerolehan korpus emosi berfungsi sebagai asas dalam penyelidikan lanjut tentang emosi. Selepas itu, dalam percubaan, kami menggunakan algoritma mesin vektor sokongan (SVM) untuk memodelkan emosi dalam data yang dikumpul. Kami melakukan pengecaman emosi multimodal menggunakan ramalan tiga modaliti: akustik, semantik dan visual. Jika dibandingkan dengan keputusan unimodal, dalam kombinasi ciri multimodal, kami mencapai ketepatan yang sama untuk rangsangan pada 92.6%, dan peningkatan yang ketara untuk tugas pengelasan valens pada 93.8%. Kami berharap dapat meneruskan kerja ini dan bergerak ke arah kuantifikasi emosi yang lebih halus dan lebih tepat.
Nurul LUBIS
Nara Institute of Science and Technology
Dessi LESTARI
Institut Teknologi Bandung
Sakriani SAKTI
Nara Institute of Science and Technology,RIKEN, Center for Advanced Intelligence Project AIP
Ayu PURWARIANTI
Institut Teknologi Bandung
Satoshi NAKAMURA
Nara Institute of Science and Technology,RIKEN, Center for Advanced Intelligence Project AIP
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Nurul LUBIS, Dessi LESTARI, Sakriani SAKTI, Ayu PURWARIANTI, Satoshi NAKAMURA, "Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 8, pp. 2092-2100, August 2018, doi: 10.1587/transinf.2017EDP7362.
Abstract: As interaction between human and computer continues to develop to the most natural form possible, it becomes increasingly urgent to incorporate emotion in the equation. This paper describes a step toward extending the research on emotion recognition to Indonesian. The field continues to develop, yet exploration of the subject in Indonesian is still lacking. In particular, this paper highlights two contributions: (1) the construction of the first emotional audio-visual database in Indonesian, and (2) the first multimodal emotion recognizer in Indonesian, built from the aforementioned corpus. In constructing the corpus, we aim at natural emotions that are corresponding to real life occurrences. However, the collection of emotional corpora is notably labor intensive and expensive. To diminish the cost, we collect the emotional data from television programs recordings, eliminating the need of an elaborate recording set up and experienced participants. In particular, we choose television talk shows due to its natural conversational content, yielding spontaneous emotion occurrences. To cover a broad range of emotions, we collected three episodes in different genres: politics, humanity, and entertainment. In this paper, we report points of analysis of the data and annotations. The acquisition of the emotion corpus serves as a foundation in further research on emotion. Subsequently, in the experiment, we employ the support vector machine (SVM) algorithm to model the emotions in the collected data. We perform multimodal emotion recognition utilizing the predictions of three modalities: acoustic, semantic, and visual. When compared to the unimodal result, in the multimodal feature combination, we attain identical accuracy for the arousal at 92.6%, and a significant improvement for the valence classification task at 93.8%. We hope to continue this work and move towards a finer-grain, more precise quantification of emotion.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2017EDP7362/_p
Salinan
@ARTICLE{e101-d_8_2092,
author={Nurul LUBIS, Dessi LESTARI, Sakriani SAKTI, Ayu PURWARIANTI, Satoshi NAKAMURA, },
journal={IEICE TRANSACTIONS on Information},
title={Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition},
year={2018},
volume={E101-D},
number={8},
pages={2092-2100},
abstract={As interaction between human and computer continues to develop to the most natural form possible, it becomes increasingly urgent to incorporate emotion in the equation. This paper describes a step toward extending the research on emotion recognition to Indonesian. The field continues to develop, yet exploration of the subject in Indonesian is still lacking. In particular, this paper highlights two contributions: (1) the construction of the first emotional audio-visual database in Indonesian, and (2) the first multimodal emotion recognizer in Indonesian, built from the aforementioned corpus. In constructing the corpus, we aim at natural emotions that are corresponding to real life occurrences. However, the collection of emotional corpora is notably labor intensive and expensive. To diminish the cost, we collect the emotional data from television programs recordings, eliminating the need of an elaborate recording set up and experienced participants. In particular, we choose television talk shows due to its natural conversational content, yielding spontaneous emotion occurrences. To cover a broad range of emotions, we collected three episodes in different genres: politics, humanity, and entertainment. In this paper, we report points of analysis of the data and annotations. The acquisition of the emotion corpus serves as a foundation in further research on emotion. Subsequently, in the experiment, we employ the support vector machine (SVM) algorithm to model the emotions in the collected data. We perform multimodal emotion recognition utilizing the predictions of three modalities: acoustic, semantic, and visual. When compared to the unimodal result, in the multimodal feature combination, we attain identical accuracy for the arousal at 92.6%, and a significant improvement for the valence classification task at 93.8%. We hope to continue this work and move towards a finer-grain, more precise quantification of emotion.},
keywords={},
doi={10.1587/transinf.2017EDP7362},
ISSN={1745-1361},
month={August},}
Salinan
TY - JOUR
TI - Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2092
EP - 2100
AU - Nurul LUBIS
AU - Dessi LESTARI
AU - Sakriani SAKTI
AU - Ayu PURWARIANTI
AU - Satoshi NAKAMURA
PY - 2018
DO - 10.1587/transinf.2017EDP7362
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2018
AB - As interaction between human and computer continues to develop to the most natural form possible, it becomes increasingly urgent to incorporate emotion in the equation. This paper describes a step toward extending the research on emotion recognition to Indonesian. The field continues to develop, yet exploration of the subject in Indonesian is still lacking. In particular, this paper highlights two contributions: (1) the construction of the first emotional audio-visual database in Indonesian, and (2) the first multimodal emotion recognizer in Indonesian, built from the aforementioned corpus. In constructing the corpus, we aim at natural emotions that are corresponding to real life occurrences. However, the collection of emotional corpora is notably labor intensive and expensive. To diminish the cost, we collect the emotional data from television programs recordings, eliminating the need of an elaborate recording set up and experienced participants. In particular, we choose television talk shows due to its natural conversational content, yielding spontaneous emotion occurrences. To cover a broad range of emotions, we collected three episodes in different genres: politics, humanity, and entertainment. In this paper, we report points of analysis of the data and annotations. The acquisition of the emotion corpus serves as a foundation in further research on emotion. Subsequently, in the experiment, we employ the support vector machine (SVM) algorithm to model the emotions in the collected data. We perform multimodal emotion recognition utilizing the predictions of three modalities: acoustic, semantic, and visual. When compared to the unimodal result, in the multimodal feature combination, we attain identical accuracy for the arousal at 92.6%, and a significant improvement for the valence classification task at 93.8%. We hope to continue this work and move towards a finer-grain, more precise quantification of emotion.
ER -