The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Perwakilan Bag-of-Visual-Words baru-baru ini menjadi popular untuk klasifikasi adegan. Walau bagaimanapun, pembelajaran perkataan visual dengan cara tanpa pengawasan mengalami masalah apabila menghadapi patch ini dengan penampilan yang serupa sepadan dengan konsep semantik yang berbeza. Kertas kerja ini mencadangkan rangka kerja pembelajaran yang diselia novel, yang bertujuan untuk memanfaatkan sepenuhnya maklumat label untuk menangani masalah tersebut. Khususnya, Pemodelan Campuran Gaussian (GMM) mula-mula digunakan untuk mendapatkan "tafsiran semantik" tampalan menggunakan label adegan. Setiap adegan mendorong ketumpatan kebarangkalian pada ruang ciri visual peringkat rendah, dan tompok diwakili sebagai vektor kebarangkalian konsep semantik pemandangan posterior. Dan kemudian algoritma Information Bottleneck (IB) diperkenalkan untuk mengelompokkan tampalan menjadi "perkataan visual" melalui cara yang diselia, dari perspektif tafsiran semantik. Operasi sedemikian boleh memaksimumkan maklumat semantik perkataan visual. Setelah memperoleh perkataan visual, kekerapan muncul perkataan visual yang sepadan dalam imej tertentu membentuk histogram, yang kemudiannya boleh digunakan dalam tugas pengkategorian pemandangan melalui pengelas Mesin Vektor Sokongan (SVM). Eksperimen pada set data yang mencabar menunjukkan bahawa perkataan visual yang dicadangkan melakukan tugas pengelasan pemandangan dengan lebih baik daripada kebanyakan kaedah sedia ada.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Shuoyan LIU, De XU, Songhe FENG, "Discriminating Semantic Visual Words for Scene Classification" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 6, pp. 1580-1588, June 2010, doi: 10.1587/transinf.E93.D.1580.
Abstract: Bag-of-Visual-Words representation has recently become popular for scene classification. However, learning the visual words in an unsupervised manner suffers from the problem when faced these patches with similar appearances corresponding to distinct semantic concepts. This paper proposes a novel supervised learning framework, which aims at taking full advantage of label information to address the problem. Specifically, the Gaussian Mixture Modeling (GMM) is firstly applied to obtain "semantic interpretation" of patches using scene labels. Each scene induces a probability density on the low-level visual features space, and patches are represented as vectors of posterior scene semantic concepts probabilities. And then the Information Bottleneck (IB) algorithm is introduce to cluster the patches into "visual words" via a supervised manner, from the perspective of semantic interpretations. Such operation can maximize the semantic information of the visual words. Once obtained the visual words, the appearing frequency of the corresponding visual words in a given image forms a histogram, which can be subsequently used in the scene categorization task via the Support Vector Machine (SVM) classifier. Experiments on a challenging dataset show that the proposed visual words better perform scene classification task than most existing methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.1580/_p
Salinan
@ARTICLE{e93-d_6_1580,
author={Shuoyan LIU, De XU, Songhe FENG, },
journal={IEICE TRANSACTIONS on Information},
title={Discriminating Semantic Visual Words for Scene Classification},
year={2010},
volume={E93-D},
number={6},
pages={1580-1588},
abstract={Bag-of-Visual-Words representation has recently become popular for scene classification. However, learning the visual words in an unsupervised manner suffers from the problem when faced these patches with similar appearances corresponding to distinct semantic concepts. This paper proposes a novel supervised learning framework, which aims at taking full advantage of label information to address the problem. Specifically, the Gaussian Mixture Modeling (GMM) is firstly applied to obtain "semantic interpretation" of patches using scene labels. Each scene induces a probability density on the low-level visual features space, and patches are represented as vectors of posterior scene semantic concepts probabilities. And then the Information Bottleneck (IB) algorithm is introduce to cluster the patches into "visual words" via a supervised manner, from the perspective of semantic interpretations. Such operation can maximize the semantic information of the visual words. Once obtained the visual words, the appearing frequency of the corresponding visual words in a given image forms a histogram, which can be subsequently used in the scene categorization task via the Support Vector Machine (SVM) classifier. Experiments on a challenging dataset show that the proposed visual words better perform scene classification task than most existing methods.},
keywords={},
doi={10.1587/transinf.E93.D.1580},
ISSN={1745-1361},
month={June},}
Salinan
TY - JOUR
TI - Discriminating Semantic Visual Words for Scene Classification
T2 - IEICE TRANSACTIONS on Information
SP - 1580
EP - 1588
AU - Shuoyan LIU
AU - De XU
AU - Songhe FENG
PY - 2010
DO - 10.1587/transinf.E93.D.1580
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2010
AB - Bag-of-Visual-Words representation has recently become popular for scene classification. However, learning the visual words in an unsupervised manner suffers from the problem when faced these patches with similar appearances corresponding to distinct semantic concepts. This paper proposes a novel supervised learning framework, which aims at taking full advantage of label information to address the problem. Specifically, the Gaussian Mixture Modeling (GMM) is firstly applied to obtain "semantic interpretation" of patches using scene labels. Each scene induces a probability density on the low-level visual features space, and patches are represented as vectors of posterior scene semantic concepts probabilities. And then the Information Bottleneck (IB) algorithm is introduce to cluster the patches into "visual words" via a supervised manner, from the perspective of semantic interpretations. Such operation can maximize the semantic information of the visual words. Once obtained the visual words, the appearing frequency of the corresponding visual words in a given image forms a histogram, which can be subsequently used in the scene categorization task via the Support Vector Machine (SVM) classifier. Experiments on a challenging dataset show that the proposed visual words better perform scene classification task than most existing methods.
ER -