The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Sebagai satu jenis model pembelajaran mesin, "model ensembel pokok keputusan" (DTEM) diwakili oleh satu set pokok keputusan. DTEM terutamanya diketahui sah untuk data berstruktur; walau bagaimanapun, seperti model pembelajaran mesin yang lain, sukar untuk dilatih supaya ia mengembalikan nilai output yang betul (dipanggil "nilai ramalan") untuk sebarang nilai input (dipanggil "nilai atribut"). Sehubungan itu, apabila DTEM digunakan berhubung dengan sistem yang memerlukan kebolehpercayaan, adalah penting untuk mengesan secara menyeluruh nilai atribut yang membawa kepada kerosakan sistem (kegagalan) semasa pembangunan dan mengambil langkah balas yang sesuai. Satu penyelesaian yang boleh difikirkan ialah memasang penapis input yang mengawal input kepada DTEM dan menggunakan perisian berasingan untuk memproses nilai atribut yang mungkin membawa kepada kegagalan. Untuk membangunkan penapis input, adalah perlu untuk menentukan keadaan penapisan untuk nilai atribut yang membawa kepada kerosakan sistem. Dalam pertimbangan keperluan itu, kami mencadangkan kaedah untuk mengesahkan DTEM secara rasmi dan, menurut hasil pengesahan, jika nilai atribut yang membawa kepada kegagalan ditemui, mengekstrak julat di mana nilai atribut tersebut wujud. Kaedah yang dicadangkan boleh mengekstrak secara komprehensif julat di mana nilai atribut yang membawa kepada kegagalan wujud; oleh itu, dengan mencipta penapis input berdasarkan julat itu, adalah mungkin untuk mengelakkan kegagalan. Untuk menunjukkan kebolehlaksanaan kaedah yang dicadangkan, kami melakukan kajian kes menggunakan set data harga rumah. Melalui kajian kes, kami juga menilai kebolehskalaannya dan menunjukkan bahawa bilangan dan kedalaman pokok keputusan adalah faktor penting yang menentukan kebolehgunaan kaedah yang dicadangkan.
Naoto SATO
Hitachi, Ltd.
Hironobu KURUMA
Hitachi, Ltd.
Yuichiroh NAKAGAWA
Hitachi, Ltd.
Hideto OGAWA
Hitachi, Ltd.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA, "Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 2, pp. 363-378, February 2020, doi: 10.1587/transinf.2019EDP7120.
Abstract: As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7120/_p
Salinan
@ARTICLE{e103-d_2_363,
author={Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges},
year={2020},
volume={E103-D},
number={2},
pages={363-378},
abstract={As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.},
keywords={},
doi={10.1587/transinf.2019EDP7120},
ISSN={1745-1361},
month={February},}
Salinan
TY - JOUR
TI - Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges
T2 - IEICE TRANSACTIONS on Information
SP - 363
EP - 378
AU - Naoto SATO
AU - Hironobu KURUMA
AU - Yuichiroh NAKAGAWA
AU - Hideto OGAWA
PY - 2020
DO - 10.1587/transinf.2019EDP7120
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2020
AB - As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.
ER -