The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Salah satu masalah yang paling dikaji dalam perlombongan data ialah mengira pengumpulan set item yang kerap dalam pangkalan data transaksi yang besar. Sejak pengenalan algoritma Apriori yang terkenal [14], ramai yang lain telah dicadangkan untuk mencari set item yang kerap. Antara algoritma tersebut, pendekatan perlombongan set item tertutup telah menimbulkan banyak minat dalam komuniti perlombongan data. Algoritma yang menggunakan pendekatan ini termasuk TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] dll. Antara algoritma ini, FCI -Stream, GC-Tree dan TGC-Tree ialah algoritma dalam talian yang berfungsi di bawah persekitaran tetingkap gelongsor. Dengan penilaian prestasi dalam [16], GC-Tree [15] adalah yang terpantas. Dalam makalah ini, algoritma yang dipertingkatkan berdasarkan GC-Tree dicadangkan, kerumitan pengiraan yang terbukti sebagai gabungan linear purata saiz transaksi dan purata saiz set item tertutup. Algoritma adalah berdasarkan teorem penting yang dibentangkan dalam Sekt. 4.2. Secara empirik, algoritma baharu adalah beberapa urutan magnitud lebih pantas daripada algoritma terkini, GC-Tree.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Junbo CHEN, Bo ZHOU, Lu CHEN, Xinyu WANG, Yiqun DING, "Finding Frequent Closed Itemsets in Sliding Window in Linear Time" in IEICE TRANSACTIONS on Information,
vol. E91-D, no. 10, pp. 2406-2418, October 2008, doi: 10.1093/ietisy/e91-d.10.2406.
Abstract: One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.10.2406/_p
Salinan
@ARTICLE{e91-d_10_2406,
author={Junbo CHEN, Bo ZHOU, Lu CHEN, Xinyu WANG, Yiqun DING, },
journal={IEICE TRANSACTIONS on Information},
title={Finding Frequent Closed Itemsets in Sliding Window in Linear Time},
year={2008},
volume={E91-D},
number={10},
pages={2406-2418},
abstract={One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.},
keywords={},
doi={10.1093/ietisy/e91-d.10.2406},
ISSN={1745-1361},
month={October},}
Salinan
TY - JOUR
TI - Finding Frequent Closed Itemsets in Sliding Window in Linear Time
T2 - IEICE TRANSACTIONS on Information
SP - 2406
EP - 2418
AU - Junbo CHEN
AU - Bo ZHOU
AU - Lu CHEN
AU - Xinyu WANG
AU - Yiqun DING
PY - 2008
DO - 10.1093/ietisy/e91-d.10.2406
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2008
AB - One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.
ER -