The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Kami mengkaji masalah carian persamaan berterusan untuk pertanyaan yang berkembang yang baru-baru ini dirumuskan. Memandangkan aliran data dan pangkalan data yang terdiri daripada n set item, tujuan masalah ini adalah untuk mengekalkank kebanyakan set serupa dengan pertanyaan yang berkembang dari semasa ke semasa dan terdiri daripada yang terkini W item dalam aliran data. Untuk masalah ini, algoritma tepat sebelum ini menggunakan strategi pemangkasan yang, pada masa ini T, memutuskan calon-calon tertinggik kebanyakan set serupa daripada nilai persamaan masa lalu dan mengira nilai persamaan hanya untuk mereka. Makalah ini mencadangkan algoritma tepat baharu yang memendekkan masa pelaksanaan dengan mengira nilai persamaan hanya untuk set yang nilai persamaannya pada T boleh berubah mengikut masa T-1. Kami mengenal pasti set sedemikian dengan sangat pantas dengan senarai songsang berasaskan frekuensi (FIL). Selain itu, kami memperoleh nilai persamaan di T in O(1) masa dengan mengemas kini nilai sebelumnya yang dikira pada masa T-1. Secara eksperimen, algoritma tepat kami berjalan lebih pantas daripada algoritma tepat sebelumnya dengan satu susunan magnitud dan sepantas algoritma penghampiran sebelumnya.
Tomohiro YAMAZAKI
Engineering, the Univeristy of Electro-Communications
Hisashi KOGA
Engineering, the Univeristy of Electro-Communications
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Tomohiro YAMAZAKI, Hisashi KOGA, "Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries and Its Variant" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 5, pp. 898-908, May 2022, doi: 10.1587/transinf.2021DAP0003.
Abstract: We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021DAP0003/_p
Salinan
@ARTICLE{e105-d_5_898,
author={Tomohiro YAMAZAKI, Hisashi KOGA, },
journal={IEICE TRANSACTIONS on Information},
title={Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries and Its Variant},
year={2022},
volume={E105-D},
number={5},
pages={898-908},
abstract={We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.},
keywords={},
doi={10.1587/transinf.2021DAP0003},
ISSN={1745-1361},
month={May},}
Salinan
TY - JOUR
TI - Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries and Its Variant
T2 - IEICE TRANSACTIONS on Information
SP - 898
EP - 908
AU - Tomohiro YAMAZAKI
AU - Hisashi KOGA
PY - 2022
DO - 10.1587/transinf.2021DAP0003
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2022
AB - We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.
ER -