Efficient Incremental Sequence Mining for Temporal Databases

From Simple Sci Wiki
Jump to navigation Jump to search

Title: Efficient Incremental Sequence Mining for Temporal Databases

Abstract: This research proposes an efficient algorithm called IUS (Incrementally Updating Sequential Patterns) for mining sequential patterns in a temporal database. The algorithm uses frequent and negative border sequences from the original database to perform incremental sequence mining. It also presents an algorithm called DUS (Decreasingly Updating Sequential Patterns) to handle updates in the database. The study defines a threshold called Min_nbd_supp to control the number of negative border sequences, which helps manage memory and computational costs. The IUS algorithm extends both the prefix and suffix of frequent sequences in the original database to generate candidates in the updated database. Experiments on GSM alarm data demonstrate that the IUS algorithm and Robust_search algorithm have similar speeds, making it an effective solution for incremental sequence mining in temporal databases.

Main Research Question: How can we efficiently update sequential patterns in a temporal database when new data are added or old data are deleted?

Methodology: The research proposes the IUS algorithm, which uses frequent and negative border sequences from the original database to minimize computational costs. It also presents the DUS algorithm to handle updates in the database. The study defines the Min_nbd_supp threshold to control the number of negative border sequences, which helps manage memory and computational costs.

Results: The experiments on GSM alarm data show that the speedups of the IUS algorithm and Robust_search algorithm are between 1 and 9, indicating that the IUS algorithm is an effective solution for incremental sequence mining in temporal databases.

Implications: The IUS algorithm provides an efficient method for updating sequential patterns in temporal databases, which is crucial for data mining applications that deal with evolving data. By using frequent and negative border sequences from the original database, the algorithm saves computational resources and memory, making it suitable for large-scale databases.

Link to Article: https://arxiv.org/abs/0203027v1 Authors: arXiv ID: 0203027v1