Query Reorganization Algorithm for Efficient Filtering System

Query Reorganization Algorithm for Efficient Filtering System

Abstract

In the information filtering paradigm, clients subscribe to a server with continuous queries that express their information needs and get notified every time appropriate information is published. To perform this task in an efficient way, servers employ indexing schemes that support fast matches of the incoming information with the query database. Such indexing schemes involve (i) main-memory trie-based data structures that cluster similar queries by capturing common elements between them and (ii) efficient filtering mechanisms that exploit this clustering to achieve high throughput and low filtering times. However, state-of-the-art indexing schemes are sensitive to the query insertion order and cannot adopt to an evolving query workload, degrading the filtering performance over time. In this paper, we present an adaptive trie-based algorithm that outperforms current methods by relying on query statistics to reorganise the query database. Contrary to previous approaches, we show that the nature of the constructed tries, rather than their compactness, is the determining factor for efficient filtering performance. Our algorithm does not depend on the order of insertion of queries in the database, manages to cluster queries even when clustering possibilities are limited, and achieves more than 96 percent filtering time improvement over its state-of-the-art competitors. Finally, we demonstrate that our solution is easily extensible to multi-core machines.


Comments are closed.