Frequent Item Computation on a Chip
Publication Details
Title
Frequent Item Computation on a Chip
Authors
Jens Teubner, René Müller, and Gustavo Alonso
Published
IEEE Transactions on Knowledge and Data Engineering (TKDE). vol. 23, no. 8 (Special Issue on Best Papers of ICDE 2010), August 2011.
Download
DOI
http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.216
Abstract
Computing frequent items is an important problem by itself and as a sub-routine in several data mining algorithms. In this paper we explore how to accelerate the computation of frequent items using field-programmable gate arrays (FPGAs) with a threefold goal: increase performance over existing solutions, reduce energy consumption over CPU-based systems, and explore the design space in detail as the constraints on FPGAs are very different from those of traditional software-based systems.
We discuss three design alternatives, each one of them exploiting different FPGA features and each one providing different performance/scalability trade-offs. An important result of the paper is to demonstrate how the inherent massive parallelism of FPGAs can improve performance of existing algorithms but only after a fundamental redesign of the algorithms. Our experimental results show that, e.g., the pipelined solution we introduce can reach more than 100 million tuples per second of sustained throughput (four times the best available results to date) by making use of techniques that are not available to CPU-based solutions. Moreover, and unlike in software approaches, the high throughput is independent of the skew of the Zipf distribution of the input and at a far lower energy cost.
Publication Log
June 2010
acceptance to TKDE and submission of final version
March 2010
submission to TKDE (accept with no further changes)
January 2010
invitation to TKDE
June 2009
submission to ICDE 2010 (title: “FPGA Acceleration for the Frequent Item Problem”; accepted as full paper)
- See FPGA Acceleration for the Frequent Item Problem for details.