Reviews for paper Efficient Frequent Item Counting in Multi-Core Hardware, submitted to KDD 2012.
Overall rating: accept
The paper presents clever optimizations that leverage the characteristics of modern multi-core processors to speed up the approximate frequent item counting problem. The filter and space-saving work on two cores, and communicate via asynchronous messages.
Overall, impressive results are obtained over space-saving, specially for skewed input.
Limitations of the work include that it targets only two cores, and it is not immediately clear that it can scale to 8, 12, 16 or even more cores. Also no comparison wit other "parallel" approches.
The PC feel that the work is valuable, but at the same time, it is somewhat limited.
1 (well accepted paper)
A well established problem
Substantial improvement over state-of-the-art methods
Acceptable, but there is room for improvement
Yes
Nice presentation, easy to follow
Not really
Rule and Pattern Mining
I will argue to accept
Strong points:
S1. Clever, highly-tuned data structures
S2. Convincing performance results
Weak points:
W1. Very specialized problem, low potential impact
W2. Some experimental comparisons missing
W3. Somewhat oversold; refocus on streaming setting needed
The paper proposes an efficient implementation of the SpaceSavings algorithm for (approximately) finding frequent items and their counts. The authors use simple but clever data structures and exploit low-level SIMD instructions to push the performance from 20M items/second to up to 200M items/second, which I think is impressive and convincing.
My main concern with this paper is that its potential impact is limited. The ideas in this paper appear to be very specific to SpaceSavings and it is unclear to me whether the techniques can be generalized to other problems / algorithms. Moreover, the paper makes a number of (implicit) assumptions, which limit its applicability: the items are 32-bit integers, approximation is OK, queries are posed frequently while the algorithm is running, and the increase in throughput is relevant in the overall system.
The experimantal comparison to alternative approaches is incomplete. The authors mention an alternative merge-based approach, which splits the input data stream across multiple threads and merges at the end / when a query is posed. The authors argue that such an approach does not work well when the stream is queried frequently. I think that this needs some more justification. First, it appears that merging can be done (potentially on a separate core) while the independent threads are still running; the performance of such a strawman approach is not clear. Second, the authors should compare experimentally their approach with the merge-based approach, ideally in settings when there are (1) no queries, (2) infrequent queries, (3) frequent queries. Finally, a comparison to the parallel techniques of [6] is missing.
The authors do not particularly focus on a streaming setting, although this is clearly suggested by the assumptions listed above. The abstract and introduction does not mention that the paper is concerned with computing approximate frequent items (some false-positives) with approximate counts. The paper thus appears oversold and should be refocused appropriately.
Minor comments:
I feel that Section 3.1 would be easier to follow if the invariants maintained by the algorithm (properties of X and Y) would be stated.
The analysis in Section 3.3 about the probability of filtering an item assumes that the data structure is "warmed up" and/or that items arrive in random order .
The proposed 2-core approach seems to require some message queue between the two processors. How would such a queue be implemented effectively (so that queue maintenance does not destroy any potential performance gains)?
A well established problem
Substantial improvement over state-of-the-art methods
Thorough, including systematic comparison with proper state-of-the-art methods and novel case studies (if applicable)
Yes
Nice presentation, easy to follow
Yes
Rule and Pattern Mining
Leave for the senior PC to decide
A novel algorithm is presented for paralle computation of frequent itemset counting.
The algorithm exploits the skew and uses a prefiltering stage that can be implemented efficiently through SIMD instructions.
The contribution of this paper is significant.
A well established problem
The technical development is highly incremental without principled or fundamental contributions
Thorough, including systematic comparison with proper state-of-the-art methods and novel case studies (if applicable)
Yes
Nice presentation, easy to follow
Not really
c) Efficient Algorithms
Leave for the senior PC to decide
The paper describes an optimized implementation for counting frequent items in a dataset. It is based on an existing algorithm, called space-saving (SS), that returns an approximate answer by keeping counters only for the most frequent items. The paper improves SS in the following way: it splits the frequent counters into two sets, the few very frequent ones called X, and the rest, called Y. X is small enough to be kept into the L1 cache of the processor, and operations in X can be done very fast using SIMD instructions. A filtering step decides if an incoming item belongs to X, in which case processing will be done very fast, or to Y, in which case processing will be done by the normal SS algorithm.
The paper is well written, deals with an important operator which is in the heart of many data mining algorithms, and achieves impressive speedup, up to 10x compared to the existing SS algorithm.
A drawback of this paper is that the good performance is achieved by careful reimplementation of an existing algorithm, but without any innovative ideas.
Another issue is that, although the two parts of the method can run in parallel in two processors, one for X and one for Y, the algorithm is not parallel in he traditional sense, I.e. it does not scale to more than two processors.