Jump label

Service navigation

Main navigation

You are here:

Main content

Shifter Lists - A Data Structure for Massive Parallelism – VLDB 2012 Reviews

Reviews for paper Shifter Lists - A Data Structure for Massive Parallelism, submitted to VLDB 2012.

Overall rating: reject

Reviewer 1

Overall Recommendation

Weak Accept (no second-chance option in this track)

Summary of the paper (what is being proposed and in what context; brief justification of your overall recommendation). One paragraph

This paper introduces a new data structure, shifter lists, for parallel computation using FPGAs. Shifter lists are characterized by two phases: execution and shift. The two phases are general enough for a large class of tasks.

Three (or more) strong points about the paper (Please be precise and explicit; clearly explain the value and nature of the contribution).

1. Well written paper.
2. Idea is neat.

Three (or more) weak points about the paper (Please indicate clearly whether the paper has any mistakes, missing related work, or results that cannot be considered a contribution; write it so that he authors can understand what is seen as negative aspects

1. Need to characterize better what class of applications can use shift lists.
2. Load balance issues were not addressed.
3. Experimental evaluation could be improved.

Relevant for VLDB2012

YES

Novelty (Please give a high novelty ranking to papers on new topics, opening new fields, or proposing truly new ideas; assign medium ratings for delta papers and papers on well known topics but still with some valuable contribution; low novelty ratings are

With some new ideas

Significance

Improvement over existing work

Technical Depth and Quality of Content

Syntacticaly complete but with little contribution

Experiments, Repeatability

Ok, but certain claims are not covered by the experiments

Presentation

Excellent: careful, logical, elegant

If accepted, would you recommend a short or long talk?

Long

Would you "champion" for acceptance of the paper in a discussion with the peer reviewers?

Well, not really

Detailed Evaluation (Contribution, Pros/Cons, Errors)

The idea is neat and the paper is well written! The paper uses skyline problem as an example to demonstrate the new data structure. It indeed makes the paper much easier to follow. However, I somehow feel the skyline part is too heavy comparing to the main theme of the paper (given that skyline problem is a well known problem in database community). I would suggest to make some space to characterize the scope of the applications that can be solved by shifter lists. Giving a few examples is not good enough.

The authors may also consider to re-structure the paper. I would expect authors to put Section 7 before Section 6 (experiment), and show experimental results for other applications as well. This way, it is more convincing that the shifter lists are useful for a wide range of applications.

It would be nice to discuss load balance of the proposed algorithm. Is it possible that the nodes on the left side having higher loads than those on the right side?

Reviewer 2

Overall Recommendation

Weak Reject (no second-chance option in this track)

Summary of the paper (what is being proposed and in what context; brief justification of your overall recommendation). One paragraph

This paper presents an interesting data structure, called shifter list, to enable the parallel computation of skyline. Specifically, the proposed approach is a variant of the previous technique, BNL algorithm, for skyline computation. That is, blocks in the centralized algorithm become nodes (containing working set) in a shift list in the distributed environment. This paper is well presented, and the experiments show good performance. However, from the aspect of technical novelty, the paper does not propose any new skyline algorithm or data structure (shifter list is a linked list or array). Moreover, I would like to see more expiremental comparisons with other centralized algorithms such as SFS, DC, bitmap and index, NN, and BBS.

Three (or more) strong points about the paper (Please be precise and explicit; clearly explain the value and nature of the contribution).

S1. This paper is well presented.

S2. The proposed technique in parallel hardware is correct.

S3. Experiments are conducted to show the good performance.

Three (or more) weak points about the paper (Please indicate clearly whether the paper has any mistakes, missing related work, or results that cannot be considered a contribution; write it so that he authors can understand what is seen as negative aspects

W1. The paper is somewhat lacking of novelty such as new algorithms/structures for the skyline computation.

W2. The proposed approach is not compared with many existing skyline algorithms (some related works are missing).

W3. There might be some minor improvements that need to be explored. Please refer to my comments below.

Relevant for VLDB2012

YES

Novelty (Please give a high novelty ranking to papers on new topics, opening new fields, or proposing truly new ideas; assign medium ratings for delta papers and papers on well known topics but still with some valuable contribution; low novelty ratings are

Novelty unclear

Rationale for novelty rating

The paper does not propose any new skyline algorithm or data structure (shifter list is a linked list or array).

Significance

Improvement over existing work

Technical Depth and Quality of Content

Syntacticaly complete but with little contribution

Experiments, Repeatability

Ok, but certain claims are not covered by the experiments

Presentation

Excellent: careful, logical, elegant

If accepted, would you recommend a short or long talk?

Long

Would you "champion" for acceptance of the paper in a discussion with the peer reviewers?

No

Detailed Evaluation (Contribution, Pros/Cons, Errors)

1. This paper is well written, and the technique to compute skyline using the shifter list in parallel hardware is technically correct. However, my main concern is the lacking of technical novelty, regarding the skyline algorithm or data structure. The proposed FPGA-based algorithm uses BNL variant, which replaces the block concept with the shifter lists. As mentioned in Section 2.3, similar structure is used in [25, 29]. Thus, these algorithms are not that new.

2. In the experiments, authors only compare the proposed approach with one skyline algorithm, BNL [22]. However, many more memory-based or disk- based skyline algorithms were proposed, such as SFS, DC, bitmap and index, NN, and BBS, which may give better performance. It would be better to compare the proposed approach with these algorithms. Furthermore, these algorithms are missing in the related work.

3. In Section 3.1, please discuss how to set the number of working set items in each node.

4. In Figure 1, authors mentioned the effect of communication patterns on the communication cost. The shifter list is a linear path (like a pipeline in Figure 1b), which may have high communication cost. Please discuss this.

5. In Section 4.4, the proposed BNL approach with shifter lists may have some Lemming skyline candidates idle at the beginning of the algorithm. That is, initially, w candidate Lemmings are on the bridge, and then the first challenger shift to fight with the first candidate (while other candidates are idle). One small improvement might consider a shifter circle, that is, w challengers are assigned to w candidates, respectively, and then shiftered in a circle. This can improve the idle state of some candidate Lemmings at the beginning, though the improvement may be small.

6. One interesting thing for dominance checking is that, the time for dominance checking can be uneven for different Lemming pairs. While checking only a few (e.g., 2) dimensions of p and q, if p0 is smaller than q0 and p1 is larger than q1, one can immediately decide p and q are not dominating each other. On the other hand, to determine p is dominating q (or q dominates p), one has to check all dimensions. Therefore, one possible technical contribution is to explore such uneven time for dominance checking in parallel hardware. In this case, some positions in the shifter list may become idle earlier than others.

7. In experiments, please show the comparisons in terms of the total time to obtain skylines, rather than the throughput. The reason for this is that the total time may reflect the communication cost, which is not discussed in the paper .

Reviewer 3

Overall Recommendation

Weak Reject (no second-chance option in this track)

Summary of the paper (what is being proposed and in what context; brief justification of your overall recommendation). One paragraph

This paper describes a nested-loops based parallel algorithm applied to skyline computation, with implementation on an FPGA. The underlying algorithm is also shown to be applicable to other tasks like frequent-item computation.

Nested-loops style FPGA algorithms have been presented before, so the main contribution here is (a) more formalization of the appproach in the shifter-list paradigm, and (b) application to skyline queries. The novelty in this is not that strong and the experimental results need more work, so I don't think this paper is ready for publication.

Three (or more) strong points about the paper (Please be precise and explicit; clearly explain the value and nature of the contribution).

S1. Shifter lists parallelize well and are a useful contrast to the more mainstream scatter-gather approach.

S2. Shifter lists allow a simple algorithm such as BNL to parallelize and perform well.

Three (or more) weak points about the paper (Please indicate clearly whether the paper has any mistakes, missing related work, or results that cannot be considered a contribution; write it so that he authors can understand what is seen as negative aspects

W1. For the near term at least, CPU trend is towards multicore. So it is not fair to compare an FPGA with a single-theaded CPU implementation. I think SSSkyline is designed for multicore, so why run it single-threaded? Also, is BNL the best skyline algorithm on CPUs?

W2. The shifter-list based BNL looks mostly like a direct application of handshake join to BNL. You are flowing all the candidate lemmings on the queue through the lemmings on the bridge to check if any of the candidate lemmings is undominatd by all those on the bridge: isn't this a join of the queue with the bridge on the isDominated predicate, with goal being to to find ones on the queue that find no match, and ones on the bridge that find at least one match?

W3. Skyline computation is very much like a self-join, so it not surprising that nested-loops is a good idea. It is not clear to me that shifter lists are more generally applicable. In particular, take frequent items (Section 7.1). If the CPU implementation uses a hash table, then the work for each element is only O(1). So why will the shifter-list approach outperform it?

Relevant for VLDB2012

YES

Novelty (Please give a high novelty ranking to papers on new topics, opening new fields, or proposing truly new ideas; assign medium ratings for delta papers and papers on well known topics but still with some valuable contribution; low novelty ratings are

With some new ideas

Significance

Improvement over existing work

Technical Depth and Quality of Content

Solid work

Experiments, Repeatability

Ok, but certain claims are not covered by the experiments

Presentation

Reasonable: improvements needed

If accepted, would you recommend a short or long talk?

Would you "champion" for acceptance of the paper in a discussion with the peer reviewers?

Detailed Evaluation (Contribution, Pros/Cons, Errors)

Long

Would you "champion" for acceptance of the paper in a discussion with the peer reviewers?

No

Detailed Evaluation (Contribution, Pros/Cons, Errors)

.

Related Information



Sub content

Contact

Prof. Dr. Jens Teubner
Tel.: 0231 755-6481