You are here:

Sub navigation

Publications+

Main content

A Spinning Join That Does Not Get Dizzy — ICDCS 2010 Reviews

Reviews for paper A Spinning Join That Does Not Get Dizzy, submitted to ICDCS 2010.

Overall rating: accept

Reviewer 1

Reviewer's Scores

Originality: 4

Comments

This is an interesting and well written paper exploring how to structure distributed joins if high speed and efficient network transports were used between nodes. The result is a simple, elegant, and efficient algorithm/implementation called cyclo-join that takes advantage of high speed RDMA networks. The idea is to have the data to visit each node in a logical ring as the communication is very fast. The transport layer, data roundabout, eliminates all costs associated with network layer and integrates nicely with the join processing. Experiments indicate that the implementation is fast and can saturate both the CPU and the network at the same time.

Pros:

Interesting fusion of RDMA with in-memory join processing
Good and comprehensive design, implementation, and evaluation
Well written manuscript

Cons:

talks about joins and nothing but the joins...
results are impressive but not that surprising

Comments:

Spinning data in a ring across nodes works fine for things that require all to all communication, fine, but what about others? Is it not too early to scrap the tcp layer entirely? Should we give up on point to point communication? Are there no cases where traditional networking left to apply? A discussion about where data roundabout may not help would be useful.
A subsection on handling failures would improve the paper a lot. If a node in the logical ring goes down, how big an impact would that be on the join performance? How is cyclo-join to interface with recovery and failure handling in distributed environments? A big part of distributed data analysis is about handling inevitable failures as the system scale increases...
What about disks and their high speed variants? Query optimizers and processing of multiple queries at the same time? Partitioning of data in a large cluster? A discussion that considers the cyclo-join in the larger system setup is lacking in the paper...

Reviewer 2

Reviewer's Scores

Originality: 3

Comments

This paper is very well written, well motivated and the contribution is clear even for non-experts. It proposes the use of a ring shaped architecture with hosts endowed with RDMA network interfaces to allow a very fast transmission of data along the ring so as to realize distributed join operations according to an original paradigm. As the authors admit the idea is not completely novel as it inherits something from the systolic system approach.

The authors highlight how nowadays architectures for distributed databases are specifically designed with the purpose of reducing the amount of data transfer. These architectures are not able to fully exploit the throughput made available by new communication technologies.

By contrast, the authors propose to perform query execution locally and let the data flow through the ring, traversing all the hosts which can pick the necessary pieces of data from the flow when necessary.

The use of this mechanism is made possible thanks to the use of RDMA technology and brings several benefits: CPU overhead due to data copy are significantly reduced (NIC to Memory transfer are executed directly) as well the number of bus contentions and context switches. Furthermore the use of RDMA allows the access to the distributed memory system as if it was a single piece of memory (with negligible overhead) withouth the addressing limitations of a single host.

The paper treats the topic at a high level hence I remained with some doubts. The authors claim that one of the strengths of this approach is its simplicity and the fact that it allows trivial replacement, addition or removal of hosts. Nevertheless it seems that the necessary memory pinning and buffer registration is not such an easy task. The authors should better clarify the complexity of adding, removing hosts. Does the proposed system really scale to a cloud size? The experiments are run on no more than six hosts. Nevertheless in a very large architecture the overhead of a ring can be meaningful, especially for non balanced tables (a "reasonably" even distribution of data among the hosts is indeed among the assumptions for the applicability of the proposed approach, the authors should comment more on this aspect).

The experiments are not run on a real system and do not address comparisons with previous approaches to the realization of distributed join. The authors only performed comparisons with non-RDMA approaches also based on the same ring topology.

Overall I found this paper very interesting and original, and I think it could create lively discussions at the conference. Nevertheless I found that the cyclic join approach (which is the very novelty of this paper) was already proposed in a previous paper of the same authors in DaMoN 2008. With different experiments and comparisons that paper proposes the same ideas of this one, hence I gave only a 3 to the originality of this paper.

Reviewer 3

Reviewer's Scores

Originality: 4

Comments

This paper presented an implementation of distributed data processing to perform joins. The authors made the interesting point that there exist scenarios in which the network can be embraced instead of avoided. They showed the importance of RDMA and developed a system which keeps data moving around in a ring to allow distributed processing.

I think generally that this type of system will work well when the speed of network is faster than the speed of local disks but the authors did not make this point.

The paper is easy to read and presents an interesting idea. But I do have a very large concern about the evaluation section. For all experiments, the authors compare their system using multiple nodes to the performance of a single node doing the default behavior. This is very unfair and a complete straw man analysis. The authors need to redo the evaluation and compare the performance of their system to the default behavior when the same amount of resource is used. I think they can still show a win for some scenarios. Imagine a case where the bandwidth over the network is faster than the local disk bandwidth and the working data set is larger than the memory on a single node but small enough to fit into distributed memory. In this case, imagine that each node has a full copy of the data and does the join in some sort of map-reduce fashion. It might be the case that each node having to pull data from the disk will force the entire distributed operation to be slower than the case where the data is passed through the ring. Without this evaluation comparing apples to apples, the paper cannot present a convincing argument to the reader that they have made a contribution.

Reviewer 4

Reviewer's Scores

Originality: 4

Comments

The proposes using a ring-based multicomputer architecture and system for high performance join processing.

The paper certainly wins an award for the most intriguing title.

One cannot help feeling that the approach is a bit of a throw back to the days of database machines. Today the approach that is the dominant paradigm is one based on column indexes and map-reduce processing to union and intersect key-value sets.

Yet this paper focuses more on the performance issues of machine-to-machine and memory-to-remote memory access than the data parallel programming model suggested by map-reduce.

The paper is interesting simply because of its "retro" approach, namely multicomputer algorithms for parallel joins. Since it subverts the dominant paradigm, it is fun to read.

The evaluation, however, is not a strength of the paper. The testbed is tiny (six machines -- people have run queries via MR on 1000 node clusters with petabyte scale datasets!). Does the approach actually scale? The data is synthetic and approach is essentially a sensitivity analysis with respect to key distribution (skewed vs. non-skewed). This seems inadequate compared with current research practice. It is unfortunate that the authors provide no comparison against more modern database approaches, such as BigTable or Hive.

Related Information

submission (PDF)
final paper (PDF) — published at SIGMOD 2010

Sub content

Contact

Prof. Dr. Jens Teubner
Tel.: 0231 755-6481

Jump label

Service navigation

Main navigation

Sub navigation

Main content

A Spinning Join That Does Not Get Dizzy — ICDCS 2010 Reviews

Reviewer 1

Reviewer's Scores

Comments

Reviewer 2

Reviewer's Scores

Comments

Reviewer 3

Reviewer's Scores

Comments

Reviewer 4

Reviewer's Scores

Comments

Related Information

Sub content

Contact