Sie sind hier:


Streams on Wires—A Query Compiler for FPGAs — VLDB 2009 Author Feedback

Streams on Wires—A Query Compiler for FPGAs — VLDB 2009 Author Feedback

Feedback to reviews for paper submission Streams on Wires—A Query Compiler for FPGAs to VLDB 2009.


Dear Reviewers:

We would like to thank for your comments and insights. They focus on two general points: relevance in practice and support for joins.


The project is a collaboration with a bank that needs to process/filter incoming streams of financial data at very high input rates. Data rates in the OPRA option pricing market, e.g., are currently around 2M messages/sec (http://www.opradata.com/specs/Traffic_Projections_2009_2010.pdf). These data rates hit the limitations of current system architectures (as also identified by Reviewer 1) on the one hand. On the other hand, the scenario has fairly low demands on query expressiveness. We will add clarifying text and references to the problem statement.


FPGAs introduce no additional difficulty for implementing joins (or other operators with unbounded space demands) beyond the one intrinsic to stream processing. They are just more involved to implement because they require using the memory and/or embedded cores available in the FPGA (for instance, joins can be done using the FPGA memory as content addressable memory). Explaining it would have taken much more space than that available in the paper. A join implementation would seamlessly fit into the compositional Glacier setup. For the use cases we have and for our target application, joins are not the primary operator. Nevertheless, it is important to emphasize that joins can be done in FPGAs - it is just not the part of the problem that we tackle in the paper.


OPERATOR SHARING (reviewer 2, D3) = yes, this can be done using the same composition model as for software operators. The implementation of the queries described in the paper already uses sharing and composition of basic data processing elements.

SCALABILITY (reviewer 3, W2) = the reviewer is correct that the real estate available in an FPGA is limited, the example with k=1000 and l=1 would require 1001 replicas. Resource consumption can be improved by using internal memory and embedded CPU cores. As in the case of joins, this can be achieved with known stream processing techniques, but it requires introducing different techniques beyond what we could present in this paper. These techniques would not affect the ability of the FPGA to process data at higher speed than CPUs. We note that we did not observe scalability problems in our industry collaboration.

OTHERS (reviewer 1, W1): The use cases we have are based on simple queries that need to be applied to streams at "wire speed". By running on in-memory data, the CPU- and FPGA-based implementations would essentially both measure the access cost to main-memory - a performance characteristic that does not necessarily translate into end-to-end application performance. The bottleneck as reviewer 1 indicates is indeed the NIC-to-memory interface, but that is what the paper shows: how FPGAs can be effectively used to bypass this bottleneck. We note that commercial systems available today often use a similar setup (e.g., FPGAs connected directly to the disk to process/filter the data before it hits the memory so as to minimize the data that eventually gets to the CPU).

Related Information



Prof. Dr. Jens Teubner
Tel.: 0231 755-6481