Like Water and Oil: With a Proper Emulsifier, Query Compilation and Data Parallelism Will Mix Well
Henning Funke and Jens Teubner
Demonstration Paper at VLDB 2020
In response to physical limitations, hardware has changed significantly during the past two decades. As the database community, we have no chance but adapt to those changes in order to benefit from these and further hardware advances.
Two strategies to deal with the change have proven particularly successful. To avoid hitting the memory wall, modern engines compile queries into native machine code; this way, data can be kept longer in registers and performance-limiting memory I/Os can be avoided. To escape the power wall, the use of heterogeneous and massively parallel architectures has been proposed; graphics processors (GPUs) in particular can deliver spectacular compute performance at a very attractive power footprint. But while both these strategies are very successful and well understood, it is surprisingly difficult to bring both together without losing much of their benefit.
In this demo, we showcase DogQC, the query compiler that we develop at TU Dortmund University. DogQC includes the Lane Refill and Push-Down Parallelism techniques to combat divergence effects that are the root cause for the above mentioned difficulty. The two techniques very effectively avoid resource under-utilization on graphics processors, while leveraging the bandwidth efficiency of compiled code. In practice, DogQC's anti-divergence measures can improve query performance by several factors.
Energy Awareness in Database Algorithms and Systems (SFB 876, A2)
The source code of the DogQC query compiler is available for download here.
It is also available on Github.
Submission to VLDB 2020 (result: Accept)