Jump label

Service navigation

Main navigation

You are here:

Main content

Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware

Publication Details


Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware


Cagri Balkesen, Jens Teubner, Gustavo Alonso, and M. Tamer Öszu


Proceedings of the 29th Int'l Conference on Data Engineering (ICDE), Brisbane, Australia, April 2013.


paper (PDF)


The architectural changes introduced with multi-core CPUs have triggered a redesign of main-memory join algorithms. In the last few years, two diverging views have appeared. One approach advocates careful tailoring of the algorithm to the architectural parameters (cache sizes, TLB, and memory bandwidth). The other approach argues that modern hardware is good enough at hiding cache and TLB miss latencies and, consequently, the careful tailoring can be omitted without sacrificing performance.

In this paper we demonstrate through experimental analysis of different algorithms and architectures that hardware still matters. Join algorithms that are hardware conscious perform better than hardware-oblivious approaches. The analyses and comparisons in the paper show that many of the claims regarding the behavior of join algorithms that have appeared in literature are due to selection effects (relative table sizes, tuple sizes, the underlying architecture, using sorted data, etc.) and are not supported by experiments run under different parameter settings. Through the analysis, we shed light on how modern hardware affects the implementation of data operators and provide the fastest implementation of radix join to date, reaching close to 200 million tuples per second.

Publication Log

December 2012

camera-ready for ICDE 2013

July 2012

submission to ICDE 2013 (accepted)

May 2012

submission to Experimental Track of PVLDB (rejected)

Sub content


Prof. Dr. Jens Teubner
Tel.: 0231 755-6481