To content
Department of Computer Science

Efficient Stream Processing of Scientific Data

Title

Efficient Stream Processing of Scientific Data

Authors

Thomas Lindemann, Jonas Kauke, and Jens Teubner

Published

Joint Workshop of HardBD and Active, collocated with ICDE 2018

Download

via DOI (10.1109/ICDEW.2018.00029)

Abstract

Modern particle physics produces volumes of experimental data that challenge any data processing system. To illustrate, the trigger system of the LHCb experiment at CERN must sustain a data rate of 4 TB/s, yet maintain real-time characteristics. In this work, we report on ELPACO, a distributed event processing platform for scientific data. Its key characteristics are excellent scalability and high resource efficiencyELPACO inherits its favorable scalability from Apache Storm, which we used as a basis for our platform. For resource efficiency, we tailored ELPACO to Eriador, a parallel, ARM-based hardware substrate with excellent energy/performance characteristics. With experiments on realistic data, we confirm a linear scalability (throughput vs. core count) and a 2.5x improvement in energy efficiency compared to existing solutions.

Project

Real-Time Analysis and Storage for High-Volume Data in Particle Physics (SFB 876, C5)