Efficient Stream Processing of Scientific Data


Thomas Lindemann, Jonas Kauke, and Jens Teubner


Joint Workshop of HardBD and Active, collocated with ICDE 2018


via DOI (10.1109/ICDEW.2018.00029)


Modern particle physics produces volumes of experimental data that challenge any data processing system. To illustrate, the trigger system of the LHCb experiment at CERN must sustain a data rate of 4 TB/s, yet maintain real-time characteristics. In this work, we report on ELPACO, a distributed event processing platform for scientific data. Its key characteristics are excellent scalability and high resource efficiencyELPACO inherits its favorable scalability from Apache Storm, which we used as a basis for our platform. For resource efficiency, we tailored ELPACO to Eriador, a parallel, ARM-based hardware substrate with excellent energy/performance characteristics. With experiments on realistic data, we confirm a linear scalability (throughput vs. core count) and a 2.5x improvement in energy efficiency compared to existing solutions.


Real-Time Analysis and Storage for High-Volume Data in Particle Physics (SFB 876, C5)