High-Volume Data in Particle Physics
Project
Sub-project C5: Realtime Analysis and Storage for High-Volume Data in Particle Physics of SFB 876 addresses the very large data volumes that arise in scientific applications such as from the LHC particle accelerator at CERN. We want to develop techniques to allow to pre-filter the data voume of approximate 30 gigabytes per second, or 400 PB per year, already close to the data source with maximum performance. We will be using hardware/software co-design methods to establish this pre-filter, and we want to refine and optimize co-design methods based on specific examples.
For the remaining data volume–we expect about 20 PB per year–we want to develop distributed storage mechanisms. They should not only be capable of storing such high volumes. More importantly, we want to equip them with efficient access mechanisms that can help to accelerate the physicists' analyses. To this end, we plan to base our work on database indexing techniques, so measurement data can be placed and distributed in the network according to semantical criteria. The performance of the new methods can be tested by a comparison with standard techniques, based on specific physics analyses.
People
- Jens Teubner (faculty)
- Thomas Lindemann (PhD student)
- Maximilian Berens (PhD student)
This is a joint project with the Particle Physics Group at TU Dortmund.
Publications
- Thomas Lindemann, Jonas Kauke, and Jens Teubner. Efficient Stream Processing of Scientific Data. HardBD 2018.
- Sebastian Dorok, Sebastian Breß, Jens Teubner, Horstfried Läpple, Gunter Saake, and Volker Markl. Efficient Storage and Analysis of Genome Data in Databases. Datenbank-Spektrum, Springer Verlag, June 2017 (Online First).
- Michael Kußmann, Maximilian Berens, Ulrich Eitschberger, Ayse Kilic, Thomas Lindemann, Frank Meier, Ramon Niet, Margarete Schellenberg, Holger Stevens, Julian Wishahi, Bernhard Spaan and Jens Teubner. DeLorean: A Storage Layer to Analyze Physical Data at Scale. BTW 2017.
- Sebastian Dorok, Sebastian Breß, Jens Teubner, Horstfried Läpple, Gunter Saake and Volker Markl. Efficient Storage and Analysis of Genome Data in Databases. BTW 2017.
- Sebastian Breß, Henning Funke and Jens Teubner. Robust Query Processing in Co-Processor-accelerated Databases. SIGMOD 2016.