The Dresden CCoE kindly invites you to our second Workshop. This year’s workshop is focused on high data rate processing.
The workshop takes place on Thursday, May 15, 2014 at TU Dresden, Willersbau Room A317 (Zellescher Weg 12-14, 01069 Dresden).
8:30-8:45 Prof. Dr. Wolfgang E. Nagel (ZIH) – Welcome
8:45-9:30 Keynote: L. Royer / M. Weigert (MPI-CBG) – Fast Imaging of Biological Processes with GPU Accelerated Smart Microscopes (for details, contact the authors: http://www.mpi-cbg.de/nc/research/research-groups/gene-myers/contact.html )
9:30-10:00 Coffee Break
10:00-10:30 M. Steuwer (U Münster) – High-Level Programming of Multi-GPU Systems (slides: skelcl-dopencl-dresden)
10:30-11:00 A. Herten (FZJ) – Enabling the Next Generation of Particle Physics Experiments: GPUs for Online Track Reconstruction (slides: aherten-ccoe-dresden)
11:00-11:30M. Vogelsang (KIT) – Real-time X-Ray image reconstruction at ANKA
11:30-13:00 Lunch Break
13:00-14:30 “Bring your own GPU Challenge” – Panel
14:30-15:00 Coffee Break
15:00-15:30 J. Köster (U Duisburg-Essen) – Massively parallel mapping of DNA reads
15:30-16:00 P. Karas (U Brno) – (De)convolving huge biomedical images on GPUs (cancelled)
16:00-16:30 E. Siragusa (FU Berlin) – Generic sequence analysis on GPUs (cancelled)
L. Royer / M. Weigert (MPI-CBG) – Fast Imaging of Biological Processes with GPU Accelerated Smart Microscopes
Light-sheet microscopy recently emerged as the technology of choice for in-vivo and in-toto imaging of developing organisms. Living embryos can be imaged with exquisite temporal and spatial resolution generating terabytes of volumetric data. The highest imaging quality is attained for smart microscopes capable of adapting their imaging parameters to the developing sample. That together with the ever increasing amounts of data generated pose great challenges for real-time image processing such as denoising, deconvolution, fusion, and super-resolution decoding. We demonstrate how GPUs are an important ingredient for building smart high-resolution microscopes.
M. Steuwer (U Münster) “High-Level Programming of Multi-GPU Systems”
Application development for modern high-performance systems with Graphics Processing Units (GPUs) currently relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs. In this talk we present SkelCL – a high-level programming approach for systems with multiple GPUs. SkelCL makes three main enhancements to the OpenCL standard:
1) memory management is simplified using parallel container data types;
2) an automatic data (re)distribution mechanism allows for implicit data movements between GPUs;
3) computations are conveniently expressed using parallel algorithmic patterns (skeletons).
SkelCL can be used together with dOpenCL, a distributed implementation of the OpenCL standard, to target distributed systems with GPUs. Our dOpenCL runtime system allows to execute OpenCL applications transparently on a distributed system by making remote GPUs available to them. In the talk we present both SkelCL and dOpenCL and we show how both can be used together to simplify the programming of distributed systems with GPUs.
A. Herten (FZJ) “Enabling the Next Generation of Particle Physics Experiments: GPUs for Online Track Reconstruction”
The PANDA experiment is a hadron physics experiment involving a novel data acquisition mechanism. Commonly, particle physics experiments read out the full detector response of particle collisions only when a fast hardware-level trigger fires. In contrast to this, PANDA uses a sophisticated event filtering scheme
which involves reconstruction of the whole incoming data stream in realtime to distinguish signal from background events. At an event rate of 20 million per second, a massive amount of computing power is needed to sufficiently reduce the incoming data rate of 200 GB/s by three order of magnitude for permanent storage. We explore the feasibility of using GPUs for this task. The talk outlines the challenges PANDA faces with data acquisition and presents the status of the GPU investigations. Different tracking algorithms running on GPUs are shown and their features and performances highlighted.
M. Vogelsang (KIT) “Real-time X-Ray image reconstruction at ANKA”
Due to their massively parallel and graphics focused hardware architecture, GPUs are perfectly suited for image processing tasks such as tomo- and laminographic reconstruction of 3- and 4-D X-ray data acquired at Synchrotron beamlines. In this talk we will present our current efforts for a near real-time streaming
reconstruction setup used at the Ångströmquelle Karlsruhe (ANKA), KIT’s synchrotron for — amongst others – X-ray microtomography. The system is based on a multi-core and multi-GPU architecture leveraging task graphs as an
algorithmic description that are mapped to OpenCL devices at run-time. For further performance increases, local mapping is extended to available cluster nodes. Integrated within our recently developed control system, a fast DAQ and a Python-to-OpenCL compiler we are able to reconstruct data from any X-ray tomo- and laminography experiment in a fast and flexible way.
J. Köster (U Duisburg-Essen) “Massively parallel mapping of DNA reads”
We present PEANUT (ParallEl AligNment UTility), a highly parallel GPU-based read mapper for DNA reads obtained with next-generation sequencing. PEANUT has several distinguishing features, including a novel q-gram index (called the q-group index) with small memory footprint built on-the-fly over the reads and the possibility to output both the best hits or all hits of a read. Designing the algorithm particularly for the GPU architecture, we
were able to reach maximum core occupancy for several key steps. Our benchmarks show that PEANUT outperforms other state-of- the-art mappers in terms of speed and sensitivity.