Einladung: Informatik-Oberseminar Simon Schwitanski

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Mittwoch, 5. Februar 2025, 13.00 Uhr Ort: Seminarraum 003, IT Center, Kopernikusstraße 6 Referent: Simon Schwitanski, M.Sc. Lehrstuhl für Hochleistungsrechnen (Informatik 12) Thema: Modeling Synchronization and Consistency for Data Race Detection in Remote Memory Access Programs Abstract: The increasing parallelism in today's supercomputers requires scalable parallel programming methods with efficient communication models. The traditional communication model in scientific computing is message passing: Both the sending and receiving process participate actively in the data exchange via messages. Remote Memory Access (RMA) models provide an alternative communication method where processes can access the memory of other processes directly. RMA models avoid unnecessary synchronization between processes and outperform the traditional message-passing model in modern supercomputers. However, they require users to explicitly ensure the synchronization and consistency of memory accesses through corresponding API calls. Otherwise, concurrent conflicting memory accesses lead to data races with undefined behavior. The non-deterministic nature of data races, commonly known from shared-memory programming, makes them difficult to detect manually. This thesis investigates data races in RMA programs and provides novel scalable methods to detect them at runtime. A classification of data races in the RMA models MPI~RMA, OpenSHMEM, and GASPI shows that synchronization and consistency are the two key properties that a correctness tool must capture to identify RMA races. This thesis provides formal models that allow analyzing both properties in RMA programs. The synchronization model analyzes the happened-before relation of events using a vector clock exchange. It captures the synchronization state of processes at runtime. The consistency model formalizes a relation defining when a remote memory access is guaranteed to be completed. Both models are combined in a generalized on-the-fly race detection model that can detect RMA data races independent of the concrete RMA model used in an application. The developed race detection model is implemented in a tool named RMASanitizer. It combines the shared-memory race detector ThreadSanitizer with the correctness checking tool MUST to detect RMA data races in MPI RMA, OpenSHMEM, and GASPI at runtime. For the evaluation, this thesis provides RMARaceBench, a classification quality benchmark suite designed to quantify the detection accuracy of RMA race detection tools. The evaluation with RMARaceBench shows that RMASanitizer has the highest detection accuracy compared to other state-of-the-art RMA race detectors. An overhead study with RMA proxy applications running with up to 700 processes shows that RMASanitizer is applicable to large-scale workloads. Es laden ein: die Dozentinnen und Dozenten der Informatik
participants (1)
-
Simon Schwitanski