Hybrid Memory Hierarchies for Next Generation HPC Systems

Talk
Richard Johnson
Time: 
07.12.2024 10:00 to 12:00
Location: 

IRB IRB-5165

Demands on computational performance, power efficiency, data transfer, resource capacity, and resilience for next generation high performance computing (HPC) systems present a new host of challenges. There is a growing disparity between computational performance vs. network and storage device throughput and among the energy costs of computational, memory, and communication operations. Chapel is a powerful, high-level, parallel, PGAS language designed to streamline development by addressing code complexities and uses a shared memory model for handling large, distributed memory systems. We propose to extend the capabilities of Chapel by providing support of persistent memory with intrinsic and programmatic features for HPC systems. In our approach we will explore the efficacy of persistent memory in a hybrid-PGAS environment through latency hiding analysis via cache monitoring, identification and mitigation of performance bottlenecks via data-centric analysis, and hardware profiling to assess performance cost vs benefits and energy footprint. To manage persistency and ensure resiliency we propose to develop a transaction system with ACID properties that supports hybrid-PGAS virtual addressing and distributed checkpoint system.