Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
Itoyori: Reconciling Global Address Space and Global Fork-Join Task Parallelism
DescriptionThis paper introduces Itoyori, a task-parallel runtime system designed to tackle the challenge of scaling task parallelism (more specifically, nested fork-join parallelism) beyond a single node. The partitioned global address space (PGAS) model is often employed in task-parallel systems, but naively combining them can lead to poor performance due to fine-grained and redundant remote memory accesses. Itoyori addresses this issue by automatically caching global memory accesses at runtime, enabling efficient cache sharing among parallel tasks running on the same processor. As a real-world case study, we ported an existing task-parallel implementation of the Fast Multipole Method (FMM) to distributed memory with Itoyori and achieved a 7.5x speedup when scaled from a single node to 12 nodes and up to 6.0x faster performance than without caching. This study demonstrates that global-view fork-join programming can be made practical and scalable, while requiring minimal changes to the shared-memory code.
Event Type
Paper
TimeTuesday, 14 November 202311am - 11:30am MST
Location301-302-303
Tags
Heterogeneous Computing
Programming Frameworks and System Software
Task Parallelism
Registration Categories
TP
Reproducibility Badges