Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
Runtime Composition of Iterations for Fusing Loop-Carried Sparse Dependence
DescriptionDependence between iterations in sparse computations causes inefficient use of memory and computation resources. This paper proposes sparse fusion, a technique that generates efficient parallel code for the combination of two sparse matrix kernels, where at least one of the kernels has loop-carried dependencies. Existing implementations optimize individual sparse kernels separately. However, this approach leads to synchronization overheads and load imbalance due to the irregular dependence patterns of sparse kernels, as well as inefficient cache usage due to their irregular memory access patterns. Sparse fusion uses a novel inspection strategy and code transformation to generate parallel fused code optimized for data locality and load balance. Sparse fusion outperforms the best of unfused implementations using ParSy and MKL by an average of 4.2× and is faster than the best of fused implementations using existing scheduling algorithms, such as LBC, DAGP, and wavefront by an average of 4× for various kernel combinations.
Event Type
Paper
TimeThursday, 16 November 20234pm - 4:30pm MST
Location405-406-407
Tags
Compilers
Performance Measurement, Modeling, and Tools
Performance Optimization
Programming Frameworks and System Software
Registration Categories
TP
Reproducibility Badges