Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
Choosing the Best Parallelization and Implementation Styles for Graph Analytics Codes: Lessons Learned from 1106 Programs
DescriptionGraph analytics has become a major workload in recent years. The underlying core algorithms tend to be irregular and data dependent, making them challenging to parallelize. Yet, these algorithms can be implemented and parallelized in many ways for CPUs and even more ways for GPUs. We took 6 key graph algorithms and created hundreds of parallel CUDA, OpenMP, and parallel C++ versions of each of them, most of which have never been described or studied. To determine which parallelization and implementation styles work well and under what circumstances, we evaluated the resulting 1106 programs on 2 GPUs and 2 CPUs using 5 input graphs. Our results show which styles and combinations thereof work well and which ones should be avoided. We found that choosing the wrong implementation style can yield over a 10x performance loss on average. The worst combinations of styles can cost 6 orders of magnitude in performance.
Event Type
Paper
TimeThursday, 16 November 20234pm - 4:30pm MST
Location403-404
Tags
Architecture and Networks
Data Movement and Memory
Graph Algorithms and Frameworks
Performance Measurement, Modeling, and Tools
Programming Frameworks and System Software
Registration Categories
TP
Reproducibility Badges