Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Mixed-Precision S/DGEMM Using the TF32 and TF64 Frameworks on Low-Precision AI Tensor Cores
DescriptionUsing NVIDIA Tensor Cores has enabled the significant acceleration of general matrix multiplication for applications in AI and in high-performance computing. The use of such specialized accelerators can provide a performance increase between 8x and 20x, albeit with a loss in precision. However, higher precisions are required in many applications. Fortunately, mixed-precision methods can be employed to maintain a high precision while also taking advantage of the performance of lower-precision AI cores. We extend the state of the art by using NVIDIA’s new TF32 framework, which not only burdens some constraints of the previous frameworks but also provides an equivalent precision and performance by using a much simpler approach. We also propose a new framework called TF64 that attempts double-precision arithmetic with low-precision Tensor Cores. Although this framework does not exist yet, we validated the correctness of this idea and achieved an equivalent of 64-bit precision on 32-bit hardware.
Event Type
Workshop
TimeSunday, 12 November 20234:10pm - 4:30pm MST
Location708
Tags
Applications
Software Engineering
Registration Categories
W