Easton Man's Channel
19:47 · Jan 19, 2025 · Sun
Beating cuBLAS in Single-Precision General Matrix Multiplication
https://salykova.github.io/sgemm-gpu
salykova
Advanced Matrix Multiplication Optimization on NVIDIA GPUs
This blog post focuses on a GPU implementation of SGEMM (Single-precision GEneral Matrix Multiply) operation defined as C := alphaAB + beta*C. We’ll review the algorithm’s design and discuss optimization techniques such as inlined PTX, asynchronous memory…
Home
Powered by
BroadcastChannel
&
Sepia