hgemm

Here are 7 public repositories matching this topic...

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

cuda cuda-kernels cuda-demo cuda-toolkit cuda-library cuda-kernel learn-cuda cuda-cpp hgemm flash-attention leet-cuda cuda-12

Updated Mar 23, 2026
Cuda

Bruce-Lee-LY / cuda_hgemm

Star

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

gpu cuda cublas nvidia gemm matrix-multiply tensor-core hgemm

Updated Sep 8, 2024
Cuda

xlite-dev / HGEMM

Star

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.

cuda tensor-cores hgemm

Updated May 10, 2025
Cuda

Bruce-Lee-LY / cuda_hgemv

Star

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

gpu cuda cublas nvidia gemm gemv matrix-multiply tensor-core hgemm cuda-core hgemv

Updated Sep 8, 2024
Cuda

loveSunning / FastCuda

Star

FastCuda is a handwritten CUDA operator library featuring progressive GEMM and Reduce kernels, cuBLAS benchmarking, and C/C++/Python interfaces for learning, profiling, and performance optimization.

reduce spmv sgemm spmm cudac sgemv tensor-core hgemm flash-attention wmma

Updated Mar 18, 2026
Cuda

Bruce-Lee-LY / cuda_back2back_hgemm

Star

Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.

gpu cuda cublas nvidia gemm matrix-multiply tensor-core hgemm back2back-hgemm fused-hgemm back2back-gemm fused-gemm

Updated Nov 3, 2023
Cuda

CelexK / cuda-course-remotion

Star

Generate narrated CUDA course videos with animated slides and AI avatars using Remotion, Gemini, and ElevenLabs TTS for automated production.

nlp security course deep-learning linear-algebra slides parallel-computing high-performance-computing cuda-opengl cusparse hgemm nppcublas cudss cutenros nvcomp nvjpeg2000 nvtiff cuda-12

Updated Apr 3, 2026
TypeScript

Improve this page

Add a description, image, and links to the hgemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hgemm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hgemm

Here are 7 public repositories matching this topic...

xlite-dev / LeetCUDA

Bruce-Lee-LY / cuda_hgemm

xlite-dev / HGEMM

Bruce-Lee-LY / cuda_hgemv

loveSunning / FastCuda

Bruce-Lee-LY / cuda_back2back_hgemm

CelexK / cuda-course-remotion

Improve this page

Add this topic to your repo