news 2026/6/24 14:10:43

CANN/catlass稀疏矩阵乘法示例

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
CANN/catlass稀疏矩阵乘法示例

SparseMatmulTla Example Readme

【免费下载链接】catlass本项目是CANN的算子模板库,提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass

Code Organization

├── 41_sparse_matmul_tla │ ├── CMakeLists.txt #CMake build file │ ├── README.md │ ├── sparse_gen_data.py │ └── sparse_matmul_tla.cpp # Main file

Example

  • After obtaining the code, compile the operator executable file. For details, see Template Library Quick Start.

  • Runsparse_gen_data.pyto generate a test sample. The test sample needs to be input from the command line. After the command is executed, theinputandoutputdirectories are generated in the specified path, including the input data of the operator and the golden data used for precision verification.

  • Then, execute the operator. Note that the input shape of the operator must match the shape of the data generated in the first step. In addition, this sample supports only theint8_tdata type for the input of matrix A or B.

The following is a complete shell script example (run in the project directory):

m=160 n=320 k=64 device=0 function build() { bash scripts/build.sh 41_sparse_matmul_tla } function gen_data() { cd examples/41_sparse_matmul_tla python3 sparse_gen_data.py $m $n $k echo "Data gen finished" } function run_kernel { echo 'Case: m=' $m ' k=' $k ' n=' $n cd ../../output/bin/ cp -r ../../examples/41_sparse_matmul_tla/input . cp -r ../../examples/41_sparse_matmul_tla/output . ./41_sparse_matmul_tla $m $n $k $device } build gen_data run_kernel

If the following result is displayed, precision verification is successful.

Compare success.

【免费下载链接】catlass本项目是CANN的算子模板库,提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/6/24 14:10:22

bitsandbytes快速入门:10分钟掌握8位量化训练技巧

bitsandbytes快速入门:10分钟掌握8位量化训练技巧 【免费下载链接】bitsandbytes Library for 8-bit optimizers and quantization routines. 项目地址: https://gitcode.com/gh_mirrors/bit/bitsandbytes bitsandbytes 是一个强大的Python库,专门…

作者头像 李华
网站建设 2026/6/24 14:05:11

终极优化指南:提升PixLoc相机姿态估计精度的10个实用技巧

终极优化指南:提升PixLoc相机姿态估计精度的10个实用技巧 【免费下载链接】pixloc Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021) 项目地址: https://gitcode.com/gh_mirrors/pi/pixloc PixLoc是一个基于深度学…

作者头像 李华
网站建设 2026/6/24 14:05:01

hspec扩展开发指南:如何为Haskell测试框架编写自定义插件

hspec扩展开发指南:如何为Haskell测试框架编写自定义插件 【免费下载链接】hspec A Testing Framework for Haskell 项目地址: https://gitcode.com/gh_mirrors/hs/hspec Hspec是Haskell生态中最流行的测试框架之一,它提供了丰富的测试功能和灵活…

作者头像 李华
网站建设 2026/6/24 13:56:05

sccache编译缓存终极指南:如何用云端缓存加速你的构建速度

sccache编译缓存终极指南:如何用云端缓存加速你的构建速度 【免费下载链接】sccache Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage …

作者头像 李华
网站建设 2026/6/24 13:29:20

如何5分钟掌握Firecrawl:网页数据提取的终极入门秘籍

如何5分钟掌握Firecrawl:网页数据提取的终极入门秘籍 【免费下载链接】firecrawl The API to search, scrape, and interact with the web at scale. 🔥 项目地址: https://gitcode.com/GitHub_Trending/fi/firecrawl 还在手动复制粘贴网页内容吗…

作者头像 李华