CODA: A Computation-Driven Paradigm for Sparse DNN Acceleration
文献类型:期刊论文
| 作者 | Liu, Yanhuan1,2; Li, Wenming1,2; Zhang, Kunming1,2; Liu, Tianyu1,2; Ye, Xiaochun1,2; An, Xuejun1,2 |
| 刊名 | IEEE COMPUTER ARCHITECTURE LETTERS
![]() |
| 出版日期 | 2025-07-01 |
| 卷号 | 24期号:2页码:381-384 |
| 关键词 | Software Hardware Computational modeling Sparse matrices Pipelines Indexes Data models Spatial databases Computational efficiency Vectors Computation-driven architecture sparse DNN acceleration dataflow paradigm unstructured sparsity work tokenizer dynamic execution core asynchronous execution |
| ISSN号 | 1556-6056 |
| DOI | 10.1109/LCA.2025.3637756 |
| 英文摘要 | The spatial dataflow paradigm, while effective for dense workloads, collapses on unstructured sparsity, manifesting as a profound Twin Crisis: a high Latency Overhead (termed the Latency Crisis) driven by slow, dominant offline software preprocessing for data reordering and formatting, and an Inefficiency Crisis characterized by massive pipeline stalls that leave hardware underutilized. This paper introduces CODA, a novel computation-driven architecture that challenges this data-centric model by shifting the focus from "scheduling data" to "discovering work." CODA materializes this new paradigm through two core innovations. First, a hardware Work Tokenizer eliminates the preprocessing bottleneck on-the-fly, directly resolving the Latency Crisis. Second, a scalable Dynamic Execution Core performs asynchronous, pull-based dispatch of computable work packets to idle resources, which eradicates stalls and solves the Inefficiency Crisis. By directly resolving the Twin Crisis, CODA demonstrates superior performance over state-of-the-art accelerators like Mentor and Hirac. On the BERT-BASE benchmark, it achieves up to 3.95x higher speed, 3.4x greater energy efficiency, and a 3.1x reduction in end-to-end latency, establishing a necessary new architectural blueprint for sparse acceleration. |
| 资助项目 | Beijing Natural Science Foundation[L234078] |
| WOS研究方向 | Computer Science |
| 语种 | 英语 |
| WOS记录号 | WOS:001641467300003 |
| 出版者 | IEEE COMPUTER SOC |
| 源URL | [http://119.78.100.204/handle/2XEOYT63/42943] ![]() |
| 专题 | 中国科学院计算技术研究所 |
| 通讯作者 | Li, Wenming |
| 作者单位 | 1.Chinese Acad Sci, Inst Comp Technol, State Key Lab Processors, Beijing 100045, Peoples R China 2.UCAS, Sch Comp Sci & Technol, Beijing 101408, Peoples R China |
| 推荐引用方式 GB/T 7714 | Liu, Yanhuan,Li, Wenming,Zhang, Kunming,et al. CODA: A Computation-Driven Paradigm for Sparse DNN Acceleration[J]. IEEE COMPUTER ARCHITECTURE LETTERS,2025,24(2):381-384. |
| APA | Liu, Yanhuan,Li, Wenming,Zhang, Kunming,Liu, Tianyu,Ye, Xiaochun,&An, Xuejun.(2025).CODA: A Computation-Driven Paradigm for Sparse DNN Acceleration.IEEE COMPUTER ARCHITECTURE LETTERS,24(2),381-384. |
| MLA | Liu, Yanhuan,et al."CODA: A Computation-Driven Paradigm for Sparse DNN Acceleration".IEEE COMPUTER ARCHITECTURE LETTERS 24.2(2025):381-384. |
入库方式: OAI收割
来源:计算技术研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。

