Thread Similarity Matrix: Visualizing Branch Divergence in GPGPU programs
文献类型:会议论文
作者 | Zhibin Yu; Lieven Eeckhout; Chengzhong Xu |
出版日期 | 2016 |
会议名称 | The 45th International Conference on Parallel Processing (ICPP-2016) |
会议地点 | 费城 美国 |
英文摘要 | Graphics processing units (GPUs) have recently evolved into popular accelerators for general-purpose parallel programs -- so-called GPGPU computing. Although programming models such as CUDA and OpenCL significantly improve GPGPU programmability, optimizing GPGPU programs is still far from trivial. Branch divergence is one of the root causes reducing GPGPU performance. Existing approaches are able to calculate the branch divergence rate but are unable to reveal how the branches diverge in a GPGPU program. In this paper, we propose the Thread Similarity Matrix (TSM) to visualize how branches diverge and in turn help find optimization opportunities. TSM contains an element for each pair of threads, representing the difference in code being executed by the pair of threads. The darker the element, the more similar the threads are, the lighter, the more dissimilar. TSM therefore allows GPGPU programmers to easily understand an application's branch divergence behavior and pinpoint performance anomalies. We present a case study to demonstrate how TSM can help optimize GPGPU programs: we improve the performance of a highly-optimized GPGPU kernel by 35% by reorganizing its thread organization to reduce its branch divergence rate. |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/10320] ![]() |
专题 | 深圳先进技术研究院_数字所 |
作者单位 | 2016 |
推荐引用方式 GB/T 7714 | Zhibin Yu,Lieven Eeckhout,Chengzhong Xu. Thread Similarity Matrix: Visualizing Branch Divergence in GPGPU programs[C]. 见:The 45th International Conference on Parallel Processing (ICPP-2016). 费城 美国. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。