中国科学院机构知识库网格系统: 基于机器学习的处理器验证高强度用例提取

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

基于机器学习的处理器验证高强度用例提取

文献类型：学位论文


作者	戚航
学位类别	硕士
答辩日期	2022-05-25
授予单位	中国科学院研究生院
授予地点	北京
导师	杨秋松
关键词	机器学习功能验证测试用例生成
学位专业	计算机软件与理论
中文摘要	CPU（Central Processing Unit, 中央处理器或者处理器）是现代计算机的核心部件，其设计规模和复杂程度都随着功能增强和工艺提升而不断增长。在处理器设计和实现过程中会不可避免会引入缺陷，由于处理器的“试错”成本特别高，必须通过各种验证手段，努力在处理器完成流片之前将各种缺陷发现和修复，尽可能减少遗留到芯片中的缺陷。当前，覆盖率驱动测试生成是处理器功能验证的一大常用方法。该方法使用大量测试用例对定义的处理器功能点进行覆盖直到收敛。一些基于机器学习方法的覆盖率驱动测试生成研究能够大幅降低验证花费的时间成本。然而，这些方法在经历一个完整的收敛流程中需要耗费较长时间。若能从覆盖功能点的测试用例中发现某些普适规律，使得对处理器的功能覆盖更加完整，就能进一步实现验证效率的提升。直接从测试用例的指令片段中提取特征会出现指令间连续性被破坏、难以归纳和量化等多种问题。为此，本文引入了程序模式这一概念，通过在 riscv-dv-x86 工具中生成规定的指令模板来实现对应用例的生成，对程序模式的相关特征进行学习并从中发现一些有助于功能覆盖的规律，从而更有效地获取能够覆盖到更多功能点的高强度用例。本文的主要工作和贡献包括：（1）引入程序模式的概念，通过程序模式的限制实现对应特性的测试用例的生成，介绍其中包含的配置参数和字段这两类可以用于学习的属性特征；使用 riscv-dv-x86 测试生成工具完成基于单程序模式的测试用例池构建，说明如何获得单程序模式对应用例回归验证后的功能覆盖率；（2）以单程序模式对应用例的功能覆盖率作为强度判断标准，探索覆盖率和程序模式特征之间的联系；根据覆盖率和两类程序模式特征分别形成数据集；根据监督标签的有无以及标签类型，应用线性回归和 M5P、自适应特征权重聚类、基于决策树的分类这三种不同的机器学习方法并加以改进，构建对应模型并对精度进行评估，实现从用例池中提取高强度用例；（3）对每种方法下所提取的用例通过回归验证达到的集合覆盖率、用例聚集度和相对整个用例池的平均还原程度进行具体评估和分析，对比并总结各方法在集合选取上的实际效果和特点；通过各方法得到了能够实现高强度用例选取的对应模型的参数，说明了将这些机器学习模型用于未测用例强度判断的方法；综合三种方法，对高强度规律发现进行了改进。实验表明，本文对高强度测试用例的发现和提取具有较好的实现效果，各类机器学习方法的应用能够大幅减少冗余用例。通过部分方法可以发现一些显著的特征规律，从这些规律中对部分程序模式特征进行限制，可以为处理器功能验证覆盖率的收敛提供指导，从而有效加速验证流程。
英文摘要	CPU is a key component of the modern computer, and its design scale and complexity are increasing with the enhancement of functions and technology. In the process of processor design and implementation, it is inevitable to introduce defects. Due to the high cost of "trial and error" of the processor, it is necessary to find and repair various defects before the processor completes tape-out, so as to reduce the defects left in the chip as much as possible. At present, coverage-driven test generation becomes a widely used method for processor verification. A large number of test cases are used in this method to cover the defined processor function points until convergence. Coverage-driven test generation based on machine learning can significantly reduce the time cost of verification. However, these methods need a long time to go through a complete convergence process. Forming general rules of selecting test cases can be important in improving coverage for verification. It can be difficult in analyzing the exact behaviors of test cases. Though each test case can be partitioned, the features that are extracted through fragments of cases can be deviate from the intrinsic features of cases. In this paper, test cases can be described through templates by which the patterns contained in each case can be extracted. As a result, the features of test cases can be accurately obtained through a strait forward method. We present a set of approaches based on machine learning for finding test cases that have the potential in improving the coverage of verification. The contributions of this paper include: (1) We introduce the concept of program pattern, achieve the goal of test case generation of corresponding characteristics through the restriction of program pattern, and introduce the configuration parameters and fields contained in the test cases, which can be used for learning; we generate test cases and construct a case pool according to program patterns provided by the riscv-dv-x86 tool, and explain how to obtain the functional coverage of test cases generated from program patterns after regression verification; (2) We use the functional coverage of the cases generated from program patterns as the criterion of intensity, explore the relation between the verification coverage and the features of program patterns; we form data sets based on the coverage and features of program patterns;we propose three kinds of improved machine learning approaches according to the existence and type of labels for the selection of test cases that can efficiently improve the coverage of verification; (3) We evaluate set coverage, aggregation degree and average reduction degree relative to the whole case pool by using the selected test cases given by the presented approaches, and summarize the practical effect and characteristics of three different methods; we obtain the parameters of corresponding models that can achieve high intensity case selection by each method, and illustrate how to use these machine learning models to judge the intensity of untested cases; we combine three methods to improve the discovery of high-intensity rules. The experimental results show that the selected test cases can be leveraged to improve the verification coverage. By partial methods, we can find some significant rules, and the limitation of some program patterns from these rules can provide guidance for the convergence of the coverage of processor functional verification, which effectively speeds up the verification process.
学科主题	计算机系统设计
语种	中文
源URL	[http://ir.iscas.ac.cn/handle/311060/19514]
专题	总体部_学位论文
推荐引用方式 GB/T 7714	戚航. 基于机器学习的处理器验证高强度用例提取[D]. 北京. 中国科学院研究生院. 2022.

入库方式： OAI收割

来源：软件研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。