中国科学院机构知识库网格系统: 基于机器学习的胃镜计算机辅助诊断算法研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

基于机器学习的胃镜计算机辅助诊断算法研究

文献类型：学位论文


作者	王帅
学位类别	博士
答辩日期	2016-11-30
授予单位	中国科学院沈阳自动化研究所
导师	唐延东 ; 丛杨
关键词	图像质量评价视频摘要稀疏表达特征选择病变检测
其他题名	Computer-Aided Gastroscopic Diagnosis Based on Machine Learning
学位专业	模式识别与智能系统
中文摘要	计算机辅助诊断（Computer-Aided Diagnosis，CAD）被称为医生的“第三只眼”，其通过成像技术、图像处理技术以及其他辅助手段，结合计算机的分析计算能力，辅助医生做出诊断，在提高医生诊断准确率、提升工作效率以及帮助患者康复等方面起到了积极促进作用。尽管当前的CAD研究取得了长足的进步与发展，但多数研究局限于对脑CT、脑MRI以及胸部CT数据的研究，关于胃镜影像的研究较少。此外，伴随着环境的污染、生活节奏的加快以及生活压力的增大，胃病的发病率呈现逐年攀升的态势，仅在我国每年就有近40万患者被诊断为胃癌。因此，借助于胃镜影像数据实现对胃病的快速准确诊断治疗具有重要的理论和应用价值。本学位论文分别对胃镜图像质量评价、胃镜视频摘要、图像特征选择以及基于胃镜医学诊断报告的多示例分类问题进行了深入研究，依托于机器学习理论的学习和泛化能力，提出了具有创新性的模型算法。论文的主要内容和创新点如下：(1) 针对胃镜图像易受消化道内部食物残渣、消化液以及光照不均等条件的影响，提出了一种基于机器学习的胃镜图像质量评价算法，旨在检测出不包含诊断信息的低质图像，从而有效降低CAD学习算法的复杂度、提升模型准确性。按照造成胃镜图像低质的原因，我们将图像分为清晰图像、过饱和图像、低照度图像和失焦图像四类，其中后三类为低质图像。对于图像的分类，我们采用了随机森林算法构建稳定高效、无需先验信息的分类器，对低质图像实现了有效检测与剔除。(2) 提出了一种基于相似性抑制字典选择的胃镜视频摘要算法，实现了对于胃镜视频的快速略览。我们将视频摘要问题转换为稀疏字典选择问题，并在传统模型的基础上，提出了相似性抑制约束，对包含相似性内容的图像约束其同时被选为关键帧的可能性，保证了选取关键帧的多样性。此外，根据人眼视觉特性以及胃镜图像的获取条件，提出了注意力先验，并将其融入到稀疏字典选择模型中，以使选取的关键帧更符合人眼的观测习惯，为医生的诊断提供更多的诊断信息。提出的胃镜视频摘要算法实现了胃镜视频的自动摘要以及关键帧数目的任意设置，不仅可以提高医生的阅片速度，而且可用于医生培训，医学诊断报告生成等实际场景中。(3) 针对医学图像特征表示中，特征维数较高、特征维度间存在冗余及非相关信息，对诊断结果造成干扰，降低诊断准确性的问题，提出了基于群稀疏约束的深度特征选择算法。此外，为了提高算法的适用性，针对训练数据是否存在标签的情形，分别提出了有监督深度特征选择模型以及无监督深度特征选择模型。区分于传统特征选择模型，所提出模型在选取有效特征子集的同时，还为每一特征维赋予合理的权重，从而提高了算法的可扩展性。算法在有效剔除干扰特征的同时还提高了执行效率。(4) 提出了一种基于多示例学习的在线病变检测算法。我们以胃镜医学诊断报告作为数据源，利用文本匹配算法从诊断文本中提取相应的标签信息。由于诊断报告中包含多幅图像，而提取的标签赋予的是整个诊断报告，故我们将问题转化为多示例学习问题。为了形成对于诊断报告有效的特征表示，首先我们使用提出的多视角投票模型从每个正包中选取最可能病变图像，然后将诊断报告映射到所选取图像组成的特征空间中对其进行表示。最后，我们采用基于低秩约束的在线度量模型构建分类器实现诊断报告的分类。所提出算法摆脱了对于人工标注样本的依赖便于大规模训练数据的收集，同时训练的在线模型可借助于新样本实现不断更新，提高模型的泛化能力，更符合实际应用的需求。
英文摘要	Computer-aided diagnosis (CAD) based on imaging technology, image processing technology, and computer science is called as the doctor's ``third eye'', which helps the clinician make a diagnosis and has played an important role in improving diagnostic accuracy, work efficiency, and assisting individuals. Even though the current CAD research has made considerable progress, most studies are limited in brain CT, brain MRI, and chest CT data, related research based on gastroscopic imaging data is only at the beginning. In addition, along with the pollution of the environment and the accelerated pace of life, the incidence of stomach diseases is increasing year by year and there are nearly 400,000 newly discovered patients with gastric cancer in China each year. Therefore, the rapid diagnosis and treatment of stomach diseases with the aid of gastroscopic imaging data are of essential theoretical and application value. In this dissertation, the gastroscopic image quality evaluation, gastroscopic video summarization, feature selection and multiple instance classification problem based on the gastroscopic medical report are presented, and some novel models and algorithms are proposed. The main contributions of this dissertation are as follows: (1) For the problem of gastroscopic images influenced easily by the food residue, digestive juice, and uneven illumination, a gastroscopic image quality assessment algorithm based on machine learning is proposed for the detection of low-quality images, which can effectively reduce the complexity and enhance the accuracy of the corresponding CAD system. According to the causes of low-quality gastroscopic images, we divide images into four classes, namely clear images, oversaturated images, dark images, and out-of-focus images. The latter three are low-quality images. For classification, a random forest algorithm is applied to construct a stable and efficient classifier, which can detect and eliminate the low-quality image effectively. (2) A scalable gastroscopic video summarization algorithm is proposed via similar-inhibition dictionary selection, which can realize the fast browsing of the original video. The problem of gastroscopic video summarization is formulated as a dictionary selection issue and the similar-inhibition constraint is introduced to reinforce the diversity of selected key frames. In addition, according to human visual characteristics, an attention prior cue is calculated to help select the frames with more high-level semantic information. The proposed gastroscopic video summarization algorithm allows us to automatically extract any given number of key frames, which can not only improve the clinician's reading speed but also can be adapted to various real applications, such as the training of young clinicians and medical report generation. (3) For medical image feature representation, the extracted high-dimensional feature vectors are often with redundant and nonrelevant information, which interferes with the recognition result and reduces the accuracy of diagnosis. To solve this problem, a deep sparse feature selection algorithm is proposed and two separate models for the case of whether the training data is labeled are built. Different from the traditional models, the proposed ones not only can assign a suitable weight to the feature dimension but also can select both feature units and feature dimensions simultaneously. Moreover, the models eliminate the impact of interference characteristics on the diagnostic results and improve the efficiency of the system. (4) An online lesion detection algorithm based on multiple instance learning is proposed. A text matching algorithm is applied to extract the label information from the diagnostic text for the whole report. Since the diagnostic report contains multiple images, the original problem is transformed into a multiple instance learning problem. In order to effectively represent a diagnostic report, a multi-view voting algorithm is firstly proposed to select the most suspicious lesion images from each positive diagnostic report and then each report is mapped into the feature space composed of the selected images. Finally, an online metric learning method is used to optimize the classification. In comparison with most computer-aided diagnosis systems, the proposed algorithm frees clinicians from the hard working to manually label the pixel-wise ground truth, which helps us collect large scale training data and leads to a more robust model. Moreover, the proposed model can self-update, which is more in line with the actual demand for CAD.
语种	中文
产权排序	1
页码	105页
源URL	[http://ir.sia.cn/handle/173321/19457]
专题	沈阳自动化研究所_机器人学研究室
推荐引用方式 GB/T 7714	王帅. 基于机器学习的胃镜计算机辅助诊断算法研究[D]. 中国科学院沈阳自动化研究所. 2016.

入库方式： OAI收割

来源：沈阳自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。