中国科学院机构知识库网格系统: 几个学习算法及其在星系光谱分类中的应用

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

几个学习算法及其在星系光谱分类中的应用

文献类型：学位论文


作者	李乡儒
学位类别	工学博士
答辩日期	2006-05-27
授予单位	中国科学院研究生院
授予地点	中国科学院自动化研究所
导师	胡占义 ; 赵永恒
关键词	活动星系核类星体光谱自动分类预处理特征提取对数波长流量标准化均值漂移相关向量机相融性度量 active galactic nuclei(AGN) quasar(quasi-stellar object ,QSO) spectra automated classification preprocessing feature extraction log-wavelength flux standization mean shift relevance vector machine (RVM) coherence measure
其他题名	Study on Several learning algorithms and its Application in Galaxy Classification
学位专业	控制理论与控制工程
中文摘要	LAMOST建成后，预计能同时观测4000个目标天体，它将产生大量的光谱数据。这些海量数据的自动处理速度和质量是能否有效发挥天文望远镜潜力和实现科学目标的主要瓶颈之一。本论文工作的主要目标是为LAMOST星系观测光谱的识别提供可行的算法和技术。我们围绕星系光谱的分类问题研究了数据预处理、特征提取和分类器设计等内容，主要内容如下: 1. 提出了相融性度量的概念。该度量能够刻画一个样本与训练集相融合的程度，并在此基础上给出了一种基于相融性度量的分类方法。该方法的主要特点是能综合处理光谱识别、特殊天体发现和错误积累抑制等问题。 2. 推广并严格证明了均值漂移算法。推广后的算法能够更准确地反映数据的内在空间结构和不同样本间的可靠性差异。为均值漂移算法更广泛、深入的应用奠定了理论基础。 3. 研究了光谱自动分类中数据格式和流量的标准化问题。首先，分析了不同数据格式对光谱的影响和格式标准化研究的必要性；然后，通过分析光谱流量数量级的不确定性及其特点，提出了流量数量级变化的基本模型，并给出了相应的标准化方法。研究发现，采用对数波长数据格式对光谱的自动分类更有利，且文献中通常采用的流量标准化方法在光谱自动分类中的效果较其它一些方法反而差。 4. 研究了光谱自动分类中的有监督特征提取问题，特别是Fisher线性判别分析和相关向量机在星系光谱识别中的应用。研究表明，它们能有效地融合训练数据中的类别信息，并按照分类能力提取特征。
英文摘要	After the expected completion in 2006 of the LAMOST, one of the key scientific projects of China, about 20,000 celestial spectra will be collected at each observation night. Such voluminous data demand automatic data processing, and this thesis is particularly focused on galaxy automatic classification, a key unit in the whole data processing system. The work includes preprocessing, feature extraction and classifier design. The original contributions are summarized as: 1. A novel coherence measure is introduced, which could effectively measure the coherence of a new spectrum of unknown type with the training samples and a novel classifier was designed accordingly. The resulting classifier is capable of carrying out spectra classification and recognizing new types of celestial objects simultaneously with satisfactory performance. 2. The traditional mean shift algorithm is extended to adequately account for local structure and relative reliability of samples. In addition, a rigorous convergence proof is provided under these extended conditions. The results contribute substantially to the establishment of a sound theoretical foundation for the mean shift algorithm, a widely used technique in many vision applications. 3. Data representation and flux normalization are studied. First, the necessity of proper data representation and flux normalization is clarified by analyzing their influence on spectra classification performance. A flux variation model is proposed and several different normalization schemes are assessed. Results show that the log-wavelength representation performs better than the wavelength representation, and surprisingly the commonly used normalization scheme performs worse than others, for example, a median-based one. 4. Spectral feature extraction by supervised methods is explored, in particular, Fisher Discriminant Analysis and Relevance Vector Machine are investigated. Experiments show that supervised methods are in general more suitable for classification purpose as they take the class information into account during the feature extraction process.
语种	中文
其他标识符	200318014603012
源URL	[http://ir.ia.ac.cn/handle/173211/5903]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	李乡儒. 几个学习算法及其在星系光谱分类中的应用[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2006.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。