中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
化学主题数据库数据整合的设计与实施

文献类型:学位论文

作者牛文斗
学位类别硕士
答辩日期2010-06-01
授予单位中国科学院研究生院
授予地点北京
导师温浩 ; 赵月红
关键词联邦数据库集成模式 数据整合 概念树 化学主题数据库
其他题名Design and implementation of chemistry subject database
学位专业应用化学
中文摘要中国科学院科学数据库系统现有的化学化工数据子库包括工程化学数据库、化学专业数据库和应用化学数据库,分别由过程工程研究所、上海有机化学研究所、长春应用化学研究所维护管理,并独立提供以检索为主的数据服务,用户如果想获得一种化合物多方面的数据,需要在不同专业数据库之间跳转。因此,构建具有统一框架的数据平台,实现中国科学院化学化工领域数据子库的整合,就显得十分必要。本文通过设计与实施化学主题数据库,来解决中国科学院化学化工数据资源集成程度低的问题。通过对比分析,本文选择联邦数据库集成模式作为化学主题数据库的整合方法,并针对中国科学院化学化工领域数据子库的特征,在传统的联邦数据库集成模式上增加数据集成模型作为扩展,以便将上述数据资源组织起来,构成一个基于化合物唯一标识的相互联系的数据集合。本文设计了以学科分类为根节点和以化合物为根节点两种不同的模型,通过比较,发现以化合物为根节点的概念树模型(数据集成模型)能够明显简化数据库用户的检索步骤,有利于化学化工数据库的集成与表达。在用户接口方面,本文着重设计了基于概念树模型的统一检索入口和可视化用户界面,前者解决了用户在不同的专业数据库之间跳转的问题,后者将来自不同数据源的检索结果按照集成模型的设计分层级分节点的展示给用户。采用不同检索方式对化学主题数据库进行验证,结果表明化学主题数据库能够实现分布异构的化学化工数据资源基于化合物唯一标识的整合,同时,能够简化用户检索步骤、节省用户时间。
英文摘要The current chemistry databases in Chinese Academy of Sciences includes: ECDB, Chemistry Database as well as Applied Chemistry Database, which are managed and maintained by Institute of Process Engineering, Shanghai Institute of Organic Chemistry and Changchun Institute of Applied Chemistry, respectively. The biggest problem those databases confronted with is that they are lack of integration, in other words, they provide data retrieve as the main service manner independently. As a result, users have to jump between different professional databases when they want to acquire detailed information about a compound. Therefore, it is necessary to establish a unified framework as a data platform to realize data integration for all the chemistry databases in Chinese Academy of Sciences. The paper... [ More ] The current chemistry databases in Chinese Academy of Sciences includes: ECDB, Chemistry Database as well as Applied Chemistry Database, which are managed and maintained by Institute of Process Engineering, Shanghai Institute of Organic Chemistry and Changchun Institute of Applied Chemistry, respectively. The biggest problem those databases confronted with is that they are lack of integration, in other words, they provide data retrieve as the main service manner independently. As a result, users have to jump between different professional databases when they want to acquire detailed information about a compound. Therefore, it is necessary to establish a unified framework as a data platform to realize data integration for all the chemistry databases in Chinese Academy of Sciences. The paper shot the target via establishing the Chemistry Subject Database system. In the choice of data integration method, after comparing between three mainstream data integration methods (Data Warehouse, Middleware Model, Federal Database Architecture), the paper found out that Federated Database Architecture could realize data integration while maintaining the separation of each sub-database as well as keeping its characteristics, making the method extremely suitable to chemistry databases in Chinese Academy of Sciences where data is not only quite different in types but also updating very quickly. Furthermore, the paper expanded the traditional method via adding a concept tree as a data integration model and then building up the framework of Federated Database Architecture based on the concept tree model. In terms of model design, the paper compared the subject- oriented method with the compound-oriented method and found out that the latter one could simplify users’ search process significantly, hence was much more feasible and reliable. As to the users’ interface, the paper especially focused on two things: one was the unified search entrance and the other was the visual results-displaying interface, both of which were based on the concept tree model. The former one improved users’ experiences when they are trying to acquire large amount of data whilst the latter one provided users with level-classified search results coming from different sources. Both the fuzzy and exact searching examples in this paper showed that the Chemistry Subject Database system could not only realize data integration in heterogeneous chemical data resources but also simplify users’ search process, as well as providing integrated data publishing.
公开日期2013-09-17
页码69
源URL[http://ir.ipe.ac.cn/handle/122111/1517]  
专题过程工程研究所_研究所(批量导入)
推荐引用方式
GB/T 7714
牛文斗. 化学主题数据库数据整合的设计与实施[D]. 北京. 中国科学院研究生院. 2010.

入库方式: OAI收割

来源:过程工程研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。