中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
云环境中基于负载控制的数据分配研究

文献类型:学位论文

作者孙熙领
学位类别硕士
答辩日期2012-05-31
授予单位中国科学院研究生院
授予地点北京
导师丁治明
关键词云计算 数据分配 数据调整 负载控制 数据依赖
学位专业计算机软件与理论
中文摘要

大量的大规模密集型数据需要存储在多个服务器中,而应用越来越广泛的云计算环境很好地解决了大规模密集型数据在分配过程中遇到的规模性问题。随着云计算技术的发展,云环境下的数据分配已成为云计算的一个重要组成部分,并在资源利用,负载均衡等方面有着极其重要的作用。

同时,云计算环境中多服务器之间的数据分配也有着诸多挑战。首先,大规模的廉价服务器会给云计算下服务的实时性带来阻碍,用户不能快速而准确的掌握其环境下的每一个服务器状态。其次,密集型的数据分配会带来服务器之间数据量的传输,从而导致数据访问效率的低下。最后,单位时间之间数据负载的波动会引起服务器的访问瓶颈,从而带来较低的资源利用率和服务质量。

本文首先以大规模密集型数据中的数据流为建模对象,提出了一种数据分配算法,在保证服务器负载平衡的基础上兼顾了密集型数据之间的依赖性。实验表明,相比于同类的数据分配算法,本文提出的算法具有更好的综合表现,特别是在保证服务器的负载平衡方面,效果尤为突出。而针对数据分配之后的负载波动或者是服务器过载现象,本文提出了一种基于负载控制的数据调整策略。该策略利用R-tree索引来掌握云计算的全局环境,使用两步裁剪策略筛选服务器,通过服务器选择策略寻找目标服务器来接受热点数据。实验表明,相比于同类的数据调整策略,本文提出的策略能够更为快速且有效的通过数据调整的方式完成服务器负载的调控工作。

英文摘要

A huge number of large-scale intensive data have to be stored in distributed servers. Nowadays, under the cloud environment, large-scale data storage can be better supported. With the development of cloud computing, data allocation in cloud becomes a significant part of cloud computing and plays an important role in resource utilization, workload balance, et al.

Data allocation between multiple servers in cloud is facing some new challenges. Firstly, it is difficult for service in cloud to guarantee real-time among large number of cheap servers. And client cannot understand the actual environment of every single server in a fast way. Secondly, the allocation of intensive data in cloud will lead the transmission of data between servers which may cause low efficiency of data access. Thirdly, the bottleneck of access on servers may be derived from the imbalanced workload among unit intervals. And this will influence resource utilization and service quality.

In this paper, we first propose a model based on data flow between large-scale intensive data. Afterwards, a data allocation algorithm is presented to guarantee the workload balance of servers while considering dependencies between intensive data. Extensive experiments confirm that our solution has better performances than conventional approaches particularly in workload balance. In order to handle workload fluctuation or servers’ overload after data allocation, we propose an efficient data adjustment strategy based on workload control. Crucial server status information is recorded and indexed by R-tree to provide global view for data movement. Based on index, a two-step filtering approach is introduced to eliminate irrational server candidates. A server selection algorithm considering workload patterns is presented afterwards to acquire load-balancing effects. Extensive experiments are conducted to confirm that our data adjustment strategy can control workload problem of servers in a fast and efficient way.

学科主题分布式处理系统
语种英语
公开日期2012-06-01
源URL[http://ir.iscas.ac.cn/handle/311060/14501]  
专题软件研究所_基础软件国家工程研究中心_学位论文
推荐引用方式
GB/T 7714
孙熙领. 云环境中基于负载控制的数据分配研究[D]. 北京. 中国科学院研究生院. 2012.

入库方式: OAI收割

来源:软件研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。