可扩展Web集群系统关键技术研究
文献类型:学位论文
作者 | 汤迪斌 |
学位类别 | 博士 |
答辩日期 | 2008-06-02 |
授予单位 | 中国科学院声学研究所 |
授予地点 | 声学研究所 |
关键词 | Web服务器集群 数据分割 请求调度策略 请求路由机制 接纳控制 |
其他题名 | Research on Key Technologies for Scalable Web Cluster Systems |
学位专业 | 信号与信息处理 |
中文摘要 | 网络的快速发展和广泛普及、信息化发展、网络融合业务的不断增加对Web服务器技术及系统提出了更高的要求。一方面,动态网页和HTTPS的广泛使用,需要更多的服务器资源;另一方面,随着网络带宽的不断增大,用户对网页、网站的响应速度要求越来越高。 为了扩充Web服务器系统的处理能力,通常采用Web集群系统。人们提出了多种Web集群结构,其中基于前端的集群结构作为最佳方案应用最为广泛。针对基于前端的Web集群,本文对以下几个关键技术展开了研究:1)如何将数据库UDI查询(Update、Delete、Insert)分布到多个数据库,提高数据库系统的扩展性;2)动态请求产生的负载差异很大且难以预测,如何实现负载均衡;3)如何在基于内容的请求路由机制中加入对持久连接和流水线请求的支持;4)如何对请求进行准确分类以便为重要的请求提供更好的服务。 本文的主要贡献如下: 1、 提出了一种基于用户的数据分割存储方案:针对论坛和博客服务提供商网站,将数据按照所属用户进行分割,存储到不同的数据库系统上。该方案能够将UDI查询分布到多个数据库系统上,增加了系统的扩展性。 2、 针对动态请求,提出了基于分类的请求调度策略,根据URL模式对动态请求分类,同一类的请求具有相同的负载特性,因而可以在不估计请求负载的情况下,实现负载的均衡。试验表明:该策略与基于用户会话得调度方法相比,性能提高了50%以上。 3、 提出了自适应分离式调度策略ASSP。以往研究表明,用不同的服务器服务静态请求和动态请求,可以提高系统的性能。ASSP在运行时根据负载情况,自动调整用于服务静态请求和动态请求的服务器数量。在各种试验环境下,ASSP性能都优于对服务器进行静态分配的最佳方案。 4、 分析了基于内容的请求路由机制TCPHA,为每个连接加入了WAITRESPONSE状态,保证上一个请求的回复发送完毕后再进行迁移,从而实现了对HTTP/1.1中持久连接和流水线请求的支持,使得TCPHA在我们的试验环境中性能至少提高61.1%。 5、 提出了一种通用的请求分类方法:定义网站的目标页面,根据网站的日志文件,计算出各页面跳转到目标页面的概率即目标概率,页面的目标概率越大,页面越重要。仿真试验表明,将此分类方法用于接纳控制,在系统超载时丢弃目标概率最小的请求,可以提高到达目标页面的的用户数,对电子商务网站,这意味着更多成功的交易。 |
英文摘要 | The rapid development and widely use of Internet technology, the evolution of information technology, and the ever increasing of network converged services have posed higher and higher demand on web server technologies and systems. More and more websites use dynamic page technologies and HTTPS protocol, which need more server resources to process. At the same time, with the broadening of network bandwidth, users are expecting faster response speed. Web server clusters are usually used to increase system processing capacity. There are some different web cluster architectures, and the front-end based one is widely used as the best scheme. This thesis researches the following key technologies in front-end based web clusters: 1) how to distribute database UDI queries (Update, Delete, Insert) to different databases, so as to increase system scalability; 2) load produced by dynamic requests is different and impossible to predict, how to balance load among back-end servers; 3) how to make current content-based requests routing mechanisms support persistent connections and pipelined requests in HTTP/1.1; 4) how to classify requests to provide better service to more important requests. The contributions of this thesis are summarized as follows: 1. A user-based data partition scheme is proposed: in view of websites that provide personal information publishing, such as forum, web log service providers, partitions data into many parts according to users they belongs to, distributes them to different database systems. This scheme distributes UDI queries to different database, improves the scalability of storage systems. 2. Proposes a classification based dynamic requests policy, CBDRS for short: arranges requests into different classes based on URL patterns, makes requests in each class have the same load characteristic, so we can balance load without any assumption of each request’s load. Experiment shows that CBDRS improves performance by at least 50% compares to session based strategy. 3. Brings forward an adaptive segregation scheduling policy, ASSP for short. Research shows that it will improve system perforce to separate static requests and dynamic requests into different servers. ASSP adapts the number of servers to serve static requests takes into consideration the status of system at runtime. ASSP is superior to static allocation scheme at any experiment circumstance. 4. Improves content-based requests routing mechanism TCPHA with the support to persistent connections and pipelined requests in HTTP/1.1: adds a WAITRESPONSE status to every connection, ensures handoff happens after the transfer of response to previous request is finished. Extra 61.1% performance benefit is achieved in our experiment. 5. A general requests classifying mechanism is proposed: defines target pages of a website, calculates the target possibility of every page. The higher the target possibility, the more important of the page. Emulation experiment shows more people reach the target pages after combining this method with admission control mechanism. This means more successful business activities in e-commerce websites. |
语种 | 中文 |
公开日期 | 2011-05-07 |
页码 | 139 |
源URL | [http://159.226.59.140/handle/311008/328] ![]() |
专题 | 声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文 |
推荐引用方式 GB/T 7714 | 汤迪斌. 可扩展Web集群系统关键技术研究[D]. 声学研究所. 中国科学院声学研究所. 2008. |
入库方式: OAI收割
来源:声学研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。