网络流量分类与识别的若干关键技术研究
文献类型:学位论文
作者 | 徐鹏 |
学位类别 | 博士 |
答辩日期 | 2008-05-29 |
授予单位 | 中国科学院研究生院 |
授予地点 | 中国科学院软件研究所 |
导师 | 吴志美 |
关键词 | 网络测量 流量分类 流量识别 传输层 机器学习 网络流属性 |
其他题名 | Research on Some Key Issues for Traffic Identification and Classification |
学位专业 | 计算机应用技术 |
中文摘要 | 网络流量分类是多项网络研究工作的前提和基础,一直是网络研究领域的热点问题。近年来,随着互联网技术的发展,新型网络应用不断涌现,给现有的流量分类技术带来了一系列挑战。本文针对现有网络流量分类研究中急需解决的关键问题展开了研究工作,取得的成果和主要贡献如下: 1.针对现有P2P流量传输层识别方法在国内网络环境中的不足,提出了3条改进策略。通过在实际网络流量记录上进行实验来验证上述改进策略的有效性,实验结果表明:改进后的识别方法能够有效适应国内网络环境,相关的准确性指标都在95%左右。 2.提出了一种基于C4.5决策树的流量分类方法。该方法利用训练数据集中的信息熵来构建分类模型,并通过对分类模型的简单查找来完成网络流样本的分类。理论分析和实验结果都表明:利用C4.5决策树方法来处理流量分类问题在分类稳定性和数据处理效率上具有明显的优势。 3.提出了一种基于时间无关属性的P2P流量识别方法。该方法仅使用16种时间无关属性,可以有效避免网络异常状态对分类稳定性的影响。与包含时间相关属性的流量分类模型进行对比,实验结果表明:仅使用分组数量相关属性和分组长度相关属性的流量分类模型已经能够有效区分P2P流量和非P2P流量。 4.设计并实现了一种实时流量监测系统。该系统改进了基于特征字段的流量分类方法,实现了网络流量的实时在线分类,为社区宽带综合业务接入系统提供了一个实用的流量监测方案。 |
英文摘要 | Network traffic classification is the foundation of many research works on network, so it has been a hot topic in this area all along. With the development of Internet technology, the continuous emergence of the new network applications has brought a lot of challenging issues in traffic classification in recent years. This thesis is focus on some key issues for traffic classification and identification. The main contributions and creativity are described as follows. Firstly, there are some problems when the existing P2P traffic transport layer identification method is used in the domestic network environment. In order to improve this method, three proposals are offered in this thesis. These proposals are validated using the domestic traces. The results of experiments indicate that the accuracy of the improved method approach 95%. Secondly, a new traffic classification method based on the C4.5 decision tree is proposed. This method builds classification model using information entropy in training data and classifies flow by simply looking up the decision tree. The theory analysis and experiment results validate that there are obvious advantages in the stability and efficiency of the classification when C4.5 decision tree method is used to classify Internet traffic. Thirdly, a novel P2P traffic identification method using 16 kinds of flow attributes is proposed. This method can avoid the negative effect of network pathologies on the stability of classification model, because it only uses time-independent attributes. The comparison with the traffic classification method using time-related attributes is given. The results show that the P2P and Non-P2P traffic can be exactly distinguished just using packet-number-related and packet-length-related attributes. Fourthly, a real-time traffic monitor system is designed and implemented. This system implements the on-line traffic identification by improving the classification method using signated string. It provides a practical solution scheme for community broadband integrated services network system. |
公开日期 | 2011-03-17 |
源URL | [http://124.16.136.157/handle/311060/6204] ![]() |
专题 | 软件研究所_多媒体通信和网络工程研究中心 _学位论文 |
推荐引用方式 GB/T 7714 | 徐鹏. 网络流量分类与识别的若干关键技术研究[D]. 中国科学院软件研究所. 中国科学院研究生院. 2008. |
入库方式: OAI收割
来源:软件研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。