基于轨迹数据的路网构造与行为模式挖掘研究
文献类型:学位论文
作者 | 吴俊伟 |
学位类别 | 博士 |
答辩日期 | 2014-05-16 |
授予单位 | 中国科学院沈阳自动化研究所 |
导师 | 朱云龙 ; 库涛 |
关键词 | 轨迹数据挖掘 路网构造 热点路径 轨迹模式 热点区域 |
其他题名 | Extracting Road Network and Mining Behavior Patterns from Trajectories |
学位专业 | 机械电子工程 |
中文摘要 | 随着智能移动终端、移动互联网的快速发展,大量移动设备中嵌入了定位模块,而众多与LBS(Location Based Services)相关的移动应用则产生了海量的轨迹数据。这些数据中隐藏着诸多与人类行为紧密相关的知识,如社会结构、行为模式、交通状况、路网结构等等,这很大程度地拓展了我们理解和管理人类自身社会的手段,然而也同时让我们面临着缺乏有效数据挖掘技术的困境。由于这一研究领域具有很强的现实意义和理论价值,它正越来越多地吸引着学术界和工业界的关注和投入。目前这一领域已经涌现出了一些有价值的理论成果和产品。然而,依然存在着宽广的研究空间和很多难以解决的问题,比如:如何对轨迹进行有效地过滤、压缩;如何利用海量的移动对象轨迹提高路网地图的制作效率和覆盖率;如何利用轨迹数据探测出城市中的热点区域和热点路径;以及如何有效地从轨迹中挖掘出移动对象的行为规律和移动模式等等。本文针对这些问题进行了深入地研究,提出了一系列相应的解决方案,设计了基于道路匹配的高效实时压缩算法、基于聚类和图像处理的路网构造算法、基于网格划分和路网构建的热点路径探测算法、基于路网构建的轨迹模式挖掘算法、基于网格相对密度的热点区域探测算法、以及高效出租车载客策略推荐算法等等,并于最后设计了一套用于可视化挖掘的工具。具体的研究内容和创新性成果概括如下:1) 实时轨迹压缩针对已有矢量数据压缩算法压缩率低、速度慢、不能保持轨迹上下文信息等问题,提出了基于路网匹配的实时轨迹压缩算法。该算法将处于某一路段的所有轨迹点用一个路段标识符号来表示,既能极大地压缩轨迹点数量,又能利用道路的形状、长度、位置等属性保留轨迹的上下文信息。实验表明,该算法能在保留轨迹的环境上下文信息的条件下,达到较高的压缩率。2) 路网构建针对已有路网构造方法效率低、投入高、更新慢等问题,提出了利用轨迹数据快速构造路网的方法。设计了两种能适应不同路网环境和精度要求的自动构建算法,分别为基于聚类的算法和基于数学形态学的算法。其中前一算法先利用聚类从车辆转弯点中探测道路出交叉口,而后再连接各交叉口形成路网。实验表明,该算法能够准确探测出绝大多数道路交叉口的位置,且能构造出拓扑性良好的路网结构,但易受数据分布特性的影响。后一算法将区域划分成网格,将轨迹点划分到各个网格中形成位图,而后基于数学形态学中的膨胀、腐蚀、骨架化、剪枝等操作从位图中提取路网结构。因算法以网格为基本处理单元,所以处理速度较快,适合大规模数据处理。也正因网格粒度较大的原因,使得所构造的路网精度稍差于前者。经不同数据量的实验验证,该算法能够快速构造大规模路网结构,且算法对数据的分布不敏感,所构造的路网覆盖率高。3) 热点路径探测针对现有热点路径探测算法需要路网地图的支持、不能识别热点路径的复杂耦合现象等问题,提出了基于网格划分的热点路径探测算法。该算法借助密度聚类的概念,通过计算网格间的公共轨迹量定义网格的密度可达性,以及路径的密度可达性,将网格转化为树的结构之后使用深度优先搜索方法探测热点路径。经实验验证,该算法能够快速有效地探测出“粗粒度”的热点轨迹。为解决网格划分方法导致的“热点路径丢失”问题,又提出了基于路网构建的热点路径探测方法。利用前一研究成果“形态学方法构造路网”从轨迹中抽取出路网结构,然后将轨迹点都匹配到道路上,从而避免了同一路径上的轨迹被划分到多个网格中的问题,也就很大程度地避免了“热点路径丢失”现象的发生。实验表明,该方法能够有效探测出粒度较细的热点路径。且经不同数据量的验证,表明该算法的执行效率很高。4) 轨迹模式挖掘针对现有的轨迹离散化方法效率低、不直观、易丢失移动模式等问题,提出了一种先进行路网结构探测,再基于道路匹配对轨迹进行离散化的方法。算法同样利用之前的路网构建方法从轨迹中提取出路网结构,而后基于道路匹配算法将轨迹离散化为“道路网格”序列,进而使用最大序列模式挖掘方法进行轨迹模式挖掘。实验结果表明,该算法能够快速有效地对轨迹进行离散化,且能比其它算法挖掘出更多更细致的轨迹模式。5) 热点区域探测及行车方案推荐针对大部分出租车效率低下的问题,提出推广高效出租车的寻客经验以提高低效出租车效率的方法。然而载客经验的有效表达和挖掘问题都难以处理。针对此问题,提出了一种基于相对密度的网格聚类算法GRDClu用以探测“热点区域”的分布,基于此对司机的寻客策略进行了总结和分类。而后使用支持向量机方法从高效车辆的历史轨迹中提取出了具有高区分度的寻客策略。实验证明, GRDClu算法能够有效处理密度分布不均的聚类问题,且基于机器学习算法所筛选的寻客策略具有更高的预测准确率。 |
索取号 | TP311.13/W82/2014 |
英文摘要 | With the rapid development of smart mobile terminals and mobile Internet, a large number of mobile devices are embedded positioning devices, and the mobile applications related to LBS(Location Based Services)generate vast amounts of trajectory data. There is a lot of human behavior knowledge hiding in these data, such as social structure, behavior patterns, traffic conditions, road network and ect., which largely extend our means to understand and manage the human society, but also make us facing the dilemma that lacking the effective data mining technology. Due to its significant practical and theoretical value in this research area, it is increasingly attracting the attention of academia and industry. And now there have been some valuable theoretical results and products in this area.However, there still exist a broad research space and a lot of difficulties in this area, such as: how to effectively filtered, compressed the raw trajectory; how to improve the production efficiency and coverage rate of the road map using massive trajectory data; how to detect hot spots and hot routes in city from the trajectories; and how to mining the behavior patterns or movement patterns of the moving objects. To solve these problems, we conducted in-depth research, and made a series of corresponding solutions. We proposed an effective and efficient algorithm for real-time trajectory compression, a clustering-based and an image processing-based road network construction algorithm, a meshing-based and a road network constructing-based hot routes detection algorithm, a road network constructing-based trajectory pattern mining algorithm, a relative grid density-based ROI (Region of Interest) detection algorithm, and a passenger hunting strategy recommendation algorithm. In the final we designed a set of tools for visualization mining. The specific researches and innovative results are summarized as follows:1) Real-time trajectory compressionFor the shortcomings that low efficiency, slow, and cannot keeping the trajectory context information of the existing vector data compression algorithms, this paper proposed a real-time trajectory compression algorithm based on road map matching. In this algorithm many trajectory points located in one road segment was compressed as one point marked by the road segment identification, such that can greatly improve the compression rate, and reserve the context information of the trajectory by the means of road shape, length, and other attributes. Experimental results show that the algorithm can meet higher compression rate than other compression algorithm under keeping the trajectory context information after compression.2) Road network constructionAs the existing road network construction methods have some problems such as low efficiency, high cost, and slow update, we proposed a quick road network construction method using trajectory data. Specifically, we designed two different road network construction algorithms that are able to adapt different real road environment and different accuracy requirements. The one is clustering-based algorithm, which using clustering algorithm to detect road intersections from vehicular turning points, and then connecting these intersections to form road network. This algorithm is more applicable to the regular road network environment, and able to construct high precision road network with good topology structure. The other one, named IM_RoadNetwork, is an image processing-based algorithm, which discretizing the geographical range into gridded space with cells of small size, and the mapping trajectory points into the grids, such that the geographical region can be converted a bitmap. Then the road network can be extracted from the bitmap by some mathematical morphology operations such as: dilation, tinning, and pruning. Compared to the previous algorithms, this algorithm is applicable to any complex road network environment, and suitable for large-scale data processing due to its faster processing speed, but less precision in somewhat.3) Hot routes detectionTo solve the problems of the existing hot route detection algorithms such as: requiring the support of road network map, or cannot recognizing the complex coupling phenomenon, we proposed a meshing-based detection algorithm GridGrowth, which defining the traffic-density reachability and the route-density reachability by the means the concepts in density clustering algorithms. This algorithm detects the hot routes using by depth-first search after converts the grids into octree structure. The experiments show that this algorithm can quickly and efficiently detect the "coarse-grained" hot routes. However, it is prone to cause the "hot route lost" problem. For this issue, we proposed a new detection algorithm based on road network construction which was solved by previous work. We construct a road network using previous image processing method, and then matching trajectory to roads, thereby avoiding the problem that one trajectory is divided into a plurality of grids, and also indeed avoiding the "hot route lost" problem. Experimental results show that this algorithm can effectively detect "fine-grained" hot routes, and has a very high efficiency.4) Trajectory pattern miningAs the existing trajectory discretization methods have some defects such as: low efficiency, not intuitive, and easy to lose moving pattern. We present road network construction-based discretization algorithm, which construct road network using previous image processing method, and then converting the trajectory into "road grid" sequence by matching trajectory to roads, and in the final mining the trajectory pattern using the maximum sequential pattern mining methods. Experimental results show that this algorithm can quickly and efficiently perform trajectory discretization, and can mine more detailed trajectory patterns than other algorithms.5) Region of interest detection and driving strategy recommendationLearning the experiences of high-profit taxi drivers can improve the pick-up rate of low-profit taxi drivers, while the problem of how to express and mine the experiences of high-profit taxi drivers is very difficult. For this issue, this paper proposes a relative density-based grid clustering algorithm to detect the Regions of Interest (ROI), according to which, then classifies the strategies of hunting passengers definitely. Finally, we use Support Vector Machine (SVM) to extract efficient passenger hunting strategies from the historical trajectories of high-profit taxis. Experimental results show that, GRDClu algorithm can effectively identify the ROIs that the densities of which are uneven distributed, and the selected passenger hunting strategies based on SVM algorithm have higher prediction accuracy rate, which shows effectiveness of the proposed method for mining the efficient passenger hunting strategies. |
语种 | 中文 |
产权排序 | 1 |
页码 | 109页 |
分类号 | TP311.13 |
源URL | [http://ir.sia.ac.cn/handle/173321/14804] ![]() |
专题 | 沈阳自动化研究所_信息服务与智能控制技术研究室 |
推荐引用方式 GB/T 7714 | 吴俊伟. 基于轨迹数据的路网构造与行为模式挖掘研究[D]. 中国科学院沈阳自动化研究所. 2014. |
入库方式: OAI收割
来源:沈阳自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。