基于网格聚类优化的区域热点路径识别
翁旭艳,郑淑妮
浙江省交通运输科学研究院,浙江 杭州 310023
摘要:针对轨迹聚类方法难以准确识别高相似热点路径的问题,提出能区分起讫点或局部路段的热点路径识别方法,将出行轨迹映射并压缩为移动网格序列,分别从边界与内部区分序列间的空间相似性度量,整合转化为距离并进行基于方格序列密度的空间聚类(grid sequencedensity-based spatial clustering of applications with noise,GS-DBSCAN)。以青岛市市南区部分出租车的轨迹数据为例,与只考虑内部相似性且分别以对比序列中较短序列和较长序列为基准的聚类方法进行对比验证。结果表明:同时考虑边界与内部相似性且以较长序列为基准的GS-DBSCAN算法能正确区分多出行起讫点分布下长度差异较大的分离、汇合与交叉耦合热点路径,受路径长度、网格尺寸等变量差异的影响小于2%,且聚类运算效率较高。
关键词:热点路径;边界;内部;轨迹聚类
Regional hotspot path identification based on grid clustering optimization
WENG Xuyan, ZHENG Shuni
Zhejiang Scientific Research Institute of Transport, Hangzhou 310023, China
Abstract: To solve the problem that the trajectory clustering method is difficult to accurately identify high-similarity hotspot paths, a hotspot path identification method that can distinguish between start and end points or local sections is proposed. The travel trajectory is mapped and compressed into a moving mesh sequence, and the spatial similarity measurement between sequences is distinguished from the boundary and the interior, integrated and transformed into distance, and spatial clustering based on grid sequencedensity-based spatial clustering of applications with noise (GS-DBSCAN) is performed. Taking the trajectory data of some taxis in Shinan District, Qingdao as an example, the clustering method that only considers internal similarity and is based on the shorter and longer sequences in the comparison sequence is verified. The results show that the GS-DBSCAN algorithm, which considers both boundary and internal similarity and is based on longer sequences, can correctly distinguish the separation, convergence, and cross-coupling hotspot paths with large length differences under the distribution of multiple travel start and end points. The influence of variable differences such as path length and grid size is less than 2%, and the clustering operation efficiency is high.
Keywords: hotspot path; boundary; interior; trajectory clustering