基于数据挖掘的道路交通事故成因分析
李虹燕,朱龙波,任宪通,许文雯
山东交通学院 交通与物流工程学院,山东 济南 250357
摘要:为减少道路交通事故的发生,基于某市2020年的交通事故数据分析道路交通事故规律及成因,以照明条件、能见度和天气等9个因素为自变量,以无伤害、轻伤、重伤及死亡等4种交通事故严重程度为因变量,采用多元Logistic回归模型和有序多分类Logistic回归模型,分析影响交通事故严重程度的重要因素。对某市2021年第1季度的交通事故数据进行验证,结果表明:多元Logistic回归模型和有序多分类Logistic回归模型对交通事故严重程度的正确预测率分别为75.1%、75.0%。基于数据挖掘的道路交通事故成因分析可为交通管理部门治理交通环境、降低交通事故提供依据。
关键词:数据挖掘;事故成因;事故严重程度;多元Logistic回归;有序多分类Logistic回归
Analysis of road traffic accidents based on data mining
LI Hongyan, ZHU Longbo, REN Xiantong, XU Wenwen
School of Transportation and Logistics Engineering, Shandong Jiaotong University, Jinan 250357, China
Abstract:In order to reduce the possibilities of traffic accidents on road, the regularities and causes of road traffic accidents are analyzed according to the traffic accident data of a city in 2020. Therefore, the multiple logistic regression model and the ordered multiple logistic regression model are used to analyze the important factors affecting the severity of traffic accidents. The 9 factors such as lighting conditions, visibility, weather and so on as independent variables while the 4 traffic accident severity degrees such as no injury, light injury, serious injury and death as dependent variables are introduced into the two models. Based on the two models, the traffic accident data of a city in the first quarter of 2021 are testified, and the results show that the correct prediction rates of traffic accident severity by the multiple logistic regression model and the ordered multiple logistic regression model are 75.1% and 75.0% respectively. This cause analysis of road traffic accidents based on data mining could provide grounds for traffic control authorities to improve traffic environment and to reduce traffic accidents in the future.
Keywords:data mining; accident causes; accident severity prediction; multiple logistic regression; ordered multiple logistic regression
