标题:
基于网络爬虫的实时交通数据自动采集Automatic Acquisition of Real-Time Traffic Date Based on Web Crawlers
作者:
闫文豪, 舒娱琴, 黄植钦
关键字:
网络爬虫, 自动采集, 实时交通数据, Python语言, Tornado网络框架Web Crawler, Automatic Acquisition, Real-Time Traffic Data, Python Programming Language, Tornado Net Framework
期刊名称:
《Advances in Geosciences》, Vol.6 No.3, 2016-06-20
摘要:
实时交通数据是低碳出行、智能交通、道路网络优化等研究领域必不可少的数据源。为了解决目前存在的实时交通数据不免费、不公开等问题,本文应用Python语言和Tornado网络框架,设计了一个稳定、高效、及时的爬虫程序。以广州市为例,从四维交通指数网页抓取到了1723条道路的基本信息及其每5分钟更新一次的实时交通数据,并将获取结果保存到MySQL数据库。结果表明网络爬虫技术在实时交通数据采集方面具有可行性和有效性。
Real-time traffic data is an essential data source to research fields such as intelligent city, low- carbon city, optimum of road net and so on. But the real-time traffic data is not available for free and public. In this paper, the Python programming language and the Tornado Net Framework are used to design a stable, efficient and timely web crawler program which could grab 1723 roads’ traffic data of Guangzhou city, including its basic attributes and real-time traffic data every 5 minutes for each road, from four-dimensional traffic index page and save the data in the local MySQL database at the same time. The result shows that the web crawl technology is feasible and efficient in acquiring real-time traffic data.