This tutorial is written on the jupyter notebook of Python. Some basic environments of Python need to be configured:
1 Basic processing of taxi GPS data: generating OD for trips - pandas
2 Plotting figures of data aggregation- matplotlib
3 Gis processing of taxi GPS data - geopandas
4 Match grid OD to administrative units and plot map background - plot_map
5 The scatter and heatmap plot of data distribution - contourf
6 How to deal with a complicate data processing tasks? - theoretical
7 Use folium to visualize map - folium
With the coming of the era of big data and with the gradual opening of data, data workers are more or less facing spatiotemporal data.
Xiaoxu once said：The data is data since it's data.
Yes! Data processing is an art! When dealing with spatiotemporal data, you need not only the skill of data processing, but also GIS. The most important thing is that it's an art. You need to use your aesthetics and produce beautiful figures!
Before I learn how to use python, I used to use SQL database to process data, export it to Excel to draw charts, and then export it to ArcGIS to draw maps. A set of process has several software, and the work efficiency is very low.
Last time, the teacher said to me: Xiao Xu, please give me 800 figures for this project. I need them this Friday. In fact, it's useless to draw this. It's mainly to show our workload in the appendix of our project text.
I flipped the table and said: Draw these by yourself, I quit!
No, it's a dream. The reality is: I smile and say, ok, I will draw all night
Now, we have python. With the pandas, geopandas, and Matplotlib packages in Python, you use Python to realize batch calculation of data, batch production of GIS maps, and so on.
WOW! That is great! You definitely need necessary skill if you are a student in graduate school and eager for data analysis. Let alone 800 figures. I can generate as many figures as there is in the computer space!
Through this tutorial, you will learn some processing skills of spatiotemporal data from the beginning, such as data cleaning, data collection, data integration and visualization with Python! After that, this tutorial introduces several commonly used Python visualization packages, and finally the actual project
Demos(all plotted by python):
Data sources for this tutorial (open datasets can be downloaded directly):
Taxi data in Shenzhen
Urban Data Release V2
Taxi GPS Data Format: 22223,2013-10-22 08:49:25,114.116631,22.582466,0
Taxi ID, Time, Latitude, Longitude, Occupancy Status, Speed; Occupancy Status: 1-with passengers & 0-with passengers;
Before learning this tutorial, it is strongly recommended that you have learn the most basic programming skill of Python. If you haven't, here are some suggestions:
In addition, some recommended courses:
：Code submit frequency
：React/respond to issue & PR etc.
：Well-balanced team members and collaboration
：Recent popularity of project
：Star counts, download counts etc.