Time Series Traffic Speed Prediction Using k-Nearest Neighbour Based on Similar Traffic Data

During the past few years, time series models and neural network models are widely used to predict traffic flow and traffic congestion based on historical data. Historical data traffic from sensors is often applied to time series prediction or various neural network predictions. Recent research shows that traffic flow pattern will be different on weekdays and weekends. We conducted a time series prediction of traffic flow on Monday, using data on weekdays and whole days data. Prediction of short time traffic flows on Monday based on weekdays data using k-NN methods shows a better result, compared to prediction based on all day’s data. We compared the results of the experiment using k-NN and Neural Network methods. From this study, we observed that generally, using similar traffic data for time series prediction show a better result than using the whole data.


Introduction
Due to population increase and numbers of private cars in this modern era, traffic congestion has become significantly worse, not only leading to economic losses but also causes environment damages, human stress and environmental pollution [1].Drivers need traffic information which can affect their driving to a certain option [2] and thus lead to chain changes in traffic flow state in the upstream and downstream of relative road section and other road segments in the network.Some studies show that a road always has a similar traffic state on same working day or weekends at the same time interval [5].Adjacent roads have a similar history in terms of road traffic condition during work days or weekend [6].Investigate similar traffic condition pattern can lead us to increase accuracy and reduce high computation in predicting traffic flow by using only similar traffic data rather use the whole data.

Related Work and Background
Time series [3], and neural network models [4] [5], are often applied to prediction traffic flow and predict traffic congestion based on vehicle speeds, weather, incident, and special days of historical data.Some studies use a Bayes classifier to predict traffic congestion [6] [7].Other used non-parametric regression k-NN to predict short-term traffic flow [8][9][10] [11].
Traffic flow is uncertain, nonlinear and complex, it is difficult to predict the traffic flow effectively and accurately by the predict method based on traditional mathematics and physics models [12], although the neural network can train based on historical data to modelling complex relationships of traffic flow, its unavoidable because its need lengthy amount of time required to train the network [8].K-nearest neighbour non-parametric regression method [10] is a reliable method for short-term traffic flow forecasting, which can reflect the nonlinear characteristics.
Some studies have conducted predict short-term traffic flow using k-NN, using threelayer k-NN [10], using k-NN with composite method [8], using k-NN with weight [12], using k-NN with tensor for calculating similarity traffic [11], and k-NN with connected link upstream or downstream [9].
Previously, we have conducted a prediction of traffic flow in normal condition based on all day's data using multiple regression method [13].Other research show there is a difference in congestion pattern between weekdays and weekend [14]and between regular and irregular days [15].Based on this information, we found that traffic flow pattern on roads will be different on weekdays and weekends.We assumed time series prediction using the k-NN method on weekdays using only weekday's data will have a better result if compared to using all days' data.

Dataset
For our experiments, we used data set from IoT traffic sensor in Aarhus, Denmark [16] [17].The total number of IoT sensors is 449 and their location is shown in Figure 1.

Methodology
In order to achieve our main research objective that is to do time series traffic speed prediction using k-NN based on similar traffic data.We conduct our experiment as follows.
In the beginning, we observe traffic average speed pattern on road 158415 and road 158595, on weekdays and in whole days.Second, we conduct time series prediction on road 158415 and road 158595 using week day's data and using whole days data.Third, we calculate an error using MAD, MAPE and RMSE.

Observe traffic average speed pattern
Our aim of the research is predicting average speed using a k-NN method based on similar traffic data.We observed traffic flow pattern on two roads and found there are different patterns of traffic flow between weekdays and the whole days.Weekdays' data is traffic data from Monday to Thursday.Whole week data is traffic data from Mondays to Sundays.We observed that traffic pattern on weekdays looks sharper than traffic pattern on whole days, as seen in figure 2 and figure 3

Time series prediction
For time series prediction using the k-NN method, we set lags = 10 and k = 20.For time series prediction using elm neural network, we set lags = 4, layers = 3 (input, hidden and output), with 4 input neurons, 100 hidden neurons and one input neuron.

Dataset
Results of the experiment for short time prediction (25 minutes) on road 158415 are shown in Table 1 and the errors are displayed in Table 2. Results of the experiment for short time prediction (25 minutes) on road 158595 are shown in Table 3 and the errors are displayed in Table 4.

Discussion
From observation of traffic speed, we gathered information that traffic flow pattern on weekdays looks sharper than traffic flow in whole days.This show traffic flow on weekdays has a similar pattern.From experiment result, time series prediction using the k-NN method on Monday using weekdays data show a better result than prediction using whole days data.As seen in Figure 4 and Figure 5.This shows us that time series prediction using a k-NN method based on similar traffic flow data produced a better result than based on whole data.

Conclusions
Generally, our results show that time series prediction on Monday using a k-NN method based on weekdays' data or based on similar traffic data produced better results compared to prediction using whole day's data, also from our experiment, prediction using time series using neural network produced better results when compared with the k-NN method.Further investigation is needed to understand the results clearer.We propose to cluster traffic condition in each day on weekdays and weekends to obtain better results.

Fig. 1 .
Fig. 1.Map of 449 IoT traffic sensors in the city of Aarhus, Denmark below.

Fig. 4 .Fig. 5 .
Fig. 4. Bar chart of error of short time series prediction on road 158415

Table 1 .
Results of short time prediction using K-NN and ELM neural network on road 158415

Table 2 .
The error of short time prediction on road 158415

Table 3 .
Results of short time series prediction using K-NN and ELM neural network on road 158595

Table 4 .
The error of short time series prediction on road 158595