Daily Electricity Consumption Forecasting Based on Lazy Learning

Daily electricity consumption is varying randomly. To improve forecasting accuracy, a Lazy Learning (LL) model is proposed. LL aims to build the regression forecasting models upon vectors which are chosen by K-vector nearest neighbors (K-VNN) method. K-VNN can solve overfitting problem and high accuracy can be ensured. Since there are many factors related to electricity consumption, Grey T’s correlation degree is used to determine key indexes to further improve the running efficiency of the model. In addition, fuzzy C-means (FCM) clustering is applied to explore the similar scenarios, then the searching scope of LL is reduced. A case studied in one building in Shanghai shows the proposed method can enhance the accuracy and efficiency of electricity consumption forecasting.


Introduction
Accurate forecasting for daily electricity consumption is of great significance to power network planning and user's management decision. The establishment of an efficient forecasting model based on big data of power system is an important topic of electricity consumption forecasting. Time series methods are adopted to identify the future electricity consumption based on historical data [1][2][3]. Besides these historical electricity consumption data, external environment such as temperature and weather characteristics is closely related to future load, so multivariable coupling methods are put forward [4,5]. To further improve forecasting accuracy, the integrated methods are proposed [6,7], in which different weights are used to combine the forecasting results of various algorithms. The literatures above provided important references for load forecasting, but the following issues need to be further researched: 1) The unified parameter models are adopted in present electricity consumption forecasting. However, the unified parameter models incorporate every error, which will result in overfitting problem.
2) The randomness of electricity consumption is gradually increasing. The forecasting accuracy can only be improved by selecting similar electricity consumption scenarios.
For this purpose, a Lazy Learning (LL) method [8,9] is proposed to forecast electricity consumption. LL algorithm is based on "similar inputs produce similar outputs ". Similar scenarios are chosen by K-vector nearest neighbors (K-VNN), so overfitting problem is overcome and high accuracy can be ensured.
To improve the running efficiency, Grey T's correlation degree [10] is used to find key indexes to improve the efficiency of the model. In addition, fuzzy C-means (FCM) clustering [11] is applied to explore similar scenarios and reduce searching scope of modeling. The results demonstrate that the proposed method can enhance the accuracy and efficiency of electricity consumption forecasting.

K-VNN
K-vector nearest neighbors (K-VNN) method is employed to choose similar input data from database for forecasting day f. Input vector of f day X f =[x f1 , x f2 ,⋯, x fN ] T is given. Here N is the dimension of input vector.
To choose the nearest K neighbors, the distance and angle between X f and X t is calculated. X t is sample t in the database, t∈T, T is time set.
The Euclidean distance d t (X t , X f ) and angle θ t (X t , X f ) are defined as follows: Once d t (X t , X f ) andθ t (X t , X f ) are obtained, K nearest neighbors are selected as follows: 1) cos(X t , X f )<0 indicates that X t and X f are in opposition, and X t should be abandoned.
2) While cos(X t , X f ) ≥0, D(X t , X f ) is further used to describe the distance and angle between X f and his neighbors. The larger D(X t , X f ) is, the nearer these two vectors are. where α is the weight, 0≤α≤1. Logistic function is used to map d t (X t , X f ) to the interval of [0, 1], which is consistent with the range of cos(X t , X f ). Furthermore, K nearest data X fk (k=1,2,…,K) are selected by sorting D(X t , X f ) in descending order. To enhance forecasting accuracy, the proportion of K in historical database samples is adjusted to 10%.

Prediction model
According to the nearest neighbors X F ={X f1 , X f2 ,…, X fK } and the electricity consumption Y F =[y f1 , y f2 , …, y fK ] T , the first order linear regression model is established.
The weighted sum-squared residual is taken as an evaluation criterion for the accuracy of the model.
where D(.) is the residual weight. The model parameter β can be obtained by minimizing J.
The output of X f is calculated by formula (6).
ˆT f f y = X β (6) LL algorithm selects the nearest neighbors to model for each forecasting day, which can not only solve the problem of overfitting but increase forecasting accuracy as well.

Data standardization
Data at different scale should be standardized first. Zscore method is adopted for standardization.
where x tn,I is the initial input. x tn is the standardized input. n=1, 2, …, N. M n is the mean value. S n is the standard deviation. standardized vector at t time is

Inputs selection
Daily load is related to historical daily load, temperature, weather factors and holiday. To identify these factors which impact daily load obviously, Grey T's correlation degree [8] is used to measure the relation degree.
In Grey T's correlation degree calculation, daily load [y 1 y 2 … y T ] is taken as a parent factor, and each input [x 1n x 2n … x Tn ] is as a sub factor. Grey correlation degree r n of the nth factor is calculated as follows： The correlation degree |r n |≤1, the greater the |r n | is, the higher the correlation degree is. r n > 0 indicates that parent factor and sub factor have positive correlation, otherwise have negative correlation.
Considering the trend of temperature influence on daily load in different seasons, the correlation degrees are determined based on four individual seasons.

Fuzzy c-means clustering
LL algorithm determines K neighbors by tracing the whole database, which will take a long time. Therefore, FCM isused to classify database into small clusters.
To realize the classification, the distance between clusters should be minimized. where C is the number of clusters. c i (i=1, 2, …, C) is the ith cluster center. U is the membership matrix of T×C, and the element u ti indicates the degree how X t belongs to the fuzzy set c i , 0≤u ij ≤1. d ti =||X t -c i || is the Euclidean distance between the tth input vector and the ith cluster center. m∈[ 1, ∞] is a weighted index, and set to 2 here.
The necessary conditions for equations (12)~(13) are obtained as follows by constructing Lagrangian functions.
The optimal classifications are determined by iteration. The optimal evaluation index is (16). When L(C-1)<L(C-2) and L(C-1)<L(C) are satisfied, C-1 is the optimal number of classification.

Selection of input factors
FCM-LL algorithm proposed is used to forecast daily electricity consumption of one building in Shanghai. The data are covered from January 1, 2017 to December 31, 2017.
The candidate inputs include the highest temperature x t1,I , the lowest temperature x t2,I , weather factor x t3,I , holiday x t4,I , average temperature x t5,I , and daily load in the previous week x t6,I~xt12,I . For x t4,I , 0 represents holidays, and 1 represents weekdays. Weather factors are refined to [0, 1] [12]. For example, 0.6 represents cloudy to moderate rain and 1 represents sunny. 45 days' data are taken from summer, autumn and winter individually for correlation analysis, and correlation degree are denoted by r n,S , r n,F and r n,W . As shown in Table 1, time series of daily electricity consumption with future load has a weak correlation and cannot be used as inputs. In addition, daily electricity consumption is related to the highest temperature, the lowest temperature, holiday and weather factors. Therefore, a 365×5 database (X t,I , y t ) is set up.

Data clustering
To further reduce the search scope of LL algorithm, the number of clustering data is limited to 50, and Table 2 is an example.
This are 20 data in this cluster, whose clustering center is c={21.14, 11.53, 0.71, 0.85, 320.93}. The data in this class has similar characteristics.
The 16 th data has 21.93% relative offset from the class, which is judged as bad data.

Algorithm performance
Algorithm performance is described by E MAPE and E. E MAPE is mean absolute percentage error. Error E indicates the ratio between the data with e i less than the preset error and total forecast days m. Leave-one-out cross validation and sampling validation are adopted to verify the performance of the algorithm. Leave-one-out cross validation selects one sample in turn. Sampling validation selects 20 data randomly.
Performence parameters of these three models are shown in Table 3, where t Fi and t Te are the test times.
As shown in Table 3, weather factors improve the accuracy of forecasting. Compared with the traditional model I, model III has higher forecasting accuracy and shorter running time. Figure 1 further demonstrates that model III can track the daily load, and can meet the needs of practical engineering.

Error analysis
To analyze the error differences among three models, electricity consumption of November 6, 2017 is taken as an example, the selected modeling data are shown in Table 4. As Table 4 shows, more can be mined and summarized as follows: 1) K-VNN can effectively select neighbors to model for forecasting day, and weather factors have important impacts for clustering and selecting neighbors. {17,14,1,436.48} and {17,13,1,324.15} cannot be identified in Model II without weather factors which leads to the extral error of Model II.
2) In Model III, data clustering and bad data identification can filter noise data. The database is reduced and the credibility of the data is improved. For example, {19,11,0.8,1,391.41} in model I is filtered as bad data and is not included in model III to ensure accuracy.

Conclusions
In summary, FCM-LL model proposed in this paper is an effective method to forecasting electricity consumption. The following conclusions can be obtained through case studies: 1) Grey T's correlation degree can select effective input vectors.
2) FCM algorithm can obviously reduce the researching scale, which can improve running efficiency.
3) FCM-LL forecasting modeled by select similar scenarios, so the problem of overfitting is overcome, and forecasting accuracy is further improved.