Forecasting Analysis of Shanghai Stock Index Based on ARIMA Model

Prediction and analysis of the Shanghai Composite Index is conducive for investors to investing in the stock market, and providing investors with reference. This paper selects Shanghai Composite Index monthly closing price from Jan, 2005 to Oct, 2016 to construct ARIMA model. This paper carries on the forecast of the last three monthly closing price of Shanghai Stock Index that have occurred, and compared it with the actual value, which tests the accuracy and feasibility of the model in the short term Shanghai Stock Index forecast. At last, this paper uses the ARIMA model to forecast the Shanghai Composite Index closing price of the last two months in 2016.


Introduction
The stock market is an embodiment of the economic strength of the real industry.The Shanghai Composite Index is the embodiment of a country's overall entity economy level.In some ways, the stock price reflects a national economy strength.We can know the rise and fall of a country's real industry by observing the share price.The Shanghai Composite Index can reflect the most intuitive of the overall economy.
Therefore, the Shanghai Composite Index forecast has important significance for a country in latter regulation and control of the real economy, and the development of regional policy.But it is difficult to analyze and predict the Shanghai Composite Index by using the traditional structural method, and the result error is quite large.So choosing suitable ARIMA model to predict the Shanghai Composite Index can be satisfied with the results.

ARIMA(p,d,q) model
Setting t X sequence is a random non-stationary time series.After the X sequence for D order difference, we can conclude: Among them, is the differential operation.d is the differential order [1].After the difference, sequence Yt has the following structure model: Integrated auto-regressive average moving is referred as the ARIMA (p, d, q) model.Among them, p is auto-regressive order.d is the moving average.^t H is the white noise sequence.i M M indicates the auto regressive coefficient, and i T T is the moving average coefficient.
The auto-regressive moving average model, which is first proposed by the American statistician G.E.P.Box and G.M.Jenkins in 1970, is widely used in time series data analysis [2].It is a effect prediction method with high precision of short-term.

Data selection and processing
This paper selects Shanghai Composite Index monthly closing price from Jan, 2005 to Oct, 2016.All data are from the Shanghai Security Exchange [3].All data processing and final data prediction are carried out using EViews7.2.

Stationarity test
We do the stationary test of Shanghai Composite Index monthly closing price by ADF test.The test results are shown in Table 1.From the test results, it can be considered that the original time series is not stationarity, but the first order differential sequence is stationarity.

3 Parameters determination
There are three important parameters in the ARIMA(p,d,q) model, which are the auto-regressive order p, the difference order d and the moving average q.In the front, after one difference to determine the smooth of the sequence, we can get

d
. Next, it is necessary to determine the value of parameter p and parameter q in order to determine the final model ARIMA (p, d, q).By analyzing the auto-correlation and partial correlation graphs of time series ^t Y , the value of p and q can be determined.The self-correlation and partial correlation test results of the sequence are shown in Figure 1.
From Figure 1, we can see that the partial correlation number is significantly truncated, while the self-correlation coefficient falls on the edge of 2 times standard deviation in the time lag of second and forth order [4].In this case, it is very difficult to use the traditional Box-Jenkins method (auto-correlation partial auto-correlation function, residual variance, F test, the criterion function) to determine the model order.For this special case, this paper makes repeatedly a comparison of the parameters of the model to determine the significance of the parameters of different models.  .The model ARIMA (4, 1, 4) is established.

ARIMA(p,d,q) model checking
After the completion of the model construction, it is necessary to carry out adaptive testing, to determine whether it is appropriate.For the model ARIMA(p,d,q), it is appropriate to determine whether the model is suitable by testing whether the residual sequence is white noise sequence.When the residual series is not a white noise sequence, it is shown that there is information contained in the residual error.The established model is certainly not the final model, and other parameters can not fully represent [5].It is necessary to further estimate model.If the residual sequence is white noise, it is shown that the proposed model is suitable.
White noise test results (For space consideration, the results are not list) show that the auto-correlation coefficient of the residual sequence is in random interval, there is no residual serial correlation, and they are close to zero in each lag order auto-correlation and partial auto-correlation values.All of the Q-statistics is not significant, and the P values of lag of the Q-statistics are significantly greater than 0.05.So the residual sequence passes the white noise test.It is white noise sequence.So the ARIMA(4,1,4) model can be used for further analysis and prediction.

Model estimation
From the previous calculation, the final model is determined as the model ARIMA (4,1,4).Through the Eviews7.2software, we get the coefficients of the ARIMA(4,1,4) model.The results are shown in Table 4.

Forecasting analysis of Shanghai Composite Index
Firstly, we use the ARIMA (4,1,4) model to forecast the closing price of Shanghai Composite Index from July, 2016 to October, 2016.The relative errors between the predicted and actual values can be calculated to determine the feasibility and accuracy of the model [6].The relative errors are calculated by the following formula: The relative error = (actual value -prediction value / actual value) The predicated value of the closing price of Shanghai Composite Index and the relative errors are shown in table 5 and Figure 2.

Fig. 1 .
Fig. 1.Self-correlation and partial correlation of first order difference sequence.For the sequence Y, we try several different models, such as ARMA (2.2), ARMA (4.4), etc.The results of t test (P value) of detecting the parameters of different models are used to select the best model.According to the ARMA (2,2) model, the regression analysis results are shown Table2.
Through the comparison of parameters significant t test results (P value) and the fitness of R-squared of the ARMA (2.2) and ARMA (4.4) model, it can be obtained, the t test is significant and R-squared is optimal in 4 p and 4 q

Table 5 .
Predicated value and error analysis.