Stock prices forecasting based on wavelet neural networks with PSO

This research examines the forecasting performance of wavelet neural network (WNN) model using published stock data obtained from Financial Times Stock Exchange (FTSE) Taiwan Stock Exchange (TWSE) 50 index, also known as Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) , hereinafter referred to as Taiwan 50. Our WNN model uses particle swarm optimization (PSO) to choose the appropriate initial network values for different companies. The findings come with two advantages. First, the network initial values are automatically selected instead of being a constant. Second, threshold and training data percentage become constant values, because PSO assists with self-adjustment. We can achieve a success rate over 73% without the necessity to manually adjust parameter or create another math model.


Introduction
Wavelet neural network (WNN) was first proposed by Zhang and Benveniste in 1992 [1]. It is a special feed-forward network supported by the wavelet theory. Wavelet neural network is not just confined to the analysis of the stock market; it is also widely used in financial derivatives market, gold market and so on [2,3]. Zhang Qian and Gao Liqun (2002) used wavelet noise filtering technique and least square method to predict price of Denghai Seed industry (stock code: 002041) shares in China [4].
In Taiwan, most investors prefer holding mutual funds over other assets in their investment portfolio. Since the majority of Taiwan's mutual funds are stock-based, there is value in developing a model using opening and closing stock prices. We believe this model should possess qualities such as universalness and ease of use.
The reason why neural networks had been widely adapted is because its distinctive characteristic of simulating the human brain thought process, having the ability to learn and solve problems. Through artificial neural network, parallel computing of the biological neural system can be achieved, allowing high volume data analysis within a short period of time. Now, used across multiple disciplines, stock price prediction is one among the many it's used in. Universalness is the goal for models. In our earlier researches [5], we combined second-order autoregressive mode (AR(2)) with wavelet analysis and its relative techniques. Unfortunately, this method was extremely limited by concavity and concavity distribution. However, WNN's network architecture is constructed with many wavelons through combination of weighted wavelons. Therefore, it is naturally unaffected by concavity.
The WNN consists of three parts: parameter; network architecture; and forecasting method. Generally, WNN has five parameters: weight; translation; dilation; threshold; and adjustment. We constructed a model which combines PSO. This method has two advantages. First, network initial values are automatically selected instead of a constant. Second, the threshold and training data percentage become constant values, because after applying PSO, the remaining three of five parameters can self-adjust to the algorithm. Thus, the model presented in this paper has strong operational ease. In 2000, Y. Oussar's research indicated that weight, translation, and dilation parameters shall be decided by the stock price's minimum and maximum values. Experiment results proved, when the initial value choice-making methodology suggested by Y. Oussar was used, not only does the calculation fall into an infinite loop often, the forecasted stock price value also exceeded the government enforced limit-up and limit-down policy, which is non-coherent with Taiwanese laws.

Methodology
We collected data of 41 stocks from the Taiwan 50 (Table 1) between 2010/07/15~2010/12/16, 100 trading business days. The daily stock closing price was chosen for input because the closing price reflects all the activities of the index on that day [6]. This means, for each company, a total of 100 stock prices, one from each trading day, is inputted for calculation. During experimentation, 10% of the data points from each stock price data were adopted as the training set. The remaining 90% were used for testing. This 10% was set as the training data percentage parameter. The threshold parameter was 0.01, constant without change. Figure 1 shows the main procedures of our research. The inputs and outputs of each block are further derailed in the material that follows

Practical swarm optimization
In 1995, J. Kennedy and R. C. Eberhart were the first to introduce PSO [7]. Originally, it was for establishing an algorithm simulating social behaviors based on the practical example of simulating behaviors of foraging birds. Later on, it was used for solving optimization problems. The PSO calculation process and parameters used in this research are as follows.
Assuming there is D number of stock prices, each price is a point in the D-dimensional space. Location of the i-th point can be presented as: 1 2 3 , , ,..., The best location that the i-th price had passed can be recorded as: The component closest to the target can be found by analysing 1 P through D P , changing the component to gd P .
The i-th price's velocity is recorded as: , , ,..., The two formulas below simulate the behaviour pattern of the bird flock:  Based on the data above, (1) was used to update particle velocity. Then, iterated in (2) for the new location.

Using PSO to optimize the initial values of WNN
Five network parameters were assigned for this WNN: weight parameter w ; translation parameter t ; dilation parameter d ; adjustment parameter y ; and threshold value. The first three parameters, weight parameter w , translation parameter t , and dilation parameter d ,were the optimized parameters. The method of implementation was: Step 1: The initial values of w , t , and d were assigned based on the parameters developed by Y. Oussar [8], shown below: w : Set as 0, t : When m is the original data's minimum value, M is the maximum value. There were 100 initial values for each parameter.
Step 3: A range must be specified when using PSO to optimize the three initial value parameters w , t , and d . Step 4: To obtain the optimized value of w , t , and d , we ran the PSO program loop using the statistics gathered from step 1 through step 3. Objective function is  There are many types of artificial neural networks (ANNs). We say that an ANN is a wavelet neural network (WNN) where the activation function is a wavelet [12]. In order to construct a WNN, the number of hidden layers as well as the number of neurons in these hidden layers must be provided. The advantage of the WNN structure proposed in this paper is that neither of these parameters is required (Figure 2). This model uses the Morlet wavelet as the activation function, and there are two main reasons. First, using wavelet is simpler than using Fourier basis. To understand the first reason, one must first understand the differences between sine/cosine waves and wavelets. Based on the fact that reconstructing signal f from dilations of a fixed sinusoidal function Since sine/cosine waves are infinitely supported in time domain, Fourier bases generally cause high frequency oscillation near the jump point and the endpoints of the interval. The high oscillation causes more time complexity here, than it would in wavelet algorithm, especially, when the signal is stocks.

Wavelet neural networks
Luckily, wavelets are localized in both frequency domain and time domain. They can be designed to have desired properties such as near optimal time-frequency localization, compact support, and regularity. Stock signal normally exists as high localized oscillation in time domain. Since, wavelets can be designed to have finitely supported localized oscillation in time domain, using wavelets to represent "bad" signals such as stocks is more advantageous than using Fourier bases. Those are the main reasons why the previous algorithm based on Fourier bases was abandoned.
Secondly, the Morlet wavelet is very useful for the detection of system nonlinearities because of its good support in both frequency and time domain. In practice, the Morlet wavelet is the product of a complex exponential wave and a Gaussian envelope. Based on Figure 3, one can see that the Morlet wavelet enjoys not only advantages of Fourier bases, but also localization in both time and frequency domain, since it is a wavelet.  , , ,..., n X x x x x is inputted, it would be translated, using 1 2 3 ( , , , , ) n T t t t t , ) n , , and dilated, using 1 2 3 ( , , , , ) , according to the parameters.
Use (6) and adjustment parameter y to obtain the output data y , shown in (7). Adding y is required for (7)

WNN training process
The WNN architecture is composed of weighted linear combinations of many wavelons. Supervised learning method was used to train our WNN, and backpropagation algorithm was used to correct network parameters [6,13,14].
The WNN training process is as follows: Step 1: Input data to the WNN.
Step 2: Five settings were assigned to the WNN, namely the w , t , and d after PSO optimization, as well as the adjustment parameter y , and the threshold value.
Step 3: Input all data and network parameters and use WNN to calculate the network output values.
Step 4: Calculate the error between the network output value and x using error function E .
Step 5: Determine if the error is within the given threshold value: 1. if yes, proceed to the next step.
2. if not, use the L-M Algorithm to adjust the network parameters and return to Step 3. Repeat the network output value calculation until the error falls within the given threshold value.

WNN forecasting process
Our model uses each stock's 100 closing prices to forecast the following 10 days' stock prices. This research used 10% of the training result data for calculation. The chosen 10% were the last 10% of the calculated w, t, and d training results produced. Therefore, the parameters used for calculation were  x 110 x 1 , stock price of the following 10 days were forecasted.

Results and discussions
In Y. Oussar's research, although the initial parameters change along with stock prices taken from different time intervals, but in fact, the parameters w, t, and d remain the same for each data point in the same set of training data. This is not very logical, because it is impossible for all data points to have the same initial value. One disadvantage is the infinite loop that occurs when using the backpropagation algorithm during training. Therefore, the assumption is, if more suitable initial parameters can be found for each data point, the problem should be resolved. The resolution we found was the usage of PSO. Before the initial parameters are assigned for the WNN training process, PSO can analyze each data point, optimizing the corresponding three initial parameters. When this is done, not only does the speed for training increases, the RMSE becomes smaller and the forecast accuracy increases.
In Table 1, we compared our algorithm, which uses PSO optimization, to Y. Oussar's algorithm. The comparison method was running the program 100 times, recording each instance where the RMSE is less than the original RMSE. When the RMSE was less more than 65 times out of 100, we considered using WNN with PSO as better performance than using Y. Oussar's algorithm. For example, when we used China Steel Corporation (2002) to test forecast, the RMSE was consistently less than 0.9933, so it was considered better performing. After running the program using B=C=2, we found 28 companies that had less RMSE more than 65 times. Then, an even smaller range of parameters for were used, B=C=1.8, presenting 30 companies, successfully raising the success rate from 68% to 73%, shown in the last row of Table 1.
This algorithm can not only be used to forecast Taiwanese stocks, but can also be used to forecast other nations' stocks, commodities, and other financial instruments. For testing, we randomly selected three gold index companies: Barrick Gold Corp (ABX); Agnico-Eagle Mines Ltd (AEU); and Anglogold Ashanti Ltd (AU). Data collected were during time interval chosen was March 6, 2014 through August 11, 2014. In Table 3, it shows a success rate of 67%. Then, three large American corporations were randomly chosen: IBM; Facebook; and Apple. Data collected were during time interval February 25, 204 through July 31, 2014 (Table 4). For both algorithms, there were no problems running data of IBM and Facebook. We also realized two major limitations of the algorithm. First, due to the fact that this model was made to have higher adaptability, the accuracy rate is lower. When compared to Ayodele Ariyo Adebiyi's model [15], created a model specific designed to forecast Dell stock, during a ten day interval, our model was not as accurate. In ANN vs. WNN, our model had five lower error values. In ARIMA vs. WNN, our model had only four lower error values. However, if Adebiyi's model was used to forecast the stock price of any other company, the algorithm or the model itself may need complete redesign. Since their model was targeted while our model was not, this error value difference was acceptable and expected (Table 5). The other difficulty is that the range for the PSO parameter is difficult to choose. When the range is too small, the optimal solution may be excluded. When the range is too large, more time is wasted in search of the optimal solution. Therefore, finding a better method of choosing a proper range is an area that needs to be studied.
The limitation of research we had was choosing the time interval for the training data. Dates where significant events that would affect the stock or stock market dramatically had occurred must be avoided. Natural disasters, stock market crashes, or any other significant events such as the Apple stock split will cause the model to fail. We don't have enough manpower to search for all possible events during the 100 days in order to avoid certain time intervals.

Conclusion
Our experiment results indicated using smaller range of numbers for the translation and dilation parameters increased performance during forecast. Among some older methods for WNN operations [10,11,13], the network threshold parameters were set as constants, and proper forecast was obtained by adjusting threshold and training data percentage. Our WNN adds PSO and resolves this slightly illogical method. The network initial values of every training data point can be configured by PSO in coordination with WNN, which means the network initial values are "automatically" selected instead of being a constant, and prevents infinite loops during calculation. Threshold and training data percentage become constant values, because PSO provides self-adjustment, which means it is not a true constant during calculation. From the two advantages mentioned above, our model has higher ease of operation in comparison to the older WNN models, because the end users do not have to set any parameters. The necessary threshold and training data parameters would be chosen for them. If an end user wants to know the following 10 days' stock price, the only data that needs input are the company name and symbol, and the forecast prices will be presented for consideration. In the future, we hope to improve our results by revising the existing training methods or adding additional steps.

Disclosure
The authors declare that there is no conflict of interests regarding the publication of this article.