Research on Financial Market Risks Based on VaR Model

With the continuous development of the Internet financial industry, its impact on people's daily life is growing. Under the influence of the information technology and the financial innovation, the volatility of the financial market has been significantly enhanced, and the financial risk has become increasingly serious. The COVID-19 in 2020 still has a profound impact on the global financial market. The reasonable measurement of the financial risks is the most important step in the financial risk management. Based on the sample analysis of the Standard & Poor's 500 Index from January 3, 2013 to June 30, 2020, this paper measures the risks by analyzing and calculating the value at risk (VaR) of the sample data by different methods. This paper mainly establishes the ARMA model, the GARCH model based on bayese statistics, and the POT model based on the extreme value theory, and compares and analyzes the VaR values obtained by the three models. Finally, the effectiveness of VaR is tested by VaR backtracking test, and the advantages and disadvantages of the methods are compared and evaluated.


Introduction
Since the 21st century, human beings have entered the era of the Internet economy, and governments around the world have slowed down the regulation of the financial industry, and the degree of the financial liberalization has been strengthened. Under the influence of the information technology and the financial innovation, the volatility of the financial market has been significantly enhanced, and the stability of the financial system has declined. As a result, the financial risks of the financial institutions, industrial and commercial enterprises, residents and even the country have become increasingly serious. Financial risks not only affect the normal operation of the financial institutions and industrial and commercial enterprises, but also pose a serious threat to the financial and economic stability of the country and even the world. The frequent occurrence of the financial crisis has brought a series of serious consequences.
Since 2020, the COVID-19 has swept the world, and profound changes have taken place in the world economic situation, and the fluctuation of the global financial market has intensified. Facing the impact of the epidemic, major economies launched a new round of the monetary easing policies. At present, the interest rates in the United States, the United Kingdom, Canada and Australia are close to zero, and the total economic volume of the regions with negative interest rate policies such as the euro zone and Japan has exceeded 20% of the global GDP. The extremely loose monetary environment has changed investors' expectations and intensified the financial market volatility. Since 2020, the US dollar index has fluctuated significantly, and commodity prices have been adjusted in depth. The global debt scale has exceeded $260 trillion, a record high, and the stock markets of major economies have fluctuated at a high level, which seriously deviates from the weak performance of the real economy. In view of the wide and far-reaching influence of the financial risks, the financial circles, business circles, government authorities and international financial organizations all over the world pay close attention to and attach great importance to it. Countries and major financial institutions have also issued new policies, which put forward strict requirements for the risk management of financial institutions.
The movement of the financial market has produced a large number of data. How to get a quantitative index from this large amount of data to reasonably measure the market risk and measure the financial risk is an important part of the financial risk management. Underestimating risks can cause huge problems (such as bankruptcy) for banks and other parts of the financial markets. On the other hand, overestimation of risks may make investors too sensitive to risks and miss the investment opportunities, thus reducing the return. Therefore, it is very important to estimate the risks correctly. Through the continuous research of many scholars, there are many methods to measure the financial risks. Among them, the Value at Risk (VaR) method is very popular because of its intuitive, concise and scientific nature. It can integrate the market risks of different trading and business departments into a numerical value, so it is easy to understand and manage. VaR method can not only clearly describe the size of the market risks, but also has the support of rigorous probability and statistics theory. At present, it has become the most extensive and important method in the modern risk quantitative management. VaR is an analytical measure that has several advantages. It is easy to estimate and easy to understand for non economics professionals. The precise horizontal two-week to 10-day VaR estimates of 99% quantiles are accepted by the Basel Committee for banks and the internal clearing house of regulatory banks (Basel Committee on Banking Supervision, 1995). This stems from concerns that financial institutions may not be able to liquidate their two-week liquidity crisis. Jorion (1996) gave the most authoritative definition of VaR: "VaR is to measure the possible or potential loss of the financial assets or portfolio under the future asset price fluctuations. Under the normal market conditions and the given confidence level, the worst expected loss value of a financial asset or portfolio of assets in a future holding period." The mathematical expression of VaR is as follows:

The definition of VaR
As a measure of risks, VaR's calculation depends on the time interval t , the confidence level 1 D and the density function ( ) f x of the expected return rate.

VaR research methods
Before calculating VaR, we must have a certain understanding of the financial market data, and on this basis, we can correctly model and measure VaR. Since the researches of Mandelbrot (1963) and Fama (1965), it has been found that the financial market data has the following characteristics: (1) The distribution of the financial yield is high peak and thick tail. Compared with the normal distribution, it has thicker tail and higher peak value.
(2) The yield is a typical negative skew.
(3) The square of the return rate has the typical autocorrelation. That is, the volatility of market factors tends to agglomerate.
All the traditional VaR models are based on these empirical laws. Theoretically, VaR research methods are divided into three categories [defined by McNeil and Frey 92000, P. 272)]: Parametric type: modeling, research and calculation based on the sample data, such as RiskMetrics and GARCH models.
Semi parametric type: it is mainly used for modeling, research and calculation of the sample tail data, such as the extreme value theory and CAViaR. Nonparametric: that is, through the simulation research and calculation, such as the historical simulation and Monte Carlo simulation.

Features of this paper
This paper selects the daily closing price of the Standard & Poor's 500 Index from January 3, 2013 to June 30, 2020 as the sample for the statistical modeling and analysis. The data volume is relatively large, which can avoid the impact of the insufficient sample size on the estimation of the model. Secondly, this paper selects VaR as the index to measure the financial risks, which has its corresponding theoretical basis as the support. At the same time, the definition of the VaR index is more popular and can be accepted by most people. In addition, this paper is not limited to one method, but involves two kinds of methods, which cover a wide range. For the estimation of the parametric model, this paper not only uses the most common maximum likelihood estimation method, but also uses the bayese statistical method to estimate the parameters of the model by making full use of the information of the data itself. Firstly, the ARMA (1, 1) model is established to find the function model which can fully reflect the rate of return. Then, based on the ARMA (1,1) model, considering the conditional autoregression of data, the GARCH (1,1) model is established. For the semi parametric model, the pot model in the extreme value theory is selected for modeling, and the sample data is analyzed by analyzing the tail data exceeding the threshold value. Finally, the VaR values calculated by the three models are compared and analyzed, and reasonable explanations are given and the results are further analyzed and discussed. And the calculated value of VaR is tested retrospectively to verify its accuracy.

Data preprocessing
Since China's stock market has only 20 years of history, the regularity of the market data is not easy to find. Therefore, this paper selects 1825 data of the daily closing price of the Standard & Poor's 500 Index from January 3, 2013 to June 30, 2020 as samples for analysis.
In 1957, the Standard & Poor's 500 Index, which is a large-scale free trading index, was issued in 1957. The stocks included in the S & P 500 Index are shares of the large public companies traded on the New York Stock Exchange and Nasdaq, the largest stock exchange in the United States. The index focuses mainly on companies in the United States, but there are also some top companies or multinational companies in other countries.
The S & P 500 Index is the most widely referenced stock index after the Dow Jones Industrial Average (DJIA). It is considered to be the leader of the US economy and is included in the leading economic indicators index. Many mutual funds, exchange traded funds and other funds, such as pension funds, aim to follow the trajectory of the S & P 500 Index. Hundreds of billions of dollars have been invested in this way.
McGraw Hill called the index the best of many indicators owned and maintained by S & P. The Standard & Poor's 500 Index is not only an index, but also the common stock index of 500 companies included in the index. There are many codes for the S & P 500 Index. For example, "GSPC," "INX," and "$SPX". Stocks covered in the S & P 500 Index are also included in the S & P 1500 and S & P Global 1200 stock indexes.
First of all, we make an intuitive trend chart of the closing price of the stock. From Figure 1, we can roughly see the daily closing price trend of the Standard & Poor's 500 Index from January 3, 2013 to June 30, 2020. From the chart, we can find that at a certain stage, the closing price has a rapid decline, which happens to be in 2020. In fact, during this period, the whole world is in the financial crisis, and the daily closing price of stocks can clearly show this change. It can be seen from Figure 1 that the sample data is not stable, and what this paper studies is the return rate of the stock. Therefore, it is necessary to convert the sample data into a stationary time series. The yield at the time of return is defined as  It can be seen from Figure 2 that the transformed yield data conforms to the stationary stochastic process. After a preliminary descriptive statistical analysis of the data, Table 1 can be obtained. From the skewness value in Table 1, it can be seen that the yield data has obvious left skewness, and the kurtosis value is obviously greater than 3. It can be concluded that the data has obvious peak characteristics, which conforms to the high peak and thick tail characteristics of the financial data.

Model concept and principle
The financial time series data generally have autocorrelation effect and lag effect, so the ARMA model is often used for modeling and analysis. The ARMA model is sometimes called the Box-Jenkins model. Given a set of time series data t X , ARMA model is used to understand or predict the future value of this group of time series. ARMA model consists of two parts: autoregression (AR) and moving average (MA). The model is usually called ARMA (p, q) model, where p is the lag order of the autoregressive part and q is the lag order of the moving average part. AR (p) stands for the autoregressive model with p-order lag. AR (p) model was written as follows:

H H
is the white noise error. Therefore, the formula of ARMA (p, q) model is as follows: 1 1

Establishment of the model
The most important step in the establishment of ARMA (p, q) model is to select the sum of lag orders p and q , and first draw the autocorrelation graph (ACF) and the partial autocorrelation graph (PACF). It can be seen from Figure 3 that the second-order autocorrelation and the partial autocorrelation of the data are significant. Therefore, it can be preliminarily judged that The residual error of the model is tested, and the result is Box-Ljung test data: model$residuals X-squared = 12.1953, df = 8, p-value = 0.1427 The original hypothesis can not be rejected, which means that the residual conforms to the white noise.

Model 2: GARCH (1, 1) model based on the Bayesian estimation 4.1 Model concept and principle
The traditional ARMA (p, q) model assumes that the fluctuation amplitude (variance) of time series variables is fixed, but in fact, this is not in line with the reality, which makes the traditional time series analysis ineffective for practical problems. Robert Engle (1982) proposed the arch model, which successfully solved the volatility problem of time series. In the field of economics, the Auto Regressive Conditional Heteroscedasticity (ARCH) model is used to analyze and simulate a group of time series. This model is used whenever there is reason to believe that there is a specific size or variance at any point in the sequence. The ARCH model assumes that the variance of the current or improved error term is a function of the actual value of the error term in a previous period, and the variance is usually related to the square of the previous improved error term.
Let's simulate a time series using the ARCH process.
The random variable t z is a strong white noise process, and the model of t h is: In it, 0 0, 0, 0 ARCH (q) model can be estimated by the least square method. If the variance is represented by the autoregressive moving average model (ARMA model), then the model is a generalized autoregressive conditional heteroscedasticity model, namely the GARCH model [Bollerslev (1986)].
The model can be expressed as follows:    (10) In fact, because: The variance equation is related to the residual term of lag, so the parameter estimation For the determination of the GARCH model parameters, this paper chooses the Gibbs sampling based on the bayese statistics.
Bayesian statistics is a systematic statistical inference method named after Thomas Bayese (1763). Later generations extended and developed the bayese statistics on his basis, and eventually formed an influential school, namely the Bayesian School. Different from the classical statistics, the bayese statistics has its own systematic theoretical framework.
The basic view of the bayese school is that in any statistical inference problem T , and in addition to using the information provided by sample   From the QQ chart of the data, we can see that the data does not conform to the normal distribution, and has obvious characteristics of sharp front and thick tail, so it is not suitable to use the GARCH model under the normal distribution to fit the data. Considering that t distribution is a common thick tailed distribution, the GARCH model of t distribution is used for fitting. The specific model is as follows:

Establishment of the model
H H E E J °®° (12) As for the prior distribution selection of parameters, the prior distribution of v degree of freedom is more suitable for 2 1/ (1 ) v , and the prior distribution of other parameters is the uniform distribution of upper > @ 0,1 without information.
Let the observed value be 1 2 ( , ,..., ) n Y y y y , and then the likelihood function of the parameter is: The full distribution conditions are as follows: The full conditional distribution of 1 1 ,..., , ,.., For the data in this paper, the simplest GARCH (1, 1) model is selected for modeling. For the parameter estimation of GARCH (1, 1) model, this paper adopts the Gibbs sampling method based on the bayese statistics.
In the simulation process, 1000 Gibbs pre iterations are performed to ensure the convergence of the parameters. Then the original pre iteration is discarded and 10000 iterations are performed. This paper implements the Gibbs sampling process through the Openbugs program.
The sampling results are shown in Table 2: According to the results in Table 2, the model is as follows: From the results of Table 2, it can be found that the parameters of each variable of the model are significant at 95% confidence level.

Model concept and principle
Extreme value theory or extreme value analysis (EVA) is a branch of statistics that deals with the extreme deviation of the median of probability distribution. It aims to assess the probability of an ordered sample of a given random variable to occur more extreme than any previously observed event. Extreme value theory is widely used in many fields, including structural engineering, finance, geography, traffic prediction and geological engineering.
In this paper, we mainly use the Peaks over Threshold (POT) method to analyze the yield. Let 1 2 , ,..., n X X X is an independent and identically distributed sample from the asset income population X , and the distribution function is ( ) F x , and select a threshold value u . If X u ! , it is called over threshold, and Y X u is also called the over-threshold value. According to the conditional probability formula, the distribution function of Y is as follows: Thus, we can get the following results: The key to the solution to ( ) F x is the expression of ( ) u F y to be obtained. According to Pickandes Balkemade Haan theorem, for a sufficiently large threshold u , the over threshold distribution ( ) u F y can be approximated by the generalized Pareto distribution, whose expression is as follows:

Establishment of the model
In order to establish the POT model, we need to determine the threshold value first. In order to make the determination more scientific, we use the combination of the mean residual life diagram and the Hill chart to observe. It can be seen from Figure 7 that the fitting result of the model is very good.

Calculation of Value at Risk (VaR) under each model 6.1 Calculation of VaR
In retrospect, the definition of VaR F is the cumulative distribution function of future earnings, and then: Therefore, VaR is essentially a quantile of future earnings.

Parameter method
The parameter method is to estimate the parameters of the distribution on the assumption that the future income satisfies a certain distribution (usually normal distribution), so as to determine the whole distribution and then calculate the quantile of the distribution. It is reasonable to assume that the future returns satisfy the normal distribution: (1) The short-term performance of stock returns can be approximated by the joint normal distribution; (2) Most assets can be expressed as a linear combination of risk factors; (3) The linear combination of any number of the normal distributions is still the normal distribution, so the linear expected return distribution of a portfolio is still the normal distribution, and its distribution parameters are unique and fixed.
Taking Model 2 as an example, the estimated value of unknown parameters is obtained by calculating the likelihood function of the GARCH model, and then the variance between the actual value and the estimated value of the model is calculated, that is, the square of the stock's ex ante volatility. According to the obtained volatility, the corresponding VaR is calculated. For example, under the 95% confidence level, the normal quantile can be multiplied by the estimated volatility.
Therefore, according to the above three models, we can get three VaR estimates respectively. It can be seen from table 3 that the VaR values calculated by different model estimation methods are quite different. And according to the sample data, we can conclude that the VaR coverage of the results obtained by using the time series linear model is wider, and can reflect the risks of the samples better.

VaR backtracking test
If the estimated value of an asset does not fully take into account all the fluctuations affecting the valuation of the asset, the risk will be underestimated. In the day trading, the simulated profit and loss and VaR can pass the backtracking test, but the actual profit and loss and VaR can not pass the test. This is because the calculation of VaR is usually based on the daily closing price, and does not describe the volatility generated in the process of the day trading. In this case, in order to make the value of VaR reflect the risks of the asset profit and loss more accurately, a shorter time interval may be needed, or the VaR parameter may not be the most suitable risk description method, and other indicators should be used for the risk measurement. The basic method of the backtracking test is to take VaR as the specific quantile of profit and loss, call the loss exceeding VaR as a "breakthrough" event, and test whether the "breakthrough" is an independent event with probability less than 5%.
In T samples, N is defined as the number of the VaR breakthrough events and / N T is the breakthrough rate. In the ideal case, / N T should tend to 0.05 p with the increase of the sample size. According to the requirements of the Basel framework (BIS), in order to make the detection more scientific, the sample data size should be at least one year. There are clear requirements for the backtracking testing in the Basel framework. It limits only 99% of daily VaR and uses a one-year sample. The average number of breakthroughs in a year is about 2.5, and the Basel Committee can accept up to four breakthroughs. If there are five or more breakthroughs, the financial institutions will be in the yellow light or red light district, which requires higher capital of financial institutions, which is reflected in the increasing value-added factors. Kupiec (1995) constructed a statistic to test the breakthrough rate of VaR: 2 log( (1 ) ) 2 log((1 ( / )) ( / ) ) When T is large, the statistic accords with the 2 F distribution of the degree of freedom of 1, so the corresponding hypothesis test can be carried out. If LR>3.84, the original hypothesis is rejected.  From the data in Table 5 and Table 6, it can be seen that the risk coverage rate of the parametric model has passed the backtracking test, but the POT model based on the extreme value theory has a very low coverage rate on VaR.

Result analysis
According to the results in Table 2 and Table 4, we can find that the S & P 500 Index in 2019 is in the yellow light area required by the Basel framework under the parametric model, and in the red light area under the POT model. It can be seen that the VaR value obtained by the model obviously underestimates the risk. There are three conjectures: (1) The model is based on the ideal market conditions. For the rate of return, only its lag effect and random factors are considered. In reality, the financial industry is also affected by political circles, industry and commerce, and these influences are not quantifiable. Although they are all considered as random factors of the model, there is a mutual relationship between these factors. Therefore, they cannot be ignored and considered comprehensively.
(2) Since 2017, the international financial market has been in an unstable pattern. Although it has shown some vitality in some markets, the overall pattern of the financial market is still not optimistic. Most of the enterprises covered by the standard & Poor's 500 Index are large-scale enterprises in the United States. However, these enterprises have been affected by the financial crisis in 2020, and some enterprises have not extricated themselves from the crisis so far, resulting in the asset risks of enterprises still very large, and the model estimation is in line with the actual situation.
(3) Since the sample data used in the model is the daily trading closing price from January 3, 2013 to June 30, 2020, in which the abnormal fluctuation of financial market in 2020 is also used in the model calculation, it may have some impact on the estimation of the model.
Compared with the models established by the three methods, although the parametric model may be more able to reflect the risk situation of the market and has higher coverage for VaR, it is considered that the parametric model may not fully describe the real situation of the data due to the low R value obtained by the parametric model. But the POT model can fit the data well. We can guess that if we only study the risk situation of the market, we can adopt the parametric model, but if we study the tail situation of the whole market, the POT model may be more appropriate.

Follow-up discussion
As an indicator of the financial risk measurement, VaR also has some disadvantages: First, VaR cannot measure the size of any potential loss that exceeds the VaR level. This will lead to the investment decision-making to optimize the expected profit in the limited situation when the VaR does not exceed a certain threshold, because the potential "rare case" loss exceeding VaR may be extreme.
Second, VaR is not a continuous variable, as is described by Artzner et al. (1999). This
Another common risk measure is Expected Shortfal (ES). ES has obvious advantages over VaR: it does not involve aspects beyond the VaR level, and ES is a sub additive continuous variable. The attribute of the sub additivity means that the ES of the portfolio cannot exceed the sum of the sub portfolios of ES. The sum of these independent ES generates a conservative risk measure for the entire portfolio.
In addition, the incremental VaR, the marginal VaR and the component VaR can also be used to estimate and predict the financial risks.