Data assimilation experiments with MPIESM climate model

Further development of data assimilation technique and its application in numerical experiments with stateof-the art Max Plank Institute Earth System model have been carried out. In particularly, the stability problem of assimilation is posed and discussed In the experiments the sea surface height data from archive Archiving, Validating and Interpolating Satellite Ocean have been used. All computations have been realized on cluster system of German Climate Computing Center. The results of numerical experiments with and without assimilation were recorded and analyzed. A special attention has been focused on the Arctic zone. It is shown that there is a good coincidence of model tendencies and independent data.


Introduction
Data assimilation methodology and the numerical climate modeling are among the most popular research directions over last 20 years.One of the frequently used versions of the numerical climate models is the Max Plank Institute Earth System model (MPIESM), created in Max Plank Institute for Meteorology in Hamburg, Germany.This model version is realized on cluster system in German Climate Computing Center (DKRZ) and consists of several modules including ocean block MPIOM, atmosphere block ECHAM6 and others which are executed in parallel.Details of the MPIESM and its realization can be found in particularly in [1].
Actual climate modeling research also often combines with data assimilation approaches.Data assimilation methods allow correcting the model output by data and obtaining the new initial conditions for future better forecast.The examples of data assimilation methods and their applications to the climate and weather predictions can be found, for instance, in [2].An example of the MPIESM model in conjunction with data assimilation has been considered in [3].
Several issues are resolving in these types of research.First, the numerical model is turning on the real natural observations and eliminates the unrealistic and completely non-relevant scenarios which could be possible without assimilations.Secondly, the observed variables can correct not only the corresponding model components but also non observed model characteristics.Finally, the regions with badly data coverage can involve the observed information from more data richer domains through model connection and evaluate both local and global temporal and spatial variability.
These and others potential opportunities of correct combinations of models and assimilation methods require further developments all components which produce this progress.The model themselves, their numerical realization including modern parallel schemes and technologies, further information advance, data bases, data assimilation techniques and their mathematical realization lead to this progress.It is important to estimate the correction of data assimilation application for specific model and data base, and to prove that the resulted model fields are physically relevant.

Data assimilation method
In the current study the original data assimilation technique, developed earlier in [4,5] is applied.The method has been applied earlier in conjunction with ocean circulation model HYCOM in [5].In the current study the same technique is applied together with MPIESM.Along with the data assimilation block the entire model scheme is presented in Figure 1.Let the dynamic numerical model be integrated on the time interval [0,T] on the gridded spatial domain and X be the model state vector.Let the number of grid points be denoted as g N and the number of unknown model variables be mv N .Then the dimension of X will be g . Let Y be the observational vector, o N is the number of observations and each with ov N variables.Then the total dimension of the observational vector state is since not all model variables are observed.As it is usual in data assimilation theory two state vectors are introduced, namely the analysis and background states, , a b X X , respectively.They are linked by the equality ( ) where, matrix K is the so-called Kalman gain with dimension r×N while matrix H with dimension N×r is the observation operator that projects the model space into the observational space.
On time interval [0,T ] the discretization Model forecast can be written as a primitive for the function ȁ where t w w The assimilation problem may be formulated as follows: find out the gain matrix K which will give minimum of the variance state vector X by minimizing (in the sense of the matrix norm) the diffusion matrix c KQK under the conditions that the main part of the correction is given by ( ) I KH ȁ & .According to this formulation vector C is an r-dimensional vector given at all grid points and having certain value for each model variable.
In order to mathematically solve this optimization problem, the theory of conditional extremes will be applied.
The matrix Q is the symmetric and positively defined matrix with the dimension N×N as it follows from its definition.In the phase-space of rectangular matrices of the dimension r×N the scalar product is defined as Now the minimization problem of the functional is considered under the condition The matrix c KQ K is symmetric and non-negatively defined, its trace is equalled to the sum of its diagonal elements, and also, it is equalled to the sum of eigenvalues of the matrix c KQK .Function J(K) is the convex, quadratic and smooth function of variable K , its derivative is easily calculated.Hence, problem ( 5),( 6) may be solved with the help of classical Lagrangian Multipliers Method.
The Lagrange function is introduced as follows Set up it to zero we can find the matrix K giving the minimum of norm c __ __ KQK under conditions (6), as the solution of equations 1 0, 2 ( ) .c

Ȝ +ȁ
Equations ( 7) are matrix equations which are equivalent to (r+1)×N linear scalar equations containing the same number of unknown variables.Since the matrix Q is invertible this system of linear equations always has a unique solution.
The solution of system (7) can be obtained explicitly.It is the following Thus, the solution of the problem on the (n +1)th step is determined by formulas (2), where ) , P is the operator that interpolates the values of the vector of observed parameters for all spatial grid area.The projection operator H and the covariance matrix Q can be defined by its own way for each specific task.Description of the method has been completed.In the proposed method the calculation of the gain matrix takes into account the dynamics of the model alternatively to the classical Kalman filter [6].

Results of numerical simulation
The model experiments have been carried out as follows: first the spin up procedure during 250 years has been performed only for the ocean model MPIOM with climate NCEP forcing.Then entire ocean-landatmosphere coupled model was running for 2 years started from initial conditions gotten after spin up.This data is set up as 2000 year.After that the numerical experiments with data assimilation (analysis) according to aforementioned method and without assimilation (control) have been executed as twin experiments started from 2001.Assimilations are executed from 2008 until 2010 each 3 months.Their results were compared in quantity and in quality between themselves and with the satellite observations data specifically for Arctic region.
In the current work data from archive AVISO (Archiving, Validating and Interpolating Satellite Ocean) dataset has been utilized.These data were downloaded from site www.ifremer.ocean.fr.Data were preprocessed and passed the quality control.Badly recognized or wrongly located data were rejected.Ice concentration data were downloaded from the site http://nsidc.orgin Arctic zone (MASAM2: Daily 4-Km Arctic Sea Ice Concentration).Some examples on 2016 year of model run are presented below.In assimilation experiments the analysis has been calculated each 3 months during full integration period.The difference of Surface Snow Thickness (SSTH) values between analysis and control run is shown on Figure 2.This Figure shows that after assimilation the SSTH valuable decreases which confirms the observed tendency as well as the tendencies given from other models.This decreasing is not uniform, its maximum is observed in Canadian shore and East Siberian.More weak signals are seen in Bare Sea.In general this result fits well to other calculations.
Figure 3 shows the difference between the assimilation and control run relatively the Sea Ice Thickness (SIT) The ice coverage thickness and ice concentration are considered as an important characteristics of Arctic zone.The Figure 3 confirms the tendencies that SIT mostly decreases everywhere in Arctic except relatively narrow zones near Canada shores.On a contrary, in Antarctic the SIT grows everywhere.This result also occurs in a good match with observed data and tendencies given by other models.

Conclusions
The performed numerical experiments allow making a conclusion that the developed data assimilation method reduces the model error and leads to the results corresponding to the observed data.
multipliers.According to the theory of Lagrangian Multipliers Method matrix K minimizes(5) and Lagrangian multipliers satisfy the equation (