Resilience assessment of dynamic engineering systems

Resilience indicators are a convenient tool to assess the resilience of engineering systems. They are often used in preliminary designs or in the assessment of complex systems. This paper introduces a novel approach to assess the time-dependent resilience of engineering systems using resilience indicators. The temporal dimension is tackled in this work using the Dynamic Bayesian Network (DBN). DBN extends the classical BN by adding the time dimension. It permits the interaction among variables at different time steps. It can be used to track the evolution of a system’s performance given an evidence recorded at a previous time step. This allows predicting the resilience state of a system given its initial condition. A mathematical probabilistic framework based on the DBN is developed to model the resilience of dynamic engineering systems. A case study is presented in the paper to demonstrate the applicability of the introduced framework.


Introduction
Research on disaster resilience has recently been fostered due to the noticeable increase in the number of natural and human-caused disasters. Resilience has been defined differently depending on the field of study [1][2][3]. In engineering, resilience is the ability to withstand a disturbance caused by an external agent and recover quickly if damage occurs [4]. Resilience can be an outcome (static) or a process (dynamic) [5]. While most of the research work focused on analyzing engineering resilience from a static point of view [6][7][8][9][10][11][12], there is a significant gap in assessing the dynamic nature of resilience through quantitative approaches. This paper proposes a dynamic framework to quantitatively assess the resilience of systems of dynamic nature (i.e., critical infrastructures, buildings, communities, etc.) [13]. The framework can be used to assess the resilience of multiple systems at once and it adopts the DBN as an inference tool. A DBN model can be obtained by expert knowledge, from a database using a combination of machine-learning techniques, or both. These properties make the DBN formalism very useful in the disaster resilience domain as this * Corresponding author: gianpaolo.cimellaro@polito.it domain has an abundance of both expert knowledge and databases records. Moreover, A DBN allows performing a transient analysis of the system after the occurrence of disruption until the system was recovered from its disruptive states. The transient analysis can be rather useful to model the restoration process of the damaged system. The proposed resilience framework is presented in the form of a mathematical formulation that integrates the probability distribution of all variables' states. A case study of a transportation network is used to demonstrate the proposed methodology. Results show the ability of the framework to dynamically model complex systems, even when data is scarce.

Dynamic Bayesian Networks
Conventional Bayesian Networks (BNs) are used when the analyzed system is in a static state. This is often not the case in a dynamic, continuously changing world. This raises the need for a tool that is capable of accounting for system changes, such as the Dynamic Bayesian Network. DBN is a Bayesian network extended with additional mechanisms that are capable of modeling influences over time [14]. It extends the classical BN by adding the time dimension. It is suitable for describing dynamic systems where the performance fluctuates (e.g. before and after a disaster). Like the BN, the DBN is a directed acyclic graphical model used for statistical processes. A DBN consists of multiple BNs (often referred to as time-slices or time steps), each with its own variables. The variables within a single or successive time-slices are connected using links. A DBN can be defined as (B1, B→), where B1 is a BN that specifies the initial distribution of the variable states P(Z1) [14], where Zt = (Ut, Xt, Yt) is the input, hidden, and output variables of the model at time step t, while B→ is called a "two-slice temporal Bayesian network" (2TBN), which defines the transition model P(Zt|Zt-1), as in Equation (1). The nodes in the first slice of the 2TBN network do not have parameters associated with them, while Conditional probability tables (CPTs) are required for the nodes in the second slice.
where i t Z is the i th node at time t and could be a component of Xt, Yt, or Ut. Pa( i t Z ) are the parents of i t Z , which can be in the same or the previous time-slice. The process in a DBN is stationary and the structure repeats after the second time-slice, so the variables for the slices t=2, 3, ..,T remain unchanged. This allows expressing the system using only two slices (i.e., the first and the second time-slices). Therefore, an unbounded sequence length could be modeled using a finite number of parameters. The probability distribution for a sequence of times-slices can be obtained by unrolling the 2TBN network, as follows: The DBN is often seen as a generalization of other temporal reasoning developments, such as the hidden Markov model (HMM) and the Kalman filter model (KFM) [15]. These models, which can be expressed in a compact form, are popular for their fast learning and fast inference techniques. In fact, DBNs generalize HMMs by expressing the state space in not only a single discrete random variable but also in a factored form.

Methodology: Time-dependent resilience analysis using Dynamic Bayesian Networks
In general, the resilience of a system tends to be a process rather than a state; thus, accounting for the performance variation of a system is important. Ordinary Bayesian Networks are unable to account for the time dimension in the analysis as they are limited to static systems. In this section, we propose a new methodology to assess the resilience of engineering systems in a dynamic manner.

Dynamic resilience model
The resilience model used in the dynamic resilience analysis is based on the resilience definition by Bruneau and Reinhorn [16] who describe the resilience of a system using four components, also called the four R's of resilience (4R's): x Robustness (R1): refers to the ability of a system to stand a certain level of stress preserving its functionality; x Redundancy (R2): indicates the alternative resources in the recovery stage when the primary ones are inadequate; x Rapidity (R3): the capacity to contain losses and avoid future disruption. It represents the slope of the functionality curve during the recovery phase; x Resourcefulness (R4): considers the human factor and the capacity to move needed resources.
As shown in Fig 1, the first two resilience components (R1 and R2) define the damage level the system may encounter if exposed to a certain hazard. Robust and redundant systems would most likely experience less damage and function almost normally after the disaster. On the other hand, once damage occurs, the system's recovery starts. the recovery process is defined by the recovery capacity and resources availability, such as human resources. Thus, the other two components (R3 and R4) interfere during the recovery stage as they are the main drivers of the system's recovery.

Network structure and elements connectivity
Assume that we have a system that is composed of 7 indicators (X1, X2, …, X7). The indicators are connected to the three resilience pillars according to their relevance. Such connections can be obtained from past experience or expert knowledge. One indicator can contribute to multiple pillars, as shown in the Bayesian network in Fig 2 where  is connected to R1 and R2 while X7 is connected to R2 and R3. The final output (resilience index) represents a combination of all factors that contribute towards the resilience pillars.

Fig. 2. Bayesian network to compute the resilience index of a static system
A DBN is a series of Bayesian networks with changing conditions. One main characteristic of DBN is that elements are connected through different time-steps. For example, element At can be linked to element Bt+1 using a temporal link if element Bt+1 has a dependency on At, where t is the time step. The connections between elements at different time steps is done using expert knowledge or from past data. Fig 3 shows a DBN where the individual networks at the different time steps are connected with one another. In our methodology, an element in a BN at time-step t can only affect itself at time-step t+1 (i.e., At affects At+1 and Bt affects Bt+1).
Regarding the four resilience components (4R's), they are incorporated in the network at different time-steps. In Fig 3, the first step (t=1) corresponds to the initial state of the system (i.e., before hazard occurrence). At this stage, none of the 4R's is involved as the aim here is to assess the initial performance of the system. The second step (t=2) is dedicated to assessing the damage that would incur if a hazard of a certain magnitude occurs. The level of damage, or the drop in the functionality, can be determined by acquiring information about the hazard (H) and the system's characteristics (i.e., R1 and R2). The combination of the parameters H, R1, and R2 can provide valuable information on how a system with a predefined initial state would behave. Thus, the two resilience components R1 and R2 are connected to the DBN at the second time-step (t=2).
Once the drop in the serviceability is determined, the recovery needs to be evaluated. Since recovery is not an instantaneous action, several Bayesian networks are needed here. The recovery period is divided into a finite number of time-steps, each with a Bayesian network. Information about the rapidity and the resourcefulness (R3 and R4) of the system is integrated at all recovery time-steps as they will define how the variables (i.e., the indicators) will evolve from one step to another. Therefore, the same Bayesian network is copied from time-step t=3 until time step t=T.
The result of each BN is a performance point. The collection of the performance points creates a resilience function that shows the changes in the system's performance, starting from a stable state (the first uniform part of the function in Fig 3) and ending with a stable state, when the system is fully recovered (the second uniform part of the function). Once obtained, the resilience function can be used to obtain a resilience index. One method uses the area above the resilience curve and links it to the notion "loss of Resilience" [17; 16] while other methods consider other metrics to quantify the resilience [18].

Joint probability distribution
The proposed dynamic resilience analysis using the DBN approach can be mathematically written in probabilistic terms, as follows: where C is the set of all static variables (contemporal variables), Z is the set of all dynamic variables (temporal variables), P(C) is the joint probability of the static variables, Pa(Zi) is the set of variables that are children of Zi, H is the hazard variable, R1 is the Redundancy variable, R2 is the Redundancy variable, R3 is the Rapidity variable, R4 is the Resourcefulness variable, N is the number of dynamic indicators, T is the total number of time steps.
The first term on the right-hand side of Equation (3) refers to the joint probability of the variables at the first time-step, the second term refers to the joint probability of the variables at the second time-step, while the third part of the equation considers the remaining time steps.

Conclusions
Unlike the static resilience analysis which assumes a constant state of a system and measures the resilience by a static quantity, the dynamic resilience analysis additionally models the evolvement of the system with time. This paper introduced a probabilistic resilience assessment and prediction framework using the Dynamic Bayesian Networks (DBNs). The framework employed resilience indicators for its implementation to make it more usable by decision makers in the industry. The methodology can handle both static and dynamic engineering systems using quantitative and/or qualitative data. The uncertainty in the inputs and in the variables' relationships is accounted for and propagated throughout the model; hence, the output is probabilistic in nature. The quantitative resilience analysis tools that can be readily available to system designers to model and quantify engineering resilience are still underdeveloped. this paper aims at motivating the resilience community to agree on the proposed universal resilience framework. The presented framework provides a tool for decision makers to systematically learn about the state of their systems given a specific event. It allows them to improve the systems' performance using the backward analysis feature of BN. This is done by setting a desirable state of the resilience and getting the variables inputs that lead to the predefined resilience state. Future work will be oriented towards building detailed networks for the damage and recovery variables as this would allow expressing the system in more details. In addition, a procedure to evaluate the interdependency among the variables as well as their weighting factors will be further addressed.