A Systemic Risk Management Model to Manage the Equipment Maintenance System in Oil and Gas Companies

The risk management is significant when managing the equipment maintenance system (EMS) which is very important to maintain equipment operations and is fundamental for achieving business objectives. With the advent of risk-based thinking in industry, there was a need for introducing the risk culture within the organization, including maintenance, in order to reduce business losses. Analysis of equipment failures data showed a relation between the failures types with their consequences, and all interaction with system maintenance components. The ineffective maintenance system may cause multiple losses for the organization and therefore affects the whole business. This paper introduces a systemic risk management model to manage the maintenance system undesired events and control the impact on the organization and the consequences on business. Using systemic risk management model, the maintenance professional can manage the whole maintenance system through risk analysis, assessment, and management by creating the different risk scenarios to develop proper types of control.


Introduction and Background
The equipment is a major concern in production and service companies. This means that it should be capable to deliver the required operations with the required conditions. So, the maintenance is the main solution to keep the equipment working with the specified capabilities to minimize the system losses. There are a number of developed maintenance techniques risk-based techniques [3,4,5,6]. All these techniques focused on the equipment itself and not subjected to the major maintenance system, the related losses, or the impact on business due to ineffective maintenance [1,2]. The maintenance system comprises of: Equipment, People, Work Process, and Environment [1]. So, it is not limited to equipment as the sole component. The interaction between these components may lead to effective maintenance system which improves the operations, or ineffective maintenance system which causes downtime that has a bad impact on business [1].
The study of the complete maintenance system in terms of risk management will provide us with the relations between the equipment and: the working processes, human activities, and the effect of environment [1].

Systemic Risk Management Model
The main objective of using the risk management model is to follow an approach to reduce system losses [7]. As a systemic model, it considers the interaction between all system components. The adopted model aims to understand the expected undesired events or hazards in the maintenance system and how they affect the organization business [1].Considering consequences, a risk control strategy is developed to reduce the impact to the accepted risk limit. The model includes seven steps to achieve complete risk management process [1]. First, is to identify the system with each components and the relations between all system components. Second, identifying the undesired events or hazards with each component. Third, estimate the probability for each undesired event and create the risk scenarios with the event tree analysis, then validate the created scenarios. Forth, estimate the resulted severity that may affect the maintenance system or the business of the organization. Fifth, calculate the risk and evaluate the risk using the reference risk of the organization. Risk assessment includes the steps from the second to the fifth. Sixth, develop the risk control strategy and calculate the residual risk. Seventh, apply cost-benefit analysis with different probabilities. Then the model was applied on reciprocating pumps in oil and gas Service Company.

Maintenance System Identification
The industry best practices indicated that the equipment maintenance system in any organization comprises of multiple components. Using the fundamentals of Systems Thinking, we formulate the components of any maintenance system as per the following: 1.
Equipment: the productive machine within the organization 2.
People: who work in the maintenance system and within the organization affect the maintenance activities 3.
Process and organization culture: All work instructions, procedures even written or well known to everyone, and the organization culture which powered the people to act regarding the maintenance system 4.
Environment: the physical environment, work environment, or other uncontrollable that may influence the system components. The research approach aims to study all system components, and the relations between them. These relations define the influence of each system component on the other, and effect on the maintenance system and the overall business within the organization.
The system components interact between each other and cause the undesired event; the equipment failures, or major equipment losses; these undesired events lead to feared events, if not controlled [1].
It was found that each component in the system has special factors which direct the functional performance for the maintenance system. The failure probability increased when the bad impact of these factors is increased [1].

Hazards Identification for Equipment
-Equipment Operations: Sever operations for the equipment may lead to more frequent failures for some equipment parts.
-Manufacturer: Each manufacturer has different equipment design and components. Improper design may affect failure frequency -Equipment Age: Failure rate is affected by the equipment age. As the equipment working more years, the failures may increase [8].

Hazards Identification for People
-Experience & Training: Lack of know how may lead to the undesired events -Behaviour: Reflects how are the individuals thinking and acting about the equipment operations and maintenance system. Improper performance may lead to damage or loss.
-Commitment: Reflects the adherence toward the system requirements and company procedures. System rules violation may cause incident or accident [9]. -Work Procedures: Illustrate the written and unwritten activities to maintain system compliance. Inflexible and routine internal procedures may lead to time loss -Organization Culture: This factor reflects how the organization think about the system. The reactive culture regarding maintenance will proceed the production and operations than maintenance, which is badly affect the maintenance performance -Management: Defines the effect of management performance regarding the system. The improper planning will result in multiple problems during the execution [8] 4.1.4 Hazard Identification for Environment -Physical Environment: Are surrounding like weather, roads, air, water, etc. The bad weather with dusty wind may cause some harm to equipment parts -Work environment: the work activities which affect the human health like work load and stresses. Overload will make the personnel act in improper way, which may cause asset loss or accident -Uncontrollable: unexpected events which may be considered in risk scenarios with low probability, for example, unexpected fire which may lead to asset loss. Others like explosion, or earthquake may be considered [1].

Defining Probabilities
Based on the case study, the probabilities for the maintenance system were estimated. The probabilities of events are put in 4 ranks [10]. Table 1 illustrates the ranges of probabilities from very low to high. High From 0.6 to less than 1

Creating Risk Scenarios
After estimating the probability of defects for components of the system, we developed the different risk scenarios which may affect the system output and cause the undesired events and feared events. A model was established to create the risk scenarios using Event Tree Software. This software was used to develop final probability for a series of events in sequence based on the probability of defect for each event. The input of software was in Excel form, and the output was also, Excel form. After validation, the resulted probability for each scenario is multiplied by the severity to calculate the risk values. Sorting of these risk values from the largest to the smallest values, will facilitate the risk calculation and evaluation

Severity Estimation
A cost iceberg model was designed to include the hidden costs when calculating the actual cost of severity of each risk scenario. When the equipment failures occur during performing the operations without controlling, one or more of consequences may happen. These consequences may have direct impact on business like job downtime, and may have indirect impact like image loss; feared event; due to bad performance. The hidden cost of these consequences should be taken into considerations. The maintenance cost of failure and the major loss represent the undesired event cost which will be added to the feared event to obtain the final value of severity, Table 2.

Calculating Risk
The risk is calculated by multiplying the resulted probabilities from scenarios with the severity values. The following table represents the final risk ranking. Major From 1000 to less than 50,000 4 Catastrophic Greater than 50,000

Evaluating Risk
In this step risk scenarios are evaluated to compare the resulted risk values with the accepted risk criteria of the organization. For the7533 validated scenarios, the lowest value for risk is zero and largest value is $ 127,381. Data analysis is performed by selecting effective ranges for the bins. The bins represent the ranges of risk values, and the frequency represents how these ranges are repeated for the different scenarios. By distributing the data with that way, it will be easy for the organization to develop the proper decision for risk control by selecting the most effective scenarios to be mitigated. Another methodology is to use Pareto analysis and the 80/20 are used to calculate the total value of risk for all scenarios, and control the minimum effective number of scenarios (may be 20%) which eliminate 80% of the risk [1].

Controlling Risk
The traditional method to control consequences in maintenance was to develop the maintenance plan to reduce the failure probability. In this model, the control will conclude all hazards within the system components which cause the final equipment failures and the other undesired and feared events. Bow Tie software was used to introduce the complete risk assessment and control for the model. The following table summarizes these types [1, 11,12].

Sensitivity Analysis
The control strategy assumed that only 25% of the probability will be reduced. But, what if the probability reduced by 50% or more? Or, what will be the situation, if more controls are used to reduce the probabilities to minimum? What if a risk reference for the organization was determined as accepted risk criteria to control the undesired events based on the organization plan? A sensitivity analysis was performed to illustrate the final output by changing the probability using Monte Carlo simulation [1].

Analysing Cost Benefit
After calculating the total cost of risk scenarios and the total cost required for controlling, the cost benefit ratio is calculated to ensure if a benefit is gained by control or we can accept the risk compared by the acceptance limit of the organization. The risk scenarios cost after the control represent the residual risk. The following consists the methodology of calculation: *Total Cost of control = 233,150 USD *Total Risk before control = 6,579,486 USD *Residual Risk =1,306,946 USD *Benefit (Risk -Residual Risk) = 5,272,540 USD (Benefit/Cost) Ratio =22.6 (which is greater than 1, so the risk control should be applied) A sensitivity analysis was done through multiple scenarios by changing the cost of control for each system components [1].

Conclusion
This research introduced risk management solution to the organizations which have problems with their equipment maintenance systems, and the organizations which need to improve their maintenance system or avoid the losses that may affect the whole business. If the maintenance system is not effectively controlled, one or more of the following major consequences will be happened: To avoid the system impact on business, the model of risk management can be applied to assess the system undesired events and selecting the proper type of control. The model is valid for any maintenance system for oil and gas companies and other industries.