The Research of Autonomous In-orbit Health Management System for China’s Manned Spacecrafts

The development trend of manned spacecraft is long-term in-orbit flight and long-term astronaut’s residency. Under the condition of limited TT&C resource, the autonomous health management of spacecraft is a critical and mandatory system to ensure the safety of the astronauts and platform. Based on the discussion of the autonomous health management objectives and management modes of manned spacecraft and combining the distribution and hierarchical information structure of China’s manned spacecraft, this paper elaborates an autonomous health management system. This autonomous health management system adopts a by-level safety threshold management method by establishing a mapping database between different important risk safety thresholds and system safety modes. When the key parameters reach different pre-set thresholds, the system autonomously downgrades and reconstructs the aircraft functionalities in the event of failure to realize optimal auto-processing and solves the issue between randomness of auto-processing and task continuity. The research has been applied to the China’s Space Station.


PREFACE
From the beginning of the 21st century, the systematic health management of aircraft has emerged as a specialized technology in aviation and aerospace fields. Representative methods include Complex System Integrated Health Management (CSIHM), Prognostics and Health Management (PHM), Integrated Vehicle Health Management (IVHM), and Integrated System Health Management (ISHM) [1][2] . Domestic research on aircraft health management has been carried out lately, relevant researches mainly focus on spacecraft fault diagnosis management, autonomous operation management, and integrated health management of complex systems. Some subsystems in such as Resource 21 satellite, Resource 22 satellite and ShenZhou spacecraft have adopted the intelligent technologies such as fault isolation, diagnosis, and system reconstruction, which have indicated good performance.
The development trend of manned spacecraft is longterm in-orbit flight and long-term astronaut's residency. Under the condition of limited TT&C resource, the autonomous health management of spacecraft is a critical and mandatory system to ensure the safety of the astronauts and platform [3] . Based on the information architecture of manned spacecraft in China, this paper elaborates an autonomous health management system which by adopting a by-level safety threshold management method achieves spacecraft functional degrade and reconstruction in the event of failure.

MANNED SPACECRAFT HEALTH MANAGEMENT
The manned spacecraft health management comprehensively evaluates the spacecraft's health status through the telemetry data and images, and predicts the performance changes of the spacecraft, so as to execute the detection, isolation and recovery of the emergent faults, and to prevent possible failures. The aim is to ensure the spacecraft normal operation to meet the mission target. It generally includes two parts: spacecraft autonomous health management and ground health management.  (1) Ground health management can use spacecraft telemetry, image, and astronaut voice to evaluate the system health status using ground-based flight control expert system, perform ground action to anomaly, or predict possible anomaly with preventive measures implemented. The shortcomings of ground health management are TT&C resources of land-bases, ocean-bases and relays cannot be occupied exclusively by manned spacecraft for a long period of time, so the ground cannot receive real-time spacecraft data over a long period of time.
The ground health management system evaluates the health status of the entire spacecraft, the reliability of sub-units and system, and in-orbit status to make lifecycle forecast. The evaluation provides decision support for spacecraft replenishment, maintenance and life extension, and generally contains the following functionalities: 1) Store telemetry data and ground test data and perform big data processing on massive data. By using various intelligent inference algorithms (such as physical model, neural network, data integration, fuzzy logic, expert system, etc.) [4] to extract reliability data, and to extract and save lifecycle prediction data. 2) Import the extracted reliability data into the assessment SW for reliability assessment. 3) By collecting actual in-orbit flight data and ground life test data, the life-time trend forecast curve is modified and fitted, to achieve in-orbit life-time prediction. (2) The autonomous management system can monitor and sense its own status, with autonomously detecting, isolating, and recovering from failures [5][6][7] . With consideration of computer processing capability, its main management scope includes platform safety and health status which requires high real-time processing capability. Meanwhile with astronaut's presence, the astronauts can observe and perceive the real state of the spacecraft, and take proper operation, which is also within the scope of autonomous management. In other words, the health monitoring, diagnosis, and processing performed by manned space vehicles or completed by astronauts are all within the scope of in-orbit autonomous health management. The shortcomings of autonomous health management is that it cannot judge and process all the conditions. In particular, when there is a fault outside the plan, it can only be resolved by the ground health management system. The autonomous health management is mainly used to isolate major emergency faults, while the ground system executes the final solution.
The characteristics of ground health management and autonomous health management are as follows:

MANNED SPACECRAFT AUTONOMOUS HEALTH MANAGEMENT GOALS
The manned spacecraft autonomous health management is mainly designed to achieve the following goals: (1) Secure the safety of astronaut and platform: for the faults that affect the safety of astronauts or cause platform failure, the health management can promptly discover, respond and dispose, to ensure the safety of astronauts and platforms, and gain time for ground handling. In the event of an emergent major failure, it provides timely response and disposal, so that the aircraft can quickly enter the safety mode for energy, attitude, propulsion, and thermal control accordingly. It alerts astronaut and displays the fault information, that the astronauts can timely proceed based on the fault information and according to the prepared fault solution.
(2) Support long-term operation: In the long-term operation stage, the TT&C resources are limited. The autonomous health management system ensures timely detection and diagnosis outside the TT&C area to carry out disposal in real time. Within TT&C area, the fault information can be received by the ground for further analysis and disposal.
(3) Release partial work of flight control on the ground: Transition from traditional ground-monitoring mode relying on people, to autonomous health management design that autonomously detect and diagnose the failures to reduce misjudgment and missed judgment by people. It releases partial work of flight control on the ground and hence reduces operating costs.

AUTONOMOUS HEALTH MANAGEMENT MODE
A good design of autonomous health management system can greatly enhance the spacecraft's in-orbit independency and safety, and hence reduce the need for TT&C resources. For example, the China's Tiangong-1 target aircraft has autonomous management design for energy module, in-orbit attitude control module and thermal control module [8] . During the long-term in-orbit period, the TT&C resources demand has been reduced from 80% to 10% compared with the rendezvous and docking tasks, which has increased the efficiency of ground-side monitoring and reduced in-orbit operating costs.
Considering the computer capability and design complexity, normally the system mainly includes the platform security related status that requires high realtime performance into the scope of autonomous management. The China's Space Station has numerous failure modes in-orbit, with variety of types and complex state. At the space station assembly phase, the number of nodes plus visiting aircrafts can reach to 7 at the most, and there are crossing connections of information, power supply and control, which causes more system coupling, and more complex coupling faults. In order to accurately define the management scope of such large-scale spacecraft autonomous management systems, FMEA (Failure Mode and Effect Analysis) and FTA (Fault Tree Analysis) are normally used jointly, combining the static functional modules of the spacecraft and dynamic flight mission profile to ensure comprehensive and accurate identification. Generally, the modes in which the manned spacecrafts require autonomous health management are shown in the following table:

AUTONOMOUS HEALTH MANAGEMENT SYSTEM ARCHITECTURE
The main functionalities of the autonomous health management system include spacecraft health status acquisition, health management data transmission, health management data storage, fault detection, fault diagnosis and failure decision.
Achieving autonomous health management functionalities generally requires equipment such as Central processor, Data acquisition equipment, Commanding equipment, Data storage and Functional terminals [9][10][11] . The Central processor receives the data from Data acquisition device and judge whether the fault occurs. At the same time, it requires the Commanding equipment to send order to the Functional terminals to perform the fault isolation operation. It controls Data storage to record the fault condition. The Data acquisition equipment receives the aircraft status data and sends the data to the central processor for data analysis. The functional terminal receives and executes the fault handling instructions from Commanding equipment to make sure that the spacecraft is in safe mode in orbit. By scheduling from the Central processor, the Data storage equipment stores the status data collected by the Data acquisition equipment as well as the autonomous health management processing and result data when outside of the flight control area, so to support later fault analysis by ground expert. Figure 2 is a block diagram of spacecraft autonomous health management system. Regarding the realization of manned spacecraft autonomous health management, taking into account that the manned spacecraft information system is a distributed embedded system, in order to increase the response speed to failures, reduce the core processing computing burden, a by-level health management system architecture is adopted. The autonomous health management functionalities are completed by numerous computers scattered in different subsystems. Because each computer has different level in the information system, the tasks it undertakes for health management are not the same, and are generally divided into three levels. The top layer is the core processing unit, the second layer is the control computer or regional controller of each subsystem, and the third layer is the subsystem internal subnet equipment and other terminal.

OPERATING MODE
The autonomous health management system in-orbit repeatedly detects the health status of the basic functional systems, including the information system, energy system, attitude & orbit control system, human environment control system and cabin structure detection system. For major emergency failures, the space station directly enters the safe mode for emergency disposal, and then waits for further investigation and disposal from the ground side. For normal failures, it provides alarm according to established strategies, or performs fault handling such as cutting off and resetting. The autonomous health management system informs the astronaut through image, alarm and other means of detected platform health status information, fault diagnosis and disposal information and simultaneously send to the ground.
With testability design, the autonomous health management system collects the status information step by step and layer by layer and performs fault detection and diagnosis.
(1) Equipment level Through the testable design such as device's BIT, HW monitoring circuit and test point, the health management system obtains the device status information, and carry out fault detection and diagnosis, and report the fault information to the subsystem controller. For devices that are directly connected to the main bus of the system network, such as digital control device and communication control device, the fault information is directly reported to the core processing unit. (2) Subsystem level The subsystem controller collects the health status of the equipment in the subsystem. According to the configured fault mode and programs of detection & diagnosis, it executes fault detection and diagnosis and report the fault information to the core processing unit.
If the subsystem controller judges the security mode to be triggered, it will perform emergency handling, and at the same time notify the core processing unit. Or it will directly notify the core processing unit and leave to core processing unit to implement the emergency handling. (

3) System level
The core processing unit integrates fault information of the related subsystems to determine whether the safe mode entry condition is fulfilled and perform emergency processing if it's fulfilled. All the fault information is sent to ground base as well as to panel subsystem to display. The astronauts carry out further actions based on the preestablished plan.

SECURITY THRESHOLD MANAGEMENT AND TROUBLESHOOTING
An one-for-all disposal will disable some of the functionalities of the aircraft necessarily, so a by-level safety threshold management method is adopted, to establish a mapping database with different safety threshold of critical risks and system safety modes. When critical parameters reach different preset thresholds, the spacecraft functionalities are turned-off level by level, which are sending alarms, shutting down high power devices, and closing of payloads. With the spacecraft functional degrade and reconstruction, the autonomous disposal result is optimized, which solves the problem between randomness of auto-processing and task continuity. The specific analysis process is as follows: (1) Follow the method described in chapter 3 to identify the safety related fault mode. (2) Classify the disposals of different fault modes and determine the corresponding system safety mode after the disposal. Minimize the processing steps as few as possible to have one disposal solution to cope with multiple fault modes, so to enhance the convenience of implementation and improve the system reliability. For example, the disposal for both power supply capacity reducing and heat dissipation capability reducing is to shut down devices. Therefore, one shut-down device method could be considered to handle both scenarios.  Take a case of spacecraft power supply capacity decreasing as an example, the mapping between different safety thresholds and system safety modes is shown in the following table:  (5) The spacecraft automatically performs functional degradation to a corresponding safety mode: the spacecraft decides faults according to preset threshold, and when a fault occurs, the spacecraft automatically performs functional degradation processing, following the pattern of by-level safety mode n down to the minimum safety mode. Data is real-time collected in the process of disposal to determine whether the fault disappears and set the spacecraft to the corresponding safety mode. If the fault disappears, the current functional degradation processing module is closed, and the subsequent processing program is stopped. Figure 6 shows the specific disposal process.

PMAE 2018
Fai l ur e occur

CONCLUSIONS
In this paper, based on the information architecture of China's manned spacecraft, an autonomous health management system is proposed and a by-level safety threshold management method is adopted to achieve spacecraft functional degrade and reconstruction in the event of failure to realize optimal auto-processing.