Fault tolerant area coverage control for multi-agent systems

. The fault tolerance characteristics of a distributed multi-agent coverage algorithm are examined. A team of sensor-equipped mobile agents is tasked with covering a planar region of interest. A distributed, gradient-based control scheme is utilized for this purpose. The agents are assumed to consist of three subsystems, each one of which may fail. The subsystems under examination are the actuation, sensing and the communication subsystem. Partial and catastrophic faults are examined. Several simulation studies are conducted highlighting the robustness of the distributed nature of the control scheme to these classes of faults, even when several of them happen at the same time.


Introduction
The problem of multi-agent area coverage can be categorized as blanket [1], sweep [2] or barrier [3] coverage.In all categories the common theme is the use of a team of sensorequipped agents for a sensor coverage task.The agents are able to communicate with their neighbors in order to coordinate their movements and attain their common objective in a collaborative manner.
There have been varied approaches to the problem of blanket area coverage, which is the one examined in this article.These include game-theoretic methods [4], event-triggered control [5] or more classical methods such as distributed geometric optimization [6] and optimal control theory [7].Several special cases of the blanket coverage problem have been examined, including coverage for non-convex domains [8,9], taking into account heterogeneous or anisotropic sensing patterns [10,11], accounting for more realistic agent dynamics [12] and ensuring communication among the agents [13].
In this work the agents are assumed to be dimensionless and are governed by a single integrator kinematic model.Each agent is equipped with an omnidirectional, limited-range sensor, allowing it to sense a circular region around it, while the sensor effective range is assumed to be common among agents.Agents are also equipped with transceivers in order to communicate with nearby agents and exchange the necessary information for coordinating their movement.The goal of the mobile agent team is to cover as much of a given convex, planar region of interest as possible using their on-board sensors.To that extent the power diagram [14] is used in order to assign disjoint regions of responsibility to the agents.The computation of the coverage objective can be then distributed over the power diagram cells allowing the use of geometric optimization techniques in order to derive a distributed control law [15].In order to evaluate the robustness of the resulting control law, each agent is assumed to be comprised of a set of subsystems, each of which can potentially fail.The actuation subsystem of an agent is responsible for its movement inside the region of interest.The sensing subsystem is tasked with sensing the region of interest, which is the goal of the team deployment.Finally, the communication subsystem is tasked with transferring information required for implementing the control law among the agents.Simulation studies are used to show that the resulting control law is robust to faults in the four subsystems described previously.
The first class of faults considered in this article are actuation faults.These can be categorized into partial or complete actuation failures.In a partial actuation failure, one or more of the mobile agent's actuators operate at lower than full capacity which results in the agent moving with reduced speed in the respective direction.In a complete actuation failure on the other hand, one or more of the agent's actuators stop functioning completely in which case the agent may be unable to move at all.Both cases are examined and it is shown through simulations that the agents are able to reconfigure themselves and achieve coverage despite some of them losing part of or all of their actuation capabilities.
Another type of fault considered in this article is the loss of sensing performance.In this case a faulty sensor is modeled as having a reduced effective range.Although the Voronoi diagram could be used in the case of identical sensing radii, it could not take into account the loss of sensing performance due to a faulty sensor, since it is dependent only on the agents' positions.The power diagram on the other hand takes into account the agent's sensing performance in addition to their positions.Thus agents whose sensors have degraded performance are assigned smaller regions of responsibility, resulting in the other agents covering the region the faulty ones no longer can.This behavior is also verified through simulation studies.
Another fault that may arise during the operation of a multi-agent system is a failure of the communication subsystem.In such a case the faulty agent will be unable to receive information from the other agents, as well as send them the information they require in order to implement the control law, essentially becoming completely isolated from the team.Since the neighbors of the faulty agents cannot distinguish a failure of the communication subsystem from a total agent failure, there are two approaches to handling this fault.The safe approach would be to always assume that the faulty agent is unable to perform its assigned sensing task, thus the other agents should implement the control scheme as if it no longer exists.The alternative approach is to always assume that the faulty agent can still complete its sensing task.In this case, the other agents should compute their control law assuming the faulty agent is located at its last reported position.In both cases the faulty agent should remain immobile since the others have no means of learning its new position.The performance of the agent team for both cases is shown through simulation studies.
It is shown that the examined control scheme based on the power diagram and distributed optimization techniques is robust to multiple classes of faults.In all cases examined the agent team is able to reconfigure and adapt despite some of its members having severe faults.

Problem statement
We assume a convex region of interest 2    and a space density function :      which encodes all a priori knowledge about the importance of points inside  .A team of n mobile ground agents which is deployed inside the region of interest.Each agent {1, 2, , } is approximated by a point mass located at i q  and is governed by the following kinematic model 2 , , , is the actuator efficiency parameter.Actuator efficiency values of 1, (0,1) and 0 correspond to normal operation, partial actuator fault and catastrophic actuator fault.
All agents are equipped with identical omnidirectional sensors with limited sensing radius R , resulting in a circular sensing pattern   ( , ) : , where is the sensor efficiency parameter of agent i .Sensor efficiency values of 1, (0,1) and 0 correspond to normal operation, partial sensor failure and complete sensor failure.
Agents are also equipped with identical wireless transceivers in order to exchange information with their neighbors.Given the fact that the sensing radii of all agents are not greater than R , agents with a distance greater than 2R will not have any meaningful interaction with each other.Thus the communication radius of all agents is 2 , where {0,1}  The goal of the mobile agent team is maximizing the sensor coverage of the region of interest while simultaneously accounting for the space density function.Thus the coverage objective is defined as max ( ) ( ) , where is the indicator function for the set i S .The control objective is the design of a distributed control law for the mobile agents so that monotonic increase over time of the coverage objective H is guaranteed.

Distributed control law
The first step in designing a distributed control law is the distribution of the computation of the coverage objective among the agents.To that end, a power partitioning of the region  using the agent positions i q as generators and their sensing radii i s R as weights will be used.The power cell of agent i is defined as where i N is the set of neighbors of agent i .It should be noted that under the absence of sensor faults, the power diagram converges to the Voronoi diagram, thus we can define the MATEC Web of Conferences 188, 05010 (2018) https://doi.org/10.1051/matecconf/201818805010ICEAF-V 2018 neighbor set i N as the Delaunay neighbors of agent i .In the case of sensor faults for the new set of neighbors ˆi N it will hold that ˆi i N N  , thus for this particular use case of the power diagram, it is sufficient to consider only the Delaunay neighbors of the generators.It is also useful to define the R-limited power cell as which is the intersection of each agent's power cell with its sensed region.Using the power partitioning and the R-limited cell definition, the coverage objective (4) can be written as It is now possible to design a gradient-based, distributed control law so that the coverage objective function (6) increases monotonically over time.The complete control law design procedure can be found in [15] and results in ( ) where  denotes the boundary of the set S , i n the outwards unit normal vector on R i V and i g is a positive constant.

Operation under faults
In this section three kinds of faults are examined through simulation studies and for each one the agent team's performance is compared to its performance under normal operation.

Simulation 1: Normal Operation
The normal operation of agent team is examined in this section in order to better evaluate the effect of the various faults examined.A team of 8 agents with sensing radius 0.5 R  and control law gains   operating inside a convex region of interest is simulated.The region  is the same as in [15] for consistency.The agents' initial and final configurations can be seen in Fig. 1.
[Left] and [Right] respectively where the agent positions are marked by black dots, the boundaries of their power cells are shown in blue and the boundaries of their sensed regions are shown in red.The agent trajectories are presented in Fig. 1. in red while the initial and final positions are shown as dots and circles respectively.This simulation is used for the evaluation of the agent team's performance under faults.The initial conditions of all following simulation studies are the same as in this one.

Simulation 2: Actuation faults
Actuation faults reduce the efficiency of an agent's actuators, leading to decreased movement speed or complete immobilization.This simulation study is used to examine the effects of actuation faults in the control law (7).Agent 8 suffers from a partial actuation fault 8 0.5 a  at 2 t  seconds, essentially reducing its speed by half.Agent 7 suffers from a catastrophic actuation fault 7 0 a  at 3 t  seconds, leaving it immobilized for the duration of the simulation.The agent trajectories and final configuration are shown in Fig. 2. [Left] and [Center] respectively, using the same colors and notation as in Fig. 1.The points on the trajectories of agents 7 and 8 at which the actuation faults occur are marked by a black cross and a black triangle respectively in Fig. 2. [Left].The evolution of the coverage objective ( 6) over time is shown in Fig. 2. [Right] in solid blue for this simulation and in dashed red for normal operation.The times at which the failures occur are marked by dashed black vertical lines.It is observed that despite the actuation faults, the agents manage to converge to a locally optimal configuration although the final value of the coverage objective is 3.6% smaller than the one achieved under normal operation.It is also observed that the agent team needs more time to converge to its final configuration which can be attributed to the reduced speed of agent 8 and the fact that the immobile agent 7 obstructs the movement of other agents.It should be noted that since the control scheme converged to local optima, even if only partial actuator faults occurred, there would be no guarantee that the final agent configuration will be the same as the one under normal operation.In the static area coverage problem however it is the value of the coverage objective that matters and not the agent positions themselves.Sensor faults limit the coverage performance of an agent by reducing its sensing radius, thus decreasing the area it is able to cover.This simulation study serves to examine the effect of sensor faults on the performance of the agent team.Agent 8 suffers from a partial sensor fault 8 0.5 s  at 5 t  reducing its sensed area to 25% of its initial value.Agent 4 suffers from a complete sensor fault 4 0 s  at 10 t  reducing its sensing performance to zero.The agent trajectories and final configuration are shown in Fig. 3. [Left] and [Center] respectively, using the same colors and notation as in Fig. 1.The points on the trajectories of agents 4 and 8 at which the actuation faults occur are marked by a black cross and a black triangle respectively in Fig. 3. [Left].The evolution of the coverage objective (6) over time is shown in Fig. 3. [Right] in solid blue for this simulation and in dashed red for normal operation.The times at which the failures occur are marked by dashed black vertical lines.Sudden drops of the coverage objective are observed at these times due to the effective sensing radius of the faulty agents decreasing instantly.It is observed that despite the sensor faults the agents are able to converge to a locally optimal configuration although the final value of the coverage objective is 12.3% smaller than the one achieved under normal operation.Due to the use of the power diagram, agents are assigned cells according to their sensing capabilities, thus agents with sensor faults are not assigned disproportionately large cells.

Simulation 4: Communication faults
The effects of communication faults on the agent team performance are examined in this section.Agents with communication faults are unable to both transmit and receive messages to and from other agents, effectively isolating them from all other agents.It is assumed that each agent is able to detect a fault on its own communication system and that in such a case it will remain stationary indefinitely.The first assumption is needed so that an agent will not interpret a failure of its own communication system as a failure of all of its neighbors while the second assumption is needed so that the other agents know what to expect from an agent with a communication failure.From the perspective of the agents operating normally it is not possible to distinguish a failure in some other agent's communication system from a total failure of that agent since in both cases they will just be unable to communicate with that agent.In case of a communication failure it might be desirable to take the malfunctioning agent into account, given the fact that its sensor is still functioning.In that case all other agents assume it is located at its last known position.Since, control-wise, this is equivalent to the malfunctioning agent having a catastrophic actuation failure, this case is not examined further as the results are the same as those presented in Section 4.1 for agent 7. On the other hand, it might be desirable for the agents operating correctly to assume that every lack of communication from some agent is due to total agent failure, meaning the malfunctioning agent is unable to move, sense and communicate.In that case the malfunctioning agent is essentially removed from the agent team, which is the case examined in the following simulation study.
This simulation study serves to study the effect of communication failures when assuming that they correspond to total agent failures.Agents 4 and 8 suffer communication [Right] in solid blue for this simulation and in dashed red for normal operation.The times at which the failures occur are marked by dashed black vertical lines.Sudden drops of the coverage objective are observed at these times due to the malfunctioning agents essentially being removed from the team.It is observed that despite the communication faults the agents are able to converge to a locally optimal configuration although the final value of the coverage objective is 13.6% smaller than the one achieved under normal operation.After each malfunctioning agent is removed from the team, the others are able to adapt and move to cover the region that it was responsible for.

Conclusions
The robustness of a distributed control scheme for area coverage by a mobile agent team was examined in this article.Several classes of faults were considered and their effects examined through simulation studies.It has been shown that the control law is inherently robust and is able to adapt in real time to various types of agent faults.
is used to represent normal operation and communication failure for

failures 4 0 c  and 8 0 c  at times 5 t  and 10 t
 respectively.The agent trajectories and final configuration are shown in Fig. 4. [Left] and [Center] respectively, using the same colors and notation as in Fig. 1.The points on the trajectories of agents 4 and 8 at which the communication faults occur are marked by black crosses in Fig. 4. [Left].The evolution of the coverage objective (6) over time is shown in Fig. 4.