Method for determining the reliability parameters of software for complex decision support systems

The testing stage, creating great opportunities to verify and shape software reliability, significantly increases the cost of its production. The effectiveness of the work related to testing, expressed by the interdependence of the level of program product reliability and the cost of testing it, strongly depends on the adopted testing strategy, specifying the organization and scope of the work performed. In this situation, therefore, there is a need to define the conditions for a compromise in terms of reliability and cost requirements set for the software. The practical finding of this compromise can be greatly facilitated if there are possibilities to formally assess the level of software quality and the cost of testing it using appropriate indicators. The paper attempts to describe a method of determining a program testing strategy as a result of solving a two-criteria optimization problem, with the program reliability coefficient and the cost of testing as component criteria. The paper consists of description of the program testing process and mathematical model of this process, formulation of the problem of two-criteria optimization of the program testing strategy, remarks on method of solving the problem that has been formulated. proposed. To illustrate the method of finding an optimal testing strategy that has been proposed a numerical example is considered.


Introduction
The work presents a formal description of the logical structure of the program under consideration, defined by a set of its component modules and interconnections existing between them.The logical structure of the program is reflected in the structure of the program reliability coefficient, the value of which depends on the value of reliability measures of its component modules.The problem of a two-criteria optimization of the program testing strategy is formulated, with the program reliability measure and the cost of testing as component criteria.The program testing strategy, determined as a result of this task, ensures simultaneous maximization of the value of the adopted reliability coefficient and minimization of the program testing cost.The presented method is illustrated by a numerical example.
The number of program errors encountered during the testing process depends on many factors, such as the testing process organization and technology (that define the manner of the testing process realization), duration of the testing, the testers' qualifications and professional experience and the reliability level of the program at the beginning of the testing process.The duration of the program testing process can be determined by predicted time spent on testing activities or by predicted cardinality of the set of input data that should be used for the testing.
Execution of the program under the testing process with one input data set (test case) will be called a run in this paper.The run can be successful, if program execution did not lead to encounter any program errors or not successful, if program execution was incorrect, i.e. some errors were encountered.

Description of the program testing process
The behavior of the computer program, depending on the manner, in which different program modules affect each other, is defined by the module structure of the program.The interactions between different modules in the form of mutual control transfer are present only during the execution of the program.The program structure may also contain some information on frequency of activation of the individual modules.When the program is working, some emergency situations related to software errors may occur.The software error may be connected with the implementation of any module or transfer of control between the two modules.A possibility of occurrence of some emergency situations during the use of the software is described by various software reliability coefficients.The knowledge of the component (module) structure of the examined software allows using the coefficients, whose values depend on such structure and the reliability coefficients of its particular components.
Depending on the method applied in that respect, it is possible to distinguish two approaches to the reliability MATEC Web of Conferences 210, 04009 (2018) https://doi.org/10.1051/matecconf/201821004009CSCC 2018 modeling process, on the basis of: the structure set by the source code instructions, structure based on logical paths [5].
The models based on the architecture set by the source code for modeling the software architecture use the so-called control transfer graph, illustrating a possibility of transferring control between the blocks of the source code.The nodes on the structure are indicated by the so-called program control guidelines.It is often assumed in the models of the discussed class that the control transfer between them may be described by the Markov chains.This assumption means that the control transfer after implementation of a specific module to other modules does not depend on previous activation of the program blocks.The software architecture is modeled by the Markov chain, e.g.discrete in states and in time.The representative examples of such models may be found in studies [2,6,10].
In the models based on the logical path concept, the software architecture is also created by separate blocks of the source code, e.g. by the modules, which are not -howeveranalyzed as independent components, but as sequences of components executed one by one in case of activation of a given path by an appropriate set of input data.Knowing the values of the reliability indicators of the components that create a given logical path, it is possible to determine the value of the reliability indicator of the whole software, using the knowledge of the architecture of the software.The representative examples of the models based on logical paths may be found in work [9,16].
The program being tested will be characterized using the directed graph G defined as follows: where I a set of graph vertices corresponding to the set of module numbers of the tested program: , if after the execution of the i-th module (during program execution), the j-th module can be executed as a next.
Without lost of generality it can be assumed that the graph of the considered program is an unigraph, with one input module and one output module with numbers In the program graph G we can distinguish a certain set of paths connecting the initial node with the final node, where the term "paths" is understood as in the graphs and networks theory [4].Due to the fact that the analyzed graph G is a directed unigraph, each path connecting vertices I ∈ WY WE i , i can be unambiguously determined by giving the numbers of vertices through which it "passes".Any such path, for which, moreover, there is at least one set of program input data that activates it, will be called logical, whereby activating a specific logical path means the subsequent execution of the modules that make up it.
Let q I mean a set of module numbers of the q-th logical path: the number of the k-th module of the q-th logical path (eg. in the order of executing the modules forming the given path), q I number of modules forming the q-th logical path.
It is assumed that the process of testing considered program consists in independent testing of its component modules, and that the testing of the i-th module, consists of the so-called testing cycles, each of which includes: • executing the module with a number of previously prepared test data set; • evaluation of the obtained results and location and removal of any errors found.
The adopted organization of the program testing process corresponds to the actual stage of the so-called autonomous testing of program components, followed by the so-called integration testing of the program [1,8,[11][12][13][14].
Let S denote the testing strategy of the program under consideration, defined as follows ) S ,..., S ,..., S , (S S where: i S testing strategy of the i-th module, defined as follows: L number of test cases (tests), based on which the i-th module is tested in the k-th cycle of its testing process, means the probability of the event whereby n new bugs are detected in the i-th cycle of the testing process, provided that m tests give incorrect results in this cycle.
Assuming that in the case of each test that gives an incorrect result, a detection of error that it causes will occur  . ( The quantities i m , n p are the values of probabilities and therefore: Condition ( 4) means that the sum of elements in each column of the P i matrix is equal to 1.
For the determined method of designing a set of test data sets, the values of probabilities i m , n p that form the matrix , depend on the logical structure of the source code of the tested program and on the level of its reliability.Assuming that the structure of program under the testing processs is a program control graph, it can be concluded that the probabilities i m , n p depend primarily on: • the number of logical paths connecting the input node with the output node; • the length of these paths, measured for example by the number of program instructions, in a case of their activation; • the level of mutual overlapping individual logic paths, the measure of which is the number of program instructions included in two or more paths.
For example, if in the tested program each instruction (except of instructions forming input and output nodes) belongs to only one logical path, i.e. that the program under testing consists of a set of disjoint paths, then it should be expected that values of probability i m , n p , m n ≤ will be focused on the main diagonal of the P i matrix or near it (above the main diagonal).In turn, if in the tested program individual roads overlap, i.e. many logical paths "pass" through the same instructions or their groups, then it is expected that values of probability i m , n p , m n ≤ , will be concentrated in the initial rows of the P i matrix (subject to conditions (2) -( 4)).
For the above reasons, in further considerations the P i matrix will be called the characteristic matrix of the i-th module of the program being tested.In a such case the tested program will be characterized by a vector of characteristic matrices P, defined as follows: ) P ,..., P ,..., P , (P P where , is the characteristic matrix of the i-th module of the tested program.

Formulation of the problem of two-criteria optimization of the program testing strategy
In further considerations, the reliability coefficient of the tested program, after finishing the testing process, implemented according to strategy S, will be defined as follows [15]: where: reliability coefficient of the i-th module with a characteristic matrix i P , after finishing the process of its testing, carried out according to the strategy i S , defined as follows [17]: where value of the i-th module reliability coefficient before the testing process begins, q d the probability of activating the q-th logic path by a single test case.
According to relation (5), the reliability coefficient r(S, P) of the tested program is understood as the probability of its correct execution for a single test case.The values of probabilities , depend on the nature and frequency of specific data sets for which the program under consideration is executed.Determining the probabilities , is implemented in practice with the methods used for the purpose Let C(S, P) be the cost of the program testing process with the characteristic matrix vector P, implemented according to strategy S. Cost C(S, P), which is the sum of the costs of testing the program modules, can be determined as follows: where: the average cost of preparing a test case set used in the testing process of the i-th module, implemented according to the testing strategy i S : where i T c is the average cost of preparing and running one test case in the i-th module testing process, while ) (S L i i it is the total number of tests used in the i-th module testing process, carried out according to the strategy i S , i.e.: the average cost of locating and removing errors detected in the i-th module testing process with the characteristic matrix i P implemented according to the strategy i S : where i E C is the average cost of locating and removing one error in the i-th module testing process, while means the expected value of the number of errors, the detection of which is expected in the i-th module testing process with the characteristic matrix i P , implemented according to the strategy i S .

Quantity
, according to the relationship (6), can be specified as follows: For practical reasons, the following limitations are imposed on the reliability coefficient r(S, P) and cost C(S, P): where quantities mean the minimum acceptable level of reliability of the tested program and the maximum acceptable level of costs incurred for testing, respectively.
The following two-criteria optimization problem of the program testing strategy can be formulated based on the introduced designations and the obtained relationships: where: S a set of acceptable solutions (strategies), defined as follows: while quantities r(S,P), C(S,P) are defined by ( 5) and ( 9) respectively, Φ the dominance relationship defined as follows: where Y is so-called criterion space defined as below: Problem ( 15) is a non-linear integer programming twocriteria optimization problem.Its solution can be determined in accordance with the generally accepted methodology of solving polyoptimization problems [3].According to this methodology, the solution of the problem (15) may consist in particular in determining: • a set of dominated strategies, • a set of non-dominated strategies, • a set of compromise strategies.
Due to the nature of dependencies ( 5) and ( 9), which determine the components of the criterion functions of the problem in question, one should expect that the set of dominated solutions will be empty.In this situation, the practical significance is therefore the determining a set of non-MATEC Web of Conferences 210, 04009 (2018) https://doi.org/10.1051/matecconf/201821004009CSCC 2018 dominated solutions and its possible narrowing, e.g. by defining a set of compromise solutions.

Numerical example
To illustrate the considerations presented in previous sections, a simple numerical example will be considered.
Let the graph of the tested program, characterizing its modular structure, be determined as in Fig. 1.
Assuming that all paths connecting the beginning node i WE =1 with the end node i WY =7 are logical paths we obtain where the sets of module numbers that make up particular logical paths are defined as follows: The values of probabilities of activating individual logical paths by individual test case are specified in Table 1.The remaining numerical data for which the calculations will be carried out are contained in Table 2.The calculations will be carried out for the test scheme, in which each cycle of testing a single module involves performing only one test, followed by evaluation of test results and localization and removal of errors, if the test showed their occurrence.According to the determination of the testing strategy (1)

= =
while K i means the number of cycles of the i-th module testing process, within each of which -in accordance with the previously assumed assumption -exactly one test is performed.
A direct consequence of the assumed testing scheme is the inability to detect in the subsequent cycles the testing process of individual modules of the so-called repeated errors, i.e. errors detected by different tests.According to the definition (3), the characteristic matrices of individual modules have the form: In the considerated case, expressions ( 5) -( 6), defining the form of program reliability coefficient after the end of the test process, implemented according to strategy S, are simplified and take the form of while quantities ) S ( r i i are dedined as follows [17]:   The cost of the program testing process, according to dependence (9), is defined as follows: where quantity , determining the expected value of number of errors, the detection of which is expected in the testing process of the i-th module, implemented according to the strategy S i is determined as follows [16]: The values of coefficients (20) and ( 22), despite their relatively complex analytical form, can be very easily determined using computer technology.
The initial value of the reliability coefficient of the analyzed program, i.e. before the start of the testing process for data from Tables 1 and 2, is specified as follows: . 9609 .0 r d r In the considered example, it is assumed that the solution of the two-criteria optimization problem (15) consists in determining a set of non-dominated, i.e. strategies, solutions, i.e. the Pareto collection [3].
The polyoptimization problem (15) will be solved for a set of character constraints: A solution of the task (15) (with the above restrictions) as the set of non-dominated strategies is presented in Table 3, wherein calculations were made for the data contained in Tables 1 and 2. This set was determined using the full review method, using computer technology.In the example under consideration, the set of non-dominated strategies contains 6 elements.According to dependence (1), each row in Table 3 defines the optimal testing strategy of the considered program, i.e. such strategy that maximizes the value of the reliability index and minimizes the cost of testing.The determined optimal values of these coefficients for individual nondominated strategies are also presented in Table 3.
The value of the reliability coefficient of the program under consideration, determined by the dependence (20), obviously depends not only on the value of reliability coefficients of its component modules, but also on the so-called the operational profile of the program, i.e. from the probabilities of activating individual logical paths of the program by a single set of input data.  5.In the present case the set of nondominated strategies defined by particular rows of table 5 consists of 18 elements.Each strategy S, determined in accordance with (1), presented in this table has the property that for the corresponding reliability coefficient of the program r(S) is a strategy with a minimum cost C(S) (or respectively: for the corresponding cost of testing C(S) is a strategy with a maximum r(S) value).The values of constituent functions r(S), C(S) corresponding to individual non-dominated strategies, are also presented in Table 5.In the case when too high value of a cardinality number of non-dominated solutions set makes it difficult to decide on one of them, an additional criterion can be used, called in the polyoptimization theory the criterion of compromise (such a solution of the polyoptimisation problem is called a compromise solution) [3].
The analysis of the strategies oresented in Tables 3 and 5 shows that within the accepted limitations, including limitations on the total cost of the program testing process, optimal testing strategies prefer testing of these modules in the first place, the increase in the reliability coefficient which has the greatest impact on the value increase reliability coefficient of the entire program and those whose cost of testing is the lowest.

Summary
The final level of reliability of the created software product is formed during the implementation of all stages of the development process.Constantly increasing reliability requirements expected for modern software systems, especially systems with responsible software, enforce the need to improve the design and implementation methods used in all stages of the process.Despite the continuous development and improvement of these methods, their current level does not give full guarantee of creating a complex software product completely free of errors.
The software engineering practice shows that the biggest percentage of errors, among errors committed throughout the entire software production process, is made in its initial stages, i.e. as part of the requirements specification and design.For this reason, it is highly desirable to develop methods to verify the correctness of the results of the initial stages of the software development process, as a result of which there will be a significant increase in the share of verification methods in error detection, thus contributing to the reduction of errors detected at the testing stage.Such situation will create a real basis for a significant shortening of the software production cycle and a reduction of production costs.
The testing stage, creating great opportunities to verify and shape software reliability, significantly increases the cost of its production.The effectiveness of the work related to testing, expressed by the interdependence of the level of program product reliability and the cost of testing it, strongly depends on the adopted testing strategy, specifying the organization and scope of the work performed.In this situation, therefore, there is a need to define the conditions for a compromise in terms of reliability and cost requirements set for the software.The practical finding of this compromise can be greatly facilitated if there are possibilities to formally assess the level of software quality and the cost of testing it using appropriate indicators.
A method of determining a program testing strategy as a result of solving a two-criteria optimization problem, with the program reliability coefficient and the cost of testing as component criteria has been presented in the paper.The paper consists of description of the program testing process and mathematical model of this process, formulation of the problem of two-criteria optimization of the program testing strategy, remarks on method of solving the problem that has been formulated.proposed.To illustrate the method of finding an optimal testing strategy that has been proposed a numerical example has been considered.

Fig. 1 .
Fig. 1.Graph of the example program

≥
Last from above limits is technical one and serves only to reduce the size of the two-criteria optimization problem.
MATEC Web of Conferences 210, 04009 (2018) https://doi.org/10.1051/matecconf/201821004009CSCC 2018 and that each test can detect at most one error, we can write that there is: [15]C Web of Conferences 210, 04009 (2018) https://doi.org/10.1051/matecconf/201821004009CSCC2018 of determining the operational profile of the program.A description of methods useful in this area can be found in[15].The values of probabilities

Table 1 .
Probability values for the activation of logic paths (set 1)

Table 2 .
Numeric data characterizing the modules of the program

Table 3 .
Set of non-dominated solutions for r MIN =0.9650 and

Table 4 .
Values of probabilities for activating logical paths (set 2)

Table 5 .
Set of non-dominated solutions for r MIN =0.9690 and