Estimation of total time on test for large samples

The aim of the present paper is to identify total time on test (TTT) estimators which require only partial information on sampling units. This allows an important simplification of the testing or observation procedure. The approach is based on the division of the testing/observation period into intervals so that TTT estimators to be determined using only classes numerical attributes. Division is made based either on failure rank or on time. A case study based on real field empirical data is carried out in order to assess the effectiveness of TTT estimators.


Introduction
The total time on test (TTT) is a concept introduced in the second half of the last century, developed and popularized in the seventies especially by Barlow and Campo [1]. Since then, it has been a subject of interest to many researchers and different new applications of the concept have been defined and studied (see, e.g., [2][3][4][5]). Many of these developments have proven to be very useful in the area of reliability analysis, particularly those concerning the identification of reliability mathematical model and the characterisation of failure rate.
Although the main use of the TTT concept is the field of reliability analysis, recently it has found its utility also in other areas such as economics, risk assessment, maintenance scheduling, etc. [3,5].
When several units are simultaneously put under test (or observed in real operation), the units failure will occur at different times. Therefore, at each moment of time during the test it will be a number of units still in operation and a number of units that have failed. The TTT statistic is given by the sum of all units life lengths, either completed (the case of units that have failed) or incomplete (units without failure).
The most common use of TTT is to obtain information about the failure rate, based on the shape of the so-called TTT transform plot. An important property of this plot is that the slope of the graphical representation at any point (i.e. any moment of time) is the inverse of the failure rate at that point. This type of analysis can be performed for both complete and incomplete data and it typically requires evaluating the TTT for each time-to-failure, i.e. for every tested/observed unit failure time. Taking into account that it is increasingly difficult to obtain complete information about the lifetime of products, the present paper investigates the estimation of TTT in the case of large samples, assuming that not all the units time-tofailure are known.

The total time on test concept
Given a non-negative random variable X and F(x) its distribution function, the total time on test (TTT) transform of X is defined as is the quantile function of X.
The scaled TTT transform is given by the ratio and the plot of T s (u) versus u is the scaled TTT transform curve on the basis of which the characterisation of failure rate can be done. The shape of the curve allows to decide whether the failure rate is constant, increasing or decreasing with time.
In most of the cases, the random variable X consists of the lifetime (time of operation without failure) of units whose reliability is analyzed. In this case, the TTT transform in equation (3) can be expressed as [6]: where R(t) denotes the reliability function and t u =F -1 (u). If ... 2 1 are the successive failure times of a sample of n units put to test at time 0, the empirical TTT up to the i th order statistic is given by [6] i which represents the total time on test of the n simultaneously tested units accumulated up to the time of the i th failure.
The TTT transform graph is obtained by plotting the points (i/n, TTT(n,i)/TTT(n,n)) -in the case of complete data -or (i/k, TTT(n,i)/TTT(n,k)) in the case of incomplete data, when the test ends after the k th failure (with i=1… k).

Estimation of TTT for large samples
As it could be seen above, plotting the TTT transform curve requires the calculation of TTT value at each failure time, so it has to be known the moment of failure for every tested or observed unit. This is not very convenient in the case of units observed in real operation, especially if it comes to large samples. For this reason, the aim of the present paper is to investigate algorithms for estimating TTT in the case of large data samples.
The idea is to get an estimation of TTT without being necessary to know the failuretimes of all units observed or tested. This would both decrease the amount of calculations required and, most of all, would simplify the procedure of obtaining data during the entire testing or observation period.
The natural solution to simplify the procedure is to divide the testing period into intervals and to evaluate TTT only for the limits of these intervals instead of calculating it for each failure time of tested units. Under these circumstances, the questions that must be answered are:  On what basis should be the observation period divided into classes ?  How (i.e. using what formula) should be estimated the TTT for each class ? Regarding the first question, two solutions may be considered for distributing the data into classes: -the quantiles method: in this case each class contains an equal number of failed units; -the equal intervals method: in this case each class has the same duration. In other words, the first method is based on failure rank while the second is based on time. It is to be noticed that if the first method is used, the TTT transform plot points will have equidistant abscissae.
Regarding the second question, two solutions may be considered for estimating TTT for each interval: -to evaluate TTT using the upper limit of the class interval; -to evaluate TTT using the midpoint of the class interval.
In either case, equation (5) can be used to estimate TTT, with the observation that this time tj are no longer failure times of individual units but class intervals parameters (upper limit or midpoint, as the case may).
If the intervals upper limit is used, TTT estimation is given by where r j is the number of failures in class j and t sj is the superior limit of class interval j.
If the intervals midpoint is used, TTT can be estimated by a similar equation, except that it uses in the first term the class intervals midpoint (t mj ) instead of its upper limit t sj : Equations (6) and (7) can be further particularized according to each considered case: if the quantiles method is used, equations (6) and (7) become, respectively: where r c denotes the number of failures in each class (the same for all classes). Furthermore, for a complete test, when n=pr c , where p is the number of classes, equations (8) and (9) can be written in the form and (11) It can be seen from equations (10) and (11) that, whether the class intervals upper limits or midpoints are used, the only data necessary to estimate TTT (taking into account that the number of failed units per class is previously set and assuming that the total number of classes is known) are the time intervals limits.
If the "equal intervals" method is used, equations (6) and (7) become, respectively and where t c denotes the length of each class interval (same duration for all intervals), hence t sj =jt c , for j=1… i. It can be seen from equations (12) and (13) that also in the case of "equal intervals" method, whether the class intervals upper limits or midpoints are used, the only data necessary to estimate TTT (taking into account that the size of the class interval t c is previously set) are the number of failed units in each class.
Estimators given by equations (10) … (13) are covering the four possible combinations of the considered cases -concerning division into classes and data used to evaluate TTT. Based on any of which can be set up a plan for testing or observing in real operation a set of sampling units. Choosing a suitable plan is depending on the specific conditions related to statistical units features, possible testing/observing procedures, available resources, etc.

Case study
A case study is carried out in this section, in order to assess the effectiveness of the previously presented estimators. The study is based on real field empirical data, observed and collected in actual operations of freight wagons [7]. The sample size is of 1802 units, of which in the case of n =190 have been registered failures over the observation period of 350 days.
As the 190 failure times are precisely known, the TTT value can be accurately calculated at any time. This (real) value can be used as a basis of comparison for the estimated values of TTT, obtained using the expressions derived in the previous section.
In tables 1 and 2 are presented the accurate and the estimated values of TTT in the two hypotheses concerning the distribution of data into classes: equal time and equal number of failed units, respectively. In each scenario, calculation was made using both the upper limit and the midpoint of the class interval. TTT was calculated as the total operation time accumulated by the 190 units failed during the observation period, which was divided in 10 intervals (35 days or 19 failed units each). where p is the number of classes and m(TTT) is the mean of actual TTT values. Notice that CV is expressed as a percentage in tables 1 and 2. It is obvious that using the midpoint of the class interval is clearly preferable to the alternative of using the upper limit of the class interval (see the values of CV and RMSE in tables 1 and 2). Also, division into classes according to quantiles method -equal number of failures per class -seems to be preferable.
An additional analysis (table not included) made in the case of 19 classes of 10 failed units each resulted in the following values of CV: 4.153 % (using the upper limit) and 0.684 % (using the midpoint). By making a comparison with previous results (see table 2), 8.90 % and 0.337 %, respectively, it does not necessarily result that an increase of the number of intervals (a more detailed analysis) has a positive impact in any situation.

Conclusion
Given the difficulty to get complete information about the lifetime of products, the aim of the present paper was to identify TTT estimators which require only partial information. This is convenient especially for large samples, implying a major simplification of the procedure of obtaining data during the entire testing or observation period.
The approach was based on the division of the testing period into intervals so that TTT estimators to be determined using only classes numerical attributes. Division was made based either on failure rank or on time and TTT was estimated using either the class upper limit or class midpoint. The TTT estimators were obtained for each considered case, the only necessary data being the classes number of failed units or time intervals limits (according to the case). On the basis of this data required can be established plans for observing in real operation or testing a set of sampling units.
A case study based on real field empirical data was carried out in order to assess the effectiveness of the TTT estimators. The root mean square error (RMSE) and the coefficient of variation (CV) were used for an overall evaluation. The numerical results showed that using the midpoint of the class interval is more suitable and that division into classes according to quantiles method seems to be advantageous.
However, choosing a suitable testing plan is depending on the specific conditions related to statistical units features, possible testing/observing procedures, available resources, etc