Statistical learning problem of artificial neural network to control roofing process

Now software developed on the basis of artificial neural networks (ANN) has been actively implemented in construction companies to support decision-making in organization and management of construction processes. ANN learning is the main stage of its development. A key question for supervised learning is how many number of training examples we need to approximate the true relationship between network inputs and output with the desired accuracy. Also designing of ANN architecture is related to learning problem known as “curse of dimensionality”. This problem is important for the study of construction process management because of the difficulty to get training data from construction sites. In previous studies the authors have designed a 4-layer feedforward ANN with a unit model of 12-5-4-1 to approximate estimation and prediction of roofing process. This paper presented the statistical learning side of created ANN with simple-error-minimization algorithm. The sample size to efficient training and the confidence interval of network outputs defined. In conclusion the authors predicted successful ANN learning in a large construction business company within a short space of time.


Introduction
In recent years, neural networks developed and tested for construction project organization and management can support decision making for construction process safety [1] and project management [2], they are also able to predict and evaluate specific construction processes such as: building underground structures [3], road building [4], earthwork [5], and other.The authors conduct a research in construction process organization and management to build roof structures at the planning and operation stage, assisted by the ANN methodology.Previously, the network architecture was designed (fig. 1) and functioning of each layer was determined.The main parameters of the system (Qwqualification of workers, Csconstruction supervision, Itinformation technology, Tsteam structure, Wcweather condition) are located on the second layer.These parameters are divided into basic organizational and technological factors (for example, g 1 -working experience and competencies, g 4foreman).Third layer includes four evaluation criteria of construction process (Ttime, Ssafety, Qquality, Ccost) and last layer merges all information that passed through the network into an integrated indicator P (construction process potential).To measure ANN inputs, the authors used three linguistic variables represented by fuzzy sets [6]  of these points for all factors is input data of the network.ANN learning is a process whereby weight coefficients of synapses are tuned in a certain way so that the network deduces reliable results.A number of approaches to ANN learning exist.For a multilayer perceptron, supervised learning by mean the backward propagation of errors is one tried and trusted approach.This algorithm needs a training sample comprised of an input signal vector and a corresponding output signal vector.Formation this sample in real construction conditions is a laborious process that requires open access of researchers to all stages and participants of construction process.The organization of that access is impossible without the direct interest of the decision maker of construction project.Therefore, it is very important to estimate in advance a sufficient training sample size that must be obtained from construction site for the successful learning of the created ANN.Also providing enough training examples to make the learning algorithm statistically reliable is a key objective in the machine learning theory [7], and we are going to discuss a solution to it for the developed ANN.

Methods
ANN learning can be described as search for function ( , ) F x w in the space of all possible function : , which approximates work of a real-life system with the permissible departure/error and sufficient reliability.According to the statistical learning theory [8], the objective is to minimize risk functional by parameter w : where ( , , ) Q x y w -loss-function, ( , ) P x y -unknown joint probability distribution of the network's ins and outs.
Because recovering density of probability distribution over the sampling is a labor-intensive task, all ANN learning algorithms based on precedents, particularly backpropagation, seek to minimize the empirical risk Since a training sample size is always finite, the main issue about reaching the stopping criterion is about how function ( , ) F x w emp successfully obtained from experimental data can minimize the risk functional (1)in other words, how the error after learning will differ from the network's error in real-life operation.Solution to this objective is related to deciding the condition of convergence of empirical averages to expected values [9]; it appears as {sup | ( ) and it means that with probability of 1-p , disparity ( ) is true simultaneously for all input values; where H -bound of confidence interval or error boundary.
As they analyzed conditions of convergence (3) for training samples of finite length n N and as they generalized the Glivenko theorem for classification of images, Soviet researchers V. Vapnik and A. Chervonenkis introduced the notion of capacity h of function set : , known internationally as the Vapnik-Chervonenkis dimension or VC-dimension [10].We used this parameter to find permissible conditions for regular convergence over the sample size of N h h , from which one can isolate the expression for error boundary: Considering (4, 5) and the condition N h h we find that training sample size is bounded by the values of ( 6).
The resulting theoretical training sample size sufficient to ensure statistical convergence of the ANN has a very large gaps between its boundaries; one of the reasons is that this is a conservative estimate for a scenario with random initial values of weights, zero a priori information about the system except the number of links, and no consideration for the specifics of different training algorithms.Subsequently, known results from practical runs of different neural networks demonstrated that to implement statistical convergence in reallife missions, a probably approximately correct learning (PAC-learning) [13,14] is more preferable; it was also found that for the classification problem, the complexity of the training sample depends more on weights than on their number [15].This is why, to evaluate the capacity of the function set : we proposed that the VC-dimension be which means that with probability of 1 p the approximating function received from the learning sample will give a quadratic error not above H , compared to the best function within the function set : .Thus, for a neural network with inputs from the interval of [0,1], and with outputs delimited by A, with l of computing layers, whose each neuron has an activation function that satisfies the Lipschitz condition )) ( 1) We used ( ) F fat H to find a sufficient training sample size to meet the condition of convergence ( 8) being of the following order )

Results
The ANN built by these authors has three computing layers 3 l , its activation function does meet the Lipschitz condition at  Because any high-rise construction process, including one related to roofing, is a complex system with a lasting life cycle, considerable human factor, and certain unpredictable random disturbances, the construction developer's main objective then is to keep the key process variables that all combined constitute potential P, within some specific preset boundaries that determine the system's sustainable equilibrium.The authors identified such boundaries, within which only one linguistic term can be assigned to the measured component (Fig. 2).The above confidence interval (12) of output values for the ANN is narrower than the stable equilibrium interval, and therefore, we can use the resulting interval to successfully evaluate and monitor the dynamics of both the integrated construction process potential P, and its values by one of criteria (T, S, Q, C).As an example, to evaluate P based on the quality criterion, we assume value 1 QP w to the quality criterion Q P , the remaining weights of neuron P 0 w p .Also the resulting confidence interval is narrower than in (12), considering the statistical aspects of learning discussed above.

Discussion
The final stage of learning for the created ANN understands its test run on real-life construction projects.If the construction company wants the ANN to be efficiently trained and if its project managers get actively involved, up to 2-3 training examples can be collected in a week of construction activities.Seeing that the average life cycle of a roofing process is 2 to 3 months, a project can quite realistically yield a sample size 20 30 N y .A large construction business, particularly one focused on roofing, or a profiled construction auditor may handle up to ten construction projects at the same timeso the average training sample size can be approximately 800 1000 N y in a year.The academic results of sufficient training sample presented in this paper with the use of the SEM algorithm suggest that after a year or two of training in a major construction business the created ANN will be able to produce results with accuracy of (12).Meanwhile, our  (17,(22)(23) lead us to believe that growth of sample size tends to make the confidence interval of ANN outputs narrower (12), and thus more accurate.Keeping the stability of learning algorithm under control, this will ensure continuous progress of the created ANN.

2
expression means that the sample size depends on h alone.For neural network with sigmoid activation function, the established range of VC-dimension[11,12] has the following orders: , where W -number of free network parameters, i.e. synaptic weights.Training of a created network stops at Layer 3, and the synaptic weights of neuron P are beyond the training array; then 32 W and VC-dimension will have the following orders: and a limited sum of weights w V d ¦ , we are going to have 1 ( ) 16 ( ln( ) 2 ln(32 ) ln(
If we use a probably-approximately correct learning model with an SEM-algorithm, where the convergence condition (3) is described with the function of losses F fat H one can successfully determine complexity of the required training set for implementation of neural network convergence in problems of complex classification, and in recovery of regression.