Application of selected computational intelligence methods to sound level modelling based on traffic intensity in thoroughfare

The aim of the paper was to build the models of sound pressure level as a function of traffic intensity in thoroughfare. The models were built by using artificial analytical models or regression trees. The former included Nordic Prediction Method. The latter were represented by Random Forest and Cubist. The analysis of accuracy of all obtained models was conducted. The best models can be used in the process of reconstruction of equivalent sound level data.


Traffic intensity and sound level measurement results
The level of road traffic noise depends on many parameters. Some of them are constant during long period of time (road geometry, type and condition of road surface), but other parameters, which are connected with traffic intensity, can change very fast: number of trucks, number of other vehicles, and average vehicle speed.
In these paper the measurement data recorded at Krakowska Street in Kielce (Poland) were used for building and evaluation of the models of sound pressure level as a function of traffic intensity in thoroughfare. The number of vehicles, their category, speed, as well as equivalent sound pressure level values were measured and recorded in a monitoring station comprising a sound level meter (SVAN 958A, digital, four-channel, class-1, vibration and sound meter), a weather station, and road radar [2]. A-weighted equivalent sound level pressure values , were recorded every minute. Next, by using equation (1) values were calculated for longer periods of time, namely for the following 24h subintervals: day (6-18), evening (18-22) and night (22-6). For the same time periods, vehicle speed values were averaged, and numbers of vehicles were summed up.
During the year 2013, for day sub-intervals, there were only 296 complete records, i.e. records (days) without missing measurement values, as shown in Fig. 1. Each record contains values of: A-weighted equivalent sound level, number of light vehicles, number of heavy vehicles, and average speed of all vehicles. In incomplete records, the most frequently missing value is , while the traffic intensity parameters (average speed, number of light and heavy vehicles) are available for almost all days in 2013. Finding the model which describes relationship between sound level and traffic intensity parameters can benefit in the process of reconstruction of missing equivalent sound level data. Selected plot matrices are shown in Fig. 3. For average speed values above 50 km/h, decreases with increase of average vehicle speed, which may be counter-intuitive, but easy to explain: higher average speed means that traffic density is smaller and less vehicles are passing the monitoring station. Fig. 3b shows that increases with increasing number of light vehicles.
In case of lower vehicle speeds, 30 ≤ 1 <50, and 30 ≤ 2 < 40, instead of (1) and (2), equations (5) and (6) When the Nordic Prediction Method model was applied to calculation of by using 1 , 2 , and (assuming 1 = 2 ) values taken from each of 296 records mentioned in Section 1, the values calculated by the model were on average 2.3 dB higher than recorded values.
Another model, described in [6] requires vehicles speed data separately for each of 4 categories of vehicles, as well as calculations of sound propagation. For these reasons author decided (in Section 3) to propose the new model, inspired by Nordic Prediction Method. Other studies showed that regression as well as artificial neural networks can be used for modelling as a function of traffic intensity parameters [7], or even without direct use of vehicle volume and speed [8,9].

Cubist and Random Forest models of sound level
Computational intelligence methods such as artificial neural networks, fuzzy systems, random forest regression, or regression trees can be applied to construction of the model which describes relation between input and output attributes in a given dataset. Two approaches from the field of regression forests and regression trees were employed in this study, namely the Random Forest algorithm implemented in WEKA [3] and Cubist software [4].
Training dataset consisted of 296 records, each of them describing one day sub-interval (6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18) in year 2013. Dataset contains only records for which sound level, number of vehicles, and their speed were recorded (there are no missing values). Each record contained one output attribute, db_A ( , A-weighted equivalent sound level), and three input attributes, log_N_light, log_N_heavy, log_avg_speed, which are 10-base logarithms of: number of light vehicles, number of heavy vehicles, and average speed of all vehicles, respectively. These attributes were selected according to equations (2)-(4), which suggest that there exists a correlation between these attribute values and value. Several methods were applied to the training set in order to obtain model of . Scatter plot for this model is shown in Fig. 4. Other models, created by Random Forest method [10], also using 296 records from training dataset, were much more complicated (100 large regression trees i.e. one 'random forest' in each model). Random Forest method was run with random selection of k = 0, 1, 2, and 3 attributes. Accuracy of all elaborated models, expressed by mean absolute error (MAE), root mean square error (RMSE), and correlation coefficient, is shown in Table 1. Values of , measured, and predicted by two selected models, are shown in Fig. 5.  Table 1 shows that Cubist 2.09 model is quite accurate on its training data (first 200 records) and test data (next 96 records). Random Forest model with