Acoustic Road Monitoring

The subject of this research is showing the performance of an automatic acoustic road monitoring system proposed by the authors. The main goal of the study is describing road traffic by means of an acoustic representation and testing effectiveness of traffic flow sensors. Evaluation metrics of the road conditions such as velocity of the traffic flow, its structure and weather condition are presented along with acoustic descriptors derived from the audio signal analysis. Accuracy of emergency vehicles pass by detection based on acoustic monitoring is also briefly described. 1 Road acoustic systems Due to the limited possibilities to introduce new road facilities in urban and intercity areas, maximization of the efficiency of existing transportation networks is crucial for administration. Intelligent transportation systems (ITS) enable to increase road capabilities and safety thanks to traffic analysis and control, along with communication with the road users. Traffic flow sensors utilize various physical phenomena in their work principle. The most commercially exploited ones use electromagnetic induction, i.e. inductive loops installed across road lanes. Others involve registering images of a vehicle used for optical and video processing or reflection properties of transmitted energy based on infrared and ultrasonic sensors. An appearance of a vehicle in the chosen detection area triggers counting by the senor and/or allows also for distance determination [1]. Opposed to the commonly used technologies, acoustic monitoring can ensure wider area detection. Traffic sound analysis is a passive, non-intrusive method that may provide additional useful information about the nature and the location of the road event employing sound propagation principles. The first traffic detector system developed by the Charles Adler Jr. in 1920’. Sensor was used as light changing and was activated by the horn sound produced by a vehicle [2]. In the next years acoustic sensors evolved to hollow steel boxes passing through them caused noise that was picked by the microphone. But eventually, in early ITS systems the most common technology was an inductive loop detector. Although the acoustic detectors are known since the creation of the ITS systems, technology behind current acoustic systems is much more mature, complicated and expensive. * Corresponding author: karmarci@pg.edu.pl © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). MATEC Web of Conferences 231, 05002 (2018) https://doi.org/10.1051/matecconf/201823105002 GAMBIT 2018


Road acoustic systems
Due to the limited possibilities to introduce new road facilities in urban and intercity areas, maximization of the efficiency of existing transportation networks is crucial for administration. Intelligent transportation systems (ITS) enable to increase road capabilities and safety thanks to traffic analysis and control, along with communication with the road users. Traffic flow sensors utilize various physical phenomena in their work principle. The most commercially exploited ones use electromagnetic induction, i.e. inductive loops installed across road lanes. Others involve registering images of a vehicle used for optical and video processing or reflection properties of transmitted energy based on infrared and ultrasonic sensors. An appearance of a vehicle in the chosen detection area triggers counting by the senor and/or allows also for distance determination [1]. Opposed to the commonly used technologies, acoustic monitoring can ensure wider area detection. Traffic sound analysis is a passive, non-intrusive method that may provide additional useful information about the nature and the location of the road event employing sound propagation principles.
The first traffic detector system developed by the Charles Adler Jr. in 1920'. Sensor was used as light changing and was activated by the horn sound produced by a vehicle [2]. In the next years acoustic sensors evolved to hollow steel boxes -passing through them caused noise that was picked by the microphone. But eventually, in early ITS systems the most common technology was an inductive loop detector. Although the acoustic detectors are known since the creation of the ITS systems, technology behind current acoustic systems is much more mature, complicated and expensive.
During the 90. a few sound based systems were tested There was also one hardware solution realization, namely SAS-1 PA detector from SmarTek. The system is built out of an array of sound sensors, can operate on multiline roads and determine a vehicle speed and type [3]. In 2011 a realization of Traffic Monitoring Wireless Sensor Network based on acoustic arrays was presented. Each sensor consisted of two microphones and the difference in the time of the arrival between two sensors was translated into vehicle speed. Other projects examine usefulness of Artificial Neural Networks in audio based ITS systems but without standalone hardware realization [4,5].

Project background
The main goal of the study is creating an acoustic traffic representation and testing effectiveness of traffic flow sensors. Besides predicting vehicle volume and traffic structure for a given interval, it is possible to check road pavement condition due to weather changes. Moreover, alerts can be generated upon a detection of car crash sounds, honks and other non-traffic oriented sounds.
Sources of road noise can be divided into three main groups: rolling noise, engine and aerodynamic noise. The degree of their participation in overall noise depends on the speed of the vehicle. For example, rolling noise is most significant for speed over 30-40 km/h. Electric cars due to other engine operation principle tend to be much quieter at low speeds. That phenomena can be dangerous to people with disabilities (e.g. low vision, partial sight or blindness as well as hard of hearing people) and also for other unprotected road users. According to the U.S. Department of Transportation's National Highway Traffic Safety Administration (NHTSA) there are 2,400 pedestrian injuries yearly due to slow moving electric and hybrid cars. For the safety issues, until Sept, 1 2019 all electronic engine manufacturers are obliged to install special sound alarm, or synthetic audible engine noise for slow moving vehicles [6].

Assumptions
The proposed acoustic road monitoring system is based on a few assumptions. The first group of requirements are related to hardware -the system should be cheap, easy to produce and resistant to weather conditions. From the software point of view, algorithms must be quite fast in execution (depending on the priority of the information) and should work online. Using audio processing we obtain quantity metrics that describe the qualitative responses such as "passenger vehicle", "truck", "wet/dry surface".
In this paper we focus on the algorithmic audio processing with the intention of the extraction of the variables describing motion. Acoustic requirements are based on three assumptions [7]: 1. Vehicle is a point source. 2. It has a stationary characteristic (Autoregressive Moving Average (ARMA model)), 3. It produces a pure tone.
Visual differences between noise source such as collision of a light vehicle, i.e. motorcycle rollover and a typical vehicle are easy to be recognized both in time and frequency-time domains. A visual example of such is presented in Fig. 1. On the left side we can observe vehicles involved in a collision, and on the right side a passenger vehicle pass by.

Processing tree
The basic task of the system is to collect information about the presence of a vehicle and it is achieved by tracking changes in the audio signal. The scheme of operation is shown in Figure 2. Depending on the analyzed parameter, i.e. the noise level, RMS of the signal or sharpness level the output detection threshold is selected [8,9]. The previously set threshold value was established in an experimental way and depended on the average intensity per hour and maximum sound level, which was compensated for the distance of the microphone from the road. The threshold adaptation formula took into account the statistical parameters in the buffered analysis frame (20 second-long window). [10] Fig. 2. The diagram of data analysis for vehicle detection.

Data acquisition
Recording of the road traffic was made using two microphones with different characteristics, i.e. omnidirectional and hyper-cardioid. In addition, the ground truth video recording was conducted during the whole session. Audio recordings were processed in order to extract traffic sound parameters. The nature of sound is analyzed not only in terms of acoustic levels, but also to discern spectral distinctive features. These operations were used for determining the vector of features that would allow the training of classifiers. In addition, the use of noise monitoring systems for estimating traffic on roads is presented. A methodology called reverse engineering is employed to extract approximate traffic volume from the noise level in a given area [11].

Event detection
The first step of the system operation is detection of acoustic activity in audio recording. The sound level recorded during the day varies due to varying traffic volume and human activity. To prevent false positive, and false negative detection we proposed an adaptive threshold detection.
Registered sessions were divided into files with a length of 5 minutes. The number of particular pass by was checked manually according to the video registration. Then calculations with an adaptive threshold method proposed were conducted. An example of a small set of data with threshold values and event detection is presented in Fig. 3.
Frames marked as "active" are forwarded to the classification system. Every detection is tagged with a timestamp, defining the vehicle type, weather condition, possible speed. Using full detailed data we can calculate mean volume, distance between vehicles and other traffic information. Performance of the detection algorithm is presented in Table 1.

Event classification
Frames marked as "active" by the system are used in the classification system. Figure 4 shows the stages of the traffic classification algorithm. Again, using video information audio traffic events were tagged correspondingly as vehicle type, instantaneous speed and weather condition. A vector of 48 parameters was obtained for each event file. Next, using pre-processing operations, noise reduction was used in order to remove atypical vehicle that acoustically deviate from the class assigned causing additional noise in data. Because different representation for each parameter was obtained, a normalization process was conducted using z-score method.
Finally, to reduce the number of parameters a statistical analysis was conducted using Fisher Linear Discriminant Analysis (LDA) [12].

Vehicle body type
Using the above scenario a few classification sessions were conducted utilizing the Support Vector Machine (SVM) method [13]. However, it should be mentioned that energy of the tire noise is cumulated near 1 kHz frequency. Moreover, for specific vehicle body type energy in the higher frequencies may differ between vehicles. Contrarily, between 4-6 kHz we can extract the main fluctuation characteristic for a given vehicle that can lead to an automatic classification decision. Figure 5 shows an example of the scatter charts for two vehicle classes. Fig. 4. Differences between vehicle classes in the context of spectral content (left), and the effect of LDA for two chosen vehicle classes with data not assigned to any vehicle class, causing overlapping in groups (right).

Weather conditions
The same way as with the body type, but using a different reduced feature vector, classification of pavement/road conditions was conducted. The surface condition was understood as the presence of water and other atmospheric precipitation on the ground. The result of several observations carried out in the selected location for three weather conditions is shown in Figure 5.

Fig. 5.
Differences between three pavement conditions understood as the presence/no presence of water or snow on the road surface.

Emergency vehicles
In Poland there are three main emergency signals: le-on, wolf, dog [14]. Sounds are not subject to significant adjustments. Polish norms and guidelines mainly refer to the document PN-75/S-76006 "Acoustic properties of dual-signal signals of emergency vehicles in motion. Requirements and tests.". However, they only normalize older -two audio types of alarm sounds and not the modulated or polynomial type that are commonly used nowadays. Technical requirements set for devices installed on privileged vehicles are also properties used for their automatic detection in the audio signal. Due to necessity of sound isolation enforcing between the outside and the inside of a vehicle cabin the main emergency signal characteristics should be as follows: difference between signal and the noise level should be at least 15 dB, -frequency response should be between 300-3000 Hz, -most of energy should be placed above 1500 Hz (hearable for most of the hearing protectors equipment), -sound should be explicitly differ from other sound sources at working area, i.e. in sound level, frequency components and be variable in time. For the low urbanized area the direction of the arrival can be easily detected due the specific way of sound emission. In places with high reflectivity like canyon streets detection of an emergency vehicle is more complex, mostly due the reflections and sound isolation of the cabin of a vehicle.
Based on the time-frequency analysis, a vector of parameters describing the alarm siren can be selected. The parameters utilized in the music information retrieval area [15] occur useful for that purpose. Alarm signals differ significantly from other acoustic signals 6 MATEC Web of Conferences 231, 05002 (2018) https://doi.org/10.1051/matecconf/201823105002 GAMBIT 2018 encountered on the road (e.g. passenger cars used as moving billboards). Spectrograms of two different moments of an emergency vehicle passing by is presented in Fig. 6.   Fig. 6. Clearly modulated sound samples recorded during the passage of a vehicle with the siren alarm. Detection can be conducted based on the sound level and signal modulation.

Conclusion
The tests performed have confirmed the effectiveness of classifiers trained to recognize vehicle type, weather conditions and vehicle counting. The algorithms developed, used for the analysis and for the classification of acoustic signals, detect various types of impulsive sounds, which may result in traffic safety increasing.
Acoustic analysis of the road traffic provides various types of traffic-oriented data corresponding to vehicle velocity and safety issues. One of the crucial points in an emergency situation is to provide help in the shortest possible time. Faster and more accurate localizing of the car crash may improve safety management of roads.
Acoustic analysis and detection systems are used in many countries, however mostly to detect explosions or shooting in large city agglomerations. Meanwhile, acoustic systems are still rarely used for the analysis of road conditions and road traffic, hence, taking into account high effectiveness of such solutions, further work supporting their conceptual development and practical applications is justified.
Project financed by the Polish National Centre for Research and Development (NCBR) from the European Regional Development Fund under the Operational Programme Innovative Economy No. POIR.04.01.04-0089/16 entitled: INZNAK -"Intelligent road signs with V2X interface for adaptive traffic controlling"