Comparison of M5 Model Tree and Nonlinear Autoregressive with eXogenous inputs (NARX) Neural Network for urban stormwater discharge modelling

This paper presents a comparative study of two data-driven modelling techniques in forecasting urban drainage stormwater discharge based on rainfall prediction. Both M5T and NARX (Nonlinear Autoregressive with eXogenous inputs) Neural Network are used for 30 minutes storm water forecasting. Data are collected from watershed area of 3315 ha, located in the city of Casablanca in Morocco. The results show that both models provide good results, but however with better performances of the NARX model.


Introduction
Major cities are facing urban flood issues due to urbanization and climate change. Real time control (RTC) systems allow an improvement in the performances of sewer networks through the control of the different network components and the optimization of storage within retention ponds and main collectors (Beeneken et al., 2014). The essential input parameters in RTC systems are rain and flow forecasts. Artificial intelligence techniques had shown good performances in modelling nonlinear systems with a considerable reduction in computing time compared to hydrologic and hydraulic models (Abou Rjeily et al. 2017). Two artificial intelligence techniques, namely, neural networks and M5 Model Tree were discussed in several applications such as sediment transport modeling (Bhattacharya et al. 2007), river flow forecast (Solomatine and Khada. 2003), evaporation estimation (Rahimikhoob. 2014). This paper presents an extension of this comparison to urban drainage application.

Artificial Neural Networks
Neural networks are commonly used in artificial intelligence techniques. They showed good performances in nonlinear system simulation and time series prediction (stock forecasts (Pang, X et al 2018) and urban flood prediction (Berkhahn et al 2019)). They are considered as black box containing the information to be learned. In the beginning, the architecture of the neural network is composed of layers and nodes without any information or knowledge of the simulated phenomenon. During the learning stage, the weights of the neurons that connect the different nodes are determined according to an optimization algorithm to minimize the error of the output of the neural network and measured data. The Nonlinear Autoregressive with eXogenous inputs (NARX) neural network given by equation (1) is used in this research: where y(t): output time series, u(t): input time series, ny and nu: are the time delays representing the phenomenon dynamic behavior.
Two types of architectures of NARX are generally used. The first one is efficient in forecasting a time series value onetime step ahead and it's called series-parallel or open loop. The second one is called parallel architecture or closed-loop and it is efficient for multistep-ahead prediction.

M5T
The M5 model is based on the principle of information theory (Quinlan 1992, Solomatine andKhada. 2003). It can handle nonlinear problem by splitting it into multiple linear ones. The M5T is similar to conventional decision trees, but instead of predicting classes through a classifier it predicts continuous variables through linear regression functions at the leaves. The construction of the M5 model tree is done in two stages (Solomatine and Xue. 2004, Rahimikhoob et al. 2013). First, the M5 model tree is constructed by a divide-and-conquer method that splits the multidimensional data space into several subspaces based on a splitting criterion ( Figure 1) that treats the standard deviation of the class values that reach a node as a measure of the error at that node and computes the expected reduction in the error by testing each attribute at that node . The splitting operation stops when the values of the instances reaches a node with slight variation or just few instances remain (Goyal and Ojha 2014) The formula used to calculate the standard deviation reduction is given by (  The result of the splitting is a tree with splitting rules at the nodes with expert linear models (LM) at the leaves ( Figure  2). The combination of linear models can be seen as committee machine (Haykin. 1999). The generated tree is often large and difficult to analyze and may lead to overfitting. In the second stage a pruning operation is needed to overcome this problem by replacing a subset with a leaf. In addition to this, the smoothing is performed to compensate the discontinuities that can occur between the adjacent linear models at the pruned leaves of the three.

Site description
The study area concerns a watershed of 3315 ha located in the city of Casablanca in Morocco. The urban drainage system (UDS) in this area is composed of 485 km of both combined and separate networks ( Figure  3). The UDS is equipped with a depth meter and flow meter the outlet of the watershed. The measurements for UDS are recorded at 15 minutes time interval. A rain gauge is located in the watershed to record the rain intensity at a 5 minutes time interval.

Data collection
Data had been extracted from the database for a period of 4 years (January 2015 -January 2019). Since the main sewer system at the outlet of the watershed is a combined one dry weather flow had been removed to keep only data related to rain flows. The rain dataset used in this present work is characterized by return periods summarized in table 1, which shows a dominance of low to medium rains in the dataset.

Return Period
Percentage of the sample

Application
Both M5T and NARX were used to predict the stormwater flow at 30 minutes. Data used in the construction of the two models are the appropriate data that contribute in the stormwater flow Qt+ 30min. These Data are the previous rainfall data (Ret-80min, …, Ret-5min, Ret) and the previous measured discharge (Qt-10min, Qt-5min and Qt). Both models were trained on a data sample of 4 years. In order to evaluate the performances of both models in predicting Qt+30min for different meteorological conditions, a sample of 4 rain events with various intensities and different return period was used. These rainfall return periods used for testing the models are: 50YRP (Year Return Period), 2 YRP, 5YRP and 5YRP.    Figure 4 shows that the NARX model represents better the flow hydrograph with a slight overestimation of the peak flow (about 10%) compared to M5T, which gives better results in predicting peak flows.

Conclusion
This paper presented the use of two artificial intelligence techniques, namely NARX and M5T for forecasting urban drainage stormwater discharge using rainfall predictions. Both of these models could be easily implemented for practical use. They were tested and compared on data collected from watershed area of 3315 ha, located in the city of Casablanca. Results show that both models provide good results. The NARX model shows good performances in modelling all the ranges of flow with slight overestimation of peak values compared to M5T Model, which provides good results in peak flows forecasting.