Statistical analysis of the water pollution indicators pertaining to treated municipal sewage introduced to the river

. The assessment of the ratio of self-purification processes in the natural environment, including the water bodies of streams and rivers is possible by the means of the numerical modelling. Nonetheless, the reliability of results of the qualitative numerical calculations may be affected by the quality of the collected input data and efficiency of the model calibration. The commonly required input data include pollutants characteristics and coefficients for empirical equations of their transport and decay, while calibration of a model requires series of measurements of selected pollutants concentration in water. This paper presents studies of concentration of pollutants measured in the treated sanitary wastewater discharged to the Bystrzyca river from municipal wastewater treatment plant, and their concentrations measured directly in the river, above and below the location of discharge point. The performed qualitative measurements of treated sewage discharged from municipal wastewater treatment plant and water of the Bystrzyca river included determination of COD, BOD 5 , N-NH 4 , N-NO 2 , N-NNO 3 , TKN, TN, TP and TSS concentration. The presented results of measurements are the first step in development and calibration of the model, allowing to successfully predict the influence of discharged sanitary wastewater on quality of water in the river.


Introduction
Discharging wastesboth solid and liquidto waters was a very convenient solution. Wastewater discharged to rivers via gutters were diluted and underwent self-cleaning, while the river, in certain distance downstream from the waste discharge site, regained its quality. However, at present it is not possible, because, e.g. due to the industrial development, new chemical compounds are being synthesised, which enter the water cycle in the natural environment and may have a negative influence on it as well as on the human health [1][2]. Therefore, due to constantly deteriorating quality of available water resources, only the treated wastewater which meets the requirements stated in valid legal regulations can be discharged to running waters, i.e. rivers [1][2][3][4].
Monitoring of receiver's water quality enables to determine the influence of introducing treated wastewater to the environment, as well as the efficiency of selfcleaning process. Due to the fact that rivers may transport pollutants over great distances, control and running water quality assessment, as well as taking the actions aimed at improving their quality, are important. Even a highlytreated medium may carry significant amounts of biogenic and mineral compounds, which may cause unfavourable processes, i.e. eutrophication or increase in water salinity. The treated wastewater, in the case of treatment plant failure, may release, i.a. heavy metals, persistent organic pollutants, potential carcinogens, as well as bacteria, viruses and parasites into the receiver [5][6]. As a result, numerous publications appeared in recent years, aimed at developing the river condition assessment systems [3,[7][8][9][10][11][12][13][14][15][16].
The quality of running waters has been evaluated for a long time on the basis of physicochemical analyses. The main parameters determining the quality of wastewater and receiver waters include BOD5 (5-day Biochemical Oxygen Demand), COD (Chemical Oxygen Demand), pH, TSStot (Total Suspended Solids), as well as the content of nitrogen and phosphorus compounds [5][6][7][8]. It is also known that a complete assessment of river condition is possible only after introducing the aspects of biological assessment [8][9][10]; similarly, as in the case of the treated wastewater quality assessment and wastewater treatment plant efficiency.
At present, there are numerous methods for evaluating the condition of river environment, ranging from the organoleptic evaluation of water quality, to highly precise physicochemical methods and biotesting. However, as far as the quality assessment of aquatic ecosystems is concerned, the environmental indices systems are increasingly preferred. They enable to evaluate the quality of aquatic environment as a habitat for organismsvarious species of protozoa are commonly used for this purpose [8,[12][13][14][15][16].
Obviously, the best results of river condition control can be achieved by simultaneously applying all three groups of assessment criteria (hydro-morphological, physicochemical and biological), unfortunately, such an approach is relatively rare [8]. Moreover, studies are usually conducted in terms of presenting new method of evaluation [3,8]. Much more rarely they are the works aimed at determining the sensitivity of these methods and the degree of agreement and complementarity in the final assessment of river conditions [17][18][19][20].
In this work, a slightly different approach, based on numerical modelling of various phenomena occurring inside the water body of streams and rivers, allowing to assess the ratio of self-purification processes in the natural environment, was presented. However, reliability of qualitative numerical calculations results depends directly to the gathered input data to modelling and successful calibration of the developed model. Necessary input data usually cover pollutants characteristics and coefficients for empirical equations, while measurements of selected pollutants are required to model calibration.
This work presents the statistical analysis of pollutant concentration in the treated wastewater discharged to the Bystrzyca river from Hajdów municipal wastewater treatment plant, as well as the concentrations in the Bystrzyca river water, above and below the location of discharge point. Numerical modelling of various phenomena occurring in the body of water streams and rivers was used to assess the ratio of self-cleaning processes in the natural environment.

Materials
The samples were collected from the Bystrzyca river, where water quality changes caused by treated effluent from the wastewater treatment plant could have occurred. The collection was carried out in three locations: 1spot above the discharge point; 2spot of wastewater discharge to the river, 3spot below the wastewater discharge point. In order to ensure appropriate quality of results, the sampling locations were in a certain distance from one another. The treated effluent was subjected to the analysis as well. Samples of water from the river were collected 500 m before the sewage discharge point, at the point of discharge and 1000 m after the discharge point. The collection was carried out using a bucket fastened on a cord from a depth of about 50 cm below the water table to properly prepared containers in accordance with the standard described in the Water and Wastewater Sampling Guidelines for WIOŚ studies in Warsaw [21].
The samples of the wastewater discharge pollutants characteristics took place once for a week for three years, but the data from spot above and below the municipality wastewater treatment plant were conducted only once for a month and only in 2009.All the dates of measurement from less frequent period occurred in set of dates of more frequent measurements. The samples were transported to the laboratory in the Faculty of Environmental Engineering LUT and the qualitative analysis of pollutant indicators was performed. The HACH DR2800 spectrophotometer and special research methods authorized by HACH-Lange company were used to analyse value of the nitrogen (cuvette tests LCK 238 LATON, LCK 340, LCK 341, LCK 303) and phosphorus (LCK 348) compounds, as well as COD (LCK 314) and TSS. On the other hand, the BOD5 analyses were conducted according to the Polish standards (PN-EN 1899-1:2002).

Methods
The methods used in this paper were divided into two groups. In the first, the analysis of individual pollutant indicator was conducted, in the sense of the impact of the pollutant concentration level on the water quality in the river below discharge point of municipal wastewater treatment plant. The relationship was estimated by a linear regression C_below=β_0+β_1 C_above+β_2 C_out (1) where C_above, C_below, C_out mean the level of concentration of particular pollutant measured respectively in the river, above and below the location of discharge point and in the treated sanitary wastewater [22]. The second method was used to judge the influence of the combined pollutant factors on the quality of water in the Bystrzyca river. A hierarchical cluster analysis was conducted with Euclidean metrics and Ward's method of agglomeration in order to compare the groups of observation in view of all nine pollutants [2,[23][24][25][26][27][28][29][30].
All calculations were conducted in statistical programming language R [31]. Cluster analysis was made using R factoextra package [25]. Figure 1 shows a strong relationship between the concentrations of COD measured above and below the Bystrzyca river, but there is much smaller correlation between concentration of COD at the discharge point and in below the river. Most of the pollutants reveal a similar behaviour, but not all, e.g. TN has strong correlation among all three measurements (see Figure 2). The measurement of the size of the effect of the discharge of sewage to the river is shown in the form of the multiple regression parameter (C_out).

Results and discussion
All models consider the impact of the condition of the river above the discharge and the level of concentration of the tested compound on its pollution afterwards. If C_out is significant, then it can be inferred that the level of concentration of the pollutant is crucial. Only in two cases the C_out is significant, so these two factors are substantial for assessing C_below. Almost all of the models show that the level of concentration of pollutant depends on C_above (see Table 1).  The second approach was to evaluate the relationship between C_above, C_below, C_out, which relies on clustering observations into homogenous groups in the sense of concentration of pollutants. On the basis of the hierarchical cluster analysis with Ward's agglomeration method, all observations can be divided into two groups. The division obtained from the analysis is obvious (see Figure 3). The points marked in red are conducted from municipal sewage and blue ones are from the Bystrzyca river. If the impact of pollutants were significant, there would be one more cluster (observations from the river would be divided into two groupsmore and less contaminated). The representation of individuals in 2dimmensional space is quite good (79% of total variance explained), so the map on the right in Figure 4 shows separation between two groups in a good way.

Conclusions
Unsupervised statistical learning method, such as cluster analysis, gives us two-class division based on all measured water pollution indicators. The result was conducted by hierarchical cluster analysis with Ward's agglomeration method and Euclidean metric. All observations in cluster 1 were conducted from municipal sewage. The samples from the Bystrzyca river, both above and below sewage treatment plant are in cluster 2. There is no significant difference between two parts of the river. The relationships between municipal sewage and the Bystrzyca river below the point of discharge are different than the dependence between the Bystrzyca river above and below the wastewater plant.
The analysis pertaining to the impact of municipal wastewater treatment plant discharge on the Bystrzyca river generally revealed that there is no such effect. Both the analysis of the impact of individual factors and their groups did not give rise to the conclusion that the sewage treatment plant has a negative effect on the river.