The modelling of ground water quality in urban area based on demographics factor and building coverage ratio by using geographically weighted regression approach (case study in Jakarta, Indonesia)

The population density as consequences of urbanisation in certain area may cause the occurrence of land utility shifting from natural site to impervious. Thus, this imperivious cover can be used as indicator of environment quality, particularly the water quality, such being the case of Jakarta, that 80% of ground water sample in Jakarta has been contaminated by E.Coli. Due to this condition, there are indicators which might be correlated with the water quality, namely building coverage ratio (BCR) and demographic factors such as population and building density. This study argues the influence of those three factors toward the ground water quality. The aim of this study is to formulate a spatial regression-correlation model between those three factors with the main indicator of water quality, namely E.Coli. The sampling data were obtained from a densely populated areas in Duren Sawit sub-district, in Jakarta. In term of statistical process, Geographically Weighted Regression (GWR) was used to analyse the spatial data. The result suggested that the value of R2 at 68.9% and the largest influence of E.Coli value in this models was more affected by the BCR rather than by demographic factors. This outcome would be an early recommendation for Jakarta.


Introduction
The land transfer function in urban areas tends to increase the surface area with the closure of cement, asphalt, and waterproof area (impervious) [1,2].In the last few decades, a large number of big cities in Indonesia are faced by a similar problem, that is a high population growth rate.This condition is triggering an increasing demand of residential areas and city infrastructures.In the further development, it would impact on the shifting of land usage in urban area.The rise of impervious cover could lead to eco-hydrologist problems such as an increase of surface runoff, a declining water quality, the damage on habitat of water aquatic animal [3,4,5].Nevertheless, the land covers (impervious cover) is the characteristic of urban area [2,6].The impervious cover could be used as indicator of environmental quality mainly related to the water resources quality [2].In addition, the previous studies discuss majority on how much the influence of population density toward the water quality in a certain watershed [4,5,6,7].The data mapping has been conducted by employing GIS and statistical method includes anova, correlation, regression as well as spatial regression (Geographical Weighted Regression) [8,9].Returning briefly to the environmental issues, DKI Jakarta become one of the largest metropolitan cities which experienced with the biggest impact of development.This condition was illustrated by its rate of city growth and development which is higher than other cities [10].Regarding to the environmental quality monitoring report in DKI Jakarta in 2015, there has been a decreasing quality of river water, ground water and air, thereby the water pollution in Jakarta has reached a harmful level.This water quality monitoring was conducted toward well water in 197 urban village.The data sample was obtained twice during period September -October and November -December by employing few parameters such as physical, chemistry, and biology (microbiology).In term of quantity and quality aspects, the ground water notes better performance than the surface water.Although, 80% of ground water samples from 75 wells in 75 urban villages have contaminated with the Escherichia Coli (E.Coli) [11].The spread of E. Coli bacteria in the ground is heavily influenced by the ground porosity.The horizontal movement is hard to be ascertained because it depends on several factors such as a type of soil, the height of ground water surface, the flow of ground water, the construction of hand pumps well as well the number of people who use it, also the number of people who throw feces [12].Simultaneously, other impacts from the fast development in Jakarta is a quality reduction of space utilisation and the imbalance of space function and structure.According to the result https://doi.org/10.1051/matecconf/201819202034ICEAST 2018 of satellite images data processing lansat 8 and GIS technology in 2017, DKI Jakarta showed a high building density.Consequently, the building density and building coverage ratio ( BCR ) increased, which is caused by a higher population density in the middle of the constant variable of land areas.For those purpose, this study attempts to obtain a spatial regression correlation model among the variables of population density, the building density, and the building coverage ratio, toward one main indicator of water quality, that is E.Coli bacteria in urban area.The outcome of this study is expected would be able to deliver a positive contribution and the policy direction related to the law enforcement.
In the meantime, the research would be conducted in East Jakarta as one of municipalities in DKI Jakarta with densely residential.The location being precisely taken for research is Duren Sawit, as one of the most densely populated sub-districts in East Jakarta and considered to be housing zone based on the Local Regulation DKI Jakarta No.1 year 2014.
Subsequently, the modelling research of ground water quality was implemented through the spatial regression approach or Geographical Weighted Regression (GWR), in which the geographical location was applied as predictor variable that could influence the response variable [13].

Study Site
The location of this study was determined in urban area in DKI Jakarta, in Duren Sawit sub-district, East Jakarta municipality to be precisely.This sub-district consist of 7 urban villages.

Sampling Methods
The data sample was collected from Duren Sawit subdistrict in 30 site of location based on the result of population density mapping.The ground water sampling was represented by the resident well water and the measurement of building coverage ratio in 30 houses.The proportional probability sampling (pps) method was applied by using the population density as guidance, thereby each urban village (study location) hold a different number of sample.

Population density (PD)
The population density data was obtained from Indonesia Central Bureau Statistic 2016, which includes the landmass of 7 urban villages, as well as the population number and density.Based on the population density as a guidance, the data sample was determined by using the proportional probability technique, namely the stratified random sampling.Thence, the number of sample was defined around 30 from total number of population density in entire urban villages.

Building density (NDBI)
The building density was estimated through the approach of NDBI (Normalized Difference Building Index).This index data was derived from the processing of satellite images data of landsat 8. Specially NDBI is a building density index in which the type of date shaped raster with pixel value between -1 until 1.The value of -1 determined the area with low building density, whereas the value of 1 for the high density.

Building Coverage Ratio (BCR)
The value of BCR is calculated by formula: The value of BCR is presented in percentage (%).

Water Quality
The ground water quality sample was tested by using E Coli bacteria.The early interview conducted in 30 site was concluded that the ground water was mostly used as drinking water and sanitation.In term of microbiology, the sampling of ground water quality procedure has followed the SNI [14].

Modeling Methods
The Geographically weighted regression (GWR) method is a regression model for each site of location.The GWR method would generate a unique models because each site of location is different.This model involves a spatial factor (latitude and longitude coordinate) from each sample.The GWR model is developed by using the software of R 3.2.1 version.The model is presented by formula (2): The weighted scheme in GWR could apply different method, such as Kernel Gaussian as the most commonly used techniques and the weighted function of bisquare [13] While, the Gaussian function to calculate the closest point -n is: Where: j = the point -n, as the closest site to point -i (3) b = the distance of point -n as the closest (bandwith) dij = the Euclidean distance between point -i to point -j dij = Bandwidth is the radius of a circle where the point in the radius of circle is still considered influential in forming parameter of location model.One method to determine the value of optimum bandwith is cross-validation (CV) method.
∑ ̂ )) The estimation of regression coefficient in Geographically Weighted Regression (GWR) was conducted by using the Weighted Least Square (WLS).
This WLS method provides different weight to each location of observation.Nevertheless, the presence of spatial heterogeneity is one of the requirement of GWR could be applied.The spatial effect test (spatial heterogeneity) referse to the variation in each location.
The heterogeneity data could be spatially tested by using statistical method of Breusch-Pagan (BP) with value: : least squares residual for observation -i X : the matrix with size n x ( k+1 ), which is consist of standardised vector (z) for every observation.The decision making: Ho is rejected if BP > , which means there is spatial heterogenity.

Data and research variable
The response variable (Y) is E.Coli bacteria, whereas the predictor variable (X) is the Population density (X1).While the building density is represented by NDBI (X2) and Building Coverage Ratio (X3).

Regression Model
Prior to the analysis of GWR model, the OLS regression model analysis need to be employed by equation: ̂=-87.3594+0.1088X1+152.9582X2+162.6955X3 (8) GWR is frequently used in data analysis related with the spatial heterogeneity.In this study, the value of BP was obtained equal to =34.217>7.815=X 2 (0.05;3) (Ho is rejected).It seem that there was a spatial heterogeneity in the model.

Model GWR
The result of GWR model by using software R 3.2.1 version is as follow: From t test stated that the rising of 100% building coverage ratio would lead to the increase of E.Coli value.

CONCLUSION
This study was designed to determine the influence of building coverage ratio, population density and building density toward the ground water quality.The result of this research showed that the GWR approach is more appropriate to build a model of ground water quality in urban area, because this approach has covered the spatial element required in the research (sampling of geographic area).The GWR approach generated a local regression model in every site (in every neighbourhood association) with building coverage ratio as influential variable.In addition, a high level of building coverage ratio approaching 100% become one of the reasons why the septic tank in certain house is built nearby to the other house.This condition increase the possibility that the ground water may be polluted by E.Coli bacteria.
Beside, a high level of building coverage ratio impact on the decreasing of rain-water infiltration to the ground, thereby it would not be able to dilute the pollutant E.Coli in the soil.In particular, the local model obtained from each neighbourhood association would provide a recommendation in enforcing the DKI Jakarta Local Regulation No.1 year 2014 related to the arrangement of building coverage ratio in each house in entire areas.

Table 1 .
GWR Diagnostic information

Table 1 .
described the value of Quasi R 2 at 68.53 %, which means that the predictor variable in the model only able to explain the response variable at 68.53%, while 31.47% is explained by other variable outside the model.The GWR model generated the value of SS residual smaller than OLS model.This condition indicated that the GWR model in more appropriate to use than other.Also, the GWR model provided an equation in each sampling location.

Table 2 .
Estimation of local parameter variable GWR in every point

Table 2 .
showed the the GWR model introduced  for every point which eventually formed the model in every point, thereby it created 30 models.Based on the data above, the water quality variable and the parameter of E. Coli bacteria in entire site is affected by the building coverage ratio variable.Also, from the Table2, the GWR model in site-1 in Pondok Kopi village is: ̂= -80.12 + 0.0073X1+229.69X2+155.23X3