Demand forecasting in Small and Medium Enterprises (SMEs) ED Aluminium Yogyakarta using causal, time series, and combined causal-time series approaches

ED Aluminium is the biggest Small and Medium Enterprises (SMEs) in Daerah Istimewa Yogyakarta (DIY) with 90 number of workers and 1,5 ton ingot capacity for production (Isnaini, 2014). Inventory data in December 2015 indicates that some products are overstocked (9%) and stockout (83%). This condition can happend because that SMEs still using intuition to predict the number of demand. Inventory fluctuation causes the inventory cost increases while overstock happend and lost the opportunity cost during stockout. To avoid overstock and stockout, the determination of demand with exact method is needed and one of them can be solved by forecasting method. This study aims to find the best forecasting methods of demand in 2015 using causal, time series, and combined causal-time series approces that better than the actual condition. The results of this research is the best forecasting method used to predict the number of sales in January-November 2015, that are SARIMA (3,1,1)(0,1,1)12 for WB, SARIMA (1,1,1)(1,0,1)6 for WSD, SARIMA (1,1,1)(1,1,0)6 for DE, SARIMA (2,1,1)(1,1,0)6 for PE, and SARIMA (2,1,3)(0,1,0)12 for PT.


Introduction
ED Aluminium is the biggest Small and Medium Enterprises (SMEs) in Daerah Istimewa Yogyakarta (DIY) with 90 number of workers and 1,5 ton ingot capacity for production [1]. ED Aluminium has 47 types of products consist of Wajan Biasa (WB), Wajan Super Dinar (WSD), Dandang Ekonomis (DE), Panci Tasik (PT), ketel, wajan batik, and etc. Inventory data in December 2015 indicates that some products are overstocked (9%) and stockout (83%). Overstock and stockout can be caused by utilization of intuiton to make production planning from forecasting until production scheduling.
Inventory fluctuation causes the inventory cost increase while overstock happend and lost the opportunity cost during stockout. So, it can be said that the inventory fluctuation shows the non-conformity of production amount and the market demand. The prediction of demand in ED Aluminium is done by adding 7% of total sales from previous months. To avoid overstock and stockout, the determination of demand with exact method is needed and one of them can be solved by forecasting method. Forecasting is a usual method to predict activities in industry such as determining number of raw materials and the number of demand.
Forecasting is a method that uses historical data to predict the future [2]. Meanwhile, according to Sudjana [3] forecasting is an activity to predict a value in the future by focusing on data or information from the past and present that are analyzed scientifically using statistics method. Forecasting can be used to make the decision that affect the period ahead. Based on the background, in this study will do the research to find the best forecasting methods of demand in 2015 using causal, time series, and combined causal-time series approaches that better than the actual condition.

Methodology
Based on availibility of historical data in IKM ED Aluminium, forecasting demand in this study be confined by 30 types of product. This study start with literature study about forecasting and will be continued by collecting historical data of demand. The historical data is used to forecast demand in the next period. Before doing the forecasting, that collected historical data have to aggregated based on the types that are Wajan Biasa (WB), Wajan Super Dinar (WSD), Dandang Ekonomis (DE), Panci Ekonomis (PE), and Panci Tasik (PT). Aggregation is used to minimize the variation of product and simplify the way of forecasting. This study use 3 forecasting approaches to find the best forecasting methods ie Causal, Time Series, Combined Causal-Time Series. After the best forecasting method is found, we have to validate the result then dissagregate it. Dissaggregation is used to split the number of type into the number of products. And the last, the result will be compared between intuition, forecasitng result, and sales in 2015.

Result and Discussion
In this study, forecasting is used for predicting the number of demand on 2015. There are 30 products that will be forecasted. The 30 products are choosen from others because of historical data's availibility. Before forecasting, the data will be aggragated as their famlily ie Wajan Biasa (WB), Wajan Super Dinar (WSD), Dandang Ekonomi (DE), Panci Ekonomi (PE), and Panci Tasik (PE). Aggregation is used for making the forecasting process more effective and efficient. To delivere the optimum result, this study use 3 forecasting approaches to find the best forecasting methods ie Causal, Time Series, Combined Causal-Time Series.

Causal
Causal Approach is used for looking up the external and internal effect (independent variabel) towards the number of demand. The selection of independent variable is based on previous research. Table 1 shows the variables that will be used on causal forecasting. Before further process, the independent variables and historical data of sales are normalized to have the same range. Variables are selected using all possible regression methods. On conseptual, calculation from all possible regresion is the best way to choose the best regression method [4]. In this method, the model will be choosen based on the best value of deterministic coefficient (R Square) with the consideration of the number of determination coeffficient is not increase due to the addition of variables. The model of causal forecasting is built with multiple linear regression method.

Correlation between Variables
Correlation test between variables are used for finding the x (external and internal variables) that have correlation with y (demand). Correlated variables is shown by Pvalue < 0,05.

Multicoleniarity and Autocorrelation
Good correlation models are not contain multicoleniarity and autocorrelation that is indicated by VIF value close to 1. According to Hanke and Wince [5], VIF value on each variables those close to 1 are indicated no problem of multicoleniarity from independent variables. If the models have multicoleniarity, it can be removed first with differencing. Besides a multicoleniarity, good correlation models are also not contain possitif and negative autocorrelation. Possitive autocorrelation happend when the value of d is less than dL (dL= 1,4073). Although, negative autocorrelation happend when the value of (d-4) is less than dL. In this model, Durbin Watson value is 1,77111 so it does not contain negative autocorrelation and the posstive autocorrelation can not be concluded (dL < d < dU). Besides can analyze the multicoleniarity and autocorrelation, unsual R can also be detected. Unsual R can make the interpretation of regression result is getting wrong so it has to remove the unsual R first before continue the process.

Finding Best Subset
Best subset or best regression model is a model that has highest adjusted R square and random residual plot. The selection of best subset is done when there are more than one correlated variable.

Training Data
After the best variable is known for regression, the next step is doing trainng data and calculate the regression. The four steps of causal forecasting above have to be done for all product's families ie WB, WSD, DE, PE, and PT to know whether causal forecasting can be done and the value of R 2 dan MAPE. Table 2 shows the summary of casual forecasting result for WB, WSD, DE, PE, and PT. Based on Table 2, causal forecasting can be done not only for WSD but also for PE. The result of training data of PE shows the value of R 2 is -34,85 with 153,06% for MAPE. The result of R 2 and MAPE from WSD and PE can not be said it is good for forecasting model because R 2 is less than 75% and MAPE is still more than 10%. So, it is necessary to choose another forecasting approach for better result. Causal and Combined Causal-Time Series approaches can not be continued for WB and PT because there are not a correlation yet. While for DE there are unsual R and X that can not be removed.
Inflation and selling price have correlation in majority unit aggregate. Inflation has significant effect on consumption in Indonesia [6]. It means, when inflation decrease, consumption in Indonesia also increases. On the other hand, selling price gives the direct effect for customer preference behavior. Customer prefer to choose the lower price with the same quality.

Combined Causal-Time Series
Based on correlation and regression test, aggregate unit which can be continued for modeling in combined forecasting approach is PE. Combined forecasting approach can not be continued for WSD because the combined forecasting up to 19 historical data have Rsquare less than 75% and unsual R on 14 th observation. The best R-Square for PE in combined forecasting is solved with using 19 historical data (Yt-19) that is equal to 69,76%. There is unsual R on 4 th and 23 th observation on combined forecasting. Depend on the explanation before, it is better to remove unsual R on regression process because it will affect the result. Once removed the 4 th and 23 th observation, combined forecasting can be continued because unsual R does not appear anymore and R square is increase up to 83,45%. However, when regression is done for selected variable (p-value< 0,05), unsual R is reappear on 17 th observation. Once removed, unsual R is appear on 16 th , 26 th , and 30 th observation so this combined forecasting can not be continued.

Time Series
Based on the result above, the 5 aggregate units can not be forecasted with combined and causal approach because the result of MAPE and R-square are still not good. So, it is necessary to do another forecasting approach called Time Series. There are a lot of time series forecasting models. Therefore, it is necessary to chose the correct method based on data plot, period, auto correlation function (ACF), and partial auto correlation function (PACF). Accuracy of selection of forecasting model will give an effect for the error result. Based on data plot for historical data of demand, there are a lot of obstacles related to determine the right model. Researcher can not find the best model from plot data and ACF because MAPE still more than 10%. Therefore, it needs more analysis to see the stationary of data. Based on that condition, researcher try to use ARIMA model because of its adventages in processing non stationary data.
On ARIMA method, there are some steps have to do such as see the plot data of ACF and PACF, do the differencing, determine the right ordo, and run some scenarios to find the best method.   Figure 1 shows Lag-12 has the highest ACF and seasonal on Lag-11 of PACF. This condition means there is a seasonality of the model and it called SARIMA. But, it can not be concluded where is the right seasonal because ACF and PACF have different period's of highest lag. Despite of doing combination of ordo, in this SARIMA model needs to do the seasonality combination. It is used for finding the best MAPE and R-square. The form of SARIMA PE is ARIMA(p,d,q)(P, D, Q)12. Because it has done the differencing once, so it can be determined d=1. Then, it will be selected several SARIMA model that have p value< 5. Model with p value< 5 shows that the model can not be appplied because the error has corralation with previsious lag. This models will be the scenario of SARIMA then R 2 and MAPE can be determined. The same steps of SARIMA PE will be done for WB, WSD, DE, and PT. Table 3 shows the best forecasting model for all unit aggragate.

Conclusion
Based on the result and analysis we can conclude that the best forecasting method to predict the number of demand on Januari-November 2015 in ED Aluminium Yogyakarta are SARIMA (3,1,1)(0,1,1)12 for WB, SARIMA (1,1,1)(1,0,1)6 for WSD, SARIMA (1,1,1)(1,1,0)6 for DE, SARIMA (2,1,1)(1,1,0)6 for PE, and SARIMA (2,1,3)(0,1,0)12 for PT. If it is compared with other methods, SARIMA have the best MAPE and R 2 not only in training data but also in valaidation because non stationary data and seasionality are considered in SARIMA. From the whole result, ED Aluminium is suggested to use SARIMA to forecast the demand. However, it is better that the number of inventory and work in process (WIP) are considered. So, the minimization of overstock and stockout can be proven by data.