FORECASTING THE CONSUMER PRICE INDEX WITH GENERALIZED SPACE-TIME AUTOREGRESSIVE SEEMINGLY UNRELATED REGRESSION (GSTAR-SUR): COMPROMISE REGION AND TIME

ABSTRACT


INTRODUCTION
The Consumer Price Index (CPI) is one of the most important macroeconomic indicators [1], [2]. This is because CPI is a tool to measure inflation in an area where inflation is a controlling macroeconomic indicator that has broadly impacted serious economic indicators [3]. According to the Central Statistics Agency (BPS), CPI is an index that calculates the average price change from a collection of prices for goods and services consumed by residents or households over a certain period closely related to inflation. One of the most important Inflation forecastings is one of the most important inputs for the decision-making process in the monetary sector. The consumer price index is also used as a basis for determining the amount of GRDP, budget planning, and other fiscal policies. According to the results of the Cost of Living Survey conducted by the Central Statistics Agency, the CPI in Indonesia generally always increases every year. In 2018 the national CPI increased by 4.14 from the previous year. Then in 2019, the national CPI increased again by 4.04 from 2018 [4].
In Indonesia, Central Java Province has a pretty good CPI. This statement can be seen from the stability of the inflation value [5], [6]. Even for the last 4 years, inflation in Central Java Province has always decreased, namely in 2017 to 2020 it decreased by 2.1 per cent. Six cities in Central Java Province are targets of the Cost of Living Survey, namely, Cilacap City, Purwokerto City, Kudus City, Surakarta City, Semarang City, and Tegal City. The selection of the city sample is based on the level of development in the economic field, which is relatively rapid compared to other cities [7]. These six cities will be used as the basis for this study. These six cities will be used as the basis for this study. This is because the six cities look the most prominent in economic development compared to other cities.
CPI data is included in the time series data, so it can be modeled using the time series analysis method. Based on the number of variables studied, time series data can be divided into univariate time series and multivariate time series [8]. Univariate time series is a time series analysis that uses one variable. While the multivariate time series is a time series analysis that uses several variables in the study because it is suspected to be interrelated [9]. The CPI between adjacent cities may have inter-location links [5], [6]. The linkage or correlation of the consumer price index is reflected in the relationship of dependence in meeting the needs of goods and services. Geographical conditions and limited infrastructure affect the availability of goods and services needed in an area, especially goods and services that cannot be produced alone, potentially affecting the level of costs and prices in other locations. Thus, the development of the consumer price index between cities and having a relationship with the consumer price index at the previous time also has a relationship between locations (spatial) [10].
A model that combines the interrelationships of events in previous times and involves interrelationships with locations in multivariate time series data are called the space-time model [8], [11], [12]. In 1980 Pfeifer and Deutsch introduced a model that combines time and location interdependence known as Space-Time Autoregressive (STAR) [13]. However, the STAR model has a weakness in the flexibility of parameters that assume that the locations have homogeneous ones, so if the locations have heterogeneous characteristics, the STAR model is not good for use [11], [12], [14]. Developed the Generalized Space-Time AutoSpace-Time (GSTAR) model to address the weaknesses of the STAR model [15]. The GSTAR model is a generalization of the STAR model, which allows autoregressive (AR) parameter values to vary at each location so that the GSTAR model can be applied to heterogeneous locations. However, the study of parameter estimation using GSTAR is still limited to estimates with the Ordinary Least Square (OLS) method. So that the GSTAR model with correlated residuals will produce inefficient estimators because it causes a greater error if used to do forecasting. Therefore, the Generalized Least Square (GLS) method is used to estimate parameters with correlated residuals commonly used in Seemingly Unrelated Regression (SUR) models [16], [17].
The SUR model is a multivariate linear regression model. This model consists of several regression equations whose errors do not correlate between observations in an equation, but errors correlate between equations [17]. Research on the comparison of GSTAR-OLS and GSTAR-SUR was conducted by Habibie in the application of GSTAR-SUR to the number of domestic aircraft passengers at Indonesian airports. It was found that the GSTAR-SUR method gave better results than GSTAR-OLS [18]. In addition, there is also a study conducted by Septyaningrum on forecasting the number of tourists at three tourist sites in Pacitan Regency using the GSTAR-SUR method obtained from the comparison of forecasting accuracy, which shows GSTAR-SUR has a smaller RMSE value in all tourist attractions [19].
Based on the existing explanation, until now, there has been no research that specifically analyzes and forecasts the CPI of six cities that are SBH samples in Central Java using the GSTAR-SUR method. Therefore, this article presents the results of modeling and forecasting the consumer price index of six SBH cities in Central Java using the GSTAR-SUR time series method.

Time Series Analysis
Time series analysis is a quantitative method to determine patterns of data in the past that are collected regularly. Time series data analysis has many purposes, including predicting future values [20]. In terms of data usage, time series analysis is divided into two, namely univariate time series and multivariate time series [8]. Univariate time series data is the result of observing one variable that has autocorrelation. In univariate time series analysis, data stability is one of the components that must be considered. A multivariate time series is a series of data consisting of several variables obtained over time and recorded sequentially according to the time of the event using a fixed time interval [21]. Multivariate time series analysis is usually used for analysis that has more than one-time series data, so there will be many variables in the model. Similar to univariate time series, the process in multivariate time series also pays attention to data stationariness [22].

Generalized Space-Time Autoregressive Model (GSTAR)
The GSTAR model is an extension of the STAR model, with its main comparison lying in the autoregressive parameter. Where in the STAR model, the autoregressive parameters are assumed to be the same, while in the GSTAR model, it is assumed to be heterogeneous. In matrix notation, GSTAR models with autoregressive degrees p and spatial degrees 1 , 2 ,..., formulated as follows [22], [23]:

Seemingly Autoregressive Regression Model (SUR)
Seemingly Autoregressive Regression Model (SUR) is a multivariate linear regression model first introduced by Zellner [1962]. This model consists of several regression equations whose misguided ones do not correlate between observations in an equation but whose misguidedness correlates between equations. The test used to determine whether the misguided covariance variance structure is a SUR structure is a Lagrange Multiplier with a hypothesis [17]: H0 = ( 1 , 1 ) = 0 for all ≠ (The structure of variance and perverse covariance is heteroskedastic and there is no misguided correlation between equations). H1 = ( 1 , 1 ) ≠ 0 for all ≠ (The structure of variance and misguided covariance is heteroskedastic and there is a misguided correlation between equations).

Research Data
The data used in this study is secondary data sourced from BPS Central Java Province. The data taken is quantitative in the form of a monthly Consumer Price Index (CPI) of 6 cities in Central Java Province, namely Cilacap, Purwokerto, Kudus, Surakarta, Semarang, and Tegal, from January 2012 to December 2021, which amounts to 120 data.

Research Steps
The data analysis steps used in this study are as follows:

Descriptive Analysis
A descriptive analysis was created to provide an overview of the data used in the study. The data used in this study is consumer price index (CPI) data for 6 cities in Central Java from January 2012 to December 2021, which amounts to 120 data. The picture given is the mean, standard deviation, minimum value, and maximum value according to Table 2.

Correlation of CPI Data in Six Cities
The value of the correlation between regions shows how much the relationship between regions is and another. The correlation values between the six cities can be seen in Table 3. Based on Table 3, it can be seen that the CPI between six cities in Central Java has a high correlation value. This indicates that there is a linkage of CPI data in the same time order and shows that the CPI between adjacent locations has a high relationship with each other.

Stationary Test
In time series data modeling, it must meet two assumptions, namely the data must be stationary to the mean and variance. Stationariness tests against variance can be seen through box-cox plot results, while stationary tests against mean can be seen from Augmented Dickey-Fuller (ADF) tests.

Stationary Test of the Mean
Time series data sensitivity testing against the mean can be performed with the Augmented Dickey-Fuller (ADF) test. The hypotheses used in the ADF test are as follows: 0 : = 1 (non-stationary data) 1 : < 1 (stationary data) H0 is accepted if the p-value > = 0.05. Based on Table 4, it can be seen that the CPI data of six SBH cities in Central Java is not yet stationary. This is evidenced by the p-value of each city, which is > 0.05, which means that H0 is accepted. Therefore it is necessary to be different from stationing the data. The results of the ADF test after differencing are found in Table 5 as follows: After differencing, it can be seen in Table 5 that the CPI data of six cities have been stationary. This is evidenced by the p-value of each city < 0.05, meaning that H0 is rejected so that it can be said that the data is stationary. .FORECASTING THE CONSUMER PRICE INDEX WITH ….

Stationary Test of the Variance
Time series data sensitivity testing against variance can be done with the Box-Cox test. Data is said to be stationary to a variance if its rounded value is worth 1 (λ=1). Here are the Box-Cox test results for six cities: Based on the Box-Cox test results for the six locations, the results were obtained that the six locations were not stationary to the variance because the rounded value was ≠ 1 and had different lambda estimate values for each location. If a transformation is performed, then the transformation used is different according to the lambda estimate for each location. Therefore, there is no need for transformations and the data is considered stationary against variance.

Test the Residual Correlation between Locations
Before the GSTAR (11)-I(1) model was modeled using Seemingly Unrelated Regression (SUR), a Lagrange Multiplier test was first carried out to determine whether there was a residual correlation between locations in the GSTAR (11)-I(1) model. Here's the Lagrange Multiplier test. Significance: = 5% = 24.996 then it can be concluded that H0 is rejected so that it can be stated that there is a residual correlation between locations.

GSTAR-SUR Model Parameter Estimation by GLS Method
After the Lagrange Multiplier test is carried out, the residual correlated model is processed using the GSTAR-SUR model. Parameter estimation in the GSTAR-SUR model can be done by the Generalized Least Square (GLS) method. This method's assessment of significant parameter estimates is the same as in the GSTAR model. That is, the value of parameter estimates is said to be significant if |tvalue |> ttable. However, in this study, insignificant parameters were not eliminated with the purpose of weighting from each location remaining. Here is an estimate of the parameters of the GSTAR-SUR model of the six locations based on each weighting:

Estimation of GSTAR-SUR (11)-I(1) Model Parameters with Uniform Weights
The results of the calculation of the estimated parameters of the GSTAR-SUR (11)-I(1) model using uniform weights can be seen in Table 7 as follows:

Estimation of GSTAR-SUR (1 1 )-I(1) Model Parameters with Distance Inverse Weights
The results of the calculation of the estimated parameters of the GSTAR-SUR (11)-I(1) model using distance inverse weights can be seen in Table 8 as follows:

GSTAR-SUR Best Model Determination (11)-I(1)
After obtaining the GSTAR-SUR model and testing the feasibility of the model, the calculation of the accuracy of the model is carried out to obtain the best model. Determining the best model is done by looking at the RMSE value of each model. Later the model with the smallest RMSE is declared the best model. The RMSE calculation results of each model can be seen in Table 9 as follows: The results of the calculation of the RMSE value can be seen that the average RMSE value of the GSTAR-SUR model (11)-I(1) using the inverse distance weight has a smaller value compared to other weights, which is 6.213. So it can be concluded that the GSTAR-SUR (11)-I(1) model with inverse distance weighting is the best model.

Forecasting CPI Data for Each City Using the Best Model
The best model that has been obtained is the GSTAR SUR (11)-I(1) distance inverse weight. This model is then used as a forecast to get predictions of Consumer Price Index (CPI) data for six cities in Central Java, namely, Cilacap, Purwokerto, Kudus, Surakarta, Semarang, and Tegal. Data forecasting is carried out from January 2022 to December 2022. The forecasting results can be seen in Table 10 as follows: Based on Table 10, it is found that the forecasting value of the CPI of the six cities for the next 12 months will increase every month. However, the increase in CPI that occurred was still quite stable. This can be seen from the absence of a significant spike from month one to the next.

CONCLUSIONS
Based on the results of the analysis and discussion that has been carried out, it can be concluded that based on the RMSE value that has been obtained, the GSTAR-SUR (11)-I(1) model with distance inverse weighting is the best model because it has an RMSE value of 6.213, where this value is the smallest RMSE value compared to other weights. The results of CPI forecasting from the GSTAR-SUR (11)-I(1) model with distance inverse weighting for the next 12 months, namely January 2022 to December 2022, found that the CPI value for the six cities will increase every month.