FORECASTING THE VALUE OF INDONESIA'S OIL AND GAS IMPORTS USING SEASONAL AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODEL

.


INTRODUCTION
The 1945 Constitution, in chapter 33 article (3), explains that "Earth, water, and the natural resources contained therein are controlled by the state and used as much as possible for the prosperity of the people" [1].Natural wealth (commonly called natural resources) is the most important asset in creating and maintaining the life axis of a country.Ideally, the more natural resources a country has, the easier it will be for the country to become a country that makes its people prosperous.However, in reality, this is not the case because there are many factors that can influence the success of a country in exploiting its natural resources.
Natural resources are wealth that exists on earth, both in the form of inanimate objects and living things that can be used to meet the needs of human life [2].The natural resources contained in a country will be one of the greatest potentials for the country's progress.What's more, natural resources are also the main component of the country in meeting human needs, such as oil and gas.Oil and gas are composed of several main elements, and if based on the value of oil and gas is a combination of the value of crude oil (petroleum), the value of oil products, and the value of gas [3].
The value of Indonesia's oil and gas exports throughout 2021 reached US$ 12.28 billion or the equivalent of 184.2 trillion rupiah (estimated at US$ 1 = Rp.15,000.00),while Indonesia's oil and gas import value reached US$ 25.53 billion or the equivalent of 382.95 trillion rupiah (estimated at US$ 1 = Rp.15,000.00)[4].Besides that, Indonesia also ranks 24th out of 61 countries as oil-producing countries in the world with a magnitude of 692 thousand barrels per day, and on the other hand, Indonesia has a need for petroleum reaching 1.47 million barrels per day [5].If viewed from the value of exports, these statistics give Indonesia the opportunity to make petroleum an asset in increasing the country's valuation.However, in other situation, Indonesia also has a higher import value.This creates a dilemma that requires the attention of the Indonesian government.
The high demand for petroleum in Indonesia is due to the fact that petroleum is the main source of energy in everyday life [6].This causes concern that the value of Indonesia's oil and gas imports will increase so that it will affect the prices of cooking fuel, vehicle fuel and other daily materials.Seen in the past time series, the value of oil and gas imports has experienced repeated increases and decreases in several periods.This resulted in the value of oil and gas imports having a seasonal pattern in the data series.Therefore, a step is needed to prevent the increase in these prices, that is by estimating the value of oil and gas imports in the future.Because the data is seasonal, so the steps can be taken are forecasting using the time series method with the Seasonal Autoregressive Integrated Moving Average (SARIMA) model.

SARIMA is a time series model using observational behavior on data that has a seasonal pattern [7] [8][9]
. There are several previous researchs that discuss the value of oil and the SARIMA model, including the results of forecasting the number of passengers of PT.Angkasa Pura I (Persero) Yogyakarta Adisutjipto Airport branch office [10].This research proves that the SARIMA model has more accurate forecasting results where the Mean Squared Deviation (MSD) value is smaller than the Winter's Exponential Smoothing method.In another research, forecasting the number of flood disasters in Indonesia using the SARIMA model also proved that the SARIMA model has a Mean Absolute Percentage Error (MAPE) value of 8.7%, where from the criteria for the accuracy of forecasting the model has a percentage of <10% so that the forecast has very good quality [11].In addition, in the research on forecasting the Kelayang drought index using the SARIMA and SPI methods it has a mean squared value of 4,111.51[12].In research [13] has applied SARIMA to forecast the production of distillate fuel oil refinery and propane blender net, it is also proven that the time series algorithm or SARIMA can be used to forecast the amount of fuel oil production.Based on that analysis, the purpose of this study is to forecast and predict the value of Indonesia's oil and gas import using SARIMA model with analysis spectral to find the periodic in this data.

Research Data
The data used in this research is secondary data, with the data source used is monthly data on the value of Indonesia's oil and gas imports of 216 data from January 2005 to December 2022 collected from the official of Central Statistics Agency (BPS) [14].

Research Stages
The stages of research conducted in this article are as follows: 1. Collect data from the official website of the BPS.
2. Input oil and gas import value data with the help of R software.
3. Display a data plot of the oil and gas import value against time.
4. Looking for seasonal periods contained in the data with spectral analysis.Spectral analysis is one of methods for analyzing time series in the frequency domain.This method is part of a statistical analysis based on the concept of frequency, which is described in the form of a spectrum (range) [15] [16].Spectral analysis is also a form of Fourier transform, namely a form that converts a time series into a set of sine or cosine waves at various frequency conditions [17] [18].It can be used to find hidden periodicity in time series data.The basic concept and purpose of spectral analysis is to calculate the periodogram and display the power spectral lines.
The periodogram is a function of the spectrum of power to frequency.The application of the periodogram in time series data is to see the hidden periodicity of the time series data, that is, by looking at the frequencies that are paired with the peak points of the spectral lines.The period that has been obtained is then used as the period of the time series model [19].
where   and   are the Fourier coefficients which are written as follows.

Testing the stationarity of the data against the variance and mean with Box-Cox Transformation and
Augmented Dickey Fuller Test.
6. Carry out the box-cox transformation and differencing process (if the data is not stationary) and re-test the data stationarity.
7. Display a plot of data on oil and gas import values that are already stationary.
8. Specify temporary SARIMA models with general model .
Estimate parameter of SARIMA models used maximum likelihood estimation.
10. Test of significance parameters model for AR and MA.
The parameter significance test hypothesis is as follows.
13. Determine the best model from several SARIMA models while then forecasting with the best model.14.Testing the accuracy of forecasting results using the MAPE value.

Data on Indonesia's Oil and Gas Import Value
Data on oil and gas import values used in this article are oil and gas import values from January 2005 to December 2022.The data is monthly data as much as 216 data.The value of oil and gas imports will be displayed in the form of a graphic plot that aims to analyze the data characteristics of the data.Figure 1 shows a process of repeated increases and decreases in the value of oil and gas imports.The highest value was in July 2022 and the lowest value was in May 2020.The repeated increase and decrease in value indicates that the data contains a seasonal pattern.In addition, it can be seen that the data is not stationary because the data fluctuations are not in a fixed mean range.Therefore, it is necessary to perform spectral analysis to determine the seasonal period in the data and stationarity test to test the stationarity of the data.

Spectral Analysis and Periodogram
By using the equation above, the periodogram value is obtained from the data on Indonesia's oil and gas import values as follows.= 0.145 so we get the seasonal period Based on this, it can be seen that the seasonal period in the value of Indonesian oil and gas imports is 43 months or s = 43.

Stationary Test of Variance
Stationary to variance means that time series data has a constant variance value over time.In addition, this stationary variant also uses a box-cox transformation in the testing process.Following are the results of the stationary test of variance with box-cox test.Table 2 shows the data on the value of oil and gas imports is not stationary with respect to the variance because the rounded value is 0.689 while a data is said to be stationary if it has a lambda of 1.Therefore, data on the value of oil and gas imports needs to be transformed using a formula   0.689 .After the data is transformed, the data is tested again for stationarity with box-cox.Table 3 shows that the data on the value of oil and gas imports has been said to be stationary with respect to the variant because it has a lambda of 1. Next, the data will be tested for stationarity with respect to the average.

Stationary Test of Mean
Data that is stationarity with respect to the mean can be seen p-value of Augmented Dickey Fuller (ADF) test.On ADF test, if p-value < 0.05 so the data is stationary with respect to the mean.The following ADF values can be seen in Table 4 below.Table 4 shows results of the p−value is 0.317 > 0.05 so that the data not yet stationary with respect to the average.Next, the differencing process needs to be carried out to satisfy the stationary test of the mean.The differencing process is carried out by calculating the difference between   and  −1 .Furthermore, the data that has been differentiated will be searched for values again the ADF is to test whether the data is stationary with respect to the average.In Table 5 is obtained that the p − value is 0.01 < 0.05 so that the data is already stationary with respect to the average.In the next stage, calculations and plotting are carried out on the ACF value to find out the lag that is outside the interval limit to define a provisional model.In Figure 2, the ACF value after differencing results does not appear to decrease continuously.Apart from that, there are also less than 3 lags coming out of confidence intervals.Thus, the data is stationary with respect to the average.
In addition, ACF graphs can also be used to see stationarity on seasonal model of data.Based on Table 1, the seasonal period is obtained at s = 43 so that you can see the ACF graph in Figure 2 for the lags of seasonal period that is significantly different from zero.On the graph of the ACF it seems that there is no lag outside the confidence interval so that the seasonal model cannot be determined.Therefore, it is necessary to carry out the process of differencing 1 on the seasonal model, that is, on the 43rd lag.

Figure 3. ACF Value of Indonesia's Oil and Gas Imports is Stationary to Seasonality
Based on Figure 3, the results of seasonal differencing in ACF do not appear to be decreasing continuously.In addition, there is also a seasonal lag outside the confidence interval.Thus, the data is said to be stationary with respect to the average in the seasonal model.

SARIMA Model Identification
The general form of the SARIMA model is (, , )(, , )  [20][21] [22].Order s is the period of the seasonality, that is s = 43.This model is a combination of two models, namely non-seasonal and seasonal.Both models are known from the number of lags that come out of the confidence interval limits.The number of values in the ACF and PACF lags that are outside the interval limit or significantly different from zero, will be used as an order in the temporary SARIMA model.

Non-Seasonal Model
In the non-seasonal model, it can be seen in the order (, , ).The p value is the order of Autoregressive (AR) on the non-seasonal PACF chart that goes outside the interval limit, the d value is the number of differencing on the non-seasonal, and the q value is the order of the Moving Average (MA) on the non-seasonal ACF chart that goes out of the interval limit.From Figure 4, PACF values that come out of the interval are lag-1 and lag-2, so it can be seen that there are 2 lags that come out of the interval limit on the PACF graph, based on [23] the order of p = 2 atau AR(2).From Figure 5, ACF values that come out of the confidence interval are lag-1 and lag-10, so it can be seen there are 2 lags that come out of the interval limit on the ACF graph, based on [23] the order of q = 2 atau MA (2).In this stationary test process, one differencing is performed, so the value is d = 1.

Seasonal Model
In the non-seasonal model, it can be seen in the order (, , ).The P value is the order of Autoregressive (AR) on the seasonal PACF chart that exits the interval limit, the D value represents the number of seasonal differencing, and the Q value is the order of the Moving Average (MA) on the seasonal ACF chart that exits the interval limit.Based on Figure 6 and Figure 7, the PACF lag that comes out of the interval is lag-43, so the order of P = 1.The ACF lag that comes out of the confidence interval is lag-43, so the order of Q = 1.In this stationary test process, differencing once is done for seasonality, so the value of D = 1.

Estimation and Test of Significance of Parameters of the SARIMA Model
At this stage, parameter estimation is carried out for the temporary SARIMA model.From 15 existing models, only 6 models met the significance test according to Table 6.

Test the Residual Assumptions of the SARIMA Model
A model can be said to be a feasible model if it meets the residual white noise assumption and normal distribution test.At this stage, the Ljung-Box test will be carried out to find out whether the model meets the residual white noise assumption and Chi-Square Test to meet the normal distribution test.From 6 models that met the significance test criteria, all of models met the residual white noise assumption and normal distribution test as shown in the table below.Based on Table 7, it can be seen that there all of models that meet the assumption of white noise.Based on Table 8, only 3 models meet normal distribution test.

Best Model Selection
After identifying the model, estimating parameters, and testing the model on the value of Indonesia's oil and gas imports, some of the best models are obtained and are suitable for forecasting.Next, the selection of the best model is carried out by looking at the smallest Mean Absolute Error (MAE) value in the several models.Below are the MAE values for some SARIMA models.286.15 (0,1,1)(0,1,1) 43  286.71 Table 9 shows that the SARIMA model (2,1,0)(0,1,1) 43 is the best model and feasible to use for forecasting.This model was chosen because it has the smallest MAE value.

Forecasting Results
After getting the SARIMA model (2,1,0)(0,1,1) 43 as the best model where the general form of the SARIMA model is as follows.

Figure 1 .
Figure 1.Time Series Plot of Indonesia's Oil and Gas Imports Values

Figure 6 .Figure 7 .
Figure 6.PACF graph for seasonal modeling For parameters p-value<0.05or|ℎ | > 1.971, then these parameters are significant model parameters so that they can be used for the feasibility test of the next stage of the model.11.Carry out residual diagnostics with the Ljung-Box Test whether it meets the White Noise assumption.Accept   or the model has met the assumption of white noise, if p-value>α.12.Carry out residual diagnostics with normal distribustion test.The hypothesis test for the assumption of residual is as follows.
Significance criterion test, that is reject   if the result  −  <  or | ℎ | >  2 ;(−) where k=2.H o : there is a normal distribution of the data H 1 : there is no normal distribution of the data Accept   or the model has met the assumption of white noise, if  (ℎ − ) <  (,) 2

Table 1 ,
it can be seen that the highest periodogram value is at the 5th frequency with the periodogram value (( 5 )) of 22,488,816.1 with value  5 =