FORECASTING ARRIVAL OF FOREIGN TOURISTS USING SEASONAL ARIMA BOX-JENKINS

Indonesia's economy is influenced by many factors, including the tourism sector. Through this tourism sector, it is possible for many foreign tourists to visit Indonesia. There are so many foreign tourists who come to Indonesia, forecasting is needed to find out the estimates of foreign tourists in the following months based on existing data. The method that used is the Seasonal Autoregressive Integrated Moving Average (SARIMA) method. The foreign tourist’s coming to Indonesia through Soekarno Hatta Airport were taken from the center agency on statistics (BPS) Indonesia. Data on the number of foreign tourists who come to Indonesia through Soekarno Hatta Airport is data with a seasonal pattern. The data used is secondary data obtained from Soekarno Hatta Airport for the period January 2010 to June 2015. In this case it is used to predict the value of the data for the next 6 months using the best model is the ARIMA(0,1,1) (0,1,1)12 . Forecasting results show the number of each month increases from the previous year. In July it showed the highest yield of 342536, which was 297878 in the previous year. Forecasting results show the number of each month increases from the previous year. In July, the highest yield was 342536, which was 297878 in the previous year.


INTRODUCTION
Indonesia is a rich country. The potential of natural resources is extraordinary, both natural and nonbiological natural resources. Indonesia is also the second highest biodiversity country after Brazil. The diversity of flora, fauna and its ecosystem, as well as cultural diversity, are potential attractions for the development of ecotourism in the country. Ecotourism and nature tourism are recognized as being particularly conducive to enriching and enhancing the tourism sector, on the basis that forms of tourism respect the natural heritage and local inhabitants and are in accordance with the carrying capacity of the site.
Forecasting is an activity to predict what will happen in the future. There have been many studies on forecasting, with the ARIMA method and Seasonal ARIMA method, such as: selection of the best models and forecasting the number of visits of foreign tourism (wisman) to Bali in 2014 [1]. Model Box-Jenkins in forecasting regional gross domestic products, Bali Province [2]. Use of ARIMA in method predicting inflation movements [3]. Crank-nicolson method for calculating value of stock loan with dividend reinvested before redemption [4]. ARIMA by Box Jenkins Methodology for Estimation and Forecasting Models in Higher Education [5]. Forecasting the Number of Aircraft Passengers at Sultan Iskandar Muda Airport Using the SARIMA Method (Seasonal Autoregressive Integrated Moving Average) [6]. Predictive modeling of pelagic fish catch in malaysia using seasonal ARIMA models [7]. Influence of seasonal factors on time series modeling for river debit forecasting with the SARIMA method [8] and other research on forecasting using the SARIMA method [9]- [13].
In this paper we use data of foreign tourists coming to Indonesia according to the entrance (Soekarno Hatta), because Soekarno Hatta Airport is an airport that is included in the ranks of Indonesian airports with the most visitors because it is located in the capital city. In this study the data used were taken from Central Bureau of Statistics. The Seasonal Autoregressive Integrated Moving Average (SARIMA) method is used for predict the number of foreign tourists coming to Indonesia according to the entrance (Soekarno Hatta), data on numbers passengers at Soekarno Hatta airport show seasonal patterns so SARIMA method is used to predict the number of passengers in the future.

Literature Review
ARIMA is often also called the Box-Jenkins time series method. ARIMA is very good accuracy for short-term forecasting, while for long-term forecasting the accuracy of forecasting is not good. Usually it will tend to be flat (horizontal/constant) for a fairly long period. The Integrated Moving Average Autoregressive Model (ARIMA) is a model that completely ignores independent variables in making forecasting. ARIMA uses past and present values of the dependent variable to produce accurate short-term forecasting. ARIMA is suitable if observations from time series are statistically related to each other (dependent). One of the most popular and frequently used stochastic time series is the Autoregressive Integrated Moving Average (ARIMA) model [14]. The ARIMA model consists of three basic steps, namely the identification phase, the assessment and testing stage, and the diagnostic inspection. Furthermore, the ARIMA model can be used to forecast if the model obtained is adequate

Stationary and Non-stationary
The thing to note is that most periodic series are non-stationary and that the AR and MA aspects of the ARIMA model only concern stationary series. Stationary is a state where mean and variance is constant. Stationarity means there is no growth or decrease in the data. The data must be roughly horizontal along the time axis. In other words, data fluctuations are around a constant average value, independent of time and the variance of these fluctuations remains essentially constant over time. A time series that is not stationary must be converted into stationary data by performing differencing. What is meant by differencing is to count changes or differences in the value of observations. The value of the difference obtained is checked again whether it is stationary or not. If it is not stationary, then do differencing again. If the variance is not stationary, then the logarithmic transformation is performed.

Classification of ARIMA models
The ARIMA model is divided into 3 elements, namely: The Autoregressive (AR) model, Moving Average (MA), and Integrated (I). To determine the best model can be seen based on the Autocorrelation and Partial Autocorrelation plots [15]. These three elements can be modified to form a new model. for example, the autoregressive and moving average (ARMA) models. However, if you want to make it in general form, it becomes ARIMA (p,d,q). p denotes the AR order, d denotes the Integrated order and q represents the moving average order. If the model becomes AR then the model will generally be ARIMA (1,0,0).

Seasonal Autoregressive Integrated Moving Average (SARIMA) Method
The general form of the seasonal moving average process for the period level or ( ) is defined as follows: where is mutually independent of − 1 , − 2 , · · which is normally distributed with mean 0 and variant 2 . The general form of the seasonal Autoregressive process of the period the level of or ( ) is defined as where is mutually independent of − 1 , − 2 , · · which is normally distributed with mean 0 and variance 2 . The seasonal data is defined as a pattern that repeats in a fixed time interval. For stationary data, the seasonal factor can be determined by identifying the autocorrelation coefficients at two or three time-lags that are significantly different from zero. Autocorrelations that are significantly different from zero represent a pattern in the data. To recognize seasonal factors, one must look at high autocorrelation. To deal with seasonal, brief general notations are: With ( , , ) = non-seasonal parts of the model ( , , ) = seasonal parts of the model = number of periods per season

RESULTS AND DISCUSSION
Data on the number of foreign tourists coming to Indonesia from 2010-2015 is as follows: From the data above we get the time series plot as follows:  Figure 1 shows that the data on the arrival of foreign tourists to Indonesia through the entrance of Soekarno-Hatta Airport has fluctuated or fluctuated from January 2010 to June 2015. Figure 2 shows that the rounded value on the Box-Cox plot is 1. Lambda value the value is equal to 1 with a confidence interval of 95%. The rounded value of 1 indicates that the data on the arrival of foreign tourists to Indonesia through the entrance to Soekarno-Hatta Airport is stationary in variance.  Based on Figure 2, it can be seen that the value of λ is 1 so that the data is stationary for variance. Whereas based on ACF plots and PACF plots it appears that the data is not stationary to the mean. The figure above shows that the value of the autocorrelation function tends to slow down which the value of autocorrelation in a lag is relatively not much different from the previous lag. The value of the partial autocorrelation (PACF) function is truncated after the initial lag. So that this also indicates that the data is not statementary in average. So to get stationary data, the method is to defference the data. The stages of identifying the model for differencing data are as follows: The one-time differencing plot above indicates that the data is stationary, so we can use the data to form the seasonal ARIMA model. As for determining the seasonal ARIMA model that is by looking at the Autocorrelation and Partial Autocorrelation functions as follows: Based on the Autocorrelation diagram (ACF), it can be seen that the autocorrelation value is cut off after lag 1 and dies down on PACF so that the first suspected model (1) is (0,1,1). Whereas, for the second model, (1) or (1,1,0), which in PACF, the partial autocorrelation value is significant at lag 1. So, the initial estimation of the model that is suitable for the data is (0,1,1 ) or (1,1,0), then we choose the best model. From Figure 5 and Figure 6 we can get a seasonal model so that the alleged model is (0,1,1) with a period of 12. So that the alleged model for seasonality is (0,1,1) (0,1,1) 12 , (1,1,0) (0,1,1) 12 .