APPLICATION OF NONPARAMETRIC GEOGRAPHICALLY WEIGHTED REGRESSION METHOD ON OPEN UNEMPLOYMENT RATE DATA IN INDONESIA

ABSTRACT


INTRODUCTION
Unemployment is still a significant social and economic problem in many countries, including Indonesia.Unemployment is a situation where a person who wants to work cannot find a suitable job and is ready to work at the market wage rate.The open unemployment rate is the percentage of the labor force that does not have a job at a certain time and is looking for a suitable job.High unemployment rates can have a negative impact on society.High open unemployment can cause a negative impact on the economy and lead to social problems such as poverty and instability [1].
An analysis is needed to determine the factors that influence unemployment in Indonesia.One method that can be used is regression.Regression is a statistical method used to measure the relationship between one or more predictor variables and one response variable in a mathematical model.Regression analysis aims to obtain the estimated form of the regression curve [2].In estimating a regression curve can be done with three approaches, namely parametric regression, nonparametric regression, and also semiparametric regression [3].Parametric regression assumes that the functional form of the relationship between the variables is known or can be explained by a certain mathematical model.Meanwhile, nonparametric regression is used to determine the relationship between response variables and predictor variables whose functional form is unknown.In semiparametric regression, some of the function forms are known while some are unknown.But not all data is linearly patterned and the regression curve is known.Therefore, a nonparametric regression approach is needed [4].The nonparametric regression method is used when the regression curve is not known for its shape and pattern [5].Spline is a continuous truncated polynomial cut that can adjust more effectively to the pattern and the resulting curve is relatively smooth [6].Truncated splines contain knot points that can locate their estimated data no matter where the data pattern moves.In addition, the Gaussian Kernel estimator is dependent on bandwidth, which regulates the regression curve's smoothness [7].
In an ordinary regression model will produce the same parameter estimates in all regions, while in reality the data conditions of one location are not the same as the conditions of other locations.A statistical method that can be used to model spatial data is Geographycally Weighted Regression (GWR).GWR can produce model parameter estimators that are local to each location [8].To model spatial data whose shape of the regression curve is unknown is to use Nonparametric Geographycally Weighted Regression (NGWR) with a spline approach.NGWR is a development of nonparametric regression for spatial data where the parameter estimator is local to each observation location.One of the advantages of NGWR is that the model tends to find its own data estimates wherever the data pattern moves, making it possible to adjust more effectively to the local characteristics of a function or data [9].
Some of the research that has been done is analysis of the causes of flooding in Samarinda City using geographically weighted regression (GWR) method of spatial statistics, where these results is the concept of flood control in Samarinda City was divided into three parts [10].Other studies have also been carried out using nonparametric regression model, where this research objective was to determine the factors that influence the causes of increased infant mortality in Kalimantan by using nonparametric regression model.nonparametric regression is useful for developing health and quality of life for people in Kalimantan, as well as increasing knowledge innovation in statistical methods [11].Further research using nonparametric regression model with The purpose of this research is to model and determine the factors affecting the Dengue Hemorrhagic Fever Cases in AWS Hospital Samarinda using Bi-responses Nonparametric Regression.Based on the test results, obtained factors that have a significant effect with a large R value [12].
NGWR model can be used to obtain the best model for open unemployment rate data in Indonesia.The NGWR model is considered capable of overcoming unknown regression curves and spatially non-stationary data, which can provide information to researchers to obtain variables that contribute to Open Unemployment Rate (OUR) [13].In this research, NGWR is used to evaluate the factors that affect the open unemployment rate data in Indonesia.

RESEARCH METHODS
The research method contains an explanation of the research design and data analysis.

The Research Data
This research applies NGWR to data on the OUR in Indonesia in 2021 to obtain a model and maping of the open unemployment rate in Indonesia based on significant factors.The open unemployment rate is the percentage of the labor force that does not have a job at a certain time and is looking for a suitable job.The variables used in this study are as follows.
y : open unemployment rate, is the percentage of the labor force that does not have a job at any given time and is looking for a suitable job [1].x : percentage of population density, is the ratio between the number of people in an area and the area of the area, which is then calculated as a percentage [14].x : GRDP at current prices, is the net value of final goods and services produced by various economic activities in a region in a given period, usually one year [17].

5
x : percentage of poor population, is the percentage of the population that is below the poverty line [18].

Analysis Technique
Data analysis in this study includes descriptive statistics and modeling with the NGWR model.The analysis steps are as follows: 1.
Make descriptive statistical analysis of each variable

2.
Testing spatial heterogeneity with the Breusch-Pagan method.The hypothesis used in the Breusch-Pagan test is as follows: The rejection condition is reject 0 with vector elements

3.
Determining the optimum knot point by using the GCV method.The optimal knot point is obtained from the minimum Generalized Cross Validation (GCV).The GCV method is defined as follows [6]: where 12 ( , ,..., ) Determining the best NGWR model criteria based on the order and knot point

Obtain the NGWR model estimation results
The function of the NGWR curve can be written as follows The form of the relationship between the response variable and the predictor variables at the i-th location for the NGWR model with a truncated spline approach can be asked as follows [9].
Model fit testing 7.
Simultaneous hypothesis testing for NGWR model parameters, with test statistic [21]: ( , ) Mapping the area based on significant variables.10.Interpretation of NGWR model

Description of the Research Data
The description of the research data includes descriptive statistical analysis in the form of data centering and data dispersion.Based on Table 1, it can be seen that the average OUR in Indonesia is 5.49%, with the lowest OUR of 3.01% in West Nusa Tenggara and Gorontalo and the highest OUR of 9.91% in Riau Islands.The average percentage of population density in Indonesia is 2.94%, with North Kalimantan as the province with the smallest percentage of population density at only 0.26% and West Java as the province with the largest percentage of population density at 17.89%.The average provincial minimum wage in Indonesia is IDR 2,687,724.The lowest minimum wage is 1,765,000 rupiah in DI Yogyakarta, while the highest minimum wage is in DKI Jakarta at IDR 4,416,18.The average length of schooling in Indonesia is 8.72 years, with the lowest average length of schooling of 6.76 years in Papua and the highest average length of schooling of 11.17 years in DKI Jakarta.The average GRDP in Indonesia is 498,652 billion rupiah, with Gorontalo as the province with the smallest GRDP of only 43,896 billion rupiah and DKI Jakarta as the province with the largest GRDP of 2,914,581 billion rupiah.The average percentage of poor people in Indonesia is 10.43%.The lowest percentage of poor people is 4.56% in South Kalimantan, while the highest percentage of poor people is 27.38% in Papua.

Pattern of Relationship Between Variables
The following are the results of the scatter plot on OUR with each predictor variables.x , and (e) scatter plot of y and 5 x In Figure 1, it can be seen that the pattern of the relationship between each predictor variable and the OUR does not show any tendency of the pattern or the pattern of the unknown shape of the regression curve.So that a nonparametric regression approach can be used in estimating.

Spatial Heterogeneity
Spatial heterogeneity testing is aimed at finding out whether there is a spatial effect on the TPT variable.Based on Table 2, the result showed that the p-value (0,01) <  (0,05), it can be concluded that there is a spatial heterogeneity effect in the TPT case.

Optimum Knot Point Selection
The best model selection criteria for NGWR is to choose the order and knot point that has the largest X value.The results of the iteration of each knot point are selected with the most optimal GCV value at each knot point.Based on the results of the selection of the best model criteria, it is found that the regression model with order 1 and 1 knot point is the best NGWR model, so one optimum knot point is needed for each predictor variable on TPT.The optimum knot points obtained are as follows:

Geographic Weighting
To determine the best geographic weighting, an optimum bandwidth value is required.In Table 5, the bisquare kernel function with a bandwidth of 31.84 is the best geographic weighting function because it has a CV value of 191.80 which is smaller than the gaussian kernel function, which is 249.20.Therefore, the bisquare kernel function is used as the weighting function in NGWR.

Figure 2. OUR distribution pattern with NGWR
From Figure 2, it is found that the red line is the estimator of OUR in Indonesia using NGWR and the blue line is the original data of OUR in Indonesia.The figure shows that the estimator of OUR in Indonesia is close to the original data of the OUR variable in Indonesia.

Simultaneous Parameter Significance Test
The V* value obtained is 3,36 where this value is larger than (0,05;33;22) F = 2,26, so it is decided to reject 0 H .It can be concluded that the percentage of population density, minimum wage, average years of schooling, GRDP, and the percentage of poor people simultaneously affect the OUR.

Partial Parameter Significance Test
The results of the calculation of the partial parameter significance test for NGWR obtained 6 groupings for each research location, with the following details: 1.The first group is West Sumatra, Riau, Jambi, South Sumatra, Bengkulu, Lampung, Bangka Belitung Islands, Riau Islands, DKI Jakarta, Bali, West Nusa Tenggara, East Nusa Tengggara, East Kalimantan, North Kalimantan, North Sulawesi, Central Sulawesi, South Sulawesi, Southeast Sulawesi, Gorontalo, West Sulawesi, Maluku, North Maluku, and Papua, with the influential variables being 1 x , 2 x , 3 x , 4 x , and 5 x .
2. The second group is South Kalimantan with influential variables are 1 x , 2 x , 3 x , and 5 x .
3. The third group is Aceh with influential variables are 1 x , 3 x , 4 x , and 5 x .4. The fourth group is North Sumatra, West Java, Central Java, Yogyakarta, East Java, West Kalimantan, and Central Kalimantan, with the influential variables being 1 x , 3 x , and 5 x .
5. The fifth group is West Papua with the influential variable being 3 x .
6.The sixth group is Banten where there are no variables that affect the OUR.This is a mapping of OUR in Indonesia based on significant factors with the NGWR.

Tes of Model Fit
Model fit testing aims to see the comparison between NGWR and GWR.Based on the calculations carried out, the value of V is 5,14 which is greater than F(0,05;12;22) = 2,23, so it was decided to reject 0 H .The conclusion in this test is that there is a significant difference between NGWR and GWR.

Interpretation of Model
From the results of the analysis, the NGWR model for the 10th area, namely the Riau Islands Province in Equation ( 9), the interpretation is: 1.
If the minimum wage, average years of schooling, GRDP, and the percentage of poor people are considered constant, then the effect of the percentage of population density on OUR is as follows: when the percentage of population density is less than 9.78 percent, then if the percentage of population density increases by one percent, OUR in Riau Islands will increase by 0.76 percent.

2.
If the percentage of population density, average years of schooling, GRDP, and the percentage of poor people are considered constant, then the effect of minimum wage on OUR is as follows: when the minimum wage is less than 3.196.600rupiah, then if the minimum wage increases by one rupiah, OUR in Riau Islands will increase by 9,73 x 10 -7 percent.

3.
If the percentage of population density, minimum wage, GRDP, and the percentage of poor people are considered constant, then the effect of average years of schooling on OUR is as follows: when the average length of schooling is less than 9,14 years, then if the average length of schooling increases by one year, OUR in Riau Islands will increase by 0,47 percent.

4.
If the percentage of population density, minimum wage, average years of schooling, and the percentage of poor people are considered constant, then the effect of GRDP on OUR is as follows: when GRDP is less than IDR 1.594.100billion, then if GRDP increases by one billion, OUR in Riau Islands will decrease by 3,20 x 10 -6 percent.

5.
If the percentage of population density, minimum wage, average years of schooling, and GRDP are considered constant, then the effect of the percentage of poor people on OUR is as follows: when the percentage of poor people is less than 16,88 percent, then if the percentage of poor people increases by one percent, OUR in Riau Islands will decrease by 0,21 percent.

CONCLUSIONS
From the research that was conducted, it was found that the data on OUR in Indonesia in 2021 contained spatial effects.The best model is obtained by NGWR with bisquare kernel function weights of order 1 and knot point 1, with R 2 of 83.45 percent.Based on partial parameter significance testing for NGWR, 6 groupings are obtained for each research location based on influential factors.

2 x
: provincial minimum wage, is the wage that employers must pay to workers every month, which is regulated by the government based on laws and regulations[15].

3x
: average years of schooling, can be measured as the geometric mean of school years spent by individuals [16].

4
. The Z matrix is a normalized matrix for each observation [19].

Figure 1 .
Figure 1.Pattern of relationship between variables, (a) scatter plot of y and 1x , (b) scatter plot of y and 2x , (c)

Figure 3 .
Figure 3. Mapping the Estimator of OUR in Indonesia Based on Significant Variables