GEOGRAPHICALLY WEIGHTED PANEL REGRESSION (GWPR) FOR COVID-19 CASE IN INDONESIA

Article History: Coronavirus disease 2019 (COVID-19) is a newly emerging infectious disease caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), which was declared a pandemic by the World Health Organization (WHO) on March 11 th , 2020. The response to this ongoing pandemic requires extensive collaboration across the scientific community to contain its impact and limit further transmission. Modeling to see cause-and-effect relationships in an event usually uses the Multiple Linear Regression (Ordinary Least Square) method. But in the case of Covid-19, the spread of the virus occurred from one location to another, so there was an indication that there was a spatial effect on the incident. In this study, we did not only look at spatial perspective but also considered time series data, so the method used was Geographically Weighted Panel Regression (GWPR). This study modeled the number of positive cases of Covid-19 in 34 provinces in Indonesia that occurred from March 2020 to August 2021 and looked at what factors influenced the number of positive cases of Covid-19 in each province. GWPR was performed with the assumption of a Fixed Effect Model (FEM). The FEM assumption was used by considering that the conditions of each observation unit were different. Based on the results, the best GWPR model obtained was the GWPR model with a Fixed Gaussian Kernel. The predictor variables that influenced the number of positive cases of Covid-19 were different at each location and tent to cluster at certain locations.


INTRODUCTION
Coronavirus disease 2019 (COVID-19) is a newly emerging infectious disease caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), which was declared a pandemic by the World Health Organization (WHO) on March 11 th , 2020. The response to this ongoing pandemic requires extensive collaboration across the scientific community to contain its impact and limit further transmission [1]. Several scientific studies on Covid-19 that have been carried out include: Sarker et al. modeled  in India using SARIIqSq sensitivity analysis Based on previous studies, it is always suspected that population density is one of the factors causing the increase in the number of positive cases of Covid-19. Some news and discussion on social media related to Covid-19 in Jakarta have linked overcrowding with the spread of the virus. It was said that because of the high population density in Jakarta, the virus would spread more easily, and its penetration would be faster in areas with higher density [7]. Several other causative factors are thought to have influenced the number of positive cases of Covid-19, including the number of elderly people, the number of households with access to proper drinking water, the GRDP of business fields based on current prices, and the number of villages with health centers [6].
Modeling to see cause-and-effect relationships in an event usually uses Multiple Linear Regression (Ordinary Least Square) method. But in the case of Covid-19, the spread of the virus occurred from one location to another, so it was indicated that there was a spatial effect on the incident. Several studies conducted by Marhamah and Mindra Jaya [8], also Mahdy [5] have shown that modeling the number of positive cases of Covid-19 can be performed using the regression method with a point spatial approach, namely Geographically Weighted Regression.
In this study, we did not only look at spatial perspective but also considered time series data, so the method used was Geographically Weighted Panel Regression. This study modeled the number of positive cases of Covid-19 in 34 provinces in Indonesia that occurred from March 2020 to August 2021 and looked at what factors influenced the number of positive cases of Covid-19 in each province. Geographically Weighted Panel Regression (GWPR) was performed with the assumption of a Fixed Effect Model (FEM). The FEM assumption was used by considering that the conditions of each observation unit were different.

Geographically Weighted Panel Regression (GWPR)
The main idea of GWPR is the same as GWR analysis. In GWPR, the time series of observations at a geographic location is assumed to be the implementation of a smooth spatiotemporal process. This process follows a distribution in which close observations (geographical location or time) are more relevant than distant observations. GWPR analysis aims to combine the overall location (cross section) and observations [9]. The GWPR method is a local regression with repeated data at location points for each spatial observation. In other words, GWPR focuses more on repeated spatial observations for each location [10].
GWPR is a development model that combines the GWR model with panel regression. In this study, it is assumed that the conditions of each unit of observation are different, so a panel regression with the FEM model is used, which is shown in Equation (1) [11].
One way to estimate the value of β in FEM model is to eliminate i  with a within estimator [12], that is, calculating the average of Equation (1) against t = 1, ..., T to obtain the cross section equation as follows: Then perform Ordinary Least Square (OLS) method to obtain the estimation value of β by using regression equation The GWPR model is obtained from a combination of the GWR model and panel regression. Equation (5) is a combination of the GWR equation and the FEM panel regression equation with the within estimator.
where it y  is the demeaned average response at the i-th observation and t-th time.  is the random effect, which is assumed to be independent, identical and follows a normal distribution with zero mean and constant variance [14].
As with GWR, bandwidth can be obtained at each location to determine local sample locations [15].
Observations at the local sample location will be weighted with kernel weights. Then weighting is performed for all time periods. At local sample points, it is assumed that panel data can be aggregated into a single geographic space. Then the panel data estimation model can be applied to obtain predictor variable coefficients at certain locations [9].

Data and Research Variables
In this study, the data used were secondary data sourced from the Central Statistics Agency (BPS) and the Ministry of Health

Stages of Data Analysis
Geographically Weighted Panel Regression (GWPR) was performed on the number of positive cases of Covid-19 in Indonesia from early 2020 to early 2021. The following are the stages in implementing the GWPR model.

Data Exploration
The Covid-19 outbreak first emerged from the Wuhan area in China in December 2019 and in March 2020 it was confirmed that 2 Depok residents in Indonesia tested positive for Covid-19. The spread of this virus continues to expand in the archipelago. This epidemic is not only a national problem in a country but is already a global problem. Figure 1 and Figure 2 show the spread of positive cases of Covid-19 at the beginning and end of 2020 in 34 provinces in Indonesia. To contain the accelerated spread of Covid-19 in Indonesia, the government implemented Large-Scale Social Restrictions, and in January 2021 the Covid-19 vaccination program was carried out in Indonesia. Figure 3 shows that there is still an increase in the number of positive cases, although the increase is not so drastic in several provinces, such as the Provinces of Maluku, Gorontalo, West Sulawesi, and Aceh, in fact, there is one province that has experienced a reduction in the number of positive cases of Covid-19, namely North Maluku Province. Meanwhile, provinces with high population density, such as South Sulawesi Province and several provinces on Java Island, continue to experience a high increase in the number of positive cases of Covid-19. The descriptive of each predictor variable is shown in Table 1 below:

Panel Model Test
The panel model consists of 3 types of models, namely Fixed Effect Model, Random Effect Model, and Common Effect Model. The first test to be performed is Chow test which is used to compare the general effects model with the fixed effects model. The second test performed is the Hausman Test which is used to compare the random effects model with the fixed effects model. Table 2 shows the results of the two panel model tests performed.  Table 2 shows the selected panel model is the Fixed Effect Model (FEM). Table 3 shows the results of FEM panel regression model with significance level of 5%.

Panel Model Assumptions Test
After the panel effect model was selected, the residual normality test of the model and the heterogeneity of variance test of the model were carried out. The normality test was carried out using the Shapiro-Wilk test and obtained a value of W = 0.97522 with a p-value = 0.05196. This indicates that the residual model is normally distributed. A heterogeneous test of variance of the model was carried out using the Breusch-Pagan test and obtained BP = 19.722 with p-value = 0.001409. This indicates that there is diversity in the data. Because the object is a province (location/spatial), it is concluded that there is spatial diversity in the panel data so that the data can be modeled using Geographically Weighted Panel Regression (GWPR).

Geographycally Weighted Panel Regression Model
We transformed the data using within estimator because the selected panel model was FEM. Then, we performed GWPR using four kinds of bandwidth, namely, Fixed Gaussian Kernel, Adaptive Gaussian Kernel, Fixed Bisquare Kernel, and Adaptive Bisquare Kernel. Table 4 shows the comparison of the GWPR models. Based on the AIC and R 2 values of the four GWPR models, the best GWPR model was the GWPR with Fixed Gaussian Kernel with the smallest AIC value of -314.721 and the largest R 2 value of 0.982. Figure 5 shows that the predictor variables affecting the number of positive cases of Covid-19 are different at each location and tend to accumulate at certain locations.

CONCLUSIONS
Geographically Weighted Panel Regression can be used to model spatial data that takes time series into account. Based on the results of the panel model test on the number of positive cases of Covid-19 for 2020-2021 in 34 provinces in Indonesia, it was obtained that the selected panel model was Fixed Effect Model and based on the comparison of GWPR model with several weights, the best GWPR model was GWPR with Fixed Gaussian Kernel. The predictor variables that affected the number of positive cases of Covid-19 were different at each location and tent to accumulate at certain locations, hence the GWPR equation for each location was also different.