OUTLIER IDENTIFICATION ON PENALIZED SPLINE REGRESSION MODELING FOR POVERTY GAP INDEX IN JAVA

  • Anggita Rizky Fadilah Department of Statistics, Faculty of Mathematics and Natural Sciences, IPB University
  • Anwar Fitrianto Department of Statistics, Faculty of Mathematics and Natural Sciences, IPB University
  • I Made Sumertajaya Department of Statistics, Faculty of Mathematics and Natural Sciences, IPB University
Keywords: adjusted boxplot, outlier, penalized spline regression, Poverty Gap Index

Abstract

Java is one of the islands in Indonesia which has good establishment acceleration. Even though economic growth was good, poverty is still a serious problem. Three of six provinces, including DI Yogyakarta, Central Java, and East Java still have poverty rates above national rates in March 2020. This problem indicates that an imbalance in poverty happens between those regions. Several regions have extreme conditions or known as outliers. Besides that, poverty gap data have a complex pattern so modeling using a non-parametric approach is suitable. This study aims to build an appropriate model to support the success of poverty alleviation in Java and the identification of outliers was carried out using an adjusted boxplot. The best-penalized regression spline model for Poverty Gap Index in Java Island was obtained by Generalized minimum Cross-Validation (GCV) using optimum smoothing parameter (λ) 0,12 and knot combination (1, 2, 4, 1, 5, 3, and 1) for seven predictor variables. The result shows that penalized spline regression model has a higher R2 than OLS regression. The R2 is obtained 69,10%, so the model is feasible to explain the variability of the poverty gap in Java. Moreover, based on the outliers’ identification shows a dependency between outlier in data and residual because some districts/cities are identified as outliers in both.

Downloads

Download data is not yet available.

References

R. Gouunder, and Z. Xing, “Impact of education and health on poverty reduction: Monetary and non-monetary evidence from Fiji,” Economic Modelling, vol. 29, no. 3, pp. 787–794, 2020, doi: 10.1016/j.econmod.2012.01.018.

A. M. Arsani, B. Ario, and A. F. Ramadhan, “Impact of Education on Poverty and Health : Evidence from Indonesia,” Economics Development Analysis Journal, vol. 9, no. 1, pp. 87–96, 2020, doi: 10.15294/edaj.v9i1.34921.

A. M. Ginting, “Pengaruh Ketimpangan Pembangunan Antarwilayah Terhadap Kemiskinan Di Indonesia 2004-2013,” Pusat Penelitian - Badan Keahlian DPR RI, vol. 20, no. 1, pp. 45–58, 2015, [Online]. Available: http://news.bisnis.com/read/20140721/15/244928/

H. Hill, “What’s happened to poverty and inequality in indonesia over half a century?,” Asian Development Review, vol. 38, no. 1, pp. 68–97, 2021, doi: 10.1162/adev_a_00158.

B. Žmuk, “Quality of Life Indicators in Selected European Countries: Hierarchical Cluster Analysis Approach,” Croatian Review of Economic, Business and Social Statistics, vol. 1, no. 1–2, pp. 42–54, 2015, doi: 10.1515/crebss-2016-0004.

B. Lestari, Fatmawati, I. N. Budiantara, and N. Chamidah, “Estimation of Regression Function in Multi-Response Nonparametric Regression Model Using Smoothing Spline and Kernel Estimators,” Journal of Physics: Conference Series, vol. 1097, no. 1, 2018, doi: 10.1088/1742-6596/1097/1/012091.

D. Ruppert, M. P. Wand, and R. J. Carroll, Semiparametric Regression. New York: Cambridge University Press, 2003.

D. J. Henderson and A. C. Souto, “An Introduction to Nonparametric Regression for Labor Economists,” Journal of Labor Research, vol. 39, no. Nov, pp. 355–382, 2018, doi: 10.1007/s12122-018-9279-6.

A. Islamiyati, N. Sunusi, A. Kalondeng, F. Fatmawati, and N. Chamidah, “Use of two smoothing parameters in penalized spline estimator for bi-variate predictor non-parametric regression model,” Journal of Sciences, Islamic Republic of Iran, vol. 31, no. 2, pp. 175–183, 2020, doi: 10.22059/JSCIENCES.2020.286949.1007435.

B. Wang, W. Shi, and Z. Miao, “Comparative Analysis for Robust Penalized Spline Smoothing Methods,” Mathematical Problems in Engineering, vol. 2014, no. July, 2014, doi: 10.1155/2014/642475.

I. Kalogridis and S. V. Aelst, “M-type Penalized Splines With Auxiliary Scale Estimation,” Journal of Statistical Planning and Inference, vol. 212, no. May, pp. 97–113, 2021, doi: 10.1016/j.jspi.2020.09.004.

E. Alvarez, R. M. G. Fernandez, F. J. B. Encomienda, and J. F. Munoz, “The Effect of Outliers on the Economic and Social Survey on Income and Living Conditions,” International Journal of Social, Management, Economics and Business Engineering, vol. 08, no. 10, pp. 3051–3055, 2014, doi: 10.4236/oalib.1106619.

C. O. Arimie, E. O. Biu, and M. A. Ijomah, “Outlier Detection and Effects on Modeling,” Open Access Library Journal, vol. 07, no. 09, pp. 1–30, 2020, doi: 10.4236/oalib.1106619.

A. Fitrianto and S. H. Xin, “Comparisons Between Robust Regression Approaches in the Presence of Outliers and High Leverage Points,” BAREKENG: Jurnal Ilmu Matematika dan Terapan, vol. 16, no. 1, pp. 243–252, 2022, doi: 10.30598/barekengvol16iss1pp241-250.

A. M. Gad and M. E. Qura, “Regression Estimation in the Presence of Outliers: A Comparative Study,” International Journal of Probability and Statistics, vol. 5, no. 3, pp. 65–72, 2016, doi: 10.5923/j.ijps.20160503.01.

C. P. Dhakal, “Dealing With Outliers and Influential Points While Fitting Regression,” Journal of Institute of Science and Technology, vol. 22, no. 1, pp. 61–65, 2017, doi: 10.3126/jist.v22i1.17741.

B. I. Babura, M. B. Adam, A. Rahim, A. Samad, A. Fitrianto, and B. Yusif, “Analysis and Assessment of Boxplot Characters for Analysis and Assessment of Boxplot Characters for,” in Journal of Physics, 2018, pp. 1–9. doi: 10.1088/1742-6596/1132/1/012078.

M. Hubert and E. Vandervieren, “An adjusted boxplot for skewed distributions,” Computational Statistics and Data Analysis, vol. 52, no. 12, pp. 5186–5201, 2008, doi: 10.1016/j.csda.2007.11.008.

L. Ngo and M. P. Wand, “Smoothing with mixed model software,” Journal of Statistical Software, vol. 9, no. 1978, pp. 1–54, 2004, doi: 10.18637/jss.v009.i01.

R. L. Eubank, Nonparametric Regression and Spline Smoothing, Second Edition. New York: Marcel Dekker Inc, 1999.

Published
2022-12-15
How to Cite
[1]
A. Fadilah, A. Fitrianto, and I. Sumertajaya, “OUTLIER IDENTIFICATION ON PENALIZED SPLINE REGRESSION MODELING FOR POVERTY GAP INDEX IN JAVA”, BAREKENG: J. Math. & App., vol. 16, no. 4, pp. 1231-1240, Dec. 2022.