KNOT OPTIMIZATION FOR BI-RESPONSE SPLINE NONPARAMETRIC REGRESSION WITH GENERALIZED CROSS-VALIDATION (GCV)

  • Andre Fajry Al Barra Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Sebelas Maret, Indonesia https://orcid.org/0009-0002-9165-6939
  • Dewi Retno Sari Saputro Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Sebelas Maret, Indonesia https://orcid.org/0000-0002-6569-394X
Keywords: Bi-Response, GCV, Knot Point, Nonparametric Regression, Spline

Abstract

Nonparametric regression is a statistical method used to model relationships between variables without making strong assumptions about the functional form of the relationship. Nonparametric regression models are flexible and can capture complex relationships that may not be adequately represented by simple parametric forms. Spline is one of the approaches used in nonparametric regression. Splines have the disadvantage of having to use optimal nodes in the data. Therefore, this article discusses the retrieval of optimal knot points using the generalized cross-validation method in the nonparametric bi-response spline regression model. The research results showed that the generalized-cross validation method is the best method for selecting nodes from other methods such as CV, AIC, BIC, RSS, or a more explicit validation-based approach method because of the development of the Cross Validation (CV) method which automatically selects the optimal number of nodes based on the balance between bias and variance. The process of optimizing knot points with Generalized Cross Validation (GCV) on bi-response spline nonparametric regression is implemented using Python can provide optimization at optimal knot points. Based on the results of the generalized cross-validation model analysis, it is concluded that GCV can effectively optimize knot points for spline fitting, ensuring a balanced and efficient model in capturing data patterns without overfitting.

Downloads

Download data is not yet available.

References

B. Mahesh, “Machine Learning Algorithms-A Review,” International Journal of Science and Research, 2018, doi: 10.21275/ART20203995.

Sifriyani, I.N. Budiantara, and S.H. Kartiko, “A New Method of Hypothesis Test for Truncated Spline Nonparametric Regression Influenced by Spatial Heterogeneity and Application” Hindawi Volume 2018, Article ID 9769150, 13 pages. https://doi.org/10.1155/2018/9769150.

J. Arkes, “Regression Analysis: A Practical Introduction” (2nd ed.), Routledge, 2023, https://doi.org/10.4324/9781003285007

V. Ratnasari, I.N. Budiantara, and A.T.R. Dani, “Nonparametric regression mixed estimators of truncated spline and gaussian kernel based on cross-validation (CV), generalized cross-validation (GCV), and unbiased risk (UBR) methods, Int. J. Adv. Sci. Eng. Inform. Technol. 11 (2021) 2400-2406.”.

D. A. Widyastuti, A. A. R. Fernandes, and H. Pramoedyo, “Spline estimation method in nonparametric regression using truncated spline approach,” in Journal of Physics: Conference Series, IOP Publishing Ltd, May 2021. doi: 10.1088/1742-6596/1872/1/012027.

M. Rosalina, S. Martha, and N. Imro’ah Intisari, “Pemodelan Regresi Nonparametrik Birespon Spline pada Persentase Penduduk Miskin dan Indeks Kedalaman Kemiskinan,” 2023.

Y. Guangyu, Z. Baqun, and Z. Min, “Estimation of Knots in Linear Spline Models,” Journal of the American Statistical Association, 2021, 118(541), 639–650, https://doi.org/10.1080/01621459.2021.1947307.

D. Krstajic, L. J. Buturovic, D. E. Leahy, and S. Thomas, “Cross-validation pitfalls when selecting and assessing regression and classification models,” J Cheminform, vol. 6, no. 1, Mar. 2014, doi: 10.1186/1758-2946-6-10.

N. Fitriyani, Metode Cross Validation dan Generalized Cross Validation dalam Regresi Nonparametrik Spline (Studi Kasus Data Fertilitas di Jawa Timur). 2014. [Online]. Available: https://www.researchgate.net/publication/347495233

M. Mitrouli and P. Roupa, “Estimates for the generalized cross-validation function via an extrapolation and statistical approach,” Calcolo, vol. 55, no. 3, Sep. 2018, doi: 10.1007/s10092-018-0266-3.

N. Donthu, S. Kumar, D. Mukherjee, N. Pandey, and W. M. Lim, “How to conduct a bibliometric analysis: An overview and guidelines,” J Bus Res, vol. 133, pp. 285–296, Sep. 2021, doi: 10.1016/j.jbusres.2021.04.070.

D. Fitria, A. Husaeni, A. Bayu, and D. Nandiyanto, “Bibliometric Using Vosviewer with Publish or Perish (using Google Scholar data): From Step-by-step Processing for Users to the Practical Examples in the Analysis of Digital Learning Articles in Pre and Post Covid-19 Pandemic,” 2021, doi: 10.17509/ijost.v6ix.

A. Kirby, “Exploratory Bibliometrics: Using VOSviewer as a Preliminary Research Tool,” Publications, vol. 11, no. 1, Mar. 2023, doi: 10.3390/publications11010010.

M. Dubyna, O. Popelo, N. Kholiavko, A. Zhavoronok, M. Fedyshyn, and I. Yakushko, “Mapping the Literature on Financial Behavior: a Bibliometric Analysis Using the VOSviewer Program,” WSEAS Transactions on Business and Economics, vol. 19, pp. 231–246, 2022, doi: 10.37394/23207.2022.19.22.

W. Haitham, Tuffaha, M. Strong, L. G. Gordon, P. A. Scuffham, “Efficient Value of Information Calculation Using a Nonparametric Regression Approach: An Applied Perspective,” Value in Health, Volume 19, Issue 4, 2016, https://doi.org/10.1016/j.jval.2016.01.011.

O. Febriana Rinda Sihotang and S. Prangga, “Aplikasi Regresi Nonparametrik Spline Birespon pada Data Kualitas Air di Das Mahakam”, [Online]. Available: http://ejurnal.binawakya.or.id/index.php/MBI

M. Jansen, “Generalized Cross Validation in variable selection with and without shrinkage,” J Stat Plan Inference, vol. 159, pp. 90–104, Apr. 2015, doi: 10.1016/j.jspi.2014.10.007.

M. Maharani and D. R. S. Saputro, “Generalized Cross Validation (GCV) in Smoothing Spline Nonparametric Regression Models,” in IOP Conference Series: Earth and Environmental Science, IOP Publishing Ltd, Mar. 2021. doi: 10.1088/1742-6596/1808/1/012053.

Sifriyani, A. R. M. Sari, A. T. R. Dani, and S. Jalaluddin, “Bi-Response Truncated Spline Nonparametric Regression with Optimal Knot Point Selection Using Generalized Cross-Validation in Diabetes Mellitus Patient’s Blood Sugar Levels,” Communications in Mathematical Biology and Neuroscience, vol. 2023, 2023, doi: 10.28919/cmbn/7903.

R. Fenda Refiantoro, C. Rizki Nugroho, and Y. Tri Hapsari, “Analisis Regresi Sederhana Pada Data Nilai UAS Menggunakan Microsoft Excel Dan IBM SPSS Analisis Regresi Sederhana Pada Nilai UAS Menggunakan Microsoft Excel Dan IBM SPSS,” Jurnal ARTI : Aplikasi Rancangan Teknik Industri Volume 17 Nomor 2, November 2022. 396-Article Text-2039-1-10-20221129.

P. Laksana and S. Kurnia Viona, “Penggunan Regresi Linear untuk Mengetahui Variabel Pengaruh pada Kekuatan Lereng Highwall Pit Alfa-Pt Arutmin Indonesia Tambang Kintap,” Prosiding Tpt XVIII Perhapi 2019.

A. Perperoglou, Sauerbrei, M. Abrahamowicz, “A review of spline function procedures in R, ” BMC Med Res Methodol 19, 46 (2019). https://doi.org/10.1186/s12874-019-0666-3.

Mariati, N. P. A. Mirah, I. Nyoman Budiantara, and V. Ratnasari, "The Application of Mixed Smoothing Spline and Fourier Series Model in Nonparametric Regression" Symmetry 13, 2021, no. 11: 2094, https://doi.org/10.3390/sym13112094.

I. Nyoman Budiantara, M. Fariz Fadillah Mardianto, M. Putri, I. N. Budiantara, and S. H. Kartiko, “Selection of Optimal Knot Point and Best Geographic Weighted on Geographically Weighted Spline Nonparametric Regression Model Using Generalized Cross Validation and Kernel Function.” [Online]. Available: https://ssrn.com/abstract=4811193

Published
2025-01-13
How to Cite
[1]
A. F. Al Barra and D. R. S. Saputro, “KNOT OPTIMIZATION FOR BI-RESPONSE SPLINE NONPARAMETRIC REGRESSION WITH GENERALIZED CROSS-VALIDATION (GCV)”, BAREKENG: J. Math. & App., vol. 19, no. 1, pp. 271-280, Jan. 2025.