A COMPARISON OF COX PROPORTIONAL HAZARD AND RANDOM SURVIVAL FOREST MODELS IN PREDICTING CHURN OF THE TELECOMMUNICATION INDUSTRY CUSTOMER

  • Sitti Nurhaliza Department of Statistics, Faculty of Mathematics and Natural Sciences, Bogor Agricultural University
  • Kusman Sadik Department of Statistics, Faculty of Mathematics and Natural Sciences, Bogor Agricultural University
  • Asep Saefuddin Department of Statistics, Faculty of Mathematics and Natural Sciences, Bogor Agricultural University
Keywords: Survival analysis, Right-censored, Cox Proportional Hazard, Random Survival Forest, C-index, Churn

Abstract

The Cox Proportional hazard model is a popular method to analyze right-censored survival data. This method is efficient to use if the proportional hazard assumption is fulfilled. This method does not provide an accurate conclusion if these assumptions are not fulfilled. The new innovative method with a non-parametric approach is now developing to predict the time until an event occurs based on machine learning techniques that can solve the limitation of CPH. The method is Random Survival Forest, which analyzes right-censored survival data without regard to any assumptions. This paper aims to compare the predictive quality of the two methods using the C-index value in predicting right-censored survival data on churn data of the telecommunication industry customers with 2P packages consisting of  Internet and TV, which are taken from all customer databases in the Jabodetabek area. The results show that the median value of the C-index of the RSF model is 0.769, greater than the median C-index value of the CPH model of 0.689. So the prediction quality of the RSF model is better than the CPH model in predicting the churn of the telecommunications industry customer.

Downloads

Download data is not yet available.

References

D. G. Kleinbaum and M. Klein, Survival Analysis A Self-Learning Text, Third Edit. 2012.

J. Harlan, Analsis Survival. Depok: Gunadarma, 2017.

H. Meiling and Y. Lin, “Nonparametric Inference for Right Censored Data Using Smoothing Splines Statistica Sinica Preprint No : SS-2017-0357 Title Nonparametric Inference for Right Censored Data Using Smoothing Splines Complete List of Authors Meiling Hao Yuanyuan Lin and Xingqi,” no. February, 2019, doi: 10.5705/ss.202017.0357.

U. B. Mogensen, H. Ishwaran, and T. A. Gerds, “Evaluating Random Forests for Survival Analysis Using Prediction Error Curves,” J. Stat. Softw., vol. 50, no. 11, pp. 1–23, 2012, doi: 10.18637/jss.v050.i11.

K. Afrin, G. Illangovan, S. S. Srivatsa, and S. T. S. Bukkapatnam, “Balanced Random Survival Forests for Extremely Unbalanced, Right Censored Data,” no. April, 2018, [Online]. Available: http://arxiv.org/abs/1803.09177.

D. K. Mageto, S. M. Mwalili, and A. G. Waitutu, “Modelling of Credit Risk: Random Forests versus Cox Proportional Hazard Regression,” Am. J. Theor. Appl. Stat., vol. 4, no. 4, p. 247, 2015, doi: 10.11648/j.ajtas.20150404.13.

M. Saadati and A. Bagheri, “Comparison of Survival Forests in Analyzing First Birth Interval,” Jorjani Biomed. J., vol. 7, no. 3, pp. 11–23, 2019, doi: 10.29252/jorjanibiomedj.7.3.11.

J. Ruyssinck et al., “Random Survival Forests for Predicting the Bed Occupancy in the Intensive Care Unit,” Comput. Math. Methods Med., vol. 2016, 2016, doi: 10.1155/2016/7087053.

J. B. Nasejje, H. Mwambi, K. Dheda, and M. Lesosky, “A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data,” BMC Med. Res. Methodol., vol. 17, no. 1, pp. 1–18, 2017, doi: 10.1186/s12874-017-0383-8.

A. Ptak-chmielewska and A. Matuszyk, “Application oF The Random Survival Forests Method in The Bankruptcy Prediction,” vol. 1, no. 1, 2020, doi: 10.15611/aoe.2020.1.06.

P. C. Austin, “Generating survival times to simulate Cox proportional hazards models with time-varying covariates,” Stat. Med., no. November 2011, 2012, doi: 10.1002/sim.5452.

H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer, “Random survival forests,” Ann. Appl. Stat., vol. 2, no. 3, pp. 841–860, 2008, doi: 10.1214/08-AOAS169.

J. Wang, “Apply Machine Learning Approaches to Survival Data,” Imperial College London, 2018.

A. Schlossberg et al., “Cox Proportional Hazard Regression,” no. July, 2016.

B. C. Jaeger, S. Welden, J. L. Speiser, K. Lenoir, and M. Segar, “A CCELERATED AND INTERPRETABLE OBLIQUE RANDOM,” 2022.

A. Hazewinkel, H. Gelderblom, and M. Fiocco, “Prediction models with survival data : a comparison between machine learning and the Cox proportional hazards model,” no. Ml, 2022.

Published
2022-12-15
How to Cite
[1]
S. Nurhaliza, K. Sadik, and A. Saefuddin, “A COMPARISON OF COX PROPORTIONAL HAZARD AND RANDOM SURVIVAL FOREST MODELS IN PREDICTING CHURN OF THE TELECOMMUNICATION INDUSTRY CUSTOMER”, BAREKENG: J. Math. & App., vol. 16, no. 4, pp. 1433-1440, Dec. 2022.