BAYESIAN ADDITIVE REGRESSION TREE APPLICATION FOR PREDICTING MATERNITY RECOVERY RATE OF GROUP LONG-TERM DISABILITY INSURANCE

  • Stevanny Budiana Center for Mathematics and Society, Department of Mathematics Parahyangan Catholic University, Indonesia
  • Felivia Kusnadi Center for Mathematics and Society, Department of Mathematics Parahyangan Catholic University, Indonesia
  • Robyn Irawan Center for Mathematics and Society, Department of Mathematics Parahyangan Catholic University, Indonesia
Keywords: Bayesian Additive Regression Tree, Maternity Recovery Rate, Prior, Sum-of-Trees

Abstract

Bayesian Additive Regression Tree (BART) is a sum-of-trees model used to approximate classification or regression cases. The main idea of this method is to use a prior distribution to keep the tree size small and a likelihood from data to get the posterior. By fixing the tree size as small as possible, the approximation of each tree would have a little effect on the posterior, which is the sum of all output from all the trees used. Bayesian additive regression tree method will be used for predicting the maternity recovery rate of group long-term disability insurance data from the Society of Actuaries (SOA). The decision tree-based models such as Gradient Boosting Machine, Random Forest, Decision Tree, and Bayesian Additive Regression Tree model are compared to find the best model by comparing mean squared error and program runtime. After comparing some models, the Bayesian Additive Regression Tree model gives the best prediction based on smaller root mean squared error values and relatively short runtime.

Downloads

Download data is not yet available.

References

I. for H. M. and Evaluation, “Findings from the Global Burden of Disease Study 2017,” Online, 2018. www.healthdata.org.

C. AbouZahr, “Global Burden of Maternal Death and Disability,” Br. Med. Bull., vol. 67, no. 1, pp. 1–11, 2003, doi: 10.1093/bmb/ldg015.

J. Tan, Y.V.; Roy, “Bayesian Additive Regression Trees and The General BART Model,” Stat. Med., vol. 38, no. 25, pp. 5048–5069, 2019.

R. E. Chipman, H.A.; George, E.I.; McCulloch, “BART: Bayesian Additive Regression Trees,” Ann. Appl. Stat., vol. 6, no. 1, pp. 266–298, 2012.

R. E. Chipman, H.A.; George, E.I.; McCulloch, “Bayesian CART Model Search,” J. Am. Stat. Assoc., vol. 93, no. 443, pp. 935–948, 1998.

R. Gareth, J.; Witten, D.; Hastie, T.; Tibshirani, An Introduction to Statistical Learning: with Applications in R. New York: Springer, 2013.

L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.

E. Klug, Maximiliano; Barash, Yiftach; Bechler, Sigalit; Resheff, Yehezkel S; Tron, Talia; Ironi, Avi; Soffer, Shelly; Zimlichman, Eyal; Klang, “A Gradient Boosting Machine Learning Model for Predicting Early Mortality in The Emergency Department Triage: Devising A Nine-Point Triage Score,” J. Gen. Intern. Med., vol. 35, no. 1, pp. 220–227, 2020.

R. Hastie, T.; Tibshirani, “Bayesian Backfitting,” Stat. Sci., vol. 15, no. 3, pp. 196–223, 2000.

G. Chopin, N.; Ducrocq, “Fast Compression of MCMC Output,” Entropy, vol. 23, no. 8, p. 1017, 2021.

S. T. Bleich, J.; Kapelner, A.; George, E.I.; Jensen, “Variable Selection for BART: An Application to Gene Regulation,” Ann. Appl. Stat., vol. 8, no. 3, pp. 1750–1781, 2014.

J. Diana, A.; Griffin, J. E.; Oberoi, J. S.; Yao, “Machine-Learning Methods for Insurance Applications – A Survey,” 2019.

M. Kopinsky, “Predicting Group Long Term Disability Recovery and Mortality Rates using Tree Models,” 2017. [Online]. Available: https://www.soa.org/globalassets/assets/Files/Research/Projects/2017-gltd-recovery-mortality-tree.pdf.

K. P. Murphy, Machine Learning A Probabilistic Perspective. Massachusetts: MIT Press, 2012.

C. L. V. Lawson, A. B.; Browne, W. J.; Rodeiro, Disease Mapping with WinBUGS and MLwiN. John Wiley & Sons, 2003.

J. Kapelner, A.; Bleich, “bartMachine: Machine Learning with Bayesian Additive Regression Trees,” 2013.

J. Bleich, Extensions and Applications of Ensemble-of-trees Methods in Machine Learning. University of Pennsylvania, 2015.

S. Ghasemi, A.; Zahediasl, “Normality Tests for Statistical Analysis: A Guide for Non-Statisticians,” Int. J. Endocrinol. Metab., vol. 10, no. 2, pp. 486–489, 2012.

C. J. Li, L.; Cook, R.D.; Nachtsheim, “Model-Free Variable Selection,” J. R. Stat. Soc., vol. 67, no. 2, pp. 285–299, 2005.

B. T. Hespanhol, L.; Vallio, C.S.; Costa, L.M.; Saragiotto, “Understanding and Interpreting Confidence and Credible Intervals Around Effect Estimates,” Brazilian J. Phys. Ther., vol. 23, no. 4, pp. 290–301, 2019.

Published
2023-04-15
How to Cite
[1]
S. Budiana, F. Kusnadi, and R. Irawan, “BAYESIAN ADDITIVE REGRESSION TREE APPLICATION FOR PREDICTING MATERNITY RECOVERY RATE OF GROUP LONG-TERM DISABILITY INSURANCE”, BAREKENG: J. Math. & App., vol. 17, no. 1, pp. 0135-0146, Apr. 2023.