BAYESIAN ADDITIVE REGRESSION TREE APPLICATION FOR PREDICTING MATERNITY RECOVERY RATE OF GROUP LONG-TERM DISABILITY INSURANCE
Abstract
Bayesian Additive Regression Tree (BART) is a sum-of-trees model used to approximate classification or regression cases. The main idea of this method is to use a prior distribution to keep the tree size small and a likelihood from data to get the posterior. By fixing the tree size as small as possible, the approximation of each tree would have a little effect on the posterior, which is the sum of all output from all the trees used. Bayesian additive regression tree method will be used for predicting the maternity recovery rate of group long-term disability insurance data from the Society of Actuaries (SOA). The decision tree-based models such as Gradient Boosting Machine, Random Forest, Decision Tree, and Bayesian Additive Regression Tree model are compared to find the best model by comparing mean squared error and program runtime. After comparing some models, the Bayesian Additive Regression Tree model gives the best prediction based on smaller root mean squared error values and relatively short runtime.
Downloads
References
I. for H. M. and Evaluation, “Findings from the Global Burden of Disease Study 2017,” Online, 2018. www.healthdata.org.
C. AbouZahr, “Global Burden of Maternal Death and Disability,” Br. Med. Bull., vol. 67, no. 1, pp. 1–11, 2003, doi: 10.1093/bmb/ldg015.
J. Tan, Y.V.; Roy, “Bayesian Additive Regression Trees and The General BART Model,” Stat. Med., vol. 38, no. 25, pp. 5048–5069, 2019.
R. E. Chipman, H.A.; George, E.I.; McCulloch, “BART: Bayesian Additive Regression Trees,” Ann. Appl. Stat., vol. 6, no. 1, pp. 266–298, 2012.
R. E. Chipman, H.A.; George, E.I.; McCulloch, “Bayesian CART Model Search,” J. Am. Stat. Assoc., vol. 93, no. 443, pp. 935–948, 1998.
R. Gareth, J.; Witten, D.; Hastie, T.; Tibshirani, An Introduction to Statistical Learning: with Applications in R. New York: Springer, 2013.
L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.
E. Klug, Maximiliano; Barash, Yiftach; Bechler, Sigalit; Resheff, Yehezkel S; Tron, Talia; Ironi, Avi; Soffer, Shelly; Zimlichman, Eyal; Klang, “A Gradient Boosting Machine Learning Model for Predicting Early Mortality in The Emergency Department Triage: Devising A Nine-Point Triage Score,” J. Gen. Intern. Med., vol. 35, no. 1, pp. 220–227, 2020.
R. Hastie, T.; Tibshirani, “Bayesian Backfitting,” Stat. Sci., vol. 15, no. 3, pp. 196–223, 2000.
G. Chopin, N.; Ducrocq, “Fast Compression of MCMC Output,” Entropy, vol. 23, no. 8, p. 1017, 2021.
S. T. Bleich, J.; Kapelner, A.; George, E.I.; Jensen, “Variable Selection for BART: An Application to Gene Regulation,” Ann. Appl. Stat., vol. 8, no. 3, pp. 1750–1781, 2014.
J. Diana, A.; Griffin, J. E.; Oberoi, J. S.; Yao, “Machine-Learning Methods for Insurance Applications – A Survey,” 2019.
M. Kopinsky, “Predicting Group Long Term Disability Recovery and Mortality Rates using Tree Models,” 2017. [Online]. Available: https://www.soa.org/globalassets/assets/Files/Research/Projects/2017-gltd-recovery-mortality-tree.pdf.
K. P. Murphy, Machine Learning A Probabilistic Perspective. Massachusetts: MIT Press, 2012.
C. L. V. Lawson, A. B.; Browne, W. J.; Rodeiro, Disease Mapping with WinBUGS and MLwiN. John Wiley & Sons, 2003.
J. Kapelner, A.; Bleich, “bartMachine: Machine Learning with Bayesian Additive Regression Trees,” 2013.
J. Bleich, Extensions and Applications of Ensemble-of-trees Methods in Machine Learning. University of Pennsylvania, 2015.
S. Ghasemi, A.; Zahediasl, “Normality Tests for Statistical Analysis: A Guide for Non-Statisticians,” Int. J. Endocrinol. Metab., vol. 10, no. 2, pp. 486–489, 2012.
C. J. Li, L.; Cook, R.D.; Nachtsheim, “Model-Free Variable Selection,” J. R. Stat. Soc., vol. 67, no. 2, pp. 285–299, 2005.
B. T. Hespanhol, L.; Vallio, C.S.; Costa, L.M.; Saragiotto, “Understanding and Interpreting Confidence and Credible Intervals Around Effect Estimates,” Brazilian J. Phys. Ther., vol. 23, no. 4, pp. 290–301, 2019.
Copyright (c) 2023 Stevanny Budiana, Felivia Kusnadi, Robyn Irawan
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.