PREDICTION INTERVALS IN MACHINE LEARNING: RESIDUAL BOOTSTRAP AND QUANTILE REGRESSION FOR CASH FLOW ANALYSIS
Abstract
Time series forecasting often faces challenges in producing reliable predictions due to inherent uncertainty in dynamic systems. While point predictions are commonly used, they may not adequately capture this uncertainty, especially in financial systems where forecasting accuracy directly impacts decision-making. Prediction intervals offer a solution by providing a range of likely outcomes rather than single-point estimates. This study implements multivariate time series forecasting using gradient boosting algorithms (XGBoost, CatBoost, and LightGBM) to predict cash flow patterns in banking transactions, focusing on constructing reliable prediction intervals. Using transaction data from Bank Rakyat Indonesia (BRI), the research analyzes both office and e-channel transactions with different lag structures based on Granger Causality tests. Model performance was evaluated using RMSLE, MAE, and MAPE metrics, with RMSLE chosen as primary due to its ability to handle skewed distributions. LightGBM achieved best performance for office cash-in transactions (RMSLE: 0.2395), while CatBoost outperformed others for office cash-out (RMSLE: 0.2848), e-channel cash-in (RMSLE: 0.3946), and e-channel cash-out (RMSLE: 0.4221). For prediction intervals, two methods were compared: Residual Bootstrap with 500 samples and Quantile Regression. Residual Bootstrap generally produced coverage probabilities closer to the 80% level (i.e., 10–90% prediction interval), especially for office transactions, while maintaining narrower interval widths. In contrast, Quantile Regression tended to generate wider intervals and often overestimated uncertainty, resulting in overly high coverage in some cases. However, both methods showed clear limitations when applied to e-channel transactions, particularly for cash-in e-channel, where coverage probabilities fell below 50% due to high volatility and irregular transaction patterns. Unlike previous work focused only on point forecasts, this study offers insights into forecast uncertainty by evaluating how well each method quantifies, providing practical guidance for financial institutions aiming to improve risk management through interval-based forecasting.
Downloads
References
D. C. Montgomery, C. L. Jennings, dan M. Kulahci, “WILEY SERIES IN PROBABILITY AND STATISTICS,” 2015.
H. Peters, “PREDICTION INTERVALS IN MACHINE LEARNING,” https://medium.com/@heinrichpeters/prediction-intervals-in-machine-learning-a2faa36b320c.
B. Efron, “BOOTSTRAP METHODS: ANOTHER LOOK AT THE JACKKNIFE,” The Annals of Statistics, vol. 7, no. 1, hlm. 1–26, Jan 1979, https://doi.org/10.1214/aos/1176344552.
R. Koenker dan G. Bassett, “REGRESSION QUANTILES,” Econometrica, vol. 46, no. 1, hlm. 33–50, 1978, https://doi.org/10.2307/1913643.
Rob Hyndman dan G. Athanasopoulos, FORECASTING: PRINCIPLES AND PRACTICE, 3rd ed. OTexts, 2021.
A. Susanti, Suhartono, H. J. Setyadi, M. Taruk, Haviluddin, dan P. P. Widagdo, “FORECASTING INFLOW AND OUTFLOW OF MONEY CURRENCY IN EAST JAVA USING A HYBRID EXPONENTIAL SMOOTHING AND CALENDAR VARIATION MODEL,” dalam Journal of Physics: Conference Series, Institute of Physics Publishing, Mar 2018. https://doi.org/10.1088/1742-6596/979/1/012096.
Fahmi, “THE ACCURACY FORECASTING OF CASH INFLOW AND CASH OUTFLOW USING DETERMINISTIC, STOCHASTIC AND HYBRIDIZATION MODELS,” Jurnal Manajemen dan Perbankan (JUMPA), vol. 11, no. 2, hlm. 1–11, 2024, https://doi.org/10.55963/jumpa.v11i2.628.
N. A. Salehah, “PENERAPAN MODEL HYBRID ARIMAX-QUANTILE REGRESSION UNTUK PERAMALAN INFLOW DAN OUTFLOW PECAHAN UANG KARTAL DI JAWA TIMUR,” Institut Teknologi Sepuluh Nopember, Surabaya, 2017.
P. Cogneau dan V. Zakamouline, “BOOTSTRAP METHODS FOR FINANCE: REVIEW AND ANALYSIS,” Mei 2010.
T. Chen dan C. Guestrin, “XGBOOST: A SCALABLE TREE BOOSTING SYSTEM,” dalam Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, dalam KDD ’16. New York, NY, USA: Association for Computing Machinery, 2016, hlm. 785–794. https://doi.org/10.1145/2939672.2939785.
A. Mehdary, A. Chehri, A. Jakimi, dan R. Saadane, “HYPERPARAMETER OPTIMIZATION WITH GENETIC ALGORITHMS AND XGBOOST: A STEP FORWARD IN SMART GRID FRAUD DETECTION,” Sensors, vol. 24, no. 4, 2024, https://doi.org/10.3390/s24041230.
G. Ke dkk., “LightGBM: A HIGHLY EFFICIENT GRADIENT BOOSTING DECISION TREE,” 2017. [Daring]. Tersedia pada: https://github.com/Microsoft/LightGBM.
L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, dan A. Gulin, “CatBoost: Unbiased Boosting With Categorical Features,” dalam Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, dan R. Garnett, Ed., Curran Associates, Inc., 2018. [Daring]. Tersedia pada: https://proceedings.neurips.cc/paper_files/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf
V. L. Justus, V. B. Rodrigues, dan A. R. dos S. Sousa, “Bootstrap CONFIDENCE INTERVALS: A COMPARATIVE SIMULATION STUDY,” Apr 2024, [Daring]. Tersedia pada: http://arxiv.org/abs/2404.12967
E. Paparoditis dan H. L. Shang, “BOOTSTRAP PREDICTION BANDS FOR FUNCTIONAL TIME SERIES,” Apr 2020, [Daring]. Tersedia pada: http://arxiv.org/abs/2004.03971.
R. Koenker dan K. F. Hallock, “QUANTILE REGRESSION,” Journal of Economic Perspectives, vol. 15, no. 4, hlm. 143–156, Des 2001, https://doi.org/10.1257/jep.15.4.143.
R. Y. Assoc, R. Rao Kurada, dan S. Pattem Asst, “AN APPROACH TO IDENTIFY ACCURATE MACHINE LEARNING MODEL TO BUILD HUMAN STRESS LEVEL PREDICTION SYSTEM,” 2023. [Daring]. Tersedia pada: https://ssrn.com/abstract=4379061.
A. Botchkarev, “A NEW TYPOLOGY DESIGN OF PERFORMANCE METRICS TO MEASURE ERRORS IN MACHINE LEARNING REGRESSION ALGORITHMS,” Interdisciplinary Journal of Information, Knowledge, and Management, vol. 14, hlm. 45–76, 2019, https://doi.org/10.28945/4184.
A. Jadon, A. Patil, dan S. Jadon, “A COMPREHENSIVE SURVEY OF REGRESSION BASED LOSS FUNCTIONS FOR TIME SERIES FORECASTING,” Nov 2022, [Daring]. Tersedia pada: http://arxiv.org/abs/2211.02989.
K. Bandara, R. J. Hyndman, dan C. Bergmeir, “MSTL: A SEASONAL-TREND DECOMPOSITION ALGORITHM FOR TIME SERIES WITH MULTIPLE SEASONAL PATTERNS,” Jul 2021, [Daring]. Tersedia pada: http://arxiv.org/abs/2107.13462.
C. Amornbunchornvej, E. Zheleva, dan T. Y. Berger-Wolf, “VARIABLE-LAG GRANGER CAUSALITY FOR TIME SERIES ANALYSIS,” Des 2019, https://doi.org/10.1109/DSAA.2019.00016.
A. Faricha dkk., “COMPARISON STUDY OF TRANSFER FUNCTION AND ARTIFICIAL NEURAL NETWORK FOR CASH FLOW ANALYSIS AT BANK RAKYAT INDONESIA,” International Journal of Electrical and Computer Engineering, vol. 12, no. 6, hlm. 6635–6644, Des 2022, https://doi.org/10.11591/ijece.v12i6.pp6635-6644.
Copyright (c) 2025 Wa Ode Rahmalia Safitri, Farit Mochamad Afendi, Budi Susetyo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.