MIXED-EFFECTS MODELS WITH GENERALIZED RANDOM FOREST: IMPROVED FOOD INSECURITY ANALYSIS
Abstract
Food insecurity is a complex issue that requires a deep understanding of its influencing factors. Accurate predictions are crucial for effective interventions. Machine learning is well-suited to the large and complex data available in the big data era. However, machine learning generally does not accommodate hierarchical or clustered data structures, making them challenging for machine learning modeling. One model that accommodates hierarchical data structures is the mixed-effects model. This study introduces a novel approach to predict food insecurity by integrating mixed-effects models and a generalized random forest. Mixed-effects models capture variations in hierarchical or clustered data, such as differences between regions, and the generalized random forest, as extended and developed from the traditional random forest, is integrated to model fixed effects and improve prediction accuracy. The empirical data used were the food insecurity data from 2021 in West Java, Indonesia. The results show that mixed-effects models with a generalized random forest significantly improve the prediction accuracy compared to other models. The average performance measure shows GMEGRF is a good model and has a balanced accuracy value of 0.6789709, which is the highest result compared to other methods. This methodological advancement offers a new robust model for understanding and potentially mitigating food insecurity, ultimately informing efforts towards SDG 2 (Zero Hunger).
Downloads
References
K. P. Myers and J. L. Temple, “TRANSLATIONAL SCIENCE APPROACHES FOR FOOD INSECURITY RESEARCH,” Appetite, vol. 200, p. 107513, Sep. 2024, doi: https://doi.org/10.1016/j.appet.2024.107513.
G. Nica-Avram, J. Harvey, G. Smith, A. Smith, and J. Goulding, “IDENTIFYING FOOD INSECURITY IN FOOD SHARING NETWORKS VIA MACHINE LEARNING,” J Bus Res, vol. 131, pp. 469–484, Jul. 2021, doi: https://doi.org/10.1016/j.jbusres.2020.09.028.
A. H. Villacis, S. Badruddoza, A. K. Mishra, and J. Mayorga, “THE ROLE OF RECALL PERIODS WHEN PREDICTING FOOD INSECURITY: A MACHINE LEARNING APPLICATION IN NIGERIA,” Glob Food Sec, vol. 36, p. 100671, Mar. 2023, doi: https://doi.org/10.1016/j.gfs.2023.100671.
C. Gao, C. J. Fei, B. A. McCarl, and D. J. Leatham, “IDENTIFYING VULNERABLE HOUSEHOLDS USING MACHINE-LEARNING,” Sustainability (Switzerland), vol. 12, no. 15, Aug. 2020, doi: https://doi.org/10.3390/su12156002.
S. Gholami et al., “FOOD SECURITY ANALYSIS AND FORECASTING: A MACHINE LEARNING CASE STUDY IN SOUTHERN MALAWI,” Data Policy, vol. 4, no. 3, Oct. 2022, doi: https://doi.org/10.1017/dap.2022.25.
J. J. L. Westerveld et al, “FORECASTING TRANSITIONS IN THE STATE OF FOOD SECURITY WITH MACHINE LEARNING USING TRANSFERABLE FEATURES,” Science of The Total Environment, vol. 786, p. 147366, Sep. 2021, doi: https://doi.org/10.1016/j.scitotenv.2021.147366.
X. Shu and Y. Ye, “KNOWLEDGE DISCOVERY: METHODS FROM DATA MINING AND MACHINE LEARNING,” Soc Sci Res, vol. 110, p. 102817, Feb. 2023, doi: https://doi.org/10.1016/j.ssresearch.2022.102817.
A. Hajjem, F. Bellavance, and D. Larocque, “MIXED EFFECTS REGRESSION TREES FOR CLUSTERED DATA,” Stat Probab Lett, vol. 81, no. 4, pp. 451–459, Apr. 2011, doi: https://doi.org/10.1016/j.spl.2010.12.003.
A. Hajjem, F. Bellavance, and D. Larocque, “MIXED-EFFECTS RANDOM FOREST FOR CLUSTERED DATA,” J Stat Comput Simul, vol. 84, no. 6, pp. 1313–1328, Jun. 2014, doi: https://doi.org/10.1080/00949655.2012.741599.
A. Hajjem, D. Larocque, and F. Bellavance, “GENERALIZED MIXED EFFECTS REGRESSION TREES,” Stat Probab Lett, vol. 126, pp. 114–118, Jul. 2017, doi: https://doi.org/10.1016/j.spl.2017.02.033.
J. Hu and S. Szymczak, “A REVIEW ON LONGITUDINAL DATA ANALYSIS WITH RANDOM FOREST,” Brief Bioinform, vol. 24, no. 2, pp. 1–11, Mar. 2023, doi: https://doi.org/10.1093/bib/bbad002.
P. Krennmair and T. Schmid, “FLEXIBLE DOMAIN PREDICTION USING MIXED EFFECTS RANDOM FORESTS,” J R Stat Soc Ser C Appl Stat, vol. 71, no. 5, pp. 1865–1894, Nov. 2022, doi: https://doi.org/10.1111/rssc.12600.
M. Pellagatti, C. Masci, F. Ieva, and A. M. Paganoni, “GENERALIZED MIXED-EFFECTS RANDOM FOREST: A FLEXIBLE APPROACH TO PREDICT UNIVERSITY STUDENT DROPOUT,” Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 14, no. 3, pp. 241–257, Jun. 2021, doi: https://doi.org/10.1002/sam.11505.
R. J. Sela and J. S. Simonoff, “RE-EM TREES: A DATA MINING APPROACH FOR LONGITUDINAL AND CLUSTERED DATA,” Mach Learn, vol. 86, no. 2, pp. 169–207, Feb. 2012, doi: https://doi.org/10.1007/s10994-011-5258-3.
J. L. Speiser et al, “BIMM TREE: A DECISION TREE METHOD FOR MODELING CLUSTERED AND LONGITUDINAL BINARY OUTCOMES,” Commun Stat Simul Comput, vol. 49, no. 4, pp. 1004–1023, Apr. 2020, doi: https://doi.org/10.1080/03610918.2018.1490429.
L. Fontana, C. Masci, F. Ieva, and A. M. Paganoni, “PERFORMING LEARNING ANALYTICS VIA GENERALISED MIXED-EFFECTS TREES,” Data (Basel), vol. 6, no. 7, p. 74, Jul. 2021, doi: https://doi.org/10.3390/data6070074.
D. Kusumaningrum et al, “FOUR-PARAMETER BETA MIXED MODELS WITH SURVEY AND SENTINEL 2A SATELLITE DATA FOR PREDICTING PADDY PRODUCTIVITY,” Smart Agricultural Technology, vol. 9, Dec. 2024, doi: https://doi.org/10.1016/j.atech.2024.100525.
P. C. Chen, M. M. Yu, J. C. Shih, C. C. Chang, and S. H. Hsu, “A REASSESSMENT OF THE GLOBAL FOOD SECURITY INDEX BY USING A HIERARCHICAL DATA ENVELOPMENT ANALYSIS APPROACH,” Eur J Oper Res, vol. 272, no. 2, pp. 687–698, Jan. 2019, doi: https://doi.org/10.1016/j.ejor.2018.06.045.
L. Breiman, “RANDOM FORESTS,” Mach Learn, vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: https://doi.org/10.1023/A:1010933404324.
S. W. Raudenbush and A. S. Bryk, “HIERARCHICAL LINEAR MODELS: APPLICATIONS AND DATA ANALYSIS METHODS,” Applications and data analysis methods (Vol. 1), 2002.doi: https://doi.org/10.3758/s13428-017-0971-x
M. Fokkema, N. Smits, A. Zeileis, T. Hothorn, and H. Kelderman, “DETECTING TREATMENT-SUBGROUP INTERACTIONS IN CLUSTERED DATA WITH GENERALIZED LINEAR MIXED-EFFECTS MODEL TREES,” Behav Res Methods, vol. 50, no. 5, pp. 2016–2034, 2018, doi: 10.3758/s13428-017-0971-x.
S. Athey, J. Tibshirani, and S. Wager, “GENERALIZED RANDOM FORESTS,” https://doi.org/10.1214/18-AOS1709, vol. 47, no. 2, pp. 1148–1178, Apr. 2019, doi: https://doi.org/10.1214/18-AOS1709.
E. Zhou and D. Lee, “GENERATIVE ARTIFICIAL INTELLIGENCE, HUMAN CREATIVITY, AND ART,” PNAS Nexus, vol. 3, no. 3, Mar. 2024, doi: https://doi.org/10.1093/pnasnexus/pgae052.
H. Fransiska, A. M. Soleh, K. A. Notodiputro, and Erfiani, “EVALUATION OF MACHINE LEARNING MODELS BASED ON HOUSEHOLD FOOD INSECURITY DATA IN INDONESIA,” in BIO Web of Conferences, EDP Sciences, Apr. 2025. doi: https://doi.org/10.1051/bioconf/202517102011.
S. García, S. Ramírez-Gallego, J. Luengo, J. M. Benítez, and F. Herrera, “BIG DATA PREPROCESSING: METHODS AND PROSPECTS,” Big Data Anal, vol. 1, no. 1, Dec. 2016, doi: https://doi.org/10.1186/s41044-016-0014-0.
I. K. Nti, O. Nyarko-Boateng, and J. Aning, “PERFORMANCE OF MACHINE LEARNING ALGORITHMS WITH DIFFERENT K VALUES IN K-FOLD CROSSVALIDATION,” International Journal of Information Technology and Computer Science, vol. 13, no. 6, pp. 61–71, Dec. 2021, doi: https://doi.org/10.5815/ijitcs.2021.06.05.
G. Y. Lee, L. Alzamil, B. Doskenov, and A. Termehchy, “A SURVEY ON DATA CLEANING METHODS FOR IMPROVED MACHINE LEARNING MODEL PERFORMANCE,” Sep. 2021, [Online]. Available: http://arxiv.org/abs/2109.07127
P. Agasthi et al “PREDICTION OF PERMANENT PACEMAKER IMPLANTATION AFTER TRANSCATHETER AORTIC VALVE REPLACEMENT: THE ROLE OF MACHINE LEARNING,” World J Cardiol, vol. 15, no. 3, pp. 95–105, Mar. 2023, doi: https://doi.org/10.4330/wjc.v15.i3.95.
D. Krstinić, M. Braović, L. Šerić, and D. Božić-Štulić, “MULTI-LABEL CLASSIFIER PERFORMANCE EVALUATION WITH CONFUSION MATRIX,” ACADEMY AND INDUSTRY RESEARCH COLLABORATION CENTER (AIRCC), Jun. 2020, pp. 01–14. doi: https://doi.org/10.5121/csit.2020.100801.
S. H. Hasanah et al, “GOJEK DATA ANALYSIS THROUGH TEXT MINING USING SUPPORT VECTOR MACHINE (SVM) AND K-NEAREST NEIGHBOR (KNN),” BAREKENG: J. Math. & App, vol. 19, no. 2, pp. 889–0902, 2025, doi: https://doi.org/10.30598/barekengvol19iss2pp889-902.
M. Heydarian, T. E. Doyle, and R. Samavi, “MLCM: MULTI-LABEL CONFUSION MATRIX,” IEEE Access, vol. 10, pp. 19083–19095, 2022, doi: https://doi.org/10.1109/ACCESS.2022.3151048.
I. Sriliana, S. Nugroho, W. Agwil, and E. D. Sihombing, “EVALUATION OF MULTIVARIATE ADAPTIVE REGRESSION SPLINES ON IMBALANCED DATASET FOR POVERTY CLASSIFICATION IN BENGKULU PROVINCE,” Barekeng, vol. 19, no. 2, pp. 1143–1156, Jun. 2025, doi: https://doi.org/10.30598/barekengvol19iss2pp1143-1156.
H. A. Salman, A. Kalakech, and A. Steiti, “RANDOM FOREST ALGORITHM OVERVIEW,” Babylonian Journal of Machine Learning, vol. 2024, pp. 69–79, Jun. 2024, doi: https://doi.org/10.58496/BJML/2024/007.
Copyright (c) 2026 Herlin Fransiska, Agus Mohamad Soleh, Khairil Anwar Notodiputro, Erfiani Erfiani

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.




1.gif)


