A Spatial Approach to Correlated High-Dimensional Stunting Data in Indonesia Using a Modified Generalized Lasso

Authors

  • Septian Rahardiantoro Program in Statistics and Data Science School of Data Science, Mathematics, and Informatics IPB University - Indonesia
  • Aida Darajati Program in Statistics and Data Science School of Data Science, Mathematics, and Informatics IPB University - Indonesia
  • Hari Wijayanto Program in Statistics and Data Science School of Data Science, Mathematics, and Informatics IPB University - Indonesia
  • Anang Kurnia Program in Statistics and Data Science School of Data Science, Mathematics, and Informatics IPB University - Indonesia

DOI:

https://doi.org/10.15849/ijasca.v18i1.18

Keywords:

ALOCV, high-dimensional data, generalized LASSO, KNN stunting

Abstract

Stunting remains a significant public health issue in Indonesia. Although the national prevalence declined by 6.1% in 2024, several provinces continue to exhibit alarmingly high rates. This study aims to explore the spatial patterns of stunting across Indonesia, evaluate the performance of the generalized lasso model in identifying potential regional coefficient groupings based on various neighborhood structures, and determine the most influential factors contributing to stunting. The data, sourced from Statistics Indonesia and the Ministry of Home Affairs in 2024, cover 38 provinces and include ten predictor variables. The analysis employs a modified elastic net approach within the generalized lasso framework by incorporating a custom penalty matrix into the L2 regularization term to mitigate multicollinearity among predictors. The proposed models were evaluated against the Spatial Autoregressive (SAR) model, standard elastic net, and standard generalized lasso using specified neighborhood adjacency methods and tuning parameters. Optimal tuning parameters were selected using the Approximate Leave-One-Covariate-Out Cross-Validation (ALOCV) method. The best-performing model was identified as the K-Nearest Neighbors (KNN) model with k=3 and the custom penalty matrix, based on an optimal balance of degrees of freedom, lower AIC and RMSE, and optimum sensitivity criteria. The results reveal that the most influential factors associated with stunting prevalence in 2024 include the poverty rate—particularly in southern Kalimantan, several provinces in Sumatra, and East Nusa Tenggara—and child health insurance coverage, which spans provinces across Indonesia.

Downloads

Published

2026-03-04