Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Evaluation of Statistical and Machine Learning Approaches Using the 2021 National Survey of Children's Health
arXiv:2602.20303v1 Announce Type: new Abstract: Background: Childhood and adolescent overweight and obesity remain major public health concerns in the United States and are shaped by behavioral, household, and community factors. Their joint predictive structure at the population level remains incompletely characterized. Objectives: The study aims to identify multilevel predictors of overweight and obesity among U.S. adolescents and compare the predictive performance, calibration, and subgroup equity of statistical, machine-learning, and deep-learning models. Data and Methods: We analyze 18,792 children aged 10-17 years from the 2021 National Survey of Children's Health. Overweight/obesity is defined using BMI categories. Predictors included diet, physical activity, sleep, parental stress, socioeconomic conditions, adverse experiences, and neighborhood characteristics. Models include logistic regression, random forest, gradient boosting, XGBoost, LightGBM, multilayer perceptron, and Ta
arXiv:2602.20303v1 Announce Type: new Abstract: Background: Childhood and adolescent overweight and obesity remain major public health concerns in the United States and are shaped by behavioral, household, and community factors. Their joint predictive structure at the population level remains incompletely characterized. Objectives: The study aims to identify multilevel predictors of overweight and obesity among U.S. adolescents and compare the predictive performance, calibration, and subgroup equity of statistical, machine-learning, and deep-learning models. Data and Methods: We analyze 18,792 children aged 10-17 years from the 2021 National Survey of Children's Health. Overweight/obesity is defined using BMI categories. Predictors included diet, physical activity, sleep, parental stress, socioeconomic conditions, adverse experiences, and neighborhood characteristics. Models include logistic regression, random forest, gradient boosting, XGBoost, LightGBM, multilayer perceptron, and TabNet. Performance is evaluated using AUC, accuracy, precision, recall, F1 score, and Brier score. Results: Discrimination range from 0.66 to 0.79. Logistic regression, gradient boosting, and MLP showed the most stable balance of discrimination and calibration. Boosting and deep learning modestly improve recall and F1 score. No model was uniformly superior. Performance disparities across race and poverty groups persist across algorithms. Conclusion: Increased model complexity yields limited gains over logistic regression. Predictors consistently span behavioral, household, and neighborhood domains. Persistent subgroup disparities indicate the need for improved data quality and equity-focused surveillance rather than greater algorithmic complexity.
Executive Summary
This study uses machine learning and statistical approaches to identify multilevel predictors of overweight and obesity among U.S. adolescents. Analyzing 18,792 children from the 2021 National Survey of Children's Health, the study compares the predictive performance, calibration, and subgroup equity of various models. The results suggest that increased model complexity yields limited gains over logistic regression, and predictors consistently span behavioral, household, and neighborhood domains. Notably, persistent subgroup disparities indicate the need for improved data quality and equity-focused surveillance. The study's findings emphasize the importance of considering the joint predictive structure of multilevel factors in addressing childhood and adolescent overweight and obesity.
Key Points
- ▸ The study identifies multilevel predictors of overweight and obesity among U.S. adolescents using machine learning and statistical approaches.
- ▸ Increased model complexity yields limited gains over logistic regression.
- ▸ Predictors consistently span behavioral, household, and neighborhood domains.
- ▸ Persistent subgroup disparities indicate the need for improved data quality and equity-focused surveillance.
Merits
Strength in methodology
The study employs a comprehensive approach, utilizing various machine learning and statistical models to identify multilevel predictors of overweight and obesity.
Use of large dataset
The analysis of 18,792 children from the 2021 National Survey of Children's Health provides a robust sample size for the study's findings.
Consideration of subgroup disparities
The study acknowledges the importance of equity-focused surveillance in addressing persistent subgroup disparities in overweight and obesity rates.
Demerits
Limited model complexity
The study finds that increased model complexity yields limited gains over logistic regression, which may limit the generalizability of the findings to more complex datasets.
Persistent subgroup disparities
The study's findings highlight the need for improved data quality and equity-focused surveillance, but do not provide a clear solution to address these disparities.
Expert Commentary
This study makes a significant contribution to the field of childhood obesity research by employing a comprehensive approach to identify multilevel predictors of overweight and obesity. The findings highlight the importance of considering the joint predictive structure of behavioral, household, and neighborhood factors in addressing childhood and adolescent overweight and obesity. However, the study's limitations, including the limited model complexity and persistent subgroup disparities, underscore the need for further research and development of more effective prevention strategies. The study's emphasis on equity-focused surveillance and improved data quality is particularly relevant in the current public health landscape.
Recommendations
- ✓ Future studies should explore the use of more complex machine learning models to identify subtle patterns and predictors of overweight and obesity.
- ✓ Healthcare providers and policymakers should prioritize the development of prevention strategies that address the multilevel predictors of overweight and obesity, particularly in underrepresented subgroups.