Abstract
Background and aims: In Gujarat, the prevalence of anemia among children under 5 years of age (U5C) is higher than the national average for the entire Indian population. Accordingly, this study aimed to identify risk factors for anemia among U5C in Gujarat using various feature selection methods and to compare their performance.
Methods: This cross-sectional study used National Family Health Survey-5 (2019–21) data of 8,058 children aged 6–59 months, selected through a stratified two-stage sampling design. Stepwise, backward, forward, correlation-based, and least absolute shrinkage and selection operator (LASSO) methods were applied to identify the most significant factors affecting anemia. Accuracy, recall, precision, F1-score, deviance, and the area under the curve (AUC) were utilized as performance metrics to assess feature selection performance.
Results: Performance metrics varied across the stepwise, backward, forward, and correlation-based methods. The accuracy, recall, precision, F1-score, AUC, and deviance were 0.612–0.895, 0.124–0.857, 0.419–0.650, 0.194–0.739, 0.651–0.685, and 10803.0–11221.4, respectively. The LASSO method outperformed all others (accuracy=0.945, recall=0.915, precision=0.783, F1=0.843, AUC=0.747, deviance=7610.4). Key variables identified by LASSO included higher maternal education, improved sanitation, breastfeeding, vitamin A supplementation, and antenatal visits as protective factors, whereas unprotected drinking water and diarrhea treatment increased anemia risk. Ultimately, wealth index and cooking fuel type demonstrated significant associations.
Conclusion: Overall, targeting these modifiable factors substantially reduces the anemia burden in Gujarat, demonstrating the need for integrated public health and social interventions that effectively address maternal education, environmental health, and nutritional support to combat anemia