Characterizing Infant Mortality Using Data Mining - A Case Study in Two Brazilian States - Santa Catarina and Amapá

Wanderson L Soares; Mark A J Song; Luis E Zárate; Cristiane N Nobre

doi:10.3233/SHTI220183

Characterizing Infant Mortality Using Data Mining - A Case Study in Two Brazilian States - Santa Catarina and Amapá

Stud Health Technol Inform. 2022 Jun 6:290:772-776. doi: 10.3233/SHTI220183.

Authors

Wanderson L Soares¹, Mark A J Song¹, Luis E Zárate¹, Cristiane N Nobre¹

Affiliation

¹ Pontifical Catholic University of Minas Gerais, Institute of Exact Sciences and Informatics, Graduate Program in Informatics, Minas Gerais, Belo Horizonte, Brazil.

PMID: 35673122
DOI: 10.3233/SHTI220183

Abstract

Infant mortality is characterized by the death of young children under the age of one, and it is an issue affecting millions of children in the world. The objective of this article is to employ concepts of knowledge discovery in databases, specifically of machine learning in the data mining phase, to characterize infant mortality in two states of Brazil: Santa Catarina, with the lowest infant mortality rate of the country's states, and Amapá, with the highest. The classifiers C4.5, JRip, Random Forest, SVM, and Multilayer Perceptron were used, and a brief comparison of the results obtained by the classifiers in both states is made. In addition, the dataset preprocessing is detailed, which includes attribute selection and class balancing. The results show that the features APGAR5, WEIGHT, and CONGENITAL ANOMALY stood out the most from the rules generated by the tree-based classifiers.

Keywords: APGAR; DATASUS; Infant mortality.

MeSH terms

Brazil / epidemiology
Child
Child, Preschool
Data Mining*
Humans
Infant
Infant Mortality
Machine Learning*
Neural Networks, Computer