Objective: Devices such as mobile phones and smart speakers could be useful to remotely identify voice alterations associated with alcohol intoxication that could be used to deliver just-in-time interventions, but data to support such approaches for the English language are lacking. In this controlled laboratory study, we compare how well English spectrographic voice features identify alcohol intoxication.
Method: A total of 18 participants (72% male, ages 21-62 years) read a randomly assigned tongue twister before drinking and each hour for up to 7 hours after drinking a weight-based dose of alcohol. Vocal segments were cleaned and split into 1-second windows. We built support vector machine models for detecting alcohol intoxication, defined as breath alcohol concentration > .08%, comparing the baseline voice spectrographic signature to each subsequent timepoint and examined accuracy with 95% confidence intervals (CIs).
Results: Alcohol intoxication was predicted with an accuracy of 98% (95% CI [97.1, 98.6]); mean sensitivity = .98; specificity = .97; positive predictive value = .97; and negative predictive value = .98.
Conclusions: In this small, controlled laboratory study, voice spectrographic signatures collected from brief recorded English segments were useful in identifying alcohol intoxication. Larger studies using varied voice samples are needed to validate and expand models.