Using speech acoustics and ML to decode health and affect

Listening for Health and Emotion

This cluster highlights how detailed acoustic analysis of speech—features like pitch, intensity, jitter, shimmer, and cepstral measures—is being used to detect and monitor emotional and clinical states. Work ranges from transformer-based, real-time emotion recognition and CGAN-powered data augmentation for Parkinson’s detection to reviews of speech-derived biomarkers for depression and longitudinal voice assessments. Supporting resources include human vocalization libraries, clinical voice-analysis tools, and studies linking specific acoustic features to emotional vocalizations. Together, these efforts position voice as a rich, non-invasive biomarker for both mental health and neurological disease, powered by modern machine learning.

Sources (7)

Updated Feb 28, 2026

Voice Science Digest

Using speech acoustics and ML to decode health and affect

Transformer encoder and data augmentation for real-time speech emotion recognition | Multimedia Tools and Applications | Springer Nature Link

CGAN Facilitated Data Augmentation of Voice and Speech Parameters for Detecting Parkinson's Disease

Review Article Speech-derived acoustic biomarkers for depression

A Short-term Longitudinal Acoustic Analysis of Spontaneous Speech

Human Vocalization Libraries - NCVS

Voice Analyzer: Free CPP, Jitter, Shimmer Analysis for SLPs

Acoustic Features of Emotional Vocalizations Account for Early ...