Voice-based Depression Study

[ Project Summary | Collaborators | Keywords | Project References]

Project Summary

Major Depressive Disorder (MDD) affects almost one in five women and one in twelve men in their lifetime and was recently recognized as the world's leading cause of disability. Yet current pharmacological and psychological therapies provide limited efficacy, and only about half of those suffering from MDD are identified and offered treatment. An obstacle preventing effective use of existing therapies, and impeding the discovery of better ones, is the difficulty of diagnosing MDD. Diagnosis is still made on the basis of a clinical interview and mental status examination, a method with relatively low reliability. Early intervention before the onset of severe symptoms can alleviate MDD's worst consequences including suicide. One possible source of information for improving diagnosis is the characterization of MDD from a person's speech.

Changes in the way people talk reflect alterations in mood, but attempts to use this information have not so far been clinically useful. Our work focuses on building algorithms to enable reliable automated detection of MDD from speech signals, with a special focus on voice quality features.

The major goals of this project are as follows. 

(1) To develop a diarization system to automatically annotate patient-doctor recordings.  

(2) To analyse voice samples of patients and model the relationship between speech acoustics and MDD.

(3) To develop classification systems that can predict MDD from voice samples.

(4) To study and improve the robustness of such classification systems to noise, language and other confounding variables.

Students and Collaborators

Prof. Abeer Alwan, SPAPL, UCLA

Amber Afshan, SPAPL, UCLA

Vijay Ravi, SPAPL, UCLA

Jinhan Wang, SPAPL, UCLA

Prof. Jonathan Flint, UCLA School of Medicine, UCLA

Dr. Joel Mefford, Department of Neurology, UCLA School of Medicine, UCLA

Prof. Tingshao Zhu , CCPL, Institude of Psychology, Chinese Academy of Science

Yazheng Di, CCPL, Institude of Psychology, Chinese Academy of Science

Prof. Eran Halperin

Dr. Elior Rahmani, Postdoc, Electrical Engineering and Computer Sciences Department, UC Berkeley


Depression detection, Speaker Diarization, Mental Health.

Project References

Aditya Gorla, Sriram Sankararaman, Esteban Burchard, Jonathan Flint, Noah Zaitlen and Elior Rahmani. "Phenotypic subtyping via contrastive learning". Accepted to RECOMB 2023.

Jinhan Wang, Vijay Ravi, Jonathan Flint, Abeer Alwan, "Unsupervised Instance Discriminative Learning for Depression Detection from Speech Signals," in Interspeech 2022, 2018-2022, doi: 10.21437/Interspeech.2022-10814

Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan, "A Step Towards Preserving Speakers' Identity While Detecting Depression Via Speaker Disentanglement," in Interspeech 2022, 3338-3342, doi: 10.21437/Interspeech.2022-10798

Di Y, Wang J, Liu X and Zhu T (2021) Combining Polygenic Risk Score and Voice Features to Detect Major Depressive Disorders. Front. Genet. 12:761141. doi: 10.3389/fgene.2021.761141.

Di, Yazheng, Jingying Wang, Weidong Li, and Tingshao Zhu. "Using i-vectors from voice features to identify major depressive disorder." Journal of Affective Disorders 288 (2021): 161-166.. doi:10.1016/j.jad.2021.04.004.

Afshan, A., Guo, J., Park, S.J., Ravi, V., Flint, J., Alwan, A. (2018) Effectiveness of Voice Quality Features in Detecting Depression. Proc. Interspeech 2018, 1676-1680, DOI: 10.21437/Interspeech.2018-1399.

Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan, "FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals," in ICASSP 2022.

Back to SPAPL Home Page.

Abeer Alwan (alwan@seas.ucla.edu)