Mobile QR Code QR CODE : The Transactions P of the Korean Institute of Electrical Engineers
The Transactions P of the Korean Institute of Electrical Engineers

Korean Journal of Air-Conditioning and Refrigeration Engineering

ISO Journal TitleTrans. P of KIEE
  • Indexed by
    Korea Citation Index(KCI)
Title Diagnosis of Depression Based on Transfer Learning Model Using Audio data of Interview-type
Authors 조아현(A-Hyeon Jo) ; 곽근창(Keun-Chang Kwak)
DOI https://doi.org/10.5370/KIEEP.2021.70.4.277
Page pp.277-283
ISSN 1229-800X
Keywords AI technology; Depression diagnosis; Transfer Learning; Interview-type audio data; two-dimensional images
Abstract Depression can lead to serious mental and physical illness, so early detection is important. Currently, a system to help early detection of depression using AI technology is being developed in various ways. In particular, research on diagnosing depression through voices that can be easily encountered in daily life is being actively conducted. In this paper, we compare and analyze the depression diagnosis performance of transfer learning models using interview-type audio data. Data use the DAIC-WOZ Depression Database, which contains audio files in interview-type. As the transfer learning model, it uses VGGish and YAMNet built based on Convolutional Neural Network(CNN) among deep learning models that are widely being used for audio classification. The characteristics of speech data are extracted to black-and-white and color two-dimensional images using the Bark spectrogram, Mel spectrogram, and Log Mel-spectrogram methods. The performance of the depression diagnosis model is higher in YAMNet than in VGGish. In case that black-and-white images are input, YAMNet’s performance was the highest with 94.48% when mel spectrogram features were used.On the other hand, in case that color images are input, YAMNet’s performance was the highest at 97.34% when bark spectrogram features were used proving that it is most suitable for diagnosing depression