Abstract
Speech emotion recognition is a vital area of research, with applications that range from human-computer interaction to mental health monitoring. This approach will comprehensively survey the techniques, methods, applications, and challenges in speech emotion recognition. It begins by checking the significance of recognizing emotions from speech and its diverse applications across various fields. This method is employed for emotion speech recognition, encompassing traditional machine learning techniques, such as support vector machines and Gaussian mixture models, as well as contemporary approaches, including deep learning and multimodal fusion.
Moreover, it examines benchmark datasets commonly used for training and evaluation purposes in emotion speech recognition research. Speech Emotion Recognition (SER) has a wide range of applications and there has been a lot of research going on in this fascinating area in recent years. However, the entertainment sector suffers from a lack of study in this research.
Many use Neural Network (NN) and Long Short-Term Memory (LSTM) architectures to categorize the emotions in audio recordings captured by actors expressing various emotions. Moreover, our survey explores real-world applications of emotional speech recognition like virtual assistants, the health sector, the market sector, education, and Mental health diagnosis. Here discusses the challenge associated with emotional speech recognition, including the variability of emotional expressions, cultural influences, humor, and privacy concerns. In future, it will help others to deal with many Noisy datasets and various cultural effects of emotion.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2024 African Journal of Biomedical Research