ISSN 2394-5125
 


    Speech Emotion Recognition using Convolutional Neural Networks and Mel Frequency Cepstral Coefficients (2020)


    Medipally Nagasri, Kalpana K, Shirisha
    JCR. 2020: 4858-4866

    Abstract

    In recent years, significant advancements have been made in artificial intelligence, machine learning, and human-machine interaction. Voice interaction and command-based control of machines have become increasingly popular, with virtual assistants like SIRI, Alexa, Cortana, and Google Assistant integrated into various consumer electronics. However, one of the limitations of machines is their inability to interact with humans as empathetic conversational partners. They often struggle to recognize and respond to human emotions. Emotion recognition from speech has emerged as a cutting-edge research area in the field of human-machine interaction, aiming to create more robust man-machine communication systems. Researchers are actively working on speech emotion recognition (SER) to enhance the quality of human-machine interaction. To achieve this goal, computers should be capable of recognizing emotional states and responding to them in ways that mirror human understanding. The effectiveness of SER systems relies on the quality of extracted features and the choice of classifiers. This project focuses on identifying four basic emotions�anger, sadness, neutrality, and happiness�from speech. It employs audio files of short Manipuri speech from movies as training and testing datasets. The methodology utilizes Convolutional Neural Networks (CNN) for emotion recognition, employing Mel Frequency Cepstral Coefficients (MFCC) as the feature extraction technique from speech data.

    Description

    » PDF

    Volume & Issue

    Volume 7 Issue-11

    Keywords