Project Details

Multimodal Emotion AI
Machine Learning & AI

Multimodal Emotion AI

Category Machine Learning & AI
Started
Completed
Duration 4 months
Status Completed

Summary

Multimodal Emotion Recognition AI is a machine learning project designed to classify human emotions by analysing both facial expressions and speech signals. The system improves prediction accuracy by fusing results from two separate models—one for audio, one for visual input—using a confidence-based decision strategy.

Description

Multimodal Emotion Recognition AI is an advanced machine learning project that identifies human emotions by analysing both facial expressions and speech signals. By combining audio and visual modalities, the system enhances emotion classification accuracy, robustness, and real-world usability.

The project is built on modular, clean code with full support for feature extraction, training, evaluation, and deployment. It integrates pretrained models (e.g., VGG16 for facial data and MFCC-based CNN/LSTM for speech), and applies a confidence-based fusion strategy to resolve disagreement between modalities.

Technologies & Skills

Emotion classification using supervised learning HOG and VGG16 for images Librosa Matplotlib MFCC for speech Modular coding and reusable functions NumPy OpenCV Pandas Python (advanced)