Project Details

Featured
Machine Learning & AI
Multimodal Emotion AI
Summary
Multimodal Emotion Recognition AI is a machine learning project designed to classify human emotions by analysing both facial expressions and speech signals. The system improves prediction accuracy by fusing results from two separate models—one for audio, one for visual input—using a confidence-based decision strategy.
Description
Multimodal Emotion Recognition AI is an advanced machine learning project that identifies human emotions by analysing both facial expressions and speech signals. By combining audio and visual modalities, the system enhances emotion classification accuracy, robustness, and real-world usability.
The project is built on modular, clean code with full support for feature extraction, training, evaluation, and deployment. It integrates pretrained models (e.g., VGG16 for facial data and MFCC-based CNN/LSTM for speech), and applies a confidence-based fusion strategy to resolve disagreement between modalities.
Technologies & Skills
Emotion classification using supervised learning
HOG and VGG16 for images
Librosa
Matplotlib
MFCC for speech
Modular coding and reusable functions
NumPy
OpenCV
Pandas
Python (advanced)