This project detects emotions from speech using a neural network model. It includes scripts for recording audio, preprocessing the data, and evaluating the model.
The model is a sequential neural network composed of several 1D convolutional layers, activation functions, dropout layers, a max-pooling layer, and a dense layer. Here is a detailed description of each layer and its configuration:
-
Conv1D Layer 1:
- Name:
conv1d_7 - Input Shape:
[None, 216, 1] - Filters: 128
- Kernel Size: 5
- Activation:
linear - Padding:
same - Initializer:
VarianceScaling
- Name:
-
Activation Layer 1:
- Name:
activation_8 - Activation:
relu
- Name:
-
Conv1D Layer 2:
- Name:
conv1d_8 - Filters: 128
- Kernel Size: 5
- Activation:
linear - Padding:
same - Initializer:
VarianceScaling
- Name:
-
Activation Layer 2:
- Name:
activation_9 - Activation:
relu
- Name:
-
Dropout Layer 1:
- Name:
dropout_3 - Rate: 0.1
- Name:
-
MaxPooling1D Layer:
- Name:
max_pooling1d_2 - Pool Size: 8
- Strides: 8
- Name:
-
Conv1D Layer 3:
- Name:
conv1d_9 - Filters: 128
- Kernel Size: 5
- Activation:
linear - Padding:
same - Initializer:
VarianceScaling
- Name:
-
Activation Layer 3:
- Name:
activation_10 - Activation:
relu
- Name:
-
Conv1D Layer 4:
- Name:
conv1d_10 - Filters: 128
- Kernel Size: 5
- Activation:
linear - Padding:
same - Initializer:
VarianceScaling
- Name:
-
Activation Layer 4:
- Name:
activation_11 - Activation:
relu
- Name:
-
Conv1D Layer 5:
- Name:
conv1d_11 - Filters: 128
- Kernel Size: 5
- Activation:
linear - Padding:
same - Initializer:
VarianceScaling
- Name:
-
Activation Layer 5:
- Name:
activation_12 - Activation:
relu
- Name:
-
Dropout Layer 2:
- Name:
dropout_4 - Rate: 0.2
- Name:
-
Conv1D Layer 6:
- Name:
conv1d_12 - Filters: 128
- Kernel Size: 5
- Activation:
linear - Padding:
same - Initializer:
VarianceScaling
- Name:
-
Activation Layer 6:
- Name:
activation_13 - Activation:
relu
- Name:
-
Flatten Layer:
- Name:
flatten_2
- Name:
-
Dense Layer:
- Name:
dense_2 - Units: 10
- Activation:
linear - Initializer:
VarianceScaling
- Name:
-
Activation Layer 7:
- Name:
activation_14 - Activation:
softmax
- Name:
- Clone the repository.
- Install the dependencies using
pip install -r requirements.txt. - Run the project using
python main.py.
preprocess.py: Contains functions for recording and preprocessing audio.evaluate.py: Contains the evaluation logic for the model.model.py: Loads the model architecture from a JSON file.utils.py: Contains utility functions and label encodings.main.py: Main entry point for the project.