This repository contains a Convolutional Neural Network (CNN) model for classifying urban sound events using the UrbanSound8K dataset. The model leverages deep learning techniques to accurately categorize different environmental sounds such as sirens, dog barks, and car horns.
UrbanSound8K is a dataset containing 8,732 labeled audio files across 10 sound classes:
- Air Conditioner
- Car Horn
- Children Playing
- Dog Bark
- Drilling
- Engine Idling
- Gun Shot
- Jackhammer
- Siren
- Street Music
π Download: UrbanSound8K Dataset
The model utilizes a Convolutional Neural Network (CNN) with Mel-Spectrograms as input features. The key layers include:
- Convolutional Layers (Feature Extraction)
- Batch Normalization & Dropout (Regularization)
- Fully Connected Layers (Classification)
- Softmax Activation (Multi-class prediction)
- Experiment with different CNN architectures (ResNet, EfficientNet)
- Implement attention mechanisms for better feature learning
- Deploy as a real-time classification app
Contributions are welcome! Feel free to open an issue or submit a pull request.
This project is licensed under the MIT License.
π΅ Developed by [AmirHosseinSoleymani] | π Follow for more AI projects!