Team 3. University of Oxford Microsoft Project 15 and Elephant Listening Project Capstone
Project Title. Gunshot Detection in Tropical African Forests
Introduction
This project is a group exercise undertaken as part of the University of Oxford Artificial Intelligence: Cloud and Edge Implementations course, as a learning challenge from Microsoft Project 15 and in association with the Elephant Listening Project. The objective is to devise solutions against illegal elephant hunting in tropical African forests by enabling sensors for instant prediction of gunshot events and thus mitigate poaching attempts.
Project Resources
Meet the Team
Our team is an interesting mix of technologists with a wide range of experience from software engineers, data scientists, dev-ops engineers and solution architects.
|
|
|
|
|
|
|
|
|
Elephant Listening Project Challenge
As proposed by Dr. Peter Wrege, Director Elephant Listening Project, Center for Conservation Bioacoustics, Cornell Lab of Ornithology, gunshot detection is an issue in African National Parks and environmental audio data is being collected into a public database at Congo Soundscapes as part of the Elephant Listening Project.
The current model (for gunshot detection) is very inefficient - less than .2% of tagged signals are gunshots and we typically get 10K- 15K tagged signals in a four-month deployment at just one of the 50 recording sites. The issue is we have about 200 good gunshots annotated, but because poaching is way too high, gunshots are still extremely rare in the sounds and it is extremely time-consuming to create the “truth” logs where we can say that every gunshot in a 24hr file has been tagged. From our understanding, this makes developing a detector more difficult.
How you approached the challenge
We approached the challenge by individual research about the problem in general and we gathered ideas on converting the gunshot detection problem statement into a machine learning problem. The main issue as per the challenge is the lack of tagged data for model training and also mostly the gunshot audio data for such use cases is proprietary. So we collected free gunshot samples and other environmental audio samples from random internet sources. There was some basic audio cleaning done to remove noise and clip exact audio data points post converting all formats to .wav files into the dataset. The further audio analysis was carried out using Azure Machine Learning Cloud Services. For brevity, only a part of the dataset is uploaded in the repo but all examples, notebooks, and the dataset are structured in a way to accommodate more data and scale the model training process as we move ahead.
Many features were analyzed across two different classes namely gunshot and environmental audio. The environmental audio contains audio data of elephant noises and other sounds of the tropical African forests. Some of the features analyzed are as follows,
There could be more features relevant to this classification problem that can be analyzed to improve the model in the future and for the scope of this exercise the first thirteen Mel-Frequency Cepstral Coefficients suited best as input features for audio classification.
The input to the machine learning model will be a set of audio features and the model has to classify each input set into two classes, i.e gunshot or environmental audio. This is a binary classification problem and a suitable Dataset was created for training purposes. A feature extraction script was created to convert the raw audio .wav files into relevant feature data and corresponding true values for classification.
Azure Machine Learning cloud service has been leveraged for the classification model supervised training. In this case, two different model architectures were tested by running scripts from the Makefile and the following results were observed,
Neural Network Architecture |
Train Accuracy |
Test Accuracy |
Multi-layer Perceptron |
0.9756 |
0.9405 |
Convolutional Neural Network |
0.9740 |
0.9236 |
Note: These accuracy metrics depend directly on the limited data used for model training and testing. Further improvements in contextual data collection, feature extraction and analysis, and experimentation with model architectures is needed to validate the reliability of these model metrics for practical applications.
Solution Architecture - Microsoft Project 15 - Azure Cloud Services
The Microsoft Project 15 Architecture is leveraged to follow the best practices in the deployment of scalable IoT solutions. One of the core goals of the Project 15 Open Platform is to boost innovation with a ready-made platform, allowing the scientific developer to expand into specific use cases.
Data Scientists can train the model using Azure Machine Learning cloud service and deploy the model to edge devices using IoT services. This process can be automated using Azure cloud services for upgrading the edge devices models at runtime through a strategic rollout.
For demo purposes, we have created an edge device prototype that simulates predictions from audio data and sends telemetry data to IoT Hub. This is a nodejs based application that uses tensorflow as a backend to make predictions using the model created during the training phase.
The model created during the training phase needs conversion to a tensorflow js graph model format to be used in a node js application. So we convert the models to tfjs graph format for the client to predict
Register an edge device to IoT Hub and copy IoT device connection string from the microsoft portal into .env file in the edge device source folder. Finally, run the edge device prototype to simulate gunshot predictions randomly and send telemetry data to IoT Hub
Live Monitoring and Demo
Future Work
Gunshot detection is still an unsolved problem today and there is ongoing research on this front for both military and surveillance purposes around the world. The machine learning classification model in this context can be improved with better data collection from the tropical forests in Africa. The lack of sufficient data remains the major issue with this challenge. Accurate gunshot audio should be recorded using the apposite set of hunting rifles for improved accuracy and further feature analysis can be conducted comparing gunshots with other surrounding sounds in the elephant habitat.
Further, many more model architectures can be tested and compared to understand what hyper-parameters work best for such remote environments. This can surely improve the model training process. Also, better deployment practices, telemetry, alert systems, and model upgrade processes can be explored with cloud-based IoT solutions to improve the overall efficiency of the solution.
Project GitHub Repo
https://github.com/Oxford-ContEd/project15-elp
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.