We made some noteworthy progress with our SERVAL-sensor, which stands for: Sound Events Recognition for Vigilance and Localisation. As the recognition-part is based on deep learning (conv-net), it can be trained to recognise all kinds of sounds (if you've got sufficient samples).
NEW: With the help of Karol Piczak (a befriended Polish researcher) we managed to create a full-circle and real-time sound event recognition system which can be used to send out alerts to e.g. rangers through our fast-response coordination app or SMS.
Sounds for nature conservation which we currently master include gunschots, engines, chainsaws, scooters, motorbikes, cars, trucks, helicopters, footsteps, coughing, sneezing, laughing, talking, breaking, barking, blotting and mooing. All on one sensor. Any other type of sound can be learned and added as well, if we can collect sufficient tagged sound samples of that sound type.
It's not difficult to see how sound events like these may be used by rangers to monitor and defend an area against poaching and encroachment!
We've established a MoU with reknown wildlife researchers Angela Stoger and Shermin de Silva, with sound recognition researcher Matthias Zeppelzauer, and with M+P, a company specialised in acoustics. With them we are working on a sound-based elephant monitoring and early warning system that may be used to reduce human-elephant conflicts (see also the work of Neil, who is working on an image recognition based version - would be great to combine these two approaches!).
Within a few weeks time we hope to demonstrate the system on a RPi (and/or BBB). The real tough cookie will be to minimize energy consumption and to increase confidence levels.
We estimate that the sensor will last a few hours per day on a solar panel. Deployment on a RPi and smart configuration of the sensor will extend that time. In none-secretive environments we can of course also make use of larger solar panels and batteries.
At this moment confidence levels of 80 to 90% are achieved, meaning that 8 to 9 out of 10 signals are correctly interpreted, although some sounds and elephant vocalisations (not yet tested) may score less.
In terms of connectivity we make use of either a GSM-module or a LoRa-module. In both cases we only transmit the code of the recognized class, plus some metadata. Only a limited amount of the sound recordings are stored (temporarily, mainly for extended research).
1) More sound samples is better, thus, anyone with controlled elephant recordings, please help us to improve the performance of the SERVAL-sensor!
2) The ever lasting question is: when is good, good enough to deploy it in the field? Indeed, technological tools like these are never finished. To speed up the learning loop, however, I would suggest installing the first prototype a.s.a.p and start experimenting and improving right away. The best school is practise ;-) Anyone a suggestion with whom and where we could start?