Hello everyone!
I am part of a team on developing CNNs to detect species in the Peruvian Amazon from soundscapes. We are having trouble finding a ground truth dataset for the region to test our model against. We have been trying to use “A collection of fully-annotated soundscape recordings from the Southwestern Amazon Basin” (https://zenodo.org/record/7079124#.Y7iis-xudhE) which seems to be the best publicly available strongly labeled ground truth from the region. However, we think there are a few errors in the labeling leading to our models having poorer results than they might. For instance, in the screenshot below, there seems to be an unlabeled species between 34:38 and 34:30 from 1 to 4 hz (and repeated again at 34:35 to 34:38) that is labeled in annotations 15616 and 15622. We have noticed a handful of what we think are these errors in the dataset, but we don’t necessarily have the expertise to know for sure. Has anyone used this data in training/testing and if so what are your thoughts on this data?
Thank you for your time!
Sean Perry
3 October 2023 10:47pm
Hi Sean!
Wanted to just mention that Arbimon, Rainforest Connection's ecoacoustic platform, has a number of projects in Peru (here, here, & others, if you search by keyword 'peru'). We have some existing CNNs for that region (mostly from Ecuador & Brazil, but there is likely species overlap). Do feel free to DM me here or email me ([email protected]) and I'm happy to talk about collaborating!
-Carly
3 October 2023 10:50pm
Also tagging @NickGardner who works on a similar project! (detecting birds from audio in Peruvian flooded forests)
8 October 2023 4:49pm
Interesting!
Hi Sean, sounds like an excellent project. Definitely talk with the Arbimon folks! As @carlybatist said, I am working with birds in the Peruvian Amazon, but in Loreto. Definitely would like to hear more about your project. As for this labelling issue here, definitely looks like an error. I have not used this dataset, now I'm curious. To be honest, some questionable labelling in that file in general. Bounding boxes can be very subjective...
Carly Batist