discussion / AI for Conservation  / 30 March 2023

AI Animal Identification Models

Hi Everyone,

I've recently joined WILDLABS and I'm getting to know the different groups. I hope I've selected the right ones for this discussion...

I am interested in AI to identify animals using cameras. Ideally, I want to do this on a mobile phone and eventually on an Edge Computing device.

I've been using ml5js and an object detection model (cocossd and yolo v2) to do realtime detection in a mobile browser but the detection is very generic e.g. person, banana, laptop, cup. I would like to swap the model for something more specific to wildlife, but drawing a blank on finding a pre-existing model to use. I think for the model to work with my javascript setup it needs to work with Tensorflow.js.

What models are people using or are you all training your own? If you're training your own can it be done using Google Teachable Machine or is something more sophisticated required? Do you use transfer learning? (I am new to Machine Learning so please excuse my lack of knowledge).

Running before I can walk, but what boards/kits/setups do people recommend for deploying AI computer vision animal identification to the Edge? I am familiar with Arduino, but imagine there are good alternatives that might be better.

Thanks for any guidance on this! I'm happy to collaborate on such a project if anyone wants to, I just started a PhD in creative technology & design.

Many thanks, David

Hi David,

I have no specific advice as I am still a sideliner when AI is concerned but Dan Morris' @dmorris  

has a ton of info which could be a great starting point perhaps.

Also you may want to cross post to the https://www.wildlabs.net/groups/camera-traps group as you specifically wants to work on AI on images.

Cheers, Lars


I don't know if I can exactly answer your question, but I think I might just be the world champion at talking people out of doing AI on the edge. :)

But seriously, my biggest pro tip for training edge models is to put a really fine point on your requirements before doing any ML.  You can more or less run any model on any hardware if you're willing to wait long enough, and every bit of model-shrinking or model customization that you do will cost you a lot of engineering time and some amount of accuracy.  So it benefits you to do a really thorough analysis of how long you can tolerate per image in terms of inference time... if you are running on a modern mobile phone, you can run just about any model in existence in <60 seconds without having to get into the business of model compression or deep customization.  Is that OK for your scenario?  And what kind of accuracy do you need?  And if you're on a phone, are you *sure* you want to run your model on the device, rather than in the cloud?  E.g. if 75% of your users would have connectivity, can you support just those 75% to start with?  Or does your scenario fundamentally require edge inference, e.g. an app specifically designed for national parks where connectivity is unavailable more than it's available?

In terms of specific models or training data, the more details you can provide about what you want to monitor, the more folks here will be able to point you toward possible training data sources and existing models.  What wildlife do you want to classify?  Do you need to handle night-time images well?

I would also put some time into tinkering with the iNaturalist and Merlin apps and contrast them against your goals.  I'm not saying they've solved every problem in the universe or that you shouldn't build something new.  But they're both really good at what they do, and understanding how they compare to what you want to build will shed a lot of light on important details.  And they're fun to play with!



Hi David

It appears that you have been looking for existing models, however, most existing models are trained on either COCO or some other very generic dataset. So, if you want to identify just animals, you may be better off training your own model. It seems no one in this thread mentioned yet that it is possible to do transfer learning on existing models, which keeps most of the "visual part" of the model as is, but just changes the classification part so it can identify other things. This way you can take an existing model trained on COCO and in a fraction of the time it takes to train a full model, just retrain that for your animals.

Also have a look at your requirements for the inferencing stage. Some models take long in training but are superfast in inference and others are slow in both cases but very accurate, etc. If you want semi-realtime inferencing, you are probably looking at single shot detectors (SSD), and not RCNNs.