We are building an app for snake classification. One nice similar app for birds is Merlin.
What is behind under their hoods?
They use deep learning
The Merlin Bird ID app uses a machine learning technique known as deep learning to identify birds in photos and sounds12. Here’s how it works:
As your phone records sound, Merlin converts the audio into an image called a spectrogram. The spectrogram plots sound frequencies that appear in the recording, as a function of time3.
This spectrogram image is then fed into a modern computer vision model called a deep convolutional neural network3.
The model is trained to identify birds based on 140 hours of audio containing bird sounds, in addition to 126 hours of audio containing non-bird background sounds, like whistling and car noises3.
For each audio clip, a group of sound ID experts from the Macaulay Library and the eBird community found the precise moments when birds were making sounds, and tagged those sounds with the corresponding bird species3.
The model uses this detailed supervision from experts to learn how to correctly predict the species that appear in these annotated audio clips, with the goal of generalizing this knowledge to predict which birds appear in audio recordings it hasn’t heard before3.
Once the database of sounds is assembled, they train the computer vision model using a gradient descent algorithm3.
The app also uses machine learning for its photo ID tool, which was made available to users around 20154. It’s powered by Visipedia and learns to recognize bird species based on training sets of millions of photos and sounds collected by birders at eBird.org, archived in the Macaulay Library at the Cornell Lab of Ornithology12.
They use transfer learning
According to the web search results, the Merlin Bird ID app uses transfer learning for both its sound ID and photo ID features. Transfer learning is a machine learning technique that involves using a pre-trained model and adapting it to a new task or domain12. For example, the app uses a deep convolutional neural network that was trained on millions of images and sounds collected by birders at eBird.org, and then fine-tunes it to recognize specific bird species based on the user’s location and date34. This way, the app can leverage the existing knowledge and data from the eBird community and apply it to new inputs from the users.
Comments