A recent breakthrough was made in bio-acoustic deep learning techniques – a method for automated detection of animal sounds – at Cornell’s K. Lisa Yang Center for Conservation Bioacoustics. Dr. Shyam Madhusudhana, a postdoctoral researcher at the Laboratory of Ornithology, has built a toolkit allowing bio-acousticians to create complex audio recognition models with just a few lines of code.
The toolkit, Koogu, was used in a recent study that beat marine analysts in detecting blue whale D calls.
Blue whale D calls are calls of varying frequency produced by both male and female whales, unlike the well-known whale song which is only produced by males. While whale songs are often predictable and easily recognizable, D calls are erratic and produced less repetitively.
However, although the D calls of blue whales are more difficult to identify, monitoring their presence allows for a much better understanding of their migration patterns and acoustic behaviors.
Acoustic monitoring has long been considered a viable method of recording rare species that lack sufficient visual data. In recent years, machine learning algorithms have demonstrated promising results in the analysis of acoustic monitoring data. In the marine biome, where visual surveys are difficult to achieve, this method becomes all the more relevant in the effort to monitor the movements and habits of different aquatic species.
This is where Koogu comes in.
“As long as someone has their own set of annotated data [of acoustic monitoring]they could take Koogu and build their own model,” Madhusudhana said.
This methodology was adopted by a team of researchers from the Australian Antarctic Division, led by Brian Miller. The researchers used Koogu to build an automated detection model for their study of blue whale calls.
Their study, co-authored by Madhusudhana, is titled “Deep Learning Algorithm Outperforms Experienced Human Observer in Detecting Blue Whale Calls: A Dual Observer Analysis.” It found that human experts detected 70% of D calls, while the model accurately detected 90% of whale calls. The model’s detection rate was also considerably faster than that of marine analysts, without the fatigue factor associated with human analysis.
The study is just the first case where Koogu has been used effectively. However, according to Madhusudhana, Koogu is far from being limited to only detecting marine auditory data.
“Koogu is not a toolkit just for whale calls – [it is] just a convenient way to create machine learning solutions – from whales to birds to insects,” Madhusudhana said.
Koogu has the potential to be an impactful tool in the field of bioacoustics. Although there has been significant development in the field of machine learning, most developments in the field of acoustics relate to human speech. Madhusudhana said Koogu bridges the gap between the two.
“If you look at a visual representation of audio — like a spectrogram — you can process it like an image and apply image classification techniques to it,” Madhusudhana said.
Koogu transforms acoustic data into a form that visual classification machine learning models can use. Madhusudhana made sure that most of the model remained configurable. Any expert in bio-acoustics could vary the settings and then alter the way audio is transformed into images. Then the images are classified using an image classification model.
“If you’re trying to develop a neural network-based solution for bioacoustics, there are probably a few hundred lines of code needed. What I did is [enabled you to] call three or four functions and you’re done,” Madhusudhana said.
The goal was for bio-acousticians and other researchers to be able to use their own data and domain knowledge and combine it with Koogu’s features to effectively analyze sounds. Koogu’s unique relevance lies in its audio-to-image conversion process.
As Madhusudhana explains, each sound is transformed into a colored map to easily distinguish one audio signal from another. When compressing into an image for image classification, significant data loss occurs. Koogu avoids this data loss, greatly increasing accuracy.
This benefit is particularly apparent in low to moderate intensity audio recordings. Such recordings make blue whale calls more difficult to detect, especially in the case of human experts.
The open source Universal Audio Recognition Toolkit has greatly streamlined the process of automated acoustic recognition.
But acoustic monitoring of whales is only part of the equation, according to Madhusudhana. “Our goal is to conserve the biodiversity of all species – that was Koogu’s goal – [to] have something very generic that anyone across the world can use.
#Cornell #researcher #builds #breakthrough #machine #learning #toolkit #bioacoustics