When technology companies develop innovative new products in the area of high-performance computing (HPC), it enables life science researchers to do new things they hadn’t imagined before. And when life science researchers make new breakthroughs, it drives information technology to innovate new approaches to support those scientific advances. It’s a relationship that has driven innovation in the life science industry for years.
This mutually beneficial cycle is particularly evident in the work of the Center for Quantitative Life Sciences (CQLS) at Oregon State University. The center supports over 26 different departments, providing laboratory equipment, IT infrastructure, training and access to staff members with extensive experience in genomics, bioinformatics and computational biology.
Christopher Sullivan, the associate director for bioinformatics at CQLS, recently sat down for an interview where he discussed some of the center’s most exciting new projects. Several of these projects use machine learning models to perform advanced analytics run on the center’s HPC clusters.
Tracking owls by sound
In collaboration with the US Forest Service, researchers from the state of Oregon have developed algorithms capable of identifying different species of owls from sound alone. They set up recording stations in the woods to capture audio files. They then generate spectrograms – visual representations of the audio inputs – which they analyze using a machine learning model they trained to recognize distinct species.
At first, the models could only identify a handful of species. But the team kept their audio recordings over the years, about 5PB captured from the 1990s. In the years since, scientists have developed more models, so today the project can identify more than 50 different species. At the same time, technology has advanced, allowing scientists to rerun this old data faster than in the past.
“You give me new technology, I’m going to extract as much data as I can from that technology,” Sullivan said. “It’s about building the tools we need alongside the data as it comes to us. And as those tools change, we’re able to go back to the data and do it again.
This scientific and technological breakthrough has a significant impact on both the economy and the environment. “It helps the public monitor owl populations and all other forest animal populations so groups can properly farm forests for timber and other things without harming the species,” Sullivan explained.
Covid monitoring in wastewater
Much of the work at CQLS involves genetic sequencing. For example, they routinely analyze sewage genetic material so they can determine which Covid variants are most prevalent at any given time.
This is an area where a scientific breakthrough has allowed computer science to advance. Sullivan explained that a new genome sequencing innovation allowed CQLS to grow so that instead of doing 2,000 sequences per run, they were able to do 2 million sequences per run. And it happened “literally overnight”.
But the IT infrastructure was not designed for this type of throughput. “So we had to go back and develop a whole new stack because the technology changed,” Sullivan said. “We do it constantly at all times.”
Identify plankton by laser light
Another CQLS project uses genetic sequencing to monitor ocean health. Oregon State’s Hatfield Marine Science Center is working with the National Oceanic and Atmospheric Administration (NOAA) on a project that analyzes plankton in seawater. They have a ship that trails a laser device behind it to capture ocean footage. As the lasers pass over water, plankton and other organisms in the water cast a shadow. The team records video of these shadows, which they then analyze with GPU-powered HPC systems to classify the content.
This effort generates a huge amount of data, on the order of 100 TB per week. This was more than CQLS could affordably store and process in the cloud. However, the center’s HPC environment, built with Dell infrastructure featuring NVIDIA GPUs, provided the right balance of performance and cost.
Projects like these help the State of Oregon learn more about the world we live in and develop new, innovative computing approaches at the same time. Their efforts improve lives and improve health while pushing the boundaries of what is possible with high performance computing.
For more information, read the OSU customer story here.
Intel® Technologies Advance Analytics
Data analytics is the key to unlocking the maximum value you can extract from your organization’s data. To create a productive, cost-effective scanning strategy that gets results, you need high-performance hardware that’s optimized to work with the software you’re using.
Modern data analytics covers a range of technologies, from dedicated analytics platforms and databases to deep learning and artificial intelligence (AI). New to analytics? Ready to evolve your analytics strategy or improve your data quality? There’s always room to grow, and Intel is ready to help. With a broad ecosystem of analytics technologies and partners, Intel accelerates the efforts of data scientists, analysts, and developers across industries. Learn more about Intel Advanced Analytics.
#Oregon #team #monitors #endangered #owls #sound