With hundreds of instruments monitoring Canada’s ocean environment, Ocean Networks Canada (ONC) gathers the same amount of data as the Hubble Telescope. Turning a firehose of high resolution data into useful knowledge is the challenge of the century. ONC’s robust and sophisticated data management system, Oceans 2.0, is already recognized as a state-of-the-art ocean management tool for marine decision-making, and it’s about to get even better.
Oceans 2.0 is a versatile online tool that allows scientists and the public to access and manipulate data—including audio and video—from ONC’s hundreds of deep ocean and coastal sensors in real time, 24/7. Thanks to renewed funding from CANARIE—whose ongoing support since 2006 has made Oceans 2.0 possible—a new two phase project is currently underway to accelerate and advance scientific research in Canada
Phase One of the Research Platform for User-Defined Oceanographic Data Products (known as Empower) will enable researchers to access data products simply and quickly through a specially designed Application Programming Interface (API).
Following the completion of Phase One in April 2017, Phase Two will enable researchers to define, test, use, and share processing code for user-defined datasets in a custom-designed programming environment, known as the Sandbox. In other words, instead of having to manually download and sort a terabyte of data each time, this tool will enable a user to develop a unique algorithm for their research, which will then do the work of filtering, sorting, and presenting the data they need at the push of a button. The completion date for Phase Two is April 2018.
“This innovation signals a new user-driven approach to ONC data management,” says ONC Data Manager, Marlene Jeffries. In-depth interviews with dozens of ONC science users are informing this exciting project. Based on the responses, 51 personas and 4 distinct use-cases will help develop Oceans 2.0’s new custom web services.
"We have this enormous pipe coming in out of the ocean, filling up this enormous tank,” comments Doug Latornell, a Research Software Engineer at the University of British Columbia whose primary focus is to set up and support scientific data automation for his students. “It’s important to keep every molecule in the tank, but a lot of the time what we want is a filtered stream out of the tank. One of the problems right now is that it feels like we can't have a filtered stream, we can only have the raw stream, one tablespoon at a time."
“Let’s be blunt about it, there is too much data to be accessed over the internet.” says John Hildebrand who leads the Scripps Whale Acoustic Lab, which maintains Triton, a software that evaluates and processes large sets of acoustic data. John is interested in integrating ONC's hydrophone data into Triton in a way that does not require the data to be uploaded onto a local machine. “It’s impractical to download a terabyte of data, and a terabyte is nothing, it’s one month of data.” John wants to download 12 terabytes which is equivalent to one year of data, the amount needed to analyze seasonal cycles of marine activity. “That is the unit in which you need to analyze these data. If I can only get a few gigabytes of data by pushing a button online, it is never going to happen.”
“The Empower project will open the door for ONC’s international users to access our data in new and creative ways that we haven’t even thought about yet,” says Web Services Specialist Ryan Ross. “One possible example could be an app that would automatically download video footage when a rare bird that is being tracked flies by one of our onshore cameras; this could be of interest to an organization like Bird Studies Canada.”
Another important aspect of making ONC’s datasets available in useful new formats involves standardizing the wide range of measurements and variables of ocean datasets so they align in a global framework. Adam Leadbetter, data management team leader with Ireland’s Marine Institute in Galway—who specializes in ocean data interoperability or compatibility—spent a week with the ONC data team to take Oceans 2.0 to the next level.
“ONC has a huge volume of data that is well-looked after and is easily accessible to scientists,” comments Leadbetter. “But the ocean is global, so we need to be able to combine datasets from various global sources to create useful products.“ To this end, Leadbetter and the ONC data management team are working to align ONC data with the standards set by the Ocean Data Interoperability Platform, an international consortium that is working towards the effective sharing of data across scientific domains and international boundaries.
The ONC data management team will continue to collaborate with Leadbetter and the Irish Marine Institute through Phase One and Phase Two of the Empower project. Stay tuned for updates in Spring 2017.