Due to their value to the ocean science and fisheries management communities, NOAA National Centers for Environmental Information (NCEI), with NOAA Fisheries and University of Colorado Cooperative Institute for Research in Environmental Sciences, established a national archive for water column sonar data. There are currently 210 TB of data freely and publicly available, and that volume is growing rapidly as sonar technology advances. The spatially and temporally diverse archive is accessible through its dedicated data portal and Amazon Web Services. Throughout 2023, we will develop a cloud-optimized data lake of echosounder files representing a ∼100 TB subset of the archive holdings. The echosounder files will be translated from their complex, binary and proprietary file format into zarr files following the Earth Science Information Partners analysis-ready cloud-optimized standards. The resulting data lake will serve as the foundation for building analytical capabilities that can cost-effectively tap into the archive’s sonar holdings, especially when coupled with compute power. The zarr stores will subsequently feed into EchoFish, the archive’s AWS-hosted interactive data visualization platform to facilitate subsetting and prevent the data lake from becoming a data swamp. The progress and potential applications of this NOAA Center for Artificial Intelligence funded project will be presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.