Loading…

Loading grant details…

Completed STANDARD GRANT National Science Foundation (US)

Collaborative Research: CIBR: Leaping the Specimen Digitization Gap: Connecting Novel Tools, Machine Learning and Public Participation to Label Digitization Efforts

$4.66M USD

Funder National Science Foundation (US)
Recipient Organization Board of Regents, Nshe, Obo University of Nevada, Reno
Country United States
Start Date Jan 15, 2021
End Date Jan 31, 2025
Duration 1,477 days
Number of Grantees 5
Roles Principal Investigator; Former Principal Investigator; Co-Principal Investigator
Data Source National Science Foundation (US)
Grant ID 2027241
Grant Description

National efforts to digitize natural history collections have transformed previously siloed, unstandardized resources into a networked, openly available information nexus usable to meet grand scientific and societal challenges. Despite these enormous strides, major bottlenecks in this digitization process still exist, especially in areas where automation approaches have been most challenging.

In particular, capturing analog specimen data into digital format and converting text descriptions of collecting locations into mappable geocoordinates, have remained boutique efforts. Because of these bottlenecks, as many as 91% of digitized specimens are missing key elements that hamper ability to use these specimen records more effectively. This project will develop key workflows to dramatically increase the speed at which specimen data can be captured and made available broadly to data providers and consumers.

These workflows include novel approaches that use both computer and human intelligence to advance our ability to capture specimen information. One key workflow focuses on the challenge of automated conversion of imaged specimen labels into properly formatted and usable digital text. Critical to the success of this workflow are human validation checkpoints that will be implemented using a popular citizen science platform, Notes from Nature.

A second workflow focuses on new tools that take advantage of previous efforts to assign mappable coordinates based on specimen collection location to automatically add such mapping information for specimens missing those data. Finally, this effort will create tools for easy access to these new data in and out of common use databases, making the data immediately available for museum providers and researchers alike.

This effort will connect public participation in science to these novel tools and technologies. Further, it will train diverse graduate students and undergraduate students in bioinformatics and museum science.

This effort has three design goals that together will dramatically reduce the digitization gap in museum specimen data. The first design goal will combine machine learning methods with public participation in scientific research (PPSR) via the successful Notes from Nature (NfN) project to speed up label digitization and facilitate obtaining locality data.

A key part of the first design goal utilizes supervised machine learning approaches and object character recognition (OCR) when possible but also includes “humans in the loop” using the NfN platform to gather fast quality feedback from human volunteers at key points. This approach also provides a means to create high-quality training datasets needed for improving automation steps, ultimately further reducing human effort.

The second design goal will integrate locality data interpretation through GEOLocate with a Biodiversity Enhanced Locality Service (BELS), which will make it possible to look up pre-existing localities that have been georeferenced using best practices. A third goal is to connect these workflows and services to Symbiota, a community digitization hub, to allow easy inflow and outflow of content back to digitization networks.

Providers will be able to easily access new data along with associated metadata about processing steps, all returned using established standards and best practices. The key to this effort will be engagement with the community, including researchers, collections staff, and Zooniverse volunteers. Engagement will focus on virtual training and working with an advisory committee in order to grow capacity and community involvement.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

All Grantees

Board of Regents, Nshe, Obo University of Nevada, Reno

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant