A BOLD way to crowdsource genetic data

October 26, 2015

The Barcode of Life Data Systems (BOLD) is one of the largest genetic databases in the world where users may search over 1.7M public records. More specifically, BOLD is an informatics workbench that facilitates the acquisition, storage, analysis and publication of DNA barcode records. DNA barcode records use a very short genetic sequence from a standard part of the genome to help experts identify and categorize specimens. As of October, 2015, NEON has shared over 3,000 specimen records with BOLD. In 2013, NEON even found a new ground beetle species in the Abacidus subgenus.

Leveraging different science infrastructures to provide high quality, open data

NEON uses publically available and standardized protocols to guide sampling techniques in the field, such as setting up mosquito traps. Field technicians undergo training on species identification as part of NEON’s quality assurance efforts. However, species identification is a unique area that requires external quality control measures to reduce uncertainty: NEON submits specimens to BOLD so that taxonomic experts may check the accuracy of NEON species identification efforts or even if, in some cases, NEON scientists have trouble identifying a species. In turn, these results inform NEON protocols and training programs. For example, if NEON finds species that are unusual to an area, taxonomic experts at BOLD assist with identification as a quality control measure. NEON Staff Scientist and Insect Ecologist Katie LeVan states, “we want the most precision possible because that is what is valuable”.

While identifying species is an important part of NEON’s science design, providing sampling data in accessible, publicly available databases is critical to NEON’s commitment to open science. Open science is a movement to make scientific research and data accessible to broader audiences. Some fundamental goals of open science include 1) transparency in methodology and data collection; 2) public availability and reusability of data; and 3) use of open source tools to facilitate scientific collaboration. To support collective science goals and support ongoing efforts of the science community, NEON is connecting with open access databases and initiatives like BOLD.

NEON beetle collection: from the field to species idenfication

NEON submits specimens and barcode records to BOLD, including specimens and records of beetles and mosquitoes. NEON is projected to provide tens of thousands of new records to BOLD by sampling at unprecedented scales and quantities across the continent. For example, there are currently roughly 24,500 Carabidae (beetle) specimen records and 2,200 beetle barcode records; over the 30-year lifetime of the Project, NEON may contribute over 50,000 new beetle barcode records.

NEON sends approximately 400 beetles specimens per field site to BOLD to analyze the accuracy of NEON species identification efforts. Of these specimens, roughly 40 are DNA barcoded for further quality control. However, these numbers vary greatly depending on the field site: in 2015, an average of 7-10k beetle specimens were collected per site, but 29k samples were collected at North Sterling.

BOLD is a resource to store, analyze and publish DNA information

BOLD is maintained by the University of Guelph in Ontario, Canada. It offers researchers a way to collect, manage and analyze DNA barcode data. There are two central DNA barcode databases: BOLD and the International Nucleotide Sequence Database Collaboration (INSDC). BOLD and the INSDC members are connected to other databases of taxonomic names and voucher specimens, such as specimens in museums. These linkages represent international open science efforts to develop integrated, standardized and reproducible methods in the field of genetics. BOLD’s assembly of molecular, morphological and distributional data is already bridging the traditional bioinformatics chasm: according to the BOLD website, “It's the BARCODE data standard that allows the products of bottom-up projects around the world to be integrated into a global initiative”. The integration and crowdsourcing of genetic data expands the scope of traditional biology: for example, the broad geographic representation of genetic data stored in BOLD may be used to develop new theories and answer questions in the area of phylogeny, otherwise known as the evolutionary history of taxonomic groups.

Looking for NEON specimens and genetic data on BOLD?

Explore NEON specimen records on BOLD’s Public Data Portal by searching the phrase “National Ecological Observatory Network, United States”. Learn more about NEON Terrestrial Organismal Sampling Methods.

More about BOLD

The BOLD Systems is designed to support the generation and application of DNA barcode data and consists of four main modules: a data portal, a database of barcode clusters, an educational portal, and a data collection workbench. The project behind the database, the International Barcode of Life project (iBOL), is the world’s largest biodiversity genetics initiative:

“Hundreds of biodiversity scientists, genetics specialists, technologists and ethicists from 25 nations are working together to construct a richly parameterized DNA barcode reference library that will be the foundation for a DNA-based identification system for all multi-cellular life.”  
-International Barcode of Life project

Dialog content.