Skip to main content
NSF NEON, Operated by Battelle

Main navigation

  • About
    • NEON Overview
      • Vision and Management
      • Spatial and Temporal Design
      • History
    • About the NEON Biorepository
      • ASU Biorepository Staff
      • Contact the NEON Biorepository
    • Observatory Blog
    • Newsletters
    • Staff
    • FAQ
    • Contact Us

    About

  • Data
    • Data Portal
      • Data Availability Charts
      • API & GraphQL
      • Prototype Data
      • Externally Hosted Data
    • Data Collection Methods
      • Airborne Observation Platform (AOP)
      • Instrument System (IS)
        • Instrumented Collection Types
        • Aquatic Instrument System (AIS)
        • Terrestrial Instrument System (TIS)
      • Observational System (OS)
        • Observation Types
        • Observational Sampling Design
        • Sampling Schedules
        • Taxonomic Lists Used by Field Staff
        • Optimizing the Observational Sampling Designs
      • Protocols & Standardized Methods
    • Getting Started with NEON Data
      • neonUtilities for R and Python
      • Learning Hub
      • Code Hub
    • Using Data
      • Data Formats and Conventions
      • Released, Provisional, and Revised Data
      • Data Product Bundles
      • Usage Policies
      • Acknowledging and Citing NEON
      • Publishing Research Outputs
    • Data Notifications
    • NEON Data Management
      • Data Availability
      • Data Processing
      • Data Quality

    Data

  • Samples & Specimens
    • Biorepository Sample Portal at ASU
    • About Samples
      • Sample Types
      • Sample Repositories
      • Megapit and Distributed Initial Characterization Soil Archives
    • Finding and Accessing Sample Data
      • Species Checklists
      • Sample Explorer - Relationships and Data
      • Biorepository API
    • Requesting and Using Samples
      • Loans & Archival Requests
      • Usage Policies

    Samples & Specimens

  • Field Sites
    • Field Site Map and Info
    • Spatial Layers & Printable Maps

    Field Sites

  • Resources
    • Getting Started with NEON Data
    • Research Support Services
      • Field Site Coordination
      • Letters of Support
      • Mobile Deployment Platforms
      • Permits and Permissions
      • AOP Flight Campaigns
      • Research Support FAQs
      • Research Support Projects
    • Code Hub
      • neonUtilities for R and Python
      • Code Resources Guidelines
      • Code Resources Submission
      • NEON's GitHub Organization Homepage
    • Learning Hub
      • Tutorials
      • Workshops & Courses
      • Science Videos
      • Teaching Modules
    • Science Seminars and Data Skills Webinars
    • Document Library
    • Funding Opportunities

    Resources

  • Impact
    • Research Highlights
    • Papers & Publications
    • NEON in the News

    Impact

  • Get Involved
    • Upcoming Events
    • Research and Collaborations
      • Environmental Data Science Innovation and Inclusion Lab
      • Collaboration with DOE BER User Facilities and Programs
      • EFI-NEON Ecological Forecasting Challenge
      • NEON Great Lakes User Group
      • NCAR-NEON-Community Collaborations
    • Advisory Groups
      • Science, Technology & Education Advisory Committee
      • Technical Working Groups
    • NEON Ambassador Program
      • Exploring NEON-Derived Data Products Workshop Series
    • Partnerships
    • Community Engagement
    • Work Opportunities

    Get Involved

  • My Account
  • Search

Search

About

  • NEON Overview
  • About the NEON Biorepository
  • Observatory Blog
  • Newsletters
  • Staff
  • FAQ
  • Contact Us

Breadcrumb

  1. About
  2. Observatory Blog
  3. Coming updates to NEON microbial data

Data Notification

Coming updates to NEON microbial data

January 29, 2024

Over the last two years there have been a lot of changes in NEON's microbial data products, and data released in the coming year will include many updates and new, improved DNA sequence data. Improvements in sequencing technology and laboratory protocols are producing data of overall better quality and greatly expanded size. See below for details of the current status of individual data products; another update will be provided in May.

The metagenomics data products will have the biggest improvements, encompassing all sample types: soil (DP1.10107.001), benthic (DP1.20279.001), and surface water (DP1.20281.001). For the 2022 collections, DNA sequencing will move to the much larger capacity Illumina NovaSeq sequencing platform. As a result of this, the average number of sequence reads for each sample will increase about ten-fold. For the 2023 and 2024 collections, NEON will embark on a collaboration with the Joint Genome Institute (JGI) and the National Microbiome Data Collaborative (NMDC), through which all metagenomic samples will be sequenced by JGI and then analyzed through the NMDC data analysis pipeline. This is supported in part by a Community Science Proposal grant awarded to NEON this year. The JGI sequencing will result in an approximately 50-fold increase in sequence output over current NEON averages. Another exciting aspect of this collaboration is that all NEON metagenomic samples, past and present, will be incorporated into the NMDC database and run through their data analysis pipeline.

The sequencing of the 2022 samples will begin in February, 2024. These sequencing runs will also include the last samples remaining from the 2020 and 2021 field collections. The first sets of data will begin to be released in March, 2024, and all metagenome sequences from 2020 – 2022 should be available as provisional data by the end of April, 2024.

The sequencing of the 2023 metagenome samples will also begin in February. Due to the intensive data analysis that will accompany the sequencing there will be a longer lag time between sequencing and release. The first set of samples should be released in May, with the subsequent two batches to be released each month thereafter. Discussions are underway between NEON and NMDC as to how these data will be presented. NEON will continue to provide links to the raw data, as well as links to the metagenome annotations on NMDC.  

The marker gene data products (soil: DP1.10108.001, benthic: DP1.20280.001, surface water: DP1.20282.001) have also gone through major changes. Over the past year and a half, the Rush University Genomics and Microbiome Core Facility (GCMF), a laboratory that specializes in environmental DNA, optimized the PCR and sequencing protocols for both ITS and 16S amplicon sequencing. These changes have resulted in greatly improved quality of the sequencing results, especially for the fungal ITS products. As well, the GMCF has upgraded to the Illumina NovaSeq to sequence the marker gene products, substantially increasing the number of sequences per sample.

Last year the Rush GMCF laboratory completed the fungal ITS sequencing of the 2019 samples. This wrapped up the 2019 marker gene sequencing (the bacterial 16S had already been done). All the 2019 marker gene data are part of the 2024 official release. The GMCF is currently sequencing the 2020 and 2021 samples, and they will be released as provisional data as they become available. The first sets of samples will be available online in February, with all samples from these two years expected to be completed by April. After completing the 2020 and 2021 samples, the GMCF will begin to sequence the 2022 and 2023 samples, with these data expected to be available by October.

Lastly, the microbe community composition data products (soil: DP1.10081.001, benthic: DP1.20086.001, and surface water: DP1.20141.001), are undergoing major modifications, primarily to the data analysis pipeline. These changes are being done to 1) make it easier to compare all samples across years and sites, 2) improve accessibility to the data by, for example, making it easier to import the sample data into popular metabarcoding programs such as phyloseq and Qiime2 for downstream ecological analysis, and 3) provide a modular pipeline system that is flexible and that can be adapted to new programs and analytic methods.

The revised community composition products will begin to be available on the data portal in the spring of 2024. The 2019 samples are targeted for release in April. As the 2020/2021 marker gene sequences become available, they will be run through the community composition pipeline, with the results published as provisional data by end of May. Likewise, for the 2022/2023 samples, the community composition analysis will be run as those sequences become available and are expected by October. 

Share

Related Posts:

Discontinuation of Select NEON Data Products

January 29, 2026

Modification of Select NEON Data Products

January 29, 2026

Data Product DP1.00006.001 (Precipitation) has been Deprecated

October 21, 2025

NSF NEON, Operated by Battelle

Follow Us:

Join Our Newsletter

Get updates on events, opportunities, and how NEON is being used today.

Subscribe Now

Footer

  • About Us
  • Contact Us
  • Terms & Conditions
  • Careers
  • Code of Conduct

Copyright © Battelle, 2026

The National Ecological Observatory Network is a major facility fully funded by the U.S. National Science Foundation.

Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the U.S. National Science Foundation.