Skip to main content
NSF NEON, Operated by Battelle

Main navigation

  • About
    • NEON Overview
      • Vision and Management
      • Spatial and Temporal Design
      • History
    • About the NEON Biorepository
      • ASU Biorepository Staff
      • Contact the NEON Biorepository
    • Observatory Blog
    • Newsletters
    • Staff
    • FAQ
    • Contact Us

    About

  • Data
    • Data Portal
      • Data Availability Charts
      • API & GraphQL
      • Prototype Data
      • Externally Hosted Data
    • Data Collection Methods
      • Airborne Observation Platform (AOP)
      • Instrument System (IS)
        • Instrumented Collection Types
        • Aquatic Instrument System (AIS)
        • Terrestrial Instrument System (TIS)
      • Observational System (OS)
        • Observation Types
        • Observational Sampling Design
        • Sampling Schedules
        • Taxonomic Lists Used by Field Staff
        • Optimizing the Observational Sampling Designs
      • Protocols & Standardized Methods
    • Getting Started with NEON Data
      • neonUtilities for R and Python
      • Learning Hub
      • Code Hub
    • Using Data
      • Data Formats and Conventions
      • Released, Provisional, and Revised Data
      • Data Product Bundles
      • Usage Policies
      • Acknowledging and Citing NEON
      • Publishing Research Outputs
    • Data Notifications
    • NEON Data Management
      • Data Availability
      • Data Processing
      • Data Quality

    Data

  • Samples & Specimens
    • Biorepository Sample Portal at ASU
    • About Samples
      • Sample Types
      • Sample Repositories
      • Megapit and Distributed Initial Characterization Soil Archives
    • Finding and Accessing Sample Data
      • Species Checklists
      • Sample Explorer - Relationships and Data
      • Biorepository API
    • Requesting and Using Samples
      • Loans & Archival Requests
      • Usage Policies

    Samples & Specimens

  • Field Sites
    • Field Site Map and Info
    • Spatial Layers & Printable Maps

    Field Sites

  • Resources
    • Getting Started with NEON Data
    • Research Support Services
      • Field Site Coordination
      • Letters of Support
      • Mobile Deployment Platforms
      • Permits and Permissions
      • AOP Flight Campaigns
      • Research Support FAQs
      • Research Support Projects
    • Code Hub
      • neonUtilities for R and Python
      • Code Resources Guidelines
      • Code Resources Submission
      • NEON's GitHub Organization Homepage
    • Learning Hub
      • Tutorials
      • Workshops & Courses
      • Science Videos
      • Teaching Modules
    • Science Seminars and Data Skills Webinars
    • Document Library
    • Funding Opportunities

    Resources

  • Impact
    • Research Highlights
    • Papers & Publications
    • NEON in the News

    Impact

  • Get Involved
    • Upcoming Events
    • Research and Collaborations
      • Environmental Data Science Innovation and Inclusion Lab
      • Collaboration with DOE BER User Facilities and Programs
      • EFI-NEON Ecological Forecasting Challenge
      • NEON Great Lakes User Group
      • NCAR-NEON-Community Collaborations
    • Advisory Groups
      • Science, Technology & Education Advisory Committee
      • Technical Working Groups
    • NEON Ambassador Program
      • Exploring NEON-Derived Data Products Workshop Series
    • Partnerships
    • Community Engagement
    • Work Opportunities

    Get Involved

  • My Account
  • Search

Search

About

  • NEON Overview
  • About the NEON Biorepository
  • Observatory Blog
  • Newsletters
  • Staff
  • FAQ
  • Contact Us

Breadcrumb

  1. About
  2. Observatory Blog
  3. Big Data Part II: Sharing the Challenges and Payoffs of Big Data

Big Data Part II: Sharing the Challenges and Payoffs of Big Data

September 5, 2012

Part II: Sharing the Challenges and Payoffs of Big Data

By Brian Wee, Chief of External Affairs

Big Data is not new to the science world. But to extract as much fundamental insight and predictive power from ecological Big Data as we have from large data sets in disciplines like physics, genomics, and atmospheric science, we need different and more sophisticated tools. One of the biggest challenges and opportunities that NEON continues to grapple with is to find, develop, and implement the best and necessary tools for an ecological Big Data project of continental scale. Ecosystems are rich with subtle and varied interactions that cannot be teased out from the noise of variability without a large suite of correlated measurements that capture a “snapshot” of the environment. If we want to be able to filter out noise and derive a new understanding of the ecosystem processes underlying our measurements, we need to capture many snapshots over time and space. Weather data typically includes automated measurements of about a dozen variables, many of which can be related to each other using fundamental physical laws. NEON, on the other hand, will observe more than 500 variables at each of its 60 sites for 30 years, a necessary breadth and depth to achieve ecological insight and forecasting across ecosystems when the relationships between such variables are complex and may have yet to be discovered or described.

In addition, data collection in meteorological networks and the Large Hadron Collider is automated; but many ecologically important samples and measurements must still be collected by hand in varying conditions. A great deal of procedural standardization and QA/QC is necessary to ensure that billions of data points collected by thousands of sensors and hundreds of people at 60 sites over 30 years are of consistent and high quality. Furthermore, each data point, whether it be a leaf nitrogen concentration, spectral reflectance or taxonomic identification of a phytoplankton, must be documented with time, place and quality control information as well as a link to the protocol that generated that measurement. These metadata exponentially increase the dimension and complexity of ecological Big Data. But they are utterly necessary; without the protocols for data collection, for instance, it is impossible for a data user to assess the accuracy and usability of the data or quantify its uncertainty in modeling applications. Making Big Data work for ecology and environmental science requires an enormous investment of effort, resources and ingenuity. But the potential payoff is equally large. Giving researchers large data sets and the tools required to integrate them into novel analyses opens up a world of discoveries and insights that the original data collectors may never have imagined. What’s more, bigger and more integrated data sets make a new type of ecological science possible by enabling the direct quantitative testing of general theory and hypotheses. These long-term, large-scale data sets make it possible to ask and answer simple, broad questions like “what are the biological consequences of environmental change?” rather than “does soil nitrogen appear to be related to nighttime temperatures in the Costa Rican rainforest?” The answers to these broader questions also have broader implications for humanity. NEON data products, for instance, can be used in models to forecast the responses of ecosystems, and ecosystems provide the human essentials of food, fiber, energy and water. These essentials and other ecosystem services link the nation’s environmental well-being to its economic success. Thus, “government has an essential role to play in the stewardship of environmental capital,” as the President’s Council of Advisors for Science and Technology (PCAST) asserts in its 2011 report, “Sustaining Environmental Capital: Protecting Society and the Economy.” Effective management of environmental capital requires accessible, high-quality information about the current state of the environment and its likely responses to change. To that end the 2011 PCAST report recommends improving integration and utilization of existing data and models and filling gaps in the data (EcoINFORMA), as well as increasingly employing ecoinformatics to improve decision-making in natural resources management. The report also identifies NEON, the Long-Term Ecological Research Network (LTER), and observation programs from other Federal agencies as contributors to a body of credible data on the status and trend of the nation’s ecosystems that can be used to inform national assessments and decisions. This large and rapidly expanding body of information calls for enhanced tools to make better use of it. Thus, the U.S. government has invested in the development of technologies and infrastructure to enhance the accessibility and utility of existing and future Big Data. The $200 million Big Data Research and Development Initiative announced earlier this year funds several such investments including a joint National Science Foundation (NSF) / National Institutes of Health solicitation for proposals to “advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large and diverse data sets.” CIF21, a complementary cyberinfrastructure initiative, aims to develop and coordinate efforts across NSF to create a distributed computing and information processing framework to transform data acquisition, storage, management and integration across Federal agencies. CIF21 further spawned the EarthCube Initiative to support the growth of data sharing and analysis across geoscience and biological science disciplines. Indeed, a fertile ecosystem of cross-pollinating data initiatives is currently buzzing away at key tasks to prepare researchers for the growing demands of large-scale, interdisciplinary environmental science. For instance, the DataONE project, the EarthCube Initiative and the Federation of Earth Science Information Partners (of which NEON is a member) are working to define better ways to publish and discover data. Another hive of activity is clustered around the task of refining semantic links between data points that enable sophisticated, powerful queries to help deal with the complex web of inter biological, physical, behavioral and phylogenetic data exemplified by the pika example in the first part of this series. Still other organizations are experimenting with cloud-based, collaborative environments to help manage the workflows of large, multidisciplinary science projects such as Macrosystems Biology research. NEON is not alone in facing the challenges of ecological Big Data; it is one of many projects that are tackling these challenges in parallel and together.

Sandra Chung and Dave Schimel contributed to this piece.

Share

Related Posts:

Unlocking the Secrets of Soil Microbial Metabolites

February 2, 2026

Petri dishes of bacteria cultures

Discontinuation of Select NEON Data Products

January 29, 2026

Modification of Select NEON Data Products

January 29, 2026

NSF NEON, Operated by Battelle

Follow Us:

Join Our Newsletter

Get updates on events, opportunities, and how NEON is being used today.

Subscribe Now

Footer

  • About Us
  • Contact Us
  • Terms & Conditions
  • Careers
  • Code of Conduct

Copyright © Battelle, 2026

The National Ecological Observatory Network is a major facility fully funded by the U.S. National Science Foundation.

Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the U.S. National Science Foundation.