Skip to main content
NSF NEON, Operated by Battelle

Main navigation

  • About
    • NEON Overview
      • Vision and Management
      • Spatial and Temporal Design
      • History
    • About the NEON Biorepository
      • ASU Biorepository Staff
      • Contact the NEON Biorepository
    • Observatory Blog
    • Newsletters
    • Staff
    • FAQ
    • User Accounts
    • Contact Us

    About

  • Data
    • Data Portal
      • Data Availability Charts
      • API & GraphQL
      • Prototype Data
      • Externally Hosted Data
    • Data Collection Methods
      • Airborne Observation Platform (AOP)
      • Instrument System (IS)
        • Instrumented Collection Types
        • Aquatic Instrument System (AIS)
        • Terrestrial Instrument System (TIS)
      • Observational System (OS)
        • Observation Types
        • Observational Sampling Design
        • Sampling Schedules
        • Taxonomic Lists Used by Field Staff
        • Optimizing the Observational Sampling Designs
      • Protocols & Standardized Methods
    • Getting Started with NEON Data
      • neonUtilities for R and Python
      • Learning Hub
      • Code Hub
    • Using Data
      • Data Formats and Conventions
      • Released, Provisional, and Revised Data
      • Data Product Bundles
      • Usage Policies
      • Acknowledging and Citing NEON
      • Publishing Research Outputs
    • Data Notifications
    • NEON Data Management
      • Data Availability
      • Data Processing
      • Data Quality

    Data

  • Samples & Specimens
    • Biorepository Sample Portal at ASU
    • About Samples
      • Sample Types
      • Sample Repositories
      • Megapit and Distributed Initial Characterization Soil Archives
    • Finding and Accessing Sample Data
      • Species Checklists
      • Sample Explorer - Relationships and Data
      • Biorepository API
    • Requesting and Using Samples
      • Loans & Archival Requests
      • Usage Policies

    Samples & Specimens

  • Field Sites
    • Field Site Map and Info
    • Spatial Data Layers & Maps

    Field Sites

  • Resources
    • Getting Started with NEON Data
    • Research Support Services
      • Field Site Coordination
      • Letters of Support
      • Permits and Permissions
      • AOP Flight Campaigns
      • Research Support FAQs
      • Research Support Projects
    • Code Hub
      • neonUtilities for R and Python
      • Code Resources Guidelines
      • Code Resources Submission
      • NEON's GitHub Organization Homepage
    • Learning Hub
      • Tutorials
      • Workshops & Courses
      • Science Videos
      • Teaching Modules
    • Science Seminars and Data Skills Webinars
    • Document Library
    • Funding Opportunities

    Resources

  • Impact
    • Research Highlights
    • Papers & Publications
    • NEON in the News

    Impact

  • Get Involved
    • Upcoming Events
    • Research and Collaborations
      • Environmental Data Science Innovation and Inclusion Lab
      • Collaboration with DOE BER User Facilities and Programs
      • EFI-NEON Ecological Forecasting Challenge
      • NEON Great Lakes User Group
      • NCAR-NEON-Community Collaborations
    • Advisory Groups
      • Science, Technology & Education Advisory Committee (STEAC)
      • Innovation Advisory Committee (IAC)
      • Technical Working Groups (TWG)
    • NEON Ambassador Program
      • Exploring NEON-Derived Data Products Workshop Series
    • Partnerships
    • Community Engagement
    • Work Opportunities

    Get Involved

  • My Account
  • Search

Search

Learning Hub

  • Tutorials
  • Workshops & Courses
  • Science Videos
  • Teaching Modules

Breadcrumb

  1. Resources
  2. Learning Hub
  3. Tutorials
  4. NEON-CUAHSI Data Skills Demo: Exploring the Water Cycle at Co-Located Terrestrial-Aquatic Sites

Tutorial

NEON-CUAHSI Data Skills Demo: Exploring the Water Cycle at Co-Located Terrestrial-Aquatic Sites

Authors: Zachary L. Nickerson

Last Updated: Mar 25, 2026

This tutorial explores hydrologic data products published by NEON across observational, instrumented, and remote sensing subsystems. It focuses on downloading and exploring various NEON hydrologic data products. It also dives deeper into NEON hydrology data, using co-located terrestrial-aquatic sites to examine the water cycle at multiple levels. The code for this tutorial is available in both Python and R environments.

Learning Objectives

After completing this activity, you will be able to:

  • Download and explore the contents of NEON hydrologic data products from the observational and instrumented subsystems.
  • Navigate to tools and data sources for more derived data products such as geospatial data, bundled eddy-covariance data, or data from the airborne observation platform.
  • Understand the similarities and linkages between different NEON data products.
  • Join and plot hydrologic data sets from the instrumented subsystem across a terrestrial-aquatic gradient at NEON co-located sites.

You can follow either the R or Python code throughout this tutorial.

  • For R users, we recommend using R version 4+ and RStudio.
  • For Python users, we recommend using Python 3.9+.

Set up: Install Packages

Packages only need to be installed once, you can skip this step after the first time:

R

  • neonUtilities: Basic functions for accessing NEON data
  • tidyverse: Collection of R packages designed for data science
  • geosphere: Compute distances between latitude/longitude coordinates
  • plotly: Functions for producing interactive plots
install.packages("neonUtilities")
install.packages("tidyverse")
install.packages("geosphere")
install.packages("plotly")

Python

  • os: Module allowing interaction with user’s operating system
  • pandas: Module for working with data frames
  • neonutilities: Basic functions for accessing NEON data
  • matplotlib: Functions for plotting
  • geopy: Compute distances between latitude/longitude coordinates
  • plotly: Functions for producing interactive plots
  • statsmodels: Functions for the estimation of statistical models
pip install os
pip install pandas
pip install neonutilities
pip install matplotlib
pip install geopy
pip install plotly
pip install statsmodels

Additional Resources

  • Tutorial for using neonUtilities from both R and Python environments.
  • GitHub repository for neonUtilities
  • neonUtilities cheat sheet. A quick reference guide for users.
  • HydroShare Collection - NEON Hydrologic Data Products: Site-Level Resources.
  • CUAHSI Cyberseminar Series: Introduction to NEON for Hydrology - An Overview of Hydrologic Data Products and Tools. :::

Set up: Load Packages

R

library(neonUtilities)
library(tidyverse)
library(geosphere)
library(plotly)

Python

import os
import pandas as pd
import neonutilities as nu
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from geopy.distance import geodesic
import plotly.graph_objects as go
import statsmodels.api as sm
import numpy as np

Set NEON Data Portal API Token

It is recommended that NEON data users have a NEON Data Portal API token set as an environment variable. See this tutorial. for instructions on obtaining a NEON API token.

R

Sys.setenv(NEON_PAT="YOUR_API_TOKEN_HERE")

Python

os.environ.setdefault('NEON_PAT',"YOUR_API_TOKEN_HERE")

Download & Explore: Introduction

In this tutorial, we will focus on one pair of co-located NEON sites from Domain 07 - Appalachians & Cumberland Plateau:

  • Oak Ridge National Laboratory (ORNL) - Terrestrial
  • Walker Branch (WALK) - Aquatic

But, the workflow can be replicated for any pair of co-located sites across the observatory that contain all the data products used in this tutorial, which is defined by the following criteria:

  • Aquatic site has published discharge data and has groundwater wells installed.
  • Terrestrial site has published precipitation data.
    • Note: Precipitation data are available from two different data products depending on the collection method at a site. Check the following data products to ensure you are downloading the correct data product for a site:
      • Precipitation - weighing gauge (DP1.00044.001)
      • Precipitation - tipping bucket (DP1.00045.001)
  • Terrestrial and aquatic site are within 10-km of each other.

The following site pairs meet that criteria:

NEON Domain Aquatic Site Terrestrial Site
D02 LEWI BLAN
D02 POSE SCBI
D03 FLNT JERC
D06 KING KONA
D07 WALK ORNL
D08 TOMB LENO
D08 BLWA DELA
D08 MAYF TALL
D12 BLDE YELL
D13 COMO NIWO
D16 MART WREF
D17 BIGC SOAP

In this tutorial, we will focus on one water year of data: Water Year 2024, defined as 2023-10-01 through 2024-09-30. And, for each data product used in this tutorial, we will download data published in RELEASE-2026. To learn more about the differences between released and provisional data, see the Understanding Releases and Provisional Data tutorial on the NEON website.

Download & Explore: Instrumented (IS) Data Products

For this exercise, we will download and explore one data product from the instrumented subsystem: Precipitation - tipping bucket (DP1.00045.001). This data product is published from NEON’s terrestrial sites; thus, we will use the NEON site code ‘ORNL’.

R

# Download precipitation data for a single water year
ptp_r <- neonUtilities::loadByProduct(dpID="DP1.00045.001",
                                      site="ORNL",
                                      startdate="2023-10",
                                      enddate="2024-09",
                                      release="RELEASE-2026",
                                      package='expanded',
                                      check.size = F,
                                      token=Sys.getenv("NEON_PAT"))

Python

# Download precipitation data for a single water year
ptp_py = nu.load_by_product(dpid="DP1.00045.001",
                            site="ORNL",
                            startdate="2023-10",
                            enddate="2024-09",
                            release="RELEASE-2026",
                            package="expanded",
                            check_size=False,
                            token=os.environ.get("NEON_PAT"))

Downloads from the NEON Utilities packages contain multiple files, including data tables, metadata, and data product documentation. Let’s explore each set of files in turn.

Files Associated with IS Downloads

The data we’ve downloaded comes as an object that is a named list/dictionary of objects. Let’s view the contents of the download package.

R

# Get all file names in the download package
names(ptp_r)

Python

# Get all file names in the download package
ptp_py.keys()

In this tutorial, we downloaded the expanded download package. What are the files contained in this download package and why are they useful?

  • TIPPRE_1min and TIPPRE_30min: Includes the primary data tables of the Precipitation - tipping bucket data product. We will dive deeper into data tables in the next section.
  • sensor_positions_00045: Reports the geolocation of each sensor included in the download.
  • science_review_flags_00045: Lists each science review flag (SRF) date range, flag value, and justification applied to the data included in this download. #NH comment: justification applied to the data or the flag?
  • issueLog_00045: Reports issues that may impact data quality, or changes to a data product that affects one or more sites.
  • variables_00045: This file contains all the variables found in the data table(s) included in this download. This includes full definitions, units, and other important information.
  • readme_00130: The readme file provides important information relevant to the data product and the specific instance of downloading the data.
  • citation_00045_RELEASE-2026: Formatted citation and DOI for the data included in this download.

Explore IS Data Tables

The expanded download package for DP1.00045.001 contains two data tables that report precipitation time series data, each reporting data at a different temporal resolution:

  • TIPPRE_1min: Tipping bucket precipitation reported at a 1-min resolution
  • TIPPRE_30min: Tipping bucket precipitation averaged to a 30-min resolution

Below, we will explore the first few rows of TIPPRE_30min. Add to the code below to also view other tables included in the expanded download package.

R

# Print the first 5 records in the time series data
print("First 5 rows of TIPPRE_30min")
head(ptp_r$TIPPRE_30min)

Python

# Print the first 5 records in the time series data
print("First 5 rows of TIPPRE_30min")
print(ptp_py['TIPPRE_30min'].head())

Explore IS Variables

The variables_00045 file provides insight into the structure of each data table and associated variables included in a download package. view the variables file and familiarize yourself with the different fields, data types, and units used in this data product.

R

# View variables file to understand data table structure
View(ptp_r$variables_00045)

Python

# View variables file to understand data table structure
print(ptp_py['variables_00045'])

Download & Explore: Observational (OS) Data Products

Let us now explore a hydrologic data product from the observational subsystem. For this exercise, we move to the surface component of the hydrologic cycle. We will download and explore hydrologic data derived from surface water grab samples: Stable isotopes in surface water (DP1.20206.001). This data product is published from NEON’s aquatic sites; thus, we will use the NEON site code ‘WALK’.

R

# Download precipitation data for a single water year
asi_r <- neonUtilities::loadByProduct(dpID="DP1.20206.001",
                                      site="WALK",
                                      startdate="2023-10",
                                      enddate="2024-09",
                                      release="RELEASE-2026",
                                      package='expanded',
                                      check.size = F,
                                      token=Sys.getenv("NEON_PAT"))

Python

# Download precipitation data for a single water year
asi_py = nu.load_by_product(dpid="DP1.20206.001",
                            site="WALK",
                            startdate="2023-10",
                            enddate="2024-09",
                            release="RELEASE-2026",
                            package="expanded",
                            check_size=False,
                            token=os.environ.get("NEON_PAT"))

Let’s do the sample exploration of the download as we did for the instrumented data product and see what is similar and different.

Files Associated with OS Downloads

R

# Get all file names in the download package
names(asi_r)

Python

# Get all file names in the download package
asi_py.keys()

When we view the content of the observational data product download, we notice similarities and differences relative to the instrumented data product. For example, both data products include citation, variables, issuelog, and readme files. What do we notice that is different?

  • The observational data product does not contain science_review_flags or sensor_positions files. Those files are specific to instrumented data products.
  • Files specific to observational data products are included:
    • categoricalCodes_20206: Some variables in the data tables are published as strings and constrained to a standardized list of values (LOV). This file shows all the LOV options for variables published in this data product.
    • validation_20206: If any fields require validation prior to publication, those validation rules are reported in this table.
  • There are many more data tables published in this observational data product. Let’s explore that in the next section.

Explore Data OS Tables

For this sample-based observational data product, there are many more tables published than the previous instrumented data product we explored. That is because data are collected and published at each point along the lifetime of a sample, from collection to analysis. Let’s break down the table structure for this stable isotopes data product.

  • asi_fieldSuperParent: Field data associated with the ‘superparent’ water sample, which is a 4-L grab samples that, once subsampled, results in multiple observational data products, including this stable isotopes data product.
  • asi_fieldData: Field data associated with the stable isotopes subsample.
  • asi_externalLabH2OIsotopes: Results of hydrogen-2 and oxygen-18 stable isotope ratio analysis in filtered surface water samples.
  • asi_externalLabSummaryData: Accuracy and precision data for the instrument used in the analysis of H2O stable isotopes.
  • asi_POMExternalLabDataPerSample: Results of carbon-13 and nitrogen-15 stable isotope ratio analysis in particulate organic matter (POM) filtered out of surface water samples.
  • asi_externalLabPOMSummaryData: Accuracy and precision data for the instrument used in the analysis of POM stable isotopes.

For this exercise, let’s just explore the first few rows of asi_externalLabH2OIsotopes. Add to the code below to also view other tables included in the expanded download package.

R

# Print the first 5 records in H2O stable isotope lab data
print("First 5 rows of asi_externalLabH2OIsotopes")
head(asi_r$asi_externalLabH2OIsotopes)

Python

# Print the first 5 records in H2O stable isotope lab data
print("First 5 rows of asi_externalLabH2OIsotopes")
print(asi_py['asi_externalLabH2OIsotopes'].head())

Explore OS Variables

R

# View variables file to understand data table structure
View(asi_r$variables_20206)

Python

# View variables file to understand data table structure
print(asi_py['variables_20206'])

Download & Explore: Higher-Level Hydrologic Data Products

NEON data products are processed at progressive levels. The precipitation and stable isotopes data products are Level 1 data products, which is the lowest level of data processing required for a NEON data product. Higher level hydrologic data products exist that include additional processing in the form of spatial and/or temporal interpolation, or the incorporation of algorithms or scientific theory to derive higher-order quantities.

More information on NEON data processing levels.

For this exercise, we will introduce three high-level hydrologic data products and show how to download them.

Higher-Level Hydrologic Data Products: Stream morphology maps

The Stream morphology maps (DP4.00131.001) data product is a Level 4 aquatic data product published at all NEON stream sites. The data product includes many data tables with post-processed survey data and links to geospatial data and site maps stored in the cloud. Let’s download the data product and fetch the geospatial data from the cloud.

R

# Download stream morpholog data for the lifetime of a site
geo_r <- neonUtilities::loadByProduct(dpID="DP4.00131.001",
                                      site="WALK",
                                      release="RELEASE-2026",
                                      package='basic',
                                      check.size = F,
                                      token=Sys.getenv("NEON_PAT"))

URL to download geospatial data from the cloud is stored in geo_surveySummary

Get the URL for the most recent geomorphology survey

print(max(geo_r$geo_surveySummary$dataFilePath[ geo_r$geo_surveySummary$surveyBoutTypeID=="geomorphology" ]))

Copy and paste the URL to your browser to retrieve the data package

Python

# Download stream morpholog data for the lifetime of a site
geo_py = nu.load_by_product(dpid="DP4.00131.001",
                            site="WALK",
                            release="RELEASE-2026",
                            package="basic",
                            check_size=False,
                            token=os.environ.get("NEON_PAT"))

URL to download geospatial data from the cloud is stored in geo_surveySummary

Get the URL for the most recent geomorphology survey

print(max(geo_py['geo_surveySummary']['dataFilePath'][ geo_py['geo_surveySummary']['surveyBoutTypeID']=="geomorphology" ]))

Copy and paste the URL to your browser to retrieve the data package

Higher-Level Hydrologic Data Products: Net Surface-Atmosphere Exchange (Eddy Covariance)

The net surface-atmosphere exchange data products are available for all terrestrial sites and are bundled together in a single Level 4 data product: Bundled data products - eddy covariance (DP4.00200.001). The data packages do not contain comma separated tabular data. Rather, the data and metadata are stored as HDF5 files.

To download and view bundled eddy covariance data, you cannot use the standard R-loadByProduct() or Python-load_by_product() function. You must use a combination of other functions available in the NEON Utilities package.

R

# Download precipitation data for a single water year
neonUtilities::zipsByProduct(dpID="DP4.00200.001",
                             site="ORNL",
                             startdate="2023-10",
                             enddate="2024-09",
                             release="RELEASE-2026",
                             package='basic',
                             check.size = F,
                             token=Sys.getenv("NEON_PAT"))
# Stack the data download, parse to data frames and read into environment 
# defaults to stacking only L4 products
sae_r <- neonUtilities::stackEddy(filepath = "filesToStack00200")

The data are stored by site name - print the header of the 'ORNL' table

head(sae_r$ORNL)

Python

# Download precipitation data for a single water year
nu.zips_by_product(dpid="DP4.00200.001",
                   site="ORNL",
                   startdate="2023-10",
                   enddate="2024-09",
                   release="RELEASE-2026",
                   package="basic",
                   check_size=False,
                   token=os.environ.get("NEON_PAT"))
# Stack the data download, parse to data frames and read into environment 
# defaults to stacking only L4 products
sae_py = nu.stack_eddy(filepath = "filesToStack00200")

The data is stored by site name - print the header of the 'ORNL' table

print(sae_py["ORNL"].head())

Higher-Level Hydrologic Data Products: Canopy Water Indices - Mosaic

The Canopy water indices - mosaic (DP3.30019.001) is a Level 3 (spatially-interpolated) data product published from the Airborne Observation Platform (AOP) subsystem.

Remote sensing data products are large, but NEON has developed many tools to aid users in downloading and interpreting AOP data. We will not download AOP data in this exercise. Rather, follow the links below for guides on downloading AOP data in R, Python, and Google Earth Engine (GEE).

  • Download and Explore NEON Data.
    • Directly links to ‘Download remote sensing data: byFileAOP() and byTileAOP()’ section.
  • Intro to AOP Data in Google Earth Engine (GEE) Tutorial Series.
  • Understanding AOP Data Releases and Best Practices for AOP Data Management.

Merge & Visualize: The Water Cycle at Co-Located Sites

In this exercise, we will merge together three hydrologic data products, each from a different section of the water cycle at NEON co-located sites:

  • Precipitation - tipping bucket (DP1.00045.001).
  • Elevation of groundwater (DP1.20100.001).
  • Continuous discharge (DP4.00130.001).

We will download each data product, identify how each product can be related, then merge the three data streams into a single data frame.

Download Data Products

This time, we will download the basic download packages.

R

# Download precipitation data for a single water year
ptp_r <- neonUtilities::loadByProduct(dpID="DP1.00045.001",
                                      site="ORNL", # Terrestrial data product
                                      startdate="2023-10",
                                      enddate="2024-09",
                                      release="RELEASE-2026",
                                      package='basic',
                                      check.size = F,
                                      token=Sys.getenv("NEON_PAT"))

Download groundwater elevation data for a single water year

egw_r <- neonUtilities::loadByProduct(dpID="DP1.20100.001", site="WALK", # Aquatic data product startdate="2023-10", enddate="2024-09", release="RELEASE-2026", package='basic', check.size = F, token=Sys.getenv("NEON_PAT"))

Download discharge data for a single water year

csd_r <- neonUtilities::loadByProduct(dpID="DP4.00130.001", site="WALK", # Aquatic data product startdate="2023-10", enddate="2024-09", release="RELEASE-2026", package='basic', check.size = F, token=Sys.getenv("NEON_PAT"))

Python

# Download precipitation data for a single water year
ptp_py = nu.load_by_product(dpid="DP1.00045.001",
                            site="ORNL", # Terrestrial data product
                            startdate="2023-10",
                            enddate="2024-09",
                            release="RELEASE-2026",
                            package='basic',
                            check_size=False,
                            token=os.environ.get("NEON_PAT"))

Download groundwater elevation data for a single water year

egw_py = nu.load_by_product(dpid="DP1.20100.001", site="WALK", # Aquatic data product startdate="2023-10", enddate="2024-09", release="RELEASE-2026", package='basic', check_size=False, token=os.environ.get("NEON_PAT"))

Download discharge data for a single water year

csd_py = nu.load_by_product(dpid="DP4.00130.001", site="WALK", # Aquatic data product startdate="2023-10", enddate="2024-09", release="RELEASE-2026", package='basic', check_size=False, token=os.environ.get("NEON_PAT"))

Identify Relational Data & Merge

Due to the standardized spatial and temporal designs of NEON data products, these three instrumented data products can be related and merged in a relatively easy fashion.

Temporal Relationships

  • All three data products have the same temporal structure. They all have the columns startDateTime and endDateTime in the data tables, and the columns are all formatted the same in published data: YYYY-MM-DD HH:MM:SS (UTC).
  • All three data products are published at a similar temporal frequency. The precipitation and groundwater elevation products are each published at a 30-min resolution and the discharge product is published at a 15-min resolution.
  • Therefore, merging the three tables by one of the datetime columns will ensure the data will be temporally related.

Spatial Relationships

  • For this pair of co-located sites, the precipitation and discharge data products are published at one location, but the groundwater elevation data product is published at multiple locations.
  • For this exercise, we will use the groundwater well location that is closest to the discharge locations. This information can easily be parsed by comparing sensor location coordinates in the sensor_positions file included in each data download.

R

# In this download, there are 3 well locations that publish elevation
# There is only 1 location for discharge
# Use `geosphere` to identify which well location is closest to discharge
egw_coords <- egw_r$sensor_positions_20100%>%
  dplyr::distinct(locationReferenceLatitude,locationReferenceLongitude,
                  .keep_all = T)
csd_coords <- csd_r$sensor_positions_00130
dist <- geosphere::distHaversine(egw_coords[,c('locationReferenceLongitude',
                                               'locationReferenceLatitude')],
                                 csd_coords[,c('locationReferenceLongitude',
                                               'locationReferenceLatitude')])

Which well is closest to the discharge location (horizontal position - HOR)?

close_loc <- egw_coords$HOR.VER[which.min(dist)]

Let's use only the data from the closest well (subset by HOR)

egw_df <- egw_r$EOG_30_min[ egw_r$EOG_30_min$horizontalPosition== substr(close_loc,0,3),# First 3 digits = HOR ]

Merge 3 data streams into a single data frame

Keep the relevant data needed to plot timeseries and examine relationships

ptp_df <- ptp_r$TIPPRE_30min%>% dplyr::select(endDateTime,precipBulk,finalQF) egw_df <- egw_df%>% dplyr::select(endDateTime,groundwaterElevMean,gWatElevFinalQF) csd_df <- csd_r$csd_15_min%>% dplyr::select(endDateTime,dischargeContinuous,dischargeFinalQF)

wc_df <- dplyr::full_join(ptp_df,egw_df) wc_df <- dplyr::full_join(wc_df,csd_df) wc_df <- wc_df[order(wc_df$endDateTime),]

Python

# In this download, there are 3 well locations that publish elevation
# There is only 1 location for discharge
# Use `geopy` to identify which well location is closest to discharge
from geopy.distance import geodesic
egw_coords = egw_py['sensor_positions_20100'].drop_duplicates(subset=['locationReferenceLatitude', 'locationReferenceLongitude'])
csd_coords = csd_py['sensor_positions_00130']
dist = egw_coords.apply(lambda row: geodesic((row['locationReferenceLatitude'], row['locationReferenceLongitude']), 
                                              (csd_coords.iloc[0]['locationReferenceLatitude'], csd_coords.iloc[0]['locationReferenceLongitude'])).meters, axis=1)

Which well is closest to the discharge location (horizontal position - HOR)?

close_loc = egw_coords.loc[dist.idxmin(), 'HOR.VER']

Let's use only the data from the closest well (subset by HOR)

egw_df = egw_py['EOG_30_min'][egw_py['EOG_30_min']['horizontalPosition'] == close_loc[:3]] # First 3 digits = HOR

Merge 3 data streams into a single data frame

Keep the relevant data needed to plot timeseries and examine relationships

ptp_df = ptp_py['TIPPRE_30min'][['endDateTime', 'precipBulk', 'finalQF']] egw_df = egw_df[['endDateTime', 'groundwaterElevMean', 'gWatElevFinalQF']] csd_df = csd_py['csd_15_min'][['endDateTime', 'dischargeContinuous', 'dischargeFinalQF']]

wc_df = pd.merge(ptp_df, egw_df, on='endDateTime', how='outer') wc_df = pd.merge(wc_df, csd_df, on='endDateTime', how='outer') wc_df = wc_df.sort_values('endDateTime')

Plot & Download Merged Interactive Timeseries

Now, let’s plot the three data streams in a single plotting field. We will use the plotly package to give us the ability to interact with the plot. Check your current working directory for the HTML file containing the plot.

R

# Format each y-axis
y1 <- list(side='left',
           automargin=T,
           title="Discharge (L s-1)",
           tickfont=list(size=16),
           titlefont=list(size=18),
           showgrid=F,
           zeroline=F)
y2 <- list(side='right',
           overlaying="y",
           automargin=T,
           title="Groundwater Elevation (m)",
           tickfont=list(size=16,color = '#CC79A7'),
           titlefont=list(size=18,color = '#CC79A7'),
           showgrid=F,
           zeroline=F)
y3 <- list(side='right',
           overlaying="y",
           automargin=T,
           title="Precipitation (mm)",
           tickfont=list(size=16,color = "#0072B2"),
           titlefont=list(size=18,color = "#0072B2"),
           showgrid=F,
           zeroline=F,
           anchor="free",
           position=0.98)

Build plot layout

ts <- plotly::plot_ly(data=wc_df)%>% plotly::layout( yaxis = y1, yaxis2 = y2, yaxis3 = y3, xaxis=list(domain=c(0,.9), tick=14, automargin=T, title="Date", tickfont=list(size=16), titlefont=list(size=18)), legend=list(orientation = "h", y=-0.15, font=list(size=14)), updatemenus=list( list( type='buttons', showactive=FALSE, buttons=base::list( list(label='Scale Discharge\n- Linear -', method='relayout', args=list(list(yaxis=list(type='linear', title="Discharge (L s-1)", tickfont=list(size=16), titlefont=list(size=18), showgrid=F, zeroline=F)))), list(label='Scale Discharge\n- Log -', method='relayout', args=list(list(yaxis=list(type='log', title="Discharge (L s-1) - log", tickfont=list(size=16), titlefont=list(size=18), showgrid=F, zeroline=F))))))))

Plot traces

ts <- ts%>%

H and Q Series

plotly::add_trace(x=~endDateTime,y=~dischargeContinuous, name="Discharge",type='scatter',mode='line', line = list(color = "black"))%>% plotly::add_trace(x=~endDateTime,y=~groundwaterElevMean, yaxis="y2", name="GW Elevation",type='scatter',mode='line', line = list(color = '#CC79A7'))%>% plotly::add_trace(x=~endDateTime,y=~precipBulk, yaxis="y3", name="Precipitation",type='scatter',mode='line', line = list(color = '#0072B2'))

htmlwidgets::saveWidget(plotly::as_widget(ts), "NEON.D07.P.H.Q.WY2024.html")

Python

# Create figure
wc_df_ts = wc_df.dropna(subset=['precipBulk'])
fig = go.Figure()

Format axes and layout

fig.update_layout( margin=dict(l=90, r=110, t=40, b=60),

xaxis=dict(
    domain=[0, 0.9],
    tickmode=&#39;auto&#39;,
    nticks=14,
    automargin=True,
    title=&quot;Date&quot;,
    tickfont=dict(size=16)
),

yaxis=dict(
    title=&quot;Discharge (L s-1)&quot;,
    tickfont=dict(size=16),
    showgrid=False,
    zeroline=False
),

yaxis2=dict(
    title=&quot;Groundwater Elevation (m)&quot;,
    tickfont=dict(size=16, color=&#39;#CC79A7&#39;),
    showgrid=False,
    zeroline=False,
    overlaying=&#39;y&#39;,
    side=&#39;right&#39;
),

yaxis3=dict(
    title=&quot;Precipitation (mm)&quot;,
    tickfont=dict(size=16, color=&#39;#0072B2&#39;),
    showgrid=False,
    zeroline=False,
    overlaying=&#39;y&#39;,
    side=&#39;right&#39;,
    anchor=&#39;free&#39;,
    position=0.98
),

legend=dict(
    orientation=&quot;h&quot;,
    y=-0.15,
    font=dict(size=14)
),

updatemenus=[
    dict(
        type=&#39;buttons&#39;,
        showactive=False,
        buttons=[
            dict(
                label=&#39;Scale Discharge\n- Linear -&#39;,
                method=&#39;relayout&#39;,
                args=[{
                    &#39;yaxis.type&#39;: &#39;linear&#39;,
                    &#39;yaxis.title&#39;: &quot;Discharge (L s-1)&quot;
                }]
            ),
            dict(
                label=&#39;Scale Discharge\n- Log -&#39;,
                method=&#39;relayout&#39;,
                args=[{
                    &#39;yaxis.type&#39;: &#39;log&#39;,
                    &#39;yaxis.title&#39;: &quot;Discharge (L s-1) - log&quot;
                }]
            )
        ]
    )
]

)

Add traces

fig.add_trace(go.Scatter( x=wc_df_ts['endDateTime'], y=wc_df_ts['dischargeContinuous'], mode='lines', name='Discharge', line=dict(color='black') ))

fig.add_trace(go.Scatter( x=wc_df_ts['endDateTime'], y=wc_df_ts['groundwaterElevMean'], mode='lines', name='GW Elevation', line=dict(color='#CC79A7'), yaxis='y2' ))

fig.add_trace(go.Scatter( x=wc_df_ts['endDateTime'], y=wc_df_ts['precipBulk'], mode='lines', name='Precipitation', line=dict(color='#0072B2'), yaxis='y3' ))

fig.write_html("NEON.D07.P.H.Q.WY2024.html")

Further Exploration: Cumulative Precipitation & Discharge

R

# Plot cumulative precipitation & discharge together using ggplot with 2 y-axes
wc_df_subset <- wc_df%>%
  filter(!is.na(precipBulk))
wc_df_subset$cumulativeP <- cumsum(wc_df_subset$precipBulk)
wc_df_subset$cumulativeQ <- cumsum(wc_df_subset$dischargeContinuous)
cumsum <- wc_df_subset%>%
  ggplot(aes(x = endDateTime)) +
  geom_smooth(aes(y = cumulativeP), method="loess", color = "#0072B2") +
  geom_smooth(aes(y = cumulativeQ/150), method="loess", color = "black") +
  scale_y_continuous(
    name = "Cumulative Precipitation (mm)",
    sec.axis = sec_axis(~ .*150, name = "Cumulative Discharge (L s-1)")
  ) +
  labs(x = "Date") +
  theme_minimal() +
  theme(
    axis.title.y.left = element_text(color = "#0072B2", size = 14),
    axis.title.y.right = element_text(color = "black", size = 14)
  )
cumsum

Python

# Plot cumulative precipitation & discharge together using ggplot with 2 y-axes
wc_df_subset = wc_df[~wc_df['precipBulk'].isna()].copy()
wc_df_subset['cumulativeP'] = wc_df_subset['precipBulk'].cumsum()
wc_df_subset['cumulativeQ'] = wc_df_subset['dischargeContinuous'].cumsum()
x = wc_df_subset['endDateTime']
x_numeric = x.astype(np.int64)  # nanoseconds since epoch
lowess = sm.nonparametric.lowess
smoothP = lowess(wc_df_subset['cumulativeP'], x_numeric, frac=0.3)
smoothQ = lowess(wc_df_subset['cumulativeQ'] / 150, x_numeric, frac=0.3)
fig, ax1 = plt.subplots(figsize=(10, 6))
ax1.plot(x, smoothP[:,1], color='#0072B2', label='Cumulative Precipitation (mm) – LOESS')
ax1.set_xlabel('Date', fontsize=12)
ax1.set_ylabel('Cumulative Precipitation (mm)', color='#0072B2', fontsize=12)
ax1.tick_params(axis='y', labelcolor='#0072B2')
ax2 = ax1.twinx()
ax2.plot(x, smoothQ[:,1], color='black', label='Cumulative Discharge (L s-1) – LOESS')
ax2.set_ylabel('Cumulative Discharge (L s-1)', color='black', fontsize=12)
ax2.tick_params(axis='y', labelcolor='black')
fig.tight_layout()
plt.show()

Further Exploration: Correlation of Groundwater Elevation & Discharge

R

# Plot scatterplots of one variable to another to assess correlation
# Create a continuous color scale by date to add the time-of-year dimension
corr <- wc_df %>%
  ggplot(aes(x = groundwaterElevMean, y = dischargeContinuous, color = as.integer(endDateTime))) +
  geom_point(aes(color = as.Date(endDateTime))) +
  scale_color_date(low="blue",high="darkorange") +
  labs(x = "Groundwater Elevation (m)", y = "Discharge (L s-1)", color = "Date") +
  theme_minimal()
corr

Python

# Plot scatterplots of one variable to another to assess correlation
# Create a continuous color scale by date to add the time-of-year dimension
dates = pd.to_datetime(wc_df['endDateTime'])
date_nums = mdates.date2num(dates)
fig, ax = plt.subplots(figsize=(10, 6))
scatter = ax.scatter(
    wc_df['groundwaterElevMean'],
    wc_df['dischargeContinuous'],
    c=date_nums,
    cmap='cool',
    alpha=0.6
)
ax.set_xlabel('Groundwater Elevation (m)', fontsize=12)
ax.set_ylabel('Discharge (L s-1)', fontsize=12)
cbar = plt.colorbar(scatter, ax=ax)
cbar.set_label('Date', fontsize=12)
tick_locs = cbar.get_ticks()
cbar.ax.set_yticklabels([mdates.num2date(t).strftime('%Y-%m-%d') for t in tick_locs])
plt.tight_layout()
plt.show()

Questions?

If you have questions or comments on this content, please contact us.

Contact Us
NSF NEON, Operated by Battelle

Follow Us:

Join Our Newsletter

Get updates on events, opportunities, and how NEON is being used today.

Subscribe Now

Footer

  • About Us
  • Contact Us
  • Terms & Conditions
  • Careers
  • Code of Conduct

Copyright © Battelle, 2026

The National Ecological Observatory Network is a major facility fully funded by the U.S. National Science Foundation.

Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the U.S. National Science Foundation.