Tutorial

Introduction to working with NEON eddy flux data

Authors: Claire K. Lunch

Last Updated: Apr 7, 2026

This data tutorial provides an introduction to working with NEON eddy flux data, using the neonUtilities R package or the neonutilities Python package. If you are new to NEON data, we recommend starting with a more general tutorial, such as the neonUtilities tutorial or the Download and Explore tutorial. Some of the functions and techniques described in those tutorials will be used here, as well as functions and data formats that are unique to the eddy flux system.

This tutorial assumes general familiarity with eddy flux data and associated concepts.

1. Setup and data download

Start by installing and loading packages and setting options.

R

To work with the NEON flux data, we need the rhdf5 package, which is hosted on Bioconductor, and requires a different installation process than CRAN packages:

install.packages('BiocManager')
BiocManager::install('rhdf5')
install.packages('neonUtilities')
install.packages('ggplot2')

Now load packages.

library(neonUtilities)
library(ggplot2)

Python

The necessary helper packages should be installed automatically when we install neonUtilities. We’ll also use several modules that are installed automatically with standard Python installations.

pip install neonutilities

Now load packages.


import neonutilities as nu
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from datetime import datetime
import os

Download data

R

Use the zipsByProduct() function from the neonUtilities package to download flux data from two sites and two months. The transformations and functions below will work on any time range and site(s), but two sites and two months allows us to see all the available functionality while minimizing download size.

Inputs to the zipsByProduct() function for this tutorial:

dpID: DP4.00200.001, the bundled eddy covariance product
package: basic (the expanded package is not covered in this tutorial)
site: NIWO = Niwot Ridge and HARV = Harvard Forest
startdate: 2018-06 (both dates are inclusive)
enddate: 2018-07 (both dates are inclusive)
savepath: path to download to; here we use the working directory
check.size: T if you want to see file size before downloading, otherwise F

The download may take a while, especially if you’re on a slow network. For faster downloads, consider using an API token.

zipsByProduct(dpID="DP4.00200.001", package="basic", 
              site=c("NIWO", "HARV"), 
              startdate="2018-06", enddate="2018-07",
              savepath=getwd(), 
              check.size=F)

Python

Use the zips_by_product() function from the neonutilities package to download flux data from two sites and two months. The transformations and functions below will work on any time range and site(s), but two sites and two months allows us to see all the available functionality while minimizing download size.

Inputs to the zips_by_product() function for this tutorial:

dpid: DP4.00200.001, the bundled eddy covariance product
package: basic (the expanded package is not covered in this tutorial)
site: NIWO = Niwot Ridge and HARV = Harvard Forest
startdate: 2018-06 (both dates are inclusive)
enddate: 2018-07 (both dates are inclusive)
savepath: path to download to; here we use the working directory
check_size: T if you want to see file size before downloading, otherwise F

The download may take a while, especially if you’re on a slow network. For faster downloads, consider using an API token.

nu.zips_by_product(dpid="DP4.00200.001", package="basic", 
              site=["NIWO","HARV"], 
              startdate="2018-06", enddate="2018-07",
              savepath=os.getcwd(), 
              check_size=False)

2. Data Levels

There are five levels of data contained in the eddy flux bundle. For full details, refer to the NEON algorithm document.

Briefly, the data levels are:

Level 0’ (dp0p): Calibrated raw observations
Level 1 (dp01): Time-aggregated observations, e.g. 30-minute mean gas concentrations
Level 2 (dp02): Time-interpolated data, e.g. rate of change of a gas concentration
Level 3 (dp03): Spatially interpolated data, i.e. vertical profiles
Level 4 (dp04): Fluxes

The dp0p data are available in the expanded data package and are beyond the scope of this tutorial.

The dp02 and dp03 data are used in storage calculations, and the dp04 data include both the storage and turbulent components. Since many users will want to focus on the net flux data, we’ll start there.

3. Extract Level 4 data (Fluxes!)

R

To extract the Level 4 data from the HDF5 files and merge them into a single table, we’ll use the stackEddy() function from the neonUtilities package.

stackEddy() requires two inputs:

filepath: Path to a file or folder, which can be any one of:
1. A zip file of eddy flux data downloaded from the NEON data portal
2. A folder of eddy flux data downloaded by the zipsByProduct() function
3. The folder of files resulting from unzipping either of 1 or 2
4. One or more HDF5 files of NEON eddy flux data
level: dp01-4

Input the filepath you downloaded to using zipsByProduct() earlier, including the filestoStack00200 folder created by the function, and dp04:

flux <- stackEddy(filepath=paste0(getwd(), "/filesToStack00200"),
                 level="dp04")

## Extracting data

## Joining data variables by file

## Stacking files by month

## Getting metadata tables

We now have an object called flux. It’s a named list containing four tables: one table for each site’s data, and variables and objDesc tables.

names(flux)

## [1] "HARV"                        "NIWO"                       
## [3] "variables"                   "objDesc"                    
## [5] "issueLog"                    "citation_00200_RELEASE-2026"

Let’s look at the contents of one of the site data files:

head(flux$NIWO)

##               timeBgn             timeEnd data.fluxCo2.nsae.flux
## 1 2018-06-01 00:00:00 2018-06-01 00:29:59              0.1895604
## 2 2018-06-01 00:30:00 2018-06-01 00:59:59              0.9429929
## 3 2018-06-01 01:00:00 2018-06-01 01:29:59              0.5101887
## 4 2018-06-01 01:30:00 2018-06-01 01:59:59              0.7980803
## 5 2018-06-01 02:00:00 2018-06-01 02:29:59              0.4345524
## 6 2018-06-01 02:30:00 2018-06-01 02:59:59              0.8440311
##   data.fluxCo2.stor.flux data.fluxCo2.turb.flux data.fluxH2o.nsae.flux
## 1            -0.06349451              0.2530549              16.428558
## 2             0.09749592              0.8454970               7.965026
## 3             0.02525844              0.4849303               5.387009
## 4             0.23341588              0.5646644               9.933795
## 5             0.17415575              0.2603966               2.963931
## 6            -0.01496958              0.8590007               4.488221
##   data.fluxH2o.stor.flux data.fluxH2o.turb.flux data.fluxMome.turb.veloFric
## 1             3.34458774              13.083971                   0.2025301
## 2            -1.51509573               9.480122                   0.1922362
## 3            -4.26233930               9.649348                   0.1197608
## 4             0.15083070               9.782964                   0.1190342
## 5            -0.01668207               2.980613                   0.1589262
## 6            -0.52273792               5.010959                   0.1115558
##   data.fluxTemp.nsae.flux data.fluxTemp.stor.flux data.fluxTemp.turb.flux
## 1               4.7823169              -1.4575674               6.2398843
## 2              -0.5981044               0.3404014              -0.9385058
## 3              -4.3428949               0.1870749              -4.5299698
## 4             -12.6516137              -2.4904875             -10.1611262
## 5              -5.0544306              -0.7514722              -4.3029584
## 6              -7.6999704              -1.9047428              -5.7952276
##   data.foot.stat.angZaxsErth data.foot.stat.distReso
## 1                    94.2262                    8.34
## 2                   355.4252                    8.34
## 3                   359.8013                    8.34
## 4                   137.7743                    8.34
## 5                   188.4800                    8.34
## 6                   183.1920                    8.34
##   data.foot.stat.veloYaxsHorSd data.foot.stat.veloZaxsHorSd
## 1                    0.7955893                    0.2713232
## 2                    0.8590177                    0.2300000
## 3                    1.2601763                    0.2300000
## 4                    0.7332641                    0.2300000
## 5                    0.7096286                    0.2300000
## 6                    0.3789859                    0.2300000
##   data.foot.stat.veloFric data.foot.stat.distZaxsMeasDisp
## 1               0.2025428                            8.34
## 2               0.2000000                            8.34
## 3               0.2000000                            8.34
## 4               0.2000000                            8.34
## 5               0.2000000                            8.34
## 6               0.2000000                            8.34
##   data.foot.stat.distZaxsRgh data.foot.stat.distObkv data.foot.stat.paraStbl
## 1                 0.04083968               -72.43385            -0.115139545
## 2                 0.29601180               965.24689             0.008640276
## 3                 0.23002973                27.47032             0.303600430
## 4                 0.83400000                10.29068             0.810442357
## 5                 0.83400000                58.07739             0.143601482
## 6                 0.83400000                15.03590             0.554672317
##   data.foot.stat.distZaxsAbl data.foot.stat.distXaxs90
## 1                       1000                    325.26
## 2                       1000                    258.54
## 3                       1000                    275.22
## 4                       1000                    208.50
## 5                       1000                    208.50
## 6                       1000                    208.50
##   data.foot.stat.distXaxsMax data.foot.stat.distYaxs90 qfqm.fluxCo2.nsae.qfFinl
## 1                     133.44                     25.02                        1
## 2                     108.42                     50.04                        1
## 3                     108.42                     75.06                        1
## 4                      83.40                     75.06                        1
## 5                      83.40                     66.72                        1
## 6                      83.40                     41.70                        1
##   qfqm.fluxCo2.stor.qfFinl qfqm.fluxCo2.turb.qfFinl qfqm.fluxH2o.nsae.qfFinl
## 1                        1                        1                        1
## 2                        1                        1                        1
## 3                        1                        1                        1
## 4                        1                        1                        1
## 5                        1                        1                        1
## 6                        1                        1                        1
##   qfqm.fluxH2o.stor.qfFinl qfqm.fluxH2o.turb.qfFinl qfqm.fluxMome.turb.qfFinl
## 1                        1                        1                         0
## 2                        0                        1                         0
## 3                        0                        1                         1
## 4                        0                        1                         1
## 5                        0                        1                         0
## 6                        0                        1                         0
##   qfqm.fluxTemp.nsae.qfFinl qfqm.fluxTemp.stor.qfFinl qfqm.fluxTemp.turb.qfFinl
## 1                         1                         0                         1
## 2                         1                         0                         1
## 3                         1                         0                         1
## 4                         1                         0                         1
## 5                         1                         0                         1
## 6                         1                         0                         1
##   qfqm.foot.turb.qfFinl
## 1                     0
## 2                     0
## 3                     0
## 4                     0
## 5                     0
## 6                     0

The variables and objDesc tables can help you interpret the column headers in the data table. The objDesc table contains definitions for many of the terms used in the eddy flux data product, but it isn’t complete. To get the terms of interest, we’ll break up the column headers into individual terms and look for them in the objDesc table:

term <- unlist(strsplit(names(flux$NIWO), split=".", fixed=T))
flux$objDesc[which(flux$objDesc$X...Object %in% term),]

##           X...Object
## 22       angZaxsErth
## 84              data
## 91          distReso
## 92        distXaxs90
## 93       distXaxsMax
## 94        distYaxs90
## 95       distZaxsAbl
## 100 distZaxsMeasDisp
## 101      distZaxsRgh
## 125             flux
## 126          fluxCo2
## 128          fluxH2o
## 130         fluxMome
## 132         fluxTemp
## 133             foot
## 200             nsae
## 244           qfFinl
## 274             qfqm
## 451             stat
## 453             stor
## 474          timeBgn
## 475          timeEnd
## 477             turb
## 505         veloFric
## 512    veloYaxsHorSd
## 515    veloZaxsHorSd
##                                                                                                           Description
## 22                                                                                                     Wind direction
## 84                                                                                             Represents data fields
## 91                                                                      Resolution of vector of distances (units = m)
## 92                                     Distance of 90% contriubtion of footprint aligned with dominant wind direction
## 93                                        Max distance of footprint contriubtion aligned with dominant wind direction
## 94  Distance of 90% contriubtion of footprint aligned with direction that is perpendicular to dominant wind direction
## 95                                              Atmospheric boundary layer height used in flux footprint calculations
## 100                                                                     Measurement height minus displacement height 
## 101                                                               Roughness length used in flux footprint calculation
## 125                                                                                             General term for flux
## 126                                                                     CO2 flux (can be turbulent, storage, or nsae)
## 128                                      H2O flux (can be turbulent, storage, or nsae)  - relates to latent heat flux
## 130                                                                Momentum flux (can be turbulent, storage, or nsae)
## 132                             Temperature flux (can be turbulent, storage, or nsae) - relates to sensible heat flux
## 133                                                                                                         Footprint
## 200                                                       Net surface atmospher exchange (turbulent + storage fluxes)
## 244         The final quality flag indicating if the data are valid for the given aggregation period (1=fail, 0=pass)
## 274   Quality flag and quality metrics, represents quality flags and quality metrics that accompany the provided data
## 451                                                                                                        Statistics
## 453                                                                                                           Storage
## 474                                                                      The beginning time of the aggregation period
## 475                                                                            The end time of the aggregation period
## 477                                                                                                         Turbulent
## 505                                                                           Friction velocity/ustar (units = m s-1)
## 512                                        Standard deviation of horizontal wind speed in Y direction (units = m s-1)
## 515                                        Standard deviation of horizontal wind speed in X direction (units = m s-1)

For the terms that aren’t captured here, fluxCo2, fluxH2o, and fluxTemp are self-explanatory. The flux components are

turb: Turbulent flux
stor: Storage
nsae: Net surface-atmosphere exchange

The variables table contains the units for each field:

flux$variables

##    category   system variable             stat           units
## 1      data  fluxCo2     nsae          timeBgn              NA
## 2      data  fluxCo2     nsae          timeEnd              NA
## 3      data  fluxCo2     nsae             flux umolCo2 m-2 s-1
## 4      data  fluxCo2     stor          timeBgn              NA
## 5      data  fluxCo2     stor          timeEnd              NA
## 6      data  fluxCo2     stor             flux umolCo2 m-2 s-1
## 7      data  fluxCo2     turb          timeBgn              NA
## 8      data  fluxCo2     turb          timeEnd              NA
## 9      data  fluxCo2     turb             flux umolCo2 m-2 s-1
## 10     data  fluxH2o     nsae          timeBgn              NA
## 11     data  fluxH2o     nsae          timeEnd              NA
## 12     data  fluxH2o     nsae             flux           W m-2
## 13     data  fluxH2o     stor          timeBgn              NA
## 14     data  fluxH2o     stor          timeEnd              NA
## 15     data  fluxH2o     stor             flux           W m-2
## 16     data  fluxH2o     turb          timeBgn              NA
## 17     data  fluxH2o     turb          timeEnd              NA
## 18     data  fluxH2o     turb             flux           W m-2
## 19     data fluxMome     turb          timeBgn              NA
## 20     data fluxMome     turb          timeEnd              NA
## 21     data fluxMome     turb         veloFric           m s-1
## 22     data fluxTemp     nsae          timeBgn              NA
## 23     data fluxTemp     nsae          timeEnd              NA
## 24     data fluxTemp     nsae             flux           W m-2
## 25     data fluxTemp     stor          timeBgn              NA
## 26     data fluxTemp     stor          timeEnd              NA
## 27     data fluxTemp     stor             flux           W m-2
## 28     data fluxTemp     turb          timeBgn              NA
## 29     data fluxTemp     turb          timeEnd              NA
## 30     data fluxTemp     turb             flux           W m-2
## 31     data     foot     stat          timeBgn              NA
## 32     data     foot     stat          timeEnd              NA
## 33     data     foot     stat      angZaxsErth             deg
## 34     data     foot     stat         distReso               m
## 35     data     foot     stat    veloYaxsHorSd           m s-1
## 36     data     foot     stat    veloZaxsHorSd           m s-1
## 37     data     foot     stat         veloFric           m s-1
## 38     data     foot     stat distZaxsMeasDisp               m
## 39     data     foot     stat      distZaxsRgh               m
## 40     data     foot     stat         distObkv               m
## 41     data     foot     stat         paraStbl               -
## 42     data     foot     stat      distZaxsAbl               m
## 43     data     foot     stat       distXaxs90               m
## 44     data     foot     stat      distXaxsMax               m
## 45     data     foot     stat       distYaxs90               m
## 46     qfqm  fluxCo2     nsae          timeBgn              NA
## 47     qfqm  fluxCo2     nsae          timeEnd              NA
## 48     qfqm  fluxCo2     nsae           qfFinl              NA
## 49     qfqm  fluxCo2     stor           qfFinl              NA
## 50     qfqm  fluxCo2     stor          timeBgn              NA
## 51     qfqm  fluxCo2     stor          timeEnd              NA
## 52     qfqm  fluxCo2     turb          timeBgn              NA
## 53     qfqm  fluxCo2     turb          timeEnd              NA
## 54     qfqm  fluxCo2     turb           qfFinl              NA
## 55     qfqm  fluxH2o     nsae          timeBgn              NA
## 56     qfqm  fluxH2o     nsae          timeEnd              NA
## 57     qfqm  fluxH2o     nsae           qfFinl              NA
## 58     qfqm  fluxH2o     stor           qfFinl              NA
## 59     qfqm  fluxH2o     stor          timeBgn              NA
## 60     qfqm  fluxH2o     stor          timeEnd              NA
## 61     qfqm  fluxH2o     turb          timeBgn              NA
## 62     qfqm  fluxH2o     turb          timeEnd              NA
## 63     qfqm  fluxH2o     turb           qfFinl              NA
## 64     qfqm fluxMome     turb          timeBgn              NA
## 65     qfqm fluxMome     turb          timeEnd              NA
## 66     qfqm fluxMome     turb           qfFinl              NA
## 67     qfqm fluxTemp     nsae          timeBgn              NA
## 68     qfqm fluxTemp     nsae          timeEnd              NA
## 69     qfqm fluxTemp     nsae           qfFinl              NA
## 70     qfqm fluxTemp     stor           qfFinl              NA
## 71     qfqm fluxTemp     stor          timeBgn              NA
## 72     qfqm fluxTemp     stor          timeEnd              NA
## 73     qfqm fluxTemp     turb          timeBgn              NA
## 74     qfqm fluxTemp     turb          timeEnd              NA
## 75     qfqm fluxTemp     turb           qfFinl              NA
## 76     qfqm     foot     turb          timeBgn              NA
## 77     qfqm     foot     turb          timeEnd              NA
## 78     qfqm     foot     turb           qfFinl              NA

Python

To extract the Level 4 data from the HDF5 files and merge them into a single table, we’ll use the stack_eddy() function from the neonutilities package.

stack_eddy() requires two inputs:

filepath: Path to a file or folder, which can be any one of:
1. A zip file of eddy flux data downloaded from the NEON data portal
2. A folder of eddy flux data downloaded by the zips_by_product() function
3. The folder of files resulting from unzipping either of 1 or 2
4. One or more HDF5 files of NEON eddy flux data
level: dp01-4

Input the filepath you downloaded to using zips_by_product() earlier, including the filestoStack00200 folder created by the function, and dp04:

flux = nu.stack_eddy(filepath=os.getcwd() + "/filesToStack00200",
                 level="dp04")

We now have an object called flux. It’s a dictionary containing four tables: one table for each site’s data, and variables and objDesc tables.

flux.keys()

## dict_keys(['HARV', 'NIWO', 'variables', 'objDesc', 'issueLog', 'citation_00200_RELEASE-2026'])

Let’s look at the contents of one of the site data files:

flux['NIWO'].head

## <bound method NDFrame.head of                        timeBgn  ... qfqm.foot.turb.qfFinl
## 0    2018-06-01 00:00:00+00:00  ...                     0
## 1    2018-06-01 00:30:00+00:00  ...                     0
## 2    2018-06-01 01:00:00+00:00  ...                     0
## 3    2018-06-01 01:30:00+00:00  ...                     0
## 4    2018-06-01 02:00:00+00:00  ...                     0
## ...                        ...  ...                   ...
## 2923 2018-07-31 21:30:00+00:00  ...                     0
## 2924 2018-07-31 22:00:00+00:00  ...                     0
## 2925 2018-07-31 22:30:00+00:00  ...                     0
## 2926 2018-07-31 23:00:00+00:00  ...                     0
## 2927 2018-07-31 23:30:00+00:00  ...                     0
## 
## [2928 rows x 36 columns]>

termlist = [f.split('.') for f in flux['NIWO'].keys()]
term = [t for sublist in termlist for t in sublist]
objDesc = flux['objDesc']
objDesc[objDesc['Object'].isin(term)]

##                Object                                        Description
## 21        angZaxsErth                                     Wind direction
## 83               data                             Represents data fields
## 90           distReso      Resolution of vector of distances (units = m)
## 91         distXaxs90  Distance of 90% contriubtion of footprint alig...
## 92        distXaxsMax  Max distance of footprint contriubtion aligned...
## 93         distYaxs90  Distance of 90% contriubtion of footprint alig...
## 94        distZaxsAbl  Atmospheric boundary layer height used in flux...
## 99   distZaxsMeasDisp      Measurement height minus displacement height 
## 100       distZaxsRgh  Roughness length used in flux footprint calcul...
## 124              flux                              General term for flux
## 125           fluxCo2      CO2 flux (can be turbulent, storage, or nsae)
## 127           fluxH2o  H2O flux (can be turbulent, storage, or nsae) ...
## 129          fluxMome  Momentum flux (can be turbulent, storage, or n...
## 131          fluxTemp  Temperature flux (can be turbulent, storage, o...
## 132              foot                                          Footprint
## 199              nsae  Net surface atmospher exchange (turbulent + st...
## 243            qfFinl  The final quality flag indicating if the data ...
## 273              qfqm  Quality flag and quality metrics, represents q...
## 450              stat                                         Statistics
## 452              stor                                            Storage
## 473           timeBgn       The beginning time of the aggregation period
## 474           timeEnd             The end time of the aggregation period
## 476              turb                                          Turbulent
## 504          veloFric            Friction velocity/ustar (units = m s-1)
## 511     veloYaxsHorSd  Standard deviation of horizontal wind speed in...
## 514     veloZaxsHorSd  Standard deviation of horizontal wind speed in...

For the terms that aren’t captured here, fluxCo2, fluxH2o, and fluxTemp are self-explanatory. The flux components are

turb: Turbulent flux
stor: Storage
nsae: Net surface-atmosphere exchange

The variables table contains the units for each field:

flux['variables']

##    category    system variable     stat            units
## 0      data   fluxCo2     nsae  timeBgn               NA
## 1      data   fluxCo2     nsae  timeEnd               NA
## 2      data   fluxCo2     nsae     flux  umolCo2 m-2 s-1
## 3      data   fluxCo2     stor  timeBgn               NA
## 4      data   fluxCo2     stor  timeEnd               NA
## ..      ...       ...      ...      ...              ...
## 73     qfqm  fluxTemp     turb  timeEnd               NA
## 74     qfqm  fluxTemp     turb   qfFinl               NA
## 75     qfqm      foot     turb  timeBgn               NA
## 76     qfqm      foot     turb  timeEnd               NA
## 77     qfqm      foot     turb   qfFinl               NA
## 
## [78 rows x 5 columns]

Plot fluxes

Let’s plot some data! First, a brief aside about time stamps, since these are time series data.

Time stamps

NEON sensor data come with time stamps for both the start and end of the averaging period. Depending on the analysis you’re doing, you may want to use one or the other; for general plotting, re-formatting, and transformations, I prefer to use the start time, because there are some small inconsistencies between data products in a few of the end time stamps.

Note that all NEON data use UTC time, aka Greenwich Mean Time. This is true across NEON’s instrumented, observational, and airborne measurements. When working with NEON data, it’s best to keep everything in UTC as much as possible, otherwise it’s very easy to end up with data in mismatched times, which can cause insidious and hard-to-detect problems. In the code below, time stamps and time zones have been handled by stackEddy() and loadByProduct(), so we don’t need to do anything additional. But if you’re writing your own code and need to convert times, remember that if the time zone isn’t specified, R will default to the local time zone it detects on your operating system.

R

plot(flux$NIWO$data.fluxCo2.nsae.flux~flux$NIWO$timeBgn, 
     pch=".", xlab="Date", ylab="CO2 flux",
     ylim=c(-20,20))

There is a clear diurnal pattern, and an increase in daily carbon uptake as the growing season progresses.

Let’s trim down to just two days of data to see a few other details.

plot(flux$NIWO$data.fluxCo2.nsae.flux~flux$NIWO$timeBgn, 
     pch=20, xlab="Date", ylab="CO2 flux",
     xlim=c(as.POSIXct("2018-07-07", tz="GMT"),
            as.POSIXct("2018-07-09", tz="GMT")),
    ylim=c(-20,20), xaxt="n")
axis.POSIXct(1, x=flux$NIWO$timeBgn, 
             format="%Y-%m-%d %H:%M:%S")

Python

plt.scatter(flux['NIWO'].timeBgn, 
            flux['NIWO']['data.fluxCo2.nsae.flux'],
           marker='.')
plt.ylim((-20,20))

## (-20.0, 20.0)

plt.show()

There is a clear diurnal pattern, and an increase in daily carbon uptake as the growing season progresses.

Let’s trim down to just two days of data to see a few other details.

plt.scatter(flux['NIWO'].timeBgn, 
            flux['NIWO']['data.fluxCo2.nsae.flux'],
           marker='.')
plt.ylim((-20,20))

## (-20.0, 20.0)

plt.xlim((datetime.strptime('2018-07-07 00:00:00', '%Y-%m-%d %H:%M:%S'),
          datetime.strptime('2018-07-09 00:00:00', '%Y-%m-%d %H:%M:%S')))

## (np.float64(17719.0), np.float64(17721.0))

plt.show()

Note the timing of C uptake; the UTC time zone is clear here, where uptake occurs at times that appear to be during the night.

4. Merge flux data with other sensor data

Many of the data sets we would use to interpret and model flux data are measured as part of the NEON project, but are not present in the eddy flux data product bundle. In this section, we’ll download PAR data and merge them with the flux data; the steps taken here can be applied to any of the NEON instrumented (IS) data products.

Download PAR data

R

To get NEON PAR data, use the loadByProduct() function from the neonUtilities package. loadByProduct() takes the same inputs as zipsByProduct(), but it loads the downloaded data directly into the current R environment.

Let’s download PAR data matching the Niwot Ridge flux data. The inputs needed are:

dpID: DP1.00024.001
site: NIWO
startdate: 2018-06
enddate: 2018-07
package: basic
timeIndex: 30

The new input here is timeIndex=30, which downloads only the 30-minute data. Since the flux data are at a 30-minute resolution, we can save on download time by disregarding the 1-minute data files (which are of course 30 times larger). The timeIndex input can be left off if you want to download all available averaging intervals.

pr <- loadByProduct("DP1.00024.001", site="NIWO", 
                    timeIndex=30, package="basic", 
                    startdate="2018-06", enddate="2018-07",
                    check.size=F)

Python

To get NEON PAR data, use the load_by_product() function from the neonutilities package. load_by_product() takes the same inputs as zips_by_product(), but it loads the downloaded data directly into the current Python environment.

Let’s download PAR data matching the Niwot Ridge flux data. The inputs needed are:

dpid: DP1.00024.001
site: NIWO
startdate: 2018-06
enddate: 2018-07
package: basic
timeindex: 30

The new input here is timeindex=30, which downloads only the 30-minute data. Since the flux data are at a 30-minute resolution, we can save on download time by disregarding the 1-minute data files (which are of course 30 times larger). The timeindex input can be left off if you want to download all available averaging intervals.

pr = nu.load_by_product("DP1.00024.001", site="NIWO", 
                    timeindex=30, package="basic", 
                    startdate="2018-06", enddate="2018-07",
                    check_size=False)

pr is another named list, and again, metadata and units can be found in the variables table. The PARPAR_30min table contains a verticalPosition field. This field indicates the position on the tower, with 010 being the first tower level, and 020, 030, etc going up the tower.

Join PAR to flux data

We’ll connect PAR data from the tower top to the flux data.

R

pr.top <- pr$PARPAR_30min[which(pr$PARPAR_30min$verticalPosition==
                                max(pr$PARPAR_30min$verticalPosition)),]

As noted above, loadByProduct() automatically converts time stamps to a recognized date-time format when it reads the data. However, the field names for the time stamps differ between the flux data and the other meteorological data: the start of the averaging interval is timeBgn in the flux data and startDateTime in the PAR data.

Let’s create a new variable in the PAR data:

pr.top$timeBgn <- pr.top$startDateTime

And now use the matching time stamp fields to merge the flux and PAR data.

fx.pr <- merge(pr.top, flux$NIWO, by="timeBgn")

And now we can plot net carbon exchange as a function of light availability:

plot(fx.pr$data.fluxCo2.nsae.flux~fx.pr$PARMean,
     pch=".", ylim=c(-20,20),
     xlab="PAR", ylab="CO2 flux")

Python

As noted above, load_by_product() automatically converts time stamps to a recognized date-time format when it reads the data. However, the field names for the time stamps differ between the flux data and the other meteorological data: the start of the averaging interval is timeBgn in the flux data and startDateTime in the PAR data.

Let’s create a new variable in the PAR data and subset to the top of the tower.


par30 = pr['PARPAR_30min']
par30['timeBgn'] = par30['startDateTime']
prtop = par30[par30['verticalPosition']=='040']

And now use the matching time stamp fields to merge the flux and PAR data.


fxpr = pd.merge(flux['NIWO'], prtop, how='outer', on='timeBgn')

And now we can plot net carbon exchange as a function of light availability:

plt.scatter(fxpr.PARMean, 
            fxpr['data.fluxCo2.nsae.flux'],
           marker='.')
plt.ylim((-20,20))

## (-20.0, 20.0)

plt.show()

If you’re interested in data in the eddy covariance bundle besides the net flux data, the rest of this tutorial will guide you through how to get those data out of the bundle.

5. Vertical profile data (Level 3)

The Level 3 (dp03) data are the spatially interpolated profiles of the rates of change of CO₂, H₂O, and temperature. Extract the Level 3 data from the HDF5 file using stackEddy() with the same syntax as for the Level 4 data.

R

prof <- stackEddy(filepath=paste0(getwd(), "/filesToStack00200"),
                 level="dp03")

As with the Level 4 data, the result is a named list with data tables for each site.

head(prof$NIWO[,1:7])

##               timeBgn             timeEnd data.co2Stor.rateRtioMoleDryCo2.0.1 m
## 1 2018-06-01 00:00:00 2018-06-01 00:29:59                         -2.681938e-04
## 2 2018-06-01 00:30:00 2018-06-01 00:59:59                          1.912094e-05
## 3 2018-06-01 01:00:00 2018-06-01 01:29:59                          3.709964e-04
## 4 2018-06-01 01:30:00 2018-06-01 01:59:59                          2.374054e-03
## 5 2018-06-01 02:00:00 2018-06-01 02:29:59                          3.397046e-03
## 6 2018-06-01 02:30:00 2018-06-01 02:59:59                          3.156952e-03
##   data.co2Stor.rateRtioMoleDryCo2.0.2 m data.co2Stor.rateRtioMoleDryCo2.0.3 m
## 1                         -2.681938e-04                         -2.681938e-04
## 2                          1.912094e-05                          1.912094e-05
## 3                          3.709964e-04                          3.709964e-04
## 4                          2.374054e-03                          2.374054e-03
## 5                          3.397046e-03                          3.397046e-03
## 6                          3.156952e-03                          3.156952e-03
##   data.co2Stor.rateRtioMoleDryCo2.0.4 m data.co2Stor.rateRtioMoleDryCo2.0.5 m
## 1                         -2.681938e-04                         -2.681938e-04
## 2                          1.912094e-05                          1.912094e-05
## 3                          3.709964e-04                          3.709964e-04
## 4                          2.374054e-03                          2.374054e-03
## 5                          3.397046e-03                          3.397046e-03
## 6                          3.156952e-03                          3.156952e-03

Python

Note (Feb 17, 2026): There is a bug in the Python implementation of the Level 3 code. This section may not run correctly, we will update as soon as possible.

prof = nu.stack_eddy(filepath=os.getcwd() + "/filesToStack00200",
                 level="dp03")

As with the Level 4 data, the result is a named list with data tables for each site.

prof['NIWO'].head()

6. Un-interpolated vertical profile data (Level 2)

The Level 2 data are interpolated in time but not in space. They contain the rates of change at each of the measurement heights.

R

Again, they can be extracted from the HDF5 files using stackEddy() with the same syntax:

prof.l2 <- stackEddy(filepath=paste0(getwd(), "/filesToStack00200"),
                 level="dp02")

head(prof.l2$HARV)

##   horizontalPosition verticalPosition             timeBgn             timeEnd
## 1                000              010 2018-06-01 00:00:00 2018-06-01 00:29:59
## 2                000              010 2018-06-01 00:30:00 2018-06-01 00:59:59
## 3                000              010 2018-06-01 01:00:00 2018-06-01 01:29:59
## 4                000              010 2018-06-01 01:30:00 2018-06-01 01:59:59
## 5                000              010 2018-06-01 02:00:00 2018-06-01 02:29:59
## 6                000              010 2018-06-01 02:30:00 2018-06-01 02:59:59
##   data.co2Stor.rateRtioMoleDryCo2.mean data.h2oStor.rateRtioMoleDryH2o.mean
## 1                                  NaN                                  NaN
## 2                          0.002666576                                  NaN
## 3                         -0.011224223                                  NaN
## 4                          0.006133056                                  NaN
## 5                         -0.019554655                                  NaN
## 6                         -0.007855632                                  NaN
##   data.tempStor.rateTemp.mean qfqm.co2Stor.rateRtioMoleDryCo2.qfFinl
## 1                2.583333e-05                                      1
## 2               -2.008056e-04                                      1
## 3               -1.901111e-04                                      1
## 4               -7.419444e-05                                      1
## 5               -1.537083e-04                                      1
## 6               -1.874861e-04                                      1
##   qfqm.h2oStor.rateRtioMoleDryH2o.qfFinl qfqm.tempStor.rateTemp.qfFinl
## 1                                      1                             0
## 2                                      1                             0
## 3                                      1                             0
## 4                                      1                             0
## 5                                      1                             0
## 6                                      1                             0

Python

Again, they can be extracted from the HDF5 files using stackeddy() with the same syntax:

profl2 = nu.stack_eddy(filepath=os.getcwd() + "/filesToStack00200",
                 level="dp02")

profl2['HARV'].head()

##   horizontalPosition  ... qfqm.tempStor.rateTemp.qfFinl
## 0                000  ...                             0
## 1                000  ...                             0
## 2                000  ...                             0
## 3                000  ...                             0
## 4                000  ...                             0
## 
## [5 rows x 10 columns]

Note that here, as in the PAR data, there is a verticalPosition field. It has the same meaning as in the PAR data, indicating the tower level of the measurement.

7. Calibrated raw data (Level 1)

Level 1 (dp01) data are calibrated, and aggregated in time, but otherwise untransformed. Use Level 1 data for raw gas concentrations and atmospheric stable isotopes.

Extract Level 1 data

R

Using stackEddy() to extract Level 1 data requires additional inputs. The Level 1 files are too large to simply pull out all the variables by default, and they include multiple averaging intervals, which can’t be merged. So two additional inputs are needed:

avg: The averaging interval to extract
var: One or more variables to extract

What variables are available, at what averaging intervals? Another function in the neonUtilities package, getVarsEddy(), returns a list of HDF5 file contents. It requires only one input, a filepath to a single NEON HDF5 file:

vars <- getVarsEddy(paste0(getwd(),
                           "/filesToStack00200/NEON.D01.HARV.DP4.00200.001.nsae.2018-07.basic.20260115T064348Z.h5"))
head(vars)

##    site level category system hor ver tmi subsys       name       otype
## 5  HARV  dp01     data   amrs 000 060 01m   <NA> angNedXaxs H5I_DATASET
## 6  HARV  dp01     data   amrs 000 060 01m   <NA> angNedYaxs H5I_DATASET
## 7  HARV  dp01     data   amrs 000 060 01m   <NA> angNedZaxs H5I_DATASET
## 9  HARV  dp01     data   amrs 000 060 30m   <NA> angNedXaxs H5I_DATASET
## 10 HARV  dp01     data   amrs 000 060 30m   <NA> angNedYaxs H5I_DATASET
## 11 HARV  dp01     data   amrs 000 060 30m   <NA> angNedZaxs H5I_DATASET
##      dclass   dim  oth
## 5  COMPOUND 44640 <NA>
## 6  COMPOUND 44640 <NA>
## 7  COMPOUND 44640 <NA>
## 9  COMPOUND  1488 <NA>
## 10 COMPOUND  1488 <NA>
## 11 COMPOUND  1488 <NA>

Inputs to var can be any values from the name field in the table returned by getVarsEddy(). Let’s take a look at CO₂ and H₂O, ¹³C in CO₂ and ¹⁸O in H₂O, at 30-minute aggregation. Let’s look at Harvard Forest for these data, since deeper canopies generally have more interesting profiles:

iso <- stackEddy(filepath=paste0(getwd(), "/filesToStack00200"),
               level="dp01", var=c("rtioMoleDryCo2","rtioMoleDryH2o",
                                   "dlta13CCo2","dlta18OH2o"), avg=30)

head(iso$HARV[,1:8])

##   horizontalPosition verticalPosition             timeBgn             timeEnd
## 1                000              010 2018-06-01 00:00:00 2018-06-01 00:29:59
## 2                000              010 2018-06-01 00:30:00 2018-06-01 00:59:59
## 3                000              010 2018-06-01 01:00:00 2018-06-01 01:29:59
## 4                000              010 2018-06-01 01:30:00 2018-06-01 01:59:59
## 5                000              010 2018-06-01 02:00:00 2018-06-01 02:29:59
## 6                000              010 2018-06-01 02:30:00 2018-06-01 02:59:59
##   data.co2Stor.rtioMoleDryCo2.mean data.co2Stor.rtioMoleDryCo2.min
## 1                         509.3375                        451.4786
## 2                         502.2736                        463.5470
## 3                         521.6139                        442.8649
## 4                         469.6317                        432.6588
## 5                         484.7725                        436.2842
## 6                         476.8554                        443.7055
##   data.co2Stor.rtioMoleDryCo2.max data.co2Stor.rtioMoleDryCo2.vari
## 1                        579.3518                         845.0795
## 2                        533.6622                         161.3652
## 3                        563.0518                         547.9924
## 4                        508.7463                         396.8379
## 5                        537.4641                         662.9449
## 6                        515.6598                         246.6969

Python

Using stack_eddy() to extract Level 1 data requires additional inputs. The Level 1 files are too large to simply pull out all the variables by default, and they include multiple averaging intervals, which can’t be merged. So two additional inputs are needed:

avg: The averaging interval to extract
var: One or more variables to extract

Inputs to var can be any values from the name field in the table returned by get_vars_eddy(). Let’s take a look at CO₂ and H₂O, ¹³C in CO₂ and ¹⁸O in H₂O, at 30-minute aggregation. Let’s look at Harvard Forest for these data, since deeper canopies generally have more interesting profiles:

iso = nu.stack_eddy(filepath=os.getcwd() + "/filesToStack00200",
                 level="dp01", avg=30, 
                    var=["rtioMoleDryCo2","rtioMoleDryH2o",
                          "dlta13CCo2","dlta18OH2o"])

iso['HARV'].head()

##   horizontalPosition  ... ucrt.h2oTurb.rtioMoleDryH2o.se
## 0                000  ...                            NaN
## 1                000  ...                            NaN
## 2                000  ...                            NaN
## 3                000  ...                            NaN
## 4                000  ...                            NaN
## 
## [5 rows x 85 columns]

Plot vertical profiles

Let’s plot vertical profiles of CO₂ and ¹³C in CO₂ on a single day.

Here we’ll use the time stamps to select all of the records for a single day. And discard the verticalPosition values that are string values - those are the calibration gases.

R

iso.d <- iso$HARV[grep("2018-06-25", iso$HARV$timeBgn, fixed=T),]
iso.d <- iso.d[-which(is.na(as.numeric(iso.d$verticalPosition))),]

ggplot is well suited to these types of data. We can plot CO₂ relative to height on the tower, with separate lines for each time interval.

g <- ggplot(iso.d, aes(y=verticalPosition)) + 
  geom_path(aes(x=data.co2Stor.rtioMoleDryCo2.mean, 
                group=timeBgn, col=timeBgn)) + 
  theme(legend.position="none") + 
  xlab("CO2") + ylab("Tower level")
g

And the same plot for ¹³C in CO₂:

g <- ggplot(iso.d, aes(y=verticalPosition)) + 
  geom_path(aes(x=data.isoCo2.dlta13CCo2.mean, 
                group=timeBgn, col=timeBgn)) + 
  theme(legend.position="none") + 
  xlab("d13C") + ylab("Tower level")
g

Python


isoH = iso['HARV']
isoH6 = isoH[(isoH['timeBgn'] >= '2018-06-25') & (isoH['timeBgn'] < '2018-06-26')]
isod = isoH6[isoH6['horizontalPosition'] == '000']

Let’s plot CO₂ relative to height on the tower, with separate lines for each time interval.

fig, ax = plt.subplots()
groups = isod["timeBgn"].unique()
cmap = plt.cm.viridis
colors = cmap(np.linspace(0, 1, len(groups)))
for color, group_name in zip(colors, groups):
group = isod[isod["timeBgn"] == group_name]
ax.plot(group['data.co2Stor.rtioMoleDryCo2.mean'],
group['verticalPosition'],
color=color)
ax.set_xlabel("CO2")
ax.set_ylabel("Tower level")
plt.show()

And the same plot for ¹³C in CO₂:

fig, ax = plt.subplots()
for color, group_name in zip(colors, groups):
group = isod[isod["timeBgn"] == group_name]
ax.plot(group["data.isoCo2.dlta13CCo2.mean"],
group["verticalPosition"],
color=color)
ax.set_xlabel("13C")
ax.set_ylabel("Tower level")
plt.show()

The legends are omitted for space, see if you can use the concentration and isotope ratio buildup and drawdown below the canopy to work out the times of day the different colors represent.