Use the neonDataStackR Package to Access NEON Data

Authors: 
Megan A. Jones
Table of Contents

This tutorial goes over how to convert data downloaded from the NEON Data Portal in zipped month-by-site files into individual files with all data from the given site(s) and months. Temperature data are used as an example.

Download the Data

To start, you must have your data of interest downloaded from the NEON Data Portal.

The stacking function will only work on Comma Seperated Value (.csv) files and not the NEON data stored in other formats (HDF5, etc).

Your data will download in a single zipped file.

The example data are any single-aspirated air temperature available from 1 January 2015 to 31 December 2016.

neonDataStackR package

This package was written to stack data downloaded in month-by-site files into a full table with all the data of interest from all sites in the downloaded date range.

More information on the package see the README in the associated GitHub repo NEONScience/NEON-utilities.

First, install the package from the GitHub repo. You must have the devtools package installed to do this. Then load the package.

library(devtools)

## Warning: package 'devtools' was built under R version 3.4.1

install_github("NEONScience/NEON-utilities/neonDataStackR", dependencies=TRUE)

## Downloading GitHub repo NEONScience/NEON-utilities@master
## from URL https://api.github.com/repos/NEONScience/NEON-utilities/zipball/master

## Installing neonDataStackR

## '/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file  \
##   --no-environ --no-save --no-restore --quiet CMD INSTALL  \
##   '/private/var/folders/0p/x8phw1_156511_jqkryx2t8m2vn2t3/T/RtmpPD2fan/devtools52c155f1e8a2/NEONScience-NEON-utilities-5872bcd/neonDataStackR'  \
##   --library='/Users/mjones01/Library/R/3.4/library' --install-tests

## 

library (neonDataStackR)

Now there is a single function to run in this package stackByTable(). The output will yield data grouped into new files by table name. For example the single aspirated air temperature data product contains 1 minute and 30 minute interval data. The output from this function is one .csv with 1 minute data and one .csv with 30 minute data.

Depending on your file size this function may run for a while. The 2015 and 2016 single aspirated air temperature from two sites that I used for a 2017 workshop took about 25 minutes to complete.

To run the stackByTable() function, simply use the file path to the downloaded and zipped file from your current working directory.

stackByTable("data/NEON_temp-air-single.zip")


Unpacked  NEON.D10.CPER.DP1.00002.001.2016-07.basic.20171010T230533Z.zip
Unpacked  NEON.D10.CPER.DP1.00002.001.2016-08.basic.20171011T101525Z.zip
Unpacked  NEON.D10.CPER.DP1.00002.001.2016-09.basic.20171010T233829Z.zip
Joining, by = c("domainID", "siteID", "horizontalPosition", "verticalPosition", 
"startDateTime", "endDateTime", "tempSingleMean", "tempSingleMinimum", 
"tempSingleMaximum", "tempSingleVariance", "tempSingleNumPts", "tempSingleExpUncert", 
"tempSingleStdErMean", "finalQF")
Joining, by = c("domainID", "siteID", "horizontalPosition", "verticalPosition", 
"startDateTime", "endDateTime", "tempSingleMean", "tempSingleMinimum", "tempSingleMaximum", 
"tempSingleVariance", "tempSingleNumPts", "tempSingleExpUncert", "tempSingleStdErMean", 
"finalQF")
# Note that I've removed some of the "Joining" output for ease of reading
Finished: All of the data are stacked into  2  tables!
Copied the first available variable definition file to /stackedFiles and renamed as variables.csv
Stacked  SAAT_1min
Stacked  SAAT_30min

From the single-aspirated air temperature data we are given two final tables. One with 1 minute intervals: SAAT_1min and one for 30 minute intervals: SAAT_30min.

These .csv files are now ready for use.

Get Lesson Code: 

neonDataStackR.R

Add new comment

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Dialog content.