Skip to main content
NSF NEON | Open Data to Understand our Ecosystems logo
Sign In

Main navigation

  • About Us
    • Overview
      • Spatial and Temporal Design
      • History
    • Vision and Management
    • Advisory Groups
      • Advisory Committee: STEAC
      • Technical Working Groups (TWGs)
    • FAQ
    • Contact Us
      • Field Offices
    • User Accounts
    • Staff

    About Us

  • Data & Samples
    • Data Portal
      • Explore Data Products
      • Data Availability Charts
      • Spatial Data & Maps
      • Document Library
      • API & GraphQL
      • Prototype Data
      • External Lab Data Ingest (restricted)
    • Samples & Specimens
      • Discover and Use NEON Samples
        • Sample Types
        • Sample Repositories
        • Sample Explorer
        • Megapit and Distributed Initial Characterization Soil Archives
        • Excess Samples
      • Sample Processing
      • Sample Quality
      • Taxonomic Lists
    • Collection Methods
      • Protocols & Standardized Methods
      • AIrborne Remote Sensing
        • Flight Box Design
        • Flight Schedules and Coverage
        • Daily Flight Reports
        • Camera
        • Imaging Spectrometer
        • Lidar
      • Automated Instruments
        • Site Level Sampling Design
        • Sensor Collection Frequency
        • Instrumented Collection Types
          • Meteorology
          • Phenocams
          • Soil Sensors
          • Ground Water
          • Surface Water
      • Observational Sampling
        • Site Level Sampling Design
        • Sampling Schedules
        • Observation Types
          • Aquatic Organisms
            • Aquatic Microbes
            • Fish
            • Macroinvertebrates & Zooplankton
            • Periphyton, Phytoplankton, and Aquatic Plants
          • Terrestrial Organisms
            • Birds
            • Ground Beetles
            • Mosquitoes
            • Small Mammals
            • Soil Microbes
            • Terrestrial Plants
            • Ticks
          • Hydrology & Geomorphology
            • Discharge
            • Geomorphology
          • Biogeochemistry
          • DNA Sequences
          • Pathogens
          • Sediments
          • Soils
            • Soil Descriptions
    • Data Notifications
    • Data Guidelines and Policies
      • Acknowledging and Citing NEON
      • Publishing Research Outputs
      • Usage Policies
    • Data Management
      • Data Availability
      • Data Formats and Conventions
      • Data Processing
      • Data Quality
      • Data Product Revisions and Releases
        • Release 2021
        • Release 2022
      • Externally Hosted Data

    Data & Samples

  • Field Sites
    • About Field Sites and Domains
    • Explore Field Sites
    • Site Management Data Product

    Field Sites

  • Impact
    • Observatory Blog
    • Case Studies
    • Spotlights
    • Papers & Publications
    • Newsroom
      • NEON in the News
      • Newsletter Archive

    Impact

  • Resources
    • Getting Started with NEON Data & Resources
    • Documents and Communication Resources
      • Papers & Publications
      • Document Library
      • Outreach Materials
    • Code Hub
      • Code Resources Guidelines
      • Code Resources Submission
      • NEON's GitHub Organization Homepage
    • Learning Hub
      • Science Videos
      • Tutorials
      • Workshops & Courses
      • Teaching Modules
      • Faculty Mentoring Networks
      • Data Education Fellows
    • Research Support and Assignable Assets
      • Field Site Coordination
      • Letters of Support
      • Mobile Deployment Platforms
      • Permits and Permissions
      • AOP Flight Campaigns
      • Excess Samples
      • Assignable Assets FAQs
    • Funding Opportunities

    Resources

  • Get Involved
    • Advisory Groups
    • Upcoming Events
    • Past Events
    • NEON Ambassador Program
    • Collaborative Works
      • EFI-NEON Ecological Forecasting Challenge
      • NCAR-NEON-Community Collaborations
    • Community Engagement
    • Work Opportunities
      • Careers
      • Seasonal Fieldwork
      • Postdoctoral Fellows
      • Internships
        • Intern Alumni
    • Partners

    Get Involved

  • My Account
  • Search

Search

Learning Hub

  • Science Videos
  • Tutorials
  • Workshops & Courses
  • Teaching Modules
  • Faculty Mentoring Networks
  • Data Education Fellows

Breadcrumb

  1. Resources
  2. Learning Hub
  3. Tutorials
  4. Working With Raster Time Series Data in R

Series

Working With Raster Time Series Data in R

The tutorials in this series cover how to open, work with and plot raster time series data in R. This series includes only the more-advanced, time-series specific tutorials that are also part of the Introduction to Working With Raster Data in R series.

Data used in this series cover NEON Harvard Forest and San Joaquin Experimental Range field sites and are in GeoTIFF and .csv formats.

Series Objectives

After completing the series you will:

  • Raster 05
    • Understand the format of a time series raster dataset.
    • Know how to work with time series rasters.
    • Be able to efficiently import a set of rasters stored in a single directory.
    • Be able to plot and explore time series raster data using the plot() function in R.
  • Raster 06
    • Be able to assign custom names to bands in a RasterStack for prettier plotting.
    • Understand advanced plotting of rasters using the rasterVis package and levelplot.
  • Raster 07
    • Be able to extract summary pixel values from a raster.
    • Know how to save summary values to a .csv file.
    • Be able to plot summary pixel values using ggplot().
    • Have experience comparing NDVI values between two different sites.

Things You’ll Need To Complete This Series

Setup RStudio

To complete the tutorial series you will need an updated version of R and, preferably, RStudio installed on your computer.

R is a programming language that specializes in statistical computing. It is a powerful tool for exploratory data analysis. To interact with R, we strongly recommend RStudio, an interactive development environment (IDE).

Install R Packages

You can chose to install packages with each lesson or you can download all of the necessary R Packages now.

  • raster: install.packages("raster")
  • rgdal: install.packages("rgdal")
  • rasterVis: install.packages("rasterVis")
  • ggplot2: install.packages("ggplot2")

More on Packages in R – Adapted from Software Carpentry.


Working with Raster Time Series Data in R Tutorial Series: This tutorial is part of a series on working with raster data in R. It is part also of a Data Carpentry workshop
on using spatio-temporal in R. Other related series include: intro to spatio-temporal data and data management, working with vector data in R, and working with tabular time series data in R.

Raster 05: Raster Time Series Data in R

Authors: Leah A. Wasser, Megan A. Jones, Zack Brym, Kristina Riemer, Jason Williams, Jeff Hollister, Mike Smorul

Last Updated: Apr 8, 2021

This tutorial covers how to work with and plot a raster time series, using an R RasterStack object. It also covers practical assessment of data quality in remote sensing derived imagery.

Learning Objectives

After completing this tutorial, you will be able to:

  • Understand the format of a time series raster dataset.
  • Know how to work with time series rasters.
  • Be able to efficiently import a set of rasters stored in a single directory.
  • Be able to plot and explore time series raster data using the plot() function in R.

Things You’ll Need To Complete This Tutorial

You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.

Install R Packages

  • raster: install.packages("raster")

  • rgdal: install.packages("rgdal")

  • More on Packages in R – Adapted from Software Carpentry.

Data to Download

NEON Teaching Data Subset: Landsat-derived NDVI raster files

The imagery data used to create this raster teaching data subset were collected over the National Ecological Observatory Network's Harvard Forest and San Joaquin Experimental Range field sites.
The imagery was created by the U.S. Geological Survey (USGS) using a multispectral scanner on a Landsat Satellite. The data files are Geographic Tagged Image-File Format (GeoTIFF).

Download Dataset

Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.

An overview of setting the working directory in R can be found here.

R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.


Additional Resources

Read more about the raster package in R.

About Raster Time Series Data

A raster data file can contain one single band or many bands. If the raster data contains imagery data, each band may represent reflectance for a different wavelength (color or type of light) or set of wavelengths - for example red, green and blue. A multi-band raster may two or more bands or layers of data collected at different times for the same extent (region) and of the same resolution.

Graphic depicting a time series of the greenness over time of the United States A multi-band raster dataset can contain time series data. Source: National Ecological Observatory Network (NEON).

The raster data that we will use in this tutorial are located in the (NEON-DS-Landsat-NDVI\HARV\2011\NDVI) directory and cover part of the NEON Harvard Forest field site.

In this tutorial, we will:

  1. Import NDVI data in GeoTIFF format.
  2. Import, explore and plot NDVI data derived for several dates throughout the year.
  3. View the RGB imagery used to derived the NDVI time series to better understand unusual/outlier values.

NDVI Data

The Normalized Difference Vegetation Index or NDVI is a quantitative index of greenness ranging from 0-1 where 0 represents minimal or no greenness and 1 represents maximum greenness.

NDVI is often used for a quantitative proxy measure of vegetation health, cover and phenology (life cycle stage) over large areas. Our NDVI data are a Landsat derived single band product saved as a GeoTIFF for different times of the year.

NDVI is calculated from the visible and near-infrared light reflected by vegetation. Healthy vegetation absorbs most of the visible light that hits it, and reflects a large portion of near-infrared light. Unhealthy or sparse vegetation reflects more visible light and less near-infrared light. NDVI is calculated from the visible and near-infrared light reflected by vegetation. Healthy vegetation (left) absorbs most of the visible light that hits it, and reflects a large portion of near-infrared light. Unhealthy or sparse vegetation (right) reflects more visible light and less near-infrared light. Image & Caption Source: NASA

More on NDVI from NASA

RGB Data

While the NDVI data are a single band product, the RGB images that contain the red band used to derive NDVI, contain 3 (of the 7) 30m resolution bands available from Landsat data. The RGB directory contains RGB images for each time period that NDVI is available.

A graphic depicting the three different color bands (red, green, and blue) of a satellite image and how they create a basic color image when composited. A "true" color image consists of 3 bands - red, green and blue. When composited or rendered together in a GIS, or even a image-editor like Photoshop the bands create a color image. Source: National Ecological Observatory Network (NEON).

Getting Started

In this tutorial, we will use the raster and rgdal libraries.

# load packages
library(raster)
library(rgdal)

# set working directory to ensure R can find the file we wish to import
wd <- "~/Git/data/" # this will depend on your local environment environment
# be sure that the downloaded file is in this directory
setwd(wd)

To begin, we will create a list of raster files using the list.files() function in R. This list will be used to generate a RasterStack. We will only add files to our list with a .tif extension using the syntax pattern=".tif$".

If we specify full.names=TRUE, the full path for each file will be added to the list.

# Create list of NDVI file paths
# assign path to object = cleaner code
NDVI_HARV_path <- paste0(wd,"NEON-DS-Landsat-NDVI/HARV/2011/NDVI") 
all_NDVI_HARV <- list.files(NDVI_HARV_path,
                            full.names = TRUE,
                            pattern = ".tif$")

# view list - note the full path, relative to our working directory, is included
all_NDVI_HARV

##  [1] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/005_HARV_ndvi_crop.tif"
##  [2] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/037_HARV_ndvi_crop.tif"
##  [3] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/085_HARV_ndvi_crop.tif"
##  [4] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/133_HARV_ndvi_crop.tif"
##  [5] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/181_HARV_ndvi_crop.tif"
##  [6] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/197_HARV_ndvi_crop.tif"
##  [7] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/213_HARV_ndvi_crop.tif"
##  [8] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/229_HARV_ndvi_crop.tif"
##  [9] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/245_HARV_ndvi_crop.tif"
## [10] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/261_HARV_ndvi_crop.tif"
## [11] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/277_HARV_ndvi_crop.tif"
## [12] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/293_HARV_ndvi_crop.tif"
## [13] "/Users/olearyd/Git/data/NEON-DS-Landsat-NDVI/HARV/2011/NDVI/309_HARV_ndvi_crop.tif"

Now we have a list of all GeoTIFF files in the NDVI directory for Harvard Forest. Next, we will create a RasterStack from this list using the stack() function.

# Create a raster stack of the NDVI time series
NDVI_HARV_stack <- stack(all_NDVI_HARV)

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

We can explore the GeoTIFF tags (the embedded metadata) in a stack using the same syntax that we used on single-band raster objects in R including: crs() (coordinate reference system), extent() and res() (resolution; specifically yres() and xres()).

# view crs of rasters
crs(NDVI_HARV_stack)

## CRS arguments:
##  +proj=utm +zone=19 +ellps=WGS84 +units=m +no_defs

# view extent of rasters in stack
extent(NDVI_HARV_stack)

## class      : Extent 
## xmin       : 239415 
## xmax       : 239535 
## ymin       : 4714215 
## ymax       : 4714365

# view the y resolution of our rasters
yres(NDVI_HARV_stack)

## [1] 30

# view the x resolution of our rasters
xres(NDVI_HARV_stack)

## [1] 30

Notice that the CRS is +proj=utm +zone=19 +ellps=WGS84 +units=m +no_defs. The CRS is in UTM Zone 19. If you have completed the previous tutorials in this raster data in R series, you may have noticed that the UTM zone for the NEON collected remote sensing data was in Zone 18 rather than Zone 19. Why are the Landsat data in Zone 19?

Image of the UTM zones of with United States with an inset image of NEON's Harvard Forest tower location demonstrating it is in UTM zone 18. Landsat imagery swaths are over 170 km N-S and 180 km E-W. As a result a given image may overlap two UTM zones. The designated zone is determined by the zone that the majority of the image is in. In this example, our point of interest is in UTM Zone 18 but the Landsat image will be classified as UTM Zone 19. Source: National Ecological Observatory Network (NEON).

The width of a Landsat scene is extremely wide - spanning over 170km north to south and 180km east to west. This means that Landsat data often cover multiple UTM zones. When the data are processed, the zone in which the majority of the data cover, is the zone which is used for the final CRS. Thus, our field site at Harvard Forest is located in UTM Zone 18, but the Landsat data are in a CRS of UTM Zone 19.

### Challenge: Raster Metadata Answer the following questions about our `RasterStack`.
  1. What is the CRS?
  2. What is the x and y resolution of the data?
  3. What units is the above resolution in?

Plotting Time Series Data

Once we have created our RasterStack, we can visualize our data. We can use the plot() command to quickly plot a RasterStack.

# view a plot of all of the rasters
# 'nc' specifies number of columns (we will have 13 plots)
plot(NDVI_HARV_stack, 
     zlim = c(1500, 10000), 
     nc = 4)

Plots of all the NDVI rasters of NEON's site Harvard Forest in the raster stack

Have a look at the range of NDVI values observed in the plot above. We know that the accepted values for NDVI range from 0-1. Why does our data range from 0 - 10,000?

Scale Factors

The metadata for this NDVI data specifies a scale factor: 10,000. A scale factor is sometimes used to maintain smaller file sizes by removing decimal places. Storing data in integer format keeps files sizes smaller.

Let's apply the scale factor before we go any further. Conveniently, we can quickly apply this factor using raster math on the entire stack as follows:

raster_stack_object_name / 10000

**Data Tip:** We can make this plot even prettier by fixing the individual tile names, adding an plot title and by using the (`levelplot`) function. This is covered in the NEON Data Skills *Plot Time Series Rasters in R* tutorial.
# apply scale factor to data
NDVI_HARV_stack <- NDVI_HARV_stack/10000
# plot stack with scale factor applied
# apply scale factor to limits to ensure uniform plottin
plot(NDVI_HARV_stack,
     zlim = c(.15, 1),  
     nc = 4)

Plots of all the NDVI rasters of NEON's site Harvard Forest in the raster stack with a scale factor of 10,000

Take a Closer Look at Our Data

Let's take a closer look at the plots of our data. Note that Massachusetts, where the NEON Harvard Forest Field Site is located has a fairly consistent fall, winter, spring and summer season where vegetation turns green in the spring, continues to grow throughout the summer, and begins to change colors and senesce in the fall through winter. Do you notice anything that seems unusual about the patterns of greening and browning observed in the plots above?

Hint: the number after the "X" in each tile title is the Julian day which in this case represents the number of days into each year. If you are unfamiliar with Julian day, check out the NEON Data Skills Converting to Julian Day tutorial . tutorial.

View Distribution of Raster Values

In the above exercise, we viewed plots of our NDVI time series and noticed a few images seem to be unusually light. However this was only a visual representation of potential issues in our data. What is another way we can look at these data that is quantitative?

Next we will use histograms to explore the distribution of NDVI values stored in each raster.

# create histograms of each raster
hist(NDVI_HARV_stack, 
     xlim = c(0, 1))

Histograms of all the NDVI rasters of NEON's site Harvard Forest in the raster stack

It seems like things get green in the spring and summer like we expect, but the data at Julian days 277 and 293 are unusual. It appears as if the vegetation got green in the spring, but then died back only to get green again towards the end of the year. Is this right?

Explore Unusual Data Patterns

The NDVI data that we are using comes from 2011, perhaps a strong freeze around Julian day 277 could cause a vegetation to senesce early, however in the eastern United States, it seems unusual that it would proceed to green up again shortly thereafter.

Let's next view some temperature data for our field site to see whether there were some unusual fluctuations that may explain this pattern of greening and browning seen in the NDVI data.

Scatterplot of daily mean air temperature at NEON's site Harvard Forest

There are no significant peaks or dips in the temperature during the late summer or early fall time period that might account for patterns seen in the NDVI data.

What is our next step?

Let's have a look at the source Landsat imagery that was partially used used to derive our NDVI rasters to try to understand what appears to be outlier NDVI values.

### Challenge: Examine RGB Raster Files
  1. View the imagery located in the /NEON-DS-Landsat-NDVI/HARV/2011 directory.
  2. Plot the RGB images for the Julian days 277 and 293 then plot and compare those images to jdays 133 and 197.
  3. Does the RGB imagery from these two days explain the low NDVI values observed on these days?

HINT: if you want to plot 4 images in a tiled set, you can use par(mfrow=c(2,2)) to create a 2x2 tiled layout. When you are done, be sure to reset your layout using: par(mfrow=c(1,1)).

Two sets of NDVI images for NEON's site Harvard Forest making a small time series

Explore The Data's Source

The third challenge question, "Does the RGB imagery from these two days explain the low NDVI values observed on these days?" highlights the importance of exploring the source of a derived data product. In this case, the NDVI data product was derived from (created using) Landsat imagery - specifically the red and near-infrared bands.

When we look at the RGB collected at Julian days 277 and 293 we see that most of the image is filled with clouds. The very low NDVI values resulted from cloud cover — a common challenge that we encounter when working with satellite remote sensing imagery.

Get Lesson Code

05-Time-Series-Raster.R

Raster 06: Plot Raster Time Series Data in R Using RasterVis and Levelplot

Authors: Leah A. Wasser, Megan A. Jones, Zack Brym, Kristina Riemer, Jason Williams, Jeff Hollister, Mike Smorul

Last Updated: Apr 8, 2021

This tutorial covers how to improve plotting output using the rasterVis package in R. Specifically it covers using levelplot() and adding meaningful custom names to bands within a RasterStack.

Learning Objectives

After completing this tutorial, you will be able to:

  • Be able to assign custom names to bands in a RasterStack for prettier plotting.
  • Understand advanced plotting of rasters using the rasterVis package and levelplot.

Things You’ll Need To Complete This Tutorial

You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.

Install R Packages

  • raster: install.packages("raster")

  • rgdal: install.packages("rgdal")

  • rasterVis: install.packages("rasterVis")

  • RColorBrewer: install.packages("RColorBrewer")

  • More on Packages in R – Adapted from Software Carpentry.

Data to Download

NEON Teaching Data Subset: Landsat-derived NDVI raster files

The imagery data used to create this raster teaching data subset were collected over the National Ecological Observatory Network's Harvard Forest and San Joaquin Experimental Range field sites.
The imagery was created by the U.S. Geological Survey (USGS) using a multispectral scanner on a Landsat Satellite. The data files are Geographic Tagged Image-File Format (GeoTIFF).

Download Dataset

Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.

An overview of setting the working directory in R can be found here.

R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.

Get Started

In this tutorial, we are working with the same set of rasters used in the Raster Time Series Data in R tutorial. These data are derived from the Landsat satellite and stored in GeoTIFF format. Each raster covers the NEON Harvard Forest field site.

If you have not already created the RasterStack, originally created in Raster Time Series Data in R , please create it now.

# import libraries
library(raster)
library(rgdal)
library(rasterVis)

## Loading required package: terra

## terra version 1.1.4

## 
## Attaching package: 'terra'

## The following objects are masked from 'package:tidyr':
## 
##     expand, fill, pack, separate

## The following object is masked from 'package:zoo':
## 
##     time<-

## The following object is masked from 'package:grid':
## 
##     depth

## The following object is masked from 'package:scales':
## 
##     rescale

## The following object is masked from 'package:ggmap':
## 
##     inset

## The following object is masked from 'package:rgdal':
## 
##     project

## The following objects are masked from 'package:dplyr':
## 
##     collapse, desc, near

## The following object is masked from 'package:knitr':
## 
##     spin

## Loading required package: lattice

## Loading required package: latticeExtra

## 
## Attaching package: 'latticeExtra'

## The following object is masked from 'package:ggplot2':
## 
##     layer

library(RColorBrewer)

# set working directory to ensure R can find the file we wish to import
wd <- "~/Git/data/" # this will depend on your local environment environment
# be sure that the downloaded file is in this directory
setwd(wd)

# Create list of NDVI file paths
all_NDVI_HARV <- list.files(paste0(wd,"NEON-DS-Landsat-NDVI/HARV/2011/NDVI"), full.names = TRUE, pattern = ".tif$")

# Create a time series raster stack
NDVI_HARV_stack <- stack(all_NDVI_HARV)

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

# apply scale factor
NDVI_HARV_stack <- NDVI_HARV_stack/10000

Plot Raster Time Series Data

We can use the plot function to plot our raster time series data.

# view a plot of all of the rasters
# nc specifies number of columns
plot(NDVI_HARV_stack, nc = 4)

Plot of all the NDVI rasters for NEON's site Harvard Forest

**Data Tip:** The range of values for NDVI is 0-1. However, the data stored in our raster ranges from 0 - 10,000. If we view the metadata for the original .tif files, we will see a scale factor of 10,000 is defined. Multiplying values with decimal places by a factor of 10, allows the data to be stored in integer format (no decimals) rather than a floating point format (containing decimals). This keeps the file size smaller.

Our plot is nice however, it's missing some key elements including, easily readable titles. It also contains a legend that is repeated for each image. We can use levelplot from the rasterVis package to make our plot prettier!

  • More on the rasterVis package

The syntax for the levelplot() function is similar to that for the plot() function. We use main="TITLE" to add a title to the entire plot series.

# create a `levelplot` plot
levelplot(NDVI_HARV_stack,
          main="Landsat NDVI\nNEON Harvard Forest")

Levelplot of all the NDVI rasters for NEON's site Harvard Forest

Adjust the Color Ramp

Next, let's adjust the color ramp used to render the rasters. First, we can change the red color ramp to a green one that is more visually suited to our NDVI (greenness) data using the colorRampPalette() function in combination with colorBrewer.

# use colorbrewer which loads with the rasterVis package to generate
# a color ramp of yellow to green
cols <- colorRampPalette(brewer.pal(9,"YlGn"))
# create a level plot - plot
levelplot(NDVI_HARV_stack,
        main="Landsat NDVI -- Improved Colors \nNEON Harvard Forest Field Site",
        col.regions=cols)

Levelplot of all the NDVI rasters for NEON's site Harvard Forest with a new color palette

The yellow to green color ramp visually represents NDVI well given it's a measure of greenness. Someone looking at the plot can quickly understand that pixels that are more green, have a higher NDVI value.

  • For all of the brewer_pal ramp names see the
brewerpal page.
**Data Tip:** Cynthia Brewer, the creator of ColorBrewer, offers an online tool to help choose suitable color ramps, or to create your own. ColorBrewer 2.0; Color Advise for Cartography

Refine Plot & Tile Labels

Next, let's label each raster in our plot with the Julian day that the raster represents. The current names come from the band (layer names) stored in the RasterStack and first the part each name is the Julian day.

To create a more meaningful label we can remove the "x" and replace it with "day" using the gsub() function in R. The syntax is as follows: gsub("StringToReplace","TextToReplaceIt", Robject).

First let's remove "_HARV_NDVI_crop" from each label.

# view names for each raster layer
names(NDVI_HARV_stack)

##  [1] "X005_HARV_ndvi_crop" "X037_HARV_ndvi_crop" "X085_HARV_ndvi_crop"
##  [4] "X133_HARV_ndvi_crop" "X181_HARV_ndvi_crop" "X197_HARV_ndvi_crop"
##  [7] "X213_HARV_ndvi_crop" "X229_HARV_ndvi_crop" "X245_HARV_ndvi_crop"
## [10] "X261_HARV_ndvi_crop" "X277_HARV_ndvi_crop" "X293_HARV_ndvi_crop"
## [13] "X309_HARV_ndvi_crop"

# use gsub to modify label names.that we'll use for the plot 
rasterNames  <- gsub("X","Day ", names(NDVI_HARV_stack))

# view Names
rasterNames

##  [1] "Day 005_HARV_ndvi_crop" "Day 037_HARV_ndvi_crop"
##  [3] "Day 085_HARV_ndvi_crop" "Day 133_HARV_ndvi_crop"
##  [5] "Day 181_HARV_ndvi_crop" "Day 197_HARV_ndvi_crop"
##  [7] "Day 213_HARV_ndvi_crop" "Day 229_HARV_ndvi_crop"
##  [9] "Day 245_HARV_ndvi_crop" "Day 261_HARV_ndvi_crop"
## [11] "Day 277_HARV_ndvi_crop" "Day 293_HARV_ndvi_crop"
## [13] "Day 309_HARV_ndvi_crop"

# Remove HARV_NDVI_crop from the second part of the string 
rasterNames  <- gsub("_HARV_ndvi_crop","",rasterNames)

# view names for each raster layer
rasterNames

##  [1] "Day 005" "Day 037" "Day 085" "Day 133" "Day 181" "Day 197"
##  [7] "Day 213" "Day 229" "Day 245" "Day 261" "Day 277" "Day 293"
## [13] "Day 309"
**Data Tip:** Instead of substituting "x" and "_HARV_NDVI_crop" separately, we could have used use the vertical bar character ( | ) to replace more than one element. For example "X|_HARV" tells R to replace all instances of both "X" and "_HARV" in the string. Example code to remove "x" an "_HARV...": `gsub("X|_HARV_NDVI_crop"," | ","rasterNames")`

Once the names for each band have been reassigned, we can render our plot with the new labels using names.attr=rasterNames.

# use level plot to create a nice plot with one legend and a 4x4 layout.
levelplot(NDVI_HARV_stack,
          layout=c(4, 4), # create a 4x4 layout for the data
          col.regions=cols, # add a color ramp
          main="Landsat NDVI - Julian Days \nHarvard Forest 2011",
          names.attr=rasterNames)

Levelplot of all the NDVI rasters for NEON's site Harvard Forest with a legend, a 4x4layout, and each raster labeled with the Julian Day

We can adjust the columns of our plot too using layout=c(cols,rows). Below we adjust the layout to be a matrix of 5 columns and 3 rows.

# use level plot to create a nice plot with one legend and a 4x4 layout.
levelplot(NDVI_HARV_stack,
          layout=c(5, 3), # create a 5x3 layout for the data
          col.regions=cols, # add a color ramp
          main="Landsat NDVI - Julian Days \nHarvard Forest 2011",
          names.attr=rasterNames)

Levelplot of all the NDVI rasters for NEON's site Harvard Forest with a legend, a 5x3 layout, and each raster labeled with the Julian Day

Finally, scales allows us to modify the x and y-axis scale. Let's simply remove the axis ticks from the plot with scales =list(draw=FALSE).

# use level plot to create a nice plot with one legend and a 4x4 layout.
levelplot(NDVI_HARV_stack,
          layout=c(5, 3), # create a 5x3 layout for the data
          col.regions=cols, # add a color ramp
          main="Landsat NDVI - Julian Days \nHarvard Forest 2011",
          names.attr=rasterNames,
          scales=list(draw=FALSE )) # remove axes labels & ticks

Levelplot of all the NDVI rasters for NEON's site Harvard Forest with a legend, no axes, and each raster labeled with the Julian Day

### Challenge: Divergent Color Ramps When we used `gsub` to modify the tile labels we replaced the beginning of each tile title with "Day". A more descriptive name could be "Julian Day".
  1. Create a plot and label each tile "Julian Day" with the julian day value following.
  2. Change the colorramp to a divergent brown to green color ramp to represent the data. Hint: Use the brewerpal page to help choose a color ramp.

Questions: Does having a divergent color ramp represent the data better than a sequential color ramp (like "YlGn")? Can you think of other data sets where a divergent color ramp may be best?

Levelplot of all the NDVI rasters for NEON's site Harvard Forest with a legend, a 5x3 layout, a divergent color palette, and each raster labeled with the Julian Day

Get Lesson Code

06-Plotting-Time-Series-Rasters-in-R.R

Raster 07: Extract NDVI Summary Values from a Raster Time Series

Authors: Leah A. Wasser, Megan A. Jones, Zack Brym, Kristina Riemer, Jason Williams, Jeff Hollister, Mike Smorul

Last Updated: Apr 8, 2021

In this tutorial, we will extract NDVI values from a raster time series dataset in R and plot them using ggplot.

Learning Objectives

After completing this tutorial, you will be able to:

  • Be able to extract summary pixel values from a raster.
  • Know how to save summary values to a .csv file.
  • Be able to plot summary pixel values using ggplot().
  • Have experience comparing NDVI values between two different sites.

Things You'll Need To Complete This Tutorial

You will need the most current version of R and, preferably, RStudio loaded on your computer to complete this tutorial.

Install R Packages

  • raster: install.packages("raster")

  • rgdal: install.packages("rgdal")

  • ggplot2: install.packages("ggplot2")

  • More on Packages in R – Adapted from Software Carpentry.

Data to Download

NEON Teaching Data Subset: Landsat-derived NDVI raster files

The imagery data used to create this raster teaching data subset were collected over the National Ecological Observatory Network's Harvard Forest and San Joaquin Experimental Range field sites.
The imagery was created by the U.S. Geological Survey (USGS) using a multispectral scanner on a Landsat Satellite. The data files are Geographic Tagged Image-File Format (GeoTIFF).

Download Dataset

Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.

An overview of setting the working directory in R can be found here.

R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.

Extract Summary Statistics From Raster Data

In science, we often want to extract summary values from raster data. For example, we might want to understand overall greeness across a field site or at each plot within a field site. These values can then be compared between different field sites and combined with other related metrics to support modeling and further analysis.

Get Started

In this tutorial, we will work with the same set of rasters used in the Raster Time Series Data in R and Plot Raster Time Series Data in R Using RasterVis and Levelplot tutorials. To begin, we will create a raster stack (also created in the previous tutorials so you may be able to skip this first step!).

library(raster)
library(rgdal)
library(ggplot2)

# set working directory to ensure R can find the file we wish to import
wd <- "~/Git/data/" # this will depend on your local environment
# be sure that the downloaded file is in this directory
setwd(wd)

# Create list of NDVI file paths
all_HARV_NDVI <- list.files(paste0(wd,"NEON-DS-Landsat-NDVI/HARV/2011/NDVI"),
                            full.names = TRUE,
                            pattern = ".tif$")

# Create a time series raster stack
NDVI_HARV_stack <- stack(all_HARV_NDVI)

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

# apply scale factor
NDVI_HARV_stack <- NDVI_HARV_stack/10000

Calculate Average NDVI

Our goal in this tutorial, is to create a data.frame that contains a single, mean NDVI value for each raster in our time series. This value represents the mean NDVI value for this area on a given day.

We can calculate the mean for each raster using the cellStats function. The cellStats function produces a numeric array of values. We can then convert our array format output to a data.frame using as.data.frame().

# calculate mean NDVI for each raster
avg_NDVI_HARV <- cellStats(NDVI_HARV_stack,mean)

# convert output array to data.frame
avg_NDVI_HARV <- as.data.frame(avg_NDVI_HARV)

# To be more efficient we could do the above two steps with one line of code
# avg_NDVI_HARV <- as.data.frame(cellStats(NDVI_stack_HARV,mean))

# view data
avg_NDVI_HARV

##                     avg_NDVI_HARV
## X005_HARV_ndvi_crop      0.365150
## X037_HARV_ndvi_crop      0.242645
## X085_HARV_ndvi_crop      0.251390
## X133_HARV_ndvi_crop      0.599300
## X181_HARV_ndvi_crop      0.878725
## X197_HARV_ndvi_crop      0.893250
## X213_HARV_ndvi_crop      0.878395
## X229_HARV_ndvi_crop      0.881505
## X245_HARV_ndvi_crop      0.850120
## X261_HARV_ndvi_crop      0.796360
## X277_HARV_ndvi_crop      0.033050
## X293_HARV_ndvi_crop      0.056895
## X309_HARV_ndvi_crop      0.541130

# view only the value in row 1, column 1 of the data frame
avg_NDVI_HARV[1,1]

## [1] 0.36515

We now have a data.frame with row.names based on the original file name and a mean NDVI value for each file. Next, let's clean up the column names in our data.frame to make it easier for colleagues to work with our code.

It is a bit confusing to have duplicate object & column names (e.g. avg_NDVI_HARV), additionally the "avg" does not clearly what the value in that particular column is. Let's change the NDVI column name to MeanNDVI.

# view column name slot
names(avg_NDVI_HARV)

## [1] "avg_NDVI_HARV"

# rename the NDVI column
names(avg_NDVI_HARV) <- "meanNDVI"

# view cleaned column names
names(avg_NDVI_HARV)

## [1] "meanNDVI"

By renaming the column, we lose the "HARV" in the header that reminds us what site our data are from. While, we are only working with one site now, we might want to compare several sites worth of data in the future. Let's add a column to our data.frame called "site". We can populate this column with the site name - HARV. Let's also create a year column and populate it with 2011 - the year our data were collected.

# add a site column to our data
avg_NDVI_HARV$site <- "HARV"

# add a "year" column to our data
avg_NDVI_HARV$year <- "2011"

# view data
head(avg_NDVI_HARV)

##                     meanNDVI site year
## X005_HARV_ndvi_crop 0.365150 HARV 2011
## X037_HARV_ndvi_crop 0.242645 HARV 2011
## X085_HARV_ndvi_crop 0.251390 HARV 2011
## X133_HARV_ndvi_crop 0.599300 HARV 2011
## X181_HARV_ndvi_crop 0.878725 HARV 2011
## X197_HARV_ndvi_crop 0.893250 HARV 2011

We now have data frame that contains a row for each raster file processed, and a column for meanNDVI, site and year.

Extract Julian Day from row.names

We'd like to produce a plot where Julian days (the numeric day of the year, 0 - 365/366) is on the x-axis and NDVI is on the y-axis. To create this plot, we'll need a column that contains the Julian day value.

One way to create a Julian day column is to use gsub on the file name in each row. We can replace both the X and the _HARV_NDVI_crop to extract the Julian Day value:

X005_HARV_NDVI_crop

# note the use of the vertical bar character ( | ) is equivalent to "or". This
# allows us to search for more than one pattern in our text strings.
julianDays <- gsub(pattern = "X|_HARV_ndvi_crop", #the pattern to find 
            x = row.names(avg_NDVI_HARV), #the object containing the strings
            replacement = "") #what to replace each instance of the pattern with

# alternately you can include the above code on one single line
# julianDays <- gsub("X|_HARV_NDVI_crop", "", row.names(avg_NDVI_HARV))

# make sure output looks ok
head(julianDays)

## [1] "005" "037" "085" "133" "181" "197"

# add julianDay values as a column in the data frame
avg_NDVI_HARV$julianDay <- julianDays

# what class is the new column
class(avg_NDVI_HARV$julianDay)

## [1] "character"

What class is our julianDay column?

**Data Tip:** To be efficient, we substituted two elements in one line of code using the "|". You can often combine commands in R to improve code efficiency. `avg_NDVI_HARV$julianDay <- gsub("X|_HARV_NDVI_crop", "", row.names(avg_NDVI_HARV))`.

Convert Julian Day to Date Class

Currently, the values in the Julian day column are stored as a character class. Storing this data as a date object is better - for plotting, data subsetting and working with our data. Let's convert.

For more information on date-time classes, see the NEON Data Skills tutorial Convert Date & Time Data from Character Class to Date-Time Class (POSIX) in R.

To convert a Julian Day number to a date class, we need to set the origin of the day which "counting" Julian Days began. Our data are from 2011, and we know that the USGS Landsat Team created Julian Day values for this year. Therefore, the first day or "origin" for our Julian day count is 01 January 2011. Once we set the Julian Day origin, we can add the Julian Day value (as an integer) to the origin date.

Since the origin date was originally set as a Date class object, the new Date column is also stored as class Date.

# set the origin for the julian date (1 Jan 2011)
origin <- as.Date("2011-01-01")

# convert "julianDay" from class character to integer
avg_NDVI_HARV$julianDay <- as.integer(avg_NDVI_HARV$julianDay)

# create a date column; -1 added because origin is the 1st. 
# If not -1, 01/01/2011 + 5 = 01/06/2011 which is Julian day 6, not 5.
avg_NDVI_HARV$Date<- origin + (avg_NDVI_HARV$julianDay-1)

# did it work? 
head(avg_NDVI_HARV$Date)

## [1] "2011-01-05" "2011-02-06" "2011-03-26" "2011-05-13" "2011-06-30"
## [6] "2011-07-16"

# What are the classes of the two columns now? 
class(avg_NDVI_HARV$Date)

## [1] "Date"

class(avg_NDVI_HARV$julianDay)

## [1] "integer"

Note that when we convert our integer class julianDay values to dates, we subtracted 1 as follows: avg_NDVI_HARV$Date <- origin + (avg_NDVI_HARV$julianDay-1) This is because the origin day is 01 January 2011, so the extracted day is 01. The Julian Day (or year day) for this is also 01. When we convert from the integer 05 julianDay value (indicating 5th of January), we cannot simply add origin + julianDay because 01 + 05 = 06 or 06 January 2011. To correct, this error we then subtract 1 to get the correct day, January 05 2011.

### Challenge: NDVI for the San Joaquin Experimental Range We often want to compare two different sites. The National Ecological Observatory Network (NEON) also has a field site in Southern California at the San Joaquin Experimental Range (SJER) .

For this challenge, compare NDVI values for the NEON Harvard Forest and San Joaquin Experimental Range field sites. NDVI data for SJER are located in the NEON-DS-Landsat-NDVI/SJER/2011/NDVI directory.

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
## prefer_proj = prefer_proj): Discarded datum Unknown based on WGS84
## ellipsoid in CRS definition

Plot NDVI Using ggplot

We now have a clean data.frame with properly scaled NDVI and Julian days. Let's plot our data.

We will use the ggplot() function within the ggplot2 package for this plot. If you are unfamiliar with ggplot() or would like to learn more about plotting in ggplot() see the tutorial on Plotting Time Series with ggplot in R .

# plot NDVI
ggplot(avg_NDVI_HARV, aes(julianDay, meanNDVI), na.rm=TRUE) +
  geom_point(size=4,colour = "PeachPuff4") + 
  ggtitle("Landsat Derived NDVI - 2011\n NEON Harvard Forest Field Site") +
  xlab("Julian Days") + ylab("Mean NDVI") +
  theme(text = element_text(size=20))

Scatterplot of mean NDVI for NEON's site Harvard Forest in 2011

Challenge: Plot San Joaquin Experimental Range Data

Create a complementary plot for the SJER data. Plot the data points in a different color.

Scatterplot of mean NDVI for NEON's site San Joaquin Experimental Range in 2011

Compare NDVI from Two Different Sites in One Plot

Comparison of plots is often easiest when both plots are side by side. Or, even better, if both sets of data are plotted in the same plot. We can do this by binding the two datasets together. The date frames must have the same number of columns and exact same column names to be bound.

# Merge Data Frames
NDVI_HARV_SJER <- rbind(avg_NDVI_HARV,avg_NDVI_SJER)  
  
# plot NDVI values for both sites
ggplot(NDVI_HARV_SJER, aes(julianDay, meanNDVI, colour=site)) +
  geom_point(size=4,aes(group=site)) + 
  geom_line(aes(group=site)) +
  ggtitle("Landsat Derived NDVI - 2011\n Harvard Forest vs San Joaquin \n NEON Field Sites") +
  xlab("Julian Day") + ylab("Mean NDVI") +
  scale_colour_manual(values=c("PeachPuff4", "SpringGreen4"))+
	# scale_colour : match previous plots
  theme(text = element_text(size=20))

Scatterplot comparing mean NDVI for NEON's sites Harvard Forest and San Joaquin Experimental Range in 2011

### Challenge: Plot NDVI with Date Plot the SJER and HARV data in one plot but use date, rather than Julian day, on the x-axis.

Scatterplot comparing mean NDVI for NEON's sites Harvard Forest and San Joaquin Experimental Range in 2011 with the date on the x-axis

Remove Outlier Data

As we look at these plots we see variation in greenness across the year. However, the pattern is interrupted by a few points where NDVI quickly drops towards 0 during a time period when we might expect the vegetation to have a larger greenness value. Is the vegetation truly senescent or gone or are these outlier values that should be removed from the data?

Let's look at the RGB images from Harvard Forest.

NOTE: the code below uses loops which we will not teach in this tutorial. However the code demonstrates one way to plot multiple RGB rasters in a grid.

# open up RGB imagery

rgb.allCropped <-  list.files(paste0(wd,"NEON-DS-Landsat-NDVI/HARV/2011/RGB/"), 
                              full.names=TRUE, 
                              pattern = ".tif$")
# create a layout
par(mfrow=c(4,4))

# super efficient code
for (aFile in rgb.allCropped){
  NDVI.rastStack <- stack(aFile)
  plotRGB(NDVI.rastStack, stretch="lin")
  }

# reset layout
par(mfrow=c(1,1))

Time series of RGB images showing greenness over time for NEON's site Harvard Forest

Notice that the data points with very low NDVI values can be associated with images that are filled with clouds. Thus, we can attribute the low NDVI values to high levels of cloud cover.

Is the same thing happening at SJER?

# open up the cropped files
rgb.allCropped.SJER <-  list.files(paste0(wd,"NEON-DS-Landsat-NDVI/SJER/2011/RGB/"), 
                              full.names=TRUE, 
                              pattern = ".tif$")
# create a layout
par(mfrow=c(5,4))

# Super efficient code
# note that there is an issue with one of the rasters
# NEON-DS-Landsat-NDVI/SJER/2011/RGB/254_SJER_landRGB.tif has a blue band with no range
# thus you can't apply a stretch to it. The code below skips the stretch for
# that one image. You could automate this by testing the range of each band in each image

for (aFile in rgb.allCropped.SJER)
  {NDVI.rastStack <- stack(aFile)
  if (aFile ==paste0(wd,"NEON-DS-Landsat-NDVI/SJER/2011/RGB//254_SJER_landRGB.tif"))
    {plotRGB(NDVI.rastStack) }
  else { plotRGB(NDVI.rastStack, stretch="lin") }
}

## Error in grDevices::rgb(RGB[, 1], RGB[, 2], RGB[, 3], alpha = alpha, max = scale): color intensity NA, not in 0:255

# reset layout
par(mfrow=c(1,1))

Time series of RGB images showing greenness over time for NEON's site San Joaquin Experimental Range

Without significant additional processing, we will not be able to retrieve a strong reflection from vegetation, from a remotely sensed image that is predominantly cloud covered. Thus, these points are likely bad data points. Let's remove them.

First, we will identify the good data points - that should be retained. One way to do this is by identifying a threhold value. All values below that threshold will be removed from our analysis. We will use 0.1 as an example for this tutorials. We can then use the subset function to remove outlier datapoints (below our identified threshold).

**Data Tip:** Thresholding, or removing outlier data, can be tricky business. In this case, we can be confident that some of our NDVI values are not valid due to cloud cover. However, a threshold value may not always be sufficient given 0.1 could be a valid NDVI value in some areas. This is where decision making should be fueled by practical scientific knowledge of the data and the desired outcomes!
# retain only rows with meanNDVI>0.1
avg_NDVI_HARV_clean<-subset(avg_NDVI_HARV, meanNDVI>0.1)

# Did it work?
avg_NDVI_HARV_clean$meanNDVI<0.1

##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Now we can create another plot without the suspect data.

# plot without questionable data

ggplot(avg_NDVI_HARV_clean, aes(julianDay, meanNDVI)) +
  geom_point(size=4,colour = "SpringGreen4") + 
  ggtitle("Landsat Derived NDVI - 2011\n NEON Harvard Forest Field Site") +
  xlab("Julian Days") + ylab("Mean NDVI") +
  theme(text = element_text(size=20))

Scatterplot of mean NDVI with outliers removed for NEON's site Harvard Forest in 2011

Now our outlier data points are removed and the pattern of "green-up" and "brown-down" makes a bit more sense.

Write NDVI data to a .csv File

We can write our final NDVI data.frame out to a text format, to quickly share with a colleague or to resuse for analysis or visualization purposes. We will export in Comma Separated Value (.csv) file format given it is usable in many different tools and across platforms (MAC, PC, etc).

We will use write.csv() to write a specified data.frame to a .csv file. Unless you designate a different directory, the output file will be saved in your working directory.

Before saving our file, let's quickly view the format to make sure it is what we want as an output format.

# confirm data frame is the way we want it

head(avg_NDVI_HARV_clean)

##                     meanNDVI site year julianDay       Date
## X005_HARV_ndvi_crop 0.365150 HARV 2011         5 2011-01-05
## X037_HARV_ndvi_crop 0.242645 HARV 2011        37 2011-02-06
## X085_HARV_ndvi_crop 0.251390 HARV 2011        85 2011-03-26
## X133_HARV_ndvi_crop 0.599300 HARV 2011       133 2011-05-13
## X181_HARV_ndvi_crop 0.878725 HARV 2011       181 2011-06-30
## X197_HARV_ndvi_crop 0.893250 HARV 2011       197 2011-07-16

It looks like we have a series of row.names that we do not need given we have this information stored in individual columns in our data.frame. Let's remove the row names.

# create new df to prevent changes to avg_NDVI_HARV
NDVI_HARV_toWrite<-avg_NDVI_HARV_clean

# drop the row.names column 
row.names(NDVI_HARV_toWrite)<-NULL

# check data frame
head(NDVI_HARV_toWrite)

##   meanNDVI site year julianDay       Date
## 1 0.365150 HARV 2011         5 2011-01-05
## 2 0.242645 HARV 2011        37 2011-02-06
## 3 0.251390 HARV 2011        85 2011-03-26
## 4 0.599300 HARV 2011       133 2011-05-13
## 5 0.878725 HARV 2011       181 2011-06-30
## 6 0.893250 HARV 2011       197 2011-07-16


# create a .csv of mean NDVI values being sure to give descriptive name
# write.csv(DateFrameName, file="NewFileName")
write.csv(NDVI_HARV_toWrite, file=paste0(wd,"meanNDVI_HARV_2011.csv"))
### Challenge: Write to .csv
  1. Create a NDVI .csv file for the NEON SJER field site that is comparable with the one we just created for the Harvard Forest. Be sure to inspect for questionable values before writing any data to a .csv file.
  2. Create a NDVI .csv file that stacks data from both field sites.

Get Lesson Code

07-Extract-NDVI-From-Rasters-in-R.R
NEON Logo

Follow Us:

Join Our Newsletter

Get updates on events, opportunities, and how NEON is being used today.

Subscribe Now

Footer

  • My Account
  • About Us
  • Newsroom
  • Contact Us
  • Terms & Conditions
  • Careers

Copyright © Battelle, 2019-2020

The National Ecological Observatory Network is a major facility fully funded by the National Science Foundation.

Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the National Science Foundation.