Series
Introduction to Hyperspectral Remote Sensing Data
In this series, we cover the basics of working with NEON hyperspectral remote sensing data. We cover the principles of hyperspectral data, how to open hyperspectral data stored in HDF5 format in R and how to extract bands and create rasters in GeoTIFF format. Finally we explore extracting a hyperspectral-spectral signature from a single pixel using R.
Data used in this series are from the National Ecological Observatory Network (NEON) and are in HDF5 format.
Series Objectives
After completing the series you will:
- Understand the collection of hyperspectral remote sensing data and how they can be used
- Understand how HDF5 data can be used to store spatial data and the associated benefits of this format when working with large spatial data cubes
- Know how to extract metadata from HDF5 files
- Know how to plot a matrix as an image and a raster
- Understand how to extract and plot spectra from an HDF5 file
- Know how to work with groups and datasets within an HDF5 file
- Know how to export a spatially projected GeoTIFF
- Create a rasterstack in R which can then be used to create RGB images from bands in a hyperspectral data cube
- Plot data spatially on a map
- Create basic vegetation indices, like NDVI, using raster-based calculations in R
Things You’ll Need To Complete This Series
Setup RStudio
To complete some of the tutorials in this series, you will need an updated version of R
and, preferably, RStudio installed on your computer.
R is a programming language that specializes in statistical computing. It is a powerful tool for exploratory data analysis. To interact with R, we strongly recommend RStudio, an interactive development environment (IDE).
Download Data
Data is available for download in each tutorial that it is needed in.
About Hyperspectral Remote Sensing Data
Authors: Leah A. Wasser
Last Updated: Sep 27, 2023
Learning Objectives
After completing this tutorial, you will be able to:
- Define hyperspectral remote sensing.
- Explain the fundamental principles of hyperspectral remote sensing data.
- Describe the key attributes that are required to effectively work with hyperspectral remote sensing data in tools like R or Python.
- Describe what a "band" is.
Mapping the Invisible
About Hyperspectral Remote Sensing Data
The electromagnetic spectrum is composed of thousands of bands representing different types of light energy. Imaging spectrometers (instruments that collect hyperspectral data) break the electromagnetic spectrum into groups of bands that support classification of objects by their spectral properties on the earth's surface. Hyperspectral data consists of many bands -- up to hundreds of bands -- that cover the electromagnetic spectrum.
The NEON imaging spectrometer collects data within the 380nm to 2510nm portions of the electromagnetic spectrum within bands that are approximately 5nm in width. This results in a hyperspectral data cube that contains approximately 426 bands - which means big, big data.
Key Metadata for Hyperspectral Data
Bands and Wavelengths
A band represents a group of wavelengths. For example, the wavelength values between 695nm and 700nm might be one band as captured by an imaging spectrometer. The imaging spectrometer collects reflected light energy in a pixel for light in that band. Often when you work with a multi or hyperspectral dataset, the band information is reported as the center wavelength value. This value represents the center point value of the wavelengths represented in that band. Thus in a band spanning 695-700 nm, the center would be 697.5).

Spectral Resolution
The spectral resolution of a dataset that has more than one band, refers to the width of each band in the dataset. In the example above, a band was defined as spanning 695-700nm. The width or spatial resolution of the band is thus 5 nanometers. To see an example of this, check out the band widths for the Landsat sensors.
Full Width Half Max (FWHM)
The full width half max (FWHM) will also often be reported in a multi or hyperspectral dataset. This value represents the spread of the band around that center point.

In the illustration above, the band that covers 695-700nm has a FWHM of 5 nm. While a general spectral resolution of the sensor is often provided, not all sensors create bands of uniform widths. For instance bands 1-9 of Landsat 8 are listed below (Courtesy of USGS)
Band | Wavelength range (microns) | Spatial Resolution (m) | Spectral Width (microns) |
---|---|---|---|
Band 1 - Coastal aerosol | 0.43 - 0.45 | 30 | 0.02 |
Band 2 - Blue | 0.45 - 0.51 | 30 | 0.06 |
Band 3 - Green | 0.53 - 0.59 | 30 | 0.06 |
Band 4 - Red | 0.64 - 0.67 | 30 | 0.03 |
Band 5 - Near Infrared (NIR) | 0.85 - 0.88 | 30 | 0.03 |
Band 6 - SWIR 1 | 1.57 - 1.65 | 30 | 0.08 |
Band 7 - SWIR 2 | 2.11 - 2.29 | 30 | 0.18 |
Band 8 - Panchromatic | 0.50 - 0.68 | 15 | 0.18 |
Band 9 - Cirrus | 1.36 - 1.38 | 30 | 0.02 |
Data Product:
DP3.30006.001 | Spectrometer orthorectified surface directional reflectance - mosaicExplore Related Data Products
These data products are used in this tutorial:
Intro to Working with Hyperspectral Remote Sensing Data in HDF5 Format in R
Authors: Leah A. Wasser, Edmund Hart, Donal O'Leary
Last Updated: Nov 23, 2020
In this tutorial, we will explore reading and extracting spatial raster data stored within a HDF5 file using R.
Learning Objectives
After completing this tutorial, you will be able to:
- Explain how HDF5 data can be used to store spatial data and the associated benefits of this format when working with large spatial data cubes.
- Extract metadata from HDF5 files.
- Slice or subset HDF5 data. You will extract one band of pixels.
- Plot a matrix as an image and a raster.
- Export a final GeoTIFF (spatially projected) that can be used both in further analysis and in common GIS tools like QGIS.
Things You’ll Need To Complete This Tutorial
To complete this tutorial you will need the most current version of R and, preferably, RStudio loaded on your computer.
R Libraries to Install:
-
rhdf5:
install.packages("BiocManager")
,BiocManager::install("rhdf5")
-
raster:
install.packages('raster')
-
rgdal:
install.packages('rgdal')
More on Packages in R - Adapted from Software Carpentry.
Data to Download
Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5
These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site in March of 2019. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. This data subset is derived from the mosaic tile named NEON_D17_SJER_DP3_257000_4112000_reflectance.h5. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetRemember that the example dataset linked here only has 1 out of every 4 bands included in a full NEON hyperspectral dataset (this substantially reduces the file size!). When we refer to bands in this tutorial, we will note the band numbers for this example dataset, which are different from NEON production data. To convert a band number (b) from this example data subset to the equivalent band in a full NEON hyperspectral file (b'), use the following equation: b' = 1+4*(b-1).
Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.
An overview of setting the working directory in R can be found here.
R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.
About Hyperspectral Remote Sensing Data
The electromagnetic spectrum is composed of thousands of bands representing different types of light energy. Imaging spectrometers (instruments that collect hyperspectral data) break the electromagnetic spectrum into groups of bands that support classification of objects by their spectral properties on the Earth's surface. Hyperspectral data consists of many bands - up to hundreds of bands - that cover the electromagnetic spectrum.
The NEON imaging spectrometer (NIS) collects data within the 380 nm to 2510 nm portions of the electromagnetic spectrum within bands that are approximately 5 nm in width. This results in a hyperspectral data cube that contains approximately 428 bands - which means BIG DATA. Remember that the example dataset used here only has 1 out of every 4 bands included in a full NEON hyperspectral dataset (this substantially reduces size!). When we refer to bands in this tutorial, we will note the band numbers for this example dataset, which may be different from NEON production data.

The HDF5 data model natively compresses data stored within it (makes it smaller) and supports data slicing (extracting only the portions of the data that you need to work with rather than reading the entire dataset into memory). These features in addition to the ability to support spatial data and associated metadata make it ideal for working with large data cubes such as those generated by imaging spectrometers.
In this tutorial we will explore reading and extracting spatial raster data stored within a HDF5 file using R.
Read HDF5 data into R
We will use the raster
and rhdf5
packages to read in the HDF5 file that
contains hyperspectral data for the
NEON San Joaquin (SJER) field site.
Let's start by calling the needed packages and reading in our NEON HDF5 file.
Please be sure that you have at least version 2.10 of rhdf5
installed. Use:
packageVersion("rhdf5")
to check the package version.
# Load `raster` and `rhdf5` packages and read NIS data into R
library(raster)
library(rhdf5)
library(rgdal)
# set working directory to ensure R can find the file we wish to import and where
# we want to save our files. Be sure to move the download into your working directory!
wd <- "~/Documents/data/" #This will depend on your local environment
setwd(wd)
# Define the file name to be opened
f <- paste0(wd,"NEON_hyperspectral_tutorial_example_subset.h5")
# look at the HDF5 file structure
View(h5ls(f,all=T))
When you look at the structure of the data, take note of the "map info" dataset, the "Coordinate_System" group, and the "wavelength" and "Reflectance" datasets. The "Coordinate_System" folder contains the spatial attributes of the data including its EPSG Code, which is easily converted to a Coordinate Reference System (CRS). The CRS documents how the data are physically located on the Earth. The "wavelength" dataset contains the middle wavelength values for each band in the data. The "Reflectance" dataset contains the image data that we will use for both data processing and visualization.
More Information on raster metadata:
- The Basics - this tutorial explains more about how rasters work in R and their associated metadata.
- About Hyperspectral Remote Sensing Data -this tutorial explains more about metadata and important concepts associated with multi-band (multi and hyperspectral) rasters.
We can use the h5readAttributes()
function to read and extract metadata from the
HDF5 file. Let's start by learning about the wavelengths described within this file.
# get information about the wavelengths of this dataset
wavelengthInfo <- h5readAttributes(f,"/SJER/Reflectance/Metadata/Spectral_Data/Wavelength")
wavelengthInfo
## $Description
## [1] "Central wavelength of the reflectance bands."
##
## $Units
## [1] "nanometers"
Next, we can use the h5read
function to read the data contained within the
HDF5 file. Let's read in the wavelengths of the band centers:
# read in the wavelength information from the HDF5 file
wavelengths <- h5read(f,"/SJER/Reflectance/Metadata/Spectral_Data/Wavelength")
head(wavelengths)
## [1] 381.5437 401.5756 421.6075 441.6394 461.6713 481.7032
tail(wavelengths)
## [1] 2404.764 2424.796 2444.828 2464.860 2484.892 2504.924
Which wavelength is band 6 associated with?
(Hint: look at the wavelengths
vector that we just imported and check out the data located at index 6 -
wavelengths[6]
).

Band 6 has a associate wavelength center of 481.7032 nanometers (nm) which is in the blue portion of the visible electromagnetic spectrum (~ 400-700 nm).
Bands and Wavelengths
A band represents a group of wavelengths. For example, the wavelength values between 695 nm and 700 nm might be one band as captured by an imaging spectrometer. The imaging spectrometer collects reflected light energy in a pixel for light in that band. Often when you work with a multi or hyperspectral dataset, the band information is reported as the center wavelength value. This value represents the center point value of the wavelengths represented in that band. Thus in a band spanning 695-700 nm, the center would be 697.5 nm). The full width half max (FWHM) will also be reported. This value represents the spread of the band around that center point. So, a band that covers 800 nm-805 nm might have a FWHM of 5 nm and a wavelength value of 802.5 nm.

The HDF5 dataset that we are working with in this activity may contain more information than we need to work with. For example, we don't necessarily need to process all 107 bands available in this example dataset (or all 426 bands available in a full NEON hyperspectral reflectance file, for that matter)
- if we are interested in creating a product like NDVI which only uses bands in the near infra-red and red portions of the spectrum. Or we might only be interested in a spatial subset of the data - perhaps a region where we have plots in the field.
The HDF5 format allows us to slice (or subset) the data - quickly extracting the subset that we need to process. Let's extract one of the green bands in our dataset - band 9.
By the way - what is the center wavelength value associated with band 9?
Hint: wavelengths[9]
.
How do we know this band is a green band in the visible portion of the spectrum?
In order to effectively subset our data, let's first read the important reflectance metadata stored as attributes in the "Reflectance_Data" dataset.
# First, we need to extract the reflectance metadata:
reflInfo <- h5readAttributes(f, "/SJER/Reflectance/Reflectance_Data")
reflInfo
## $Cloud_conditions
## [1] "For cloud conditions information see Weather Quality Index dataset."
##
## $Cloud_type
## [1] "Cloud type may have been selected from multiple flight trajectories."
##
## $Data_Ignore_Value
## [1] -9999
##
## $Description
## [1] "Atmospherically corrected reflectance."
##
## $Dimension_Labels
## [1] "Line, Sample, Wavelength"
##
## $Dimensions
## [1] 500 500 107
##
## $Interleave
## [1] "BSQ"
##
## $Scale_Factor
## [1] 10000
##
## $Spatial_Extent_meters
## [1] 257500 258000 4112500 4113000
##
## $Spatial_Resolution_X_Y
## [1] 1 1
##
## $Units
## [1] "Unitless."
##
## $Units_Valid_range
## [1] 0 10000
##
## $dim
## [1] 107 500 500
# Next, we read the different dimensions
nRows <- reflInfo$Dimensions[1]
nCols <- reflInfo$Dimensions[2]
nBands <- reflInfo$Dimensions[3]
nRows
## [1] 500
nCols
## [1] 500
nBands
## [1] 107
The HDF5 read function reads data in the order: Bands, Cols, Rows. This is different from how R reads data. We'll adjust for this later.
# Extract or "slice" data for band 9 from the HDF5 file
b9 <- h5read(f,"/SJER/Reflectance/Reflectance_Data",index=list(9,1:nCols,1:nRows))
# what type of object is b9?
class(b9)
## [1] "array"
A Note About Data Slicing in HDF5
Data slicing allows us to extract and work with subsets of the data rather than reading in the entire dataset into memory. Thus, in this case, we can extract and plot the green band without reading in all 107 bands of information. The ability to slice large datasets makes HDF5 ideal for working with big data.
Next, let's convert our data from an array (more than 2 dimensions) to a matrix (just 2 dimensions). We need to have our data in a matrix format to plot it.
# convert from array to matrix by selecting only the first band
b9 <- b9[1,,]
# check it
class(b9)
## [1] "matrix"
Arrays vs. Matrices
Arrays are matrices with more than 2 dimensions. When we say dimension, we are talking about the "z" associated with the data (imagine a series of tabs in a spreadsheet). Put the other way: matrices are arrays with only 2 dimensions. Arrays can have any number of dimensions one, two, ten or more.
Here is a matrix that is 4 x 3 in size (4 rows and 3 columns):
Metric | species 1 | species 2 |
---|---|---|
total number | 23 | 45 |
average weight | 14 | 5 |
average length | 2.4 | 3.5 |
average height | 32 | 12 |
Dimensions in Arrays
An array contains 1 or more dimensions in the "z" direction. For example, let's say that we collected the same set of species data for every day in a 30 day month. We might then have a matrix like the one above for each day for a total of 30 days making a 4 x 3 x 30 array (this dataset has more than 2 dimensions). More on R object types here (links to external site, DataCamp).


Next, let's look at the metadata for the reflectance data. When we do this, take note of 1) the scale factor and 2) the data ignore value. Then we can plot the band 9 data. Plotting spatial data as a visual "data check" is a good idea to make sure processing is being performed correctly and all is well with the image.
# look at the metadata for the reflectance dataset
h5readAttributes(f,"/SJER/Reflectance/Reflectance_Data")
## $Cloud_conditions
## [1] "For cloud conditions information see Weather Quality Index dataset."
##
## $Cloud_type
## [1] "Cloud type may have been selected from multiple flight trajectories."
##
## $Data_Ignore_Value
## [1] -9999
##
## $Description
## [1] "Atmospherically corrected reflectance."
##
## $Dimension_Labels
## [1] "Line, Sample, Wavelength"
##
## $Dimensions
## [1] 500 500 107
##
## $Interleave
## [1] "BSQ"
##
## $Scale_Factor
## [1] 10000
##
## $Spatial_Extent_meters
## [1] 257500 258000 4112500 4113000
##
## $Spatial_Resolution_X_Y
## [1] 1 1
##
## $Units
## [1] "Unitless."
##
## $Units_Valid_range
## [1] 0 10000
##
## $dim
## [1] 107 500 500
# plot the image
image(b9)
# oh, that is hard to visually interpret.
# what happens if we plot a log of the data?
image(log(b9))
What do you notice about the first image? It's washed out and lacking any detail. What could be causing this? It got better when plotting the log of the values, but still not great.
Let's look at the distribution of reflectance values in our data to figure out what is going on.
# Plot range of reflectance values as a histogram to view range
# and distribution of values.
hist(b9,breaks=40,col="darkmagenta")
# View values between 0 and 5000
hist(b9,breaks=40,col="darkmagenta",xlim = c(0, 5000))
# View higher values
hist(b9, breaks=40,col="darkmagenta",xlim = c(5000, 15000),ylim=c(0,100))
As you're examining the histograms above, keep in mind that reflectance values range between 0-1. The data scale factor in the metadata tells us to divide all reflectance values by 10,000. Thus, a value of 5,000 equates to a reflectance value of 0.50. Storing data as integers (without decimal places) compared to floating points (with decimal places) creates a smaller file. You will see this done often when working with remote sensing data.
Notice in the data that there are some larger reflectance values (>5,000) that represent a smaller number of pixels. These pixels are skewing how the image renders.
Data Ignore Value
Image data in raster format will often contain a data ignore value and a scale factor. The data ignore value represents pixels where there are no data. Among other causes, no data values may be attributed to the sensor not collecting data in that area of the image or to processing results which yield null values.
Remember that the metadata for the Reflectance
dataset designated -9999 as
data ignore value
. Thus, let's set all pixels with a value == -9999 to NA
(no value). If we do this, R won't try to render these pixels.
# there is a no data value in our raster - let's define it
myNoDataValue <- as.numeric(reflInfo$Data_Ignore_Value)
myNoDataValue
## [1] -9999
# set all values equal to -9999 to NA
b9[b9 == myNoDataValue] <- NA
# plot the image now
image(b9)
Reflectance Values and Image Stretch
Our image still looks dark because R is trying to render all reflectance values between 0 and 14999 as if they were distributed equally in the histogram. However we know they are not distributed equally. There are many more values between 0-5000 then there are values >5000.
Images have a distribution of reflectance values. A typical image viewing program
will render the values by distributing the entire range of reflectance values
across a range of "shades" that the monitor can render - between 0 and 255.
However, often the distribution of reflectance values is not linear. For example,
in the case of our data, most of the reflectance values fall between 0 and 0.5.
Yet there are a few values >0.8 that are heavily impacting the way the image is
drawn on our monitor. Imaging processing programs like ENVI, QGIS and ArcGIS (and
even Adobe Photoshop) allow you to adjust the stretch of the image. This is similar
to adjusting the contrast and brightness in Photoshop.
The proper way to adjust our data would be
what's called an image stretch
. We will learn how to stretch our image data,
later. For now, let's plot the values as the log function on the pixel
reflectance values to factor out those larger values.
image(log(b9))
The log applied to our image increases the contrast making it look more like an image. However, look at the images below. The top one is what our log adjusted image looks like when plotted. The bottom on is an RGB version of the same image. Notice a difference?


Transpose Image
Notice that there are three data dimensions for this file: Bands x Rows x
Columns. However, when R reads in the dataset, it reads them as: Columns x
Bands x Rows. The data are flipped. We can quickly transpose the data to correct
for this using the t
or transpose
command in R.
The orientation is rotated in our log adjusted image. This is because R reads in matrices starting from the upper left hand corner. Whereas, most rasters read pixels starting from the lower left hand corner. In the next section, we will deal with this issue by creating a proper georeferenced (spatially located) raster in R. The raster format will read in pixels following the same methods as other GIS and imaging processing software like QGIS and ENVI do.
# We need to transpose x and y values in order for our
# final image to plot properly
b9 <- t(b9)
image(log(b9), main="Transposed Image")
Create a Georeferenced Raster
Next, we will create a proper raster using the b9
matrix. The raster
format will allow us to define and manage:
- Image stretch
- Coordinate reference system & spatial reference
- Resolution
- and other raster attributes...
It will also account for the orientation issue discussed above.
To create a raster in R, we need a few pieces of information, including:
- The coordinate reference system (CRS)
- The spatial extent of the image
Define Raster CRS
First, we need to define the Coordinate reference system (CRS) of the raster. To do that, we can first grab the EPSG code from the HDF5 attributes, and covert the EPSG to a CRS string. Then we can assign that CRS to the raster object.
# Extract the EPSG from the h5 dataset
myEPSG <- h5read(f, "/SJER/Reflectance/Metadata/Coordinate_System/EPSG Code")
# convert the EPSG code to a CRS string
myCRS <- crs(paste0("+init=epsg:",myEPSG))
# define final raster with projection info
# note that capitalization will throw errors on a MAC.
# if UTM is all caps it might cause an error!
b9r <- raster(b9,
crs=myCRS)
# view the raster attributes
b9r
## class : RasterLayer
## dimensions : 500, 500, 250000 (nrow, ncol, ncell)
## resolution : 0.002, 0.002 (x, y)
## extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## source : memory
## names : layer
## values : 0, 9210 (min, max)
# let's have a look at our properly oriented raster. Take note of the
# coordinates on the x and y axis.
image(log(b9r),
xlab = "UTM Easting",
ylab = "UTM Northing",
main = "Properly Oriented Raster")
Next we define the extents of our raster. The extents will be used to calculate
the raster's resolution. Fortunately, the spatial extent is provided in the
HDF5 file "Reflectance_Data" group attributes that we saved before as reflInfo
.
# Grab the UTM coordinates of the spatial extent
xMin <- reflInfo$Spatial_Extent_meters[1]
xMax <- reflInfo$Spatial_Extent_meters[2]
yMin <- reflInfo$Spatial_Extent_meters[3]
yMax <- reflInfo$Spatial_Extent_meters[4]
# define the extent (left, right, top, bottom)
rasExt <- extent(xMin,xMax,yMin,yMax)
rasExt
## class : Extent
## xmin : 257500
## xmax : 258000
## ymin : 4112500
## ymax : 4113000
# assign the spatial extent to the raster
extent(b9r) <- rasExt
# look at raster attributes
b9r
## class : RasterLayer
## dimensions : 500, 500, 250000 (nrow, ncol, ncell)
## resolution : 1, 1 (x, y)
## extent : 257500, 258000, 4112500, 4113000 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## source : memory
## names : layer
## values : 0, 9210 (min, max)

Learn more about raster attributes including extent, and coordinate reference systems here.
We can adjust the colors of our raster too if we want.
# let's change the colors of our raster and adjust the zlims
col <- terrain.colors(25)
image(b9r,
xlab = "UTM Easting",
ylab = "UTM Northing",
main= "Raster w Custom Colors",
col=col,
zlim=c(0,3000))
We've now created a raster from band 9 reflectance data. We can export the data
as a raster, using the writeRaster
command.
# write out the raster as a geotiff
writeRaster(b9r,
file=paste0(wd,"band9.tif"),
format="GTiff",
overwrite=TRUE)
# It's always good practice to close the H5 connection before moving on!
# close the H5 file
H5close()
Try these three extensions on your own:
-
Create rasters using other bands in the dataset.
-
Vary the distribution of values in the image to mimic an image stretch. e.g.
b9[b9 > 6000 ] <- 6000
-
Use what you know to extract ALL of the reflectance values for ONE pixel rather than for an entire band. HINT: this will require you to pick an x and y value and then all values in the z dimension:
aPixel<- h5read(f,"Reflectance",index=list(NULL,100,35))
. Plot the spectra output.
Get Lesson Code
Creating a Raster Stack from Hyperspectral Imagery in HDF5 Format in R
Authors: Edmund Hart, Leah A. Wasser, Donal O'Leary
Last Updated: Nov 23, 2020
In this tutorial, we will learn how to create multi (3) band images from hyperspectral data. We will also learn how to perform some basic raster calculations (known as raster math in the GIS world).
Learning Objectives
After completing this activity, you will be able to:
- Extract a "slice" of data from a hyperspectral data cube.
- Create a rasterstack in R which can then be used to create RGB images from bands in a hyperspectral data cube.
- Plot data spatially on a map.
- Create basic vegetation indices like NDVI using raster-based calculations in R.
Things You’ll Need To Complete This Tutorial
To complete this tutorial you will need the most current version of R and, preferably, RStudio loaded on your computer.
R Libraries to Install:
-
rhdf5:
install.packages("BiocManager")
,BiocManager::install("rhdf5")
-
raster:
install.packages('raster')
-
rgdal:
install.packages('rgdal')
-
maps:
install.packages('maps')
More on Packages in R - Adapted from Software Carpentry.
Data to Download
Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5
These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site in March of 2019. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. This data subset is derived from the mosaic tile named NEON_D17_SJER_DP3_257000_4112000_reflectance.h5. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetRemember that the example dataset linked here only has 1 out of every 4 bands included in a full NEON hyperspectral dataset (this substantially reduces the file size!). When we refer to bands in this tutorial, we will note the band numbers for this example dataset, which are different from NEON production data. To convert a band number (b) from this example data subset to the equivalent band in a full NEON hyperspectral file (b'), use the following equation: b' = 1+4*(b-1).
Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.
An overview of setting the working directory in R can be found here.
R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.
Recommended Skills
For this tutorial you should be comfortable working with HDF5 files that contain hyperspectral data, including reading in reflectance values and associated metadata and attributes.
If you aren't familiar with these steps already, we highly recommend you work through the Introduction to Working with Hyperspectral Data in HDF5 Format in R tutorial before moving on to this tutorial.
About Hyperspectral Data
We often want to generate a 3 band image from multi or hyperspectral data. The most commonly recognized band combination is RGB which stands for Red, Green and Blue. RGB images are just like the images that your camera takes. But there are other band combinations that are useful too. For example, near infrared images emphasize vegetation and help us classify or identify where vegetation is located on the ground.


Create a Raster Stack in R
In the previous activity, we exported a single band of the NEON Reflectance data from a HDF5 file. In this activity, we will create a full color image using 3 (red, green and blue - RGB) bands. We will follow many of the steps we followed in the Intro to Working with Hyperspectral Remote Sensing Data in HDF5 Format in R tutorial. These steps included loading required packages, reading in our file and viewing the file structure.
# Load required packages
library(raster)
library(rhdf5)
# set working directory to ensure R can find the file we wish to import and where
# we want to save our files. Be sure to move the download into your working directory!
wd <- "~/Documents/data/" # This will depend on your local environment
setwd(wd)
# create path to file name
f <- paste0(wd,"NEON_hyperspectral_tutorial_example_subset.h5")
# View HDF5 file structure
View(h5ls(f,all=T))
To spatially locate our raster data, we need a few key attributes:
- The coordinate reference system
- The spatial extent of the raster
We'll begin by grabbing these key attributes from the H5 file.
# define coordinate reference system from the EPSG code provided in the HDF5 file
myEPSG <- h5read(f,"/SJER/Reflectance/Metadata/Coordinate_System/EPSG Code" )
myCRS <- crs(paste0("+init=epsg:",myEPSG))
# get the Reflectance_Data attributes
reflInfo <- h5readAttributes(f,"/SJER/Reflectance/Reflectance_Data" )
# Grab the UTM coordinates of the spatial extent
xMin <- reflInfo$Spatial_Extent_meters[1]
xMax <- reflInfo$Spatial_Extent_meters[2]
yMin <- reflInfo$Spatial_Extent_meters[3]
yMax <- reflInfo$Spatial_Extent_meters[4]
# define the extent (left, right, top, bottom)
rasExt <- extent(xMin,xMax,yMin,yMax)
# view the extent to make sure that it looks right
rasExt
## class : Extent
## xmin : 257500
## xmax : 258000
## ymin : 4112500
## ymax : 4113000
# Finally, define the no data value for later
myNoDataValue <- as.integer(reflInfo$Data_Ignore_Value)
myNoDataValue
## [1] -9999
Next, we'll write a function that will perform the processing that we did step by step in the Intro to Working with Hyperspectral Remote Sensing Data in HDF5 Format in R. This will allow us to process multiple bands in bulk.
The function band2Rast
slices a band of data from the HDF5 file, and
extracts the reflectance. It them converts the data to a matrix, converts it to
a raster and returns a spatially corrected raster for the specified band.
The function requires the following variables:
- file: the file
- band: the band number we wish to extract
- noDataValue: the noDataValue for the raster
- extent: a raster
Extent
object . - crs: the Coordinate Reference System for the raster
The function output is a spatially referenced, R raster object.
# file: the hdf file
# band: the band you want to process
# returns: a matrix containing the reflectance data for the specific band
band2Raster <- function(file, band, noDataValue, extent, CRS){
# first, read in the raster
out <- h5read(file,"/SJER/Reflectance/Reflectance_Data",index=list(band,NULL,NULL))
# Convert from array to matrix
out <- (out[1,,])
# transpose data to fix flipped row and column order
# depending upon how your data are formatted you might not have to perform this
# step.
out <- t(out)
# assign data ignore values to NA
# note, you might chose to assign values of 15000 to NA
out[out == myNoDataValue] <- NA
# turn the out object into a raster
outr <- raster(out,crs=CRS)
# assign the extents to the raster
extent(outr) <- extent
# return the raster object
return(outr)
}
Now that the function is created, we can create our list of rasters. The list specifies which bands (or dimensions in our hyperspectral dataset) we want to include in our raster stack. Let's start with a typical RGB (red, green, blue) combination. We will use bands 14, 9, and 4 (bands 58, 34, and 19 in a full NEON hyperspectral dataset).
# create a list of the bands we want in our stack
rgb <- list(14,9,4) #list(58,34,19) when using full NEON hyperspectral dataset
# lapply tells R to apply the function to each element in the list
rgb_rast <- lapply(rgb,FUN=band2Raster, file = f,
noDataValue=myNoDataValue,
extent=rasExt,
CRS=myCRS)
# check out the properties or rgb_rast
# note that it displays properties of 3 rasters.
rgb_rast
## [[1]]
## class : RasterLayer
## dimensions : 500, 500, 250000 (nrow, ncol, ncell)
## resolution : 1, 1 (x, y)
## extent : 257500, 258000, 4112500, 4113000 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## source : memory
## names : layer
## values : 0, 9418 (min, max)
##
##
## [[2]]
## class : RasterLayer
## dimensions : 500, 500, 250000 (nrow, ncol, ncell)
## resolution : 1, 1 (x, y)
## extent : 257500, 258000, 4112500, 4113000 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## source : memory
## names : layer
## values : 0, 9210 (min, max)
##
##
## [[3]]
## class : RasterLayer
## dimensions : 500, 500, 250000 (nrow, ncol, ncell)
## resolution : 1, 1 (x, y)
## extent : 257500, 258000, 4112500, 4113000 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## source : memory
## names : layer
## values : 0, 9704 (min, max)
# finally, create a raster stack from our list of rasters
rgbStack <- stack(rgb_rast)
In the code chunk above, we used the lapply()
function, which is a powerful,
flexible way to apply a function (in this case, our band2Raster()
fucntion)
multiple times. You can learn more about lapply() here.
NOTE: We are using the raster stack
object in R to store several rasters that
are of the same CRS and extent. This is a popular and convenient way to organize
co-incident rasters.
Next, add the names of the bands to the raster so we can easily keep track of the bands in the list.
# Create a list of band names
bandNames <- paste("Band_",unlist(rgb),sep="")
# set the rasterStack's names equal to the list of bandNames created above
names(rgbStack) <- bandNames
# check properties of the raster list - note the band names
rgbStack
## class : RasterStack
## dimensions : 500, 500, 250000, 3 (nrow, ncol, ncell, nlayers)
## resolution : 1, 1 (x, y)
## extent : 257500, 258000, 4112500, 4113000 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## names : Band_14, Band_9, Band_4
## min values : 0, 0, 0
## max values : 9418, 9210, 9704
# scale the data as specified in the reflInfo$Scale Factor
rgbStack <- rgbStack/as.integer(reflInfo$Scale_Factor)
# plot one raster in the stack to make sure things look OK.
plot(rgbStack$Band_14, main="Band 14")
We can play with the color ramps too if we want:
# change the colors of our raster
myCol <- terrain.colors(25)
image(rgbStack$Band_14, main="Band 14", col=myCol)
# adjust the zlims or the stretch of the image
myCol <- terrain.colors(25)
image(rgbStack$Band_14, main="Band 14", col=myCol, zlim = c(0,.5))
# try a different color palette
myCol <- topo.colors(15, alpha = 1)
image(rgbStack$Band_14, main="Band 14", col=myCol, zlim=c(0,.5))
The plotRGB
function allows you to combine three bands to create an image.
# create a 3 band RGB image
plotRGB(rgbStack,
r=1,g=2,b=3,
stretch = "lin")
A note about image stretching: Notice that we use the argument
stretch="lin"
in this plotting function, which automatically stretches the
brightness values for us to produce a natural-looking image.
Once you've created your raster, you can export it as a GeoTIFF. You can bring this GeoTIFF into any GIS program!
# write out final raster
# note: if you set overwrite to TRUE, then you will overwite or lose the older
# version of the tif file! Keep this in mind.
writeRaster(rgbStack, file=paste0(wd,"NEON_hyperspectral_tutorial_example_RGB_stack_image.tif"), format="GTiff", overwrite=TRUE)
Use different band combinations to create other "RGB" images. Suggested band combinations are below for use with the full NEON hyperspectral reflectance datasets (for this example dataset, divide the band number by 4 and round to the nearest whole number):
- Color Infrared/False Color: rgb (90,34,19)
- SWIR, NIR, Red Band: rgb (152,90,58)
- False Color: rgb (363,246,55)
Raster Math - Creating NDVI and other Vegetation Indices in R
In this last part, we will calculate some vegetation indices using raster math in R! We will start by creating NDVI or Normalized Difference Vegetation Index.
About NDVI
NDVI is a ratio between the near infrared (NIR) portion of the electromagnetic spectrum and the red portion of the spectrum. Please keep in mind that there are different ways to aggregate bands when using hyperspectral data. This example is using individual bands to perform the NDVI calculation. Using individual bands is not necessarily the best way to calculate NDVI from hyperspectral data!
# Calculate NDVI
# select bands to use in calculation (red, NIR)
ndvi_bands <- c(16,24) #bands c(58,90) in full NEON hyperspectral dataset
# create raster list and then a stack using those two bands
ndvi_rast <- lapply(ndvi_bands,FUN=band2Raster, file = f,
noDataValue=myNoDataValue,
extent=rasExt, CRS=myCRS)
ndvi_stack <- stack(ndvi_rast)
# make the names pretty
bandNDVINames <- paste("Band_",unlist(ndvi_bands),sep="")
names(ndvi_stack) <- bandNDVINames
# view the properties of the new raster stack
ndvi_stack
## class : RasterStack
## dimensions : 500, 500, 250000, 2 (nrow, ncol, ncell, nlayers)
## resolution : 1, 1 (x, y)
## extent : 257500, 258000, 4112500, 4113000 (xmin, xmax, ymin, ymax)
## crs : +init=epsg:32611 +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## names : Band_16, Band_24
## min values : 0, 0
## max values : 9386, 9424
#calculate NDVI
NDVI <- function(x) {
(x[,2]-x[,1])/(x[,2]+x[,1])
}
ndvi_calc <- calc(ndvi_stack,NDVI)
plot(ndvi_calc, main="NDVI for the NEON SJER Field Site")
# Now, play with breaks and colors to create a meaningful map
# add a color map with 4 colors
myCol <- rev(terrain.colors(4)) # use the 'rev()' function to put green as the highest NDVI value
# add breaks to the colormap, including lowest and highest values (4 breaks = 3 segments)
brk <- c(0, .25, .5, .75, 1)
# plot the image using breaks
plot(ndvi_calc, main="NDVI for the NEON SJER Field Site", col=myCol, breaks=brk)
Try the following:
-
Calculate EVI using the following formula : EVI<- 2.5 * ((b4-b3) / (b4 + 6 * b3- 7.5*b1 + 1))
-
Calculate Normalized Difference Nitrogen Index (NDNI) using the following equation: log(1/p1510)-log(1/p1680)/ log(1/p1510)+log(1/p1680)
-
Explore the bands in the hyperspectral data. What happens if you average reflectance values across multiple red and NIR bands and then calculate NDVI?
Get Lesson Code
Plot Spectral Signatures Derived from Hyperspectral Remote Sensing Data in HDF5 Format in R
Authors: Leah A. Wasser, Donal O'Leary
Last Updated: Nov 23, 2020
Learning Objectives
After completing this tutorial, you will be able to:
- Extract and plot a single spectral signature from an HDF5 file.
- Work with groups and datasets within an HDF5 file.
Things You’ll Need To Complete This Tutorial
To complete this tutorial you will need the most current version of R and, preferably, RStudio loaded on your computer.
R Libraries to Install:
-
rhdf5:
install.packages("BiocManager")
,BiocManager::install("rhdf5")
-
plyr:
install.packages('plyr')
-
ggplot2:
install.packages('ggplot2')
More on Packages in R - Adapted from Software Carpentry.
Data to Download
Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5
These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site in March of 2019. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. This data subset is derived from the mosaic tile named NEON_D17_SJER_DP3_257000_4112000_reflectance.h5. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetRemember that the example dataset linked here only has 1 out of every 4 bands included in a full NEON hyperspectral dataset (this substantially reduces the file size!). When we refer to bands in this tutorial, we will note the band numbers for this example dataset, which are different from NEON production data. To convert a band number (b) from this example data subset to the equivalent band in a full NEON hyperspectral file (b'), use the following equation: b' = 1+4*(b-1).
Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.
An overview of setting the working directory in R can be found here.
R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.
Recommended Skills
For this tutorial, you should be comfortable reading data from a HDF5 file and have a general familiarity with hyperspectral data. If you aren't familiar with these steps already, we highly recommend you work through the Introduction to Working with Hyperspectral Data in HDF5 Format in R tutorial before moving on to this tutorial.
Everything on our planet reflects electromagnetic radiation from the Sun, and different types of land cover often have dramatically different refelectance properties across the spectrum. One of the most powerful aspects of the NEON Imaging Spectrometer (a.k.a. NEON's hyperspectral imager) is that it can accurately measure these reflectance properties at a very high spectral resolution. When you plot the reflectance values across the observed spectrum, you will see that different land cover types (vegetation, pavement, bare soils, etc.) have distinct patterns in their reflectance values, a feature that we call the 'spectral signature' of a particular land cover class.
In this tutorial, we will extract a single pixel's worth of reflectance values to plot a spectral signature for that pixel. In order to plot the spectral signature for a given pixel in this hyperspectral dataset, we will need to extract the reflectance values for that pixel, and pair those with the wavelengths that are represented in those measurements. We will also need to adjust the reflectance values by the scaling factor that is saved as an 'attribute' in the HDF5 file. First, let's start by defining the working directory and reading in the example dataset.
# Call required packages
library(rhdf5)
library(plyr)
library(ggplot2)
# set working directory to ensure R can find the file we wish to import and where
# we want to save our files. Be sure to move the download into your working directory!
wd <- "~/Documents/data/" #This will depend on your local environment
setwd(wd)
Now, we need to access the H5 file.
# Define the file name to be opened
f <- paste0(wd,"NEON_hyperspectral_tutorial_example_subset.h5")
# look at the HDF5 file structure
h5ls(f,all=T)
## group name ltype
## 0 / SJER H5L_TYPE_HARD
## 1 /SJER Reflectance H5L_TYPE_HARD
## 2 /SJER/Reflectance Metadata H5L_TYPE_HARD
## 3 /SJER/Reflectance/Metadata Coordinate_System H5L_TYPE_HARD
## 4 /SJER/Reflectance/Metadata/Coordinate_System Coordinate_System_String H5L_TYPE_HARD
## 5 /SJER/Reflectance/Metadata/Coordinate_System EPSG Code H5L_TYPE_HARD
## 6 /SJER/Reflectance/Metadata/Coordinate_System Map_Info H5L_TYPE_HARD
## 7 /SJER/Reflectance/Metadata/Coordinate_System Proj4 H5L_TYPE_HARD
## 8 /SJER/Reflectance/Metadata Spectral_Data H5L_TYPE_HARD
## 9 /SJER/Reflectance/Metadata/Spectral_Data Wavelength H5L_TYPE_HARD
## 10 /SJER/Reflectance Reflectance_Data H5L_TYPE_HARD
## corder_valid corder cset otype num_attrs dclass dtype stype rank
## 0 FALSE 0 0 H5I_GROUP 0 0
## 1 FALSE 0 0 H5I_GROUP 5 0
## 2 FALSE 0 0 H5I_GROUP 0 0
## 3 FALSE 0 0 H5I_GROUP 0 0
## 4 FALSE 0 0 H5I_DATASET 0 STRING H5T_STRING SIMPLE 1
## 5 FALSE 0 0 H5I_DATASET 0 STRING H5T_STRING SIMPLE 1
## 6 FALSE 0 0 H5I_DATASET 1 STRING H5T_STRING SIMPLE 1
## 7 FALSE 0 0 H5I_DATASET 0 STRING H5T_STRING SIMPLE 1
## 8 FALSE 0 0 H5I_GROUP 0 0
## 9 FALSE 0 0 H5I_DATASET 3 FLOAT H5T_IEEE_F64LE SIMPLE 1
## 10 FALSE 0 0 H5I_DATASET 13 INTEGER H5T_STD_I32LE SIMPLE 3
## dim maxdim
## 0
## 1
## 2
## 3
## 4 1 1
## 5 1 1
## 6 1 1
## 7 1 1
## 8
## 9 107 107
## 10 107 x 500 x 500 107 x 500 x 500
Read Wavelength Values
Next, let's read in the wavelength center associated with each band in the HDF5 file. We will later match these with the reflectance values and show both in our final spectral signature plot.
# read in the wavelength information from the HDF5 file
wavelengths <- h5read(f,"/SJER/Reflectance/Metadata/Spectral_Data/Wavelength")
Extract Z-dimension data slice
Next, we will extract all reflectance values for one pixel. This makes up the
spectral signature or profile of the pixel. To do that, we'll use the h5read()
function. Here we pick an arbitrary pixel at (100,35)
, and use the NULL
value to select all bands from that location.
# extract all bands from a single pixel
aPixel <- h5read(f,"/SJER/Reflectance/Reflectance_Data",index=list(NULL,100,35))
# The line above generates a vector of reflectance values.
# Next, we reshape the data and turn them into a dataframe
b <- adply(aPixel,c(1))
# create clean data frame
aPixeldf <- b[2]
# add wavelength data to matrix
aPixeldf$Wavelength <- wavelengths
head(aPixeldf)
## V1 Wavelength
## 1 720 381.5437
## 2 337 401.5756
## 3 336 421.6075
## 4 399 441.6394
## 5 406 461.6713
## 6 426 481.7032
Scale Factor
Then, we can pull the spatial attributes that we'll need to adjust the reflectance
values. Often, large raster data contain floating point (values with decimals) information.
However, floating point data consume more space (yield a larger file size) compared
to integer values. Thus, to keep the file sizes smaller, the data will be scaled
by a factor of 10, 100, 10000, etc. This scale factor
will be noted in the data attributes.
# grab scale factor from the Reflectance attributes
reflectanceAttr <- h5readAttributes(f,"/SJER/Reflectance/Reflectance_Data" )
scaleFact <- reflectanceAttr$Scale_Factor
# add scaled data column to DF
aPixeldf$scaled <- (aPixeldf$V1/as.vector(scaleFact))
# make nice column names
names(aPixeldf) <- c('Reflectance','Wavelength','ScaledReflectance')
head(aPixeldf)
## Reflectance Wavelength ScaledReflectance
## 1 720 381.5437 0.0720
## 2 337 401.5756 0.0337
## 3 336 421.6075 0.0336
## 4 399 441.6394 0.0399
## 5 406 461.6713 0.0406
## 6 426 481.7032 0.0426
Plot Spectral Signature
Now we're ready to plot our spectral signature!
ggplot(data=aPixeldf)+
geom_line(aes(x=Wavelength, y=ScaledReflectance))+
xlab("Wavelength (nm)")+
ylab("Reflectance")
Get Lesson Code
Select pixels and compare spectral signatures in R
Authors: Donal O'Leary
Last Updated: May 13, 2021
In this tutorial, we will learn how to plot spectral signatures of several
different land cover types using an interactive clicking feature of the
raster
package.
Learning Objectives
After completing this activity, you will be able to:
- Extract and plot spectra from an HDF5 file.
- Work with groups and datasets within an HDF5 file.
- Use the
raster::click()
function to interact with an RGB raster image
Things You’ll Need To Complete This Tutorial
To complete this tutorial you will need the most current version of R and, preferably, RStudio loaded on your computer.
R Libraries to Install:
-
rhdf5:
install.packages("BiocManager")
,BiocManager::install("rhdf5")
-
raster:
install.packages('raster')
-
rgdal:
install.packages('rgdal')
-
plyr:
install.packages('plyr')
-
reshape2:
install.packages('rehape2')
-
ggplot2:
install.packages('ggplot2')
More on Packages in R - Adapted from Software Carpentry.
Data to Download
Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5
These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site in March of 2019. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. This data subset is derived from the mosaic tile named NEON_D17_SJER_DP3_257000_4112000_reflectance.h5. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetRemember that the example dataset linked here only has 1 out of every 4 bands included in a full NEON hyperspectral dataset (this substantially reduces the file size!). When we refer to bands in this tutorial, we will note the band numbers for this example dataset, which are different from NEON production data. To convert a band number (b) from this example data subset to the equivalent band in a full NEON hyperspectral file (b'), use the following equation: b' = 1+4*(b-1).
Download NEON Teaching Data Subset: RGB Image of SJER
This RGB image is derived from hyperspectral remote sensing data collected on the National Ecological Observatory Network's San Joaquin Exerimental Range field site in March of 2019. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetSet Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.
An overview of setting the working directory in R can be found here.
R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.
Recommended Skills
This tutorial will require that you be comfortable navigating HDF5 files, and have an understanding of what spectral signatures are. For additional information on these topics, we highly recommend you work through the Introduction to Hyperspectral Remote Sensing Data series before moving on to this tutorial.
Getting Started
First, we need to load our required packages, and import the hyperspectral data (in HDF5 format). We will also collect a few other important pieces of information (band wavelengths and scaling factor) while we're at it.
# Load required packages
library(rhdf5)
library(reshape2)
library(raster)
library(plyr)
library(ggplot2)
# set working directory to ensure R can find the file we wish to import and where
# we want to save our files. Be sure to move the download into your working directory!
wd <- "~/Documents/data/" #This will depend on your local environment
setwd(wd)
# define filepath to the hyperspectral dataset
fhs <- paste0(wd,"NEON_hyperspectral_tutorial_example_subset.h5")
# read in the wavelength information from the HDF5 file
wavelengths <- h5read(fhs,"/SJER/Reflectance/Metadata/Spectral_Data/Wavelength")
# grab scale factor from the Reflectance attributes
reflInfo <- h5readAttributes(fhs,"/SJER/Reflectance/Reflectance_Data" )
scaleFact <- reflInfo$Scale_Factor
Now, we read in the RGB image that we created in an earlier tutorial and plot it. If you didn't make this image before, you can download it from the link at the top of this page.
# Read in RGB image as a 'stack' rather than a plain 'raster'
rgbStack <- stack(paste0(wd,"NEON_hyperspectral_tutorial_example_RGB_stack_image.tif"))
# Plot as RGB image
plotRGB(rgbStack,
r=1,g=2,b=3, scale=300,
stretch = "lin")
Interactive click
Function from raster
Package
Next, we use an interactive clicking function to identify the pixels that we want to extract spectral signatures for. To follow along best with this tutorial, we suggest the following five cover types (exact location shown below).
- Irrigated grass
- Tree canopy (avoid the shaded northwestern side of the tree)
- Roof
- Bare soil (baseball diamond infield)
- Open water
As shown here:

# change plotting parameters to better see the points and numbers generated from clicking
par(col="red", cex=3)
# use the 'click' function
c <- click(rgbStack, id=T, xy=T, cell=T, type="p", pch=16, col="magenta", col.lab="red")
Once you have clicked your five points, press the ESC
key to save your
clicked points and close the function before moving on to the next step. If
you make a mistake in the step, run the plotRGB()
function again to start over.
The click()
function identifies the cell number that you clicked, but in order
to extract spectral signatures, we need to convert that cell number into a row
and column, as shown here:
# convert raster cell number into row and column (used to extract spectral signature below)
c$row <- c$cell%/%nrow(rgbStack)+1 # add 1 because R is 1-indexed
c$col <- c$cell%%ncol(rgbStack)
Extract Spectral Signatures from HDF5 file
Next, we loop through each of the cells that we selected to use the h5read()
function to etract the reflectance values of all bands from the given row and
column.
# create a new dataframe from the band wavelengths so that we can add
# the reflectance values for each cover type
Pixel_df <- as.data.frame(wavelengths)
# loop through each of the cells that we selected
for(i in 1:length(c$cell)){
# extract Spectra from a single pixel
aPixel <- h5read(fhs,"/SJER/Reflectance/Reflectance_Data",
index=list(NULL,c$col[i],c$row[i]))
# scale reflectance values from 0-1
aPixel <- aPixel/as.vector(scaleFact)
# reshape the data and turn into dataframe
b <- adply(aPixel,c(1))
# rename the column that we just created
names(b)[2] <- paste0("Point_",i)
# add reflectance values for this pixel to our combined data.frame called Pixel_df
Pixel_df <- cbind(Pixel_df,b[2])
}
Plot Spectral signatures using ggplot2
Finally, we have everything that we need to plot the spectral signatures for
each of the pixels that we clicked. In order to color our lines by the different
land cover types, we will first reshape our data using the melt()
function, then
plot the spectral signatures.
# Use the melt() function to reshape the dataframe into a format that ggplot prefers
Pixel.melt <- reshape2::melt(Pixel_df, id.vars = "wavelengths", value.name = "Reflectance")
# Now, let's plot some spectral signatures!
ggplot()+
geom_line(data = Pixel.melt, mapping = aes(x=wavelengths, y=Reflectance, color=variable), lwd=1.5)+
scale_colour_manual(values = c("green2", "green4", "grey50","tan4","blue3"),
labels = c("Field", "Tree", "Roof","Soil","Water"))+
labs(color = "Cover Type")+
ggtitle("Land cover spectral signatures")+
theme(plot.title = element_text(hjust = 0.5, size=20))+
xlab("Wavelength")
Nice! However, there seems to be something weird going on in the wavelengths near 1400nm and 1850 nm...
Atmospheric Absorbtion Bands
Those irregularities around 1400nm and 1850 nm are two major atmospheric absorbtion bands - regions where gasses in the atmosphere (primarily carbon dioxide and water vapor) absorb radiation, and therefore, obscure the reflected radiation that the imaging spectrometer measures. Fortunately, the lower and upper bound of each of those atmopheric absopbtion bands is specified in the HDF5 file. Let's read those bands and plot rectangles where the reflectance measurements are obscured by atmospheric absorbtion.
# grab Reflectance metadata (which contains absorption band limits)
reflMetadata <- h5readAttributes(fhs,"/SJER/Reflectance" )
ab1 <- reflMetadata$Band_Window_1_Nanometers
ab2 <- reflMetadata$Band_Window_2_Nanometers
# Plot spectral signatures again with rectangles showing the absorption bands
ggplot()+
geom_line(data = Pixel.melt, mapping = aes(x=wavelengths, y=Reflectance, color=variable), lwd=1.5)+
geom_rect(mapping=aes(ymin=min(Pixel.melt$Reflectance),ymax=max(Pixel.melt$Reflectance), xmin=ab1[1], xmax=ab1[2]), color="black", fill="grey40", alpha=0.8)+
geom_rect(mapping=aes(ymin=min(Pixel.melt$Reflectance),ymax=max(Pixel.melt$Reflectance), xmin=ab2[1], xmax=ab2[2]), color="black", fill="grey40", alpha=0.8)+
scale_colour_manual(values = c("green2", "green4", "grey50","tan4","blue3"),
labels = c("Field", "Tree", "Roof","Soil","Water"))+
labs(color = "Cover Type")+
ggtitle("Land cover spectral signatures")+
theme(plot.title = element_text(hjust = 0.5, size=20))+
xlab("Wavelength")
Now we can clearly see that the noisy sections of each spectral signature are
within the atmospheric absorbtion bands. For our final step, let's take all
reflectance values from within each absorbtion band and set them to NA
to
remove the noisy sections from the plot.
# Duplicate the spectral signatures into a new data.frame
Pixel.melt.masked <- Pixel.melt
# Mask out all values within each of the two atmospheric absorbtion bands
Pixel.melt.masked[Pixel.melt.masked$wavelengths>ab1[1]&Pixel.melt.masked$wavelengths<ab1[2],]$Reflectance <- NA
Pixel.melt.masked[Pixel.melt.masked$wavelengths>ab2[1]&Pixel.melt.masked$wavelengths<ab2[2],]$Reflectance <- NA
# Plot the masked spectral signatures
ggplot()+
geom_line(data = Pixel.melt.masked, mapping = aes(x=wavelengths, y=Reflectance, color=variable), lwd=1.5)+
scale_colour_manual(values = c("green2", "green4", "grey50","tan4","blue3"),
labels = c("Field", "Tree", "Roof", "Soil", "Water"))+
labs(color = "Cover Type")+
ggtitle("Land cover spectral signatures")+
theme(plot.title = element_text(hjust = 0.5, size=20))+
xlab("Wavelength")
There you have it, spectral signatures for five different land cover types, with the readings from the atmospheric absorbtion bands removed.
There are many interesting comparisons to make with spectral signatures. Try these challenges to explore hyperspectral data further:
-
Compare five different types of vegetation, and pick an appropriate color for each of their lines. A nice guide to the many different color options in R can be found here.
-
What happens if you only click four points? What about six? How does this change the spectral signature plots, and can you fix any errors that occur?
-
Does shallow water have a different spectral signature than deep water?
Get Lesson Code
Spatial Data Tutorial Series Capstone Challenges
Authors: Leah A. Wasser, Claire Lunch, Kate Thibault, Natalie Robinson
Last Updated: Oct 7, 2020
These capstone challenges utilize the skills that you learned in the previous tutorials in the:
- Primer on Raster Data in R series,
- Introduction to Hyperspectral Remote Sensing Data - in R series, or
- Introduction to the Hierarchical Data Format (HDF5) - Using HDFView & R series.
Things You’ll Need To Complete This Tutorial
You will need the most current version of R and, preferably, RStudio
loaded
on your computer to complete this tutorial.
Install R Packages
-
raster:
install.packages("raster")
-
rgdal:
install.packages("rgdal")
-
sp:
install.packages("sp")
More on Packages in R – Adapted from Software Carpentry.
Download Data
NEON Teaching Data Subset: Field Site Spatial Data
These remote sensing data files provide information on the vegetation at the National Ecological Observatory Network's San Joaquin Experimental Range and Soaproot Saddle field sites. The entire dataset can be accessed by request from the NEON Data Portal.
Download DatasetCapstone One: Calculate NDVI for the SJER field sites
The Normalized Difference Vegetation Index (NDVI) is calculated using the equation:
(NIR - Red) / (NIR + Red)
where NIR is the near infrared band in an image and Red is the red band in an image.
Use the Red (Band 58 in the GeoTIFF files) and the NIR (band 90 in the GeoTIFF files) GeoTIFF files to
- Calculate NDVI in R.
- Plot NDVI. Make sure your plot has a title and a legend.
- Assign a colormap to the plot and specify the breaks for the colors to represent NDVI values that make sense to you. For instance, you might chose to color the data into quartiles using breaks at .25,.5, .75 and 1.
- Expore your final NDVI dataset as a GeoTIFF. Make sure the CRS is correct.
- To test your work, bring it into QGIS. Does it line up with the other GeoTIFFs (for example the band 19 tiff). Did it import properly?
Capstone Two: Create an HDF5 file
If you have some of your own data that you'd like to explore for this activity, feel free to do so. Otherwise, use the vegetation structure data that we've provided in the data downloads for this workshop.
- Create a new HDF5 file using the vegetation structure data in
D17_2013_vegStr.csv
andD17_2013_SOAP_vegStr.csv
. (Note that previously the working directory was set to SJER. You'll have to change this to easily access the SOAP vegetation data). - Create two groups within a
California
group:- one for the San Joaquin (SJER) field site
- one for the Soaproot Saddle (SOAP) field site.
- Attribute each of the above groups with information about the field sites. HINT: you can explore the NEON field sites page for more information about each site.
- Extract the vegetation structure data for San Joaquin and add it as a dataset to the San Joaquin group. Do the same for the Soaproot Saddle dataset.
- Add the plot centroids data to the SJER group. Include relevant attributes for this dataset including the CRS string and any other metadata with the dataset.
- Open the metadata file for the vegetation structure data. Attribute the
structure dataset as you see fit to make it usable. As you do this, think about
the following:
- Is there a better way to provide or store these metadata?
- Is there a way to automate adding the metadata to the H5 file?