In this tutorial, we will open and plot point, line and polygon vector data stored in shapefile format in R.
After completing this tutorial, you will be able to:
- Explain the difference between point, line, and polygon vector elements.
- Describe the differences between opening point, line and polygon shapefiles in R.
- Describe the components of a spatial object in R.
- Read a shapefile into R.
Things You’ll Need To Complete This Tutorial
You will need the most current version of R and, preferably,
on your computer to complete this tutorial.
Install R Packages
More on Packages in R – Adapted from Software Carpentry.
These vector data provide information on the site characterization and infrastructure at the National Ecological Observatory Network's Harvard Forest field site. The Harvard Forest shapefiles are from the Harvard Forest GIS & Map archives. US Country and State Boundary layers are from the US Census Bureau.Download Dataset
Set Working Directory: This lesson assumes that you have set your working directory to the location of the downloaded and unzipped data subsets.
R Script & Challenge Code: NEON data lessons often contain challenges that reinforce learned skills. If available, the code for challenge solutions is found in the downloadable R script of the entire lesson, available in the footer of each lesson page.
About Vector Data
Vector data are composed of discrete geometric locations (x,y values) known as vertices that define the "shape" of the spatial object. The organization of the vertices, determines the type of vector that we are working with: point, line or polygon.There are 3 types of vector objects: points, lines or polygons. Each object type has a different structure. Image Source: National Ecological Observatory Network (NEON)
- Points: Each individual point is defined by a single x, y coordinate. There can be many points in a vector point file. Examples of point data include: sampling locations, the location of individual trees or the location of plots.
Lines: Lines are composed of many (at least 2) vertices, or points, that
are connected. For instance, a road or a stream may be represented by a line. This
line is composed of a series of segments, each "bend" in the road or stream
represents a vertex that has defined
- Polygons: A polygon consists of 3 or more vertices that are connected and "closed". Thus the outlines of plot boundaries, lakes, oceans, and states or countries are often represented by polygons. Occasionally, a polygon can have a hole in the middle of it (like a doughnut), this is something to be aware of but not an issue we will deal with in this tutorial.
Shapefiles: Points, Lines, and Polygons
Geospatial data in vector format are often stored in a
Because the structure of points, lines, and polygons are different, each
individual shapefile can only contain one vector type (all points, all lines
or all polygons). You will not find a mixture of point, line and polygon
objects in a single shapefile.
Objects stored in a shapefile often have a set of associated
describe the data. For example, a line shapefile that contains the locations of
streams, might contain the associated stream name, stream "order" and other
information about each stream line object.
- More about shapefiles can found on Wikipedia.
We will use the
rgdal package to work with vector data in R. Notice that the
sp package automatically loads when
rgdal is loaded. We will also load the
raster package so we can explore raster and vector spatial metadata using similar commands.
# load required libraries # for vector work; sp package will load with rgdal. library(rgdal) # for metadata/attributes- vectors or rasters library(raster) # set working directory to the directory location on your computer where # you downloaded and unzipped the data files for the tutorial # setwd("pathToDirHere")
The shapefiles that we will import are:
- A polygon shapefile representing our field site boundary,
- A line shapefile representing roads, and
- A point shapefile representing the location of the Fisher
flux tower located at the NEON Harvard Forest field site.
The first shapefile that we will open contains the boundary of our study area
(or our Area Of Interest or AOI, hence the name
aoiBoundary). To import
shapefiles we use the R function
readOGR() requires two components:
- The directory where our shapefile lives:
- The name of the shapefile (without the extension):
Let's import our AOI.
# Import a polygon shapefile: readOGR("path","fileName") # no extension needed as readOGR only imports shapefiles aoiBoundary_HARV <- readOGR(dsn=path.expand("NEON-DS-Site-Layout-Files/HARV"), layer="HarClip_UTMZ18") ## OGR data source with driver: ESRI Shapefile ## Source: "/Users/olearyd/Git/data/NEON-DS-Site-Layout-Files/HARV", layer: "HarClip_UTMZ18" ## with 1 features ## It has 1 fields ## Integer64 fields read as strings: id
Shapefile Metadata & Attributes
When we import the
HarClip_UTMZ18 shapefile layer into R (as our
aoiBoundary_HARV object), the
readOGR() function automatically stores
information about the data. We are particularly interested in the geospatial
metadata, describing the format,
extent, and other components of
the vector data, and the attributes which describe properties associated
with each individual vector object.
Key metadata for all shapefiles include:
- Object Type: the class of the imported object.
- Coordinate Reference System (CRS): the projection of the data.
- Extent: the spatial extent (geographic area that the shapefile covers) of the shapefile. Note that the spatial extent for a shapefile represents the extent for ALL spatial objects in the shapefile.
We can view shapefile metadata using the
# view just the class for the shapefile class(aoiBoundary_HARV) ##  "SpatialPolygonsDataFrame" ## attr(,"package") ##  "sp" # view just the crs for the shapefile crs(aoiBoundary_HARV) ## CRS arguments: ## +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs # view just the extent for the shapefile extent(aoiBoundary_HARV) ## class : Extent ## xmin : 732128 ## xmax : 732251.1 ## ymin : 4713209 ## ymax : 4713359 # view all metadata at same time aoiBoundary_HARV ## class : SpatialPolygonsDataFrame ## features : 1 ## extent : 732128, 732251.1, 4713209, 4713359 (xmin, xmax, ymin, ymax) ## crs : +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs ## variables : 1 ## names : id ## value : 1
aoiBoundary_HARV object is a polygon of class
in the CRS UTM zone 18N. The CRS is critical to interpreting the object
extent values as it specifies units.
Spatial Data Attributes
Each object in a shapefile has one or more attributes associated with it. Shapefile attributes are similar to fields or columns in a spreadsheet. Each row in the spreadsheet has a set of columns associated with it that describe the row element. In the case of a shapefile, each row represents a spatial object - for example, a road, represented as a line in a line shapefile, will have one "row" of attributes associated with it. These attributes can include different types of information that describe objects stored within a shapefile. Thus, our road, may have a name, length, number of lanes, speed limit, type of road and other attributes stored with it.Each spatial feature in an R spatial object has the same set of associated attributes that describe or characterize the feature. Attribute data are stored in a separate *.dbf file. Attribute data can be compared to a spreadsheet. Each row in a spreadsheet represents one feature in the spatial object. Image Source: National Ecological Observatory Network (NEON)
We view the attributes of a
# alternate way to view attributes aoiBoundary_HARV@data ## id ## 0 1
In this case, our polygon object only has one attribute:
Metadata & Attribute Summary
We can view a metadata & attribute summary of each shapefile by entering
the name of the R object in the console. Note that the metadata output
includes the class, the number of features, the extent, and the
coordinate reference system (
crs) of the R object. The last two lines of
summary show a preview of the R object attributes.
Plot a Shapefile
# view a summary of metadata & attributes associated with the spatial object summary(aoiBoundary_HARV) ## Object of class SpatialPolygonsDataFrame ## Coordinates: ## min max ## x 732128 732251.1 ## y 4713209 4713359.2 ## Is projected: TRUE ## proj4string : ## [+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs] ## Data attributes: ## id ## Length:1 ## Class :character ## Mode :character
Next, let's visualize the data in our R
spatialpolygonsdataframe object using
# create a plot of the shapefile # 'lwd' sets the line width # 'col' sets internal color # 'border' sets line color plot(aoiBoundary_HARV, col="cyan1", border="black", lwd=3, main="AOI Boundary Plot")
Answer the following questions:
- What type of R spatial object is created when you import each layer?
- What is the
extentfor each object?
- Do the files contain, points, lines or polygons?
- How many spatial objects are in each file?
Plot Multiple Shapefiles
plot() function can be used for basic plotting of spatial objects.
We use the
add = TRUE argument to overlay shapefiles on top of each other, as
we would when creating a map in a typical GIS application like QGIS.
We can use
main="" to give our plot a title. If we want the title to span two
lines, we use
\n where the line should break.
# Plot multiple shapefiles plot(aoiBoundary_HARV, col = "lightgreen", main="NEON Harvard Forest\nField Site") plot(lines_HARV, add = TRUE) # use the pch element to adjust the symbology of the points plot(point_HARV, add = TRUE, pch = 19, col = "purple")
You can plot vector data layered on top of raster data using the
plot attribute. Create a plot that uses the NEON AOP Canopy Height Model
NEON_RemoteSensing/HARV/CHM/HARV_chmCrop.tif as a base layer. On top of the
CHM, please add:
- The study site AOI.
- The tower location.
Be sure to give your plot a meaningful title.
Additional Resources: Plot Parameter Options
For more on parameter options in the base R
plot() function, check out these
Get Lesson Code
If you have questions or comments on this content, please contact us.Contact Us