Skip to main content
NSF NEON | Open Data to Understand our Ecosystems logo

Main navigation

  • About Us
    • Overview
      • Spatial and Temporal Design
      • History
    • Vision and Management
    • Advisory Groups
      • Advisory Committee: STEAC
      • Technical Working Groups (TWGs)
    • FAQ
    • Contact Us
      • Field Offices
    • User Accounts
    • Staff

    About Us

  • Data & Samples
    • Data Portal
      • Explore Data Products
      • Data Availability Charts
      • Spatial Data & Maps
      • Document Library
      • API & GraphQL
      • Prototype Data
      • External Lab Data Ingest (restricted)
    • Samples & Specimens
      • Discover and Use NEON Samples
        • Sample Types
        • Sample Repositories
        • Sample Explorer
        • Megapit and Distributed Initial Characterization Soil Archives
        • Excess Samples
      • Sample Processing
      • Sample Quality
      • Taxonomic Lists
    • Collection Methods
      • Protocols & Standardized Methods
      • AIrborne Remote Sensing
        • Flight Box Design
        • Flight Schedules and Coverage
        • Daily Flight Reports
        • Camera
        • Imaging Spectrometer
        • Lidar
      • Automated Instruments
        • Site Level Sampling Design
        • Sensor Collection Frequency
        • Instrumented Collection Types
          • Meteorology
          • Phenocams
          • Soil Sensors
          • Ground Water
          • Surface Water
      • Observational Sampling
        • Site Level Sampling Design
        • Sampling Schedules
        • Observation Types
          • Aquatic Organisms
            • Aquatic Microbes
            • Fish
            • Macroinvertebrates & Zooplankton
            • Periphyton, Phytoplankton, and Aquatic Plants
          • Terrestrial Organisms
            • Birds
            • Ground Beetles
            • Mosquitoes
            • Small Mammals
            • Soil Microbes
            • Terrestrial Plants
            • Ticks
          • Hydrology & Geomorphology
            • Discharge
            • Geomorphology
          • Biogeochemistry
          • DNA Sequences
          • Pathogens
          • Sediments
          • Soils
            • Soil Descriptions
    • Data Notifications
    • Data Guidelines and Policies
      • Acknowledging and Citing NEON
      • Publishing Research Outputs
      • Usage Policies
    • Data Management
      • Data Availability
      • Data Formats and Conventions
      • Data Processing
      • Data Quality
      • Data Product Revisions and Releases
        • Release 2021
        • Release 2022
        • Release 2023
      • NEON and Google
      • Externally Hosted Data

    Data & Samples

  • Field Sites
    • About Field Sites and Domains
    • Explore Field Sites
    • Site Management Data Product

    Field Sites

  • Impact
    • Observatory Blog
    • Case Studies
    • Spotlights
    • Papers & Publications
    • Newsroom
      • NEON in the News
      • Newsletter Archive

    Impact

  • Resources
    • Getting Started with NEON Data & Resources
    • Documents and Communication Resources
      • Papers & Publications
      • Document Library
      • Outreach Materials
    • Code Hub
      • Code Resources Guidelines
      • Code Resources Submission
      • NEON's GitHub Organization Homepage
    • Learning Hub
      • Science Videos
      • Tutorials
      • Workshops & Courses
      • Teaching Modules
      • Faculty Mentoring Networks
      • Data Education Fellows
    • Research Support and Assignable Assets
      • Field Site Coordination
      • Letters of Support
      • Mobile Deployment Platforms
      • Permits and Permissions
      • AOP Flight Campaigns
      • Excess Samples
      • Assignable Assets FAQs
    • Funding Opportunities

    Resources

  • Get Involved
    • Advisory Groups
    • Upcoming Events
    • Past Events
    • NEON Ambassador Program
    • Collaborative Works
      • EFI-NEON Ecological Forecasting Challenge
      • NCAR-NEON-Community Collaborations
      • NEON Science Summit
      • NEON Great Lakes User Group
    • Community Engagement
    • Science Seminars and Data Skills Webinars
    • Work Opportunities
      • Careers
      • Seasonal Fieldwork
      • Postdoctoral Fellows
      • Internships
        • Intern Alumni
    • Partners

    Get Involved

  • My Account
  • Search

Search

Learning Hub

  • Science Videos
  • Tutorials
  • Workshops & Courses
  • Teaching Modules
  • Faculty Mentoring Networks
  • Data Education Fellows

Breadcrumb

  1. Resources
  2. Learning Hub
  3. Tutorials
  4. Classification of Hyperspectral Data with Support Vector Machine (SVM) Using SciKit in Python

Tutorial

Classification of Hyperspectral Data with Support Vector Machine (SVM) Using SciKit in Python

Authors: Paul Gader

Last Updated: Apr 1, 2021

In this tutorial, we will learn to classify spectral data using the Support Vector Machine (SVM) method.

Objectives

After completing this tutorial, you will be able to:

  • Classify spectral remote sensing data using Support Vector Machine (SVM).

Install Python Packages

  • numpy
  • gdal
  • matplotlib
  • matplotlib.pyplot

Download Data

Download the spectral classification teaching data subset Download Dataset

Additional Materials

This tutorial was prepared in conjunction with a presentation on spectral classification that can be downloaded.

Download Dr. Paul Gader's Classification 1 PPT Download Dr. Paul Gader's Classification 2 PPT Download Dr. Paul Gader's Classification 3 PPT
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy import linalg
from scipy import io
from sklearn import linear_model as lmd

Note that you will need to update these filepaths according to your local machine.

InFile1          = '/Users/olearyd/Git/data/RSDI2017-Data-SpecClass/LinSepC1.mat'
InFile2          = '/Users/olearyd/Git/data/RSDI2017-Data-SpecClass/LinSepC2.mat'
C1Dict           = io.loadmat(InFile1)
C2Dict           = io.loadmat(InFile2)
C1               = C1Dict['LinSepC1']
C2               = C2Dict['LinSepC2']
NSampsClass    = 200
NSamps         = 2*NSampsClass
### Set Target Outputs ###
TargetOutputs                     =  np.ones((NSamps,1))
TargetOutputs[NSampsClass:NSamps] = -TargetOutputs[NSampsClass:NSamps]
AllSamps     = np.concatenate((C1,C2),axis=0)
AllSamps.shape
(400, 2)
#import sklearn
#sklearn.__version__
LinMod = lmd.LinearRegression.fit?
LinMod = lmd.LinearRegression.fit
LinMod = lmd.LinearRegression.fit
LinMod = lmd.LinearRegression.fit
LinMod = lmd.LinearRegression.fit
M = lmd.LinearRegression()
print(M)
LinearRegression()
LinMod = lmd.LinearRegression.fit(M, AllSamps, TargetOutputs, sample_weight=None)
R = lmd.LinearRegression.score(LinMod, AllSamps, TargetOutputs, sample_weight=None)
print(R)
0.9112691769822485
LinMod
LinearRegression()
w = LinMod.coef_
w
array([[0.81592447, 0.94178188]])
w0 = LinMod.intercept_
w0
array([-0.01663028])
### Question:  How would we compute the outputs of the regression model?

Kernels

Now well use support vector models (SVM) for classification.

from sklearn.svm import SVC
### SVC wants a 1d array, not a column vector
Targets = np.ravel(TargetOutputs)
InitSVM = SVC()
InitSVM
SVC()
TrainedSVM = InitSVM.fit(AllSamps, Targets)
y = TrainedSVM.predict(AllSamps)
plt.figure(1)
plt.plot(y)
plt.show()

png

d = TrainedSVM.decision_function(AllSamps)
plt.figure(1)
plt.plot(d)
plt.show()

png

Include Outliers

We can also try it with outliers.

Let's start by looking at some spectra.

### Look at some Pine and Oak spectra from
### NEON Site D03 Ordway-Swisher Biological Station
### at UF
### Pinus palustris
### Quercus virginiana
InFile1 = '/Users/olearyd/Git/data/RSDI2017-Data-SpecClass/Pines.mat'
InFile2 = '/Users/olearyd/Git/data/RSDI2017-Data-SpecClass/Oaks.mat'
C1Dict  = io.loadmat(InFile1)
C2Dict  = io.loadmat(InFile2)
Pines   = C1Dict['Pines']
Oaks    = C2Dict['Oaks']
WvFile  = '/Users/olearyd/Git/data/RSDI2017-Data-SpecClass/NEONWvsNBB.mat'
WvDict  = io.loadmat(WvFile)
Wv      = WvDict['NEONWvsNBB']
Pines.shape
(809, 346)
Oaks.shape
(1731, 346)
NBands=Wv.shape[0]
print(NBands)
346

Notice that these training sets are unbalanced.

NTrainSampsClass = 600
NTestSampsClass  = 200
Targets          = np.ones((1200,1))
Targets[range(600)] = -Targets[range(600)]
Targets             = np.ravel(Targets)
print(Targets.shape)
(1200,)
plt.figure(111)
plt.plot(Targets)
plt.show()

png

TrainPines = Pines[0:600,:]
TrainOaks  = Oaks[0:600,:]
#TrainSet   = np.concatenate?
TrainSet   = np.concatenate((TrainPines, TrainOaks), axis=0)
print(TrainSet.shape)
(1200, 346)
plt.figure(3)
### Plot Pine Training Spectra ###
plt.subplot(121)
plt.plot(Wv, TrainPines.T)
plt.ylim((0.0,0.8))
plt.xlim((Wv[1], Wv[NBands-1]))
### Plot Oak Training Spectra ###
plt.subplot(122)
plt.plot(Wv, TrainOaks.T)
plt.ylim((0.0,0.8))
plt.xlim((Wv[1], Wv[NBands-1]))
plt.show()

png

InitSVM= SVC()
TrainedSVM=InitSVM.fit(TrainSet, Targets)
d = TrainedSVM.decision_function(TrainSet)
print(d)
[-0.26050536 -0.45009774 -0.4508219  ...  1.70930028  1.79781222
  1.66711708]
plt.figure(4)
plt.plot(d)
plt.show()

png

Does this seem to be too good to be true?

TestPines = Pines[600:800,:]
TestOaks  = Oaks[600:800,:]
TestSet = np.concatenate((TestPines, TestOaks), axis=0)
print(TestSet.shape)
(400, 346)
dtest = TrainedSVM.decision_function(TestSet)
plt.figure(5)
plt.plot(dtest)
plt.show()

png

Yeah, too good to be true...What can we do?

Error Analysis

Error analysis can be used to identify characteristics of errors.

You could try different Magic Numbers using Cross Validation, etc. Stay tuned for a tutorial on this topic.

Questions?

If you have questions or comments on this content, please contact us.

Contact Us
NEON Logo

Follow Us:

Join Our Newsletter

Get updates on events, opportunities, and how NEON is being used today.

Subscribe Now

Footer

  • My Account
  • About Us
  • Newsroom
  • Contact Us
  • Terms & Conditions
  • Careers

Copyright © Battelle, 2019-2020

The National Ecological Observatory Network is a major facility fully funded by the National Science Foundation.

Any opinions, findings and conclusions or recommendations expressed in this material do not necessarily reflect the views of the National Science Foundation.