Open HDF5 files with Python Sample Code

David Hulslander, Josh Elliot, Leah A. Wasser, Tristan Goulden
Table of Contents


At the end of this tutorial you will be able to

  • open an HDF5 file with Python.

Data to Download

Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5

These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Data Portal.

Download Dataset

In this tutorial, we'll work with temperature data collected using sensors on a flux tower by the National Ecological Observatory Network (NEON) . Here the data are provided in a HDF5 format to allow for the exploration of this format. More on NEON temperature data can be found on the the NEON Data Portal. Please note that temperature data are distributed as a flat .csv file and not as an HDF5 file. NEON data products including eddy covariance data and remote sensing data are however released in the HDF5 format.

Python Code to Open HDF5 files

The code below is starter code to create an H5 file in Python.

if __name__ == '__main__':
    # import required libraries
    import h5py as h5
    import numpy as np
    import matplotlib.pyplot as plt

    # Read H5 file
    f = h5.File("NEON-DS-Imaging-Spectrometer-Data.h5", "r")
    # Get and print list of datasets within the H5 file
    datasetNames = [n for n in f.keys()]
    for n in datasetNames:

    # extract reflectance data from the H5 file
    reflectance = f['Reflectance']
    # extract one pixel from the data
    reflectanceData = reflectance[:,49,392]
    reflectanceData = reflectanceData.astype(float)

    # divide the data by the scale factor
    # note: this information would be accessed from the metadata
    scaleFactor = 10000.0
    reflectanceData /= scaleFactor
    wavelength = f['wavelength']
    wavelengthData = wavelength[:]
    #transpose the data so wavelength values are in one column
    wavelengthData = np.reshape(wavelengthData, 426)

    # Print the attributes (metadata):
    print("Data Description : ", reflectance.attrs['Description'])
    print("Data dimensions : ", reflectance.shape, reflectance.attrs['DIMENSION_LABELS'])
    # print a list of attributes in the H5 file
    for n in reflectance.attrs:
    # close the h5 file

    # Plot
    plt.plot(wavelengthData, reflectanceData)
    plt.title("Vegetation Spectra")
    plt.xlabel('Wavelength [$\mu m$]')

    # Write a new HDF file containing this spectrum
    f = h5.File("VegetationSpectra.h5", "w")
    rdata = f.create_dataset("VegetationSpectra", data=reflectanceData)
    attrs = rdata.attrs
    attrs.create("Wavelengths", data=wavelengthData)

Get Lesson Code: 


Leave a comment

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Dialog content.