Open HDF5 files with Python Sample Code

David Hulslander, Josh Elliot, Leah A. Wasser, Tristan Goulden
Table of Contents


At the end of this tutorial you will be able to

  • open an HDF5 file with Python.

Data to Download

Download NEON Teaching Data Subset: Imaging Spectrometer Data - HDF5

These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Data Portal.

Download Dataset

In this tutorial, we'll work with temperature data collected using sensors on a flux tower by the National Ecological Observatory Network (NEON) . Here the data are provided in a HDF5 format to allow for the exploration of this format. More on NEON temperature data can be found on the the NEON Data Portal. Please note that temperature data are distributed as a flat .csv file and not as an HDF5 file. NEON data products including eddy covariance data and remote sensing data are however released in the HDF5 format.

Python Code to Open HDF5 files

The code below is starter code to create an H5 file in Python.

if __name__ == '__main__':
    # import required libraries
    import h5py as h5
    import numpy as np
    import matplotlib.pyplot as plt

    # Read H5 file
    f = h5.File("NEON-DS-Imaging-Spectrometer-Data.h5", "r")
    # Get and print list of datasets within the H5 file
    datasetNames = [n for n in f.keys()]
    for n in datasetNames:

    # extract reflectance data from the H5 file
    reflectance = f['Reflectance']
    # extract one pixel from the data
    reflectanceData = reflectance[:,49,392]
    reflectanceData = reflectanceData.astype(float)

    # divide the data by the scale factor
    # note: this information would be accessed from the metadata
    scaleFactor = 10000.0
    reflectanceData /= scaleFactor
    wavelength = f['wavelength']
    wavelengthData = wavelength[:]
    #transpose the data so wavelength values are in one column
    wavelengthData = np.reshape(wavelengthData, 426)

    # Print the attributes (metadata):
    print("Data Description : ", reflectance.attrs['Description'])
    print("Data dimensions : ", reflectance.shape, reflectance.attrs['DIMENSION_LABELS'])
    # print a list of attributes in the H5 file
    for n in reflectance.attrs:
    # close the h5 file

    # Plot
    plt.plot(wavelengthData, reflectanceData)
    plt.title("Vegetation Spectra")
    plt.xlabel('Wavelength [$\mu m$]')

    # Write a new HDF file containing this spectrum
    f = h5.File("VegetationSpectra.h5", "w")
    rdata = f.create_dataset("VegetationSpectra", data=reflectanceData)
    attrs = rdata.attrs
    attrs.create("Wavelengths", data=wavelengthData)

Get Lesson Code: 



What code in Python to open the H5 of eddy covariance data provided by NEON?

Adolpho - great question. At this point, NEON does not have python code specific to working with the H5 (HDF5) eddy covariance data. However, providing this is in the works and, when available, you'd be able to find it on the NEONScience GitHub organization ( or listed on the NEON Code Resources page ( In general, the h5py python package is pretty good for working with H5 data. There are other tutorials in Python using this package with the NEON hyperspectral imaging data if you want to check it out:

Leave a comment

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Dialog content.