Open HDF5 files with Python Sample Code
At the end of this tutorial you will be able to
- open an HDF5 file with Python.
Data to Download
These temperature data were collected by the National Ecological Observatory Network's flux towers at field sites across the US. The entire dataset can be accessed by request from the NEON Data Portal.
These hyperspectral remote sensing data provide information on the National Ecological Observatory Network's San Joaquin Exerimental Range field site. The data were collected over the San Joaquin field site located in California (Domain 17) and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON Data Portal.
In this tutorial, we'll work with temperature data collected using sensors on a flux tower by the National Ecological Observatory Network (NEON) . Here the data are provided in a HDF5 format to allow for the exploration of this format. More on NEON temperature data can be found on the the NEON Data Portal. Please note that temperature data are distributed as a flat .csv file and not as an HDF5 file. NEON data products including eddy covariance data and remote sensing data are however released in the HDF5 format.
Python Code to Open HDF5 files
The code below is starter code to create an H5 file in Python.
if __name__ == '__main__': # import required libraries import h5py as h5 import numpy as np import matplotlib.pyplot as plt # Read H5 file f = h5.File("NEON-DS-Imaging-Spectrometer-Data.h5", "r") # Get and print list of datasets within the H5 file datasetNames = [n for n in f.keys()] for n in datasetNames: print(n) # extract reflectance data from the H5 file reflectance = f['Reflectance'] # extract one pixel from the data reflectanceData = reflectance[:,49,392] reflectanceData = reflectanceData.astype(float) # divide the data by the scale factor # note: this information would be accessed from the metadata scaleFactor = 10000.0 reflectanceData /= scaleFactor wavelength = f['wavelength'] wavelengthData = wavelength[:] #transpose the data so wavelength values are in one column wavelengthData = np.reshape(wavelengthData, 426) # Print the attributes (metadata): print("Data Description : ", reflectance.attrs['Description']) print("Data dimensions : ", reflectance.shape, reflectance.attrs['DIMENSION_LABELS']) # print a list of attributes in the H5 file for n in reflectance.attrs: print(n) # close the h5 file f.close() # Plot plt.plot(wavelengthData, reflectanceData) plt.title("Vegetation Spectra") plt.ylabel('Reflectance') plt.ylim((0,1)) plt.xlabel('Wavelength [$\mu m$]') plt.show() # Write a new HDF file containing this spectrum f = h5.File("VegetationSpectra.h5", "w") rdata = f.create_dataset("VegetationSpectra", data=reflectanceData) attrs = rdata.attrs attrs.create("Wavelengths", data=wavelengthData) f.close()