Looking to climate science for ways to deal with data overload

November 04, 2011

Large science conferences are usually a science smorgasbord for a knowledge junkie like myself. But I showed up at the World Climate Research Program Open Science Meeting in Denver with a time limit and tunnel vision. I focused on how scientists, policymakers and educators might make more and better use of large data sets – and even participate in generating them. You see, I and my colleagues in the education/outreach and cyberinfrastructure departments at NEON are fixing to develop a web portal for NEON data that makes the data accessible in more than one sense of the word. We are trying to design tools that make it relatively easy for you to find both the data you want and/or the meaning in the data (if there is any). To my surprise and delight, I found at WCRP that related efforts are taking place at institutions right in Boulder. I dove into a broad, deep pool of international climate science and cultural diversity at WCRP and ended up talking almost entirely with people who work more or less down the street from me.

For example, the first poster in the Tuesday morning session that caught my eye was about the Data Rods project from the National Snow and Ice Data Center at the University of Colorado Boulder. A huge quantity of remote sensing data exists that was collected by different instruments in different formats at different spatial scales over several decades. Some satellite data exists only as image files. It’s incredibly time-consuming to search and analyze those files across both space and time using relational databases. Converting spatially referenced data into data rods, as David Gallaher explained to me, make it many times faster to assess changes in a specific location over time. NEON will be monitoring the same locations over 30 years with georeferenced sampling and remote sensing, and much of NEON’s data could be converted into data rods for simultaneous analysis across both space and time. For example, with satellite images, the data rod system takes whatever spatial reference information is available about each image and lines up the image pixels inside a spatial grid, where each grid cell corresponds to a specific location on Earth. Each gridded pixel is like a page in a flipbook: as you add pages from other images taken at different times, you end up with a story about how that location changed over time – a data rod.

Convert old satellite images of the Arctic into data rods, and it’s a relatively quick computational task to sort out clouds from ice sheets, Gallaher told me. Clouds and ice are both white, but clouds move faster, and data rods will conveniently animate those different rates of movement in a way that’s easy to detect with a computer algorithm. Differentiating between the ice and clouds is key to estimating historical ice cover and describing the impacts of climate change. Scientists, what could you do faster and more efficiently with old or modern environmental data stored this way?

Speaking of teaching old data new tricks, I’m a big fan for the Old Weather project, a cleverly designed citizen science effort to transcribe a trove of historical weather and ice information embedded in old weather logs. A core problem with this data is that it was not all collected in the same way, and can’t be compared directly with modern data without correcting for the biases introduced by collection methods. Because these data were collected so long ago, scientists don’t know exactly how they’re biased are. But some researchers and high school students in New York and Alaska have followed some very old directions to re-create the thermometer shelter used by the HMS Plover to take hourly air temperature measurements in Point Barrow, Alaska back in the mid-19th century. The replica thermometer shelter is now at the NOAA Barrow Observatory, where it will collect a year’s worth of data that the students will use to estimate the temperature biases introduced by the shelter into the temperature record. I’m looking forward to seeing these students present their results at a future conference, and I’m filing away this project in my brain for retrieval when NEON is ready to develop new programs in its citizen science arm.

Finally, the exception to my tour of local science was a poster by a fellow named Keith Cunningham, who hails from the University of Alaska Fairbanks. His poster was about the use of remote sensing data to audit carbon markets. For example, the United Nations Reducing Emissions from Deforestation and forest Degradation program has set up a carbon market in developing countries by assigning value to the carbon stored in forests and giving the countries financial incentives to manage forests and development in ways that don’t release huge amounts of carbon into the atmosphere and contribute to global warming. UN-REDD is having countries measure their own carbon storage. But Cunningham argues that as in other markets, a third party auditor needs to step in to help keep countries accountable for the carbon they are or aren’t storing. That’s where remote sensing science comes in. NEON will collect much of the kind of biomass data that could be used to audit carbon markets. None of that data will be in areas where a carbon market currently applies, but some of the knowledge and methodology NEON generates about collecting and analyzing biomass data at different spatial scales may end up influencing carbon accounting sooner or later. In the thick of getting the observatory up and running, it’s easy to forget that NEON data products and tools could affect decisions in the policy and regulation world that will affect the lives and livelihoods of millions or even billions of people, decades into the future.

Dialog content.