Kepler Workflow Project

Objective

We aim to lower data integration barriers by seamlessly combining workflows using CyVerse Discovery Environment and supporting reproducible research and publication by linking to the Research Object infrastructure (ROHub).



Overview

The US NSF-funded Science Across Virtual Institutes (SAVI) project developed a proof-of-concept using NEON and UNAVCO site data to lower these barriers for the use of environmental data in collaboration with Expert System, technical partner in the EU H2020 EVER-EST project, which is building a virtual research environment for earth sciences around the notion of ROs. We will create containerized applications and incorporate Kepler workflows that bring together diverse data to specific applications, with an emphasis on lightweight visual analytics. Our approach is informed by stakeholder-based use cases and requirements that catalyze future research, actionable science, and to readily integrate into business/decision-making workflows to assure success. Creating a requirements-based framework from all the identified use-cases will inform future applications and the scalability to other projects, will allow us to fully demonstrate the functionality of our development efforts. 

https://firemap.sdsc.edu/savi/map.html

http://www.researchobject.org/


Current Status

COOPEUS is an EU and US coordination and support action that brought together Europe’s major environmental research infrastructure projects, i.e., EISCAT, EPOS, LifeWatch, EMSO, and ICOS, with their US counterparts, including AMISR, EARTHSCOPE (UNAVCO and IRIS), OOI and NEON.  The aim of COOPEUS is to provide a platform to initiate collaborative cross-infrastructure data sharing and research.  The activities within COOPEUS are ongoing in the US (https://www.neonscience.org/observatory/strategic-development/coopeus-pr...) and continue in the EU under the project COOP+ (http://www.coop-plus.eu) and ENVRI+.  As part of COOPEUS’s Strategic Roadmap, these infrastructures are collaborating with the EU project EVER-EST (http://ever-est.eu), whose mission is to build a Virtual Research Environment based on research objects.  The main focus is to enhance research reusability and reproducibility based on workflow-centric research objects, and lower the barriers for end-users attempting to access ‘Big Data’.

A prototype has been developed that wraps research objects around Kepler workflows that are in turn linked to several RESTful web services that access time series data from UNAVCO and NEON.  Within the workflow, web services are standardized to the GeoWS time series format, developed as part of the US EarthCube program.  A prototype web-based map interface (https://firemap.sdsc.edu/savi/map.html) demonstrates the Kepler workflow utility and harmonizes access to time series plots and data from UNAVCO/EarthScope and NEON.  A Research Object (ROHub.org) bundle can be uploaded to ROHub (including workflow, input data slice, output data, and provenance trace) for preservation, reuse, and version management.  Once stored in ROHub, the RO can be inspected for reproducibility by other scientists using the data slice stored in it (without invoking the web services again) and the provenance trace.  Also, ROHub can assign a DOI to the Research Object, making it citable. The results demonstrate enhanced interoperability or interworkability among global environmental and geophysical data, extending the focus on data and metadata standards to the exchange of formalized scientific concepts relevant to data, their modeling and analysis.

We created a prototype and a proof of concept. We are exploring what works, what doesn’t, what’s scalable and what’s not. Moving forward we plan to show scalability capability, use case examples, and design the scoop of the project with the next phase in mind.

 
 
Dialog content.