The approach

Semantics-first user workflows

Integrated modeling is an approach for the reuse and connection of scientific information that promises open knowledge for better decisions.

Over the last ten years, our work has steadily advanced toward a full implementation of the semantic web, with great promise for both science and society. Our approach builds on recent advances in open data and data integration, described in brief below and in detail in the rest of this site.

We view integrated modeling as an implementation of the semantic web, a subset of the World Wide Web where data and models exist as first-class research objects that can be found online and read and understood by both humans and computers.

We begin by conceptualizing integrated modeling worldviews. A worldview contains shared semantics that can address large cross-domain problem areas – for instance, linked socio-environmental systems.

The semantics that accompany a worldview are then used to annotate integrated modeling resources, i.e., data and models. Semantics are by design highly modular, parsimonious, and logically consistent, and allow linking to established authorities – wrapping vocabularies and thesaura that are already well-accepted for a particular scientific field or domain. As the number of individuals annotating data and models grows, more and more knowledge becomes available, which the system can use to simulate new phenomena with greater accuracy.

Using a web browser or integrated development environment, the user completes a simple action of observing a concept (e.g., elevation, streamflow, or human migration) within a specified spatiotemporal context. k.LAB, the underlying software stack, assembles the data and models needed to observe the concept, choosing from the best available data and models to match the context of interest.

Data and models, which have been made interoperable through semantic annotation, are assembled using artificial intelligence (i.e., machine reasoning over the semantics and intelligent ranking of the best options), which selects the best data and models for the context based on user-encoded decision rules.

The results are then delivered to the user, including model inputs and outputs, in well-known file formats, and a printable report documenting the data and models selected and run. Results are secure, with appropriate data access provided through a user certificate system, and traceable, with full provenance information provided to the user.

The solution

The problem