A novel way to model
This page contains a brief outline of its final contents.
- When dealing with scientific information, semantics are usually “stated” in metadata – an annotation approach. While there is recognition of the need for good semantics in order to make data and models FAIR, the majority of modeling workflows are still monolithic and hard or impossible to integrate.
- We propose a semantics-first approach, with information linked to semantics during its entire lifecycle and across computations. It is based on recognizing specific types of observables – such as subjects, processes, qualities, events, and relationships. These types constrain the ways to conceptualize and model an observable, and specify how the scientific observations (data) that describe them when observed in a context are created and stored, so that they can be treated uniformly within each category.
- We have developed a language, called k.IM, that facilitates the specification of semantically rich data and models. It transparently incorporates the types of observables and the constraints of their composition. In this way, the language’s syntax assists modelers in conceptualizing and developing their observations and models. The language is patterned on natural English and can be read and understood by persons unfamiliar with it.
- Models consist of individual statements that specify how to create an observation of a concept. Each model can be contextualized in terms of space, time, representation etc., using concepts from a shared modelling worldview that ensures complete compatibility of concepts from different domains. After models are specified in k.IM, the k.LAB software. querying the IM semantic web, chooses the most appropriate model for the context. Because a given observables is always observed in the same way, we make no assumptions about the specific data sources and annotate data and models in completely reusable ways.
- Users must only specify a context of observation and the conceptual description of what they want to observe. The context can be chosen interactively (for example, a spatial context can be chosen by zooming in on a region using a map) and the concept can be chosen from the worldview using search tools. The simple action “observe concept in context” (which can be implemented as a drag-and-drop operation in a user interface) builds the best possible model and runs it, returning a finished observation (dataset) to the user.
- We have paid much attention to scale, how it affects semantics, and how it can be mediated to ensure compatibility of observations. Models can be written in a scale-agnostic way, although certain observations can be constrained to particular scales or representations.
- In k.IM, statements are typically very short because only one concept is modeled at a time, and if others are necessary to observe it, these are also specified only conceptually. This allows modelers to leave to the implementation the details that usually make the activity of modeling cumbersome and complex, such as input/output management, visualization, rescaling/reprojection etc.
- Semantics are traditionally represented in the OWL language. We use one simple language for both expressing semantics and writing models. All the knowledge in worldviews is written and can be inspected as k.IM code. All k.IM statements compile to OWL2, so that compatibility with OWL is guaranteed.
- Examples of k.IM code for semantic specification
- The adoption of a semantics-first data stewardship and modeling approach has the potential to completely change the scientific landscape, allowing linkages and discovery of model workflows that were previously unworkable. Model paradigms can be merged into a seamless whole. The purpose of our initiative is to allow the benefits of semantic modeling to promote a new, more efficient way to answer complex questions and address complex problems.
- Examples of k.IM code for model specification