Object Model
Todo
Need to revisit this section to clarify what it is for and to have some form of completeness and consistency. Implementation-specific details need to be removed, as the SoS architecture needs to balance the abstract view of an architecture vs. the details of a specific implementation. Subsections that group and list things are preferred over a disconnected appendix. There is also the balance between describing terms here, which will not be indexed in the glossary, vs. in the glossary.
- Entity/Object
An entity or object is used to represent resources within INTERSECT. They are tuples consisting of a unique identifier (i.e., uri/iri, uuid, etc.) and a (+blob), containing the (encoded) representation of the entity. Objects can link to other objects, creating some relationship between them.
- Entity Types
All objects within INTERSECT have a type. Types have to be unique. Some examples of object types are:
intersect.orchestrator.controller
intersect.orchestrator.controller.campaign
intersect.orchestrator.controller.experiment
intersect.data.catalog
intersect.data.set
intersect.data.asset
Generic Objects
- Resource
Entities that are a distinct logical or physical component of a System. Resources most often represent components such as computers or scientific instruments that have particular sharing or partitioning constraints.
- Capability
A capability is the ability to perform an activity. It expresses the ability of an agent to perform an activity of a certain type. All capabilities must have a unique name. An agent can have multiple capabilities, and there multiple agents can provide the same capability.
Examples for INTERSECT capabilities are:
(:intersect:compute:create)
(:intersect:compute:submit)
(:intersect:orchestrator:create)
(:intersect:orchestrator:submit)
(:intersect:orchestrator:start)
(:intersect:data:create)
(:intersect:data:get)
(:intersect:data:list)
(:intersect:infrastructure:create)
(:intersect:infrastructure:register)
(:intersect:infrastructure:find)
Note
Some agent has the capability to perform the activity described by an entity of a certain type.
The agent ORNL::Users::Some_User wants to execute a bash script.
The agent Summit::Controller::Bash has the capability to perform the activity intersect::compute::execute::script::bash
The agent Summit::Controller::Bash performs activity execute::script::bash on behalf of agent ORNL::Users::Doe::Some_User
The bash script is an entity of type script::bash. To execute the bash script a controller with the capability intersect::compute::execute::script::bash is required. After a suitable agent is selected, it can perform the activity intersect::compute::execute::script::bash.
Fig. 111 Representation of the capability object
- Approval Point
A task which requires input of a specific type of agent. The execution of the task is suspended until the approval is either granted or denied.
- Decision Point
A task whose outcome depends on the output of a previous one.
- Event
The occurrence of an activity.
- Properties
Properties are used to describe INTERSECT objects in more detail. For example, an infrastructure system providing a compute service must have a property for the processor architecture. Additionally, properties can describe other aspects of the infrastructure system, like memory, network, etc.
- Constraint
Constraints are used to limit the resource selection to resources with certain properties. Examples of constraints include:
processor architecture
memory
accelerator
number of CPUs / hardware threads
number of nodes
Agents
- Agent
An agent is a type of performer. An agent has a set of capabilities. It can perform the activities indicated by its capabilities.
A Controller is a type of agent and provides capabilities to interact (control or query) with Resources.
Important
An agent provides a capability
Fig. 112 Representation of the Agent Object
- User
A user is an entity having access to a given resource. Users can perform activities. Users can be assigned roles.
- Organization
An organization is an entity providing a given resource. Users can perform activities on behalf of Organizations.
- SoftwareAgent
A software agent is an entity performing certain activities on behalf of users or organizations.
- Controller
An controller is a specialization of a software agent. Controllers integrate infrastructure systems into the INTERSECT ecosystem.
- Adapter
An adapter is a specialization of a software agent. Adapters integrate software systems into the INTERSECT ecosystem.
Activities
- Activity
An activity is a unit of work. It is an event, typically with a start and end, which is performed for a purpose. Activities can be repeated, and while the activity is the same, a separate event is generated each time a activity is performed.
Activities can have inputs and produce outputs. Inputs and outputs can be files. Inputs can additionally be comprised of parameters.
Activities can have dependencies on other activities. They can also depend on the presence of a resource (software/data asset). Additionally, activities can depend on the availability of a certain type of resource (e.g., a GPU, or a X86-64 CPU architecture, etc.)
Fig. 113 Entity-Relationship Diagram for Activities
- Campaign
A campaign is a specialization of an activity. A campaign manages the state of, the data flow between and the dependencies of a non-empty set of experiments. A campaign has a controller. A campaign has a template.
Fig. 114 A (not so complete) example of the relationships of a Campaign.
- Experiment
An experiment is a collection of Tasks with the goal to achieve a specific outcome. It contains a description of the data flow between the different Tasks and the dependencies between them. An experiment has at least one controller, which coordinates the execution of the experiment. An experiment has one template. An experiment can have a dependency on:
Task / Activity
Performer / Agent / Controller - Types of Performers
Event
Capability
Campaign
Experiment
Data Set
Data Asset
Approval Point
Decision Point
Template
Note
An experiment can depend on instruments or resources. The example experiment depends on a microscope, and can utilize its capabilities. Additionally, it contain a task that takes the output of the microscope to do some post-processing. The task itself has a dependency on a compute service. To run the experiment, the microscope needs to be configured. The microscope creates data for the output.
- Task
While an activity is a unit of work, a task represents the intend to perform an activity when certain conditions are met. The conditions can be dependencies and constraints.
A task has a set of input entities, blocks or consumes a resource, and produces a set of output entities. The input for a task can vary greatly and range from configuration parameters to physical samples. Resources that can be blocked are instruments that are time shared and in most cases consume some type allocations. Other types of consumables can be reactants or solvents. The ‘output’ of a task can vary greatly as well and range from reaction products, metadata about the reaction, to a publishable dataset.
A task has multiple states. It is either ready to be performed, assigned to a performer (as in scheduled), a performer is active**ly performing the work, or the task has been performed either **successful or unsuccessful. Additionally, it can also be suspended, canceled or aborted.
A task depends on the capabilities required by the associated activity. For example, assume the activity is to acquire the spectrum of a given sample, then the task depends on a spectrometer that provides this capability.
Todo
A task can have pre- and post conditions.
A task can have constraints. Constraints are used to express certain requirements of a given task. For example, a computational task can require a certain amount of memory, and constrains the task to systems with at least the required amount of memory. Other possible constrains are the processor architecture, a certain accelerator, number of nodes, etc.
Fig. 115 Task states
Important
A task requires an agent with the capability to execute it. In other words, the task itself has a type, which requires a certain type of controller to execute it. Each task has a controller associated with it.
Note
A task can itself provide a capability
Note
A user wants a bash script executed. To do so a task is created. The task will depend on:
the bash script
the capability to execute the script (i.e. an infrastructure system with a bash shell)
any additional constraints or dependencies - dependencies like access to a particular resource (i.e. some input) - constraints like a minimum amount of memory or number of CPU cores
Once an agent that satisfies all dependencies and constraints (and policies?) is available, it will perform the [activity](#intersect:model:activity) of executing the bash script.
Note
The Microscope has capability measure. The Instrument Controller has capability perform.measure. The Instrument Controller has capability configure.instrument. A Compute Controller has capability perform.analyze. The Experiment Controller has the capability perform.experiment. The Experiment requires the capability perform.experiment. The first task in the experiment is perform.measure. This task is performed by the Controller with the capability perform.measure. The second task is to analyze (perform.analyze) the output of the first task. This task is performed by some Controller with the capability perform.analyze.
Data
- Data Asset
Is a representation of a singular Object / Entity.
as in files (.jpg,.svg, .docx,.sh)
JSON Object
Database, Table, Row, etc.
- Data Set
Is a collection of [Data Assets](#intersect:model:data-asset) that belong together.
- Metadata
Metadata objects are used to store additional information about objects. For example, a data asset can have metadata to provide more context.