Data Management System
Note
See the INTERSECT Scientific Data Layer documentation for details
The main purpose of the INTERSECT federated ecosystem is to enable science breakthroughs with autonomous experiments, self-driving laboratories, smart manufacturing, and AI-driven design, discovery and evaluation. The Data Management System (DMS) is responsible for collecting, tracking, transferring, storing, curating, and archiving the corresponding scientific data. The services of the DMS are organized in the following 4 tiers (Fig. 29):
Tier 0 (infrastructure) abstracts access to different storage backends, such as file systems, databases, object stores, etc.. They provide a Create, read, update and delete (CRUD)-like interface to abstract objects. The objects represent assets of a particular class (i.e. a file, a metadata entry, a catalog item, etc.).
Tier I provides data transport and storage management services.
Tier II provides data registration and deployment services.
Tier III provides data repository and data catalog services.
Fig. 29 The Data Management System tiers.
The DMS has the following services and microservice capabilities (mapping the System-of-Systems Architecture to the Microservices Architecture):
Data Transport Service
Moves data from one storage backend to another.
Fig. 30 Overview of systems in the task of moving a data asset from
sns.ornl.gov/datatodata.olcf.ornl.govData Transport Endpoint Service
Storage Management Service
Data Registration Service
Data Deployment Service
Data Repository Service
Fig. 31 The Data Repository Service relationships
Data Catalog Service
Minimum requrement
At minimum, there must be one and only one DMS in an INTERSECT federated ecosystem, as the DMS spans over the infrastructure systems within the same INTERSECT federated ecosystem. Individual services of the DMS may be distributed across infrastructure systems as needed, where some services may only exist once.
Optional requrement
Optionally, multiple INTERSECT federated ecosystems may exist that operate either completely indepenently from each other or collaborate with each other, but each INTERSECT federated ecosystem has only one (its own) DMS.
Note
Asset classes are loosely defined concepts here. In general an asset class is a Binary Large Object (BLOB). However, in the context they are used, i.e.’ on a higher abstraction layer, these BLOBs are well defined. Asset classes can also be defined based on other constraints like object size, frequency of access, etc. A data asset can be used as an abstraction of domain specific data and it has a unique identifier.