Open Metadata Repository Cohort Operation¶
An Open Metadata Repository Cohort (or more simply, just a cohort) is a collection of servers sharing metadata using a peer-to-peer exchange protocol. Once a server becomes a member of the cohort, it can share metadata with, and receive metadata from, any other member either through events, or through federated queries.
The following types of servers can become a member of one or more cohorts.
Their configuration lists the cohorts that they are to join.
Configuring registration to an Open Metadata Repository Cohort¶
An OMAG Server that is capable of being a Cohort Member can register with one or more open metadata repository cohorts.
Each cohort has a memorable name - eg cocoCohort
. This name needs to be used in the configuration of each member. At the heart of a cohort are 1-4 cohort topics. These are topics on an event bus that the members use to exchange information.
There is a choice of topic structure for the cohort.
- A single topic is used for all types of events
- Three topics are used, each dedicated to a specific type of cohort event:
- Registration events that exchange information about the members of the cohort.
- Type verification events that ensure consistency of the open metadata types used by the members of the cohort.
- Instance events that enable members of the cohort to share metadata elements.
The use of a single topic comes from the original implementation of Egeria. The use of the three dedicated topics was added later in version 2.11 to reduce the latency of cohort registration and to allow tuning of each topic's configuration. This is essential when multiple instances of an OMAG server are running in a cluster because the registration and type verification events need to be received by all server instances and the instance events need only to be received by one of the server instances.
Typically, all members of the cohort should be configured to use the same topic structure. However, if one of the members is back level and can only support the single topic then the other members can be set up to operate both topic structures. This is less efficient as these servers will process most instance events twice. However, it does provide a workaround until the back-level member can be upgraded.
The choices of topic structure are summarized in Figure 1.
Figure 1: Choices of cohort topic structures referred to as SINGLE_TOPIC, DEDICATED_TOPICS and BOTH_SINGLE_AND_DEDICATED_TOPICS reading left to right
Configuration commands¶
The commands for configuring a server as a member of a cohort are shown below. Before calling these commands, make sure that the default settings for the event bus are configured, and you know the name of the cohort and the topic structure it is using.
Add access to a cohort
The following command registers the server with a cohort using the default settings. This includes the default cohort topic structure, which is SINGLE_TOPIC before version 3.0 and DEDICATED_TOPICS for version 3.0 and above.
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}
Alternatively it is possible to explicitly specify the cohort topic structure. The example below sets it to DEDICATED_TOPICS. The other options are SINGLE_TOPIC and BOTH_SINGLE_AND_DEDICATED_TOPICS.
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/topic-structure/DEDICATED_TOPICS
Both of these commands optionally support passing a map of name-value pairs in the request body. These properties are added to the additionalProperties
attribute of the Connection objects for each of the cohort topics. The additional properties supported are specific to the topic connector implementation. For example, see the Apache Kafka Topic Connector Documentation.
The result of the cohort configuration call fills out an entry in the cohort list of the server's configuration document. The fields in a cohort list entry are show in Figure 2.
Figure 2: Fields in an entry in a server's cohort list
It is possible to update any of these fields directly using the following command:
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/configuration
JSON structure for a member that is using DEDICATED_TOPICS
{
"class": "CohortConfig",
"cohortName": "cocoCohort",
"cohortRegistryConnection": {
"class": "Connection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"type": {
"class": "ElementType",
"headerVersion": 0,
"elementOrigin": "LOCAL_COHORT",
"elementVersion": 0,
"elementTypeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
"elementTypeName": "ConnectorType",
"elementTypeVersion": 1,
"elementTypeDescription": "A set of properties describing a type of connector."
},
"guid": "108b85fe-d7a8-45c3-9f88-742ac4e4fd14",
"qualifiedName": "File Based Cohort Registry Store Connector",
"displayName": "File Based Cohort Registry Store Connector",
"description": "Connector supports storing of the open metadata cohort registry in a file.",
"connectorProviderClassName": "org.odpi.openmetadata.adapters.repositoryservices.cohortregistrystore.file.FileBasedRegistryStoreProvider"
},
"endpoint": {
"class": "Endpoint",
"headerVersion": 0,
"address": "./data/servers/cocoMDS4/cohorts/cocoCohort.registrystore"
}
},
"cohortOMRSRegistrationTopicConnection": {
"class": "VirtualConnection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"connectorProviderClassName": "org.odpi.openmetadata.repositoryservices.connectors.omrstopic.OMRSTopicProvider"
},
"embeddedConnections": [
{
"class": "EmbeddedConnection",
"headerVersion": 0,
"position": 0,
"displayName": "cocoCohort OMRS Topic for registrations",
"embeddedConnection": {
"class": "Connection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"type": {
"class": "ElementType",
"headerVersion": 0,
"elementOrigin": "LOCAL_COHORT",
"elementVersion": 0,
"elementTypeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
"elementTypeName": "ConnectorType",
"elementTypeVersion": 1,
"elementTypeDescription": "A set of properties describing a type of connector."
},
"guid": "3851e8d0-e343-400c-82cb-3918fed81da6",
"qualifiedName": "Kafka Open Metadata Topic Connector",
"displayName": "Kafka Open Metadata Topic Connector",
"description": "Kafka Open Metadata Topic Connector supports string based events over an Apache Kafka event bus.",
"connectorProviderClassName": "org.odpi.openmetadata.adapters.eventbus.topic.kafka.KafkaOpenMetadataTopicProvider",
"recognizedConfigurationProperties": [
"producer",
"consumer",
"local.server.id",
"sleepTime"
]
},
"endpoint": {
"class": "Endpoint",
"headerVersion": 0,
"address": "egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.registration"
},
"configurationProperties": {
"producer": {
"bootstrap.servers": "localhost:9092"
},
"local.server.id": "73955db6-026c-4ba5-a180-1355dbf166cf",
"consumer": {
"bootstrap.servers": "localhost:9092"
}
}
}
}
]
},
"cohortOMRSTypesTopicConnection": {
"class": "VirtualConnection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"connectorProviderClassName": "org.odpi.openmetadata.repositoryservices.connectors.omrstopic.OMRSTopicProvider"
},
"embeddedConnections": [
{
"class": "EmbeddedConnection",
"headerVersion": 0,
"position": 0,
"displayName": "cocoCohort OMRS Topic for types",
"embeddedConnection": {
"class": "Connection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"type": {
"class": "ElementType",
"headerVersion": 0,
"elementOrigin": "LOCAL_COHORT",
"elementVersion": 0,
"elementTypeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
"elementTypeName": "ConnectorType",
"elementTypeVersion": 1,
"elementTypeDescription": "A set of properties describing a type of connector."
},
"guid": "3851e8d0-e343-400c-82cb-3918fed81da6",
"qualifiedName": "Kafka Open Metadata Topic Connector",
"displayName": "Kafka Open Metadata Topic Connector",
"description": "Kafka Open Metadata Topic Connector supports string based events over an Apache Kafka event bus.",
"connectorProviderClassName": "org.odpi.openmetadata.adapters.eventbus.topic.kafka.KafkaOpenMetadataTopicProvider",
"recognizedConfigurationProperties": [
"producer",
"consumer",
"local.server.id",
"sleepTime"
]
},
"endpoint": {
"class": "Endpoint",
"headerVersion": 0,
"address": "egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.types"
},
"configurationProperties": {
"producer": {
"bootstrap.servers": "localhost:9092"
},
"local.server.id": "73955db6-026c-4ba5-a180-1355dbf166cf",
"consumer": {
"bootstrap.servers": "localhost:9092"
}
}
}
}
]
},
"cohortOMRSInstancesTopicConnection": {
"class": "VirtualConnection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"connectorProviderClassName": "org.odpi.openmetadata.repositoryservices.connectors.omrstopic.OMRSTopicProvider"
},
"embeddedConnections": [
{
"class": "EmbeddedConnection",
"headerVersion": 0,
"position": 0,
"displayName": "cocoCohort OMRS Topic for instances",
"embeddedConnection": {
"class": "Connection",
"headerVersion": 0,
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"type": {
"class": "ElementType",
"headerVersion": 0,
"elementOrigin": "LOCAL_COHORT",
"elementVersion": 0,
"elementTypeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
"elementTypeName": "ConnectorType",
"elementTypeVersion": 1,
"elementTypeDescription": "A set of properties describing a type of connector."
},
"guid": "3851e8d0-e343-400c-82cb-3918fed81da6",
"qualifiedName": "Kafka Open Metadata Topic Connector",
"displayName": "Kafka Open Metadata Topic Connector",
"description": "Kafka Open Metadata Topic Connector supports string based events over an Apache Kafka event bus.",
"connectorProviderClassName": "org.odpi.openmetadata.adapters.eventbus.topic.kafka.KafkaOpenMetadataTopicProvider",
"recognizedConfigurationProperties": [
"producer",
"consumer",
"local.server.id",
"sleepTime"
]
},
"endpoint": {
"class": "Endpoint",
"headerVersion": 0,
"address": "egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.instances"
},
"configurationProperties": {
"producer": {
"bootstrap.servers": "localhost:9092"
},
"local.server.id": "73955db6-026c-4ba5-a180-1355dbf166cf",
"consumer": {
"bootstrap.servers": "localhost:9092"
}
}
}
}
]
},
"cohortOMRSTopicProtocolVersion": "V1",
"eventsToProcessRule": "ALL"
}
Controlling the name of the cohort topic(s)
Typically, a production deployment of an event bus requires the topics to be explicitly defined in its configuration. In addition, many organizations have naming standards for topics. Therefore, Egeria provides commands to query the topic names from the configuration for easy automation and the ability to override the topic names.
The default single topic name is egeria.omag.openmetadata.repositoryservices.cohort.{cohortName}.OMRSTopic
and the default dedicated topic names are:
- For registration events -
egeria.omag.openmetadata.repositoryservices.cohort.{cohortName}.OMRSTopic.registration
- For type verification events -
egeria.omag.openmetadata.repositoryservices.cohort.{cohortName}.OMRSTopic.types
- For instance events -
egeria.omag.openmetadata.repositoryservices.cohort.{cohortName}.OMRSTopic.instances
This is the command to query the single topic name.
GET {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/topic-name
{
"class": "StringResponse",
"relatedHTTPCode": 200,
"resultString": "egeria.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic"
}
{
"class": "StringResponse",
"relatedHTTPCode": 200
}
{
"class": "StringResponse",
"relatedHTTPCode": 400,
"exceptionClassName": "org.odpi.openmetadata.adminservices.ffdc.exception.OMAGInvalidParameterException",
"exceptionErrorMessage": "OMAG-ADMIN-400-033 The OMAG server cocoMDS1 is unable to override the cohort topic until the cocoCohortXXX cohort is set up",
"exceptionSystemAction": "No change has occurred in this server's configuration document.",
"exceptionUserAction": "Add the cohort configuration using the administration services and retry the request."
}
This is the command to retrieve the dedicated topics:
GET {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/dedicated-topic-names
The result looks like this with the registration topic showing first, then the type verification topic and lastly the "instances topic":
{
"class": "DedicatedTopicListResponse",
"relatedHTTPCode": 200,
"dedicatedTopicList": {
"registrationTopicName": "egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.registration",
"typesTopicName": "egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.types",
"instancesTopicName": "egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.instances"
}
}
Override the value for the cohort topic
It is also possible to change the name of the topics used by a cohort. Any changes must be issued against each member of the cohort so that they are all connecting to the same cohort topic(s). The new value takes affect the next time the server is started.
Changing the single topic name is done with the following command
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/topic-name-override
{newTopicName}
The {newTopicName}
flows in the request body as raw text.
This is the command for changing the registration topic name:
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/topic-name-override/registration
{newTopicName}
This is the command for changing the type verification topic name:
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/topic-name-override/types
{newTopicName}
This is the command for changing the "instances topic" name:
POST {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}/topic-name-override/instances
{newTopicName}
Disconnect from a cohort
This command unregisters a server from a cohort.
DELETE {platformURLRoot}/open-metadata/admin-services/users/{adminUserId}/servers/{serverName}/cohorts/{cohortName}
Formation of a cohort¶
Cohort membership is established dynamically at server start up. This is through the cohort topic(s) defined in the configuration.
First server¶
To join an open metadata repository cohort, a server first adds a registration event to the cohort topic(s). This event identifies the server, its metadata repository (if any) and its capabilities.
Figure 1: The first server to join the cohort issues a registration request and waits for others to join.
Subsequent servers¶
When another server joins the cohort, it also adds its registration event to the cohort topic(s) and begins to receive the registration events from other members. The other members respond with re-registration events to ensure the new member has the latest information about the originator's capabilities. The exchange of registration information causes all members to verify that they have the latest information about their peers. This is maintained in their own cohort registry store so that they can reconfigure themselves on restart without needing the other members to resend their registration information.
Figure 2: When another server joins the cohort they exchange registration information.
Peer-to-peer operation¶
Once the registration information is exchanged and stored in each member's cohort registry store, it is ready to issue federated queries across the cohort, and respond to metadata requests from other members. The registration information includes the URL Root and server name of the member. The federation capability in each member allows it to issue metadata create, update, delete and search requests to each and every member of the cohort.
Figure 3: Once the registration is complete the cohort members can issue federated queries.
Primary mechanism for accessing metadata
This peer-to-peer operation and federated queries are the primary mechanism for accessing metadata, because the access services use federated queries for every request they make for metadata.
Metadata exchange¶
Once the cohort membership is established, the server begins publishing information using instance events about changes to its home metadata instances in its local repository. These events can be used by other members to maintain a cache of reference copies of this metadata to improve availability of the metadata and retrieval performance. Updates to this metadata will, however, be automatically routed to the home repository by the enterprise repository services:
Figure 4: Metadata can also be replicated through the cohort to allow caching for availability and performance.
Metadata refresh
A member may also request that metadata is "refreshed" across the cohort. The originator of the requested metadata then sends the latest version of this metadata to the rest of the cohort through the cohort topic. This mechanism is useful to seed the cache in a new member of the cohort and is invoked as a result of a federated query issued from any cohort member.
Dynamic changes to types¶
Finally, as type definitions (TypeDefs) are added and updated, the cohort members send out events to allow the other members to verify that this type does not conflict with any of their types. Any conflicts in the types causes audit log messages to be logged in all members, prompting action to resolve the conflicts.
Figure 5: TypeDef validation.
Leaving the cohort¶
When an OMAG Server permanently leaves the cohort, it sends an unregistration request. This enables the other members to remove the parting member from their registries.
Security¶
The server's metadata security connector provides fine-grained control on which metadata is sent, received and/or stored by the server. This level of control is necessary for metadata repositories that are managing specific collections of valuable metadata such as Assets that have sensitive attributes that need to be removed before the .
Explore hands-on
The administration hands-on lab called "Understanding Cohort Configuration Lab" provides an opportunity to query the cohort registries of cohort members as they exchange metadata for Coco Pharmaceuticals.
Federated queries¶
A federated query combines metadata retrieved from all members of the connected cohorts.
Federated query visiting the local repository and then calling all other servers connected via the cohort(s).
The list of servers that are called by a federated query is built dynamically from the cohort registration request events. These events take information from the configuration document for the server.
Configuring the local repository for federated queries¶
In the Local Repository section of the configuration document are two connections:
- LocalRepositoryLocalConnection is the connector to the metadata repository for this local server.
- LocalRepositoryRemoteConnection is the connector that remote servers should use in their federated queries to retrieve information from this local repository.
Configuration document showing the both the local and remote connections for the local repository in a metadata access store or repository proxy. In a metadata access point, both connections are null. In a metadata access store that does not support federated queries, LocalRepositoryRemoteConnection is null.
The LocalRepositoryRemoteConnection is sent to the other cohort members in the registration request events. The default value specifies the OMRS REST Repository Connector as the remote repository connector.
The default remote-repository connector is a REST API client for the Repository REST API supported by Egeria OMAG Server Platform.
When the registration request is accepted, the receiving system uses the LocalRepositoryRemoteConnection to configure the remote repository connector in the Enterprise Repository Connector from the enterprise repository services that is responsible for executing the federated queries.
The remote repository connectors are established dynamically in the enterprise repository connector using information from the registration events.
Making federated queries¶
Whenever an Open Metadata Access Service (OMAS) is called, it uses the enterprise repository connector to create, retrieve, update and delete metadata.
The operation of the enterprise repository connector depends on the type of request. When metadata is retrieved, the request is passed to all connected repositories and the results are combined.
When metadata is retrieved, the enterprise repository connector calls the local repository and each of the registered remote repositories. The metadata returned is combined and passed to caller.
Updates and deletes are targeted to the home repository of the instance. The metadata collection identifier of the home repository is encoded in the header of the metadata instance. This identifier is also known to each of the remote repository connectors, so it is possible to match the instance with its home repository.
The home metadata collection is identified in the header of each metadata element.
When the home repository makes the change to the instance, it sends an event with the latest version of the instance to the rest of the cohort.
Create requests are passed first to the local repository, and if it can not support the requested type of metadata, then the enterprise repository connector tries each of the remote repository connectors until one of them returns to say that the new instance is created.
Whichever repository created the instance, becomes that instance's home repository, and it sends out an event to the rest of the cohort to announce that the new instance is available.
Raise an issue or comment below