Egeria has a growing collection of connectors to third party technologies. These connectors help to accelerate the rollout of your open metadata ecosystem since they can be used to automate the extraction and distribution of metadata to the third party technologies.
A connector is a client to a third party technology. It supports a standard API that Egeria calls and it then translates these calls into requests to the third party technology. Some connectors are also able to listen for notifications from the third party technology. When a notification is received, the connector converts its content into a call to Egeria to distribute the information to the open metadata ecosystem.
Connectors enable Egeria to operate in many environments and with many types of third party technologies, just by managing the configuration of the OMAG servers. The Connector Catalog list the connector implementations supplied by the Egeria community. There are three broad categories of connectors and the connector catalog is organized accordingly:
Connectors that support the exchange and maintenance of metadata. This includes the integration connectors, repository connectors, discovery services and governance action services.
Connectors that support Egeria’s runtime. This includes the event bus connectors, cohort registry stores, configuration stores, audit log destination connectors, open metadata archive stores, REST client connectors and the cohort member remote repository connectors
Connectors that provide access to digital resources and their metadata that is stored in the open metadata ecosystem.
Metadata exchange and maintenance connectors¶
The connectors that support the exchange and maintenance of metadata help to accelerate the rollout of your open metadata ecosystem since they can be used to automate the extraction and distribution of metadata to the third party technologies.
|Type of Connector||Description|
|Integration connectors||manage the metadata exchange to a third party technology through an integration service.|
|Repository and Event Mapper connectors||integrate metadata repositories into the open metadata ecosystem so that they can interact with one or more open metadata repository cohorts.|
|Open Discovery Services||analyze the content of resources in the digital landscape and create annotations that are attached to the resource's asset metadata element in the open metadata repositories in the form of an discovery analysis report.|
|Governance Action Services||perform monitoring of metadata changes, validation of metadata, triage of issues, assessment and/or remediation activities as required.|
|Governance Daemon Connectors||contain specialist connectors for the governance servers that make active use of open metadata.|
The integration connectors support the exchange of metadata with third party technologies. This exchange may be inbound and/or outbound; synchronous, polling or event-driven.
An integration connector runs in an Open Metadata Integration Service (OMIS) which is in turn hosted in an Integration Daemon server. Each integration service provides a specialist interface designed to aid the integration with a specific type of technology. The integration connector implementation is therefore dependent on a specific OMIS.
An integration connector is shown deployed in an integration service running in an integration daemon. The connector is linking to a third party technology and also calling the open metadata APIs of Egeria to manage the exchange of metadata.
The files integration connectors run in the Files Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.
|Files Integration Connectors||Description|
|Data files monitor||maintains a
|Data folder monitor||maintains a
Cataloguing Databases and their Schemas¶
The database integration connectors run in the Database Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.
|Database Integration Connectors||Description|
|PostgreSQL database connector||automatically maintains the open metadata instances for the databases hosted on a PostgreSQL server This includes the database schemas, tables, columns, primary keys and foreign keys.|
Cataloguing event topics and the structure of their events¶
The topic integration connectors run in the Topic Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.
|Topic Integration Connector||Description|
|Kafka Monitor topic integration connector||automatically maintains the open metadata instances for the topics hosted on an Apache Kafka server .|
|Kafka Audit topic integration connector||Validates that topics that are active in an Apache Kafka server are also catalogued in open metadata. Creates an audit log record for each topic that is not catalogued.|
The API integration connectors run in the API Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.
|API Integration Connectors||Description|
|Open API Monitor integration connector||automatically maintains the open metadata instances for the APIs extracted from the Open API Specification extracted from an application.|
Populating security enforcement engines¶
The security integration connectors run in the Security Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.
Capturing and publishing Lineage¶
|Lineage Integration Connectors||Description|
|Open Lineage Event Receiver integration connector||Connector to receive open lineage events from an event topic and publish them to lineage integration connectors with listeners registered in the same instance of the Lineage Integrator OMIS.|
|Governance Action to Open Lineage integration connector||Connector to listen for governance actions executing in the open metadata ecosystem, generate open lineage events for them and publish them to the integration connectors running in the same instance of Lineage Integrator OMIS that are listening for OpenLineage events.|
|API-based Open Lineage Log Store integration connector||Connector that calls an OpenLineage compliant API to store the open lineage events that are passed to it through the OpenLineage listener that is registered with the Lineage Integrator OMIS.|
|File-based Open Lineage Log Store integration connector||Connector that stores the open lineage events that are passed to it through the OpenLineage listener that is registered with the Lineage Integrator OMIS. Each OpenLineage event is stored in its own file in JSON format. These files are organized according to the namespace and job name in the event.|
|Open Lineage Cataloguer integration connector||Connector to register an OpenLineage listener with the Lineage Integrator OMIS and to catalog any processes that are not already known to the open metadata ecosystem.|
Further information relating to integration connectors
Repository and Event Mapper Connectors¶
The repository connector, and its optional event mapper connector provide the ability to integrate a metadata repository into the open metadata ecosystem. These connector have direct access to the connected open metadata repository cohorts. There are two patterns of use for these connectors.
In the first pattern, called the native repository connector, the repository connector delegates all of its methods to a particular type of persistence store. Metadata is only accessible through the Egeria APIs and it is stored as entities, relationships and classifications enabling it to support any valid type of open metadata. This type of repository connector runs as the local repository within an Egeria Metadata Access Store server.
Repository connector supporting a native open metadata repository
In the second pattern, called the adapter repository connector, the repository connector, and an optional event mapper connector, provide an adapter for a third party metadata repository so it can be a part of the open metadata ecosystem. These connectors run in a Repository Proxy server.
Repository connector and optional event mapper connector supporting an adapter to a third party metadata repository
The table below lists the repository connectors supporting the native open metadata repositories.
|Native Repository Connector||Description|
|JanusGraph OMRS Repository Connector||provides a native repository for a metadata server using JanusGraph as the backend.|
|XTDB OMRS Repository Connector||provides a native repository for a metadata server that supports historical queries, using XTDB as the persistent store.|
|In-memory OMRS Repository Connector||provides a simple native repository implementation that "stores" metadata in HashMaps within the JVM; it is used for testing, or for environments where metadata maintained in other repositories needs to be cached locally for performance/scalability reasons.|
|Read-only OMRS Repository Connector||provides a native repository implementation that does not support the interfaces for create, update, delete; however, it does support the search interfaces and is able to cache metadata -- this means it can be loaded with open metadata archives to provide standard metadata definitions.|
The table below lists the repository connectors that act as an adapter for third party metadata repositories.
|Adapter Repository Connectors||Description|
|Apache Atlas OMRS Repository Connector||implements read-only connectivity to the Apache Atlas metadata repository|
|IBM Information Governance Catalog (IGC) OMRS Repository Connector||implements read-only connectivity to the metadata repository within the IBM InfoSphere Information Server suite|
|SAS Viya OMRS Repository Connector||implements metadata exchange to the metadata repository within the SAS Viya Platform|
|Sample Repository proxy (adapter) using polling to access files||implements metadata exchange to a file system using a polling pattern and an embedded OMRS repository.|
Further information relating to Repository and Event Mapper connectors
- Configuring a native repository connector to understand how to set up a repository connector in a Metadata Access Store.
- Configuring an adapter repository connector to understand how to set up a repository connector in a Repository Proxy.
- Writing repository and event mapper connectors for more information on writing new repository and event mapper connectors.
Open Discovery Services¶
An open discovery service is a component that performs specific analysis of the contents of a digital resource on request. The aim of the open discovery service is to enable a detailed picture of the properties of a resource to be built up.
Each time an open discovery service runs to analyse a digital resource, a new discovery analysis report is created and attached to the resource's asset. If the open discovery service is run regularly, it is possible to track how the contents are changing over time.
The discovery analysis report contains one or more sets of related properties that the discovery service has discovered about the resource, its metadata, structure and/or content. These are stored in a set of discovery annotations linked off of the discovery analysis report.
An open discovery service is designed to run at regular intervals to gather a detailed perspective on the contents of the digital resource and how they are changing over time. Each time it runs, it is given access to the results of previously run open discovery services, along with a review of these findings made by individuals responsible for the digital resource (such as stewards, owners, custodians).
Operation of an open discovery service
- Each time an open discovery service runs, Egeria creates a discovery analysis report to describe the status and results of the open discovery service's execution. The open discovery service is passed a discovery context that provides access to metadata.
- The discovery context is able to supply metadata about the asset and create a connector to the digital resource using the connection information linked to the asset. The discovery service uses the connector to access the digital resource's contents in order to perform the analysis.
- The discovery service creates discovery annotations to record the results of its analysis. It adds them to the discovery context which stores them in open metadata attached to the discovery analysis report.
- The discovery annotations can be reviewed and commented on through an external stewardship process. This means choices from, for example, a list of potential options proposed by the discovery services, can be verified and the best one selected by an individual expert. The resulting choices are added to annotation reviews attached to the appropriate annotations.
- The next time the open discovery service runs, a new discovery analysis report is created to link new attachments.
- The discovery context provides access to the existing attachments for that asset along with any annotation reviews. The discovery services is able to link its new annotations to the existing annotations as an annotation extension. This means that the stewards can see the history associated with the new information.
Runtime for an open discovery service
|Sequential Discovery Pipeline||runs nested discovery services in a sequence (more information on discovery pipelines).|
|CSV Discovery Service||extracts the column names from the first line of the file, counts up the number of records in the file and extracts its last modified time.|
|Validate Drop Foot Weekly Measurements Discovery Service||runs nested discovery services in a sequence (more information on discovery pipelines).|
|Validate Patient Records||runs nested discovery services in a sequence (more information on discovery pipelines).|
Further information relating to Open Discovery Services
- Configuring an engine host to understand how to set up the Engine Host server where the open discovery services run.
- Setting up a governance engine content pack to create an open discovery engine definition to load into a Metadata Access Store.
- Writing an open discovery service for information on writing new open discovery services.
Governance Action Services¶
A governance action service is a specialized connector that performs monitoring of metadata changes, validation of metadata, triage of issues, assessment and/or remediation activities on request. Some governance action services invoke functions in external engines that are working with data and related assets.
A governance action service runs in the Governance Action Open Metadata Engine Service (OMES) hosted by the Engine Host OMAG Server.
Governance action services implement interfaces defined by the Governance Action Framework (GAF). The GAF offers embeddable functions and APIs to simplify the implementation of governance action services, and their integration into the broader digital landscape, whilst being resilient and with good performance.
It is possible to implement complex governance actions in a single governance action service. Alternatively there are five specialized types of governance action services that help you to break down your governance function into reusable components that can be choreographed by governance action processes to maximise the flexibility of your governance automation. When a governance action service completes, it produces guards that define what needs to be done next along with a list of action targets.
Verification Governance Action Service validates that a rule or policy is being followed. This is often a test that the metadata elements, relationships and classification are set up as they should be. For example, it may check that a new asset has an owner, is set up with governance zones and includes a connection and a schema there possible. Verification governance action services
Triage Governance Action Service runs triage rules to determine how to manage a situation or request, such as a request for action from an open discovery service. Often this involves a human decision maker. It may initiate an external workflow, wait for manual decision or create a ToDo for a specific person.
Remediation Governance Action Service makes updates to metadata elements, relationships between them and classifications. Examples of remediation governance action services include:
- Classification and linking of metadata elements such as adding owners, governance zones and origin classifications to assets.
- Duplicate detection, linking and consolidating.
Provisioning Governance Action Service invokes a provisioning service whenever a provisioning request is made. Typically, the provisioning service is an external service. It may also create lineage metadata to describe the work of the provisioning service if the provisioning service is not able to create lineage itself.
The interfaces for governance action services is defined in the governance-action-framework module.
Governance action service example - data onboarding process
The governance action services are best understood through examples. Consider an onboarding process where new files are being copied into a landing area. They need to be catalogued in open metadata and moved into the data lake folder.
Operation of the data onboarding process
At the start, there is an integration connector called Data Files Monitoring Integration Connector that will detect new files in the landing area folder and create an Asset metadata element to describe the file.
There is also governance action service called New Asset Watchdog that has registered a listener for new Asset metadata elements.
New files arrive
When a new file arrives, Data Files Monitoring Integration Connector detects is and catalogues it as an Asset in open metadata. This triggers a call to New Asset Watchdog which then creates a governance action to initate a provisioning governance action service .
Provisioning to the data lake
The governance action identifies the governance action service called Clinical Trial Provisioning and so it is started in the engine host. It moves the file to the data lake folder and adds lineage metadata to describe the data movement and the new Asset for the file in the data lake. The original asset is still in the metadata repository since it is needed to show the source of the data movement.
Archive the asset for the landing area
The result of the provisioning removes the file from the landing area folder. This is detected by Data Files Monitoring Integration Connector which then archives the corresponding Asset. This addes a Memento classification to the Asset which means it is only retrievable on lineage requests.
This is a summary of the flow:
- New file detected by the Integration Connector.
- An Asset describing the file is created in the Metadata Access Server.
- New Asset event passed to Watchdog Governance Action Service.
- New Governance Action created that results in notification to Engine Host.
- Engine Host claims Governance Action and activates Provisioning Governance Action Service.
- Provisioning Governance Action Service moves file and writes lineage.
- Deleted file is detected by the Integration Connector.
- File's Asset is archived (adding a Memento classification to the Asset).
Since the watchdog governance action service calls the provisioning governance action process explicitly via the governance action, their implementations are somewhat tied together. The alternative is that the watchdog governance action service can invoke a governance action process that will choreograph the execution of one or more governance services based on a flow definition managed in open metadata. The governance action process separates the implementation of the watchdog governance action service from the follow-on governance actions since changes to the follow-on processing is maintained through open metadata rather than requiring code changes to the watchdog governance action service code.
|Generic Element Watchdog Governance Action Service||listens for changing metadata elements and initiates governance action processes when certain events occur.|
|Generic Folder Watchdog Governance Action Service||listens for changing assets linked to a
|Move/Copy File Provisioning Governance Action Service||moves or copies files from one location to another and maintains the lineage of the action.|
|Origin Seeker Remediation Governance Action Service||walks backwards through the lineage mappings to discover the origin of the data|
Further information relating to Governance Action Services
- Configuring an engine host to understand how to set up the Engine Host server where the governance action services run.
- Setting up a governance engine content pack to create a governance action engine definition to load into a Metadata Access Store.
- Writing a governance action service for information on writing new governance action services.
Governance Daemon Connectors¶
The governance daemon connectors contain specialist connectors for the governance servers that make active use of open metadata.
|Open Lineage Janus Connector||The Open Lineage connectors provide plugins to the Open Lineage Server that allow the Open Lineage Services to connect with databases.|
Repository Governance Services¶
An repository governance service is a specialized connector that performs governance on open metadata repository such as maintaining an open metadata archive. It is hosted in the Repository Governance OMES which is, in turn, running in an engine host OMAG server.
Figure 1: Repository Governance Services
A repository governance service can:
- Register a listener with the Enterprise OMAS Topic to receive notifications from any of the repositories connected via Open Metadata Repository Cohorts.
- Issue requests to find and retrieve metadata instances from any of the repositories connected via Open Metadata Repository Cohorts.
- Incrementally build and store an open metadata archive.
There are currently no repository governance services supplied by Egeria.
Further information relating to Repository Governance Services
- Configuring an engine host to understand how to set up the Engine Host server where the repository governance services run.
- Setting up a governance engine content pack to create a repository governance engine definition to load into a Metadata Access Store.
- Writing a repository governance service to understand how to write a repository governance service.
Runtime connectors enable Egeria's OMAG Server Platform and its hosted OMAG Servers to operate in many environments by providing plug-in points for the runtime services it needs to operate. Most of the runtime connectors relate to persistent storage, or connections to distributed services.
|Platform Metadata Security Connectors||manage authorization requests for the OMAG Server Platform's services.|
|Server Metadata Security Connectors||manage authorization requests for the OMAG Server's services.|
|Configuration Document Store Connectors||manage the persistence and retrieval of configuration documents.|
|Cohort Registry Store Connectors||store the open metadata repository cohort membership details in the cohort registry store.|
|Open Metadata Archive Store Connectors||read and write open metadata archives.|
|Audit Log Destination Connectors||support different destinations for audit log records.|
|REST Client Connectors||issue REST API calls to Egeria's deployed platforms and third party technologies.|
|Cohort Member Client Connector||supports repository service called to remote cohort members.|
|Open Metadata Topic Connectors||send and receive events.|
Platform Metadata Security Connectors¶
The platform metadata security connector provides authorization support for requests to the OMAG Server Platform. There is one platform metadata security connector defined for each OMAG Server Platform.
There is one implementation of the platform metadata security connector provided by Egeria. It is a sample that encodes information from the Coco Pharmaceutical scenarios.
Further information relating to Platform Metadata Security Connectors
- Configuring a Platform Metadata Security Connector in the OMAG Server Platform
- Metadata Security to understand the platform metadata security connector in the context of all of the security features.
- Writing a Platform Metadata Security Connector.
Server Metadata Security Connectors¶
The server metadata security connector provides authorization support for requests to an OMAG Server. There is one server metadata security connector configured for each OMAG Server.
This connector is called each time a request is made to the server. It is called to:
- Validate the user has access to the server.
- Validate the user has access to the called operation.
- Select which zones to associate with the requests based on the user.
- If the request involves an asset, then does the user have access to the asset and which properties are they allowed to see.
- If the request is for a connection for an asset, then select which one linked to the asset should be returned to the user.
- Determine which events to sent to the cohort.
- Determine what actions can be made on the metadata repository.
- Metadata security service provides the interface for the server metadata security connector and manages the calls to it.
- Writing a server metadata security connector
- Configuring the server metadata security connector in a server.
There is one implementation of the server metadata security connector provided by Egeria. It is a sample that encodes information from the Coco Pharmaceuticals scenarios.
Further information relating to Server Metadata Security Connectors
Configuration Document Store Connectors¶
There is one configuration document store connector defined for each OMAG Server Platform.
There are two implementations of the configuration document store connector provided by Egeria: one for an encrypted store (default) and the other for a plain text store.
Encrypted File Configuration Store Connector stores each configuration document as an encrypted JSON file.
File Configuration Store stores each configuration document as a clear text JSON file.
Further information relating to Configuration Document Store Connectors
Cohort Registry Store Connectors¶
The cohort registry store maintains information about the servers registered in an open metadata repository cohort. It resides in each cohort member and represents that member's view of the cohort membership. It contains the registration information sent by this member and the responses received from the other members.
Inside the cohort registry store there is one local registration record describing the information sent to the other members of the cohort and a list of remote registration records received from the other members of the cohort.
A cohort registry store connector manages the persistence of the cohort registry store. Egeria uses a connector to allow different storage methods for different deployment environments. Each member may choose their own implementation of the cohort registry store connector.
Egeria provides a single implementation of a cohort registry store connector:
- Cohort Registry File Store Connector provides the means to store the cohort registry membership details as a JSON file.
Further information relating to Cohort Registry Store Connectors
Open Metadata Archive Store Connectors¶
The open archive store connector manages the storage of an Open Metadata Archive. It is use in a utility or archive service that is maintaining an open metadata archive, and it is called in an Metadata Access Store when it is loading metadata from the archive.
Egeria provides two implementations of the open metadata archive store connector:
File-based Open Metadata Archive Store Connector stores an open metadata archive as a plain text JSON file.
Directory-based Open Metadata Archive File Store Connector stores an open metadata archive in a directory structure where each type definition and metadata instance is stored in JSON format in its own file.
Further information relating to Open Metadata Archive Store Connectors
- Metadata Archiving to understand the different mechanisms that use open metadata archives.
- Open Metadata Archives to understand structure of an open metadata archive.
- Writing a Open Metadata Archive Store Connector.
- Loading an Open Metadata Archive at server statup
- Loading an Open Metadata Archive in a running server
Audit Log Destination Connectors¶
An audit log destination connector provides support for a specific audit log destination. At least one audit log destination connector is configured in every OMAG Server's's configuration document and used by its audit log component when the server runs.
An audit log destination's purpose may be either to store, process or distribute audit log records to diagnostic systems. Its associated configuration controls which severities of audit log record it receives. The implementation for the audit log destination connector can make further choices about how each log record is processed.
Below are the connector implementations provided by Egeria
Console Audit Log Connector writes selected parts of each audit log record to stdout.
slf4j Audit Log Connector writes full log records to the slf4j ecosystem.
File Audit Log Connector creates log records as JSON files in a shared directory.
Event Topic Audit Log Connector sends each log record as an event on the supplied event topic.
Further information relating to Audit Log Destination Connectors
REST Client Connectors¶
Egeria makes extensive use of REST API calls for synchronous (request-response) communication with its own deployed platforms and third party technologies. The REST client connectors are used to issue the REST API calls.
Egeria provides a single implementation for Spring.
- Spring REST Client Connector uses the Spring RESTClient to issue REST API calls.
This is embedded in Egeria's Java clients. See
- Egeria's [Platform API clients](/guides/developer/#working-with-the-platform-apis). - Egeria's [OMAS clients](/guides/developer/#working-with-the-open-metadata-and-governance-apis).
Cohort Member Client Connectors¶
Members of an Open Metadata Repository Cohort provide the other cohort members with a Connection to a connector that supports the OMRSRepositoryConnector interface during the cohort registration process. This connector translates calls to retrieve and maintain metadata in the member's repository into remote calls to the real repository.
Egeria's Open Metadata Repository Services (OMRS) provides a default REST API implementation and a corresponding client:
- REST Cohort Client Connector supports remote calls to the OMRS REST API.
The connection for this connector is configured in the
LocalRepositoryRemoteConnection property of the
cohort member's Local Repository Configuration.
Digital resource connectors¶
The digital resource connectors provide access to digital resources and their metadata that is stored in the open metadata ecosystem. These connectors are for use by external applications and tools to connect with resources and services in the digital landscape. These connectors also supply the Asset metadata from Egeria that describes these resources.
Instances of these connectors are created through the Asset Consumer OMAS, Asset Owner OMAS and Discovery Engine OMAS interfaces. They use the Connection linked to the corresponding Asset in the open metadata ecosystem. If there are more than one connection associated with the asset, then a selection is made by the server metadata security connector running in the OMASs' server.
The Avro file connector provides access to an Avro file that has been catalogued using open metadata.
The basic file connector provides support to read and write to a file using the Java File object.
The CSV file connector is able to retrieve data from a Comma Separated Values (CSV) file where the contents are stored in logical columns with a special character delimiter between the columns.
The data folder connector is for accessing data that is stored as a number of files within a folder (directory).
More coming ...
Open Metadata Topic Connectors¶
The Open Metadata Topic Connectors are used by Egeria to read and write events to a topic managed by an event broker. These events contain notifications relating to changes in metadata and the topic provides an asynchronous event exchange service hosted in the event broker. It is typically wrapped in a connector that supports a specific event type. For example, the Open Metadata Topic Connectors connect servers into an open metadata repository cohort and exchange notifications through the Open Metadata Access Services (OMAS)'s topics called the InTopic and OutTopic. In all of these cases, an open metadata topic connector is nested inside of the specific topic connector. The use of the open metadata topic connector in this way means that only one connector need be implemented for each type of event bus - rather than one for each type of event that Egeria supports.
Egeria provides a single implementation of an open metadata connector for Apache Kafka that it uses by default.
- The Kafka Open Metadata Topic Connector implements an Apache Kafka connector for a topic that exchanges Java Objects as JSON payloads.