Skip to content

Connector catalog

Egeria has a growing collection of connectors to third party technologies. These connectors help to accelerate the rollout of your open metadata ecosystem since they can be used to automate the extraction and distribution of metadata to the third party technologies.

A connector is a client to a third party technology. It supports a standard API that Egeria calls and it then translates these calls into requests to the third party technology. Some connectors are also able to listen for notifications from the third party technology. When a notification is received, the connector converts its content into a call to Egeria to distribute the information to the open metadata ecosystem.

Connectors enable Egeria to operate in many environments and with many types of third party technologies, just by managing the configuration of the OMAG servers. The Connector Catalog list the connector implementations supplied by the Egeria community. There are three broad categories of connectors and the connector catalog is organized accordingly:

  • Connectors that support the exchange and maintenance of metadata. This includes the integration connectors, repository connectors, discovery services and governance action services.

  • Connectors that support Egeria‚Äôs runtime. This includes the event bus connectors, cohort registry stores, configuration stores, audit log destination connectors, open metadata archive stores, REST client connectors and the cohort member remote repository connectors

  • Connectors that provide access to digital resources and their metadata that is stored in the open metadata ecosystem.

Metadata exchange and maintenance connectors

The connectors that support the exchange and maintenance of metadata help to accelerate the rollout of your open metadata ecosystem since they can be used to automate the extraction and distribution of metadata to the third party technologies.

Type of Connector Description
Integration connectors manage the metadata exchange to a third party technology through an integration service.
Repository and Event Mapper connectors integrate metadata repositories into the open metadata ecosystem so that they can interact with one or more open metadata repository cohorts.
Open Discovery Services analyze the content of resources in the digital landscape and create annotations that are attached to the resource's asset metadata element in the open metadata repositories in the form of an discovery analysis report.
Governance Action Services perform monitoring of metadata changes, validation of metadata, triage of issues, assessment and/or remediation activities as required.
Governance Daemon Connectors contain specialist connectors for the governance servers that make active use of open metadata.

Integration Connectors

The integration connectors support the exchange of metadata with third party technologies. This exchange may be inbound and/or outbound; synchronous, polling or event-driven.

An integration connector runs in an Open Metadata Integration Service (OMIS) which is in turn hosted in an Integration Daemon server. Each integration service provides a specialist interface designed to aid the integration with a specific type of technology. The integration connector implementation is therefore dependent on a specific OMIS.

Deployed Integration Connector

An integration connector is shown deployed in an integration service running in an integration daemon. The connector is linking to a third party technology and also calling the open metadata APIs of Egeria to manage the exchange of metadata.

Cataloguing Files

The files integration connectors run in the Files Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.

Files Integration Connectors Description
Data files monitor maintains a DataFile asset for each file in the directory (or any subdirectory). When a new file is created, a new DataFile asset is created. If a file is modified, the lastModified property of the corresponding DataFile asset is updated. When a file is deleted, its corresponding DataFile asset is also deleted (or archived if it is still needed for lineage).
Data folder monitor maintains a DataFolder asset for the directory. The files and directories underneath it are assumed to be elements/records in the DataFolder asset and so each time there is a change to the files and directories under the monitored directory, it results in an update to the lastModified property of the corresponding DataFolder asset.

Cataloguing Databases and their Schemas

The database integration connectors run in the Database Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.

Database Integration Connectors Description
PostgreSQL database connector automatically maintains the open metadata instances for the databases hosted on a PostgreSQL server This includes the database schemas, tables, columns, primary keys and foreign keys.

Cataloguing event topics and the structure of their events

The topic integration connectors run in the Topic Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.

Topic Integration Connector Description
Kafka Monitor topic integration connector automatically maintains the open metadata instances for the topics hosted on an Apache Kafka server .
Kafka Audit topic integration connector Validates that topics that are active in an Apache Kafka server are also catalogued in open metadata. Creates an audit log record for each topic that is not catalogued.

Cataloguing APIs

The API integration connectors run in the API Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.

API Integration Connectors Description
Open API Monitor integration connector automatically maintains the open metadata instances for the APIs extracted from the Open API Specification extracted from an application.

Populating security enforcement engines

The security integration connectors run in the Security Integrator Open Metadata Integration Service (OMIS) hosted in the integration daemon.

Capturing and publishing Lineage

The lineage integration connectors run in the Lineage Integrator OMIS hosted in the integration daemon. They support Lineage Management.

Lineage Integration Connectors Description
Open Lineage Event Receiver integration connector Connector to receive open lineage events from an event topic and publish them to lineage integration connectors with listeners registered in the same instance of the Lineage Integrator OMIS.
Governance Action to Open Lineage integration connector Connector to listen for governance actions executing in the open metadata ecosystem, generate open lineage events for them and publish them to the integration connectors running in the same instance of Lineage Integrator OMIS that are listening for OpenLineage events.
API-based Open Lineage Log Store integration connector Connector that calls an OpenLineage compliant API to store the open lineage events that are passed to it through the OpenLineage listener that is registered with the Lineage Integrator OMIS.
File-based Open Lineage Log Store integration connector Connector that stores the open lineage events that are passed to it through the OpenLineage listener that is registered with the Lineage Integrator OMIS. Each OpenLineage event is stored in its own file in JSON format. These files are organized according to the namespace and job name in the event.
Open Lineage Cataloguer integration connector Connector to register an OpenLineage listener with the Lineage Integrator OMIS and to catalog any processes that are not already known to the open metadata ecosystem.
Further information relating to integration connectors

Repository and Event Mapper Connectors

The repository connector, and its optional event mapper connector provide the ability to integrate a metadata repository into the open metadata ecosystem. These connector have direct access to the connected open metadata repository cohorts. There are two patterns of use for these connectors.

In the first pattern, called the native repository connector, the repository connector delegates all of its methods to a particular type of persistence store. Metadata is only accessible through the Egeria APIs and it is stored as entities, relationships and classifications enabling it to support any valid type of open metadata. This type of repository connector runs as the local repository within an Egeria Metadata Access Store server.

Native open metadata repository

Repository connector supporting a native open metadata repository

In the second pattern, called the adapter repository connector, the repository connector, and an optional event mapper connector, provide an adapter for a third party metadata repository so it can be a part of the open metadata ecosystem. These connectors run in a Repository Proxy server.

Adapter repository connectors

Repository connector and optional event mapper connector supporting an adapter to a third party metadata repository

The table below lists the repository connectors supporting the native open metadata repositories.

Native Repository Connector Description
JanusGraph OMRS Repository Connector provides a native repository for a metadata server using JanusGraph as the backend.
XTDB OMRS Repository Connector provides a native repository for a metadata server that supports historical queries, using XTDB as the persistent store.
In-memory OMRS Repository Connector provides a simple native repository implementation that "stores" metadata in HashMaps within the JVM; it is used for testing, or for environments where metadata maintained in other repositories needs to be cached locally for performance/scalability reasons.
Read-only OMRS Repository Connector provides a native repository implementation that does not support the interfaces for create, update, delete; however, it does support the search interfaces and is able to cache metadata -- this means it can be loaded with open metadata archives to provide standard metadata definitions.

The table below lists the repository connectors that act as an adapter for third party metadata repositories.

Adapter Repository Connectors Description
Apache Atlas OMRS Repository Connector implements read-only connectivity to the Apache Atlas metadata repository
IBM Information Governance Catalog (IGC) OMRS Repository Connector implements read-only connectivity to the metadata repository within the IBM InfoSphere Information Server suite
SAS Viya OMRS Repository Connector implements metadata exchange to the metadata repository within the SAS Viya Platform
Sample Repository proxy (adapter) using polling to access files implements metadata exchange to a file system using a polling pattern and an embedded OMRS repository.
Further information relating to Repository and Event Mapper connectors

Open Discovery Services

An open discovery service is a component that performs specific analysis of the contents of a digital resource on request. The aim of the open discovery service is to enable a detailed picture of the properties of a resource to be built up.

Each time an open discovery service runs, it creates a new discovery analysis report linked off of the digital resource's Asset metadata element that records the results of the analysis.

Asset with discovery analysis reports

Each time an open discovery service runs to analyse a digital resource, a new discovery analysis report is created and attached to the resource's asset. If the open discovery service is run regularly, it is possible to track how the contents are changing over time.

The discovery analysis report contains one or more sets of related properties that the discovery service has discovered about the resource, its metadata, structure and/or content. These are stored in a set of discovery annotations linked off of the discovery analysis report.

An open discovery service is designed to run at regular intervals to gather a detailed perspective on the contents of the digital resource and how they are changing over time. Each time it runs, it is given access to the results of previously run open discovery services, along with a review of these findings made by individuals responsible for the digital resource (such as stewards, owners, custodians).

Operation of an open discovery service

Operation of an open discovery service

  1. Each time an open discovery service runs, Egeria creates a discovery analysis report to describe the status and results of the open discovery service's execution. The open discovery service is passed a discovery context that provides access to metadata.
  2. The discovery context is able to supply metadata about the asset and create a connector to the digital resource using the connection information linked to the asset. The discovery service uses the connector to access the digital resource's contents in order to perform the analysis.
  3. The discovery service creates discovery annotations to record the results of its analysis. It adds them to the discovery context which stores them in open metadata attached to the discovery analysis report.
  4. The discovery annotations can be reviewed and commented on through an external stewardship process. This means choices from, for example, a list of potential options proposed by the discovery services, can be verified and the best one selected by an individual expert. The resulting choices are added to annotation reviews attached to the appropriate annotations.
  5. The next time the open discovery service runs, a new discovery analysis report is created to link new attachments.
  6. The discovery context provides access to the existing attachments for that asset along with any annotation reviews. The discovery services is able to link its new annotations to the existing annotations as an annotation extension. This means that the stewards can see the history associated with the new information.
Runtime for an open discovery service

Open discovery services are packaged into Open Discovery Engines that run in the Asset Analysis OMES hosted in an Engine Host.

The metadata repository interface for metadata discovery tools is implemented by the Discovery Engine OMAS that runs in a Metadata Access Server.

An open discovery service may be triggered by a REST call to the Asset Analysis OMES, via a Governance Action or as part of a governance action process.

Open Discovery Service

Connector Description
Sequential Discovery Pipeline runs nested discovery services in a sequence (more information on discovery pipelines).
CSV Discovery Service extracts the column names from the first line of the file, counts up the number of records in the file and extracts its last modified time.
Validate Drop Foot Weekly Measurements Discovery Service runs nested discovery services in a sequence (more information on discovery pipelines).
Validate Patient Records runs nested discovery services in a sequence (more information on discovery pipelines).
Further information relating to Open Discovery Services

Governance Action Services

A governance action service is a specialized connector that performs monitoring of metadata changes, validation of metadata, triage of issues, assessment and/or remediation activities on request. Some governance action services invoke functions in external engines that are working with data and related assets.

A governance action service runs in the Governance Action Open Metadata Engine Service (OMES) hosted by the Engine Host OMAG Server.

Governance Action Service

Governance action services implement interfaces defined by the Governance Action Framework (GAF). The GAF offers embeddable functions and APIs to simplify the implementation of governance action services, and their integration into the broader digital landscape, whilst being resilient and with good performance.

It is possible to implement complex governance actions in a single governance action service. Alternatively there are five specialized types of governance action services that help you to break down your governance function into reusable components that can be choreographed by governance action processes to maximise the flexibility of your governance automation. When a governance action service completes, it produces guards that define what needs to be done next along with a list of action targets.

  • Watchdog Governance Action Service listens for changes to metadata and initiates new governance actions, governance action processes or an incident report.

  • Verification Governance Action Service validates that a rule or policy is being followed. This is often a test that the metadata elements, relationships and classification are set up as they should be. For example, it may check that a new asset has an owner, is set up with governance zones and includes a connection and a schema there possible. Verification governance action services

  • Triage Governance Action Service runs triage rules to determine how to manage a situation or request, such as a request for action from an open discovery service. Often this involves a human decision maker. It may initiate an external workflow, wait for manual decision or create a ToDo for a specific person.

  • Remediation Governance Action Service makes updates to metadata elements, relationships between them and classifications. Examples of remediation governance action services include:

    • Classification and linking of metadata elements such as adding owners, governance zones and origin classifications to assets.
    • Duplicate detection, linking and consolidating.
  • Provisioning Governance Action Service invokes a provisioning service whenever a provisioning request is made. Typically, the provisioning service is an external service. It may also create lineage metadata to describe the work of the provisioning service if the provisioning service is not able to create lineage itself.

The interfaces for governance action services is defined in the governance-action-framework module.

Governance action service example - data onboarding process

The governance action services are best understood through examples. Consider an onboarding process where new files are being copied into a landing area. They need to be catalogued in open metadata and moved into the data lake folder.

Data onboarding scenario

Operation of the data onboarding process

Initialization

At the start, there is an integration connector called Data Files Monitoring Integration Connector that will detect new files in the landing area folder and create an Asset metadata element to describe the file.

There is also governance action service called New Asset Watchdog that has registered a listener for new Asset metadata elements.

Data onboarding startup

New files arrive

When a new file arrives, Data Files Monitoring Integration Connector detects is and catalogues it as an Asset in open metadata. This triggers a call to New Asset Watchdog which then creates a governance action to initate a provisioning governance action service .

Data onboarding startup

Provisioning to the data lake

The governance action identifies the governance action service called Clinical Trial Provisioning and so it is started in the engine host. It moves the file to the data lake folder and adds lineage metadata to describe the data movement and the new Asset for the file in the data lake. The original asset is still in the metadata repository since it is needed to show the source of the data movement.

Data onboarding startup

Archive the asset for the landing area

The result of the provisioning removes the file from the landing area folder. This is detected by Data Files Monitoring Integration Connector which then archives the corresponding Asset. This addes a Memento classification to the Asset which means it is only retrievable on lineage requests.

Data onboarding startup

This is a summary of the flow:

Data onboarding overview

  1. New file detected by the Integration Connector.
  2. An Asset describing the file is created in the Metadata Access Server.
  3. New Asset event passed to Watchdog Governance Action Service.
  4. New Governance Action created that results in notification to Engine Host.
  5. Engine Host claims Governance Action and activates Provisioning Governance Action Service.
  6. Provisioning Governance Action Service moves file and writes lineage.
  7. Deleted file is detected by the Integration Connector.
  8. File's Asset is archived (adding a Memento classification to the Asset).

Since the watchdog governance action service calls the provisioning governance action process explicitly via the governance action, their implementations are somewhat tied together. The alternative is that the watchdog governance action service can invoke a governance action process that will choreograph the execution of one or more governance services based on a flow definition managed in open metadata. The governance action process separates the implementation of the watchdog governance action service from the follow-on governance actions since changes to the follow-on processing is maintained through open metadata rather than requiring code changes to the watchdog governance action service code.

Connector Description
Generic Element Watchdog Governance Action Service listens for changing metadata elements and initiates governance action processes when certain events occur.
Generic Folder Watchdog Governance Action Service listens for changing assets linked to a DataFolder element and initiates governance actions when specific events occur. This may be for files directly linked to the folder or located in sub-folders.
Move/Copy File Provisioning Governance Action Service moves or copies files from one location to another and maintains the lineage of the action.
Origin Seeker Remediation Governance Action Service walks backwards through the lineage mappings to discover the origin of the data
Further information relating to Governance Action Services

Governance Daemon Connectors

The governance daemon connectors contain specialist connectors for the governance servers that make active use of open metadata.

Connector Description
Open Lineage Janus Connector The Open Lineage connectors provide plugins to the Open Lineage Server that allow the Open Lineage Services to connect with databases.

Repository Governance Services

An repository governance service is a specialized connector that performs governance on open metadata repository such as maintaining an open metadata archive. It is hosted in the Repository Governance OMES which is, in turn, running in an engine host OMAG server.

Repository Governance Services

Figure 1: Repository Governance Services

A repository governance service can:

  • Register a listener with the Enterprise OMAS Topic to receive notifications from any of the repositories connected via Open Metadata Repository Cohorts.
  • Issue requests to find and retrieve metadata instances from any of the repositories connected via Open Metadata Repository Cohorts.
  • Incrementally build and store an open metadata archive.

There are currently no repository governance services supplied by Egeria.

Further information relating to Repository Governance Services

Runtime connectors

Runtime connectors enable Egeria's OMAG Server Platform and its hosted OMAG Servers to operate in many environments by providing plug-in points for the runtime services it needs to operate. Most of the runtime connectors relate to persistent storage, or connections to distributed services.

Type Description
Platform Metadata Security Connectors manage authorization requests for the OMAG Server Platform's services.
Server Metadata Security Connectors manage authorization requests for the OMAG Server's services.
Configuration Document Store Connectors manage the persistence and retrieval of configuration documents.
Cohort Registry Store Connectors store the open metadata repository cohort membership details in the cohort registry store.
Open Metadata Archive Store Connectors read and write open metadata archives.
Audit Log Destination Connectors support different destinations for audit log records.
REST Client Connectors issue REST API calls to Egeria's deployed platforms and third party technologies.
Cohort Member Client Connector supports repository service called to remote cohort members.
Open Metadata Topic Connectors send and receive events.

Platform Metadata Security Connectors

The platform metadata security connector provides authorization support for requests to the OMAG Server Platform. There is one platform metadata security connector defined for each OMAG Server Platform.

Platform Metadata Security Connector

There is one implementation of the platform metadata security connector provided by Egeria. It is a sample that encodes information from the Coco Pharmaceutical scenarios.

Further information relating to Platform Metadata Security Connectors

Server Metadata Security Connectors

The server metadata security connector provides authorization support for requests to an OMAG Server. There is one server metadata security connector configured for each OMAG Server.

Server Metadata Security Connector

This connector is called each time a request is made to the server. It is called to:

  • Validate the user has access to the server.
  • Validate the user has access to the called operation.
  • Select which zones to associate with the requests based on the user.
  • If the request involves an asset, then does the user have access to the asset and which properties are they allowed to see.
  • If the request is for a connection for an asset, then select which one linked to the asset should be returned to the user.
  • Determine which events to sent to the cohort.
  • Determine what actions can be made on the metadata repository.

Further information

There is one implementation of the server metadata security connector provided by Egeria. It is a sample that encodes information from the Coco Pharmaceuticals scenarios.

Further information relating to Server Metadata Security Connectors

Configuration Document Store Connectors

The configuration store connectors contain the connector implementations that manage the storage of Configuration Documents for OMAG Servers.

Configuration Document Store Connector

There is one configuration document store connector defined for each OMAG Server Platform.

There are two implementations of the configuration document store connector provided by Egeria: one for an encrypted store (default) and the other for a plain text store.

Further information relating to Configuration Document Store Connectors

Cohort Registry Store Connectors

The cohort registry store maintains information about the servers registered in an open metadata repository cohort. It resides in each cohort member and represents that member's view of the cohort membership. It contains the registration information sent by this member and the responses received from the other members.

Cohort registry store connector

Inside the cohort registry store there is one local registration record describing the information sent to the other members of the cohort and a list of remote registration records received from the other members of the cohort.

Internal structure for the information stored inside a single cohort registry store

A cohort registry store connector manages the persistence of the cohort registry store. Egeria uses a connector to allow different storage methods for different deployment environments. Each member may choose their own implementation of the cohort registry store connector.

Egeria provides a single implementation of a cohort registry store connector:

Further information relating to Cohort Registry Store Connectors

Open Metadata Archive Store Connectors

The open archive store connector manages the storage of an Open Metadata Archive. It is use in a utility or archive service that is maintaining an open metadata archive, and it is called in an Metadata Access Store when it is loading metadata from the archive.

Open Metadata Archive Store Connector

Egeria provides two implementations of the open metadata archive store connector:

Further information relating to Open Metadata Archive Store Connectors

Audit Log Destination Connectors

An audit log destination connector provides support for a specific audit log destination. At least one audit log destination connector is configured in every OMAG Server's's configuration document and used by its audit log component when the server runs.

Audit Log Destination Connector

An audit log destination's purpose may be either to store, process or distribute audit log records to diagnostic systems. Its associated configuration controls which severities of audit log record it receives. The implementation for the audit log destination connector can make further choices about how each log record is processed.

Below are the connector implementations provided by Egeria

Further information relating to Audit Log Destination Connectors

REST Client Connectors

Egeria makes extensive use of REST API calls for synchronous (request-response) communication with its own deployed platforms and third party technologies. The REST client connectors are used to issue the REST API calls.

REST Client Connector

Egeria provides a single implementation for Spring.

This is embedded in Egeria's Java clients. See

- Egeria's [Platform API clients](/guides/developer/#working-with-the-platform-apis).
- Egeria's [OMAS clients](/guides/developer/#working-with-the-open-metadata-and-governance-apis).

Cohort Member Client Connectors

Members of an Open Metadata Repository Cohort provide the other cohort members with a Connection to a connector that supports the OMRSRepositoryConnector interface during the cohort registration process. This connector translates calls to retrieve and maintain metadata in the member's repository into remote calls to the real repository.

Cohort Member Client Connector

Egeria's Open Metadata Repository Services (OMRS) provides a default REST API implementation and a corresponding client:

The connection for this connector is configured in the LocalRepositoryRemoteConnection property of the cohort member's Local Repository Configuration.

Digital resource connectors

The digital resource connectors provide access to digital resources and their metadata that is stored in the open metadata ecosystem. These connectors are for use by external applications and tools to connect with resources and services in the digital landscape. These connectors also supply the Asset metadata from Egeria that describes these resources.

Instances of these connectors are created through the Asset Consumer OMAS, Asset Owner OMAS and Discovery Engine OMAS interfaces. They use the Connection linked to the corresponding Asset in the open metadata ecosystem. If there are more than one connection associated with the asset, then a selection is made by the server metadata security connector running in the OMASs' server.

Connection objects are associated with assets in the metadata catalog using the Asset Owner OMAS, Data Manager OMAS and Asset Manager OMAS.

Digital Resource Connector

Files

  • The Avro file connector provides access to an Avro file that has been catalogued using open metadata.

  • The basic file connector provides support to read and write to a file using the Java File object.

  • The CSV file connector is able to retrieve data from a Comma Separated Values (CSV) file where the contents are stored in logical columns with a special character delimiter between the columns.

  • The data folder connector is for accessing data that is stored as a number of files within a folder (directory).

Databases

More coming ...

Open Metadata Topic Connectors

The open metadata topic connector provides a topic interface for a generic string event. It is the type of connector implemented by specific event buses.

The Open Metadata Topic Connectors are used by Egeria to read and write events to a topic managed by an event broker. These events contain notifications relating to changes in metadata and the topic provides an asynchronous event exchange service hosted in the event broker. It is typically wrapped in a connector that supports a specific event type. For example, the Open Metadata Topic Connectors connect servers into an open metadata repository cohort and exchange notifications through the Open Metadata Access Services (OMAS)'s topics called the InTopic and OutTopic. In all of these cases, an open metadata topic connector is nested inside of the specific topic connector. The use of the open metadata topic connector in this way means that only one connector need be implemented for each type of event bus - rather than one for each type of event that Egeria supports.

Open Metadata Topic Connector

Egeria provides a single implementation of an open metadata connector for Apache Kafka that it uses by default.

It is configured in the Egeria OMAG Servers through the Event Bus Configuration.