Skip to content

Building Integration Connectors

The integration connectors support the exchange of metadata with third party technologies. This exchange may be inbound and/or outbound; synchronous, polling or event-driven.

An integration connector runs in an Open Metadata Integration service (OMIS) which is in turn hosted in an Integration Daemon server. Each integration service provides a specialist interface designed to aid the integration with a specific type of technology. The integration connector implementation is therefore dependent on a specific OMIS.

Deployed Integration Connector

An integration connector is shown deployed in an integration service running in an integration daemon. The connector is linking to a third party technology and also calling the open metadata APIs of Egeria to manage the exchange of metadata.

The purpose of the integration daemon and its integration services is to minimise the effort required to integrate a third party technology into the open metadata ecosystem. They handle:

  • Management of configuration - including user security information.
  • Starting and stopping of your integration logic.
  • Thread management and polling.
  • Access to the open metadata repositories for query and maintenance of open metadata.
  • Ability to write to audit log and maintain measurements for performance metrics.
  • Metadata provenance.

This means you can focus on interacting with the third party technology and mapping its metadata to open metadata in your integration connector.

Integration connector interface

An integration connector can:

  • Listen on a blocking call, waiting for the third party technology to send a notification.
  • Register with an external notification service that sends notifications on its own thread.
  • Register a listener with its context to act on notifications from the partner OMAS's Out Topic.
  • Poll the third party technology each time that the integration daemon calls your integration connector's refresh() method.
  • Issue queries and maintenance (create, update, delete) requests to the open metadata repositories.

Access to open metadata is provided via a context object. The Open Metadata Integration Services (OMISs) each provide a context object that is specialized for a particular category of third party technology in order to provide the most optimal interface to open metadata for your integration connector. This typically includes:

  • The ability to register a listener to receive events from the OMAS's Out Topic, or send events to the OMAS's In Topic.
  • The ability to create and update metadata instances.
  • For assets, the ability to change an asset's visibility by changing its zone membership using the publish and withdraw methods.
  • The ability to delete metadata.
  • Various retrieval methods to help when comparing the metadata in the open metadata repositories with the metadata in the third party technology.

Each integration service defines the base class that an integration connector must implement if they are to run under that service. The base classes differ only in the type of context object that they support. Select the integration service, and hence the base class, to use for your integration connector from the table below.

Integration Service Type of technology supported Link to integration connector base class
Analytics Integrator OMIS Data Assets and Glossary Terms for analytics tools. AnalyticsIntegratorConnector class.
API Integrator OMIS API Schemas APIIntegratorConnector class.
Catalog Integrator OMIS Assets and related metadata found in an Asset Catalog CatalogIntegratorConnector class.
Database Integrator OMIS Databases and their schema DatabaseIntegratorConnector class.
Display Integrator OMIS Forms, reports and the queries they depend on DisplayIntegratorConnector class.
Files Integrator OMIS Files and their internal structure FilesIntegratorConnector class.
Infrastructure Integrator OMIS IT infrastructure landscape such as hosts, platforms and servers InfrastructureIntegratorConnector class.
Lineage Integrator OMIS Processes and their execution flow LineageIntegratorConnector class.
Organization Integrator OMIS People, teams, roles and user identities OrganizationIntegratorConnector class.
Search Integrator OMIS Content for search indexes relating to assets. SearchIntegratorConnector class.
Security Integrator OMIS Publishing information about users and resources. SecurityIntegratorConnector class.
Topic Integrator OMIS Event topics and the structure of the events they share. TopicIntegratorConnector class.

The context object is a wrapper around the client of an Open Metadata Access Service (OMAS). The OMAS supplies the properties and event structures for the API.

OMIS OMAS Pair

Therefore you need to add dependencies for your selected OMIS's API module and the API module of is partner OMAS. This is shown in the table below:

Integration Service Partner OMAS Dependencies
Analytics Integrator OMIS Analytics Modeling OMAS analytics-integrator-api, analytics-modeling-api
API Integrator OMIS Data Manager OMAS api-integrator-api, data-manager-api
Catalog Integrator OMIS Asset Manager OMAS catalog-integrator-api, asset-manager-api
Database Integrator OMIS Data Manager OMAS database-integrator-api, data-manager-api
Display Integrator OMIS Data Manager OMAS display-integrator-api, data-manager-api
Files Integrator OMIS Data Manager OMAS files-integrator-api, data-manager-api
Infrastructure Integrator OMIS IT infrastructure OMAS infrastructure-integrator-api, it-infrastructure-api
Lineage Integrator OMIS Asset Manager OMAS lineage-integrator-api, asset-manager-api
Organization Integrator OMIS Community Profile OMAS organization-integrator-api, community-profile-api
Search Integrator OMIS Asset Catalog OMAS search-integrator-api, asset-catalog-api
Security Integrator OMIS Security Manager OMAS security-integrator-api, security-manager-api
Topic Integrator OMIS Data Manager OMAS topic-integrator-api, data-manager-api

These dependencies are in addition to the standard dependencies for an integration connector:

Example of the Maven dependencies for an integration connector ...
        <dependency>
            <groupId>org.odpi.egeria</groupId>
            <artifactId>topic-integrator-api</artifactId>
            <scope>provided</scope>
            <version>${open-metadata.version}</version>
        </dependency>

        <dependency>
            <groupId>org.odpi.egeria</groupId>
            <artifactId>data-manager-api</artifactId>
            <scope>provided</scope>
            <version>${open-metadata.version}</version>
        </dependency>

        <dependency>
            <groupId>org.odpi.egeria</groupId>
            <artifactId>audit-log-framework</artifactId>
            <scope>provided</scope>
            <version>${open-metadata.version}</version>
        </dependency>

        <dependency>
            <groupId>org.odpi.egeria</groupId>
            <artifactId>open-connector-framework</artifactId>
            <scope>provided</scope>
            <version>${open-metadata.version}</version>
        </dependency>

        <dependency>
            <groupId>org.odpi.egeria</groupId>
            <artifactId>repository-services-apis</artifactId>
            <scope>provided</scope>
            <version>${open-metadata.version}</version>
        </dependency>

        <dependency>
            <groupId>org.odpi.egeria</groupId>
            <artifactId>integration-daemon-services-api</artifactId>
            <scope>provided</scope>
            <version>${open-metadata.version}</version>
        </dependency>

Use provided scope ...

Notice the <scope>provided</scope> setting for the Egeria libraries. This prevents the Egeria libraries from being included in your connector jar file. By using the provided scope, your connector can run with any level of Egeria that supports this type of connector. Without it, duplicate Egeria classes would be loaded into your OMAG Server Platform and if the platform was running at a different level it is not certain which version of the classes would run. (It "may" be ok but experience, as we know, teaches us that "if it can go wrong it will go wrong" so avoiding problems is always preferable :).

You will also need to add the dependencies for the third party technology that your connector is calling.

All of the integration connector base classes inherit from (extend) the IntegrationConnectorBase . This class defines the lifecycle methods of the integration connector.

Methods implemented by an integration connector

Methods implemented by an integration connector. The base class implements the initialize, setAuditLog, setConnectorName, and setContext methods. Your integration connector only needs to supply the start, refresh and disconnect method. It implements the engage method only if it needs to issue a blocking call.

  • initialize is a standard method for all connectors that is called by the connector broker when a request is made to create an instance of the connector. The connector broker uses the initialize method to pass the connection object used to create the connector instance and a unique identifier for this instance of the connector. This method is provided by the integration connector's base class. Your code can access the connection properties via the connectionProperties variable and the connector's unique identifier via the connectorInstanceId variable.

  • setAuditLog provides a Audit Log Framework (ALF) compatible logging destination. This method is provided by the integration connector's base class. Your code can access the audit log via the auditLog variable.

  • setConnectorName provides the name of the connector from the configuration so it can be used for logging. This method is provided by the integration connector's base class. Your code can access your integration connector's name via the connectorName variable.

  • initializeEmbeddedConnectors saves the optional list of embedded connectors that were defined in the connection object for your integration connector when it was configured. These connectors are digital resource connectors for use by your integration connector to call the third party technology. This method is provided by the integration connector's base class. Your code can access the embedded connector's via the embeddedConnectors variable.

  • setContext sets up the integration service specific context object. This method is also provided by the integration connector's base class. Your code can access the connector's name via the context variable. However it is recommended that because it is set to null after the disconnect method (described below), you connector should use the super.getContext() method to access the context, particularly if your connector operates in multiple threads.

  • start indicates that the connector is completely configured (that is all of the methods listed above have been called) and it can begin processing. This call is where the configuration properties are extracted from the connection object. It can also be used to register with non-blocking services. For example, it can register a listener for events from the OMAS Out Topic through the context.

  • engage is used when the connector is configured to need to issue blocking calls to wait for new metadata. It is called from its own thread. It is recommended that the engage() method returns when each blocking call completes. The integration daemon will pause a second and then call engage() again. This pattern enables the calling thread to detect the shutdown of its hosting integration daemon server. This method is implemented by the integration connector's base class to do nothing. You only need to override it if your integration connector is issuing blocking calls.

  • refresh requests that the connector does a comparison of the metadata in the third party technology and open metadata repositories. Refresh is called:

    1. when the integration connector first starts and then
    2. at intervals defined in the connector's configuration as well as
    3. any external REST API calls to explicitly refresh the connector.
  • disconnect is called when the server is shutting down. The connector should free up any resources that it holds since it is not needed any more. Once disconnect has been called the context is no longer valid.

Therefore you are looking to implement the start, refresh and disconnect methods in your integration connector, and optionally overriding the engage method if your connector issues blocking calls.

Designing your integration connector

There are four main design decisions to make before you start coding:

  • How is the work of the connector triggered - explicitly through the connection object contents or by listening for events from either the third party technology or open metadata?
  • Which direction the metadata synchronization is going. Is the third party technology the source of metadata or is open metadata?
  • How are elements from the third party technology correlated with the elements in open metadata.
  • If the third party technology is the source, should the metadata created in the open metadata ecosystem be read-only so that it can not be changed by other tools. This is achieved using External source metadata provenance.

Three patterns for connections

Your integration connector is created and initialized with a connection object. This connection object should contain all of the configuration needed by your integration connector. For example, it may contain configuration properties that can control the behavior of your connector. When connecting to the third party technology, optional userId and password for the third party technology may be stored in the connection along with endpoint information that defines the network address of its deployment.

Connection object with an explicit endpoint

An explicit endpoint is added to the integration connector's connection in its configuration to provide information on the network location of the third party technology. This is used to initialize the client libraries needed to call the third party technology.

A connection with no endpoint

If no endpoint is configured in the integration connector's connection, the endpoint information can be retrieved from open metadata by calling the context object and/or listening for notifications from the partner OMAS.

An alternative approach to calling the third party technology directly in your integration connector is to use one or more appropriate digital resource connectors to call the third party technology. The connection objects for these digital resource connectors are nested in the connection object for the integration connector.

A virtual connection include embedded connection

A Virtual Connection is a special type of connection that allows connections for different connectors to be embedded. This style of connection can be used by an integration connector that is making use of digital resource connectors to call its third party technology. Typically there is only one embedded connection, but multiple embedded connections can be used. Also, the embedded connections themselves may be virtual connections.

Metadata flow for your connector

The refresh method of your connector is called periodically to ensure the metadata in the third party technology is consistent with the metadata in the open metadata ecosystem. It operates in two phases:

  1. Retrieving metadata from the source and ensuring the equivalent metadata is present in the metadata destination.

  2. Retrieving metadata from the destination and deleting any elements that are not present in the source.

Third party technology is the metadata source

When the third party technology is the metadata source (for example, it is is a relational database or a file system) the refresh method ensures that the open metadata in Egeria is exactly the same as the metadata in the third party technology.

Third party technology is the metadata destination

When the open metadata ecosystem is the metadata source and the integration connector is responsible for distributing a subset of the open metadata to the third party technology, the refresh method ensures this subset (and no more) is present in the third party technology.

Mapping the third party technology to open metadata

Your integration connector needs to be able to map between the elements in the third party technology and in the open metadata ecosystem. Each will use different unique identifiers that it is unlikely that you can control. Design the qualifiedName of the open metadata elements to be constructable from the identifier of the equivalent metadata element in the third party technology.

What if there is not a one-to-one correspondence between elements

The Catalog Integrator OMIS supports external identifiers which can help to correlate complex relationships between the third party technology and open metadata.

Controlling external source metadata provenance

The integration services allow you to control whether external source metadata provenance is enabled using a toggle switch. If it is set to true, external source metadata provenance is used, otherwise it is local cohort metadata provenance.

Integration Service Method to control external source metadata provenance
Analytics Integrator OMIS Call setAnalyticsToolIsHome() method to set toggle. Default is true.
API Integrator OMIS Call setAPIManagerIsHome() method to set toggle. Default is true.
Catalog Integrator OMIS Use assetManagerIsHome property on method calls.
Database Integrator OMIS External source metadata provenance always enabled.
Display Integrator OMIS Call setApplicationIsHome() to set toggle. Default is true.
Files Integrator OMIS Local cohort metadata provenance is always enabled.
Infrastructure Integrator OMIS Call setInfrastructureManagerIsHome() method to set toggle. Default is true.
Lineage Integrator OMIS Use assetManagerIsHome property on method calls.
Organization Integrator OMIS Local cohort metadata provenance is always enabled.
Search Integrator OMIS Not applicable - outbound only
Security Integrator OMIS Local cohort metadata provenance is always enabled.
Topic Integrator OMIS Call setEventBrokerIsHome() method to set toggle. Default is true.

Writing the connector provider

The purpose of the connector provider is to provide information on how to configure, and initialize a particular connector. It is the factory class used to construct an instance of the connector at runtime using a connection object constructed as follows:

Connection object structure

The connection object contains properties needed by the connection object to operate. It includes a connector type object that is used when constructing the connector and an endpoint object that defines where the corresponding digital resource is located.

However it also provides information to It returns the ConnectorType object for the connector. The connector type describes the capabilities of the connector such as:

  • the java class of this connector provider. A connector provider is the factory for its Connector. It is typically called from the Connector Broker. The connector broker uses the connectorProviderClassName in the connector type to create an instance of the connector provider.

  • the configurationProperties that can be added to the connector's connection object to adapt its behavior. The administrator who is configuring the connector used the recognizedConfigurationProperties from the connector type to determine the properties

The connector type is included

If the connector provider implements

Return a new instance of the connector based on the properties in a supplied Connection object. The Connection object that has all of the properties needed to create and configure the instance of the connector. This includes the connector type described above.

Example: connector provider for the Kafka Monitor Integration Connector

For example, the KafkaMonitorIntegrationProvider is used to instantiate connectors that are monitoring an Apache Kafka broker. Therefore, its name and description refer to Kafka, and the connectors it instantiates are of type `KafkaMonitorIntegrationConnector .

Writing the connector

Accessing configuration properties

Accessing endpoint

Accessing context

Registering a listener

Setting metadata provenance

Locating elements created for this third party technology

Testing your connector

Your integration connector implementation should be built and packaged in a jar file. This jar file contains your connector provider and connector implementation. It may optionally contain any dependent client libraries to the third party connector that are called directly by your integration connector. This is necessary if these client libraries are not available in their own jar file.

The connector jar file (and any jar files for the dependent third party client libraries not included in your connector's jar file) need to be added to the OMAG Server Platform class path. The easiest way to do this is to copy the JAR files into the lib directory of your OMAG Server Platform's install directory.

Once you have installed the connector, configure it in the integration daemon, connected to a metadata access store

Figure 6

Your connector is then able to start and exchange metadata.

Figure 7

Further information
Back to top