Building Integration Connectors¶
The integration connectors support the exchange of metadata with third party technologies. This exchange may be inbound and/or outbound; synchronous, polling or event-driven.
An integration connector runs in an Open Metadata Integration service (OMIS) which is in turn hosted in an Integration Daemon server. Each integration service provides a specialist interface designed to aid the integration with a specific type of technology. The integration connector implementation is therefore dependent on a specific OMIS.
An integration connector is shown deployed in an integration service running in an integration daemon. The connector is linking to a third party technology and also calling the open metadata APIs of Egeria to manage the exchange of metadata.
The purpose of the integration daemon and its integration services is to minimise the effort required to integrate a third party technology into the open metadata ecosystem. They handle:
- Management of configuration - including user security information.
- Starting and stopping of your integration logic.
- Thread management and polling.
- Access to the open metadata repositories for query and maintenance of open metadata.
- Ability to write to audit log and maintain measurements for performance metrics.
- Metadata provenance.
This means you can focus on interacting with the third party technology and mapping its metadata to open metadata in your integration connector.
Integration connector interface¶
An integration connector can:
- Listen on a blocking call, waiting for the third party technology to send a notification.
- Register with an external notification service that sends notifications on its own thread.
- Register a listener with its context to act on notifications from the partner OMAS's Out Topic.
- Poll the third party technology each time that the integration daemon calls your integration connector's
refresh()
method. - Issue queries and maintenance (create, update, delete) requests to the open metadata repositories.
Access to open metadata is provided via a context object. The Open Metadata Integration Services (OMISs) each provide a context object that is specialized for a particular category of third party technology in order to provide the most optimal interface to open metadata for your integration connector. This typically includes:
- The ability to register a listener to receive events from the OMAS's Out Topic, or send events to the OMAS's In Topic.
- The ability to create and update metadata instances.
- For assets, the ability to change an asset's visibility by changing its zone membership using the
publish
andwithdraw
methods. - The ability to delete metadata.
- Various retrieval methods to help when comparing the metadata in the open metadata repositories with the metadata in the third party technology.
Each integration service defines the base class that an integration connector must implement if they are to run under that service. The base classes differ only in the type of context object that they support. Select the integration service, and hence the base class, to use for your integration connector from the table below.
Integration Service | Type of technology supported | Link to integration connector base class |
---|---|---|
Analytics Integrator OMIS | Data Assets and Glossary Terms for analytics tools. | AnalyticsIntegratorConnector class. |
API Integrator OMIS | API Schemas | APIIntegratorConnector class. |
Catalog Integrator OMIS | Assets and related metadata found in an Asset Catalog | CatalogIntegratorConnector class. |
Database Integrator OMIS | Databases and their schema | DatabaseIntegratorConnector class. |
Display Integrator OMIS | Forms, reports and the queries they depend on | DisplayIntegratorConnector class. |
Files Integrator OMIS | Files and their internal structure | FilesIntegratorConnector class. |
Infrastructure Integrator OMIS | IT infrastructure landscape such as hosts, platforms and servers | InfrastructureIntegratorConnector class. |
Lineage Integrator OMIS | Processes and their execution flow | LineageIntegratorConnector class. |
Organization Integrator OMIS | People, teams, roles and user identities | OrganizationIntegratorConnector class. |
Search Integrator OMIS | Content for search indexes relating to assets. | SearchIntegratorConnector class. |
Security Integrator OMIS | Publishing information about users and resources. | SecurityIntegratorConnector class. |
Topic Integrator OMIS | Event topics and the structure of the events they share. | TopicIntegratorConnector class. |
The context object is a wrapper around the client of an Open Metadata Access Service (OMAS). The OMAS supplies the properties and event structures for the API.
Therefore you need to add dependencies for your selected OMIS's API module and the API module of is partner OMAS. This is shown in the table below:
Integration Service | Partner OMAS | Dependencies |
---|---|---|
Analytics Integrator OMIS | Analytics Modeling OMAS | analytics-integrator-api, analytics-modeling-api |
API Integrator OMIS | Data Manager OMAS | api-integrator-api, data-manager-api |
Catalog Integrator OMIS | Asset Manager OMAS | catalog-integrator-api, asset-manager-api |
Database Integrator OMIS | Data Manager OMAS | database-integrator-api, data-manager-api |
Display Integrator OMIS | Data Manager OMAS | display-integrator-api, data-manager-api |
Files Integrator OMIS | Data Manager OMAS | files-integrator-api, data-manager-api |
Infrastructure Integrator OMIS | IT infrastructure OMAS | infrastructure-integrator-api, it-infrastructure-api |
Lineage Integrator OMIS | Asset Manager OMAS | lineage-integrator-api, asset-manager-api |
Organization Integrator OMIS | Community Profile OMAS | organization-integrator-api, community-profile-api |
Search Integrator OMIS | Asset Catalog OMAS | search-integrator-api, asset-catalog-api |
Security Integrator OMIS | Security Manager OMAS | security-integrator-api, security-manager-api |
Topic Integrator OMIS | Data Manager OMAS | topic-integrator-api, data-manager-api |
These dependencies are in addition to the standard dependencies for an integration connector:
- Audit log framework - for logging audit log messages.
- Open Connector Framework - basic connector interfaces.
- Integration Daemon API - for the integration connector base classes.
- Repository Services APIs - for audit log message severities.
Example of the Maven dependencies for an integration connector ...
<dependency>
<groupId>org.odpi.egeria</groupId>
<artifactId>topic-integrator-api</artifactId>
<scope>provided</scope>
<version>${open-metadata.version}</version>
</dependency>
<dependency>
<groupId>org.odpi.egeria</groupId>
<artifactId>data-manager-api</artifactId>
<scope>provided</scope>
<version>${open-metadata.version}</version>
</dependency>
<dependency>
<groupId>org.odpi.egeria</groupId>
<artifactId>audit-log-framework</artifactId>
<scope>provided</scope>
<version>${open-metadata.version}</version>
</dependency>
<dependency>
<groupId>org.odpi.egeria</groupId>
<artifactId>open-connector-framework</artifactId>
<scope>provided</scope>
<version>${open-metadata.version}</version>
</dependency>
<dependency>
<groupId>org.odpi.egeria</groupId>
<artifactId>repository-services-apis</artifactId>
<scope>provided</scope>
<version>${open-metadata.version}</version>
</dependency>
<dependency>
<groupId>org.odpi.egeria</groupId>
<artifactId>integration-daemon-services-api</artifactId>
<scope>provided</scope>
<version>${open-metadata.version}</version>
</dependency>
Use provided scope ...
Notice the <scope>provided</scope>
setting for the Egeria libraries. This prevents the Egeria libraries from being included in your connector jar file. By using the provided scope, your connector can run with any level of Egeria that supports this type of connector. Without it, duplicate Egeria classes would be loaded into your OMAG Server Platform and if the platform was running at a different level it is not certain which version of the classes would run. (It "may" be ok but experience, as we know, teaches us that "if it can go wrong it will go wrong" so avoiding problems is always preferable :).
You will also need to add the dependencies for the third party technology that your connector is calling.
All of the integration connector base classes inherit from (extend) the IntegrationConnectorBase
. This class defines the lifecycle methods of the integration connector.
Methods implemented by an integration connector. The base class implements the initialize, setAuditLog, setConnectorName, and setContext methods. Your integration connector only needs to supply the start, refresh and disconnect method. It implements the engage method only if it needs to issue a blocking call.
-
initialize
is a standard method for all connectors that is called by the connector broker when a request is made to create an instance of the connector. The connector broker uses the initialize method to pass the connection object used to create the connector instance and a unique identifier for this instance of the connector. This method is provided by the integration connector's base class. Your code can access the connection properties via theconnectionProperties
variable and the connector's unique identifier via theconnectorInstanceId
variable. -
setAuditLog
provides a Audit Log Framework (ALF) compatible logging destination. This method is provided by the integration connector's base class. Your code can access the audit log via theauditLog
variable. -
setConnectorName
provides the name of the connector from the configuration so it can be used for logging. This method is provided by the integration connector's base class. Your code can access your integration connector's name via theconnectorName
variable. -
initializeEmbeddedConnectors
saves the optional list of embedded connectors that were defined in the connection object for your integration connector when it was configured. These connectors are digital resource connectors for use by your integration connector to call the third party technology. This method is provided by the integration connector's base class. Your code can access the embedded connector's via theembeddedConnectors
variable. -
setContext
sets up the integration service specific context object. This method is also provided by the integration connector's base class. Your code can access the connector's name via thecontext
variable. However it is recommended that because it is set to null after thedisconnect
method (described below), you connector should use thesuper.getContext()
method to access the context, particularly if your connector operates in multiple threads. -
start
indicates that the connector is completely configured (that is all of the methods listed above have been called) and it can begin processing. This call is where the configuration properties are extracted from the connection object. It can also be used to register with non-blocking services. For example, it can register a listener for events from the OMAS Out Topic through the context. -
engage
is used when the connector is configured to need to issue blocking calls to wait for new metadata. It is called from its own thread. It is recommended that theengage()
method returns when each blocking call completes. The integration daemon will pause a second and then callengage()
again. This pattern enables the calling thread to detect the shutdown of its hosting integration daemon server. This method is implemented by the integration connector's base class to do nothing. You only need to override it if your integration connector is issuing blocking calls. -
refresh
requests that the connector does a comparison of the metadata in the third party technology and open metadata repositories. Refresh is called:- when the integration connector first starts and then
- at intervals defined in the connector's configuration as well as
- any external REST API calls to explicitly refresh the connector.
-
disconnect
is called when the server is shutting down. The connector should free up any resources that it holds since it is not needed any more. Once disconnect has been called the context is no longer valid.
Therefore you are looking to implement the start
, refresh
and disconnect
methods in your integration connector, and optionally overriding the engage
method if your connector issues blocking calls.
Designing your integration connector¶
There are four main design decisions to make before you start coding:
- How is the work of the connector triggered - explicitly through the connection object contents or by listening for events from either the third party technology or open metadata?
- Which direction the metadata synchronization is going. Is the third party technology the source of metadata or is open metadata?
- How are elements from the third party technology correlated with the elements in open metadata.
- If the third party technology is the source, should the metadata created in the open metadata ecosystem be read-only so that it can not be changed by other tools. This is achieved using External source metadata provenance.
Three patterns for connections¶
Your integration connector is created and initialized with a connection object. This connection object should contain all of the configuration needed by your integration connector. For example, it may contain configuration properties that can control the behavior of your connector. When connecting to the third party technology, optional userId and password for the third party technology may be stored in the connection along with endpoint information that defines the network address of its deployment.
An explicit endpoint is added to the integration connector's connection in its configuration to provide information on the network location of the third party technology. This is used to initialize the client libraries needed to call the third party technology.
If no endpoint is configured in the integration connector's connection, the endpoint information can be retrieved from open metadata by calling the context object and/or listening for notifications from the partner OMAS.
An alternative approach to calling the third party technology directly in your integration connector is to use one or more appropriate digital resource connectors to call the third party technology. The connection objects for these digital resource connectors are nested in the connection object for the integration connector.
A Virtual Connection is a special type of connection that allows connections for different connectors to be embedded. This style of connection can be used by an integration connector that is making use of digital resource connectors to call its third party technology. Typically there is only one embedded connection, but multiple embedded connections can be used. Also, the embedded connections themselves may be virtual connections.
Metadata flow for your connector¶
The refresh
method of your connector is called periodically to ensure the metadata in the third party technology is consistent with the metadata in the open metadata ecosystem. It operates in two phases:
-
Retrieving metadata from the source and ensuring the equivalent metadata is present in the metadata destination.
-
Retrieving metadata from the destination and deleting any elements that are not present in the source.
When the third party technology is the metadata source (for example, it is is a relational database or a file system) the refresh method ensures that the open metadata in Egeria is exactly the same as the metadata in the third party technology.
When the open metadata ecosystem is the metadata source and the integration connector is responsible for distributing a subset of the open metadata to the third party technology, the refresh method ensures this subset (and no more) is present in the third party technology.
Mapping the third party technology to open metadata¶
Your integration connector needs to be able to map between the elements in the third party technology and in the open metadata ecosystem. Each will use different unique identifiers that it is unlikely that you can control. Design the qualifiedName
of the open metadata elements to be constructable from the identifier of the equivalent metadata element in the third party technology.
What if there is not a one-to-one correspondence between elements
The Catalog Integrator OMIS supports external identifiers which can help to correlate complex relationships between the third party technology and open metadata.
Controlling external source metadata provenance¶
The integration services allow you to control whether external source metadata provenance is enabled using a toggle switch. If it is set to true, external source metadata provenance is used, otherwise it is local cohort metadata provenance.
Integration Service | Method to control external source metadata provenance |
---|---|
Analytics Integrator OMIS | Call setAnalyticsToolIsHome() method to set toggle. Default is true . |
API Integrator OMIS | Call setAPIManagerIsHome() method to set toggle. Default is true . |
Catalog Integrator OMIS | Use assetManagerIsHome property on method calls. |
Database Integrator OMIS | External source metadata provenance always enabled. |
Display Integrator OMIS | Call setApplicationIsHome() to set toggle. Default is true . |
Files Integrator OMIS | Local cohort metadata provenance is always enabled. |
Infrastructure Integrator OMIS | Call setInfrastructureManagerIsHome() method to set toggle. Default is true . |
Lineage Integrator OMIS | Use assetManagerIsHome property on method calls. |
Organization Integrator OMIS | Local cohort metadata provenance is always enabled. |
Search Integrator OMIS | Not applicable - outbound only |
Security Integrator OMIS | Local cohort metadata provenance is always enabled. |
Topic Integrator OMIS | Call setEventBrokerIsHome() method to set toggle. Default is true . |
Writing the connector provider¶
The purpose of the connector provider is to provide information on how to configure, and initialize a particular connector. It is the factory class used to construct an instance of the connector at runtime using a connection object constructed as follows:
The connection object contains properties needed by the connection object to operate. It includes a connector type object that is used when constructing the connector and an endpoint object that defines where the corresponding digital resource is located.
However it also provides information to It returns the ConnectorType object for the connector. The connector type describes the capabilities of the connector such as:
-
the java class of this connector provider. A connector provider is the factory for its Connector. It is typically called from the Connector Broker. The connector broker uses the
connectorProviderClassName
in the connector type to create an instance of the connector provider. -
the
configurationProperties
that can be added to the connector's connection object to adapt its behavior. The administrator who is configuring the connector used therecognizedConfigurationProperties
from the connector type to determine the properties
The connector type is included
If the connector provider implements
Return a new instance of the connector based on the properties in a supplied Connection object. The Connection object that has all of the properties needed to create and configure the instance of the connector. This includes the connector type described above.
Example: connector provider for the Kafka Monitor Integration Connector
For example, the KafkaMonitorIntegrationProvider
is used to instantiate connectors that are monitoring an Apache Kafka broker. Therefore, its name and description refer to Kafka, and the connectors it instantiates are of type `KafkaMonitorIntegrationConnector .
Writing the connector¶
Accessing configuration properties¶
Accessing endpoint¶
Accessing context¶
Registering a listener¶
Setting metadata provenance¶
Locating elements created for this third party technology¶
Testing your connector¶
Your integration connector implementation should be built and packaged in a jar file. This jar file contains your connector provider and connector implementation. It may optionally contain any dependent client libraries to the third party connector that are called directly by your integration connector. This is necessary if these client libraries are not available in their own jar file.
The connector jar file (and any jar files for the dependent third party client libraries not included in your connector's jar file) need to be added to the OMAG Server Platform class path. The easiest way to do this is to copy the JAR files into the lib
directory of your OMAG Server Platform's install directory.
Once you have installed the connector, configure it in the integration daemon, connected to a metadata access store
Your connector is then able to start and exchange metadata.
Further information
- Open Connector Framework (OCF) that defines the behavior of all connectors.
- Configuring an integration daemon to understand how to set up an integration connector.
- Developer guide for more information on writing connectors.