Skip to content

Open Discovery Service

An open discovery service is a component that performs specific analysis of the contents of a digital resource on request. The aim of the open discovery service is to enable a detailed picture of the properties of a resource to be built up.

Each time an open discovery service runs, it creates a new discovery analysis report linked off of the digital resource's Asset metadata element that records the results of the analysis.

Asset with discovery analysis reports

Each time an open discovery service runs to analyse a digital resource, a new discovery analysis report is created and attached to the resource's asset. If the open discovery service is run regularly, it is possible to track how the contents are changing over time.

The discovery analysis report contains one or more sets of related properties that the discovery service has discovered about the resource, its metadata, structure and/or content. These are stored in a set of discovery annotations linked off of the discovery analysis report.

An open discovery service is designed to run at regular intervals to gather a detailed perspective on the contents of the digital resource and how they are changing over time. Each time it runs, it is given access to the results of previously run open discovery services, along with a review of these findings made by individuals responsible for the digital resource (such as stewards, owners, custodians).

Operation of an open discovery service

Operation of an open discovery service

  1. Each time an open discovery service runs, Egeria creates a discovery analysis report to describe the status and results of the open discovery service's execution. The open discovery service is passed a discovery context that provides access to metadata.
  2. The discovery context is able to supply metadata about the asset and create a connector to the digital resource using the connection information linked to the asset. The discovery service uses the connector to access the digital resource's contents in order to perform the analysis.
  3. The discovery service creates discovery annotations to record the results of its analysis. It adds them to the discovery context which stores them in open metadata attached to the discovery analysis report.
  4. The discovery annotations can be reviewed and commented on through an external stewardship process. This means choices from, for example, a list of potential options proposed by the discovery services, can be verified and the best one selected by an individual expert. The resulting choices are added to annotation reviews attached to the appropriate annotations.
  5. The next time the open discovery service runs, a new discovery analysis report is created to link new attachments.
  6. The discovery context provides access to the existing attachments for that asset along with any annotation reviews. The discovery services is able to link its new annotations to the existing annotations as an annotation extension. This means that the stewards can see the history associated with the new information.
Runtime for an open discovery service

Open discovery services are packaged into Open Discovery Engines that run in the Asset Analysis OMES hosted in an Engine Host.

The metadata repository interface for metadata discovery tools is implemented by the Discovery Engine OMAS that runs in a Metadata Access Server.

An open discovery service may be triggered by a REST call to the Asset Analysis OMES, via a Governance Action or as part of a governance action process.

Open Discovery Service

Open Discovery Pipeline

There is a lot of common functions that are used repeatedly during the discovery process.

An open discovery pipeline is a specialized implementation of an open discovery service that runs a set of open discovery services against a single digital resource. The implementation of the open discovery pipeline determines the order that these open discovery services are run.

Open discovery pipeline example

Each open discovery service in the pipeline is able to access the results of the open discovery services that have run before it through the discovery context. The combined results of the open discovery pipeline are grouped into a single discovery analysis report linked off of the asset.

The aim of the open discovery pipeline is to enable reusable open discovery service implementations to be choreographed together for different types of digital resource.