Latest Release
Release 5.0 (July 2024)¶
Release 5.0 is a major functional release focused on usability, both from the point of view of the first-time user of Egeria, and developers of the code. It has six themes:
- Simplified up and running - improvements to the OMAG Server Platform and administration services to make it easier to set up and run, plus new sample server configurations that mean you can get Egeria up and running without needing to understand the administration services.
- A python client - pyegeria provides a python API covering Egeria's administration and operational services plus many of the Open Metadata View Services (OMVSs) provides access to Egeria through this popular programming language for administrators and data professionals.
- Content pack of connector metadata - metadata to pre-configure the connectors supplied with Egeria, and provide valid value sets describing technology types and file types.
- Digital Product Support - new APIs for creating a digital product catalog.
- Survey Support - new framework, services and report utility to support the surveying of digital resources to understand its contents before embarking on wholesale cataloguing and governance.
- Reduction of technical debt - there has been internal code changes to remove deprecated code, remove the duplication of type information, and many bug fixes :).
Open Metadata Types
- The Anchors classification is able to store the type of the anchor as well as its GUID.
- A new classification called RootCollection can be added to a collection entity to indicate that it is the root of collection hierarchy.
- The Collection entity has a new attribute called collectionType that can be used to identify the concept that the collection represents.
- A new supertype called Action has been added to the ToDo and EngineAction.
- The ActionSponsor relationship now links an Action to a Referenceable.
- The ToDo entity has a new attribute called lastReviewTime.
- The EngineAction entity has new attributes called requesterUserId and requestedStartDate.
- A new classification called PersonalProject can be added to a project entity to indicate that it is an informal project that have been created by an individual to help them organize their work.
- A new classification called StudyProject can be added to a project entity to indicate that it is a focused analysis of a topic, person, object or situation.
- New properties called priority, projectHealth and projectPhase has been added to the Project entity to represent the status of the project.
- A new classification called DataScope can be added to a referenceable entity (typically a DataStore or DataSet) to define the scope of the associated data resource in space and time.
- Although not yet implemented in the Engine Host, there are types for a new type of governance engine called the ContextEventEngine and its corresponding governance service called ContextEventService. This new engine is for managing context events, scheduling and associated actions.
- A new entity called a ContextEvent plus associated elements has been added to provide a means to capture significant events that impact users, data, services, etcetera, to be recorded and used to explain why blips occurred in the past, or plan and take action to mitigate against blips in the future.
- There is a new schema extraction relationship called DiscoveredLinkedDataField that can be used to describe associations between data fields. The DataField entity has new attributes to align it more closely with SchemaAttribute.
- There are new lineage relationships called UltimateSource and UltimateDestination can be used to capture the size and edges of an element's lineage graph have been added.
- The GroupedMedia relationship is deprecated in favour of DataContentForDataSet.
- The DataFile type has a attribute called fileExtension. New types that inherit from DataFile have been defined for a wide range of file types: SpreadsheetFile, XMLFile, AudioFile, VideoFile, 3DImageFile, RasterFile, VectorFile, SourceCodeFile, ExecutableFile, ScriptFile, BuildInstructionFile, YAMLFile, PropertiesFile, and ArchiveFile. There is also a new relationship called ArchiveFileContents to identify the contents of an archive file.
- There are new attributes called isCaseSensitive and category for ValidValueDefinition that provides additional information used to match valid values. In addition, there are two new types of valid value relationship: ValidValueAssociation and ConsistentValidValues.
- The ConnectorType entity has a new attribute called deployedImplementationType to allow a connector writer to record the deployed implementation type that the connector works with.
- The Template classification has a new property called versionIdentifier. This model also includes a new relationship called CatalogTemplate to identify a template for a particular type of technology.
- The CatalogTarget relationship has two new properties: configurationProperties and metadataSourceQualifiedName.
- The Topic entity how has a topicName attribute.
- ConsistentValidValue is a new relationship that links valid values that represent different properties, and are typically seen together in the same data item.
- SpecificationPropertyAssignment is a new relationship that links a valid value that represents a type of configurable property with the element that represents the associated implementation.
XTDB repository connector is now in egeria.git
The XTDB repository connector is now located in the core Egeria git repository which means it is always available in the containers and distributions. This change is in recognition that XTDB is our premier metadata repository. There are three new administration commands to simplify the configuration of an XTDB repository.
- Configure XTDB with an in-memory repository
- Configure XTDB with a local file-based key-value store
- Configure XTDB with a connection, allowing more complex configurations to be set up
The java package name of the integrated version of the connector has changed from org.odpi.egeria.connectors.juxt.xtdb.repositoryconnector.XtdbOMRSRepositoryConnectorProvider
to org.odpi.openmetadata.adapters.repositoryservices.xtdb.repositoryconnector.XTDBOMRSRepositoryConnectorProvider
so it is possible to continue to run the original version of the XTDB connector in V5.0.
XTDB hang, or platform exit due to mismatch crypto library
On the latest level of the MacOs operating system we have experienced hangs during server startup, or the OMAG Server Platform exiting with a message complaining about the unsafe load of a cryptographic library. A question has been raised with the XTDB community to find out more about this library. In the meantime, there is a workaround if you see this issue: set the following environment variables before starting your OMAG Server Platform. This switches XTDB to using its own cryptography libraries.
XTDB_DISABLE_LIBCRYPTO=True;XTDB_ENABLE_BYTEUTILS_SHA1=True
Improved/changed terminology
The following concepts have been renamed to improve the clarity of their function, particularly in the documentation.
Open Lineage Services and Open Lineage Server renamed to Lineage Warehouse Services and Lineage Warehouse respectively
Our use of the term "open lineage" for our lineage governance server has been causing confusion with the newer Open Lineage standard and project. As a result, the Egeria Open Lineage Services are now called the Lineage Warehouse Services and the OMAG Server they run is called the Lineage Warehouse. The new name is more descriptive and avoiding clashing names is desirable. The change has been made cleanly both to the code and the document site (ie no backward compatible deprecated methods) to ensure the name clashing is completely removed. Every should work as before, except that when migrating a open lineage server to version 5.0, it is necessary to reconfigure what was the open lineage services section to the new lineage warehouse services section. The structure is the same, but the class name has changed from OpenLineageConfig
to LineageWarehouseConfig
, and the connectorProviderClassName
of the JanusGraph connector has changed from org.odpi.openmetadata.openconnectors.governancedaemonconnectors.openlineageconnectors.janusconnector.graph.LineageGraphConnectorProvider
to org.odpi.openmetadata.openconnectors.governancedaemonconnectors.lineagewarenouseconnectors.janusconnector.graph.LineageGraphConnectorProvider
. Full details of this change are describes in the Administration Guide.
Governance Action is renamed to Engine Action
If you have been using Egeria for a while, you may be familiar with the term Governance Action as the mechanism used to control the execution of automated actions in the Engine Hosts. In release 5.0, Governance Action is renamed to Engine Action to create a greater name differentiation between the concepts that are used to define the governance behaviour and those used to control the execution of this behaviour. The term Governance Action is now used as a general term for a Governance Action Process or a Governance Action Type. There is also a differentiation between a Governance Action Type that is used to initiate a standalone Engine Action and a specialized Governance Action Type called a Governance Action Process Step that describes a single step in a Governance Action Process.
New Python client called pyegeria
A new python client called pyegeria
for Egeria's administrative, platform and server operational services is available on pypi
. It also includes coverage of many of the view services. Future versions will add the metadata highway and governance server operational services.
Integration Daemon runs each integration connector in its own thread
Integration connectors running in an integration daemon are now each given their own thread to run in so that a busy integration connector does not prevent other integration connectors from running.
Improved administration
New administration commands for basic server properties
The administration services support new commands to set up, retrieve and clear the basic server properties in a server's configuration document in one operation. These properties are serverDescription, organizationName, localServerUserId, localServerPassword and maxPageSize. The new calls are described in the Administration Guide.
New administration command to retrieve the metadata collection name
The administration services support a new command to retrieve the metadata collection name set up for a repository in a cohort member server. This complements the calls to set up the metadata collection name and metadata collection id, and retrieve the metadata collection id.
New administration commands for the Engine Host
The administration services support new commands to simplify the configuration of an engine host. It is no longer necessary to match the type of the engines you are configuring with the associated engine service. You can now specify a single list of governance engine that you want to run in the engine host and Egeria will work out which engine service to manage each of them. See the Administration Guide for more information.
New administration commands for the Integration Daemon
The administration services support new commands to simplify the configuration of an integration daemon. It is now possible to configure all the integration services in a single call. There are also new methods to work with all the configuration in the integration daemon services as a single request. See the Administration Guide for more information.
New administration commands for the View Server
The administration services support new commands to simplify the configuration of an view server. It is now possible to configure all the view services in a single call. See the Administration Guide for more information.
Placeholder variables supported in the configuration document
It is possible to add placeholders in string properties of an OMAG Server's configuration document. The replacement values are configured either by a REST API call (OMAG Server Platform only), or, more typically in the application.properties
for either the OMAG Server Platform or OMAG Server Runtime. The placeholders are replaced by these values each time the server starts. This means that the server can be run in different environments without reconfiguring the server. Full details can be found in the Configuration Document documentation.
Support for default configuration documents
There is a new application.properties
property supported by the OMAG Server Platform that allows a default configuration document to be set up for the configuration document store. This provides the initial content for each new server configuration created on the platform. The new property is called platform.default.config.document
and is described in the Administration Guide.
Placeholder properties supported in metadata templates
It is possible to add placeholders to properties in metadata elements that are used as templates when creating new metadata elements. The replacement values are supplied on the request to create the new element from the template.
Improved framework services
The framework services provide common services to the Open Metadata Access Services (OMASs). The Governance Action Framework Service includes a new set of services called the Open Governance Service
that provides services to define governance actions such as governance action processes and governance action types, and the initiation of engine actions either directly or via a governance action process or governance action type.
All OMASs now have clients for the Connected Asset Service, Open Metadata Store, Open Governance Service and the Open Integration Service.
New Survey Action Framework (SAF)
The Survey Action Framework (SAF) supports a new type of connector called a survey action service that implements a survey of the contents of a particular digital resource such as a directory (folder) on the filesystem, a database or a file. The results of its analysis are stored in open metadata as a survey report linked off of the asset that describes the surveyed resource. The purpose of the survey report is to help in the initial understanding of a deployed IT landscape to help decide where to focus future effort in cataloguing and governing key resources. They can also be scheduled to run at regular intervals, creating a history of how the digital resource is changing over time.
There are implementations of three survey action services: one for a directory (folder, once for a file and another, more specialised service, for CSV files. There is also a SurveyReport utility to print the resulting report.
The survey action services run in the Engine Host. There is a new Open Metadata Engine Service (OMES) to support survey action services. It is called the Survey Action OMES.
Open Discovery Framework (ODF) is now deprecated
The function of the Survey Action Framework (SAF) is based on the Open Discovery Framework (ODF), which has now been deprecated. The reason for this is two-fold. Firstly the use of the term "discovery" in the framework's name is unfortunate since it is not the only mechanism that Egeria has for discovering metadata. This makes the descriptions about metadata discovery confusing. Secondly, the Open Discovery Framework has some usability issues relating to the discovery of schemas that have now been fixed in the Survey Action Framework. The Open Discovery Framework will remain in the codebase for at least 6 months to allow time for the existing open discovery services to move to the Survey Action Framework. It is recommended that all new implementations use the Survey Action Framework.
New Apache Atlas Survey Action Service
There is a new survey action service that provides a report on the content of an Apache Atlas server. It is called the Apache Atlas Survey Service. This survey action services complements the Apache Atlas Integration Connector that exchanges metadata between the open metadata ecosystem and an Apache Atlas server.
Extended support for file system monitoring
The context for all the Open Metadata Integration Services (OMIS) has been updated to support a file system listener. This means that integration connectors can easily monitor changing files and directories. This can be useful for situations where new information is being exchanged with open metadata using files such as CSV files and spreadsheets.
New View Service: Action Author OMVS
Action Author OMVS is a new view service for managing the definitions of governance action types and and governance action processes.
New View Service: Automated Curation OMVS
Automated Curation OMVS is a new view service that enable the user to set up integration connectors in an integration daemon, run engine actions and governance action processes.
New View Service: Feedback Manager OMVS
Feedback Manager OMVS is a new view service that supports the management of feedback (comments, likes, informal tags, notelogs and reviews) from users. The feedback support in Glossary Browser OMVS has been removed.
New View Service: Project Manager OMVS
Project Manager OMVS supports the definition and management of projects. Projects can themselves be organized hierarchically.
New View Service: Collection Manager OMVS
Collection Manager OMVS supports the organizing and browsing of metadata in collections. Collections can themselves be organized hierarchically. An example of its use is in managing the collections of assets that make up digital products.
New View Service: Valid Metadata OMVS
Valid Metadata OMVS supports the retrieval and maintenance of valid values for open metadata.
Improved anchor support
Anchor classifications now include the type name and domain of the anchor entity. This allows the Anchor classification to be used to search for string values in any element that is anchored to a particular type of element.
Removal of JavaScript User Interfaces
The original intention with the UIs was to consolidate their backend services onto the OMAG Server Platform/Runtime. Unfortunately, neither UI code based is building and the security scanners are reporting serious CVEs with the JavaScript code bases. There is no-one left in the community that is able to work with the JavaScript and so, for now these UIs have been deprecated. The runtime services that support the UIs have also been deprecated and removed from the runtime build. The code is temporarily still in the repository and will be cleaned up in a future release if no progress is made on the JavaScript code.
Deprecation of Egeria UI Application
Due to the addition of the new View Services and support for tokens in the OMAG Server Platform and OMAG Server runtimes, the Egeria UI Application is now deprecated. New UI components should use the Open Metadata View Services (OMVS). The Egeria UI Application assembly, and the open metadata assembly have also been deprecated.
Asset Catalog, OMAS Subject Area OMAS and Glossary View OMAS
The function of these services has now moved to Glossary Manager OMAS, Glossary Browser OMVS and Asset Catalog OMVS.
Repository Explorer OMVS, Type Explorer OMVS, Dynamic Infrastructure Explorer OMVS, Server Author OMVS, Glossary Author OMVS are now deprecated
These services have been removed from Egeria's build, but their code is temporariliy still in the Egeria.git repository.
New file extensions to Egeria' standard files
Files produced by Egeria supplied connectors now have specific file extensions to help identify them in the file system. Previously they used the file extension for their format 9such as .json
.
File Extension | File Type Name |
---|---|
omarchive | Open Metadata Archive File |
config | OMAG Server Configuration Document |
cohortregistry | OMAG Server Cohort Registry |
omalrecord | Open Metadata Audit Log Record File |
auditlog | Open Metadata Audit Log Folder |
openlineageevent | Open Lineage Event File |
ConnectorTypes can now identify the deployed implementation type for their connector
The developer for a connector can provide information about their connector's implementation in a connector type object that can be retrieved from its connector provider. The Open Connector Framework (OCF) has been extended to allow the connector type to include the deployed implementation type in addition to the type name of the asset subtype that this connector represents. Assets of this type are linked to connection objects via the ConnectionToAsset relationship.
Integration Connector Providers can now list catalog target types
The developer of an integration connector can provide information about the third party technology it exchanges metadata with through catalog target types.
Overriding configuration properties and metadataSourceQualifiedName for each catalog target
An integration connector's behaviour is controlled through a map of configuration properties passed in the configurationProperties attribute of the integration connector's connection object. It is now possible to override the settings of these properties for each catalog target that the integration connector is processing. These overriding configuration properties are stored in the CatalogTarget relationship's properties. Similarly, the metadata collection that elements catalogued by an integration connector is controlled by the metadataSourceQualifiedName attribute of the RegisteredIntegrationConnector relationship. It is now possible to override the metadata source qualified name for each catalog target that the integration connector is processing. The overriding value is also stored in the CatalogTarget relationship's properties.
Governance Service Providers can now export their specification
The developer of a governance service can provide information about the request types, request parameters and action target types it consumes and produces making it easier for consumers creating governance engine definitions, governance action types and governance action processes to incorporate the governance service into their work.
Extension to the OpenConnectorArchive
The OpenMetadataArchive.omarchive
(was OpenMetadataArchive.json
) now contains reference data for many file types and deployed implementation types. The deployed implementation types include details of Egeria's runtimes and services and how they link together. The deployed implementation types are also linked to details of the connectors that work with the associated technology. This is to assist someone setting up new automated curation and governance activities in the open metadata ecosystem.
IntelliJ HTTP Client
The omag-server-platform
assembly created during an egeria build now includes script files that exercise Egeria's REST API services using IntelliJ's HTTP Client. These mirror the Postman collections that are also found in the assembly.
Default Audit Log console output has changed
The default audit log output to the console for newly configured servers will no longer include the Event
log entries and the Types
log entries. Existing servers will continue to log as before. The audit log entries displayed on the console by a server can be changed using the Administration Services.
Management of open metadata type definitions
As the Egeria code base evolved, sets of definitions for the unique identifiers and type names of the open metadata types appeared in multiple modules. There was a set in the open metadata archive helper, the generic handlers, multiple OMASs and the Governance Action Framework. This was in addition to the original definitions in the open metadata type archive itself. In release 5.0, these copies of the open metadata type identifiers have been consolidated into one set found in the Governance Action Framework. There is still work to change the type definitions in the open metadata type archive to use the definitions from the GAF. This change removes the inconsistencies and gaps in the definitions. It also makes it easier to discover where a particular open metadata type is used.
Deprecation of Data Engine Proxy and Data Engine OMAS
These services have been deprecated since they are specialized for IBM's Information Server product. IF support for this product is required in the future then it will be reimplemented in the Integration Daemon.
Removal of the Anonymous Repository Services
The anonymous versions to the OMRS Metadata Collection REST APIs have been removed from Egeria's runtimes since they do not include caller information, and therefore provide opportunities for a denial of service attack. These services were deprecated in version 2 of Egeria.
- getMetadataCollectionId
- getEntity
- getRelationship