In development
A component that is in development means that the Egeria community is still building the function. The code is added continuously in small pieces to help the review and socialization process. It may not run, or do something useful - it only promises not to break other function. Expect to find git issues describing the end state.
Asset Lineage Open Metadata Access Service (OMAS)¶
Overview¶
Asset Lineage is metadata access service (OMAS) that consolidates and exports lineage metadata for assets in a cohort. This is achieved by actively watching the cohort and discovering specific asset types and relationships to other assets or metadata elements.
Following asset types are considered:
- DataStore (subtypes)
- Process
- DataSet
Asset Lineage OMAS then builds complex graph structures (sometimes we call this asset context) that are sent to an out topic address for further preservation and use with Open Lineage Server .
The above works well for scenario where metadata is actively shared on the cohort while it gets created. In different scenario, additional repository already prepopulated with existing metadata can join the cohort. Asset Lineage OMAS offers endpoint to handle this as well by allowing external system to request (or actively poll) and extract the metadata relevant for building lineage graph.
In all cases, Asset Lineage OMAS always relies on underlying Enterprise Repository Services OMRS subsystem to find and consolidate metadata by combining different elements available across the cohort.
User Guide¶
Most of the interaction with the Asset Lineage OMAS will be driven by the external tools used to build lineage using Data Engine OMAS and Data Engine Proxy or integrated lineage via Integration Services OMIS like Database integrator or Files integrator.
When enabled in a OMAG Metadata Access Server, it subscribes to the enterprise cohort topic and uses following events triggers:
Relationship events that build lineage graph:
- Lineage Mappings between Assets
- Lineage Mappings between Schema Elements
- Semantic Assignments between Glossary Terms and Schema Elements
Entity events to feed the changes assets that are crated or updated:
- Data Stores
- Processes
Interface choices¶
- Java client to integrate within Java programs,
- REST API to interact with external/remote systems,
- Out Topic Events to publish lineage related events.
Out Topic Events¶
LineageEntityEvent
{
"class": "LineageEntityEvent",
"eventVersionId": 1,
"assetLineageEventType": "UPDATE_ENTITY_EVENT",
"lineageEntity": {
"guid": "a8f71cfc-bd59-440e-afb9-9719d60a8fe3",
"typeDefName": "Process",
"createdBy": "cocoETLnpa",
"updatedBy": "cocoETLnpa",
"createTime": 1636326254855,
"updateTime": 1636326255079,
"version": 2,
"metadataCollectionId": "9beaa80a-50d9-44ba-b2ae-18f2379c9aa4",
"properties": {
"displayName": "ConvertFileToCSV",
"qualifiedName": "ConvertFileToCSV@CocoPharma/DataEngine/CocoETL",
"description": "Process named 'ConvertFileToCSV' representing high level processing activity performed by CocoETL tool."
}
}
}
LineageRelationshipEvent
{
"class": "LineageRelationshipEvent",
"eventVersionId": 1,
"assetLineageEventType": "NEW_RELATIONSHIP_EVENT",
"lineageRelationship": {
"guid": "aa2b43c5-eb9a-49bf-93fd-aeddf55812f3",
"typeDefName": "LineageMapping",
"createdBy": "cocoETLnpa",
"updatedBy": null,
"createTime": 1636326312659,
"updateTime": null,
"version": 1,
"metadataCollectionId": "9beaa80a-50d9-44ba-b2ae-18f2379c9aa4",
"properties": {},
"sourceEntity": {
"guid": "b117f8a5-deaf-47d1-80fe-15425b5b7f1f",
"typeDefName": "DataFile",
"createdBy": "cocoETLnpa",
"updatedBy": null,
"createTime": 1636326233314,
"updateTime": null,
"version": 1,
"metadataCollectionId": null,
"properties": {
"qualifiedName": "file://secured/research/previous-clinical-trials/old-archive.dat@CocoPharma/DataEngine/CocoETL"
}
},
"targetEntity": {
"guid": "a8f71cfc-bd59-440e-afb9-9719d60a8fe3",
"typeDefName": "Process",
"createdBy": "cocoETLnpa",
"updatedBy": "cocoETLnpa",
"createTime": 1636326254855,
"updateTime": 1636326255079,
"version": 2,
"metadataCollectionId": null,
"properties": {
"qualifiedName": "ConvertFileToCSV@CocoPharma/DataEngine/CocoETL"
}
}
}
}
REST API¶
GET - Request lineage publish out for single entity (asset or term)
{{platformURLRoot}}/servers/{{serverName}}/open-metadata/access-services/asset-lineage/users/{{userId}}/publish-entity/{{entityTypeName}}/{{guid}}
GET - Request lineage publish out for entity type (asset or term)
{{platformURLRoot}}/servers/{{serverName}}/open-metadata/access-services/asset-lineage/users/{{userId}}/publish-entities/{{entityTypeName}}
GET - Request asset context
{{platformURLRoot}}/servers/{{serverName}}/open-metadata/access-services/asset-lineage/users/{{userId}}/publish-context/{{entityTypeName}}/{{guid}}
GET - Output topic OCF connection
{{platformURLRoot}}/servers/{{serverName}}/open-metadata/access-services/asset-lineage/users/{{userId}}/topics/out-topic-connection/{callerId}
Configuration¶
POST - Enable Asset Lineage OMAS with accessServiceOptions
{{platformURLRoot}}/open-metadata/admin-services/users/{{adminUserId}}/servers/{{serverName}}/access-services/asset-lineage
{
"LineagePublisherBatchSize": 100,
"LineageClassificationTypes": [
"PrimaryCategory",
"Confidentiality",
"AssetZoneMembership",
"SubjectArea",
"AssetOwnership"
]
}
Detailed description of the properties
Property | Description |
---|---|
LineagePublisherBatchSize | Number of elements to be sent in a single event. This parameter is used to optimize event payload size and allow multiple elements to be grouped in a batch. Default is 1. |
LineageClassificationTypes | List of classification types considered while producing lineage events. The access service is always preconfigured with the default set listed in the example request body. Additional types can be added when necessary, they are always merged with the default set. |
Raise an issue or comment below