Discovery Analysis Report¶
A discovery analysis report contains one or more sets of related properties that an open discovery service has discovered about the resource, its metadata, structure and/or content. These are stored in a set of discovery annotations linked off of the discovery analysis report.
It is attached to the asset for the digital resource that was analysed. Overtime, the discovery analysis reports show how the digital resource's contents are changing.
The discovery analysis report is created in the open metadata repository by the Asset Analysis OMES when it creates an new open discovery service instance. The open discovery service can retrieve information about the discovery analysis report through the discovery analysis report store client. This client is accessed through the discovery annotation store.
The discovery analysis report store also enables a long-running discovery service (typically an open discovery pipeline to record its current analysis step.
Discovery annotations¶
A discovery annotation describes one or more related properties about a digital resource. Some annotations refer to the entire digital resource and others refer to a data field within the digital resource. The annotations that describe a single data field are called data field annotations.
The annotation types defined in the Open Discovery Framework (ODF) are as follows:
- Classification Annotation - Captures a recommendation of which classifications to attach to this asset. It can be made at the asset or data field level.
- Data Class Annotation - Captures a recommendation of which data class this data field closely represents.
- Data Profile Annotation - Capture the characteristics of the data values stored in a specific data field in a data source.
- Data Profile Log Annotation - Capture the named of the log files where profile characteristics of the data values stored in a specific data field. This is used when the profile results are too large to store in open metadata.
- Data Source Measurement Annotation - collect arbitrary properties about a digital resource.
- Data Source Physical Status Annotation - documents the physical characteristics of a data source asset.
- Fingerprint Annotation - Capture the characteristics of the data values stored in a specific data field or the whole digital resource and express it as a single value.
- Request for Action Annotation - used to trigger governance and stewardship actions.
- Relationship Advice Annotation - document a recommended relationship that should be established with the asset.
- Quality Annotation - document calculated quality scores on different dimensions.
- Schema Analysis Annotation - document the structure of the data (schema) inside the asset.
- Semantic Annotation - documents suggested meanings for this data based on the values and name of the field.
Open Metadata Types for Discovery Annotations
The open metadata types for a discovery annotations are describe in Area 6.
The main entity type is called Annotation. It is extended by DataFieldAnnotation to distinguish annotations that refer, primarily to a data field. Other more specialist annotations extend these two basic annotation types.
Further information
Raise an issue or comment below