Skip to content

Open Metadata Archives

Open metadata archives provide pre-canned content (open metadata types and instances) to load into an open metadata repository. There are two main types of open metadata archive:

  • Content packs - metadata types and instances that are reusable in many organizations. The type definitions for the Open Metadata Types are managed in a content pack. Similarly, bespoke types are managed in content packs. Content packs are also used for distributing standard glossaries or other types of definitions from expert groups and organizations. The elements in the content pack belong to the archive's metadata collection irrespective of their originating metadata repository.

  • Metadata exports - metadata exported from a specific open metadata repository that can act as a snapshot or backup of specific types and instances. The elements in the metadata export remain part of the metadata collection of the originating metadata repository.

By the rules of metadata provenance, the elements in an open metadata archive are read-only when loaded into an open metadata repository unless the repository has the same metadata collection id as the element.

Figure 1 shows a content pack being loaded into a server. When an element from an open metadata archive is loaded, it is compared against the content of the local repository. If it is a new element, or a later version than the local repository has, the element is stored and then distributed around to any connected cohorts.

Figure 1

Figure 1: Loading a content pack

Notice that due to the distribution of this metadata across the cohorts, it is only necessary to load the archive into one of the servers.

When data and other types of assets are being transported between organizations, it is possible to use a metadata export open metadata archive to pass the related metadata as well. This is shown in figure 2.

Figure 2

Figure 2: Exporting and reimporting metadata between unconnected repositories

Figure 3 shows a metadata export archive to create a backup of selected metadata. This can be used to recover the metadata repository content after a bad load or other operational error.

Figure 3

Figure 3: Selective back up of metadata elements

Creating open metadata archives

There are two approaches to create an open metadata archive:

  • Assemble the contents in memory and push to the open metadata archive store when the archive is assembled.
  • Push the elements in the archive one-by-one as they are built.

The first approach works well for small archives such as content packs and the second is for large archives such as backups.

There are three supporting components used in the construction process:

  • Helper - logic to build the different types of elements for the archive.
  • Builder - logic to assemble the elements into the archive structure.
  • Writer - logic to store the contents of the archive on disk.

They are driven by specific archive logic that knows what content to add to the archive and an open metadata archive store connector that is responsible for the storage of the archive.

Figure 4

Figure 4: Assembling an open metadata archive in memory and then writing it out to disk once it is complete

Figure 5

Figure 5: Assembling an open metadata archive directly on disk

The archive logic can either be an offline utility or an archive service running in an archive engine.

Inside an Open Metadata Archive

The open metadata archive has three parts to it. This is shown in Figure 4. The header defines the type of archive and its properties. Then there is the type store. This contains new attribute type definitions, new type definitions and updates to type definitions (patches). Finally, there is the instance store. This contains new instances (entities, relationships and classifications).

Figure 6

Figure 6: Inside an Open Metadata Archive

Example of the header from the Cloud Information Model archive
{
  "class":"OpenMetadataArchive",
  "archiveProperties":
      {
          "class":"OpenMetadataArchiveProperties",
          "archiveGUID":"9dc75637-92a7-4926-b47b-a3d407546f89",
          "archiveName":"Cloud Information Model (CIM) glossary and concept model",
          "archiveDescription":"Data types for commerce focused cloud applications.",
          "archiveType":"CONTENT_PACK",
          "originatorName":"The Cloud Information Model",
          "originatorLicense":"Apache 2.0",
          "creationDate":1570383385107,
          "dependsOnArchives":["bce3b0a0-662a-4f87-b8dc-844078a11a6e"]
      }, 
   "archiveTypeStore":{},
   "archiveInstanceStore":{}
}

Storage structures

Figure 7

Figure 7: Storing an open metadata archive as a single file

Figure 8

Figure 8: Storing an open metadata archive in a directory structure

Loading open metadata archives

A metadata server's configuration document can list the archives to load each time the server is started. This is useful if the server does not retain metadata through a server restart (like the in-memory metadata repository). Open metadata archives may also be loaded while the server is running using a REST API call.

These articles describe how to load open metadata archives into a server:

The archive loads in the following order:

  • Attribute Type Definitions (AttributeTypeDefs) from the type store.

  • PrimitiveDefs

  • CollectionDefs
  • EnumDefs

  • New Type Definitions (TypeDefs) from the type store.

  • EntityDefs

  • RelationshipDefs
  • ClassificationDefs

  • Updates to type definitions (TypeDefPatches)

  • New Instances

  • Entities

  • Relationships
  • Classifications

The archive is loaded once and its content is immediately available. If the repository persists metadata over a server restart then this archive content continues to be available after the server restarts.

It does not matter how many times an archive is loaded, only one copy of the content is added to the repository.

Supported utilities for open metadata archives

Egeria supports the following open metadata archives. Associated with each archive are utilities that help you build additional archives of your own content.