Downloading and Building Egeria Tutorial¶
Egeria is an open source project that is delivered both as source code as well as Maven Central Repository libraries.
This tutorial will guide you through the process of downloading the core Egeria source code from GitHub and building it so that you can run it on your local machine.
Alternatively you can also use Kubernetes to run Egeria. This uses the published builds of Egeria and does not require you to build Egeria on your machine.
Prerequisite technology for building Egeria¶
Installing Java
Installing Java¶
Java is a relatively mature object-oriented programming language that was originally designed to be able to easily run programs across a number of different computer systems.
The Egeria project itself is primarily written in Java, and therefore a Java Runtime Environment (JRE) is the most basic component needed in order to run Egeria.
You will need a Java Development Kit (JDK) installed on your machine in order to build Egeria. (A JDK will include a JRE.)
There are various JREs/JDKs available, and you may even have one pre-installed on your system. You can check
if java is already installed by running the command java -version
from the command-line.
Java can be installed by:
- Downloading the OpenJDK 17 (LTS) HotSpot JVM from Adoptium.
- Running the installer that is downloaded.
Alternatively you may wish to install from your package manager such as homebrew
on MacOS.
Installing Maven
Installing Maven¶
Apache Maven is a build tool at is being phased out in the Egeria project, but is still required by some repositories. It is capable of code compilation, running unit tests, validating dependencies and Javadoc as well as build our distribution archive.
Egeria 4.0 and above cannot be built using maven.
where it is used, Egeria requires Maven 3.5 or higher. 3.6.x or above is recommended.
Check if Maven is installed
mvn --version
Maven can be installed by downloading the software from the Apache maven website and unpacking it into a directory that is included in your PATH
. Alternatively these methods are available:
Install Maven through HomeBrew
brew install maven
Install through yum
yum install maven
Install through apt-get
apt-get install maven
On Windows, you should use Windows Subsystem for Linux Version 2 or above, install an appropriate Linux distribution, and follow the instructions for that Linux distribution.
Installing Git
Installing Git on your local machine¶
Git is an open source version control system used to store and manage Egeria's files. You need it installed on your machine to work with Egeria's git repositories stored on GitHub.
You can check whether it is installed on your system by running git --version
from the command-line.
Git can be installed:
- On MacOS, as part of the Xcode suite (running
git --version
will prompt you to install it if it is not already installed). - On Linux operating systems, by using your distribution's package manager (
yum install git
,apt-get install git
, etc). - On Windows, you should use Windows Subsystem for Linux Version 2 or above, install an appropriate Linux distribution, and follow the instructions for Linux.
Tutorial tasks¶
- Downloading the Egeria source from GitHub
- Building the Egeria source with Apache Maven
- Installing Egeria
Downloading the Egeria Source from GitHub¶
The code for Egeria is downloaded from each git repository one at a time. The commands shown in each tab below create a clone (copy) of the egeria git repositories for your own use. If you want to make a contribution to Egeria, you need to clone your own fork of a repository rather than the main repository itself.
Create a new directory for Egeria's main libraries. In the example below it is called egeria-main-libraries
:
mkdir egeria-main-libraries
Change to your new directory.
cd egeria-main-libraries
Egeria's source is extracted from GitHub using the following git
command:
To retrieve the code for a specific release of Egeria enter
git clone -b egeria-release-{release-number} --single-branch https://github.com/odpi/egeria.git
git clone -b egeria-release-3.14 --single-branch https://github.com/odpi/egeria.git
To retrieve the latest "SNAPSHOT" code from the main
branch of Egeria enter:
git clone https://github.com/odpi/egeria.git
A new directory has been created with the core Egeria source code. Change to the egeria
directory and you are ready to build the source.
cd egeria
Create a new directory for Egeria's main libraries. In the example below it is called egeria-samples-source
:
mkdir egeria-samples-source
Change to your new directory.
cd egeria-samples-source
Egeria's samples source is extracted from GitHub using the following git
command:
git clone https://github.com/odpi/egeria-samples.git
A new directory has been created with the samples' source code. Change to the egeria-samples
directory and you are ready to build the source.
cd egeria-samples
Create a new directory for Egeria's developer projects. In the example below it is called egeria-dev-projects-source
:
mkdir egeria-dev-projects-source
Change to your new directory.
cd egeria-dev-projects-source
Egeria's source is extracted from GitHub using the following git
command:
git clone https://github.com/odpi/egeria-dev-projects.git
A new directory has been created with the developer projects source code. Change to the egeria-dev-projects
directory and you are ready to build the source.
cd egeria-dev-projects
The ls
command allows you to list the files from the repository:
ls
It should be the same as the contents of the git repository on GitHub.
You are now ready to build the egeria source.
Building the Egeria Source¶
The build process takes the source files from the git repository and creates executable libraries needed to run Egeria.
Egeria currently supports building on *nix, Linux & Linux-like operating systems such as MacOS.
Our official build pipelines are based on x86_64 architecture, but it is expected the build will run on other architectures, subject to the availability of the required tools and interpreters/jvms/runtimes on that platform (for example Java, Python, Docker/containerd/k8s etc).
Currently, the Egeria team does not regularly test or use Windows, so there may be areas that are not documented as well, or not work. We would welcome any interested developers who use Windows on a daily basis to join us and help improve this area!
On Windows, you should use Windows Subsystem for Linux Version 2 or above, and install a Linux distribution such as Ubuntu. This avoids issues we have seen with path separators, symbolic links, slow I/O performance, long path names. WSL version 2 should be used, not version 1, due to differences in file I/O (emulation). The docs above explain how to switch from v1 to v2.
From the command line everything should work just as for macOS & linux, including building and running Egeria since a full linux distribution is being used, with a linux kernel.
However, IDE use may be a little different. Some IDEs can run the GUI in Windows natively, and then use the WSL environment to perform build and execution.
With IntelliJ the following process is most likely to work:
- Ensure an Ubuntu environment is setup using WSL2
- Install a java sdk, and maven as for mac/linux
- ensure a build at the command line works ok
- Install IntelliJ community edition on Windows. Using the latest version (2022.1 at time of writing) is recommended as WSL support is a new area
- Create a new project 'from existing sources' and ensure you point to //wsl$/..... (path in linux environment)
- After a few warnings as IntelliJ detects the code, your SDK should be set automatically to the linux java version
Jetbrains have a WSL2 support article which elaborates these instructions in more detail
Another option would be to run the IDE itself directly within the linux environment, and share the display via X11, VNC, or another form of remote desktop. This is likely to work, but could perform sluggishly. Microsoft are improving this area with WSLg , but this requires very new software, and dedicated graphics to work well. It's also outside the scope of this summary.
Egeria provides both maven and gradle build scripts. On Windows we've seen issues with maven which can cause IntelliJ to be busy or unresponsive for hours. If this happens you could try to use the gradle build instead. To do this in IntelliJ:
- Navigate to your maven tool window,click the top level maven project 'Egeria' & 'Unlink Maven Projects' - and confirm.
- In the left project tree right-click on the top level build.gradle and 'Link gradle project'
Yet another option to use IntelliJ is to make use of Remote Development. With this configuration you would use a seperate linux system, and connect remotely. This is beyond the scope of these docs.
Feedback on Windows, offers to help, clarification on the steps can be directed to odpi/egeria-docs#335
Running the build¶
When you download (clone) the contents of a git repository from GitHub, a new directory is created that is named after the repository that you cloned. For example, the directory created when the main egeria.git
repository is cloned is called egeria
. This directory contains all the source and the build scripts.
The project uses three main build technologies:
- Gradle is the primary build tool for the Egeria repositories.
- Apache Maven is an alternative build tool to Gradle and is being phased out.
- npm is used for Javascript repositories associated with the User Interfaces.
The build scripts that use these technologies ensure the software is built in the correct order.
Building with Gradle¶
Gradle is used to build the following repositories:
- egeria.git - main Egeria libraries.
- egeria-dev-projects.git - utilities and connectors for developers to use and develop further.
The Gradle processing works through the project modules. Each module has a build.gradle
file that defines the artifact, its dependencies and any special processing that the module builds. The top-level build.gradle
file at the root of the repository's source code directory structure controls the overall process.
Gradle runs the build in parallel threads so be sure any test cases are independent of one another.
Maven repositories
This processing includes locating and downloading external libraries and dependencies, typically from an online open source repository called Maven Central and our snapshot repository on https://oss.sonatype.org, so make sure you are online when you run the build.
No gradle installation is required, as we use the 'gradle wrapper' which will automatically install gradle if needed. This reduces the setup steps, and ensure everyone runs the same version of gradle.
This is a regular incremental build, but will also run all tests and generate javadoc.
./gradlew build
The quick build skips generation of javadoc, and tests
./gradlew build -x test -x javadoc
We avoid any use of cache, and ensure a full clean build. This may be needed when you want to recheck something that has no changed sources, but needs a rebuild -- for example to review compiler warning messages (not errors)
./gradlew clean build --no-build-cache
This build option creates an OMAG Server Platform where the registered services are optional. The OMAG Server Platform loads the registered services it finds on the loader path specified with the -Dloader.path={directoryName}
option of its startup command. Use this option if you want to remove the registered services that you are not using, or you would like introduce your own registered services.
./gradlew -PadminChassisOnly build
The build will typically take from seconds to 10 minutes depending on the speed of your machine and the number of projects that need to be built.
BUILD SUCCESSFUL in 4m 51s
3290 actionable tasks: 3172 executed, 118 up-to-date
Gradle development
For egeria Gradle is a replacement build tool to Maven and offers:
- better support for parallel builds
- more flexibility for build tasks
- breaking the link between directory structure and maven artifacts
- extremely fast incremental builds
As of version 4, Egeria can only be built using gradle.
Building with Maven¶
If building a version of Egeria prior to version 4, the maven instructions can be found below:
Prior to V4.0 Maven is used to build the following repositories:
- egeria.git - main Egeria libraries.
- egeria-samples.git - coded samples of using Egeria.
- egeria-dev-projects.git - utilities and connectors for developers to use and develop further.
The Maven processing organizes the modules into a hierarchy. Each module has a pom.xml
file (called the pom file) that defines the artifact, its parent / children, dependencies and any special processing that the module builds. The top-level pom file is the pom.xml
file at the root of the repository's source code directory structure.
When the Maven command is run, it passes through the hierarchy of modules multiple times. Each pass processes a particular lifecycle phase of the build (to ensure, for example, Java source files are compiled before the resulting object files are packaged into a jar file).
Maven repositories
This processing includes locating and downloading external libraries and dependencies, typically from an online open source repository called Maven Central. The directory where these external dependencies is stored locally is called .m2
.
!! cli "Rebuild a module with Maven" From the module's directory issue command:
mvn clean install
The egeria.git
repository has a top-level pom file so all of the modules can be built using one mvn clean install
command from the top-level egeria
directory. There is also a quick build option for people just wishing to use Egeria rather than make changes - enter mvn clean install -P quick -D skipFVT
The egeria-samples.git
repository does not have a top-level pom file. Each sample is built separately. When you want to build a sample, change to the sample's directory where the pom.xml
file is located and issue mvn clean install
.
The egeria-dev-projects.git
repository has a top-level pom fileo all of the modules can be built using one mvn clean install
command from the top-level egeria-dev-projects
directory.
The build can take 15 minutes to over an hour depending on the repository and on the speed/load on your machine. However eventually you will see the message:
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 54:54 min
[INFO] Finished at: 2020-01-29T09:33:17Z
[INFO] Final Memory: 171M/3510M
[INFO] ------------------------------------------------------------------------
Process finished with exit code 0
Installing Egeria¶
Change to the top level egeria
directory where your local copy of egeria.git
is downloaded to.
The egeria build process creates the distribution files for Egeria in the open-metadata-distribution/open-metadata-assemblies
project. To see its contents, after a full gradle build completes, use the following cd
command to change to its build/distributions
directory:
cd open-metadata-distribution/open-metadata-assemblies/build/distributions
ls
{{release}}-distribution.tar.gz
or egeria-4.1-distribution.tar.gz
in this example.
egeria-4.1-distribution.tar.gz
Create a directory for the install and copy the tar file into it. The two commands shown below create an install directory in your home directory and then copies the egeria distribution file into it.
mkdir ~/egeria-install
cp egeria*-distribution.tar.gz ~/egeria-install
These next commands change to the new directory and lists its contents.
cd ~/egeria-install
It is now possible to unpack the tar file with the following steps.
gunzip egeria*-distribution.tar.gz
tar -xf egeria*-distribution.tar
{{release}}-distribution.tar.gz
or egeria-4.1-distribution.tar.gz
in this example. Change to this new directory and list its contents as shown below.
cd egeria*gz
ls
LICENSE content-packs samples user-interface
NOTICE keystore.p12 server utilities
conformance-suite sample-data truststore.p12
As before, you may notice different files as Egeria evolves.
Under server
is a directory for the OMAG Server Platform that is used to run open metadata and governance services. This is the server-chassis-spring-4.1.jar
.
ls server
lib server-chassis-spring-4.1.jar
lib
directory is where the jar files for connectors, samples and new registered services are installed. The initial list includes the connectors that are located in the egeria.git
repository.
ls server/lib
audit-log-console-connector-4.1.jar
audit-log-event-topic-connector-4.1.jar
audit-log-file-connector-4.1.jar
audit-log-slf4j-connector-4.1.jar
avro-file-connector-4.1.jar
basic-file-connector-4.1.jar
cohort-registry-file-store-connector-4.1.jar
configuration-encrypted-file-store-connector-4.1.jar
configuration-file-store-connector-4.1.jar
csv-file-connector-4.1.jar
data-folder-connector-4.1.jar
discovery-service-connectors-4.1.jar
dynamic-archiver-connectors-4.1.jar
elasticsearch-integration-connector-4.1.jar
files-integration-connectors-4.1.jar
governance-action-connectors-4.1.jar
governance-services-sample-4.1.jar
graph-repository-connector-jar-with-dependencies-4.1.jar
inmemory-open-metadata-topic-connector-4.1.jar
inmemory-repository-connector-4.1.jar
kafka-integration-connector-4.1.jar
kafka-open-metadata-topic-connector-4.1.jar
omrs-rest-repository-connector-4.1.jar
open-lineage-janus-connector-4.1.jar
open-metadata-archive-directory-connector-4.1.jar
open-metadata-archive-file-connector-4.1.jar
open-metadata-security-samples-4.1.jar
openapi-integration-connector-4.1.jar
openlineage-integration-connectors-4.1.jar
spring-rest-client-connector-4.1.jar
Copy the jar files for any additional connectors you want to use into the lib
directory. The connectors available for Egeria are listed in the Connector Catalog.
The content-packs
directory contains Open Metadata Archives that provide sample open metadata content. The README.md
describes their content.
ls content-packs
CloudInformationModel.json DataStoreConnectorTypes.json
CocoBusinessSystemsArchive.json OpenConnectorsArchive.json
CocoClinicalTrialsTemplatesArchive.json OpenMetadataTypes.json
CocoComboArchive.json README.md
CocoGovernanceEngineDefinitionsArchive.json SimpleAPICatalog.json
CocoGovernanceProgramArchive.json SimpleDataCatalog.json
CocoOrganizationArchive.json SimpleEventCatalog.json
CocoSustainabilityArchive.json SimpleGovernanceCatalog.json
CocoTypesArchive.json
sample-data
directory contains sample data that is used in various labs and samples.
ls sample-data/*
sample-data/oak-dene-drop-foot-weekly-measurements:
week1.csv week3.csv week5.csv week7.csv week9.csv
week2.csv week4.csv week6.csv week8.csv
sample-data/old-market-drop-foot-weekly-measurements:
week1.csv week3.csv week5.csv week7.csv week9.csv
week2.csv week4.csv week6.csv week8.csv
Raise an issue or comment below
What next?¶
This is the end of the Downloading and Building Egeria Tutorial. You are now ready to learn about the OMAG Server Platform.
Alternatively ...
- Run the open metadata labs to get experience with using Egeria or
- Learn about developing extensions to Egeria or
- Learn how to make a contribution to Egeria
Raise an issue or comment below