0220 Files and Folders¶
A metadata catalog typically contains information about the data files that can be processed and their location. Files and folders describe physical files and how they are organized on the file system.
DataFile¶
DataFile
catalogs a physical file. It inherits from DataStore
to declare that it is a physical artifact. There are subtypes for DataFile
that identify the format of the file:
CSVFile
contains comma-separated values.AvroFile
is organized according to the Apache Avro specification.JSONFile
is encoded using JavaScript Object Notation (JSON).
FileFolder¶
A Filefolder
represents a folder or directory used to group related files together.`
FolderHierarchy¶
FolderHierarchy
links FileFolder
elements together to show a hierarchical organization.
NestedFile¶
NestedFile
links a file to a folder.
LinkedFile¶
Files can also have a symbolic link (LinkedFile
) to a element to show that it logically belongs to the other content in the element.
DataFolder¶
DataFolder
is a special case of Filefolder
for cataloguing directories that are contain a collection of data. The files and nested folders within it collectively make up the data content. They are not individually catalogued.
Hierarchical file structures¶
The diagram below illustrates the structure of a file system.
The FileSystem
is typically a Software Capability.
The root folders (of type FileFolder
) are connected to it using the
ServerAssetUse
relationship.
Beneath that are FileFolder
s with DataFile
s nested beneath them.