Upgrade the role of the Manifest
Upgrade the role of the Manifest
Status
-
Initiated -
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
There is a lot of debate about the necessity and role of the ingestion manifest. Most examples of Manifest seen in R1 and R2 carry information that can be extracted from the data files that are being ingested. This has lead to the perception that they are largely redundant and in many cases can be replaced by parsing the data during ingestion.
However; while examples seem superfluous, there is a real need to automate ingestion as much as possible. To do this, we should upgrade the purpose of the manifest to be a well formed description of an ingestion job, providing
- A complete list of all elements that will be involved in the ingestion workflow
- semantic relationships among these elements that
- inform the ingestion workflow itself; and/or
- are captured in the metadata to support referential integrity
- additional metadata to supplement the content that can be extracted from these elements
- business properties such as company, contract, location, etc.
- system properties such as entitlement
- guidance on dealing with exceptions
By doing so, we can remove a good deal of the ambiguity and complexity from the ingestion process itself.
Decision
Invest in the Manifest definition and support in the ingestion framework to achieve the goals defined above.
Rationale
Providing structured information to the ingestion framework reduces complexity in the services and framework itself. The ingestion process no longer has to infer intention. This allows us to allocate data loading requirements to pre-ingestion (completing the manifest) and ingestion (interpreting and honoring the manifest).
The Manifest itself; does not have to exist in the form of a file; it can be generated and passed directly to the Ingestion Service as well-formed JSON.
Consequences
We need to be cautious about loading up too many requirements into the Manifest since its purpose is to ease ingestion rather than complicate it.