For business context, videos, slides, OSDU schemas, and FAQs about the RAFS DDMS see [this Member Gitlab wiki](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/RAFSDDMSDEV/home/-/wikis/home).
For context for this code repository, technical documentation, and technical tutorials see [this README](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/blob/main/README.md).
## Rock and Fluid Sample DDMS Development Home Wiki
## Page Overview
This wiki article is to provide an overview and context for the Rock and Fluid Sample (RAFS) DDMS in the OSDU Data Platform, and to answer frequently asked questions.
It includes information and links for technical, business, and data management stakeholders.
For more context for this code repository, technical documentation, and technical tutorials see [this README](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/blob/main/README.md).
[Link here](https://community.opengroup.org/osdu/governance/project-management-committee/-/wikis/Release-Strategy) to full Milestone release notes, not just RAFS DDMS.
## Helpful Links and Documents
### Data Definitions - RAFS Rock/Petrophysics and RAFS Fluids
-[Rock and Fluid Data Definitions Issue Backlog](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/Petrophysics/home/-/boards)
-[Samples and Sample Analysis Worked Examples and data model explanation](https://community.opengroup.org/osdu/data/data-definitions/-/tree/master/Examples/WorkedExamples/SampleAnalyses?ref_type=heads)
-[DDMS Architectural Decision Record (ADR) #2](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/home/-/issues/2)
### RAFS DDMS Data Model - Content Schemas
#### Modeling the RAFS DDMS Content Schemas
In data modeling, there are various options, from normalized vs. denormalized, hard-typed vs. soft-typed, and more.
The RAFS DDMS content schemas have been designed intentionally to support the goals of OSDU in the following ways. Rather than having an extremely soft-typed approach, where each data loader and app can define their own attributes,...
1. Each expected attribute in a given dataset is explicitly identified and has defined qualifiers. This is expected to help dramatically with interoperability -- different apps and companies have the same way to find the same data.
2. Each attribute that expects values that are common in the industry has an OSDU-governed reference list, to achieve the same goals as the catalog-schema reference lists -- different apps and companies have the same semantics to describe the same data.
3. Content schemas are fairly normalized, storing each value only one time. This reduces storage and reduces the probability of inconsistent duplicated data.
While there is some sacrifice in flexibility, extensibility, and ease of ingestion, the benefits of this approach are better governed, better defined, higher quality, more interoperable, structured data.
While the following review is not required by OSDU Forum governance in 2024, the DDMS content schemas have been intentionally reviewed with subsurface SMEs in OSDU Data Definitions meetings to validate them.
#### Samples and Sample Analysis Content Schemas
-**RAFS DDMS Content Schemas and Reference Values (live and draft)**
***JSON DDMS content schemas - to be used:** https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/tree/main/app/models/data_schemas/jsonschema
***DDMS content schemas - Excel format for review:** https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/RAFSDDMSDEV/docs/-/tree/main/Design%20Documents/ProposalSheets?ref_type=heads
***JSON DDMS reference value manifests - drafts:** (not yet approved by Data Definitions): https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/RAFSDDMSDEV/docs/-/tree/main/Design%20Documents/ReferenceValues/Manifests/reference-data
#### Fluid Model-related Content Schemas (not yet added as of Jan. 2025)
***EoS/PVT Model WPCs - drafts -**[Proposed WPCs](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/tree/main/deployments/shared-schemas/rafsddms/work-product-component/PVTModel?ref_type=heads)
* Black Oil Table
* Component Scenario
* MPFM Calibration _PVT Model / Equations of State_
***EoS/PVT Model content - drafts -**[Proposed content schemas](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/tree/main/app/models/data_schemas/pvt_model/jsonschema)
-**Temporary JSON WKS/Catalog Schemas (e.g. WPCs and Reference Data) \[note: These are now replaced but published WKSs\]:** https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/rock-and-fluid-sample/rafs-ddms-services/-/tree/main/deployments/shared-schemas/rafsddms
### General OSDU DDMS Documentation - What is a DDMS?
Please join us on the [<span dir="">#</span>cap-rock_and_fluid_sample_ddms_development](https://opensdu.slack.com/archives/C059AP6NX7T) channel.
## The Rock and Fluid Sample Business Domain
Rock or Fluid samples can just as easily be called "specimens". They are physical specimens extracted from their natural location, that are then observed, analyzed, and/or described by various methods in order to gain insight about the reservoir. Sometime the analysis results are further analyzed, abstracted, and modelled to answer certain business questions.
Rock Sample Data and Fluid Sample Data can be characterized as:
- Highly complex with multiple dimensions
- Highly valuable to reservoir characterization
- Highly variable in data structure and naming across the industry
- Integrated – different disciplines use these samples and this data for different but complementary purposes
Rock Sample data includes, but is not limited to:
- Core physical sample data (conventional core and sidewall core)
- Rock cuttings physical sample data
- Rock sample analysis reports of various kinds (Routine/RCA, Special/SCAL, Geomechanics, etc.)
- Petrography (core descriptions, etc.)
- Depth shifts
- Logs (e.g. gamma) run on core
- Specialized photography, much of which is depth calibrated (core box photography, core image logs, thin section photos)
Fluid Sample data includes, but is not limited to:
- Physical sample information
- PVT (pressure, volume, temperature)
- SRA (source rock analysis)
- SARA (saturates, aromatics, resins, and asphaltenes)
- Gas Composition
- Bulk Oil Properties
- Gas Chromatography
- Mass Spectroscopy
To view the actual content schemas available at the time of reading, please see section above.
## The Need for and Role of the RAFS DDMS
The OSDU well-known/canonical schemas (WKSs) capture metadata for general search and discovery – the “data context”. Work-product-components (WPCs) are designed to capture metadata and point to a source file, not to capture and structure all the “data content”. WPCs, like other WKSs, are indexed by the Indexer Service which uses open-source Elasticsearch; This is good for some practical applications, but there are more performant options for managing array data.
At the time of the initial proposal (Dec. 2022), numerous OSDU members independently recognized the need to additionally structure the data content/array data, and a DDMS is the OSDU Forum-endorsed solution for this domain-specific need.
## Capabilities Enabled by the RAFS DDMS
The Rock and Fluid Sample (RAFS) DDMS will enable the capability of the OSDU platform to:
- Store the array data content in a way that is optimized for indexing, search, filtering, and visualization
- Store/structure the array data content in a standard format and definition
- Link structured data content back to its appropriate Master and WPC records (“data context”)
## DDMS Expected Users
### Technical Users of the DDMS include:
- Data Analysts
- Developers
- Solution Architects
- Data Engineers
- Data Managers/Stewards
### Business Users (with their software applications) benefitted by the DDMS include:
- Data Managers/Stewards
- Petrophysicists
- Petrologists
- Petrographers
- Geochemists
- Geologists
- Reservoir Modelers
- Reservoir Engineers
- Flow Assurance Engineers
- Facilities Engineers
- Lab Inventory Professionals
- Lab Analysis Professionals
- Data Managers/Stewards
## RAFS DDMS Architecture
The preliminary/prototype architecture:
- is based on the Wellbore DDMS architecture, including lessons learned that led to the upgrade to the Wellbore DDMS v3, such as flexible filtering and range-querying
- uses Apache Parquet (like the Wellbore DDMS) for optimized storing and indexing
Note, however, some of the differences from the Wellbore DDMS (as of Aug. 2023):
1. ~~The RAFS DDMS uses the "Dataset" group-type to manage the content~~ (This approach was changed in Jan. 2025.)
2. The RAFS DDMS uses the Register Service to resolve the link to the content from the WPC to the DDMS
Note also that, in principle, RAFS DDMS architects as of Aug. 2023 are open to any storage backend format that is justified for the requirements of the data being handled, not only Parquet.
For more technical and architectural documentation, please follow relevant links above.
## Original Contribution and Development in the Forum
In order to progress this solution, ExxonMobil, with EPAM, have:
- Set up a PMC project
- Submitted prototype code and architectural documentation for review and approval by the PMC and EA, as appropriate (including [DDMS ADR #2](https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/home/-/issues/2), now approved)
- Collaborated with the appropriate Data Concepts/Definitions teams to align on which data modeling activity is solved by which OSDU solution and/or workstream, and to get SMEs aware of these efforts
Subsequent to this, as with any forum initiative, multiple companies and developers can contribute, coordinated though the appropriate OSDU PMC team. As of Aug. 2023, there are **recurring forum calls for this PMC Project** ([Thursdays, bi-weekly](https://www.opengroup.org/og_sduaggregatedcalendar)).
As of January 2025, the RAFS DDMS is expected to be fully released as a mature DDMS in the Milestone 25 (M25) OSDU release.
## Current Team Contacts (as of Dec. 2024)
-**PMC Project Lead** = Hadley Cooper (ExxonMobil)
### 1. Why are Rock and Fluid Samples combined to be treated in a single DDMS?
Subject matter experts in the OSDU Forum, including petrophysicists and geochemists, have observed that these domains overlap and recommended (in 2022-2023) that they be modeled together. For example, certain analysis types that are applied to rocks are also applied to fluids. Similarly, fluids are sometimes extracted from rocks for the purpose of analysis.
### 2. Why are Rock and Fluid Samples not incorporated into the Wellbore DDMS?
1.[In OSDU](https://community.opengroup.org/osdu/documentation/-/wikis/OSDU-(C)/Reference-Architecture/Functional-Architecture/Data), DMS (aka CSS) (shape of data) is different from DDMS (optimized to a specific business domain)
2. Even if the "shape of data" (e.g. array data, at least partially solved by Parquet) is similar to well logs and even re-usable, a separate DDMS is needed for a separate business domain
3. Rock and Fluid Sample Data have been identified as a separate domain from Wellbore by both the SMEs in the Samples Data Definitions teams (an [output of a January workshop, slide 6](https://opensdu.slack.com/archives/C028RLMJVJP/p1676385721498969)), and by the Domain Driven Design team ([spreadsheet](https://gitlab.opengroup.org/osdu/subcommittees/data-def/docs/-/blob/master/Design%20Documents/WIP/DomainDefinitions/Domains-and-datatypes-Registry.xls)). Reasons include: Not all rock or fluid samples come from wellbores, and the Rock and Fluid Domain deals mostly with multi-faceted analysis on physical specimens rather than measurements along a wellbore
4. It is broadly recognized that Domains have inter-dependencies. OSDU must develop a methodology for domain-and-DDMS interactions rather than merging domains into a single DDMS when inter-dependencies are observed.
### 3. Does the RAFS DDMS support search across analysis datasets, such as "find all porosity, whether it came from Routine Core Analysis, NMR, or Mercury Injection"?
It its early MVP state, no, the RAFS DDMS does not support this search across datasets; only optimized content capture, search, filtering and retrieval _by_ analysis set. However, as of Aug. 2023, this has been recognized as a key future capability of the DDMS, and some proposals have already been made for approaches to develop new endpoints that achieve this kind of performant search on the data content in the RAFS DDMS. Some of these approaches are discussed in the forum presentations (see Helpful Links above), but specific designs and execution will need to be done through the PMC project team recurring calls (see mention above).
### 4. Are the OSDU Data Definitions Schemas (Well Known Schemas / WKS) ready, in order to support the RAFS DDMS?
As of December 2024, most of the WKSs are ready. There is, however, some clean-up to be done to deprecate old schemas, and consolidate certain reference lists. That clean-up is captured in [this Gitlab Issue (Petrophysics 62)](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/Petrophysics/home/-/issues/62).
More explanation can be found in Data Definitions Documentation: [SampleAnalyses Worked Examples](https://gitlab.opengroup.org/osdu/subcommittees/data-def/work-products/schema/-/tree/master/Examples/WorkedExamples/SampleAnalyses?ref_type=heads)
**M25** M25 will be released in January 2025.
The forum plan is both to have the WKSs cleaned up, and the RAFS DDMS graduated in M25, fully using the OSDU WKS schemas and mature content schemas.
**M21 History** The following schemas were [shipped with M21](https://community.opengroup.org/osdu/governance/project-management-committee/-/wikis/M21-Release-Notes):
- Sample (master) (note: This is a placeholder in-progress. Will be matured in M22)
- SamplesAnalysesReport (WPC)
- SamplesAnalysis (WPC)
- SampleAnalysisType (reference) (note: These 3 reference lists are only partial; Will be matured with M22)
- SampleAnalysisFamily (reference)
- SampleAnalysisSubFamily (reference)
- SampleAnalysisCategoryTag (reference - no forum-published values)
The [proposed ERD is here](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/Petrophysics/docs/-/blob/master/Design%20Documents/UnifiedSample_brainstorming_proposals/RockFluidSampleUnifiedWKS_OSDU_ERD_2024Jan25.drawio.pdf?ref_type=heads)(updated Jan. 2024), which the Data Definitions Petrophysics and Fluids teams worked toward.
The "unified sample model" represented by that ERD will deprecate large portions the previously published Rock-centric schemas. In 2022, OSDU published several Rock Sample-based schemas, including RockSample, Coring, RockSampleAnalysis, and RockImage. In late 2022 and early 2023, the Samples-and-Petrophysics and Fluid Samples data definitions workgroups decided to create a unified WKS model that accommodates both Rock and Fluids. In addition to the M21 components mentioned, they hope to have most of this model released in OSDU release M22.
The the absence of published, unified, WKSs, RAFS DDMS work did proceed with custom schemas and reference lists as needed. (See Temporary Catalog Schemas link above.) After Data Definitions publishes the official data model, the RAFS DDMS development team will quickly adopt it.
### 5. What is the difference between version 1 and version 2 of the RAFS DDMS endpoints?
In version 1, each analysis-centric endpoint represented one analysis type (e.g. RCA, CCE). Version 2 follows a different approach, following the data definitions approach, where the analysis type will be determined by a reference value (SampleAnalysisType), and the DDMS endpoint will simply be SampleAnalysis; Then, depending on which SampleAnalysisType reference value is provided, a specific content schema and content validation is used.
As of April 2024, version 1 is being deprecated.
### 6. How does Energistics PRODML relate to this RAFS DDMS?
There are several ways in which PRODML was used in the Rock and Fluid Sample domain in OSDU; Some ways pertain to the RAFS DDMS content schemas; Other ways pertain to broader OSDU Data Definitions (well known schemas). These two categories overlap, of course.
**PRODML inspiration for PVT DDMS Content Schemas**
PRODML "FluidAnalysis" was used as the basis for the content schemas of PVT-types (CCE, DiffLib, etc.) in the RAFS DDMS.
The assumption is that PRODML would be parsed, and the data would be loaded in the RAFS DDMS API in a very similar format to its source PRODML format.
There are some key differences to be aware of:
1. PRODML is in XML format, whereas OSDU is not
2. The two data types, SARA and High Temperature GC (HTGC,) are _not_ under the "PVT" Family in OSDU data definitions, even though they were in the PVT category in PRODML. Therefore their content schemas in the DDMS might vary from PRODML as well.
**PRODML inspiration for Well Known Schemas and reference lists**
The OSDU Data Definitions team has used elements of PRODML, such as AcquisitionJob, Acquisition, and Chain-of-Custody as inspirations for schemas in the unified sample model. Of course, in order to accommodate rock data in addition to fluid data, the schemas have been reviewed by the OSDU teams and modified from PRODML. For more on well known schemas, see FAQ #4.
### 7. What data or analysis types are currently supported by the RAFS DDMS?
A list of analysis types currently supported by the RAFS DDMS are available here: [Supported Analysis Types](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/RAFSDDMSDEV/home/-/wikis/Supported-Analysis-Types)
### 8. Many Sample Analysis Types have a content structure that is unique from others. How do I know how the WKS "SampleAnalysisType" correlates with the RAFS DDMS content schemas, which are governed by the DDMS itself?
There is a cross-reference here: [Supported Analysis Types](https://gitlab.opengroup.org/osdu/subcommittees/data-def/projects/RAFSDDMSDEV/home/-/wikis/Supported-Analysis-Types)
### 9. The RAFS DDMS has internal content schemas. How do those relate to the forum's Data Definitions "Well Known Schemas" (WKS)?
#### Reference lists
_Reference lists_ used by the RAFS DDMS are published in the OSDU Data Definitions repository. This is for several reasons:
1. Certain lists in the forum can and should be re-used
2. It allows clear and consistent governance, like the categories of Fixed, Open, and Local
3. It allows more visibility of those lists _without_ the risk of them being used purely independently of the RAFS DDMS
#### Group-type "content"?
There is a group-type designation of "content" in Data Definitions. Will the RAFS DDMS content schemas be published as that group-type?
No, RAFS DDMS content schemas will not be published under group-type "content". This is for several reasons:
1. Group-type "content" is intended for scenarios where (a) the forum wants to structure the content for datasets (in a way similar to the LAS semi-structured format for well logs) and (b) where there is no DDMS to manage the content schemas. Since the RAFS DDMS exists, using group-type "content" is not appropriate here.
2. Publishing the content schemas as group-type "content" would allow or encourage the use of these content schemas to store data content/bulk data for rock and fluid sample analysis _independently_ of the RAFS DDMS. But that is not the intention. In order to use the content schemas, the OSDU user should go through the RAFS DDMS.