Enterprise Deployment Concerns
Architecturally Significant Use Cases
Irrespective of the size or a global spread of an enterprise, there are few typical use cases for an OSDU deployment:
- Interactive web and mobile applications Early adoption phase web applications are usually a data query, navigation, and visualization apps. They require an ability to search and visualized data from OSDU platform. Map-based, 2D and 3D seismic, wells, well logs and trajectories, data dashboards and documents are examples of visualization. Depending if the client is implemented as a thin client or a fat one, required data bandwidth to the client can be different. Overall, a user session could consume from 10s of MBs to 10s of GBs of data.
- Interactive desktop applications Consumption of these applications come in two different flavors, either as a conventional desktop app or as a remote desktop connection to a VM running the app. In both consumption cases, the available memory and processing power (CPU and GCP) is not constrained as in case of web and mobile apps. What typically becomes an issue is: network bandwidth and egress cost in case when app is ran on user's machine; and latency when app is run via remote desktop. Best practice is to run data heavy desktop apps via remote desktop to the data center where the data resides. Required latency to have a meaningful user experience is below 100 ms, which means that for this use case, data and VM should be colocated in a data center that is as close to the user as network latency allows.
- High-performance computing workflows A typical high-performance application with a parallel HPC workload, requires high throughput shared data stores (file and object) and 1000’s of compute nodes. The application uses local disk/file system on the nodes for scratch operations. Results from processing are persisted in a shared data store. Cost and performance mandates that the data and the compute are colocated.
- Analytics and machine learning workflows OSDU data is used for analytics and machine learning workflows. This ranges from visual analytics to training ML models and performing inference. Analytics and ML workflows represent a different data consumption pattern than an operational workflow. Data available in such analytics consumption backends represents a wide-view, fit-for-purpose normalized data. Typically, one would not globally replicate the same footprint.
Architecturally Significant Concerns
- Multi-regional support Clearly, since many of the OSDU consumers are global, multinational companies, for OSDU to satisfy above use cases, it needs to ensure data is available in multiple locations where data and compute are colocated. Interactive workflows impose additional requirement that these collocated data and compute are as close to the end users as possible.
- Multi-organization support Many of the operator corporations are formed of multiple legal entities. At the same time, SaaS providers are providing OSDU to multiple corporations. In both cases, OSDU as a multi-tenant data platform provides opportunity for cost efficiency due to economy of scale. At the same time, OSDU as a multi-tenant data platform gives opportunities for joint ventures (another option being federated OSDU deployments).
- Multi-cloud support Cloud providers compete in different aspects of their offering (e.g., technology stack, available regions, managed services availability). For an OSDU customer to leverage these competitive advantages, it is necessary for OSDU to support multi-cloud deployments.
- In-country support Many of OSDU members operate in countries with data residency constraints. For them, an in country deployment is a necessity. This can be achieved by leveraging an available public cloud region, appliance or cloud edge or on-prem deployment. Similar situation is with operators with policies requiring on-prem deployment.
- OSDU federation A corporation can acquire another one, start a join-venture or require a view over data for its multiple legal entities. For this to be possible, data from one OSDU instance must be consumable in other instances. Of course, exporting data from one and loading to another is a possibly but it disrupts data lifecycle.
Any of above concerns can independently influence the deployment topology of OSDU. For example, a federation between two OSDU instances where one is single-region, on-prem deployment by an operator and the other one is a multi-region, multi-organization, multi-cloud, deployment by a SaaS provider.
Usual deployment topologies
Global Deployment (Single Organization, Single Public Cloud, Multi-Region)
From Alex's notes <- I feel this is more implementation that architecture. Will rephrase to reflect the requirements and then push implementation options to design section.
OSDU Requirements as we understand them:
- Metadata, Reference Data and Master data are all replicated. A region manages Meta & Referene while the Master is centrally managed and pushed to regions.
- Search index is always local since the other data is always replicated and locally available.
- Since search is local, it can be optimized to show “region-owned” data in the context rather than showing all world data all the time. Helps local business with relevant & most important data
- Actual data (raw data + files) are replicated as needed. Replication is a concern that is deferred to the cloud vendors who are best suited to manage such pure infrastructure oriented tasks.
Design and implementation considerations at Single Customer Multi-Region Design
SaaS Deployment Concerns
Multi-Organization (Multi-Customer / Single-Region)
Current OpenDES Architecture for supporting multiple organizations with strong isolation at the data and resource level without having to redeploy services for each one. This support the following requirements:
- Supporting multiple customers with a single instance set of stateless services which
- simplifies deployment of new customers
- updating the services (stateless services) for consistent behaviour across multipe customers
- Interesting explicit cross-organization sharing scenarios, such as
- Data brokers
- Joint ventures
See design comments at Isolated Multi Organization (Multi Customer) Support
Multi-Cloud/Region - Single Control Plane
Proposed OpenDES Architecture for supporting multiple deployments across different Clouds and regions managed under a single control plane.
Note: The common control plane intersects with the Global Deployment (Single Customer, Multi-Region) deployment
See design comments at Multi-Cloud & Region - Single Control Plane
The data ecosystem architecture can be deployed to individual customers. The simplest deployment is to use one project the contains both the data (one partition) and the services.
In Country (On-Prem, Appliance or Cloud Edge)
Dell's Place to write an opinion