Policy Based Entitlements
We are often left to address the gaps from architectural principles (which stay at a pretty high and abstract level) to the actual implementation detail. Here is an attempt to bridge that gap by providing a set of Lightweight Architecture Decision Records (LADRs) which are simple to follow and can be implemented in a given team/project by the developers
Decision Title
Status
-
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
- Policies: Contextual grants of permission (often legal: contractual, residency, trade compliance) hese are captured in Entitlements User Stories
- Access Control: Explicit grants of permissions
- Identity: Consider related questions around Authentication
Scale of Problem
- For a single, large multi-national enterprise scale company
- Duration: 10's of years
- Users: 10,000's of users working within potentially 1,000's of organizational scopes
- Content: 100's of millions of business objects with new types being introduced in a continuous basis.
Decision
- We will support complex policies that are evaluated dynamically.
- Work needs to be done to collect examples of these policies
- We will review how to make these policies “expressible” from the perspective of an admin. The language of policy engines is powerful; but complex policies can look complex.
- We could use some help on the usability study because a LOT of policies will be unmanageable unless we think about the UI and how to organize them.
- We will look for an Open Source policy engine and replace the bespoke one that is in OpenDES.
- The evaluation will look at Usability and Expressiveness to make sure it meets the needs
- The evaluation will look at Performance and Latency to figure out
- where in the architecture / services it should be called
- where we need to collect the contextual information (user, location, contract, …) so that this does not become a bottleneck.
- We will not remove the OpenDES support for storage level ACLs
- Doing so would break every deployment of the data ecosystem and force reloading customer data we don’t have access to.
- We will continue to support Storage level ACLS for existing Deployments and for OSDU customers who wish additional security at the storage level
- If companies don’t want to use ACLs as additional security, then they can use a single service account for all queries and do all the entitlement management through policy.
Rationale
Consequences
When to revisit
Tradeoff Analysis - Input to decision
Alternatives and implications
There are two broad classes of entitlements: precalculated and dynamically evaluated.
Precalculated entitlements are fast to evaluate and require the least context. A good example of a precalculated entitlement is assigning a role to an object and then matching a user's membership in that role. The attachment of the role to the object and the assignment of the user to the role happen in advance of determining access. So at the time of access, a simple matching process can determine with almost 100% accuracy, in more-or-less constant time whether the access should be granted or not. Speed and accuracy are the obvious advantages. The primary disadvantage is that the mechanism for determining who belongs in which group and why is left as an exercise for the operator. It is not obvious how an operator translates its needs for policy, compliance, business roles, or need-to-know into a set of roles/groups. The tools for organising users / groups / roles, etc. live outside OSDU.
Dynamic evaluations encode the rights of users in descriptive policies or rules. At the time of access, the relevant set of rules is determined and evaluated. This requires translating the operator's requirements for policy, compliance, business roles, need-to-know, etc. into a machine-readable policy language/set of rules. The primary advantage is the flexibility to describe access controls generally, even in advance of data loading. E.g., data with attribute X cannot be read by users with attribute Y. The primary disadvantage is that the more flexibility that is possible in the rules, the slower the system is likely to perform. The tools for writing rules that implement operators' needs will be inside OSDU, but such tools have to be created for OSDU.
The three proposals below generally differ by how much of these two approaches are used. To more deeply understand these three approaches, look at three worked entitlement examples.
Policies and ACLs
Many systems separate design and implementation between contextual grants and explicit access permissions. There are multiple reasons for this and they include: historical (requirements for contextual grants come later), different risk exposure, different performance vs freshness trade-offs, incompatible use cases, etc.
The policy/ACL approach is a hybrid approach applying precalculated access control assignments (e.g. owner, rights, permissions) with dynamically-calculated policy entitlements. Access decisions that can be decided purely by precalculated rights are decided that way. This means that in some situations, only ACLs are considered, whereas in other situations policy/compliance requirements and ACLs are considered.
Depending on the requirements, contextual grants can require dynamic rule evaluation or be implemented as separate consumption zones/patterns. There are different ways to determine whether policy/requirement rules are evaluated. It can be per service/API. E.g., calling service X, API Y requires policy evaluation, but calling service X API Z will succeed/fail based only on ACLs. Another mechanism is through consumption zones/patterns. E.g., when exporting the data to certain environments, (e.g., non-OSDU environments) policy evaluations might need to occur. Whereas exporting data to other environments (where OSDU entitlements are actively enforced) perhaps only ACLs are checked.
Everything as policy
With everything as policy all access decisions are dynamically evaluated. No distinction is made between contextual policies and the ones where the context is not important. Context is always considered in every access request. A uniform set of rules is evaluated at each time of access.
In this way policies can be implemented independent of specific data or specific users. The access control policies can resemble the original license terms, regulatory rules, or organizational policies that the operator adheres to.
Constrained policies
The constrained policy approach attempts to define a generic entitlements model that can express grants in form of rules, but make it usable in a big data space by limiting the impact on performance and cost. In this approach, rules are fundamentally dynamic, but the expressiveness of the rules / patterns / language for defining policies is constrained to a very small set. The goal is to achieve many of the benefits dynamic entitlement while limiting complexity. The limited complexity should allow a more performant system implementation.
Possible constraints:
- Whereas the everything-as-policy approach would regulate every service and every API one possible simplification would be to only entitle the service level. E.g., users can have all APIs on a service or none. Or, more likely, any regulation of what a user can do inside a service is left to the service itself to implement. This simplifies and speeds up the policy engine evaluation by pushing complex evaluations to individual OSDU services.
- Limited operations: define simply 3 levels of entitlement for all services: read, write, owner. Force all services to align their activities along these levels. Another similar set of operations would be CRUD. Force all OSDU services to entitle along those 4 operations.
- Lack of wildcards / patterns: some of the most time-consuming operations are string operations matching against patterns liks
*@example.com
orurn:foo:bar:baz/*
. One way to constrain policy would be to forbid patterns and force enumeration of exact user and resource names.
Decision criteria and tradeoffs
Trade-off Points
- Granularity of authorization
- The end-user perspective (what the user wants)
- The system perspective (what the system inherently provides)
- Basic security concerns
- Does it require services with elevated privileges?
- Water-tightness of approach - how easy is it to circumvent?
- Does it satisfy common security requirements around basic operations like CRUD
- Performance
- How long to decide a given access request?
- Time it takes for a policy or an access control change to propagate through the system
- Scalability to thousands of users, thousands of groups, trillions of objects, thousands of policies
- Scalability/synchronizability across multiple installations in disparate geographic regions
- Usability
- Ability to manage permissions at scale
- Ability to understand / audit who actually has access under what conditions
- User-friendliness of the approach
- Susceptibility to user errors
- Technical feasibility and maintainability
- Can it be implemented
- Is it portable across cloud providers
- Is it portable across technologies
- Does it follow a known/existing standard?
- Are these existing technical solutions that we can leverage?
- How much leverages cloud native, established frameworks versus how much is purpose built by OSDU?
- Cost of Change
- Impact of migrating existing data currently in deployed instances
- Likelihood that approach would lead to future migrations of data
- Likelihood that the approach can be amended / adjusted without provoking substantial change in the future.
- Correctness
- Likelihood that the entitlement decision is "correct" at the moment it is made
- Ability to describe and enforce the majority of all business, legal, and regulatory requirements. Easy things are easy, exceptional things look exceptional.