[DRAFT] ADR: Enabling User Context in Ingestion
Problem Statement
Currently Ingestion Jobs in OSDU like CSV Parser, Manifest Ingestion etc. uses Service Account Tokens while calling any OSDU Service APIs like Storage/Dataset etc., which means any Authorization checks happening for API Access or Data Level Access (ACL checks in Storage service) is based on permission level of Service Account rather than based on the User who initiated the Ingestion in the first place.
Scenario to Understand this Issue
In CSV Ingestion IDs of Record generated are predetermined by using Natural Keys [Code Link]
Now there can be a case where User A invoked CSV Ingestion where an existing record (created by a different User B with xyz ACLs) was updated. User A who is trying to update the record may not have access to the ACLs associated with the existing record but since the Ingestion Job uses Service Account Tokens, ACL validations will succeed in Storage service Create/Update Record flow (Service Accounts are part of users.data.root
group which gives them access over entire data in the system)
As a result User A updated records created by User B resulting in data loss for original user which is not expected behavior and a major gap in Authorization
Questions
- Why can't the Ingestion Jobs rely on User Tokens passed in request Headers in Workflow API? - Ingestion Jobs are long running and Tokens will eventually expire and would require renewal, along with that System can't renew User Tokens as it can't have access to User specific Credentials and Auth Codes
Proposed Solution
We can leverage the SPI Layer (Service Mesh/API Gateway) responsible for Authentication & Identity Resolution in this scenario. So as part of Entitlements V2 onboarding Authentication and identity resolution was extracted from Entitlements service and service expects the identity to be provided to it in the requests. The x-user-identity header is an expected parameter on the requests into the service. This header provides the identity of the user in the request and is set by the SPI Layer (Service Mesh/Gateway)
Following changes are proposed
-
A new header x-on-behalf-of will be introduced which will store the user identity (context), the value for this will be set only by Ingestion Jobs (CSV/Manifest)
-
Workflow Service will add a new field user identity (present in DPS Headers) to Airflow
Conf
while triggering the Dag Run -
Ingestion Jobs (CSV/Manifest) will extract the newly added user identity property and then set x-on-behalf-of header in the requests before calling any downstream services
Change in SPI Layer (Service Mesh)
-
If the request contains Service Account Token and x-on-behalf-of header is not empty or null, then the x-user-id header will be set to x-on-behalf-of header
-
Else set the x-user-id header by following existing logic
This allows preserving the User Identity (Context) and hence all API Level and Data Level Authorization checks will be performed based on Entitlement Groups of the User rather than Service Account
Authentication can still be carried out using Service Account tokens as the User was already authenticated when they triggered Workflow API
Advantages of Above Approach
-
ACL validations will be performed based on user-id instead of Service Account which resolves the elevated permissions issue
-
No Service side code changes are required, and the change will only be scoped to Service Mesh
-
Hardens the Authorization checks as it also ensures that only Users with appropriate API permissions will be able to Trigger Ingestion
-
No change in service behavior expected
Flow Diagram
Security Enhancement
This is a proposed enhancement in extension to the above approach to further harden the security of the system. Currently in OSDU deployment, Service Account Tokens can be directly used by clients to invoke OSDU APIs, this removes distinction between an Actual User Initiated Flow as compared to some internal system driven flows
Service Account tokens should be restricted to only internal services and external users shouldn’t be permitted to perform this operation; this will be a security enhancement as well because in case of accidental Service Account Token leaks it will prevent any malicious user gaining highest level access over entire Data and APIs in system
Changes required
-
Mechanism to distinguish external calls vs internal calls, this can be handled by each CSP in their own Service Mesh Implementation
-
Block any external calls made using Service Account Tokens
-
Details on response codes can be flushed out later in a separate ADR