File issueshttps://community.opengroup.org/osdu/platform/system/file/-/issues2020-09-16T20:08:24Zhttps://community.opengroup.org/osdu/platform/system/file/-/issues/6Missing Integration tests2020-09-16T20:08:24Zethiraj krishnamanaiduMissing Integration testsMissing Integration tests.Missing Integration tests.Dmitriy RudkoDmitriy Rudko2020-09-11https://community.opengroup.org/osdu/platform/system/file/-/issues/1[Data flow/Ingestion] Ingestion code sync from GitHub to ADO2023-04-25T13:50:36Zethiraj krishnamanaidu[Data flow/Ingestion] Ingestion code sync from GitHub to ADOGoogle's team is working on Ingestion services and the internal process is to push the code to GitHub which creates some challenges for the R2 Development team.
As discussed and agreed last week, we need to make sure that we push the I...Google's team is working on Ingestion services and the internal process is to push the code to GitHub which creates some challenges for the R2 Development team.
As discussed and agreed last week, we need to make sure that we push the Ingestion, File, and Delivery Service code from GitHub to ADO for the R2 Development team so that all cloud providers can contribute/develop SPIs.
@Stephen Henderson volunteered to work with the Google team(@fargyle) to set up the process to move the code from GitHub to ADO. The initial code is pushed to GitHub on Feb 10th.
* GitHub code sync with ADO, we need a process in place to sync every day and this is not a onetime task.
* Mon-repo structure, we need to follow the agreed-upon core services structure where each service is different repo. We are not going to discuss Mon-repo vs Multi-repo in R2, Again it does not matter how its managed in GitHub but when we push to ADO we need to follow the standards.
* Noticed os-core-common library within osdu-r2 and its duplicate, We need to make we use the core-common library that we have created for core-services.
* I don't see SPIs in the provider's folder, we will have to follow the core service package structure.
* Integration Tests
ethiraj krishnamanaiduFerris ArgyleJoeethiraj krishnamanaidu2020-02-21https://community.opengroup.org/osdu/platform/system/file/-/issues/90Fixing sonar quality issues2023-07-28T09:39:01ZGauri ChitaleFixing sonar quality issuesSonarqube analysis has reported multiple code smells in the file service code.Sonarqube analysis has reported multiple code smells in the file service code.https://community.opengroup.org/osdu/platform/system/file/-/issues/88[ADR] Dataset service security enhancments2023-07-10T10:43:58ZOm Prakash Gupta[ADR] Dataset service security enhancments# Decision Title
Security Enhancements for Dataset Service's Signed URL APIs
## Status
- [X] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
A customer has voiced a security concern about Fi...# Decision Title
Security Enhancements for Dataset Service's Signed URL APIs
## Status
- [X] Proposed
- [ ] Trialing
- [ ] Under review
- [ ] Approved
- [ ] Retired
## Context & Scope
A customer has voiced a security concern about File Service's `POST GetStorageInstructions` and `POST GetReterievalInstructions` APIs under the scenario of a malicious user getting hold of the generated signed URLs and using them to access files from storage. When Private Link is not a desired option to mitigate these concerns for the customer due to policy and deployment complexity reasons, the following enhancements are proposed to the two existing APIs and introducing a new API to alleviate the customer's security concerns.
## Decision
### Proposed Changes
1. For `POSTS GetStorageInstructions` API: Change default TTL from 7 days to 1 hour and make TTL configurable through a query Paramater `expiryTime` in Time Units Minutes,Hours,Days. The expiry time is capped at 7 Days if the time provided by the User exceeds the capped value. In absence of this parameter, the Signed URL would be valid for 1 Hour by default.
2. For `POST GetReterievalInstructions` API: Change default TTL from 7 days to 1 hour. and make TTL configurable through a query Paramater `expiryTime` in Time Units Minutes,Hours,Days. The expiry time is capped at 7 Days if the time provided by the User exceeds the capped value.
These two changes make the two APIs behave consistently also.
3. New API to revoke all Signed URLs generated for a specified storage account. Storage account is specified through a query parameter `storageAccount`. User can grab the storageAccount from the `GetReterievalInstructions` or `GetStorageInstructions` response.
POST api/Dateset/v1/revokeURLs
This API will use the `StorageAccountRevokeUserDelegationKeys` to revoke all the User Delegation Keys for the storage account and that will revoke all the User Delegation SAS tokens and thus invalidate all the Signed URLs.
4. Start using user-defined delegation keys for storage accounts rather than using storage account keys.
## Rationale
Shortened TTL for the Signed URLs decreases the Window of opportunities for a malicious user to use the Sighed URLs to access any sensitive information; Additional Revoking API provides customers a capability to mitigate the risk at the earliest moment if Signed URL leaking is detected.
## Consequences
**Caution**: SAS token in a Signed URL cannot be individually revoked. This API will revoke all SAS tokens generated and invalidate all signed URLs for that storage account. A user needs to send `GET uploadURL` and `GET downloadURL` requests again to generate new URLs. It should only be used when the customer knows for sure a signed URL has been compromised.
**Caution**: User Delegation Keys are cached by Azure Storage, so there may be a delay between when the user initiates the process of revocation and when an existing user delegation SAS becomes invalid. So after calling `POST revokeURLs`, wait for sometime and verify the compromised URL no longer works before sending `GET uploadURL` and `GET downloadURL` requests again.
These cautions need to be included in the file service open API spec and be communicated to customers clearly.
## Backward Compatibility
This is NOT a breaking change.https://community.opengroup.org/osdu/platform/system/file/-/issues/87Apply role-based access to File V2 endpoints.2023-08-07T11:13:22ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comApply role-based access to File V2 endpoints.File V2/DMS API doesn't use Authorization filters (@PreAuthorize), and doesn't evaluate the roles of the requester, which could lead to data leaks.
Also, it was marked as Hidden but this rule was not applied to the Infra level automatic...File V2/DMS API doesn't use Authorization filters (@PreAuthorize), and doesn't evaluate the roles of the requester, which could lead to data leaks.
Also, it was marked as Hidden but this rule was not applied to the Infra level automatically.
https://community.opengroup.org/osdu/platform/system/file/-/blob/master/file-core/src/main/java/org/opengroup/osdu/file/api/FileDmsApi.java#L57
Potential issues:
- If not closed from Istio, data leaks are possible.
- Even if closed from the outside, authorization of internal requests will not be evaluated.M19 - Release 0.22Oleksandr Kosse (EPAM)Riabokon Stanislav(EPAM)[GCP]Andrei Dalhikh [EPAM/GC]Oleksandr Kosse (EPAM)https://community.opengroup.org/osdu/platform/system/file/-/issues/82File Services Context Path2023-03-28T13:03:00ZThulasi Dass SubramanianFile Services Context PathCurrent File service settings
- context path: `server.servlet.contextPath=/api/file/`
- Endpoints: eg for downloadURL operation: `/v2/files/{id}/downloadURL
1. All API Endpoints have **/v2/** prefixed to the **RequestMapping** Path (_Sc...Current File service settings
- context path: `server.servlet.contextPath=/api/file/`
- Endpoints: eg for downloadURL operation: `/v2/files/{id}/downloadURL
1. All API Endpoints have **/v2/** prefixed to the **RequestMapping** Path (_Screenshot attached below_)
1. To be consistent for **swagger ui** and **api-docs** path customization and also to make easy versioning of the API,
_Can we add '**v2**' to the context path and remove it from all endpoints_?
- context path: `server.servlet.contextPath=/api/file/v2`
- Endpoints: eg for downloadURL operation: `/files/{id}/downloadURL`
**Consequences:**
- There will not be any changes w.r.t to Consumers of the API
- Its internal refactoring of the maintenance of the Base Path and Version
CSP can provide their inputs if we see any breaking changes or any settings need to be updated.
![image](/uploads/9878bb676457816652e6760868d73002/image.png)M17 - Release 0.20Thulasi Dass SubramanianThulasi Dass Subramanianhttps://community.opengroup.org/osdu/platform/system/file/-/issues/79File ci cd pipelines do not use file-test-core-bdd with vital test cases.2022-11-04T10:15:34ZRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comFile ci cd pipelines do not use file-test-core-bdd with vital test cases.There is a BDD tests defined in the File testing module: <br/>
https://community.opengroup.org/osdu/platform/system/file/-/tree/master/testing/file-test-core-bdd <br/>
They get test case updates with new feature introductions, for exampl...There is a BDD tests defined in the File testing module: <br/>
https://community.opengroup.org/osdu/platform/system/file/-/tree/master/testing/file-test-core-bdd <br/>
They get test case updates with new feature introductions, for example: <br/>
https://community.opengroup.org/osdu/platform/system/file/-/merge_requests/138/diffs#d67c53013c6814c8d874d0daf0cffc9179ad1d00 <br/>
But they are not used in cicd pipelines, which left those features not cowered. <br/>
And it looks like because of ignoring them for a long time, tests have some compatibility issues which leads to runtime errors like
~~~
java.lang.NoClassDefFoundError: Could not initialize class io.restassured.RestAssured
at org.opengroup.osdu.file.util.test.RestAssuredClient.<init>(RestAssuredClient.java:30)
at org.opengroup.osdu.file.util.test.HttpClientFactory.getInstance(HttpClientFactory.java:8)
at org.opengroup.osdu.file.stepdefs.FileStepDef_GET.lambda$new$1(FileStepDef_GET.java:76)
~~~
Keeping them ignored may cause issues with feature introduction and verification. <br/>
There are several possible solutions: <br/>
- Fix and enable file-test-core-bdd tests in the integration step
- Copy missing tests from to file-test-CSP_PROVIDER_MODULERustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comRustam Lotsmanenko (EPAM)rustam_lotsmanenko@epam.comhttps://community.opengroup.org/osdu/platform/system/file/-/issues/78ADR: Security Enhancements for File Service's Signed URL APIs2023-11-29T09:58:29ZLucy LiuADR: Security Enhancements for File Service's Signed URL APIs
# Decision Title
Security Enhancements for File Service's Signed URL APIs
## Status
- [X] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
A customer has voiced a security concern about File...
# Decision Title
Security Enhancements for File Service's Signed URL APIs
## Status
- [X] Proposed
- [x] Trialing
- [x] Under review
- [x] Approved
- [ ] Retired
## Context & Scope
A customer has voiced a security concern about File Service's `GET uploadURL` and `GET downloadURL` APIs under the scenario of a malicious user getting hold of the generated signed URLs and using them to access files from storage. When Private Link is not a desired option to mitigate these concerns for the customer due to policy and deployment complexity reasons, the following enhancements are proposed to the two existing APIs and introducing a new API to alleviate the customer's security concerns.
## Decision
### Proposed Changes
1. For `GET uploadURL` API: Change default TTL from 7 days to 1 hour and make TTL configurable through a query Paramater `expiryTime` in Time Units Minutes,Hours,Days. The expiry time is capped at 7 Days if the time provided by the User exceeds the capped value. In absence of this parameter, the Signed URL would be valid for 1 Hour by default.
2. For `GET downloadURL` API: Change default TTL from 7 days to 1 hour. TTL is already configurable through the query Paramater `expiryTime`.
These two changes make the two APIs behave consistently also.
3. New API to revoke all Signed URLs generated for a specified storage account. Storage account is specified through a query parameter `storageAccount`. User can grab the storageAccount from the `GET uploadURL` or `GET downloadURL` response.
POST api/file/v2/files/revokeURLs
This API will use the `StorageAccountRevokeUserDelegationKeys` to revoke all the User Delegation Keys for the storage account and that will revoke all the User Delegation SAS tokens and thus invalidate all the Signed URLs.
## Rationale
Shortened TTL for the Signed URLs decreases the Window of opportunities for a malicious user to use the Sighed URLs to access any sensitive information; Additional Revoking API provides customers a capability to mitigate the risk at the earliest moment if Signed URL leaking is detected.
## Consequences
**Caution**: SAS token in a Signed URL cannot be individually revoked. This API will revoke all SAS tokens generated and invalidate all signed URLs for that storage account. A user needs to send `GET uploadURL` and `GET downloadURL` requests again to generate new URLs. It should only be used when the customer knows for sure a signed URL has been compromised.
**Caution**: User Delegation Keys are cached by Azure Storage, so there may be a delay between when the user initiates the process of revocation and when an existing user delegation SAS becomes invalid. So after calling `POST revokeURLs`, wait for sometime and verify the compromised URL no longer works before sending `GET uploadURL` and `GET downloadURL` requests again.
These cautions need to be included in the file service open API spec and be communicated to customers clearly.
## Backward Compatibility
This is NOT a breaking change.M18 - Release 0.21Om Prakash GuptaOm Prakash Guptahttps://community.opengroup.org/osdu/platform/system/file/-/issues/76POST files/metadata Re-try Failure due to Staging File being Deleted Pre-matu...2023-03-09T15:24:22ZLucy LiuPOST files/metadata Re-try Failure due to Staging File being Deleted Pre-maturelyAn issue was observed in POST files/metadata retries during MSFT use of this API in M12: retries performed for a failed POST files/metadata API are likely to result in 400s errors no matter how many times retry is performed. Further inve...An issue was observed in POST files/metadata retries during MSFT use of this API in M12: retries performed for a failed POST files/metadata API are likely to result in 400s errors no matter how many times retry is performed. Further investigation shows the root cause is that when metadata creation failed, staging file is also deleted. Thus subsequent retries with the same source file ID that mapped to the deleted staging file will result in failure. The staging file should not be deleted pre-maturely if metadata creation failed.
Current workaround is to perform the extra two steps to upload the file to staging again and then retry POST files/metadata:
1. Get a signed URL by calling File location API
2. Upload File to blob storage using signed url
3. Create the metadata using POST Metadata API
Suggested fix:
1. In FileMetadataService::saveMetadata, move the deleteStagingFile step to the last step right before successful return. So that staging file will only be deleted when everything succeeds.
2. Check staging file existence before deleting. Catch and ignore exceptions thrown from staging file delete. Staging file deletion failure is very rare but could happen under special concurrency situations: simultaneous calls for Metadata create with same payload results to one of the delete failure because file already deleted by the other caller. Failed staging file deletion should not invalidate successful metadata creation.M17 - Release 0.20Chad LeongChad Leonghttps://community.opengroup.org/osdu/platform/system/file/-/issues/73Storage and retrieval instructions for file collection in AWS file service mi...2022-07-01T15:23:32ZMorris EstepaStorage and retrieval instructions for file collection in AWS file service missing critical fieldsThe storage and retrieval instructions for file collections returned by AWS file service currently only returns a pre-signed URL. However, pre-signed URLs only work with single S3 objects. Consumers of file collection instructions need t...The storage and retrieval instructions for file collections returned by AWS file service currently only returns a pre-signed URL. However, pre-signed URLs only work with single S3 objects. Consumers of file collection instructions need the following information in order to store and retrieve objects in S3:
* Unsigned URL
* temporary credentials
* region
AWS file service needs to change to return the required information above.M12 - Release 0.15Morris EstepaMorris Estepahttps://community.opengroup.org/osdu/platform/system/file/-/issues/71File core module Junits are not getting executed2022-06-17T11:26:17ZAbhishek Kumar (SLB)File core module Junits are not getting executedAbhishek Kumar (SLB)Abhishek Kumar (SLB)https://community.opengroup.org/osdu/platform/system/file/-/issues/69File Service: Requests to POST metadata are taking long time2022-08-23T21:03:43ZSachin JaiswalFile Service: Requests to POST metadata are taking long time### Problem Statement
Request to post metadata takes long time when try to calculate the checksum for larger files.
### Solution
We can overcome this problem by reading bytes from the input stream and storing them into the buffer array.### Problem Statement
Request to post metadata takes long time when try to calculate the checksum for larger files.
### Solution
We can overcome this problem by reading bytes from the input stream and storing them into the buffer array.https://community.opengroup.org/osdu/platform/system/file/-/issues/65The postman collection that executes successfully on Azure, GCP and IBM but f...2022-08-24T14:38:35ZKamlesh TodaiThe postman collection that executes successfully on Azure, GCP and IBM but fails on AWSThe following collection works successfully in GCP and IBM environments in the Platform Validation project. But it fails to run in Azure and AWS environments.
[FileAPI_UploadDownload_CI-CD_v2.0.postman_collection.json](/uploads/9e3038f1...The following collection works successfully in GCP and IBM environments in the Platform Validation project. But it fails to run in Azure and AWS environments.
[FileAPI_UploadDownload_CI-CD_v2.0.postman_collection.json](/uploads/9e3038f1bd76c63807ce29ff53fae533/FileAPI_UploadDownload_CI-CD_v2.0.postman_collection.json)
In Azure environment, @krveduru @ankurrawat
Request 2.UploadFile by SignedURL fails with response code 400 An HTTP header that's mandatory for this request is not specified
The API doc is not clear as to what needs to be specified and the same request works in GCP and IBM environment
Reponse:
<?xml version="1.0" encoding="utf-8"?>
<Error>
<Code>MissingRequiredHeader</Code>
<Message>An HTTP header that's mandatory for this request is not specified.
RequestId:3d3902d8-501e-0092-3544-50f68a000000
Time:2022-04-14T21:15:09.4287412Z</Message>
<HeaderName>x-ms-blob-type</HeaderName>
</Error>
In AWS environment, @fhoueto.amz
Request 3. Create File Metadata R3 fails with response code 500 Internal Server Error
Reponse:
{
"error": {
"code": 500,
"message": "Internal server error",
"errors": [
{
"domain": "global",
"reason": "internalError",
"message": "Internal server error"
}
]
}
}
The file being uploaded is a las file (7004_a1501_1978_comp.las) from TNO data.
FYI - @dzmitry_malkevich @anujgupta @debasisc @ChrisZhangM11 - Release 0.14Okoun-Ola Fabien HouetoOkoun-Ola Fabien Houetohttps://community.opengroup.org/osdu/platform/system/file/-/issues/64ADR: Calculate Checksum before saving metadata2023-07-05T09:41:49ZParesh BehedeADR: Calculate Checksum before saving metadata# Decision Title
Calculate checksum of uploaded file before creating its metadata
## Status
- [X] Proposed
- [X] Trialing
- [X] Under review
- [X] Approved
- [ ] Retired
## Context & Scope
We support dataset--File.Generic entity record...# Decision Title
Calculate checksum of uploaded file before creating its metadata
## Status
- [X] Proposed
- [X] Trialing
- [X] Under review
- [X] Approved
- [ ] Retired
## Context & Scope
We support dataset--File.Generic entity record to be created in data platform while user hits /metadata endpoint of File Service. this schema has couple of useful attribute which we don't use as of now which is checksum and checksum algorigthm.
These attributes would be super useful to detect any duplicate file uploads in data platform.
## Mechanism for calculating checksum
I propose to implement new method in core module (lets say generateChecksum()) which can be implemented by every CSPs in provider module before we make call to storage service for saving metadata of file.
Now this method can be implemented in various ways and algorithms as per CSPs choice, for e.g., in Azure, we really don't need to generate checksum explicitly as its been calculated by blob store automatically, so implementation of generateChecksum() will be to just fetch the blob's metadata and they are done. similarly it can be implemented by other providers if there storage solution also supports calculating checksum while storing blob.
## Decision
We should generate checksum of single file before creating its metadata in data platform, so that we can provide that checksum value in metadata record (instance of dataset--File.Generic entity)M12 - Release 0.15Paresh BehedeParesh Behedehttps://community.opengroup.org/osdu/platform/system/file/-/issues/63Preloadfilepath & ExtensionProperties removed from file Metadata API2022-11-28T14:10:32Zivar SoerheimPreloadfilepath & ExtensionProperties removed from file Metadata APIDuring ingestion of file metadata under /files/metadata using POST command the Preloadfilepath and ExtensionProperties are not persisted when returning the record post ingest.
This seems like strange behaviour to me. I would like to ei...During ingestion of file metadata under /files/metadata using POST command the Preloadfilepath and ExtensionProperties are not persisted when returning the record post ingest.
This seems like strange behaviour to me. I would like to either understand why this happens, or extend the file metadata api so these properties are not removed.
This is the workflow:
1. Get Signed URL for upload
2. Upload file using signed URL
3. Upload file metadata using file api (this returns ID of created record and can be searched in storage)
4. Refer to this ID when creating well log record or any other record
The problem with this workflow is that:
- PreloadFilePath and ExtensionProperties are removed from the record during metadata uploadhttps://community.opengroup.org/osdu/platform/system/file/-/issues/59Using Publisher Facade to publish status messages2022-02-24T08:50:26ZTsvetelina IvanovaUsing Publisher Facade to publish status messagesAzure core lib introduces a publisher facade which can be used across services in order to publish messages to message brokers(Service Bus/ Event Grid).It will help to manage and update at a single source instead of each service doing it...Azure core lib introduces a publisher facade which can be used across services in order to publish messages to message brokers(Service Bus/ Event Grid).It will help to manage and update at a single source instead of each service doing it individually.The pub sub configuration can be used to configure publishing for Event grid and Service bus.
Link of related issue:
https://community.opengroup.org/osdu/platform/system/notification/-/issues/25Tsvetelina IvanovaTsvetelina Ivanovahttps://community.opengroup.org/osdu/platform/system/file/-/issues/55Update Swagger documentation for end point - File uploadURL2022-09-29T13:30:08ZDebasis ChatterjeeUpdate Swagger documentation for end point - File uploadURLSee this below-
![API-File-service](/uploads/fb1508499115056f6ccd8abbb10c9a01/API-File-service.PNG)
Should be lowercase "**uploadURL**"
{{FILE_HOST}}/files/uploadURLSee this below-
![API-File-service](/uploads/fb1508499115056f6ccd8abbb10c9a01/API-File-service.PNG)
Should be lowercase "**uploadURL**"
{{FILE_HOST}}/files/uploadURLM14 - Release 0.17Shrikant GargShrikant Garghttps://community.opengroup.org/osdu/platform/system/file/-/issues/54Upgrade to Log4J 2.172021-12-21T03:09:43ZDavid Diederichd.diederich@opengroup.orgUpgrade to Log4J 2.17The Apache Foundation released another Log4j2 update, version 2.17, which address a denial of service vulnerability.
This issue tracks progress to upgrade this dependency for this project.The Apache Foundation released another Log4j2 update, version 2.17, which address a denial of service vulnerability.
This issue tracks progress to upgrade this dependency for this project.https://community.opengroup.org/osdu/platform/system/file/-/issues/53Log4J Expedient Updates and Patches2021-12-17T11:10:27ZDavid Diederichd.diederich@opengroup.orgLog4J Expedient Updates and PatchesThis issue associates MRs that were applied to this project quickly to get a patched version ready as soon as possible. The intent is to provide a reference point for later, more thoughtful, analysis.This issue associates MRs that were applied to this project quickly to get a patched version ready as soon as possible. The intent is to provide a reference point for later, more thoughtful, analysis.David Diederichd.diederich@opengroup.orgDavid Diederichd.diederich@opengroup.orghttps://community.opengroup.org/osdu/platform/system/file/-/issues/52Log4J CVE-2021-442282021-12-17T06:39:56ZTsvetelina IvanovaLog4J CVE-2021-44228Apache Log4j2 <=2.14.1 JNDI features used in configuration, log messages, and parameters do not protect against attacker controlled LDAP and other JNDI related endpoints. An attacker who can control log messages or log message parameters...Apache Log4j2 <=2.14.1 JNDI features used in configuration, log messages, and parameters do not protect against attacker controlled LDAP and other JNDI related endpoints. An attacker who can control log messages or log message parameters can execute arbitrary code loaded from LDAP servers when message lookup substitution is enabled. From log4j 2.15.0, this behavior has been disabled by default. In previous releases (>2.10) this behavior can be mitigated by setting system property "log4j2.formatMsgNoLookups" to “true” or it can be mitigated in prior releases (<2.10) by removing the JndiLookup class from the classpath (example: zip -q -d log4j-core-\*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class).Tsvetelina IvanovaTsvetelina Ivanova