Dataset service available as core service for all CSPs
Updated on Jan 15, 2021.
With Dataset as core service ADR approved, this issue is updated to track the work needed for all CSPs to have dataset service available in all cloud platforms, which includes the SPI implementation, pipeline and integration tests setup in Gitlab.
Current code is in data flow project (https://community.opengroup.org/osdu/platform/data-flow/data-workflow-framework/dataset-registry), and it should be moved to system project.
Original issue entry:
Title: File service to support use cases of multiple files
File service design ADR has limsted scope of supporting single file ingestion. An endpoint was added to get uploadURL for this use case. There are other use cases like VDS support that requires support of multiple files or location of files. This item is to track the related work needed.
I quote Joe's comments in the previous ADR on this: "If we can change the endpoint to getLocation, where this endpoint is partition and entitlement aware, and where this endpoint returns a JSON object that informs the data producer on all aspects required to execute a cloud SDK for their respective object storage, then this will support all cloud providers. I cannot support an approach that does not support all cloud providers. This endpoint would need to return things like 1) base URL (based on partition) 2) namespace within the base URL (aka folders) 3) all other information required to get an access token from the SDK versus generally providing a signedURL. This would also eliminate the need for a FileID. If you are stuck on providing a signedURL, then, that should be an optional attribute within the object that I have described. That way, a producing application can use the signedURL if there is one, and be able to do something else if the signedURL is not provided. As it is, the design does not leave an options."