@shivani_karipe I noticed that in commit 4187481d the deployment task for WDMS main service was removed.
In that case, please make sure the worker service tests sets here are being executed during the build.
CC @Vernet
@shivani_karipe After the deployment step, the WDMS e2e tests should be executed as well.
Nvm, saw line 84
Is this correct?
Should it be devops/azure/wellbore-ddms-worker.values.yaml
instead?
@omprakash_epam The Azure libraries are updated, and do already include the required implementation to enable the worker service in Azure.
For the benchmark results, see my reply to Nur below
What is still pending on Azure team side is to update the GitLab pipelines to enable the deployment of the worker service in the Azure environment used by the OSDU Forum.
Could you coordinate that with MSFT team?
CC @Vernet
@nursheikh In the absence of a comprehensive performance benchmark that could be shared. The recommendation is to use the existing e2e test set to validate the expected reduction in CPU and memory usage with the adoption of the worker service for bulk data access, in replacement of the Dask-based implementation.
The AWS team has shared here the observed changes in CPU and memory usage when running the existing e2e tests in their environment against the Dask-based implementation vs using the new worker service for bulk data access.
Those values are consistent with the latest measurement collected in our internal Azure test environment as well.
@kogliny Could you provide the full log of the e2e tests run?
@kogliny Could you file a ticket with more details?
Be aware delays are expected, as it's holiday season for us. The team is back after August 16th.
@ydzeng The latest worker service version contains,
Dask retirement will come next.
Performance benchmark executions do not need to wait on that to start.
Code drop July 28th from f3ee8433304fadca8c22d0a46db2ea4326b2ee8d
It includes major changes regarding bulk data writing:
@carl.godkin, @deepapathak To better understand the data.WellID
(or any id referenced in the data
block) expected format, we need to refer back to the schema definition.
The regular expression used "pattern": "^[\\w\\-\\.]+:master-data\\-\\-Well:[\\w\\-\\.\\:\\%]+:[0-9]*$",
translated to a more readable form, {namespace}:master-data--Well:{id}:{optional version}
That means valid data.WellID
values are,
{namespace}:master-data--Well:{id}:{version}
{namespace}:master-data--Well:{id}:
--> In this case a trailing colon will be presentExample of invalid data.WellID
value,
{namespace}:master-data--Well:{id}
Using regex utility to illustrate,
opendes:dataset--ConnectedSource.Generic:test123
-> opendes:dataset--File.Generic:test123
), how will id collision be handled?
ConnectedSource.Generic:{id}
record, check whether the pairing File.Generic:{id}
already exists and to at the same time reserve it if available?In the sequence diagram,
Inputs
for eds_wellbore_ddms END POINT
?eds_wellbore_ddms END POINT
be synchronous or asynchronous?Regarding the requirements, are there requirements to
@carl.godkin For a reference, this is how the record can be edited to pass schema validation
@debasisc , @carl.godkin That is correct, it's not mandatory/required to create WellboreTrajectory records with Wellbore DDMS.
The only requirement is that the record is valid according to the schema definition.
One can try to use an external JSON validator to validate the record content against the OSDU schema. It'll fail at the same place.
From the WDDMS error message, indeed it looks like the location is missing though.
@dacarpen Could you share more information on
See comment above
So far it looks like CSP specific, and infra related, rather than an underlying issue with WDMS as a service.
@carl.godkin , @debasisc Does Osdu_Ingest
use WDMS API to create the Wellbore Trajectory WPC record?
If it's using Storage Service instead, it's likely to be inadvertently creating records that fail schema validation.
@kogliny As the fix was merged, can this ticket be closed?