infra-azure-provisioning issueshttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues2023-10-12T07:23:53Zhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/323SEGY to VDS DAG and SEGY to ZGY DAG deployment failing2023-10-12T07:23:53ZDiya AliasSEGY to VDS DAG and SEGY to ZGY DAG deployment failingHi, I am deploying OSDU M20 using Manual deployment. When I am trying to deploy SEGY to VDS DAG and SEGY to ZGY DAG with below commands its failing. Looks like the docker image tag specified in the manual deployment documentation is very...Hi, I am deploying OSDU M20 using Manual deployment. When I am trying to deploy SEGY to VDS DAG and SEGY to ZGY DAG with below commands its failing. Looks like the docker image tag specified in the manual deployment documentation is very old one. Please can anyone help to find the newest tag for deploying these DAGS.
Images used to deploy as per documentation:
- **msosdu.azurecr.io/segy-to-zgy-conversion-dag:0.11.0**
- **msosdu.azurecr.io/segy-to-vds-conversion-dag:0.11.0**
- Documentation Link: https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/release/0.23/docs/service-data.md?ref_type=heads
Error received for both DAG deployments:
Error while registering DAG {"code":403,"reason":"Non empty dag content obtained","message":"Setting dag content not allowed"}https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/322[critical?] Service Schema Loading Fails with `InaccessibleObjectException`2023-09-25T10:56:22ZPaweł Grudzień[critical?] Service Schema Loading Fails with `InaccessibleObjectException`**Description:**
When executing the Service Schema Loading step from the Load Service Data section during OSDU installation, the provided Docker setup repeatedly returns "Internal server error". These errors are preventing the successfu...**Description:**
When executing the Service Schema Loading step from the Load Service Data section during OSDU installation, the provided Docker setup repeatedly returns "Internal server error". These errors are preventing the successful addition of schemas to OSDU.
**Error Details:**
Received multiple `500 Internal Server Error` responses for different schemas including:
```
Error with kind osdu:wks:AbstractAccessControlList:1.0.0: Message: Internal server error
Error with kind osdu:wks:AbstractActivityParameter:1.1.0: Message: Internal server error
Error with kind osdu:wks:AbstractActivityParameter:1.0.0: Message: Internal server error
Error with kind osdu:wks:AbstractActivityState:1.0.0: Message: Internal server error
Error with kind osdu:wks:AbstractAliasNames:1.0.0: Message: Internal server error
Error with kind osdu:wks:AbstractAnyCrsFeatureCollection:1.1.0: Message: Internal server error
Error with kind osdu:wks:AbstractAnyCrsFeatureCollection:1.0.0: Message: Internal server error
Error with kind osdu:wks:AbstractCoordinates:1.0.0: Message: Internal server error
...
Error with kind osdu:wks:work-product--WorkProduct:1.0.0: Message: Internal server error
Error with kind osdu:wks:reference-data--WorkflowPersonaType:1.0.1: Message: Internal server error
Error with kind osdu:wks:reference-data--WorkflowPersonaType:1.0.0: Message: Internal server error
Error with kind osdu:wks:reference-data--WorkflowUsageType:1.0.1: Message: Internal server error
Error with kind osdu:wks:reference-data--WorkflowUsageType:1.0.0: Message: Internal server error
```
Each resulting in error like this:
```
Error with kind osdu:wks:master-data--WellboreOpening:1.0.0: Message: Internal server error
Try PUT for id: osdu:wks:reference-data--WellboreOpeningStateType:1.0.0
{"error":{"code":500,"message":"Internal server error","errors":[{"domain":"global","reason":"internalError","message":"Internal server error"}]}}
https://osdu-pl2-srpl2-k8q2-istio-gw.centralus.cloudapp.azure.com/api/schema-service/v1/schemas/system
500
```
Logs for schema servic[schama.zip](/uploads/ba19c442c8b488ecb22d79deaa48b1aa/schama.zip)report the following exception:
```
java.lang.reflect.InaccessibleObjectException: Unable to make field private static final long java.util.ArrayList.serialVersionUID accessible: module java.base does not "opens java.util" to unnamed module @61e86192
```
(full logs in the attachment)
As a result full installation of OSDU is impossible.
**Expected Behavior:**
Schemas should be successfully added to OSDU without any errors.
**Actual Behavior:**
Repeated "Internal server error" prevents the addition of schemas.
**Steps to Reproduce:**
1. Proceed to the Service Schema Loading step from the Load Service Data section of OSDU installation instructions.
2. Execute the provided commands.
3. Observe the repeated "Internal server error" and check logs for details.
**Suggested Fix:**
Research suggests that the Java runtime environment might be causing the `InaccessibleObjectException` due to module restrictions in more recent Java versions. Consider revisiting the implementation to ensure compatibility with the used Java version or adjust the runtime environment to a version that doesn't enforce these module boundaries. I'm not sure if this service changed the java version but this may be something to consider.
**Environment:**
- OSDU version: 0.23
EDIT: added some formatting and got spam update errorhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/321[IMPROVEMENT] Lack of Automated Script in Cosmos DB Firewall Update Instructions2023-09-20T10:32:21ZPaweł Grudzień[IMPROVEMENT] Lack of Automated Script in Cosmos DB Firewall Update Instructions**Description:**
The current instructions for updating the Cosmos DB firewall settings do not include an automated method to add the user's current public IP address. Users often encounter issues when their requests originate from an IP...**Description:**
The current instructions for updating the Cosmos DB firewall settings do not include an automated method to add the user's current public IP address. Users often encounter issues when their requests originate from an IP that is blocked by the Cosmos DB firewall, as indicated by the error:
```
azure.cosmos.exceptions.CosmosHttpResponseError: (Forbidden) Request originated from IP xx.xx.xx.xx through public internet. This is blocked by your Cosmos DB account firewall settings.
```
**Details:**
When users attempt to connect to Cosmos DB from an unlisted IP address, they receive the above error. This requires them to manually check their public IP and then update the Cosmos DB firewall settings, a process that can be tedious and error-prone.
**Expected Behavior:**
Users should have a seamless way to add their current public IP address to the Cosmos DB firewall settings without having to manually determine their IP and update the settings.
**Actual Behavior:**
Users need to manually determine their public IP and update the Cosmos DB firewall settings, resulting in possible human errors and inefficiencies.
**Steps to Reproduce:**
1. Access Cosmos DB from an IP not listed in the firewall settings.
2. Observe the aforementioned error.
3. Manually determine the public IP.
4. Manually update the Cosmos DB firewall settings to include the new IP.
**Suggested Fix:**
Provide an automated bash script that:
1. Determines the user's public IP.
2. Fetches the existing allowed IPs from the Cosmos DB firewall settings.
3. Adds the new IP to the list if not already present.
4. Updates the firewall settings with the new list.
Here's the script:
```bash
#!/bin/bash
# Ensure required environment variables are set
if [[ -z "$COSMOS_ENDPOINT" || -z "$GROUP" ]]; then
echo "Please make sure the COSMOS_ENDPOINT and GROUP environment variables are set."
exit 1
fi
# Extract Cosmos DB account name from the endpoint URL
COSMOS_DB_ACCOUNT_NAME=$(echo $COSMOS_ENDPOINT | awk -F'://' '{print $2}' | awk -F'.' '{print $1}')
# Fetch the public IP address
MY_IP=$(curl -s ifconfig.me)
# Fetch existing allowed IPs from Cosmos DB
EXISTING_IPS=$(az cosmosdb show --name $COSMOS_DB_ACCOUNT_NAME --resource-group $GROUP --query "ipRangeFilter" -o tsv)
# Check if your IP is already in the list
if [[ $EXISTING_IPS == *$MY_IP* ]]; then
echo "Your IP ($MY_IP) is already in the list."
exit 0
fi
# Combine your IP with the existing IPs
if [ -z "$EXISTING_IPS" ]; then
NEW_IPS=$MY_IP
else
NEW_IPS="$EXISTING_IPS,$MY_IP"
fi
# Update the firewall rules
az cosmosdb update --name $COSMOS_DB_ACCOUNT_NAME --resource-group $GROUP --ip-range-filter "$NEW_IPS"
echo "Firewall rules updated successfully."
```
Users just need to update the placeholders with their Cosmos DB account name and resource group and then run the script.
**Environment:**
- Azure Cosmos DB SDK version: (e.g., 2.14.0, or the version you are referring to)
- Azure CLI version: (e.g., 2.x.x)https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/320Authentication hangs with AzureRM provider version 2.64.0 in Monitoring Resou...2023-09-18T11:13:02ZPaweł GrudzieńAuthentication hangs with AzureRM provider version 2.64.0 in Monitoring Resources terraform and needs update**Description:**
When using the AzureRM Terraform provider at version `2.64.0`, the Monitoring Resources 'terraform apply' script hangs indefinitely without providing any error message. However, after updating the AzureRM provider to a ...**Description:**
When using the AzureRM Terraform provider at version `2.64.0`, the Monitoring Resources 'terraform apply' script hangs indefinitely without providing any error message. However, after updating the AzureRM provider to a newer version, the problem is resolved, suggesting an authentication issue with version `2.64.0`. I did not capture the logs sadly.
**Details:**
In a Terraform script with the below configuration:
```
terraform {
required_version = ">= 1.3"
backend "azurerm" {
key = "terraform.tfstate"
}
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "=2.64.0"
}
random = {
source = "hashicorp/random"
version = "=2.3.1"
}
}
}
```
The terraform apply command hangs indefinitely during execution. Although no error message was shown in standard logs, verbose logs indicated an authentication error to Azure.
**Expected Behavior:**
The terraform apply should either execute successfully.
**Actual Behavior:**
The script hangs indefinitely without any feedback to the user.
**Steps to Reproduce:**
1. Use the AzureRM provider at version `2.64.0` in a Terraform script.
2. Execute the script.
3. Observe that it hangs without any clear error message.
**Workaround:**
Upgrading the AzureRM provider to a newer version (e.g., `3.73.0`) resolves the problem.
**Suggested Fix:**
Upgrade to the latest version of the provider in documentation.
**Environment:**
- Terraform version: (e.g., 1.5.1)
- AzureRM provider version where the issue was observed: `2.64.0`https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/319Incorrect usage of trim function leads to malformed resource names in monitor...2023-09-18T10:59:58ZPaweł GrudzieńIncorrect usage of trim function leads to malformed resource names in monitoring resources terraformIn the Terraform module, the trim function is being used to remove specific suffixes from strings. However, the current usage can lead to the unintended removal of characters, causing malformed resource names in Azure resources.
Descri...In the Terraform module, the trim function is being used to remove specific suffixes from strings. However, the current usage can lead to the unintended removal of characters, causing malformed resource names in Azure resources.
Description:
In the Monitoring Resources main.tf Terraform module, the trim function is being used to remove specific suffixes from strings. However, the current usage can lead to the unintended removal of characters, causing malformed resource names in Azure resources.
Details:
The specific instance observed is in the trimming of the -rg suffix from resource group names. The current code uses:
```
central_group_prefix = trim(data.terraform_remote_state.central_resources.outputs.central_resource_group_name, "-rg")
```
The intention is to remove the -rg suffix, but due to the behavior of trim, it also removes any individual -, r, and g characters from the ends of the string, leading to unexpected results.
For instance, a name like "osdu-pl2-crpl2-583g-rg" is trimmed to "osdu-pl2-crpl2-583" instead of the expected "osdu-pl2-crpl2-583g".
Expected Behavior:
The -rg suffix should be removed without affecting other characters in the string.
Actual Behavior:
Characters within the -rg suffix are being removed individually if they are at the ends of the string, leading to unexpected results.
Steps to Reproduce:
Use a resource group name like "osdu-pl2-crpl2-583g-rg".
Apply the Terraform module.
Observe that resources dependent on the central_group_prefix variable have the g character missing.
Suggested Fix:
Replace the trim function with the trimsuffix function, which will only remove the exact -rg suffix:
```
central_group_prefix = trimsuffix(data.terraform_remote_state.central_resources.outputs.central_resource_group_name, "-rg")
```
This change should be applied wherever the trim function is used in a similar context.https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/317Error: Plugin did not respond2023-09-11T19:29:09ZYukuo WangError: Plugin did not respondWe captured several terraform plan failure recently.
│ Error: Plugin did not respond
│
│ with module.system_storage_account.azurerm_storage_account.main,
│ on ../../../modules/providers/azure/storage-account/main.tf line 19, in reso...We captured several terraform plan failure recently.
│ Error: Plugin did not respond
│
│ with module.system_storage_account.azurerm_storage_account.main,
│ on ../../../modules/providers/azure/storage-account/main.tf line 19, in resource "azurerm_storage_account" "main":
│ 19: resource "azurerm_storage_account" "main" {
│
│ The plugin encountered an error, and failed to respond to the
│ plugin.(*GRPCProvider).ReadResource call. The plugin logs may contain more
│ details.
╷
│ Error: Request cancelled
│
│ with module.keyvault_policy.azurerm_key_vault_access_policy.keyvault[0],
│ on ../../../modules/providers/azure/keyvault-policy/main.tf line 15, in resource "azurerm_key_vault_access_policy" "keyvault":
│ 15: resource "azurerm_key_vault_access_policy" "keyvault" {
│
│ The plugin.(*GRPCProvider).UpgradeResourceState request was cancelled.
╵
Also with stack trace logs:
Stack trace from the terraform-provider-azurerm_v3.39.1_x5 plugin:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4c12582]
goroutine 1950 [running]:
github.com/hashicorp/terraform-provider-azurerm/internal/services/containers.resourceKubernetesClusterRead(0xc001d94480, {0x5d01ea0?, 0xc000737000})
github.com/hashicorp/terraform-provider-azurerm/internal/services/containers/kubernetes_cluster_resource.go:2060 +0x9c2
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0x6f8e340?, {0x6f8e340?, 0xc001fd32c0?}, 0xd?, {0x5d01ea0?, 0xc000737000?})
github.com/hashicorp/terraform-plugin-sdk/v2@v2.24.1/helper/schema/resource.go:712 +0x178
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0xc000b56b60, {0x6f8e340, 0xc001fd32c0}, 0xc001f90750, {0x5d01ea0, 0xc000737000})
github.com/hashicorp/terraform-plugin-sdk/v2@v2.24.1/helper/schema/resource.go:1015 +0x585
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0xc00152f980, {0x6f8e340?, 0xc001fd2ea0?}, 0xc001c5a100)
github.com/hashicorp/terraform-plugin-sdk/v2@v2.24.1/helper/schema/grpc_provider.go:613 +0x4a5
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource(0xc001930320, {0x6f8e340?, 0xc001fd2780?}, 0xc001127140)
github.com/hashicorp/terraform-plugin-go@v0.14.1/tfprotov5/tf5server/server.go:748 +0x4b1
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler({0x63c6d80?, 0xc001930320}, {0x6f8e340, 0xc001fd2780}, 0xc001347b20, 0x0)
github.com/hashicorp/terraform-plugin-go@v0.14.1/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:349 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00027a000, {0x6f9e380, 0xc000f9e000}, 0xc002595b00, 0xc001993530, 0xb246a90, 0x0)
google.golang.org/grpc@v1.50.1/server.go:1340 +0xd23
google.golang.org/grpc.(*Server).handleStream(0xc00027a000, {0x6f9e380, 0xc000f9e000}, 0xc002595b00, 0x0)
google.golang.org/grpc@v1.50.1/server.go:1713 +0xa2f
google.golang.org/grpc.(*Server).serveStreams.func1.2()
google.golang.org/grpc@v1.50.1/server.go:965 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
google.golang.org/grpc@v1.50.1/server.go:963 +0x28a
Error: The terraform-provider-azurerm_v3.39.1_x5 plugin crashed!
This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.
By troubleshooting on this, we noticed that there is a bug fix:
Fix nil panic by correcting nil check expression: https://github.com/hashicorp/terraform-provider-azurerm/pull/21850
This fix is inclued in terraform-provider-azurerm v3.57.0 (May 19, 2023)
https://github.com/hashicorp/terraform-provider-azurerm/blob/v3.57.0/CHANGELOG.md
BUG FIXES:
data.azurerm_kubernetes_cluster - prevent a panic when some values returned are nil (#21850)https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/315[Feature] Airflow2 stage with private endpoints2023-08-02T22:19:08ZArturo Hernandez [EPAM][Feature] Airflow2 stage with private endpoints# Airflow2 stage
---
By default airflow2 it is deployed at service resources stage, one airflow it is configured for the hole OSDU.
It looks like airflow2 it is not enough for service resources in a multipartition environment, therefor...# Airflow2 stage
---
By default airflow2 it is deployed at service resources stage, one airflow it is configured for the hole OSDU.
It looks like airflow2 it is not enough for service resources in a multipartition environment, therefore, airflow2 it is deployed externally per data partition, in a separated network and subnet (brand new airflow2 resources will be created).
In order to secure and increase performance when using an external airflow2, we need to setup private endpoint for those resources, including private endpoint for the partition-airflow2 application gateway from the main AKS cluster.
Airflow2 interacts with the storage accounts mostly, therefore I guess those private endpoints will be needed as well.
## Airflow2 independent stage
---
I have an strong opinion that airflow 2 it should be segregated from the partition resources, in case there is need for new external airflow, that should be created as separate stage (like data-partition, service, central), should be some "airflow" resources stage, which should provide configured airflow out of the box.
## ADF Replacement
---
To achieve convergence between ADME and community we might want to start thinking about Azure Data Factory, which it is already available for [terraform - AzureRM](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/data_factory.html), we might start smoothly this migration at data partition level and then at service resource level.
## Action items
---
@lucynliu @nursheikh I would like to start these discussion here in the forum, it would be nice to start convergence between community and ADME, I have the feeling we should get rid of the per-partition Airflow2 resources (including AKS for airflow), and start considering at this first stage to use ADF per partition as optional feature, then move forward with ADF at Service Resources, or if should be fine now to start considering using ADF per partition (I don't know if this it is really convenient).
We should also include the optional feature of private endpoints from AKS to ADF/AKS-Airflow in any case.
cc. @lucynliu @vleskivArturo Hernandez [EPAM]Arturo Hernandez [EPAM]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/262Feature - multiregion private endpoints2023-05-10T16:53:32ZArturo Hernandez [EPAM]Feature - multiregion private endpointsAs for now when private endpoints were introduced, the main restriction it is that private endpoints are restricted to the same region.
* https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-overview#private-endpoint-pr...As for now when private endpoints were introduced, the main restriction it is that private endpoints are restricted to the same region.
* https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-overview#private-endpoint-properties
> The private endpoint must be deployed in the same region and subscription as the virtual network.
This means that if we plan to deploy another partition on different region, it will not be possible due this limitation.
Ideally, I think this would be best approach:
* CR and SR should be deployed in the same region (No need for new virtual network)
* SR will need new subnet for network peering.
* DP should have own net and subnet which will peer to the SR subnet. All the private endpoints will be created in the DP attached to the dedicated virtual network of the DP resources.
I think this is not priority to be implemented on specific Milestone, I guess there are few use cases on which we may want to have partitions on different region than the control plane.Arturo Hernandez [EPAM]Arturo Hernandez [EPAM]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/261objectId filed is not present2023-04-26T13:48:46ZDmytro KomisarobjectId filed is not presentHere [README](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/infra/templates/osdu-r3-mvp/central_resources/README.md?plain=1#L40) says
```bash
az ad sp list --display-name ...Here [README](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/infra/templates/osdu-r3-mvp/central_resources/README.md?plain=1#L40) says
```bash
az ad sp list --display-name $NAME --query [].objectId -ojson
```
but output json does not have ".objectId" filed. Assume just ".id" is what is needed but it definitely need to be corrected.
Also, [line 48](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/infra/templates/osdu-r3-mvp/central_resources/README.md?plain=1#L48) says:
```bash
az ad app permission admin-consent --id $appId
```
where $appId was not set. Again, I assume this should be "appId" from line 22, but not sure about this.
Could these please be fixed.https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/244Automation gaps in Release Process - Phase 32022-12-01T16:02:13ZKrishna Nikhil VedurumudiAutomation gaps in Release Process - Phase 3Automate post-deployment activities.
- [ ] Schema boostrapping - load standard schemas to the system.
- [ ] DAG upload - Upload DAGs to the airflow storage account.
- [ ] Data loading
- [ ] Standard references.
- [ ] TNO data...Automate post-deployment activities.
- [ ] Schema boostrapping - load standard schemas to the system.
- [ ] DAG upload - Upload DAGs to the airflow storage account.
- [ ] Data loading
- [ ] Standard references.
- [ ] TNO dataset
- [ ] DDMSs to use helm-charts-azure in community migrate to standard-ddms
- [ ] Implement helm-chart-azure pipeline on demand in preship and demo envs
- [ ] azure_code_coverage changes for all java servicesM15 - Release 0.18Arturo Hernandez [EPAM]shivani karipeArturo Hernandez [EPAM]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/243Upgrade AKS/Istio in Flux based model2022-09-29T19:34:08ZVasyl Leskiv [SLB]Upgrade AKS/Istio in Flux based model- Flux based model - Istio v1.11.3 (max supported AKS version: 1.22)
- Helm based model - Istio v1.14.0 (max supported AKS version: 1.24)
As we decided to continue support Flux based model - it would be good to sync Istio version with H...- Flux based model - Istio v1.11.3 (max supported AKS version: 1.22)
- Helm based model - Istio v1.14.0 (max supported AKS version: 1.24)
As we decided to continue support Flux based model - it would be good to sync Istio version with Helm based model to be able Upgrade AKS to the latest version according to client requirement.Arturo Hernandez [EPAM]shivani karipeArturo Hernandez [EPAM]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/242Azure AD for authentication to be used to connect to PostgresDB2022-09-02T07:50:09Zdevesh bajpaiAzure AD for authentication to be used to connect to PostgresDBAs of today Airflow uses credentials stored in KeyVault to connect to Postgres via PG bouncer service.
Customer has raised a concern regarding how Postgres DB is being used by Airflow. As recommended best practices, Azure AD for authent...As of today Airflow uses credentials stored in KeyVault to connect to Postgres via PG bouncer service.
Customer has raised a concern regarding how Postgres DB is being used by Airflow. As recommended best practices, Azure AD for authentication to be used (see https://docs.microsoft.com/en-us/azure/postgresql/single-server/concepts-azure-ad-authentication and here https://docs.microsoft.com/en-us/azure/postgresql/single-server/how-to-configure-sign-in-azure-ad-authentication).Vineeth Guna [Microsoft]Vineeth Guna [Microsoft]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/241Drop support of keda_v2_enabled flag on services side2022-08-30T20:48:30ZVasyl Leskiv [SLB]Drop support of keda_v2_enabled flag on services sideAs beyond infrastructure repo the feature flag has been added into service repos ( for example helm-charts-azure) we need to make cleanup on services side and drop the file [infra-azure-provisioning/docs/keda-upgrade.md](https://communit...As beyond infrastructure repo the feature flag has been added into service repos ( for example helm-charts-azure) we need to make cleanup on services side and drop the file [infra-azure-provisioning/docs/keda-upgrade.md](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/docs/keda-upgrade.md) as it doesn't make sense to support keda v1 anymore.Arturo Hernandez [EPAM]Arturo Hernandez [EPAM]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/234Airflow Logs getting truncated in log Analytics2022-08-02T02:52:16Zdevesh bajpaiAirflow Logs getting truncated in log AnalyticsAirflow logs created in blobs tore are are sent to log analytics
refer : https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/source/airflow-function
but it is observed that in ...Airflow logs created in blobs tore are are sent to log analytics
refer : https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/source/airflow-function
but it is observed that in case airflow logs has multiple line those logs are truncated in log analytics
> e.g.
airlfow logs in blob store
<pre>
[2022-06-10, 06:23:09 UTC] {validate_schema.py:322} ERROR - Schema validation error. Data field.
[2022-06-10, 06:23:09 UTC] {validate_schema.py:323} ERROR - Manifest kind: osdu:wks:work-product-component--WellboreTrajectory:1.1.0
[2022-06-10, 06:23:09 UTC] {validate_schema.py:324} ERROR - Error: None is not of type 'string'
Failed validating 'type' in schema['properties']['data']['allOf'][3]['properties']['AppliedOperations']['items']:
{'type': 'string'}
On instance['data']['AppliedOperations'][0]:
None
</pre>
> export from logAnalytics
<pre>
--------------------------------------------------------------------------------",INFO
"2022-06-10 06:23:09,305","Error: None is not of type 'string'",ERROR
"2022-06-10 06:23:09,305","Manifest kind: osdu:wks:work-product-component--WellboreTrajectory:1.1.0",ERROR
"2022-06-10 06:23:09,304","Schema validation error. Data field.",ERROR
"2022-06-10 06:23:09,026","Exporting the following env vars:
</pre>https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/228[BUG] Swagger page could not be displayed2022-06-09T18:53:20ZRostislav Vatolinvatolinrp@gmail.com[BUG] Swagger page could not be displayedSwagger page could not be displayed after recent upgrade of springfox-boot-starter to 3.0.0 for partition service. AuthorizationPolicy for partition service requires fix.Swagger page could not be displayed after recent upgrade of springfox-boot-starter to 3.0.0 for partition service. AuthorizationPolicy for partition service requires fix.https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/227[AKS Policies] Fix volume types policy to comply with least privilege principle2022-06-07T12:25:37ZArturo Hernandez [EPAM][AKS Policies] Fix volume types policy to comply with least privilege principleCurrently policy applied for "Allowed volume types" it is allowing `*`:
```json
{
"effect": { "value": "audit"},
"excludedNamespaces": {"value": ["kube-system", "gatekeeper-system", "azure-arc"]},
"allowedVolumeTypes": {"value": [...Currently policy applied for "Allowed volume types" it is allowing `*`:
```json
{
"effect": { "value": "audit"},
"excludedNamespaces": {"value": ["kube-system", "gatekeeper-system", "azure-arc"]},
"allowedVolumeTypes": {"value": ["*"]}
}
```
To support keyvault and csi providers, need to adopt least privilege principle to get rid of "all" expression.
Related to #218https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/225Disable Registry Scan feature for flux2022-04-12T09:17:46ZDzmitry_Paulouski (slb)Disable Registry Scan feature for fluxThere are a lot of error messages in the Flux pod.
They are caused by Flux checking for new images, but access to container registry is not provided:
https://fluxcd.io/legacy/flux/faq/#how-do-i-give-flux-access-to-an-image-registry
_Flux...There are a lot of error messages in the Flux pod.
They are caused by Flux checking for new images, but access to container registry is not provided:
https://fluxcd.io/legacy/flux/faq/#how-do-i-give-flux-access-to-an-image-registry
_Flux transparently looks at the image pull secrets that you attach to workloads and service accounts, and thereby uses the same credentials that Kubernetes uses for pulling each image. In general, if your pods are running, then Kubernetes has pulled the images, and Flux should be able to access them too._
Since we do not use this feature, it can be disabled. https://fluxcd.io/legacy/flux/faq/#can-i-disable-flux-registry-scanningDzmitry_Paulouski (slb)Dzmitry_Paulouski (slb)https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/219Standardize the Environment variable naming for Airflow Variables2022-01-19T03:48:51Zharshit aggarwalStandardize the Environment variable naming for Airflow VariablesAs we see some variables using HOST as suffix while some as URL, we should Standardize
AIRFLOW_VAR_CORE__SERVICE__SCHEMA__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/schema-service/v1"
AIRFLOW_VAR_CORE__SERVICE__SEARCH__URL: "h...As we see some variables using HOST as suffix while some as URL, we should Standardize
AIRFLOW_VAR_CORE__SERVICE__SCHEMA__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/schema-service/v1"
AIRFLOW_VAR_CORE__SERVICE__SEARCH__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/search/v2"
AIRFLOW_VAR_CORE__SERVICE__STORAGE__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/storage/v2"
AIRFLOW_VAR_CORE__SERVICE__FILE__HOST: "https://#{OSDU_SVC_ENDPOINT}#/api/file/v2"
AIRFLOW_VAR_CORE__SERVICE__WORKFLOW__HOST: "https://#{OSDU_SVC_ENDPOINT}#/api/workflow"
AIRFLOW_VAR_CORE__SERVICE__WORKFLOW__HOST: "https://#{OSDU_SVC_ENDPOINT}#/api/workflow/v1"
AIRFLOW_VAR_CORE__SERVICE__SEARCH_WITH_CURSOR__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/search/v2/query_with_cursor"
AIRFLOW_VAR_CORE__SERVICE__PARTITION__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/partition/v1"
AIRFLOW_VAR_CORE__SERVICE__LEGAL__HOST: "https://#{OSDU_SVC_ENDPOINT}#/api/legal/v1"
AIRFLOW_VAR_CORE__SERVICE__ENTITLEMENTS__URL: "https://#{OSDU_SVC_ENDPOINT}#/api/entitlements/v2"https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/217Azure Ad Application URIIdentifiers new restrictions added causing deployment...2022-01-25T08:04:48ZVivek OjhaAzure Ad Application URIIdentifiers new restrictions added causing deployment failureCreating OSDU azure instance central resources giving following error
Error: graphrbac.ApplicationsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code...Creating OSDU azure instance central resources giving following error
Error: graphrbac.ApplicationsClient#Create: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="Unknown" Message="Unknown service error" Details=[{"odata.error":{"code":"Request_BadRequest","date":"2021-12-15T14:49:28","message":{"lang":"en","value":"Values of identifierUris property must use a verified domain of the organization or its subdomain: 'http://osdu-mvp-cr022-0bsd-app'"},"requestId":"84ba8e6f-224b-4b88-9a0c-587a52afc283","values":[{"item":"PropertyName","value":"identifierUris"},{"item":"PropertyErrorCode","value":"HostNameNotOnVerifiedDomain"},{"item":"HostName","value":"http://osdu-mvp-cr022-0bsd-app"}]}}]
on ../../../modules/providers/azure/ad-application/main.tf line 20, in resource "azuread_application" "main":
20: resource "azuread_application" "main" {https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/215Required secrets for postgresql in keyvault2021-12-10T06:24:37ZAalekh JainRequired secrets for postgresql in keyvaultWorkflow ingestion service needs to connect to the postgresql database (that is primarily used by airflow). This is required in order to implement the feature where we have to query the postgresql dataset.
As of now, there's no clear wa...Workflow ingestion service needs to connect to the postgresql database (that is primarily used by airflow). This is required in order to implement the feature where we have to query the postgresql dataset.
As of now, there's no clear way to obtain the hostname and username (for db) that will allow us to connect to the postgreql for running the custom queries.
These changes are added as part of the following MR in workflow ingestion service -
https://community.opengroup.org/osdu/platform/data-flow/ingestion/ingestion-workflow/-/merge_requests/199
The corresponding MR (in infra azure provisioning) that adds these changes is - !549
cc: @kibattul