infra-azure-provisioning issueshttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues2021-09-06T05:34:10Zhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/198[Airflow Alert] Web host count alert triggered even when host count is above ...2021-09-06T05:34:10ZBhakti Thakkar[Airflow Alert] Web host count alert triggered even when host count is above threshold value1. Attached is the graph which shows host count is 3.
Threshold value is 2.
![image](/uploads/e76a1b0815d3e12d7112219529c6b54e/image.png)
Screenshot for aks
![image](/uploads/792944ddebec49578526a4c48ba2997d/image.png)1. Attached is the graph which shows host count is 3.
Threshold value is 2.
![image](/uploads/e76a1b0815d3e12d7112219529c6b54e/image.png)
Screenshot for aks
![image](/uploads/792944ddebec49578526a4c48ba2997d/image.png)https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/197Enable BYOAD by adding feature flag for ad application in central resources2023-08-16T10:40:37ZVivek OjhaEnable BYOAD by adding feature flag for ad application in central resourcesVivek OjhaVivek Ojhahttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/196Close Release 0.112023-08-16T10:40:37ZMANISH KUMARClose Release 0.11Vivek OjhaVivek Ojhahttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/195terraform init failed with L Error: Unreadable module directory Unable to ev...2021-08-26T22:28:04ZAsraful Chowdhuryterraform init failed with L Error: Unreadable module directory Unable to evaluate directory symlink: lstat ../../../modules: no such file or directoryBelow command is failing as : (**Terraform v0.14.4**)
` terraform init -backend-config "storage_account_name=${TF_VAR_remote_state_account}" -backend-config "container_name=${TF_VAR_remote_state_container}"`
Initializing modules...
- a...Below command is failing as : (**Terraform v0.14.4**)
` terraform init -backend-config "storage_account_name=${TF_VAR_remote_state_account}" -backend-config "container_name=${TF_VAR_remote_state_container}"`
Initializing modules...
- ad_application in
- app_insights in
- container_registry in
- graph_account in
- keyvault in
- keyvault_policy in
- log_analytics in
- service_principal in
- storage_account in
```
Error: Unreadable module directory
Unable to evaluate directory symlink: lstat ../../../modules: no such file or
directory
Any thoughs anout this ?
```https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/194Update airflow with latest OSDU python SDK2021-07-30T07:54:11ZKishore BattulaUpdate airflow with latest OSDU python SDKUpdate airflow with latest version of OSDU python SDKUpdate airflow with latest version of OSDU python SDKM7 - Release 0.10.0 - removehttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/193Onboard Seismic Dms File Metadata Service2021-08-30T17:16:12ZVladimir MoiseevOnboard Seismic Dms File Metadata Service**Service name**: `Seismic Dms File Metadata Service`
The following steps must be completed for a service to onboard with OSDU on Azure. Additionally, please add the `Service Onboarding` tag to this issue when it is created.
For more i...**Service name**: `Seismic Dms File Metadata Service`
The following steps must be completed for a service to onboard with OSDU on Azure. Additionally, please add the `Service Onboarding` tag to this issue when it is created.
For more information, visit our service onboarding documentation [here](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/docs/service-onboarding.md).
## Steps:
**Infrastructure and Initial Requirements**
- [x] Add any additional Azure cloud infrastructure (Cosmos containers, Storage containers, fileshares, etc.) to the Terraform template. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/infra/templates/osdu-r3-mvp). Note that if the infrastructure is a part of the data-partition template, you may need to add secrets to the keyvault that are partition specific; if doing so, update the createPartition REST request to include the keys that you have added so they are accessible in service code. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/tools/rest/partition.http#L48)
- [x] Create an ingress point for the service. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/charts/osdu-common/templates/appgw-ingress.yaml)
- [x] Add any test data that is required for the service integration tests. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/tools/test_data)
- [x] Update `upload-data.py` to upload any new test data files you created. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/tools/test_data/upload-data.py).
- [x] Update the integration tester with any entitlements required to test the service. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/tools/test_data/user_info_1.json)
- [x] Add in any new secrets that the service needs to run. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/charts/osdu-common/templates/kv-secrets.yaml)
- [x] Create environment variable script to generate .yaml files to be used with Intellij [EnvFile](https://plugins.jetbrains.com/plugin/7861-envfile) plugin and .envrc files to be used with [direnv](https://direnv.net/). [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/tools/variables)
**Gitlab Code and Documentation**
- [x] Complete the service code such that it passes all integration tests locally. There is some documentation on starting off implementing an Azure provider. [Link](./gitlab-service-readme-template.md)
- [x] Create helm charts for service. The charts for each service are located in the `devops/azure` directory. You can look at charts from other services as a model. The charts will be nearly identical except for the different environment variables, values, etc each service needs to run. [Link](./gitlab-service-guide.md)
- [x] Implement Istio for the service if this has not already been done. Here is an example MR that shows what steps are required. [Link](https://community.opengroup.org/osdu/platform/system/storage/-/merge_requests/64)
- [x] Create an Istio auth policy in the `devops/azure/chart/templates` directory. Here is an example of an Istio auth policy that is generic and can be used by other services. [Link](https://community.opengroup.org/osdu/platform/system/storage/-/blob/master/devops/azure/chart/templates/azure-istio-auth-policy.yaml)
- [x] Add any variables that are required for the service integration tests to the Azure CI-CD file. [Link](https://community.opengroup.org/osdu/platform/ci-cd-pipelines/-/blob/master/cloud-providers/azure.yml)
- [x] Verify that the README for the Azure provider correctly and clearly describes how to run and test the service. There is a README template to help. [Link](./gitlab-service-readme-template.md)
- [x] Push any changes and verify that the Gitlab pipeline is passing in master.
**Development and Demo Azure Devops Pipelines**
- [ ] Create development ADO pipeline at `devops/azure/development-pipeline.yml` in the service repo.
- [ ] Verify development pipeline passes in ADO.
- [ ] Create Demo ADO pipeline at `devops/azure/pipeline.yml` in the service repo.
- [ ] Verify demo pipeline is passing in ADO.
**User Documentation**
- [ ] Add the service to the mirror pipeline instructions. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/docs/code-mirroring.md)
- [ ] Add the service to the manual deployment instructions. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/charts)
- [ ] Add any required variables to the already existing variable group instructions for automated deployment. You should know if any variables need to be added to existing variable groups from creating the development and demo pipelines. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/docs/service-automation.md#create-osdu-service-libraries)
- [ ] Add a variable group `Azure Service Release - $SERVICE_NAME` to the documentation. You should know what values to set for this variable group from creating the development and demo pipelines. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/docs/service-automation.md#create-osdu-service-libraries)
- [ ] Add a step for creating the service pipeline at the bottom of the service-automation page. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/docs/service-automation.md#create-osdu-service-libraries)
- [ ] Create a rest script with sample calls to the service for users. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/tools/rest)Vladimir MoiseevVladimir Moiseevhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/192Release 0.10.02021-08-09T16:03:14ZMANISH KUMARRelease 0.10.0**Release name**: `0.10.0`
The following steps must be completed OSDU Azure release.
For more information, visit our release documentation [here](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provi...**Release name**: `0.10.0`
The following steps must be completed OSDU Azure release.
For more information, visit our release documentation [here](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/blob/master/CHANGELOG.md).
## Steps:
**Infra Board closure**
- [ ] Mark all issues closed which have been completed in the release. [Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/boards). Note that creating an issue ticket as "Close Release - <version>" helps as a mark up for the release
**Deploy terraform scripts**
- [ ] Central Resources
- [ ] Service Resources
- [ ] Data Paritions.
**Create service release images**
- [ ] Upload release service images to ACR.
- [ ] Create and Update service charts
- [ ] Upload service charts to ACR.
**Create Data seeding**
- [ ] Update documentation for following seeding data, Config, Manifest DAG, CSV Parser DAG, Schema, Entitlements, Policy, ZGY DAG and VDS DAG.
- [ ] Create and Upload versioned image for following seeding data, Config, Manifest DAG, CSV Parser DAG, Schema, Entitlements, Policy, ZGY DAG and VDS DAG.
- [ ] Update scripts for data seeding.
**Upload Data prior to service deployment**
- [ ] Upload Config Data
- [ ] Upload Manifest Ingest DAG
**Service deployment**
- [ ] Partition Service
- [ ] Security Services
- [ ] Core Services
- [ ] Reference Services
- [ ] Ingest Services
- [ ] Seismic Services
- [ ] Wellbore Services
**Upload Data post to service deployment**
- [ ] Upload Entitlements Data
- [ ] Upload Schema
- [ ] Upload Policies
- [ ] Upload CSV Parser DAG
- [ ] Upload SEGY to ZGY DAG Conversion
- [ ] Upload SEGY to VDS DAG Conversion
**Validation**
Test services and dags using REST scripts[Link](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/tools/rest).
- [ ] Test services
- [ ] Test Manifest DAG
- [ ] Test CSV DAG
- [ ] Test ZGY DAG
- [ ] Test VDS DAGM7 - Release 0.10.0 - removeMANISH KUMARMANISH KUMARhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/191Airflow host count Alerts not triggered2021-08-11T12:15:39ZAbhishek KumarAirflow host count Alerts not triggeredI am setting the replica count in airflow helm config as below for triggering below alerts but I don't find them triggered.
airflow-web-host-count-alert
replica - 1
AIRFLOW__WEBSERVER__WORKERS: 0
Example log query result for webhost c...I am setting the replica count in airflow helm config as below for triggering below alerts but I don't find them triggered.
airflow-web-host-count-alert
replica - 1
AIRFLOW__WEBSERVER__WORKERS: 0
Example log query result for webhost count which is 1 and still there is no alert triggered. Similarly for worker and scheduler too.
![image](/uploads/a651cde67ba778943a044677439161c5/image.png)
Similarly for worker and scheduler too, I don't see any alerts even if the query returns results.
Can you please check if there's something missing here.Mayank Saggar [Microsoft]Mayank Saggar [Microsoft]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/190Close Release 0.10.02021-07-28T05:59:15ZMANISH KUMARClose Release 0.10.0M7 - Release 0.10.0 - removeMANISH KUMARVivek OjhaMANISH KUMARhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/189Azure postgresql storage setup2021-08-04T08:37:45ZOleksandr Stetskiv-SLBAzure postgresql storage setup[Storage-auto-grow feature](https://docs.microsoft.com/en-us/azure/postgresql/concepts-pricing-tiers#storage-auto-grow) is enabled in Postgres, so it increases storage size by adding 5gb each time when out of capacity.
This feature is n...[Storage-auto-grow feature](https://docs.microsoft.com/en-us/azure/postgresql/concepts-pricing-tiers#storage-auto-grow) is enabled in Postgres, so it increases storage size by adding 5gb each time when out of capacity.
This feature is not compatible with current terraform scripts as Postgres storage size is hardcoded in terraform and we need to use [lifecycle ignore_changes](https://www.terraform.io/docs/language/meta-arguments/lifecycle.html#ignore_changes) for this attribute.https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/188OSDU Azure infra setup2021-11-18T15:13:03ZArvind BhojOSDU Azure infra setupCompleted OSDU Azure infra setup but Unit test for service_resources failing with the following error:
unit.go:143: Plan unexpectedly had 79 resources instead of 92
Followed the steps in the following link and all the tearrforms comple...Completed OSDU Azure infra setup but Unit test for service_resources failing with the following error:
unit.go:143: Plan unexpectedly had 79 resources instead of 92
Followed the steps in the following link and all the tearrforms completed successfully:
[](https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/tree/master/infra/templates/osdu-r3-mvp)https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/187Use latest secure TLS version for Storage accounts2021-06-29T08:33:55ZVasyl Leskiv [SLB]Use latest secure TLS version for Storage accountsMotivation:
* Use most recent secure TLS version in storage account module by default.
* Some client subscriptions have configured Azure policies that prevent OSDU terraform deployment with TLS version 1.0 (it is [non explicitly set by ...Motivation:
* Use most recent secure TLS version in storage account module by default.
* Some client subscriptions have configured Azure policies that prevent OSDU terraform deployment with TLS version 1.0 (it is [non explicitly set by default](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_account#min_tls_version)) and is non-secure outdated version.Vasyl Leskiv [SLB]Vasyl Leskiv [SLB]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/186Adding Airflow Multipartition Partition Changes2022-08-23T10:47:33Zharshit aggarwalAdding Airflow Multipartition Partition ChangesFor enabling multi partitioning support for Airflow following infrastructure changes are required
**New AKS Cluster needs to be created in dp resource group**
- Same configuration as what we have in service resources group
- Autoscal...For enabling multi partitioning support for Airflow following infrastructure changes are required
**New AKS Cluster needs to be created in dp resource group**
- Same configuration as what we have in service resources group
- Autoscaling needs to be enabled for AKS cluster
- The virtual network used by node pool should accommodate for atleast 2500 ip addresses
- AKS should have access to node resource group in which node pools exist
- AKS Access to Create and Remove VM's in Node Resource Group
- AKS should have access to only pull images from central resources ACR as well as ACR created in data partition
- AKS should have access to the data partition specific pod identity
**New managed identity needs to be created in dp resource group**
- Need read access to keyvault which is present in dp resource group
- Need access to fileshares/ blob storage for the storage account used by other osdu services
- Need access to storage queue to read and process
**New Postgresql server needs to be created in dp resource group**
- Same configuration steps as what we have in service resources group
- Only difference is any secrets related to postgres needs to be stored in data partition.
**Use existing storage account used by other osdu services**
- Create fileshares and directories internally similar to service resource group
- Create storage container similar to service resource group
- Create storage queue similar to service resource group
- Adding storage account secrets in dp keyvault
**Create event grid subscription to push logs to log analytics**
**New container registry needs to be created in dp resource group**
**New keyvault needs to be created in dp resource group**
**New redis cluster needs to be created in dp resource group**
- Same configuration steps as what we have in service resource group
- Only difference is any secrets related to redis needs to be stored in data partition.
**New log analytics workspace needs to be created in data partition to store task logs**
**Kubernetes changes needed**
- Install KEDA helm chart version 2.1.0
- Install Cert manager helm chart
- Install Kvsecrets helm chart
- Install aad-pod-identity helm chart
- Create OSDU namespace with istio injection enabled
Create airflow specific secrets and store it in dp specific keyvault
**AKS, Postgres, Redis, Virtual network diagnostics**
**New keyvault to be created in central resources which will have app insights key which is shared across all data partitions**
- The pod identity in data partition should have get access to this keyvault.
**Create NSG for aks subnet in data partition AKS cluster**
- Whitelist sr aks egress ip in this NSG
**All the resources created should be feature flagged**https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/185[Feature] Airflow Monitoring Dashboards.2021-07-28T06:23:16ZMayank Saggar [Microsoft][Feature] Airflow Monitoring Dashboards.For Monitoring of Airflow and it's services, three dashboards, one for airflow infra, one for airflow service and one for airflow dags will be deployed as a part of Monitoring resources. The infra and service dashboards would be viewable...For Monitoring of Airflow and it's services, three dashboards, one for airflow infra, one for airflow service and one for airflow dags will be deployed as a part of Monitoring resources. The infra and service dashboards would be viewable at data partition level, whereas the dags dashboard would be viewable at data-partition and dag level.M7 - Release 0.10Mayank Saggar [Microsoft]Mayank Saggar [Microsoft]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/184Fix istio chart to deploy it through helm install/upgrade2022-08-23T10:47:30ZVineeth Guna [Microsoft]Fix istio chart to deploy it through helm install/upgradeNeed to fix istio chart to deploy it with helm install/upgrade
This will be used for enabling multipartition support for airflowNeed to fix istio chart to deploy it with helm install/upgrade
This will be used for enabling multipartition support for airflowVineeth Guna [Microsoft]Vineeth Guna [Microsoft]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/183Create new pipeline for deploying airflow in data partitions2022-08-23T10:47:29ZVineeth Guna [Microsoft]Create new pipeline for deploying airflow in data partitionsNeed to automate deployment of airflow in data partition to enable multi partition support for airflow
To accomplish this need to create an ADO pipeline which can be used by customers to automate the deploymentNeed to automate deployment of airflow in data partition to enable multi partition support for airflow
To accomplish this need to create an ADO pipeline which can be used by customers to automate the deploymentVineeth Guna [Microsoft]Vineeth Guna [Microsoft]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/182Fix airflow chart to deploy it through helm install/upgrade2022-08-23T10:47:28ZVineeth Guna [Microsoft]Fix airflow chart to deploy it through helm install/upgradeAirflow charts needs to be templatized to be used with helm install/upgrade
This is needed for enabling multi partition support for airflow
These changes should not effect the existing flux deploymentAirflow charts needs to be templatized to be used with helm install/upgrade
This is needed for enabling multi partition support for airflow
These changes should not effect the existing flux deploymentVineeth Guna [Microsoft]Vineeth Guna [Microsoft]https://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/181[Breaking Change] Zonal Redundancy for Airflow2022-08-23T10:47:30ZAbhishek Chowdhry[Breaking Change] Zonal Redundancy for AirflowPoint Airflow to use the newly created Zone Redundant Redis Instance for future purposes. This will break the existing Airflow Runs and they will need to be triggered again.
**Consuming this Change**:
This change will break the existin...Point Airflow to use the newly created Zone Redundant Redis Instance for future purposes. This will break the existing Airflow Runs and they will need to be triggered again.
**Consuming this Change**:
This change will break the existing Airflow Runs. If they can be retriggered without losing any data, just retrigger the Airflow Runs once this change is merged.
If retriggering end to end runs is not possible due to any reason and we don't want to lose the existing runs, there are 2 suggested methods:
1) Drain the entire Queue by do not sending any new requests to Airflow. Once the queue is drained, take the changes for pointing to the new queue(new redis instance) and resume the traffic to Airflow.
2) Stop sending any new Requests to Airflow and take the changes for pointing to the new Queue(new Redis instance). Now requeue all the tasks from the old queue into the new queue. Resume the traffic to Airlfow.
Prefer the first option to the second one as the second option has a big overhead of requeuing and may still result in data loss.M7 - Release 0.10.0 - removeAbhishek ChowdhryAbhishek Chowdhryhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/180Zonal Redundancy for Redis2022-08-23T10:47:29ZAbhishek ChowdhryZonal Redundancy for RedisEnable Zone Redundancy for Redis by creating a new Redis Premium Instance with zone Redundancy Enabled
## Acceptance Criteria
* [X] Infra changes to add new Redis
* [X] Premerge pipeline
* [ ] Changes for Glab/dev/demo
* [ ] Changes fo...Enable Zone Redundancy for Redis by creating a new Redis Premium Instance with zone Redundancy Enabled
## Acceptance Criteria
* [X] Infra changes to add new Redis
* [X] Premerge pipeline
* [ ] Changes for Glab/dev/demo
* [ ] Changes for ManualM7 - Release 0.10.0 - removeAbhishek ChowdhryAbhishek Chowdhryhttps://community.opengroup.org/osdu/platform/deployment-and-operations/infra-azure-provisioning/-/issues/179Zonal redundancy for AKS and App Gateway2021-07-27T07:50:44ZVivek OjhaZonal redundancy for AKS and App GatewayEnable zonal redundancy for AKS and App GatewayEnable zonal redundancy for AKS and App GatewayM7 - Release 0.10.0 - removeVivek OjhaVivek Ojha