Skip to content

Adding Airflow Multipartition Partition Changes

harshit aggarwal requested to merge airflow_do_changes_master_mr3 into master

Closes #186 (closed)

For enabling multi partitioning support for Airflow following infrastructure changes are required

New AKS Cluster needs to be created in dp resource group

  • Same configuration as what we have in service resources group

  • Autoscaling needs to be enabled for AKS cluster

  • The virtual network used by node pool should accommodate for atleast 2500 ip addresses

  • AKS should have access to node resource group in which node pools exist

  • AKS Access to Create and Remove VM's in Node Resource Group

  • AKS should have access to only pull images from central resources ACR as well as ACR created in data partition

  • AKS should have access to the data partition specific pod identity

New managed identity needs to be created in dp resource group

  • Need read access to keyvault which is present in dp resource group

  • Need access to fileshares/ blob storage for the storage account used by other osdu services

  • Need access to storage queue to read and process

New Postgresql server needs to be created in dp resource group

  • Same configuration steps as what we have in service resources group

  • Only difference is any secrets related to postgres needs to be stored in data partition.

Use existing storage account used by other osdu services

  • Create fileshares and directories internally similar to service resource group

  • Create storage container similar to service resource group

  • Create storage queue similar to service resource group

  • Adding storage account secrets in dp keyvault

Create event grid subscription to push logs to log analytics

New container registry needs to be created in dp resource group

New keyvault needs to be created in dp resource group

New redis cluster needs to be created in dp resource group

  • Same configuration steps as what we have in service resource group

  • Only difference is any secrets related to redis needs to be stored in data partition.

New log analytics workspace needs to be created in data partition to store task logs

Kubernetes changes needed

  • Install KEDA helm chart version 2.1.0

  • Install Cert manager helm chart

  • Install Kvsecrets helm chart

  • Install aad-pod-identity helm chart

  • Create OSDU namespace with istio injection enabled

Create airflow specific secrets and store it in dp specific keyvault

AKS, Postgres, Redis, Virtual network diagnostics

New keyvault to be created in central resources which will have app insights key which is shared across all data partitions

  • The pod identity in data partition should have get access to this keyvault.

Create NSG for aks subnet in data partition AKS cluster

  • Whitelist sr aks egress ip in this NSG

All the resources created should be feature flagged


Links for terraform plan outputs

When flag false in dp resources [Link]

When flag true in dp resources [Link]

When flag false in CR resources [Link]

When flag true in CR resources [Link]

Edited by harshit aggarwal

Merge request reports