Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • I infra-azure-provisioning
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 57
    • Issues 57
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 17
    • Merge requests 17
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Open Subsurface Data Universe Software
  • Platform
  • Deployment and Operations
  • infra-azure-provisioning
  • Merge requests
  • !444

Fixed bugs in the Airflow Monitoring and Alerts

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Mayank Saggar [Microsoft] requested to merge airflow-monitoring-alerts-fixes into master Aug 11, 2021
  • Overview 1
  • Commits 8
  • Pipelines 5
  • Changes 4

Infrastructure Submissions:


  • [YES] Have you added an explanation of what your changes do and why you'd like us to include them?
  • [NA] I have updated the documentation accordingly.
  • [NA] I have added tests to cover my changes.
  • [YES] All new and existing tests passed.
  • [YES] I have formatted the terraform code. (terraform fmt -recursive && go fmt ./...)

Current Behavior or Linked Issues


A few noted bugs in airflow dashboards:

  • The granularity of data points was fixed at 15 min mark, so if one applies a time-period of 1 hour it would return only 3-4 data points instead of 60.
  • In all the 3 dashboards DatapartitionId filter-name is changed to ClusterName.
  • The charts on service dashboard were split on the basis of Metric Name and not Cluster Name. So corrected it.
  • In dags dashboard the datapoints were split on basis on Metric Name so corrected it to dagName/TaskId where applicable. Noted bugs in Airflow Alerts:
  • The granularity of metrics in alert queries was set to be 5min in some alerts where it was expecting more metrics so changed it to 1min and 30 sec in Host-count alerts.
  • Changed aggregation type of import-error alert to Max
  • Changed aggregation type of Error Rate alert to Sum

Does this introduce a breaking change?


  • [NO]

MR Guildelines

  • Paste TF Plan for the MR.
  • Pre-Merge pipeline should be run before merging. (Azure team)
  • Does the module exists for new resource.
  • Is there a new variable added in the MR. (Don’t use library variables and use locals)

Other information


Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: airflow-monitoring-alerts-fixes