Admin message

On Sunday, May 30th, we will be performing critical infrastructure maintenance on our Disaster Recovery processes between 20:30 and 00:30 EDT (00:30 and 2:30 UTC). This will necessitate brief outages for Community GitLab during that time. If you are not able to access one of our services or websites, please wait a few minutes and try again. Additional status updates will be available on our status page at https://status.opengroup.org/.

Improve Airflow logs

The problem that we are facing now is that it is hard to read Airflow logs and hard to see what records were stored into Storage service and what records were not. Several comments here:

  1. DAG execution status may be green, but some of the records were not stored. This is somewhat expected behavior, thats why DAG is displayed green in the Airflow. We expect that some of the entities may fail validation. If they fail validation, we skip them and process other entities. It may be confusing to the user.
  2. We have lots of tasks in osdu_ingest DAG now and validation happen at different stages, so logs are spread out between different tasks. -> it is hard for a user to know what log to check
  3. Sometimes Airflow logs don't even display skipped ids. This is a critical issue that has to be fixed.

Ideally, it would be great to produce a report at the end of DAG execution. Report should list processed ids and unprocessed ids with the errors.

Assignee Loading
Time tracking Loading