Loading of 50,000 records using Osdu_ingest DAG fails for Azure platform
When trying to load test manifest ingestion (Osdu_ingest DAG) with 50,000 organization records, the DAG is failing. The test with 500 and 1000 records was successful but not 50,000 records.
While checking the status of DAG it returns with failed status: >>> azure_client.get_workflow('Osdu_ingest', '2c032948-3a8f-441f-9578-7a356709aa64') DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): osdu-ship.msft-osdu-test.org:443 DEBUG:urllib3.connectionpool:https://osdu-ship.msft-osdu-test.org:443 "GET /api/workflow/v1/workflow/Osdu_ingest/workflowRun/2c032948-3a8f-441f-9578-7a356709aa64 HTTP/1.1" 200 None DEBUG:root:HTTP GET https://osdu-ship.msft-osdu-test.org/api/workflow/v1/workflow/Osdu_ingest/workflowRun/2c032948-3a8f-441f-9578-7a356709aa64 ... DEBUG:root:Response: 200 DEBUG:root:json = {"workflowId": "Osdu_ingest", "runId": "2c032948-3a8f-441f-9578-7a356709aa64", "startTimeStamp": 1633056515159, "endTimeStamp": 1633069037479, "status": "failed", "submittedBy": "preshipping@azureglobal1.onmicrosoft.com"} <osdu_client.OsduClient object at 0x00000261C7BC2B80>
The link to airflow log is https://osdu-ship.msft-osdu-test.org/airflow/log?task_id=update_status_finished_task&dag_id=Osdu_ingest&execution_date=2021-10-01T02%3A48%3A37.320308%2B00%3A00
The tail part of the log is posted here DAG: Osdu_ingest R3 manifest processing with providing integrity
[2021-10-01 06:17:16,034] {decorators.py:28} INFO - ManagedIdentityCredential.get_token succeeded
[2021-10-01 06:17:16,034] {chained.py:68} INFO - DefaultAzureCredential acquired a token from ManagedIdentityCredential
[2021-10-01 06:17:16,034] {_universal.py:474} INFO - Request URL: 'https://osdu-mvp-crship-9vns-kv.vault.azure.net/secrets/app-dev-sp-username/?api-version=REDACTED'/nRequest method: 'GET'/nRequest headers:/n 'Accept': 'application/json'/n 'x-ms-client-request-id': '2a6154b8-227f-11ec-8125-3644e0c71593'/n 'User-Agent': 'azsdk-python-keyvault-secrets/4.2.0 Python/3.6.12 (Linux-5.4.0-1051-azure-x86_64-with-debian-10.5)'/n 'Authorization': 'REDACTED'/nNo body was attached to the request
[2021-10-01 06:17:16,064] {connectionpool.py:442} DEBUG - https://osdu-mvp-crship-9vns-kv.vault.azure.net:443 "GET /secrets/app-dev-sp-username/?api-version=7.1 HTTP/1.1" 200 324
[2021-10-01 06:17:16,065] {_universal.py:502} INFO - Response status: 200/nResponse headers:/n 'Cache-Control': 'no-cache'/n 'Pragma': 'no-cache'/n 'Content-Type': 'application/json; charset=utf-8'/n 'Expires': '-1'/n 'x-ms-keyvault-region': 'centralus'/n 'x-ms-client-request-id': '2a6154b8-227f-11ec-8125-3644e0c71593'/n 'x-ms-request-id': 'fb03f535-9856-47e6-83af-8f35058b8bb1'/n 'x-ms-keyvault-service-version': '1.9.79.2'/n 'x-ms-keyvault-network-info': 'conn_type=Subnet;addr=10.10.2.151;act_addr_fam=InterNetworkV6;'/n 'X-Powered-By': 'REDACTED'/n 'Strict-Transport-Security': 'REDACTED'/n 'X-Content-Type-Options': 'REDACTED'/n 'Date': 'Fri, 01 Oct 2021 06:17:15 GMT'/n 'Content-Length': '324'
[2021-10-01 06:17:16,066] {_universal.py:474} INFO - Request URL: 'https://osdu-mvp-crship-9vns-kv.vault.azure.net/secrets/app-dev-sp-password/?api-version=REDACTED'/nRequest method: 'GET'/nRequest headers:/n 'Accept': 'application/json'/n 'x-ms-client-request-id': '39639520-227f-11ec-8125-3644e0c71593'/n 'User-Agent': 'azsdk-python-keyvault-secrets/4.2.0 Python/3.6.12 (Linux-5.4.0-1051-azure-x86_64-with-debian-10.5)'/n 'Authorization': 'REDACTED'/nNo body was attached to the request
[2021-10-01 06:17:16,083] {connectionpool.py:442} DEBUG - https://osdu-mvp-crship-9vns-kv.vault.azure.net:443 "GET /secrets/app-dev-sp-password/?api-version=7.1 HTTP/1.1" 200 322
[2021-10-01 06:17:16,084] {_universal.py:502} INFO - Response status: 200/nResponse headers:/n 'Cache-Control': 'no-cache'/n 'Pragma': 'no-cache'/n 'Content-Type': 'application/json; charset=utf-8'/n 'Expires': '-1'/n 'x-ms-keyvault-region': 'centralus'/n 'x-ms-client-request-id': '39639520-227f-11ec-8125-3644e0c71593'/n 'x-ms-request-id': 'd8fd8cfa-d69a-4721-89f7-7afda24a9e30'/n 'x-ms-keyvault-service-version': '1.9.79.2'/n 'x-ms-keyvault-network-info': 'conn_type=Subnet;addr=10.10.2.151;act_addr_fam=InterNetworkV6;'/n 'X-Powered-By': 'REDACTED'/n 'Strict-Transport-Security': 'REDACTED'/n 'X-Content-Type-Options': 'REDACTED'/n 'Date': 'Fri, 01 Oct 2021 06:17:15 GMT'/n 'Content-Length': '322'
[2021-10-01 06:17:16,085] {_universal.py:474} INFO - Request URL: 'https://osdu-mvp-crship-9vns-kv.vault.azure.net/secrets/app-dev-sp-tenant-id/?api-version=REDACTED'/nRequest method: 'GET'/nRequest headers:/n 'Accept': 'application/json'/n 'x-ms-client-request-id': '39668c30-227f-11ec-8125-3644e0c71593'/n 'User-Agent': 'azsdk-python-keyvault-secrets/4.2.0 Python/3.6.12 (Linux-5.4.0-1051-azure-x86_64-with-debian-10.5)'/n 'Authorization': 'REDACTED'/nNo body was attached to the request
[2021-10-01 06:17:16,136] {connectionpool.py:442} DEBUG - https://osdu-mvp-crship-9vns-kv.vault.azure.net:443 "GET /secrets/app-dev-sp-tenant-id/?api-version=7.1 HTTP/1.1" 200 325
[2021-10-01 06:17:16,137] {_universal.py:502} INFO - Response status: 200/nResponse headers:/n 'Cache-Control': 'no-cache'/n 'Pragma': 'no-cache'/n 'Content-Type': 'application/json; charset=utf-8'/n 'Expires': '-1'/n 'x-ms-keyvault-region': 'centralus'/n 'x-ms-client-request-id': '39668c30-227f-11ec-8125-3644e0c71593'/n 'x-ms-request-id': '5955db9c-8001-411d-9b62-ffdfd53b142d'/n 'x-ms-keyvault-service-version': '1.9.79.2'/n 'x-ms-keyvault-network-info': 'conn_type=Subnet;addr=10.10.2.151;act_addr_fam=InterNetworkV6;'/n 'X-Powered-By': 'REDACTED'/n 'Strict-Transport-Security': 'REDACTED'/n 'X-Content-Type-Options': 'REDACTED'/n 'Date': 'Fri, 01 Oct 2021 06:17:15 GMT'/n 'Content-Length': '325'
[2021-10-01 06:17:16,139] {_universal.py:474} INFO - Request URL: 'https://osdu-mvp-crship-9vns-kv.vault.azure.net/secrets/aad-client-id/?api-version=REDACTED'/nRequest method: 'GET'/nRequest headers:/n 'Accept': 'application/json'/n 'x-ms-client-request-id': '396eb9d2-227f-11ec-8125-3644e0c71593'/n 'User-Agent': 'azsdk-python-keyvault-secrets/4.2.0 Python/3.6.12 (Linux-5.4.0-1051-azure-x86_64-with-debian-10.5)'/n 'Authorization': 'REDACTED'/nNo body was attached to the request
[2021-10-01 06:17:16,162] {connectionpool.py:442} DEBUG - https://osdu-mvp-crship-9vns-kv.vault.azure.net:443 "GET /secrets/aad-client-id/?api-version=7.1 HTTP/1.1" 200 318
[2021-10-01 06:17:16,163] {_universal.py:502} INFO - Response status: 200/nResponse headers:/n 'Cache-Control': 'no-cache'/n 'Pragma': 'no-cache'/n 'Content-Type': 'application/json; charset=utf-8'/n 'Expires': '-1'/n 'x-ms-keyvault-region': 'centralus'/n 'x-ms-client-request-id': '396eb9d2-227f-11ec-8125-3644e0c71593'/n 'x-ms-request-id': 'bbc4c81b-e04b-414b-a1a8-2473db395859'/n 'x-ms-keyvault-service-version': '1.9.79.2'/n 'x-ms-keyvault-network-info': 'conn_type=Subnet;addr=10.10.2.151;act_addr_fam=InterNetworkV6;'/n 'X-Powered-By': 'REDACTED'/n 'Strict-Transport-Security': 'REDACTED'/n 'X-Content-Type-Options': 'REDACTED'/n 'Date': 'Fri, 01 Oct 2021 06:17:15 GMT'/n 'Content-Length': '318'
[2021-10-01 06:17:16,167] {connectionpool.py:943} DEBUG - Starting new HTTPS connection (1): login.microsoftonline.com:443
[2021-10-01 06:17:16,232] {connectionpool.py:442} DEBUG - https://login.microsoftonline.com:443 "GET /58975fd3-4977-44d0-bea8-37af0baac100/v2.0/.well-known/openid-configuration HTTP/1.1" 200 1753
[2021-10-01 06:17:16,234] {authority.py:92} DEBUG - openid_config = {'token_endpoint': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/oauth2/v2.0/token', 'token_endpoint_auth_methods_supported': ['client_secret_post', 'private_key_jwt', 'client_secret_basic'], 'jwks_uri': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/discovery/v2.0/keys', 'response_modes_supported': ['query', 'fragment', 'form_post'], 'subject_types_supported': ['pairwise'], 'id_token_signing_alg_values_supported': ['RS256'], 'response_types_supported': ['code', 'id_token', 'code id_token', 'id_token token'], 'scopes_supported': ['openid', 'profile', 'email', 'offline_access'], 'issuer': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/v2.0', 'request_uri_parameter_supported': False, 'userinfo_endpoint': 'https://graph.microsoft.com/oidc/userinfo', 'authorization_endpoint': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/oauth2/v2.0/authorize', 'device_authorization_endpoint': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/oauth2/v2.0/devicecode', 'http_logout_supported': True, 'frontchannel_logout_supported': True, 'end_session_endpoint': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/oauth2/v2.0/logout', 'claims_supported': ['sub', 'iss', 'cloud_instance_name', 'cloud_instance_host_name', 'cloud_graph_host_name', 'msgraph_host', 'aud', 'exp', 'iat', 'auth_time', 'acr', 'nonce', 'preferred_username', 'name', 'tid', 'ver', 'at_hash', 'c_hash', 'email'], 'kerberos_endpoint': 'https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/kerberos', 'tenant_region_scope': 'NA', 'cloud_instance_name': 'microsoftonline.com', 'cloud_graph_host_name': 'graph.windows.net', 'msgraph_host': 'graph.microsoft.com', 'rbac_url': 'https://pas.windows.net'}
[2021-10-01 06:17:16,235] {application.py:60} DEBUG - Generates correlation_id: 5cdce50a-b115-4ce8-81c5-abef3222005e
[2021-10-01 06:17:16,306] {connectionpool.py:442} DEBUG - https://login.microsoftonline.com:443 "POST /58975fd3-4977-44d0-bea8-37af0baac100/oauth2/v2.0/token HTTP/1.1" 200 1331
[2021-10-01 06:17:16,307] {token_cache.py:120} DEBUG - event={
"client_id": "60c4b736-2aa4-4889-88a0-d50503d63de7",
"data": {
"claims": null,
"scope": [
"ab320ed3-9cdd-4798-8e3c-2a657800183b/.default"
]
},
"environment": "login.microsoftonline.com",
"grant_type": "client_credentials",
"params": null,
"response": {
"access_token": "********",
"expires_in": 3599,
"ext_expires_in": 3599,
"token_type": "Bearer"
},
"scope": [
"ab320ed3-9cdd-4798-8e3c-2a657800183b/.default"
],
"token_endpoint": "https://login.microsoftonline.com/58975fd3-4977-44d0-bea8-37af0baac100/oauth2/v2.0/token"
}
[2021-10-01 06:17:16,560] {update_status.py:82} DEBUG - Sending request '{"status": "failed"}'
[2021-10-01 06:17:16,560] {update_status.py:84} DEBUG - Workflow URL: http://workflow.osdu-azure.svc.cluster.local/api/workflow/v1/workflow/Osdu_ingest/workflowRun/2c032948-3a8f-441f-9578-7a356709aa64
[2021-10-01 06:17:16,563] {connectionpool.py:230} DEBUG - Starting new HTTP connection (1): workflow.osdu-azure.svc.cluster.local:80
[2021-10-01 06:17:18,754] {connectionpool.py:442} DEBUG - http://workflow.osdu-azure.svc.cluster.local:80 "PUT /api/workflow/v1/workflow/Osdu_ingest/workflowRun/2c032948-3a8f-441f-9578-7a356709aa64 HTTP/1.1" 200 None
[2021-10-01 06:17:18,755] {taskinstance.py:1150} ERROR - Dag failed
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
result = task_copy.execute(context=context)
File "/opt/airflow/dags/osdu_manifest/operators/update_status.py", line 133, in execute
raise PipelineFailedError("Dag failed")
osdu_api.libs.exceptions.PipelineFailedError: Dag failed
[2021-10-01 06:17:18,795] {taskinstance.py:1194} INFO - Marking task as FAILED. dag_id=Osdu_ingest, task_id=update_status_finished_task, execution_date=20211001T024837, start_date=20211001T061641, end_date=20211001T061718
[2021-10-01 06:17:18,983] {cli_action_loggers.py:86} DEBUG - Calling callbacks: []
[2021-10-01 06:17:20,064] {base_job.py:197} DEBUG - [heartbeat]
[2021-10-01 06:17:20,064] {local_task_job.py:102} INFO - Task exited with return code 1