Search - Policy Integration "400 Request Header Or Cookie Too Large"

Background

We are observing an intermittent issue, after enabling Policy in Search Service resulting in the following error response

{
    "code": 400,
    "reason": "Bad Request",
    "message": "Failed to derive xcontent"
}

Analysis:

Based on our localhost analysis ( logs) current user (preshipping@azureglobal1.onmicrosoft.com) belongs to more than 2000 data groups as member.

While Search calls Policy translate API, in request header 'X-Data-Groups' values has more than 2000 groups for the user, which results in '400 Request Header Or Cookie Too Large'.

HttpResponse(headers={null=[HTTP/1.1 400 Bad Request], Server=[Microsoft-Azure-Application-Gateway/v2], Connection=[close], Content-Length=[259], Date=[Wed, 02 Nov 2022 07:06:04 GMT], Content-Type=[text/html]}, body=<html><head><title>400 Request Header Or Cookie Too Large</title></head><body><center><h1>400 Bad Request</h1></center><center>Request Header Or Cookie Too Large</center><hr><center>Microsoft-Azure-Application-Gateway/v2</center></body></html>, contentType=text/html, responseCode=400, exception=null, request=https://osdu-ship.msft-osdu-test.org/api/policy/v1/translate, httpMethod=POST, latency=1623

This error body translated as input query for ElasticSearch which results in ElasticSearch exception

{
    "code": 400,
    "reason": "Bad Request",
    "message": "Failed to derive xcontent"
}

Workaround:

  • We have deleted the stale/test groups present in X-Data-Groups for the user via Entitlements API.

Need Inputs:

The above workaround is not a ideal/permanent solution. Hence we are looking for any inputs to remediate this issue across all environments

Edited by Thulasi Dass Subramanian