Azure: Fixing reindex
Fixing Azure reindex so that it is able to interact with entitlements service of http within the k8s cluster. Related to #14 (closed)
From initial investigation, it seems like the issue was in part that the x-functions-key
header was not being set properly, and so the value was coming through as null when indexer-service was checking authorization with entitlements. This is because Istio was preventing the request from even hitting the service container for entitlements. When we turn off Istio, we see some informative error logs from tomcat in the entitlements pod:
2021-03-17 13:11:46.830 INFO entitlements-azure-57764d67f8-6jthb --- [p-nio-80-exec-4] o.a.c.h.Http11Processor correlation-id= data-partition-id=: Error parsing HTTP request header
Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
java.lang.IllegalArgumentException: The HTTP header line [x-functions-key0x0d] does not conform to RFC 7230 and has been ignored.
at org.apache.coyote.http11.Http11InputBuffer.skipLine(Http11InputBuffer.java:1041)
at org.apache.coyote.http11.Http11InputBuffer.parseHeader(Http11InputBuffer.java:893)
at org.apache.coyote.http11.Http11InputBuffer.parseHeaders(Http11InputBuffer.java:593)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:284)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:880)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1601)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
This is interesting for two reason: (1) it seems that Istio and Tomcat perform some header validation. (2) For whatever reason, the validation is only behaving this way over http and not https. I did a bit of reading on the Tomcat code that is throwing the error. I think the relevant part is here. So since there is this illegal character in the header, Tomcat throws a 400 on behalf of entitlements service for the request. It seems Istio must do something similar since the request does not even reach the service container when Istio is on.
Interestingly, both Istio and Tomcat only seem to be doing this for http and not https. We were able to successfully make the call (even with the null header) before when we referred to entitlements service by its DNS name rather than the internal access point within the cluster (http://entitlements-azure/entitlements/v1
).
This MR resolves this issue by directly hardcoding the value for the x-functions-key
header. We were already hardcoding the value to be abcd
in application.properties
. However, it was not being set properly because the code block the defines what the header value should be set to:
@Value("${indexer.queue.key}")
private String queueKey;
was running after the variable was used, so it was coming through as null.
Merge request reports
Activity
46 43 .list(request.getHeaderNames()) 47 44 .stream() 48 45 .collect(Collectors.toMap(h -> h, request::getHeader)); 49 50 headers.put(INDEXER_QUEUE_KEY,queueKey); 46 headers.put(INDEXER_QUEUE_KEY, "abcd"); @jsangiamo can you rename "abcd" -> "DUMMY_KEY"
Edited by MANISH KUMAR@madhurtanwani https://community.opengroup.org/osdu/platform/system/indexer-service/-/blob/master/provider/indexer-azure/src/main/java/org/opengroup/osdu/indexer/azure/util/RequestInfoImpl.java#L84
IMO, anything is set doesn't effect at all. This method always returns
false
.Same is for all other CSPs, always return
false
Edited by MANISH KUMARI am going to set the value to
NOT_USED
for clarity for now. We need to do some additional investigation as to if this header is needed and if we can safely remove it. I worry it is somehow used when indexer is interacting with indexer-queue in the reindex process. Let me know if anyone has an issue with thischanged this line in version 4 of the diff
42 42 @Value("${MAX_CACHE_VALUE_SIZE}") @jsangiamo, thanks for fixing it! Also, can you please elaborate the fix :D ? Maybe add it in the MR description or in the Issue itself. We've spend significant time in investigating this and would love to hear how you arrived at the solution :) @kiveerap / @manishk watchout!
+1 - great to see the fix and curious on the debugging process. Thank you @jsangiamo
added 31 commits
-
f00b7f46...acc92cfb - 14 commits from branch
master
- 62fe3592 - adding debugging print statements
- d712b704 - updating corelib azure version
- f2bda23c - Swapping out entitlements factory to a lazt requestscope component class
- 39692e3b - switching to debugger core common version
- f34f82ea - updating debug dependency
- d211829b - bumping dependency'
- 00b5948c - bumping dep v
- 97dfcbbc - adding more prints'
- 14437265 - changing azure application config, setting sprint boot logs to debug
- f1ecd9e1 - updating core version
- aa8f58fd - fixing pom
- 6d36d387 - hardcoding queue key so it isn't null
- 490232b9 - removing debugging
- 64b9eca7 - updating core common
- 67257c09 - hard coding in queue key again
- 1835026c - making hardcoded value more clear
- da1f18e1 - fixing notice
Toggle commit list-
f00b7f46...acc92cfb - 14 commits from branch
added 24 commits
-
8c5b6fe3...a7fbb7b2 - 4 commits from branch
master
- 201f84aa - adding debugging print statements
- dd5c01b3 - updating corelib azure version
- e5798d9b - Swapping out entitlements factory to a lazt requestscope component class
- 25ce9a6c - switching to debugger core common version
- ead74829 - updating debug dependency
- 8028cd98 - bumping dependency'
- 580f0235 - bumping dep v
- cd578b55 - adding more prints'
- ed3ac5b1 - changing azure application config, setting sprint boot logs to debug
- 62519a2f - updating core version
- 118e313a - fixing pom
- 90343417 - hardcoding queue key so it isn't null
- aba3c00f - removing debugging
- e59727aa - updating core common
- aece6a20 - hard coding in queue key again
- 1504bc8c - making hardcoded value more clear
- 5692503c - fixing notice
- 8c0646fa - fixing notice
- add108e1 - fixing topic name for reindex
- 99501500 - removing unused os-core-common variable
Toggle commit list-
8c5b6fe3...a7fbb7b2 - 4 commits from branch
mentioned in commit 98c2a941
mentioned in issue #14 (closed)
mentioned in merge request !140 (merged)
added Azure label
changed milestone to %M5 - Release 0.8