Draft: Fix Thread Exhaustion and ES Connection Reliability Improvements (!896) · Merge requests · OSDU / OSDU Data Platform / System / Indexer

Implemented Fixes

Replaced Elasticsearch client cache with concurrent hash map
Added Connection Pool Limits
Added Client Cleanup Methods/cleanup of stale or unhealthy connections
Implemented retry mechanism to handle ES connection interrupts
Execute query with health checks
Moving configuration of ES connection settings to application.properties
Added unit tests

Testing

Pod resource usage after indexing 100k+ records:

Executing top (resouce usage):

  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
    7     1 appuser  S    3108m   2%  29   0% java -Xms1000M -Xmx1000M --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.lang.reflect=ALL-UNNAMED -jar /app.jar
 1262     0 appuser  S     2984   0%  25   0% bash
    1     0 appuser  S     1712   0%  21   0% /bin/sh -c . /entrypoint.sh
 1280  1262 appuser  R     1700   0%  24   0% top

Executing ps -lfT | wc -l (thread count):

Edited Apr 23, 2025 by Marc Burnie [AWS]

Draft: Fix Thread Exhaustion and ES Connection Reliability Improvements

Implemented Fixes

Testing

Merge request reports