This MR is to address performance issue for index augmenter Poor performance for index augmenter
- To address issue #1, we aggregate search of the related objects (parent objects). In the test case, 3 queries were aggregated to 1 query. We also removed the unnecessary (last) query when using cursor query. So the total queries were reduced to 5 queries when indexing one record.
- To address issue #2 (moved), we found there is a bug on the implementation of the caches. There were annotated as RequestScope. That means the cached information can be used by one call with max. 50 records. After we removed the RequestScope and with enhancement #1, the average queries per record is reduced to 2.
In this MR, we also include the following enhancements:
- Reduce the usage of query with cursor in the place that we are sure that search result not reach to the limit of normal query. (No more than 10,000 search results) in order to reduce the overhead on the ElasticSearch.
- Fix data dependent defect that causes NullPointerException
- Re-organize the cache package and ensure there is no conflict on the key among different cache solutions. We found that the RelatedObjectCache and RecordChangeInfoCache share the same key (record id) in the original implementation and it will cause problem in Azure implementation with Redis.
With all the changes, we re-tested the re-index with augmenter enabled. The performance with augmenter enabled is about 4 times worse than the performance with augmenter disabled.
Here is the test result for re-index of WellLog from two random runs. The indexer created 6 batches for 291 records and they were run in parallel:
- With augmenter enabled:
41 records in 28,887 ms 50 records in 30,006 ms 50 records in 33,261 ms 50 records in 33,540 ms 50 records in 35,794 ms 50 records in 35,581 ms
- With augmenter disabled:
41 records in 7,885 ms 50 records in 8,018 ms 50 records in 9,024 ms 50 records in 9,207 ms 50 records in 9,034 ms 50 records in 9,358 ms