Fix NullPointerException in indexer when processing duplicate schemas
Summary
This change resolves a potential NullPointerException (reported with #257 (closed)) that occurs when the indexer processes multiple records with the same schema kind. The issue arises when duplicate IndexSchema objects are added to the schemas list during payload preparation.
Changes Made
-
Modified
IndexerServiceImpl.getIndexerPayload()
: Changed fromList<IndexSchema>
toLinkedHashSet<IndexSchema>
to automatically deduplicate schemas while preserving insertion order -
Added comprehensive test case:
testGetIndexerPayload_ShouldDeduplicateSchemas()
verifies that multiple records with the same kind result in only one unique schema in the payload -
Updated imports: Added
LinkedHashSet
andSet
imports
Technical Details
The root cause was that when multiple records shared the same kind (schema type), duplicate IndexSchema
objects were being added to the schemas list. This could lead to downstream processing issues and potential NullPointerExceptions.
Before:
List<IndexSchema> schemas = new ArrayList<>();
// ... processing loop
schemas.add(schema); // Could add duplicates
After:
Set<IndexSchema> schemasSet = new LinkedHashSet<>();
// ... processing loop
schemasSet.add(schema); // Automatically deduplicates
return RecordIndexerPayload.builder()
.schemas(new ArrayList<>(schemasSet))
.build();
Test Coverage
Added unit test that:
- Creates multiple records with the same schema kind
- Verifies that only one unique schema is included in the final payload
- Confirms that all records are still processed correctly
- Uses reflection to test the private
getIndexerPayload()
method
Backward Compatibility
Potential Future Improvements
TypeMapper.getDataAttributeIndexerMapping
is mutating object that is passed into it. This contributed to this problem when the schema was processed a second time. This method should be modified such that its not modifying the original object. This will require more extensive testing to ensure other operations do not depend on this method's current behavior of mutating the IndexSchema object.