Type of change
Please provide link to gitlab issue or ADR(Architecture Decision Record)
Does this introduce a change in the core logic?
- [YES/NO] No
Does this introduce a change in the cloud provider implementation, if so which cloud?
IBM (Added an additional field (hash) to the document db. Auto populated for all other providers.)
Does this introduce a breaking change?
- [YES/NO] No
What is the current behavior?
- skipdupes only works for changes in RecordData and ignores changes in RecordMetadata.
- Record modification events do not contain info on which section was updated
What is the new/expected behavior?
- skipdupes=true now handles for change in any of RecordData and RecordMetadata fields
- This also implements ADR 92 which populates a recordBlocks field with the fields updated
- skipdupes now uses the recordBlock field to check if any updates are made to the record
Have you added/updated Unit Tests and Integration Tests?
Unit Tests for all new files and new lines with 100% coverage has been added
Any other useful information
The hashes of data and meta field are now saved in metadata which is used to validate if any changes were made and prevents and extra call to the blob storage for each record. This is used both in skipdupes=true scenario as well as to populate the recordBlocks field. This brings around a 50% performance improvement in skipdupes=true scenarios