Transformer - Refactor to implement full-scale batching - supports memory
In today's backlog refinement meeting, it was discussed that a possible solution to our Transformer memory consumption issue would be to re-design the ingest->process->store workflow into a batched, repeatable cycle.
Currently, the Transformer performs all Ingestion procedures, then all processing procedures, then all storage procedures. This linear lifecycle occurs once for each cache update/initialization. It results in a significant memory dependency because all ingested records must persist to the processing stage, and all processed records must persist to the storage stage.
@bgunter42 suggested that in order to reduce peak memory usage, the Transformer could perform everything in batches. This would mean the ingest->process->store workflow would occur cyclically until all records have been ingested, processed, and stored.
Acceptance Criteria:
- Transformer refactored to batch ingestion
- Batch size configurable by user
- measured performance gains