Skip to content

Load save catalog async storage

Yannick requested to merge load_save_catalog_async_storage into master

Main goal make the save and the load of the bulk catalog outside of the Dask worker and perform the I/O asynchronously. Note that the json dumps and load are done inside the main thread, by the way, for testing, even with dealing with many column (tested with 150k columns) is quick enough (actually it's not heavier than the current dump/load done for the meta data).

To do so, some side changes:

  • [no mandatory] tenant resolved in middleware and then put in the context
  • join builder reviewed to always use '/' as separator to be compatible to all known cases so far (including windows fs or Azurite emulator)
  • some test fixtures updated

Merge request reports