Skip to content

feat: sdutil chunking upload

Varunkumar Manohar requested to merge slb/vm/sdutil-chunking-upload into master

This MR allows the explicit upfront chunking of input datasets. At this point, only Azure blob storage uploads are supported

  1. If the chunk-size parameter is specified then the file is uploaded as multiple chunks of size=chunk_size. Here the size value is in MiB

    python sdutil.py cp {{dataset}} sd://datapartition/subproject/dataset --chunk-size=30

  2. If no chunk-size parameter is specified then the file is uploaded as chunks of size 32MiB

  3. If the chunk-size parameter is set to 0then the file is uploaded as a single object

    python sdutil.py cp {{dataset}} sd://datapartition/subproject/dataset --chunk-size=0

Edited by Varunkumar Manohar

Merge request reports