Fix: Allow NaN values in optional object fields during bulk validation

Summary

Fixes issue #667 (closed) where bulk data ingestion fails when optional object fields contain null values.

  • Root cause: When JSON payloads have null for optional object fields, pandas converts them to NaN (float64), which fails schema validation
  • Solution: Add bulk_mode parameter to DataValidator.validate() that filters NaN errors only for optional (non-required) fields
  • Safety: Required fields with NaN still fail validation, ensuring data integrity

Changes

  • Add bulk_mode parameter to DataValidator.validate() for relaxed validation
  • Implement schema-aware NaN filtering that distinguishes between required and optional fields
  • Enable bulk_mode=True in bulk data ingestion endpoint (app/api/routes/data/api.py)
  • Add comprehensive tests for nullable field validation scenarios

Test plan

  • Single record without optional field passes strict validation
  • Records with null optional fields fail strict validation (documents the bug)
  • Records with null optional fields pass with bulk_mode=True
  • Required fields with NaN still fail even in bulk_mode=True
  • Existing unit tests pass
  • Linting passes (WPS110 variable naming fixed)

Closes #667 (closed)

Merge request reports

Loading