Fix: Allow NaN values in optional object fields during bulk validation
Summary
Fixes issue #667 (closed) where bulk data ingestion fails when optional object fields contain null values.
-
Root cause: When JSON payloads have
nullfor optional object fields, pandas converts them toNaN(float64), which fails schema validation -
Solution: Add
bulk_modeparameter toDataValidator.validate()that filters NaN errors only for optional (non-required) fields - Safety: Required fields with NaN still fail validation, ensuring data integrity
Changes
- Add
bulk_modeparameter toDataValidator.validate()for relaxed validation - Implement schema-aware NaN filtering that distinguishes between required and optional fields
- Enable
bulk_mode=Truein bulk data ingestion endpoint (app/api/routes/data/api.py) - Add comprehensive tests for nullable field validation scenarios
Test plan
-
Single record without optional field passes strict validation -
Records with null optional fields fail strict validation (documents the bug) -
Records with null optional fields pass with bulk_mode=True -
Required fields with NaN still fail even in bulk_mode=True -
Existing unit tests pass -
Linting passes (WPS110 variable naming fixed)
Closes #667 (closed)