ADR: Static code analysis for Python libraries
Context
Python is a dynamically typed language, so developers don't need to worry about types. This works well if a project is small and a few developers work on it.
However, once the project gets bigger, and involves a lot of engineers, understanding how code works becomes the cornerstone of the further development. Python has type annotations designed to help developers to understand code. Now, these type annotations in our Python libraries are kind of hints for developers and their IDEs, but following them is not mandatory, and they can be simply ignored.
As a result, we face issues when some methods are called with arguments with wrong types. And these bugs unexpectedly show in runtime under certain conditions.
It is not so rare to get the following runtime error: AttributeError: 'dict' object has no attribute 'to_JSON'
However, these bugs could be easily catch with any static analyzer.
Decision
Add a static analysis step for type checking to CI/CD pipelines right before unit-tests. The step will be run on the container with preinstalled tools for Python static analysis (e.g., pytype or mypy).
At first, we are going to add static analysis to the following libraries:
-
https://community.opengroup.org/osdu/platform/system/sdks/common-python-sdk/-/tree/master/osdu_api - excluding CSP-specific code from
osdu_api/providers
; - https://community.opengroup.org/osdu/platform/data-flow/ingestion/osdu-ingestion-lib;
- https://community.opengroup.org/osdu/platform/data-flow/ingestion/osdu-airflow-lib.
Further, we can cover other Python libraries with static analysis.
Consequences
Pros:
- It will be much easier to catch subtle bugs without writing extra unit-tests;
- Developers will be forced to follow type annotations that will make code more readable and understandable.
Cons:
- The existing code should be refactored to pass static analysis validations;
- Some developers might find obeying these new rules too strict.