ADR: Process Changes to drive OSDU vulnerabilities to zero
Status
-
Proposed -
Trialing -
Under review -
Approved -
Retired
Context & Scope
We are at a key inflection point within the OSDU community, where we have switch from experimentation or small-scale deployments to full scale adoption of OSDU Data Platform by operators for the real-world workflows. To enable this, amongst the operators, there lies a common theme of need for enterprise readiness with the fundamental and non-negotiable pillar of application and data security. Enforcing a security first culture is a common challenge across the different industry applications and platforms, which are being increasingly built on open-source code and libraries. Due to the latest issues seen with the nation state bad actors attacking critical infrastructure using security vulnerabilities in open-source libraries such as Log4shell, security of open-source software has become top of mind for Cybersecurity and Infrastructure Security Agency (CISA) which recently published its Secure By Design initiative. Secure coding needs to be incorporated into the development process of open-source software which drives the world’s most important industries, where OSDU is no exception with more and more of the energy production driven by it.
As we stand today, the OSDU release process lets milestone releases happen even when there are active vulnerabilities in the codebase. Taking the first step, we have extended the milestone release period to provide extra sprints to improve overall code quality. Thanks to the efforts of PMC, there has been a steady decline in the critical class vulnerabilities over time (see image below from OSDU Security Dashboard), however the overall counts for other vulnerabilities have remained relatively flat.
As part of this ADR, we are proposing some key changes in our release policies and processes to drive active vulnerabilities in OSDU Data Platform codebase to recommended levels (Zero Critical and High severity vulnerabilities and minimizing Medium severity vulnerabilities). The ADR is scoped to security vulnerabilities only and not cover other engineering fundamentals like scalability, reliability or performance.
Goal
The key goal being addressed is introducing a security first culture driving towards achieving the operators and partners enterprise security SLAs when they use OSDU in their production deployments. The OSDU milestone releases are secure by default thereby leading to faster adoption and delivery of new milestone releases of OSDU for scalable production deployments via commercial/non-commercial setups.
Current Status
Across all CSP codebases (including the code for CI), we have approximately 4500 vulnerabilities spanning across different severity classes from Critical to Low. Over 40% of these active vulnerabilities are Critical and High severity vulnerabilities. With 4 CSPs involved and common code vulnerabilities as well, this translates to roughly 1K vulnerabilities on any given CSP implementation. Note – today’s assessment is primarily based on Dependency Scanning but that should be a good starting point.
While security has been a key focus area in the OSDU community today, it is not treated as a blocker to release code. Any new feature work or new core service or DDMS are integrated into the Main branch and even tagged with milestone releases with no gatekeeping on vulnerability impact.
Proposed solution
In order to zero down on security vulnerabilities for at least Critical, High and Medium class vulnerabilities, we want to first align on the core principles and processes we must adopt in the OSDU community:
- We emphasize in the community that we need to make OSDU Secure by Design and Secure by default. At a high level:
- Secure by Design means that we will design and architect components, features, and services in a manner that they follow Zero Trust principles and solve for security basics like Secure Identity, Not storing any passwords/access strings, be vigilant of which open-source libraries we are taking a dependency on etc.
- Secure by Default strives towards making the stricter configurations as the default. So, default implementations or samples have configurations that limit access, or work on private network or have rate limits in place etc.
- We make Security not just a focus area but a blocker.
- Features or components or new services should be pro-actively restrained from making into the OSDU codebase if they do not pass the vulnerability benchmarks.
- This should include existing code which may or may not have been modified in a while but has new vulnerabilities – we should treat those as blockers as well.
- OSDU Milestone releases should not happen with active Critical or High severity vulnerabilities.
- We delineate in the community that all developers are responsible for fixing vulnerabilities – that it is not just the task for pre-shipping or CSPs but security is a joint priority.
Once we all align on these principles, below are some of immediate steps that we should take to drive to a secure OSDU Data Platform:
-
No New Milestone Releases: We will not release any new OSDU milestones till all Critical, High and prioritized Medium severity (to be determined based on impact) class vulnerabilities are fixed/addressed.
- Since we are very close to M23 release, we should enforce this from M24 onwards.
-
Challenge: If we hold-off new milestone releases then the vulnerability fixes being made might take a very long period to light up for any consumption
- Mitigation: We should plan and support hotfixes/minor version releases for all security and critical bug fixes in the meanwhile. What that would mean is that for vulnerabilities and critical bug fixes, we roll out interim updates to the N-1 release (like triage the list and keep pushing vulnerabilities and critical bug fixes on M23)
- New Features are not merged into Main with vulnerabilities: Today, there are no restrictions on merging new features into main branches and pushing for tagging to new milestone releases. No new code with Critical, High and Important Medium priorities (to be determined based on impact) should be merged into the main branch.
- PR and ADR Gates: We implement and enforce PR (Pull Request) gates, that no new merges will happen in the Main branches till security vulnerabilities are brought to zero or if the PMC approves exceptions for some critical bugs. In addition, we should also block any new ADRs from approval unless they are tied to fixing security issues or critical bug fixes.
- CI/CD Pipelines: We develop and enforce CI/CD pipelines against each repo checking for vulnerabilities and merges do not happen if it results in additional vulnerabilities.
- All Hands on Deck: All developers across all projects/workstreams pivot to security fixes. When we enforce the PR gates and CI/CD pipelines to block merges with vulnerabilities, it would become relatively straightforward to push for all developers to prioritize fixing security vulnerabilities.
- Tooling and Reporting: We already have some good dashboards and tools but as needed we will invest in new tools/dashboards as needed to drive the visibility and tracking of vulnerabilities. Using the sustainability funding to acquire tools should be considered if needed.
- CISA Engagement: We should work with CISA to evaluate if they can add OSDU Data Platform as well to their list of OSS projects and seek their help in fixing the vulnerabilities.