Experimental FOSSA caching
This MR adds in some experimental FOSSA logic, which is intended to ease the pain of using FOSSA. I'd like to add it to one service to get some "real-world" testing, then apply it globally once we see it works reasonably. I'm very interested in feedback on how this is working and ways that it could be improved before we make it standard.
Note that this uses the image
fossa-with-cache/incremental:latest -- meaning it grabs the latest build rather than a stable release one. This is to make it easier to iterate and refine during this testing phase. Before it is moved to the ci-cd-pipelines project, it will be changed to follow a release tag.
Caching of the NOTICE Files
The primary feature of the
fossa-with-cache tool is that it caches the NOTICE that FOSSA generates and associates it to a list of project dependencies. If the dependencies do not change, the tool will re-use the previously generated NOTICE rather than asking FOSSA to create a new one. This should cut down on instances of small, jittery changes in the NOTICE from one generation to the next.
The cache ages out after a period of time (default = 1 week), so the first pipeline run after that age will regenerate even if there's a cached version available. This is to avoid going too long before reconsulting FOSSA -- FOSSA's database is constantly evolving, and it may have new information that the previous run didn't.
Single package names, list of URLs
When multiple different package names refer to the same package (like spring-security & Spring Security), a single canonical name is used instead. When multiple URLs are used, they are concatenated into a comma separated list. The configuration for this is hard-coded, and will need to be updated over time. But, at least for the known cases, this should cut down on NOTICE differences that are only these typographical changes.
This does mean that every NOTICE file will change the first time the pipeline uses this logic. This is unavoidable, but should only happen once on each project.
NOTICE files can be updated without re-running the pipeline
If there are new commits to a branch, and the only changes are the NOTICE file, the
fossa-check-notice stage will compare the generated (or cached) NOTICE against the latest committed version, rather than the one associated with the pipeline. This allows developers to fix a NOTICE file without having to re-run the entire pipeline.
This is a bit counterintuitive from a CI perspective. Normally, you expect that pipelines only operate on the specific commit that generated them. However, in this case, I think the non-standard technique is useful to reduce the cost associated with a NOTICE difference.
The skipped pipeline also shows up in the list of pipelines, including the MR list. We'll need to know to look for previous pipelines if we see a skipped pipeline with commit message "Update NOTICE" or similar.
In the case of trusted branches, the NOTICE should be updated on the regular development branch first. Then, move the trusted branch to match (
git branch -f trusted-devBranch devBranch or similar) and push it up. Trusted branches already suppress pipelines, so there's no need to push with
-o ci.skip on that one. Finally, re-run the
fossa-check-notice on the protected pipeline. If you only update the trusted branch, the pipeline will pass but the NOTICE files won't get merged in. If you only update the development branch, the pipeline won't see them and will continue to fail. I may add some additional commentary to the pipeline failure message to help with this, or perhaps add special logic around the trusted branches to make this easier. Interested in feedback / ideas on this.
Example: This MR
For pipeline 79827, the original
fossa-check-notice job (747962) failed with differences in the NOTICE. I followed the steps to download the cached version, and re-ran the job (748097). The pipeline (79837) from the changed NOTICE was skipped; and that shows up in this MR as the latest pipeline.