ADR : Optional Audit Logs per partition
- Under review
Context & Scope
As an operator of OSDU we are seeing increased levels of costs related to logging over time. For instance in a single day we saw over 500GB of logs created per environment. This can run to costs of $X000 a day.
A large percentage of these logs can be equated to audit logs, especially for READ actions like list groups in Entitlements which can be called multiple millions of times a day.
We have deployments and partitions of varying degrees of sensitivity. For example we have deployments strictly for development purposes but even in our production systems some partitions are used solely for development by their clients.
As a client of the system I would like to be able to control which partitions have audit logs enabled and if so whether I have this turned on for write operations only i.e. create,update,delete actions or all events in the system.
This gives me the choice of whether the cost benefit of running audit logs in a particular partition is worthwhile and can give the client the control based on the sensitivity of the partition.
This has precedent in other systems e.g. in Google Cloud or Azure I have the choice for most resources I deploy whether to turn on and retain audit logs for these services even though they could have sensitive information as it gives me as the client the control and choice.
I could choose to have all audit logs turned on at all times. This is the current state. However this has potentially high cost implications. If the partition or deployment of OSDU does not contain sensitive information then I would like the choice to turn off audit logs.
I could make the option to turn on/off logs as part of the build configuration. However this means I need to redeploy if I want to switch logs on or off. This also prevents end users potentially ever having this control.
I could make the configuration for the entire environment rather than per partition. However a strong use case for partitions is to separate them based on data sensitivity e.g. I may have a partition for development purposes and another partition in the same deployment with production data.
We are proposing the ability to have audit logs turned on or off by choice of the operator and/or client.
An implementation of this could be via a feature flag from the partition service which will allow this to be set at runtime and per partition.
We would like 2 separate feature flags. One which can turn the READ action audit logs on or off and another to turn the CREATE/UPDATE/DELETE/PUBLISH/JOB_RUN action audit logs on or off.
By default both of these should be turned on so it is 'secure' by default. However the operator and/or end user can then choose to turn certain ones off.