
Policy-based validation for Kubernetes (and more!)
Getting started with Conftest to build policies for Kubernetes
While the Kubernetes API server will provide basic schema validation for Kubernetes manifests, we often want to be able to apply our own validation rules to the actual contents of the manifests.

But sadly not the same OPA…
Instead of ad-hoc, imperative checks, we would like to be able to define policies that we can reuse time-after-time, across many resources, and in a way that can embed into our existing software & operational lifecycle.
Enter Conftest, an policy-as-code tool built upon Open Policy Agent - a general purpose policy framework - and its policy langage, Rego language.
Why would we need this?
Validating our resources doesn’t just come down to making them work, it means providing add a whole extra layer of security and confidence. This ranges from unit-testing our resources to full end-to-end integration and regression testing our running infrastructure.
Some use-cases include…
🛠️ Running a unit test suite of policies locally when developing new resources, ensuring the resources adhere to best practices
- Ensuring team labels are applied to resources
- Warning when using
:latest
tag - Ensuring a
Service
has an appropriateselector
configured - Requiring credentials to be provided in the event a private image registry is used
🐳 Running unit & integration tests in a CI/CD pipeline that validate expected changes
- Ensuring that a PVC reference will not change
- Ensuring that Pod
resources
are within an acceptable range for an environment - Ensuring relevant labels & annotations are present as required by Prometheus
✅ Running infrastructure validation/acceptance tests to ensure that newly-provisioned infrastructure was deployed correctly
- Ensuring that a
Service
has liveEndpoint
s - Validating all expected resources are deployed as part of a Helm release
- Validating a blue-green deploy successfully completed, leaving only the appropriate colour running
🔒 Protecting against regressions when changing existing resources, or reporting infomation between releases
- Warning when using a deprecated application env var in a
Pod
spec - Ensuring volume names are accidentally changed between versions
- Diffing container images between releases to detect changes
- Ensuring that version information is updated in annotations
And that’s just for starters!
Since OPA natively works with JSON, and Conftest extends this by supporting
.yaml
, .env
, .dockerignore
, Dockerfile
, .hcl
, and more - we can
see how these practices can be extended across the entire stack.
How does it work?
First of all, we write our policies in the Rego language, then run these policies against one or more documents.
These documents can be local Kubernetes manifests; Kubernetes resources provided via. stdin; a custom data structure; or even a mix of multiple documents at once - Conftest is completely data-agnostic!
Let’s start simple with the following Secret
manifest:
apiVersion: v1kind: Secrettype: Opaquemetadata: name: tokens namespace: foo annotations: meta.helm.sh/release-name: tokens meta.helm.sh/release-namespace: foo labels: app.kubernetes.io/managed-by: Helmdata: token-a: "dG9rZW4tYQ==" token-c: "dG9rZW4tYw==" password: "aHVudGVyNDI=" private-key-1: "LS0tLS1CRUdJTiBQUklWQVRFIEtFWQ==" private-key-2: "LS0tLS1CRUdJTiBQUklWQVRFIEtFWQ=="
A basic policy for this could look like:
package secrets.example
import rego.v1
deny contains reason if { input.kind == "Secret" not input.data["token-b"] reason := "token-b is required"}
This policy lives in the namespace secrets.example
and contains a single rule,
which is comprised of a rule head - deny contains reason if
- and the body,
which contains expressions to evalute.
The expressions in the rule body are implicitly combined with a logical and
,
meaning that if all of the expressions are truthy, the entire rule passes, and
if any of them are falsey, the rule evaluation is stopped and the rule fails.
As you can see, we can access to the input
object, which is the file we provided.
Since our rule head is deny
, then if the rule passes (all expressions are true),
then we get a failure reported in Conftest.
Trying it out
Now, if we run this with:
conftest test resources/secret.yaml
We should see the following output, with the message being denoted by the
rule’s reason
variable:
FAIL - - secrets.example - token-b is required
This is because the provided Secret
does not have a token-b
data entry
and as such, will cause all of the expressions to evaluate as true
and
therefore the entire rule will pass, reporting the failure.
However, if we run this rule against a Deployment
manifest, the first expression will
evaluate as false:
input.kind == "Secret"
And the rule will fail, stopping processing any more expressions and nothing is reported in Conftest’s output.
🎉 The beauty of this, is that we can apply an entire suite of tests against a resource and only applicable ones will report failures!
Alternatively, you can define a rule’s head using violation
or warn
, and
optionally suffix these terms with an underscored identifier.
-
violation
- This allows you to return an object, ie. to provide extra metadata or context -
warn
- Prints a warning instead of a failure in the console output
package secrets.example
token_name := "token-b"
violation_token_required contains reason if { input.kind == "Secret" not input.data[token_name]
reason := { "msg": sprintf("%s is required", [token_name]), "namespace": input.metadata.namespace, }}
And if we run this with -o json
, we can see that the other keys in the
reason
object are exposed as the metadata
object.
[ { "filename": "", "namespace": "secrets.example_1", "successes": 0, "failures": [ { "msg": "token-b is required", "metadata": { "namespace": "foo" } } ] }]
Conftest uses OPA under the hood, which works by building a JSON object that is the result of evaluating each rule against the input data.
If you run this directly in the Rego playground, you will see the following output:
{ "deny": [ { "msg": "token-b is required", "namespace": "foo" } ]}
This is because each passing rule is adding to the deny
array.
Then this data structure is parsed by Conftest to provide more relevant console
output or structured data (when using -o json
)
Building more complex rules
Now that we’ve grasped the fundamentals, let’s build some interesting rules.
Evaluating rules over multiple items
If any expressions in a rule evaluate to multiple items, then the rule will return multiple results. This essentially ‘forks’ the execution of the rule for each item. While this can take some getting used to, it’s also incredibly powerful.
package secrets.example
import rego.v1
violation contains reason if { input.kind == "Secret" input.metadata.name == "tokens" some key, _ in input.data # this iterates over each entry, therefore evaluating the # rule against each individually
contains(key, "private-key") reason := sprintf("No private keys allowed - found '%s'", [key])}
conftest test --namespace secrets.example ./resources/secret.yamlFAIL - ./resources/secret.yaml - secrets.example - No private keys - found 'private-key-1'FAIL - ./resources/secret.yaml - secrets.example - No private keys - found 'private-key-2'
Comprehensions
If you instead wanted to collect multiple values, you can use use a comprehension. This can be used to build an array, a set or an object.
keys := [key | some key, value in input.metadata.annotations]# ^- key is the yielded value# ^- keys is an array of the annotation keys
The body inside the comprehension can also contain additional expressions, allowing us to filter or transform elements:
package secrets.example
pkey_header := "-----BEGIN PRIVATE KEY"
violation contains reason if { input.kind == "Secret" private_keys := [result | # `result` is the yielded value some name, value in input.data # iterate over key-value pairs decoded := base64.decode(value) contains(decoded, pkey_header) # filter item out if it isn't a private key result := { # build an object to yield "name": name, "key": decoded, } ] # `private_keys` is a filtered array of objects count(private_keys) > 1 # continue if multiple items reason := { "msg": sprintf("Multiple private keys provided in secret '%s'", [input.metadata.name]), "private_keys": private_keys, }}
conftest test ./resources/secret.yaml -o json[ { "filename": "./resources/secret.yaml", "namespace": "secrets.example_1", "successes": 0, "failures": [ { "msg": "Multiple private keys provided in secret 'tokens'", "metadata": { "private_keys": [ { "key": "-----BEGIN PRIVATE KEY", "name": "private_key_1" }, { "key": "-----BEGIN PRIVATE KEY", "name": "private_key_2" } ] } } ] }]
🗒️ A note on syntax changes
To preserve backwards compatibility, new syntax is being introduced as part of opt-in package imports. The newer syntax is (in my opinion) more expressive, but you may still find lots of examples online using the old syntax.
For example, iterating over an array can be written as following when importing
future.keywords.in
/rego.v1
:
import future.keywords.in
some value in some_array
Where the older syntax (which is still usable) is:
input.metadata.annotations[value]
You can either selectively opt-in to new keywords using the future.keywords
package
(See Future keywords
for more info), or you can opt in to some major breaking changes by importing the
rego.v1
package, which notably changes rule heads:
# beforeviolation[reason] { ... }
# afterimport rego.v1violation contains reason if { ... }
Helper rules & functions
Helper rules
We can also define helper rules which allow us to abstract some of our logic, but won’t produce a failure in Conftest.
# Helper rule that simply returns true or falseis_tokens_secret := true if { input.kind == "Secret" contains(input.metadata.name, "tokens")}
# Helper rule that simply returns true or false# (This is short-hand for the previous example)is_tokens_secret if { input.kind == "Secret" contains(input.metadata.name, "tokens")}
# This helper rule returns `input.metadata.name` if it passesis_tokens_secret := input.metadata.name if { input.kind == "Secret" contains(input.metadata.name, "tokens")}
We can then leverage these helper rules in our rules:
is_tokens_secret := input.metadata.name if { input.kind == "Secret" contains(input.metadata.name, "tokens")}
violation contains reason if { name := is_tokens_secret # if the helper rule fails, this rule will fail # if the helper rule passes, we get the returned value reason := sprintf("%s is a tokens secret", [name])}
Functions
As well as this, we can define functions which allow us to parameterise the input:
# no return value, so implicitly true or falseis_cert(value) if { contains(value, "---BEGIN")}
# or alternatively, we could write it like so:# is_cert(value) := contains(value, "---BEGIN")
# v-- the object to search# v-- the return value,has_private_keys(secret_data) := result if { input.kind == "Secret" # We can still access `input`
# v-- comprehension used to build up the `result` variable, which we return result := [result | some name, value in secret_data decoded := base64.decode(value) is_cert(decoded) result := { "name": name, "key": decoded, } ]}
violation contains reason if { private_keys := has_private_keys(input.data) count(private_keys) > 1
reason := { "msg": sprintf("Multiple private keys provided in secret '%s'", [input.metadata.name]), "private_keys": private_keys, }}
Other use-cases
It’s also worth noting that you can do something very interesting things with helper rules & functions.
For example, this blog post by Styra explains how to model ‘or’ logic, by definining the same helper rule/function multiple times.
package secrets.example# (A contrived example)
has_token if input.data["token-a"]
has_token if { input.kind == "Secret" input.data["token-b"]}
has_token if { value := input.data["token-c"]}
violation contains reason if { has_token # ^- resolves to true as at least _one_ of `has_token` # helper rule implementations passes reason := "Secret contains a token"}
FAIL - ./resources/secret.yaml - secrets.example - Secret contains a token
Partial rules
Building upon the previous example, if we were to try and return different values from this helper rule:
has_token := input.data["token-a"]
has_token := value if { input.kind == "Secret" value := input.data["token-b"]}
# these evaluate to different values...
We get the following error:
eval_conflict_error: complete rules must not produce multiple outputs
This is because it has_token
is a ‘complete rule’.
However, we can also define ‘partial rules’ which are allowed to return multiple values as a set:
package secrets.example# (Another contrived example)
has_token contains value if { input.kind == "Secret" value := input.data["token-a"]}
has_token contains value if { input.kind == "Secret" value := input.data["token-b"]}
has_token contains value if { input.kind == "Secret" value := input.data["token-c"]}
violation contains reason if { tokens := has_token reason := sprintf("Secret contains tokens: %s", [tokens])}
FAIL - ./resources/secret.yaml - secrets.example - Secret contains tokens: {"dG9rZW4tYQ==", "dG9rZW4tYw=="}
These helers should look very familiar to the Conftest rules we’ve writing!
Sharing rules & functions
If you have a number of helper rules or functions you would like to share
between policies, you can also import
their namespace and re-use them
between files.
Writing dynamic rules using ‘data’
So far, we’ve written some fairly static rules. Sure, we can add more flexibility by making them more data-driven:
bad_keys := {"token-a", "token-b", "token-b"}
violation contains reason if { input.kind == "Secret" some key, _ in input.data
key in bad_keys reason := sprintf("Bad key detected: %s", [key])}
But this hard-coded approach doesn’t scale, or allow us to re-use these policies accross our estate.
However, Conftest provides the means to dynamically provide data in yaml files,
which we then provide to the program via. the --data
flag.
bad_keys: - token-a - token-b - token-c
This is then merged into the data
object.
package secrets.example
violation contains reason if { input.kind == "Secret" some key, _ in input.data
key in data.bad_keys reason := sprintf("Bad key detected: %s", [key])}
conftest test --data data/data.yaml ./resources/secret.yaml
FAIL - ./resources/secret.yaml - secrets.example - Bad key detected: token-aFAIL - ./resources/secret.yaml - secrets.example - Bad key detected: token-c
If you are requiring data to be provided, I advise to always add separate rules that verify the existence & shape of the data.
This safeguards against the event you forget to pass any data, or you provide the wrong data.
violation contains "No bad_keys provided" if not data.bad_keys
Without these rules to indicate bad data, the above rule would fail when
trying to access data.bad_keys
, meaning we do not see a failure in the Conftest
report
Merging data for more complex policies
But we don’t have to stop there. As already mentioned, Conftest isn’t limited to parsing Kubernetes yaml manifests.
For example, we might want to compare two versions of the same resource, or
validate multiple resources together (ie. a Service
and a Deployment
).
While we can build this data into a single document ourselves (ie. with some
jq
/yq
magic), Conftest actually provides a --combine
flag, which will merge
each provided document into a single input
array.
Then, the input
object is the following shape:
[ { "path": "resources/resource-a.yaml", "contents": { ... }, }, { "path": "resources/resource-b.yaml", "contents": { ... }, }]
This extends the use-case of Conftest massively!
Some examples use cases might be…
-
Checking that a
Service
has a valid selector and is targetting the rightPod
s -
Auditing where specific
ConfigMap
values are used, across multipleDeployments
-
Validating that a Helm release has deployed the expected resources
-
Extracting env vars from a
Dockerfile
and ensuring these env vars are declared in aPod
spec
An example using combine
For example, if we also validate the following Pod
spec as well as our
existing Secret
:
apiVersion: v1kind: Podmetadata: name: nginxspec: containers: - name: nginx image: nginx:1.14.2 volumeMounts: - name: tokens mountPath: /var/tokens
volumes: - name: tokens secret: secretName: tokenz items: - key: token-a path: token-a - key: token-c path: token-c - key: private-key-1 path: private-key-1 - key: private-key-2 path: private-key-2 - key: token-b path: token-b
With the following policy:
package secrets.example
import rego.v1
# Helper rules
secret := resource.contents if { some resource in input resource.contents.kind == "Secret"}
pod := resource.contents if { some resource in input resource.contents.kind == "Pod"}
nginx_container := container if { some container in pod.spec.containers container.name == "nginx"}
pod_volume := volume if { some volume in pod.spec.volumes volume.name == "tokenz"}
# The meat and potatoes
violation contains "No Secret provided" if not secretviolation contains "No Pod provided" if not pod
violation_container_not_mounting_volume contains reason if { volume_mount_names := [volume_mount.name | some volume_mount in nginx_container.volumeMounts] not pod_volume.name in volume_mount_names reason := sprintf("volume '%s' is not mounted by container '%s'", [pod_volume.name, nginx_container.name])}
violation_no_volume_for_secret contains reason if { some volume in pod.spec.volumes not volume.secret.secretName == secret.metadata.name
reason := sprintf("secret '%s' has no associated volume in Pod spec", [secret.metadata.name])}
violation_secret_data_not_explicitly_mapped contains reason if { some secret_datum_key, _ in secret.data
volume_items := [item.key | some item in pod_volume.secret.items]
not secret_datum_key in volume_items reason := sprintf("secret datum '%s' not explicitly mapped in to Pod volume", [secret_datum_key])}
violation_only_valid_secret_data_mapped contains reason if { some volume_item in pod_volume.secret.items not(volume_item.key in [secret_datum_key | some secret_datum_key, _ in secret.data])
reason := sprintf("volume is attempting to map invalid data key '%s'", [volume_item.key])}
By running:
conftest test ./resources/pod.yaml ./resources.secret.yaml
FAIL - Combined - secrets.example - volume 'tokenz' is not mounted by container 'nginx'FAIL - Combined - secrets.example - secret 'tokens' has no associated volume in Pod specFAIL - Combined - secrets.example - secret datum 'password' not explicitly mapped in to Pod volumeFAIL - Combined - secrets.example - volume is attempting to map invalid data key 'token-b'
Conclusion
We’ve covered a lot of ground in this post, but this only scratches the surface as to how you can use Conftest.
Hopefully you can see the power provided by Conftest, Rego and OPA, and how you might use it in your workflow.
I’ll add some more specific examples in a an upcoming cheatsheet, so stay tuned!