Published
- 37 min read
Best Practices for Kubernetes Security
How to Write, Ship, and Maintain Code Without Shipping Vulnerabilities
A hands-on security guide for developers and IT professionals who ship real software. Build, deploy, and maintain secure systems without slowing down or drowning in theory.
Buy the book now
Practical Digital Survival for Whistleblowers, Journalists, and Activists
A practical guide to digital anonymity for people who can’t afford to be identified. Designed for whistleblowers, journalists, and activists operating under real-world risk.
Buy the book now
The Digital Fortress: How to Stay Safe Online
A simple, no-jargon guide to protecting your digital life from everyday threats. Learn how to secure your accounts, devices, and privacy with practical steps anyone can follow.
Buy the book nowIntroduction
Kubernetes has become the de facto standard for container orchestration, enabling developers to deploy and manage scalable applications efficiently. However, the flexibility and complexity of Kubernetes can introduce security challenges, making clusters and applications vulnerable to attacks.
This guide outlines best practices for securing Kubernetes environments, helping developers and DevOps teams protect their clusters and applications from potential threats.
The Kubernetes Security Attack Surface
Before diving into individual controls, it helps to understand the scope of what you are protecting. A Kubernetes cluster has a much larger attack surface than a traditional server deployment — and many of those surfaces are not immediately obvious when you first start using the platform.
At the infrastructure level, the control plane includes the API server, etcd, the controller manager, and the scheduler. Every action taken in Kubernetes — whether by a human operator, an automation tool, or an application — passes through the API server. If the API server is misconfigured (for example, if anonymous authentication is enabled or network access is unrestricted), an attacker can gain cluster-admin privileges without needing to compromise a single application pod.
The etcd database is particularly sensitive. It stores every Kubernetes object — including Secrets — and if an attacker gains direct read access to etcd, encryption at rest is the only control standing between them and every credential in the cluster. Similarly, the kubelet on each node exposes an HTTP API that, if reachable, can be used to execute commands in any pod running on that node, completely bypassing Kubernetes RBAC.
At the workload level, the threats shift to container images, pod configurations, and service accounts. Container images can carry known CVEs, outdated OS packages, or malicious code injected via a supply chain compromise. Pod specs that grant overly broad Linux capabilities, enable privileged mode, or mount sensitive host paths give attackers a path from a compromised application directly to the host kernel. Service accounts that are auto-mounted into pods and carry excessive RBAC permissions turn any application-level vulnerability into a cluster-level incident.
At the network level, the default allow-all networking model means that a compromised frontend pod can directly reach database pods, internal metadata services (a common vector for credential theft on cloud-hosted clusters), and the API server itself. Without NetworkPolicies in place, lateral movement through a Kubernetes cluster is trivially easy for an attacker who has breached any single pod.
Understanding this layered attack surface is the foundation for prioritizing your security investments. The sections that follow address each layer in turn — but the overarching theme is the same: apply least-privilege at every boundary, verify at admission time, and monitor continuously at runtime.
Why Kubernetes Security Matters
Kubernetes manages critical resources, including containers, storage, and network configurations. A single misconfiguration or vulnerability can compromise the entire infrastructure, leading to data breaches, unauthorized access, or disrupted services.
Key Risks:
- Misconfigurations:
- Improperly configured Kubernetes components can expose sensitive resources.
- Insecure Secrets Management:
- Storing unencrypted secrets increases the risk of data leakage.
- Privilege Escalation:
- Mismanaged role-based access control (RBAC) can allow unauthorized access.
- Container Vulnerabilities:
- Vulnerabilities in container images can be exploited by attackers.
- Network Exploits:
- Poorly secured networks can enable unauthorized communication between pods.
Best Practices for Kubernetes Security
1. Secure Kubernetes Clusters
Use Role-Based Access Control (RBAC)
Implement RBAC to enforce least privilege access. Define roles with specific permissions and bind them to users or service accounts.
Example (RBAC Policy):
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: ['']
resources: ['pods']
verbs: ['get', 'watch', 'list']
Enable Audit Logging
Use Kubernetes audit logs to monitor and investigate cluster activity.
Example (API Server Flag):
--audit-log-path=/var/log/kubernetes/audit.log
Use Secure Configuration Management
- Disable anonymous access to the Kubernetes API.
- Use strong authentication methods, such as certificates or OIDC.
2. Secure Container Images
Scan Images for Vulnerabilities
Use image scanning tools to detect vulnerabilities in container images.
Tools:
- Trivy: Scans container images for known vulnerabilities.
- Clair: Provides static analysis of vulnerabilities.
Use Minimal Base Images
Reduce the attack surface by using lightweight images, such as alpine.
Example (Dockerfile):
FROM python:3.9-alpine
Avoid Hardcoded Secrets
Store sensitive information, like API keys, in Kubernetes Secrets.
Example (Kubernetes Secret):
apiVersion: v1
kind: Secret
metadata:
name: db-password
type: Opaque
data:
password: cGFzc3dvcmQ=
3. Harden Network Security
Use Network Policies
Control communication between pods using network policies.
Example (Network Policy):
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend
namespace: default
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: backend
Isolate Namespaces
Separate workloads into different namespaces to improve security and manage access.
Example:
kubectl create namespace production
kubectl create namespace staging
Enable Encryption
Enable encryption for data in transit using TLS and for data at rest using Kubernetes encryption providers.
4. Monitor and Log Activity
Implement Logging
Use centralized logging solutions, such as:
- Fluentd: For collecting and forwarding logs.
- Elasticsearch/Kibana: For log storage and visualization.
Enable Runtime Monitoring
Monitor container behavior to detect and respond to anomalies.
Tools:
- Falco: Monitors container runtime activity for malicious behavior.
- Sysdig: Provides visibility into runtime security.
5. Enforce Pod Security Standards
Use Pod Security Admission (PSA)
Set pod security standards using Kubernetes Pod Security Admission policies.
Example (Pod Security):
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
allowPrivilegeEscalation: false
Run Non-Root Containers
Avoid running containers as the root user.
Example (Dockerfile):
USER 1001
6. Secure CI/CD Pipelines
Integrate Kubernetes security checks into your CI/CD pipelines to catch vulnerabilities before deployment.
Example (GitHub Actions with Kubesec):
jobs:
security-check:
runs-on: ubuntu-latest
steps:
- name: Run Kubesec
run: kubesec scan deployment.yaml
Tools for Kubernetes Security
Kubernetes-Native Tools
- Kube-bench: Checks Kubernetes clusters for compliance with CIS benchmarks.
- Kube-hunter: Identifies potential security issues in Kubernetes clusters.
Third-Party Tools
- Aqua Security: Comprehensive container security platform.
- Twistlock: Provides runtime protection and compliance enforcement.
Challenges and Solutions
Challenge: Complexity of Kubernetes Configurations
Solution:
- Use tools like Kubesec and OPA Gatekeeper to enforce security policies.
Challenge: Managing Secrets Securely
Solution:
- Use third-party secret management tools like HashiCorp Vault or native solutions like Kubernetes Secrets.
Challenge: Keeping Up with Updates
Solution:
- Automate updates using tools like Rancher or monitor releases for patches.
Deep Dive: RBAC Implementation
Role-Based Access Control is not just about attaching a handful of permissions to a user — it requires thoughtful design to prevent privilege escalation while still letting teams work efficiently. The challenge with RBAC is that the Kubernetes permission model has several non-obvious behaviors that can lead to unintentionally broad access.
The most important conceptual shift is recognizing that RBAC permissions are additive only — there is no way to explicitly deny a permission that has been granted by another binding. If a user or service account is bound to multiple roles, their effective permissions are the union of all role rules. This means the only way to take away a permission is to remove the binding or modify the role itself.
Another common source of confusion is the relationship between Role/ClusterRole and RoleBinding/ClusterRoleBinding. Many teams assume that using a ClusterRole means cluster-wide access, but that is only true when it is bound via a ClusterRoleBinding. A ClusterRole bound via a namespace-scoped RoleBinding grants those permissions only within that namespace — which is actually the preferred pattern for granting commonly-needed permissions across many namespaces from a single reusable role definition.
The four core RBAC objects in Kubernetes are:
| Object | Scope | Purpose |
|---|---|---|
Role | Namespace | Grants permissions within one namespace |
ClusterRole | Cluster-wide | Grants cluster-scoped or cross-namespace permissions |
RoleBinding | Namespace | Binds a Role or ClusterRole to a subject within a namespace |
ClusterRoleBinding | Cluster-wide | Binds a ClusterRole to a subject cluster-wide |
The cardinal rule: prefer RoleBinding over ClusterRoleBinding wherever possible. Even when reusing a ClusterRole definition, a RoleBinding confines that permission to a specific namespace.
Service Account Hardening
Applications running inside pods communicate with the Kubernetes API through a ServiceAccount. By default, Kubernetes automatically mounts a service account token into every pod — even when the pod never talks to the API server. Disable auto-mounting and create dedicated ServiceAccounts for workloads that genuinely need API access:
apiVersion: v1
kind: ServiceAccount
metadata:
name: backend-sa
namespace: production
automountServiceAccountToken: false
For workloads that do need API access, bind granular permissions and restrict to named resources:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production
name: configmap-reader
rules:
- apiGroups: ['']
resources: ['configmaps']
verbs: ['get', 'list']
resourceNames: ['app-config'] # restrict to one specific ConfigMap
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: backend-reads-config
namespace: production
subjects:
- kind: ServiceAccount
name: backend-sa
namespace: production
roleRef:
kind: Role
apiGroup: rbac.authorization.k8s.io
name: configmap-reader
RBAC for Human Operators via OIDC
For human operators, avoid long-lived certificate credentials. Integrate an OIDC provider (Dex, Okta, Google Workspace) so tokens expire and can be revoked centrally:
# kube-apiserver flags
--oidc-issuer-url=https://accounts.google.com
--oidc-client-id=your-client-id
--oidc-username-claim=email
--oidc-groups-claim=groups
Then bind roles to OIDC groups rather than individual users:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: dev-team-namespace-access
namespace: production
subjects:
- kind: Group
name: [email protected]
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: edit
apiGroup: rbac.authorization.k8s.io
Auditing RBAC Configuration
Regularly audit your RBAC configuration to catch over-permissioned accounts. The following kubectl commands are useful for periodic reviews:
# Check whether a specific service account can create pods
kubectl auth can-i create pods \
--as system:serviceaccount:production:backend-sa \
-n production
# List all subjects bound to cluster-admin
kubectl get clusterrolebindings -o json | \
jq '.items[] | select(.roleRef.name=="cluster-admin") | {name:.metadata.name, subjects:.subjects}'
# See all roles and bindings in a namespace at a glance
kubectl get roles,rolebindings -n production -o wide
# Enumerate all actions a service account can take (requires kubectl-access-matrix plugin)
kubectl access-matrix --sa production:backend-sa
The kubectl-who-can plugin is also invaluable: kubectl who-can delete secrets -n production shows every subject that holds that permission.
RBAC Privilege Escalation Risks to Know
Several RBAC verbs are especially dangerous and should be granted with extreme caution. The Kubernetes RBAC documentation calls these out explicitly, but they are worth emphasizing because they are frequently overlooked during initial cluster setup.
The escalate verb on roles lets a user create roles with broader permissions than they currently hold — in other words, they can grant themselves arbitrary permissions through a new role definition. The bind verb is similarly dangerous: it lets a user create RoleBinding or ClusterRoleBinding objects that attach any role (including roles with more permissions than the user holds) to any subject. Together, these two verbs effectively bypass the Kubernetes RBAC safety checks on privilege escalation, so they should only ever be granted to trusted cluster administrators.
The impersonate verb deserves particular attention in shared cluster environments. A user with this permission can act as any other user, group, or service account — including system:masters, which bypasses all RBAC checks entirely. This is a legitimate need for some automation tools, but the scope should be tightly restricted using resourceNames to limit impersonation to specific subjects.
Wildcard access to Secrets via list or watch is a subtler risk. When Kubernetes returns a List response for Secrets, it includes the full data field for every Secret — the base64-encoded values are plain in the response body. Treating list on Secrets as equivalent to get (and therefore restricting it the same way) is essential for preventing bulk secret exfiltration.
Several RBAC verbs are especially dangerous and should be granted with extreme caution:
escalate: Allows a user to create Roles with broader permissions than they currently hold.bind: Allows a user to bind any ClusterRole to themselves, bypassing least-privilege.impersonate: Allows a user to act as any other user or service account in the cluster.createonserviceaccounts/token: Allows minting new tokens for any service account — effectively inheriting that account’s permissions.listonsecrets: Alistverb on Secrets returns all secret contents in the body, not just metadata — treat it the same asget.
Regularly scan for bindings that grant these verbs using kube-bench, rbac-lookup, or simply using kubectl get clusterrolebindings combined with jq filtering.
NetworkPolicy Patterns and Implementation
By default, every pod in a Kubernetes cluster can communicate with every other pod across every namespace — a significant attack surface. NetworkPolicy resources let you define fine-grained ingress and egress rules at the pod level, using label selectors.
The mental model to internalize is that a NetworkPolicy selects pods (via podSelector) and describes which traffic is permitted for those pods. Pods that are not selected by any policy are fully open; pods that are selected by at least one policy are subject to the union of all matching policies. This is important: policies are additive, so a deny-all policy plus an allow-specific policy together permit only what the allow policy explicitly opens.
Designing NetworkPolicies requires a clear understanding of your application’s data flows. Before writing a single policy, map out every service-to-service communication that your application requires: which pods call which other pods, on which ports, and in which direction. Tools like Cilium’s Hubble, Calico’s flow logs, and Weave’s traffic visualizer can help you discover existing traffic patterns in a running cluster before you start enforcing policies — preventing accidental outages when you first apply a deny-all rule.
One important caveat: NetworkPolicies affect pod-to-pod traffic at the network layer, but they do not replace application-level authentication. A NetworkPolicy allowing traffic from pod A to pod B does not verify that the request actually comes from pod A — IP spoofing aside, any pod that ends up sharing the same IP (e.g., after a restart) would also be permitted. For stronger workload-to-workload identity guarantees, consider a service mesh like Istio or Linkerd, which adds mutual TLS (mTLS) and cryptographic identity to every service call.
Important prerequisite: NetworkPolicies are enforced by the Container Network Interface (CNI) plugin. Your cluster must run a CNI that supports NetworkPolicy — such as Calico, Cilium, or Weave Net. Clusters using kubenet (common on cloud-managed Kubernetes) do not enforce NetworkPolicies without installing a compatible CNI.
Start With a Default Deny-All
The single most impactful pattern: apply a deny-all baseline to each namespace, then add explicit allow rules for required traffic. This inverts the default allow-all posture:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {} # empty selector matches ALL pods in the namespace
policyTypes:
- Ingress
- Egress
Once this is applied, no pod in production can send or receive traffic until you explicitly allow it.
Allow Service-to-Service Traffic
After locking down with a deny-all, open only the connections your application requires:
# Allow the backend to receive traffic from the frontend only on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
Allow DNS and External Egress
Pods need DNS resolution, and some need to reach external APIs. Add targeted egress rules:
# Allow all pods to resolve DNS via kube-dns
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
---
# Allow the backend to call an external payment API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-backend-external-egress
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 203.0.113.0/24 # payment provider CIDR
ports:
- protocol: TCP
port: 443
Cross-Namespace Traffic for Monitoring
A monitoring stack in a separate namespace must be able to scrape metrics without granting it unrestricted access:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-scrape
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app: prometheus
ports:
- protocol: TCP
port: 9090
Note the use of both namespaceSelector and podSelector in the same from entry — they are ANDed together, meaning only Prometheus pods inside the monitoring namespace are permitted.
Network Segmentation Architecture
graph TD
subgraph Internet
EXT[External Traffic]
end
subgraph production namespace
ING[Ingress Controller]
FE[Frontend Pods]
BE[Backend Pods]
DB[Database Pods]
end
subgraph monitoring namespace
PROM[Prometheus]
end
EXT -->|HTTPS :443| ING
ING -->|HTTP :80| FE
FE -->|TCP :8080| BE
BE -->|TCP :5432| DB
PROM -->|TCP :9090| BE
style DB fill:#ff9999
style BE fill:#ffcc99
style FE fill:#ccffcc
Pod Security Standards: A Practical Guide
Kubernetes deprecated PodSecurityPolicy in v1.21 and removed it entirely in v1.25, replacing it with Pod Security Admission (PSA) — a built-in admission controller that enforces three standard security profiles at the namespace level via namespace labels.
Understanding why PSP was replaced is informative for applying PSA correctly. PodSecurityPolicy was a cluster-scoped resource that had confusing interaction semantics: a policy only applied to a pod if the creating user or the pod’s service account had use permission on that policy. This meant that granting a user the ability to create pods implicitly required also granting them access to a PSP — and teams routinely granted permissive PSPs just to make things work, defeating the purpose of the control entirely. PSA solves this by moving policy enforcement to the namespace level with simple labels, removing the confusing RBAC interaction entirely.
The three profiles are designed as a graduated hierarchy. privileged is the baseline for trusted system workloads like CNI plugins or node monitoring agents that genuinely need direct host access. baseline prevents the most common and most dangerous privilege escalation techniques — running as root, using privileged containers, sharing host namespaces — while remaining compatible with the vast majority of containerized applications that simply haven’t been hardened. restricted adds the remaining defenses from current hardening best practices: mandatory non-root execution, capability dropping, seccomp enforcement, and read-only root filesystems.
The key operational insight is that you do not have to choose a single profile for your entire cluster. Apply privileged to system namespaces (kube-system, CNI namespaces), baseline as the default for application namespaces during initial adoption, and restricted for namespaces running public-facing or elevated-trust workloads. This graduated approach lets you harden consistently without a big-bang migration.
The three profiles are:
| Profile | Restriction Level | Typical Use Case |
|---|---|---|
privileged | None — fully unrestricted | System workloads, infrastructure components |
baseline | Prevents known exploits; allows common defaults | Most applications that don’t need elevated access |
restricted | Enforces current Pod hardening best practices | Security-critical or public-facing services |
Each profile can be applied in three modes per namespace:
enforce— Pods violating the policy are rejected.audit— Violations are recorded in API server audit logs but not blocked.warn— The API client receives a warning, but the pod is admitted.
The recommended rollout strategy is to start with warn + audit, fix any violations, then graduate to enforce.
Labeling Namespaces for PSA
# Phase 1: observe violations without blocking
kubectl label namespace production \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/warn-version=latest \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/audit-version=latest
# Check audit log for violations, then enforce
kubectl label namespace production \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest
Writing a Compliant Pod Spec (Restricted Profile)
A pod satisfying the restricted profile must:
- Set
runAsNonRoot: trueat the pod or container level - Drop
ALLLinux capabilities (and add back only what is truly needed) - Set
allowPrivilegeEscalation: false - Use a
RuntimeDefaultorLocalhostseccomp profile - Use only permitted volume types (no
hostPath)
apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-app
namespace: production
spec:
replicas: 2
selector:
matchLabels:
app: secure-app
template:
metadata:
labels:
app: secure-app
spec:
automountServiceAccountToken: false
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myregistry.io/myapp:1.2.3
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
resources:
requests:
cpu: '100m'
memory: '64Mi'
limits:
cpu: '500m'
memory: '256Mi'
volumeMounts:
- name: tmp
mountPath: /tmp
- name: var-run
mountPath: /var/run
volumes:
- name: tmp
emptyDir: {}
- name: var-run
emptyDir: {}
PSA Profile Comparison Table
| Control | Privileged | Baseline | Restricted |
|---|---|---|---|
privileged: true allowed | ✅ | ❌ | ❌ |
Host namespaces (hostPID, hostNetwork, hostIPC) | ✅ | ❌ | ❌ |
| HostPath volumes | ✅ | ❌ | ❌ |
runAsNonRoot required | ❌ | ❌ | ✅ |
| Must drop all capabilities | ❌ | ❌ | ✅ |
| Seccomp profile required | ❌ | ❌ | ✅ |
allowPrivilegeEscalation must be false | ❌ | ❌ | ✅ |
| Suitable for | System/infra | Most apps | Critical/public services |
Migrating from PodSecurityPolicy
If you are migrating a cluster from pre-v1.25, the process is:
- Audit existing PSPs with
kubectl get pspto understand current policy boundaries. - Map them to PSA profiles — most PSPs map to either
baselineorrestricted. - Label namespaces in
warnmode and deploy the new PSA configuration alongside existing PSPs. - Fix violations reported by the
warnandauditmodes. - Remove PSPs and enforce PSA.
Policy Enforcement with OPA/Gatekeeper and Kyverno
Pod Security Admission enforces pod-level controls, but production clusters often need custom organizational policies: “all Deployments must declare resource limits”, “container images must come from our internal registry”, “every namespace must carry a team label”, or “no Service of type LoadBalancer in staging”. Policy engines let you encode these rules as code — giving you version-controlled, reviewable, and testable security policies that live alongside your application manifests.
The broader category of tooling here is called Policy as Code, and it applies the same engineering discipline to security rules that good development teams apply to application code: policies are committed to source control, reviewed in pull requests, tested against sample inputs, and rolled out systematically to clusters. This approach means that policy drift — where the live cluster state diverges from the intended security posture — becomes visible and reversible.
Both Gatekeeper and Kyverno integrate with the Kubernetes admission webhook mechanism. When a resource is submitted to the API server, the server forwards a copy to the admission webhook, which evaluates it against all active policies and returns either an admit or deny decision. This happens synchronously before the resource is persisted in etcd, so violations are blocked at the source rather than detected after the fact.
Choosing between Gatekeeper and Kyverno involves trade-offs. Rego (Gatekeeper’s policy language) is a powerful general-purpose logic language that can express arbitrarily complex rules, but it has a steep learning curve. Kyverno’s YAML-based approach maps closely to the Kubernetes resource model and is approachable for developers who already understand Kubernetes manifests. For many teams, Kyverno is the faster path to a functioning policy library, while Gatekeeper is preferred in environments that already have Rego expertise or need to share policy definitions with other OPA deployments (such as API gateways or CI pipelines).
OPA Gatekeeper
Gatekeeper uses ConstraintTemplate (a Rego policy) and Constraint (a policy instance with parameters). Install it with:
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/release-3.14/deploy/gatekeeper.yaml
Example: Require resource limits on all containers
# 1 — The ConstraintTemplate defines the Rego logic
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredresources
spec:
crd:
spec:
names:
kind: K8sRequiredResources
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredresources
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.resources.limits.cpu
msg := sprintf("Container '%v' is missing a CPU limit", [container.name])
}
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.resources.limits.memory
msg := sprintf("Container '%v' is missing a memory limit", [container.name])
}
---
# 2 — The Constraint activates the template for specific namespaces
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredResources
metadata:
name: require-resource-limits
spec:
match:
kinds:
- apiGroups: ['apps']
kinds: ['Deployment', 'StatefulSet', 'DaemonSet']
namespaces:
- production
- staging
Kyverno: YAML-Native Policies
Kyverno is a CNCF alternative to Gatekeeper that uses plain Kubernetes YAML — no Rego required — making it accessible to teams without OPA experience:
# Block pods that use the 'latest' image tag
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-latest-tag
spec:
validationFailureAction: Enforce
rules:
- name: require-explicit-tag
match:
any:
- resources:
kinds: ['Pod']
validate:
message: "Using ':latest' tag is not allowed. Pin to a specific version."
pattern:
spec:
containers:
- image: '!*:latest'
---
# Mutate: automatically add a label to every new namespace
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-security-label
spec:
rules:
- name: add-team-label
match:
any:
- resources:
kinds: ['Namespace']
mutate:
patchStrategicMerge:
metadata:
labels:
+(security-reviewed): 'false'
OPA Gatekeeper vs. Kyverno Comparison
| Feature | OPA Gatekeeper | Kyverno |
|---|---|---|
| Policy language | Rego | YAML / JMESPath |
| Learning curve | High | Low |
| Validate policies | ✅ | ✅ |
| Mutate policies | Via separate webhook | ✅ Built-in |
| Generate resources (e.g., default NetworkPolicy) | ❌ | ✅ |
| Image verification (Sigstore/Cosign) | Via external tooling | ✅ Built-in |
| Audit mode | ✅ | ✅ |
| Policy reporting (PolicyReport CRD) | ✅ | ✅ |
| CNCF status | Graduated | Incubating |
Secrets Management in Kubernetes
Native Kubernetes Secrets are base64-encoded, not encrypted — they sit as plaintext in etcd unless you explicitly enable encryption at rest. A user with etcd access, or with list permission on Secrets, can read every secret in the cluster. This is a frequent surprise for teams who assume that Kubernetes Secrets provide confidentiality by default — the name “Secret” does not imply encryption, only a marginally more restricted access path than a ConfigMap.
The practical implication is that Kubernetes Secrets alone are not a secrets management solution. They are a runtime injection mechanism — a way to get a value into a container without baking it into the image — but they do not provide the lifecycle management, rotation, auditing, or access controls that production secrets require. To manage secrets properly, you need to address three concerns: how secrets are stored at rest, how they are accessed at runtime, and how they are rotated when they are compromised or expire.
For storage at rest, enabling encryption at the API server level is the baseline step — it ensures that anyone who gains direct access to the etcd data files cannot read Secret values. Going further, some organizations use envelope encryption with a KMS provider (AWS KMS, GCP KMS, Azure Key Vault), where the API server holds only an encrypted data encryption key and calls out to the KMS for the actual decryption — so even an API server compromise does not give access to secrets without also compromising the KMS.
For runtime access, the preferred modern approach is to keep secrets out of Kubernetes entirely and pull them from a dedicated secrets backend at pod startup. The External Secrets Operator and the CSI Secrets Store Driver are the two predominant patterns: ESO syncs secrets into short-lived Kubernetes Secret objects that it manages and rotates, while the CSI driver mounts secrets directly as files in the pod’s filesystem without ever creating a Kubernetes Secret object.
For rotation, the key is to avoid long-lived static credentials wherever possible. Database passwords, API keys, and certificates should have short TTLs and be rotated automatically — tools like Vault’s dynamic secrets generate unique, short-lived credentials per workload request, so there are no static passwords to exfiltrate and rotate.
Enable Encryption at Rest
Configure the API server to encrypt Secrets (and optionally other resource types) before they are written to etcd:
# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <base64-encoded-32-byte-randomly-generated-key>
- identity: {} # fallback: allows reading unencrypted existing data
Add the flag to the API server manifest:
--encryption-provider-config=/etc/kubernetes/encryption-config.yaml
After enabling, force all existing Secrets through the encryption provider:
kubectl get secrets --all-namespaces -o json | kubectl replace -f -
External Secrets Operator (ESO)
The preferred pattern for production is to not store sensitive values in Kubernetes at all — instead pull them from a dedicated secrets backend. The External Secrets Operator synchronizes values from AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, or HashiCorp Vault into short-lived Kubernetes Secrets:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
namespace: production
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secretsmanager-store
kind: ClusterSecretStore
target:
name: db-credentials-k8s
creationPolicy: Owner
data:
- secretKey: DB_PASSWORD
remoteRef:
key: production/database
property: password
- secretKey: DB_USERNAME
remoteRef:
key: production/database
property: username
When the upstream secret changes, ESO automatically reconciles the Kubernetes Secret within the refreshInterval.
HashiCorp Vault Agent Injector
For teams running Vault, the Vault Agent Injector sidecar injects secrets as files or environment variables at pod startup — the secrets are never persisted as Kubernetes Secret objects at all:
spec:
template:
metadata:
annotations:
vault.hashicorp.com/agent-inject: 'true'
vault.hashicorp.com/role: 'backend-app'
vault.hashicorp.com/agent-inject-secret-db-creds: 'secret/data/production/db'
vault.hashicorp.com/agent-inject-template-db-creds: |
{{- with secret "secret/data/production/db" -}}
DB_USER="{{ .Data.data.username }}"
DB_PASS="{{ .Data.data.password }}"
{{- end }}
Secrets Management Approach Comparison
| Approach | Encrypted at Rest | Dynamic Rotation | Native to K8s | Complexity |
|---|---|---|---|---|
| Plain Kubernetes Secrets | Only if configured | ❌ | ✅ | Low |
| Bitnami Sealed Secrets | ✅ (asymmetric) | ❌ | ✅ | Low–Medium |
| External Secrets Operator | Source-dependent | ✅ | ✅ (syncs to K8s secret) | Medium |
| Vault Agent Injector | ✅ (never hits K8s) | ✅ | ❌ (sidecar) | High |
| CSI Secrets Store Driver | ✅ | ✅ | ✅ (mounted as volume) | Medium |
Supply Chain Security and Image Verification
The Kubernetes supply chain is a prime attack vector. A single vulnerable base image or a compromised build step can affect every workload in your cluster. Supply chain security spans image scanning, signing, and enforcement at admission time.
Supply chain attacks have moved squarely into the mainstream threat landscape. The SolarWinds compromise in 2020 and the 3CX supply chain attack in 2023 demonstrated that attackers are increasingly targeting the build and distribution pipeline rather than the deployed application. In a Kubernetes context, this means that the container image you deploy may carry malicious code injected during the build process, or may pull a compromised base image from a public registry that had a dependency poisoned upstream.
The SLSA (Supply-chain Levels for Software Artifacts) framework from Google provides a maturity model for supply chain security. At the lowest levels, you have basic version control and build integrity. At the highest levels, every artifact has a cryptographically verifiable provenance record tracing it back to a specific source commit, built by a specific verified builder, with a tamper-evident audit log of the entire pipeline. Kubernetes tooling like Cosign and the policy engines described above are the practical mechanisms for implementing SLSA controls at the deployment gate.
The practical starting point for most teams is a two-step approach: scan at build time to catch known vulnerabilities before they reach production, and enforce allowed registries and image signatures at admission time to ensure that only images that passed your quality gates can actually run in the cluster. Neither control alone is sufficient — scanning without enforcement means developers can bypass it, and enforcement without scanning means you know where the image came from but not whether it is safe.
Image Vulnerability Scanning in CI/CD
Scan images for known CVEs (Common Vulnerabilities and Exposures) before they ever reach production. Run the scan as a required step that breaks the build on critical findings:
# GitHub Actions: scan image with Trivy
- name: Build image
run: docker build -t myregistry.io/myapp:${{ github.sha }} .
- name: Scan with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: myregistry.io/myapp:${{ github.sha }}
format: sarif
output: trivy-results.sarif
severity: CRITICAL,HIGH
exit-code: '1' # fail the pipeline on critical/high CVEs
- name: Upload SARIF results
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: trivy-results.sarif
Image Signing with Cosign (Sigstore)
Cosign lets you cryptographically sign container images and verify those signatures before deployment:
# In CI: sign the image after pushing to the registry
cosign sign --key cosign.key myregistry.io/myapp:1.2.3
# In a verification step: confirm authenticity
cosign verify --key cosign.pub myregistry.io/myapp:1.2.3
For keyless signing (using Fulcio — Sigstore’s certificate authority), CI pipelines can sign with their OIDC token, producing a short-lived certificate tied to the pipeline identity.
Enforcing Image Signatures with Kyverno
Kyverno’s built-in image verification blocks unsigned images at admission time:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-image-signatures
spec:
validationFailureAction: Enforce
rules:
- name: check-signature
match:
any:
- resources:
kinds: ['Pod']
verifyImages:
- imageReferences:
- 'myregistry.io/myapp:*'
attestors:
- entries:
- keys:
publicKeys: |-
-----BEGIN PUBLIC KEY-----
<your Cosign public key here>
-----END PUBLIC KEY-----
Restricting Image Registries
Prevent pods from pulling images from untrusted registries by enforcing an allowlist:
# Gatekeeper ConstraintTemplate: only allow images from approved registries
rego: |
package allowedregistries
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not startswith(container.image, "myregistry.io/")
not startswith(container.image, "gcr.io/my-project/")
msg := sprintf("Image '%v' is not from an approved registry", [container.image])
}
Testing and Auditing Kubernetes Security
A documented security policy is only as strong as your ability to verify and continuously validate it. The discipline of continuous security validation — sometimes called “security assurance” — treats your cluster’s security posture as something to be tested as rigorously as application functionality. The same way you would catch a regression in your application code with automated tests, automated compliance checks catch regressions in your security posture before they become vulnerabilities.
The controls described throughout this guide fall into three temporal categories, each with different testing approaches. Pre-deployment controls — image scanning, manifest linting, policy-as-code validation — are tested in CI/CD pipelines before anything reaches the cluster. Admission-time controls — PSA, Gatekeeper, Kyverno — are validated by attempting to deploy non-compliant resources and confirming they are rejected. Runtime monitoring controls — Falco, audit logging, anomaly detection — are validated through red team exercises, chaos engineering, or by deliberately triggering known-bad behaviors and confirming detection.
Combining all three layers gives you defense in depth: pre-deployment catches the bulk of issues cheaply (failing a CI job is far cheaper than responding to an incident), admission-time acts as a final safety net against policy drift, and runtime monitoring handles the cases that slipped through or represent novel attack patterns that your static policies didn’t anticipate.
kube-bench: CIS Benchmark Compliance
kube-bench runs checks against CIS Kubernetes Benchmark controls — a widely accepted baseline covering control-plane, node, and policy configuration:
# Deploy as a Kubernetes Job on each node type
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
# View results
kubectl logs -l app=kube-bench --tail=100
Example output:
[PASS] 1.2.9 Ensure that the --authorization-mode argument includes RBAC
[FAIL] 1.2.6 Ensure that the --kubelet-certificate-authority argument is set
[WARN] 4.2.3 Ensure that the --client-ca-file argument is set as appropriate
[INFO] 5.1.1 Ensure that the cluster-admin role is only used where required
Integrate kube-bench into your CI pipeline using its JSON output mode for automated pass/fail gating.
kubesec: Manifest Scoring
kubesec scores YAML manifests against a security best-practice ruleset before you even apply them to the cluster — ideal as a pre-commit check or CI gate:
kubesec scan deployment.yaml
# JSON output excerpt
{
"score": 4,
"scoring": {
"passed": [
{ "id": "ReadOnlyRootFilesystem", "points": 1 }
],
"advise": [
{
"id": "CapDropAny",
"points": 1,
"reason": "Drop all capabilities and add only those required to reduce syscall attack surface"
},
{
"id": "SeccompAny",
"points": 1,
"reason": "Seccomp profiles restrict the system calls available to a container"
}
]
}
}
Configuring API Server Audit Logging
Kubernetes audit logs capture every API request with the requesting user, resource, and action. Configure a granular policy to capture the events that matter for security:
# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Full request/response bodies for Secret operations (detect exfiltration)
- level: RequestResponse
resources:
- group: ''
resources: ['secrets']
# Full request for pod creation/deletion (detect rogue workloads)
- level: Request
resources:
- group: ''
resources: ['pods']
verbs: ['create', 'delete', 'patch', 'update']
# Metadata only for read-heavy paths (reduce log volume)
- level: Metadata
verbs: ['get', 'list', 'watch']
# Don't log routine healthcheck paths
- level: None
users: ['system:kube-proxy']
verbs: ['watch']
resources:
- group: ''
resources: ['endpoints', 'services']
Reference the policy from the API server:
--audit-log-path=/var/log/kubernetes/audit.log
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-maxage=30
--audit-log-maxbackup=10
--audit-log-maxsize=100
Falco: Runtime Threat Detection
Falco monitors system calls in real time and alerts on behavioral anomalies — a shell spawned inside a pod, a sensitive file read, a binary executed from a writable path:
# Custom Falco rule
- rule: Shell spawned in container
desc: A shell was started inside a running container
condition: >
spawned_process and container
and shell_procs
and not container.image.repository in (trusted_shell_images)
output: >
Shell spawned in container (user=%user.name container=%container.name
image=%container.image.repository shell=%proc.name cmdline=%proc.cmdline)
priority: WARNING
tags: [container, shell]
Deploy Falco as a DaemonSet and forward its JSON output to your SIEM (Splunk, Datadog, ElasticSearch) for correlation and alerting.
End-to-End Security Pipeline
A mature Kubernetes security pipeline applies controls at every stage:
flowchart LR
A[Code Commit] --> B[SAST Scan]
B --> C[Build Image]
C --> D[Trivy CVE Scan]
D --> E[Cosign Sign]
E --> F[Push to Registry]
F --> G[kubesec Manifest Scan]
G --> H[Deploy to Staging]
H --> I[PSA + Kyverno Admission]
I --> J[Deploy to Production]
J --> K[Falco Runtime Monitoring]
K --> L[kube-bench Periodic Audit]
style D fill:#ff9999,color:#000
style I fill:#ff9999,color:#000
style K fill:#ff9999,color:#000
Common Mistakes and Anti-Patterns
Even experienced teams fall into predictable security traps with Kubernetes. What makes these patterns particularly dangerous is that they often work perfectly well from a functionality standpoint — applications run, features ship, tests pass — and the security weakness is only visible when you specifically look for it or, worse, when an attacker finds it first.
Many of these anti-patterns trace back to two root causes: the pressure to make things work quickly and the assumption that the underlying platform provides more security than it actually does. Kubernetes is designed to be flexible and powerful, which means it does not enforce strong defaults. It is your responsibility to explicitly configure every security control you need. The absence of a NetworkPolicy is not a neutral state — it is an active decision to allow all traffic. Running without PSA enforcement is not just “not configured yet” — it is an environment where any developer who can create pods can potentially escalate to host access.
The good news is that most of these anti-patterns are straightforward to fix once identified. The harder challenge is building awareness across the entire team so that the pattern is recognized in code review, not just in a security audit. Embedding tools like kubesec and Kyverno into the development workflow — where violations appear in the same pull request as the problematic code — is the most effective way to shift the feedback loop left.
Anti-Pattern 1: Running Containers as Root
The most widespread mistake — container images that default to root (UID 0). If an attacker escapes the container via a vulnerability, they land as root on the host:
# BAD: implicit root user
FROM ubuntu:22.04
COPY app /app
CMD ["/app"]
# GOOD: explicitly create and switch to a non-root user
FROM ubuntu:22.04
RUN groupadd -r appgroup && useradd --no-log-init -r -g appgroup appuser
COPY --chown=appuser:appgroup app /app
USER appuser
CMD ["/app"]
Anti-Pattern 2: Wildcard Permissions in RBAC
Granting * on verbs or resources is a shortcut that makes the entire RBAC model meaningless:
# BAD: omnipotent service account
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
# GOOD: only the verbs and resources actually needed
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
Anti-Pattern 3: Sensitive Values in ConfigMaps
ConfigMaps are not encrypted and are visible to anyone who can read the namespace. Never put passwords, tokens, or private keys there:
# BAD: password in a ConfigMap
apiVersion: v1
kind: ConfigMap
data:
DB_PASSWORD: "hunter2"
# GOOD: use a Secret (ideally backed by an external secrets manager)
apiVersion: v1
kind: Secret
type: Opaque
stringData:
DB_PASSWORD: "hunter2"
Anti-Pattern 4: No Resource Limits
Containers without resource limits can monopolize CPU and memory, causing a denial of service for co-located workloads — and provide attackers with compute for cryptomining or brute-forcing:
# BAD
containers:
- name: app
image: myapp:1.0
# GOOD: always specify both requests and limits
containers:
- name: app
image: myapp:1.0
resources:
requests:
cpu: "100m"
memory: "64Mi"
limits:
cpu: "500m"
memory: "256Mi"
Anti-Pattern 5: Using the latest Image Tag
The latest tag is mutable — a new image pushed over latest silently changes runtime behavior with no audit trail and no reproducibility:
# BAD: mutable, non-reproducible
image: myapp:latest
# GOOD: pin to a specific semantic version
image: myapp:1.4.2
# BEST: pin to an immutable digest
image: myapp@sha256:e3b0c44298fc1c149afbf4c8996fb924...
Anti-Pattern 6: Overly Broad NetworkPolicy Selectors
A common shortcut that opens more than intended:
# BAD: allows all pods in the namespace to talk to the database
ingress:
- from:
- podSelector: {} # matches every pod
# GOOD: only the api-server pod
ingress:
- from:
- podSelector:
matchLabels:
app: api-server
Anti-Pattern 7: Ignoring PodDisruptionBudgets
Security availability is as important as confidentiality. Without a PodDisruptionBudget, performing a node drain or rolling update could evict all replicas simultaneously, creating an outage window that attackers can exploit:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: backend-pdb
namespace: production
spec:
minAvailable: 2
selector:
matchLabels:
app: backend
Anti-Pattern 8: Skipping Namespace Isolation for Multi-Tenant Clusters
Namespaces are not a security boundary by themselves — without NetworkPolicies, PSA enforcement, and resource quotas, tenants can interfere with each other. Always combine namespaces with:
# Enforce restricted PSA on every tenant namespace
kubectl label namespace tenant-a pod-security.kubernetes.io/enforce=restricted
# Apply resource quotas to prevent noisy-neighbor exhaustion
kubectl apply -f - <<EOF
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-quota
namespace: tenant-a
spec:
hard:
pods: "20"
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
EOF
Multi-Tenancy and Namespace Isolation
When multiple teams, applications, or customers share a single Kubernetes cluster, namespace isolation becomes critical. Kubernetes namespaces provide logical grouping and access scoping, but they are not a hard isolation boundary like separate virtual machines. A pod in namespace team-a can, by default, reach pods in namespace team-b over the network; a service account with workload creation privileges in one namespace can indirectly access ConfigMaps and Secrets in that same namespace by injecting them into pod specs.
True multi-tenancy in Kubernetes requires combining several controls simultaneously: RBAC to limit which API resources each team can touch, NetworkPolicies to enforce network isolation between namespaces, Pod Security Admission to prevent workloads from escalating out of their namespace, and ResourceQuotas to prevent one tenant from starving others of compute.
The LimitRange resource complements ResourceQuota by enforcing per-pod and per-container defaults — so even if a developer forgets to specify resource requests, the cluster applies sane defaults automatically:
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: team-a
spec:
limits:
- type: Container
default:
cpu: '200m'
memory: '128Mi'
defaultRequest:
cpu: '100m'
memory: '64Mi'
max:
cpu: '2'
memory: '2Gi'
For stronger isolation requirements — for example, when running workloads from different customers — consider virtual clusters (using tools like vcluster) or separate physical clusters. Virtual clusters run a full Kubernetes control plane as a workload inside the host cluster, giving each tenant a complete, isolated API server experience with their own admission policies and RBAC, while sharing the underlying node pool.
Namespace-Level Security Checklist
When provisioning a new namespace for a team, apply this standard set of controls consistently:
- PSA labels — Set
enforce,audit, andwarnto the appropriate profile for the workload. - Default deny-all NetworkPolicy — Block all traffic by default; add allow rules as needed.
- ResourceQuota — Cap total CPU, memory, and object counts to prevent runaway resource consumption.
- LimitRange — Set default resource requests and limits so pods without explicit limits still behave predictably.
- RBAC bindings — Bind team members to
edit(notadmin) by default; require justification for broader access. - Service account — Create a dedicated service account with
automountServiceAccountToken: falsefor each workload.
Automating this provisioning through a Kubernetes operator or a GitOps workflow (ArgoCD ApplicationSet, Flux) ensures that every namespace starts with the same security baseline and that the baseline itself is version-controlled and auditable.
Kubernetes Security Maturity Model
Organizations typically adopt Kubernetes security controls in stages. Trying to implement everything at once often leads to either friction that blocks development velocity or half-implemented controls that provide false confidence. A maturity model helps teams identify where they are and what to prioritize next.
Level 0 — Default Configuration: Clusters are deployed with minimal changes to defaults. Anonymous access may be enabled. No NetworkPolicies. No PSA enforcement. Secrets stored in ConfigMaps or environment variables. No image scanning.
Level 1 — Basic Hardening: Anonymous API access is disabled. Authentication is configured (certificates or OIDC). Basic RBAC roles are defined. Image scanning is integrated into CI. Secrets are stored as Kubernetes Secret objects with encryption at rest.
Level 2 — Policy Enforcement: Pod Security Admission is enforced on production namespaces (at least baseline). Default deny NetworkPolicies are applied to all application namespaces. Resource limits are required on all containers via a policy engine. Audit logging is enabled and forwarded to a SIEM.
Level 3 — Defense in Depth: PSA restricted is enforced on security-critical namespaces. Image signatures are verified at admission time. Runtime threat detection (Falco) is deployed and alerting is configured. Secrets are managed via an external secrets backend with automatic rotation. Regular kube-bench audits are automated and results tracked.
Level 4 — Continuous Assurance: Security posture is continuously validated through automated compliance checks. Policy drift triggers automated remediation or alerts. Supply chain provenance (SLSA) is verified for all deployed artifacts. Security controls are tested as part of the same pipeline as application functionality. Incident response runbooks are documented and regularly exercised.
Most teams should target Level 2 as a baseline before going to production, and work toward Level 3 for their most critical workloads. Level 4 represents a mature security engineering investment appropriate for regulated industries or high-value targets.
Conclusion
Kubernetes security is not a checkbox you complete once — it is an ongoing engineering discipline that spans your entire software delivery lifecycle. From the moment a developer writes a Dockerfile to the moment a container is running in production and handling user traffic, there are dozens of security decisions being made, often implicitly. The practices described in this guide make those decisions explicit and defensible.
Starting with the fundamentals — RBAC least-privilege, NetworkPolicy segmentation, Pod Security Standards, and Secrets encryption — gives you a solid foundation. Building up through policy-as-code enforcement, supply chain verification, and runtime monitoring gives you the defense-in-depth posture that modern threat actors require. The security maturity model gives you a roadmap for prioritizing that investment based on your current state and your risk profile.
The most important mindset shift is to treat security controls the same way you treat tests: they belong in the development workflow, not in a separate security review at the end of the release cycle. When kubesec runs in a pre-commit hook, when Kyverno blocks a misconfigured Deployment in staging before it ever reaches production, when Falco alerts you to an unexpected shell spawned in a container within seconds of it happening — that is when security becomes a property of the system rather than a process bolted on beside it.
Start with the highest-impact controls for your current state (usually RBAC, PSA baseline enforcement, and image scanning), measure your baseline using kube-bench, and build from there. A cluster that improves its security posture consistently over time is more resilient than one that aims for perfection on day one and stalls. Integrating these strategies into your Kubernetes workflows will not only protect your clusters from known threats — it will build the muscle memory across your team to think about security as an inherent part of how you build and operate software.