Azure Infrastructure Blog

7 MIN READ

Zero-Trust Kubernetes: Enforcing Security & Multi-Tenancy with Custom Admission Webhooks

Microsoft

Nov 04, 2025

Admission controllers act as Kubernetes’ built-in gatekeepers that intercept API requests after authentication/authorization but before they're persisted to etcd. They can validate or mutate incoming objects, ensuring everything that enters your cluster meets defined policies. We strengthen this mechanism with OPA Gatekeeper (policy-as-code, integrated with Azure Policy on AKS), Kyverno (YAML-based policy engine), and custom admission webhooks that uphold Zero Trust rules.

By implementing admission controls, security policies become automated and proactive. Every deployment or change is evaluated in real time against your rules, preventing misconfigurations or risky settings from ever reaching the cluster. This dynamic enforcement greatly reduces the chance of human error opening a security gap. (Refer to Admission Control in Kubernetes for more details)

Embracing Zero-Trust Principles in Kubernetes

In our security strategy, “Never trust, always verify” is a guiding philosophy. In a Kubernetes context, adopting a Zero-Trust model means no component or request is inherently trusted, even if already inside the cluster perimeter. Every action must be authenticated, authorized, and within policy.

Here are few Zero Trust Enforcement Rules for Kubernetes:

Enforce Least-Privilege Access
Grant only the minimum required permissions using Kubernetes RBAC. Every workload gets its own ServiceAccount with only the permissions it needs and avoid using cluster-admin roles.
Restrict to Trusted Container Images
Permit images only from approved internal registries or signed sources. Block unverified images from public hubs using admission controllers or Azure Policy
Deny Privileged Containers and Host Access
Prevent pods from running in privileged mode or mounting sensitive host paths such as /etc or /var/run/docker.sock.
Default-Deny Network Policies
Apply a default deny-all ingress/egress posture per namespace and allow traffic only where explicitly required. Eliminates lateral movement.
Enable Mutual TLS (mTLS) for Pod Communication
Use a service mesh (Istio/Linkerd) to enforce encrypted and authenticated workload communication.
Continuous Policy Auditing and Drift Detection
Run admission controllers like OPA Gatekeeper or Kyverno in audit mode to detect policy violations in existing resources.
Enforce Runtime Security Controls
Integrate tools like Falco or Azure Defender for Kubernetes to monitor runtime behavior and detect anomalies such as unexpected system calls or privilege escalations.
Secure API Server Access
Restrict access to the Kubernetes API server using IP whitelisting, Azure AD integration, and role-based access.

By enforcing these Zero-Trust controls, the attack surface is drastically reduced. Even if an attacker gains initial access, layered guardrails prevent privilege escalation and block any lateral movement within the cluster.

This is a sample enforcement scenario to demonstrate how a Custom Admission Controller can apply Zero-Trust rules on Pods.
In this example, the webhook enforces:

Images must originate from testtech.azurecr.io
Pod must include the label environment

Implementation Steps

Refer to the sample code here: Kubernetes Custom Admission Controller

Step 1 — Build the Flask-based webhook

webhook.py processes AdmissionReview requests, evaluates the Pod spec against security rules, and returns the admission decision (allow/deny).

def validate():
    request_info = request.get_json()
    uid = request_info["request"]["uid"]
    pod = request_info["request"]["object"]
    violations = []

    # --- Rule 1: Allow only images from trusted registries ---
    trusted_registries = ["testtech.azurecr.io"]
    for container in pod.get("spec", {}).get("containers", []):
        image = container.get("image", "")
        if not any(image.startswith(reg) for reg in trusted_registries):
            violations.append(f"Image {image} not from trusted registry.")

    # --- Rule 2: Require 'environment' label ---
    labels = pod.get("metadata", {}).get("labels", {})
    if "environment" not in labels:
        violations.append("Pod missing required label: environment")

This ensures pods from public registries like docker.io are blocked and deployed with required labels.

Step 2 — Create and Mount TLS Certificates

Kubernetes API Server only communicates with HTTPS webhooks. We generate certificates (self-signed or via cert-manager) but the key point is: The certificate must include the Kubernetes service DNS name as SAN (Subject Alternative Name)

openssl req -x509 -newkey rsa:4096 -sha256 -days 365 -nodes -keyout tls.key -out tls.crt -subj "/CN=ztac-webhook.ztac-system.svc" -addext "subjectAltName = DNS:ztac-webhook.ztac-system.svc"

Then we store the cert in a Kubernetes secret:

kubectl create secret tls ztac-tls --cert=tls.crt --key=tls.key -n ztac-system

Step 3 — Deploy Webhook + Service

Deployment runs (refer to deployment.yaml & service.yaml in sample code): Docker image, mounts the TLS certificates (ztac-tls secret) and exposes port 8443. Service (ClusterIP) exposes webhook inside the cluster.

kubectl apply -f manifests/deployment.yaml 
kubectl apply -f manifests/service.yaml

Step 4 — Register ValidatingWebhookConfiguration

This informs Kubernetes API to call your webhook for every Pod request: (refer to validatingwebhook.yaml). 'CA Bundle' ensures the API Server trusts your webhook TLS certificate.

webhooks:
  - name: ztac.security.example.com
    clientConfig:
      service:
        name: ztac-webhook
        namespace: ztac-system
        path: /validate
      caBundle: <CA-BUNDLE-HERE>     # Base64 encoded CA cert
    admissionReviewVersions: ["v1"]
    sideEffects: None
    timeoutSeconds: 5

kubectl apply -f manifests/validatingwebhook.yaml

Step 5 — Test the Webhook

#Case1: In this example, the Pod pulls the image from a trusted registry, but since the required label is missing, the admission webhook rejects the Pod. (See the sample-testing folder.)

apiVersion: v1
kind: Pod
metadata:
  name: pod-allow
  namespace: ztac-system
spec:
  containers:
  - name: nginx
    image: testtech.azurecr.io/nginx:latest

Error below:

#Case2: Likewise, when a Pod references an image from an untrusted registry, the admission webhook blocks its creation. Refer to pod-deny-image.yaml in the sample folder.

#Case3: The Pod creation is permitted only when it complies with all defined Zero-Trust enforcement rules.

Securing Multi-Tenant & Shared Environments (AKS)

In shared AKS clusters, tenant isolation is critical to prevent cross-team compromise. Key strategies include:

Namespace Isolation: Assign separate namespaces per team, enforce RBAC and NetworkPolicies at namespace level.
Tenant-Specific RBAC: Scope roles to namespaces, integrate Azure AD for identity-based access control.
Network Fencing: Apply default-deny NetworkPolicies, restrict inter-namespace traffic and use Azure VNet segmentation.
Resource Quotas: Limit CPU, memory, and storage per namespace to prevent resource exhaustion.
Admission Controls: Use OPA Gatekeeper to enforce namespace-specific policies.
Ingress/Egress Security: Isolate ingress with TLS and SANs, restrict egress traffic per namespace to prevent data exfiltration.

Extending an example with Multi tenancy that “Webhook can check actual NetworkPolicy object existence under namespace”

def namespace_policy_is_secure(namespace):
    api = client.NetworkingV1Api()
    policies = api.list_namespaced_network_policy(namespace)
    found_secure_policy = False

    for policy in policies.items:
        has_ingress = bool(policy.spec.ingress)
        has_egress = bool(policy.spec.egress)

        # Require both ingress and egress rules
        if not (has_ingress and has_egress):
            continue

        # Validate ingress rules (must not allow open/any traffic)
        for rule in policy.spec.ingress:
            if not rule._from or rule._from == [{}]:  # empty means allow all
                return False

        # Validate egress rules (must not allow open/any traffic)
        for rule in policy.spec.egress:
            if not rule.to or rule.to == [{}]:  # empty means allow all
                return False

        found_secure_policy = True
    return found_secure_policy

# --- Rule: Enforce secure NetworkPolicy for multi-tenant isolation ---
if not namespace_policy_is_secure(namespace):
    violations.append(
        f"Namespace '{namespace}' does not enforce secure network isolation "
        "(requires NetworkPolicy with ingress + egress + deny-all default rules)."
    )

Second example would like to bring on “Dynamic Resource Quota Enforcement” means your Admission Controller checks how much CPU/Memory a tenant (namespace) has already consumed and rejects any Pod that exceeds the remaining quota.

# --- Multi-tenant ResourceQuota enforcement ---
quotas = core_api.list_namespaced_resource_quota(namespace).items

for quota in quotas:
    hard = quota.status.hard or {}
    used = quota.status.used or {}

    limit_cpu = float(hard.get("requests.cpu", 0))
    limit_mem = convert_to_mi(hard.get("requests.memory", "0Mi"))

    used_cpu = float(used.get("requests.cpu", 0))
    used_mem = convert_to_mi(used.get("requests.memory", "0Mi"))

    # Calculate remaining quota capacity
    remaining_cpu = limit_cpu - used_cpu
    remaining_mem = limit_mem - used_mem

    # Compare requested pod resources vs remaining namespace quota
    if requested_cpu > remaining_cpu or requested_mem > remaining_mem:
        violations.append(
            f"ResourceQuota exceeded in namespace '{namespace}'. "
            f"Remaining CPU={remaining_cpu}, Memory={remaining_mem}Mi | "
            f"Requested CPU={requested_cpu}, Memory={requested_mem}Mi"
        )

Together, these controls allow AKS to function as a secure multi-tenant platform. Each namespace (tenant) is treated under Zero-Trust, no workload is trusted by default, and no communication occurs without explicit policy. Teams can share infrastructure while maintaining strong isolation, ensuring that risks in one environment can’t propagate into another.

Additional Best Practices and Conclusion

Beyond the core focus areas, here are a few additional advanced security practices worth highlighting:

Secure the Supply Chain: Integrate image-scanning tools like Trivy or Clair into CI/CD to detect vulnerabilities early. Enforce that only signed, verified, and trusted images from approved registries can be deployed.
Detect Runtime Threats: Use runtime security tools such as Falco to monitor container behavior (e.g., unexpected exec shells, privilege escalations, or unusual network activity) and trigger alerts on anomalies in real time.
Enable Unified Observability & Visibility: Use Prometheus/Grafana for metrics and centralized logging via Elasticsearch or Microsoft Sentinel to quickly spot unauthorized access and policy violations across workloads and namespaces.
Be Incident-Ready: Maintain tested incident response playbooks, perform regular etcd backups, and define clear processes for isolating risky workloads, rotating secrets, and restoring cluster operations without downtime.

In summary, securing Kubernetes requires a multi-layered, Zero-Trust approach — especially in environments where multiple teams or tenants share the same cluster. While tools like OPA Gatekeeper and Kyverno provide strong policy enforcement frameworks, custom admission controllers unlock deeper control and flexibility. They enable enforcement of context-aware, organization-specific rules such as tenant-based isolation, dynamic validations driven by external systems, and security decisions based on real-time signals. By combining custom admission logic with Zero-Trust principles (“never trust, always verify”), every pod deployment becomes a security checkpoint, ensuring that only compliant, authorized, and safe workloads are allowed into the cluster. This shifts security from reactive monitoring to proactive enforcement, reducing risk and strengthening compliance in complex Kubernetes environments.

Updated Nov 04, 2025

Version 1.0

cloud security best practices

divyaan

Microsoft

Joined October 23, 2025

View Profile

Azure Infrastructure Blog

Follow this blog board to get notified when there's new activity