Standard Kubernetes DNS forces every pod to traverse the network fabric to a centralized CoreDNS service, a design that becomes a scaling and latency bottleneck at cluster scale. By default, pods send DNS queries to the kube-dns Service IP, which kube-proxy translates to CoreDNS endpoints via iptables rules. NodeLocal DNSCache removes this network hop by resolving queries locally on each node.h node.
Why Adopt NodeLocal DNSCache?
The primary drivers for adoption are usually:
- Eliminating Conntrack Pressure: In high-QPS UDP DNS scenarios, conntrack contention and UDP tracking can cause intermittent DNS response loss and retries; depending on resolver retry/timeouts, this can appear as multi-second lookup delays and sometimes much longer tails.
- Reducing Latency: By placing a cache on every node, you remove the network hop to the CoreDNS service. Responses are practically instantaneous for cached records.
- Offloading CoreDNS: A DaemonSet architecture effectively shards the DNS query load across the entire cluster, preventing the central CoreDNS deployment from becoming a single point of congestion during bursty scaling events.
Who needs this?
You should prioritize this architecture if you run:
- Large-scale clusters large clusters (hundreds of nodes or thousands of pods), where CoreDNS scaling becomes difficult to manage.
- High-churn endpoints, such as spot instances or frequent auto-scaling jobs that trigger massive waves of DNS queries.
- Real-time applications where multi-second (and occasionally longer) DNS lookup delays are unacceptable.
The Challenge with Cilium
Deploying NodeLocal DNSCache on a cluster managed by Cilium (CNI) requires a specific approach. Standard NodeLocal DNSCache relies on node-level interface/iptables setup. In Cilium environments, you can instead implement the interception via Cilium Local Redirect Policy (LRP), which redirects traffic destined to the kube-dns ClusterIP service to a node-local backend pod.
This post details a production-ready deployment strategy aligned with Cilium’s Local Redirect Policy model. It covers necessary configuration tweaks to avoid conflicts and explains how to maintain security filtering.
Architecture Overview
In a standard Kubernetes deployment, NodeLocal DNSCache creates a dummy network interface and uses extensive iptables rules to hijack traffic destined for the Cluster DNS IP.
When using Cilium, we can achieve this more elegantly and efficiently using Local Redirect Policies.
- DaemonSet: Runs node-local-dns on every node.
- Configuration: Configured to skip interface creation and iptables manipulation.
- Redirection: Cilium LRP intercepts traffic to the kube-dns Service IP and redirects it to the local pod on the same node.
1. The NodeLocal DNSCache DaemonSet
The critical difference in this manifest is the arguments passed to the node-local-dns binary. We must explicitly disable its networking setup functions to let Cilium handle the traffic.
The NodeLocal DNSCache deployment also requires the node-local-dns ConfigMap and the kube-dns-upstream Service (plus RBAC/ServiceAccount). For brevity, the snippet below shows only the DaemonSet arguments that differ in the Cilium/LRP approach. The node-cache reads the template Corefile (/etc/coredns/Corefile.base) and generates the active Corefile (/etc/Corefile). The -conf flag points CoreDNS at the active Corefile it should load.
The node-cache binary accepts -localip as an IP list; 0.0.0.0 is a valid value and makes it listen on all interfaces, appropriate for the LRP-based redirection model.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-local-dns
namespace: kube-system
labels:
k8s-app: node-local-dns
spec:
selector:
matchLabels:
k8s-app: node-local-dns
template:
metadata:
labels:
k8s-app: node-local-dns
annotations:
# Optional: policy.cilium.io/no-track-port can be used to bypass conntrack for DNS.
# Validate the impact on your Cilium version and your observability/troubleshooting needs.
policy.cilium.io/no-track-port: "53"
spec:
# IMPORTANT for the "LRP + listen broadly" approach:
# keep hostNetwork off so you don't hijack node-wide :53
hostNetwork: false
dnsPolicy: ClusterFirst
containers:
- name: node-cache
image: registry.k8s.io/dns/k8s-dns-node-cache:1.15.16
args:
- "-localip"
# Use a bind-all approach. Ensure server blocks bind broadly in your Corefile.
- "0.0.0.0"
- "-conf"
- "/etc/Corefile"
- "-upstreamsvc"
- "kube-dns-upstream"
# CRITICAL: Disable internal setup
- "-skipteardown=true"
- "-setupinterface=false"
- "-setupiptables=false"
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
# Ensure your Corefile includes health :8080 so the liveness probe works
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
- name: kube-dns-config
mountPath: /etc/kube-dns
volumes:
- name: kube-dns-config
configMap:
name: kube-dns
optional: true
- name: config-volume
configMap:
name: node-local-dns
items:
- key: Corefile
path: Corefile.base
2. The Cilium Local Redirect Policy (LRP)
Instead of iptables, we define a CRD that tells Cilium: "When you see traffic for `kube-dns`, send it to the `node-local-dns` pod on this same node."
apiVersion: "cilium.io/v2"
kind: CiliumLocalRedirectPolicy
metadata:
name: "nodelocaldns"
namespace: kube-system
spec:
redirectFrontend:
# ServiceMatcher mode is for ClusterIP services
serviceMatcher:
serviceName: kube-dns
namespace: kube-system
redirectBackend:
# The backend pods selected by localEndpointSelector must be in the same namespace as the LRP
localEndpointSelector:
matchLabels:
k8s-app: node-local-dns
toPorts:
- port: "53"
name: dns
protocol: UDP
- port: "53"
name: dns-tcp
protocol: TCP
This is an LRP-based NodeLocal DNSCache deployment: we disable node-cache’s iptables/interface setup and let Cilium LRP handle local redirection. This differs from the upstream NodeLocal DNSCache manifest, which uses hostNetwork + dummy interface + iptables.
LRP must be enabled in Cilium (e.g., localRedirectPolicies.enabled=true) before applying the CRD. Official Cilium LRP doc
The Network Policy "Gotcha"
If you use CiliumNetworkPolicy to restrict egress traffic, specifically for FQDN filtering, you typically allow access to CoreDNS like this:
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s:k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: ANY
This will break with local redirection.
Why? Because LRP redirects the DNS request to the node-local-dns backend endpoint; strict egress policies must therefore allow both kube-dns (upstream) and node-local-dns (the redirected destination).
The Repro Setup
To demonstrate this failure, the cluster is configured with:
- NodeLocal DNSCache: Deployed as a DaemonSet (node-local-dns) to cache DNS requests locally on every node.
- Local Redirect Policy (LRP): An active LRP intercepts traffic destined for the kube-dns Service IP and redirects it to the local node-local-dns pod.
- Incomplete Network Policy: A strict CiliumNetworkPolicy (CNP) is enforced on the client pod. While it explicitly allows egress to kube-dns, it misses the corresponding rule for node-local-dns.
Reveal the issue using Hubble:
In this scenario, the client pod dns-client is attempting to resolve the external domain github.com.
When inspecting the traffic flows, you will see EGRESS DENIED verdicts. Crucially, notice the destination pod in the logs below: kube-system/node-local-dns, not kube-dns.
Although the application originally sent the packet to the Cluster IP of CoreDNS, Cilium's Local Redirect Policy modified the destination to the local node cache. Since strictly defined Network Policies assume traffic is going to the kube-dns identity, this redirected traffic falls outside the allowed rules and is dropped by the default deny stance.
The Fix: You must allow egress to both labels.
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s:k8s-app: kube-dns
# Add this selector for the local cache
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s:k8s-app: node-local-dns
toPorts:
- ports:
- port: "53"
protocol: ANY
Without this addition, pods protected by strict egress policies will timeout resolving DNS, even though the cache is running.
Use Hubble to observe the network flows:
After adding matchLabels: k8s:k8s-app: node-local-dns, the traffic is now allowed. Hubble confirms a policy verdict of EGRESS ALLOWED for UDP traffic on port 53. Because DNS resolution now succeeds, the response populates the Cilium FQDN cache, subsequently allowing the TCP traffic to github.com on port 443 as intended.
Real-World Example: Restricting Egress with FQDN Policies
Here is a complete CiliumNetworkPolicy that locks down a workload to only access api.example.com. Note how the DNS rule explicitly allows traffic to both kube-dns (for upstream) and node-local-dns (for the local cache).
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: secure-workload-policy
spec:
endpointSelector:
matchLabels:
app: critical-workload
egress:
# 1. Allow DNS Resolution (REQUIRED for FQDN policies)
- toEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s:k8s-app: kube-dns
# Allow traffic to the local cache redirection target
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s:k8s-app: node-local-dns
toPorts:
- ports:
- port: "53"
protocol: ANY
rules:
dns:
- matchPattern: "*"
# 2. Allow specific FQDN traffic (populated via DNS lookups)
- toFQDNs:
- matchName: "api.example.com"
toPorts:
- ports:
- port: "443"
protocol: TCP
Configuration & Upstream Loops
When configuring the ConfigMap for node-local-dns, use the standard placeholders provided by the image. The binary replaces them at runtime:
- __PILLAR__CLUSTER__DNS__: The Upstream Service IP (kube-dns-upstream).
- __PILLAR__UPSTREAM__SERVERS__: The system resolvers (usually /etc/resolv.conf).
Ensure kube-dns-upstream exists as a Service selecting the CoreDNS pods so cache misses are forwarded to the actual CoreDNS backends.
Alternative: AKS LocalDNS
LocalDNS is an Azure Kubernetes Services (AKS)-managed node-local DNS proxy/cache.
Pros:
- Managed lifecycle at the node pool level.
- Support for custom configuration via localdnsconfig.json (e.g., custom server blocks, cache tuning).
- No manual DaemonSet management required.
Cons & Limitations:
- Incompatibility with FQDN Policies: As noted in the official documentation, LocalDNS isn’t compatible with applied FQDN filter policies in ACNS/Cilium; if you rely on FQDN enforcement, prefer a DNS path that preserves FQDN learning/enforcement.
- Updating configuration requires reimaging the node pool.
For environments heavily relying on strict Cilium Network Policies and FQDN filtering, the manual deployment method described above (using LRP) can be more reliable and transparent.
AKS recommends not enabling both upstream NodeLocal DNSCache and LocalDNS in the same node pool, as DNS traffic is routed through LocalDNS and results may be unexpected.