Linux and Open Source Blog

5 MIN READ

Troubleshooting Network Issues with Retina

Microsoft

Aug 22, 2025

The Kubernetes environment comes with its own unique challenges, particularly at enterprise level scale. Hundreds or thousands of pods may exist at any given point in time and are continuously being created, deleted, restarted, reallocated etc. On top of that, these pods are also distributed amongst tens, hundreds, or even thousands of nodes.

If you need to troubleshoot connectivity issues within your microservices on such a cluster, manually executing tcpdump against individual containers will be an extremely slow and tedious process. This would typically require you to identify the node where the relevant pod is running, run tcpdump (installing it if needed) and then transfer the .pcap files away for analysis. Rinse and repeat for multiple pods as required.

Installing tools can be restricted in many environments, and the ephemeral nature of pods makes pod-based captures unreliable and difficult to manage.

Enter Retina.

What is Retina?

Retina is a cloud-agnostic, open-source Kubernetes network observability platform that leverages the power of eBPF for observability and deep network insights.

This post will discuss how Retina solves key challenges of performing packet captures in a Kubernetes environment, and additional debug tools which are provided within the Retina Shell.

To learn more about Retina, refer to the following blog post: Retina: Bridging Kubernetes Observability and eBPF Across the Clouds, or visit the documentation directly at retina.sh.

Packet Captures with Retina

Retina captures allow users to perform distributed packet captures across the cluster, based on specified Nodes/Pods and other supported filters. This eliminates the manual toil of installing tools and running captures on individual targets.

The captures are performed on demand, either via the CLI or a CRD, and can be output to persistent storage options which include the host filesystem, PVC or a storage blob. Whenever a capture is initiated, a Kubernetes job is created on each relevant Node. The job’s worker Pod runs for the specified duration, performs the capture, and wraps the network information into a tarball. The tarball is then copied to the appropriate output location(s).

The names of the created jobs are appended with a random hash to uniquely identify them. For example, a capture named 'hello-world' might end up with a name 'hello-world-bkfzp'.

The names of the tarballs include the capture name, the host name, and a UTC time stamp - e.g. 'hello-world-aks-agentpool-13097696-vmss000000-20250813182832UTC.tar.gz'.

Retina CLI - performing a packet capture and downloading the output

The result of the capture contains more than just a .pcap file. Retina also captures a number of networking metadata such as iptables rules, socket statistics, kernel network information from /proc/net, and more. Refer to the documentation for the exhaustive list - Tarball Contents.

To learn more about Retina captures, check out the documentation:

Capture with CLI

Capture with CRD

Quick Start

# install Retina CLI 
kubectl krew install retina 

# create a packet capture 
kubectl retina capture create 

# observe captures until they are done 
kubectl retina capture list 

# download the result of the capture 
kubectl retina capture download --name retina-capture 

# extract the contents of the downloaded tarball 
tar -xvf <name.tar.gz>

Common Scenarios

Example 1: Capture traffic from a single deployment, e.g. coredns

kubectl retina capture create \ 
  --name coredns \ 
  --pod-selectors "k8s-app=kube-dns" \ 
  --namespace-selectors="kubernetes.io/metadata.name=kube-system"

Example 2: Capture traffic from a single deployment with specific traffic filters

kubectl retina capture create \ 
  --name coredns \ 
  --pod-selectors "k8s-app=kube-dns" \ 
  --namespace-selectors="kubernetes.io/metadata.name=kube-system" \ 
  --tcpdump-filter "tcp and port 53"

Example 3: Capture all traffic on a specific node

kubectl retina capture create \ 
  --name node-full \ 
  --node-selectors "kubernetes.io/hostname=aks-nodepool1-21861315-vmss000003"

Example 4: Capture all HTTPS traffic on all Linux nodes and upload to blob storage

kubectl retina capture create \ 
  --name all-linux-node-443 \ 
  --node-selectors "kubernetes.io/os=linux" \ 
  --tcpdump-filter "tcp and port 443" \  
  --blob-upload <SAS URL>

Advanced Debugging with Retina Shell

The Retina Shell is still considered to be an experimental feature; however, it has already proven to be extremely useful for troubleshooting different networking scenarios.

Note: As of v1.0.0-rc2, Retina Shell only supports Linux based Nodes / Pods.

By running the 'retina shell' command you can start an interactive shell on a Kubernetes Node or Pod. This runs a container image which contains many networking tools such as curl, ping, nslookup, iptables and more.

In the latest Retina release (v0.0.36) both pwru and bpftool have been added to the suite. Inspektor Gadget has been added within the pre-release v1.0.0-rc2.

Retina CLI - using retina shell to connect to a node to troubleshoot with bpftool and pwru

To learn more about Retina Shell, check out the documentation:

Retina Shell

Quick Start

# different ways of invoking the Retina Shell 
kubectl retina shell <node-name> 
kubectl retina shell -n <namespace> pods/<pod-name> 
kubectl retina shell <node-name> --mount-host-filesystem 
kubectl retina shell <node-name> --capabilities=<CAPABILITIES,COMMA,SEPARATED>

pwru

An eBPF-based tool for tracing network packets in the Linux kernel with advanced filtering capabilities. It allows fine-grained introspection of the kernel state to facilitate debugging network connectivity issues.

Example: Debugging HTTP Traffic Between Microservices

pwru "tcp and (src port 8080 or dst port 8080)"

bpftool

Allows users to list, dump, load BPF programs and more. It is a reference utility to quickly inspect and manage BPF objects on your system, to manipulate BPF object files, or to perform various other BPF-related tasks.

Example 1: List and Inspect Loaded BPF Programs

bpftool prog show

Example 2: Dump BPF Maps to Debug Connection Tracking

bpftool map dump id <map_id>

Inspektor Gadget

A set of tools and framework for data collection and system inspection on Kubernetes clusters and Linux hosts using eBPF.

Example 1: Trace DNS Queries and Responses

ig run trace_dns:latest -n <namespace> -p <pod>

Example 2: Trace OOM Kill Events

ig run trace_oomkill:latest --namespace <namespace>

Conclusion

The best network issues are the ones that never happen. Unfortunately, this is not the reality for most services and when things inevitably go bad you need to be able to count on your tools to help you.

Retina solves many of the challenges of debugging connectivity issues in a Kubernetes environment by providing a streamlined experience out-of-the-box, eliminating the manual toil and operational overhead. Whether you need to run a distributed packet capture across your cluster or connect to a Node and dig deeper with specialized tooling, Retina has you covered.

To keep up with the development and new releases, star the repository on GitHub - microsoft/retina, or contribute to the project yourself by following the development guide!

Updated Aug 22, 2025

Version 1.0

Microsoft

Joined April 18, 2025

View Profile

Linux and Open Source Blog