Blog Post

Microsoft Security Community Blog
9 MIN READ

Fishing for Syslog with Azure Kubernetes and Logstash

kfriedemann's avatar
kfriedemann
Icon for Microsoft rankMicrosoft
Jul 14, 2025

Traditional syslog collection methods struggle with scale and security in distributed environments. This post shows how to deploy a cloud-native Logstash solution on Azure Kubernetes Service using Terraform for secure, scalable syslog collection with RFC 5425 TLS compliance.

Deploy Secure Syslog Collection on Azure Kubernetes Service with Terraform

Organizations managing distributed infrastructure face a common challenge: collecting syslog data securely and reliably from various sources. Whether you're aggregating logs from network devices, Linux servers, or applications, you need a solution that scales with your environment while maintaining security standards.

This post walks through deploying Logstash on Azure Kubernetes Service (AKS) to collect RFC 5425 syslog messages over TLS. The solution uses Terraform for infrastructure automation and forwards collected logs to Azure Event Hubs for downstream processing. You'll learn how to build a production-ready deployment that integrates with Azure Sentinel, Azure Data Explorer, or other analytics platforms.

Solution Architecture

The deployment consists of several Azure components working together:

  • Azure Kubernetes Service (AKS): Hosts the Logstash deployment with automatic scaling capabilities
  • Internal Load Balancer: Provides a static IP endpoint for syslog sources within your network
  • Azure Key Vault: Stores TLS certificates for secure syslog transmission
  • Azure Event Hubs: Receives processed syslog data using the Kafka protocol
  • Log Analytics Workspace: Monitors the AKS cluster health and performance

Syslog sources send RFC 5425-compliant messages over TLS to the Load Balancer on port 6514. Logstash processes these messages and forwards them to Event Hubs, where they can be consumed by various Azure services or third-party tools.

Prerequisites

Before starting the deployment, ensure you have these tools installed and configured:

  • Terraform: Version 1.5 or later
  • Azure CLI: Authenticated to your Azure subscription
  • kubectl: For managing Kubernetes resources after deployment

Several Azure resources must be created manually before running Terraform, as the configuration references them. This approach provides flexibility in organizing resources across different teams or environments.

Step 1: Create Resource Groups

Create three resource groups to organize the solution components:

az group create --name rg-syslog-prod --location eastus
az group create --name rg-network-prod --location eastus
az group create --name rg-data-prod --location eastus

Each resource group serves a specific purpose:

  • rg-syslog-prod: Contains the AKS cluster, Key Vault, and Log Analytics Workspace
  • rg-network-prod: Holds networking resources (Virtual Network and Subnets)
  • rg-data-prod: Houses the Event Hub Namespace for data ingestion

Step 2: Configure Networking

Create a Virtual Network with dedicated subnets for AKS and the Load Balancer:

az network vnet create \
  --resource-group rg-network-prod \
  --name vnet-syslog-prod \
  --address-prefixes 10.0.0.0/16 \
  --location eastus

az network vnet subnet create \
  --resource-group rg-network-prod \
  --vnet-name vnet-syslog-prod \
  --name snet-aks-prod \
  --address-prefixes 10.0.1.0/24

az network vnet subnet create \
  --resource-group rg-network-prod \
  --vnet-name vnet-syslog-prod \
  --name snet-lb-prod \
  --address-prefixes 10.0.2.0/24

The network design uses non-overlapping CIDR ranges to prevent routing conflicts. The Load Balancer subnet will later be assigned the static IP address 10.0.2.100.

Step 3: Set Up Event Hub Namespace

Create an Event Hub Namespace with a dedicated Event Hub for syslog data:

az eventhubs namespace create \
  --resource-group rg-data-prod \
  --name eh-syslog-prod \
  --location eastus \
  --sku Standard

az eventhubs eventhub create \
  --resource-group rg-data-prod \
  --namespace-name eh-syslog-prod \
  --name syslog

The Standard SKU provides Kafka protocol support, which Logstash uses for reliable message delivery. The namespace automatically includes a RootManageSharedAccessKey for authentication.

Step 4: Configure Key Vault and TLS Certificate

Create a Key Vault to store the TLS certificate:

az keyvault create \
  --resource-group rg-syslog-prod \
  --name kv-syslog-prod \
  --location eastus

For production environments, import a certificate from your Certificate Authority:

az keyvault certificate import \
  --vault-name kv-syslog-prod \
  --name cert-syslog-prod \
  --file certificate.pfx \
  --password <pfx-password>

For testing purposes, you can generate a self-signed certificate:

az keyvault certificate create \
  --vault-name kv-syslog-prod \
  --name cert-syslog-prod \
  --policy "$(az keyvault certificate get-default-policy)"

Important: The certificate's Common Name (CN) or Subject Alternative Name (SAN) must match the DNS name your syslog sources will use to connect to the Load Balancer.

Step 5: Create Log Analytics Workspace

Set up a Log Analytics Workspace for monitoring the AKS cluster:

az monitor log-analytics workspace create \
  --resource-group rg-syslog-prod \
  --workspace-name log-syslog-prod \
  --location eastus

Understanding the Terraform Configuration

With the prerequisites in place, let's examine the Terraform configuration that automates the remaining deployment. The configuration follows a modular approach, making it easy to customize for different environments.

Referencing Existing Resources

The Terraform configuration begins by importing references to the manually created resources:

data "azurerm_client_config" "current" {}

data "azurerm_resource_group" "rg-main" {
  name = "rg-syslog-prod"
}

data "azurerm_resource_group" "rg-network" {
  name = "rg-network-prod"
}

data "azurerm_resource_group" "rg-data" {
  name = "rg-data-prod"
}

data "azurerm_virtual_network" "primary" {
  name                = "vnet-syslog-prod"
  resource_group_name = data.azurerm_resource_group.rg-network.name
}

data "azurerm_subnet" "kube-cluster" {
  name                 = "snet-aks-prod"
  resource_group_name  = data.azurerm_resource_group.rg-network.name
  virtual_network_name = data.azurerm_virtual_network.primary.name
}

data "azurerm_subnet" "kube-lb" {
  name                 = "snet-lb-prod"
  resource_group_name  = data.azurerm_resource_group.rg-network.name
  virtual_network_name = data.azurerm_virtual_network.primary.name
}

These data sources establish connections to existing infrastructure, ensuring the AKS cluster and Load Balancer deploy into the correct network context.

Deploying the AKS Cluster

The AKS cluster configuration balances security, performance, and manageability:

resource "azurerm_kubernetes_cluster" "primary" {
  name                = "aks-syslog-prod"
  location            = data.azurerm_resource_group.rg-main.location
  resource_group_name = data.azurerm_resource_group.rg-main.name
  dns_prefix          = "aks-syslog-prod"

  default_node_pool {
    name           = "default"
    node_count     = 2
    vm_size        = "Standard_DS2_v2"
    vnet_subnet_id = data.azurerm_subnet.kube-cluster.id
  }

  identity {
    type = "SystemAssigned"
  }

  network_profile {
    network_plugin      = "azure"
    load_balancer_sku   = "standard"
    network_plugin_mode = "overlay"
  }

  oms_agent {
    log_analytics_workspace_id = data.azurerm_log_analytics_workspace.logstash.id
  }
}

Key configuration choices:

  • System-assigned managed identity: Eliminates the need for service principal credentials
  • Azure CNI in overlay mode: Provides efficient pod networking without consuming subnet IPs
  • Standard Load Balancer SKU: Enables zone redundancy and higher performance
  • OMS agent integration: Sends cluster metrics to Log Analytics for monitoring

The cluster requires network permissions to create the internal Load Balancer:

resource "azurerm_role_assignment" "aks-netcontrib" {
  scope                = data.azurerm_virtual_network.primary.id
  principal_id         = azurerm_kubernetes_cluster.primary.identity[0].principal_id
  role_definition_name = "Network Contributor"
}

Configuring Logstash Deployment

The Logstash deployment uses Kubernetes resources for reliability and scalability. First, create a dedicated namespace:

resource "kubernetes_namespace" "logstash" {
  metadata {
    name = "logstash"
  }
}

The internal Load Balancer service exposes Logstash on a static IP:

resource "kubernetes_service" "loadbalancer-logstash" {
  metadata {
    name      = "logstash-lb"
    namespace = kubernetes_namespace.logstash.metadata[0].name
    annotations = {
      "service.beta.kubernetes.io/azure-load-balancer-internal"         = "true"
      "service.beta.kubernetes.io/azure-load-balancer-ipv4"            = "10.0.2.100"
      "service.beta.kubernetes.io/azure-load-balancer-internal-subnet"  = data.azurerm_subnet.kube-lb.name
      "service.beta.kubernetes.io/azure-load-balancer-resource-group"   = data.azurerm_resource_group.rg-network.name
    }
  }

  spec {
    type = "LoadBalancer"
    selector = {
      app = kubernetes_deployment.logstash.metadata[0].name
    }
    port {
      name        = "logstash-tls"
      protocol    = "TCP"
      port        = 6514
      target_port = 6514
    }
  }
}

The annotations configure Azure-specific Load Balancer behavior, including the static IP assignment and subnet placement.

Securing Logstash with TLS

Kubernetes Secrets store the TLS certificate and Logstash configuration:

resource "kubernetes_secret" "logstash-ssl" {
  metadata {
    name      = "logstash-ssl"
    namespace = kubernetes_namespace.logstash.metadata[0].name
  }

  data = {
    "server.crt" = data.azurerm_key_vault_certificate_data.logstash.pem
    "server.key" = data.azurerm_key_vault_certificate_data.logstash.key
  }

  type = "Opaque"
}

The certificate data comes directly from Key Vault, maintaining a secure chain of custody.

Logstash Container Configuration

The deployment specification defines how Logstash runs in the cluster:

resource "kubernetes_deployment" "logstash" {
  metadata {
    name      = "logstash"
    namespace = kubernetes_namespace.logstash.metadata[0].name
  }

  spec {
    selector {
      match_labels = {
        app = "logstash"
      }
    }

    template {
      metadata {
        labels = {
          app = "logstash"
        }
      }

      spec {
        container {
          name  = "logstash"
          image = "docker.elastic.co/logstash/logstash:8.17.4"

          security_context {
            run_as_user                = 1000
            run_as_non_root            = true
            allow_privilege_escalation = false
          }

          resources {
            requests = {
              cpu    = "500m"
              memory = "1Gi"
            }
            limits = {
              cpu    = "1000m"
              memory = "2Gi"
            }
          }

          volume_mount {
            name       = "logstash-config-volume"
            mount_path = "/usr/share/logstash/pipeline/logstash.conf"
            sub_path   = "logstash.conf"
            read_only  = true
          }

          volume_mount {
            name       = "logstash-ssl-volume"
            mount_path = "/etc/logstash/certs"
            read_only  = true
          }
        }
      }
    }
  }
}

Security best practices include:

  • Running as a non-root user (UID 1000)
  • Disabling privilege escalation
  • Mounting configuration and certificates as read-only
  • Setting resource limits to prevent runaway containers

Automatic Scaling Configuration

The Horizontal Pod Autoscaler ensures Logstash scales with demand:

resource "kubernetes_horizontal_pod_autoscaler" "logstash_hpa" {
  metadata {
    name      = "logstash-hpa"
    namespace = kubernetes_namespace.logstash.metadata[0].name
  }

  spec {
    scale_target_ref {
      kind        = "Deployment"
      name        = kubernetes_deployment.logstash.metadata[0].name
      api_version = "apps/v1"
    }

    min_replicas                      = 1
    max_replicas                      = 30
    target_cpu_utilization_percentage = 80
  }
}

This configuration maintains between 1 and 30 replicas, scaling up when CPU usage exceeds 80%.

Logstash Pipeline Configuration

The Logstash configuration file defines how to process syslog messages:

input {
  tcp {
    port       => 6514
    type       => "syslog"
    ssl_enable => true
    ssl_cert   => "/etc/logstash/certs/server.crt"
    ssl_key    => "/etc/logstash/certs/server.key"
    ssl_verify => false
  }
}

output {
  stdout {
    codec => rubydebug
  }

  kafka {
    bootstrap_servers => "${name}.servicebus.windows.net:9093"
    topic_id          => "syslog"
    security_protocol => "SASL_SSL"
    sasl_mechanism    => "PLAIN"
    sasl_jaas_config  => 'org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://${name}.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=${primary_key};EntityPath=syslog";'
    codec             => "json"
  }
}

The configuration:

  • Listens on port 6514 for TLS-encrypted syslog messages
  • Outputs to stdout for debugging (visible in container logs)
  • Forwards processed messages to Event Hubs using the Kafka protocol

Deploying the Solution

With all components configured, deploy the solution using Terraform:

  1. Initialize Terraform in your project directory:
    terraform init
  2. Review the planned changes:
    terraform plan
  3. Apply the configuration:
    terraform apply
  4. Connect to the AKS cluster:
    az aks get-credentials \
      --resource-group rg-syslog-prod \
      --name aks-syslog-prod
  5. Verify the deployment:
    kubectl -n logstash get pods
    kubectl -n logstash get svc
    kubectl -n logstash get hpa

Configuring Syslog Sources

After deployment, configure your syslog sources to send messages to the Load Balancer:

  1. Create a DNS record pointing to the Load Balancer IP (10.0.2.100). For example: syslog.yourdomain.com
  2. Configure syslog clients to send RFC 5425 messages over TLS to port 6514
  3. Install the certificate chain on syslog clients if using a private CA or self-signed certificate

Example rsyslog configuration for a Linux client:

*.* @@syslog.yourdomain.com:6514;RSYSLOG_SyslogProtocol23Format

Monitoring and Troubleshooting

Monitor the deployment using several methods:

View Logstash logs to verify message processing:

kubectl -n logstash logs -l app=logstash --tail=50

Check autoscaling status:

kubectl -n logstash describe hpa logstash-hpa

Monitor in Azure Portal:

  • Navigate to the Log Analytics Workspace to view AKS metrics
  • Check Event Hub metrics to confirm message delivery
  • Review Load Balancer health probes and connection statistics

Security Best Practices

This deployment incorporates several security measures:

  • TLS encryption: All syslog traffic is encrypted using certificates from Key Vault
  • Network isolation: The internal Load Balancer restricts access to the virtual network
  • Managed identities: No credentials are stored in the configuration
  • Container security: Logstash runs as a non-root user with minimal privileges

For production deployments, consider these additional measures:

  • Enable client certificate validation in Logstash for mutual TLS
  • Add Network Security Groups to restrict source IPs
  • Implement Azure Policy for compliance validation
  • Enable Azure Defender for Kubernetes

Integration with Azure Services

Once syslog data flows into Event Hubs, you can integrate with various Azure services:

Azure Sentinel: Configure Data Collection Rules to ingest syslog data for security analytics. See the Azure Sentinel documentation for detailed steps.

Azure Data Explorer: Create a data connection to analyze syslog data with KQL queries.

Azure Stream Analytics: Process syslog streams in real-time for alerting or transformation.

Logic Apps: Trigger workflows based on specific syslog patterns or events.

Cost Optimization

To optimize costs while maintaining performance:

  • Right-size the AKS node pool based on actual syslog volume
  • Use Azure Spot instances for non-critical environments
  • Configure Event Hub retention based on compliance requirements
  • Enable auto-shutdown for development environments

Conclusion

This Terraform-based solution provides a robust foundation for collecting syslog data in Azure. The combination of AKS, Logstash, and Event Hubs creates a scalable pipeline that integrates seamlessly with Azure's security and analytics services.

The modular design allows easy customization for different environments and requirements. Whether you're collecting logs from a handful of devices or thousands, this architecture scales to meet your needs while maintaining security and reliability.

For next steps, consider implementing additional Logstash filters for data enrichment, setting up automated certificate rotation, or expanding the solution to collect other log formats. The flexibility of this approach ensures it can grow with your organization's logging requirements.

Updated Jul 11, 2025
Version 1.0
No CommentsBe the first to comment