Troubleshoot intermittent connectivity issue in azure app service
Published Dec 11 2023 06:26 PM 2,491 Views
Microsoft

In this article

    Linux 

        nslookup 

        dig 

        NodeJS 

        Python 

        tcpping 

        curl 

        tcpdump 

    Windows 

        nameresolver.exe 

        tcpping.exe 

        curl.exe 

        Webjob 

For application running on Azure App Service, sometimes the code throwing error to connect to target service due to intermittent connectivity issue.
To identify the issue, it's not convenient if we keep modify and deploy the code, we could use App Service Kudu site to do some quick testing and no need to change any code.

Linux

Login to your app service Kudu site, e.g. https://<appname>.scm.azurewebsites.net, and navigate to "SSH" menu. You should be able to see SSH shell.
If you are using custom image, then you need enable it by follow document: https://learn.microsoft.com/en-us/azure/app-service/configure-custom-container?tabs=debian&pivots=co...

nslookup

nslookup normally already installed to Linux Kudu images, you could use it to check if the FQDN resolved to correct IP address.

 

# nslookup command to test DNS resolution for login.microsoftonline.com, you could change it to other destination
nslookup login.microsoftonline.com
nslookup <dbname>.postgres.database.azure.com
# the reponse could like below
Server:         127.0.0.11
Address:        127.0.0.11#53

Non-authoritative answer:
Name:   <dbname>.postgres.database.azure.com
Address: xxx.xxx.xxx.xxx

 

In case it's not installed, execute below commands.

 

apt update
apt install dnsutils

 

For keep testing the nsloookup you could save below script to a file, e.g. dnstest.sh, you could adjust the parameters according your requirements.

 

fqdn=$1 # get the first parameter
max=$2  # get the second parameter
for ((i = 0 ; i < max ; i++ )); 
do 
   now=$(date +"%Y-%m-%d %T")
   echo "$now Progress $i / $max" # show the progress
   nslookup $fqdn            # execute the command
   sleep 1                   # sleep 1 second
done

 

To use it

 

bash dnstest.sh login.microsoftonline.com 10

 

To use it and output the details to log file.

 

bash dnstest.sh login.microsoftonline.com 10 > output.log

 

dig

dig also installed by default, you could use it to check detailed information for DNS resolution.

 

dig login.microsoftonline.com

 

For keep testing the dig you could save below script to a file, e.g. digtest.sh

 

fqdn=$1 # get the first parameter
max=$2  # get the second parameter
for ((i = 0 ; i < max ; i++ )); 
do 
   now=$(date +"%Y-%m-%d %T")
   echo "$now Progress $i / $max" # show the progress
   dig $fqdn                 # execute the command
   sleep 1                   # sleep 1 second
done

 

To use it

 

bash digtest.sh login.microsoftonline.com 10
bash digtest.sh login.microsoftonline.com 10 > output.log

 

NodeJS

If you are using NodeJS as runtime, you could use below command to get the DNS resolution used time.

 

node -e "const dns = require('dns'); console.time('lookup_time'); dns.lookup('login.microsoftonline.com', (err, out) => { console.timeEnd('lookup_time'); console.log(err, out)});"

 

Python

Save below scripts to dnstest.py.

 

import sys
import socket
import time
import datetime

fqdn = sys.argv[1]
total = int(sys.argv[2])

def get_ipv4_by_hostname(hostname):
    # see `man getent` `/ hosts `
    # see `man getaddrinfo`

    return list(
        i        # raw socket structure
            [4]  # internet protocol info
            [0]  # address
        for i in
        socket.getaddrinfo(
            hostname,
            0  # port, required
        )
        if i[0] is socket.AddressFamily.AF_INET  # ipv4

        # ignore duplicate addresses with other socket types
        and i[1] is socket.SocketKind.SOCK_RAW
    )
for i in range(0, total):
    now = datetime.datetime.now()
    print(str(now) + " testing " + fqdn)
    print(get_ipv4_by_hostname(fqdn))
    time.sleep(1)

 

To use it

 

python dnstest.py www.google.com 10
python dnstest.py www.google.com 10 > output.log

 

tcpping

tcpping may not installed, you need follow below instructions to install tcpping

 

apt-get update
apt-get install bc
apt-get install tcptraceroute  
cd /usr/bin  
wget http://www.vdberg.org/~richard/tcpping
chmod 755 tcpping  

 

Test the connection.

 

tcpping login.microsoftonline.com 443
# sample response
seq 0: tcp response from 20.190.144.161 [open]  67.911 ms
seq 1: tcp response from 20.190.163.20 [open]  2.130 ms
seq 2: tcp response from 20.190.144.161 [open]  70.361 ms
seq 3: tcp response from 20.190.144.137 [open]  68.137 ms
seq 4: tcp response from 20.190.148.163 [open]  69.158 ms
seq 5: tcp response from 40.126.35.19 [open]  1.443 ms
seq 6: tcp response from 20.190.148.162 [open]  69.235 ms
seq 7: tcp response from 20.190.144.139 [open]  69.703 ms

 

for tcpping, by default it will repeat unlimited, you could specify the repeat times by using -x parameter

 

tcpping -x 100 www.google.com 443 

 

curl

Use curl command could simulate the client and get data from target service.
You could test target application, e.g. API by using curl command.
We could use postman-echo service to display if curl sent correct data to target service and then change the service to target Api service.

 

curl -v https://login.microsoftonline.com

curl  https://postman-echo.com/get?name=value
# sample response
{
  "args": {
    "name": "value"
  },
  "headers": {
    "x-forwarded-proto": "https",
    "x-forwarded-port": "443",
    "host": "postman-echo.com",
    "x-amzn-trace-id": "Root=1-6508142c-47c6ba1a402278df3911d701",
    "user-agent": "curl/7.74.0",
    "accept": "*/*"
  },
  "url": "https://postman-echo.com/get?name=value"
}

curl -X POST -H "Content-Type: application/json" -d '{"name": "name1", "email": "test@example.com"}' https://postman-echo.com/post
# sample response
{
  "args": {},
  "data": {
    "name": "name1",
    "email": "test@example.com"
  },
  "files": {},
  "form": {},
  "headers": {
    "x-forwarded-proto": "https",
    "x-forwarded-port": "443",
    "host": "postman-echo.com",
    "x-amzn-trace-id": "Root=1-650814f6-5bfb1beb60fbc2e47a4a81ce",
    "content-length": "46",
    "user-agent": "curl/7.74.0",
    "accept": "*/*",
    "content-type": "application/json"
  },
  "json": {
    "name": "name1",
    "email": "test@example.com"
  },
  "url": "https://postman-echo.com/post"
}

 

For keep testing the curl command you could save below script to a file, e.g. curltest.sh, you could adjust the parameters according your requirements.

 

fqdn=$1 # get the fqdn parameter
port=$2 # get the port parameter
max=$3  # get the loop count

for ((i = 0 ; i < max ; i++ )); 
do 
   now=$(date +"%Y-%m-%d %T")
   echo "$now Progress $i / $max" # show the progress
   curl -v $fqdn $port            # execute the command
   sleep 1                   # sleep 1 second
done

 

To use it

 

bash curltest.sh login.microsoftonline.com 443 10
bash curltest.sh login.microsoftonline.com 443 10 > output.log

 

tcpdump

If the test connection failed and don't know the reason, you could use tcpdump and capture a network trace

 

# For Ubuntu/Jessie Based Images, you need to run the below commands
apt-get update
apt install tcpdump

# For Alpine Based Images, you need to run the below commands
apk update
apk add tcpdump

# run tcpdump command
tcpdump -w /home/site/wwwroot/traffic.pcap

 

Now the tcpdump is capturing network packets, open another Kudu site instance and then run the command to reproduce the issue.
And then back to previously kudu site, press ctrl+C to stop the capturing.
Open https://<appname>.scm.azurewebsites.com/newui/fileManager to download the traffic.pcap file.

Calvin_Cai_2-1698841992793.png

After download successfully, you could use wireshark Opens in new window or tab tool to analyze the network trace file.

Windows

Login to Kudu site https://<appname>.scm.azurewebsites.net/ and navigate to "Debug console" -> "PowerShell" menu, you would see the PowerShell window.

nameresolver.exe

Nameresolver is similar to nslookup where it will do a DNS lookup against the DNS server.

 

nameresolver.exe login.microsoftonline.com 
# sample response
Server: 168.63.129.16

Non-authoritative answer:
Name: www.tm.ak.prd.aadg.trafficmanager.net
Addresses: 
	40.126.35.87
	20.190.163.18
	40.126.35.128
	40.126.35.144
	40.126.35.129
	40.126.35.134
	40.126.35.19
	40.126.35.21
Aliases: 
	login.mso.msidentity.com
	ak.privatelink.msidentity.com
	www.tm.ak.prd.aadg.trafficmanager.net

# specify the dns server
nameresolver.exe login.microsoftonline.com 8.8.8.8

 

For keep testing the nameresolver, you could write a loop to continuos resolve the target domain to check intermittent issue.
e.g. save below file to nameresolvertest.ps1

 

param (
    [string]$fqdn, 
    [int]$max = 10
)

for ($i = 0; $i -lt $max; $i++) {
    $now = (Get-Date).ToString("yyyy-MM-dd HH:mm:ss")
    Write-Host "$now Progress $i / $max"
    nameresolver.exe $fqdn 
    Start-Sleep 1
}

 

To use it

 

powershell ./nameresolvertest.ps1 www.google.com 10
powershell ./nameresolvertest.ps1 www.google.com 10 > output.log

 

tcpping.exe

tcpping could test if the connection to target service is working on specific port
Please note between server and port, there is a symbol ":", it's not like tcpping in Linux environment, in Linux, it's space " ".

 

tcpping login.microsoftonline.com:443

# sample response
tcpping login.microsoftonline.com:443
Connected to login.microsoftonline.com:443, time taken: 139ms
Connected to login.microsoftonline.com:443, time taken: 74ms
Connected to login.microsoftonline.com:443, time taken: 62ms
Connected to login.microsoftonline.com:443, time taken: 78ms
Complete: 4/4 successful attempts (100%). Average success time: 88.25ms

# specify the tcpping count
tcpping login.microsoftonline.com:443 -n 10

tcpping xxxxxxxxxx.redis.cache.windows.net:6380

# failed response
tcpping xxxxxxxxxx.redis.cache.windows.net:6380
Connection attempt failed: No such host is known
Connection attempt failed: No such host is known
Connection attempt failed: No such host is known
Connection attempt failed: No such host is known
Complete: 0/4 successful attempts (0%). Average success time: 0ms

# OR

tcpping xxxxxxxxxx.redis.cache.windows.net:6380
Connection attempt failed: Connection timed out.
Connection attempt failed: Connection timed out.
Connection attempt failed: Connection timed out.
Connection attempt failed: Connection timed out.
Complete: 0/4 successful attempts (0%). Average success time: 0ms

 

You could use tcpping to test different services based on their ports

 

tcpping <sqlservername>.database.azure.com:1433
tcpping <mysqlname>.database.azure.com:3389
tcpping <postgresqlname>.postgres.database.azure.com:5432
tcpping <redisname>.redis.cache.windows.net:6380

 

For windows tcpping, the default ping count is 4, you could specify the -n parameter to ping target mutiple times.

e.g. below script will use tcpping call target 100 times

 

tcpping www.google.com:443 -n 100

 

curl.exe

In Windows kudu site, curl command will call the PowerShell cmd-let Invoke-WebRequest, you could use "curl.exe" in the script instead of the PowerShell cmd-let. 

 

curl.exe -v https://login.microsoftonline.com
curl.exe  https://postman-echo.com/get?name=value

 

Please note some time curl.exe will return the red text with the invocation summary, that is ok.
If we could see the correct content replied, that means the connection is good.

Calvin_Cai_3-1698842028414.png

Webjob

You could also create a powershell script webjob to running on backend, so it will more stable then just run from the Kudu console. 

Save below example to your local file, e.g. pswebjob.ps1

 

while($true){
    Write-Output (Get-Date).ToString("yyyy-MM-dd HH:mm:ss")
    Write-Output "Hello World from PowerShell WebJob!"
    curl.exe -v https://www.google.com
    Start-Sleep -Seconds 1
}

 

Please note you need change the Write-Host to Write-Output in the powershell script, otherwise it will throw the handle error. 

For how to create a continuous webjob, you could refer document https://learn.microsoft.com/en-us/azure/app-service/webjobs-create#CreateContinuous 

After webjob is created, you should able to see the logs like following

Calvin_Cai_0-1701671591447.png

References

Azure App Service virtual network integration troubleshooting guide - Azure | Microsoft Learn

Quickly test connectivity from Azure Website to SQL DB - Microsoft Community Hub
Networking Related Commands for Azure App Services - Microsoft Community Hub
Installing TcpPing on Azure App Service Linux - (azureossd.github.io)
https://gist.github.com/cnDelbert/5fb06ccf10c19dbce3a7 

3 Comments
Co-Authors
Version history
Last update:
‎Dec 04 2023 02:40 AM
Updated by: