Due to the potential impact on performance and storage costs, Azure Databricks clusters don't capture networking logs by default. Follow the below instructions if you need to capture tcpdump to investigate multiple networking issues related to the cluster. These steps will capture a TCP dump on each cluster node--both driver and workers during the entire lifetime of the cluster.
IMPORTANT: Make sure to remove the tcpdump init script from the cluster once you generate the tcpdump to avoid performance and additional cost.
%scala
dbutils.fs.put("dbfs:/databricks/init-scripts/tcpdump_pypi_repo.sh", """
#!/bin/bash
sudo apt update && sudo apt install --yes tcpdump
sleep 5s
set -x
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
MYIP=$(hostname -I | sed 's/ *$//g')
echo "initiating tcp dump"
TCPDUMP_FILE="/tmp/trace_$(date +"%Y%m%d_%H%M")_${MYIP}.pcap"
sudo tcpdump -w $TCPDUMP_FILE -W 1000 -G 1800 -K -n -s256 host files.pythonhosted.org and port 443 &
sleep 15s
echo "initiated tcp dump `ls -ltrh $TCPDUMP_FILE`"
cat <<'EOF' >> /tmp/copy_stats.sh
#!/bin/bash
SOURCE_FILE=$1
DB_CLUSTER_ID=$(echo $HOSTNAME | awk -F '-' '{print$1"-"$2"-"$3}')
if [[ ! -d /dbfs/databricks/tcpdump/${DB_CLUSTER_ID} ]] ; then
sudo mkdir -p /dbfs/databricks/tcpdump/${DB_CLUSTER_ID}
fi
BASEDIR="/dbfs/databricks/tcpdump/${DB_CLUSTER_ID}"
#BASEDIR="/local_disk0/tcpdump/${DB_CLUSTER_ID}"
mkdir -p ${BASEDIR}
FILESIZE=0
while [ 1 ]; do
CUR_FILESIZE=$(stat -c%s "$SOURCE_FILE")
if [ "$CUR_FILESIZE" -gt "$FILESIZE" ]; then
sudo cp -f $SOURCE_FILE ${BASEDIR}/.
fi
FILESIZE=$CUR_FILESIZE
sleep 1m
done
EOF
chmod a+x /tmp/copy_stats.sh
/tmp/copy_stats.sh $TCPDUMP_FILE &>/tmp/copy_stats.log & disown
fi
""",true)
Enable cluster logging path.
NOTE:
You can replace the “host” information as per your requirements. For example, if you are testing the connectivity from the cluster to your Azure SQL server, then replace the host information with the IP address of the Azure SQL DB. It should be sudo tcpdump -w /tmp/trace_%Y_%m_%d_%H_%M_%S_${MYIP}.pcap -W 1000 -G 1800 -K -n host <IPAddress of the Azure SQLDB>
https://learn.microsoft.com/en-us/azure/databricks/init-scripts/cluster-scoped
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/cli/
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.