A Real Example: How to debug Azure Stream Analytics jobs locally using job diagram
Published Sep 16 2020 05:07 PM 2,786 Views
Microsoft

Azure Stream Analytics is a full managed service for real-time analytics. In addition to the experience in Azure portal we have developer tools which make development and debugging easier. This blog will introduce a new debugging feature in Azure Stream Analytics tools extension for Visual Studio Code. 

 

Have you ever faced a situation where your streaming job produces no result or unexpected results and you don’t know how to trouble shoot which parts go wrong? We are happy to announce the roll out of the newest debugging feature – job diagram debugging in Visual Studio Code extension for Azure Stream Analytics. This feature brings together job diagrams, metrics, diagnostic logs, and intermediate results to help you quickly isolate the source of a problem. You can not only test your query on your local machine but also connect to live sources such as Event Hub, IoT Hub. The job diagram helps you tunderstand how data flows between each step, and you can view the intermediate result set and metrics to debug issues. You can iterate fast because each local test of the job only takes seconds. 

 

This blog will use a real example to show you how to debug an Azure Stream Analytics job using job diagram in Visual Studio Code.  

 

Note 

This job diagram only shows the data and metrics for local testing in a single node on your own machine. It should not be used for performance and scalability tuning. 

 

Step 1 Install the tools 

vscode-signin.png

 

 

 

Step 2 Open your job in Visual Studio Code 

Please go to step 3 if you already have your Azure Stream Analytics project opened in Visual Studio Code.  

 

Open the Azure Stream Analytics job you want to debug on the Query Editor on Azure portal. Select from drop down Open in Visual Studio and choose Visual Studio Code. Then choose Open job in Visual Studio Code. The job will be exported as an Azure Stream Analytics project in Visual Studio Code. 

 

portal-openinvscode.png

 

 

portal-openinvscode2.png 

Step 3 Run job locally 

Since the credentials are already auto-populated, only thing you need to do is to open the script and Run locallyMake sure the input data is sending to your job input sources. 

 

vscode-localtesting.png

 

Step 4 Debug using job diagram 

The job diagram shown on the right window of the editor shows how data flows from input sourceslike Event Hub or IoT Hub, through multiple query steps to output sinks. You can also view the data as well as metrics of each query step in each intermediate result set to find the source of an issue. 

 

Now, let’s look at a real example job below. We have a job receiving stock quotes for different stocks. In the query there are filters on a few stocks but one output does not have any data. 

 

We run the job locally against live input stream from Event Hub. Through job diagram we can see that the step ‘msftquotes’ does not have data flowing in.    

vscode-debugging1.png

 

To troubleshoot that, let’s zoom in the diagram and select the upstream step typeconvertedquotes’ to see if there is any output. In the node it shows that there are 3135 output events. Also, from the Result tab below it is easy to find out that there are data been output from this step with symbol ‘MSFT’. 

vscode-debugging2.png

 

Then we select the step ‘msftquotes’ and locate to the corresponding script to take a closer look.  

vscode-debugging3.png 

Now we find the root cause – there is a typo in the script, ‘%MSFT%’ is mistakenly typed as ‘%MSFA%’.  

MSFTQuotes AS ( 

    SELECT typeconvertedquotes.* FROM typeconvertedquotes 

    WHERE symbol like '%MSFA%' 

    AND bidSize > 0 

), 

 

Let’s fix the typo, stop the job and run again. 

 

Look, the data is flowing into step ‘msftquotes’ and the other downstream steps. 

vscode-debugging4.png

 

Other than checking the result for each step, you can also view logs and metrics for the job.  

 

Step 5 Submit to Azure

When local testing is done, submit the job to Azure to run in the cloud environment and further validate the job status in a distributed manner. 

 

Hope you find these new features helpfuland please let us know what capabilities you're looking for with regard to job debugging in Azure Stream Analytics! 

 

 

Version history
Last update:
‎Sep 23 2020 07:48 AM
Updated by: