Image Analysis in Power BI – Using AI Services Vision API

jameelahmeds · ‎Jan 26 2024

In the earlier post image-analysis-in-power-bi-using-ai-vision-insights-1-4 , we saw how easy it was to use Power BI Vision AI Insights which required Power BI Premium to analyze images.
This time we will use the AI Services Vision API which does not require Power BI Premium but does involve some coding. We will use Python scripting to call the Vision API. Then compare the output differences between the AI Insights method and the Vision API method.

This article has the following five sections with step-by-step instructions.

A. Configure Vision AI Services

B. Install Python, Visual Studio Code plugin, and Python packages

C. Run test/debug the Python script.

D. Implement the Python script in Power BI.

E. Compare Output differences.

A. Configure Vision AI Services

1. In Azure Portal, Provision Computer Vision under AI Services (previously called Cognitive Services)

2. Note down the Endpoint and the Key, as this will be required during the scripting to call the Vision API.

B. Install Python, Visual Studio Code, Python extension for VS Code and the Python packages

1. Install Python

2. Install Visual Studio Code

3. Install Python extension for Visual Studio Code

4. Open Visual Studio Code, and create a new python (.py) file. This will also open a Python terminal section within VS Code.

5. Install Python Packages from the terminal:

pip install pandas 
pip install azure-cognitiveservices-vision-computervision

For example, pandas can be installed as shown below:

C. Run test/debug the Python scripts.

Power BI will provide as an input to the Python script a variable called dataset, which is a panda dataframe, containing the rows to be transformed (in our case applying the Vision API). This means if we develop & test the script outside of Power BI, then we need to create a dataframe with some sample data.
Important: You must include error handling in your Python script, so that if one row encounters an error, then it continues to process the remaining rows.
We will have two separate scripts: one to get the description and another to get the tags identified in the image.
It is easier to develop, test, and debug your Python script within Visual Studio Code than in Power BI.
1. Update the Computer Vision endpoint URL in the code.

2. Update the Computer Vision Key according to your environment.
The below script expects the dataframe with a column called "ImageURL".

Description Script

#'dataset' holds the PowerBI input data for this script
import pandas as pd
data = {'ImageURL':['https://fastly.picsum.photos/id/1026/367/267.jpg?hmac=prF35YFx9-apfzpe3aJ8ukBG5LdXc7jqoT7g8Xlid1M'
                     #Second image is deliberately set to invalid URL for test exception handling
                  , 'https://i.picsum.photos/id/1038/367/267.jpg?hmac=WQmi_BZU714k90ytKa6huD9mXXF0Hp7IntkL2yEu5dU'     
                  , 'https://fastly.picsum.photos/id/1035/367/267.jpg?hmac=Bq8Xh0z-VRVWm9RfovGy3iXDBaVyNPVGHKOganDNmwI'
                  , 'https://fastly.picsum.photos/id/1048/367/267.jpg?hmac=1g6RSHAu1AMTkONGSSaHEOUYKH5couNnl5ZxY8ySxQM']
       }
dataset = pd.DataFrame(data)
#above section must be removed when pasted into PowerBI. 
#--------------------------------------------------------------------------------------------
endpoint = 'https://cogjamsvision.cognitiveservices.azure.com/'  # '<ADD Computer Vision AccountName>'
key =      '860c16675d594d328b00e0a7302e7cc2'#  '<ADD COMPUTER VISION SUBSCRIPTION KEY HERE>'
max_descriptions = 3

from distutils.log import error
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials
import json


# Set credentials
credentials = CognitiveServicesCredentials(key)
# Create client
client = ComputerVisionClient(endpoint, credentials)

description_count=[]
description_1    =[]
confidence_1     =[]
errors           =[]
json_list        =[]

for i in dataset.index:
    try:
        json=""
        #call the AI Vision API for image description analysis  
        analysis = client.describe_image(dataset['ImageURL'][i], max_descriptions, "en")
        description_count.append(len(analysis.captions)) 

        #parse and assign result to arrays if analysis object is not empty.
        if len(analysis.captions)>0: 
            description_1.append(analysis.captions[0].text)
            confidence_1.append(analysis.captions[0].confidence)
            errors.append(None)
        for caption in analysis.captions:
            json +=  ('' if len(analysis.captions)==1 else ",")  + '{"description": "'+ caption.text +'", "confidence":"'+ str(caption.confidence) +'", "error":null}'   

    except Exception as e:
        description_count.append(None) 
        description_1.append(None)
        confidence_1.append(None)
        errors.append(str(e))    
        json +=  '{"description": null , "confidence":null, "error":"'+ str(e) +'"}'   
    finally:
        json_list.append(json)        

#assign the arrays the columns of the data dataset 
dataset['description_count'] = description_count 
dataset['description_1'] = description_1 
dataset['confidence_1'] = confidence_1 
dataset['description_json'] = json_list 
dataset['error'] = errors 
#-------------------------------------------------------
#below section must be removed when pasted into PowerBI. 
print (dataset)

Description Script result - notice in the screenshot below the error message returned for the second row, and the script can continue to process the next rows.

Tags Script

#'dataset' holds the PowerBI input data for this script
import pandas as pd
data = {'ImageURL':['https://fastly.picsum.photos/id/1026/367/267.jpg?hmac=prF35YFx9-apfzpe3aJ8ukBG5LdXc7jqoT7g8Xlid1M'
                     #Second image is deliberately set to invalid URL for test exception handling
                  , 'https://i.picsum.photos/id/1038/367/267.jpg?hmac=WQmi_BZU714k90ytKa6huD9mXXF0Hp7IntkL2yEu5dU'     
                  , 'https://fastly.picsum.photos/id/1035/367/267.jpg?hmac=Bq8Xh0z-VRVWm9RfovGy3iXDBaVyNPVGHKOganDNmwI'
                  , 'https://fastly.picsum.photos/id/1048/367/267.jpg?hmac=1g6RSHAu1AMTkONGSSaHEOUYKH5couNnl5ZxY8ySxQM']
                 # ,'Id':[1, 2, 3, 4]
       }
dataset = pd.DataFrame(data)
#above section must be removed when pasted into PowerBI. 
#--------------------------------------------------------------------------------------------
endpoint = 'https://cogvision.cognitiveservices.azure.com/'  # '<ADD Computer Vision AccountName>'
key =      '860c16675d'#  '<ADD COMPUTER VISION SUBSCRIPTION KEY HERE>'

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials
import pandas as pd 

def Tags(key, endpoint,url):
    df = pd.DataFrame(columns=['tag', 'confidence','error'])
    credentials = CognitiveServicesCredentials(key)
    client = ComputerVisionClient(endpoint, credentials)
    try:
        analysis = client.analyze_image(url,visual_features=[VisualFeatureTypes.tags]) 
        for tag in analysis.tags: 
            df.loc[df.shape[0]]=[tag.name, tag.confidence, None] #add row at end of df
    except Exception as e:
        df.loc[df.shape[0]]=[None, None, str(e)]    
    return (df)

#cross apply url to function results
dataset = pd.concat([pd.concat([Tags(key,endpoint, r['ImageURL']), pd.DataFrame(r).T], axis=1).bfill().ffill() for _, r in dataset.iterrows()], ignore_index=True)


#-------------------------------------------------------
#below section must be removed when pasted into PowerBI. 
print (dataset)

Tag Script result - notice that the script can process the remaining rows even if an error is encountered.

D. Implement the Python script in Power BI

1. Open the Power BI file (.PBIX) in the Power BI Desktop that was developed in the "Image Analysis in Power BI – Using AI Vision Insights" blog series.

2. Open Power Query Editor by clicking on the "Transform data" button found in the Home ribbon toolbar.

3. Click on Manage Parameters and add parameters called, VisionAPI_Key and VisionAPI_Endpoint, and set their current values.

Manage_Parameters..jpg

4. Create a new query by referencing the Images query.

5. Rename the new query to "Images_Python"

6. Add Python Transform Script

7. Copy and Paste the Description Python Script

a) Remove the top and bottom sections as mentioned in the python script (step C)

b) click on the Table hyperlink to expand the dataset to rows and columns.

c) rename and reorder the columns as shown below

"ImageId", "Description", "Confidence", "Descriptions_Count", "Descriptions_json", "Error"

7) Modify the PowerQuery Script to use parameters

a) Click on the Advanced Editor under the Home Ribbon.

b) replace the hard-coded endpoint and key values with the parameter names.

The resulting script in power query will be as follows:

let
    Source = Images,
    #"Run Python script" = Python.Execute("endpoint = '" & Text.From(VisionAPI_Endpoint) & "'  #(lf)key =      '" & Text.From(VisionAPI_Key) & "'#(lf)max_descriptions = 3#(lf)#(lf)#pip install azure-cognitiveservices-vision-computervision #(lf)#pip install pandas#(lf)from distutils.log import error#(lf)from azure.cognitiveservices.vision.computervision import ComputerVisionClient#(lf)from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes#(lf)from msrest.authentication import CognitiveServicesCredentials#(lf)import pandas as pd#(lf)import json#(lf)#(lf)credentials = CognitiveServicesCredentials(key)#(lf)client = ComputerVisionClient(endpoint, credentials)#(lf)#(lf)description_count=[]#(lf)description_1    =[]#(lf)confidence_1     =[]#(lf)errors           =[]#(lf)json_list        =[]#(lf)#(lf)for i in dataset.index:#(lf)    try:#(lf)        json=""""#(lf)        analysis = client.describe_image(dataset['ImageURL'][i], max_descriptions, ""en"")#(lf)        description_count.append(len(analysis.captions)) #(lf)        if len(analysis.captions)>0: #(lf)            description_1.append(analysis.captions[0].text +""."" )#(lf)            confidence_1.append(analysis.captions[0].confidence)#(lf)            errors.append(None)#(lf)        for caption in analysis.captions:#(lf)            json +=  ('' if len(analysis.captions)==1 else "","")  + '{""description"": ""'+ caption.text +'"", ""confidence"":""'+ str(caption.confidence) +'"", ""error"":null}'   #(lf)    except Exception as e:#(lf)        description_count.append(None) #(lf)        description_1.append(None)#(lf)        confidence_1.append(None)#(lf)        errors.append(str(e))    #(lf)        json +=  ('' if len(analysis.captions)==1 else "","")  + '{""description"": null , ""confidence"":null, ""error"":""'+ str(e) +'""}'   #(lf)    finally:#(lf)        json_list.append(json)        #(lf) #(lf)dataset['Descriptions_Count'] = description_count #(lf)dataset['Description'] = description_1 #(lf)dataset['Confidence'] = confidence_1 #(lf)dataset['Descriptions_json'] = json_list #(lf)dataset['Error'] = errors #(lf)",[dataset=Source]),
    dataset = #"Run Python script"{[Name="dataset"]}[Value],
    #"Changed Type" = Table.TransformColumnTypes(dataset,{{"ImageId", Int64.Type}, {"ImageURL", type text}, {"WebURL", type text}, {"Descriptions_Count", Int64.Type}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Description", "Confidence", "Error"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Replaced Value",{"ImageId", "Description", "Confidence", "Descriptions_Count", "Descriptions_json", "Error"})
in
    #"Reordered Columns"

8) Create a new query by referencing the Images query. This will be used for the Tags script.

a) rename the query to ImageTags_Python.

b) click on the Advanced Editor from the Home ribbon and paste the following powe query script.

let
    Source = Images,
    #"Run Python script" = Python.Execute("endpoint = '" & Text.From(VisionAPI_Endpoint) & "'#(lf)key =      '" & Text.From(VisionAPI_Key) & "'#(lf)max_descriptions = 3#(lf)#(lf)#(lf)from azure.cognitiveservices.vision.computervision import ComputerVisionClient#(lf)from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes#(lf)from msrest.authentication import CognitiveServicesCredentials#(lf)import pandas as pd #(lf)#(lf)def Tags(key, endpoint,url):#(lf)    df = pd.DataFrame(columns=['Tag', 'Confidence','Error'])#(lf)    credentials = CognitiveServicesCredentials(key)#(lf)    client = ComputerVisionClient(endpoint, credentials)#(lf)    try:#(lf)        analysis = client.analyze_image(url,visual_features=[VisualFeatureTypes.tags]) #(lf)        for tag in analysis.tags: #(lf)            df.loc[df.shape[0]]=[tag.name, tag.confidence, None] #add row at end of df#(lf)    except Exception as e:#(lf)        df.loc[df.shape[0]]=[None, None, str(e)]    #(lf)    return (df)#(lf)#(lf)#cross apply url to function results#(lf)dataset = pd.concat([pd.concat([Tags(key,endpoint, r['ImageURL']), pd.DataFrame(r).T], axis=1).bfill().ffill() for _, r in dataset.iterrows()], ignore_index=True)#(lf)#(lf)",[dataset=Source]),
    dataset = #"Run Python script"{[Name="dataset"]}[Value],
    #"Changed Type" = Table.TransformColumnTypes(dataset,{{"Confidence", type number}, {"ImageId", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"ImageURL", "WebURL"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Duplicates",{"ImageId", "Tag", "Confidence","Error"}),
    #"Sorted Rows" = Table.Sort(#"Reordered Columns",{{"ImageId", Order.Ascending},{"Confidence", Order.Descending} })
in
    #"Sorted Rows"

9) Create the following two relationships. The third relationship was already created in the previous blog series.

a)

b)

10) Create a Power BI report visual using the Images_Python table.

E. Now let's Compare Output differences between the Visual Insights and the AI Services Vision API.

Notice there is a difference in the number of tags and the Confidence % between the two approaches.

AI Services Vision API also supports description extraction.

Notice it was able to identify which castle is in the picture from known landmarks.

AI Services Vision API also supports two-word tag extraction.

Conclusion:

If you are looking to extract a sentence or a description of what the image is about, then the only option is to use the AI Services - Vision API within a Python/R transformation script. What I have observed so far is that the description has a high accuracy, and possibly better than the individual tag accuracy. The Power BI report created during this blog post is attached towards the bottom of the zip file.

Feature	Power BI Vision Insight	Cognitive Services Vision API
Is Power BI Premium required?	Yes	No
Separate Cognitive Services Vision provisioning required in Azure Portal?	No	Yes
Script Development required in Python or R?	No	Yes
Image Tag Extraction	only one-word tag extraction supported	also supports two-word tag extraction.
Description Extraction?	No	Yes - can identify know landmarks

Products (50)

Special Topics (28)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Image Analysis in Power BI – Using AI Services Vision API