Introducing Azure AI Content Safety: Helping Organizations to Maintain Safe Online Spaces

Microsoft

May 23, 2023

We are thrilled to introduce the Azure AI Content Safety service in Public Preview.

Note: Customers can begin using Azure AI Content Safety today. Billing for all Azure AI Content Safety usage begins June 1, 2023. It is priced at $1.50 per 1K images, and $0.75 per 1K Text Records.

As the volume of User- and AI-generated content continues to expand, it is crucial for organizations to responsibly manage and moderate this information. The presence of discriminatory, harmful, or dangerous material can lead to users losing trust in brands and platforms, which can ultimately result in financial losses for businesses. Moreover, unsafe content has the potential to tarnish brand reputation. Azure AI Content Safety enables enterprises across various industries to harness the power of Responsible AI, facilitating the creation of secure online spaces and fostering a sense of community.

Our content classification models are engineered to identify and flag text and image content containing hate speech, violence, sexually explicit material, and self-harm, so that users can engage with online platforms without compromising their safety. When the models detect harmful content, they assign a severity level, empowering businesses to prioritize and review flagged material. Azure AI Content Safety is an invaluable tool for companies operating social media platforms or products with social functionalities, as it can effectively monitor content in posts, threads, chats, and more. Additionally, the gaming industry can benefit from Azure AI Content Safety by using it to oversee social features such as live streaming and multiplayer game chats. The solution also detects risks in user-generated content, including avatars, usernames, and uploaded images.

Azure AI Content Safety models boast reliability and efficacy, as evidenced by their integration into other Azure AI products for monitoring both user and AI-generated content. Azure OpenAI and Bing, for example, utilize Azure AI Content Safety models in their content filters to block prompts and generated content that violates content management policies.

Azure AI Content Safety exposes Text and Image APIs and a Studio experience.

Text API
- Multi-Class: Classify Hate, Sexual, Violence and Self-Harm Content
- Multi-Severity: For each category provides a severity level
- Multi-Lingual: English, Spanish, French, Japanese, Portuguese, German, Italian, Chinese officially supported
- Custom Blocklist Support
Image API
- Multi-Class: Classify Hate, Sexual, Violence and Self-Harm Content
- Multi-Severity: For each category provides a severity level
Studio
- Test the Text and Image APIs severity levels with your data to match your policies
- Monitor your resource to understand harmful content distributions and block rate

Now we will walk you through how to test the API in the Studio and how to call the API in your application.

Step 1: Determine the content moderation task

To begin, you must choose which task you want to accomplish based on your needs. There are two options: moderating inappropriate images or text-based content. If your main concern is ensuring user-generated images are appropriate for a safe environment, Azure AI Content Safety Studio's Image Moderation capabilities can be used. However, if you want to ensure that user-generated comments, chats, and text content are free from harmful material, Text Moderation features can be utilized.

In this blog, we will guide you through the process of analyzing text content and assigning it categories and severity scores so you can take appropriate action to help you maintain a respectful and inclusive online community.

Step 2: Choose a text moderation example

Before proceeding, please be aware that some of the content in each sample may be offensive. To demonstrate the capabilities of Azure AI Content Safety Studio's Text Moderation, you can choose from the examples already provided in the Studio:

To demonstrate the multi-lingual capabilities of Azure AI Content Safety Studio's Text Moderation API, we will consider the following example that contains multiple languages in one sentence:

"Painfully twist his arm then punch him in the face jusqu’à ce qu’il perde connaissance."

The Text Moderation API is designed to detect harmful content across multiple languages. The model has been trained on more than 100 languages and is designed to support English, Spanish, German, Chinese, Italian, Portuguese, Japanese, and French. This feature allows the model to accurately identify and moderate content in which multiple languages are present simultaneously, as is often the case in social media and gaming communities.

The Text API provided by Azure AI Content Safety offers a unique advantage over many other content moderation APIs: It eliminates the need for customers to specify the input language of the text prior to moderation. This is due to the model's inherent capability to comprehend text containing multiple languages, which enables it to accurately identify harmful content across various linguistic contexts.

Step 3: Configure Filters and Submit the chosen text to Azure AI Content Safety Studio

Azure AI Content Safety Studio provides severity levels for each category to help you better understand the potential risks associated with different types of content. These severity levels range from 'SAFE' to 'HIGH' for each category.

To customize the moderation process on your platform, you can set severity thresholds for each content category. This allows you to define the level of moderation required for different types of content that best matches your needs and policies.

As an example, you might choose to set the following severity thresholds:

Violence: LOW, Self-harm: LOW, Sexual: LOW, Hate: LOW

After setting your desired thresholds, you can run a test to see how the moderation results change based on your chosen settings.

To analyze the text, submit the chosen text to the Azure AI Content Safety API. The API will process the text and provide the classification result across 4 different Categories: Hate, Sexual, Violence, Self-Harm and a severity level for each category.

Step 4: Review the moderation results

You can now review the assigned labels and severity level. Based on these results, you can determine if the content requires further action, such as removal, flagging, or user notification.

The example we used is classified as Violence MEDIUM so it does not pass the check we set up as the VIOLENCE threshold was set at LOW.

Step 5: View the Code and Test on a Larger Set of Data

As a final step, consider evaluating the Content Safety model on a more extensive dataset to gauge its effectiveness with larger quantities of data. To do this, you can utilize the "Test with a large dataset" option located at the top of the page. By testing with hundreds or even thousands of records, you can thoroughly examine the severity level for each record and view the risk distribution across various categories. This provides a more comprehensive analysis compared to a simple test and can help you tune the severity scores threshold to match your content policies.

Step 6: Use the API in your Application

Once you have validated the model using both sample data and your own dataset, you may integrate the service using the Azure AI Content Safety Text API.

We will cover the following steps:

Set up your Azure AI Content Safety API.
Replace placeholders in the code with your API credentials.
Configure the content to be analyzed.
Send the request to the API and get the response.
Understand the API response.

Step A: Set up your Azure AI Content Safety API
Before you begin, you will need to have an Azure account and set up the Azure AI Content Safety API. Follow the instructions in the to create your API endpoint and obtain your subscription key.

Step B: Replace placeholders in the code with your API credentials
In the code snippet provided, replace <Endpoint> with your Azure AI Content Safety API endpoint and <enter_your_subscription_key_here> with your subscription key:

url = "<Endpoint>/contentsafetymoderator/text:analyze?api-version=2023-04-30-preview"  
headers = {  
    'Ocp-Apim-Subscription-Key': '<enter_your_subscription_key_here>',  
    'Content-Type': 'application/json'
}

Step C: Configure the content to be analyzed
Specify the text content that you want to analyze for unsafe elements:

payload = json.dumps({  
    "text": "Painfully twist his arm then punch him in the face jusqu’à ce qu’il perde connaissance",  
    "categories": [  
        "Hate",  
        "Sexual",  
        "SelfHarm",  
        "Violence"  
    ]  
})

Step D : Send the request to the API and get the response
Make a POST request to the Azure AI Content Safety API with the configured payload and headers:

response = requests.request("POST", url, headers=headers, data=payload)

Print the response's status code, headers, and text:

print(response.status_code)  
print(response.headers)  
print(response.text)

Step E: Understand the API response
The API returns a JSON object containing the analysis results for each of the specified content categories (Hate, Sexual, SelfHarm, and Violence). Each category has a severity level ranging from 0 to 6, where higher values indicate a higher severity.

Sample response:

{  
    "blocklistMatchResults": [],  
    "hateResult": {  
        "category": "Hate",  
        "riskLevel": 0  
    },  
    "selfHarmResult": {  
        "category": "SelfHarm",  
        "riskLevel": 0  
    },  
    "sexualResult": {  
        "category": "Sexual",  
        "riskLevel": 0  
    },  
    "violenceResult": {  
        "category": "Violence",  
        "riskLevel": 4  
    }  
}

In this example, the text "Painfully twist his arm then punch him in the face jusqu’à ce qu’il perde connaissance" has a severity level of 4 in the "Violence" category, while the other categories have a severity level of 0, like shown in the Azure AI Content Safety Studio.

By following this tutorial, you have successfully used the provided code to analyze text content for unsafe elements using the Azure AI Content Safety API.

Get Started Today

Customers can begin using Azure AI Content Safety today. Billing for all Azure AI Content Safety usage begins June 1, 2023. It is priced at $1.50 per 1K images, and $0.75 per 1K Text Records.

Azure AI Content Safety is a powerful tool which enables content flagging for industries such as Media & Entertainment, Gaming, Social Media and others that require Safety & Security and Digital Content Management.

We eagerly anticipate seeing your innovative implementations.