Introduction
Co-op Translator is an innovative Python package developed to streamline the translation of project documentation into various languages. By harnessing the capabilities of advanced Large Language Models (LLMs) and Azure AI Services, Co-op Translator provides a robust solution for developers and organizations seeking
enhance their global outreach and collaboration.
A Journey with Co-op Translator to Share Learning Resources Worldwide
In this blog, we will walk you through setting up and using Co-op Translator, a powerful tool that automates the translation of markdown files and images. I used it to translate my English presentation materials from an MLSA session into Punjabi (pa) and Hindi (hi), allowing students to access content in their preferred languages. You can view the translated results in my GitHub repository. This guide covers how I integrated Azure services to provide multilingual support, configuring the project environment, and using the Co-op Translator Python package to handle both markdown and image translation.
Here's an example from my GitHub documentation project, showing my presentation materials translated into Hindi.
What is a Co-Op Translator?
Co-Op Translator is an open-source tool designed to automate translating Markdown files and images containing embedded text into multiple languages. Powered by Azure AI Services, it streamlines the traditionally time-consuming translation process, allowing you to make your projects globally accessible with minimal manual effort.
Key features
- Translates both Markdown files and text files
- Supports multiple languages simultaneously.
- Leverages Azure Computer Vision and Azure OpenAI for high-quality translations.
- It can be easily integrated into your existing workflows.
Purpose of Translating the Repository
Translating the repository's primary purpose is to democratize project documentation access. By making information available in multiple languages, Co-op Translator aims to facilitate collaboration among developers, educators, and students from diverse linguistic backgrounds. This initiative not only improves accessibility but also fosters an inclusive environment for learning and development.
Overview of the Repository
The Co-op Translator repository serves as a comprehensive resource for users interested in implementing automated translation solutions. It includes:
Source Code: The core functionality of the translation tool.
Documentation: Guides and tutorials for setting up and using the translator.
Examples: Sample projects demonstrating how to integrate translation features into existing applications.
Tools: Utilities for managing translations and analyzing costs.
Description of the Repository’s Purpose
The repository is designed to empower developers with the tools necessary to automate the translation of their documentation. This includes not only the translation process itself but also best practices for maintaining high-quality translations and managing multilingual content effectively.
Target Audience: Students and Educators
Co-op Translator is particularly beneficial for students and educators who often require access to multilingual documentation for their studies and projects. By providing translated resources, the repository supports educational initiatives and promotes inclusiveness in learning environments.
Setting Up Azure Resources
First, I created the necessary Azure resources for the Co-op Translator:
Creating an Azure Account
If you don't already have an Azure account, you'll need to create one.
- Navigate to the Azure Sign Up page.
- Select Try Azure for free or Pay as you go.
- Follow the on-screen instructions to create your account.
- Provide your details and contact information.
- Verification: You'll need to verify your identity using a credit card or phone number.
Creating an Azure Computer Vision Resource
-
Sign in to the Azure Portal.
-
Type computer vision in the search bar at the top of the portal page and select Computer vision from the options that appear.
-
Select + Create from the navigation menu.
-
Perform the following tasks:
- Select your Azure Subscription.
- Select the Resource group to use (create a new one if needed).
- Select the Region you'd like to use.
- Enter Name. It must be a unique value.
- Select the Pricing tier you'd like to use.
-
Select Review + Create.
-
Select Create.
Creating an Azure OpenAI Resource
-
Type azure openai in the search bar at the top of the portal page and select Azure OpenAI from the options that appear.
-
Select + Create from the navigation menu.
-
Perform the following tasks:
- Select your Azure Subscription.
- Select the Resource group to use (create a new one if needed).
- Select the Region you'd like to use.
- Enter Name. It must be a unique value.
- Select the Pricing tier you'd like to use.
-
Select Next to move to the Network page.
-
Select a network security Type you'd like to use.
-
Select Next to move to the Tags page.
-
Select Next to move to the Review + submit page.
-
Select Create.
Deploying Azure OpenAI Models
-
Navigate to the Azure OpenAI resource that you created.
-
Select Go to Azure OpenAI Studio from the navigation menu.
-
Inside Azure OpenAI Studio, select Deployments from the left side tab.
-
Select + Deploy model from the navigation menu.
-
Select Deploy base model from the navigation menu to create a new gpt-4o deployment.
-
Perform the following tasks:
- Inside Select a model page, select gpt-4o.
- Select Confirm to navigate to the Deploy model gpt-4o page.
- Inside Deploy model gpt-4o page, enter Deployment name. It must be a unique value. For example, gpt-4o.
- Inside Deploy model gpt-4o page, select the Deployment type you'd like to use.
-
Select Deploy.
Creating the .env File in the Root Directory
Then, I created a .env file in the root directory of my project to store my Azure credentials and other environment variables.
In this section, we will guide you through setting up your environment variables for Azure services using a .env file. Environment variables allow you to securely manage sensitive credentials, such as API keys, without hard-coding them into your codebase.
Creating the .env File
In the root directory of your project, create a file named .env. This file will store all your environment variables in a simple format.
Warning
Do not commit your .env file to version control systems like Git. Add .env to your .gitignore file to prevent accidental commits.
-
Navigate to the root directory of your project.
-
Create an .env file in the root directory of your project.
-
Open the .env file and paste the following template:
# Azure Credentials AZURE_SUBSCRIPTION_KEY="your_azure_subscription_key" AZURE_AI_SERVICE_ENDPOINT="https://your_azure_ai_service_endpoint" # Azure OpenAI Credentials AZURE_OPENAI_API_KEY="your_azure_openai_api_key" AZURE_OPENAI_ENDPOINT="https://your_azure_openai_endpoint" AZURE_OPENAI_MODEL_NAME="your_model_name" AZURE_OPENAI_CHAT_DEPLOYMENT_NAME="your_deployment_name" AZURE_OPENAI_API_VERSION="your_api_version"
Gathering Your Azure Credentials
You will need the following Azure credentials on hand to configure the environment:
-
For Azure AI Service:
- Azure Subscription Key: Your Azure subscription key, which allows you to access the Azure AI services.
- Azure AI Service Endpoint: The endpoint URL for your specific Azure AI service.
-
For Azure OpenAI Service:
- Azure OpenAI API Key: The API key for accessing Azure OpenAI services.
- Azure OpenAI Endpoint: The endpoint URL for your Azure OpenAI service.
- Azure OpenAI Model Name: The name of the model you will be interacting with.
- Azure OpenAI Deployment Name: The name of your deployment for Azure OpenAI models.
- Azure OpenAI API Version: The version of the Azure OpenAI API you are using.
Adding Azure Environment Variables
-
Perform the following tasks to add the Azure Subscription key and Azure AI Services Endpoint:
- Type computer vision in the search bar at the top of the portal page and select Computer vision from the options that appear.
- Navigate the Azure Computer Vision resource that you are currently using.
- Copy and paste your Subscription key and Endpoint into the .env file.
- Type computer vision in the search bar at the top of the portal page and select Computer vision from the options that appear.
-
Perform the following tasks to add the Azure OpenAI API Key and Endpoint:
- Type azure openai in the search bar at the top of the portal page and select Azure OpenAI from the options that appear.
- Navigate the Azure OpenAI resource that you are currently using.
- Select Keys and Endpoint from the left side tab.
- Copy and paste your Azure OpenAI API Key and Endpoint into the .env file.
- Type azure openai in the search bar at the top of the portal page and select Azure OpenAI from the options that appear.
-
Perform the following tasks to add the Azure OpenAI Deployment Name and Version:
-
Navigate to the Azure OpenAI resource that you created.
-
Select Go to Azure OpenAI Studio from the navigation menu.
-
Inside Azure OpenAI Studio, select Deployments from the left side tab.
-
Copy and paste your Azure OpenAI Name and model Version into the .env file.
-
-
Save the .env file.
-
Now, you can access these environment variables to use Co-op Translator with your Azure services.
Creating a .gitignore File
To ensure sensitive information is not shared in version control, I created a .gitignore file in the root of my project and add the .env file to it:
# Ignore environment files
.env
Adding Multilingual Support Table in README.md
Next, I added a multilingual support section to my README.md file to display the available translations. Here's how I set it up:
## 🌐 Multi-Language Support
> **Note:**
> These translations were automatically generated using the open-source [co-op-translator](https://github.com/Azure/co-op-translator) and may contain errors or inaccuracies. For critical information, it is recommended to refer to the original or consult a professional human translation. If you'd like to add or update a translation, please refer to the [co-op-translator](https://github.com/Azure/co-op-translator) repository, where you can easily contribute using simple commands.
| Language | Code | Link to Translated README | Last Updated |
|----------------------|------|---------------------------------------------------------|--------------|
| Punjabi (Gurmukhi) | pa | [Punjabi Translation](./translations/pa/README.md) | 2024-10-25 |
| Hindi | hi | [Hindi Translation](./translations/hi/README.md) | 2024-10-25 |
Installing and Using the Co-op Translator
-
I created a virtual environment for the project to keep dependencies isolated:
python -m venv .venv
-
Then, I activated the virtual environment:
.venv\Scripts\activate.bat
Note
Your terminal should now show the virtual environment is active, for example:
(.venv) C:\Users\sms79\dev\24-07-22-tech-writing>
-
With the environment activated, I installed the Co-op Translator using pip:
pip install co-op-translator
Step-by-Step Translation Process
Once installation is complete, translating your project into your desired languages with Co-op Translator is straightforward. From the root directory of your project, simply run the following command:
translate -l "language_codes"
-
For example, to translate the markdown files and images into both Punjabi and Hindi, you would use:
translate -l "hi pa"
-
After running this command, Co-op Translator will begin the translation process, with output similar to the following:
(.venv) C:\Users\sms79\dev\PowerPlatformsession>translate -l "hi pa" Translating images: 100%|█████████████████████████████████████████████████████████████| 32/32 [07:45<00:00, 14.54s/it] Translating markdown files: 100%|███████████████████████████████████████████████████████| 4/4 [04:15<00:00, 63.75s/it]
This process efficiently translates both images and markdown files, allowing your content to reach a wider audience in multiple languages.
Appendices
Appendix 1: Original README (English)
Appendix 2: Translated README (Punjabi)
Appendix 3: Translated README (Hindi)
Observations on Translation Quality
The markdown translations were completed smoothly without any issues. However, with image translations, I noticed that densely packed sentences occasionally shrank or stretched excessively after translation, making them challenging to read.
Benefits of Multilingual Content for Students
Using Co-op Translator to Provide multilingual content significantly enhances accessibility for students who speak different languages. It enables non-English speakers to engage fully with the material, breaking down language barriers. This is especially valuable in the MLSA program, which spans diverse regions globally and includes many participants whose first language is not English.
-
Translation Cost Analysis
-
Translation Time and Cost
-
Translation Time
For this translation process, we translated into Hindi (hi) and Punjabi (pa). The following time was recorded:
Images: 32 images translated in 7 minutes and 45 seconds
Markdown Files: 4 markdown files translated in 4 minutes and 15 seconds
Translation Cost
Using Azure Cost Management, I reviewed the translation costs for both images and markdown files. Note that the Images section includes the cost for extracting and translating text within each image. Here is the detailed breakdown:
- Images (with Text Extraction): 32 images processed, totaling $0.05
- Markdown Files: 4 markdown files translated (includes extracted image text), totaling $0.29
Total Estimated Translation Cost
The total estimated cost for this translation process is $0.34.
Accessing Cost Management
To verify and monitor these costs, follow these steps to access Azure Cost Management:
-
Go to the Azure Portal and log in with your Azure account.
-
Type subscriptions in the search bar at the top of the portal page and select Subscriptions from the options that appear.
-
Select the subscription you'd like to use.
-
Select Cost analysis from the left side menu to view a detailed breakdown of your costs.
-
Perform the following tasks to monitor translation costs effectively:
- Select the calendar icon to adjust the date range, aligning it with the timeframe of your translation project.
- Select +Add filter
- Select Filter items to Resource.
- If all translation resources are organized within a single resource group, choose Resource Group to view the cumulative cost at the group level.
- To analyze individual costs per service, select Resource for a detailed breakdown by specific resource usage.
- Select Resources you'd like to analyze, such as Azure Computer Vision and Azure OpenAI, which are used for translation tasks.
Understanding Translation Costs in Azure Portal
Azure provides a suite of cost-management tools that help users monitor and manage their spending on translation services. To effectively navigate these tools:
- Access the Azure Portal: Log into your Azure account and navigate to the Cost Management + Billing section.
- Review Cost Analysis: Use the cost analysis feature to track your spending over time, filter by resource type, and identify trends in usage.
- Set Budgets: Create budgets for your translation services to ensure you stay within financial limits.
Analyzing the Cost of Translation by Content Type
Costs associated with translation services can vary significantly based on the type of content being translated. For instance:
- Textual Content: Generally, it incurs lower costs compared to multimedia content.
- Images and Graphics: This may require additional processing and thus higher costs.
- Complex Formats: Documents with complex formatting or technical jargon may also lead to increased translation costs.
To analyze costs effectively, users should categorize their content types and estimate translation expenses accordingly.
Tips for Managing and Minimizing Costs
- Choose the Right Azure Resources: Select the most cost-effective services based on your needs. Consider using lower-tier translation services for less critical documents.
- Monitor Usage Regularly: Set up alerts in Azure to notify you when you approach your budget limits. Regular monitoring helps avoid unexpected charges.
- Optimize Content: Reduce the volume of content needed by summarizing information or providing key excerpts. This can significantly lower costs.
- Leverage Azure’s Free Tier: If applicable, take advantage of Azure’s free tier offerings for translation services to minimize costs during initial trials or small projects.
Utilizing Translated Content for Students
How Students Can Access and Use the Translated Repository
Students can easily access translated versions of project documentation through the repository. Key steps include:
Navigating the Repository: Visit the Co-op Translator GitHub repository and look for sections dedicated to translated documents.
Using Translated Resources: Students can download or clone the repository to access translated documentation, which can be utilized for their projects, assignments, or collaborative work.
Enabling Global Access for Student Ambassadors and Language-Specific Content
Student Ambassadors: Encourage student ambassadors to promote the use of translated content within their networks, ensuring that resources reach a wider audience.
Language-Specific Content: Create dedicated sections in the repository for language-specific resources, allowing users to easily find and utilize materials in their preferred language.
Conclusion
Co-op Translator represents a significant advance in making multilingual project documentation. Automating the translation process, not only saves time and resources but also promotes inclusivity and collaboration across linguistic barriers. The benefits extend to developers, educators, and students, enabling them to work together more effectively in a globalized environment.
References
- https://github.com/Azure/co-op-translator
- https://learn.microsoft.com/azure/ai-services/
- https://learn.microsoft.com/azure/ai-services/openai/