azure ai translator
13 TopicsFrom Foundry to Fine-Tuning: Topics you Need to Know in Azure AI Services
With so many new features from Azure and newer ways of development, especially in generative AI, you must be wondering what all the different things you need to know are and where to start in Azure AI. Whether you're a developer or IT professional, this guide will help you understand the key features, use cases, and documentation links for each service. Let's explore how Azure AI can transform your projects and drive innovation in your organization. Stay tuned for more details! Term Description Use Case Azure Resource Azure AI Foundry A comprehensive platform for building, deploying, and managing AI-driven applications. Customizing, hosting, running, and managing AI applications. Azure AI Foundry AI Agent Within Azure AI Foundry, an AI Agent acts as a "smart" microservice that can be used to answer questions (RAG), perform actions, or completely automate workflows. can be used in a variety of applications to automate tasks, improve efficiency, and enhance user experiences. Link AutoGen An open-source framework designed for building and managing AI agents, supporting workflows with multiple agents. Developing complex AI applications with multiple agents. Autogen Multi-Agent AI Systems where multiple AI agents collaborate to solve complex tasks. Managing energy in smart grids, coordinating drones. Link Model as a Platform A business model leveraging digital infrastructure to facilitate interactions between user groups. Social media channels, online marketplaces, crowdsourcing websites. Link Azure OpenAI Service Provides access to OpenAI’s powerful language models integrated into the Azure platform. Text generation, summarization, translation, conversational AI. Azure OpenAI Service Azure AI Services A suite of APIs and services designed to add AI capabilities like image analysis, speech-to-text, and language understanding to applications. Image analysis, speech-to-text, language understanding. Link Azure Machine Learning (Azure ML) A cloud-based service for building, training, and deploying machine learning models. Creating models to predict sales, detect fraud. Azure Machine Learning Azure AI Search An AI-powered search service that enhances information to facilitate exploration. Enterprise search, e-commerce search, knowledge mining. Azure AI Search Azure Bot Service A platform for developing intelligent, enterprise-grade bots. Creating chatbots for customer service, virtual assistants. Azure Bot Service Deep Learning A subset of ML using neural networks with many layers to analyze complex data. Image and speech recognition, natural language processing. Link Multimodal AI AI that integrates and processes multiple types of data, such as text and images(including input & output). Describing images, answering questions about pictures. Azure OpenAI Service, Azure AI Services Unimodal AI AI that processes a single type of data, such as text or images (including input & output). Writing text, recognizing objects in photos. Azure OpenAI Service, Azure AI Services Fine-Tuning Models Adapting pre-trained models to specific tasks or datasets for improved performance. Customizing models for specific industries like healthcare. Azure Foundry Model Catalog A repository of pre-trained models available for use in AI projects. Discovering, evaluating, fine-tuning, and deploying models. Model Catalog Capacity & Quotas Limits and quotas for using Azure AI services, ensuring optimal resource allocation. Managing resource usage and scaling AI applications. Link Tokens Units of text processed by language models, affecting cost and performance. Managing and optimizing text processing tasks. Link TPM (Tokens per Minute) A measure of the rate at which tokens are processed, impacting throughput and performance. Allocating and managing processing capacity for AI models. Link PTU(provisioned throughput) provisioned throughput capability allows you to specify the amount of throughput you require in a deployment. Ensuring predictable performance for AI applications. Link767Views1like0CommentsMicrosoft Translator Pro is now Generally Available (GA)
In November 2024, we introduced the gated public preview release of Microsoft Translator Pro, our robust solution crafted to help enterprises break down language barriers in the workplace. Today, we are thrilled to announce that Microsoft Translator Pro is now generally available on iOS. New features of the gated GA release Below are the latest features in this release. For more information on the core features, please refer to the public preview release announcement. Customized phrasebook: Upload a phrasebook with your organization’s phrases to facilitate quick and efficient communication in another language. International availability: The app is now accessible in selected countries outside the United States. To view the complete list of supported countries, please refer to the Microsoft Translator Pro availability by country Availability in US Government cloud: Microsoft Translator Pro, which is already available in commercial cloud, is now also available within the US Government cloud. US Government agencies can now operate the app within the US Government cloud. For detailed information on regional availability, please refer to the Microsoft Translator Pro availability by region Expanded language coverage: The app now supports additional languages when connected to the internet, enhancing its usability for a broader range of users. For more details, please visit the Microsoft Translator Pro language support Join the gated GA To onboard the GA version of the app, please complete the gating form. Upon meeting the criteria, we will grant your organization access to the paid version of the Microsoft Translator Pro app. Learn more and get started: Microsoft Translator Pro documentation Microsoft Translator Pro FAQEnter new era of enterprise communication with Microsoft Translator Pro & document image translation
Microsoft Translator Pro: standalone, native mobile experience We are thrilled to unveil the gated public preview of Microsoft Translator Pro, our robust solution designed for enterprises seeking to dismantle language barriers in the workplace. Available on iOS, Microsoft Translator Pro offers a standalone, native experience, enabling speech-to-speech translated conversations among coworkers, users, or clients within your enterprise ecosystem. Watch how Microsoft Translator Pro transforms a hotel check-in experience by breaking down language barriers. In this video, a hotel receptionist speaks in English, and the app translates and plays the message aloud in Chinese for the traveler. The traveler responds in Chinese, and the app translates and plays the message aloud in English for the receptionist. Key features of the public preview Our enterprise version of the app is packed with features tailored to meet the stringent demands of enterprises: Core feature - speech-to-speech translation: Break language barriers: Real-time speech-to-speech translation allows you to have seamless communication with individuals speaking different languages. Unified experience: View or hear both transcription and translation simultaneously on a single device, ensuring smooth and efficient conversations. On-device translation: Harness the app's speech-to-speech translation capability without an internet connection in limited languages, ensuring your productivity remains unhampered. Full administrator control: Enterprise IT Administrators wield extensive control over the app's deployment and usage within your organization. They can fine-tune settings to manage conversation history, audit, and diagnostic logs, with the ability to disable history or configure automatic exportation of the history to cloud storage. Uncompromised privacy and security: Microsoft Translator Pro provides enterprises with a high level of translation quality and robust security. We know that Privacy and security are top priorities for you. Once granted access by your organization's admin, you can sign in the app with your organizational credentials. Your conversational data remains strictly yours, safeguarded within your Azure tenant. Neither Microsoft nor any external entities have access to your data. Join the Preview To embark on this journey with us, please complete the gating form . Upon meeting the criteria, we will grant your organization access to the paid version of the Microsoft Translator Pro app, which is now available in the US. Learn more and get started: Microsoft Translator Pro documentation. Document translation translates text embedded in images Our commitment to advancing cross-language communication takes a major step forward with a new enhancement in Azure AI Translator’s Document Translation (DT) feature. Previously, Document Translation supported fully digital documents and scanned PDFs. Starting January 2025, with this latest update, the service can also process mixed-content documents, translating both digital text and text embedded within images. Sample document translated from English to Spanish: (Frames in order: Source document, translated output document (image not translated), translated output document with image translation) How It Works To enable this feature, the Document Translation service now leverages Microsoft Azure AI Vision API to detect, extract, and translate text from images within documents. This capability is especially useful for scenarios where documents contain a mix of digital text and image-based text, ensuring complete translations without manual intervention. Getting Started To take advantage of this feature, customers can use the new optional parameter when setting up a translation request: Request A new parameter under "options" called "translateTextWithinImage" has been introduced. This parameter is of type Boolean, accepting "true" or "false." The default value is "false," so you’ll need to set it to "true" to activate the image text translation capability. Response: When this feature is enabled, the response will include additional details for transparency on image processing: totalImageScansSucceeded: The count of successfully translated image scans. totalImageScansFailed: The count of image scans that encountered processing issues. Usage and cost For this feature, customers will need to use the Azure AI Services resource, as this new feature leverages Azure AI Vision services along with Azure AI Translator. The OCR service incurs additional charges based on usage. Pricing details for the OCR service can be found here: Pricing details Learn more and get started (starting January 2025): Translator Documentation These new advancements reflect our dedication to pushing boundaries in Document Translation, empowering enterprises to connect and collaborate more effectively, regardless of language. Stay tuned for more innovations as we continue to expand the reach and capabilities of Microsoft Azure AI Translator.Announcing Azure AI Content Understanding: Transforming Multimodal Data into Insights
Solve Common GenAI Challenges with Content Understanding As enterprises leverage foundation models to extract insights from multimodal data and develop agentic workflows for automation, it's common to encounter issues like inconsistent output quality, ineffective pre-processing, and difficulties in scaling out the solution. Organizations often find that to handle multiple types of data, the effort is fragmented by modality, increasing the complexity of getting started. Azure AI Content Understanding is designed to eliminate these barriers, accelerating success in Generative AI workflows. Handling Diverse Data Formats: By providing a unified service for ingesting and transforming data of different modalities, businesses can extract insights from documents, images, videos, and audio seamlessly and simultaneously, streamlining workflows for enterprises. Improving Output Data Accuracy: Deriving high-quality output for their use-cases requires practitioners to ensure the underlying AI is customized to their needs. Using advanced AI techniques like intent clarification, and a strongly typed schema, Content Understanding can effectively parse large files to extract values accurately. Reducing Costs and Accelerating Time-to-Value: Using confidence scores to trigger human review only when needed minimizes the total cost of processing the content. Integrating the different modalities into a unified workflow and grounding the content when applicable allows for faster reviews. Core Features and Advantages Azure AI Content Understanding offers a range of innovative capabilities that improve efficiency, accuracy, and scalability, enabling businesses to unlock deeper value from their content and deliver a superior experience to their end users. Multimodal Data Ingestion and Content Extraction: The service ingests a variety of data types such as documents, images, audio, and video, transforming them into a structured format that can be easily processed and analyzed. It instantly extracts core content from your data including transcriptions, text, faces, and more. Data Enrichment: Content Understanding offers additional features that enhance content extraction results, such as layout elements, barcodes, and figures in documents, speaker recognition and diarization in audio, and more. Schema Inferencing: The service offers a set of prebuilt schemas and allows you to build and customize your own to extract exactly what you need from your data. Schemas allow you to extract a variety of results, generating task-specific representations like captions, transcripts, summaries, thumbnails, and highlights. This output can be consumed by downstream applications for advanced reasoning and automation. Post Processing: Enhances service capabilities with generative AI tools that ensure the accuracy and usability of extracted information. This includes providing confidence scores for minimal human intervention and enabling continuous improvement through user feedback. Transformative Applications Across Industries Azure AI Content Understanding is ideal for a wide range of use cases and industries, as it is fully customizable and allows for the input of data from multiple modalities. Here are just a few examples of scenarios Content Understanding is powering today: Post call analytics: Customers utilize Azure AI Content Understanding to extract analytics on call center or recorded meeting data, allowing you to aggregate data on the sentiment, speakers, and content discussed, including specific names, companies, user data, and more. Media asset management and content creation assistance: Extract key features from images and videos to better manage media assets and enable search on your data for entities like brands, setting, key products, people, and more. Insurance claims: Analyze and process insurance claims and other low-latency batch processing scenarios to automate previously time-intensive processes. Highlight video reel generation: With Content Understanding, you can automatically identify key moments in a video to extract highlights and summarize the full content. For example, automatically generate a first draft of highlight reels from conferences, seminars, or corporate events by identifying key moments and significant announcements. Retrieval Augmented Generation (RAG): Ingest and enrich content of any modality to effectively find answers to common questions in scenarios like customer service agents, or power content search scenarios across all types of data. Customer Success with Content Understanding Customers all over the world are already finding unique and powerful ways to accelerate their inferencing and unlock insights on their data by leveraging the multi modal capabilities of Content Understanding. Here are a few examples of how customers are unlocking greater value from their data: Philips: Philips Speech Processing Solutions (SPS) is a global leader in dictation and speech-to-text solutions, offering innovative hardware and software products that enhance productivity and efficiency for professionals worldwide. Content Understanding enables Philips to power their speech-to-result solution, allowing customers to use voice to generate accurate, ready-to-use documentation. “With Azure AI Content Understanding, we're taking Philips SpeechLive, our speech-to-result solution to a whole new level. Imagine speaking, and getting fully generated, accurate documents—ready to use right away, thanks to powerful AI speech analytics that work seamlessly with all the relevant data sources.” – Thomas Wagner, CTO Philips Dictation Services WPP: WPP, one of the world’s largest advertising and marketing services providers, is revolutionizing website experiences using Azure AI Content Understanding. SJR, a content tech firm within WPP, is leveraging this technology for SJR Generative Experience Manager (GXM) which extracts data from all types of media on a company's website—including text, audio, video, PDFs, and images—to deliver intelligent, interactive, and personalized web experiences, with the support of WPP's AI technology company, Satalia. This enables them to convert static websites into dynamic, conversational interfaces, unlocking information buried deep within websites and presenting it as if spoken by the company's most knowledgeable salesperson. Through this innovation, WPP's SJR is enhancing customer engagement and driving conversion for their clients. ASC: ASC Technologies is a global leader in providing software and cloud solutions for omni-channel recording, quality management, and analytics, catering to industries such as contact centers, financial services, and public safety organizations. ASC utilizes Content Understanding to enhance their compliance analytics solution, streamlining processes and improving efficiency. "ASC expects to significantly reduce the time-to-market for its compliance analytics solutions. By integrating all the required capture modalities into one request, instead of customizing and maintaining various APIs and formats, we can cover a wide range of use cases in a much shorter time.” - Tobias Fengler, Chief Engineering Officer Numonix: Numonix AI specializes in capturing, analyzing, and managing customer interactions across various communication channels, helping organizations enhance customer experiences and ensure regulatory compliance. They are leveraging Content Understanding to capture insights from recorded call data from both audio and video to transcribe, analyze, and summarize the contents of calls and meetings, allowing them to ensure compliance across all conversations. “Leveraging Azure AI Content Understanding across multiple modalities has allowed us to supercharge the value of the recorded data Numonix captures on behalf of our customers. Enabling smarter communication compliance and security in the financial industry to fully automating quality management in the world’s largest call centers.” – Evan Kahan, CTO & CPO Numonix IPV Curator: A leader in media asset management solutions, IPV is leveraging Content Understanding to improve their metadata extraction capabilities to produce stronger industry specific metadata, advanced action and event analysis, and align video segmentation to specific shots in videos. IPV’s clients are now able to accelerate their video production, reduce editing time, access their content more quickly and easily. To learn more about how Content Understanding empowers video scenarios as well as how our customers such as IPV are using the service to power their unique media applications, check out Transforming Video Content into Business Value. Robust Security and Compliance Built using Azure’s industry-leading enterprise security, data privacy, and Responsible AI guidelines, Azure AI Content Understanding ensures that your data is handled with the utmost care and compliance and generates responses that align with Microsoft’s principles for responsible use of AI. We are excited to see how Azure AI Content Understanding will empower organizations to unlock their data's full potential, driving efficiency and innovation across various industries. Stay tuned as we continue to develop and enhance this groundbreaking service. Getting Started If you are at Microsoft Ignite 2024 or are watching online, check out this breakout session on Content Understanding. Learn more about the new Azure AI Content Understanding service here. Build your own Content Understanding solution in the Azure AI Foundry. For all documentation on Content Understanding, please refer to this page.4.5KViews1like0CommentsDeploy a Gradio Web App on Azure with Azure App Service: a Step-by-Step Guide
This guide provides a detailed walkthrough for deploying a Gradio interface to the cloud using Azure App Service. It is designed for individuals who wish to transition their Gradio applications, such as machine learning model demos or web apps, from local development to a stable, publicly accessible application. The tutorial covers the utilization of Visual Studio Code (VSCode) to set up virtual environments, ensuring a controlled development space. It also addresses the management of sensitive information by demonstrating how to handle secrets securely. The article provides insights and best practices for deploying your Gradio project to the cloud, ensuring a seamless transition from a local prototype to a professional-grade application hosted on Azure.9.2KViews3likes8CommentsExplore Azure AI Services: Curated list of prebuilt models and demos
Unlock the potential of AI with Azure's comprehensive suite of prebuilt models and demos. Whether you're looking to enhance speech recognition, analyze text, or process images and documents, Azure AI services offer ready-to-use solutions that make implementation effortless. Explore the diverse range of use cases and discover how these powerful tools can seamlessly integrate into your projects. Dive into the full catalogue of demos and start building smarter, AI-driven applications today.7.2KViews4likes0CommentsAnnouncing new multi-modal capabilities with Azure AI Speech
Customers continue to innovate with Azure OpenAI and Azure AI Speech. They are bringing new efficiencies into their enterprise and building new multimodal experiences for their customers. We are seeing a variety of use cases including call analytics, medical transcription, captioning, chatbots and more. At Azure AI, we continue to work with customers and bring new innovations to the market. Here are all the multimodal innovations, specifically including speech and text, that we are announcing at Microsoft Build this year. Speech analytics Today, we are announcing Speech analytics in preview. Speech analytics is a new service in Azure AI Studio that combines Azure AI services, and PromptFlow, to automatically process and analyze audio data simply by uploading it to cloud storage. With Speech analytics it is easy to gain insights into call center conversations or to extract a conversation summary using AI models from Azure OpenAI as well as Azure AI Language to analyze the accurate transcriptions generated by Azure AI Speech. Gaining insights from call center conversations allows businesses to better understand their customer needs, product feedback and support trends and to improve the customer experience. Using our post-call analytics template customers can quickly set up common insights like call summaries, customer sentiment, and key topics. Customers that want to go beyond these out-of-the-box insights can easily modify the default prompt to extract additional insights and even modify the full prompt flow to fully customize the analytics to extract a wide range of information including for example discussion highlights and even predicting possible conversation flows , With Speech Analytics, it is also easy to customize support for multiple languages, accents, domains and scenarios and to scale to large production use. Speech analytics is helping our customers gain insights into customer conversations and improve their customer experience, sales, and marketing strategies. It is also a steppingstone for multi-modal data analysis, which will enable richer and deeper insights from different types of data in the future. Here is an exemplary suite of technologies that Speech Processing Solutions (Philips Dictation) is building using Azure AI services, including Speech analytics: Speech analytics will be available for developers to try out in June. To learn more, try it out in the Azure AI Studio. Fast transcription Today, we are also announcing Fast Transcription API in preview. The API -part of the Azure AI Speech family- provides the means to transcribe audio files of up to 200MB size in seconds through a simple REST call. Customers want to enable scenarios where obtaining the transcript quickly is paramount. They want the transcript as soon as an interview finishes, or a phone call completes, for instance. This API is a game changer for transcription at large. It can now transcribe up to 40x faster than real-time producing for example a transcript of a 10 minute audio file in 15 seconds, without sacrificing accuracy using a synchronous REST API call. The API provides a simple but powerful way to transcribe audio and opens the door to a new set of scenarios, one of which is ‘agent note taking’ within call centers. Efficient note taking A typical agent working in a call center spends 3 to 5 mins after each call creating notes. Fast Transcription API in combination with Azure OpenAI Service can automate this task, giving thousands of hours of work back to the call center. Medical practitioners that record conversations with patients can analyze these recordings in seconds. Similarly, media and content creators can analyze and extract insights from podcasts or interviews as soon as they complete. IntelePeer simplifies communications automation through advanced AI-powered solutions, helping businesses and contact centers reduce costs and enrich the customer experience. "The performance of Microsoft’s FAST API for offline transcription far supersedes the competition. When comparing the same sample corpus, FAST API performed the best among the alternative services tested. It shined on low quality audio transcription, delivering results 70% better than other vendors." - Sergey Galchenko, CTO, IntelePeer. Parloa, a software development company building a contact center AI platform for the next generation of customer service in enterprises, has been using the Fast Transcription API in private preview. "FAST Transcription API provides the fastest, most accurate and most cost-effective option in the Transcription market" -- CTO, Parloa OPPO, a global technology brand for its innovative smartphone and smart devices, is using Azure AI speech-to-text, Fast Transcription and Azure AI text-to-speech to pilot new customer experiences on their new AI phone. Read this blog to learn more. Fast Transcription API will be available to developers starting June, 2024. Stay tuned for more. Video Translation Today, we are announcing the availability of Video Translation, a groundbreaking service designed to transform the way businesses localize their video content, in preview. The new service offers developers an efficient and seamless solution to address the rising demand for translating video content and overcoming language barriers, allowing content owners to reach a broader audience. Whether it's for educational videos, marketing campaigns, or entertainment content, the Video Translation ensures your message is heard, in any of the supported languages. The service enables developers to translate content in 10 language pairs with prebuilt neural voices and content editing features, or by using the personal voice capability, which is a limited access feature. Learn more about Video Translation in the studio and try it out with your own videos. Vimeo is on a mission to simplify making, managing, and sharing video --- all in a single, easy-to-use platform. "Vimeo has been working closely with Microsoft video translation and is excited about the use cases it will unlock for customers worldwide." - Ashraf Alkarmi - Vimeo Chief Product Officer Read this blog to learn more about video translation. Multi-lingual speech translation We are also announcing new speech translation enhancements in Azure AI Speech. We are introducing multiple language detection with the ability to detect language switches among the supported languages in the same audio stream, automatic language detection eliminating the need for developers to specify input languages, and integrated custom translation to adapt the translation to your domain-specific vocabulary. With these capabilities, developers no longer need to specify the input language, can handle language switches within the same session, and support live streaming translations into target languages. This capability is especially helpful for captioning use-cases. Captioning is the act of adding text to audio or video content, to make it more accessible and comprehensible for people who have hearing difficulties, or who speak a different language. Captioning is not only a legal obligation in many countries, but also a social duty and a good practice for inclusion. Content creators can now attract a broader and more diverse audience and improve the user experience and engagement effortlessly. Check out how iTourTranslator has integrated multi-lingual speech translation in their AR glasses. Read this blog to learn more about multi-lingual speech translation. Announcing general availability of personal voice Another aspect of our Speech service is the natural voices it offers. Customers use the platform to create realistic and natural-sounding voices for avatars, chatbots and IVRs. With Azure AI Speech, you can either use an existing voice model choosing from a wide variety of voices and styles, or create your own custom voice, using your own data and recordings. Today, we are also announcing the general availability of a new personal voice feature in Azure AI Speech. It is available as limited access to ensure appropriate guardrails and avoid misuse. This feature allows users to create an AI voice in a few seconds by providing just a short speech sample as the audio prompt. This feature can be used for various use cases, such as personalizing voice experience for a chatbot, or translating video content in different languages with the actor’s native voice. Read this blog to learn about customer examples and demos, and the responsible AI practices that are implemented, such as watermarking and usage policies. In conclusion our powerful and versatile platform helps customers combine speech input and output as a modality to other AI capabilities. This enables developers to create high-quality workloads for new scenarios. Whether you need insights into human conversations, live or recorded captions, or realistic and natural-sounding voices for your avatars, chatbots, or IVRs, Azure AI assists customers deliver fast, reliable and customizable solutions.8.2KViews0likes0CommentsMultilingual Chatbot with Azure AI Studio, Phi-3 Mini, GPT-4 and Azure AI Translator
In today's globalized world, businesses and developers are increasingly seeking to create applications that can interact seamlessly with users across different languages. A multilingual chatbot is a powerful tool for achieving this, allowing for real-time interaction in the users' native languages. This guide will introduce you to creating a sophisticated chatbot using Azure AI Studio, Phi-3-mini deployment for the chatbot framework, GPT-4 for assistant framework, Azure AI Translator for real-time translation to improve Phi-3-mini and GPT-4 response quality for international languages.3.9KViews0likes0CommentsImagine, Integrate, Innovate: Join Microsoft's GenAI Hackathon - LIVE NOW!
Imagine, Integrate, Innovate: Build with Azure AI to revolutionize multimodal experiences in this virtual, GenAI hackathon. In the lead up to Microsoft Build, our flagship developer conference, we’re going big on multimodal building with our developer community by launching Microsoft's GenAI Hackathon on Devpost live now until May 6th! With Azure AI, you can blend the best of various AI technologies to create more dynamic, versatile, and responsible applications that make a big impact in the world. Whether you’re a pro or just starting out, there’s something for you.4.2KViews1like0Comments