azure ai translator
11 TopicsAnnouncing Azure AI Content Understanding: Transforming Multimodal Data into Insights
Solve Common GenAI Challenges with Content Understanding As enterprises leverage foundation models to extract insights from multimodal data and develop agentic workflows for automation, it's common to encounter issues like inconsistent output quality, ineffective pre-processing, and difficulties in scaling out the solution. Organizations often find that to handle multiple types of data, the effort is fragmented by modality, increasing the complexity of getting started. Azure AI Content Understanding is designed to eliminate these barriers, accelerating success in Generative AI workflows. Handling Diverse Data Formats: By providing a unified service for ingesting and transforming data of different modalities, businesses can extract insights from documents, images, videos, and audio seamlessly and simultaneously, streamlining workflows for enterprises. Improving Output Data Accuracy: Deriving high-quality output for their use-cases requires practitioners to ensure the underlying AI is customized to their needs. Using advanced AI techniques like intent clarification, and a strongly typed schema, Content Understanding can effectively parse large files to extract values accurately. Reducing Costs and Accelerating Time-to-Value: Using confidence scores to trigger human review only when needed minimizes the total cost of processing the content. Integrating the different modalities into a unified workflow and grounding the content when applicable allows for faster reviews. Core Features and Advantages Azure AI Content Understanding offers a range of innovative capabilities that improve efficiency, accuracy, and scalability, enabling businesses to unlock deeper value from their content and deliver a superior experience to their end users. Multimodal Data Ingestion and Content Extraction: The service ingests a variety of data types such as documents, images, audio, and video, transforming them into a structured format that can be easily processed and analyzed. It instantly extracts core content from your data including transcriptions, text, faces, and more. Data Enrichment: Content Understanding offers additional features that enhance content extraction results, such as layout elements, barcodes, and figures in documents, speaker recognition and diarization in audio, and more. Schema Inferencing: The service offers a set of prebuilt schemas and allows you to build and customize your own to extract exactly what you need from your data. Schemas allow you to extract a variety of results, generating task-specific representations like captions, transcripts, summaries, thumbnails, and highlights. This output can be consumed by downstream applications for advanced reasoning and automation. Post Processing: Enhances service capabilities with generative AI tools that ensure the accuracy and usability of extracted information. This includes providing confidence scores for minimal human intervention and enabling continuous improvement through user feedback. Transformative Applications Across Industries Azure AI Content Understanding is ideal for a wide range of use cases and industries, as it is fully customizable and allows for the input of data from multiple modalities. Here are just a few examples of scenarios Content Understanding is powering today: Post call analytics:Customers utilize Azure AI Content Understanding to extract analytics on call center or recorded meeting data, allowing you to aggregate data on the sentiment, speakers, and content discussed, including specific names, companies, user data, and more. Media asset management and content creation assistance: Extract key features from images and videos to better manage media assets and enable search on your data for entities like brands, setting, key products, people, and more. Insurance claims: Analyze and process insurance claims and other low-latency batch processing scenarios to automate previously time-intensive processes. Highlight video reel generation:With Content Understanding, you can automatically identify key moments in a video to extract highlights and summarize the full content. For example, automatically generate a first draft of highlight reels from conferences, seminars, or corporate events by identifying key moments and significant announcements. Retrieval Augmented Generation (RAG): Ingest and enrich content of any modality to effectively find answers to common questions in scenarios like customer service agents, or power content search scenarios across all types of data. Customer Success with Content Understanding Customers all over the world are already finding unique and powerful ways to accelerate their inferencing and unlock insights on their data by leveraging the multi modal capabilities of Content Understanding. Here are a few examples of how customers are unlocking greater value from their data: Philips: Philips Speech Processing Solutions (SPS) is a global leader in dictation and speech-to-text solutions, offering innovative hardware and software products that enhance productivity and efficiency for professionals worldwide. Content Understanding enables Philips to power their speech-to-result solution, allowing customers to use voice to generate accurate, ready-to-use documentation. “With Azure AI Content Understanding, we're taking Philips SpeechLive, our speech-to-result solution to a whole new level. Imagine speaking, and getting fully generated, accurate documents—ready to use right away, thanks to powerful AI speech analytics that work seamlessly with all the relevant data sources.” – Thomas Wagner, CTO Philips Dictation Services WPP:WPP, one of the world’s largest advertising and marketing services providers, is revolutionizing website experiences using Azure AI Content Understanding. SJR, a content tech firm within WPP, is leveraging this technology for SJR Generative Experience Manager (GXM) which extracts data from all types of media on a company's website—including text, audio, video, PDFs, and images—to deliver intelligent, interactive, and personalized web experiences, with the support of WPP's AI technology company, Satalia. This enables them to convert static websites into dynamic, conversational interfaces, unlocking information buried deep within websites and presenting it as if spoken by the company's most knowledgeable salesperson. Through this innovation, WPP's SJR is enhancing customer engagement and driving conversion for their clients. ASC: ASC Technologies is a global leader in providing software and cloud solutions for omni-channel recording, quality management, and analytics, catering to industries such as contact centers, financial services, and public safety organizations. ASC utilizes Content Understanding to enhance their compliance analytics solution, streamlining processes and improving efficiency. "ASC expects to significantly reduce the time-to-market for its compliance analytics solutions. By integrating all the required capture modalities into one request, instead of customizing and maintaining various APIs and formats, we can cover a wide range of use cases in a much shorter time.” - Tobias Fengler, Chief Engineering Officer Numonix: Numonix AI specializes in capturing, analyzing, and managing customer interactions across various communication channels, helping organizations enhance customer experiences and ensure regulatory compliance. They are leveraging Content Understanding to capture insights from recorded call data from both audio and video to transcribe, analyze, and summarize the contents of calls and meetings, allowing them to ensure compliance across all conversations. “Leveraging Azure AI Content Understanding across multiple modalities has allowed us to supercharge the value of the recorded data Numonix captures on behalf of our customers. Enabling smarter communication compliance and security in the financial industry to fully automating quality management in the world’s largest call centers.” – Evan Kahan, CTO & CPO Numonix IPV Curator: A leader in media asset management solutions, IPV is leveraging Content Understanding to improve their metadata extraction capabilities to produce stronger industry specific metadata, advanced action and event analysis, and align video segmentation to specific shots in videos. IPV’s clients are now able to accelerate their video production, reduce editing time, access their content more quickly and easily. To learn more about how Content Understanding empowers video scenarios as well as how our customers such as IPV are using the service to power their unique media applications, check out Transforming Video Content into Business Value. Robust Security and Compliance Built using Azure’s industry-leading enterprise security, data privacy, and Responsible AI guidelines, Azure AI Content Understanding ensures that your data is handled with the utmost care and compliance and generates responses that align with Microsoft’s principles for responsible use of AI. We are excited to see how Azure AI Content Understanding will empower organizations to unlock their data's full potential, driving efficiency and innovation across various industries. Stay tuned as we continue to develop and enhance this groundbreaking service. Getting Started If you are at Microsoft Ignite 2024 or are watching online, check out this breakout session on Content Understanding. Learn more about the new Azure AI Content Understanding service here. Build your own Content Understanding solution in the Azure AI Foundry. For all documentation on Content Understanding, please refer to this page.1.2KViews0likes0CommentsDeploy a Gradio Web App on Azure with Azure App Service: a Step-by-Step Guide
This guide provides a detailed walkthrough for deploying a Gradio interface to the cloud using Azure App Service. It is designed for individuals who wish to transition their Gradio applications, such as machine learning model demos or web apps, from local development to a stable, publicly accessible application. The tutorial covers the utilization of Visual Studio Code (VSCode) to set up virtual environments, ensuring a controlled development space. It also addresses the management of sensitive information by demonstrating how to handle secrets securely. The article provides insights and best practices for deploying your Gradio project to the cloud, ensuring a seamless transition from a local prototype to a professional-grade application hosted on Azure.8KViews3likes8CommentsExplore Azure AI Services: Curated list of prebuilt models and demos
Unlock the potential of AI with Azure's comprehensive suite of prebuilt models and demos. Whether you're looking to enhance speech recognition, analyze text, or process images and documents, Azure AI services offer ready-to-use solutions that make implementation effortless. Explore the diverse range of use cases and discover how these powerful tools can seamlessly integrate into your projects. Dive into the full catalogue of demos and start building smarter, AI-driven applications today.6.3KViews4likes0CommentsAnnouncing new multi-modal capabilities with Azure AI Speech
Customers continue to innovate with Azure OpenAI and Azure AI Speech. They are bringing new efficiencies into their enterprise and building new multimodal experiences for their customers. We are seeing a variety of use cases including call analytics, medical transcription, captioning, chatbots and more. At Azure AI, we continue to work with customers and bring new innovations to the market. Here are all the multimodal innovations, specifically including speech and text, that we are announcing at Microsoft Build this year. Speech analytics Today, we are announcing Speech analytics in preview. Speech analytics is a new service in Azure AI Studio that combines Azure AI services, and PromptFlow, to automatically process and analyze audio data simply by uploading it to cloud storage. With Speech analytics it is easy to gain insights into call center conversations or to extract a conversation summary using AI models from Azure OpenAI as well as Azure AI Language to analyze the accurate transcriptions generated by Azure AI Speech. Gaining insights from call center conversations allows businesses to better understand their customer needs, product feedback and support trends and to improve the customer experience. Using our post-call analytics template customers can quickly set up common insights like call summaries, customer sentiment, and key topics. Customers that want to go beyond these out-of-the-box insights can easily modify the default prompt to extract additional insights and even modify the full prompt flow to fully customize the analytics to extract a wide range of information including for example discussion highlights and even predicting possible conversation flows , With Speech Analytics, it is also easy to customize support for multiple languages, accents, domains and scenarios and to scale to large production use. Speech analytics is helping our customers gain insights into customer conversations and improve their customer experience, sales, and marketing strategies. It is also a steppingstone for multi-modal data analysis, which will enable richer and deeper insights from different types of data in the future. Here is an exemplary suite of technologies that Speech Processing Solutions (Philips Dictation) is building using Azure AI services, including Speech analytics: Speech analytics will be available for developers to try out in June. To learn more, try it out in the Azure AI Studio. Fast transcription Today, we are also announcing Fast Transcription API in preview. The API -part of the Azure AI Speech family- provides the means to transcribe audio files of up to 200MB size in seconds through a simple REST call. Customers want to enable scenarios where obtaining the transcript quickly is paramount. They want the transcript as soon as an interview finishes, or a phone call completes, for instance. This API is a game changer for transcription at large. It can now transcribe up to 40x faster than real-time producing for example a transcript of a 10 minute audio file in 15 seconds, without sacrificing accuracy using a synchronous REST API call. The API provides a simple but powerful way to transcribe audio and opens the door to a new set of scenarios, one of which is ‘agent note taking’ within call centers. Efficient note taking A typical agent working in a call center spends 3 to 5 mins after each call creating notes. Fast Transcription API in combination with Azure OpenAI Service can automate this task, giving thousands of hours of work back to the call center. Medical practitioners that record conversations with patients can analyze these recordings in seconds. Similarly, media and content creators can analyze and extract insights from podcasts or interviews as soon as they complete. IntelePeer simplifies communications automation through advanced AI-powered solutions, helpingbusinesses and contact centers reduce costs and enrich the customer experience. "The performance of Microsoft’s FAST API for offline transcription far supersedes the competition. When comparing the same sample corpus, FAST API performed the best among the alternative services tested. It shined on low quality audio transcription, delivering results 70% better than other vendors." - Sergey Galchenko, CTO, IntelePeer. Parloa, a software development company building a contact center AI platform for the next generation of customer service in enterprises, has been using the Fast Transcription API in private preview. "FAST Transcription API provides the fastest, most accurate and most cost-effective option in the Transcription market"-- CTO, Parloa OPPO, a global technology brand for its innovative smartphone and smart devices, is using Azure AI speech-to-text, Fast Transcription and Azure AI text-to-speech to pilot new customer experiences on their new AI phone. Read this blog to learn more. Fast Transcription API will be available to developers starting June, 2024. Stay tuned for more. Video Translation Today, we are announcing the availability of Video Translation, a groundbreaking service designed to transform the way businesses localize their video content, in preview. The new service offers developers an efficient and seamless solution to address the rising demand for translating video content and overcoming language barriers, allowing content owners to reach a broader audience. Whether it's for educational videos, marketing campaigns, or entertainment content, the Video Translation ensures your message is heard, in any of the supported languages. The service enables developers to translate content in 10 language pairs with prebuilt neural voices and content editing features, or by using the personal voice capability, which is a limited access feature.Learn more about Video Translation in the studio and try it out with your own videos. Vimeois on a mission to simplify making, managing, and sharing video --- all in a single, easy-to-use platform. "Vimeo has been working closely with Microsoft video translation and is excited about the use cases it will unlock for customers worldwide." - Ashraf Alkarmi- Vimeo Chief Product Officer Read this blog to learn more about video translation. Multi-lingual speech translation We are also announcing new speech translation enhancements in Azure AI Speech. We are introducing multiple language detection with the ability to detect language switches among the supported languages in the same audio stream, automatic language detection eliminating the need for developers to specify input languages, and integrated custom translation to adapt the translation to your domain-specific vocabulary. With these capabilities, developers no longer need to specify the input language, can handle language switches within the same session, and support live streaming translations into target languages. This capability is especially helpful for captioning use-cases. Captioning is the act of adding text to audio or video content, to make it more accessible and comprehensible for people who have hearing difficulties, or who speak a different language. Captioning is not only a legal obligation in many countries, but also a social duty and a good practice for inclusion. Content creators can now attract a broader and more diverse audience and improve the user experience and engagement effortlessly. Check out how iTourTranslator has integrated multi-lingual speech translation in their AR glasses. Read this blog to learn more about multi-lingual speech translation. Announcing general availability of personal voice Another aspect of our Speech service is the natural voices it offers. Customers use the platform to create realistic and natural-sounding voices for avatars, chatbots and IVRs. With Azure AI Speech, you can either use an existing voice model choosing from a wide variety of voices and styles, or create your own custom voice, using your own data and recordings. Today, we are also announcing the general availability of a new personal voice feature in Azure AI Speech. It is available as limited access to ensure appropriate guardrails and avoid misuse. This feature allows users to create an AI voice in a few seconds by providing just a short speech sample as the audio prompt. This feature can be used for various use cases, such as personalizing voice experience for a chatbot, or translating video content in different languages with the actor’s native voice. Read this blog to learn about customer examples and demos, and the responsible AI practices that are implemented, such as watermarking and usage policies. In conclusion our powerful and versatile platform helps customers combine speech input and output as a modality to other AI capabilities. This enables developers to create high-quality workloads for new scenarios. Whether you need insights into human conversations, live or recorded captions, or realistic and natural-sounding voices for your avatars, chatbots, or IVRs, Azure AI assists customers deliver fast, reliable and customizable solutions.8KViews0likes0CommentsMultilingual Chatbot with Azure AI Studio, Phi-3 Mini, GPT-4 and Azure AI Translator
In today's globalized world, businesses and developers are increasingly seeking to create applications that can interact seamlessly with users across different languages. A multilingual chatbot is a powerful tool for achieving this, allowing for real-time interaction in the users' native languages. This guide will introduce you to creating a sophisticated chatbot using Azure AI Studio, Phi-3-mini deployment for the chatbot framework, GPT-4 for assistant framework, Azure AI Translator for real-time translation to improve Phi-3-mini and GPT-4 response quality for international languages.3.8KViews0likes0CommentsImagine, Integrate, Innovate: Join Microsoft's GenAI Hackathon - LIVE NOW!
Imagine, Integrate, Innovate: Build with Azure AI to revolutionize multimodal experiences in this virtual, GenAI hackathon. In the lead up to Microsoft Build, our flagship developer conference, we’re going big on multimodal building with our developer community by launching Microsoft's GenAI Hackathon on Devpost live now until May 6th! With Azure AI, you can blend the best of various AI technologies to create more dynamic, versatile, and responsible applications that make a big impact in the world.Whether you’re a pro or just starting out, there’s something for you.4.1KViews1like0CommentsAzure AI Translator announces new features as container offering.
Seattle—April 17, 2024—Today,we are pleased to announce the release of document translation (preview) and transliteration features for Azure AI Translator containers. All Translator container customers will get these new features automatically as part of the update. Translator containers provide users with the capability to host the Azure AI Translator API on their own infrastructure and include all libraries, tools, and dependencies needed to run the service in any private, public, or personal computing environment. They are isolated, lightweight, portable, and are great for implementing specific security or data governance requirements. As of today’s release, the following operations are now supported when using Azure AI Translator containers: Text translation: Translate the text phrases between supported source and target language(s) in real-time. Text transliteration: Converts text in a language from one script to another script in real-time. E.g. converting Russian language text written in Cyrillic script to Latin script. Document translation (Preview): Translate a document between supported source and target language while preserving the original document’s content structure and format. When to consider using Azure AI Translator containers? You may want to consider Azure AI Translator containers in cases where: there are strict data residency requirements to ensure that sensitive information remains within the company’s security boundary. you reside in industries such as government, military, banking, and security enforcement where the ability to translate data without exposing it to external networks is a must. you require the ability to maintain continuous translation capabilities while operating in disconnected environments or with limited internet access. optimization, cost management, and flexibility to run on-premises with existing infrastructure is a priority. Getting started with Translator container. Translator containers are a gated offering. You need to request container access and get approved. Refer to the prerequisites for a more detailed breakdown. How do I get charged? The document translation and transliteration features would be charged at different rates similar to the cloud offering. Connected container: You're billed monthly at the pricing tier of the Azure AI Translator resource, based on the usage and consumption. Below is an example of document translation billing metadata transmitted by Translator connected container to Azure for billing. { "apiType": "texttranslation", "id": "f78748d7-b3a4-4aef-8f29-ddb394832219", "containerType": "texttranslation", "containerVersion": "1.0.0+2d844d094c930dc12326331b3e49515afa3635cb", "containerId": "4e2948413cff", "meter": { "name": "CognitiveServices.TextTranslation.Container.OneDocumentTranslatedCharacters", "quantity": 27.0 }, "requestTime": 638470710053653614, "customerId": "c2ab4101985142b284217b86848ff5db" } Disconnected container: As shown in the below usage records example, the aggregated value of‘Billed Unit’corresponding to the meters ‘One Document Translated Characters’ and ‘Translated Characters’ is counted towards the characters you licensed for your disconnected container usage. { "type": "CommerceUsageResponse", "meters": [ { "name": "CognitiveServices.TextTranslation.Container.OneDocumentTranslatedCharacters", "quantity": 1250000, "billedUnit": 1875000 }, { "name": "CognitiveServices.TextTranslation.Container.TranslatedCharacters", "quantity": 1250000, "billedUnit": 1250000 } ], "apiType": "texttranslation", "serviceName": "texttranslation" } References User documentation Pricing Send your feedback to mtfb@microsoft.com2.6KViews0likes0CommentsConversational Bots 2.0 – Setting a new paradigm
The evolution of AI chatbots is transforming user interactions. Powered by advanced Azure AI, these multi-modal bots can process and respond to various inputs like text, images, and voice. They offer enhanced support and seamless navigation, making them invaluable for improving user experiences.3.6KViews2likes0Comments