Microsoft Ignite 2021
3 TopicsAnnouncing General Availability of Speaker Recognition
Microsoft Launches Speaker Recognition in Limited Access Today November 3 rd , 2021, at Microsoft Ignite, we are announcing GA (General Availability) of speaker recognition, a capability of Azure Cognitive Services for Speech. Speaker recognition can now be used to provide convenient verification and personalization of speakers by their unique voice characteristics in a wide range of solutions. As part of Microsoft’s commitment to responsible AI, we are designing and releasing speaker recognition with the intention of protecting the rights of individuals and society and fostering transparent human-computer interaction. For this reason, access to and use of Microsoft’s speaker recognition service is limited via a customer application process. How Does it Work? Once you have provided audio training data for a single speaker, the speaker recognition service will create an enrollment profile based on the unique characteristics of the speaker's voice (also known as a voice signature). You can then cross-check audio voice samples against this profile to verify that the speaker is the same person (speaker verification) or cross-check audio voice samples against a group of enrolled speaker profiles to see if it matches any profile in the group (speaker identification). Speaker Verification This service can be used to verify speakers for secure, fluid customer engagements in a wide range of use cases, such as call centers or interactive voice response systems. Speaker verification can be either text-dependent or text-independent. With text-dependent verification a speaker will say a passphrase to enroll their voice. The speaker will then be required to repeat that same passphrase used in enrollment when attempting to be verified by the speaker verification service. Text-independent speaker verification also requires speakers to say an activation passphrase as part of the enrollment process, but after enrollment, speakers can use everyday language when attempting to be verified by the speaker verification service. For both text-dependent and text-independent verification, the speaker's voice is enrolled by saying a passphrase from a set of predefined phrases. If the passphrase matches, the voice signature is created based on speakers' unique biometric voice characteristics. To help prevent misuse of this service, Microsoft requires that customers actively involve users in enrollment through this activation step. The activation step indicates the speakers' active participation in creating their voice signatures and is intended to help avoid the scenario in which speakers are enrolled without their awareness. Speaker Identification Speaker identification is used to determine an unknown speaker’s identity within a group of enrolled speakers. Speaker identification enables you to attribute speech to individual speakers and unlock value from scenarios with multiple speakers. Speaker identification is text-independent, which means that there are no restrictions on what the speaker says in the audio besides the required initial activation passphrase to activate the enrollment. Data Security and Privacy Speaker enrollment data is stored in a secured system, including the speech audio for enrollment and the voice signature features. The speech audio for enrollment is only used when the algorithm is upgraded, and the features need to be extracted again. The service does not retain the speech recording or the extracted voice features that are sent to the service during the recognition phase. You control how long data should be retained. You can create, update, and delete enrollment data for individual speakers through API calls. When the subscription is deleted, all the speaker enrollment data associated with the subscription will also be deleted. As with all the Azure Cognitive Services resources, developers should ensure that you have received the appropriate permissions from the users for Speaker Recognition. Limited Access to Speaker Recognition Speaker Recognition requires registration and Microsoft may limit access based on certain eligibility criteria. Customers who wish to use this service are required to submit an intake form. Access to Speaker Recognition is subject to Microsoft’s sole discretion based on eligibility criteria and a vetting process. Microsoft may require customers to reverify this information periodically. Start from here to understand more about responsible use of Speaker Recognition. Empowering Microsoft Partner [24]7.ai with Speaker Recognition Microsoft partner [24]7.ai™ makes every customer-brand interaction more satisfying and cost efficient, driving customer loyalty, sales growth, and agent productivity for the world’s leading brands. The company combines deep vertical expertise, human insight, and years of contact center experience to ensure consistent, easy, personalized conversations across channels and time. [24]7.ai is transforming the digital customer experience (CX) through its cloud-based customer engagement platform, agent services, and managed services. “We partnered with Microsoft in voice biometrics, not just for the company’s technology, which, based on our testing, is top notch, but also because Microsoft is known for safeguarding the security and privacy of its customers—a key consideration with speaker recognition software.” —John Gaffney, VP, Voice Commerce Product Management, [24]7.ai [24]7.ai™ incorporates Speaker Recognition technology into its [24]7 Voices™ product, an interactive voice response (IVR) platform that supports natural, intent-based customer interactions, boosts self-serve automation, and blends seamlessly with voice agents and digital channels. [24]7 Voices is itself part of the company’s [24]7.ai Engagement Cloud™ platform, a recognized industry leader in conversational AI. By using Speaker Recognition, [24]7.ai provides the following benefits to its [24]7 Voices clients and their customers: A better customer experience: Voice biometrics enables more secure and streamlined customer journeys. Because it gives organizations confidence that the speaker is who they say they are, [24]7.ai clients avoid having to transfer customers to an agent for additional security screening. That means less hassle and wasted time for callers—and the longer they stay in the IVR system, the greater the opportunity for them to self-serve. Stronger authentication and increased security: Voice biometrics drastically reduces the risk of theft or hacks that are prevalent with passwords and PINs. Again, that’s a win for [24]7.ai clients and their customers. Significant cost savings: Voice biometric reduces operational and fraud costs and, by increasing IVR containment, enables [24]7.ai clients to provide more self-serve customer options within the IVR. It decreases handling time by agents, minimizes transfers, and reduces the burden of maintaining/resetting passwords—which cuts down IT time and manpower costs working on backend systems and integrations, Although [24]7.ai clients access the speaker identification and verification components separately, the company considers them as living under a single voice biometrics umbrella. It’s a Win-Win-Win In a nutshell: Every time [24]7.ai clients reduce agent talk time, it saves them money AND improves their customers’ experience. The trick is to do it really well, and safely. As noted in the quote above, [24]7.ai has great confidence in the Speaker Recognition technology. But technology alone isn’t enough. Deploying voice biometrics in today’s regulatory environment is challenging, to say the least, and in the end [24]7.ai was won over by Microsoft’s long-standing and well-known commitment to privacy and security. Learn more about The building blocks of Microsoft’s responsible AI program and Azure AI – Cognitive Services, Speaker Recognition5KViews3likes0CommentsIntroducing Azure Cognitive Service for Language
Azure Cognitive Services has historically had three distinct NLP services that solve different but related customer problems. These are Text Analytics, LUIS and QnA Maker. As these services have matured and customers now depend on them for business-critical workloads, we wanted to take a step back and evaluate the most effective path forward over the next several years for delivering our roadmap of a world-class, state-of-the-art NLP platform-as-a-service. Each service was initially focused on a set of capabilities supporting distinct customer scenarios. For example, LUIS for custom language models most often supporting bots, Text Analytics for general purpose pre-built language services, and QnA Maker for knowledge-based question / answering. As AI accuracy has improved, the cost of offering more sophisticated models has decreased, and customers have increased their adoption of NLP for business workloads, we are seeing more and more overlapping scenarios where the lines are blurred between the three distinct services. As such, the most effective path forward is a single unified NLP service in Azure Cognitive Services. Today we are pleased to announce the availability of Azure Cognitive Service for Language. It unifies the capabilities in Text Analytics, LUIS, and the legacy QnA Maker service into a single service. The key benefits include: Easier to discover and adopt features. Seamlessness between pre-built and custom-trained NLP. Easier to build NLP capabilities once and reuse them across application scenarios. Access to multilingual state-of-the-art NLP models. Simpler to get started through consistency in APIs, documentation, and samples across all features. More billing predictability. The unified Language Service will not affect any existing applications. All existing capabilities of the three services will continue to be supported until we announce a deprecation timeline of the existing services (which would be no less than one year). However, new features and innovation will start happening only on the unified Language service. For example, question answering and conversational language understanding (CLU) are only available in the unified service (more details on these features later). As such, customers are encouraged to start making plans to leverage the unified service. More details on migration including links to resources are provided below. Here is everything we are announcing today in the unified Language service: Text Analytics is now Language Service: All existing features of Text Analytics are included in the Language Service. Specifically, Sentiment Analysis and Opinion Mining, Named Entity Recognition (NER), Entity Linking, Key Phrase Extraction, Language Detection, Text Analytics for health, and Text Summarization are all part of the Language Service as they exist today. Text Analytics customers don’t need to do any migrations or updates to their in-production or in-development apps. The unified service is backward compatible with all existing Text Analytics features. The key difference is when creating a new resource in the Azure portal UI, you will now see the resource labeled as “Language” rather than “Text Analytics”. Introducing conversational language understanding (preview) - the next generation of LUIS: Language Understanding (LUIS) has been one of our fastest growing Cognitive Services with customers deploying custom language models to production for various scenarios from command-and-control IoT devices and chat bots, to contact center agent assist scenarios. The next phase in the evolution of LUIS is conversational language understanding (CLU) which we are announcing today as a preview feature of the new Language Service. CLU introduces multilingual transformer-based models as the underlying model architecture and results in significant accuracy improvements over LUIS. Also new as part of CLU is the ability to create orchestration projects, which allow you to configure a project to route to multiple customizable language services, like question answering knowledge bases, other CLU projects, and even classic LUIS applications. Visit here to learn more. If you are an existing LUIS customer, we are not requiring you to migrate your application to CLU today. However, as CLU represents the evolution of LUIS, we encourage you to start experimenting with CLU in preview and provide us feedback on your experience. You can import a LUIS JSON application directly into CLU to get started. GA of question answering: In May 2021, we launched the preview of custom question answering. Today we are announcing the General Availability (GA) of question answering as part of the new Language Service. If you are just getting started with building knowledge bases that are query-able with natural language, visit here to get started. If you want to know more about migrating legacy QnA Maker knowledge bases to the Language Service see here. Your existing QnA Maker knowledge bases will continue to work. We are not requiring customers to migrate from QnA Maker at this time. However, question answering represents the evolution of QnA Maker and new features will only be developed for the unified service. As such, we encourage you to plan for a migration from legacy QnA Maker if this applies to you. Introducing custom named entity recognition (preview): Documents include an abundant amount of valuable information. Enterprises rely on pulling out that information to easily filter and search through those documents. Using the standard Text Analytics NER, they could extract known types like person names, geographical locations, datetimes, and organizations. However, lots of information of interest is more specific than the standard types. To unlock these scenarios, we’re happy to announce custom NER as a preview capability of the new Language Service. The capability allows you to build your own custom entity extractors by providing labelled examples of text to train models. Securely upload your data in your own storage accounts and label your data in the language studio. Deploy and query the custom models to obtain entity predictions on new text. Visit here to learn more. Introducing custom text classification (preview): While many pieces of information can exist in any given document, the whole piece of text can belong to one or more categories. Organizing and categorizing documents is key to data reliant enterprises. We’re excited to announce custom text classification, a preview feature under the Language service, where you can create custom classification models with your defined classes. Securely upload your data in your own storage accounts and label your data in the language studio. Choose between single-label classification where you can label and predict one class for every document, or multi-label classification that allows you to assign or predict several classes per document. This service enables automation to incoming pieces of text such as support tickets, customer email complaints, or organizational reports. Visit here to learn more. Language studio: This is the single destination for experimentation, evaluation, and training of Language AI / NLP in Cognitive Services. With the Language studio you can now try any of our capabilities with a few buttons clicks. For example, you can upload medical documents and get back all the entities and relations extracted instantly, and you can easily integrate the API into your solution using the Language SDK. You can take it further by training your own custom NER model and deploy it through the easy-to-use interface. Try it out now yourself here. Several customers are already using Azure Cognitive Service for Language to transform their businesses. Here's what two of them had to say: “We used Azure Cognitive Services and Bot Service to deliver an instantly responsive, personal expert into our customers’ pockets. Providing this constant access to help is key to our customer care strategy.” -Paul Jacobs, Group Head of Operations Transformation, Vodafone “Sellers might have up to 100,000 documents associated with a deal, so the time savings can be absolutely massive. Now that we’ve added Azure Cognitive Service for Language to our tool, customers can potentially compress weeks of work into days.” -Thomas Fredell, Chief Product Officer, Datasite To learn more directly from customers, see the following customer stories: Vodafone transforms its customer care strategy with digital assistant built on Azure Cognitive Services Progressive Insurance levels up its chatbot journey and boosts customer experience with Azure AI Kepro improves healthcare outcomes with fast and accurate insights from Text Analytics for health On behalf of the entire Cognitive Services Language team at Microsoft, we can't wait to see how Azure Cognitive Service for Language benefits your business!26KViews5likes0Comments