Text Analytics
21 TopicsIntroducing new task-optimized summarization capabilities powered by fine-tuned large-language model
We are expanding our utilization of LLMs to GPT-3.5 Turbo, along with our proprietary z-Code++ models to offer task-optimized summarization using a balance of output accuracy and cost.5KViews2likes5CommentsDocument Redaction and Sanitization with ChatGPT on Azure
Privacy and Security are the topmostpriorities for Consumers and Businesses. Bad actors can steal the information when it is in transit between the systems, when it is being processed or even when archived. Large Language Models have opened doors for more effective ways of sanitizing the documents. In this article I will provide some examples of how to sanitize the documents using ChatGPT on Azure. What is Document Redaction and Sanitization? It is the process of removing or replacing any information that is considered as sensitive, private or confidential. What types of information needs redaction? Following are some sensitive information types that needs redaction: Personally Identifiable and Information (PII): Anything information that helps identify a person called needs to be redacted. Examples include people's names, addresses, social security numbers, drivers license etc. Protected Health Information(PHI) : Health related information such as patient’s medical records, insurance group numbers, benefit information etc. Business Confidential Information: Organizational information such as employee records, biometric records, business related secrets, contractual information, financial documents, judicial records etc. Using ChatGPT on Azure for Redacting and Sanitizing the documents: Example 1: Sanitize the sensitive information from an Invoice The sample invoice I am going to use is in scanned format. As ChatGPT can only take plain text as input currently, we first have to digitalize the invoice. Form Recognizer service is a very effective way of digitizing the scanned documents. Below is a screenshot from Form Recognizer with the Invoice example: After extracting the data from the invoice using Form Recognizer and some cleansing, we will have the plain text something like below: You will find the responses below from ChatGPT for my redaction and sanitizing instructions: Prompt: Show me all the references of PII below: Prompt:redact all references of PII data below: Prompt: Replace all occurrences of Contoso with LinkedIn: Prompt:convert all dates to MON-DD-YYYY format. show me only dates: Example 2: Sanitize information from a Health Insurance Card for PHI data: Similar to the above example, I used the Form Recognizer service to extract the information from a health insurance card. Below are some examples on how to sanitize the PHI information. Prompt:show me all references of PHI: Prompt:redact all references of people names below: You can continue with various redaction and sanitization activities in the document such as replacing text, removing text, translating text, converting currencies etc. All the Best! Note1: The responses may vary depending on the hyperparameters like Temperature, Top Probabilities etc. Note 2: I also encourage you all to try the same prompts ontext-davinci-003 model also.Introducing Native Document Support for PII Detection (Public Preview)
This capability can now identify, categorize, and redact sensitive information (PII - Personally Identifiable Information) in unstructured text directly from complex documents, allowing users to ensure data privacy compliance within a streamlined workflow. It effortlessly detects and safeguards crucial information, adhering to the highest standards of data privacy and security.10KViews3likes0CommentsIntegrating AI: Best Practices and Resources to Get Started
What are the problems you can solve with AI? How do you experiment and prototype? This article aims tohelp you decide if and how to integrate AI into your applications,get you started withAzure’s ready to use AI solutions,CognitiveServicesand answer your mostfrequent questionswhen getting started.15KViews4likes1CommentWhat’s new Azure AI Language | BUILD 2023
We are announcing more new features and capabilities:General Availability (GA) of document and conversation summarization, and public preview of the following features: interactive custom summarization powered by Z-Code++ and Azure OpenAI Service, custom sentiment analysis, enhanced Text Analytics for health, Named Entity Recognition (NER) and Personal Identifiable Information (PII) Detection, increased language coverage, enhanced integration capability, and expanded container support.6.4KViews5likes0CommentsSupercharge your AI skills with #30DaysOfAzureAI launching today with daily posts throughout April
#30DaysOfAzureAilaunches today, with daily posts throughout April. Topics include building intelligent apps with Azure OpenAI and the Azure AI SDKs, Machine Learning, MLOps, AI for Accessibility, Responsible AI, and more.4.9KViews2likes2CommentsIntroducing Azure Cognitive Service for Language
Azure Cognitive Services has historically had three distinct NLP services that solve different but related customer problems. These are Text Analytics, LUIS and QnA Maker. As these services have matured and customers now depend on them for business-critical workloads, we wanted to take a step back and evaluate the most effective path forward over the next several years for delivering our roadmap of a world-class, state-of-the-art NLP platform-as-a-service. Each service was initially focused on a set of capabilities supporting distinct customer scenarios. For example, LUIS for custom language models most often supporting bots, Text Analytics for general purpose pre-built language services, and QnA Maker for knowledge-based question / answering. As AI accuracy has improved, the cost of offering more sophisticated models has decreased, and customers have increased their adoption of NLP for business workloads, we are seeing more and more overlapping scenarios where the lines are blurred between the three distinct services. As such, the most effective path forward is a single unified NLP service in Azure Cognitive Services. Today we are pleased to announce the availability of Azure Cognitive Service for Language. It unifies the capabilities in Text Analytics, LUIS, and the legacy QnA Maker service into a single service. The key benefits include: Easier to discover and adopt features. Seamlessness between pre-built and custom-trained NLP. Easier to build NLP capabilities once and reuse them across application scenarios. Access to multilingual state-of-the-art NLP models. Simpler to get started through consistency in APIs, documentation, and samples across all features. More billing predictability. The unified Language Service will not affect any existing applications. All existing capabilities of the three services will continue to be supported until we announce a deprecation timeline of the existing services (which would be no less than one year). However, new features and innovation will start happening only on the unified Language service. For example, question answering and conversational language understanding (CLU) are only available in the unified service (more details on these features later). As such, customers are encouraged to start making plans to leverage the unified service. More details on migration including links to resources are provided below. Here is everything we are announcing today in the unified Language service: Text Analytics is now Language Service: All existing features of Text Analytics are included in the Language Service. Specifically, Sentiment Analysis and Opinion Mining, Named Entity Recognition (NER), Entity Linking, Key Phrase Extraction, Language Detection, Text Analytics for health, and Text Summarization are all part of the Language Service as they exist today. Text Analytics customers don’t need to do any migrations or updates to their in-production or in-development apps. The unified service is backward compatible with all existing Text Analytics features. The key difference is when creating a new resource in the Azure portal UI, you will now see the resource labeled as “Language” rather than “Text Analytics”. Introducing conversational language understanding (preview) - the next generation of LUIS: Language Understanding (LUIS) has been one of our fastest growing Cognitive Services with customers deploying custom language models to production for various scenarios from command-and-control IoT devices and chat bots, to contact center agent assist scenarios. The next phase in the evolution of LUIS is conversational language understanding (CLU) which we are announcing today as a preview feature of the new Language Service. CLU introduces multilingual transformer-based models as the underlying model architecture and results in significant accuracy improvements over LUIS. Also new as part of CLU is the ability to create orchestration projects, which allow you to configure a project to route to multiple customizable language services, like question answering knowledge bases, other CLU projects, and even classic LUIS applications. Visit here to learn more. If you are an existing LUIS customer, we are not requiring you to migrate your application to CLU today. However, as CLU represents the evolution of LUIS, we encourage you to start experimenting with CLU in preview and provide us feedback on your experience. You can import a LUIS JSON application directly into CLU to get started. GA of question answering: In May 2021, we launched the preview of custom question answering. Today we are announcing the General Availability (GA) of question answering as part of the new Language Service. If you are just getting started with building knowledge bases that are query-able with natural language, visit here to get started. If you want to know more about migrating legacy QnA Maker knowledge bases to the Language Service see here. Your existing QnA Maker knowledge bases will continue to work. We are not requiring customers to migrate from QnA Maker at this time. However, question answering represents the evolution of QnA Maker and new features will only be developed for the unified service. As such, we encourage you to plan for a migration from legacy QnA Maker if this applies to you. Introducing custom named entity recognition (preview): Documents include an abundant amount of valuable information. Enterprises rely on pulling out that information to easily filter and search through those documents. Using the standard Text Analytics NER, they could extract known types like person names, geographical locations, datetimes, and organizations. However, lots of information of interest is more specific than the standard types. To unlock these scenarios, we’re happy to announce custom NER as a preview capability of the new Language Service. The capability allows you to build your own custom entity extractors by providing labelled examples of text to train models. Securely upload your data in your own storage accounts and label your data in the language studio. Deploy and query the custom models to obtain entity predictions on new text. Visit here to learn more. Introducing custom text classification (preview): While many pieces of information can exist in any given document, the whole piece of text can belong to one or more categories. Organizing and categorizing documents is key to data reliant enterprises. We’re excited to announce custom text classification, a preview feature under the Language service, where you can create custom classification models with your defined classes. Securely upload your data in your own storage accounts and label your data in the language studio. Choose between single-label classification where you can label and predict one class for every document, or multi-label classification that allows you to assign or predict several classes per document. This service enables automation to incoming pieces of text such as support tickets, customer email complaints, or organizational reports. Visit here to learn more. Language studio: This is the single destination for experimentation, evaluation, and training of Language AI / NLP in Cognitive Services. With the Language studio you can now try any of our capabilities with a few buttons clicks. For example, you can upload medical documents and get back all the entities and relations extracted instantly, and you can easily integrate the API into your solution using the Language SDK. You can take it further by training your own custom NER model and deploy it through the easy-to-use interface. Try it out now yourself here. Several customers are already using Azure Cognitive Service for Language to transform their businesses. Here's what two of them had to say: “We used Azure Cognitive Services and Bot Service to deliver an instantly responsive, personal expert into our customers’ pockets. Providing this constant access to help is key to our customer care strategy.” -Paul Jacobs, Group Head of Operations Transformation, Vodafone “Sellers might have up to 100,000 documents associated with a deal, so the time savings can be absolutely massive. Now that we’ve added Azure Cognitive Service for Language to our tool, customers can potentially compress weeks of work into days.” -Thomas Fredell, Chief Product Officer, Datasite To learn more directly from customers, see the following customer stories: Vodafone transforms its customer care strategy with digital assistant built on Azure Cognitive Services Progressive Insurance levels up its chatbot journey and boosts customer experience with Azure AI Kepro improves healthcare outcomes with fast and accurate insights from Text Analytics for health On behalf of the entire Cognitive Services Language team at Microsoft, we can't wait to see how Azure Cognitive Service for Language benefits your business!24KViews5likes0Comments