azure ai services
298 TopicsExtracting Handwritten Corrections with Azure AI Foundry's Latest Tools
In document processing, dealing with documents that contain a mix of handwritten and typed text presents a unique challenge. Often, these documents also feature handwritten corrections where certain sections are crossed out and replaced with corrected text. Ensuring that the final extracted content accurately reflects these corrections is crucial for maintaining data accuracy and usability. In our recent endeavors, we explored various tools to tackle this issue, with a particular focus on Document Intelligence Studio and Azure AI Foundry's new Field Extraction Preview feature. The Challenge Documents with mixed content types—handwritten and typed—can be particularly troublesome for traditional OCR (Optical Character Recognition) systems. These systems often struggle with recognizing handwritten text accurately, especially when it coexists with typed text. Additionally, when handwritten corrections are involved, distinguishing between crossed-out text and the corrected text adds another layer of complexity, as the model is confused with which value(s) to pick out. Our Approach Initial Experiments with Pre-built Models To address this challenge, we initially turned to Document Intelligence Studio's pre-built invoice model, which provided a solid starting point. However, it would often extract both the crossed-out value as well as the new handwritten value under the same field. In addition, it did not always match the correct key to field value. Custom Neural Model Training Next, we attempted to train a custom neural model in the Document Intelligence Studio, which leverages Deep Learning for predicting key document elements, allowing for further adjustments and refinements. It is recommended to use at least 100 to 1000 sample files to achieve more accurate and consistent results. When training models, it is crucial to use text-based PDFs (PDFs with selectable text) as they provide better data for training. The model's accuracy improves with more varied training data, including different types of handwritten edits. Without enough training data or variance, the model may overgeneralize. Therefore, we uploaded approximately 100 text-based pdfs's (PDF has selectable text) to Azure AI Foundry and manually corrected the column containing handwritten text. After training on a subset of these files, we built and tested our custom neural model on the training data. The model performed impressively, achieving a 92% confidence score in identifying the correct values. The main drawbacks were the manual effort required for data labeling and the 30 minutes needed to build the model. During our experiments, we noticed that when extracting fields from a table, labeling and extracting every column comprehensively rather than just a few columns resulted in higher accuracy. The model was better at predicting when it had a complete view of the table Breakthrough with Document Field Extraction (Preview) Finally, the breakthrough came when we leveraged the new Document Field Extraction Preview feature from Azure AI Foundry. This feature demonstrated significant improvements in handling mixed content and provided a more seamless experience in extracting the necessary information. Field Description Modification: One of the key steps in our process was modifying the field descriptions within the Field Extraction Preview feature. By providing detailed descriptions of the fields we wanted to extract, we helped the AI understand the context and nuances of our documents better. Specifically, we wanted to make sure that the value extracted forFOB_COST was the handwritten correction, so we wrote in theField Description: "Ignore strikethrough or 'x'-ed out text at all costs, for example: do not extract red / black pen or marks through text. Do not use stray marks. This field only has numbers." Correction Handling: During the extraction process, the AI was able to distinguish between crossed-out text and the handwritten corrections. Whenever a correction was detected, the AI prioritized the corrected text over the crossed-out content, ensuring that the final extracted data was accurate and up-to-date. Performance Evaluation: After configuring the settings and field descriptions, we ran several tests to evaluate the performance of the extraction process. The results were impressive, with the AI accurately extracting the corrected text and ignoring the crossed-out sections. This significantly reduced the need for manual post-processing and corrections Results The new Field Extraction Preview feature in Azure AI Foundry exceeded our expectations. The modifications we made to the field descriptions, coupled with the AI's advanced capabilities, resulted in a highly efficient and accurate document extraction process. The AI's ability to handle mixed-content documents and prioritize handwritten corrections over crossed-out text has been a game-changer for our workflow. Conclusion For anyone dealing with documents that contain a mix of handwritten and typed text, and where handwritten corrections are present, we highly recommend exploring Azure AI Studio's Field Extraction Preview feature. The improvements in accuracy and efficiency can save significant time and effort, ensuring that your extracted data is both reliable and usable. As we continue to refine our processes, we look forward to even more advancements in document intelligence technologies.70Views0likes0CommentsIntroducing Azure AI Agent Service
Introducing Azure AI Agent Service at Microsoft Ignite 2024 Discover how Azure AI Agent Service is revolutionizing the development and deployment of AI agents. This service empowers developers to build, deploy, and scale high-quality AI agents tailored to business needs within hours. With features like rapid development, extensive data connections, flexible model selection, and enterprise-grade security, Azure AI Agent Service sets a new standard in AI automation20KViews6likes1CommentUnlock the Power of AI with Azure AI Foundry!
Imagine having a giant box of LEGO bricks. Each piece is a tool, ready to help you build something incredible—whether it's a robot or a skyscraper. Azure AI Foundry is like that LEGO box, but for creating powerful AI solutions! All-in-One Place: Just like having all your LEGO pieces in one big box, Azure AI Foundry puts all the tools you need to build AI in one place. This makes it easier and faster to create awesome projects. Azure AI Foundry is an all-in-one platform that combines the capabilities of various AI tools, including the rebranded Azure AI Studio. It offers everything you need to build, deploy, and manage AI solutions—from machine learning to generative AI models. Get Started with Azure AI Foundry today 🔍 Azure AI: Which Tool Should You Use? A Quick Guide! 🚀 🤔 When Should You Use Foundry? ✅ Large-Scale Projects: Manage complex, collaborative AI projects involving multiple teams. ✅ Integration Needs: Combine various Azure services, models, and data sources. ✅ Enterprise Governance: Ensure compliance, security, and orchestration for AI initiatives. 🚀 When to Use OpenAI Studio (Instead of Foundry)? ✅ Rapid Prototyping: Quickly test and fine-tune OpenAI models like GPT or DALL-E. ✅ Specific Tasks: Ideal for focused applications like text generation or image classification. ✅ Streamlined Interface: Access an easy-to-use environment for working with OpenAI models directly. Key Advantages of Using AI Foundry as a One-Stop Shop: Centralized Hub: Combines various AI tools and services, enabling seamless integration and project management. Supports collaboration across data science, engineering, and business teams. Model Diversity: Supports different types of models: custom machine learning, OpenAI models, and Azure Cognitive Services. Provides templates and pipelines for different AI scenarios (vision, language, structured data). End-to-End Lifecycle Management: Handles the entire AI lifecycle: data preparation, model training, deployment, and monitoring. Supports responsible AI practices (fairness, compliance, and transparency). Scalability and Governance: Designed for enterprise-scale projects with robust security, compliance, and access controls. Facilitates team collaboration and workflow orchestration across departments. Examples of What You Can Do with Azure AI Foundry Customer Service Chatbots: Create chatbots that can answer customer questions 24/7, helping businesses provide better service without needing a human to be available all the time. Image Recognition: Develop programs that can look at pictures and tell you what they see, like identifying objects in a photo or recognizing faces. Language Translation: Build tools that can translate speech or text from one language to another, making it easier for people from different countries to communicate. Predictive Maintenance: Create systems that can predict when machines need maintenance before they break down, helping companies save money and avoid downtime. Personalized Recommendations: Develop AI that can suggest products or content based on what a person likes, similar to how streaming services recommend movies or shows you might enjoy. 🔗 Bottom Line: Azure AI Foundry is your one-stop shop for enterprise-level AI projects, while OpenAI Studio offers simplicity and speed for targeted, smaller tasks. Choose based on your project size, complexity, and goals! 🎯201Views0likes0CommentsPhi-3 Vision – Catalyzing Multimodal Innovation
Microsoft's Phi-3 Vision is a new AI model that combines text and image data to deliver smart and efficient solutions. With just 4.2 billion parameters, it offers high performance and can run on devices with limited computing power. From describing images to analyzing documents, Phi-3 Vision is designed to make advanced AI accessible and practical for everyday use. Explore how this model is set to change the way we interact with AI, offering powerful capabilities in a small and efficient package.30KViews5likes2CommentsIntroducing AI-generated voices for Azure neural text to speech service
In this blog, we introduce two new voices created using the latestcontrollable new voice generation technology, a masculine voice named AIGenerate1 and a feminine voice named AIGenerate2, and provide a deeper view on the technology behind.12KViews4likes9CommentsNew technical research is advancing Azure’s Neural Text-to-Speech service
Our latest research innovation, code named NaturalSpeech, brings a new milestone to neural TTS achieving no significant difference with natural human recordings using side-by-side CMOS as metrics on a popular TTS dataset (LJSpeech) for the first time.7.3KViews0likes0CommentsMake your voice chatbots more engaging with new text to speech features
Today we're thrilled to announce Azure AI Speech's latest updates, enhancing text to speech capabilities for a more engaging and lifelike chatbot experience. These updates include: A wider range of multilingual voices for natural and authentic interactions; More prebuilt avatar options, with latest sample codes for seamless GPT-4o integration; and A new text stream API that significantly reduces latency for ChatGPT integration, ensuring smoother and faster responses.7.5KViews2likes1CommentAzure AI Speech launches new zero-shot TTS models for Personal Voice
Azure AI Speech Service has upgraded its Personal Voice feature with new zero-shot TTS models. Compared to the initial model, these new models improve the naturalness of synthesized voices and better resemble the speech characteristics of the voice in the prompt.14KViews2likes2CommentsCreating a branded AI voice that conveys emotions and speaks multiple languages
Today at Microsoft Inspire 2023, we're excited to announce the general availability (GA) of the new multi-style and multi-lingual custom neural voice (CNV) features inside Text to Speech, part of the Azure AI Speech capability.10KViews0likes0Comments