document processing

18 Topics

eSignature now available worldwide
We’re excited to share that eSignature for Microsoft 365 is now available worldwide* on Microsoft 365 public clouds allowing signatures to be requested on all PDFs and Word documents stored in SharePoint. This pertains to all PDFs and Word documents stored in SharePoint. Previously, this service was limited to the US, UK, EU, Canada and Australia-Pacific regions. Worldwide availability also includes the integration of Adobe Acrobat Sign and Docusign eSignature for PDFs on SharePoint. Signatures made simple We’re also excited to announce several updates to eSignature that make requesting and managing signed documents easier: eSignature for Word (Desktop) is now available to Word users on the Microsoft 365 Current Channel. eSignature for Word and PDFs is now available worldwide (while previously it was limited to certain regions). We’re adding a new free-text field capability for Word electronic signature requests. We’ve enhanced the signed PDF automatic save back capability with Adobe & Docusign signature services. Lastly, if you’ve been using “SharePoint eSignature”, you’ll notice it will soon have a new name in the UI, “Microsoft 365 eSignature”, reflecting that it’s a core part of the Microsoft 365 experience. Request electronic signatures from Word - Launched Microsoft Word now has built-in eSignature support for everyone on the Microsoft 365 Current Channel. This means you can initiate the signing process without ever leaving Word. No more toggling between multiple applications as you can add signature fields and send out for signatures directly from Word. Watch the eSignature for Word video to learn more about it. How it works: In Word (Desktop), open a document that’s stored in a SharePoint site with eSignature enabled. Once enabled by your administrator, you’ll find an “eSignature fields” option on the Insert tab of the ribbon. Click it to insert signature fields where needed in your document. Once you’ve placed the fields, you can enter the recipients’ email addresses (and optionally include a custom message), then hit “Create request”. Word will automatically convert the document to PDF for signing and send it out, all in one go. The signed PDF is saved right back into the SharePoint library next to your original Word file. In other words, your document never leaves your Microsoft 365 tenant during the process. And because your document and the signed copies remain in Microsoft 365, you retain full control over security and compliance. There’s no need to send files through external channels or as email attachments. New free-text field for Word documents – rolling out in October We’ve also added a new field to make your signature requests more powerful: free-text fields. This allows you to include additional input fields to your documents for signers to fill out when they sign, beyond just their signature or initials. A free-text field is a blank text field you can place in a document (just like a signature line or date field). When your signer is completing the signature request, they will see this field and can type in the information you requested. You can use it to gather any extra details you need. For example, you might ask the signer to provide their job title, an ID number, a mailing address, or any other relevant information while signing the document. This feature is great for scenarios where a signature alone isn’t enough and you had to chase down extra info via email or separate forms. Now you can capture that information in one step, during the signing process itself. Automatic save to original location – now available If you already use Adobe Acrobat Sign or Docusign eSignature for electronic signatures, you can leverage the eSignature for Microsoft 365 experience to initiate the request and have a copy of the signed PDF automatically saved back to the originating SharePoint library. This will help save your time in creating and finding signed PDFs. Users can create PDF signature requests from a SharePoint document library and choose one of those providers (if enabled by an admin) as the signing service. The PDF will be automatically uploaded to the provider to create a signing request. When signed, a copy of the signed PDF is automatically saved back into the original SharePoint library (whereas previously the signed PDF was saved back to a single dedicated provider folder). Next steps These updates (a new name, global availability, Word integration, custom fields, and third-party updates) all aim to make eSignature more convenient for you. If you’re an admin or interested user, make sure eSignature is enabled in your tenant. It’s found under Pay-as-you-go services in the Microsoft 365 Admin Center. Check that your Office apps are updated so you can see the Word integration. And give these new features a try! To learn more about eSignature configuration and usage, check out Overview of eSignature page on Microsoft Learn For a view of previous updates please see here: Announcing SharePoint eSignature for Microsoft Word | Microsoft Community Hub SharePoint eSignature product updates! | Microsoft Community Hub *Worldwide availability excludes Indonesia and Türkiye
dara_a
Sep 26, 2025 Place Document Processing Blog
1.6KViews
1like
2Comments
Model Mondays S2E10: Automating Document Processing with AI
1. Weekly Highlights We kicked off with the top news and updates in the Azure AI ecosystem: Agent Factory Blog Series: A new 6-part blog series on designing reliable, agentic AI—exploring multi-step, collaborative agents that reflect, plan, and adapt using tool integrations and design patterns. Text PII Preview in Azure AI Language: Now redacts PII (like date of birth, license plates) in major European languages, with better accuracy for UK bank entities. Claude Opus 4.1 in Copilot Pro & Enterprise: Public preview brings smarter summaries, tool assistant thinking, and "Ask Mode" in VS Code.Now leverages stronger computer vision algorithms for table parsing—achieving 94-97% accuracy across Latin, Chinese, Japanese, and Korean—with sub-10ms latency. Mistral Document AI in Azure Foundry: Instantly turn PDFs, contracts, and scanned docs into structured JSON with tables, headings, and LaTeX support. Serverless, multilingual, secure, and perfect for regulated industries. 2. Spotlight On: Document Intelligence with Azure & Mistral This week’s spotlight was a hands-on exploration of document processing, featuring both Microsoft and Mistral AI experts. Why Document Processing? Unstructured data—receipts, forms, handwritten notes—are everywhere. Modern document AI can extract, structure, and even annotate this data, fueling everything from search to RAG pipelines. Azure Document Intelligence: State-of-the-art OCR and table extraction with super-high accuracy and speed. Handles multi-language, complex layouts, and returns structured outputs ready for programmatic use. Mistral Document AI: Transforms PDFs and scanned docs into JSON, retaining complex formatting, tables, images, and even LaTeX. Supports custom schema extraction, image/document annotations, and returns everything in one API call. Integrates seamlessly with Azure AI Foundry and developer workflows. Demo Highlights: Extracting Receipts: OCR accurately pulls out store, date, and transaction details from photos. Handwriting Recognition: Even historical documents (like Thomas Jefferson’s letters) are parsed with surprising accuracy. Tables & Structured Data: Financial statements and reports converted into structured markdown and JSON—ready for downstream apps. Advanced Annotations: Define your own schema (via JSON Schema or Pydantic), extract custom fields, classify images, summarize documents, and even translate summaries—all in a single call. 3. Customer Story: Oracle Health Oracle Health shared how agentic AI and fine-tuned models are revolutionizing clinical workflows: Problem: Clinicians spend hours on documentation, searching records, and manual data entry—reducing time for patient care. Solution: Oracle’s clinical AI agents automate chart reviews, data extraction, and even conversational Q&A—while keeping humans in the loop for safety. Technical Highlights: Multi-agent architecture understands provider specialty and context. Orchestrator model "routes" requests to the right agent or plugin, extracting needed arguments from context. Fine-tuning was key: For low latency, Oracle used lightweight models (like GPT-4 Mini) and fine-tuned on their data—achieving sub-800ms responses, with accuracy matching larger models. Fine-tuning also allowed for nuanced tool selection, argument extraction, and rule-based orchestration—better than prompt engineering alone. Used LoRA for efficient, targeted fine-tuning without erasing base model knowledge. Live Demo: Agent summarizes patient history, retrieves lab results, filters for abnormals, and answers follow-up questions—all conversationally. Fine-tuned orchestrator chooses the right tool and context for each doctor’s workflow. Result: 1-2 hours saved per day, more time for patients, and happier doctors! 4. Key Takeaways Here are the key learnings from this episode: Document AI is Production-Ready: Azure Document Intelligence and Mistral Document AI offer fast, accurate, and customizable document parsing for real enterprise needs. Schema-Driven Extraction & Annotation: Define your own schemas and extract exactly what you want—no more one-size-fits-all. Fine-Tuning Unlocks Performance: For low latency and high accuracy, fine-tuning lightweight models beats prompt engineering in complex, rule-based agent workflows. Agentic Workflows in Action: Multi-agent systems can automate complex tasks, route requests, and keep humans in control, especially in regulated domains like healthcare. Community & Support: Join the Discord and Forum to ask questions, share use cases, and connect with the team. Sharda's Tips: How I Wrote This Blog Writing this recap is all about sharing what I learned and making it practical for the community! I start by organizing the key highlights, then walk through customer stories and demos, using simple language and real-world examples. Copilot helps me structure and clarify my notes, especially when summarizing technical sections. Here’s the prompt I used for Copilot this week: "Generate a technical blog post for Model Mondays S2E10 based on the transcript and episode details. Focus on document processing with Azure AI and Mistral, include customer demos, and highlight practical workflows and fine-tuning. Make it clear and approachable for developers and students." Every episode inspires me to try these tools myself, and I hope this blog makes it easy for you to start, too. If you have questions or want to share your own experience, I’d love to hear from you! Coming Up Next Week Next week: Text & Speech AI Playgrounds! Learn how to build and test language and speech models, with live demos and expert guests. | Register For The Livestream – Aug 25, 2025 | Register For The AMA – Aug 29, 2025 | Ask Questions & View Recaps – Discussion Forum About Model Mondays Model Mondays is a weekly series to build your Azure AI IQ with: 5-Minute Highlights: News & updates on Mondays 15-Minute Spotlight: Deep dives into new features, models, and protocols 30-Minute AMA Fridays: Live Q&A with product teams and experts Get started: Register For Livestreams Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! Join the Azure AI Developer Community for real-time chats, events, support, and more: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.
Sharda_Kaur
Aug 26, 2025 Place Educator Developer Blog
225Views
0likes
0Comments
Syntex creates only Single line text fields for each piece of information.
I have deployed and applied "Structured document processing" model to one of my libraries in SharePoint. I chose two pieces of information to extract: Date (type date) Table, which had 6 columns (with types : text, number, date) The training of the model and the extraction itself seems fine, but after applying the model to the library, all the columns, that Syntex creates in the library for the new content type are "Single lines of text". No number, no date... Is there something I am missing or is this the Syntex situation right now? Similar problem is with longer strings, where I would like the Syntex to grap the whole string, but since he is not able to create Multi line column, he just cuts the string after the character limit is reached. Any advice for how to proceed would be highly appreciated. Thanks!
vladoes
Jun 29, 2025 Place Document Processing Discussions
172Views
0likes
1Comment
Issues with Microsoft Syntex Document Processing Model: Incomplete Extraction for Multi-Page PDFs
I'm facing several challenges with the Microsoft Syntex document processing model, particularly when dealing with multi-page PDFs and large tables. I'd appreciate any insights or suggestions from fellow users or Microsoft experts who may have encountered similar issues. Below are the specific problems I'm experiencing: Unsupported PDF Formats & Multiple Tables on a Single Page: Some PDFs that I try to process seem to be in unsupported formats. In addition, pages containing multiple tables often result in extraction errors or incomplete data. Has anyone else encountered this with complex table layouts in PDFs, and what approaches have you used to resolve it? Data Extraction from Multi-Page PDFs: When processing PDFs longer than two pages (e.g., six-page PDFs), the model often extracts data correctly from only the first two or three pages. The remaining pages, particularly those with tables spanning multiple pages, are either incomplete or entirely missing. Additionally, large tables (100+ rows) in multi-page PDFs tend to result in inaccurate extraction. Are there any best practices for handling these multi-page table scenarios? Automatic Processing Issues: Sometimes, the Syntex model doesn't process files automatically. I have to manually select the file and click "Classify" to trigger processing. Is this a known issue, or is there something I might be missing in my setup? Model Publishing Delays: After publishing changes to the model, it often takes an extended period (up to 30 minutes or more) for the new model to start processing files. In some cases, the files aren't processed at all. Has anyone experienced similar delays after publishing a model, and what could be causing this? Low Confidence Scores for Multi-Page PDFs: When processing multi-page PDFs with tables, the model returns low confidence scores (below 60%). What steps can I take to improve these accuracy scores, particularly for documents with complex table structures?
Swami_Nawale
Mar 17, 2025 Place Document Processing Discussions
278Views
0likes
3Comments
Unable to train Syntex Structured Document model
Hi all, I am trying to create and train a Syntex model for the first time in my organization. I am going through the Content Center and, while a model is created when I hit the New Model button, when it goes to start the training process, I get the following error: I have confirmed that there exists an entry for SharePointFormProcessing in a table called Packages in Dataverse. The same thing happens when trying to create a model locally in another site and regardless of whether I'm using a the freeform selection or layout method. Does anyone have an idea what the problem might be?
than_antono
Aug 16, 2024 Place Document Processing Discussions
856Views
0likes
3Comments
High costs for MS Syntex - although restricted to some sites - maybe open for a short time
Hi everyone, we are experiencing high costs for MS Syntex. We just wanted it to use for some sites - so i restricted it to 2 sites. But i did not see that i need to restrict each service - so some users seem to "use" it. How can i identify these sites and stop them from using any of the services? How can i completely disabled Syntex for our tenant if nothing else works? BR Stephan
StephanGee
Aug 12, 2024 Place Document Processing Discussions
403Views
0likes
1Comment
SharePoint Premium and Content Management: 2023 in Review and What’s Next in 2024
Recap the year of generative AI, Microsoft advanced content management in 2023, and take a look at what's next in 2024 – including new products, features, and a promotional offer.
jolenetam
Feb 16, 2024 Place Document Processing Blog
10KViews
1like
1Comment
Become a Microsoft advanced content management advocate! [Opportunity]
The Microsoft Next Gen Content Services team is launching an advanced content management and experiences Advocates Initiative. We are looking for thought leaders who can amplify product awareness for the latest innovations from SharePoint Premium, SharePoint Embedded, Microsoft 365 Backup, and Microsoft 365 Archive. As advocates, you can help create content to share with your networks, leveraging Microsoft marketing material and updates. If you or anyone you know would be interested in posting about these solutions, please fill out our interest form. As a member of this initiative, you’ll have opportunities to cross-promote on Microsoft channels, receive marketing updates on the products, and enhance your personal brand.
Sandraperez
Feb 06, 2024 Place Document Processing Discussions
682Views
4likes
0Comments
Regd. Issue with data extraction when file is dropped from Power Automate into the primed library
Hello all, I am facing an issue , where if a file is being dropped to the primed library using flow . The flow action account has syntex License and the document looks fine (the pdf opens) once it gets dropped to the primed library , but the extraction is not happening , but when I click on Classify And Extract , it is working fine , anyone facing similar scenario?
Solved
DivyaAkula
Jan 23, 2024 Place Document Processing Discussions
504Views
0likes
1Comment
Files getting classified in syntex training files but not in SharePoint library
Hi, I'm having some trouble with some files in SharePoint not getting classified, with some appearing to not even get put through the classification process. The files are being scanned in batches, uploaded, run through Adobe's OCR, and then classified. However, random files will just never get classified, because of one of two reasons: they never get classified (aka they have no classification date whatsoever), or they get "classified" but remain as the default content type ('Document'). In either case, if I then copy the file into the training files for the model that should be classifying it (there are multiple models running on the same library in our case), the model will perfectly classify it as positive. I can then save/sync the model and reupload the file, and it will still present the same problem. Does anyone know what may be causing this? To be clear, I have had this happen across multiple files, which are being uploaded in batches, with 80-90% of the files in a batch being classified just fine but some just refusing to do so. I can have file 1, 2, and 3 work fine in a batch but suddenly file 4 will fail. There is no difference between the files in terms of scanning, OCR, or the upload process, so I have tentatively ruled out those being the issue. I'm assuming it's a Syntex issue, at least for the second category because those files are getting run through it, just not being classified properly. Any help would be much appreciated!
Shane_Lambert
Dec 26, 2023 Place Document Processing Discussions
1.7KViews
0likes
4Comments