Event banner
Microsoft Syntex AMA
Event details
Microsoft Syntex is Content AI integrated in the flow of work. Syntex automatically reads, tags, and indexes high volumes of content and connects it where it’s needed—in search, in applications, and as reusable knowledge. Syntex automates workflows at scale, whether you’re processing invoices, writing a contract that requires a signature, or struggling to understand the flood of unstructured content.
An AMA is a live text-based online event similar to a “YamJam” on Yammer or an “Ask Me Anything” on Reddit. This AMA gives you the opportunity to connect with Microsoft product experts who will be on hand to answer your questions and listen to feedback.
Feel free to post your questions about Microsoft Syntex anytime in the comments below beforehand, if it fits your schedule or time zone better, though questions will not be answered until the live hour.
83 Comments
- mpjjonkerBrass ContributorLinkedEntities: in some documents we can annotate parties (persons, organizations), each party has its own role in the document, seller, buyer, witness, employer, employee, etc... Another example would be, a person with an address and other properties (date of birth, hobby, jobtitle) : it would be nice if we could group\collect those in one annotation with features\properties
- JamesEccles
Microsoft
Not something we have in Syntex today, but this is great feedback for us to look at. Thanks!
- mpjjonkerBrass ContributorPre-annotation possibilties: For domain experts (who we need to label our examples) it is often easier to start with pre-annotated documents. That way they have something to (dis)agree with. One way we have been using in the past, is a simple keyword (list) match to perform machine annotation, followed by the manual activity of domain experts. And about these domainexperts: sometimes it depends which person annotates the content, in other environments there is the detection of 'disagreement' between annotators. Is that still needed today ?
- IanStory
Microsoft
Hi Michel! One of the great benefits of Syntex is we provide a set of prebuilt models in addition to allow you to create your own models. I would say in the case of prebuilt models, this isn't needed today, you just take the model (like for invoices or receipts - more coming soon) and apply it and let it do its magic. In the case of building your own models using unstructured document processing, structured document processing, or freeform document processing, it is absolutely helpful to have examples to (dis)agree with, and so I think that's still useful as a concept. However, you don't have to "pre-annotate" them, more just have a small training set ready that whoever is building the model is familiar with (and I get that perhaps you'd have a separate set of documents that a domain expert had, perhaps even graphically annotated, to help you if you're building the model but not the expert yourself). One of the grand things we're trying to do with Syntex though is make it so that the domain expert can build the model themselves, and not need to split this into two separate roles!
- mpjjonkerBrass ContributorQuestion about language support. The support page mentions: This model supports all of the Latin-based languages How can we leverage the language understanding of our native language? For example: tokenization (of composite words), negation, SVO vs SOV, Part of Speech. Is the https://turing.microsoft.com/ project being used 'under the hood'?
- JamesEccles
Microsoft
The Unstructured model type supports any latin character set language. This model type is not using Turing under the hood- it ultimately has no semantic understanding of content. Rather it understands the text in a document as a series of tokens. Its patterns within/between/around these tokens that the model is building on. https://learn.microsoft.com/en-gb/microsoft-365/contentunderstanding/explanation-types-overview#what-are-tokens- mpjjonkerBrass ContributorIn an earlier stage of NLP we have used UIMA , where tokenization and sentencazation where important part of the understanding, should I compare this with that?
- DebraCurrieAustraliaCopper ContributorWhilst I understand pricing rates will vary by country/region, I would like to understand the pricing mechanisms - is Syntex priced per user and if so, what constitutes a user - is it the person creating the 'sets', running the models, uploading a document to a library that will be "syntexed" 🙂 or something else? Also, what permission levels do I need to implement Syntex our IT is pretty locked down and I'd like to do some POCs before building a case. Thank you
- Chris McNultySilver Contributor
Most of Microsoft Syntex will require no upfront seat licensing and will be available to almost all M365 commercial plans. Once activated, you will be able classify and extract content, content assembly, eSignature, OCR, image tag, translate, summarize etc. priced per page or per doc, generally, on a pay as you go basis. Unstructured doc processing and prebuilt models will be launched this month as metered services in preview with an initial cost of free (pricing to be disclosed for GA).
- JamesEccles
Microsoft
This article explains the way the licensing works, including what the specific actions are that would constitute a user needing a license - https://learn.microsoft.com/en-gb/microsoft-365/contentunderstanding/syntex-licensing We also announced at Ignite, that we will be moving many services in Syntex from a user based license, to a Pay-As-You-Go model over the coming months. This means you wouldn't need a user license, and instead would pay for what you consume. To do some testing, the minimum you would need is for your IT to create a Content Center site and give you permissions to it- https://learn.microsoft.com/en-gb/microsoft-365/contentunderstanding/create-a-content-center Better yet, have them start a free trial so you can get hands on with more of the capabilities - https://learn.microsoft.com/en-gb/microsoft-365/contentunderstanding/trial-syntex
- Mario_FulanIron ContributorCan you explain a bit more about the differences in functionality for Freeform documents (using AI Builder) and Unstructured Documents (using doc understanding models)? I know the training is different, freeform doesn't do classification, and a few other things. One question I have is whether both can do the "deskew" of PDF or scanned images before processing. Freeform seems to handle rotated documents but unstructured documents have trouble with OCR text extraction positioning if the image is rotated
- JamesEccles
Microsoft
This article gives a good overview of the different model types - https://learn.microsoft.com/en-gb/microsoft-365/contentunderstanding/difference-between-document-understanding-and-form-processing-model Both Unstructured and Freeform models can be used to tackle similar use cases. Limitations on file format and language may push you in one direction or other. But assuming both are possibilities for your use case, then I would start with Freeform. It has a lower bar for effort during training. If that doesn't get at the right data, then shift to Unstructured which has more of a teaching element and more need for human training. Both model types use the same OCR engine, so should be broadly the same for skewed docs. One thing to note though is that Freeform does factor in layout to the model, where Unstructured has to restructure into linear text.
- Mario_FulanIron ContributorQuestion: If I have content composition, is there a way to have the finished document land in a different folder or document library than the template? I can see using content routing, but that feature is not yet available
- Wayne_AddisonCopper ContributorHi Mario, I trigger the content creation via a 'For a selected item' Power Auto trigger then a 'Generate document using SharePoint Syntex (preview)' action, it still places the composed doc in the template folder but I use downstream actions to move the file and convert to PDF. * I'm assuming 'content composition' is the same as 'content assembly'.
- Mario_FulanIron ContributorThis is the action I was referring to. I know I can use a downstream action after the document has generated. Was wondering about the addition of the redirect target location so that the doc wasn't generated (even briefly) in the source location. There is a concern about permissions and directing to the target with permissions intact. Thanks for the fast reply. I'll continue to use this approach for now.
- Mario_FulanIron ContributorQuestion: For content composition with Syntex, are there APIs that can be called to initiate a composition using a template and if not do we have an ETA for those?
- JamesEccles
Microsoft
We currently have a Power Automate action in the SharePoint connector for automated content assembly. It's called "Generate documents using Syntex (preview). We're working on having APIs give the same experience programmatically, likely to be released next year.
- 4BobRandallBrass ContributorIs this a service that will be made available to those of us in the Government Community Cloud as well as the commercial cloud?
- JamesEccles
Microsoft
Syntex is available today in GCC. Right now Syntex is not available in GCCH or DoD. There's no timeline we can share on these clouds right now.
- Wayne_AddisonCopper ContributorWhen classifying documents, is there a way to tell if a classification attempt has been made but none of the models (deployed to the library) succeeded in classifying some of the docs? Something like setting the classification attempt time/date and a content type of 'Unknown' would be good. Thanks.
- JamesEccles
Microsoft
Great feedback. We're looking something along these lines to give you better indications of where Syntex has processed a document. Look out for a change in this direction coming soon.