Event banner
Microsoft Syntex AMA
Event Ended
Tuesday, Nov 15, 2022, 10:00 AM PSTEvent details
Microsoft Syntex is Content AI integrated in the flow of work. Syntex automatically reads, tags, and indexes high volumes of content and connects it where it’s needed—in search, in applications, and ...
EmilyPerina
Updated Nov 15, 2022
mpjjonker
Nov 03, 2022Brass Contributor
Question about language support.
The support page mentions: This model supports all of the Latin-based languages
How can we leverage the language understanding of our native language?
For example: tokenization (of composite words), negation, SVO vs SOV, Part of Speech.
Is the https://turing.microsoft.com/ project being used 'under the hood'?
JamesEccles
Microsoft
Nov 15, 2022The Unstructured model type supports any latin character set language. This model type is not using Turing under the hood- it ultimately has no semantic understanding of content. Rather it understands the text in a document as a series of tokens. Its patterns within/between/around these tokens that the model is building on.
https://learn.microsoft.com/en-gb/microsoft-365/contentunderstanding/explanation-types-overview#what-are-tokens
- mpjjonkerNov 15, 2022Brass ContributorIn an earlier stage of NLP we have used UIMA , where tokenization and sentencazation where important part of the understanding, should I compare this with that?