This blog has been authored by Vinod Kurpad (Principal PM, Azure Cognitive Search) and Prachi Jain (PMM, Azure AI)
Azure Cognitive Search is a cloud service that enables developers with APIs and tools to build rich search experiences over a variety of content in web, mobile, and enterprise applications. This week at //build, we announced new capabilities in Azure Cognitive Search making it easier for developers to build search experience and customize with new skills to deliver more relevant results to their users.
Improved Development Experience Debug sessions is a new portal preview feature in Cognitive Search, that gives you a rich IDE like experience for refining a skillset and fixing issues within an AI enrichment pipeline, As visualized in Skills Graph, you can explore enrichments across all nodes within and enrichment tree and evaluate each skill invocation.
Debug Sessions allow you to modify skills as well as inspect inputs and outputs of each step.
“Debug session skill graph”
Debug sessions help you with three kinds of issues:
Skillset issue: expressions, paths and type mismatches within your skillset
Skill failures: Applies mostly to custom skills, with the ability to generate a request that you can debug locally
Data inconsistencies: Handle scenarios where a specific document fails, for example if your data source now contains a document in a different language from what you had configured that another skill does not recognize. Debug sessions also help you start small and incrementally build more complex skillsets.
We have continued to grow our catalog of skills, with new additions:
• PII skill identifies and redacts personally identifiable information, such as social security numbers, email addresses, credit card numbers, drivers’ licenses and many more entities. You can find the complete list of entities and languages supported here. • Translation skill identifies the language of the text in the document and translate it into a target language. The translation skill supports a variety of languages ensuring coverage over most scenarios. • Document extraction skill can inserted at any point within the skillset. In the past, document extraction was implicit in the beginning of the enrichment pipeline, now it can be configured within the pipeline. This enables scenarios like working with encrypted files. • Azure Machine Learning Skill (AML) makes discovering and consuming a model built within AML simple and intuitive. Endpoint discovery, authentication, and schema validation are some of the key benefits. The AML skill is a preview feature that you can sign up to use.
Now your Azure Machine Learning skills can be automatically detected when you edit your skills from the portal.
“AML Skill : Adding the AML skill to skillset"
More relevant search results
We are bringing new capabilities that help you deliver better search results
The introduction of a new BM25 based ranking algorithm, that in our tests increased Normalized Discounted Cumulative Gain (NDCG) by about 5 points! This generates more intuitive results that align with user expectations. You can test this algorithm today.
Mechanisms to provide more consistent results to users even in a world of shards and replicas that are constantly changing. We have introduced the ability to specify a query session (to reduce changes across the session) as well as the ability to request the scoring statistics to be computed based on global statistics (across shards). Learn more about scoring statistics.
For more granular control on ranking your search results, those of you who are more data science inclined, we have exposed some of the statistics computed for indexing purposes that you can use as an input into a “Learning to Rank” model that you create. You can then invoke this model to override the defaults to re-rank your search results. Get index statistics as part of your query.
More secure than ever
Now that we have encryption for data in transit and at rest, our next security advancements are in the areas of endpoint protection and access control. The latest security features include support for accessing a search service over a private endpoint, limiting access to specific IP ranges, and limiting access to only clients in a virtual network. In addition, we are also announcing preview support for AD managed identities. You can register a search service with Active Directory, and then grant read access to that identity from the Azure data sources you index from. Watch to learn more.
Now you can assign Azure Cognitive Search a managed identity that can be given “rights” to read a data source
It is great to how customers like PwC, have built a capability to automatically identify obligations for US consumer financial regulations on Azure Cognitive Search via their Regulatory Obligation Identifier Skill, saving significant manual effort for their customers searching for documents that contain regulation content. Learn more about their solution.