First published on TECHNET on Apr 12, 2018
As you start to work with Search you will notice that it’s architecture has not been changed dramatically and your lessons learned in SharePoint 2013 should be able to help you.
Moving parts in Search
To retrieve information, the crawl component connects to the content sources by invoking the appropriate indexing connector or protocol handler. After retrieving the content, the crawl component passes crawled items to the content processing component.
For more information about crawling content sources, see Plan crawling and federation in SharePoint Server .
The component transforms crawled items into artifacts that are included in the search index. The content processing component also writes information about links and URLs to the link database.
For more information about content processing, see Plan crawling and federation in SharePoint Server .
The analytics processing component performs two types of analyses: search analytics and usage analytics.
The results from the analyses are added to the items in the search index. In addition, results from usage analytics are stored in the analytics reporting database.
For more information, see Overview of analytics processing in SharePoint Server .
The search index can be divided into discrete portions, called index partitions. The search index is the aggregation of all index partitions. Each index partition holds one or more index replicas that contain the same information.
The index component:
For more information about the search schema and the search index, see Overview of the search schema in SharePoint Server .
The query component analyzes and processes queries and results.
For more information, see Plan to transform queries and order results in SharePoint Server .
The search administration component runs the system processes for search.
The crawl database stores tracking information and historical information about crawled items.
The link database stores information extracted by the content processing component. In addition, it stores information about search clicks; the number of times people click on a search result from the search result page. This information is stored unprocessed, to be analyzed by the analytics processing component.
The analytics reporting database stores the results of usage analytics. In addition, it stores statistics information from the analyses. SharePoint Server uses this information to create Excel reports that show different statistics.
The search administration database stores search configuration data, such as the topology, crawl rules, query rules, and the mappings between crawled and managed properties. It also stores the access control list (ACL) for the crawl component.
NOTE: If you combine any of these components (Analytics, Crawl, Content Processing, Query Processing, or Search Admin) you need to add an additional 8GB Ram for each component ie If you have 3 components that server needs 24GB Ram
Based on the expectation of having over 20 million items, and high availability across all components I would recommend the following base architecture
Search Server 1 Query Processing component, Index Component (Partition 0)
Search Server 2 Index Component (Partition 0)
Search Server 3 Query Processing, Index Component (Partition 1)
Search Server 4 Index Component (Partition 1)
Search Server 5 Analytics Component, Content Processing Component, Crawl Component, Admin Component
Search Server 6 Analytics Component, Content Processing Component, Crawl Component, Admin Component
To improve Full crawl time and results – Add more crawl databases and content processing components for result freshness, use the Crawl health report to determine bottlenecks if there are any.
To improve Query latency – Add more index replicas so that the query load is distributed. Use the Query health report to determine bottlenecks if there are any.
To improve Query latency and Throughput – Split the index into multiple partitions. Use the Query health report to determine bottlenecks if there are any.
For redundant crawling and query processing, it is not necessary to have a redundant analytics processing component. However, if the non-redundant analytics processing component fails, the search results will not have optimal relevance until the failure is recovered.
Best Practices for End Users
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.