Crawling

3 Topics

Group Site Collections Columns not Crawled
Hi everyone, I added some custom columns in a group site collection but I can't find these columns as crawled properties in the search schema. All columns have content and the content is searchable. If I do the same with a classic teamsite colums are there as usual and I can map the properties in the search schema. Anyone with the same behavior?
benny1831
Nov 19, 2018 Place SharePoint
1KViews
0likes
1Comment
SharePoint 2013 - Crawl Error- The content processing pipeline failed to process the item
SharePoint 2013 has been recently updated to May 2017 CU. I am receiving the below errors when performing a Full Crawl: "The content processing pipeline failed to process the item. ( Object reference not set to an instance of an object.; ; SearchID = XXXX....." I have also tried to reset index and start a new full crawl but issue persists. Anyone knows a resolution for this error? Thanks for your help :)
Andre Grech
Sep 03, 2018 Place SharePoint
1.5KViews
0likes
0Comments
Schedule multiple content source crawls: SharePoint 2013
The maximum Content Source boundary for a SharePoint 2013 Search service application is 500. That is a lot of individual Content Source objects. Still if you are crawling file shares and you want to show specific units as Content Sources in a refiner, you may be quickly growing a large set of Content Source. One shortcoming that has always bothered me is the limitation of "scheduling" crawls. Each schedule is independent; only relevant to the content source it is scheduled to crawl. This means you could easily get into some complex mapping to try to figure out when you have what crawling. Without unlimited resources, 20 million or more items to crawl and more that a hand full of content source locations, will likely run into some frustrations. One solution to keep your crawling better managed is to look at a schedule taking the entire Farm SSA into account. You can do this through PowerShell scripting and a Windows Task. In this example we will assume we don't want more than 10 crawls running simultaneously. We'll exclude "Local SharePoint sites" from being crawled by this process. It has continuous crawl enabled and its own aggressive Full and Incremental crawl schedule. We're also going to exclude a few content sources from the schedule for some reason. Maybe they are extremely large or on very slow disk, so we want a completely unique choice in when we crawl them. Our maximum limit count of ten will include all non-idle crawl components. Yes, this code can be cleaned up, made into a function, etc. The point is to provide a functional tool to review and use if desired. === START === <# Purpose: Start Incremental Crawl on oldest Content Source up to $MaxNonIdel instances Check how many non-idle Content Sources there are If less than $MaxNonIdel, start remainder as Incremental Crawl based on oldest crawl #> $ErrorActionPreference = “Stop”; # Load SharePoint Module If ((Get-PSSnapIn -Name Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue) -eq $null ) { Add-PSSnapIn -Name Microsoft.SharePoint.PowerShell } # Get duration of crawl $NameSSA = "Your Search Service Application Name"; $MaxNonIdel = 10; $NonIdelCount = 0; $sources = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $NameSSA; ForEach ($source in $sources) { if ($source.CrawlStatus -ne "Idle") { $NonIdelCount++; } } if($NonIdelCount -lt $MaxNonIdel) { $sources = $sources | Sort-Object -Property CrawlCompleted; ForEach ($source in $sources) { if($NonIdelCount -lt $MaxNonIdel) { if ($source.CrawlStatus -eq "Idle" -and $source.Name -ne "Promo*" -and $source.Name -notlike "Blue*" -and $source.Name -ne "Local*") { $source.StartIncrementalCrawl(); $NonIdelCount++; } } if($NonIdelCount -ge $MaxNonIdel) { Exit; } } } === END ===
Anonymous
Jun 01, 2018 Place SharePoint
1.4KViews
1like
0Comments