Forum Discussion
Teams site showing up in public Bing results
Thx David Rosenthal for looping us on this. Started internal investigation right away to avoid confusion.
Thanks for looping us in.
If a search crawler discovers a link to an authenticated SharePoint Online site, it may add the link to the index. Because the site requires authentication, the site title and contents will not be indexed – only the presence of the URL. This should only occur when there is a link to the site collection somewhere on the public internet (e.g. someone might have use anonymous link sharing and posted that link somewhere where a internet search crawler could find it).
The timing of this particular thread is an interesting coincidence, as we are doing work in this area. Just this month, to mitigate this, we have added a default robots.txt file to every site collection which instructs search crawlers not to index this URL. This will prevent search crawlers from adding new sites to their index. Existing sites should be removed from the search results next time the site is indexed by the search crawler.
You can actually see the robots.txt file for your domain at https://<tenantURL>/robots.txt The change is rolling out - should be mostly complete but might be a few deployments that do not yet have it.
Hope this helps!
- BradleyGeldenhuysJan 17, 2018Copper Contributor
That doesn't make sense, as the robots.txt file will not be accessible as the site is locked down.
- Dale WilsonApr 21, 2017Copper ContributorJust want to say thanks to everyone who responded on the thread. Must say this shook us a bit. Will monitor the results over time based on the robots.txt file.
- Apr 21, 2017AdamHarmetz,
This afternoon when I search for random urls of our client sites they appeared in the search results. Also sites used by teams would appear with their url. I don't think that this was down to sharing sites. Or are you saying if any anonymous link was ever shared then the whole tenant would have been crawled by bing?- AdamHarmetzApr 21, 2017
Microsoft
Thanks. There could be more ways in which a internet search crawler might have picked up sites in the past that I do not know about. I will ask around and see.
We do believe that with the work to update robots.txt, upon next crawl/over time these will age out.What would be helpful to use to ensure we have a good understanding of the issue is if you have any sites created in the past day or two that appear in a search results. We would be looking for a site that still appears there even after the robots.txt change.
- David RosenthalApr 21, 2017
Microsoft
Great to hear of the work to better protect even the URLs. There seem to be other ways these are getting surfaced though, as my tenant does not and has never had anonymous sharing turned on. I can't rule it out, but I would also be very surprised to find that someone from my team had put one of these URLs out on the public internet.
Is this affected by Guest Access in Office 365 Groups or anything like that? Does that URL have to become visible to the world somehow in order for guests to get to it?
- Apr 21, 2017Ey Adam: May I ask you why this change/issue has not been informed in the Message Center?
- AdamHarmetzApr 21, 2017
Microsoft
I think this is good feedback and I will share it with the team responsible for this work. I think there should have been a post.