First published on TECHNET on Sep 15, 2018
The topic of this article seems to be pure dynamite. Reason for (emotional) discussions at most customers I have been engaged with when exclusions for SharePoint Servers are recommended .
During SharePoint on-prem farm operation, you will face different occasions that require a closer look at the software, that is safeguarding your operating system. Files and folders, processes and services, all of them are constantly inspected by security software. Over the past decades we learned to live with it. Learned to accept and learned to respect anti-virus (AV) scanners. The risk is real: we need modern virus protection.
Despite the fact we are threatened by malware and viruses 24/7, operating complex server software needs an advanced understanding on possible impact any third party software might cause. Again and again I am witness of product misbehaviour and performance degradation that is caused by anti-virus software running on server products. Awareness for this topic is essential. I want to encourage you to really get to know the protection software that you run side by side with SharePoint and possible optimization for it you could go for.
This blog post was written to illustrate the performance part of this discussion. I tested the impact anti-virus real-time scanning has on SharePoint performance and want to share the results. I want to promote the importance of the correct configuration of folder (and/or process) exclusions.
Assessing, discussing and improving server security is part of my daily job at the largest German enterprise customers. Writing this article exposes me to all kinds of complaints, yes, I know. That is why I want to clarify some things upfront:
I am writing this article from an operational excellence perspective. As an accredited Field Engineer for SharePoint Security Risk Assessments I know of the necessity of virus protection and the risks any real-time scan folder exclusions (aka save havens for malware ) might cause. The highest level of security can only be achieved by implementing holistic approaches, that include more than just the use of virus and malware protection without allowing any folder exclusions.
Let me hit the nail on the head by naming the mitigation for one wide spread critical operational security issue: Limiting and auditing administrative access as well as implementing a secure management of service and administrative accounts. This could help to create a fair balance between a high level of security and operational excellence, even with folder exclusions for real-time protection software in place.
The highest level of security comes with a price, this is what IT Tech Consultants are preaching ever since. If this highest level of security must be met, this article could help you how to find out the cost from the SharePoint performance point of view.
Site Data (Content) Export Test
This test was inspired by an actual customer scenario: Exporting huge amount of content to the file system, as part of a SharePoint content migration. The customer is running complex export scripts to dump all content to disc. The exported content is then being imported to a target farm using PowerShell. The runtime (duration) of these scripts is huge and takes not just hours, but days and are executed in multiple cycles. I wanted to find out what impact real-time virus scan has on these kinds of operations and potential delays that are caused by it.
I created a simple PowerShell script ( Measure-SiteExport-Duration.ps1 ) that measures the duration of three consecutive site collection export operations. My test script is exporting the Central Administration Site Collection three times in a row (to generate some I/O traffic on the discs), the duration is measured and tracked.
Search Full Crawl Test
SharePoint Search is a crucial service at most customers. Huge amount of items must be crawled as often as possible, crawl times must be as short as possible as one piece of the puzzle to ensure maximum crawl performance. Search is a complex and a performance intense service, both on the SharePoint side and on the SQL backend. So it seems to make sense to check anti-virus real-time scanning impact. This test only covers the SharePoint part of the story. The results surprised me.
In my test, I measured the duration of 5 consecutive full crawls of one single content source containing one web application. I appologize for the small amount of items being crawled in each full crawl (2300), but a lab environment is what it is: a lab environment (and this allowed me to test in a reasonable amount of time).
Client Performance Test
How big is the direct impact of server side real-time scanning on people using SharePoint? Interesting Question!
Though it is difficult to enumerate end user performance data, I made use of a scripted toolset I have used in the past to measure time for uploads and downloads. This won't exactly represent all kind of user interaction, (aspx site requests are not covered by this test) but gives an idea and reason for further experiments in productive-like environments.
The utilized script was run by a SharePoint user account from a client computer.
Following steps have been involved and the overall duration of all steps was measured:
Downloading Office (Excel) Document 1 MB (5 times)
Downloading Office (Excel) Document 2 MB (5 times)
Downloading Office (Excel) Document 3 MB (5 times)
Downloading Office (Excel) Document 4 MB (5 times)
Downloading Office (Excel) Document 5 MB (5 times)
Downloading Office (Excel) Document 10 MB (5 times)
Downloading Office (Excel) Document 20 MB (5 times)
Uploading Office (Excel) Document 1 MB (5 times)
Uploading Office (Excel) Document 2 MB (5 times)
Uploading Office (Excel) Document 3 MB (5 times)
Uploading Office (Excel) Document 4 MB (5 times)
Uploading Office (Excel) Document 5 MB (5 times)
Uploading Office (Excel) Document 10 MB (5 times)
Uploading Office (Excel) Document 20 MB (5 times)
I tried to get as much insight as possible for all three test scenarios. So I did not just repeated the tests turning anti-virus scanning on or off, I tried to get as close to the real life as possible and tested multiple scenarios:
I repeated all tests, eliminating the lowest and highest measured values and then calculated averages.
For all script based testing I always started PowerShell scripts in a new process and closed open sessions before starting the following tests, to avoid any kind of caching.
Site Data Export Test Results
The test results highlight the importance of applying a folder exclusion for the destination path of your site export, if you are utilizing a real-time scanning solution. Running them without following best practises increased the duration of content export up to 74%.
Search Full Crawl Test Results
The results of this test setup surprised me a bit, as I expected more activity on the local discs (search index) during crawling. The main finding and good news is, that real-time virus scans do not seem to slow down crawling notably in my test setup.
I am asuming, this might differ from farm to farm though, due to the nature of some content. Huge amounts of Office documents, that can be fully indexed with out-of-the-box ifilters , would cause more I/O traffic on servers with index components. Content sources with crawled file types like scanned pdf or images have a smaller impact to the SharePoint index in the file system.
The amount and type of content on my test lab is not too represantitive. If you are constantly looking to improve crawl performance like some of my customers, I'd recommend to perform testing on your lab environments with a copy of your productive content.
Client Performance Test Results
This test is gathering data, that is directly related to end user experience by measuring duration of downloads and uploads. Anything that is directly related to the degradation of SharePoint end user performance is usually getting immediate management attention. I was able to identify results that I expected. Real-time protection following the best practises increased the captured results for about 10%, missing folder exclusions during real-time scanning increased the results for almost 20% in reference to not using any virus scanning.
On high utilized web frontend servers an even higher impact is likely, due to the increased traffic within IIS blob caching, IIS and ASP.NET related folders. The test results in my lab have been created with just one user consuming the farm.
The conclusion of this experiment is evident: if you are looking for another performance tweak, double check your current AV settings.
I was running all of my testing on a local, virtual SharePoint environment:
Please share your thoughts & experiences in the comments. Any feedback to this post is highly welcome.
Thank you for rating this article if you liked it or if it was helpful for you. Feel free to share this post on social media.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.