Secondary Access to File Stream(s) File, In Open Transaction, May Hang
Published Jan 15 2019 01:50 PM 151 Views
Microsoft
First published on MSDN on May 04, 2011

This post is directly from an issue I have been working on.  The behavior is very difficult to simulate because of the flags and timings involved.   In this post I will attempt to describe the scenario and provide you with a simple workaround for your applications.

Reproduction Steps

1. Begin a SQL Server transaction

2. Obtain the file stream context and logical path

3. Open the file stream file for Write but allow caching (do not enable the WriteThrough flag)   Ex:   using (var LobjSqlStream = new SqlFileStream(LstrStringPath, LobjBuffer, FileAccess.Write))

4. Write data to the file.  This is an important step because a dirty file is required to trigger the behavior.  Data must be held in the NTFS file system cache.

5. Close the file but leave the SQL Server transaction active.    This is another important facet as you can only trigger the behavior for files that were opened cached, dirtied and are still part of an active SQL Server transaction.

Now a secondary access to the file is required to trigger the problem.

6. While the SQL Server transaction is still active a second request to open the file must be made.   Not only is the open required but it must specify the file should be opened WITHOUT file system caching.  Ex: using (var LobjSqlStream = new SqlFileStream(LstrStringPath, LobjBuffer, FileAccess.Write, FileOptions.WriteThrough))

Workaround

When opening the file stream file for writing be sure to always specify the WriteThrough flag to avoid file system cache and the associated behaviors described in this post.

using (var LobjSqlStream = new SqlFileStream(LstrStringPath, LobjBuffer, FileAccess.Write, FileOptions.WriteThrough))

More Information

I indicated that this is a difficult issue to simulate because it took me 2 days using the kernel debugger and breakpoints to trigger the behavior and develop a reliable and repeatable, reproduction of the problem.   What is happening is a maintenance operation by NTFS during the second open request.

The first open and close sequence results in cached, write data in the file system cache.  When you are allowing NTFS to cache the data the Close operation from the client does not force the data out of NTFS cache.  The cache uses its own algorithms for maintenance.   However, the second open request indicates it wants a NON-CACHED access path.   The CreateFile call to NTFS does a lookup in the cache and finds data that needs to be flushed before this NON-CACHED access path can be honored.   When NTFS finds this situation a close it triggered on the original stream data ($DATA stream of the file) and the Rsfx filter is invoked.   The file stream filter driver (Rsfx) sees that the SQL Server transaction is still active and attempts to flush the data to honor the transactional requirements.    In doing so it causes a deadlock on the thread with itself as NTFS has already determined it needs to flush the data as well.

This results in the CreateFile call from the client application hanging, waiting for the System.exe thread to complete the create but the system thread has deadlocked itself.

If the SQL Server transaction would have been committed or rolled back the RsFx driver would have made sure the cached data was flushed to disk as part of the transactional logic.   Why didn’t I suggest a workaround of making sure to commit the transaction before a second access?   As always keeping the transaction open for a short period helps concurrency so this is a good practice.   However, I have been able to trigger the behavior with virus scanners, log backup and a simple test application that just opens the file at the right time.   Keeping the time between the close of the file and the commit of the transaction short is important but it is not a completely safe workaround.    The safe workaround is opening the file with WriteThrough to avoid NTFS cache in the first place.

You can restart the client application that is stuck in the CreateFile call to ‘un-hang’ you client.   Most of the time it will also require a reboot of the server to clear the system thread deadlock as well.

Fix: A bug has been filed with the SQL Server team and is being evaluated for inclusion in an upcoming CU release.

Bob Dorr - Principal SQL Server Escalation Engineer


Version history
Last update:
‎Jan 15 2019 01:50 PM
Updated by: