Emerging Issue - Remote Access is disabled External Access Policy and NTLM is Disabled
If legacy Authentication methods are turned-off externally by following https://docs.microsoft.com/en-us/skypeforbusiness/plan-your-deployment/modern-authentication/turn-on-modern-auth, and remote access for the user is also disabled by External Access Policy a bug has emerged that causes clients on the external network to be in a infinite loop, trying to authenticate and get a 403 Forbidden error. This generally would happen whenever the client is not connected to VPN.
The bug manifests in many ways, some of which are mentioned below
- Size of LCSCDR database can increase considerably, especially for the dbo.Registration table
- CDR/QOE Reports may be delayed
- In rare cases replication would show a single secondary as opposed to both active secondaries and would auto-correct after several hours
LYSS Database can experience an increase in size too, and you will notice
EVENT ID |
Event id text |
Notes |
32056 |
Space Used by LYSS DB is within normal range |
DB Utilization > 0% and < 40% |
32057 |
Space Used by LYSS DB is at or above the Warning Threshold. |
DB utilization > =40% and < 60% |
32059 |
Space Used by LYSS DB is at or above the Critical Threshold |
Db Utilization is >= 60% |
Depending on the extent of time the issue has been occurring, the size of the environment and other factors as user-behavior the following EVENT IDs may also be see
Event id |
Event ID text |
Notes |
32075 |
A full flush of all queue items for LYSS DB has started. |
|
32076 |
A full flush of all queue items for LYSS DB has completed. |
|
32089 |
A flush of queue items from the LYSS DB was initiated, and items were exported to the file system. |
|
32090 |
Flushed queue Items from the LYSS DB have been left unattended to for some amount of time and require attention to be imported back. |
|
32103 |
Fabric service id 'ROUTING GROUP GUID' is running with a reduced replication set. |
Get-CsPoolFabricState will show that routing groups are in missing secondaries |
You can run a SQL Query against LYSS database to confirm if indeed you have been experiencing issues with 4003 by running
Use lyss;
SELECT SUBSTRING ( CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader])), CHARINDEX( '<MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader]))) + 10, CHARINDEX( '</MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader])))- (10+CHARINDEX( '<MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader]))))) 'MsDiag' ,Count(1) 'Count' FROM [lyss].[dbo].[ItemQueue]
WHERE CHARINDEX( '<MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader]))) > 0
Group by SUBSTRING ( CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader])), CHARINDEX( '<MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader]))) + 10, CHARINDEX( '</MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader])))- (10+CHARINDEX( '<MsDiagId>', CONVERT(VARCHAR(MAX), CONVERT(VARBINARY(MAX), [ItemHeader])))))
Order by 2 desc
The output should look like
This issue has been fixed in a client update with version 16.0.11901.10000, but the default behavior hasn’t been updated. In-order to remediate the issue, you would need a client policy ( or a GPO) along with an updated client.
The fix to be effective we need a regkey ForbiddenRemoteAccessIsPermanentError as shown below.
Path: HKEY_CURRENT_USER\Software\Policies\Microsoft\Office\16.0\Lync
KeyName: ForbiddenRemoteAccessIsPermanentError
Value: 1
The key can be pushed through client policy entry for e.g. adding the policy entry to global client policy
$a = New-CsClientPolicyEntry -name ForbiddenRemoteAccessIsPermanentError -value "True"
Set-CsClientPolicy -Identity Global -PolicyEntry @{Add=$a}
In-order for the client policy to be applied, a successful logon is required, so users need to sign-in atleast once, so the data is cached and used for subsequent failures
Once appropriate changes have been accomplished users will experience the following error message when logging on remotely
Please Note: At this point in time, only Skype for Business 2016 Client has a fix, and there are no planned changes for Skype for Business 2015 Client
We understand that updating the clients may take some time, and while the clients are being updated, organizations may want a work-around to prevent any work disruptions. At this point in time, we are recommending the following
- Update Storage Service behavior to disable Auto Import functionality to allow for a controlled method for import of data and prevent any potential issues by running
Set-CsStorageServiceConfiguration -EnableAutoImportFlushedData $false
- Perform a FULL Flush of storage service before the beginning of the day to prevent automatic export under load to happen during business hours, as it's resource-intensive ( CPU / Memory/ Disk / Network) by running
Invoke-CsStorageServiceFlush -FlushType FullFlush -PoolFqdn POOLFQDN
This may also prevent FabricReplicationSetReduction happening in your organization, if it was previously occurring
Finally, it's possible that XML files have been written to your file share that may needed to be imported for regulatory and/or compliance purposes. Please reach out to Microsoft Support to help you find ways how/when the data can be imported safely.