Blog Post

Azure PaaS Blog
4 MIN READ

Azure Service Fabric FabricDCA 403 forbidden issue

JerryZhangMS's avatar
JerryZhangMS
Former Employee
Feb 17, 2022

When you use Service Fabric cluster, there is a component called FabricDCA (Data Collection Agent) which is responsible for sending diagnostic data from underlying VMSS to specified storage account. But sometimes you will find the node in unhealthy status and from Service Fabric Explorer, you can see the following error message :

 

'FabricDCA' reported Error for property 'DataCollectionAgent.Blob_WindowsFabric_AzureBlobServiceFabricEtw_BlobInitializer'.

 

The Data Collection Agent (DCA) encountered an exception when trying to initialize Azure Storage. Diagnostics information will be left uncollected if this continues to happen. Failed trying to access storage account. Please verify if the connection string provided is correct. AccountName: xxxx  ContainerName : fabriclogs-xxxx-xxxx-xxxx-xxxx-d9bf911e5ade. The remote server returned an error: (403) Forbidden.

 

Error message in SF explorer

This error will not cause the whole cluster down, but it will actually cause the nodes in unhealthy status and may effect some operations. For example, if your upgrade of project is under Monitored mode, the health check cannot be passed in this situation and it will cause the upgrade failure.

 

The nature of the issue

This is not such complicated. As explained above, the FabricDCA component will time by time send data to target storage account. As shown in the last part of the error message, the FabricDCA component is still working normally to send out the data, but the request sent out got a Forbidden error code 403 instead of expected 200 or 202 response code.

 

Possible root causes

The possible root causes here are the same as the following question:

What will cause the request sent to an Azure storage account returned with 403 Forbidden error?

 

Generally speaking, it can be summarized into two parts:

  1. Authentication failure
  2. Firewall validation failure

 

For Azure Storage Account, no matter which one of the above two validations is failed, it will always return 403 Forbidden error.

 

Possible solutions:

Then let's talk about the solutions of the above two situations.

1. To resolve the authentication failure, we need to make sure the access key of the storage account saved in Service Fabric cluster setting is the current one. Normally this kind of issue is caused due to a regeneration of the access key of storage account.

a. From resource explorer site, we need to login, switch to Read/Write mode and locate our Service Fabric cluster.

Resource Explorer page

b. Click the blue Edit button and scroll down, there must be a part called diagnosticsStorageAccountConfig.

Edit button of Resource Explorer page

diagnosticsStorageAccountConfig part

c. We can open the access key page of Storage Account in Azure Portal and click on Show keys button above. (The storage account is with name shown in storageAccountName above. In this example it’s x5bhwuld4hrs42)

Access Keys page of storage account in Azure Portal

d. Copy the key 1 and key 2 value into the primaryAccessKey and secondaryAccessKey part of diagnosticsStorageAccountConfig. If there is any value in protectedAccountKeyName or protectedAccountKeyName2, please remove it. The expected one will be like:

Modified diagnosticsStorageAccountConfig part

e. Click on green PUT button to update the new access key into Service Fabric cluster and monitor until the provisioningState become Succeeded.

 

 

2. To resolve the firewall validation failure, we can follow these steps.

i. Same as step 1.a and 1.b to find out the diagnostic storage account name.

ii. Visit the Networking page of this storage account in Azure Portal.

Networking page of storage account in Azure Portal

iii. Here we have two choices:

  • We can simply change the configuration of Firewall to All networks. This will allow the traffic from the Internet to this storage account directly.

Allow all networks traffic to Storage Account

  • We can also keep the choice as Selected networks but add a Virtual Network into the setting such as:

Allow traffic from specified subnet of Virtual Network to storage account

This will allow the traffic from the selected Virtual Network subnet be passed by the Firewall. It’s like the allow list of the Virtual Network subnet.

 

 

On the other hand, when you try to follow the first solution, it’s possible that your upgrade will be blocked since the default upgrade mode is Monitored. For example, if your cluster is with three nodes, when the correct access key is updated into the first node, the health check will find out that the second and third node are both unhealthy. This will cause quorum loss of the whole cluster and the upgrade will be blocked and rolled back.

 

To bypass this problem, please follow these steps:

1. We need a computer where Service Fabric SDK and the cluster certificate are both installed. The cluster certificate should be installed under path CurrentUser\My.

2. Open PowerShell command window and use following command lines to connect to your cluster. Please remember to replace the value of the ClusterName and CertThumbprint by your own cluster URL and thumbprint of your cluster certificate.

$ClusterName= "sfhttpjerry.eastus.cloudapp.azure.com:19000"
$CertThumbprint= "F7EE27A0E063B681DD95EE3AE98F2F93EC4BB7C0"  

Connect-serviceFabricCluster -ConnectionEndpoint $ClusterName -KeepAliveIntervalInSec 10 -X509Credential -ServerCertThumbprint $CertThumbprint -FindType FindByThumbprint -FindValue $CertThumbprint -StoreLocation CurrentUser -StoreName My

3. Open the Service Fabric Explorer. The link can be found from Overview page of the Service Fabric cluster in Azure Portal.

4. Follow same steps as part 1, but this time, once we click on PUT button, switch to Service Fabric Explorer and keep refreshing the data of the cluster. We can do it by clicking the refresh button shown in following screenshot instead of refreshing whole page.

Refresh button of Service Fabric Explorer

5. Once the upgrade is shown as in progress in Service Fabric Explorer, switch back to PowerShell command window which you used to connect to cluster and run following command:

Update-ServiceFabricClusterUpgrade -UpgradeMode UnmonitoredAuto

This command will forcibly change the upgrade mode of the ongoing upgrade to UnmonitoredAuto which means there will not be any health check.

 

 

Result:

After modifying the configuration of access key in Service Fabric cluster and Firewall setting of Storage Account, normally the FabricDCA 403 issue should be resolved. If there is still the same error message in Service Fabric Explorer, please kindly raise support ticket to Microsoft for more detailed assistance.

Published Feb 17, 2022
Version 1.0
No CommentsBe the first to comment