Azure Service Fabric
12 TopicsCommon causes of SSL/TLS connection issues and solutions
In the TLS connection common causes and troubleshooting guide (microsoft.com) and TLS connection common causes and troubleshooting guide (microsoft.com), the mechanism of establishing SSL/TLS and tools to troubleshoot SSL/TLS connection were introduced. In this article, I would like to introduce 3 common issues that may occur when establishing SSL/TLS connection and corresponding solutions for windows, Linux, .NET and Java. TLS version mismatch Cipher suite mismatch TLS certificate is not trusted TLS version mismatch Before we jump into solutions, let me introduce how TLS version is determined. As the dataflow introduced in the first session(https://techcommunity.microsoft.com/t5/azure-paas-blog/ssl-tls-connection-issue-troubleshooting-guide/ba-p/2108065), TLS connection is always started from client end, so it is client proposes a TLS version and server only finds out if server itself supports the client's TLS version. If the server supports the TLS version, then they can continue the conversation, if server does not support, the conversation is ended. Detection You may test with the tools introduced in this blog(TLS connection common causes and troubleshooting guide (microsoft.com)) to verify if TLS connection issue was caused by TLS version mismatch. If capturing network packet, you can also view TLS version specified in Client Hello. If connection terminated without Server Hello, it could be either TLS version mismatch or Ciphersuite mismatch. Solution Different types of clients have their own mechanism to determine TLS version. For example, Web browsers - IE, Edge, Chrome, Firefox have their own set of TLS versions. Applications have their own library to define TLS version. Operating system level like windows also supports to define TLS version. Web browser In the latest Edge and Chrome, TLS 1.0 and TLS 1.1 are deprecated. TLS 1.2 is the default TLS version for these 2 browsers. Below are the steps of setting TLS version in Internet Explorer and Firefox and are working in Window 10. Internet Explorer Search Internet Options Find the setting in the Advanced tab. Firefox Open Firefox, type about:config in the address bar. Type tls in the search bar, find the setting of security.tls.version.min and security.tls.version.max. The value is the range of supported tls version. 1 is for tls 1.0, 2 is for tls 1.1, 3 is for tls 1.2, 4 is for tls 1.3. Windows System Different windows OS versions have different default TLS versions. The default TLS version can be override by adding/editing DWORD registry values ‘Enabled’ and ‘DisabledByDefault’. These registry values are configured separately for the protocol client and server roles under the registry subkeys named using the following format: <SSL/TLS/DTLS> <major version number>.<minor version number><Client\Server> For example, below is the registry paths with version-specific subkeys: Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Client For the details, please refer to Transport Layer Security (TLS) registry settings | Microsoft Learn. Application that running with .NET framework The application uses OS level configuration by default. For a quick test for http requests, you can add the below line to specify the TLS version in your application before TLS connection is established. To be on a safer end, you may define it in the beginning of the project. ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 Above can be used as a quick test to verify the problem, it is always recommended to follow below document for best practices. https://docs.microsoft.com/en-us/dotnet/framework/network-programming/tls Java Application For the Java application which uses Apache HttpClient to communicate with HTTP server, you may check link How to Set TLS Version in Apache HttpClient | Baeldung about how to set TLS version in code. Cipher suite mismatch Like TLS version mismatch, CipherSuite mismatch can also be tested with the tools that introduced in previous article. Detection In the network packet, the connection is terminated after Client Hello, so if you do not see a Server Hello packet, that indicates either TLS version mismatch or ciphersuite mismatch. If server is supported public access, you can also test using SSLLab(https://www.ssllabs.com/ssltest/analyze.html) to detect all supported CipherSuite. Solution From the process of establishing SSL/TLS connections, the server has final decision of choosing which CipherSuite in the communication. Different Windows OS versions support different TLS CipherSuite and priority order. For the supported CipherSuite, please refer to Cipher Suites in TLS/SSL (Schannel SSP) - Win32 apps | Microsoft Learn for details. If a service is hosted in Windows OS. the default order could be override by below group policy to affect the logic of choosing CipherSuite to communicate. The steps are working in the Windows Server 2019. Edit group policy -> Computer Configuration > Administrative Templates > Network > SSL Configuration Settings -> SSL Cipher Suite Order. Enable the configured with the priority list for all cipher suites you want. The CipherSuites can be manipulated by command as well. Please refer to TLS Module | Microsoft Learn for details. TLS certificate is not trusted Detection Access the url from web browser. It does not matter if the page can be loaded or not. Before loading anything from the remote server, web browser tries to establish TLS connection. If you see the error below returned, it means certificate is not trusted on current machine. Solution To resolve this issue, we need to add the CA certificate into client trusted root store. The CA certificate can be got from web browser. Click warning icon -> the warning of ‘isn’t secure’ in the browser. Click ‘show certificate’ button. Export the certificate. Import the exported crt file into client system. Windows Manage computer certificates. Trusted Root Certification Authorities -> Certificates -> All Tasks -> Import. Select the exported crt file with other default setting. Ubuntu Below command is used to check current trust CA information in the system. awk -v cmd='openssl x509 -noout -subject' ' /BEGIN/{close(cmd)};{print | cmd}' < /etc/ssl/certs/ca-certificates.crt If you did not see desired CA in the result, the commands below are used to add new CA certificates. $ sudo cp <exported crt file> /usr/local/share/ca-certificates $ sudo update-ca-certificates RedHat/CentOS Below command is used to check current trust CA information in the system. awk -v cmd='openssl x509 -noout -subject' ' /BEGIN/{close(cmd)};{print | cmd}' < /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem If you did not see desired CA in the result, the commands below are used to add new CA certificates. sudo cp <exported crt file> /etc/pki/ca-trust/source/anchors/ sudo update-ca-trust Java The JVM uses a trust store which contains certificates of well-known certification authorities. The trust store on the machine may not contain the new certificates that we recently started using. If this is the case, then the Java application would receive SSL failures when trying to access the storage endpoint. The errors would look like the following: Exception in thread "main" java.lang.RuntimeException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at org.example.App.main(App.java:54) Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:371) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:314) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:309) Run the below command to import the crt file to JVM cert store. The command is working in the JDK 19.0.2. keytool -importcert -alias <alias> -keystore "<JAVA_HOME>/lib/security/cacerts" -storepass changeit -file <crt_file> Below command is used to export current certificates information in the JVM cert store. keytool -keystore " <JAVA_HOME>\lib\security\cacerts" -list -storepass changeit > cert.txt The certificate will be displayed in the cert.txt file if it was imported successfully.53KViews4likes0CommentsHow to enable IPv4+IPv6 dual-stack feature on Service Fabric cluster
As the IPv4 addresses are already exhausted, more and more service providers and website hosts start using the IPv6 address on their server. Although the IPv6 is a new version of IPv4, their packet headers and address format are completely different. Due to this reason, users can consider IPv6 as a different protocol from IPv4. In order to be able to communicate with a server with IPv6 protocol only, we’ll need to enable the IPv6 protocol on Service Fabric cluster and its related resources. This blog will mainly talk about how to enable this feature on Service Fabric cluster by ARM template. Prerequisite: We should be familiar with how to deploy a Service Fabric cluster and related resources by ARM template. This is not only about downloading the ARM template from the official document, but also including preparing a certificate, creating a Key Vault resource and uploading the certificate into the Key Vault. For detailed instructions, please check this official document. Abbreviation: Abbreviation Full name VMSS Virtual Machine Scale Set SF Service Fabric NIC Network Interface Configuration OS Operation System VNet Virtual Network Limitation: Currently all the Windows OS Image with container support, which means the name is ending with -Containers such as WindowsServer 2016-Datacenter-with-Containers, does not support to enable IPv4+IPv6 dual stack feature. The design changes before and after enabling this feature: Before talking about the ARM template change, it’s better to have a full picture about the changes on each resource type. Before enabling the IPv6 dual stack feature, the traffic flow of the Service Fabric will be as following: Client sends request to the public IP address with IPv4 format address. The protocol used is IPv4. Load Balancer listens to its frontend public IP address and decides which VMSS instance to route this traffic according to 5-tuple rule and the load balancing rules Load Balancer forwards the request to the NIC of the VMSS NIC of the VMSS forwards the request to the specific VMSS node with IPv4 protocol. The internal IP addresses of these VMSS nodes are also in IPv4 format After enabling the IPv6 dual stack feature, the traffic flow of the Service Fabric will be as following: (The different parts are in bold.) Client sends request to one of the public IP addresses associated with Load Balancer. One of them is with IPv4 format address and another is with IPv6 format address. The protocol used can be IPv4 or IPv6 depending on which public IP address the client sends request to Load Balancer listens to its frontend public IP addresses and decides which VMSS instance to route this traffic according to 5-tuple rule. Then according to the load balancing rules and incoming request protocol, Load Balancer decides which protocol to forward the request Load Balancer forwards the request to the NIC of the VMSS NIC of the VMSS forwards the request to the specific VMSS node. The protocol here is decided by Load Balancer in second point. These VMSS nodes have both IPv4 and IPv6 format internal IP addresses By comparing the traffic before and after enabling this dual stack feature, it’s not difficult to find the different points, which are also the configuration to be changed: Users need to create a second public IP address with IPv6 address and set the first public IP address type to IPv4. Users need to add the IPv6 address range into the VNet and subnet used by our VMSS. Users need to add additional load balancing rule in Load Balancer for IPv6 traffic. Users need to modify the NIC of the VMSS to accept both IPv4 and IPv6 protocol. In addition to the above four points, users also need to remove the nicPrefixOverride setting of the VMSS SF node extension because this override setting currently doesn't support IPv4+IPv6 dual stack yet. There isn’t risk of removing this setting because this setting only takes effect when the SF cluster works with containers which won't be in this scenario due to the limitation of the OS image. Changes to the ARM template: After understanding the design changes, the next part is about the changes in the ARM template. If users are going to deploy a new SF cluster with this feature, here are the ARM template and parameter files. After downloading these files: Modify the parameter values in parameter file Decide whether to change the value of some variables. For example, the IP range of the VNet and subnet, the DNS name, the Load Balancer resource and load balancing rule name etc. These values can be customized according to users’ own requirements. If this is for test purposes only, this step can be skipped. Use the preferred way, such as Azure Portal, Azure PowerShell or Azure CLI, to deploy this template. If users are going to upgrade the existing SF cluster and resources to enable this feature, here are the points which users will need to modify in their own ARM template. The template of this Blog is modified based on the official example ARM template. To understand the change in template more easily, the templates before and after the change are both provided here. Please follow the explanation below and compare the two templates to see what change is needed. In the variables part, multiple variables should be added. These variables will be used in the next parts. Tips: As documented here, the IP range of the subnet with IPv6 (subnet0PrefixIPv6) must be ending with “/64”. Variable name Explanation dnsIPv6Name DNS name to be used on the public IP address resource with IPv6 address addressPrefixIPv6 IPv6 address range of the VNet lbIPv6Name IPv6 public IP address resource name subnet0PrefixIPv6 IPv6 address range of the subnet lbIPv6IPConfig0 IPv6 IP config name in load balancer lbIPv6PoolID0 IPv6 backend pool name in load balancer For VNet: Add the IPv6 address range into VNet address range and subnet address range. For public IP address: Set publicIPAddressVersion property of existing public IP address to IPv4 Create a new public IP address resource with IPv6 address For Load Balancer: (Referred document) a. Add IPv6 frontend IP configuration b. Add IPv6 backend IP address pool c. Duplicate every existing load balancing rule into IPv6 version (Only one is shown as example here) Note: As documented here, the idle timeout setting of IPv6 load balancing rule is not supported to be modified yet. The default timeout setting is 4 minutes. d. Modify depending resources For VMSS: (Referenced document) Remove nicPrefixOverride setting from the SF node extension b. Set primary and privateIPAddressVersion property of existing IP Configuration and add the IPv6 related configuration into NIC part. For SF: Modify the depending resources Tips: This upgrade operation sometimes will take long time. My test took about 3 hours to finish the upgrade progress. The traffic of IPv4 endpoint will not be blocked during this progress. The result of the upgrade: Before: After: The way to test IPv6 communication to SF cluster: To verify whether the communication to SF cluster by IPv6 protocol is enabled, we can do it this way: For Windows VMSS: Please simply open the Service Fabric Explorer website with your IPv6 IP address or domain name. The domain name of IPv4 and IPv6 public IP address can be found in your public IP address overview page: (IPv6 public IP address as example here) The result with IPv4 domain URL: https://sfjerryipv6.eastus.cloudapp.azure.com:19080/Explorer For IPv6, we only need to replace the part between https:// and :19080/Explorer by the domain name in your IPv6 public IP address page, such as: https://sfjerryipv6-ipv6.eastus.cloudapp.azure.com:19080/Explorer If the tests can both return the SF explorer page correctly, then the IPv4+IPv6 dual stack feature of SF cluster is also verified as working. For Linux VMSS: Due to design, currently the SF explorer doesn’t work on IPv6 for Linux SF cluster. To verify the traffic by IPv6 to a Linux SF cluster, please kindly deploy a webpage application which listens to 80 port and use the IPv6 domain name with 80 port the visit the website. If everything works well, it will return the same page as we visit the IPv4 domain name with 80 port. (Optional) The way to test IPv6 communication in SF cluster backend VMs: As explained in the first part, the communication from SF cluster to other remote servers with IPv6 will also be an increasing requirement. After enabling the IPv4+IPv6 dual stack feature, not only user can reach SF cluster by IPv6, but also the communication from SF cluster to other servers with IPv6 is enabled. To verify this point, we can do it this way: For Windows VMSS: RDP into whichever node, install and open Edge browser and visit https://ipv6.google.com If it can return a page as normal Google homepage, then it’s working well. For Linux VMSS: SSH into whichever node and run the command: curl -6 ipv6.google.com If the result starts with <!doctype html> and you can find data like <meta content=”Search the world’s information, …>, then it’s working well. Summary By following this step-by-step guideline, enabling the IPv4+IPv6 dual stack feature should not be a question blocking the usage of SF cluster in the future. As the SF cluster itself has a complicated design and also during this process it contains the change of multiple resources, if there is any difficulty, please do not hesitate to reach Azure customer support to ask for help.7.3KViews2likes0CommentsUnable to Load Service Fabric Explorer
Service Fabric Explorer (SFX) is an open-source tool for inspecting and managing Azure Service Fabric clusters. Service Fabric Explorer is a desktop application for Windows, macOS and Linux. To launch SFX in a web browser, browse to the cluster's HTTP management endpoint from any browser - for example https://clusterFQDN:19080. Service Fabric explorer may not load for numerous reasons. Most frequent reasons could be access denied while trying to access or unable to choose the right certificate. Following steps provide some useful insights on investigation steps and mitigations to be followed in such scenarios. 1. Check the status of the cluster and certificate that is being tried to access the cluster. If the cluster state is in “Upgrade service unreachable” then mostly, the certificate might be expired. If the certificate has a warning stating it is not under trusted root and is issued by a third-party certificate issuer, then add it to trusted root certificates and exclude the Certificate issuer from any security rules that will block the access to the site. Furthermore, if cluster is healthy, and certificate is not expired, then verify if provided certificate to the cluster is wrong. 2. If the issue persists, post verifying correct certificate usage and validity, clear the browser session and cache to get it to prompt again. Additionally, try to access from incognito mode or private window. 3. SFX fails to load due to certificate issues. Initially, To identify certificate related issues at the first level, verify if there is a pop up coming up on screen to choose the certificate before accessing the Service Fabric Explorer. Note: Download the certificate on the machine that is been used to access the Service fabric explorer such that certificate appears on pop up while accessing SFX. 4. When loading the admin Service Fabric Explorer, use F12 (or any network traffic analyzer) to look at the call failures. If there are call failures with 403 as shown below, it means Fabric Upgrade Service is not able to talk to gateway. This indicates an issue with the certificate or an issue with http gateway. For certificate issues, check if certificate is ACL’d correctly to 'Network Service' and has full permissions. 5. If a similar screen like below is visible, Moving ahead, verify if the Inbound connectivity is blocked from the Azure portal. check if port 19000 and 19080 are open and accessible in Azure NSG, and machine’s IP of user is whitelisted when trying to access from local machine. In case of any blockages in inbound connectivity to 19080 via network/firewall/proxy issue at client end, They must be unblocked by the client. To help identify network issues, Using the Network Monitor Tool will help capture the traces that can be analysed further. Furthermore, One can even use a ServiceTag to allow network traffic to/from SFRP endpoint . 6. In cases of AAD based authentication to access Service fabric explorer, verify correct permissions to access and modify from Service Fabric explorer are present as per the Set up Azure Active Directory for client authentication for Azure Service Fabric. 7. Further, to isolate the issue, RDP into any one of the VM that is a part of Service fabric cluster. Try to access localhost:19080 and see if service fabric explorer is visible. If yes, then check your Load balancer’s rules and allow https connection to 19080. Reference link : https://github.com/Azure/Service-Fabric-Troubleshooting-Guides/blob/master/Security/NSG%20configuration%20for%20Service%20Fabric%20clusters%20Applied%20at%20VNET%20level.md 8. Try connecting to the cluster over PowerShell using Connect-ServiceFabricCluster. If this succeeds, FabricGateway is up and the TCP management endpoint is fine. As a next step Please reach out to Service Fabric support team to investigate traces for HttpGateway issues. If connection to cluster fails, FabricGateway is having issues and not just the Http endpoint. The next step is to share the traces located in D:\SvcFab\Log\Traces with service fabric support team to investigate further for FabricGateway issues.6.4KViews2likes0CommentsService Fabric Explorer (SFX) web client CVE-2023-23383 spoofing vulnerability
Service Fabric Explorer (SFX) is the web client used when accessing a Service Fabric (SF) cluster from a web browser. The version of SFX used is determined by the version of your SF cluster. We are providing this blog to make customers aware that running Service Fabric versions 9.1.1436.9590 and below are affected. These versions could potentially allow unwanted code execution in the cluster if an attacker can successfully convince a victim to click a malicious link and perform additional actions in the Service Fabric Explorer interface. This issue has been resolved in Service Fabric 9.1.1583.9589 released on March 14th, 2023, as CVE-2023-23383 which had a score of CVSS: 8.2 / 7.1. See the Technical Details section for more information.5.9KViews1like0CommentsInstalling AzureMonitoringAgent and linking it to your Log Analytics Workspace
The current Service Fabric clusters are currently equipped with the MicrosoftMonitoringAgent (MMA) as the default installation. However, it is essential to note that MMA will be deprecated in August 2024, for more details refer- We're retiring the Log Analytics agent in Azure Monitor on 31 August 2024 | Azure updates | Microsoft Azure. Therefore, if you are currently utilizing MMA, it is imperative to initiate the migration process to AzureMonitoringAgent (AMA). Installation and Linking of AzureMonitoringAgent to a Log Analytics Workspace: Create a Log Analytics Workspace (if not already established): Access the Azure portal and search for "Log Analytics Workspace." Proceed to create a new Log Analytics Workspace. Ensure that you select the identical resource group and geographical region where your cluster is located. Detailed explanation: Create Log Analytics workspaces - Azure Monitor | Microsoft Learn Create Data Collection Rules: Access the Azure portal and search for "Data Collection Rules (DCR)”. Select the same resource group and region as of your cluster. In Platform type, select the type of instance you have like Windows, Linux or both. You can leave data collection endpoint as blank. In the resources section, add the Virtual machine Scale Set (VMSS) resource which is attached to the Service fabric cluster. In the "Collect and deliver" section, click on Add data source and add both Performance Counters and Windows Event Logs one by one. Choose the destination for both the data sources as Azure Monitor Logs and in the Account or namespace dropdown, select the name of the Log Analytics workspace that we have created in step 1 and click on Add data source. Next click on review and create. Note: - For more detailed explanation on how to create DCR and various ways of creating it, you can follow - Collect events and performance counters from virtual machines with Azure Monitor Agent - Azure Monitor | Microsoft Learn Adding the VMSS instances resource with DCR: Once the DCR is created, in the left panel, click on Resources. Check if you can see the VMSS resource that we have added while creating DCR or not. If not then, click on "Add" and navigate to the VMSS attached to service fabric cluster and click on Apply. Refresh the resources tab to see whether you can see VMSS in the resources section or not. If not, try adding a couple of times if needed. Querying Logs and Verifying AzureMonitoringAgent Setup: Please allow for 10-15 minutes waiting period before proceeding. After this time has elapsed, navigate to your Log Analytics workspace, and access the 'Logs' section by scrolling through the left panel. Run your queries to see the logs. For example, query to check the heartbeat of all instances:Heartbeat | where Category contains "Azure Monitor Agent" | where OSType contains "Windows" You will see the logs there in the bottom panel as shown in the above screenshot. Also, you can modify the query as per your requirement. For more details related to Log Analytics queries, you can refer- Log Analytics tutorial - Azure Monitor | Microsoft Learn Perform the uninstallation of the MicrosoftMonitoringAgent (MMA): Once you have verified that the logs are getting generated, you can go to Virtual Machine Scale Set and then to the "Extensions + applications" section and delete the old MMA extension from VMSS.5.5KViews4likes2CommentsDeploying an application with Azure CI/CD pipeline to a Service Fabric cluster
Prerequisites Before you begin this tutorial: Install Visual Studio 2019 and install the Azure development and ASP.NET and web development workloads. Install the Service Fabric SDK Create a Windows Service Fabric cluster in Azure, for example by following this tutorial Create an Azure DevOps organization. This allows you to create projects in Azure DevOps and use Azure Pipelines. Configure the Application on the Visual Studio 2019 Clone the Voting Application from the link- https://github.com/Azure-Samples/service-fabric-dotnet-quickstart After that we can enter link to clone the voting application. Once you click on Clone button, we can see that application is ready to open on Solution Explorer. Now we have to build the solution so that all the dependency DLL will be downloaded on the package folder from NuGet store. We need to cross check the NuGet package solution to find if any DLL is deprecated. If so, then we need to update all older version of DLL. After correcting the DLL version, we have to check the application file 'voting.sfproj' Note – For Visual Studio 2022, toolsVersion will be 16.0 and we have to update the MS build version everywhere in the 'voting.sfproj' file. From packages.config file, we can get the MS build version: We must cross check the dotnet version in 'packages.config' of the application and also at the service level. Like in 'packages.config' is having the net40 but in service dotnet version net472. We have to manually add the reference of MS build in service project file. Example – Expected error based on above changes – We must push our changes to our repo. However, prior that we must take care that we should not push our changes on master branch. We need to create a new branch and push our changes to that branch. For that, In Visual Studio we can go to Team Explorer After that sync the local branch on DevOps repo. Now we have to create a Pipeline – Click on New Pipeline Then click in Use Classic Editor --> select the repository. Select the template – search for Service Fabric template- After that all the Task will be generated. In Agent Specification we need to select the same version as Visual Studio version. Like we have selected the 2019 because we have built the project on VS 2019. Use NuGet latest stable version. At time of this blog creation, NuGet version is 5.5.1. Also uncheck the checkbox “Always download the latest matching version”. In Build solution we must select 2019 as my Visual Studio version is 2019. In “Update Service Fabric Manifest” task we can directly change the version in manifest. In Copy files – we can gather the data from application manifest and application parameters file. Please refer below image for above points (19-23) Enable continuous integration checkbox, so that whenever we do any commit on the repo Automatically the build pipeline is triggered. We can add some static variables while executing the pipeline by putting the value in variable. Build Success mail – Build failed mail- Release Pipeline- Release pipeline is the final step where application is deployed to the cluster. 2. Click on “New Release Pipeline” then again select the template for Service Fabric. 3. Then add the Artifact by selecting the correct build pipeline. 4. Click on 1 job,1 task 5. Click on stages --> then we have to select the cluster connection. If no cluster connection is created, then click on “New” 6. Create a Service Connection as given in below image- Note: - For Azure Active Directory credentials, add the Server certificate thumbprint of the server certificate used to create the cluster and the credentials you want to use to connect to the cluster in the Username and Password fields. 7. How to generate the client certificate value – Open to PowerShell ISE with Admin access. Paste the command- [System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes("C:\Users\pritamsinha\Downloads\certi\certestuskv.pfx"). Paste the output in the same PowerShell workspace area and remove all the space from beginning and end. 8. In case of some error with base 64 value then deployment will fail- Enable Grant access permission to all pipelines. Note – in case when cluster certificate is expired, and we have updated the cluster certificate then we need to update the thumbprint and client certificate value. Post above, Deploy Service Fabric application section – In Application Parameter – We need to select the target location of the file where the application parameter file is placed. Enable compressed package so that application package will be converted to zip file. CopyPackageTimeoutSec-Timeout in seconds for copying application package to image store. If specified, this will override the value in the published profile. RegisterPackageTimeoutSec -Timeout in seconds for registering or un-registering application package. Enable the Skip upgrade for same Type and Version (Indicates whether an upgrade will be skipped if the same application type and version already exists in the cluster, otherwise the upgrade fails during validation. If enabled, re-deployments are idempotent.) Enable the Unregister Unused Versions (Indicates whether all unused versions of the application type will be removed after an upgrade.) Configure the “Continuous deployment trigger” – Then save the config and run the release pipeline. Expected output- References- Azure pipeline reference link -https://learn.microsoft.com/en-us/azure/devops/pipelines/get-started/what-is-azure-pipelines?view=az... Service Fabric Azure CICD pipeline doc- https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-tutorial-deploy-app-with-cicd-...5KViews6likes1CommentUnable to load Service Fabric Explorer
Service Fabric Explorer (SFX) is an open-source tool for inspecting and managing Azure Service Fabric clusters. To launch SFX in a web browser, browse to the cluster's HTTP management endpoint from any browser - for example https://<clusterfqdn>:19080. Service Fabric explorer may not load for numerous reasons. Most frequent reasons could be access denied while trying to access or unable to choose the right certificate. Following steps provide some useful insights on investigation steps and mitigations to be followed in such scenarios. 1. Check the status of the cluster and certificate that is being tried to access the cluster. If the cluster state is in “UpgradeServiceUnreachable” then mostly, the certificate might be expired. If the certificate has a warning stating it is not under trusted root and is issued by a third-party certificate issuer, then add it to trusted root certificates and exclude the Certificate issuer from any security rules that will block the access to the site. Furthermore, if cluster is healthy, and certificate is not expired, then verify if provided certificate to the cluster is wrong. 2. If the issue persists, post verifying correct certificate usage and validity, clear the browser session and cache to get it to prompt again. Additionally, try to access from incognito mode or private window 3. SFX fails to load due to certificate issues. Initially, to identify certificate related issues at the first level, verify if there is a pop up coming up on screen to choose the certificate before accessing the Service Fabric Explorer. Note: Download the certificate on the machine that is been used to access the Service fabric explorer such that certificate appears on pop up while accessing SFX. 4. When loading the Service Fabric Explorer, use F12 (or any network traffic analyser) to look at the call failures. If there are call failures with 403 as shown below, it means Fabric Upgrade Service is not able to talk to Fabric Gateway. This indicates an issue with the certificate or an issue with HTTP gateway. For certificate issues, check if certificate is ACL’d correctly to 'Network Service' and has full permissions. 5. If a similar screen like below is visible, moving ahead, verify if the Inbound connectivity is blocked from the Azure portal. Check if port 19000 and 19080 are open and accessible in Azure NSG, and machine’s IP of user is whitelisted when trying to access from local machine. In case of any blockages in inbound connectivity to 19080 via network/firewall/proxy issue at client end, They must be unblocked by the client. To help identify network issues, Using the Network Monitor Tool will help capture the traces that can be analysed further. Furthermore, one can even use a ServiceTag to allow network traffic to/from SFRP endpoint. Please refer the link https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-best-practices-networking for more details. 6. In cases of AAD based authentication to access Service Fabric Explorer, verify correct permissions to access and modify from Service Fabric Explorer are present as per the Set up Azure Active Directory for client authentication for Azure Service Fabric. 7. Further, to isolate the issue, RDP into any one of the VM that is a part of Service Fabric Cluster. Try to access localhost:19080 and see if Service Fabric Explorer is visible. If yes, then check your Load balancer’s rules and allow HTTPS connection to 19080. 8. Try connecting to the cluster over PowerShell using Connect-ServiceFabricCluster. If this succeeds, FabricGateway is up and the TCP management endpoint is fine. As a next step, please reach out to Service Fabric support team to investigate traces for HttpGateway issues. If connection to cluster fails, FabricGateway is having issues and not just the Http endpoint. The next step is to share the traces located in D:\SvcFab\Log\Traces with Service Fabric support team to investigate further for FabricGateway issues.4.5KViews5likes0CommentsManually roll over a Common Name based Service Fabric cluster certificate using resources.azure.com
Applies To: Azure Service Fabric Clusters secured with common name-based certificate. If you are trying to rollover a thumbprint-based certificate, please refer to this article A certificate is an instrument meant to bind information regarding an entity (the subject) to their possession of a pair of asymmetric cryptographic keys, and so constitutes a core construct of public key cryptography. The keys represented by a certificate can be used for protecting data. The client and server use certificates to ensure the privacy and integrity of their communication, and to conduct mutual authentication. In Service Fabric, certificates are used to provide security and for authentication. When a Service Fabric Cluster certificate is close to expiry, you need to update the certificate. Certificate rollover is simple if the cluster was Set up to use certificate based on common name (instead of thumbprint). Get a new certificate from a certificate authority with a new expiration date. Self-signed certificates are not support for production Service Fabric Clusters to include certificates generated during Azure portal cluster creation workflow. The new certificate must have the same common name as the older certificate issued by same Certificate Authority. Service Fabric Cluster will automatically use the declared certificate with a later expiration date, when more than one valid certificate is installed on the virtual machine scale set. You need to upload a new certificate to a Key Vault and then install the certificate on the virtual machine scale set. Add new certificate to Key Vault: Get a new certificate from a Certificate Authority e.g., DigiCert, GeoTrust, Comodo etc., with a later expiration date and Upload certificate in Azure Key Vault. *Note: Please note that here we are not promoting any Certificate Authority, this is just for your reference. Install the certificate on the virtual machine scale set: Before starting the process of installing the certificate on virtual machine scale set, do check the certificate issuer thumbprint of old and new certificate. *Note: Issuer thumbprint is the thumbprint of intermediate in the certification path and not of the leaf (certificate itself). Please refer below screenshot for more clarity. Finding the issuer thumbprint of old certificate: Old certificate issuer thumbprint you can check from Resource Explorer (azure.com). Please follow the steps below for checking the issuer thumbprint of old certificate: In the Microsoft.ServiceFabric/clusters resource, navigate to certificateCommonNames property. In commonNames setting you will see certificateIssuerThumbprint. Below is the snippet of resource explorer: "certificateCommonNames": { "commonNames": [ { "certificateCommonName": "[parameters('certificateCommonName')]", "certificateIssuerThumbprint": "[parameters('certificateIssuerThumbprintList')]" } ], "x509StoreName": "[parameters('certificateStoreValue')]" } Finding the issuer thumbprint of new certificate: For a new certificate, please install the certificate in your machine for current user and then you can check the issuer thumbprint from certification path. Please follow the steps below to check issuer thumbprint (intermediate thumbprint) for new certificate: Open "Manage user certificates" by searching in windows search bar. A window like the screenshot below will open. Expand "Personal" and click on "Certificates". It will show you all the certificates installed in your current user. Choose the certificate that you want to install in your cluster by double clicking on it. Open that certificate -> click on certification path -> double click on intermediate (middle one from the list) -> navigate to details -> scroll to bottom and check the property thumbprint. You can refer to the snippet below as a reference: Based on both the issuer thumbprints: If both the thumbprints are same: No need for any cluster upgrade, can directly go for installing certificate on VMSS. If both the thumbprints are different: In this scenario you need to add new issuer thumbprint in cluster resource. Please follow the steps below for the same. For adding the new issuer thumbprint in cluster, please follow the below steps: Go to resources explorer and navigate to the cluster. Please refer to the screenshot below for complete path: In the Microsoft.ServiceFabric/clusters resource, navigate to certificateCommonNames property. In commonNames setting you will see certificateIssuerThumbprint. Choose Read/Write mode from the top and click on "Edit" to add a new value. "certificateCommonNames": { "commonNames": [ { "certificateCommonName": "[parameters('certificateCommonName')]", "certificateIssuerThumbprint": "[parameters('certificateIssuerThumbprintList')]" } ], "x509StoreName": "[parameters('certificateStoreValue')]" } In the certificateIssuerThumbprintList add comma separated new issuerthumbprint. For e.g., "certificateIssuerThumbprint": “thumbprintOld, thumbprintNew” After making the changes, click on "PUT" on the top and wait for "provisioningState" to become "Succeeded" from "Updating". Installing certificate on VMSS: Now you need to install the certificate on virtual machine scale set. Follow the steps below: Go to Resource Explorer (azure.com) and navigate to the virtual machine scale set configured for the cluster. subscriptions └───%subscription name% └───resourceGroups └───%resource group name% └───providers └───Microsoft.Compute └───virtualMachineScaleSets └───%virtual machine scale set name% Make changes to all the Microsoft.Compute/virtualMachineScaleSets resource definitions - Locate the Microsoft.Compute/virtualMachineScaleSets resource definition. Choose Read/Write mode from the top and click on "Edit" to add a new value. Scroll to the "vaultCertificates": under "OSProfile". Add certificateUrl and certificateStore of new certificate. "vaultCertificates": [ { "certificateUrl": "[parameters('oldCertificateUrlValue')]" "certificateStore": "[parameters('oldCertificateValue')]", }, { "certificateUrl": "[parameters('newCertificateUrlValue')]" "certificateStore": "[parameters('newCertificateValue')]" } ] After the above changes, we need to click on "PUT" button and wait for "provisioningState" to get "Succeeded". VMSS provisioning state in Updating VMSS provisioning state in Succeeded NOTE: Make sure that you have repeated the above step for all the node types (Microsoft.Compute/virtualMachineScaleSets) resource definitions in your template. If you miss one of them, the certificate will not get installed on that virtual machine scale set and you will have unpredictable results in your cluster, including the cluster going down. So double check, before proceeding further. To check if the new certificate is deployed successfully or not. Navigate to Service Fabric Explorer (SFX). Then expand any node and in the essential section, expand Health evaluations -> All and see the certificate expiry. It will be the later expiry of new certificate. Refer below screenshot: *Note: Please don’t get confused by the thumbprint mentioned in the screenshot or on Service Fabric Explorer. As even if you are using common name-based certificate, that certificate will still have some thumbprint and Service Fabric Explorer in this section shows that thumbprint only.4KViews9likes4Comments