First published on MSDN on Apr 26, 2018
Update on 6.2 status
Due to the nature of the bugs we have found in 6.2, we have chosen to take down the binaries from our download center as the current 6.2 release does not meet our quality bar. Links to the 6.2 release will point to the latest 6.1 release. The current direct download links in the 6.2 release notes will be broken starting now. A new set of release notes and download links will follow. SDK download via WebPI has also been reverted to 6.1.
We advise against installing 6.2 in a production environment at this time.
Due to the process we have for upgrading clusters in Azure, we were able to stop the roll-out of 6.2 to clusters in Azure before any customers were impacted.
If you have downloaded the 6.2 package to upgrade your Windows standalone clusters, and upgraded the clusters, there is a chance you are impacted. The issues are also present in the SDK release, so if you have upgraded your SDK to 6.2, the bugs are also present on the developer machines using these bits.
To check the version of your Service Fabric cluster, go to the Service Fabric Explorer, and check the version of one of the system services. That version corresponds to the version of the runtime.
What's wrong with 6.2?
We have encountered three bugs in the 6.2 release which are causing issues during and after upgrade. One of the bugs will cause the Fault Analysis Service to crash on systems not using US locale for date and time. The second bug is a race condition between backup and checkpoint of stateful services which can cause the services to fail (services not utilizing backup are unaffected, as are those that happen to checkpoint after upgrade and before the backup takes place). The third issue is with the upgrade of .NET Core services from 6.1 to 6.2, where the services would not start after the upgrade due to an assembly load issue.
Learning and improvement
We are taking measures to extend our automation testing to include these and other scenarios. One gap already identified is that localization runs only on a subset of tests and should be expanded. Another improvement is in our test infrastructure to allow more granular sequencing of events, rather than relying on random tests to catch certain combinations of events.
As you all know, publishing a release that contains bugs is always disappointing and causes frustration. There is no excuse for not doing the due diligence of validating the binaries we ship. We do however have processes in place which helped us find these issues quickly before the majority of customer clusters received the bits.
We will be back soon with more news on getting a 6.2 release out.