operate
2 TopicsUnderstanding Cloud Connector Edition Auto Update
Starting with Skype for Business Cloud Connector (Cloud Connector) 1.4.1, we introduced an automated update process: Cloud Connector automatically update based on the update schedule that administrators have configured for their Cloud Connector Hybrid PSTN Sites. This article goes into the details of this automatic update. Note: Cloud Connector should be viewed as a part of Skype for Business Service. We are constantly improving the service and make changes in the Service and in Cloud Connector. If you do not update the Cloud Connector to the latest release, you might end up in the situation when your Cloud Connector will not work properly. Imagine the situation, Microsoft updated the service, added changes in Cloud Connector code to make it work with the new version of the service, but if you don’t update the Cloud Connector, your telephony will not work. Because of these, Microsoft supports only the latest version of Cloud Connector software. Also, to accommodate Update Window, we support the N-1 version for 60 days after releasing a new version. Auto Update Requirements Outbound internet access to install, manage, and update Cloud Connector on Host Appliance. Outbound internet access on all Cloud Connector VM’s to download Windows updates, or, access to WSUS server as defined in Cloud Connector configuration file. Skype for Business Online PowerShell Module installed on Host Appliance. CCE Management Service is running on Host Appliance. Group Policy to prevent forcefully unloading user registry at log off (required for 1.4.1). Skype for Business Tenant Admin account. Initial Setup When the first Cloud Connector appliance is registered, a PSTN site is configured in the Office 365 Skype for Business Tenant, with the name of the Site as defined in the Cloud Connector configuration file Auto update is enabled, and default time windows are configured for both the software (bits) and operating system (windows) with corresponding Tenant Update Time Windows that are configured in the tenant as part of this registration process. See the image below. Confirm or Modify the Update Schedule for Hybrid PSTN Site(s) To confirm the schedule for updating Cloud Connector software, administrators can view the Hybrid PSTN site update schedule in Skype for Business Admin Center in the on premises PSTN tab of the Voice section. Note that updates will run based on the local host time of the Cloud Connector appliance. Please be sure to confirm that auto update is enabled and that a bits update time window is set to correspond to the maintenance window that you want the updates to run. The time that updates will occur is based on the local time of the Cloud Connector host appliance. For example, if the update time window is set for 11PM and the Cloud Connector appliance is in Amsterdam, the updates will occur at 11PM CET (UTC+1). For details on how to configure update time windows, please refer to Modify the configuration of an existing Cloud Connector deployment. Cloud Connector Update Process The CCE Management service uses cached update time window information stored in the root of the CCE Site Directory\Tenant_<EdgeFQDN> file when checking if updates run. This file is updated from O365 tenant every 30 minutes. Therefore, if you modify the schedule online, it can take up to 30 minutes for the change to be implemented on the host appliance. Overall Auto Update Process The auto update process will run on the schedule set by Tenant Administrator based on local host appliance clock Update detection will continue to run for the duration specified in the update time window. If an update is detected, the update process is invoked. The appliance will be put in maintenance mode and only one appliance per site can enter maintenance mode. The maintenance mode lock is written to the root of the CCE Appliance Directory\CceSevicePersistent file {"AutoMaintenanceStatus":#,"IsInManualMaintenance":false} 0=None, 1=Bits Update, 2=OS Update, 3=RecoveryMode Update tasks will run. Once updates are completed and all services are confirmed running, the Appliance will be taken out of maintenance mode. Repeat steps for the next appliance in site. Monitor Update Process The Cloud Connector management service will log events to the Windows Application log with a source of CCEManagementService and detailed information will be written to "C:\Program Files\Skype for Business Cloud Connector Edition\ManagementService\CceManagementService.log". Note: the CCEManagementService.log can grow quite large, so you might want to stop the CCE Management Service, and rename this log periodically. There are plans to modify logging in the future to prevent log growth. If the log file size becomes too large to open in a text editor, you must use a text file splitter to break into smaller segments. You can also see the status of the appliance by running Get-CsPSTNHybridAppliance in Remote PowerShell or by viewing in on premises PSTN tab in the Voice section of the Skype for Business Admin Center. Bits Update Process During this process, the running version remains in service, and an interim switch is used to connect to the new VM’s. Once the new version installation is complete and services are confirmed to be running, the old version is drained stopped and the network connections are switched to the new version. Bits update is detected based on the scheduled time window. Bits update task is triggered. The Cloud Connector download site is queried and if a new build is detected, then the update will occur. The appliance is put in maintenance mode, and the appliance status is updated in the Tenant showing Status of Maintenance, and DeploymentStatus of Upgrading, with the new version and the start time that the update began. Cloud Connector bits are downloaded. The CCE management service is stopped. The Skype for Business Online Cloud Connector edition software is updated which requires uninstalling the old version and installing the new version. New virtual machines are built from the existing VHDX file. If the VHDX is detected to be older than 90 days, the Install Instance script will log the following warning: SFBServer.vhdx was generated more than 90 days before. Use Convert-CcIsoToVhdx to generate it again and apply windows updates. Note: It is recommended that a new VHDX be built periodically to reduce the amount of time to perform Windows updates for new and updated Cloud Connector machines. It’s not supported to update the VHDX with Windows update and re-run Sysprep as there are a limited number of times that Sysprep can run on a computer. Once the deployment of the new Cloud Connector is completed and services confirmed running, the switch to the new version will occur as follows: Change virtual network connections to new Cloud Connector virtual machines Shut down the N+1 version Remove N+2 version and delete the virtual disks. The appliance will be taken out of maintenance mode, and the appliance status is updated in the Tenant to reflect updated Status of running, Version number of new build, and DeploymentStatus of Upgraded. Detailed logs for the download, upgrade of the Cloud Connector software, new version installation and switch to new build will be written to the Logs folder located in the root of the Appliance directory. Windows Update Process Windows update process is performed on the active running version. Therefore, when a windows update is detected, the appliance is drained stopped and put in maintenance mode. OS update is detected during scheduled time window. OS update task is triggered. The appliance is put in maintenance mode, and the appliance status is updated in the Tenant showing Status of Maintenance, and OsUpdateStatus of Upgrading, with start time. The RTCSRV service on Edge and the RTCSRV and RTCMEDSRV services on Mediation server are drained stopped. OS update PowerShell script is copied to the root of the System drive on all CCE VM’s. Local windows update service is triggered to check for updates either against Windows Updates Internet service, or the local WSUS server defined in Cloud Connector configuration file. Updates are installed and a check for virtual machine restart is run. If a restart is required, all Cloud Connector virtual machines are restarted, then a second check for restart is run. Once updates are completed on all virtual machines, the updates are run on the host appliance and its restarted. Once the host has been restarted and no additional restarts are confirmed, the appliance is taken out of maintenance mode, and the appliance status is updated in the Tenant to reflect updated Status of Running, and OSUpdateStatus of Updated. Troubleshooting Auto Update CCE Management Service Logging Level: If you need more diagnostic logging, you can modify the logging level to verbose for the following two settings in the Microsoft.Rtc.CCE.ManagementService.exe.config located in the "C:\Program Files\Skype for Business Cloud Connector Edition\ManagementService folder. (This will cause rapid log growth): <add name="serviceSwitch" value="Information"/> <add name="powershellSwitch" value="Warning"/> If updates are not running because another maintenance task is detected, check the status of the CCE Appliance Directory\CceSevicePersistent file to determine what task is running. {"AutoMaintenanceStatus":#,"IsInManualMaintenance":false} 0=None, 1=Bits Update, 2=OS Update, 3=RecoveryMode Bits update failed to switch version is logged by the CCE Management Service with following error: CceService Error: 20003: Bits update failed to switch version. Appliance running status: Running, error detail: Failed to drain services with exception: [192.168.213.4] Connecting to remote server192.168.213.4 failed with the following error message: Access is denied Check the networking status on the virtual machines and be sure there are no duplicate IP’s configured.9.9KViews3likes10CommentsTroubleshoot individual calls in Skype for Business Online
I received an email recently ‘Hey Martin, I wanted to call Bob on his mobile phone from Skype for Business this morning. The call didn’t complete, gave me an error. I was at my hotel at that point in time, can you tell me why this call didn’t go through? I’m travelling right now with no coverage, love to hear from you what was going on. Thanks, Joe.’ So far so good, now I need to explain that Joe is the CIO of that company and wanted to call Bob, the CEO. He’s travelling so I won’t be able to get any further details from him, nor will I have access to his machine to get any client-side logs for troubleshooting. It would be just too easy to go to the EventLog of that machine and find an event Source=Lync, EventID=11 highlighting the exact reason. *(see end of blog post) To start troubleshooting I need logfiles to understand what is going on. I have no chance to get client-side logs as I cannot contact him, what about server-side logs? In an on-premises environment we could leverage the Centralized Logging Service (CLS) to retrieve call details from the Always On scenario. Remember, this scenario is designed to run always so that you access to the log files after an issue occurred without the need to reproduce the scenario. See https://technet.microsoft.com/en-us/library/jj687958.aspx for more details on this. Joe is hosted online leveraging Cloud Connector Edition for PSTN connectivity, so CLS logger won’t help here. But there’s something similar that I can do, I can retrieve logs from the service by leveraging the Get-CsUserSession cmdlet. Let’s see what I have, I know the user (Joe) and the time (this morning) and I am a Skype for Business admin in that tenant, that’s all I need. Collect Sessions To collect logs, I need to start a PowerShell and connect to the tenant. This article has all the details if you haven’t seen this already: https://technet.microsoft.com/library/dn362831.aspx $creds = Get-Credential $s = New-CsOnlineSession -Credential $creds Import-PSSession $s -AllowClobber I can now run the Get-CsUserSession cmdlet to retrieve Joe’s logs from this morning: get-csusersession -User joe@my-uclab.de -StartTime (Get-Date).AddHours(-4) Instead of specifying a specific date I used the Get-Date cmdlet and subtracted four hours as Joe’s failed call is less than four hours ago. I did not specify the -EndTime parameter to limit the amount of data further. This cmdlet returns me a lot of SIP signalling that Joe initiated: Analyse Session Now that is great but a pretty long list. So let’s store everything in a variable for easier processing. $sessions = get-csusersession -User joe@my-uclab.de -StartTime (Get-Date).AddHours(-4) To find the event I am interested in I need to find the call that Joe was trying to start. To do so, I use the MediaTypesDescription Parameter. Let’s see what MediaTypes are available for my set of logs: $sessions.MediaTypesDescription Joe reported that he wanted to place an audio call, so let’s get details for the audio call: $sessions | where {$_.MediaTypesDescription -eq "[Audio]"} Okay, that’s better. I can see that Joe tried to place a call to +49 151 4406 3xxx. But I don’t know why this call failed, let’s check the ErrorReports for details: $sessions | where {$_.MediaTypesDescription -eq "[Audio]"} | select -ExpandProperty ErrorReports Okay, so the call was process but the Gateway responded with a 403 forbidden. After checking the gateway configuration associated to his Cloud Connector Edition it turns out that outbound call routing wasn’t setup correctly to support number to Germany (+49). After fixing this configuration issue Joe can place outbound calls to Germany. Other things you can do with Get-CsUserSession The other interesting thing is that this also allows you to access the QoEReport (aka VQReport) that is sent after every call, just find the QoEReport property of a completed audio call. This is useful in a scenario where you need to troubleshoot quality or call reliability issues for a specific user. Things to be aware of The Get-CsUserSession cmdlet is very powerful to troubleshoot issues of an individual user. Some PII data (like phone numbers in this example) is masked and you can only query for data of a single user at any point in time. But this cmdlet does expose quite sensitive information, therefore only Skype for Business admins have access to this cmdlet. Very like access to CLS or CDR/QoE database access for an on-premises environment. There’s only a short lag of usually less than 5 minutes before the data is available and data is available for up to 365 days (starting from August 2016 onwards). Where can I find more? Please see the Skype Operations Framework website for this and more information on troubleshooting and practical guidance for successful deployments all together. https://www.skypeoperationsframework.com/Offers/?pageState=Supportthesolution Get-CsUserSession on Technet: https://technet.microsoft.com/library/mt715516.aspx *Shortcut Here’s the shortcut if I had access to Joe’s machine with ‘Also collect troubleshooting info using Windows Event Logging’ being turned on in the Skype for Business General settings. Please note that Joe has a PC with German language installed, but as you can see the error message itself is still English. This readiness content is presented by Skype Academy www.skypeoperatiosnframework.com/academy9.4KViews1like2Comments