Connect with experts and redefine what’s possible at work – join us at the Microsoft 365 Community Conference May 6-8. Learn more >

SQL 2017

18 Topics

Effectively troubleshoot latency in SQL Server Transactional replication: Part 1
Are you struggling with latency issues in SQL Server Transactional replication? This article provides clear, step-by-step instructions to effectively troubleshoot and resolve these challenges. Dive into proven techniques and best practices that will help you enhance your SQL Server's performance and ensure seamless data replication. Don't let latency slow you down—master the art of SQL Server troubleshooting today! I hope you find this teaser engaging! If you need any adjustments or additional content, feel free to let me know. Thanks to Collin Benkler, Senior Escalation Engineer in Microsoft for SQL Server for his valuable feedbacks.
Arslanov
Dec 11, 2024 Place SQL Server Support Blog
5.1KViews
4likes
4Comments
SQL Server R Services またはSQL Server Machine Learning Servicesをオフラインでインストールまたは更新プログラムを適用するための方法
こんにちは SQL Server サポートチームです。今回は、SQL Server R Services またはSQL Server Machine Learning Servicesをオフラインでインストールまたは更新プログラムを適用するための方法について紹介します。日本語版のSQL Server をご利用いただいているお客様より、オフライン環境でSQL Server R Services またはSQL Server Machine Learning Servicesをインストールまたは更新プログラムを適用した際に、失敗するというお問い合わせをいただくことがあります。これはSQL Server の不具合が原因となりますが、修正目途がたっていない状況となりますため、ブログで回避策を紹介させていただきます。作業の流れは下記の通りとなります。１．SQL Server R Services またはSQL Server Machine Learning Servicesをオフラインでインストールする場合　1) インターネットに接続できる環境で、リリースバージョンにあったcab ファイルをダウンロードします。　　各リリースバージョンのcabファイルのダウンロード URL は、次のページをご確認ください。　　SQL Server 2019 の場合：　　　https://learn.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-component-install-without-internet-access?view=sql-server-ver15#1---download-2019-cabs 　　SQL Server 2017 の場合：　　　https://learn.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-component-install-without-internet-access?view=sql-server-2017#1---download-2017-cabs 　　SQL Server 2016 の場合：　　　https://learn.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-component-install-without-internet-access?view=sql-server-2016#1---download-2016-cabs 　2) 適用対象のサーバーに、cabをコピーします。　　※ 任意のパスで結構です。ここでは例として、C:\Temp にコピーするとします。次からの手順は、C:\Temp 配下にコピーした場合の手順となります。　3) 各cabファイルの末尾が1033となっているため、1041にコピーします。　　例）SQL Server 2019 の場合　　　SPO_4.5.12.120_1033.cab -> SPO_4.5.12.120_1041.cab 　　　SPS_9.4.7.25_1033.cab -> SPS_9.4.7.25_1041.cab 　　　SRO_3.5.2.125_1033.cab -> SRO_3.5.2.125_1041.cab 　　　SRS_9.4.7.25_103　3.cab -> SRS_9.4.7.25_1041.cab 　　この結果C:\Temp 配下に、次のファイルが存在することになります。　　　SPO_4.5.12.120_1033.cab 　　　SPO_4.5.12.120_1041.cab 　　　SPS_9.4.7.25_1033.cab 　　　SPS_9.4.7.25_1041.cab 　　　SRO_3.5.2.125_1033.cab 　　　SRO_3.5.2.125_1041.cab 　　　SRS_9.4.7.25_1033.cab 　　　SRS_9.4.7.25_1041.cab 　4) インストールを行います。　　※　[Microsoft Machine Learning Server コンポーネントのオフラインインストール] では [インストールパス] に、「C:\Temp」を指定します。　　4-1) インストールメディアを開き、setup.exe を右クリックし、管理者として実行します。　　4-2) セットアップウィザードで、オープンソースの R または Python コンポーネントに対するライセンスのページが表示されたら、 [同意する] をクリックします。ライセンス条項に同意すると、次の手順に進むことができます。　　4-3) [オフラインインストール] ページが表示されたら、 [インストールパス] に、以前にコピーした CAB ファイルが格納されているフォルダーを指定します。　　　図1．Microsoft Machine Learning Server コンポーネントのオフラインインストール画面　　　　　　4-4) 画面の指示に従って続行し、インストールを完了します。　　　参考情報セットアップの実行　　　https://docs.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-component-install-without-internet-access?view=sql-server-ver15#run-setup 2．SQL Server R Services またはSQL Server Machine Learning Servicesの更新プログラムをオフラインで適用する場合　1) インターネットに接続できる環境で、更新プログラムにあったcab ファイルをダウンロードします。　　各更新プログラム用のcabファイルのダウンロード URL は、次のページをご確認ください。　　SQL Server 2019 の場合：　　　https://docs.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-cab-downloads?view=sql-server-ver15#sql-server-2019-cabs 　　SQL Server 2017 の場合：　　　https://docs.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-cab-downloads?view=sql-server-2017#sql-server-2017-cabs 　　SQL Server 2016 の場合：　　　https://docs.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-cab-downloads?view=sql-server-2016#sql-server-2016-cabs 　2) 適用対象のサーバーに、cabをコピーします。　　※ 任意のパスで結構です。ここでは例として、C:\Temp にコピーするとします。次からの手順は、C:\Temp 配下にコピーした場合の手順となります。　3) 各cabファイルの末尾が1033となっているため、1041にコピーします。　　例）SQL Server 2019 CU8 の場合　　　SPO_4.5.12.479_1033.cab -> SPO_4.5.12.479_1041.cab 　　　SPS_9.4.7.958_1033.cab -> SPS_9.4.7.958_1041.cab 　　　SRO_3.5.2.777_1033.cab -> SRO_3.5.2.777_1041.cab 　　　SRS_9.4.7.958_1033.cab -> SRS_9.4.7.958_1041.cab 　　この結果C:\Temp 配下に、次のファイルが存在することになります。　　　SPO_4.5.12.479_1033.cab 　　　SPO_4.5.12.479_1041.cab 　　　SPS_9.4.7.958_1033.cab 　　　SPS_9.4.7.958_1041.cab 　　　SRO_3.5.2.777_1033.cab 　　　SRO_3.5.2.777_1041.cab 　　　SRS_9.4.7.958_1033.cab 　　　SRS_9.4.7.958_1041.cab 　4) インストールを行います。　　※　[Microsoft Machine Learning Server コンポーネントのオフラインインストール] では [インストールパス] に、「C:\Temp」を指定します。　　4-1) 更新プログラムのインストーラーを実行します。ライセンス条項に同意し、[機能の選択] ページで、累積的な更新プログラムが適用される機能を確認します。　　　機械学習機能を含む、現在のインスタンスにインストールされているすべての機能が表示されます。　　　図2. 機能の選択画面　　　　　4-2) ウィザードを続行し、R および Python ディストリビューションのライセンス条項に同意します。　　[オフラインインストール] ページが表示されたら、 [インストールパス] に、以前にコピーした CAB ファイルが格納されているフォルダーを指定します。　　　図3. Microsoft Machine Learning Server コンポーネントのオフラインインストール画面　　　　　　参考情報累積的な更新プログラムの適用 https://docs.microsoft.com/ja-jp/sql/machine-learning/install/sql-ml-component-install-without-internet-access?view=sql-server-ver15#apply-cumulative-updates ※ 本Blogの内容は、2021年5月現在の内容となっております。
Masafumi_Hori
Sep 16, 2024 Place SQL Server Support Blog
13KViews
0likes
0Comments
Azure 仮想マシン上に作成した SQL Server の日本語化手順（Windows OS 版 SQL Server 2019 / 2022 対応手順）
Azure ギャラリーイメージより、SQL Server が含まれた仮想マシンをデプロイすることが可能ですが、現在は英語版のみの提供となっています。Azure 仮想マシン上に作成した SQL Server 2019 / 2022 (Windows OS 版) の日本語化手順について紹介します。
moematsuo
Sep 16, 2024 Place SQL Server Support Blog
28KViews
0likes
0Comments
イメージのランダム化を強制する(必須ASLR) に「既定でオンにする」を設定した場合、Express Edition の日本語版のインストーラの起動に失敗します
こんにちは。日本マイクロソフト SQL Server サポートチームです。事象：イメージのランダム化を強制する(必須ASLR) に「既定でオンにする」を設定した場合、SQL Server 2016 ならびにSQL Server 2019 のExpress Edition の日本語版のインストーラ(SQLEXPR_x64_JPN.exe) の起動に失敗します。なお、イメージのランダム化を強制する(必須ASLR) の規定値は「既定でオフにする」であり、既定値では発生しません。また、SQL Server 2014、SQL Server 2017、SQL Server 2022 のExpress Edition の日本語版のインストーラー(SQLEXPR_x64_JPN.exe) では発生しません。 ※ 公開時点イメージのランダム化を強制する(必須ASLR)の設定手順は以下となります。スタートメニューより[設定] -> [更新とセキュリティ] -> [Windows セキュリティ] -> [アプリとブラウザーの制御] -> [Exploit Protection の設定] -> [システム設定] -> [イメージのランダム化を強制する (必須ASLR)] 　図．Windows セキュリティ画面(ご参考) 　原因： SQL Server のセットアップは、イメージのランダム化を強制する(必須ASLR) を想定しておらず、SQL Server 2016 ならびにSQL Server 2019 のExpress Edition の日本語版のインストーラ(SQLEXPR_x64_JPN.exe) では、対応が行われていません。イメージのランダム化を強制する(必須ASLR) に「既定でオンにする」を設定した環境で、インストーラ(SQLEXPR_x64_JPN.exe) の起動に失敗する以外の問題はありません。回避策：恐れ入りますが、SQL Server 2016 ならびにSQL Server 2019 のExpress Edition の日本語版のインストーラ(SQLEXPR_x64_JPN.exe) を変更し、再リリースする予定はありません。そのため、イメージのランダム化を強制する(必須ASLR) に「既定でオンにする」を設定している環境では、次のように一時的に「既定でオフにする」に変更して展開し、SQL Server をインストールします。　1) イメージのランダム化を強制する(必須ASLR)を、「既定でオフにする」に変更します。　2) SQL Server 2016 またはSQL Server 2019 のExpress Edition の日本語版のインストーラー(SQLEXPR_x64_JPN.exe) を起動してファイル群を展開します。　3) イメージのランダム化を強制する(必須ASLR)を、「既定でオンにする」に変更します。　4) SQL Server のインストールを進めます。なお、SQL Server Express を再配布されている場合、現時点でパッケージを展開して再配布することは、モジュールの変更とはみなされないことを確認しています。
Masafumi_Hori
Sep 16, 2024 Place SQL Server Support Blog
9.3KViews
0likes
0Comments
Changes to hashing algorithm for self-signed certificate in SQL Server 2017
First published on MSDN on Nov 08, 2017 Starting with SQL Server 2005, a self-signed certificate is created automatically during the startup to be used for channel encryption.
Venu Cherukupalli
Feb 02, 2024 Place SQL Server Support Blog
19KViews
0likes
2Comments
SQL Backup using Azure File Share
SQL Server limits the maximum backup size supported using a page blob to 1 TB. The maximum backup size supported using block blobs is limited to approximately 200 GB (50,000 blocks * 4 MB MAXTRANSFERSIZE). In order to get over the above limitation, we can opt for Azure File Share. SQL Backup is more than 12TiB, you can choose to backup using Azure File Share Standard file shares can span up to 100 TiB, however this feature is not enabled by default. If you need a file share that is larger than 5 TiB, you will need to enable the large file share feature for your storage account. Premium file shares can span up to 100 TiB without any special setting, however premium file shares are provisioned, rather than pay as you go like standard file shares. This means that provisioning file share much larger than what you need will increase the total cost of storage. In local and zone redundant storage accounts, Azure file shares can span up to 100 TiB, however in geo- and geo-zone redundant storage accounts, Azure file shares can span only up to 5 TiB. Prerequisites Storage Account Azure PowerShell Storage Module https://www.powershellgallery.com/packages/Az.Storage/3.7.0 Steps 1. Storage Account with Premium FileShare Enable Large File Share 2. Register for the SMB Multichannel preview with the following commands. Connect-AzAccount # Setting your active subscription to the one you want to register for the preview. # Replace the <subscription-id> placeholder with your subscription id. $context = Get-AzSubscription -SubscriptionId <your-subscription-id> Set-AzContext $context Register-AzProviderFeature -FeatureName AllowSMBMultichannel -ProviderNamespace Microsoft.Storage Register-AzResourceProvider -ProviderNamespace Microsoft.Storage You can also verify if the feature registration is complete Get-AzProviderFeature -FeatureName AllowSMBMultichannel -ProviderNamespace Microsoft.Storage 3. Enable SMB Multichannel Once you have created a File Storage account, you can follow the instructions to update SMB Multichannel settings for your storage account Note: If the SMB Multichannel option is not visible under File share settings or you get a failed to update setting error while updating the configuration, please make sure that your subscription is registered, and your account is in one of the supported regions with supported account type and replication. 4. Create a file share You can set max capacity up to 100TB 5. Connect to FileShare from Window either using Active Directory or Storage Account Connecting to a share using the storage account key is only appropriate for admin access. But mounting the Azure file share with the Active Directory identity of the user is preferred. $connectTestResult = Test-NetConnection -ComputerName testsmbfileshare.file.core.windows.net -Port 445 if ($connectTestResult.TcpTestSucceeded) { # Save the password so the drive will persist on reboot cmd.exe /C "cmdkey /add:`"testsmbfileshare.file.core.windows.net`" /user:`"localhost\testsmbfileshare`" /pass:`"SxbRsNuwc1*******/8lk1TyUkqC+2+************==`"" # Mount the drive New-PSDrive -Name Z -PSProvider FileSystem -Root "\\testsmbfileshare.file.core.windows.net\sqlbackup" -Persist } else { Write-Error -Message "Unable to reach the Azure storage account via port 445. Check to make sure your organization or ISP is not blocking port 445, or use Azure P2S VPN, Azure S2S VPN, or Express Route to tunnel SMB traffic over a different port." } Copy this script and run this in PowerShell to map this as a network drive locally. This script will check to see if this storage account is accessible via TCP port 445, which is the port SMB uses. If port 445 is available, your Azure file share will be persistently mounted. Note: The script will only work on Windows Server 2012 and above Once we had the above script is executed, we could see the Z drive as network Drive in My computer / This PC 6. On the SQL Server you need to first enable XP_cmdshell so we can configure backups to this file share. Enable the Advance SQL Configuration EXECUTE sp_configure 'show advanced options', 1; GO -- To update the currently configured value for advanced options. RECONFIGURE; GO -- To enable the feature. EXECUTE sp_configure 'xp_cmdshell', 1; GO -- To update the currently configured value for this feature. RECONFIGURE; GO We must mount the Z Drive in SQL server and opt for it to be available for backups. Therefore, we map it using the script below xp_cmdshell 'net use Z: \\testsmbfileshare.file.core.windows.net\sqlbackup /u:localhost\testsmbfileshare SxbRsNuwc1*******/8lk1TyUkqC+2+************==`' Get the storage account, username and access key from Step 5 7. Backup the database now to the file share subdirectory using the below command BACKUP DATABASE [AdventureWorks2019] TO DISK = 'Z:\AdventureWorks2019.bak' with stats = 5 Reference (optional) SMB file shares in Azure Files | Microsoft Docs Create an Azure file share - Azure Files | Microsoft Docs https://technet.microsoft.com/en-us/library/dn435916(v=sql.120).aspx#limitations
SaniyaSamreen
Dec 05, 2023 Place SQL Server Support Blog
20KViews
6likes
4Comments
Configure multiple-subnet AlwaysOn Availability Group by modifying CIB
In the Windows world, a Windows Server Failover Cluster (WSFC) natively supports multiple subnets and handles multiple IP addresses via an OR dependency on the IP address. On Linux, there is no OR dependency, but there is a way to achieve a proper multi-subnet natively with Pacemaker, as shown by the following. You cannot do this by simply using the normal Pacemaker command line to modify a resource. You need to modify the cluster information base (CIB). The CIB is an XML file with the Pacemaker configuration. Here is an example to create a SQL Server Linux Availability group in 4 nodes in 3 subnets in RHEL 7.6 If you are already familiar with the AG Group setup process, please just jump to step 16. 1.Register your subscription on for all servers (red1,red2,red3 and red4 in this case) subscription-manager register 2.List all available subscription, pick the one with High Availabiilty , notedown the pool id subscription-manager list --available --all 3.Register the subscription for all nodes (red1,red2,red3 and red4 in this case) sudo subscription-manager attach --pool=xxxxx 4.Enable the repository(red1,red2,red3 and red4 in this case) sudo subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms 5.Install Pacemaker packages on all nodes. (red1,red2,red3 and red4 in this case) sudo yum install pacemaker pcs fence-agents-all resource-agents 6.Install SQL Server resource agent (red1,red2,red3 and red4 in this case) sudo yum install mssql-server-ha 7.Set the password for the default user that is created when installing Pacemaker and Corosync packages. All the password should be exactly same (red1,red2,red3 and red4 in this case) sudo passwd hacluster 8.Update /etc/hosts file in all servers, add IP and node name. All the servers should have the same entries. 192.168.2.103 red1 192.168.2.104 red2 192.168.4.100 red3 192.168.5.101 red4 9.Run following commands to Enable and start pcsd service and Pacemaker in all nodes. (red1,red2 and red3 and red4 in this case) sudo systemctl enable pcsd sudo systemctl start pcsd sudo systemctl enable pacemaker 10.Run following commands to Create Cluster in primary replica node (red1 in this case) sudo pcs cluster auth red1 red2 red3 red4 -u hacluster -p YouPasswordUsedinStep7 sudo pcs cluster setup --name sqlcluster1 red1 red2 red3 red4 sudo pcs cluster start --all sudo pcs cluster enable --all 11.Run following command to Enable cluster feature in all nodes(red1,red2 , red3 and red4 in this case) sudo /opt/mssql/bin/mssql-conf set hadr.hadrenabled 1 sudo systemctl restart mssql-server Create AG and Listener 1.Run following queries in red1 to create certificate use master go CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**'; go CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm'; go BACKUP CERTIFICATE dbm_certificate TO FILE = '/var/opt/mssql/data/dbm_certificate.cer' WITH PRIVATE KEY ( FILE = '/var/opt/mssql/data/dbm_certificate.pvk', ENCRYPTION BY PASSWORD = '**<Private_Key_Password>**' ); 2.Run following commands in red1 to copy the certificate to rest of the servers(red2,red3 and red4 in this case) cd /var/opt/mssql/data scp dbm_certificate.* root@red2:/var/opt/mssql/data/ scp dbm_certificate.* root@red3:/var/opt/mssql/data/ scp dbm_certificate.* root@red4:/var/opt/mssql/data/ 3.Give permission to the mssql user to access the certificate files in rest of the servers(red2,red3 and red4 in this case) cd /var/opt/mssql/data chown mssql:mssql dbm_certificate.* 4.Run following T-SQL queries to create the certificate in rest of the nodes by restoring the certificate backup file (red2,red3 and red4 in this case) use master go CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**' go CREATE CERTIFICATE dbm_certificate FROM FILE = '/var/opt/mssql/data/dbm_certificate.cer' WITH PRIVATE KEY ( FILE = '/var/opt/mssql/data/dbm_certificate.pvk', DECRYPTION BY PASSWORD = '**<Private_Key_Password>**' ) 5.Create endpoint in all servers (red1,red2,red3 and red4) CREATE ENDPOINT [Hadr_endpoint] AS TCP (LISTENER_PORT = 5022) FOR DATABASE_MIRRORING ( ROLE = ALL, AUTHENTICATION = CERTIFICATE dbm_certificate, ENCRYPTION = REQUIRED ALGORITHM AES ); ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED; 6.Run following query in primary replica (red1) to create Availability group(Please note, it works for SQL 2019. If you are using SQL 2017, you need to change AVAILABILITY_MODE of one the replica to ASYNCHRONOUS_COMMIT) CREATE AVAILABILITY GROUP [ag1] WITH (DB_FAILOVER = ON, CLUSTER_TYPE = EXTERNAL) FOR REPLICA ON N'red1' WITH ( ENDPOINT_URL = N'tcp://red1:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC) , N'red2' WITH ( ENDPOINT_URL = N'tcp://red2:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC), N'red3' WITH ( ENDPOINT_URL = N'tcp://red3:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC), N'red4' WITH ( ENDPOINT_URL = N'tcp://red4:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC) ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;--grant create any database permission Join the AG group, run the following T-SQL queries in all the secondary servers (red2,red3 and red4 in this case) ALTER AVAILABILITY GROUP [ag1] JOIN WITH (CLUSTER_TYPE = EXTERNAL); ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE 8.Run following T-SQL Queries to create database and add it to AG group in primary replica (red1 in this case). CREATE DATABASE [db1]; ALTER DATABASE [db1] SET RECOVERY FULL; BACKUP DATABASE [db1] TO DISK = N'/var/opt/mssql/data/db1.bak'; BACKUP log [db1] TO DISK = N'/var/opt/mssql/data/db1.trn'; GO ALTER AVAILABILITY GROUP [ag1] ADD DATABASE [db1]; 9.Create SQL login pacemaker in all servers (red1,red2,red3 and red4 in this case). CREATE LOGIN [pacemakerLogin] with PASSWORD= N'ComplexP@$$w0rd!' GO ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemakerLogin] 10.Run following bash command in red1 sudo pcs property set stonith-enabled=false In all SQL Server Linux servers , run following bash commands to save the credentials for the SQL Server login.(red1,red2,red3 and red4) (The password is as same as the one used in step 9) echo 'pacemakerLogin' >> ~/pacemaker-passwd echo 'ComplexP@$$w0rd!' >> ~/pacemaker-passwd sudo mv ~/pacemaker-passwd /var/opt/mssql/secrets/passwd sudo chown root:root /var/opt/mssql/secrets/passwd sudo chmod 400 /var/opt/mssql/secrets/passwd # Only readable by root 12.Create availability group resource at cluster level, run following command on any one of the nodes (just in one server and run just one time). sudo pcs resource create ag_cluster1 ocf:mssql:ag ag_name=ag1 meta failure-timeout=60s master notify=true ##check the status 13.Run following bash command in primary replica red1 to create one virtual IP resources. The resource name is 'vip1', and IP address is 192.168.2.111 sudo pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=192.168.2.111 ##check the status Create Availability group listener for Availability group ag1. Run following T-SQL query in primary replica (red1 in this case). ALTER AVAILABILITY GROUP [ag1] ADD LISTENER 'aglistener' (WITH IP ( ('192.168.2.111','255.255.255.0'), ('192.168.4.111','255.255.255.0'), ('192.168.5.111','255.255.255.0') ),PORT = 1433); Run following bash commands to create constraints: sudo pcs constraint colocation add vip1 ag_cluster1-master INFINITY with-rsc-role=Master sudo pcs constraint order promote ag_cluster1-master then start vip1 16.Run following bash command to export the CIB.(you can run the command in any node) sudo pcs cluster cib <filename> 17.You will find following similar entries <primitive class="ocf" id="vip1" provider="heartbeat" type="IPaddr2"> <instance_attributes id="vip1-instance_attributes"> <nvpair id="vip1-instance_attributes-ip" name="ip" value="192.168.2.111"/> </instance_attributes> <operations> <op id="vip1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> <op id="vip1-start-interval-0s" interval="0s" name="start" timeout="20s"/> <op id="vip1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> </operations> </primitive> 18.Here is the modified version <primitive class="ocf" id="vip1" provider="heartbeat" type="IPaddr2"> <instance_attributes id="vip1-instance_attributes"> <rule id="Subnet1-IP" score="INFINITY" boolean-op="or"> <expression id="Subnet1-Node1" attribute="#uname" operation="eq" value="red1"/> <expression id="Subnet1-Node2" attribute="#uname" operation="eq" value="red2"/> </rule> <nvpair id="vip1-instance_attributes-ip" name="ip" value="192.168.2.111"/> </instance_attributes> <instance_attributes id="vip1-instance_attributes2"> <rule id="Subnet2-IP" score="INFINITY"> <expression id="Subnet2-Node1" attribute="#uname" operation="eq" value="red3"/> </rule> <nvpair id="vip1-instance_attributes-ip2" name="ip" value="192.168.4.111"/> </instance_attributes> <instance_attributes id="vip1-instance_attributes3"> <rule id="Subnet3-IP" score="INFINITY"> <expression id="Subnet3-Node1" attribute="#uname" operation="eq" value="red4"/> </rule> <nvpair id="vip1-instance_attributes-ip3" name="ip" value="192.168.5.111"/> </instance_attributes> <operations> <op id="vip1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> <op id="vip1-start-interval-0s" interval="0s" name="start" timeout="20s"/> <op id="vip1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> </operations> </primitive> Run following command to import the modified CIB and reconfigure Pacemaker. sudo pcs cluster cib-push <filename> Here are the takeaway points: 1).All nodes in same subnet should be in the same <Instance_attributes> 2).If there are more than one servers in the subnet, the keyword ‘boolean-op="or"’ is a must 3).The IP address of Alwayson Listener is addressed in <nvpair> . 4).The value of id property does not matter, you can specify any value as long as the value is unique. Optional, you can create three entries for the three IP addresses in the DNS server. Here is an screenshot of using SQLCMD to connect the AGListener
Liwei
Sep 11, 2023 Place SQL Server Support Blog
10KViews
3likes
1Comment
SQL Server Database Mail Failure Troubleshooting & Two Common Issues
This blog discusses troubleshooting steps to investigate the failure of SQL Server Database Mail service on sending email. It also discusses 2 special scenarios with SQL Server Database Mail that commonly cause mail sending failure General Troubleshooting Steps to Narrow Down RCA Special Scenarios: Scenario 1: Implicit SSL/TLS mode is not supported for SQL Server DB Mail. If your SMTP server requires implicit TLS, then SQL Server DB Mail will not be able to send email using this SMTP server. Scenario 2: SQL DB Mail failure caused by SMTP mail server requesting TLS 1.2. In this case, TLS 1.2 needs to be enabled at DB Mail server. Commonly Used Protocols and Port ======================== protocol No encryption TLS/SSL TLS/SSL Plain port Explicit port Implicit port FTP 21 21 990 SMTP 25 or 587 25 or 587 465 IMAP 143 143 993 POP3 110 110 995 Telnet 23 23 992 HTTP 80 - 443 General Troubleshooting Steps to Narrow Down RCA ===================================== Check SQL DB Mail errors from DMV and logs /*List all DB Mail event log*/ SELECT * FROM msdb.dbo.sysmail_event_log ORDER BY log_date DESC /*List all failed items */ SELECT er.log_id AS [LogID], er.event_type AS [EventType], er.log_date AS [LogDate], er.description AS [Description], er.process_id AS [ProcessID], er.mailitem_id AS [MailItemID], er.account_id AS [AccountID], er.last_mod_date AS [LastModifiedDate], er.last_mod_user AS [LastModifiedUser], fi.send_request_user, fi.send_request_date, fi.recipients, fi.subject, fi.body FROM msdb.dbo.sysmail_event_log er LEFT JOIN msdb.dbo.sysmail_faileditems fi ON er.mailitem_id = fi.mailitem_id ORDER BY [LogDate] DESC Check and ensure DB Mail has been enabled sp_configure 'show advanced options', 1; GO RECONFIGURE; GO sp_configure 'Database Mail XPs', 1; GO RECONFIGURE GO Check and ensure the SQL Server service account has permission to call the Database mail’s executable file (DatabaseMail.exe) Example: Install Path:\Program Files\Microsoft SQL Server\MSSQL1.SQL2019\MSSQL\Binn\DatabaseMail.exe Check DB Mail account profile’s security configuration options and port number you have chosen for further troubleshooting purpose The Port Number you have chosen for this account Whether or not you have checked the box “this server requires a secure connection(SSL)” The type of SMTP authentication you have chosen Steps: Open SSMS, select Management, right-click Database Mail, and select Configure Database Mail -> Manage Database Mail accounts and profiles > Next. Telnet Port and IP of SMTP Server. In SQL DB Mail server, telnet the SMTP server to see if SQL machine can communicate with the IP and port of your SMTP (the port number configured in your profile). Most SMTP servers use port 25. If the telnet is not successful, it means the network communication from DB Mail server to SMTP server has caused the problem. If the DB Mail server can communicate with other server without issue, then the problem lies on the network traffic with SMTP server Open CMD as administrator and run below command to telnet your target server IP and port telnet [domain name or ip] [port] For example, to verify connection to 192.168.0.10 on port 25, issue the command: telnet 192.168.0.10 25 Configure TELNET in your server: Windows 7, 8. 10: Open Windows Start menu > Type "Control Panel" > Press Enter > “Programs” > "Programs and Features" > Turn Windows features on or off > Select "Telnet Client" > Press “OK" Windows Server 2012, 2016: Open “Server Manager” > “Add roles and features” > click “Next” until reaching the “Features” step > tick “Telnet Client” > click “Install” > when the feature installation finishes, click “Close”. Run Testing Script to Send email. If telnet to the IP and port from DB Mail server to SMTP server is successful, it means the network from DB server to SMTP server is good. We will need to test and see if we can send DB Mail via PowerShell or other scripts (rule out impact of DB Mail service and see if we could directly send email to your SMTP server to isolate the issue further). Also, we suggest to test and see if we can switch to a public SMTP server for the same DB Mail profile as well as switch to different DB Mail profile to narrow down whether it is profiler issue, SMTP server issue, or SQL Server DB Mail service issue. Below is a sample PowerShell Script to send DB mail $UserCredential = Get-Credential enter this: user:test@chimex.onmicrosoft.com password:yourpassword Send-MailMessage -to "youremail" -from "test@chimex.onmicrosoft.com" -subject "powershell" -body "powershell" -BodyAsHtml -smtpserver smtp.office365.com -usessl -Credential $UserCredential If step 6 is not able to be directly tested in your environment, kindly collect a network monitor trace when you reproduce the issue (failed to send DB Mail) If you do not see clear evidence to find RCA in the network trace for client or SMTP server reset, based on the symptom, you may need to collect a TTT debug trace for dbmail.exe and work with Microsoft support. Special Scenarios: =============== Scenario 1: Implicit TLS mode is not supported for SQL Server DB Mail. If your SMTP server requires implicit TLS, then SQL Server DB Mail will not be able to send email with this SMTP server. You can consider the options of switching to a different SMTP Server (such as a public one) that support STARTTLS or making modification in your SMTP server to support STARTTLS If your DB Mail profile has enabled SSL encryption, then STARTTLS would be required in the secured communication between your DB Mail server and SMTP Server. For SQL Server Database Mail architecture, as our service relies on .Net System.Net.Mail (SmtpClient class) and System.Net.Mail does not support “Implicit SSL/TLS“mode, SQL Server Database Mail does not support it as well. In the past, port 465 can be used to support “Implicit SSL/TLS“mode (SMTP over SSL). Now it is no longer supported. (more details , kindly refer SmtpClient.EnableSsl Property (System.Net.Mail) | Microsoft Docs ) More Details with STARTTLS and “Implicit SSL/TLS” --------------------------------------------------------------------- “STARTTLS” is an email protocol command that would turn an insecure network connection into a secured one. If email client server has enabled/request SSL or TLS encryption, then STARTTLS would be required for SQL Server DB Mail service to successfully send email. “Implicit SSL/TLS” is another mode for secured client to server communication. The major two differences between “STARTTLS” and “Implicit SSL/TLS” are summarized as below (while there are a couple of other differences) : With the “Implicit SSL/TLS” mode, email client server connects to the SMTP server and TLS/SSL encryption is switched on implicitly as soon as the connection is established while under “STARTTLS” mode, client explicitly requests TLS/SSL encryption to be switched on after initial TCP handshake. With “Implicit SSL/TLS” mode, if the connection is not able to be built with encrypted security mode, the email will be prevented from being sent. However, with “STARTTLS”, if the SMTP mail server not able to support TLS encryption as requested by email client server, the email client server will negotiate with the SMTP server and agree to downgrade to an unencrypted connection. Thus, with “STARTTLS”, you can use the same port for encrypted and plain text mail. Scenario 2: SQL DB Mail failure caused by SMTP mail server requesting TLS 1.2. In this case, TLS 1.2 needs to be enabled at DB Mail server. Multiple DB Mail cases have seen its failure being caused by client DB Mail server not supporting TLS 1.2 while the SMTP mail server is requesting TLS 1.2. Unfortunately, most often the error observed in Database Mail Log Viewer (sysmail_event_log) is very generic as below. From network monitor trace, you may not see any STARTTLS traffic as the connection can be reset by DB mail server after initial handshake. The mail could not be sent to the recipients because of the mail server failure. (Sending Mail using Account 2 (2022-03-14T16:58:54). Exception Message: Cannot send mails to mail server. (Failure sending mail.). ) If Schannel errors are detected in Windows system event logs around the same time when DB Mail failure happens, you are suggested to use below troubleshooting steps to see if the failure is related to TLS 1.2 requested by SMTP server. Use the first 7 steps mentioned earlier to narrow down other issues that can play a role in the scene. Check system event log for any Schannel errors (or any errors) from the same time as the database mail failure occurred Below are a few examples but there could be others. Any Schannel error occurring at the same time as the database mail failure should be cautiously checked. {timestamp},Error,{servername},36887,Schannel,A fatal alert was received from the remote endpoint. The TLS protocol defined fatal alert code is 70. {timestamp},Error,0,36871,Schannel,{servername},A fatal error occurred while creating a TLS client credential. The internal error state is 10013. If Schannel errors are spotted at the same time when DB Mail fails which can easy to be reproduced, check and ensure your SQL Server and Windows has the necessary hotfixes to support TLS 1.2 (refer link https://support.microsoft.com/en-us/topic/kb3135244-tls-1-2-support-for-microsoft-sql-server-e4472ef8-90a9-13c1-e4d8-44aad198cdbe ) Check registry settings in DB Mail server and make sure the needful ones are present. Reboot is needed for the change to take effects. (Please first take a backup of your registry key and store them in another machine before you make any change! ) [HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\.NETFramework\v2.0.50727] "SystemDefaultTlsVersions"=dword:00000001 "SchUseStrongCrypto"=dword:00000001 [HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\.NETFramework\v4.0.30319] "SystemDefaultTlsVersions"=dword:00000001 "SchUseStrongCrypto"=dword:00000001 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\v2.0.50727] "SystemDefaultTlsVersions"=dword:00000001 "SchUseStrongCrypto"=dword:00000001 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\v4.0.30319] "SystemDefaultTlsVersions"=dword:00000001 "SchUseStrongCrypto"=dword:00000001 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2] [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Client] "DisabledByDefault"=dword:00000000 "Enabled"=dword:00000001 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Server] "DisabledByDefault"=dword:00000000 "Enabled"=dword:00000001 DISCLAIMER : THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
EstherXinMSFT
Mar 16, 2022 Place SQL Server Support Blog
43KViews
2likes
0Comments
TSQL Scripts for Troubleshooting Common Issues with SQL Server
This blog provides TSQL samples for troubleshooting common SQL Server issues as listed below. You can modify the parameters (i.e. database name, table name, keyword name, duration, etc.) based on customized environment settings and requirements. (Test these scripts before implementing in PROD environment. Please be aware of all potential risks for implementing the script in your PROD environment.) 1. Performance Troubleshooting(blocking, high CPU, memory, idle, query execution) 2. Deadlock and Database Object ID mapping 3. HA (Clustered SQL Server) 4. Backup & Transaction Log Related Issue 5. Query Store (QDS) 6. Database Encryption (TDE) 7. Tool (Profiler trace and X-event) Performance Troubleshooting ========================= (blocking, high CPU, memory, idle, query execution) 1.List all active sessions and its queries that contain your target table name or specific TSQL structure (input your target key word) SELECT sqltext.TEXT, req.session_id, req.status, req.command, req.cpu_time,req.database_id, req.total_elapsed_time FROM sys.dm_exec_requests req CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sqltext where sqltext.text like '%Your Target Key Word%' 2.List all sleeping user sessions that have been idle for over 15 minutes with detailed queries (You can customize the session’s status and idle time) SELECT CURRENT_TIMESTAMP as currenttime, datediff(minute,last_batch,GETDATE()) as 'idletime_in_minute' ,sp.status,sp.spid,sp.login_time,sp.program_name,sp.hostprocess,sp.loginame,text FROM sys.sysprocesses sp CROSS APPLY sys.dm_exec_sql_text(sp.sql_handle) AS QT where sp.status = 'sleeping' and datediff(minute,last_batch,GETDATE()) >15 and spid>50 3.List top 10 high CPU queries that currently running in this SQL instance SELECT s.session_id,r.status,r.blocking_session_id 'Blk by',r.wait_type,wait_resource,r.wait_time / (1000 * 60) 'Wait M',r.cpu_time,r.logical_reads,r.reads,r.writes,r.total_elapsed_time / (1000 * 60) 'Elaps M',Substring(st.TEXT,(r.statement_start_offset / 2) + 1, ((CASE r.statement_end_offset WHEN -1 THEN Datalength(st.TEXT) ELSE r.statement_end_offset END - r.statement_start_offset) / 2) + 1) AS statement_text, Coalesce(Quotename(Db_name(st.dbid)) + N'.' + Quotename(Object_schema_name(st.objectid, st.dbid)) + N'.' + Quotename(Object_name(st.objectid, st.dbid)), '') AS command_text,r.command,s.login_name, s.host_name,s.program_name,s.last_request_end_time,s.login_time, r.open_transaction_count FROM sys.dm_exec_sessions AS s JOIN sys.dm_exec_requests AS r ON r.session_id = s.session_id CROSS APPLY sys.Dm_exec_sql_text(r.sql_handle) AS st WHERE r.session_id != @@SPID ORDER BY r.cpu_time desc 4. List top 10 high memory usage queries that currently running in this SQL instance SELECT mg.session_id,mg.granted_memory_kb,mg.requested_memory_kb,mg.ideal_memory_kb,mg.request_time,mg.grant_time,mg.query_cost,mg.dop,st.[TEXT],qp.query_plan FROM sys.dm_exec_query_memory_grants AS mg CROSS APPLY sys.dm_exec_sql_text(mg.plan_handle) AS st CROSS APPLY sys.dm_exec_query_plan(mg.plan_handle) AS qp ORDER BY mg.required_memory_kb DESC 5. List detailed memory usage for each memory clerk DBCC MEMORYSTATUS 6. List memory usage for plan cache and its maximum size based on current setting. By removing the WHERE condition, you will get a full list for all memory cache clerks’ information select name, type, buckets_count from sys.dm_os_memory_cache_hash_tables where name IN ( 'SQL Plans' , 'Object Plans' , 'Bound Trees' ,'Extended Stored Procedures') select name, type, pages_kb, entries_count from sys.dm_os_memory_cache_counters where name IN ( 'SQL Plans' , 'Object Plans' , 'Bound Trees' ,'Extended Stored Procedures') 7. List progressive (on-going) execution plan for a specific session (Starting with SQL 2016) For SQL 2016 and 2017, please first run below TSQL in the query session of which the execution plan you wish to extract later set statistics profile on Run below query to extract the on-going execution plan for your target session (input the SPID in the bracket) SELECT * FROM sys.dm_exec_query_statistics_xml(59); 8.List all block header queries that currently detected in this SQL instance declare @blocker varchar(100),@sql varchar(100) print convert(varchar(20), getdate(),120) select distinct blocked into #blocker from sysprocesses where blocked <> 0 DECLARE blocker CURSOR FOR select spid from sysprocesses where spid in (select * from #blocker) and blocked = 0 OPEN blocker FETCH NEXT FROM blocker INTO @blocker WHILE @@FETCH_STATUS = 0 BEGIN set @sql='DBCC inputbuffer(' + @blocker + ')' execute (@sql) set @sql= 'select * from sysprocesses where spid=' + @blocker execute (@sql) FETCH NEXT FROM blocker INTO @blocker END CLOSE blocker DEALLOCATE blocker drop table #blocker 9.Kill all sleeping sessions that has been idle over 1 hour (You can customize the idle duration) DECLARE user_spid INT DECLARE CurSPID CURSOR FAST_FORWARD FOR SELECT SPID FROM master.dbo.sysprocesses (NOLOCK) WHERE spid>50 AND status='sleeping' -- only sleeping threads AND DATEDIFF(HOUR,last_batch,GETDATE())>=1 -- thread sleeping for 1 hours AND spid<>@@spid -- ignore current spid OPEN CurSPID FETCH NEXT FROM CurSPID INTO user_spid WHILE (@@FETCH_STATUS=0) BEGIN PRINT 'Killing '+CONVERT(VARCHAR,@user_spid) EXEC('KILL '+@user_spid) FETCH NEXT FROM CurSPID INTO user_spid END CLOSE CurSPID DEALLOCATE CurSPID GO 10.Kill all block headers. This script will continuously scan every 5 seconds for all block header sessions and kill all block headers Use master go while 1=1 Begin declare @blocker varchar(100),@sql varchar(100) print convert(varchar(20), getdate(),120) select distinct blocked into #blocker from sysprocesses where blocked <> 0 DECLARE blocker CURSOR FOR select spid from sysprocesses where spid in (select * from #blocker) and status='sleeping' OPEN blocker FETCH NEXT FROM blocker INTO @blocker WHILE @@FETCH_STATUS = 0 BEGIN set @sql='DBCC inputbuffer(' + @blocker + ')' execute (@sql) set @sql= 'kill ' + @blocker execute (@sql) FETCH NEXT FROM blocker INTO @blocker END CLOSE blocker DEALLOCATE blocker drop table #blocker waitfor delay '0:0:05' End Deadlock and Database Object ID mapping =================================== Below are 3 examples to map the issued object based on the ID to its source database, index or schema. 1. Key type wait resource waitresource=KEY: 6:12345678990 (987654321a9b) database_id = 6 hobt_id = 12345678990 hash value = (987654321a9b) We can use below TSQL to check the database name based on DATABASE ID select db_name(6) Use below TSQL to check the schema, object, and index details related to this key USE DatabaseName GO SELECT sc.name as schema_name, so.name as object_name, si.name as index_name FROM sys.partitions AS p JOIN sys.objects as so on p.object_id=so.object_id JOIN sys.indexes as si on p.index_id=si.index_id and p.object_id=si.object_id JOIN sys.schemas AS sc on so.schema_id=sc.schema_id WHERE hobt_id = 12345678990 2. Object Wait resource type waitresource=OBJECT: 6:1234567890:4 database_id = 6 Object ID = 1234567890 We can use below TSQL to check the specific object name based on the waitresource details select OBJECT_NAME(1234567890,6) 3. Page Wait Resource Type waitresource=“PAGE: 6:3:70133 ” = Database_Id : FileId : PageNumber database_id=6 data_file_id = 3 page_number = 70133 HA (Clustered SQL Server) =================== List the latest error encountered by local AG replica for connection timeout select r.replica_server_name, r.endpoint_url, rs.connected_state_desc, rs.last_connect_error_description, rs.last_connect_error_number, rs.last_connect_error_timestamp from sys.dm_hadr_availability_replica_states rs join sys.availability_replicas r on rs.replica_id=r.replica_id where rs.is_local=1 List the current redo rate, redo queue size, log send rate, log send queue size for current AG replica SELECT CURRENT_TIMESTAMP as currenttime,drs.last_commit_time,ar.replica_server_name, adc.database_name, ag.name AS ag_name, drs.is_local, drs.is_primary_replica, drs.log_send_queue_size, drs.log_send_rate, drs.redo_queue_size, drs.redo_rate FROM sys.dm_hadr_database_replica_states AS drs INNER JOIN sys.availability_databases_cluster AS adc ON drs.group_id = adc.group_id AND drs.group_database_id = adc.group_database_id INNER JOIN sys.availability_groups AS ag ON ag.group_id = drs.group_id INNER JOIN sys.availability_replicas AS ar ON drs.group_id = ar.group_id AND drs.replica_id = ar.replica_id Restart endpoint Check SQL AG endpoint name SELECT * FROM sys.endpoints Restart AG endpoint on this local replica ALTER ENDPOINT <Your AG Endpoint Name> STATE=STOPPED ALTER ENDPOINT <Your AG Endpoint Name> STATE=STARTED Backup & Transaction Log Related Issue ================================ List all backup history and backup file details for the past 7 days (You can modify the date) SELECT CONVERT(CHAR(100), SERVERPROPERTY('Servername')) AS Server, msdb.dbo.backupset.database_name, msdb.dbo.backupset.backup_start_date, msdb.dbo.backupset.backup_finish_date, msdb.dbo.backupset.expiration_date, CASE msdb..backupset.type WHEN 'D' THEN 'Database' WHEN 'L' THEN 'Log' END AS backup_type, msdb.dbo.backupset.backup_size, msdb.dbo.backupmediafamily.logical_device_name, msdb.dbo.backupmediafamily.physical_device_name, msdb.dbo.backupset.name AS backupset_name, msdb.dbo.backupset.description FROM msdb.dbo.backupmediafamily INNER JOIN msdb.dbo.backupset ON msdb.dbo.backupmediafamily.media_set_id = msdb.dbo.backupset.media_set_id WHERE (CONVERT(datetime, msdb.dbo.backupset.backup_start_date, 102) >= GETDATE() - 7) ORDER BY msdb.dbo.backupset.database_name, msdb.dbo.backupset.backup_finish_date List all databases’ data and log file path and logical name USE master; SELECT name 'Logical Name', physical_name 'File Location' FROM sys.master_files; List each database’ transaction log size usage and space DBCC SQLPERF(LOGSPACE) List VLF size and counts for all databases as well as detailed size for your target database select * from sys.dm_db_log_info(5); /*input here your DB ID to get detailed size for existing VLFs on this database*/ SELECT [name], s.database_id, COUNT(l.database_id) AS 'VLF Count', SUM(vlf_size_mb) AS 'VLF Size (MB)', SUM(CAST(vlf_active AS INT)) AS 'Active VLF', SUM(vlf_active*vlf_size_mb) AS 'Active VLF Size (MB)', COUNT(l.database_id)-SUM(CAST(vlf_active AS INT)) AS 'In-active VLF', SUM(vlf_size_mb)-SUM(vlf_active*vlf_size_mb) AS 'In-active VLF Size (MB)' FROM sys.databases s CROSS APPLY sys.dm_db_log_info(s.database_id) l GROUP BY [name], s.database_id ORDER BY 'VLF Count' DESC GO For the purpose of shrinking transaction log file, check the log_reuse_wait_desc type for all database select name,database_id,log_reuse_wait, log_reuse_wait_desc from sys.databases Please refer this link for how to fix each of the log_reuse_wait_desc type when it is not “NOTHING” https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-databases-transact-sql?view=sql-server-ver15 Database Corruption Issue Force repair of database with REPAIR_ALLOW_DATA_LOSS option (data loss expected and sometimes can cause more damage. Please refrain from using this method unless as for a last resort. More details for fixing database corruption, please refer this link https://docs.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-checkdb-transact-sql?view=sql-server-ver15) Alter database YourDBName set single_user with rollback immediate; DBCC CHECKDB('YourDBName', REPAIR_ALLOW_DATA_LOSS); Alter database YourDBName set multi_user with rollback immediate; Query Store (QDS) ================= 1.List a database’s query store status select * from sys.database_query_store_options 2. List all databases that have query store configured select name as 'DATABASE NAME', CASE is_query_store_on when 1 then 'ENABLED' else 'OTHER' END AS 'QUERY STORE STATE' from sys.databases where is_query_store_on = 1 order by 1 ; Database Encryption (TDE) ====================== Check if the DMK exist in master database USE master GO SELECT * FROM sys.symmetric_keys WHERE name = '##MS_DatabaseMasterKey##' Check if the certificate is created in master database. A certificate is equivalent to an asymmetric key USE master GO select * from sys.certificates Check if the database is encrypted/progress (encryption_state = 3 encrypted; =2 in progress) USE master GO SELECT db_name(database_id) [TDE Encrypted DB Name], c.name as CertName, encryptor_thumbprint , dek.* FROM sys.dm_database_encryption_keys dek INNER JOIN sys.certificates c on dek.encryptor_thumbprint = c.thumbprint Check if the DMK in the master database is now encrypted by the SMK (is_master_key_encrypted_by_server = 1) select is_master_key_encrypted_by_server, * from sys.databases where database_id = db_id(N'master') Tool (Profiler trace and X-event) ======================= Import SQL Profiler trace into SQL Server database tables USE DatabaseName GO select * into Sample from fn_trace_gettable('c:\trace\YourFolderToStoreTheTrace\YourTraceFile.trc',default) where eventclass in (10, 12) For event class ID, check the list in this link https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-trace-setevent-transact-sql?view=sql-server-ver15 Use XELoader to process X-event and import the data into SQL database table for data analysis purpose. XELoader can be downloaded from this opensource link https://github.com/suresh-kandoth/XELoader Use CMD script to load the X-event files into SQL Server database Sample script: C:\XELoader\6.2>XELoader.exe -D"C:\Users\yixin\Desktop\xeloader\xevent" -SYourServerName\InstanceName -dYourDBName Use TSQL query to check aggregated data result Sample TSQL: SELECT sum(c_duration) as SUM_DURAION,sum(c_signal_duration) as SUM_SIGNAL_DURATION,c_wait_type from [xel].[wait_info] group by c_wait_type order by SUM_DURAION desc DISCLAIMER : THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
EstherXinMSFT
Mar 16, 2022 Place SQL Server Support Blog
13KViews
1like
0Comments
Common Causes and Troubleshooting Solutions for SQL AG Data Synchronization Latency
This article summarizes the common causes , solutions and troubleshooting mechanism for SQL Availability Group (AG) data synchronization latency between primary and secondary for both synchronous-commit and asynchronous-commit mode. The latency commonly happens at log harden phase or log redo phase. And sometimes, both. If it happens at log harden phase, you will likely to see HADR_SYNC_COMMIT wait type in your primary that waiting for your synchronous-commit secondary replica's acknowledgement. Also, your primary is likely to encounter latency on committing transactions sent from application because of this wait. If latency happens purely at log redo phase, you are likely to see a relatively low redo rate in your issued secondary replica with high value of redo queue. The key point is to first narrow down the actual phase of workflow (attached at the bottom of this article as also mentioned in a few other blogs thanks to great work of Simon Su )in data synchronization that induces the latency. For experienced DBA, if you have already narrowed down the latency to be caused by extensive HADR_SYNC_COMMIT wait type in your primary for your synchronous-commit secondary replica, you can refer to this article for resolving this bottleneck : https://techcommunity.microsoft.com/t5/sql-server/troubleshooting-high-hadr-sync-commit-wait-type-with-always-on/ba-p/385369 If you wish to capture detailed traces to narrow down the bottleneck, except for performance monitor counters where you can check log send and redo efficiency, blocking chain and its block header on primary replica, SQL AG dedicated X-events in both primary and issued secondary are also required to see the data synchronization workflow latency. We have an automatic tool "AGLatency Report Tool" to help you analyze the AG X-event data thanks to the great work of Simon Su . You can find this tool in this link with its user manual and the AG X-event script in this link Below are the common causes and its solution/troubleshooting mechanism for SQL AG Data Synchronization Latency: Long-running active transactions in read-only secondary replica ---- > When there is a long-running transaction in readable secondary replica, there can be blocking for the local redo thread in secondary replica when accessing the same data. High network latency / low network throughput ----> When the network transmission between primary and the issued secondary is slow or instable, the log send rate will be slow and impact the amount of data being transmitted to secondary(refer doc https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2012/dn135335(v=sql.110)?redirectedfrom=MSDN) Flow control induced high log send queue size and low log send rate ---- > When your SQL AG's performance triggered flow control gate for specific SQL AG databases or at SQL AG replica level, there would be temporary termination and limitation of sending log data from primary to secondary. Flow control can be commonly caused by server overloading or slow network. We need to resolve the conditions in your environment that cause the flow control if we have constantly observed flow control gate being entered by your AG. You can check below performance monitor counters to see if your SQL AG Database or SQL AG replica has ever triggered flow control gate: AG Replica level Availability Replica : Flow Control Time , Availability Replica : Flow Control/sec , AG Database Level Database Replica: Database Flow Control Delay Database Replica: Database Flow Controls/Sec (For Flow Control Gate details as discussed below, please refer this link Monitor Performance for AlwaysOn Availability Groups | Microsoft Docs) AlwaysOn Availability Groups is designed with flow control gates on the primary replica to avoid excessive resource consumption, such as network and memory resources, on all availability replicas. These flow control gates do not affect the synchronization health state of the availability replicas, but they can affect the overall performance of your availability databases, including RPO. After the logs have been captured on the primary replica, they are subject to two levels of flow controls, as shown in the table below. Level Number of Gates Number of messages Useful Metrics Transport 1 per availabiltiy replica 8192 Extended event database_transport_flow_control_action Database 1 per availability database 11200 (x64) 1600 (x86) DBMIRROR_SEND Extended event hadron_database_flow_control_action FLOW CONTROL GATES Once the message threshold of either gate is reached, log messages are no longer sent to a specific replica or for a specific database. These can be sent once acknowledgement messages are received for the sent messages to bring the number of sent messages below the threshold. In addition to the flow control gates, there is another factor that can prevent the log messages from being sent. The synchronization of replicas ensures that the messages are sent and applied in the order of the log sequence numbers (LSN). Before a log message is sent, its LSN also checked against the lowest acknowledged LSN number to make sure that it is less than one of thresholds (depending on the message type). If the gap between the two LSN numbers is larger than the threshold, the messages are not sent. Once the gap is below the threshold again, the messages are sent. Blocked REDO thread ---- When there are other issues or performance bottlenecks in secondary replica, the REDO thread can be blocked and cause latency of data synchronization. Refer document https://techcommunity.microsoft.com/t5/sql-server/alwayson-minimizing-blocking-of-redo-thread-when-running/ba-p/383963 Shared REDO Target ----- This issue only happens when you have multiple secondary replicas. When one secondary replica is very slow in finishing the redo process, it will limit the other secondary replicas in proceeding further on the maximum allowable LSN with the redo process. Trace flag 9559 can be used to mitigate the issue. (refer doc https://docs.microsoft.com/en-us/archive/blogs/alwaysonpro/recovery-on-secondary-lagging-shared-redo-target) Running out of parallel redo threads in secondary replica --- When secondary replica is running out of parallel redo threads, you can use TF3478 to allow maximal number of parallel redo thread to increase with total number of CPUs. By default, A SQL Server instance uses up to 100 threads for parallel redo for secondary replicas. Each database uses up to one-half of the total number of CPU cores, but not more than 16 threads per database. If the total number of required threads for a single instance exceeds 100, SQL Server uses a single redo thread for every remaining database. Disk or I/O subsystem latency -- > when the disk or I/O subsystem in the secondary replica has critical bottleneck, the redo rate will be impacted and quite small in secondary replica. We are less likely to see it in your secondary before it hit on primary if your secondary replica is not readable and has comparable hardware for SQL database files' storage as that in primary and dedicated for this SQL AG's usage. However, if your secondary replica has high volume of read-only workload, it can bring extra IO overhead to your IO subsystem. Frequent database backup and VM snapshot can be another potential cause of extra I/O overhead.
EstherXinMSFT
Nov 14, 2021 Place SQL Server Support Blog
23KViews
3likes
0Comments