high availability
113 TopicsWhat is causing the Always On AG issue? Is it cluster, AD, DNS, or SQL?
AG (Always On Availability Groups) is a logical container that holds one or more user databases that failover together. So, if there is an issue with AG, where do you look? This blog post is an index map of the resources that are currently available and the ones to come. This post also provides pointers on which support topic to select if you choose to open a support case with CSS but why open a CSS case if you can quickly resolve the issue with minimal business impact based on the helpful resources included here?23KViews4likes2CommentsTroubleshooting data movement latency between synchronous-commit AlwaysOn Availability Groups
First published on MSDN on Apr 05, 2018 Writer: Simon SuTechnical Reviewer: Pam Lahoud, Sourabh Agarwal, Tejas ShahApplies to: SQL Server 2014 SP2, SQL Server 2016 SP1, SQL Server 2017 RTM In synchronous-commit mode AG nodes sometimes you may observe your transactions are pending on HADR_SYNC_COMMIT waits.38KViews3likes1CommentCommon Causes and Troubleshooting Solutions for SQL AG Data Synchronization Latency
This article summarizes the common causes , solutions and troubleshooting mechanism for SQL Availability Group (AG) data synchronization latency between primary and secondary for both synchronous-commit and asynchronous-commit mode. The latency commonly happens at log harden phase or log redo phase. And sometimes, both. If it happens at log harden phase, you will likely to see HADR_SYNC_COMMIT wait type in your primary that waiting for your synchronous-commit secondary replica's acknowledgement. Also, your primary is likely to encounter latency on committing transactions sent from application because of this wait. If latency happens purely at log redo phase, you are likely to see a relatively low redo rate in your issued secondary replica with high value of redo queue. The key point is to first narrow down the actual phase of workflow (attached at the bottom of this article as also mentioned in a few other blogs thanks to great work of Simon Su )in data synchronization that induces the latency. For experienced DBA, if you have already narrowed down the latency to be caused by extensive HADR_SYNC_COMMIT wait type in your primary for your synchronous-commit secondary replica, you can refer to this article for resolving this bottleneck : https://techcommunity.microsoft.com/t5/sql-server/troubleshooting-high-hadr-sync-commit-wait-type-with-always-on/ba-p/385369 If you wish to capture detailed traces to narrow down the bottleneck, except for performance monitor counters where you can check log send and redo efficiency, blocking chain and its block header on primary replica, SQL AG dedicated X-events in both primary and issued secondary are also required to see the data synchronization workflow latency. We have an automatic tool "AGLatency Report Tool" to help you analyze the AG X-event data thanks to the great work of Simon Su . You can find this tool in this link with its user manual and the AG X-event script in this link Below are the common causes and its solution/troubleshooting mechanism for SQL AG Data Synchronization Latency: Long-running active transactions in read-only secondary replica ---- > When there is a long-running transaction in readable secondary replica, there can be blocking for the local redo thread in secondary replica when accessing the same data. High network latency / low network throughput ----> When the network transmission between primary and the issued secondary is slow or instable, the log send rate will be slow and impact the amount of data being transmitted to secondary(refer doc https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2012/dn135335(v=sql.110)?redirectedfrom=MSDN) Flow control induced high log send queue size and low log send rate ---- > When your SQL AG's performance triggered flow control gate for specific SQL AG databases or at SQL AG replica level, there would be temporary termination and limitation of sending log data from primary to secondary. Flow control can be commonly caused by server overloading or slow network. We need to resolve the conditions in your environment that cause the flow control if we have constantly observed flow control gate being entered by your AG. You can check below performance monitor counters to see if your SQL AG Database or SQL AG replica has ever triggered flow control gate: AG Replica level Availability Replica : Flow Control Time , Availability Replica : Flow Control/sec , AG Database Level Database Replica: Database Flow Control Delay Database Replica: Database Flow Controls/Sec (For Flow Control Gate details as discussed below, please refer this link Monitor Performance for AlwaysOn Availability Groups | Microsoft Docs) AlwaysOn Availability Groups is designed with flow control gates on the primary replica to avoid excessive resource consumption, such as network and memory resources, on all availability replicas. These flow control gates do not affect the synchronization health state of the availability replicas, but they can affect the overall performance of your availability databases, including RPO. After the logs have been captured on the primary replica, they are subject to two levels of flow controls, as shown in the table below. Level Number of Gates Number of messages Useful Metrics Transport 1 per availabiltiy replica 8192 Extended event database_transport_flow_control_action Database 1 per availability database 11200 (x64) 1600 (x86) DBMIRROR_SEND Extended event hadron_database_flow_control_action FLOW CONTROL GATES Once the message threshold of either gate is reached, log messages are no longer sent to a specific replica or for a specific database. These can be sent once acknowledgement messages are received for the sent messages to bring the number of sent messages below the threshold. In addition to the flow control gates, there is another factor that can prevent the log messages from being sent. The synchronization of replicas ensures that the messages are sent and applied in the order of the log sequence numbers (LSN). Before a log message is sent, its LSN also checked against the lowest acknowledged LSN number to make sure that it is less than one of thresholds (depending on the message type). If the gap between the two LSN numbers is larger than the threshold, the messages are not sent. Once the gap is below the threshold again, the messages are sent. Blocked REDO thread ---- When there are other issues or performance bottlenecks in secondary replica, the REDO thread can be blocked and cause latency of data synchronization. Refer document https://techcommunity.microsoft.com/t5/sql-server/alwayson-minimizing-blocking-of-redo-thread-when-running/ba-p/383963 Shared REDO Target ----- This issue only happens when you have multiple secondary replicas. When one secondary replica is very slow in finishing the redo process, it will limit the other secondary replicas in proceeding further on the maximum allowable LSN with the redo process. Trace flag 9559 can be used to mitigate the issue. (refer doc https://docs.microsoft.com/en-us/archive/blogs/alwaysonpro/recovery-on-secondary-lagging-shared-redo-target) Running out of parallel redo threads in secondary replica --- When secondary replica is running out of parallel redo threads, you can use TF3478 to allow maximal number of parallel redo thread to increase with total number of CPUs. By default, A SQL Server instance uses up to 100 threads for parallel redo for secondary replicas. Each database uses up to one-half of the total number of CPU cores, but not more than 16 threads per database. If the total number of required threads for a single instance exceeds 100, SQL Server uses a single redo thread for every remaining database. Disk or I/O subsystem latency -- > when the disk or I/O subsystem in the secondary replica has critical bottleneck, the redo rate will be impacted and quite small in secondary replica. We are less likely to see it in your secondary before it hit on primary if your secondary replica is not readable and has comparable hardware for SQL database files' storage as that in primary and dedicated for this SQL AG's usage. However, if your secondary replica has high volume of read-only workload, it can bring extra IO overhead to your IO subsystem. Frequent database backup and VM snapshot can be another potential cause of extra I/O overhead.25KViews3likes0CommentsConfigure multiple-subnet AlwaysOn Availability Group by modifying CIB
In the Windows world, a Windows Server Failover Cluster (WSFC) natively supports multiple subnets and handles multiple IP addresses via an OR dependency on the IP address. On Linux, there is no OR dependency, but there is a way to achieve a proper multi-subnet natively with Pacemaker, as shown by the following. You cannot do this by simply using the normal Pacemaker command line to modify a resource. You need to modify the cluster information base (CIB). The CIB is an XML file with the Pacemaker configuration. Here is an example to create a SQL Server Linux Availability group in 4 nodes in 3 subnets in RHEL 7.6 If you are already familiar with the AG Group setup process, please just jump to step 16. 1.Register your subscription on for all servers (red1,red2,red3 and red4 in this case) subscription-manager register 2.List all available subscription, pick the one with High Availabiilty , notedown the pool id subscription-manager list --available --all 3.Register the subscription for all nodes (red1,red2,red3 and red4 in this case) sudo subscription-manager attach --pool=xxxxx 4.Enable the repository(red1,red2,red3 and red4 in this case) sudo subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms 5.Install Pacemaker packages on all nodes. (red1,red2,red3 and red4 in this case) sudo yum install pacemaker pcs fence-agents-all resource-agents 6.Install SQL Server resource agent (red1,red2,red3 and red4 in this case) sudo yum install mssql-server-ha 7.Set the password for the default user that is created when installing Pacemaker and Corosync packages. All the password should be exactly same (red1,red2,red3 and red4 in this case) sudo passwd hacluster 8.Update /etc/hosts file in all servers, add IP and node name. All the servers should have the same entries. 192.168.2.103 red1 192.168.2.104 red2 192.168.4.100 red3 192.168.5.101 red4 9.Run following commands to Enable and start pcsd service and Pacemaker in all nodes. (red1,red2 and red3 and red4 in this case) sudo systemctl enable pcsd sudo systemctl start pcsd sudo systemctl enable pacemaker 10.Run following commands to Create Cluster in primary replica node (red1 in this case) sudo pcs cluster auth red1 red2 red3 red4 -u hacluster -p YouPasswordUsedinStep7 sudo pcs cluster setup --name sqlcluster1 red1 red2 red3 red4 sudo pcs cluster start --all sudo pcs cluster enable --all 11.Run following command to Enable cluster feature in all nodes(red1,red2 , red3 and red4 in this case) sudo /opt/mssql/bin/mssql-conf set hadr.hadrenabled 1 sudo systemctl restart mssql-server Create AG and Listener 1.Run following queries in red1 to create certificate use master go CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**'; go CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm'; go BACKUP CERTIFICATE dbm_certificate TO FILE = '/var/opt/mssql/data/dbm_certificate.cer' WITH PRIVATE KEY ( FILE = '/var/opt/mssql/data/dbm_certificate.pvk', ENCRYPTION BY PASSWORD = '**<Private_Key_Password>**' ); 2.Run following commands in red1 to copy the certificate to rest of the servers(red2,red3 and red4 in this case) cd /var/opt/mssql/data scp dbm_certificate.* root@red2:/var/opt/mssql/data/ scp dbm_certificate.* root@red3:/var/opt/mssql/data/ scp dbm_certificate.* root@red4:/var/opt/mssql/data/ 3.Give permission to the mssql user to access the certificate files in rest of the servers(red2,red3 and red4 in this case) cd /var/opt/mssql/data chown mssql:mssql dbm_certificate.* 4.Run following T-SQL queries to create the certificate in rest of the nodes by restoring the certificate backup file (red2,red3 and red4 in this case) use master go CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**' go CREATE CERTIFICATE dbm_certificate FROM FILE = '/var/opt/mssql/data/dbm_certificate.cer' WITH PRIVATE KEY ( FILE = '/var/opt/mssql/data/dbm_certificate.pvk', DECRYPTION BY PASSWORD = '**<Private_Key_Password>**' ) 5.Create endpoint in all servers (red1,red2,red3 and red4) CREATE ENDPOINT [Hadr_endpoint] AS TCP (LISTENER_PORT = 5022) FOR DATABASE_MIRRORING ( ROLE = ALL, AUTHENTICATION = CERTIFICATE dbm_certificate, ENCRYPTION = REQUIRED ALGORITHM AES ); ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED; 6.Run following query in primary replica (red1) to create Availability group(Please note, it works for SQL 2019. If you are using SQL 2017, you need to change AVAILABILITY_MODE of one the replica to ASYNCHRONOUS_COMMIT) CREATE AVAILABILITY GROUP [ag1] WITH (DB_FAILOVER = ON, CLUSTER_TYPE = EXTERNAL) FOR REPLICA ON N'red1' WITH ( ENDPOINT_URL = N'tcp://red1:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC) , N'red2' WITH ( ENDPOINT_URL = N'tcp://red2:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC), N'red3' WITH ( ENDPOINT_URL = N'tcp://red3:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC), N'red4' WITH ( ENDPOINT_URL = N'tcp://red4:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC) ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;--grant create any database permission Join the AG group, run the following T-SQL queries in all the secondary servers (red2,red3 and red4 in this case) ALTER AVAILABILITY GROUP [ag1] JOIN WITH (CLUSTER_TYPE = EXTERNAL); ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE 8.Run following T-SQL Queries to create database and add it to AG group in primary replica (red1 in this case). CREATE DATABASE [db1]; ALTER DATABASE [db1] SET RECOVERY FULL; BACKUP DATABASE [db1] TO DISK = N'/var/opt/mssql/data/db1.bak'; BACKUP log [db1] TO DISK = N'/var/opt/mssql/data/db1.trn'; GO ALTER AVAILABILITY GROUP [ag1] ADD DATABASE [db1]; 9.Create SQL login pacemaker in all servers (red1,red2,red3 and red4 in this case). CREATE LOGIN [pacemakerLogin] with PASSWORD= N'ComplexP@$$w0rd!' GO ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemakerLogin] 10.Run following bash command in red1 sudo pcs property set stonith-enabled=false In all SQL Server Linux servers , run following bash commands to save the credentials for the SQL Server login.(red1,red2,red3 and red4) (The password is as same as the one used in step 9) echo 'pacemakerLogin' >> ~/pacemaker-passwd echo 'ComplexP@$$w0rd!' >> ~/pacemaker-passwd sudo mv ~/pacemaker-passwd /var/opt/mssql/secrets/passwd sudo chown root:root /var/opt/mssql/secrets/passwd sudo chmod 400 /var/opt/mssql/secrets/passwd # Only readable by root 12.Create availability group resource at cluster level, run following command on any one of the nodes (just in one server and run just one time). sudo pcs resource create ag_cluster1 ocf:mssql:ag ag_name=ag1 meta failure-timeout=60s master notify=true ##check the status 13.Run following bash command in primary replica red1 to create one virtual IP resources. The resource name is 'vip1', and IP address is 192.168.2.111 sudo pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=192.168.2.111 ##check the status Create Availability group listener for Availability group ag1. Run following T-SQL query in primary replica (red1 in this case). ALTER AVAILABILITY GROUP [ag1] ADD LISTENER 'aglistener' (WITH IP ( ('192.168.2.111','255.255.255.0'), ('192.168.4.111','255.255.255.0'), ('192.168.5.111','255.255.255.0') ),PORT = 1433); Run following bash commands to create constraints: sudo pcs constraint colocation add vip1 ag_cluster1-master INFINITY with-rsc-role=Master sudo pcs constraint order promote ag_cluster1-master then start vip1 16.Run following bash command to export the CIB.(you can run the command in any node) sudo pcs cluster cib <filename> 17.You will find following similar entries <primitive class="ocf" id="vip1" provider="heartbeat" type="IPaddr2"> <instance_attributes id="vip1-instance_attributes"> <nvpair id="vip1-instance_attributes-ip" name="ip" value="192.168.2.111"/> </instance_attributes> <operations> <op id="vip1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> <op id="vip1-start-interval-0s" interval="0s" name="start" timeout="20s"/> <op id="vip1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> </operations> </primitive> 18.Here is the modified version <primitive class="ocf" id="vip1" provider="heartbeat" type="IPaddr2"> <instance_attributes id="vip1-instance_attributes"> <rule id="Subnet1-IP" score="INFINITY" boolean-op="or"> <expression id="Subnet1-Node1" attribute="#uname" operation="eq" value="red1"/> <expression id="Subnet1-Node2" attribute="#uname" operation="eq" value="red2"/> </rule> <nvpair id="vip1-instance_attributes-ip" name="ip" value="192.168.2.111"/> </instance_attributes> <instance_attributes id="vip1-instance_attributes2"> <rule id="Subnet2-IP" score="INFINITY"> <expression id="Subnet2-Node1" attribute="#uname" operation="eq" value="red3"/> </rule> <nvpair id="vip1-instance_attributes-ip2" name="ip" value="192.168.4.111"/> </instance_attributes> <instance_attributes id="vip1-instance_attributes3"> <rule id="Subnet3-IP" score="INFINITY"> <expression id="Subnet3-Node1" attribute="#uname" operation="eq" value="red4"/> </rule> <nvpair id="vip1-instance_attributes-ip3" name="ip" value="192.168.5.111"/> </instance_attributes> <operations> <op id="vip1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> <op id="vip1-start-interval-0s" interval="0s" name="start" timeout="20s"/> <op id="vip1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> </operations> </primitive> Run following command to import the modified CIB and reconfigure Pacemaker. sudo pcs cluster cib-push <filename> Here are the takeaway points: 1).All nodes in same subnet should be in the same <Instance_attributes> 2).If there are more than one servers in the subnet, the keyword ‘boolean-op="or"’ is a must 3).The IP address of Alwayson Listener is addressed in <nvpair> . 4).The value of id property does not matter, you can specify any value as long as the value is unique. Optional, you can create three entries for the three IP addresses in the DNS server. Here is an screenshot of using SQLCMD to connect the AGListener11KViews3likes1CommentCreate a hybrid Distributed Availability Group Between SQL Windows and SQL Linux
This article describes how to create a Hybrid SQL Server Distributed Availability Group between SQL Linux and SQL Windows Here is the topology of the DAG environment Data Center 1 Data Center 2 Red1:192.168.3.102 (Primary) Red2:192.168.3.103 Availability group :AGLinux Listener:AGLinux-Lis IP:192.168.3.111 SQL Server: 2019 CU2 RHEL 8 Node1:192.168.2.101(Forwarder) Node2:192.168.2.102 Availability group :AGWindows Listener:AGWindows-Lis: IP:192.168.2.111 SQL Server: 2019 CU2 Windows 2016 Distributed Availability group:DAGHybrid Please make use following ports are opened on all the servers in both datacenters. TCP:5022,2224,3121,21064 UDP:5405 For RHEL, you need to have the 'high availability subscription' enabled. Please refer https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-availability-group-cluster-rhel?view=sql... for detail. Section 1:Create Availability group in datacenter1 1. Updates /etc/hosts on all the nodes Datacenter1, add IP , replica name and AG Listeners. Here is a screenshot. All servers in the datacenter1 should have same setting.(Although the IP resources of Listeners will be created in later steps, it’s ok to add at the very beginning) 2. Run following bash commands on all servers in datacenter1 to enable Alwayson Feature. sudo /opt/mssql/bin/mssql-conf set hadr.hadrenabled 1 sudo systemctl restart mssql-server 3.Run following T-SQL to Enable Always On Health check event session on all the servers in datacenter1. ALTER EVENT SESSION AlwaysOn_health ON SERVER WITH (STARTUP_STATE=ON); 4.Create a certificate, run the query on primary replica in datacenter1(red1 in this case) use master go CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**'; go CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm'; go BACKUP CERTIFICATE dbm_certificate TO FILE = '/var/opt/mssql/data/dbm_certificate.cer' WITH PRIVATE KEY ( FILE = '/var/opt/mssql/data/dbm_certificate.pvk', ENCRYPTION BY PASSWORD = '**<Private_Key_Password>**' ); 5. Copy the certificate to rest of the servers in datacenter1(red2 in this case) cd /var/opt/mssql/data scp dbm_certificate.* root@**<nodeName>**:/var/opt/mssql/data/ 6. Give permission to the mssql user to access the certificate files in rest of the servers(red2 in this case).Run following bash command in red2. cd /var/opt/mssql/data chown mssql:mssql dbm_certificate.* 7. Create the certificate on rest of the servers by restoring the certificate backup(red2 in this case) use master go CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**' go CREATE CERTIFICATE dbm_certificate FROM FILE = '/var/opt/mssql/data/dbm_certificate.cer' WITH PRIVATE KEY ( FILE = '/var/opt/mssql/data/dbm_certificate.pvk', DECRYPTION BY PASSWORD = '**<Private_Key_Password>**' ) 8.Create endpoint on all servers in datacenter1. The port 5022 is used. CREATE ENDPOINT [Hadr_endpoint] AS TCP (LISTENER_PORT = 5022) FOR DATABASE_MIRRORING ( ROLE = ALL, AUTHENTICATION = CERTIFICATE dbm_certificate, ENCRYPTION = REQUIRED ALGORITHM AES ); ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED; 9.Create AG with 2 synchronous replicas in datacenter1. Run following query on primary replica (red1 in this case) CREATE AVAILABILITY GROUP [AGLinux] WITH (DB_FAILOVER = ON, CLUSTER_TYPE = EXTERNAL) FOR REPLICA ON N'red1' WITH ( ENDPOINT_URL = N'tcp://red1:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC), N'red2' WITH ( ENDPOINT_URL = N'tcp://red2:5022', AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, FAILOVER_MODE = EXTERNAL, SEEDING_MODE = AUTOMATIC) go ALTER AVAILABILITY GROUP [AGLinux] GRANT CREATE ANY DATABASE;--grant create any database permission 10.Join the AG group, run the following T-SQL queries on all the secondary servers in datacenter1(red4 in this case) ALTER AVAILABILITY GROUP [AGLinux] JOIN WITH (CLUSTER_TYPE = EXTERNAL); ALTER AVAILABILITY GROUP [AGLinux] GRANT CREATE ANY DATABASE 11.Create database and add to AG group on primary replica.(red1 in this case). CREATE DATABASE [db2]; ALTER DATABASE [db2] SET RECOVERY FULL; BACKUP DATABASE [db2] TO DISK = N'db2.bak'; BACKUP log [db2] TO DISK = N'db2.trn'; GO ALTER AVAILABILITY GROUP [AGLinux] ADD DATABASE [db2]; 12.Create SQL login pacemaker on all servers in datacenter1 CREATE LOGIN [pacemakerLogin] with PASSWORD= N'ComplexP@$$w0rd!' GO ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemakerLogin] GO 13.Install pacemaker.Run the following bash command on all servers in datacenter1. sudo yum install pacemaker pcs fence-agents-all resource-agents 14.Install SQL Server resource agent for SQL Server. Run the following bash command on all servers in datacenter1. sudo yum install mssql-server-ha 15.Set the password for the default user that is created when installing Pacemaker and Corosync packages. Run following command on all servers in datacenter1(This command asks for password. The password should be same for the servers in same datacenter) sudo passwd hacluster 16.To allow nodes to rejoin the cluster after the reboot, enable and start pcsd service and Pacemaker. Run the following commands on all servers in datacenter1. sudo systemctl enable pcsd sudo systemctl start pcsd sudo systemctl enable pacemaker 17.Create a cluster in datacenter1. Run following commands on any one of the nodes in datacenter1(just on one server, and just one time). You will be asked to provide the username and password you specified in step 15. sudo pcs host auth red1 red2 sudo pcs cluster setup cluster1 red1 red2 sudo pcs cluster start --all sudo pcs cluster enable --all 18.Because the node level fencing configuration depends heavily on your environment, disable it for this tutorial (it can be configured later. In my test, if this value is not set to false, you can’t failover, and IP resource may fail). The following script disables node level fencing(run it on any nodes in datacenter1) sudo pcs property set stonith-enabled=false 20.On all SQL Server Linux servers in datacenter1 , save the credentials for the SQL Server login.(Use the same user name and password created in step12) echo 'pacemakerLogin' >> ~/pacemaker-passwd echo 'ComplexP@$$w0rd!' >> ~/pacemaker-passwd sudo mv ~/pacemaker-passwd /var/opt/mssql/secrets/passwd sudo chown root:root /var/opt/mssql/secrets/passwd sudo chmod 400 /var/opt/mssql/secrets/passwd # Only readable by root 21.Create availability group resource at OS level in datacenter1, run following command on any one of the nodes in datacenter1(just on one node, and just one time)Here are the resource and constraint before creating the resource(no resource/constraint) sudo pcs resource create ag_cluster ocf:mssql:ag ag_name=AGLinux meta failure-timeout=60s promotable notify=true 22.Create virtual IP resource. The resource name is ‘virtualip1’, and IP Address is 192.168.3.111 in this demonstration. This is the IP Address for the AG Listener. sudo pcs resource create virtualip1 ocf:heartbeat:IPaddr2 ip=192.168.3.111 23.Create Availability group listener for Availability group ag1. Run following T-SQL query in primary replica(red1 in this case). The ip address should be as same as the IP in step 22. ALTER AVAILABILITY GROUP [AGLinux] ADD LISTENER 'AGLinux-Lis' (WITH IP(('192.168.3.111','255.255.255.0')),PORT = 1433); a)It’s fine if you run step 23 before the step 22, however, you can’t use the IP Address to connect SQL Server until the resource in Pacemaker is created. b)There is a bug in DMV sys.availability_group_listener_ip_addresses showing the state is offline 24.Add colocation constraint. Please note, without the constraint, the ip resource does not failover when the AG resource failover. Hence we need to create a constraint to guarantee that the IP will failover when cluster resource failover. Run following command on any servers in datacenter1(just on one server, and just one time) sudo pcs constraint colocation add virtualip1 with master ag_cluster-clone INFINITY with-rsc-role=Master 25.Add ordering constraint sudo pcs constraint order promote ag_cluster-clone then start virtualip1 25.To update the property value to 2 minutes run. Run following bash command on any node(just one time) sudo pcs property set cluster-recheck-interval=2min sudo pcs property set start-failure-is-fatal=true pcs resource update ag_cluster meta failure-timeout=60s Section 2:Create Availability group in datacenter2(Actually, you can run this section first as long as the certificate is as same as the one used in Section1). Steps in Section 2 is exactly same to regular SQL Server AG in Windows, the only thing I’d like to highlight is that you have to create the endPoint authenticated by certificate, which is the same certificate you used in SQL Linux. 1.Copy certificate backup in SQL Linux to the two Windows servers. 2.Run following queries to create master key ,certificate and endpoint CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**' go CREATE CERTIFICATE dbm_certificate FROM FILE = 'C:\Backup\dbm_certificate.cer' WITH PRIVATE KEY ( FILE = 'C:\Backup\dbm_certificate.pvk', DECRYPTION BY PASSWORD = '**<Private_Key_Password>**' ) go create ENDPOINT [Hadr_endpoint] AS TCP (LISTENER_PORT = 5022) FOR DATABASE_MIRRORING ( ROLE = ALL, AUTHENTICATION = CERTIFICATE dbm_certificate, ENCRYPTION = REQUIRED ALGORITHM AES ); ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED; 3.Create login and grant permission to access the endpoint. CREATE LOGIN InstanceA_Login WITH PASSWORD = 'Strong Passworsdfg1#d'; CREATE USER InstanceA_User FOR LOGIN InstanceA_Login; go alter authorization on CERTIFICATE::dbm_certificate to InstanceA_User GO GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO InstanceA_Login; 4.Create Availability group CREATE AVAILABILITY GROUP [AGWindows] FOR REPLICA ON N'node1' WITH ( ENDPOINT_URL = N'TCP://node1:5022', FAILOVER_MODE = AUTOMATIC, AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, SEEDING_MODE = AUTOMATIC ), N'node2' WITH ( ENDPOINT_URL = N'TCP://node2:5022', FAILOVER_MODE = AUTOMATIC, AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, SEEDING_MODE = AUTOMATIC ) 5.Create Listener ALTER AVAILABILITY GROUP [AGWindows] ADD LISTENER 'AGWindows-Lis' (WITH IP(('192.168.2.111','255.255.255.0')),PORT = 1433); 6.Run following T-SQL to connect the primary replica on all replicas(node2 in this case) ALTER AVAILABILITY GROUP [AGWindows] JOIN GO ALTER AVAILABILITY GROUP [AGWindows] GRANT CREATE ANY DATABASE; 7.Edit the hosts file on two windows servers, make sure they are able to access the listener AGLinux-Lis. Section 3:Create Hybrid Distributed Availability group 1.Create distributed availability group in datacenter1. Run following query in primary replica in datacenter1( red1 in this case) CREATE AVAILABILITY GROUP [DAGHybrid] WITH (DISTRIBUTED) AVAILABILITY GROUP ON 'AGLinux' WITH ( LISTENER_URL = 'tcp://AGLinux-lis:5022' ,AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT ,FAILOVER_MODE = MANUAL ,SEEDING_MODE = AUTOMATIC ) ,'AGWindows' WITH ( LISTENER_URL = 'tcp://AGWindows-Lis:5022' ,AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT ,FAILOVER_MODE = MANUAL ,SEEDING_MODE = AUTOMATIC ); 2.Run following query on the forward to join the Distributed AG ALTER AVAILABILITY GROUP [DAGHybrid] JOIN AVAILABILITY GROUP ON 'AGLinux' WITH ( LISTENER_URL = 'tcp://AGLinux-lis:5022' ,AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT ,FAILOVER_MODE = MANUAL ,SEEDING_MODE = AUTOMATIC ), 'AGWindows' WITH ( LISTENER_URL = 'tcp://AGWindows-Lis:5022' ,AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT ,FAILOVER_MODE = MANUAL ,SEEDING_MODE = AUTOMATIC ); The article for RHEL 7.6 is here:https://sqlserver.code.blog/2020/02/14/create-a-hybrid-distributed-availability-group-between-sql-windows-and-sql-linux/4.8KViews3likes0CommentsAlwaysOn Readable Secondaries Can Display Misleading Data & Log File Paths
First published on MSDN on Jun 29, 2016 Written By:Grant Carter, Senior Premier Field EngineerReviewed By:Mark Weber – Principal Premier Field EngineerNorm Eberly – Senior Premier Field EngineerCharles Allard – Senior Premier Field EngineerNick Schueler - Senior Premier Field EngineerCurt Matthews – Senior Escalation Engineer ProblemYou may discover on a readable secondary database that is part of an AlwaysOn Availability Group the data and transaction log file locations in sp_helpdb and sys.1.3KViews2likes0CommentsTutorial: Create SQL Cluster(FCI) on RHEL
In the Windows world, SQL Server integrates into Windows Server Failover Cluster (WSFC) natively, and we have a dedicated installation for SQL Server Cluster. However, on Linux, you need to install standalone SQL Server instance in the nodes first, and then configure the instance as a SQL Server cluster instance. I use SQL Server 2019 with RHEL 8.x in this tutorial, but it is possible to use SQL Server 2017 in RHEL 7.x or RHEL 8 to configure FCI. Here is the step by step Video Topology 1.ISCSI target server configuration The two highlighted disks will be the used as Shared Storage. sdc is for the database files in /var/opt/mssql/data sdd is for the user databases files. If all your databases are stored /var/opt/mssql/data, feel free to ignore all the steps link to device sdd. 2.Run fdisk to create partition. fdisk /dev/sdc fdisk /dev/sdd Run lsblk again 3.Install targetcli package. yum -y install targetcli 4.Run targetcli to create ISCSI target server. I created two ACLs entries for the node1 and node2: iqn.2020-08.com.contoso:node1 iqn.2020-08.com.contoso:node2 I’m not going to dig into the targetcli command, please review this article for detail. And I have following iscsi settings: 5.Run following bash command to expose tcp port 3260 firewall-cmd --add-port=3260/tcp --permanent firewall-cmd --reload Enable and restart the target service. systemctl enable target.service systemctl restart target.service 2.ISCSI initiator configuration. Choose one of the servers that will participate in the FCI configuration. It does not matter which one. I use node1 in this tutorial. Please note, All the steps in this section are performed in node1, unless stated otherwise. 1.Install iscsi-initiator-utils in all nodes. sudo yum install iscsi-initiator-utils -y 2.Edit the /etc/iscsi/initiatorname.iscsi , replace the existing value with following keywords, the one I used in step 4 of section[ISCSI target server configuration] InitiatorName=iqn.2020-08.com.contoso:node1 3.Discover iSCSI disk of the target. iscsiadm -m discovery -t st -p <ip of iscsi target server> Here is the command in this tutorial. iscsiadm -m discovery -t st -p 10.1.0.8 (Optional step)After the discovery below database is updated. ls -l /var/lib/iscsi/nodes 6.Make the connection to iscsi target. iscsiadm --mode node --targetname iqn.2020-08.com.contoso:servers --login (Optional step)After logging in, a session with the iSCSI target is established. iscsiadm --mode node -P 1 (Optional step)If you review the messages file, you will see following keywords sudo grep "Attached SCSI" /var/log/messages Aug 6 01:38:21 localhost kernel: sd 3:0:1:0: [sdb] Attached SCSI disk Aug 6 01:38:21 localhost kernel: sd 2:0:0:0: [sda] Attached SCSI disk Aug 6 04:26:01 localhost kernel: sd 6:0:0:0: [sdc] Attached SCSI disk Aug 6 04:26:01 localhost kernel: sd 6:0:0:1: [sdd] Attached SCSI disk 9.Create physical volumes on the iSCSI disks. sudo pvcreate /dev/sdc sudo pvcreate /dev/sdd 10.Create volume groups ‘FCIDataVG1’ and ‘FCIDataVG2’ on the iSCSI disk. sudo vgcreate FCIDataVG1 /dev/sdc sudo vgcreate FCIDataVG2 /dev/sdd Create logical name for the two groups. Run following commands in node1. sudo lvcreate -L599G -n FCIDataLV1 FCIDataVG1 sudo lvcreate -L499G -n FCIDataLV2 FCIDataVG2 Check the device Format the logical volume with a supported filesystem. sudo mkfs.xfs /dev/FCIDataVG1/FCIDataLV1 sudo mkfs.xfs /dev/FCIDataVG2/FCIDataLV2 13.Repeat the step1~step6 in rest of the nodes. Please note, 1)Do not do step6 in rest of node2 before step12 is completed in node1. Else you maybe not able to failover. 2)All the steps but step 2 are exactly same. Here is the value for node2 3)After the step6 is executed in rest of the nodes, you will see the same devices as node1. Here is a screenshot of node2 after step 6 is executed. 3.SQL Server configuration. Please note, All the steps in section are performed in node1, unless stated otherwise. 1.Run following queries to create login used by pacemaker CREATE LOGIN [sqlpackmaker] with PASSWORD= N'YourStrongP@ssword1' ALTER SERVER ROLE [sysadmin] ADD MEMBER [sqlpackmaker] 2.Drop the default server name and create a new server name. The new server name is SQL Virutal name. exec sp_dropserver node1 go exec sp_addserver 'sqlvirtualname1','local' Restart SQL Server to take effect. sudo systemctl stop mssql-server sudo systemctl restart mssql-server 4.Run following queries to check change. select @@servername, SERVERPROPERTY('ComputernamephysicalNetBIOS') 5.Stop SQL Server in all nodes(node1,node2 ). sudo systemctl stop mssql-server 6.Copy the /var/opt/mssql/secrets/machine-key of node1 to node2. 7.Create temporary directories to store the SQL Server data and log files. mkdir /var/opt/mssql/tempdir mkdir /var/opt/mssql/tempuserdata 8.Copy the SQL Server data and log files to the temporary directories. cp /var/opt/mssql/data/* /var/opt/mssql/tempdir/ cp /var/opt/mssql/userdata/* /var/opt/mssql/tempuserdata/ 9.Delete the files from the existing SQL Server data directory in node1. rm -f /var/opt/mssql/data/* rm -f /var/opt/mssql/userdata/* 10.Mount the iSCSI logical volume in the SQL Server data folder. mount /dev/<VolumeGroupName>/<LogicalVolumeName> <FolderName> Here are the commands in this tutorial. mount /dev/FCIDataVG1/FCIDataLV1 /var/opt/mssql/data mount /dev/FCIDataVG2/FCIDataLV2 /var/opt/mssql/userdata 11.Change the owner of the mount to mssql. chown mssql:mssql /var/opt/mssql/data chown mssql:mssql /var/opt/mssql/userdata 12.Change ownership of the group of the mount to mssql. chgrp mssql /var/opt/mssql/data chgrp mssql /var/opt/mssql/userdata 13.Copy the files from temp folders back to /var/opt/mssql/data and /var/opt/mssql/userdata. cp /var/opt/mssql/tempdir/* /var/opt/mssql/data cp /var/opt/mssql/tempuserdata/* /var/opt/mssql/userdata 14.Check the two temp folders and make sure the files are copied. Change ownership of files to mssql. chown mssql:mssql /var/opt/mssql/data/* chown mssql:mssql /var/opt/mssql/userdata/* 16.Configure auto mount to make the OS mount the devices automatically. 1) makeGet the UUID. Please downdown the UUID ,TYPE and directory blkid /dev/FCIDataVG1/FCIDataLV1 blkid /dev/FCIDataVG2/FCIDataLV2 2).Edit /etc/fstab to configure auto mount in node1,node2. Here is a screenshot in node1. Please review this article for more detail 4.Cluster configuration. 1.Edit /etc/hosts to speicfy the node and ips in node1 and node1. Do the same thing in node1 and node2 2.Create a file to store the SQL Server username and password for the Pacemaker login. Run the following command in node1 and node2 (The same login name and password specified in step 1 of section [SQL Server configuration] sudo touch /var/opt/mssql/secrets/passwd sudo echo 'sqlpackmaker' >> /var/opt/mssql/secrets/passwd sudo echo 'YourStrongP@ssword1' >> /var/opt/mssql/secrets/passwd sudo chown root:root /var/opt/mssql/secrets/passwd sudo chmod 600 /var/opt/mssql/secrets/passwd 3.On both cluster nodes, open the Pacemaker firewall ports. To open these ports with firewalld, run the following command in node 1 and node2 sudo firewall-cmd --permanent --add-service=high-availability sudo firewall-cmd --reload 4.Install Pacemaker packages in node 1 and node2 sudo yum install pacemaker pcs fence-agents-all resource-agents 5.Set the password for the default user that is created when installing Pacemaker and Corosync packages. Use the same password in node 1 and node2 sudo passwd hacluster 6.To allow nodes to rejoin the cluster after the reboot, enable and start pcsd service and Pacemaker. Run the following command in node 1 and node2 sudo systemctl enable pcsd sudo systemctl start pcsd sudo systemctl enable pacemaker 7.Create the cluster. Run following command in node1. The password should be as same as the one in step 5. sudo pcs host auth node1 node2 sudo pcs cluster setup sqlcluster node1 node2 sudo pcs cluster start --all sudo pcs cluster enable --all 8.Disable the stonith-enabled for test purpose. Run following command in node1. sudo pcs property set stonith-enabled=false 9.Install the FCI resource agent for SQL Server. Run the following commands in node1 and node2 sudo yum install mssql-server-ha 10.Create disk resource and this resource belongs to a resource group(RGfci in this demo). Run following command in node1 sudo pcs resource create iSCSIDisk1 Filesystem device="/dev/FCIDataVG1/FCIDataLV1" directory="/var/opt/mssql/data" fstype="xfs" --group fci sudo pcs resource create iSCSIDisk2 Filesystem device="/dev/FCIDataVG2/FCIDataLV2" directory="/var/opt/mssql/userdata" fstype="xfs" --group fci 11. Create IP resource that will be used by FCI, and this resource belongs to the same resource group created in previous step. sudo pcs resource create vip2 ocf:heartbeat:IPaddr2 ip=10.1.0.111 nic=eth0 cidr_netmask=24 --group fci 12.Create FCI resource. The resource name should be exactly same to the SQL Virtual name created in step 2 in section [SQL Server configuration] sudo pcs resource create sqlvirtualname1 ocf:mssql:fci --group fci Failover === sudo pcs resource move sqlvirtualname1 <NodeName> More:Add a node to existing cluster3.8KViews2likes0Comments