alerts
5 TopicsCreating alerts for custom errors with auditing and Log Analytics
Errors are an inherent part of any application. As a database professional managing an Azure SQL Managed Instance, you may be interested in understanding when specific errors occur and how to leverage user-generated errors to respond swiftly when particular scenarios arise. On this post we will see how we can setup alerts for scenarios like query blocking and long open transactions. This will be an extend of what is described on How to setup alerts for deadlocks using Log Analytics | Microsoft Community Hub Step 1 - Setup Log Analytics and auditing Follow steps 1, 2, 3 and 4 described on How to setup alerts for deadlocks using Log Analytics | Microsoft Community Hub Step 2 - Create a table to save the details of the blocking chain / long open transaction Getting an alert itself is not useful if you don`t have a way of getting details for later analysis. Create the table on a database of your choice. If you are interested on blocking: CREATE TABLE [dbo].[blocking_report] ( [DateTime] [datetime] NULL ,[HeadBlocker] [varchar](1) NOT NULL ,[SessionID] [smallint] NOT NULL ,[Login] [nvarchar](128) NOT NULL ,[Database] [nvarchar](128) NULL ,[BlockedBy] [smallint] NULL ,[OpenTransactions] [int] NULL ,[Status] [nvarchar](30) NOT NULL ,[WaitType] [nvarchar](60) NULL ,[WaitTime_ms] [bigint] NULL ,[WaitResource] [nvarchar](256) NULL ,[WaitResourceDesc] [nvarchar](3072) NULL ,[Command] [nvarchar](32) NULL ,[Application] [nvarchar](128) NULL ,[TotalCPU_ms] [int] NOT NULL ,[TotalPhysicalIO_MB] [bigint] NULL ,[MemoryUse_KB] [int] NULL ,[LoginTime] [datetime] NOT NULL ,[LastRequestStartTime] [datetime] NOT NULL ,[HostName] [nvarchar](128) NULL ,[QueryHash] [binary](8) NULL ,[BlockerQuery_or_MostRecentQuery] [nvarchar](max) NULL ) If you are interested on open transactions: CREATE TABLE [dbo].[opentransactions]( [CapturedTime] [datetime] NOT NULL, [tran_elapsed_time_seconds] [int] NULL, [transaction_begin_time] [datetime] NOT NULL, [session_id] [int] NOT NULL, [database_name] [nvarchar](128) NULL, [open_transaction_count] [int] NOT NULL, [host_name] [nvarchar](128) NULL, [program_name] [nvarchar](128) NULL, [login_name] [nvarchar](128) NULL, [status] [nvarchar](30) NULL, [text] [nvarchar](max) NULL ) Step 3 - Create a SQL Agent job Schedule the two SQL Agent jobs using the queries below. You might be interested on schedule that runs every X amount of seconds/minutes. Describing very briefly what each query does: if a query is being blocked for more than 60 seconds, save the blocking chain on a table and raise an error if a transaction is open for more than 120 seconds, save the query details on a table and raise an error Make the necessary adjustments according to your goals. Query for blocking: IF ( SELECT count(*) FROM sys.dm_exec_requests WHERE wait_type LIKE 'LCK%' AND wait_time > 60000 ) > 0 -- checks for queries waiting to obtain a lock for more than 30 seconds BEGIN INSERT INTO database_name.dbo.Blocking_report -- make sure that you change the database_name value SELECT ( SELECT getdate() ) ,[HeadBlocker] = CASE WHEN r2.session_id IS NOT NULL AND ( r.blocking_session_id = 0 OR r.session_id IS NULL ) THEN '1' ELSE '' END ,[SessionID] = s.session_id ,[Login] = s.login_name ,[Database] = db_name(p.dbid) ,[BlockedBy] = w.blocking_session_id ,[OpenTransactions] = r.open_transaction_count ,[Status] = s.STATUS ,[WaitType] = w.wait_type ,[WaitTime_ms] = w.wait_duration_ms ,[WaitResource] = r.wait_resource ,[WaitResourceDesc] = w.resource_description ,[Command] = r.command ,[Application] = s.program_name ,[TotalCPU_ms] = s.cpu_time ,[TotalPhysicalIO_MB] = (s.reads + s.writes) * 8 / 1024 ,[MemoryUse_KB] = s.memory_usage * 8192 / 1024 ,[LoginTime] = s.login_time ,[LastRequestStartTime] = s.last_request_start_time ,[HostName] = s.host_name ,[QueryHash] = r.query_hash ,[BlockerQuery_or_MostRecentQuery] = txt.TEXT FROM sys.dm_exec_sessions s LEFT OUTER JOIN sys.dm_exec_connections c ON (s.session_id = c.session_id) LEFT OUTER JOIN sys.dm_exec_requests r ON (s.session_id = r.session_id) LEFT OUTER JOIN sys.dm_os_tasks t ON ( r.session_id = t.session_id AND r.request_id = t.request_id ) LEFT OUTER JOIN ( SELECT * ,ROW_NUMBER() OVER ( PARTITION BY waiting_task_address ORDER BY wait_duration_ms DESC ) AS row_num FROM sys.dm_os_waiting_tasks ) w ON (t.task_address = w.waiting_task_address) AND w.row_num = 1 LEFT OUTER JOIN sys.dm_exec_requests r2 ON (s.session_id = r2.blocking_session_id) LEFT OUTER JOIN sys.sysprocesses p ON (s.session_id = p.spid) OUTER APPLY sys.dm_exec_sql_text(ISNULL(r.[sql_handle], c.most_recent_sql_handle)) AS txt WHERE s.is_user_process = 1 AND ( r2.session_id IS NOT NULL AND ( r.blocking_session_id = 0 OR r.session_id IS NULL ) ) OR blocked > 0 ORDER BY [HeadBlocker] DESC ,s.session_id; THROW 50000 ,'There are queries being blocked for more than 60 seconds' ,1; END Query for open transactions: DECLARE @count INT = ( SELECT count(*) FROM sys.dm_tran_active_transactions at INNER JOIN sys.dm_tran_session_transactions st ON st.transaction_id = at.transaction_id LEFT OUTER JOIN sys.dm_exec_sessions sess ON st.session_id = sess.session_id LEFT OUTER JOIN sys.dm_exec_connections conn ON conn.session_id = sess.session_id OUTER APPLY sys.dm_exec_sql_text(conn.most_recent_sql_handle) AS txt WHERE DATEDIFF(SECOND, transaction_begin_time, GETDATE()) > 120 -- 120 seconds ) IF @count > 0 BEGIN INSERT INTO database_name.dbo.opentransactions -- change database_name to where the table was created SELECT GETDATE() AS CapturedTime ,DATEDIFF(SECOND, transaction_begin_time, GETDATE()) AS tran_elapsed_time_seconds ,at.transaction_begin_time ,st.session_id ,DB_NAME(sess.database_id) AS database_name ,st.open_transaction_count ,sess.host_name ,sess.program_name ,sess.login_name ,sess.STATUS ,txt.TEXT FROM sys.dm_tran_active_transactions at INNER JOIN sys.dm_tran_session_transactions st ON st.transaction_id = at.transaction_id LEFT OUTER JOIN sys.dm_exec_sessions sess ON st.session_id = sess.session_id LEFT OUTER JOIN sys.dm_exec_connections conn ON conn.session_id = sess.session_id OUTER APPLY sys.dm_exec_sql_text(conn.most_recent_sql_handle) AS txt WHERE DATEDIFF(SECOND, transaction_begin_time, GETDATE()) > 120 --120 seconds ORDER BY tran_elapsed_time_seconds DESC; THROW 50000 ,'There are open transactions for more than 120 seconds' ,1; END Step 4 - Create the alert based on a Log Analytics query Just as described on steps 6 and 7 of How to setup alerts for deadlocks using Log Analytics | Microsoft Community Hub, you can create the two separate alerts using the queries below. For blocking: AzureDiagnostics | where TimeGenerated > ago(15m) //last 15 minutes | where LogicalServerName_s == "server_name" //server name | where Category == "SQLSecurityAuditEvents" | where additional_information_s contains "There are queries being blocked for more" | project TimeGenerated, LogicalServerName_s, database_name_s, additional_information_s For open transactions: AzureDiagnostics | where TimeGenerated > ago(15m) //last 15 minutes | where LogicalServerName_s == "server_name" //server name | where Category == "SQLSecurityAuditEvents" | where additional_information_s contains "There are open transactions for more than" | project TimeGenerated, LogicalServerName_s, database_name_s, additional_information_s106Views0likes0CommentsHow to setup alerts for deadlocks using Log Analytics
Managed Instance diagnostic events do not support sending deadlock information to Log Analytics. However, through auditing, it's possible to query failed queries along with their reported error messages—though this does not include deadlock XML. We will see how we can send information to Log Analytics and setup an alert for when a deadlock occurs. Step 1 - Deploy Log Analytics Create a Log Analytics workspace if you currently don't have one Create a Log Analytics workspace Step 2 - Add diagnostic setting On the Azure Portal, open the Diagnostic settings of your Azure SQL Managed Instance and choose Add diagnostic setting Select SQL Security Audit Event and choose has destination your Log Analytics workspace Step 3 - Create a server audit on the Azure SQL Managed Instance Run the query below on the Managed Instance Rename the server audit and server audit specification to a name of your choice. CREATE SERVER AUDIT [audittest] TO EXTERNAL_MONITOR GO -- we are adding Login audit, but only BATCH_COMPLETED_GROUP is necessary for query execution CREATE SERVER AUDIT SPECIFICATION audit_server FOR SERVER AUDIT audittest ADD (SUCCESSFUL_LOGIN_GROUP), ADD (BATCH_COMPLETED_GROUP), ADD (FAILED_LOGIN_GROUP) WITH (STATE = ON) GO ALTER SERVER AUDIT [audittest] WITH (STATE = ON) GO Step 4 - Check events on Log Analytics It may take some time for records to begin appearing in Log Analytics. Open your Log Analytics workspace and choose Logs To verify if data is being ingested, run the following query in Log Analytics and wait until you start getting the first results: Make sure that you change servername with your Azure SQL Managed Instance name AzureDiagnostics | where LogicalServerName_s == "servername" | where Category == "SQLSecurityAuditEvents" | take 10 Example: Step 5 - (Optional) Create a deadlock event for testing Create a deadlock scenario so you can see a record on log analytics. Example: Open SSMS and a new query window under the context of a user database (you can create a test database just for this test). create a table on a user database and insert 10 records: create table tb1 (id int identity(1,1) primary key clustered, col1 varchar(30)) go insert into tb1 values ('aaaaaaa') go 10 You can close the query window or reuse for the next step. Open a new query window (or reuse the first query window) and run (leave the query window open after executing): begin transaction update tb1 set col1 = 'bbbb' where id = 1 Open a second query window and run (leave the query window open after executing): begin transaction update tb1 set col1 = 'bbbb' where id = 2 Go back to the first query window opened and run (the query will be blocked - will stay executing): update tb1 set col1 = 'bbbb' where id = 2 Go back to the second query window opened and run (this transaction will be victim of deadlock): update tb1 set col1 = 'bbbb' where id = 1 You can rollback and close all windows after the deadlock exception. Step 6 - (Optional) Check the deadlock exception on Log Analytics Note: the record can take some minutes to appear on Log Analytics Use the query below to obtain the Deadlock events for the last hour (we are looking for Error 1205) Make sure that you change servername with your Azure SQL Managed Instance name AzureDiagnostics | where TimeGenerated > ago(1h) | where LogicalServerName_s == "servername" | where Category == "SQLSecurityAuditEvents" | where succeeded_s == "false" | where additional_information_s contains "Err 1205, Level 13" Step 7 - Use query to Create an Alert Use the query below to create an Alert on Azure Log Analytics Make sure that you change servername with your Azure SQL Managed Instance name. The query checks for deadlocks that occurred on the previous hour. AzureDiagnostics | where TimeGenerated > ago(1h) | where LogicalServerName_s == "servername" | where Category == "SQLSecurityAuditEvents" | where succeeded_s == "false" | where additional_information_s contains "Err 1205, Level 13" Run the query and click on New alert rule Create the alert with the desired settingsMonitor your data using Azure Monitor log search alerts and the Azure Data Explorer (ADX) Database
Have you ever thought of how to keep track of the health and performance of your workloads and business? Have you ever encountered an issue where you wanted to get notified once something went wrong or needed your attention? That's where alerts come in handy.4.7KViews0likes0Comments