sql server
420 TopicsWindows server 2025 SQL patching cluster problem.
Dear Team, I have a problem when I am patching upgrade windows server 2025 with KB5091157. After patching is the clustering is not able to join back; it shows the error with credentials. The log error is "Cannot connect sqlxxxxxxx." you do not have administrative privileges on the cluster. Contact your network administrator to request access. Note: The server is not in a different VLAN network.21Views0likes0CommentsDocumentation contradictory
Hi, ALL, Page https://learn.microsoft.com/en-us/sql/t-sql/statements/create-database-transact-sql?view=sql-server-ver17&tabs=sqlpool states [quote] SIZE, MAXSIZE, and FILEGROWTH parameters can be set when a UNC path is specified for the file. [/quote] However later on that same page it states [quote] SIZE can't be specified when the os_file_name is specified as a UNC path. [/quote] I think those 2 sentences contradicts each other.....68Views0likes1CommentRegex support for LOB types in T-SQL—available in Azure SQL & SQL Server 2025
At a glance — Native regular expression (regex) functions in T-SQL now accept varchar(max) and nvarchar(max) inputs of up to 2 MB across all seven regex functions, including the two table-valued functions (REGEXP_MATCHES and REGEXP_SPLIT_TO_TABLE). This capability ships in SQL Server 2025 CU5 and is already available in Azure SQL Database, SQL Database in Fabric and Azure SQL Managed Instance configured with the Always-up-to-date update policy. It will reach Managed Instances on the SQL Server 2025 update policy as part of the CU5 rollout. You no longer need to split log files, HTML documents, or large JSON payloads into 8,000-byte chunks just to run a pattern match. 1. Introduction Regular expressions have long been a cornerstone of modern data processing — used for validation, parsing, transformation, and extracting structured insights from unstructured text. With SQL Server 2025 and Azure SQL, regex is now a first-class T-SQL capability, removing the historical need to rely on SQLCLR functions or application-tier processing. While the initial release made native regex broadly available, large-object (LOB) inputs were not yet supported on every function. CU5 closes that gap. Under the hood, T-SQL regex implements POSIX Extended Regular Expression (ERE) semantics, augmented by a curated set of Perl-style features, and is powered by the RE2 engine. RE2 is a linear-time, non-backtracking implementation, which means it is not susceptible to catastrophic backtracking (a class of denial-of-service issue commonly known as ReDoS). That guarantee becomes far more important when the input is a 1.8 MB log blob than when it is an 8,000-byte string. Release timeline Milestone What shipped Ignite 2025 — General Availability Regex went GA in SQL Server 2025 and Azure SQL. LOB inputs were initially supported only on REGEXP_LIKE, REGEXP_COUNT, and REGEXP_INSTR. LOB support on REGEXP_REPLACE and REGEXP_SUBSTR was deferred, and the two table-valued functions (TVFs) accepted only non-LOB string types. Azure SQL (post-GA service updates) LOB inputs enabled across all seven functions. SQL Server 2025 CU5 LOB inputs up to 2 MB enabled on all seven functions in the SQL Server. What’s new in CU5 varchar(max) and nvarchar(max) inputs are accepted on every regex function. The input string is capped at 2 MB per function call. The pattern is still capped at 8,000 bytes, which is far larger than any maintainable regular expression should ever need. Behavior is consistent between Azure SQL and SQL Server, so code you write today is fully portable. Note — The 2 MB limit applies to the input passed to a single function call, not to the column or row. A single value in a varchar(max) column can still store up to 2 GB; the constraint is that no single regex evaluation can consume more than 2 MB of that value. Prerequisites SQL Server 2025 CU5 or later, or Azure SQL Database, or SQL Database in Fabric or Azure SQL Managed Instance configured with the SQL Server 2025 / Always-up-to-date update policy. The two table-valued functions (REGEXP_MATCHES and REGEXP_SPLIT_TO_TABLE) require database compatibility level 170, unless the database-scoped configuration ALLOW_BUILTIN_TVF_IN_ALL_COMPAT_LEVELS (preview) is enabled. Note — On Azure SQL Managed Instance (Always-up-to-date), this capability is rolling out region by region. It is already live in regions where the rollout has completed and will light up in the remaining regions as the deployment finishes. Instances on the SQL Server 2025 update policy will receive it as part of the CU5 rollout — coming soon. Verify compatibility level (170 required for the TVFs) – SELECT name, compatibility_level FROM sys.databases WHERE name = DB_NAME(); -- If necessary: -- ALTER DATABASE [<your-database>] SET COMPATIBILITY_LEVEL = 170; 2. Working with LOB Data This section demonstrates the CU5 capabilities against a realistic LOB data. We build a LogEntries table whose RawPayload column holds multi-KB to multi-MB chunks of web server and application output, plus an HtmlPages table for HTML cleansing examples. 2.1 Create the sample schema and data IF OBJECT_ID('dbo.LogEntries', 'U') IS NOT NULL DROP TABLE dbo.LogEntries; IF OBJECT_ID('dbo.HtmlPages', 'U') IS NOT NULL DROP TABLE dbo.HtmlPages; CREATE TABLE dbo.LogEntries ( LogId BIGINT IDENTITY(1,1) PRIMARY KEY, Source SYSNAME NOT NULL, IngestedAt DATETIME2(3) NOT NULL DEFAULT SYSUTCDATETIME(), RawPayload VARCHAR(MAX) NOT NULL -- LOB column ); CREATE TABLE dbo.HtmlPages ( PageId INT IDENTITY(1,1) PRIMARY KEY, Url NVARCHAR(2048) NOT NULL, Body NVARCHAR(MAX) NOT NULL -- LOB column (Unicode) ); Now generate realistically large rows. The REPLICATE(CAST(... AS varchar(max)), n) pattern is required because REPLICATE returns NULL when the result would exceed 8,000 bytes unless its first argument is a max type. -- Synthetic web access-log payload (~252 KB in row 1, plus a separate ~586 KB row). DECLARE @logLine VARCHAR(500) = '127.0.0.1 - alice [21/May/2026:10:15:32 +0000] "GET /api/orders/42 HTTP/1.1" 200 1532 ' + 'user-agent="Mozilla/5.0" ip=10.0.0.7 email=alice@contoso.com card=4111-1111-1111-1234' + CHAR(10); DECLARE @bigLog VARCHAR(MAX) = REPLICATE(CAST(@logLine AS VARCHAR(MAX)), 1500) -- ~252 KB + '127.0.0.1 - mallory [21/May/2026:10:16:01 +0000] "POST /login HTTP/1.1" 500 0 ' + 'ip=203.0.113.99 ssn=123-45-6789' + CHAR(10); INSERT INTO dbo.LogEntries (Source, RawPayload) VALUES ('web-01', @bigLog), -- ~252 KB ('web-02', REPLICATE(CAST('OK ' AS VARCHAR(MAX)), 200000)); -- ~586 KB -- Synthetic HTML page (~775 KB / ~396,000 characters). DECLARE @htmlChunk NVARCHAR(MAX) = N'<div class="row"><p>Hello <b>world</b>! Contact <a href="mailto:bob@contoso.com">bob</a>.</p></div>'; INSERT INTO dbo.HtmlPages (Url, Body) VALUES (N'https://contoso.example/page-1', N'<html><head><title>Big Page</title></head><body>' + REPLICATE(@htmlChunk, 4000) + N'</body></html>'); -- Confirm payload sizes in bytes. SELECT LogId, Source, DATALENGTH(RawPayload) AS PayloadBytes FROM dbo.LogEntries; SELECT PageId, DATALENGTH(Body) AS BodyBytes, LEN(Body) AS BodyChars FROM dbo.HtmlPages; Results: LogId Source PayloadBytes 1 web-01 258,110 2 web-02 600,000 PageId BodyBytes BodyChars 1 792,124 396,062 Before CU5, feeding any of these payloads into REGEXP_REPLACE, REGEXP_SUBSTR, REGEXP_MATCHES, or REGEXP_SPLIT_TO_TABLE would have failed with a type-mismatch error or required a LEFT(RawPayload, 8000)-style truncation. The same queries now run end-to-end. 2.2 REGEXP_LIKE — Filter rows by LOB content -- Find logs that contain at least one HTTP 5xx response. SELECT LogId, Source, DATALENGTH(RawPayload) AS PayloadBytes FROM dbo.LogEntries WHERE REGEXP_LIKE(RawPayload, '"[A-Z]+\s[^"]+\sHTTP/1\.[01]"\s5[0-9]{2}\s'); REGEXP_LIKE is a Boolean predicate: it evaluates to true when the pattern matches anywhere in the input and false otherwise. Because it returns a Boolean rather than a bit, use it directly in WHERE, CASE WHEN, IIF, or CHECK constraint contexts — do not compare it with = 1 or = 0 (the parser rejects that syntax). Note — REGEXP_LIKE itself requires database compatibility level 170. The other scalar regex functions (REGEXP_COUNT, REGEXP_INSTR, REGEXP_REPLACE, REGEXP_SUBSTR) are available at all compatibility levels. Results: LogId Source PayloadBytes 1 web-01 258,110 2.3 REGEXP_COUNT — Counting at scale -- Per-row tally of GET requests, POST requests, and 5xx responses -- across the entire LOB payload. SELECT LogId, Source, REGEXP_COUNT(RawPayload, '"GET\s') AS Gets, REGEXP_COUNT(RawPayload, '"POST\s') AS Posts, REGEXP_COUNT(RawPayload, '\s5[0-9]{2}\s') AS ServerErrors FROM dbo.LogEntries; Results: LogId Source Gets Posts ServerErrors 1 web-01 1,500 1 1 2 web-02 0 0 0 2.4 REGEXP_INSTR — Locate the first error -- 1-based character position (or 0 if no match) of the FIRST 5xx response in each payload. SELECT LogId, Source, REGEXP_INSTR(RawPayload, '\s5[0-9]{2}\s', 1, 1, 0) AS FirstErrorPos FROM dbo.LogEntries; Parameter recap: REGEXP_INSTR(string, pattern, start, occurrence, return_option [, flags [, group ]]). A return_option of 0 returns the starting position of the match; 1 returns the position immediately after the last character of the match. Results: LogId Source FirstErrorPos 1 web-01 258,072 2 web-02 0 2.5 REGEXP_REPLACE — Redact sensitive data in place PII redaction over LOB payloads was one of the most-requested CU5 scenarios. Before CU5, it required a custom chunked-replace routine; it is now a single expression. -- Redact credit-card-shaped tokens, U.S. SSN-shaped tokens, and email addresses -- across the entire payload. SELECT LogId, REGEXP_REPLACE( REGEXP_REPLACE( REGEXP_REPLACE( RawPayload, '\b[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}\b', '****-****-****-****'), '\b[0-9]{3}-[0-9]{2}-[0-9]{4}\b', '***-**-****'), '\b[A-Za-z0-9._%+\-]+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}\b', '[redacted-email]' ) AS RedactedPayload FROM dbo.LogEntries; Or strip every HTML tag from an nvarchar(max) page in a single call: SELECT PageId, LEN(Body) AS OriginalLen, LEN(REGEXP_REPLACE(Body, N'<[^>]+>', N'')) AS TextOnlyLen FROM dbo.HtmlPages; Results — the ~775 KB HTML document collapses from 396,062 to 100,008 characters of plain text in a single call: PageId OriginalLen TextOnlyLen 1 396,062 100,008 2.6 REGEXP_SUBSTR — Extract a single value -- Pull the first IPv4 address out of each log payload. SELECT LogId, REGEXP_SUBSTR(RawPayload, '\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b', 1, -- start position 1, -- occurrence 'c', -- flags: case-sensitive 0 -- group: 0 returns the whole match ) AS FirstIp FROM dbo.LogEntries; To return the contents of a specific capture group instead of the entire match, pass its 1-based group number as the final argument. Results: LogId FirstIp 1 127.0.0.1 2 NULL 2.7 REGEXP_MATCHES — Every match, set-based This is where the combination of TVF and LOB delivers the largest productivity gain: extract every structured value from a megabyte of unstructured text in a single set-based query, with no client round-trips. REGEXP_MATCHES returns one row per match with these columns: Column Type Description match_id bigint Sequence number of the match (1-based). start_position int 1-based start index of the match. end_position int 1-based end index of the match. match_value same type as string_expression The entire matched substring. substring_matches json JSON array describing each capture group, with the shape [{"value":"…","start":N,"length":N}, …]. -- Every email address in every log payload, alongside its row of origin. SELECT l.LogId, m.match_id, m.match_value AS EmailFound FROM dbo.LogEntries AS l CROSS APPLY REGEXP_MATCHES( l.RawPayload, '\b[A-Za-z0-9._%+\-]+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}\b' ) AS m ORDER BY l.LogId, m.match_id; Capture groups are even more useful — you can project the parts of every log line as columns by reading from the substring_matches JSON document: -- Parse Common-Log-Format-ish entries into ip, user, status, and bytes columns. -- The pattern has four capture groups, accessed below as $[0] through $[3]. SELECT l.LogId, m.match_id, JSON_VALUE(m.substring_matches, '$[0].value') AS Ip, JSON_VALUE(m.substring_matches, '$[1].value') AS UserName, JSON_VALUE(m.substring_matches, '$[2].value') AS Status, JSON_VALUE(m.substring_matches, '$[3].value') AS Bytes FROM dbo.LogEntries AS l CROSS APPLY REGEXP_MATCHES( l.RawPayload, '^([0-9.]+)\s-\s(\S+)\s\[[^\]]+\]\s"[^"]+"\s([0-9]{3})\s([0-9]+)', 'm' -- multi-line: ^ and $ anchor to each line, not just the whole input ) AS m ORDER BY l.LogId, m.match_id; Important — Without the 'm' flag, the ^ anchor matches only at the start of the entire 250 KB input, so you would receive exactly one match for the first line. The multi-line flag is what unlocks per-line extraction. Results (first two parsed rows): LogId match_id Ip UserName Status Bytes 1 1 127.0.0.1 alice 200 1532 1 2 127.0.0.1 alice 200 1532 2.8 REGEXP_SPLIT_TO_TABLE — Shred a LOB into rows -- Project the entire log payload as one row per non-empty line. SELECT l.LogId, s.ordinal AS [LineNo], s.value AS LineText FROM dbo.LogEntries AS l CROSS APPLY REGEXP_SPLIT_TO_TABLE(l.RawPayload, '\r?\n') AS s WHERE l.LogId = 1 AND s.value <> '' ORDER BY s.ordinal; You now have a tabular projection of a multi-megabyte text blob without leaving the engine. You can feed it into a CTE, aggregate it, join it to dimension tables, or materialize it into a staging table — all set-based. Results (first three rows): LogId ordinal LineText (first 80 chars) 1 1 127.0.0.1 - alice [21/May/2026:10:15:32 +0000] "GET /api/orders/42 HTTP/1.1" 200 1 2 127.0.0.1 - alice [21/May/2026:10:15:32 +0000] "GET /api/orders/42 HTTP/1.1" 200 1 3 127.0.0.1 - alice [21/May/2026:10:15:32 +0000] "GET /api/orders/42 HTTP/1.1" 200 Tip — composing LOB regex pipelines — CROSS APPLY (and OUTER APPLY when you need to preserve rows that produce no matches) is the primary composition primitive. You can stack REGEXP_SPLIT_TO_TABLE (lines) feeding REGEXP_MATCHES (fields per line) feeding ordinary aggregates, all within a single query plan. 2.9 The 2 MB ceiling — strategies for larger inputs The 2 MB limit applies to the input string of a single regex call. If the value passed to a regex function exceeds 2 MB, the call raises an error (error number 19311, severity 16) rather than silently truncating. That is the intended behavior — silent truncation would hide correctness bugs. In practice, 2 MB is a generous ceiling: a single log file or HTML document of that size is already unusual, and most real-world LOB data sit comfortably below it. When individual values do exceed the limit, the most reliable approach is to split them into smaller logical units before they land in the column you want to query — for example, by writing one log line, one document section, or one record per row at ingestion time. Because every regex function (including the two TVFs) shares the same 2 MB ceiling, sharding at query time is not generally feasible; doing it at the load path keeps every regex call well under the limit and avoids per-query workarounds. Bytes vs. characters — The 2 MB limit is measured in bytes, not characters, and the byte count is based on the UTF-8 encoding of the input regardless of the column’s declared type. ASCII characters take 1 byte each, so plain ASCII text can run to roughly two million characters; non-ASCII characters take 2–4 bytes in UTF-8, so fewer characters fit. Keep in mind that DATALENGTH() reports storage size in the column’s own encoding, which may differ from the UTF-8 byte count used by the limit, and LEN() (which counts characters) is best avoided as a sizing check here. To measure the UTF-8 byte length that the limit actually checks, cast the value to varchar(max) under a UTF-8 collation and take its DATALENGTH: SELECT DATALENGTH( CONVERT(varchar(max), Body COLLATE Latin1_General_100_CI_AS_SC_UTF8) ) AS Utf8Bytes FROM dbo.HtmlPages; Anything above 2 * 1024 * 1024 (2,097,152) bytes will be rejected by a regex call on that value. Have a scenario that genuinely needs more than 2 MB? If your workload requires regex evaluation on individual values larger than the current 2 MB ceiling, we would like to hear about it. Please share the details — data shape, payload size, pattern, and business need — on the Azure SQL feedback portal. Customer feedback directly informs how we prioritize future limit changes. 2.10 Cleanup DROP TABLE IF EXISTS dbo.LogEntries; DROP TABLE IF EXISTS dbo.HtmlPages; 3. Summary What changed in CU5 Before CU5 — LOB inputs were accepted on REGEXP_LIKE, REGEXP_COUNT, and REGEXP_INSTR. The remaining functions — REGEXP_REPLACE, REGEXP_SUBSTR, and the two TVFs (REGEXP_MATCHES, REGEXP_SPLIT_TO_TABLE) — required non-LOB string inputs, which often meant truncating with LEFT(..., 8000) or chunking in the application tier. After CU5 (and already in Azure SQL) — All seven functions accept varchar(max) and nvarchar(max) inputs of up to 2 MB. The pattern remains capped at 8,000 bytes. Quick reference Function Returns LOB input (CU5) Common use case REGEXP_LIKE Boolean (predicate) Yes Filter rows in WHERE / CASE / CHECK predicates REGEXP_COUNT int Yes Count occurrences of a pattern REGEXP_INSTR int Yes Position of the nth match REGEXP_REPLACE string Yes Redact, cleanse, or normalize text REGEXP_SUBSTR string Yes Extract a single value REGEXP_MATCHES (TVF) (match_id, start_position, end_position, match_value, substring_matches) Yes Extract every match plus capture groups (via JSON), set-based REGEXP_SPLIT_TO_TABLE (TVF) (value, ordinal) Yes Split a LOB into rows by a regex delimiter Further reading Official documentation: REGEXP_LIKE, REGEXP_COUNT, REGEXP_INSTR, REGEXP_REPLACE, REGEXP_SUBSTR, REGEXP_MATCHES, REGEXP_SPLIT_TO_TABLE. Regular expressions overview. SQL Server 2025 CU5 release notes. Closing thought. Native regex was already a significant quality-of-life improvement when it became generally available. CU5 completes the picture: every function, every input size up to 2 MB, every shape — scalar or table-valued. The next time you are tempted to export a column out of the database in order to grep it, try one of the seven regex functions first. Happy matching. 🧠150Views0likes0CommentsMS ODBC and OLE DB failed
Hello, In SQL Server 2022 (16.0.4250.1) showed two fails and can´t continue (see screenshot) On system are installed those versions of ODBC and OLE DB System was previously working (not stopped on this window for fail). We did repair of both installation and restart pc, but not helpful. Whta and how to repair it, please? Thank you.70Views0likes1CommentArchitecture Risk Brief: Silent Data Integrity Failures in Distributed Criminal Justice Systems
Why Modernized Public Safety Environments Need Stronger Data Integrity Controls In criminal justice information services systems, the most dangerous failures are often the ones you cannot see. A system may appear fully operational—dashboards green, services responsive, transactions flowing—while critical data is incomplete, inconsistent, or out of sync across connected platforms. In these environments, the absence of alerts does not necessarily mean the absence of problems. Instead, it can signal that data integrity issues are developing silently beneath normal system behavior. As agencies modernize criminal justice information services (CJIS) systems, adopt cloud platforms, and expand data sharing across jurisdictions, the challenge is not only keeping systems online; it is ensuring the data moving between them remains accurate, consistent, and trustworthy. Why This Risk Is Growing Criminal justice agencies are going through rapid modernization, and with that comes a level of complexity that simply didn’t exist in earlier, more isolated systems. In many environments, legacy applications are still running alongside newer cloud-based platforms, which creates gaps in how data is processed and interpreted. At the same time, transaction volumes have increased significantly, and under heavy load it’s not uncommon to see partial commits, retry behavior, or subtle inconsistencies that are hard to detect. There’s also a growing expectation for near real-time synchronization across systems, even when those systems weren’t originally designed to stay perfectly in sync. As more agencies begin sharing data across jurisdictions, the number of integration points increases, and each one introduces its own risk. None of these changes are inherently problematic, but together they create conditions where data integrity issues can develop quietly without triggering any obvious system failures. These changes improve capability but also create new failure modes that traditional monitoring does not detect. System uptime alone is no longer a reliable indicator of operational health. The CJIS Security Policy reinforces this requirement by mandating that criminal justice information (CJI) remain accurate, complete, and protected from unauthorized alteration throughout its lifecycle. What Silent Data Integrity Failures Look Like Silent failures almost never show up as outages. Most of the time, everything looks fine on the surface—systems are up, jobs are running, dashboards are green. The problems usually come to light much later, often when someone is preparing for an audit, reconciling data between agencies, or digging into a case where something just doesn’t add up. In one scenario, a transaction completed successfully in the source system but never made it to a downstream platform. There were no errors, no retries flagged—just missing data. In another case, records looked perfectly valid within each system, but when compared across environments, they didn’t match. These kinds of discrepancies tend to surface during reporting or compliance checks, not during normal operations. That’s what makes them difficult to catch. From an operational standpoint, everything appears healthy. There are no alerts or obvious failures, but underneath that, the data has slowly drifted out of sync. Database Corruption: The Most Silent Failure of All Beyond synchronization gaps, database corruption represents an even more dangerous and often invisible threat. Corruption can arise from: Storage subsystem issues Hardware degradation Incomplete writes under high load Failover anomalies Legacy-to-cloud interactions Low-severity corruption may go unnoticed for weeks but eventually impacts multiple agency systems. Because corruption directly threatens the accuracy and integrity of CJI, it poses a significant CJIS compliance risk. My Implementation: Automated Corruption Alerts To deal with this, I implemented a simple automated alerting system that monitors corruption indicators and notifies me as soon as something looks off. Instead of waiting for issues to surface during audits or downstream failures, this provides an early signal that something isn’t right. In practice, it means I can react quickly, investigate the issue before it spreads, and avoid situations where bad data propagates into other systems. In CJIS environments, even a single corrupted record can have real consequences, so early visibility makes a meaningful difference. Flow Diagram to Detect Integrity Root Causes of Silent Data Drift In most cases, these data integrity issues don’t come from obvious failures—they build up during normal day-to-day operations. In high-volume systems, retries and partial commits under load can leave data in an inconsistent state without triggering any errors. During modernization or cloud migrations, subtle differences in schema behavior or transformation logic can cause data to drift between systems over time. Another common gap is monitoring. Most setups track uptime and performance, but very few validate whether the data itself remains consistent across platforms. And once data moves across multiple systems and integrations, each handoff becomes a potential point where something can go slightly wrong. None of these issues stand out individually, but together they create conditions where inconsistencies quietly accumulate. Next Steps for Agencies Criminal justice organizations don’t need to overhaul their entire technology stack to strengthen data integrity. Instead, they can take practical, incremental steps that build resilience into existing systems while preparing for future modernization. Establish a Baseline for Data Integrity Map where data originates, how it moves, and where it is stored across multiple agency systems. Implement Routine Cross-System Validation Use Azure Data Factory, Azure SQL Data Sync, and Log Analytics queries to automate comparisons between operational and reporting systems. Monitor for Corruption and Synchronization Failures Enable corruption detection and configure automated notifications—similar to the low-to-critical corruption alerts I implemented. Treat Failover and Migration as Integrity Events Use Azure SQL Failover Groups and ADF pipelines to verify data consistency before and after transitions. Strengthen Governance and Documentation Use Microsoft Purview to track lineage, schema changes, and data ownership. Build a Culture of Data Integrity Encourage teams to treat data correctness as a shared responsibility across the organization. Final Thoughts Criminal justice information systems have made significant progress in availability, scalability, and security. But as these systems become more distributed and interconnected, data integrity—including corruption detection—is emerging as one of the most critical and least visible operational risks. The challenge is no longer simply ensuring systems stay online. It is ensuring that the data moving through them remains correct, consistent, and trustworthy across every system, agency, and workflow that depends on it. In environments where data directly impacts investigations, reporting, and compliance decisions, integrity must be engineered, validated, and continuously enforced with the same rigor applied to system availability and security.Dynamic Data Masking – What it is, What it isn’t, and How to use it effectively
In this post, we’ll explain the core purpose of Dynamic Data Masking (to ease application development), how it works, and its proper use cases – as well as its limitations. If you’re considering using Dynamic Data Masking or reviewing your data security strategy, this information will help you make informed decisions. What Dynamic Data Masking is designed for Dynamic Data Masking Dynamic Data Masking - SQL Server | Microsoft Learn is a database feature that can be used to alter how certain data elements are presented in query results for users who do not have privileged access or required permission. For example, a query on an email column may return a masked value such as jXXX@XXXX.com rather than the full address, depending on user permissions, while the original data remains unchanged in storage. Masking rules are defined within the database schema and are applied to query results for applicable users at runtime. This approach can simplify application developer’s job and reduce the need for application‑level logic that modifies how sensitive values are displayed across different application(s) or reports. DDM can help prevent accidental or casual exposure of sensitive information. How Does DDM differ from other security features? Dynamic Data Masking affects only what users see in query results—it does not protect the underlying data. Unlike encryption Always Encrypted - SQL Server | Microsoft Learn or Row‑Level security Row-Level Security - SQL Server | Microsoft Learn, DDM does not encrypt data, filter rows, or override SQL permissions. Users with elevated privileges (such as UNMASK, db_owner, or sysadmin) always see unmasked data or can modify or remove masking rules. What DDM doesn’t protect against Because Dynamic Data Masking is applied when query results are returned, there are several considerations to be aware of: Inference through queries: In some scenarios, users with database access may be able to make inferences about masked values by applying query filters or conditions that rely on underlying stored data. The database is still comparing the real values under the hood, so these queries work. It’s an expected behavior given DDM’s design. Privileged users: Users who are granted sufficient database permissions, such as the ability to alter table schemas, can directly disable or remove masking. Users with sysadmin, db_owner or CONTROL permission can view unmasked data. Thus, controlling and auditing who holds such privileges is vital. Metadata visibility: Masking rules and associated columns can be discoverable through system metadata. Data movement: Because masking is defined at the schema level in a given database instance, backups or exported datasets may contain unmasked values depending on permissions and configuration. Understanding these design characteristics is important when incorporating DDM into a broader data governance or privacy strategy. Proper use and best practices for DDM Organizations may consider using Dynamic Data Masking in scenarios where consistent display of sensitive values is needed across application(s) or reporting environments. Some implementation considerations include: Using DDM to help standardize how sensitive fields are displayed in query results and reduce developmental effort for data masking Combining DDM with other database or access‑control features as part of a layered data protection strategy Reviewing which users are granted permissions to view unmask data or alter masking configurations. Implementing auditing or monitoring database activity as part of broader governance practices Educating internal stakeholders on how masking operates at the query‑result level Testing masking configurations in non‑production environments prior to deployment Conclusion Dynamic Data Masking can be useful in scenarios where organizations want to manage how sensitive data is displayed in application outputs without modifying stored values. It is designed to operate as part of a broader data access or governance approach rather than as a standalone protection mechanism for stored data. When implemented alongside complementary database features and appropriate access controls, DDM may help support more consistent handling of sensitive values across environments.291Views0likes0CommentsStream data in near real time from SQL to Azure Event Hubs - Public preview
If near-real time integration is something you are looking to implement and you were looking for a simpler way to get the data out of SQL, keep reading. SQL is making it easier to integrate and Change Event Streaming is a feature continuing this trend. Modern applications and analytics platforms increasingly rely on event-driven architectures and real-time data pipelines. As the businesses speed up, real time decisioning is becoming especially important. Traditionally, capturing changes from a relational database requires complex ETL jobs, periodic polling, or third-party tools. These approaches often consume significant cycles of the data source, introduce operational overhead, and pose challenges with scalability, especially if you need one data source to feed into multiple destinations. In this context, we are happy to release Change Event Streaming ("CES") feature into Public Preview for Azure SQL Database. This feature enables you to stream row-level changes - inserts, updates, and deletes - from your database directly to Azure Event Hubs in near real time. Change Event Streaming addresses the above challenges by: Reducing latency: Changes are streamed (pushed by SQL) as they happen. This is in contrast with traditional CDC (change data capture) or CT (change tracking) based approaches, where an external component needs to poll SQL at regular intervals. Traditional approaches allow you to increase polling frequency, but it gets difficult to find a sweet spot between minimal latency and minimal overhead due to too frequent polls. Simplifying architecture: No need for Change Data Capture (CDC), Change Tracking, custom polling or external connectors - SQL streams directly to configured destination. This means simpler security profile (fewer authentication points), fewer failure points, easier monitoring, lower skill bar to deploy and run the service. No need to worry about cleanup jobs, etc. SQL keeps track of which changes are successfully received by the destination, handles the retry logic and releases log truncation point. Finally, with CES you have fewer components to procure and get approved for production use. Decoupling: The integration is done on the database level. This eliminates the problem of dual writes - the changes are streamed at transaction boundaries, once your source of truth (the database) has saved the changes. You do not need to modify your app workloads to get the data streamed - you tap right onto the data layer - this is useful if your apps are dated and do not possess real-time integration capabilities. In case of some 3rd party apps, you may not even have an option to do anything other than database level integration, and CES makes it simpler. Also, the publishing database does not concern itself with the final destination for the data - Stream the data once to the common message bus, and you can consume it by multiple downstream systems, irrespective of their number or capacity - the (number of) consumers does not affect publishing load on the SQL side. Serving consumers is handled by the message bus, Azure Event Hubs, which is purpose built for high throughput data transfers. onceptually visualizing data flow from SQL Server, with an arrow towards Azure Event Hubs, from where a number of arrows point to different final destinations. Key Scenarios for CES Event-driven microservices: They need to exchange data, typically thru a common message bus. With CES, you can have automated data publishing from each of the microservices. This allows you to trigger business processes immediately when data changes. Real-time analytics: Stream operational data into platforms like Fabric Real Time Intelligence or Azure Stream Analytics for quick insights. Breaking down the monoliths: Typical monolithic systems with complex schemas, sitting on top of a single database can be broken down one piece at a time: create a new component (typically a microservice), set up the streaming from the relevant tables on the monolith database and tap into the stream by the new components. You can then test run the components, validate the results against the original monolith, and cutover when you build the confidence that the new component is stable. Cache and search index updates: Keep distributed caches and search indexes in sync without custom triggers. Data lake ingestion: Capture changes continuously into storage for incremental processing. Data availability: This is not a scenario per se, but the amount of data you can tap into for business process mining or intelligence in general goes up whenever you plug another database into the message bus. E.g. You plug in your eCommerce system to the message bus to integrate with Shipping providers, and consequently, the same data stream is immediately available for any other systems to tap into. How It Works CES uses transaction log-based capture to stream changes with minimal impact on your workload. Events are published in a structured JSON format following the CloudEvents standard, including operation type, primary key, and before/after values. You can configure CES to target Azure Event Hubs via AMQP or Kafka protocols. For details on configuration, message format, and FAQs, see the official documentation: Feature Overview CES: Frequently Asked Questions Get Started Public preview CES is available today in public preview for Azure SQL Database and as a preview feature in SQL Server 2025. [update 20-mar-2026] Change Event Streaming is now in public preview for Azure SQL Managed instance. Read more here. Private preview CES is also available as a private preview for Azure SQL Managed Instance and Fabric SQL database: you can request to join the private preview by signing up here: https://aka.ms/sql-ces-signup We encourage you to try the feature out and start building real-time integrations on top of your existing data. We welcome your feedback—please share your experience through Azure Feedback portal or support channels. The comments below on this blog post will also be monitored, if you want to engage with us. Finally, CES team can be reached via email: sqlcesfeedback [at] microsoft [dot] com. Useful resources Free Azure SQL Database. Free Azure SQL Managed Instance.1.3KViews0likes0CommentsHow does GitHub Copilot in SSMS 22 handle database context collection before generating a response?
Hello, I am trying to better understand the internal workflow of GitHub Copilot in SSMS 22, especially for database-specific questions. From the product descriptions, it seems that Copilot can use the context of the currently connected database, such as schema, tables, columns, and possibly other metadata, when answering questions or generating T-SQL. However, I could not find clear official documentation about the actual sequence of operations. My main questions are: Before generating a response, does Copilot first collect database context/metadata from the active connection and then send that context to the LLM as grounding information? Or does it first use the LLM to interpret the user’s request, decide what information is needed, and then retrieve database metadata before generating the final answer? In some explanations, I have seen the phrase "Core SQL Copilot Infrastructure", but I cannot find any official documentation for that term. Is this an official component name? If so, what does it specifically refer to in the SSMS Copilot architecture? When Copilot answers schema-related or data-related questions, what information is retrieved automatically from the connected database, and is any SQL executed as part of that process? Is there any official architectural documentation that explains: context collection, prompt grounding, LLM invocation order, and whether query execution can occur before the final response is generated? I am asking because I want to understand the feature from both an architecture and data governance/security perspective. Any clarification from the product team or documentation links would be greatly appreciated. Thank you.47Views0likes0CommentsExpanding Azure Arc SQL Migration with a New Target: SQL Server on Azure Virtual Machines
Modernizing a SQL Server estate is rarely a single-step effort. It typically involves multiple phases, from discovery and assessment to migration and optimization, often spanning on-premises, hybrid, and cloud environments. SQL Server enabled by Azure Arc simplifies this process by bringing all migration steps into a single, cohesive experience in the Azure portal. With the March 2026 release, this integrated experience is extended by adding SQL Server on Azure Virtual Machines as a new migration target in Azure Arc. Arc-enabled SQL Server instances can now be migrated not only to Azure SQL Managed Instance, but also to SQL Server running on Azure infrastructure, using the same unified workflow. Expanding Choice Without Adding Complexity By introducing SQL Server on Azure Virtual Machines as a migration target, Azure Arc now supports a broader range of migration strategies while preserving a single operational model. It becomes possible to choose between Azure SQL Managed Instance and SQL Server on Azure VMs without fragmenting migration tooling or processes. The result is a flexible, scalable, and consistent migration experience that supports hybrid environments, reduces operational overhead, and enables modernization at a controlled and predictable pace. One Integrated Migration Journey A core value of SQL Server migration in Azure Arc is that the entire migration lifecycle is managed from one place. Once a SQL Server instance is enabled by Azure Arc, readiness can be assessed, a migration target selected, a migration method chosen, progress monitored, and cutover completed directly in the Azure portal. This approach removes the need for disconnected tools or custom orchestration. The only prerequisite remains unchanged: the source SQL Server needs to be enabled by Azure Arc. From there, migration is fully integrated into the Azure Arc SQL experience. A Consistent Experience Across Migration Targets The migration experience for SQL Server on Azure Virtual Machines follows the same model already available for Azure SQL Managed Instance migrations in Azure Arc. The same guided workflow, migration dashboard, and monitoring capabilities are used regardless of the selected target. This consistency is intentional. It allows teams to choose the destination that best fits their technical, operational, or regulatory requirements without having to learn a new migration process. Whether migrating to a fully managed PaaS service or to SQL Server on Azure infrastructure, the experience remains predictable and familiar. Backup Log Shipping Migration to SQL Server in Azure VM Migration to SQL Server on Azure Virtual Machines is based on backup and restore, specifically using log shipping mechanism. This is a well-established approach for online migrations that minimizes downtime while maintaining control over the cutover window. In this model, database backups need to be uploaded from the source SQL Server to Azure Blob Storage. The migration engine will restore the initial full backup followed by ongoing transaction log and diff. backups. Azure Blob Storage acts as the intermediary staging location between the source and the target. The Azure Blob Storage account and the target SQL Server running on an Azure Virtual Machine must be co-located in the same Azure region. This regional alignment is required to ensure efficient data transfer, reliable restore operations, and predictable migration performance. Within the Azure Arc migration experience, a simple and guided UX is used to select the Azure Blob Storage container that holds the backup files. Both the selected storage account and the Azure VM hosting SQL Server must reside in the same Azure region. Once the migration job is started, Azure Arc automatically restores the backup files to SQL Server on the Azure VM. As new log backups are uploaded to Blob Storage, they are continuously detected and applied to the target database, keeping it closely synchronized with the source. Controlled Cutover on Your Terms This automated restore process continues until the final cutover is initiated. When the cutover command is issued, Azure Arc applies the final backup to the target SQL Server on the Azure Virtual Machine and completes the migration. The target database is then brought online, and applications can be redirected to the new environment. This controlled cutover model allows downtime to be planned precisely, rather than being dictated by long-running restore operations. Getting started To get started, Arc enable you SQL Server. Then, in the Azure portal, navigate to your Arc enabled SQL Server and select Database migration under the Migration menu on the left. For more information, see the SQL Server migration in Azure Arc documentation.1.1KViews5likes0CommentsUnable to install SQL Server 2022 Express (installer glitch + SSMS error)
Hi, I recently purchased a new Lenovo laptop, and I am trying to install Microsoft SQL Server 2022 Express along with SSMS. SSMS installed successfully, but SQL Server installation fails, and sometimes the installer UI glitches or does not load properly. Because of this, I am getting connection errors in SSMS like "server not found" and "error 40". I am not very familiar with technical troubleshooting. Can someone guide me step-by-step in a simple way to install SQL Server correctly? Thank you.124Views0likes0Comments