Welcome to the Azure Synapse January 2022 update! Our first blog of the year includes newly added database templates, a security whitepaper, and data integration updates. For the first time, we will also feature a companion video that you can watch to get the quick key updates.
Help us improve this monthly blog, we would love to hear back from you on how best to engage and inform you! Leave a comment below.
We’ve seen a lot of enthusiasm and adoption of the 11 Synapse database templates during public preview that 4 additional templates have been recently added. You can now access Automotive, Genomics, Manufacturing, and Pharmaceuticals templates in Azure Synapse. See them in either in the gallery or by creating a new lake database from the tab and selecting + Table and then From template.
Learn more by reading Four Additional Azure Synapse Database Templates Now Available in Public Preview
The release of the Synapse ML library v0.9.5 (previously called MMLSpark) simplifies the creation of massively scalable machine learning pipelines with Apache Spark. It unifies several existing ML Frameworks and new Microsoft algorithms in a single, scalable API that’s usable across Python, R, Scala, and Java. This update includes support for the following new capabilities:
Learn more by reading the full release notes or visit the SynapseML homepage to get started.
We just published a white paper that explains Synapse's enterprise-grade security capabilities and industry-leading features that addresses security concerns and provides a comprehensive overview of Azure Synapse Analytics security features. This whitepaper covers the five layers of security: Authentication, Access Control, Data Protection, Network Security, and Threat Protection. Use this reference document to understand each security feature and to implement an industry-standard security baseline to protect your data on the cloud.
Learn more by reading Azure Synapse Analytics security white paper: Introduction
Starting in December 2021, TLS 1.2 is required for newly created Synapse Workspaces. TLS 1.2 provides enhanced security to safeguard against exploits. Login attempts to newly created Synapse workspace from connections using a TLS versions lower than 1.2 will fail.
Learn more by reading Azure Synapse Analytics connectivity settings
You can now easily add data quality, data validation, and schema validation to your Synapse ETL jobs by leveraging Assert transformation in Synapse data flows. Add expectations to your data streams that will execute from the pipeline data flow activity to evaluate whether each row or column in your data meets your assertion. Tag the rows as pass or fail and add row-level details about how a constraint has been breached. This is a critical new feature to an already effective ETL framework to ensure that you are loading and processing quality data for your analytical solutions.
Learn more by reading Assert transformation in mapping data flow
Synapse data flows can now read and write data directly to Dynamics through the new data flow Dynamics connector. Create data sets in data flows to read, transform, aggregate, join, etc., and then write the data back into Dynamics using the built-in Synapse Spark compute.
Learn more by reading Native data flow connector for Dynamics
It’s here! This much anticipated update adds IntelliSense to expression editing, making it super easy for you to create new expressions, check your expression syntax, find functions, and add code to your pipelines.
Learn more by reading IntelliSense support in Expression Builder for more productive pipeline authoring experiences
Automatic schema discovery along with auto-table creation process makes it easy for customers to automatically map and load complex data types from Parquet files, such as arrays, and maps into Dedicated SQL pools in Synapse. Rowgroup compression is automatically enabled when customers enable the auto-create table option within the COPY command. Start taking advantage of all these features today to simplify data ingestion with Azure Synapse Analytics!
Learn more by reading how Github leveraged this functionality in Introducing Automatic Schema Discovery with auto table creation for complex datatypes
SQL pools now support the HASHBYTES function! HASHBYTES is a T-SQL function which hashes values. This means that you can use the HASHBYTES function in queries that read data using external tables and the OPENROWSET function.
SELECT
TOP 100
HASHBYTES('sha2_256', vendorid) as hashedVendorID,
vendorID
FROM
OPENROWSET(
BULK 'https://azureopendatastorage.blob.core.windows.net/nyctlc/yellow/puYear=2019/puMonth=1/*.parquet',
FORMAT = 'parquet'
) AS [result];
The HASHBYTES function is returning the hashes of its input values in SQL and is now supporting the following types: MD2, MD4, MD5, SHA, SHA1, SHA2.
Learn more by reading about HASHBYTES
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.