Welcome to the June 2022 Azure Synapse Analytics update! This month, you will find information about the Azure Synapse Success by Design playbook and rerunning pipelines with new parameters. Additional general updates and new features in SQL, Synapse Data Explorer, Data Integration, and Machine Learning are also below.
Don’t forget to check out our companion video!
We now offer an Azure Orbital analytics sample solution showing an end-to-end implementation of extracting, loading, transforming, and analyzing spaceborne data by using geospatial libraries and AI models with Azure Synapse Analytics. The sample solution also demonstrates how to integrate geospatial-specific Azure Cognitive Services models, AI models from partners, and bring-your-own-data models.
Spaceborne data originating from remote sensors are characterized by high volume, high velocity, and a high degree of variety. To democratize such a unique type of Spatial big data, there is a strong need to build a highly scalable cloud-based solution to generate insights based on artificial intelligence. The solution leverages cloud serverless compute, cost-efficient storage, and high availability across regions to enable an integrated workflow.
Architecture diagram of the workflow with reference architecture:
Combining Azure Orbital analytics with Synapse Analytics in this workflow is ideal for the aerospace and aircraft industries. It simplifies the tasks of:
We’ll explore and walk through some of these scenarios in a future blog series. Stay tuned.
To learn more about the Azure orbital analytics workflow, read Spaceborne data analysis with Azure Synapse Analytics.
To learn more about the team behind this solution, visit Azure Space.
Project success is no accident and requires careful planning and execution. The Synapse Analytics' Success by Design playbooks are now available on Microsoft Docs. The Azure Synapse proof of concept playbook provides a guide to scope, design, execute, and evaluate a proof of concept for SQL or Spark workloads. These guides contain best practices from the most challenging and complex solution implementations incorporating Azure Synapse.
If you're working on implementing a Synapse Analytics solution, plug the Implementation Success method right into your existing project plan or use it as a guide. You will find key checkpoints at every step of the implementation process to validate the end-to-end design for workspaces, security, data integration, SQL, and Spark pools.
To learn more about the Azure Synapse proof of concept playbook, read Success by Design
We know that you turn to Azure Synapse Analytics to work with large amounts of data. With that in mind, the maximum size of query result sets in Serverless SQL pools has been increased from 200GB to 400GB. This limit is shared between concurrent queries.
To learn more about this size limit increase and other constraints, read Self-help for serverless SQL pool.
The new Synapse Web Explorer homepage makes it even easier to get started with Synapse Web Explorer. The Web Explorer homepage now includes the following sections:
A great way to learn about a product is to see how it is being used by others. The Web Explorer sample gallery provides end-to-end samples of how customers leverage Synapse Data Explorer popular use cases such as Logs Data, Metrics Data, IoT data and Basic big data examples. Each sample includes the dataset, well-documented queries, and a sample dashboard.
Navigate to the sample gallery from the Web Explorer homepage. From the homepage’s Get Started section, go to Explore sample data with KQL to access the sample queries or go to Explore sample dashboards to open the sample dashboards.
To access the sample gallery, you need either an Azure active directory (AAD) user identity or a Microsoft account (MSA).
To learn more about the sample gallery, read Azure Data Explorer in 60 minutes with the new samples gallery
You can now add drill through capabilities to your Synapse Web Explorer dashboards. The new drill through capabilities allow you to easily jump back and forth between dashboard pages. This is made possible by using a contextual filter to connect your dashboards. Defining these contextual drill throughs is done by editing the visual interactions of the selected tile in your dashboard.
To learn more about drill through capabilities, read Use drillthroughs as dashboard parameters.
Being able to display data in different time zones is very powerful. You can now decide to view the data in UTC time, your local time zone, or the time zone of the monitored device/machine. The Time Zone settings of the Web Explorer now apply to both the Query results and to the Dashboard. By changing the time zone, the dashboards will be automatically refreshed to present the data with the selected time zone.
For more information on time zone settings, read Change datetime to specific time zone.
Fuzzy matching with a sliding similarity score option has been added to the Join transformation in Mapping Data Flows. You can create inner and outer joins on data values that are similar rather than exact matches! Previously, you would have had to use an exact match. The sliding scale value goes from 60% to 100%, making it easy to adjust the similarity threshold of the match.
For learn more about fuzzy joins, read Join transformation in mapping data flow
We’re excited to announce that the Map Data tool is now Generally Available. The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code.
For learn more about Map Data, read Map Data in Azure Synapse Analytics
You can now change pipeline parameters when re-running a pipeline from the Monitoring page without having to return to the pipeline editor. After running a pipeline with new parameters, you can easily monitor the new run against the old ones without having to toggle between pages.
Note that by re-running a pipeline with new parameters, it will be considered a new pipeline run and will not show under re-run groupings.
To learn more about rerunning pipelines with new parameters, read Rerun pipelines and activities
We’re excited to announce that user defined functions (UDFs) are now Generally Available. With user-defined functions, you can create customized expressions that can be reused across multiple mapping data flows. You no longer have to use the same string manipulation, math calculations, or other complex logic several times. User-defined functions will be grouped in libraries to help developers group common sets of functions.
To learn more about user defined functions, read User defined functions in mapping data flows
To simplify the process for creating and managing GPU-accelerated pools, Azure Synapse takes care of pre-installing low-level libraries and setting up all the complex networking requirements between compute nodes. This integration allows users to get started with GPU- accelerated pools within just a few minutes.
Now, Azure Synapse Analytics provides built-in support for deep learning infrastructure. The Azure Synapse Analytics runtime for Apache Spark 3.1 and 3.2 now includes support for the most common deep learning libraries like TensorFlow and PyTorch. The Azure Synapse runtime also includes supporting libraries like Petastorm and Horovod, which are commonly used for distributed training. This feature is currently available in Public Preview.
To learn more about how to leverage these libraries within your Azure Synapse Analytics GPU-accelerated pools, read the Deep learning tutorials.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.