Synapse CSE
31 TopicsSynapse – Data Lake vs. Delta Lake vs. Data Lakehouse
As a data engineer, we often hear terms like Data Lake, Delta Lake, and Data Lakehouse, which we might be confusing at times. In this blog we’ll demystify these terms and talk about the differences of each of the technologies and concepts, along with scenarios of usage for each.52KViews14likes0CommentsData mesh: A perspective on using Azure Synapse Analytics to build data products
This is a multi-part blog series, and it discusses various aspects of implementing data mesh architecture on Azure. This part focuses on data as a product principle and presents a perspective on using Azure Synapse Analytics as a data product. We discuss (at a high-level) data product functions & capabilities and apply that lens to Synapse Analytics. We discuss how workspaces can be partitioned to give domains scale and agility to build data products.18KViews11likes4CommentsAzure Synapse analytics (dedicated SQL pool) data modelling best practices
In this article, I will discuss how to physically model an Azure Synapse Analytics data warehouse while migrating from an existing on-premises MPP (Massive Parallel Processing) data warehouse solution like Teradata and Netezza.13KViews10likes6CommentsSynapse Spark - Encryption, Decryption and Data Masking
As a data engineer, we often get requirements to encrypt, decrypt, mask, or anonymize certain columns of data in files sitting in the data lake when preparing and transforming data with Apache Spark. The extensibility feature of Spark allows us to leverage a library which is not native to Spark. One such library is Microsoft Presidio, which provides fast identification and anonymization modules for private entities in text such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data, and more. It facilitates both fully automated and semi-automated PII (Personal Identifiable Information) de-identification and anonymization flows on multiple platforms.9.4KViews7likes2CommentsCreating a custom disaster recovery plan for your Synapse workspace Part 1
Many of our customers have been asking about creating a disaster recovery plan for their Synapse Workspace. In a new blog series, we will cover the basics of disaster recovery and business continuity, discussing available options and custom solutions.15KViews6likes1CommentIntroduction to Kusto Query Language (KQL)
Kusto Query Language (KQL) is a powerful query language to analyse large volumes of structured, semi structured and unstructured (Free Text) data. It has inbuilt operators and functions that lets you analyse data to find trends, patterns, anomalies, create forecasting, and machine learning.29KViews5likes3Comments