Data Science
117 TopicsAI-900: Microsoft Azure AI Fundamentals Study Guide
This comprehensive study guide provides a thorough overview of the topics covered in the Microsoft Azure AI Fundamentals (AI-900) exam, including Artificial Intelligence workloads, fundamental principles of machine learning, computer vision and natural language processing workloads. Learn about the exam's intended audience, how to earn the certification, and the skills measured as of April 2022. Discover the important considerations for responsible AI, the capabilities of Azure Machine Learning Studio and more. Get ready to demonstrate your knowledge of AI and ML concepts and related Microsoft Azure services with this helpful study guide.38KViews11likes3CommentsIntroducing Advanced KQL Free Online Course
Are you analyze data on ADX? User Analytics? Geographic? Machine Learning? Time Series Analysis? Anomaly Detection? You have come to the right place! You are welcome to take our new Advanced KQL course that will boost your knowledge and skills for performing advanced analytics in ADX6.8KViews7likes0CommentsBring Vision to Life with Three Horizons, Data Mesh, Data Lakehouse, and Azure Cloud Scale Analytics
Bring Vision to Life with Three Horizons, Data Mesh, Data Lakehouse, and Azure Cloud Scale Analytics – Plus some bonus concepts! I have not posted in a while so this post is loaded with ideas and concepts to think about. I hope you enjoy it! The structure of the post is a chronological perspective of 4 recent events in my life: 1) Camping on the Olympic Peninsula in WA state, 2) Installation of new windows and external doors in my residential house, 3) Injuring my back (includes a metaphor for how things change over time), and 4) Camping at Kayak Point in Stanwood WA (where I finished writing this). Along with these series of events bookended by Camping trips, I also wanted to mention May 1 st which was International Workers Day (celebrated as Labor Day in September in the US and Canada). To reach the vision of digital transformation through cloud scale analytics we need many more workers (Architects, Developers, DBAs, Data Engineers, Data Scientists, Data Analysts, Data Consumers) and the support of many managers and leaders. Leadership is required so analytical systems can become more distributed and properly staffed to scale vs the centralized and small specialist teams that do not scale. Analytics could be a catalyst for employment with the accelerated building and operating of analytical systems. There is evidence that the structure of the teams working on these analytical systems will need to be more distributed to scale to the level of growth required. When focusing on data management, Data Mesh strives to be more distributed, and Data Lakehouse supports distributed architectures better than the analytical systems of the past. I am optimistic that cloud-based analytical systems supported by these distributed concepts can scale and progress to meet the data management, data engineering, data science, data analysis, and data consumer needs and requirements of many organizations.22KViews6likes1CommentIntroducing Data Science for Beginners
Our team of Azure Cloud Advocates, Program Managers, and Student Ambassadors are pleased to bring a new addition to the For Beginners Curriculum series: Data Science for Beginners. Data Science for Beginners is a free, MIT-licensed open-source curriculum of 20 lessons that focus on the foundations of Data Science and requires no prior knowledge to get started.14KViews6likes2CommentsDiscover the Future of Data Engineering with Microsoft Fabric for Technical Students & Entrepreneurs
Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place. This makes it an ideal platform for technical students and entrepreneurial developers looking to streamline their data engineering and analytics workflows.6.1KViews4likes1CommentAutomated Continuous integration and delivery – CICD in Azure Data Factory
In Azure Data Factory, continuous integration and delivery (CI/CD) involves transferring Data Factory pipelines across different environments such as development, test, UAT and production. This process leverages Azure Resource Manager templates to store the configurations of various ADF entities, including pipelines, datasets, and data flows. This article provides a detailed, step-by-step guide on how to automate deployments using the integration between Data Factory and Azure Pipelines. Prerequisite Azure database factory, Setup of multiple ADF environments for different stages of development and deployment. Azure DevOps, the platform for managing code repositories, pipelines, and releases. Git Integration, ADF connected to a Git repository (Azure Repos or GitHub). The ADF contributor and Azure DevOps build administrator permission is required Step 1 Establish a dedicated Azure DevOps Git repository specifically for Azure Data Factory within the designated Azure DevOps project. Step 2 Integrate Azure Data Factory (ADF) with the Azure DevOps Git repositories that were created in the first step. Step 3 Create developer feature branch with the Azure DevOps Git repositories that were created in the first step. Select the created developer feature branch from ADF to start the development. Step 4 Begin the development process. For this example, I create a test pipeline “pl_adf_cicd_deployment_testing” and save all. Step 5 Submit pull request from developer feature branch to main Step 6 Once the pull requests are merged from the developer's feature branch into the main branch, proceed to publish the changes from the main branch to the ADF Publish branch. The ARM templates (JSON files) will get up-to date, they will be available in the adf-publish branch within the Azure DevOps ADF repository. Step 7 ARM templates can be customized to accommodate various configurations for Development, Testing, and Production environments. This customization is typically achieved through the ARMTemplateParametersForFactory.json file, where you specify environment-specific values such as link service, environment variables, managed link and etc. For example, in a Testing environment, the storage account might be named teststorageaccount, whereas in a Production environment, it could be prodstorageaccount. To create environment specific parameters file Azure DevOps ADF Git repo > main branch > linkedTemplates folder > Copy “ARMTemplateParametersForFactory.json” Create parameters_files folder under root path Copy paste ARMTemplateParametersForFactory.json inside parameters_files folder and rename to specify environment for example, prod-adf-parameters.json Update each environment specific parameter values Step 8 To create an Azure DevOps CICD pipeline, use the following code and ensure you update the variables to match your environment before running it. This will allow you to deploy from one ADF environment to another, such as from Test to Production. name: Release-$(rev:r) trigger: branches: include: - adf_publish variables: azureSubscription: <Your subscription> SourceDataFactoryName: <Test ADF> DeployDataFactoryName: <PROD ADF> DeploymentResourceGroupName: <PROD ADF RG> stages: - stage: Release displayName: Release Stage jobs: - job: Release displayName: Release Job pool: vmImage: 'windows-2019' steps: - checkout: self # Stop ADF Triggers - task: AzurePowerShell@5 displayName: Stop Triggers inputs: azureSubscription: '$(azureSubscription)' ScriptType: 'InlineScript' Inline: | $triggersADF = Get-AzDataFactoryV2Trigger -DataFactoryName "$(DeployDataFactoryName)" -ResourceGroupName "$(DeploymentResourceGroupName)" if ($triggersADF.Count -gt 0) { $triggersADF | ForEach-Object { Stop-AzDataFactoryV2Trigger -ResourceGroupName "$(DeploymentResourceGroupName)" -DataFactoryName "$(DeployDataFactoryName)" -Name $_.name -Force } } azurePowerShellVersion: 'LatestVersion' # Deploy ADF using ARM Template and UAT JSON parameters - task: AzurePowerShell@5 displayName: Deploy ADF inputs: azureSubscription: '$(azureSubscription)' ScriptType: 'InlineScript' Inline: | New-AzResourceGroupDeployment ` -ResourceGroupName "$(DeploymentResourceGroupName)" -TemplateFile "$(System.DefaultWorkingDirectory)/$(SourceDataFactoryName)/ARMTemplateForFactory.json" -TemplateParameterFile "$(System.DefaultWorkingDirectory)/parameters_files/prod-adf-parameters.json" -Mode "Incremental" azurePowerShellVersion: 'LatestVersion' # Restart ADF Triggers - task: AzurePowerShell@5 displayName: Restart Triggers inputs: azureSubscription: '$(azureSubscription)' ScriptType: 'InlineScript' Inline: | $triggersADF = Get-AzDataFactoryV2Trigger -DataFactoryName "$(DeployDataFactoryName)" -ResourceGroupName "$(DeploymentResourceGroupName)" if ($triggersADF.Count -gt 0) { $triggersADF | ForEach-Object { Start-AzDataFactoryV2Trigger -ResourceGroupName "$(DeploymentResourceGroupName)" -DataFactoryName "$(DeployDataFactoryName)" -Name $_.name -Force } } azurePowerShellVersion: 'LatestVersion' Triggering the Pipeline The Azure DevOps CI/CD pipeline is designed to automatically trigger whenever changes are merged into the main branch. Additionally, it can be initiated manually or set to run on a schedule for periodic deployments, providing flexibility and ensuring that updates are deployed efficiently and consistently. Monitoring and Rollback To monitor the pipeline execution, utilize the Azure DevOps pipeline dashboards. In case a rollback is necessary, you can revert to previous versions of the ARM templates or pipelines using Azure DevOps and redeploy the changes.1.6KViews3likes1Comment