#14DaysOfData Science: A Developer Tools & AI Workshop
Published Apr 01 2024 12:00 AM 1,584 Views
Microsoft

In previous posts, we introduced our #14DaysOfDataScience content collection and celebrated Data Science Day, a full day of talks and learning sessions from Microsoft and community experts. Today, we'll continue the journey by giving you an actionable roadmap and associated workshop, to put those concepts into practice using the right developer tools and AI. Let's dive in!

nitya_0-1711369242472.png

 

Week 1:  Data Science Fundamentals

We started the journey in Week 1 by exploring the foundations of data science with a quick review of core concepts like Responsible AI, Machine Learning (Supervised and Unsupervised) and the Data Science Lifecycle. And we ended that week with a note on Data Science Developer Experience and the importance of using the right tools for coding and analysis, to streamline your development workflow. Here are the three core resources you need to know:


Mar 14: Data Science Day

We spent Pi Day with a full day of talks and tutorials from community and industry experts, exploring a wide range of topics from data analysis to machine learning and AI. I gave a talk on one of my favorite topics - "Simplifying Data Analysis & Visualization with Developer Tools and AI" - which sets the stage perfectly for this Developer Tools Week on #14DaysofDataScience. My goal was to provide a learning roadmap with actionable labs to help you go from concepts to code, and learn by doing. Here are the three links to know:

 

 

Mindset: Goal-Oriented Learning

The goal for Week 2 of #14DaysOfDataScience was to follow up on the talk with a week-long tour of relevant Developer Tools and AI capabilities in a hands-on workshop format. Before we dive into details, let's talk about the audience and learning mindset for this week. My target is someone familiar with coding and development workflows but new to data science or Python programming. You are a self-driven learner so all you need are resources and a roadmap to get you going. My mindset is one of goal-oriented learning - focusing on learning just what I need to learn, for what I need to do next, so I get that sense of accomplishment. Don't boil the ocean. Instead stay focused and keep making steady progress and your expertise will grow organically while providing immediate value to your team.

 

Motivation: Why Developer Tools & AI?

From a goal-minded learning mindset, we can see the learning journey as having 3 challenges:

  1. Is it Frictionless? How quickly can I become productive without wasting time in "setup" that does not actually build my knowledge in that domain? How many times have we heard others say "It works on my machine?" when we are struggling. We need setup options that are fast, painless, and work the same for everyone.
  2. Am I Focused? How can I stay on task without getting distracted? How many times have you thought "let me just google that.." and lost hours web-surfing only to realize you forgot the original question and still have no solution. We need inline tools that give us answers without breaking our state of flow.
  3. Is it Friendly? How can I get help from others when I am stuck and I need debug insights? How many times did you say "here's a screenshot of my error" and know that it's hard for them to figure it out without being able to reproduce your actions or know context? We need tools that help us capture, share and replicate, work for seamless collaboration.

This is where Developer Tools become important - from getting consistent and fast setup to having shareable and reproducible development environments with inline support for keeping our attention on task. But where does AI fit into this picture? Let's talk about "gaps" in our learning.

  • Knowledge Gaps - where we know what we don't know - and just need to learn it.
  • Intuitiion Gaps - where we don't know what we don't know - so how can we even start?

Developer Tools (with an actionable roadmap) can help reduce the knowledge gap - just plan and execute. But how do I expand my awareness of a topic while staying goal-oriented in my learning? This is where AI can really help by offering suggestions and follow-up recommendations in contextually-relevant ways to grow our expertise in the context of that journey.

 

Week 2: Developer Tools & AI

So, what can you learn in Developer Tools Week on #14DaysOfDataScience? Let's review the 7-part series that will take you from setting up your development environment to using AI for intuitive data visualization with natural language.

  1. GitHub Codespaces | Learn how to use dev containers with GitHub Codespaces to get fast and consistent development environments using pre-defined configurations. Use the codespaces-jupyter template for an interactive and reprodcible notebook experience.
  2. Visual Studio Code | Create Data Science profiles for Visual Studio Code, enhancing your productivity and making your editor configuration shareable with others. Learn about core Visual Studio Code extenisions like Data Wrangler that enhance data cleaning with a low-code interface that generates Python code for import into notebooks, for code-first use.
  3. GitHub Copilot | Think of this as an AI coding assistant that can improve your productivity with proactive suggestions (auto-complete) and reactive responses (interactive chat) for an inline experience that keeps you focused on task. Ask GitHub Copilot to explain unfamiliar concepts or click a suggested follow-up question to organically build your intution on a topic without leaving the editor!
  4. Open Datasets | Want to practice your growing data analysis skills with these tools? You are going to need datasets and inspiration. Look no further than this post where we explore open datasets - curated, public resources - and see how the related communities can guide your exploration. 
  5. Responsible AI | Data analysis leads to machine learning models that drive AI algorithms, which may be used for decision-making at scale. It's important to make sure that the resulting actions cause no harm and behave in a consistent and fair manner for all users. Learn Responsible AI principles and how to use related tooling for model debugging and decision-making in your workflows.
  6. Project LIDA | Till now we've used known developer tools and practices to explore data analysis and visualization. But what if you could just "chat" with your data and ask for insights and visualizations without having to write a single line of code? We'll explore Microsoft LIDA, a project from Microsoft Research that can help you build your intuition for data visualization.
  7. Azure AI Platform | We've talked about Data Science and operationalizing it with MLOps (Data Science lifecycle workflows) for predictive AI applications. But what about generative AI driven by natural language prompts and pre-trained models? In this final post, we explore the paradigm shift to LLM Ops, and how the Azure AI platform streamlines the workflow.

The posts by themselves, tell a story that will help you navigate the developer workflow through the lens of tools and AI. But doing hands-on labs may help you "see" how these concepts work in practice and reinforce your conceptual understanding with applied usage. 

 

Workshop: Learn By Doing!

The roadmap slide from my talk links to a related Python Data Analysis Workshop repo that I am maintaining, and will continue to evolve with more examples. Simply fork the repo and explore the pre-existing exercises in the context of these posts. Then, create an exercise notebook in each folder, and use it as a sandbox for trying other ideas and extending your intuition and knowledge organically. Share what you learned, or leave issues on the repo with feedback or suggestions for other resources or exercises you want to see.

nitya_3-1711369415143.png

 

Relevant Resources

We covered a lot today. Want to continue your learning journey? Here are three curated resource collections that I will continue to maintain, that covers the journey from data science to AI. Happy hacking and remain to build that goal-based learning mindset! 

 

Version history
Last update:
‎Apr 01 2024 07:35 AM
Updated by: