Use GitHub Codespaces to learn Machine Learning
Published Aug 15 2022 11:46 AM 2,527 Views
Microsoft

Have you read the theory of AI and ML over and over but feel the only way to understand is by running and tinkering with the code?

 

Well, I was stuck in this cyclical loop, so I reached out to my teammates to get over this hurdle. They suggested using GitHub Codespaces and walked me through the set up. It was easy and quick, and I would never have done it, if I had to figure it out myself. So hopefully, this article helps all of you combine both the practical and theoretical knowledge to take one step towards using ML in real life.

 

First, let’s cover some quick basics like what and why.

 

What is ML and AI?

Machine learning is a subfield of artificial intelligence, which is broadly defined as the capability of a machine to imitate intelligent human behavior. Artificial intelligence systems are used to perform complex tasks in a way that is like how humans solve problems.

 

Then why don’t we just have humans do it?

The goal of ML or AI is to predict results based on incoming data plus the relationships and patterns behind the data. The more diverse the data, the better the result.

Now, humans need something called working memory to store and integrate all information “at hand”. Working memory is a form of memory that allows a person to temporarily hold a limited amount of information at the ready for immediate mental use. It is considered essential for learning, problem-solving, and other mental processes. Now imagine holding on to 10,000 images in your memory while trying to solve an overly complex math problem.

 

This is where Machine learning helps us offload our predictions or decision-making capabilities to a machine, when we are limited by our working memory. Simply put, ML/AI is an enhanced version of Human's working memory.

 

If you want a gentle introduction to image analysis using Neural networks and build your own, here is a blog that links to all the resources.

 

Now that we have theoretical knowledge, let’s see what tools and how can we use to apply them.

What is GitHub Codespaces?

Codespaces is an instant development environment hosted on cloud that allows you to develop, run, test, debug, push, etc. without any machine setup locally.

 

What is Codespace For?

Codespaces give developers a consistent development environment in minutes. You can customize this entire space, not only for yourself, but for everyone else that comes to that GitHub repo.

As a beginner, it took me 10 mins to set up the environment on Codespace vs 60-120 mins setting up locally with a python environment already setup. You can connect to your codespace from the browser or VS Code and can create from any branch or commit in your repo.

For our project, we created a new repository and performed the following steps to customize our codespace, run, and test.

 

Steps -

  1. Create a new repository.
  2. Create a new codespace - Follow the steps in the documentation. It comes with a default image.
  3. Run the codespace and add dev container configurations - Ctrl+shift+P to open Pallet, search for configuration to deploy an Image, search for Jupyter data science Notebooks
  4. Rebuild the dev container
  5. Download the requirements file linked here.
  6. Create a new file, name it requirements.txt, then upload the text from the above step.
  7. Go to docker file, uncomment 18 to 20
  8. Rebuild the container to use all the requirements.txt
  9. Download mnist-demo-mlp file linked here and drag and drop this file to upload in codespace.
  10. Add a new file and call it .ignore
  11. Add /Data (to ignore the data from being uploaded to GitHub repo)
  12. Play the notebook one by one to see the results/output.

That’s it! You have successfully trained, tested and visualized the test results of the trained Neural Network in less than 40 mins. Feel free to tinker with your code and see how it changes the performance.

If you made a few changes and want to commit and push the changes back to remote, follow the GitHub documentation

Co-Authors
Version history
Last update:
‎Aug 15 2022 11:50 AM
Updated by: