Blog Post

Azure SQL Blog
3 MIN READ

GitHub Copilot for SQL Developers: Turbocharge your SQL Development Workflow

subhojitbasak's avatar
subhojitbasak
Icon for Microsoft rankMicrosoft
Jul 18, 2023

Blog 1

Congratulations on starting your journey as a database developer in a new organization! As you embark on this exciting chapter of your career, you'll encounter various challenges and opportunities to enhance your SQL development skills. One tool that can revolutionize your workflow is GitHub Copilot. This is the first blog of our series where we will explore how GitHub Copilot can assist you in your SQL development tasks, providing intelligent code suggestions and speeding up your learning curve.

 

What is GitHub Copilot?

GitHub Copilot is an AI-powered code completion tool developed by GitHub and powered by OpenAI. It leverages machine learning models trained on vast amounts of code to generate context-aware code suggestions directly within your integrated development environment (IDE). You can use GitHub Copilot as an extension in Azure Data Studio, SQL Server Data Tools (SSDT) and Visual Studio Code.

 

In our previous blog on Copilot, we talked about a plethora of uses, here we would take you through a specific journey where the same prompt given to Copilot would generate different output based on the context that is given to Copilot.

 

Copilot can take comments in natural language, the same we are used to writing before our code as a best practice, to give a context about our code. This comment, which serves as a token to Copilot goes through checks to identify PII (Personally Identifiable Information) and Toxicity (any hate speech or unwanted content) before allowed content is fed into the AI model. The output of the AI model then goes through Duplicate detection and the top result is presented as a few lines of code. Once this output is generated, and the user accepts or rejects it, Copilot forgets the token and the information via which it has reached that output.

 

In the example below, Copilot generates the code based on a comment that is posted.

 

Once you press the tab key, the code is drafted in the editor.

 

This process works perfectly fine, and Copilot can generate the right queries if they are basic and simple. When the prompt for a more complex question is given, Copilot generates a query that is referencing columns that don’t exist on the database.

 

The reason why Copilot cannot return this query right is because it lacks the context of the tables and columns present in the database. It tries to give the best guess but that is not enough. To make the same prompt works, we need to provide context of the objects that it is expected to know about, before returning the output.

The “Brain Dump” Approach

In this approach, you give context to Copilot to be able to return better results for the prompts that are given. Expanding on the previous example, you would want to provide context on the table object Sales.OrderLines and Sales.Orders. To do that you can right-click on the tables to get the Script for Create. You then copy the objects and paste them at the beginning of the query. This is done for all tables that you might be using in the query.

 

 

 

Once all the objects are there in the query, if you run the same prompt, is what you are expected to get:

 

 

Learn More

If you have reached this far and have enjoyed your experience with GitHub Copilot, you can take it one step further by reading the second blog in the series. In Blog 2, we talk about some more complex queries and different scenarios. We deep dive into query optimization techniques using GitHub Copilot. 

 

Get started with Copilot

You can get started today with GitHub Copilot in Azure Data Studio by installing the latest Azure Data Studio release and installing GitHub Copilot from the extension marketplace. More information on GitHub Copilot is available in the documentation for Azure Data Studio and GitHub Copilot

 

 

 

 

Updated Sep 01, 2023
Version 2.0
  • NomanTee's avatar
    NomanTee
    Copper Contributor

    This seems like making SQL developers dumber. I dont understand why would I write all that comment and copy and paste the table definition when I can just write the whole query a lot quicker? 

    As for the junior developer I fear they would start relying on this instead of learning the actual query language.

  • ShawnDevin's avatar
    ShawnDevin
    Copper Contributor

    some might think that this is making developers dumber, but when I first started learning SQL back in 2000, they had a built in query builder which now is used to build views (select new view to see what I mean).  I didn't know the first thing about SQL and was just given the manual and MSDN disks and told I was now in charge of report queries, to learn it and fix a bug in a report.  I was fortunately shown how to use the old query designer (right click on a view folder and choose new view to see it) and I used this to construct queries for the first year.  It gave an easy way to pick tables, and showed me what the various types of joins looked like, how to filter data and how to use aliases.  I slowly used it less and less once I knew how the code worked and it became faster just to write the code myself as I got comfortable.  I think generative AI will be something like that on steroids for new developers.  Allowing them to quickly create and experiment with custom code without having to do hours or days of research through expired online help articles and should be a great boon to teach them how to code projects they've never tried before.

  • elisseudev's avatar
    elisseudev
    Copper Contributor

    I don't know why a developer would use this kind of tool, just to get dumber. 

  • RaulChiarella's avatar
    RaulChiarella
    Copper Contributor

    NomanTee I completelly disagree with you. I don't think you will write SQL queries quicker than the suggestions Copilot gives you.

    I had to recently create some tables using some inner joins from multiple tables that used a lot of data, and the keys were really annoying, with complicated names and all... Doing it maunally I was risking running into typos, writing everything maually, which would take a time I did'nt want to waste.

     

    Using Copilot, I literally copied pasted a documentation that I created some time ago using UML, and it did the entire thing by itself, with no errors whatsoever.

    What's even best, using Copilot does not make you dumber, instead, it makes you more agile, and even give you some insights on different approachs to a same business logic. Why limit your capacitiy when you can be even better?

  • odeddror2075's avatar
    odeddror2075
    Copper Contributor

    Hi there,

     

    -- I have table with two column ID and datetime can you make co pilot produce this code?
    --The query grabs the first time for each ID (the first SELECT, before the UNION ALL)
    --and then recurses, finding the next row for each ID which is at least 15 minutes appart.
    WITH times AS (
    SELECT e.ID, MIN(e.DTTM) AS DTTM
    FROM example e
    GROUP BY e.ID

    UNION ALL

    SELECT r.ID, r.DTTM
    FROM (
    SELECT e.ID, e.DTTM,
    ROW_NUMBER() OVER (PARTITION BY e.ID ORDER BY e.DTTM) AS rn
    FROM example e
    INNER JOIN times t
    ON e.ID = t.ID AND e.DTTM >= DATEADD(mi, 15, t.DTTM)
    ) r
    WHERE r.rn = 1
    )
    SELECT top 100 percent ID,DTTM
    FROM times
    ORDER BY ID, DTTM

  • odeddror2075's avatar
    odeddror2075
    Copper Contributor

    I think we the developer need to learn how to ask good questions in order to get right answers. This is classic (for me) example I stuck with the question.

     

  • COBOL_84's avatar
    COBOL_84
    Copper Contributor

    It appears that copilot is acting like a 4GL (4th generation language). And like all 4GLs the instructions are more related to human language but complex instructions are difficult to convey. Think of trying to conveying in English a recursive query involving a dozen or more joins, unions, and subsets (and make that query efficient). What would really be beneficial is a self tuning database, one that would optimize performance based on past queries.  

     

  • tailinnosoft's avatar
    tailinnosoft
    Copper Contributor

    Late to the party, but…

     

    I spend most of my time in sql writing random queries. At most a few times a year I add a new table, alter a table, or write a sproc. When I do, I often have to stop by W3schools to refresh my memory on some bit of syntax.

     

    my hope is copilot will speed this up. The suggestion may not be perfect, but it’ll get me closer to what I need without having to hit the browser.

     

    I’ve tried the vscode mssql plugin over using SSMS.. I need to spend more time in it next time I have a bunch of sql work to do… the biggest pain point was not being able to run highlighted code. Maybe this has been fixed, or maybe I just need to get in the habit of commenting out queries I’m not interested in executing this run!

     

    I ended up here to see if Microsoft has anything similar to PlanetScale’s Schema Recommendations. I look forward to Microsoft taking some inspiration from it 😉 

  • Ken_Hadden's avatar
    Ken_Hadden
    Brass Contributor

    The fact that I need to clutter my editor with table definitions in order for copilot to be able to get the context it needs is frankly ridiculous. What if I am working on a very complex database that requires a dozen joins for many queries. That is not unusual where I work. And it would be quite time consuming to extract all of those DDL statements.