Note from the editor: This blog is guest-authored by Jeremy Chapman, former IT professional and long-time technical expert on the Microsoft 365 product team.
2023 will go down as the year the world was broadly introduced to generative AI. Experiences from OpenAI’s ChatGPT and Microsoft Copilot were significant in both technology and even non-technology conversations. Beyond the user experiences, many of the core concepts were more deeply explained, including how generative AI can offer more personalized responses by incorporating your files and data as part of the orchestration process to respond to your prompts.
Before I go any deeper into the mechanics of generative AI and retrieval augmented generation, since this is about Modern Work skilling overall, let me start by sharing a few highlights spanning Microsoft 365, Windows, and others that you may have missed. The good news is that you can get to everything right now on demand and for free. These resources span online and in-person events, videos, documentation, and structured learning, so whatever modality you prefer, we’ve got your covered!
Here are 7 recommended resources
- Microsoft Technical Takeoff with deep Windows, Windows 365, and Microsoft Intune went live in November and hosted more than 30 deeper-dive 300-level sessions, which didn’t make it into the agenda of Microsoft Ignite.
- Microsoft Ignite’s Modern Work track included 37 technical sessions spanning Microsoft 365, Teams, Copilot and more – with exclusive and deep dive updates across products and services – so you can find information on your specific areas of focus and interest.
- Microsoft Mechanics went deep across 10 Modern Work topics since June, not just for Microsoft Copilot, but also announcements like the new Windows App, Microsoft Loop general availability, and more.
- New Copilot for Microsoft 365 Learning Paths, starting with basic concepts and going all the way to what you and your organization can do to prepare for generative AI
- Microsoft Copilot end user skilling site with even more of the basics covered for how to work with Copilot, tips for authoring prompts, demonstrations and more.
- If instructor-led training is your preference, there are 7 unique Virtual Training Days that have been running for months and you can now sign up for future deliveries.
- My final recommendation is to use Microsoft Copilot to find and summarize the skilling content you want – and because it’s grounded using current web content, those recommendations will always be up to date.
Sharing one of my 2023 skilling journeys: Learning the core mechanics of generative AI apps and orchestration
So now, let me get back to the topic of how generative AI experiences can safely work with your data to create more personalized responses. You may know me from Microsoft Mechanics, where we dig into trending technical topics at Microsoft and as the name suggests, explain the mechanics behind them. And because we cover topics on Mechanics from several areas across Microsoft, including Azure AI, developer tools, and data security, as well as Modern Work, I have a unique cross-product perspective. This extends to generative AI solutions with retrieval-augmented generation, and how their underlying orchestration patterns work. I’d like to share some highlights from my own personal skilling journey to demystify a few of the core concepts. And to be clear, you will likely need a basic understanding of generative AI concepts, to follow everything I’ll write and show below.
The mechanics of copilot-style apps
First, you may have seen this architecture representing Copilot for Microsoft 365, either in product documentation or a Microsoft Ignite session that I helped present in November:
If you follow the lines and numbered steps in the diagram, it walks through the mechanics of what happens behind the scenes when you submit a prompt. I won’t describe every step in the architecture, but there are a few concepts in there that I learned a bit more deeply about while building recent shows like our deep dive on building custom apps using Azure AI Studio, along with the many topics before that in our broader generative AI video playlist. And unless you are a developer, there’s a good chance that you might have missed these details, so let me break down at a very high level a few of the shared core concepts you’d use to build your own copilot-style app. I’ll use the Azure AI Studio as visual representations for a few of the steps that I initially had not fully understood.
Pre-processing. This refers to a broader set of steps occurring immediately after you submit a prompt. In Azure AI Studio, as well as most common generative AI architectures, one of the first concepts used for pre-processing is called a System Message, but it’s sometimes also referred to as a “System Prompt” or “Meta Prompt” as highlighted below. These are appended to all user prompts behind the scenes to provide additional context for how a large language model should respond. The example is simple system message, and you will see these used to ensure for example that responses are friendly, helpful, voiced in a certain tone, factual, and cite information sources used for a response.
Grounding with data. This refers to data that can later be presented to a large language model along with your prompt and system message to help generate an informed response. This process uses information that the large language model did not have access to as part of its training set. Any data used for retrieval is limited to what the user has permission to access.
Below, the integrated data is represented in my example by a dataset of owner manuals for outdoor products from a fictitious online retailer. In Copilot for Microsoft 365, this would include the Microsoft Graph and the information it can programmatically access in places like SharePoint, OneDrive, email, calendar, and others. This can also include any information you have extended your organization’s instance of the Microsoft Graph with using Graph connectors or plugins to connect programmatically and in real-time via APIs to read information external to Microsoft 365. Information on the web can also optionally be used to retrieve up-to-date public information as part of the grounding process.
These retrieval steps can optionally use semantic search – or what’s often referred to as vector search – in addition to keyword search. Semantic search retrieves similar information based on the derived intent of the user prompt, without matching keywords, keyword synonyms, or keyword misspellings. An example of semantic search would be if you described “the app with numbers, tables, and formulas” in a prompt; semantic search could derive that you are referring to Microsoft Excel, even though neither “Microsoft” or “Excel” were used to describe it as keywords. To find out more about how semantic search works with keyword search and the benefits of combining the two concepts, I’d encourage you to watch this show about vector and keyword search on Mechanics.
Orchestration. All of the steps shown in the architecture diagram are referring to orchestration, or the things that happen between submitting your prompt and receiving the response. There is a lot that goes on there. In the Azure AI Studio, the orchestration used is called a prompt flow. This does everything described in the high-level steps 1-6 in the architecture diagram above. And using Azure AI Studio and prompt flows, can get a little higher fidelity on what can happen, such as determining and extracting intent, formatting the retrieved information so that it can be efficiently presented to the large language model, and later formatting the reply so that it matches the interface and expectation of the user in your app.
Responsible AI. Responsible AI is often abbreviated as RAI, like in the architecture diagram above, but what does it mean programmatically? Microsoft has an entire site devoted to describing what responsible AI means. Azure AI Studio also has the concept of Content Safety using content filters for both user inputs and response outputs (model completions) using a few high-level categories. These are only a few components used as part of orchestration and inferencing with generative AI, and you can get an idea of the types of things that are filtered.
Post-processing. Again, the post processing comprises a larger number of steps, including formatting the output of a response for a particular app or interface. This is also where data security and compliance controls can come into play, as described in the architecture diagram. For example, the response here can include data labels of referenced content for confidentiality or automatically apply sensitivity labels of generated content if sensitive information was retrieved to generate the response.
Then, once all of these and other steps are completed, the orchestration returns its informed response back to the user in the app they are using with generative AI.
Hopefully, this helps demystify a few of the core concepts for these types of apps. This is by no means an exhaustive description of every step in the process (or for that matter, everything I’ve learned in 2023 🙂). For me, with a background in deployment and process automation, these steps were very helpful in my personal skilling journey to understand how everything works, and even helped me to build a few simple copilot style apps using the Azure AI Studio.
Of course, there is a lot more that you can explore. I’d recommend starting with a few of the items listed in the beginning. If you’re looking for something not on my list, that’s where Microsoft Copilot can help. Just remember to be specific with your prompt for best results.
Thanks for reading this far and best wishes for 2024!