Modern Work App Consult Blog

20 MIN READ

Bringing OpenAI into an Outlook add-in: a business mail generator

Microsoft

Feb 15, 2023

[Updated on 7th March 2023]

Open AI has made the ChatGPT model available through its APIs. The model is called gpt-3.5-turbo and the good news is that it's 10 times cheaper than the Davinci one. The model, in fact, is priced $0.002 / 1K tokens, while the Davinci one we have used in the original version of the project is priced $0.0200 / 1K tokens.

However, the way this model can be used in your applications is slightly different than the one we have used in the post to leverage the Davinci model. Let's look at how the generateText() function must be changed to use ChatGPT. But before, we must open a terminal on the folder which contains our project and upgrade the OpenAI library with the following command:

npm update openai

This command will install version 3.1.0, which includes a new API to interact with the ChatGPT model. Let's look at the complete code:

generateText = async () => {
  var current = this;
  const configuration = new Configuration({
    apiKey: "your-API-key",
  });
  const openai = new OpenAIApi(configuration);
  current.setState({ isLoading: true });
  const response = await openai.createChatCompletion({
    model: "gpt-3.5-turbo",
    messages: [
      {
        role: "system",
        content: "You are a helpful assistant that can help users to create professional business content.",
      },
      { role: "user", content: "Turn the following text into a professional business mail: " + this.state.startText },
    ],
  });
  current.setState({ isLoading: false });
  current.setState({ generatedText: response.data.choices[0].message.content });
};

The first difference is that we have a new API to interact with ChatGPT, called createChatCompletion(), which we must use instead of createCompletion().

The second difference, even if it's a small one, is that we have to change the model name to gpt-3.5-turbo.

The third difference is the most important one. The prompt you pass to the model to generate text isn't based any more on a single sentence, but on on a collection of messages object. Each of them has two properties:

role, which is used to define who generated the prompt. It could be the user, the system or the assistant itself.
content, which is the actual text.

In the code sample, you can see two types of messages: one with system as role, which we're using to instruct the model on the type of outcome we want to achieve (you are an assistant that can generate business content); one with user as role, which is instead the actual ask (turn the following text into a business mail).

There's another reason why this model requires a collection of messages, instead of a single prompt like with Davinci. ChatGPT is born to support conversational scenarios, so that the user can interact with the AI without specifying the full context each time, like you would do with a human. However, the conversation history isn't managed by OpenAI, but it must be managed by the developer. In our scenario, we don't have to do this: we aren't using ChatGPT to have a conversation, but to generate a text to include in our mail, so we don't have to change the way the Outlook add-in works. In a conversational scenario, however, we would need to use the messages collection to provide the whole history, so that the model can infere the context. The following example from the official documentation will help you to understand better this concept:

 const response = await openai.createChatCompletion({
      model: "gpt-3.5-turbo",
      messages: [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
      ],
    });

Other than the messages generated by the user, we pass back also the messages generated by ChatGPT using assistant as role. This way, the model is able to get the whole context and provide an answer to the last question (Where was it played?).

The source code on GitHub has been updated to use the new model.

The original post starts here

The tech world has always shown a lot of interest in Artificial Intelligence, especially in the last years. However, recently, interest has started to spread also outside the tech enthusiast bubble. Dall-E, the model developed by OpenAI to create images, started to give a new meaning to "generative AI", showing the capabilities of these new powerful AI models. But it's ChatGPT that really ignited the interest, by providing a conversational model that it's very close to the human one and that, most of all, can help you accomplishing many tasks: it can make searches, it can relate content together, it can generate summaries or lists, it can create stories, etc.

A few days ago, Microsoft demonstrated how ChatGPT isn't just a "toy" to play with, by announcing a new shift of the search experience with the integration of a conversational and content generation experience into Bing and Edge. AI becomes a copilot, that can assist you during your daily tasks and help you to be more productive and, as the new Microsoft mission states, "do more with less". And I'm sure, in the coming months, we'll see even more integrations; some of them have already been announced, like the ones in Teams and Viva Sales.

So, why don't we get ahead of the game and we start to play with the possibilities of using AI to improve productivity? In this blog post we'll combine the best of both worlds: the productivity offered by the Microsoft 365 ecosystem and the content generation capabilities of the latest AI models. We're going to build an Outlook add-in that, through OpenAI APIs, will help people to craft professional business mails from one or two sentences. We aren't going to use the real ChatGPT model, since it isn't available for public consumption yet, but we're going to use one of the many powerful GPT models offered by OpenAI. The goal of this blog post is to help you understand the capabilities of these models and the basic steps to integrate them. Once ChatGPT will be available (directly through Open AI or using another provider, like Azure Open AI, you will just need to swap the model (or the API implementation), but the basic architecture will stay the same.

Let's start!

Create the Outlook add-in

We're going to create the Outlook add-in using the new web-based model, so you will need the following tools installed on your machine:

The latest LTS version of Node.js.
Visual Studio Code
The Yeoman generator for Office add-ins.

Once you have all the requirements, open a terminal and run the following command:

yo office

Please note: since it's web based development, you might be tempted to work on this project using the Windows Subsystem for Linux, which typically delivers better performances, especially on the file system. However, if you want to have a good debugging experience, it's better to create the project in Windows, so that you'll be able to debug the add-in using Outlook for desktop.

You will be guided through a series of steps to scaffold the starting template for the project. Use the following settings:

Choose a project type: Office Add-in Task Pane project using React framework (here actually you can pick up the framework you prefer, but since React is the one I know better I'm going to use this template for this post).
Choose a script type: TypeScript (also in this case you can pick up the language you prefer, but I highly suggest TypeScript for everything that is nothing more than a "hello world" in the web ecosystem).
What do you want to name your add-in?: Give it a meaningful name, like "Outlook AI Mail generator".
Which Office client application would you like to support?: Outlook

Now the tool will scaffold the basic template and it will run npm install to restore all the dependencies. At the end of the process, you will find in the current folder a subfolder with the same name of the add-in you picked up during the wizard. Just open it with Visual Studio Code to start the coding experience.

The Outlook add-in template contains two basic implementations of the two scenarios supported by Office add-ins:

Task pane. The task pane is a HTML page that is displayed inside a panel placed on the right of the screen inside the application. The user can interact with the page and, through the Office SDK, perform operations that can interact with the context.
Commands. These are operations that don't require any UI interaction. You click on a button and an operation is performed. Also in this case, through the Office SDK you can retrieve the context and operate on it. For example, you can select some text in the mail body and the command is able to read it.

We're going to support both approaches, so that you can pick the one you like better.

For the taskpane, this is the final look & feel we're going to achieve:

The user will specify the text he wants to turn into a business mail in the first box. By pressing the Generate Text button, we're going to invoke the Open AI APIs, passing a prompt followed by the input text. The result returned by the API will be displayed in the second box. Users will have the chance to adjust and then, once they're done, they can click the Insert into mail button, which will use the Office APIs to include the text in the mail's body.

For the command, instead, we don't really have a user interface, just a button available in our ribbon:

The logic, however, is the same as the taskpane. The only difference is that the input text to process through Open AI won't be typed by the user in a dedicated box, but directly in the mail's body. Using the Office APIs, we'll retrieve the body and pass it to the Open AI APIs. Then, the result will be automatically pasted back into the body.

Now that we have a better idea of the result, let's start to work!

Configuring the manifest

An Office add-in includes a file called manifest.xml, which describes it: the name, the publisher, the supported actions, etc. Before we start working on the code, we must make some changes. Some of them are purely cosmetic. For example, you can use the DisplayName property to change the name of the add-in that will be displayed to the user inside Outlook; or the IconUrl one to change the icon.

A particularly important section, however, is the one called MailHost, which describes where and how the add-in will be integrated inside the Outlook surface. By default, the template includes the following extension point:

<ExtensionPoint xsi:type="MessageReadCommandSurface">

This means that the add-in will be integrated in the reading experience: you'll be able to invoke it when you're reading a mail. This isn't our scenario, however. We want this add-in to help us in writing a new mail, so we must change this value as in the following snippet:

<ExtensionPoint xsi:type="MessageComposeCommandSurface">

Finally, we can customize the section inside the ShortStrings element to change the labels that are associated to the buttons:

<bt:ShortStrings>
  <bt:String id="GroupLabel" DefaultValue="AI Generator Add-in"/>
  <bt:String id="TaskpaneButton.Label" DefaultValue="Business mail AI generator"/>
  <bt:String id="ActionButton.Label" DefaultValue="Generate business mail"/>
</bt:ShortStrings>

Now we can move to the code.

Building the task pane

We're going to focus on the taskpane folder of our solution, which includes the files that are used to render the web content displayed in the panel. Since I've picked up the React template, the taskpane.html page doesn't contain much. It includes only a div, called container, which is used by the index.tsx file to load the React application and render it into the div placeholder. The real application is stored inside the components folder: App.tsx is the main component, which defines the UI and includes the interaction logic. We also have some smaller components, which are used to render specific UI elements (like the header or the progress indicator).

Let's start to build the various elements we need step by step. Since we're going to make multiple changes compared to the default template, I won't go into the details on what you need to change. Just replace the existing components and functions with the ones I'm going to describe in the rest of the article.

Getting the initial text and pass to Open AI

As the first step, we need to define the state of our main component. To support our scenario, we need the state to store the initial text and the generated one. As such, we must update the AppState interface as following:

export interface AppState {
  generatedText: string;
  startText: string;
}

Let's initialize them as well in the constructor of the component with an empty string:

export default class App extends React.Component<AppProps, AppState> {
  constructor(props) {
    super(props);
    this.state = {
      generatedText: "",
      startText: "",
    };
  }
}

Now let's look at how we can define the UI, through JSX, the markup language (sort of) used by React, and the render() function of the component:

render() {
    return (
      <div>
        <main>
          <h2> Open AI business e-mail generator </h2>
          <p>Briefly describe what you want to communicate in the mail:</p>
          <textarea
            onChange={(e) => this.setState({ startText: e.target.value })}
            rows={10}
            cols={40}
          />
          <p>
            <DefaultButton onClick={this.generateText}>
              Generate text
            </DefaultButton>
          </p>
      </div>
  )
}

We have added a textarea, which is where the user is going to type the starting text, and a button, which will invoke the Open AI APIs. As for the textarea, we subscribe to the onChange event (which is triggered every time the user types something) and we use it to store the typed text inside the state, through the startText property.

Before implementing the onClick event of the button, however, we must take a step back and register to Open AI so that we can get an API key. Just go to https://openai.com/api/ and click on Sign Up (or Log In if you already have an account). Once you're in, click on your profile and choose View API keys. From there, click on Create new secret key and copy somewhere the generated key. You won't be able to retrieve it again, so do it immediately. The free plan gives you 18 $ of credits to be used within 3 months, which is more than enough for the POC we're building.

Now that we have an API key, we can start using the Open AI APIs. The easiest way to do it in our add-in is through the official JavaScript library. Open a terminal on the folder which includes your project and type:

npm install openai

Then, at the top of the App.tsx file, add the following statement to import the objects we need from the library:

import { Configuration, OpenAIApi } from "openai";

Now we can implement the onClick event that is triggered when the user clicks on the Generate text button:

generateText = async () => {
  var current = this;
  const configuration = new Configuration({
    apiKey: "add-your-api-key",
  });
  const openai = new OpenAIApi(configuration);
  const response = await openai.createCompletion({
    model: "text-davinci-003",
    prompt: "Turn the following text into a professional business mail: " + this.state.startText,
    temperature: 0.7,
    max_tokens: 300,
  });
  current.setState({ generatedText: response.data.choices[0].text });
};

First, we create a Configuration object, passing in the apiKey property the API key we have just generated.

Please note: we're doing this only for testing purposes, but this isn't a suggested approach for a production scenario. Since the add-in runs entirely client side, it's extremely easy to spoof the API key. You must use a more suitable approach like having a server-side middleware between the client and the API (like an Azure Function) and use services like Azure Key Vault.

Then we create a new OpenAIApi object, passing as parameter the configuration we have just created. Through this object, we can interact with the various models exposed by Open AI. The one related to text is accessible through the createCompletion() method, which requires as parameter an object with the following properties:

model, which is the name of the model to use. In this sample, we're using text-davinci-003, which is the most advanced GPT model available through the APIs at the time of writing this article. Soon, Open AI will make available also ChatGPT as one of the available options.
prompt, which is the text we want to process. We use a prompt that describes what we want to achieve (turn the following text into a professional business mail), followed by the text typed by the user (which we have previously stored in the component's state).
temperature, which is a value between 0 and 2 that controls how much randomness is in the output. As explained in this good article, the lower the temperature, the more likely GPT-3 will choose words with a higher probability of occurrence. In our case, we set 0.7, which is a good balance between "too flat" and "too creative".
max_tokens, which is the maximum number of words that will be returned by the API.

This method is asynchronous and based on JavaScript promises, so we can use the async / await pattern to invoke it. This means that the final line of our snippet will be called only when the API has returned a response. Specifically, the text generated by Open AI will be included in the text property of the first element of the data.choices collection. We store it in the generatedText property inside the component's state, so that we can later use it.

Use the generated text to craft our mail

Now that Open AI has generated the text of our business mail for us, we must display it to the user, give them the option to edit it and then use it as body of the mail. In order to do that, we need to add a new property in our state to store the final text to include in the mail, which might have some differences compared to the one generated by Open AI since we're giving the user the option to edit it. Here is the updated definition of the AppState interface:

export interface AppState {
  generatedText: string;
  startText: string;
  finalMailText: string;
}

Let's not forget to initialize it as well in the component's constructor:

export default class App extends React.Component<AppProps, AppState> {
  constructor(props) {
    super(props);
    this.state = {
      generatedText: "",
      startText: "",
      finalMailText: ""
    };
  }
}

Now that we have the property we need, let's add a new box and a new button to our application, by adding the following elements in JSX right below the ones we've added in the previous section:

<textarea
   defaultValue={this.state.generatedText}
   onChange={(e) => this.setState({ finalMailText: e.target.value })}
   rows={10}
   cols={40}
 />
 <p>
   <DefaultButton onClick={this.insertIntoMail>
     Insert into mail
   </DefaultButton>
</p>

The textarea control in React has a property called defaultValue, which we can set with a text that we want to display when the component is rendered. We connect it to the generatedText property available in the component's state. This way, once the Open API call has returned the generated text and stored it into the state, the box will automatically update itself to show it. Then, like we did with the previous box, we handle the onChange event, by saving the text typed by the user inside the finalMailText property of the component's state.

Finally, we have another button, which invokes a function called insertIntoMail(), which is described below:

insertIntoMail = () => {
  const finalText = this.state.finalMailText.length === 0 ? this.state.generatedText : this.state.finalMailText;
  Office.context.mailbox.item.body.setSelectedDataAsync(finalText, {
    coercionType: Office.CoercionType.Html,
  });
};

Here we can see the Office SDK in action. First, we determine if the user has made any change to the text generated by Open AI. Then, we call the Office.context.mailbox.item.body.setSelectedDataAsync() method, passing as parameter the final text (the one generated by the Open API plus any edit the user might have done). This method will take care of adding the text into the body of the mail, specifically where the text cursor is placed.

Building the command

Building the command requires less effort than the taskpane, since we don't have any UI. We just need to intercept the click on the button in the ribbon and manage it. The default commands.ts file inside the commands folder includes a function called action(), which we can use for this purpose.

First, let's create a new function that takes the body of the mail and processes it using Open AI:

function getSelectedText(): Promise<any> {
  return new Office.Promise(function (resolve, reject) {
    try {
      Office.context.mailbox.item.body.getAsync(Office.CoercionType.Text, async function (asyncResult) {
        const configuration = new Configuration({
          apiKey: "your-api-key",
        });
        const openai = new OpenAIApi(configuration);
        const response = await openai.createCompletion({
          model: "text-davinci-003",
          prompt: "Turn the following text into a professional business mail: " + asyncResult.value,
          temperature: 0.7,
          max_tokens: 300,
        });

        resolve(response.data.choices[0].text);
      });
    } catch (error) {
      reject(error);
    }
  });
}

The main difference with the taskpane is that, in this case, we are getting the text to turn into a business mail directly from the body of the mail. To do it, we use the Office SDK and, specifically, the Office.context.mailbox.body.getAsync() method. Being asynchronous, we receive the body in a callback, in which we implement the Open AI integration, which is the same we have seen for the taskpane. By using the Open AI library, we send a prompt followed by the text typed by the user to Open AI, by using the createCompletion() function and using the text-davinci-003 GPT model. Once we get a response, we return to the caller the text processed by Open AI, which is stored inside the text property of the first element of the data.choices collection.

Now we can implement the action() function:

function action(event: Office.AddinCommands.Event) {
  getSelectedText().then(function (selectedText) {
    Office.context.mailbox.item.setSelectedDataAsync(selectedText, { coercionType: Office.CoercionType.Text });
    event.completed();
  });
}

Also, in this case we're using code we have already seen in the taskpane implementation. We call the getSelectedText() function we have just created and, once we have the generated business mail, we use the Office.context.mailbox.item.setSelectedDataAsync() method to copy it into the mail's body. In the end, we call event.completed() to let Office know that the command execution is completed.

Testing and debugging the add-in

Visual Studio Code makes testing the add-in easy, thanks to a series of debug profiles which are created by Yeoman. If you move to the Debug tab of Visual Studio Code, you will find different profiles, one for each Office application. The one we're interested in is called Outlook Desktop (Edge Chromium). If you select it and you press the Play button, two things will happen:

A terminal prompt will be launched. Inside it, Visual Studio Code will run the local server (which uses Webpack to bundle all the JavaScript) that serves the add-in content to Outlook.
Outlook will start and you will see a security prompt asking if you want to sideload the add-in.

Now click on the button to compose a new mail and based on the size of your screen, you should see your add-in available in the ribbon (or, in case it doesn't fit, you'll see it by clicking on the three dots at the end of the ribbon).

When you click on it, a panel on the left will be opened, like the one we have seen at the beginning of the post. You will also be asked if you want to connect to the debugger, make sure to click Yes to confirm. Now type a simple sentence that you want to turn into a mail. For example, something like:

David, I'm planning to work on your garden tomorrow at 3 PM, but I might be a bit late.

Then click on Generate text and, after a few seconds, you should see a more carefully crafted text being displayed in the box below:

Dear David,

I hope this message finds you well. I wanted to let you know that I am planning to work on your garden tomorrow at 3 PM, however, I may be running a bit late. I apologize for any inconvenience this may cause.

Thank you for your understanding.

Sincerely,
[Your Name]

Now you can make the changes you need, then click on Insert into mail. You will find the text included inside the body of the mail, ready to be sent to your customer. In background, Visual Studio Code has attached a debugger to the WebView which is rendering the panel inside Outlook. This means that you can set breakpoints in your TypeScript code and do step-by-step debugging whenever it's needed. For example, you can set breakpoints inside the generateText() function and monitor the interaction with Open AI APIs. The local server, additionally, supports live reload, so whenever you make any change, you won't have to redeploy the add-in, but they will be applied in real time.

The same testing can be done also for the command. The difference is that you must type the text to turn into a business mail directly in the mail's body, then click on the Generate business mail button in the ribbon. A loading indicator will be displayed at the top of the window, and you'll be notified once the operation is completed. Also, in this case the Visual Studio Code debugger will be attached, so you can set breakpoints in your code if needed.

Deploying the add-in

Once you're done testing, you can choose to make available your plugin to a broader audience. The publishing story for Office add-ins is similar to the Teams apps ones:

You can share the add-in for manual sideloading, which is great for testing and limited distribution.
You can publish the add-in in your organization, so that all the employees can pick it up from the add-in store.
You can publish the add-in in the Office store, to make it available to every Office customer around the globe.

Regardless of your choice, Office is just the "interface" for your add-in, but it doesn't host it. This is why the only required component to deploy for an Office add-in is the manifest, which includes all the information on where the web app is hosted. If you explore the manifest.xml file we have previously edited, you will find the following tag, which defines the entry point of our taskpane:

<SourceLocation DefaultValue="https://localhost:3000/taskpane.html"/>

Ideally, once you're ready to deploy, this URL (and the other localhost references included in the manifest) will be replaced by the real URL of the web application.

If you want to start the process to make your add-in available to a broader audience, Visual Studio Code is still your best friend. The following document will guide on how to leverage the Azure extension for Visual Studio Code to generate the distributable version of the add-in and to publish it on Azure Storage, which is a perfect match for our scenario since the add-in is made only by static web content.

Once you have published your add-in, you can open any Outlook version (desktop, web, etc.), click on Get add-ins, move to the My add-ins section and, under Custom addins, click on Add a custom add-in. From there, you can either pick up the manifest.xml file of your project or specify the URL of the manifest published on Azure Storage. This way, people will be able to add it to Outlook without needing you to share any file.

Wrapping up

In this blog post we have learned how we can help users to be more productive by infusing software with the latest powerful AI models created by Open AI. Specifically, we focused on the Microsoft 365 ecosystem, by providing an Outlook "copilot" to help you write more professional business mails. And this is just the beginning! We know that new and powerful models are already available (like ChatGPT) and that Microsoft will directly offer new integrations. It's an exciting time to work in the tech space 😃

You can find the definitive version of the add-in on GitHub, with a few improvements that we didn't discuss in this blog post since they help to deliver a better UX, but they aren't strictly connected to the AI integration (like showing a progress indicator meanwhile Open AI processes the input text in the taskpane).

Happy coding!

Updated Mar 07, 2023

Version 6.0

artificial intelligence

Matteo Pagani

Microsoft

Joined March 19, 2017

View Profile

Modern Work App Consult Blog

Follow this blog board to get notified when there's new activity