Using Microsoft Graph API to convert the format of your documents
Published May 01 2019 04:17 PM 22K Views
Microsoft

First published on TECHNET on Oct 26, 2018
This post is a contribution from Jing Wang, an engineer with the SharePoint Developer Support team

Many SharePoint Online customers want to convert their word documents or some other documents in SPO to pdf files programmatically.
Within SharePoint Online User Interface, you can convert them one at a time manually, but sometimes, users wish to have the ability to convert multiple documents automatically without having to open each document library, locate the documents and click through the options to do them one at a time.

With new Graph api endpoint listed below, above requirement can be automated with custom code, for example, C#, JavaScript…


GET /drive/items/{item-id}/content?format={format}
GET /drive/root:/ {path and filename} :/content?format={format}


Format options
The following values are valid for the format parameter:
Format valueDescriptionSupported source extensions

pdf Converts the item into PDF format. csv, doc, docx, odp, ods, odt, pot, potm, potx, pps, ppsx, ppsxm, ppt, pptm, pptx, rtf, xls, xlsx


See details of the endpoint here:
https://developer.microsoft.com/en-us/graph/docs/api-reference/v1.0/api/driveitem_get_content_forma...

Sample - Complete solution in C#:

Step I, Create a native app in Azure portal and give permissions, for Graph API.







Step II. Create a Console Application, add two dlls and their references:
DLLs:. These can be added as Nuget packages.
Microsoft.IdentityModel.Clients.ActiveDirectory.dll (Nuget Package Microsoft.SharePointOnline.CSOM)
Newtonsoft.Json.dll (Nuget Package Newtonsoft.Json)

Add the below using statements


using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Newtonsoft.Json;


Step III, Implement the code, with Graph Api to convert the document and download it locally and upload it back to SPO site:
Note: ADAL library is used for authentication.

Source code:
-------------------------------


using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Security.Cryptography.X509Certificates;
using System.Text;
using System.Threading.Tasks;
using System.Net;
using System.Security.Claims;
using System.IO;
using Microsoft.SharePoint.Client;
using System.Security;


namespace ConsoleApp1
{
public static class StreamExtensions
{
public static byte[] ReadAllBytes(this Stream instream)
{
if (instream is MemoryStream)
return ((MemoryStream)instream).ToArray();

using (var memoryStream = new MemoryStream())
{
instream.CopyTo(memoryStream);
return memoryStream.ToArray();
}
}
}
class Program
{
private static string TENANT_NAME = "mycompany.onmicrosoft.com";
private static string resource = "https://graph.microsoft.com";
private static string loginname = "user@mycompany.onmicrosoft.com";
private static string loginpassword = "*********";
private static string AzureTenantID = "********-f247-4d48-a45d-************";
private static string spositeUrl = "https://mycompany.sharepoint.com/*********";
private static string destinationDocumentLibrary = "dl1";

static void Main(string[] args)
{
//USER TOKEN - THIS WORKS!!!!!!!!!!!!
UserPasswordCredential userPasswordCredential = new UserPasswordCredential(loginname, loginpassword);
var graphauthority = "https://login.microsoftonline.com/" + AzureTenantID;
AuthenticationContext authContext = new AuthenticationContext(graphauthority);
var token = authContext.AcquireTokenAsync(resource, "94b1544c-35e8-4d45-a941-c3dbaab283dc", userPasswordCredential).Result.AccessToken;

// Create a new HttpWebRequest Object to the mentioned URL.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create("https://graph.microsoft.com/v1.0/me/drive/root:/orange.docx:/content?format=pdf");
//HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create("https://graph.microsoft.com/v1.0/drives/b!zMNDej1sNEG0SanRDltXfAVTYcdt1pdIggMBPYZYp9Wgdi3ir9sFQJXof6...");
myHttpWebRequest.AllowAutoRedirect = false;
myHttpWebRequest.Headers.Set("Authorization", ("Bearer " + token));
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
string downloadPath = myHttpWebResponse.GetResponseHeader("Location");
Console.WriteLine("Download PDF file from here:\n " + downloadPath);

//Get the file Stream with Location Url
HttpWebRequest HttpWebRequest_download = (HttpWebRequest)WebRequest.Create(downloadPath);
HttpWebRequest_download.Accept = "blob";

var response = (HttpWebResponse)HttpWebRequest_download.GetResponse();
Stream myStream = response.GetResponseStream();
FileStream targetFile = new FileStream("C:\\temp\\orange_converted_localcopy.pdf", FileMode.Create);
myStream.CopyTo(targetFile);
myStream.Close();
response.Close();

//You can continue to use Graph API to upload document back to OneDrive or other SPO site
//since we used loginname/password above already, we will use simple CSOM to upload file to another SPO site as quick demo
using (var clientContext = new ClientContext(spositeUrl))
{
SecureString passWord = new SecureString();
foreach (char c in loginpassword.ToCharArray()) passWord.AppendChar(c);

clientContext.Credentials = new SharePointOnlineCredentials(loginname, passWord);
var web = clientContext.Web;
clientContext.Load(web);
clientContext.ExecuteQuery();

List dl = web.Lists.GetByTitle(destinationDocumentLibrary);
clientContext.Load(dl);
clientContext.ExecuteQuery();

//Upload the converted file to SPO site
targetFile.Position = 0;
var fci = new FileCreationInformation
{
Url = "orange_converted_spocopy.pdf",
ContentStream = targetFile,
Overwrite = true
};
Folder folder = dl.RootFolder;
FileCollection files = folder.Files;
Microsoft.SharePoint.Client.File file = files.Add(fci);
clientContext.Load(files);
clientContext.Load(file);
clientContext.ExecuteQuery();

targetFile.Close();
response.Close();

Console.WriteLine("Converted file is uploaded to SPO site - orange_converted_spocopy.pdf");
Console.ReadKey();
}
}

}
}


Here is the converted file downloaded locally.



The converted pdf is also uploaded to this SPO site:



In the process to generate the url for the Graph API endpoint, I found it kind of tricky to identify the drive ID for specific SPO site, so listing the approach to get the same.

First, Use following url format in Graph Explorer to retrieve the drive id:
https://graph.microsoft.com/v1.0/sites/[spositehostname]:/[sites/pub]:/drive

For example, if the SPO site url is:
https://mycompany.sharepoint.com/sites/testsite

Url to put in Graph Explorer is:
https://graph.microsoft.com/v1.0/sites/mycompany.sharepoint.com:/sites/testsite:/drive



Output has the drive id:
--


{
"@odata.context": "https://graph.microsoft.com/beta/$metadata#drives/$entity",
"createdDateTime": "2017-12-04T19:48:25Z",
"description": "This system library was created by the Publishing feature to store documents that are used on pages in this site.",
"id": "b!zMNDej1sNEG0SanRDltXfAVTYcdt1pdIggMBPYZYp9Wgdi3ir9sFQJXof6j8GNUD",
"lastModifiedDateTime": "2018-10-11T04:08:37Z",
"name": "Documents",
"webUrl": "https://mycompany.sharepoint.com/sites/****/Documents",
"driveType": "documentLibrary",
"createdBy": {
"user": {
"displayName": "System Account"
}
}
}


I have a file named “Repro.docx” in the root of above drive:



So the file’s conversion endpoint is:
https://graph.microsoft.com/v1.0/drives/b!zMNDej1sNEG0SanRDltXfAVTYcdt1pdIggMBPYZYp9Wgdi3ir9sFQJXof6...

2 Comments
Copper Contributor

Hello, is this feature down? It seems to not be working any more. I am getting an internal. error from office services statying that HttpCodeNotfound for https://excelcs.officeapps.live.com/document/export/pdf

Copper Contributor

Hi,

docx to PDF works for me with this code, but the conversion doesn't seem to update field references (e.g. page number of a bookmark).  Is there some way for the field references to be updated during the conversion?

Thanks

Version history
Last update:
‎Sep 01 2020 02:27 PM
Updated by: