Microsoft Foundry Blog

8 MIN READ

How to build a personal finance app using Azure

Microsoft

Jan 25, 2021

AI allows you to deliver breakthrough experiences in your apps. With Azure Cognitive Services, you can easily customize and deploy the same AI models that power Microsoft’s products, such as Xbox and Bing, using the tools and languages of your choice.

In this blog we will walk through an exercise that you can complete in under an hour and learn how to build an application that can be useful for you, all while exploring a set of Azure services. If you have ever wanted to get your financial transactions in order, look no further. With this exercise, we’ll explore how to quickly take a snap of a receipt from your phone and upload it for categorization, creating expense reports, and to gain insights to your spending. Remember, even though we’ll walk you through each step, you can always explore the sample code and get creative with your own unique solution!

Features of the application:

Snap a picture of your receipt and upload it using your smartphone
Extract relevant data from the images: Who issued the receipt? What was the total amount? What was purchased? All of this information can be effortlessly stored for exploration
Query the data: bring your receipts to life by extracting relevant and insightful information

Prerequisites

If you don't have an Azure subscription, create a free account before you begin. If you have a subscription, log in to the Azure Portal.
You will need to have python installed locally to run some of the samples.

Key Azure technologies:

Azure Form Recognizer scans image documents with optical character recognition and extracts text, key/value pairs, and tables from documents, receipts, and forms.
Form Recognizer’s prebuilt receipt model specifically extracts receipt data
Azure Blob Storage is used to store data
Azure Cognitive Search enriches the data by making it easily identifiable

Solution Architecture

App Architecture Description:

User uploads a receipt image from their mobile device
The uploaded image is verified and then sent to the Azure Form Recognizer to extract information
The image is analysed by the REST API within the Form Recognizer prebuilt receipt model
A JSON is returned that has both the text information and bounding box coordinates of the extracted receipt data
The resulting JSON is parsed and a simpler JSON is formed, saving only the relevant information needed
This receipt JSON is then stored in Azure Blob Storage
Azure Cognitive Search points directly to Azure Blob Storage and is used to index the data
The application queries this search index to extract relevant information from the receipts

Another visual of the flow of data within the solution architecture is shown below.

Now that we’ve explored the technology and services we’ll be using, let’s dive into building our app!

Implementation

To get started, data from receipts must be extracted; this is done by setting up the Form Recognizer service in Azure and connecting to the service to use the relevant API for receipts. A JSON is returned that contains the information extracted from receipts and is stored in Azure Blob Storage to be used by Azure Cognitive Search. Cognitive Search is then utilized to index the receipt data, and to search for relevant information.

High level overview of steps, along with sample code snippets for illustration:

Go to the Azure portal and create a new Form Recognizer resource. In the Create pane, provide the following information:

Name	A descriptive name for your resource.
Subscription	Select the Azure subscription which has been granted access.
Location	The location of your cognitive service instance. Different locations may introduce latency, but have no impact on the runtime availability of your resource.
Pricing Tier	The cost of your resource depends on the pricing tier you choose and your usage. For more information, see the API pricing details.
Resource Group	The Azure resource group that will contain your resource. You can create a new group or add it to a pre-existing group.

After Form Recognizer deploys, go to All Resources and locate the newly deployed resource. Save the key and endpoint from the resource’s key and endpoint page somewhere so you can access it later.

You can use the following Analyze Receipt API to start analyzing the receipt. Remember to replace <endpoint> & <subscription key> the values you saved earlier and replace <path to your receipt> with the local path to your scanned receipt image.

# Analyse script

import json
import time
from requests import get, post

# Endpoint URL
endpoint = r"<endpoint url>"
apim_key = "<subscription key>"
post_url = endpoint + "/formrecognizer/v2.0/prebuilt/receipt/analyze"
source = r"<path to your receipt>"

headers = {
    # Request headers
    'Content-Type': 'image/jpeg',
    'Ocp-Apim-Subscription-Key': apim_key,
}

params = {
    "includeTextDetails": True
}

with open(source, "rb") as f:
    data_bytes = f.read()

try:
    resp = post(url=post_url, data=data_bytes, headers=headers, params=params)
    if resp.status_code != 202:
        print("POST analyze failed:\n%s" % resp.text)
        quit()
    print("POST analyze succeeded:\n%s" % resp.headers)
    get_url = resp.headers["operation-location"]
except Exception as e:
    print("POST analyze failed:\n%s" % str(e))
    quit()

If you run this code and everything is as it should be, you'll receive a 202 (Success) response that includes an Operation-Location header, which the script will print to the console. This header contains an operation id that you can use to query the status of the asynchronous operation and get the results. In the following example value, the string after operations/ is the operation ID.

https://cognitiveservice/formrecognizer/v2.0/prebuilt/receipt/operations/54f0b076-4e38-43e5-81bd-b85b8835fdfb

Now you can call the Get Analyze Receipt Result API to get the Extracted Data.

# Get results.
n_tries = 10
n_try = 0
wait_sec = 6
while n_try < n_tries:
    try:
        resp = get(url = get_url, headers = {"Ocp-Apim-Subscription-Key": apim_key})
        resp_json = json.loads(resp.text)
        if resp.status_code != 200:
            print("GET Receipt results failed:\n%s" % resp_json)
            quit()
        status = resp_json["status"]
        if status == "succeeded":
            print("Receipt Analysis succeeded:\n%s" % resp_json)
            quit()
        if status == "failed":
            print("Analysis failed:\n%s" % resp_json)
            quit()
        # Analysis still running. Wait and retry.
        time.sleep(wait_sec)
        n_try += 1
    except Exception as e:
        msg = "GET analyze results failed:\n%s" % str(e)
        print(msg)
        quit()

This code uses the operation id and makes another API call.

The JSON that is returned can be examined to get the required information - ‘readResults’ field will contain all lines of text that was decipherable, and the ‘documentResults’ field contains ‘key/value’ information for the most relevant parts of the receipt (e.g. the merchant, total, line items etc.)

The receipt image below,

resulted in the JSON from which we have extracted the following details:
```
 MerchantName: THE MAD HUNTER 
 TransactionDate: 2020-08-23 
 TransactionTime: 22:07:00 
 Total: £107.10 
```

We will now create a JSON from all the data extracted from the analysed receipt. The structure of the JSON is shown below:

{
   "id":"INV001",
   "user":"Sujith Kumar",
   "createdDateTime":"2020-10-23T17:16:32Z",
   "MerchantName":"THE MAD HUNTER",
   "TransactionDate":"2020-10-23",
   "TransactionTime":"22:07:00",
   "currency":"GBP",
   "Category":"Entertainment",
   "Total":"107.10",
   "Items":[	]
}

We can now save this JSON and build a search service to extract the information we want from it.

Before continuing onto step 8, you must have an Azure Storage Account with Blob storage.

We will now save the JSON files in an Azure Blob Storage container and use it as a source for the Azure Cognitive Search Service Index that we will create.
Sign-in to the Azure Portal and search for "Azure Cognitive Search" or navigate to the resource through Web > Azure Cognitive Search. Follow the steps to:

Choose a subscription
Set a resource group
Name the service appropriately
Choose a location
Choose a pricing tier for this service
Create your service
Get a key and URL endpoint

We will use the free Azure service, which means you can create three indexes, three data sources and three indexers. The dashboard will show you how many of each you have left. For this exercise you will create one of each.

In the portal, find the search service you created above and click Import data on the command bar to start the wizard. In the wizard, click on Connect to your data and specify the name, type, and connection information. Skip the ‘Enrich Content’ page and go to Customize Target Index.
For this exercise, we will use the wizard to generate a basic index for our receipt data. Minimally, an index requires a name and a fields collection; one of the fields should be marked as the document key to uniquely identify each document.

Fields have data types and attributes. The check boxes across the top are index attributes controlling how the field is used.

Retrievable means that it shows up in search results list. You can mark individual fields as off limits for search results by clearing this checkbox.
Key is the unique document identifier. It's always a string, and it is required.
Filterable, Sortable, and Facetable determine whether fields are used in a filter, sort, or faceted navigation structure.
Searchable means that a field is included in full text search. Only Strings are searchable.

Make sure you choose the following fields:

id
user
createdDateTime
MerchantName
TransactionDate
TransactionTime
Currency
Category
Total

Still in the Import data wizard, click Indexer > Name, and type a name for the indexer.

This object defines an executable process. For now, use the default option (Once) to run the indexer once, immediately.

Click Submit to create and simultaneously run the indexer.

Soon you should see the newly created indexer in the list, with status indicating "in progress" or success, along with the number of documents indexed.

The main service page provides links to the resources created in your Azure Cognitive Search service. To view the index you just created, click Indexes from the list of links.

Click on the index (azureblob-indexer in this case) from the list of links and view the index-schema.

Now you should have a search index that you can use to query the receipt data that’s been extracted from the uploaded receipts.

Click the search explorer

From the index drop down choose the relevant index. Choose the default API Version (2020-06-30) for this exercise.

In the search bar paste a query string (for eg. category='Entertainment')

You will get results as verbose JSON documents as shown below:

Now that you have built a query indexer and aimed it at your data you can now use it to build queries programmatically and extract information to answer some of the following questions:

How much did I spend last Thursday?
How much have I spent on entertainment over the last quarter?
Did I spend anything at ‘The Crown and Pepper’ last month?

Additional Ideas

In addition to the services and functionalities used throughout this exercise, there are numerous other ways you can use Azure AI to build in support for all kinds of receipts or invoices. For example, the logo extractor can be used to identify logos of popular restaurants or hotel chains, and the business card model can ingest business contact information just as easily as we saw with receipts.

We encourage you to explore some of the following ideas to enrich your application:

Search invoices for specific line items
Train the models to recognize different expense categories such as entertainment, supplies, etc.
Add Language Understanding (LUIS) to ask your app questions in natural language and extract formatted reports
Add Azure QnA Maker to your app and get insights such as how much you spent on entertainment last month, or other categories of insights you’d like to explore

Updated Jan 25, 2024

Version 7.0

azure ai document intelligence

Microsoft

Joined December 13, 2020

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity