FinOps Framework
26 TopicsNews and updates from FinOps X 2024: How Microsoft is empowering organizations
Last year, I shared a broad set of updates that showcased how Microsoft is embracing FinOps practitioners through education, product improvements, and innovative solutions that help organizations achieve more. with AI-powered experiences like Copilot and Microsoft Fabric. Whether you’re an engineer working in the Azure portal or part of a business or finance team collaborating in Microsoft 365 or analyzing data in Power BI, Microsoft Cloud has the tools you need to accelerate business value for your cloud investments.11KViews8likes0CommentsMicrosoft Cost Management: Billing & trust relationships explained
As a Microsoft Cloud Solution Architect supporting our global enterprise customers on FinOps and Microsoft Cost Management topics I am often involved in conversations with customers explaining the different relationships existing in Microsoft Cost Management. In this article I will shed some light on key relationships for commercial enterprise customers to provide clarity. The knowledge provided in this article can be especially helpful for those customers currently planning to transition from a Microsoft Enterprise Agreement contract towards a Microsoft Customer Agreement. Building blocks for Cost Management relationships Before we start looking into the individual relationships, we need to understand the building blocks of these Cost Management relationships: Billing contracts – Enterprise Agreement (EA) and Microsoft Customer Agreement (MCA) providing legal and commercial terms as well as certain technical capabilities like Billing Roles. Microsoft Azure Consumption Commitment (MACC) as a contractual agreement where an organization commits to spending a certain amount on Azure services over a specified period and might gets a certain discount on their usage. Billing roles provided by each contract like Enterprise Administrator or Billing Account Owner to manage the contract. Tenants based on Microsoft Entra ID providing entities access to Billing Roles and Azure Resources Azure subscriptions as containers to deploy and manage Azure Resources at specific cost. Azure customer price sheet (CPS) determining the current Customer prices for specific Azure Resources. Key Cost Management relationships in EA and MCA contracts Customer to contract Customer Organizations can sign multiple contracts with Microsoft. Even though it is generally advised to have a 1:1 relationship between Customer and Microsoft contract, Customer can theoretically use multiple contracts at the same time. Examples Best example for this is during a transition period from EA to MCA, where the customer has an active EA and an active MCA contract. During M&A activities a customer organization might end up with multiple EA or MCA contracts. Contract to price sheet Every contract is associated with a customer specific price sheet determining the individual customer prices in the agreed billing currency. In MCA the default associated price sheet basically equals the Azure Retail price list in USD. The price sheet can be accessed in the Azure portal. Contract to Microsoft Azure Consumption Commitment (MACC) Customers can sign a MACC. There is usually a 1:1 relationship between a contract and a MACC. As a benefit of this commitment, Customers might get discounted pricing on Azure Resources (usage). This potential discount will be reflected in the price sheet associated with the contract. Contract to billing roles Both EA and MCA provide Customers with roles, that can manage certain aspects of the contract. These roles are called billing roles. Billing roles differ from EA to MCA and are described in detail here for EA and here for MCA. A key difference between EA and MCA is, that Customers can associate any valid work, school, or Microsoft account to EA billing roles but only work accounts of an approved tenant for MCA. Contract to tenant To manage billing roles customers must associate exactly one Entra ID tenant with the contract. This happens at contract setup. Identities within this Tenant can be assigned to billing roles within the contract. Example User Dirk from the contoso.com tenant can be assigned to the EA admin role in Contoso’s EA contract. In MCA this tenant is called the primary billing tenant. Only users from this tenant can be assigned to billing roles within the MCA contract. CAUTION: The tenant global administrator role is above the billing account administrator. Global administrators in a Microsoft Entra ID tenant can add or remove themselves as billing account administrators at any time to the Microsoft Customer Agreement. If you want to assign identities from tenants other than the primary billing tenant, you can add associated billing tenants. NOTE: Even though customers should strive for a single tenant, there is no restriction in how many tenants a customer can create within a contract. Contract to subscription An Azure subscription is a logical container used to provision and manage Azure resources. Resource access is managed by a trust relationship to an Entra ID tenant and billing is managed by a billing relationship to a Microsoft contract (EA or MCA). Every subscription can only have one billing relationship to one contract. This billing relationship can be moved to a different contract based on certain conditions (e.g. in EA/MCA transition scenarios or M&A scenarios). Every contract can manage 5000 subscriptions by default. NOTE: The billing relationship determines the prices for the consumed resources within the subscription. If you have a subscription that is associated with a contract that uses Azure retail prices you pay the retail price. If the associated contract has customer specific prices (e.g. by signing a MACC with applicable discounts), the resources within this subscription will be charged at these prices. Subscription to tenant Every subscription has a 1:1 trust relationship to an Entra ID tenant. This trust relationship determines, which identities can manage the resources within the subscription. Tenant to subscription Every tenant can manage trust relationships with virtually unlimited number of subscriptions. These subscriptions can use billing relationships to multiple contracts which might lead to different prices for resources deployed to these subscriptions. Example Contoso is using a single Entra ID tenant, has just signed a new MACC agreement and is migrating to MCA. At signing the MCA contract, the MACC is associated with the MCA account and not with the EA contract. During an EA to MCA transition period, some subscriptions still have a billing relationship to the old EA contract and others are already moved to the MCA contract. In this situation the same VM in the same region might be charged differently, depending on the subscription the VM is deployed to. If the subscription is still using the EA billing relationship, the price might be different (e.g. due to lack of applied discounts). Summary & key takeaways Relationships on a page Let’s summarize the described relationships: Fully understanding the described relationships can be very helpful in a variety of scenarios, especially when you are currently discussing the transition from an Enterprise Agreement contract towards a Microsoft Customer Agreement. Key takeaways Three key takeaways: Tenants do not determine the price of an Azure resource! Only the billing relationship between a subscription and a contract determines the price of a resource! Customers can use multiple tenants to manage their Azure resources. Only for managing billing roles there is usually a 1:1 relationship between the contract and an Entra ID tenant. Next steps If the provided information was helpful for you: Share this article within your FinOps, Finance and Cloud Competence Center teams. If you need in-depth support for planning the proper relationships and billing structures for your organization, please reach out to your local Microsoft representative. If you have a unified support contract, you can ask your Customer Success Account Manager (CSAM) to reach out to our global team of FinOps experts from the Culture & Cloud Experience (CCX) practice. This article lays the foundation for an upcoming post about how to manage the transition from EA to MCA from a FinOps perspective.1.3KViews4likes1CommentWhat’s new in FinOps toolkit 0.4 – July 2024
In July, the FinOps toolkit 0.4 added support for FOCUS 1.0, updated tools and resources to align with the FinOps Framework 2024 updates, introduced a new tool for cloud optimization recommendations called Azure Optimization Engine, and more!3.7KViews4likes1CommentGetting started with FinOps hubs: Multicloud cost reporting with Azure and Google Cloud
Microsoft’s FinOps hubs offer a powerful and trusted foundation for managing, analyzing, and optimizing cloud costs. Built on Azure Data Explorer (ADX) and Azure Data Lake Storage (ADLS), FinOps hubs provide a scalable platform to unify billing data across providers leveraging FOCUS datasets. In this post, we’ll take a hands-on technical walkthrough of how to connect Microsoft FinOps hubs to Google Cloud, enabling you to export and analyze Google Cloud billing data with Azure billing data directly within your FinOps hub instance. This walk through will focus on using only storage in the hub for accessing data and is designed to get you started and understand multicloud connection and reporting from FinOps hubs. For large datasets Azure Data Explorer or Microsoft Fabric is recommended. With the introduction of the FinOps Open Cost and Usage Specification (FOCUS), normalizing billing data and reporting it through a single-pane-of-glass experience has never been easier. As a long-time contributor to the FOCUS working group, I’ve spent the past few years helping define standards to make multicloud reporting simpler and more actionable. FOCUS enables side-by-side views of cost and usage data—such as compute hours across cloud providers—helping organizations make better decisions around workload placement and right-sizing. Before FOCUS, this kind of unified analysis was incredibly challenging due to the differing data models used by each provider. To complete this technical walk through, you’ll need access to both Azure and Google Cloud—and approximately 2 to 3 hours of your time. Getting started: The basics you’ll need Before diving in, ensure you have the following prerequisites in place: ✅ Access to your Google Cloud billing account. ✅ A Google Cloud project with BigQuery and other APIs enabled and linked to your billing account. ✅ All required IAM roles and permissions for working with BigQuery, Cloud Functions, Storage, and billing data (detailed below). ✅ Detailed billing export and pricing export configured in Google Cloud. ✅ An existing deployment of FinOps hubs. What you’ll be building In this walk-through, you’ll set up Google billing exports and an Azure Data Factory pipeline to fetch the exported data, convert to parquet and ingest into your FinOps hub storage, following the FOCUS 1.0 standard for normalization. Through the process you should end up creating: Configure a BigQuery view to convert detailed billing exports into the FOCUS 1.0 schema. Create a metadata table in BigQuery to track export timestamps to enable incremental data exports reducing file export sizes and avoiding duplicate data. Set up a GCS bucket to export FOCUS-formatted data in CSV format. Deploy a Google Cloud Function that performs incremental exports from BigQuery to GCS. Create a Google Cloud Schedule to automate the export of your billing data to Google Cloud Storage. Build a pipeline in Azure Data Factory to: Fetch the CSV billing exports from Google Cloud Storage. Convert them to Parquet. Ingest the transformed data into your FinOps hub ingestion container. Let's get started. We're going to start in Google Cloud before we jump back into Azure to setup the Data Factory pipeline. Enabling FOCUS exports in Google Prerequisite: Enable detailed billing & pricing exports ⚠️ You cannot complete this guide without billing data enabled in BigQuery if you have not enabled detailed billing exports and pricing exports, do this now and come back to the walk-through in 24 hours. Steps to enabled detailed billing exports and pricing exports: Navigate to Billing > Billing Export. Enable Detailed Cost Export to BigQuery. Select the billing project and a dataset - if you have not created a project for billing do so now. Enable Pricing Export to the same dataset. 🔊 “This enables daily cost and usage data to be streamed to BigQuery for granular analysis.” Detailed guidance and information on billing exports in Google can be found here: Google Billing Exports. What you'll create: Service category table Metadata table FOCUS View - you must have detailed billing export and pricing exports enabled to create this view Cloud Function Cloud Schedule What you'll need An active GCP Billing Account (this is your payment method). A GCP Project (new or existing) linked to your GCP billing account. All required APIs enabled. All required IAM roles enabled. Enable Required APIs: BigQuery API Cloud Billing API Cloud Build API Cloud Scheduler Cloud Functions Cloud Storage Cloud Run Admin Pub/Sub (optional, for event triggers) Required IAM Roles Assign the following roles to your user account: roles/billing.viewer or billing.admin roles/bigquery.dataEditor roles/storage.admin roles/cloudfunctions.admin roles/cloudscheduler.admin roles/iam.serviceAccountTokenCreator roles/cloudfunctions.invoker roles/run.admin (if using Cloud Run) roles/project.editor Create a service account (e.g., svc-bq-focus) Assign the following roles to your service account: roles/bigquery.dataOwner roles/storage.objectAdmin roles/cloudfunctions.invoker roles/cloudscheduler.admin roles/serviceAccount.tokenCreator roles/run.admin Your default compute service account will also require access to run cloud build services. Ensure you apply the cloud build role to your default compute service account in your project, it may look like this: projectID-compute@developer.gserviceaccount.com → roles/cloudbuild.builds.builder Create FOCUS data structure and View This section will create two new tables in Big Query, one for service category mappings and one for metadata related to export times. These are important to get in before we create the FOCUS view to extract billing data in FOCUS format. Create a service category mapping table I removed service category from the original google FOCUS view to reduce the size of the SQL Query, therefore, to ensure we mapped Service category properly, I created a new service category table and joined it to the FOCUS view. In this step we will create a new table using open data to map GCP services to service category. Doing this helps reduce the size of the SQL query and simplifies management of Service Category mapping. Leveraging open source data we can easily update service category mappings if they ever change or new categories are added without impacting the FOCUS view query. Process: Download the latest service_category_mapping.csv from the FOCUS Converter repo Go to BigQuery > Your Dataset > Create Table Upload the CSV Table name: service_category Schema: Auto-detect Create a metadata table This table will be used to track the last time detailed billing data was added to your detailed billing export, we use this to enable incremental exports of billing data through the FOCUS view to ensure we only export the latest set of data and not everything all the time. Process: Go to BigQuery > Create Table Table name: metadata_focus_export Schema: Field Name : Format last_export_time: TIMESTAMP export_message: STRING Enter your field name and then choose field format, do not add : 🔊 “Ensures each export only includes new data since the last timestamp.” Create the FOCUS-aligned view Creating a view in BigQuery allows us to un-nest the detailed billing export tables into the format of FOCUS 1.0*. To use Power BI we must un-nest the tables so this step is required. It also ensures we map the right columns in Google Cloud detailed billing export to the right columns in FOCUS. ⚠️ The FOCUS SQL code provided in this walk-through has been altered from the original Google provided code. I believe this new code is better formatted for FOCUS 1.0 than the original however it does contain some nuances that suit my personal views. Please evaluate this carefully before using this in a production system and adjust the code accordingly to your needs. Steps: Navigate to BigQuery > New Query Paste and update the FOCUS view SQL query which is provided below Replace: yourexporttable with detailed export dataset ID and table name that will look like " yourpricingexporttable with pricing export and table name your_billing_dataset with your detailed export dataset ID and table name FOCUS SQL Query: WITH usage_export AS ( SELECT *, ( SELECT AS STRUCT type, id, full_name, amount, name FROM UNNEST(credits) LIMIT 1 ) AS cud, FROM "Your-Detailed-Billing-Export-ID" -- replace with your detailed usage export table path ), prices AS ( SELECT export_time, sku.id AS sku_id, sku.description AS sku_description, service.id AS service_id, service.description AS service_description, tier.* FROM "your_pricing_export_id", UNNEST(list_price.tiered_rates) AS tier ) SELECT "111111-222222-333333" AS BillingAccountId, "Your Company Name" AS BillingAccountName, COALESCE((SELECT SUM(x.amount) FROM UNNEST(usage_export.credits) x),0) + cost as BilledCost, usage_export.currency AS BillingCurrency, DATETIME(PARSE_DATE("%Y%m", invoice.month)) AS BillingPeriodStart, DATETIME(DATE_SUB(DATE_ADD(PARSE_DATE("%Y%m", invoice.month), INTERVAL 1 MONTH), INTERVAL 1 DAY)) AS BillingPeriodEnd, CASE WHEN usage_export.adjustment_info.type IS NOT NULL and usage_export.adjustment_info.type !='' THEN 'Adjustment' WHEN usage_export.cud.type = 'PROMOTION' AND usage_export.cost_type = 'regular' AND usage_export.cud.id IS NOT NULL THEN 'Credit' WHEN usage_export.sku.description LIKE '%Commitment - 3 years - dollar based VCF%' or usage_export.sku.description LIKE '%Prepay Commitment%' THEN 'Purchase' WHEN usage_export.cud.id IS NOT NULL AND usage_export.cud.id != '' THEN 'Credit' WHEN usage_export.cost_type = 'regular' THEN 'Usage' WHEN usage_export.cost_type = 'tax' THEN 'Tax' WHEN usage_export.cost_type = 'adjustment' THEN 'Adjustment' WHEN usage_export.cost_type = 'rounding_error' THEN 'Adjustment' ELSE usage_export.cost_type END AS ChargeCategory, IF(COALESCE( usage_export.adjustment_info.id, usage_export.adjustment_info.description, usage_export.adjustment_info.type, usage_export.adjustment_info.mode) IS NOT NULL, "correction", NULL) AS ChargeClass, CASE WHEN usage_export.adjustment_info.type IS NOT NULL AND usage_export.adjustment_info.type != '' THEN usage_export.adjustment_info.type ELSE usage_export.sku.description END AS ChargeDescription, CAST(usage_export.usage_start_time AS DATETIME) AS ChargePeriodStart, CAST(usage_export.usage_end_time AS DATETIME) AS ChargePeriodEnd, CASE usage_export.cud.type WHEN "COMMITTED_USAGE_DISCOUNT_DOLLAR_BASE" THEN "Spend" WHEN "COMMITTED_USAGE_DISCOUNT" THEN "Usage" END AS CommitmentDiscountCategory, usage_export.cud.id AS CommitmentDiscountId, COALESCE (usage_export.cud.full_name, usage_export.cud.name) AS CommitmentDiscountName, usage_export.cud.type AS CommitmentDiscountType, CAST(usage_export.usage.amount_in_pricing_units AS numeric) AS ConsumedQuantity, usage_export.usage.pricing_unit AS ConsumedUnit, -- review CAST( CASE WHEN usage_export.cost_type = "regular" THEN usage_export.price.effective_price * usage_export.price.pricing_unit_quantity ELSE 0 END AS NUMERIC ) AS ContractedCost, -- CAST(usage_export.price.effective_price AS numeric) AS ContractedUnitPrice, usage_export.seller_name AS InvoiceIssuerName, COALESCE((SELECT SUM(x.amount) FROM UNNEST(usage_export.credits) x),0) + cost as EffectiveCost, CAST(usage_export.cost_at_list AS numeric) AS ListCost, prices.account_currency_amount AS ListUnitPrice, IF( usage_export.cost_type = "regular", IF( LOWER(usage_export.sku.description) LIKE "commitment%" OR usage_export.cud IS NOT NULL, "committed", "standard"), null) AS PricingCategory, IF(usage_export.cost_type = 'regular', usage_export.price.pricing_unit_quantity, NULL) AS PricingQuantity, IF(usage_export.cost_type = 'regular', usage_export.price.unit, NULL) AS PricingUnit, 'Google'AS ProviderName, usage_export.transaction_type AS PublisherName, usage_export.location.region AS RegionId, usage_export.location.region AS RegionName, usage_export.service.id AS ResourceId, REGEXP_EXTRACT (usage_export.resource.global_name, r'[^/]+$') AS ResourceName, usage_export.sku.description AS ResourceType, COALESCE(servicemapping.string_field_1, 'Other') AS ServiceCategory, usage_export.service.description AS ServiceName, usage_export.sku.id AS SkuId, CONCAT("SKU ID:", usage_export.sku.id, ", Price Tier Start Amount: ", price.tier_start_amount) AS SkuPriceId, usage_export.project.id AS SubAccountId, usage_export.project.name AS SubAccountName, (SELECT CONCAT('{', STRING_AGG(FORMAT('%s:%s', kv.key, kv.value), ', '), '}') FROM ( SELECT key, value FROM UNNEST(usage_export.project.labels) UNION ALL SELECT key, value FROM UNNEST(usage_export.tags) UNION ALL SELECT key, value FROM UNNEST(usage_export.labels) ) AS kv) AS Tags, FORMAT_DATE('%B', PARSE_DATE('%Y%m', invoice.month)) AS Month, usage_export.project.name AS x_ResourceGroupName, CAST(usage_export.export_time AS TIMESTAMP) AS export_time, FROM usage_export LEFT JOIN "Your-Service-Category-Id".ServiceCategory AS servicemapping ON usage_export.service.description = servicemapping.string_field_0 LEFT JOIN prices ON usage_export.sku.id = prices.sku_id AND usage_export.price.tier_start_amount = prices.start_usage_amount AND DATE(usage_export.export_time) = DATE(prices.export_time); 🔊 "This creates a FOCUS-aligned view of your billing data using FOCUS 1.0 specification, this view does not 100% conform to FOCUS 1.0. Create GCS Bucket for CSV Exports Your GCS bucket is where you will place your incremental exports to be exported to Azure. Once your data is exported to Azure you may elect to delete the files in the bucket, the metaata table keeps a record of the last export time. Steps: Go to Cloud Storage > Create Bucket Name: focus-cost-export (or any name you would like) Region: Match your dataset region Storage Class: Nearline (cheaper to use Earline but standard will also work just fine) Enable Interoperability settings > create access/secret key The access key and secret are tied to your user account, if you want to be able to use this with multiple people, create an access key for your service account - recommended for Prod, for this guide purpose, an access key linked to your account is fine. Save the access key and secret to a secure location to use later as part of the ADF pipeline setup. 🔊 “This bucket will store daily CSVs. Consider enabling lifecycle cleanup policies.” Create a Cloud Function for incremental export A cloud function is used here to enable incremental exports of your billing data in FOCUS format on a regular schedule. At present there is no known supported on demand export service from Google, so we came up with this little workaround. The function is designed to evaluate your billing data for last export time that is equals to or greater than the last time the function ran. To do this we look at the export_time column and the metadata table for the last time we ran this. This ensure we only export the most recent billing data which aids in reducing data export costs to Azure. This process is done through the GCP GUI using an inline editor to create the cloud function in the cloud run service. Steps: Go to Cloud Run > Write Function Select > Use Inline editor to create a function Service Name: daily_focus_export Region, the same region as your dataset - in our demo case us-central1 Use settings: Runtime: Python 3.11 (you cannot use anything later than 3.11) Trigger: Optional Authentication: Require Authentication Billing: Request based Service Scaling: Auto-scaling set to 0 Ingress: All Containers: leave all settings as set Volumes: Leave all settings as set Networking: Leave all settings as set Security: Choose the service account you created earlier Save Create main.py file and requirements.txt files through the inline editor For requirements.txt copy and paste the below: functions-framework==3.* google-cloud-bigquery Google-cloud-storage For main.py your function entry point is: export_focus_data Paste the code below into your main.py inline editor window import logging import time import json from google.cloud import bigquery, storage logging.basicConfig(level=logging.INFO) def export_focus_data(request): bq_client = bigquery.Client() storage_client = storage.Client() project_id = "YOUR-ProjectID" dataset = "YOUR-Detailed-Export-Dataset" view = "YOUR-FOCUS-View-Name" metadata_table = "Your-Metadata-Table" job_name = "The name you want to call the job for the export" bucket_base = "gs://<your bucketname>/<your foldername>" bucket_name = "Your Bucket Name" metadata_file_path = "Your-Bucket-name/export_metadata.json" try: logging.info("🔁 Starting incremental export based on export_time...") # Step 1: Get last export_time from metadata metadata_query = f""" SELECT last_export_time FROM `{project_id}.{dataset}.{metadata_table}` WHERE job_name = '{job_name}' LIMIT 1 """ metadata_result = list(bq_client.query(metadata_query).result()) if not metadata_result: return "No metadata row found. Please seed export_metadata.", 400 last_export_time = metadata_result[0].last_export_time logging.info(f"📌 Last export_time from metadata: {last_export_time}") # Step 2: Check for new data check_data_query = f""" SELECT COUNT(*) AS row_count FROM `{project_id}.{dataset}.{view}` WHERE export_time >= TIMESTAMP('{last_export_time}') """ row_count = list(bq_client.query(check_data_query).result())[0].row_count if row_count == 0: logging.info("✅ No new data to export.") return "No new data to export.", 204 # Step 3: Get distinct export months folder_query = f""" SELECT DISTINCT FORMAT_DATETIME('%Y%m', BillingPeriodStart) AS export_month FROM `{project_id}.{dataset}.{view}` WHERE export_time >= TIMESTAMP('{last_export_time}') AND BillingPeriodStart IS NOT NULL """ export_months = [row.export_month for row in bq_client.query(folder_query).result()] logging.info(f"📁 Exporting rows from months: {export_months}") # Step 4: Export data for each month for export_month in export_months: export_path = f"{bucket_base}/{export_month}/export_{int(time.time())}_*.csv" export_query = f""" EXPORT DATA OPTIONS( uri='{export_path}', format='CSV', overwrite=true, header=true, field_delimiter=';' ) AS SELECT * FROM `{project_id}.{dataset}.{view}` WHERE export_time >= TIMESTAMP('{last_export_time}') AND FORMAT_DATETIME('%Y%m', BillingPeriodStart) = '{export_month}' """ bq_client.query(export_query).result() # Step 5: Get latest export_time from exported rows max_export_time_query = f""" SELECT MAX(export_time) AS new_export_time FROM `{project_id}.{dataset}.{view}` WHERE export_time >= TIMESTAMP('{last_export_time}') """ new_export_time = list(bq_client.query(max_export_time_query).result())[0].new_export_time # Step 6: Update BigQuery metadata table update_query = f""" MERGE `{project_id}.{dataset}.{metadata_table}` T USING ( SELECT '{job_name}' AS job_name, TIMESTAMP('{new_export_time}') AS new_export_time ) S ON T.job_name = S.job_name WHEN MATCHED THEN UPDATE SET last_export_time = S.new_export_time WHEN NOT MATCHED THEN INSERT (job_name, last_export_time) VALUES (S.job_name, S.new_export_time) """ bq_client.query(update_query).result() # Step 7: Write metadata JSON to GCS blob = storage_client.bucket(bucket_name).blob(metadata_file_path) blob.upload_from_string( json.dumps({"last_export_time": new_export_time.isoformat()}), content_type="application/json" ) logging.info(f"📄 Metadata file written to GCS: gs://{bucket_name}/{metadata_file_path}") return f"✅ Export complete. Metadata updated to {new_export_time}", 200 except Exception as e: logging.exception("❌ Incremental export failed:") return f"Function failed: {str(e)}", 500 Before saving, update variables like project_id, bucket_name, etc. 🔊 “This function exports new billing data based on the last timestamp in your metadata table.” Test the Cloud Function Deploy and click Test Function Ensure a seed export_metadata.json file is uploaded to your GCS bucket (if you have not uploaded a seed export file the function will not run. Example file is below { "last_export_time": "2024-12-31T23:59:59.000000+00:00" } Confirm new CSVs appear in your target folder Automate with Cloud Scheduler This step will set up an automated schedule to run your cloud function on a daily or hourly pattern, you may adjust the frequency to your desired schedule, this demo uses the same time each day at 11pm. Steps: Go to Cloud Scheduler > Create Job Region: Same region as your Function - This Demo us-central1 Name: daily-focus-export Frequency: 0 */6 * * * (every 6 hours) or Frequency: 0 23 * * * (daily at 11 PM) Time zone: your desired time zone Target: HTTP Auth: OIDC Token → Use service account Click CREATE 🔊 "This step automates the entire pipeline to run daily and keep downstream platforms updated with the latest billing data." Wrapping up Google setup Wow—there was a lot to get through! But now that you’ve successfully enabled FOCUS-formatted exports in Google Cloud, you're ready for the next step: connecting Azure to your Google Cloud Storage bucket and ingesting the data into your FinOps hub instance. This is where everything comes together—enabling unified, multi-cloud reporting in your Hub across both Azure and Google Cloud billing data. Let’s dive into building the Data Factory pipeline and the associated datasets needed to fetch, transform, and load your Google billing exports. Connecting Azure to Google billing data With your Google Cloud billing data now exporting in FOCUS format, the next step is to bring it into your Azure environment for centralized FinOps reporting. Using Data Factory, we'll build a pipeline to fetch the CSVs from your Google Cloud Storage bucket, convert them to Parquet, and land them in your FinOps Hub storage account. Azure access and prerequisites Access to the Azure Portal Access to the resource group your hub has been deployed to Contributor access to your resource group At a minimum storage account contributor and storage blob data owner roles An existing deployment of the Microsoft FinOps Hub Toolkit Admin rights on Azure Data Factory (or at least contributor role on ADF) Services and tools required Make sure the following Azure resources are in place: Azure Data Factory instance A linked Azure Storage Account (this is where FinOps Hub is expecting data) (Optional) Azure Key Vault for secret management Pipeline Overview We’ll be creating the following components in Azure Data Factory Component Purpose GCS Linked Service Connects ADF to your Google Cloud Storage bucket Azure Blob Linked Service Connects ADF to your Hub’s ingestion container Source Dataset Reads CSV files from GCS Sink Dataset Writes Parquet files to Azure Data Lake (or Blob) in Hub's expected format Pipeline Logic Orchestrates the copy activity, format conversion, and metadata updates Secrets and authentication To connect securely from ADF to GCS, you will need: The access key and secret from GCS Interoperability settings (created earlier) Store them securely in Azure Key Vault or in ADF pipeline parameters (if short-lived) 💡 Best Practice Tip: Use Azure Key Vault to securely store GCS access credentials and reference them from ADF linked services. This improves security and manageability over hardcoding values in JSON. Create Google Cloud storage billing data dataset Now that we're in Azure, it's time to connect to your Google Cloud Storage bucket and begin building your Azure Data Factory pipeline. Steps Launch Azure Data Factory Log in to the Azure Portal Navigate to your deployed Azure Data Factory instance Click “Author” from the left-hand menu to open the pipeline editor In the ADF Author pane, click the "+" (plus) icon next to Datasets Select “New dataset” Choose Google Cloud Storage as the data source Select CSV as the file format Set the Dataset Name, for example: gcs_focus_export_dataset Click “+ New” next to the Linked Service dropdown Enter a name like: GCS-Billing-LinkedService Under Authentication Type, choose: Access Key (for Interoperability access key) Or Azure Key Vault (recommended for secure credential storage) Fill in the following fields: Access Key: Your GCS interoperability key Secret: Your GCS interoperability secret Bucket Name: focus-cost-export (Optional) Point to a folder path like: focus-billing-data/ Click Test Connection to validate access Click Create Adjust Dataset Properties Now that the dataset has been created, make the following modifications to ensure proper parsing of the CSVs: SettingValue Column delimiter; (semicolon) Escape character" (double quote) You can find these under the “Connection” and “Schema” tabs of the dataset editor. If your connection fails you may need to enable public access on your GCS bucket or check your firewall restrictions from azure to the internet! Create a Google Cloud Storage metadata dataset To support incremental loading, we need a dedicated dataset to read the export_metadata.json file that your Cloud Function writes to Google Cloud Storage. This dataset will be used by the Lookup activity in your pipeline to get the latest export timestamp from Google. Steps: In the Author pane of ADF, click "+" > Dataset > New dataset Select Google Cloud Storage as the source Choose JSON as the format Click Continue Configure the Dataset Setting Value Name gcs_export_metadata_dataset Linked Service Use your existing GCS linked service File path e.g., focus-cost-export/metadata/export_metadata.json Import schema Set to From connection/store or manually define if needed File pattern Set to single file (not folder) Create the sink dataset – "ingestion_gcp" Now that we’ve connected to Google Cloud Storage and defined the source dataset, it’s time to create the sink dataset. This will land your transformed billing data in Parquet format into your Azure FinOps Hub’s ingestion container. Steps: In Azure Data Factory, go to the Author section Under Datasets, click the "+" (plus) icon and select “New dataset” Choose Azure Data Lake Storage Gen2 (or Blob Storage, depending on your Hub setup) Select Parquet as the file format Click Continue Configure dataset properties Name: ingestion_gcp Linked Service: Select your existing linked service that connects to your FinOps Hub’s storage account. (this will have been created when you deployed the hub) File path: Point to the container and folder path where you want to store the ingested files (e.g., ingestion-gcp/cost/billing-provider/gcp/focus/) Optional Settings: Option Recommended Value Compression type snappy Once configured, click Publish All again to save your new sink dataset. Create the sink dataset – "adls_last_import_metadata" The adls_last_import_metadata dataset (sinks dataset) is the location you use to copy the export time json file from google to azure, this location is sued for the pipeline to check the last time the import of data was run by reading the json file you coped from google and comparing it to the new json file from google Steps: In Azure Data Factory, go to the Author section Under Datasets, click the "+" (plus) icon and select “New dataset” Choose Azure Data Lake Storage Gen2 (or Blob Storage, depending on your Hub setup) Select JSON as the file format Click Continue Configure dataset properties Name: adsl_last_import_metadata Linked Service: Select your existing linked service that connects to your FinOps Hub’s storage account. (this will have been created when you deployed the hub) File path: Point to the container and folder path where you want to store the ingested files (e.g., ingestion/cost/metadata) Build the ADF pipeline for incremental import With your datasets created, the final step is building the pipeline logic that orchestrates the data flow. The goal is to only import newly exported billing data from Google, avoiding duplicates by comparing timestamps from both clouds. In this pipeline, we’ll: Fetch Google’s export timestamp from the JSON metadata file Fetch the last successful import time from the Hub’s metadata file in Azure Compare timestamps to determine whether there is new data to ingest If new data exists: Run the Copy Activity to fetch GCS CSVs, convert to Parquet, and write to the ingestion_gcp container Write an updated metadata file in the hub, with the latest import timestamp Pipeline components: Activity Purpose Lookup - GCS Metadata Reads the last export time from Google's metadata JSON in GCS Lookup - Hub Metadata Reads the last import time from Azure’s metadata JSON in ADLS If Condition Compares timestamps to decide whether to continue with copy Copy Data Transfers files from GCS (CSV) to ADLS (Parquet) Set Variable Captures the latest import timestamp Web/Copy Activity Writes updated import timestamp JSON file to ingestion_gcp container What your pipeline will look like: Step-by-Step Create two lookup activities: GCS Metadata Hub Metadata Lookup1 – GCS Metadata Add an activity called Lookup to your new pipeline Select the Source Dataset: gcs_export_metadata_dataset or whatever you named it earlier This lookup reads the the export_metadata.json file created by your Cloud Function in GCS for the last export time available. Configuration view of lookup for GCS metadata file Lookup2 – Hub Metadata Add an activity called lookup and name it Hub Metadata Select the source dataset: adls_last_import_metadata This lookup reads the last import time data from the hub metadata file to compare it to the last export time from GCS Configuration view of Metadata lookup activity Add conditional logic In this step we will add the condition logic Add activity called If Condition Add the expression below to the condition Go to the Expression tab and paste the following into Dynamic Content: @greater(activity('Lookup1').output.last_export_time, activity('Lookup2').output.last_import_time) Configuration view of If Condition Activity Next Add Copy Activity (If Condition = True) next to True condition select edit and add activity Copy Data Configure the activity with the following details Setting Value Source Dataset gcs_focus_export_dataset (CSV from GCS) Sink Dataset ingestion_gcp (Parquet to ADLS) Merge Files Enabled (reduce file count) Filter Expression @activity('Lookup1').output.firstRow.last_export_time Ensure you add filter expression for filter by last modified. This is important, if you do not add the filter by last modified expression your pipeline will not function properly. Finally we create an activity to update the metadata file in the hub Add another copy activity to your if condition and ensure it is linked to the previous copy activity, this ensure it runs after the import activity is completed. Copy metadata activity settings: Source gcs_export_metadata_dataset Sink adls_last_import_dataset Destination Path ingestion_gcp/metadata/last_import.json This step ensures the next run of your pipeline uses the updated import time to decide whether new data exists. Your pipeline now ingests only new billing data from Google and records each successful import to prevent duplicate processing. Wrapping up – A unified FinOps reporting solution Congratulations — you’ve just built a fully functional multi-cloud FinOps data pipeline using Microsoft’s FinOps Hub Toolkit and Google Cloud billing data, normalized with the FOCUS 1.0 standard. By following this guide, you’ve: ✅ Enabled FOCUS billing exports in Google Cloud using BigQuery, GCS, and Cloud Functions ✅ Created normalized, FOCUS-aligned views to unify your GCP billing data ✅ Automated metadata tracking to support incremental exports ✅ Built an Azure Data Factory pipeline to fetch, transform, and ingest GCP data into your Hub ✅ Established a reliable foundation for centralized, multi-cloud cost reporting This solution brings side-by-side visibility into Azure and Google Cloud costs, enabling informed decision-making, workload optimization, and true multi-cloud FinOps maturity. Next steps 🔁 Schedule the ADF pipeline to run daily or hourly using triggers 📈 Build Power BI dashboards or use templates from the Microsoft FinOps Toolkit visualise unified cloud spend 🧠 Extend to AWS by applying the same principles using the AWS FOCUS export and AWS S3 storage Feedback? Have questions or want to see more deep dives like this? Let me know — or connect with me if you’re working on FinOps, FOCUS, or multi-cloud reporting. This blog is just the beginning of what's possible.1.9KViews3likes1CommentManaging Azure OpenAI costs with the FinOps toolkit and FOCUS: Turning tokens into unit economics
By Robb Dilallo Introduction As organizations rapidly adopt generative AI, Azure OpenAI usage is growing—and so are the complexities of managing its costs. Unlike traditional cloud services billed per compute hour or storage GB, Azure OpenAI charges based on token usage. For FinOps practitioners, this introduces a new frontier: understanding AI unit economics and managing costs where the consumed unit is a token. This article explains how to leverage the Microsoft FinOps toolkit and the FinOps Open Cost and Usage Specification (FOCUS) to gain visibility, allocate costs, and calculate unit economics for Azure OpenAI workloads. Why Azure OpenAI cost management is different AI services break many traditional cost management assumptions: Billed by token usage (input + output tokens). Model choices matter (e.g., GPT-3.5 vs. GPT-4 Turbo vs. GPT-4o). Prompt engineering impacts cost (longer context = more tokens). Bursty usage patterns complicate forecasting. Without proper visibility and unit cost tracking, it's difficult to optimize spend or align costs to business value. Step 1: Get visibility with the FinOps toolkit The Microsoft FinOps toolkit provides pre-built modules and patterns for analyzing Azure cost data. Key tools include: Microsoft Cost Management exports Export daily usage and cost data in a FOCUS-aligned format. FinOps hubs Infrastructure-as-Code solution to ingest, transform, and serve cost data. Power BI templates Pre-built reports conformed to FOCUS for easy analysis. Pro tip: Start by connecting your Microsoft Cost Management exports to a FinOps hub. Then, use the toolkit’s Power BI FOCUS templates to begin reporting. Learn more about the FinOps toolkit Step 2: Normalize data with FOCUS The FinOps Open Cost and Usage Specification (FOCUS) standardizes billing data across providers—including Azure OpenAI. FOCUS Column Purpose Azure Cost Management Field ServiceName Cloud service (e.g., Azure OpenAI Service) ServiceName ConsumedQuantity Number of tokens consumed Quantity PricingUnit Unit type, should align to "tokens" DistinctUnits BilledCost Actual cost billed CostInBillingCurrency ChargeCategory Identifies consumption vs. reservation ChargeType ResourceId Links to specific deployments or apps ResourceId Tags Maps usage to teams, projects, or environments Tags UsageType / Usage Details Further SKU-level detail Sku Meter Subcategory, Sku Meter Name Why it matters: Azure’s native billing schema can vary across services and time. FOCUS ensures consistency and enables cross-cloud comparisons. Tip: If you use custom deployment IDs or user metadata, apply them as tags to improve allocation and unit economics. Review the FOCUS specification Step 3: Calculate unit economics Unit cost per token = BilledCost ÷ ConsumedQuantity Real-world example: Calculating unit cost in Power BI A recent Power BI report breaks down Azure OpenAI usage by: SKU Meter Category → e.g., Azure OpenAI SKU Meter Subcategory → e.g., gpt 4o 0513 Input global Tokens SKU Meter Name → detailed SKU info (input/output, model version, etc.) GPT Model Usage Type Effective Cost gpt 4o 0513 Input global Tokens Input $292.77 gpt 4o 0513 Output global Tokens Output $23.40 Unit Cost Formula: Unit Cost = EffectiveCost ÷ ConsumedQuantity Power BI Measure Example: Unit Cost = SUM(EffectiveCost) / SUM(ConsumedQuantity) Pro tip: Break out input and output token costs by model version to: Track which workloads are driving spend. Benchmark cost per token across GPT models. Attribute costs back to teams or product features using Tags or ResourceId. Power BI tip: Building a GPT cost breakdown matrix To easily calculate token unit costs by GPT model and usage type, build a Matrix visual in Power BI using this hierarchy: Rows: SKU Meter Category SKU Meter Subcategory SKU Meter Name Values: EffectiveCost (sum) ConsumedQuantity (sum) Unit Cost (calculated measure) Unit Cost = SUM(‘Costs’[EffectiveCost]) / SUM(‘Costs’[ConsumedQuantity]) Hierarchy Example: Azure OpenAI ├── GPT 4o Input global Tokens ├── GPT 4o Output global Tokens ├── GPT 4.5 Input global Tokens └── etc. Power BI Matrix visual showing Azure OpenAI token usage and costs by SKU Meter Category, Subcategory, and Name. This breakdown enables calculation of unit cost per token across GPT models and usage types, supporting FinOps allocation and unit economics analysis. What you can see at the token level Metric Description Data Source Token Volume Total tokens consumed Consumed Quantity Effective Cost Actual billed cost BilledCost / Cost Unit Cost per Token Cost divided by token quantity Effective Unit Price SKU Category & Subcategory Model, version, and token type (input/output) Sku Meter Category, Subcategory, Meter Name Resource Group / Business Unit Logical or organizational grouping Resource Group, Business Unit Application Application or workload responsible for usage Application (tag) This visibility allows teams to: Benchmark cost efficiency across GPT models. Track token costs over time. Allocate AI costs to business units or features. Detect usage anomalies and optimize workload design. Tip: Apply consistent tagging (Cost Center, Application, Environment) to Azure OpenAI resources to enhance allocation and unit economics reporting. How the FinOps Foundation’s AI working group informs this approach The FinOps for AI overview, developed by the FinOps Foundation’s AI working group, highlights unique challenges in managing AI-related cloud costs, including: Complex cost drivers (tokens, models, compute hours, data transfer). Cross-functional collaboration between Finance, Engineering, and ML Ops teams. The importance of tracking AI unit economics to connect spend with value. By combining the FinOps toolkit, FOCUS-conformed data, and Power BI reporting, practitioners can implement many of the AI Working Group’s recommendations: Establish token-level unit cost metrics. Allocate costs to teams, models, and AI features. Detect cost anomalies specific to AI usage patterns. Improve forecasting accuracy despite AI workload variability. Tip: Applying consistent tagging to AI workloads (model version, environment, business unit, and experiment ID) significantly improves cost allocation and reporting maturity. Step 4: Allocate and Report Costs With FOCUS + FinOps toolkit: Allocate costs to teams, projects, or business units using Tags, ResourceId, or custom dimensions. Showback/Chargeback AI usage costs to stakeholders. Detect anomalies using the Toolkit’s patterns or integrate with Azure Monitor. Tagging tip: Add metadata to Azure OpenAI deployments for easier allocation and unit cost reporting. Example: tags: CostCenter: AI-Research Environment: Production Feature: Chatbot Step 5: Iterate Using FinOps Best Practices FinOps capability Relevance Reporting & analytics Visualize token costs and trends Allocation Assign costs to teams or workloads Unit economics Track cost per token or business output Forecasting Predict future AI costs Anomaly management Identify unexpected usage spikes Start small (Crawl), expand as you mature (Walk → Run). Learn about the FinOps Framework Next steps Ready to take control of your Azure OpenAI costs? Deploy the Microsoft FinOps toolkit Start ingesting and analyzing your Azure billing data. Get started Adopt FOCUS Normalize your cost data for clarity and cross-cloud consistency. Explore FOCUS Calculate AI unit economics Track token consumption and unit costs using Power BI. Customize Power BI reports Extend toolkit templates to include token-based unit economics. Join the conversation Share insights or questions with the FinOps community on TechCommunity or in the FinOps Foundation Slack. Advance Your Skills Consider the FinOps Certified FOCUS Analyst certification. Further Reading Managing the cost of AI: Understanding AI workload cost considerations Microsoft FinOps toolkit Learn about FOCUS Microsoft Cost Management + Billing FinOps Foundation Appendix: FOCUS column glossary ConsumedQuantity: The number of tokens or units consumed for a given SKU. This is the key measure of usage. ConsumedUnit: The type of unit being consumed, such as 'tokens', 'GB', or 'vCPU hours'. Often appears as 'Units' in Azure exports for OpenAI workloads. PricingUnit: The unit of measure used for pricing. Should match 'ConsumedUnit', e.g., 'tokens'. EffectiveCost: Final cost after amortization of reservations, discounts, and prepaid credits. Often derived from billing data. BilledCost: The invoiced charge before applying commitment discounts or amortization. PricingQuantity: The volume of usage after applying pricing rules such as tiered or block pricing. Used to calculate cost when multiplied by unit price.692Views2likes0CommentsMaximize efficiency by managing and exchanging your Azure OpenAI Service provisioned reservations
When it comes to AI, businesses confront unprecedented challenges in efficiently managing computational resources. That’s why Azure OpenAI Service is a critical platform for organizations seeking to leverage cutting-edge AI capabilities, and it makes provisioned reservations an essential strategy for intelligent cost savings. Business needs change, of course, and flexibility in managing these reservations is vital. In this blog, we’ll not only explore what makes Azure OpenAI Service provisioned reservations indispensable for organizations seeking resilience and cost efficiency in their AI operations, but also follow a fictional company, Contoso, to illustrate real-world scenarios where exchanging reservations enhances scalability and budget control. The crucial role of provisioned reservations in modern AI infrastructure Azure OpenAI Service provisioned reservations help organizations save money by committing to a month- or yearlong provisioned throughput unit reservation for AI model usage, ensuring guaranteed availability and predictable costs. As mentioned in this article, purchasing a reservation and choosing coverage for an Azure region, quantity, and deployment type, reduces costs as compared to being charged at hourly rates. Actively managing and monitoring these reservations is paramount to unlocking their full potential. Here's why: Optimizing utilization: Regular monitoring ensures that your reservations align with your actual usage, preventing wasted resources. Adapting to business changes: As business needs shift, reservations can be adjusted to accommodate evolving requirements. Avoiding over-commitment: Proactive management helps prevent over-purchasing reservations, which can lead to unnecessary expenses. Enhancing cost control and accountability: By tracking reservation usage and costs, organizations can maintain better control over their AI budgets. Leveraging AI usage insights: Analyzing reservation utilization provides valuable insights into AI application performance and usage patterns. The value of exchanging provisioned reservations One of the most powerful aspects of provisioned reservations is the ability to exchange them. This flexibility allows businesses to adapt their commitments to better align with their evolving needs. Exchanges can be initiated through the Azure Portal or via the Azure Reservation API, offering seamless adjustments. Consider Contoso, a global technology firm leveraging Azure OpenAI Service for customer support chatbots and content generation tools. Initially, Contoso’s needs were straightforward, but as their business expanded, their AI requirements changed. This is where the exchange feature proved invaluable. Types of provisioned reservations exchanges Contoso leveraged several types of exchanges to optimize their Azure OpenAI Service usage: Region exchange: Contoso initially committed to a reservation in the East US region. However, as their operations expanded into Western Europe, they needed to shift their AI workloads. By exchanging reservations, they were able to apply their discounted billing to the West Europe region, ensuring optimal performance for their growing user base. Deployment type exchange: There are three types of deployment: Global, Azure geography (or regional), and Microsoft specified data zone. Contoso initially reserved regional deployments for their inference operations, but because of growing demand they switched to global deployment. This means their Azure OpenAI Service prompts and responses will now be processed anywhere that the relevant model is deployed. By exchanging reservations from regional to global, they were able to apply their reservation savings ensuring seamless cost savings for their critical application. Term exchange: Contoso initially committed to a one-month reservation. However, they soon realized their need for ongoing service and wanted to allocate resources more efficiently. By exchanging reservations, they switched to a one-year term, allowing them to budget more effectively. Payment exchange: Contoso started with an upfront payment model. However, for better cash flow management, they transitioned to a monthly payment plan through a payment exchange. Changing the scope of provisioned reservations As Contoso’s use of Azure OpenAI Service expanded across multiple departments, they needed to modify their reservation scope. Azure offers the ability to scope reservations to individual resource groups or subscriptions, to subscriptions within a management group, or to all subscriptions within a billing account or billing profile. Contoso used Microsoft Cost Management to modify the scope of their reservations, ensuring that each department had the necessary resources. Setting up automatic renewals for provisioned reservations To prevent service disruptions and maintain budget predictability, Contoso enabled automatic renewal for their reservations. Automatic renewals offer several benefits: Continuous service: Ensures uninterrupted billing for Azure OpenAI Service. Budget predictability: Maintains consistent costs over time. Reduced administrative overhead: Eliminates the need for manual renewal processes. Enabling auto-renewal in the Azure Portal is a straightforward process, ensuring that Contoso’s AI operations continue uninterrupted. Reviewing the provisioned reservation utilization report Contoso’s finance and IT teams regularly review their provisioned reservation utilization report to ensure they are getting the best value from their investment. These reports, accessible through Azure Cost Management, provide insights into reservation usage and help identify areas for optimization. Analyzing utilization reports allows Contoso to: Identify underutilized resources. Adjust reservations to match actual usage. Optimize costs and improve efficiency. Setting up utilization alerts To proactively monitor their reservation usage, Contoso configured reservation utilization alerts in Microsoft Cost Management. These alerts notify them if usage drops below a set threshold, allowing them to take timely action. By setting up utilization alerts, Contoso can: Receive real-time notifications of usage changes. Adjust reservations to avoid waste. Maintain optimal resource utilization. Best practices for managing Azure OpenAI Service provisioned reservations Azure OpenAI Service provisioned reservations offer a powerful way to control costs, but proactive management is essential for maximizing their value. As we have seen, Contoso implemented several best practices to maximize the benefits of provisioned reservations: Regular usage monitoring: Continuously tracking usage to identify trends and optimize resource allocation. Strategic adjustments and exchanges: Adapting reservations to match evolving business needs. Implementing governance policies: Establishing clear policies for reservation management and usage. Automating alerts and reporting: Configuring alerts and reports to proactively monitor reservation usage. By leveraging the flexibility of reservation exchanges and implementing best practices, any business can optimize their AI investments and drive long-term efficiency. Embracing these strategies will empower your organization to fully capitalize on the transformative potential of Azure OpenAI Service. Find out more by completing the Azure OpenAI Service provisioned reservation learn module. Additional Resources: What are Azure reservations? Save costs with Microsoft Azure OpenAI Service provisioned reservations Azure OpenAI Service provisioned throughput units (PTU) onboarding Azure pricing overview304Views2likes0CommentsManaging the cost of AI: Leveraging the FinOps Framework
Artificial intelligence is well and truly here, though the rate of adoption varies across consumers and businesses. When we think of AI in relation to FinOps, we look at it from two different angles: how are we using FinOps to optimize our AI costs, and how are we using AI to optimize all of our costs. The second angle includes the intersection of our cost management tools and AI capabilities, including things like Azure Copilot answering cost and forecasting questions. The first angle is only just starting to be whispered about in corporate hallways. Many organizations are still curious about AI and are exploring its capabilities without too much concern about cost or return on investment. They want to see if they can apply this new component to new or existing applications or to business process, to prove whether it’s beneficial. But early adopters, and savvy organizations that have a mature FinOps practice, are looking at how their cloud cost management practices wrap around this new type of cloud workload. Let’s explore some of the foundational FinOps elements and how they apply to AI. The FinOps Framework During its establishment and subsequent updates, the FinOps Foundation’s FinOps Framework was always intended to expand beyond just consumption-based cloud workloads. We’re seeing the rise of this with the addition of Scopes to the Framework, including Software-as-a-Service and Data Centers. And while AI isn’t a new scope because it fits into Public Cloud (or Data Center if you are hosting your own models locally), AI does benefit from the original forward-thinking approach. It provides an acid-test to see if the Principles, Domains, etc. are applicable to this new kind of workload, both for AI capabilities that purely consumption based (e.g. Azure OpenAI language models) or are SKU based with a combination of size limitations plus scale unit hours (e.g. Azure AI Search). Principles, Core Personas, Allied Persona, Phases and Maturity all ring true when looked at from the perspective of your AI workloads. You may add a few specialized personas when you go down to a greater level of detail under Engineers – like data and prompt engineers. Learn more in the FinOps Foundation’s FinOps for AI Overview. AI does not negate the need for teams to collaborate, everyone to take ownership of their cloud usage, and for business value to drive decision-making. The Domains and Capabilities are where we drill down into the next level of detail and examine if our existing tools and processes are up to the task of managing AI workloads. Microsoft takes a holistic approach to adopting new workloads, reflected in our Cloud Adoption Framework and Well Architected Framework. These have been updated to include AI adoption and AI Workload Documentation. For AI adoption, our Azure Essentials guidance introduces stages of Readiness and foundation, Design & govern, and Manage and optimize – combining detailed best practices and tools relevant to your AI adoption journey. FinOps capabilities for adopting AI on Azure Combine Microsoft's guidance and the FinOps Framework, and you can drill down into many of the FinOps capabilities during the stages of adopting AI on Azure: Readiness & foundation Planning & estimating – Planning for and estimating the cost of an AI workload requires an understanding of the entire application architecture, driven by the business requirements. Like every other application you manage, your organization will have requirements regarding security, resiliency & redundancy, recovery time and recovery points. These may be less rigorous if your application is stateless and not holding long-term data, but the architecture may need to be resilient if the application will become mission-critical or customer facing. & browser frontend, PostgreSQL & Redis, Event Bus and OpenAI. Next, it’s important to understand how AI services are priced. Most text-based AI services using Large Language Models (LLMs), including the Azure OpenAI Service, are priced per 1,000 tokens (where a token is a common sequence of characters found in the text). OpenAI’s Tokenizer website can demonstrate how text is broken into tokens, with efficiency improvements already being seen from GPT-3 to GPT-4o. Calculating the number of tokens in a sample conversation (inputs) & multiplying that by the number of website visitors x the percentage that may interact with a chatbot, can give us an indication of token usage and predicted cost. Workload optimization – As well as ensuring that all the components of the application are right-sized, consider whether you are using the right AI models, pre-built models or more efficient newer models. Also investigate ways that your application design can optimize token usage in how it caches responses (including semantic caching) and sends summaries instead of entire conversation history. Budgeting & forecasting – Good budgeting & forecasting is built on understanding current usage and adjusting for future expectations. In a future blog post, we’ll dive into how to measure and analyze AI workload usage. Once you understand the specifics of AI in your cost management tooling, the regular habit of budget reviews and forecasting adjustments is part of your overall FinOps processes. Empower teams & foster collaboration – Just as you’re learning how AI services generate cost and how to analyze that data, ensure your teams understand this at the right level for them. Engineers might use these insights to make application design changes, while Finance teams might care less about tokens but could equate that to customer usage and hopefully a parallel increase in sales. Design & govern Establish AI policy & governance – It’s easy to think of this one in terms of privacy & security, but what other aspects of AI do you need to put rules around, especially in relation to cost? Are all engineers allowed to deploy all AI workloads? Is AI restricted to development subscriptions until token usage is proven with a proof of concept? Are there any relevant in-built or customer Azure Policies that you need to implement? Are self-hosted open source LLMs allowed in your organization? Integrate intersecting disciplines like LLMOps – While we think of Engineers as the traditional personas who will develop and maintain an application, AI introduces specialized disciplines like LLMOps. Machine Learning Ops for Large Language Models (LLMOps) includes the automation of repetitive tasks, such as model building, testing, deployment, and monitoring, which improves efficiency. Though LLMs are pre-trained, MLOps can be leveraged to tune the LLMs, operationalize and monitor them effectively in production. These improvements can lead to cost reduction. Take advantage of the variable cost model of the cloud – A key part of this is understanding how your AI service is charged, and whether run times will impact cost or not. Fine tuning models have an hourly cost in addition to a token usage rate, charged from deployment onwards, so be mindful of when you deploy them and delete them when they are no longer needed. Implement the “crawl, walk, run” approach for continuous improvement – This one is self-explanatory and even more applicable as your organization adopts AI workloads. You will start out with a basic understanding and rudimentary decisions, which should be reviewed as your AI maturity increases. Manage and optimize Leverage reporting and analytics – Identify how AI usage and costs are surfaced in your existing cost management tools. We’ll cover this from a Microsoft Azure perspective in a future blog post. Anomaly management – As for any application, understand your process or tools for detecting anomalies in usage, and what steps should be taken next. Drive accountability – Emphasize to technical and engineering teams that their design decisions have cost implications, and that they have also have the power to help identify and implement cost efficiencies. Rate optimization – This responsibility should lie with your FinOps team and includes both the pricing rates for your organization (for example, via an Enterprise Agreement) as well as rate discounts using Azure Reservations, which also apply to AI services using Provisioned Throughput Units. Sustainability – An important consideration for corporate responsibility, especially if your organization has Environmental, Social & Governance (ESG) goals or reporting requirements. Leverage the Azure Carbon Optimization reports for emissions data and more. Unit economics – Unit economics breaks down into understanding the true cost of an application and correlating that with the business metric of the desired outcome (e.g. revenue per visit). From an application cost perspective, consider capabilities like Azure API Gateway’s Azure OpenAI Token Metric policy, which collects token usage data and facilitates accurate cross-charging based on token consumption. Then explore the outcome of your AI solution to identify business value indicators. These may be easier to measure if you’ve integrated AI with your e-commerce site, but a little trickier if it’s an internal chatbot that is improving business productivity. Conclusion If you have an established FinOps practice, you’ll find many of the process and capabilities are applicable to adopting and managing AI workloads on Azure. If you haven’t explored FinOps in depth yet, new AI workloads may be the catalyst for establishing a FinOps practice, but these foundational principles will also benefit your entire cloud, SaaS and data center estate. Learn how to navigate the financial landscape for successful AI adoption, including security funding, establishing your organizational readiness and managing your AI investments.1.1KViews2likes0Comments