Real-time predictions with the azure_ai extension (Preview)

Microsoft

Mar 27, 2024

The azure_ai extension on Azure Database for PostgreSQL now introduces in preview the ability to invoke machine learning models hosted on Azure Machine Learning online endpoints in real-time.

Azure Machine Learning supports end-end machine learning lifecycles from building and training models to Generative AI workflows with prompt-flow, and these learning models or pipelines can be hosted on fully managed endpoints for inferencing. Online inferencing endpoints can host pre-trained or custom models that are trained and registered in the Azure Machine learning workspace. They are highly scalable, can handle a large number of requests, serve low latency requirements, have rich traffic monitoring and debugging infrastructure. The ability to invoke these easily from within the database, right where your data resides, helps in building various such as:

Fraud detection in banking.
Product recommendations in retail.
Equipment failure or maintenance in manufacturing.
Routing in transportation.
Patient predictions in healthcare.
Data validation, standardization, classification

For Gen AI models, rich vector support right within the database can store embeddings and do efficient similarity searches.

How to use it

To use the machine learning inference feature:

Allow list the azure_ai extension
Create the extension in the database you want to invoke inferencing form.
Deploy your model to an online inference endpoint and test it.
```
CREATE EXTENSION azure_ai;
```
Configure the extension to point to the Machine Learning inferencing endpoint.
Invoke the azure_ml.inference user defined function from within SQL.

Invoking a housing price prediction model:

We are invoking the azure_ml.inference function, passing input_data that is a jsonb payload for the HTTP request and the online inference endpoint returns jsonb that can be expanded into a set of values depending on the semantics of what the model returns. In this case the model returned housing price predictions in the hundreds of thousands.

-- Invoke model, input data payload in jsonb depends on the model.
SELECT jsonb_array_elements(inference.inference) as MedianHousePrediction
FROM azure_ml.inference('
{
  "input_data": [
    [5.03,5.542,0.95,2.97,37.28,-121.9,67.4,8]
  ],
 "params": {}
}', deployment_name=>'Housingprediction')

You could create a user-defined function for each of the models being invoked which would simplify application interface and allow better visibility into how often this function is called. For example, for the house prediction model invoked above:

CREATE OR REPLACE FUNCTION predict_house_pricing (IN listingid INT, OUT prediction real)
AS $$
with listing_features AS
  (SELECT  json_build_array(medinc,HouseAge, AveRooms, AveBedrms, Population, AveOccup, Latitude,Longitude) as data 
    FROM cal_housing 
    WHERE id= listingid),
  modeldata AS
  (
    SELECT json_build_object('input_data',json_agg(data)) as inputdata
    FROM listing_features 
  )
  SELECT jsonb_array_elements(inference.inference)::real as MedianHousePrediction
  FROM azure_ml.inference(
                          -- Input data to pass to the model in jsonb
                          (SELECT inputdata::jsonb FROM modeldata)
                          -- This is Azure machine learning online inference endpoint deployment
                          , deployment_name=>'Housingprediction') 
$$ LANGUAGE SQL PARALLEL SAFE COST 5000;

-- Invoke the function
SELECT * FROM predict_house_pricing(18000);

Getting Started

To learn more about the azure_ai extension, real-time machine learning inferencing and how it can simplify building applications on Azure Database for PostgreSQL, visit our documentation below: