Recently, I'm planning to train a custom Open AI model which based Logic App cases in real-life. As per the situation, we have following challenges need to be resolved.
We need huge amounts of records which cannot be provided from a single person. So we need to have a way to collect data from different people.
Generate Jsonl files which requested by Open AI custom model.
The training data is growing, so we need to automate the model training and deployment in schedule.
After some research and test, I found we can use Microsoft Form + Azure Storage Account + Logic App as resolution.
Microsoft Form: It is a very easy using services which can collect data from different teammates, here the sample form:
Azure Storage Account: I'm using Storage Table and Blob container in this scenario, the Storage Table maintains the raw data which collected from Microsoft Form and blob stores the Jsonl files which generated by Logic App.
Logic App: provide main data collection and automate deployment flows.
Three Logic App Consumption with Managed Identity enabled which assigned "Storage Table Data Contributor", "Storage Blob Data Contributor" and "Cognitive Services OpenAI Contributor" role (Logic App Standard also can be a choice, then you need to have 3 workflows).
Filter for deployments which have the same provided "Deployment Name", delete the existing deployments and re-deploy with new model via API ([ManagementUrl]/deployments/[DeploymentName]?api-version=2023-10-01-preview)
In Azure Storage Table connector, I don't find a place to fill in "Next page marker" for pagination in "Get Entities" action. So in "Training Data Generator" workflow, I have to use Http action to query Storage Table directly.
Logic App ARM template is not available yet, so you need to prepare API connection yourselves.
Based on the dataset size and load of backend, the custom model might need to take sometime to generate, default timeout for "Until" loop of waiting model creation is 12 hours, you might need to change to longer.
In my scenario, there's no request during weekend, so I can safely delete deprecated deployment. You may need to change this behavior as per your requirement.