Forum Discussion

parulpaul01's avatar
parulpaul01
Copper Contributor
Feb 13, 2026

Title: Synthetic Dataset Format from AI Foundry Not Compatible with Evaluation Schema

Current Situation

The synthetic dataset created from AI Foundry Data Synthetic Data is generated in the following messages format

{

"messages": [

{ "role": "system", "content": "You are a helpful assistant" },

{ "role": "user", "content": "What is the primary purpose?" },

{ "role": "assistant", "content": "The primary purpose is..." }

]

}

 

Challenge

When attempting evaluation, especially RAG evaluation, the documentation indicates that the dataset must contain structured fields such as

question - The query being asked

ground_truth - The expected answer

 

Recommended additional fields

reference_context

metadata

Example required format

{

"question": "",

"ground_truth": "",

"reference_context": "",

"metadata": { "document": "" }

}

 

Because the synthetic dataset is in messages format, I am unable to directly map it to the required evaluation schema.

Question

Is there a recommended or supported way to convert the synthetic dataset generated in AI Foundry messages format into the structured format required for evaluation?

Can the user role be mapped to question?

Can the assistant role be mapped to ground_truth?

Is there any built in transformation option within AI Foundry?

 

No RepliesBe the first to reply