Blog Post

Azure AI Foundry Blog
6 MIN READ

Building Human-in-the-loop AI Workflows with Microsoft Agent Framework

ogkranthi's avatar
ogkranthi
Icon for Microsoft rankMicrosoft
Oct 10, 2025

A Fraud detection example that combines deterministic orchestration with autonomous agents.

Modern enterprise systems face a simple problem: how to make AI decisions reliable, explainable, and production-ready?

This post walks through how the Microsoft Agent Framework structures AI-driven workflows for fraud detection — combining deterministic orchestration with autonomous agents. You can find the full implementation here

Why Combine Workflows and Agents?

Across industries — from finance and healthcare to logistics and customer service — AI systems face the same core challenge: how to deliver intelligent reasoning without sacrificing reliability, traceability, and operational control.
Rule-based workflows excel at predictability and compliance, but struggle to adapt when conditions change. Pure AI agents can interpret complex, ambiguous data, yet they often lack the production guarantees teams depend on: reproducibility, observability, and safe recovery from failures.

The Microsoft Agent Framework bridges this gap by combining deterministic workflow graphs — which govern execution, enforce structure, and provide fault tolerance — with autonomous agents that handle reasoning steps best served by LLM intelligence.

In fraud detection, this means the workflow orchestrates the process end-to-end, while specialized agents bring contextual understanding of usage patterns, location anomalies, and billing irregularities. Each part does what it’s best at:

  • Workflow: structure, type safety, and fault tolerance
  • Agents: domain-specific reasoning and contextual insight

System Architecture

The Fraud Detection scenario is built around a fan-out/fan-in pattern, a common orchestration design in distributed systems. It enables concurrent execution of multiple specialized agents, aggregation of their results, and deterministic decision-making at the end.

The architecture consists of four main components: Alert Router, Specialist Agents, Fraud Risk Aggregator, and Deterministic Decision Logic.

 

 

Alert Router: The Alert Router is the entry point of the workflow. When a new alert arrives — for example, multiple login attempts from different countries or a sudden spike in transactions — the router initiates the workflow and distributes the work across multiple agents in parallel.                          It encapsulates:
  • Workflow initialization (alert ingestion and metadata logging)
  • Context creation (customer ID, time window, alert type)
  • Launching parallel agent executions
  • Handling fan-out coordination and error propagation

This step is entirely deterministic — the router doesn’t interpret data, it only ensures the right agents receive it concurrently.

Specialist Agents: The workflow uses three domain-specific agents, each with tightly scoped access to tools and data sources through MCP (Model Context Protocol):

AgentResponsibilityExample Data Access
UsagePatternAgentAnalyzes data consumption or behavioral anomaliesAPI access to usage metrics
LocationAgentDetects geographic inconsistencies or unusual login originsGeoIP lookup or login logs
BillingAgentReviews recent transaction trends and abnormal chargesBilling or payment service APIs

Each agent operates autonomously within the workflow runtime. They:

  • Use LLM reasoning to interpret structured or semi-structured data
  • Generate typed outputs (e.g., anomaly reports with scores and explanations)
  • Write logs and traces to a shared observability channel

All three agents run in parallel, allowing fast, independent analysis without blocking or waiting for one another.

Fraud Risk Aggregator: Once the agents finish, the results flow into the Risk Aggregator, which performs the fan-in stage of the pattern. This component acts as a reasoning layer — an LLM-powered agent that combines structured results from all specialists and produces a unified assessment. It outputs a typed RiskAssessment object containing:

  • risk_score (float)
  • recommendation (e.g., “LOCK_ACCOUNT” or “ALLOW”)
  • justification (a concise reasoning summary)

The aggregator uses prompt templates and schemas to enforce consistency and prevent unpredictable responses. This is where AI reasoning occurs, but within a controlled, schema-validated boundary.

Deterministic decision Logic: Finally, the workflow executes a deterministic rule based on the aggregated output.
For example:

if risk_score > 0.6:
    route_to_human("analyst_review")
else:
    auto_resolve()

This part is entirely non-AI — it provides the predictability, traceability, and governance that enterprise workflows require. If a case is high-risk, the system checkpoints and pauses for human review. Once an analyst approves or rejects the recommendation, the workflow resumes from the checkpoint and completes. The deterministic layer ensures that every decision path is reproducible and logged, making the system both intelligent and auditable.

Execution Features

Checkpointing: 

Every workflow step can save and resume state. When the system waits for human approval, it writes a checkpoint. If the process restarts or the machine crashes, it resumes from the last known state without losing data. Internally, checkpoints serialize the execution context and the outputs of completed steps, stored in a durable backend. 

Checkpointing is built into the framework runtime. The stored state includes:

  • Workflow variables
  • Completed step outputs
  • Pending tasks
  • Execution metadata

This enables fault-tolerant recovery and deterministic replay.

Parallelism:

Since the three specialist agents run independently, the workflow leverages parallel execution instead of sequential blocking. The runtime manages concurrency safely, collecting results as they finish (fan-in). This pattern scales linearly with available compute resources.

Fault Tolerance and Recovery:

If an agent or executor fails, the runtime restarts that node from its checkpoint without re-running completed steps. This makes recovery deterministic and idempotent — a crucial property for automation in finance or healthcare scenarios.

Observability:

The framework emits

  • Open Telemetry traces for each node execution
  • Web socket event streams for real-time visualizations in dashboards
  • Structured audit logs capturing every agent call and LLM response
Human in the Loop Integration:

Not all decisions should be automated. The workflow includes a human review node triggered by high-risk assessments. This node suspends execution and waits for manual approval. When an analyst approves or rejects the recommendation, the workflow resumes from its checkpoint. This makes it possible to blend automation with required human oversight — without complex external process coordination.

Implementation Overview

The Fraud Detection workflow in the Microsoft Agent Framework is implemented as a workflow graph — a set of connected executors (nodes) that define the precise order of operations, data flow, and control logic. This graph combines parallel AI-powered analysis with deterministic orchestration, ensuring both adaptability and reliability.

At a high level, the workflow:

  1. Fans out a suspicious alert to multiple domain-specific agents (usage, location, billing).
  2. Fans in their results to a single AI aggregation agent.
  3. Passes the aggregated risk assessment to deterministic decision logic.
  4. Optionally pauses at a human-in-the-loop checkpoint for high-risk cases.
  5. Resumes to execute the final fraud action and notify the customer.
Fan-Out / Fan-In (Parallel Agents)

This is the core orchestration pattern:

  • Fan-Out:
    One incoming alert is dispatched to multiple executors in parallel.
    In code:
    # Create workflow builder  
    builder = WorkflowBuilder()  
    
    # Fan-out: AlertRouter → Usage, Location, Billing executors  
    builder.add_fan_out_edges(alert_router, [  
        usage_executor,  
        location_executor,  
        billing_executor  
    ])  
    This means the AlertRouterExecutor sends the same alert to three specialized agents, each using scoped MCP tools to analyze different aspects of the case concurrently.
  • Fan-In:
    The workflow waits until all parallel branches complete, then merges results into a single FraudRiskAggregatorExecutor:
    # Fan-in: Usage, Location, Billing → Aggregator  
    builder.add_fan_in_edges(  
        [usage_executor, location_executor, billing_executor],  
        aggregator  
    )  
    The aggregator is an LLM-powered reasoning agent that synthesizes the findings, calculates a unified risk score, and recommends an action — all within a schema‑validated boundary.
Routing and Human-in-the-Loop

Once the aggregator produces a FraudRiskAssessment, the workflow applies deterministic decision logic:

builder.add_switch_case_edge_group(  
    aggregator,  
    [  
        # High risk → Analyst review via ReviewGatewayExecutor  
        Case(condition=lambda assessment: assessment.overall_risk_score >= 0.6,  
             target=review_gateway),  
        # Low risk → Auto clear  
        Default(target=auto_clear),  
    ],  
)  
  • High-risk cases are routed to a human fraud analyst using the RequestInfoExecutor.
    The workflow checkpoints here, allowing safe pause/resume without losing state.
  • Low-risk cases are auto-cleared by the AutoClearExecutor.

Both paths eventually converge into the FinalNotificationExecutor, which sends a customer notification and logs the audit trail.

Why This Matters

This modular, graph‑based design means:

  • Parallel analysis scales linearly with compute resources.
  • Deterministic orchestration ensures reproducibility and fault tolerance.
  • HITL checkpoints blend automation with required human oversight.
  • Extensibility — swapping out agents or adding new ones doesn’t require changing the orchestration pattern.

Application View

Deployment Flexibility

  • Runtime: Built on the Microsoft Agent Framework core, which provides workflow execution, checkpoint persistence, and tracing.
  • Models: Agents can call any LLM accessible through Azure OpenAI, Azure AI Foundry or others.
  • Storage: Checkpoints and logs can use standard backends (e.g., Azure Blob, PostgreSQL).
  • Security: MCP tools isolate access; credentials and keys stay outside LLM context.
  • Monitoring and Observability: Open Telemetry data integrates with Azure Monitor or Grafana or Azure AI Foundry.
  • Developers can run the full system locally or deploy it as containerized components on Azure Kubernetes Service.

Extending the Pattern

The same architecture can support other regulated processes:

  • Insurance claim reviews
  • Healthcare authorizations
  • Loan underwriting
  • Document classification with manual verification

Only the agent definitions and tool bindings change; the orchestration logic can remain the same.

Summary

The Fraud Detection Workflow in the Microsoft Agent Framework shows how to combine structured orchestration with AI-driven reasoning. An incoming alert fans out to three specialized agents — usage, location, and billing — each using scoped MCP tools to analyze data in parallel. Their outputs are aggregated by an AI agent that produces a typed risk assessment, which then drives a deterministic decision: auto-clear low risk or route high risk to a human review checkpoint.

The workflow demonstrates checkpointing, parallel execution, type-safe messaging, and observability with OpenTelemetry and event logs. If interrupted, it resumes from the last saved state, ensuring reliable, auditable, and reproducible execution.

In short, it’s a practical example of how the Agent Framework enables AI reasoning inside predictable, fault-tolerant workflows.

 

Huge thanks to cenyuzhang​ for enabling the team for this build out and JamesN​ for leading the effort with others

Updated Oct 10, 2025
Version 1.0
No CommentsBe the first to comment