.net
435 TopicsMicrosoft Agent Framework, Microsoft Foundry, MCP, Aspire を使った実践的な AI アプリを構築するサンプルが登場
AI エージェントを作ること自体は、以前よりも簡単になってきました。しかし、それらを実際の本番運用のアプリケーションの一部としてデプロイすること (複数のサービス、永続的な状態管理、本番向けのインフラを含めた形で運用すること) になると、途端に複雑になります。 .NET コミュニティの開発者からも、ローカル環境でもクラウドでも動作する、クラウドネイティブな実運用レベルのサンプルが見たいという声が多く寄せられていました。 その声に応え、私たちはオープンソースのサンプルアプリ『Interview Coach (面接コーチ)』を作りました。模擬就職面接を行う AI チャット web アプリです。 このサンプルでは、本番運用を想定したサービスにおいて、以下の技術がどのように組み合わさるのかを示しています: Microsoft Agent Framework Microsoft Foundry Model Context Protocol (MCP) Aspire このアプリは、実際に動作する 面接シミュレーター です。AI コーチがユーザーに対して行動面や技術面の質問を行い、最後に面接パフォーマンスのまとめをフィードバックとして提供します。 この記事では、このアプリで使用している設計パターンと、それらがどのような課題を解決するのかを紹介します。 こちらから Interview Coach デモアプリ を試すことができます。 なぜ Microsoft Agent Framework なのか? もしこれまで .NET で AI エージェントを開発してきたなら、おそらく Semantic Kernel や AutoGen、あるいはその両方を使ったことがあるでしょう。 Microsoft Agent Framework は、それらの次のステップにあたるフレームワークです。 このフレームワークは、同じチームによって開発されており、両プロジェクトをうまく統合して、1つのフレームワークにまとめたものです。 具体的には、 AutoGen のエージェント抽象化 Semantic Kernel のエンタープライズ機能 (状態管理、型安全性、ミドルウェア、テレメトリーなど) を統合し、さらにマルチエージェントのオーケストレーションのためのグラフベースのワークフローを追加しています。 .NET 開発者にとってのメリットは次のとおりです: フレームワークが1つに統合: Semantic Kernel と AutoGen のどちらを使うか悩む必要がありません。 馴染みのある開発パターン: エージェントは dependency injection、IChatClient、そして ASP.NET アプリと同じホスティングモデルを利用します。 本番運用を前提とした設計: OpenTelemetry、ミドルウェアパイプライン、Aspire との統合が最初から用意されています。 マルチエージェントのオーケストレーション: 逐次ワークフロー、並列実行、handoff パターン、グループチャットなどをサポートします。 Interview Coach は、これらの機能を単なる Hello World ではなく、実際のアプリケーションとしてまとめたサンプルです。 なぜ Microsoft Foundry なのか? AI エージェントには、単にモデルがあれば良いわけではありません。インフラも必要です。 Microsoft Foundry は、AI アプリケーションを構築・管理するための Azure のプラットフォームであり、Microsoft Agent Framework の推奨バックエンドでもあります。 Foundry を使うと、次のような機能を1つのポータルで利用できます: モデルアクセス: OpenAI、Meta、Mistral などのモデルを1つのエンドポイントから利用できるカタログ Content safety (安全性): モデレーションや個人情報(PII)検出が組み込まれており、エージェントが問題のある出力をしないように制御。 コスト最適化ルーティング: リクエストがタスクに最適なモデルへ自動的にルーティングされる 評価とファインチューニング: エージェントの品質を測定し、継続的に改善できる エンタープライズ向けガバナンス: Entra ID や Microsoft Defender による認証、アクセス制御、コンプライアンス Interview Coach では、Foundry がエージェントを動かすモデルエンドポイントを提供しています。 エージェントコードは IChatClient インターフェースを利用しているため、Foundry はあくまで設定の選択肢の 1 つですが、最初から豊富なツールが揃っている点で最も便利な選択肢です。 Interview Coach は何をするアプリなのか? Interview Coach は、模擬就職面接を行う対話型 AI です。 ユーザーが 履歴書(resume) と 応募先の職務内容(job description)を入力すると、そこから先はエージェントが面接プロセスを進めていきます。 情報収集(Intake): 履歴書と応募先の職務内容を収集します。 行動面接(Behavioral interview): あなたの経験に合わせて、 STAR メソッド (過去の行動を構造的に説明するための回答フレームワークで、Situation, Task, Action, Result の頭文字から来ている) に基づいた質問を行います。 技術面接(Technical interview): 応募する職種に応じた技術的な質問を行います。 まとめ(Summary): 面接のパフォーマンス (成績) を評価し、具体的なフィードバックを含むレビューを生成します。 ユーザーは、このシステムと Blazor の Web UI を通して対話します。 AI の回答は リアルタイムでストリーミング表示されます。 余談: Behavioral Interview とは Behavioral Interview(行動面接/行動事例面接)とは、応募者の「過去の具体的な行動」を深掘りし、その人の行動特性、スキル、考え方が企業の求める人材像と適合しているかを判断する面接手法です。 単なる知識や志望動機ではなく、「ストレスを感じた時どう対処したか」など過去の事実に基づき、将来のパフォーマンスを予測します。 アーキテクチャ概要 このアプリケーションは、複数のサービスに分割されており、すべて Aspire によってオーケストレーションされています: LLM Provider: Microsoft Foundry(推奨)を利用し、さまざまなモデルへアクセスします。 WebUI: 面接の対話を行うための Blazor ベースのチャットインターフェースです。 Agent: 面接のロジックを担うコンポーネントで、Microsoft Agent Framework 上で構築されています。 MarkItDown MCP Server: Microsoft の MarkItDown (なんでも Markdown にしてくれる Python ライブラリ) を利用し、履歴書(PDF や DOCX)を Markdown 形式に変換して解析します。 InterviewData MCP Server: .NET で実装された MCP サーバーで、面接セッションのデータを SQLite に保存します。 Aspire は、サービスディスカバリ、ヘルスチェック、テレメトリーを管理します。 各コンポーネントは 独立したプロセスとして実行され、1 つのコマンドでアプリケーション全体を起動できます。 パターン 1 : マルチエージェントによるハンドオフ このサンプルが特に興味深いのは、 ハンドオフ (handoff) パターン を採用している点です。 1 つのエージェントがすべてを処理するのではなく、面接のプロセスを 5 つの専門エージェントに分割しています: Agent 役割 Tools Triage (トリアージ) メッセージを適切な担当エージェントへ振り分ける なし(ルーティングのみ) Receptionist (受付) セッションを作成し、履歴書と職務内容を収集 MarkItDown + InterviewData Behavioral Interviewer (行動面接官) STAR メソッドを用いた行動面接を実施 InterviewData Technical Interviewer (技術面接官) 職種に応じた技術面接の質問を行う InterviewData Summarizer (サマリー生成) 面接の最終的なサマリーを生成 InterviewData ハンドオフパターンでは、あるエージェントが会話の制御を次のエージェントに完全に引き渡します。 引き継いだエージェントが、その後の会話をすべて担当します。 これは 「agent-as-tools」パターンとは異なります。 (agent-as-tools では、メインのエージェントが他のエージェントを補助ツールとして呼び出しますが、会話の制御自体はメインエージェントが保持します。) 以下は、このハンドオフワークフローの構成例です: var workflow = AgentWorkflowBuilder .CreateHandoffBuilderWith(triageAgent) .WithHandoffs(triageAgent, [receptionistAgent, behaviouralAgent, technicalAgent, summariserAgent]) .WithHandoffs(receptionistAgent, [behaviouralAgent, triageAgent]) .WithHandoffs(behaviouralAgent, [technicalAgent, triageAgent]) .WithHandoffs(technicalAgent, [summariserAgent, triageAgent]) .WithHandoff(summariserAgent, triageAgent) .Build(); 通常の処理フロー(happy path)は次の順序で進みます。 Receptionist → Behavioral → Technical → Summarizer それぞれの専門エージェントが、次のエージェントへ直接ハンドオフします。 もし想定外の状況が発生した場合は、エージェントは Triage エージェントへ戻り、適切なルーティングを再度行います。 なお、このサンプルには 単一エージェントモードも用意されており、よりシンプルな構成でのデプロイも可能です。 これにより、単一エージェントとマルチエージェントのアプローチを比較することができます。 パターン2: ツール統合のための MCP このプロジェクトでは、ツールはエージェントの内部に実装されていません。 それぞれが 独立した MCP(Model Context Protocol)サーバーとして提供されています。 例えば、MarkItDown サーバーはこのプロジェクトだけでなく、まったく別のエージェントプロジェクトでも再利用できます。また、ツール開発チームはエージェント開発チームとは独立してツールをリリースすることが可能です。 MCP は言語非依存(language-agnostic)であることも特徴です。 そのため、このサンプルでは MarkItDown が Python サーバーとして動作し、エージェントは .NET で実装されています。 エージェントは起動時に MCP クライアントを通じてツールを検出し、必要なエージェントにそれらを渡します。 var receptionistAgent = new ChatClientAgent( chatClient: chatClient, name: "receptionist", instructions: "You are the Receptionist. Set up sessions and collect documents...", tools: [.. markitdownTools, .. interviewDataTools]); 各エージェントには、必要なツールだけが割り当てられます: Triage エージェント:ツールなし(ルーティングのみを担当) インタビュアーエージェント:セッションデータへのアクセス Receptionist エージェント:ドキュメント解析 + セッションアクセス これは 最小権限の原則(principle of least privilege) に基づいた設計です。 パターン3: Aspire によるオーケストレーション Aspire は、アプリケーション全体をまとめて管理する役割を担います。 アプリホストはサービスのトポロジー(構成)を定義し、 どのサービスが存在するのか それぞれがどのように依存しているのか どの設定を受け取るのか を管理します。 これにより、次のような機能が利用できます: Service discovery. サービスは固定の URL ではなく、サービス名で互いを見つけることができます。 Health checks. Aspire ダッシュボードで、各コンポーネントの状態を確認できます。 Distributed tracing. 共通のサービス設定を通じて OpenTelemetry が組み込まれます。 One-command startup. aspire run --file ./apphost.cs を実行するだけで、すべてのサービスが起動します。 デプロイ時には、azd up を実行することで、アプリケーション全体が Azure Container Apps にデプロイされます。 始めてみよう 事前準備 .NET 10 SDK 以降 Azure サブスクリプション Microsoft Foundry project Docker Desktop またはその他のコンテナランタイム ローカルで実行する git clone https://github.com/Azure-Samples/interview-coach-agent-framework.git cd interview-coach-agent-framework # Configure credentials dotnet user-secrets --file ./apphost.cs set MicrosoftFoundry:Project:Endpoint "<your-endpoint>" dotnet user-secrets --file ./apphost.cs set MicrosoftFoundry:Project:ApiKey "<your-key>" # Start all services aspire run --file ./apphost.cs Aspire Dashboard を開き、すべてのサービスの状態が Running になるまで待ちます。 その後、WebUI のエンドポイントをクリックすると、模擬面接を開始できます。 以下は、ハンドオフパターンがどのように動作するかを DevUI 上で可視化したものです。 このチャット UI を使って、面接候補者としてエージェントと対話することができます。 Azure にデプロイする azd auth login azd up Tこれだけで完了です。 残りの処理は Aspire と azd が自動で実行します。 デプロイとテストが完了したら、次のコマンドを実行することで、作成されたすべてのリソースを安全に削除できます。 azd down --force --purge このサンプルから学べること Interview Coach を実際に試すことで、次のような内容を理解できます: Microsoft Foundry を モデルバックエンドとして利用する方法 Microsoft Agent Framework を使った 単一エージェントおよびマルチエージェントシステムの構築 ハンドオフによるオーケストレーションを用いて、ワークフローを専門エージェントに分割する方法 エージェントコードとは独立した MCP ツールサーバーの作成と利用 Aspire を使った 複数サービスからなるアプリケーションのオーケストレーション 一貫性のある構造化された振る舞いを生み出すプロンプト設計 azd up を使った アプリケーション全体のデプロイ方法 試してみよう 完全なソースコードは GitHub で公開されています: Azure-Samples/interview-coach-agent-framework Microsoft Agent Framework を初めて使う場合は、まず次の資料から始めることをおすすめします。 framework documentation Hello World sample. その後、このサンプルに戻ってくると、これらの要素がより大きなプロジェクトの中でどのように組み合わさるのかが理解できるでしょう。 もしこれらのパターンを使って何か作った場合は、ぜひ Issue を作成して 教えてください。 次は? (What's Next?) 我々は、現在、次のような さらなる統合シナリオにも取り組んでいます: Microsoft Foundry Agent Service GitHub Copilot A2A などなど。 これらの機能がリリースされ次第、このサンプルも随時アップデートしていく予定です。 Resources Microsoft Agent Framework ドキュメント Introducing Microsoft Agent Framework preview Microsoft Agent Framework Reaches Release Candidate Microsoft Foundry ドキュメント Microsoft Foundry Agent Service Microsoft Foundry Portal Microsoft.Extensions.AI Model Context Protocol specification Aspire ドキュメント ASP.NET BlazorMicrosoft Agent Framework, Microsoft Foundry, MCP, Aspire를 활용한 실전 예제 만들기
AI 에이전트를 개발하는 것은 점점 쉬워지고 있습니다. 하지만 여러 서비스, 상태 관리, 프로덕션 인프라를 갖춘 실제 애플리케이션의 일부로 배포하는 것은 여전히 복잡합니다. 실제로 .NET 개발자 커뮤니티에서는 로컬 머신과 클라우드 네이티브 방식의 클라우드 환경 모두에서 실제로 동작하는 실전 예제에 대한 요구가 많았습니다. 그래서 준비했습니다! Microsoft Agent Framework과 Microsoft Foundry, MCP(Model Context Protocol), Aspire등을 어떻게 프로덕션 상황에서 조합할 수 있는지를 보여주는 오픈소스 Interview Coach 샘플입니다. AI 코치가 인성 면접 질문과 기술 면접 질문을 안내한 후, 요약을 제공하는 효율적인 면접 시뮬레이터입니다. 이 포스트에서는 어떤 패턴을 사용했고 해당 패턴이 해결할 수 있는 문제를 다룹니다. Interview Coach 데모 앱을 방문해 보세요. 왜 Microsoft Agent Framework을 써야 하나요? .NET으로 AI 에이전트를 구축해 본 적이 있다면, Semantic Kernel이나 AutoGen, 또는 두 가지 모두를 사용해 본 적이 있을 겁니다. Microsoft Agent Framework는 그 다음 단계로서, 각각의 프로젝트에서 효과적이었던 부분을 하나의 프레임워크로 통합했습니다. AutoGen의 에이전트 추상화와 Semantic Kernel의 엔터프라이즈 기능(상태 관리, 타입 안전성, 미들웨어, 텔레메트리 등)을 하나로 통합했습니다. 또한 멀티 에이전트 오케스트레이션을 위한 그래프 기반 워크플로우도 추가했습니다. 그렇다면 .NET 개발자에게 이것이 어떤 의미로 다가올까요? 하나의 프레임워크. Semantic Kernel과 AutoGen 사이에서 더 이상 고민할 필요가 없습니다. 익숙한 패턴. 에이전트는 의존성 주입, IChatClient , 그리고 ASP.NET 앱과 동일한 호스팅 모델을 사용합니다. 프로덕션을 위한 설계. OpenTelemetry, 미들웨어 파이프라인, Aspire 통합이 포함되어 있습니다. 멀티 에이전트 오케스트레이션. 순차 실행, 동시 실행, 핸드오프 패턴, 그룹 채팅 등 다양한 멀티 에이전트 오케스트레이션 패턴을 지원합니다. Interview Coach는 이 모든 것을 Hello World가 아닌 실제 애플리케이션에 적용합니다. 왜 Microsoft Foundry를 써야 하나요? AI 에이전트에는 모델 말고도 더 많은 무언가가 필요합니다. 우선 인프라가 필요하겠죠. Microsoft Foundry는 AI 애플리케이션을 구축하고 관리하기 위한 Azure 플랫폼이며, Microsoft Agent Framework의 권장 백엔드입니다. Foundry는 자체 포털에서 아래와 같은 내용을 제공합니다: 모델 액세스. OpenAI, Meta, Mistral 등의 모델 카탈로그를 하나의 엔드포인트로 제공합니다. 콘텐츠 세이프티. 에이전트가 벗어나지 않도록 기본으로 제공하는 콘텐츠 조정 및 PII 감지 기능이 있습니다. 비용 최적화 라우팅. 에이전트의 요청을 자동으로 최적의 모델로 라우팅합니다. 평가 및 파인튜닝. 에이전트 품질을 측정하고 시간이 지남에 따라 개선할 수 있습니다. 엔터프라이즈 거버넌스. Entra ID와 Microsoft Defender를 통한 ID, 액세스 제어, 규정 준수를 지원합니다. Interview Coach에서 Foundry는 에이전트를 구동하는 모델 엔드포인트를 제공합니다. 에이전트 코드가 IChatClient 인터페이스를 사용하기 때문에, Foundry는 LLM 선택을 위한 설정에 불과할 수도 있겠지만, 에이전트가 필요로 하는 가장 많은 도구를 기본적으로 제공하는 선택지입니다. Interview Coach는 무엇을 하나요? Interview Coach는 모의 면접을 진행하는 대화형 AI입니다. 이력서와 채용 공고를 제공하면, 에이전트가 나머지를 처리합니다: 접수. 이력서와 목표 직무 설명을 수집합니다. 행동 면접. 경험에 맞춘 STAR 기법 질문을 합니다. 기술 면접. 직무별 기술 질문을 합니다. 요약. 구체적인 피드백과 함께 성과 리뷰를 생성합니다. Blazor 웹 UI를 통해 실시간으로 응답 스트리밍을 제공하며 사용자와 에이전트간 상호작용합니다. 아키텍처 개요 애플리케이션은 Aspire를 통해 다양한 서비스를 오케스트레이션합니다: LLM 제공자. 다양한 모델 액세스를 위한 Microsoft Foundry (권장). WebUI. 면접 대화를 위한 Blazor 채팅 인터페이스. 에이전트. Microsoft Agent Framework로 구축된 면접 로직. MarkItDown MCP 서버. Microsoft의 MarkItDown을 통해 이력서(PDF, DOCX)를 마크다운으로 변환합니다. InterviewData MCP 서버. SQLite에 세션을 저장하는 .NET MCP 서버. Aspire가 서비스 디스커버리, 상태 확인, 텔레메트리를 처리합니다. 각 컴포넌트는 별도의 프로세스로 실행시키며, 하나의 커맨드 만으로 전체를 시작할 수 있습니다. 패턴 1: 멀티 에이전트 핸드오프 이 샘플에서 가장 흥미로운 부분이기도 한 핸드오프 패턴으로 멀티 에이전트 시나리오를 구성했습니다. 하나의 에이전트가 모든 것을 처리하는 대신, 면접은 다섯 개의 전문 에이전트로 나뉩니다: 에이전트 역할 도구 Triage 메시지를 적절한 전문가에게 라우팅 없음 (순수 라우팅) Receptionist 세션 생성, 이력서 및 채용 공고 수집 MarkItDown + InterviewData Behavioral Interviewer STAR 기법을 활용한 행동 면접 질문 진행 InterviewData Technical Interviewer 직무별 기술 질문 진행 InterviewData Summarizer 최종 면접 요약 생성 InterviewData 핸드오프 패턴에서는 하나의 에이전트가 대화의 전체 제어권을 다음 에이전트에게 넘깁니다. 그러면 넘겨 받는 에이전트가 모든 제어권을 인수합니다. 이는 주 에이전트가 다른 에이전트를 도우미로 호출하면서도 제어권을 유지하는 "agent-as-tools(도구로서의 에이전트)" 방식과는 다릅니다. 핸드오프 워크플로우를 어떻게 구성하는지 살펴보시죠: var workflow = AgentWorkflowBuilder .CreateHandoffBuilderWith(triageAgent) .WithHandoffs(triageAgent, [receptionistAgent, behaviouralAgent, technicalAgent, summariserAgent]) .WithHandoffs(receptionistAgent, [behaviouralAgent, triageAgent]) .WithHandoffs(behaviouralAgent, [technicalAgent, triageAgent]) .WithHandoffs(technicalAgent, [summariserAgent, triageAgent]) .WithHandoff(summariserAgent, triageAgent) .Build(); 면접 상황을 상상해 본다면 기본적으로 순차적인 방식으로 진행합니다: Receptionist → Behavioral → Technical → Summarizer. 각 전문가가 직접 다음으로 핸드오프합니다. 예상치 못한 상황이 발생하면, 에이전트는 재라우팅을 위해 Triage로 돌아갑니다. 이 샘플에는 더 간단한 배포를 위한 단일 에이전트 모드도 포함하고 있어, 두 가지 접근 방식을 나란히 비교할 수 있습니다. 패턴 2: 도구 통합을 위한 MCP 이 프로젝트에서 도구는 에이전트 내부에 구현하는 대신 MCP(Model Context Protocol) 서버를 통해 통합합니다. 동일한 MarkItDown 서버가 완전히 다른 에이전트 프로젝트에서도 쓰일 수 있으며, 도구 개발팀은 에이전트 개발팀과 독립적으로 배포할 수 있습니다. MCP는 또한 언어에 구애받지 않으므로, 이 샘플 앱에서 쓰인 MarkItDown은 Python 기반의 서버이고, 에이전트는 .NET 기반으로 동작합니다. 에이전트는 시작 시 MCP 클라이언트를 통해 도구를 발견하고, 적절한 에이전트에게 전달합니다: var receptionistAgent = new ChatClientAgent( chatClient: chatClient, name: "receptionist", instructions: "You are the Receptionist. Set up sessions and collect documents...", tools: [.. markitdownTools, .. interviewDataTools]); 각 에이전트는 필요한 도구만 받습니다. Triage는 도구를 받지 않고(라우팅만 수행), 면접관은 세션 액세스를, Receptionist는 문서 파싱과 세션 액세스를 받습니다. 이는 최소 권한 원칙을 따릅니다. 패턴 3: Aspire 오케스트레이션 Aspire가 모든 것을 하나로 연결합니다. 앱 호스트는 서비스 토폴로지를 정의합니다: 어떤 서비스가 존재하는지, 서로 어떻게 의존하는지, 어떤 구성을 받는지. 다음을 제공합니다: 서비스 디스커버리. 서비스가 하드코딩된 URL이 아닌 이름으로 서로를 찾습니다. 상태 확인. Aspire 대시보드에서 모든 컴포넌트의 상태를 보여줍니다. 분산 추적. 공유 서비스 기본값을 통해 OpenTelemetry가 연결됩니다. 단일 커맨드 시작. aspire run --file ./apphost.cs 로 모든 것을 시작합니다. 배포 시, azd up 으로 전체 애플리케이션을 Azure Container Apps에 푸시합니다. 시작하기 사전 요구 사항 .NET 10 SDK 이상 Azure 구독 Microsoft Foundry 프로젝트 Docker Desktop 또는 기타 컨테이너 런타임 로컬에서 실행하기 git clone https://github.com/Azure-Samples/interview-coach-agent-framework.git cd interview-coach-agent-framework # 자격 증명 구성 dotnet user-secrets --file ./apphost.cs set MicrosoftFoundry:Project:Endpoint "<your-endpoint>" dotnet user-secrets --file ./apphost.cs set MicrosoftFoundry:Project:ApiKey "<your-key>" # 모든 서비스 시작 aspire run --file ./apphost.cs Aspire 대시보드를 열고, 모든 서비스가 Running으로 표시될 때까지 기다린 후, WebUI 엔드포인트를 클릭하여 모의 면접을 시작하세요. 핸드오프 패턴이 어떻게 동작하는지 DevUI에서 시각화한 모습입니다. 이 채팅 UI를 사용하여 면접 후보자로서 에이전트와 상호작용할 수 있습니다. Azure에 배포하기 azd auth login azd up 배포를 위해서는 이게 사실상 전부입니다! Aspire와 azd 가 나머지를 처리합니다. 배포와 테스트를 완료한 후, 다음 명령어를 실행하여 모든 리소스를 안전하게 삭제할 수 있습니다: azd down --force --purge 이 샘플에서 배울 수 있는 것 Interview Coach를 통해 다음을 경험하게 됩니다: Microsoft Foundry를 모델 백엔드로 사용하기 Microsoft Agent Framework로 단일 에이전트 및 멀티 에이전트 시스템 구축하기 핸드오프 오케스트레이션으로 전문 에이전트 간 워크플로우 분할하기 에이전트 코드와 독립적으로 MCP 도구 서버 생성 및 사용하기 Aspire로 멀티 서비스 애플리케이션 오케스트레이션하기 일관되고 구조화된 동작을 생성하는 프롬프트 작성하기 azd up 으로 모든 것 배포하기 사용해 보세요 전체 소스 코드는 GitHub에 있습니다: Azure-Samples/interview-coach-agent-framework Microsoft Agent Framework가 처음이라면, 프레임워크 문서와 Hello World 샘플부터 시작하세요. 그런 다음 여기로 돌아와서 더 큰 프로젝트에서 각 부분이 어떻게 결합되는지 확인하세요. 이러한 패턴으로 무언가를 만들었다면, 이슈를 열어 알려주세요. 다음 계획 다음과 같은 통합 시나리오를 현재 작업 중입니다. 작업이 끝나는 대로 이 샘플 앱을 업데이트 하도록 하겠습니다. Microsoft Foundry Agent Service GitHub Copilot A2A 참고 자료 Microsoft Agent Framework 문서 Microsoft Agent Framework 프리뷰 소개 Microsoft Agent Framework, 릴리스 후보 도달 Microsoft Foundry 문서 Microsoft Foundry Agent Service Microsoft Foundry 포털 Microsoft.Extensions.AI Model Context Protocol 사양 Aspire 문서 ASP.NET BlazorExploring Azure Face API: Facial Landmark Detection and Real-Time Analysis with C#
In today’s world, applications that understand and respond to human facial cues are no longer science fiction—they’re becoming a reality in domains like security, driver monitoring, gaming, and AR/VR. With Azure Face API, developers can leverage powerful cloud-based facial recognition and analysis tools without building complex machine learning models from scratch. In this blog, we’ll explore how to use C# to detect faces, identify key facial landmarks, estimate head pose, track eye and mouth movements, and process real-time video streams. Using OpenCV for visualization, we’ll show how to overlay landmarks, draw bounding boxes, and calculate metrics like Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR)—all in real time. You'll learn to: Set up Azure Face API Detect 27 facial landmarks Estimate head pose (yaw, pitch, roll) Calculate eye aspect ratio (EAR) and mouth openness Draw bounding boxes around features using OpenCV Process real-time video Prerequisites .NET 8 SDK installed Azure subscription with Face API resource Visual Studio 2022 or later Webcam for testing (optional) Basic understanding of C# and computer vision concepts Part 1: Azure Face API Setup 1.1 Install Required NuGet Packages dotnet add package Azure.AI.Vision.Face dotnet add package OpenCvSharp4 dotnet add package OpenCvSharp4.runtime.win 1.2 Create Azure Face API Resource Navigate to Azure Portal Search for "Face" and create a new Face API resource Choose your pricing tier (Free tier: 20 calls/min, 30K calls/month) Copy the Endpoint URL and API Key 1.3 Configure in .NET Application appsettings.json: { "Azure": { "FaceApi": { "Endpoint": "https://your-resource.cognitiveservices.azure.com/", "ApiKey": "your-api-key-here" } } } Initialize Face Client: using Azure; using Azure.AI.Vision.Face; using Microsoft.Extensions.Configuration; public class FaceAnalysisService { private readonly FaceClient _faceClient; private readonly ILogger<FaceAnalysisService> _logger; public FaceAnalysisService(ILogger<FaceAnalysisService> logger, IConfiguration configuration) { _logger = logger; string endpoint = configuration["Azure:FaceApi:Endpoint"]; string apiKey = configuration["Azure:FaceApi:ApiKey"]; _faceClient = new FaceClient(new Uri(endpoint), new AzureKeyCredential(apiKey)); _logger.LogInformation("FaceClient initialized with endpoint: {Endpoint}", endpoint); } } Part 2: Understanding Face Detection Models 2.1 Basic Face Detection public async Task<List<FaceDetectionResult>> DetectFacesAsync(byte[] imageBytes) { using var stream = new MemoryStream(imageBytes); var response = await _faceClient.DetectAsync( BinaryData.FromStream(stream), FaceDetectionModel.Detection03, FaceRecognitionModel.Recognition04, returnFaceId: false, returnFaceAttributes: new FaceAttributeType[] { FaceAttributeType.HeadPose }, returnFaceLandmarks: true, returnRecognitionModel: false ); _logger.LogInformation("Detected {Count} faces", response.Value.Count); return response.Value.ToList(); } Part 3: Facial Landmarks - The 27 Key Points 3.1 Understanding Facial Landmarks 3.2 Accessing Landmarks in Code public void PrintLandmarks(FaceDetectionResult face) { var landmarks = face.FaceLandmarks; if (landmarks == null) { _logger.LogWarning("No landmarks detected"); return; } // Eye landmarks Console.WriteLine($"Left Eye Outer: ({landmarks.EyeLeftOuter.X}, {landmarks.EyeLeftOuter.Y})"); Console.WriteLine($"Left Eye Inner: ({landmarks.EyeLeftInner.X}, {landmarks.EyeLeftInner.Y})"); Console.WriteLine($"Left Eye Top: ({landmarks.EyeLeftTop.X}, {landmarks.EyeLeftTop.Y})"); Console.WriteLine($"Left Eye Bottom: ({landmarks.EyeLeftBottom.X}, {landmarks.EyeLeftBottom.Y})"); // Mouth landmarks Console.WriteLine($"Upper Lip Top: ({landmarks.UpperLipTop.X}, {landmarks.UpperLipTop.Y})"); Console.WriteLine($"Under Lip Bottom: ({landmarks.UnderLipBottom.X}, {landmarks.UnderLipBottom.Y})"); // Nose landmarks Console.WriteLine($"Nose Tip: ({landmarks.NoseTip.X}, {landmarks.NoseTip.Y})"); } 3.3 Visualizing All Landmarks public void DrawAllLandmarks(FaceLandmarks landmarks, Mat frame) { void DrawPoint(FaceLandmarkCoordinate point, Scalar color) { if (point != null) { Cv2.Circle(frame, new Point((int)point.X, (int)point.Y), radius: 3, color: color, thickness: -1); } } // Eyes (Green) DrawPoint(landmarks.EyeLeftOuter, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeLeftInner, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeLeftTop, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeLeftBottom, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeRightOuter, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeRightInner, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeRightTop, new Scalar(0, 255, 0)); DrawPoint(landmarks.EyeRightBottom, new Scalar(0, 255, 0)); // Eyebrows (Cyan) DrawPoint(landmarks.EyebrowLeftOuter, new Scalar(255, 255, 0)); DrawPoint(landmarks.EyebrowLeftInner, new Scalar(255, 255, 0)); DrawPoint(landmarks.EyebrowRightOuter, new Scalar(255, 255, 0)); DrawPoint(landmarks.EyebrowRightInner, new Scalar(255, 255, 0)); // Nose (Yellow) DrawPoint(landmarks.NoseTip, new Scalar(0, 255, 255)); DrawPoint(landmarks.NoseRootLeft, new Scalar(0, 255, 255)); DrawPoint(landmarks.NoseRootRight, new Scalar(0, 255, 255)); DrawPoint(landmarks.NoseLeftAlarOutTip, new Scalar(0, 255, 255)); DrawPoint(landmarks.NoseRightAlarOutTip, new Scalar(0, 255, 255)); // Mouth (Blue) DrawPoint(landmarks.UpperLipTop, new Scalar(255, 0, 0)); DrawPoint(landmarks.UpperLipBottom, new Scalar(255, 0, 0)); DrawPoint(landmarks.UnderLipTop, new Scalar(255, 0, 0)); DrawPoint(landmarks.UnderLipBottom, new Scalar(255, 0, 0)); DrawPoint(landmarks.MouthLeft, new Scalar(255, 0, 0)); DrawPoint(landmarks.MouthRight, new Scalar(255, 0, 0)); // Pupils (Red) DrawPoint(landmarks.PupilLeft, new Scalar(0, 0, 255)); DrawPoint(landmarks.PupilRight, new Scalar(0, 0, 255)); } Part 4: Drawing Bounding Boxes Around Features 4.1 Eye Bounding Boxes /// <summary> /// Draws rectangles around eyes using OpenCV. /// </summary> public void DrawEyeBoxes(FaceLandmarks landmarks, Mat frame) { int boxWidth = 60; int boxHeight = 35; // Calculate Rectangles var leftEyeRect = new Rect((int)landmarks.EyeLeftOuter.X - boxWidth / 2, (int)landmarks.EyeLeftOuter.Y - boxHeight / 2, boxWidth, boxHeight); var rightEyeRect = new Rect((int)landmarks.EyeRightOuter.X - boxWidth / 2, (int)landmarks.EyeRightOuter.Y - boxHeight / 2, boxWidth, boxHeight); // Draw Rectangles (Green in BGR) Cv2.Rectangle(frame, leftEyeRect, new Scalar(0, 255, 0), 2); Cv2.Rectangle(frame, rightEyeRect, new Scalar(0, 255, 0), 2); // Add Labels Cv2.PutText(frame, "Left Eye", new Point(leftEyeRect.X, leftEyeRect.Y - 5), HersheyFonts.HersheySimplex, 0.4, new Scalar(0, 255, 0), 1); Cv2.PutText(frame, "Right Eye", new Point(rightEyeRect.X, rightEyeRect.Y - 5), HersheyFonts.HersheySimplex, 0.4, new Scalar(0, 255, 0), 1); } 4.2 Mouth Bounding Box /// <summary> /// Draws rectangle around mouth region. /// </summary> public void DrawMouthBox(FaceLandmarks landmarks, Mat frame) { int boxWidth = 80; int boxHeight = 50; // Calculate center based on the vertical lip landmarks int centerX = (int)((landmarks.UpperLipTop.X + landmarks.UnderLipBottom.X) / 2); int centerY = (int)((landmarks.UpperLipTop.Y + landmarks.UnderLipBottom.Y) / 2); var mouthRect = new Rect(centerX - boxWidth / 2, centerY - boxHeight / 2, boxWidth, boxHeight); // Draw Mouth Box (Blue in BGR) Cv2.Rectangle(frame, mouthRect, new Scalar(255, 0, 0), 2); // Add Label Cv2.PutText(frame, "Mouth", new Point(mouthRect.X, mouthRect.Y - 5), HersheyFonts.HersheySimplex, 0.4, new Scalar(255, 0, 0), 1); } 4.3 Face Bounding Box /// <summary> /// Draws rectangle around entire face using the face rectangle from API. /// </summary> public void DrawFaceBox(FaceDetectionResult face, Mat frame) { var faceRect = face.FaceRectangle; if (faceRect == null) { return; } var rect = new Rect( faceRect.Left, faceRect.Top, faceRect.Width, faceRect.Height ); // Draw Face Bounding Box (Red in BGR) Cv2.Rectangle(frame, rect, new Scalar(0, 0, 255), 2); // Add Label with dimensions Cv2.PutText(frame, $"Face {faceRect.Width}x{faceRect.Height}", new Point(rect.X, rect.Y - 10), HersheyFonts.HersheySimplex, 0.5, new Scalar(0, 0, 255), 2); } 4.4 Nose Bounding Box /// <summary> /// Draws bounding box around nose using nose landmarks. /// </summary> public void DrawNoseBox(FaceLandmarks landmarks, Mat frame) { // Calculate horizontal bounds from Alar tips int minX = (int)Math.Min(landmarks.NoseLeftAlarOutTip.X, landmarks.NoseRightAlarOutTip.X); int maxX = (int)Math.Max(landmarks.NoseLeftAlarOutTip.X, landmarks.NoseRightAlarOutTip.X); // Calculate vertical bounds from Root to Tip int minY = (int)Math.Min(landmarks.NoseRootLeft.Y, landmarks.NoseTip.Y); int maxY = (int)landmarks.NoseTip.Y; // Create Rect with a 10px padding buffer var noseRect = new Rect( minX - 10, minY - 10, (maxX - minX) + 20, (maxY - minY) + 20 ); // Draw Nose Box (Yellow in BGR) Cv2.Rectangle(frame, noseRect, new Scalar(0, 255, 255), 2); } Part 5: Geometric Calculations with Landmarks 5.1 Calculating Euclidean Distance /// <summary> /// Calculates distance between two landmark points. /// </summary> public static double CalculateDistance(dynamic point1, dynamic point2) { double dx = point1.X - point2.X; double dy = point1.Y - point2.Y; return Math.Sqrt(dx * dx + dy * dy); } 5.2 Eye Aspect Ratio (EAR) Formula /// <summary> /// Calculates the Eye Aspect Ratio (EAR) to detect eye closure. /// </summary> public double CalculateEAR( FaceLandmarkCoordinate top1, FaceLandmarkCoordinate top2, FaceLandmarkCoordinate bottom1, FaceLandmarkCoordinate bottom2, FaceLandmarkCoordinate inner, FaceLandmarkCoordinate outer) { // Vertical distances double v1 = CalculateDistance(top1, bottom1); double v2 = CalculateDistance(top2, bottom2); // Horizontal distance double h = CalculateDistance(inner, outer); // EAR formula: (||p2-p6|| + ||p3-p5||) / (2 * ||p1-p4||) return (v1 + v2) / (2.0 * h); } Simplified Implementation: /// <summary> /// Calculates Eye Aspect Ratio (EAR) for a single eye. /// Reference: "Real-Time Eye Blink Detection using Facial Landmarks" (Soukupová & Čech, 2016) /// </summary> public double ComputeEAR(FaceLandmarks landmarks, bool isLeftEye) { var top = isLeftEye ? landmarks.EyeLeftTop : landmarks.EyeRightTop; var bottom = isLeftEye ? landmarks.EyeLeftBottom : landmarks.EyeRightBottom; var inner = isLeftEye ? landmarks.EyeLeftInner : landmarks.EyeRightInner; var outer = isLeftEye ? landmarks.EyeLeftOuter : landmarks.EyeRightOuter; if (top == null || bottom == null || inner == null || outer == null) { _logger.LogWarning("Missing eye landmarks"); return 1.0; // Return 1.0 (open) to prevent false positives for drowsiness } double verticalDist = CalculateDistance(top, bottom); double horizontalDist = CalculateDistance(inner, outer); // Simplified EAR for Azure 27-point model double ear = verticalDist / horizontalDist; _logger.LogDebug( "EAR for {Eye}: {Value:F3}", isLeftEye ? "left" : "right", ear ); return ear; } Usage Example: var leftEAR = ComputeEAR(landmarks, isLeftEye: true); var rightEAR = ComputeEAR(landmarks, isLeftEye: false); var avgEAR = (leftEAR + rightEAR) / 2.0; Console.WriteLine($"Average EAR: {avgEAR:F3}"); // Open eyes: ~0.25-0.30 // Closed eyes: ~0.10-0.15 5.3 Mouth Aspect Ratio (MAR) /// <summary> /// Calculates Mouth Aspect Ratio relative to face height. /// </summary> public double CalculateMouthAspectRatio(FaceLandmarks landmarks, FaceRectangle faceRect) { double mouthHeight = landmarks.UnderLipBottom.Y - landmarks.UpperLipTop.Y; double mouthWidth = CalculateDistance(landmarks.MouthLeft, landmarks.MouthRight); double mouthOpenRatio = mouthHeight / faceRect.Height; double mouthWidthRatio = mouthWidth / faceRect.Width; _logger.LogDebug( "Mouth - Height ratio: {HeightRatio:F3}, Width ratio: {WidthRatio:F3}", mouthOpenRatio, mouthWidthRatio ); return mouthOpenRatio; } 5.4 Inter-Eye Distance /// <summary> /// Calculates the distance between pupils (inter-pupillary distance). /// </summary> public double CalculateInterEyeDistance(FaceLandmarks landmarks) { return CalculateDistance(landmarks.PupilLeft, landmarks.PupilRight); } /// <summary> /// Calculates distance between inner eye corners. /// </summary> public double CalculateInnerEyeDistance(FaceLandmarks landmarks) { return CalculateDistance(landmarks.EyeLeftInner, landmarks.EyeRightInner); } 5.5 Face Symmetry Analysis /// <summary> /// Analyzes facial symmetry by comparing left and right sides. /// </summary> public FaceSymmetryMetrics AnalyzeFaceSymmetry(FaceLandmarks landmarks) { double centerX = landmarks.NoseTip.X; double leftEyeDistance = CalculateDistance(landmarks.EyeLeftInner, new { X = centerX, Y = landmarks.EyeLeftInner.Y }); double leftMouthDistance = CalculateDistance(landmarks.MouthLeft, new { X = centerX, Y = landmarks.MouthLeft.Y }); double rightEyeDistance = CalculateDistance(landmarks.EyeRightInner, new { X = centerX, Y = landmarks.EyeRightInner.Y }); double rightMouthDistance = CalculateDistance(landmarks.MouthRight, new { X = centerX, Y = landmarks.MouthRight.Y }); return new FaceSymmetryMetrics { EyeSymmetryRatio = leftEyeDistance / rightEyeDistance, MouthSymmetryRatio = leftMouthDistance / rightMouthDistance, IsSymmetric = Math.Abs(leftEyeDistance - rightEyeDistance) < 5.0 }; } public class FaceSymmetryMetrics { public double EyeSymmetryRatio { get; set; } public double MouthSymmetryRatio { get; set; } public bool IsSymmetric { get; set; } } Part 6: Head Pose Estimation 6.1 Understanding Head Pose Angles Azure Face API provides three Euler angles for head orientation: 6.2 Accessing Head Pose Data public void AnalyzeHeadPose(FaceDetectionResult face) { var headPose = face.FaceAttributes?.HeadPose; if (headPose == null) { _logger.LogWarning("Head pose not available"); return; } double yaw = headPose.Yaw; double pitch = headPose.Pitch; double roll = headPose.Roll; Console.WriteLine("Head Pose:"); Console.WriteLine($" Yaw: {yaw:F2}° (Left/Right)"); Console.WriteLine($" Pitch: {pitch:F2}° (Up/Down)"); Console.WriteLine($" Roll: {roll:F2}° (Tilt)"); InterpretHeadPose(yaw, pitch, roll); } 6.3 Interpreting Head Pose public string InterpretHeadPose(double yaw, double pitch, double roll) { var directions = new List<string>(); // Interpret Yaw (horizontal) if (Math.Abs(yaw) < 10) directions.Add("Looking Forward"); else if (yaw < -20) directions.Add($"Turned Left ({Math.Abs(yaw):F0}°)"); else if (yaw > 20) directions.Add($"Turned Right ({yaw:F0}°)"); // Interpret Pitch (vertical) if (Math.Abs(pitch) < 10) directions.Add("Level"); else if (pitch < -15) directions.Add($"Looking Down ({Math.Abs(pitch):F0}°)"); else if (pitch > 15) directions.Add($"Looking Up ({pitch:F0}°)"); // Interpret Roll (tilt) if (Math.Abs(roll) > 15) { string side = roll < 0 ? "Left" : "Right"; directions.Add($"Tilted {side} ({Math.Abs(roll):F0}°)"); } return string.Join(", ", directions); } 6.4 Visualizing Head Pose on Frame /// <summary> /// Draws head pose information with color-coded indicators. /// </summary> public void DrawHeadPoseInfo(Mat frame, HeadPose headPose, FaceRectangle faceRect) { double yaw = headPose.Yaw; double pitch = headPose.Pitch; double roll = headPose.Roll; int centerX = faceRect.Left + faceRect.Width / 2; int centerY = faceRect.Top + faceRect.Height / 2; string poseText = $"Yaw: {yaw:F1}° Pitch: {pitch:F1}° Roll: {roll:F1}°"; Cv2.PutText(frame, poseText, new Point(faceRect.Left, faceRect.Top - 10), HersheyFonts.HersheySimplex, 0.5, new Scalar(255, 255, 255), 1); int arrowLength = 50; double yawRadians = yaw * Math.PI / 180.0; int arrowEndX = centerX + (int)(arrowLength * Math.Sin(yawRadians)); Cv2.ArrowedLine(frame, new Point(centerX, centerY), new Point(arrowEndX, centerY), new Scalar(0, 255, 0), 2, tipLength: 0.3); double pitchRadians = -pitch * Math.PI / 180.0; int arrowPitchEndY = centerY + (int)(arrowLength * Math.Sin(pitchRadians)); Cv2.ArrowedLine(frame, new Point(centerX, centerY), new Point(centerX, arrowPitchEndY), new Scalar(255, 0, 0), 2, tipLength: 0.3); } 6.5 Detecting Head Orientation States public enum HeadOrientation { Forward, Left, Right, Up, Down, TiltedLeft, TiltedRight, UpLeft, UpRight, DownLeft, DownRight } public List<HeadOrientation> DetectHeadOrientation(HeadPose headPose) { const double THRESHOLD = 15.0; bool lookingUp = headPose.Pitch > THRESHOLD; bool lookingDown = headPose.Pitch < -THRESHOLD; bool lookingLeft = headPose.Yaw < -THRESHOLD; bool lookingRight = headPose.Yaw > THRESHOLD; var orientations = new List<HeadOrientation>(); if (!lookingUp && !lookingDown && !lookingLeft && !lookingRight) orientations.Add(HeadOrientation.Forward); if (lookingUp && !lookingLeft && !lookingRight) orientations.Add(HeadOrientation.Up); if (lookingDown && !lookingLeft && !lookingRight) orientations.Add(HeadOrientation.Down); if (lookingLeft && !lookingUp && !lookingDown) orientations.Add(HeadOrientation.Left); if (lookingRight && !lookingUp && !lookingDown) orientations.Add(HeadOrientation.Right); if (lookingUp && lookingLeft) orientations.Add(HeadOrientation.UpLeft); if (lookingUp && lookingRight) orientations.Add(HeadOrientation.UpRight); if (lookingDown && lookingLeft) orientations.Add(HeadOrientation.DownLeft); if (lookingDown && lookingRight) orientations.Add(HeadOrientation.DownRight); return orientations; } Part 7: Real-Time Video Processing 7.1 Setting Up Video Capture using OpenCvSharp; public class RealTimeFaceAnalyzer : IDisposable { private VideoCapture? _capture; private Mat? _frame; private readonly FaceClient _faceClient; private bool _isRunning; public async Task StartAsync() { _capture = new VideoCapture(0); _frame = new Mat(); _isRunning = true; await Task.Run(() => ProcessVideoLoop()); } private async Task ProcessVideoLoop() { while (_isRunning) { if (_capture == null || !_capture.IsOpened()) break; _capture.Read(_frame); if (_frame == null || _frame.Empty()) { await Task.Delay(1); // Minimal delay to prevent CPU spiking continue; } Cv2.Resize(_frame, _frame, new Size(640, 480)); // Ensure we don't await indefinitely in the rendering loop _ = ProcessFrameAsync(_frame.Clone()); Cv2.ImShow("Face Analysis", _frame); if (Cv2.WaitKey(30) == 'q') break; } Dispose(); } private async Task ProcessFrameAsync(Mat frame) { // This is where your DrawFaceBox, DrawAllLandmarks, and EAR logic will sit. // Remember to use try-catch here to prevent API errors from crashing the loop. } public void Dispose() { _isRunning = false; _capture?.Dispose(); _frame?.Dispose(); Cv2.DestroyAllWindows(); } } 7.2 Optimizing API Calls Problem: Calling Azure Face API on every frame (30 fps) is expensive and slow. Solution: Call API once per second, cache results for 30 frames. private List<FaceDetectionResult> _cachedFaces = new(); private DateTime _lastDetectionTime = DateTime.MinValue; private readonly object _cacheLock = new(); private async Task ProcessFrameAsync(Mat frame) { if ((DateTime.Now - _lastDetectionTime).TotalSeconds >= 1.0) { _lastDetectionTime = DateTime.Now; byte[] imageBytes; Cv2.ImEncode(".jpg", frame, out imageBytes); var faces = await DetectFacesAsync(imageBytes); lock (_cacheLock) { _cachedFaces = faces; } } List<FaceDetectionResult> facesToProcess; lock (_cacheLock) { facesToProcess = _cachedFaces.ToList(); } foreach (var face in facesToProcess) { DrawFaceAnnotations(face, frame); } } Performance Improvement: 30x fewer API calls (1/sec instead of 30/sec) ~$0.02/hour instead of ~$0.60/hour Smooth 30 fps rendering < 100ms latency for visual updates 7.3 Drawing Complete Face Annotations private void DrawFaceAnnotations(FaceDetectionResult face, Mat frame) { DrawFaceBox(face, frame); if (face.FaceLandmarks != null) { DrawAllLandmarks(face.FaceLandmarks, frame); DrawEyeBoxes(face.FaceLandmarks, frame); DrawMouthBox(face.FaceLandmarks, frame); DrawNoseBox(face.FaceLandmarks, frame); double leftEAR = ComputeEAR(face.FaceLandmarks, isLeftEye: true); double rightEAR = ComputeEAR(face.FaceLandmarks, isLeftEye: false); double avgEAR = (leftEAR + rightEAR) / 2.0; Cv2.PutText(frame, $"EAR: {avgEAR:F3}", new Point(10, 30), HersheyFonts.HersheySimplex, 0.6, new Scalar(0, 255, 0), 2); } if (face.FaceAttributes?.HeadPose != null) { DrawHeadPoseInfo(frame, face.FaceAttributes.HeadPose, face.FaceRectangle); string orientation = InterpretHeadPose(face.FaceAttributes.HeadPose.Yaw, face.FaceAttributes.HeadPose.Pitch, face.FaceAttributes.HeadPose.Roll); Cv2.PutText(frame, orientation, new Point(10, 60), HersheyFonts.HersheySimplex, 0.6, new Scalar(255, 255, 0), 2); } } Part 8: Advanced Features and Use Cases 8.1 Face Tracking Across Frames public class FaceTracker { private class TrackedFace { public FaceRectangle Rectangle { get; set; } public DateTime LastSeen { get; set; } public int TrackId { get; set; } } private List<TrackedFace> _trackedFaces = new(); private int _nextTrackId = 1; public int TrackFace(FaceRectangle newFace) { const int MATCH_THRESHOLD = 50; var match = _trackedFaces.FirstOrDefault(tf => { double distance = Math.Sqrt(Math.Pow(tf.Rectangle.Left - newFace.Left, 2) + Math.Pow(tf.Rectangle.Top - newFace.Top, 2)); return distance < MATCH_THRESHOLD; }); if (match != null) { match.Rectangle = newFace; match.LastSeen = DateTime.Now; return match.TrackId; } var newTrack = new TrackedFace { Rectangle = newFace, LastSeen = DateTime.Now, TrackId = _nextTrackId++ }; _trackedFaces.Add(newTrack); return newTrack.TrackId; } public void RemoveOldTracks(TimeSpan maxAge) { _trackedFaces.RemoveAll(tf => DateTime.Now - tf.LastSeen > maxAge); } } 8.2 Multi-Face Detection and Analysis public async Task<FaceAnalysisReport> AnalyzeMultipleFacesAsync(byte[] imageBytes) { var faces = await DetectFacesAsync(imageBytes); var report = new FaceAnalysisReport { TotalFacesDetected = faces.Count, Timestamp = DateTime.Now, Faces = new List<SingleFaceAnalysis>() }; for (int i = 0; i < faces.Count; i++) { var face = faces[i]; var analysis = new SingleFaceAnalysis { FaceIndex = i, FaceLocation = face.FaceRectangle, FaceSize = face.FaceRectangle.Width * face.FaceRectangle.Height }; if (face.FaceLandmarks != null) { analysis.LeftEyeEAR = ComputeEAR(face.FaceLandmarks, true); analysis.RightEyeEAR = ComputeEAR(face.FaceLandmarks, false); analysis.InterPupillaryDistance = CalculateInterEyeDistance(face.FaceLandmarks); } if (face.FaceAttributes?.HeadPose != null) { analysis.HeadYaw = face.FaceAttributes.HeadPose.Yaw; analysis.HeadPitch = face.FaceAttributes.HeadPose.Pitch; analysis.HeadRoll = face.FaceAttributes.HeadPose.Roll; } report.Faces.Add(analysis); } report.Faces = report.Faces.OrderByDescending(f => f.FaceSize).ToList(); return report; } public class FaceAnalysisReport { public int TotalFacesDetected { get; set; } public DateTime Timestamp { get; set; } public List<SingleFaceAnalysis> Faces { get; set; } } public class SingleFaceAnalysis { public int FaceIndex { get; set; } public FaceRectangle FaceLocation { get; set; } public int FaceSize { get; set; } public double LeftEyeEAR { get; set; } public double RightEyeEAR { get; set; } public double InterPupillaryDistance { get; set; } public double HeadYaw { get; set; } public double HeadPitch { get; set; } public double HeadRoll { get; set; } } 8.3 Exporting Landmark Data to JSON using System.Text.Json; public string ExportLandmarksToJson(FaceDetectionResult face) { var landmarks = face.FaceLandmarks; var landmarkData = new { Face = new { Rectangle = new { face.FaceRectangle.Left, face.FaceRectangle.Top, face.FaceRectangle.Width, face.FaceRectangle.Height } }, Eyes = new { Left = new { Outer = new { landmarks.EyeLeftOuter.X, landmarks.EyeLeftOuter.Y }, Inner = new { landmarks.EyeLeftInner.X, landmarks.EyeLeftInner.Y }, Top = new { landmarks.EyeLeftTop.X, landmarks.EyeLeftTop.Y }, Bottom = new { landmarks.EyeLeftBottom.X, landmarks.EyeLeftBottom.Y } }, Right = new { Outer = new { landmarks.EyeRightOuter.X, landmarks.EyeRightOuter.Y }, Inner = new { landmarks.EyeRightInner.X, landmarks.EyeRightInner.Y }, Top = new { landmarks.EyeRightTop.X, landmarks.EyeRightTop.Y }, Bottom = new { landmarks.EyeRightBottom.X, landmarks.EyeRightBottom.Y } } }, Mouth = new { UpperLipTop = new { landmarks.UpperLipTop.X, landmarks.UpperLipTop.Y }, UnderLipBottom = new { landmarks.UnderLipBottom.X, landmarks.UnderLipBottom.Y }, Left = new { landmarks.MouthLeft.X, landmarks.MouthLeft.Y }, Right = new { landmarks.MouthRight.X, landmarks.MouthRight.Y } }, Nose = new { Tip = new { landmarks.NoseTip.X, landmarks.NoseTip.Y }, RootLeft = new { landmarks.NoseRootLeft.X, landmarks.NoseRootLeft.Y }, RootRight = new { landmarks.NoseRootRight.X, landmarks.NoseRootRight.Y } }, HeadPose = face.FaceAttributes?.HeadPose != null ? new { face.FaceAttributes.HeadPose.Yaw, face.FaceAttributes.HeadPose.Pitch, face.FaceAttributes.HeadPose.Roll } : null }; return JsonSerializer.Serialize(landmarkData, new JsonSerializerOptions { WriteIndented = true }); } Part 9: Practical Applications 9.1 Gaze Direction Estimation public enum GazeDirection { Center, Left, Right, Up, Down, UpLeft, UpRight, DownLeft, DownRight } public GazeDirection EstimateGazeDirection(HeadPose headPose) { const double THRESHOLD = 15.0; bool lookingUp = headPose.Pitch > THRESHOLD; bool lookingDown = headPose.Pitch < -THRESHOLD; bool lookingLeft = headPose.Yaw < -THRESHOLD; bool lookingRight = headPose.Yaw > THRESHOLD; if (lookingUp && lookingLeft) return GazeDirection.UpLeft; if (lookingUp && lookingRight) return GazeDirection.UpRight; if (lookingDown && lookingLeft) return GazeDirection.DownLeft; if (lookingDown && lookingRight) return GazeDirection.DownRight; if (lookingUp) return GazeDirection.Up; if (lookingDown) return GazeDirection.Down; if (lookingLeft) return GazeDirection.Left; if (lookingRight) return GazeDirection.Right; return GazeDirection.Center; } 9.2 Expression Analysis Using Landmarks public class ExpressionAnalyzer { public bool IsSmiling(FaceLandmarks landmarks) { double mouthCenterY = (landmarks.UpperLipTop.Y + landmarks.UnderLipBottom.Y) / 2; double leftCornerY = landmarks.MouthLeft.Y; double rightCornerY = landmarks.MouthRight.Y; return leftCornerY < mouthCenterY && rightCornerY < mouthCenterY; } public bool IsMouthOpen(FaceLandmarks landmarks, FaceRectangle faceRect) { double mouthHeight = landmarks.UnderLipBottom.Y - landmarks.UpperLipTop.Y; double mouthOpenRatio = mouthHeight / faceRect.Height; return mouthOpenRatio > 0.08; // 8% of face height } public bool AreEyesClosed(FaceLandmarks landmarks) { double leftEAR = ComputeEAR(landmarks, isLeftEye: true); double rightEAR = ComputeEAR(landmarks, isLeftEye: false); double avgEAR = (leftEAR + rightEAR) / 2.0; return avgEAR < 0.18; // Threshold for closed eyes } } 9.3 Face Orientation for AR/VR Applications public class FaceOrientationFor3D { public (Vector3 forward, Vector3 up, Vector3 right) GetFaceOrientation(HeadPose headPose) { double yawRad = headPose.Yaw * Math.PI / 180.0; double pitchRad = headPose.Pitch * Math.PI / 180.0; double rollRad = headPose.Roll * Math.PI / 180.0; var forward = new Vector3((float)(Math.Sin(yawRad) * Math.Cos(pitchRad)), (float)(-Math.Sin(pitchRad)), (float)(Math.Cos(yawRad) * Math.Cos(pitchRad))); var up = new Vector3((float)(Math.Sin(yawRad) * Math.Sin(pitchRad) * Math.Cos(rollRad) - Math.Cos(yawRad) * Math.Sin(rollRad)), (float)(Math.Cos(pitchRad) * Math.Cos(rollRad)), (float)(Math.Cos(yawRad) * Math.Sin(pitchRad) * Math.Cos(rollRad) + Math.Sin(yawRad) * Math.Sin(rollRad))); var right = Vector3.Cross(up, forward); return (forward, up, right); } } public struct Vector3 { public float X, Y, Z; public Vector3(float x, float y, float z) { X = x; Y = y; Z = z; } public static Vector3 Cross(Vector3 a, Vector3 b) => new Vector3(a.Y * b.Z - a.Z * b.Y, a.Z * b.X - a.X * b.Z, a.X * b.Y - a.Y * b.X); } Conclusion This technical guide has explored the capabilities of Azure Face API for facial analysis in C#. We've covered: Key Capabilities Demonstrated Facial Landmark Detection - Accessing 27 precise points on the face Head Pose Estimation - Tracking yaw, pitch, and roll angles Geometric Calculations - Computing EAR, distances, and ratios Visual Annotations - Drawing bounding boxes with OpenCV Real-Time Processing - Optimized video stream analysis Technical Achievements Computer Vision Math: Euclidean distance calculations Eye Aspect Ratio (EAR) formula Mouth aspect ratio measurements Face symmetry analysis OpenCV Integration: Drawing bounding boxes and landmarks Color-coded feature highlighting Real-time annotation overlays Video capture and processing Practical Applications This technology enables: 👁️ Gaze tracking for UI/UX studies 🎮 Head-controlled game interfaces 📸 Auto-focus camera systems 🎭 Expression analysis for feedback 🥽 AR/VR avatar control 📊 Attention analytics for presentations ♿ Accessibility features for disabled users Performance Metrics Detection Accuracy: 95%+ for frontal faces Landmark Precision: ±2-3 pixels Processing Latency: 200-500ms per API call Frame Rate: 30 fps with caching Further Exploration Advanced Topics to Explore: Face Recognition - Identify individuals Age/Gender Detection - Demographic analysis Emotion Detection - Facial expression classification Face Verification - 1:1 identity confirmation Similar Face Search - 1:N face matching Face Grouping - Cluster similar faces Call to Action 📌 Explore these resources to get started: Official Documentation Azure Face API Documentation Face API REST Reference Azure Face SDK for .NET Related Libraries OpenCVSharp - OpenCV wrapper for .NET System.Drawing - .NET image processing Source Code GitHub Repository: ravimodi_microsoft/SmartDriver Sample Code: Included in this articleBuilding MCP Apps with Azure Functions MCP Extension
Today, we are thrilled to announce the release of MCP App support in the Azure Functions MCP (Model Context Protocol) extension! You can now build MCP Apps using the Functions MCP Extension in Python, TypeScript, and .NET. What are MCP Apps Until now, MCP has primarily been a way for AI agents to “talk” to data and tools. A tool would take an input, perform a task, and return a text response. While powerful, text has limits. For example, it’s easier to see a chart than to read a long list of data points. It’s also more convenient and accurate to provide complex inputs via a form than a series of text responses. MCP Apps addresses the limits by allowing MCP servers to return interactive HTML interfaces that render directly in the conversation. The following scenarios shed light into how the UI capabilities of MCP Apps improve the user experience of MCP tools in ways that texts can’t: Data exploration: A sales analytics tool returns an interactive dashboard. Users filter by region, drill down into specific accounts, and export reports without leaving the conversation. Configuration wizards: A deployment tool presents a form with dependent fields. Selecting “production” reveals additional security options; selecting “staging” shows different defaults. Real-time monitoring: A server health tool shows live metrics that update as systems change. No need to re-run the tool to see current status. Building MCP Apps with Azure Functions MCP Extension Azure Functions is the ideal platform for hosting remote MCP servers because of its built-in authentication, event-driven scaling from 0 to N, and serverless billing. This ensures your agentic tools are secure, cost-effective, and ready to handle any load. How It Works: Connecting Tools to Resources Building an MCP App involves two main components: Tools: Tools are executable functions that allow an LLM to interact with external systems (e.g., querying a database or sending an email). Resources: Resources are read-only data entities (e.g., log files, API docs, or database schemas) that provide the LLM with information without triggering side effects. You connect the tools to resources via the tools’ metadata. 1. The Tool with UI Metadata The following code snippet defines an MCP tool called GetWeather using the McpToolTrigger and associated metadata using McpMetadata. The McpMetadata declares that the tool has an associated UI, telling AI clients that when this tool is invoked, there’s a specific visual component available to display the results. Example (Python): TOOL_METADATA = '{"ui": {"resourceUri": "ui://weather/index.html"}}' @app.mcp_tool(metadata=TOOL_METADATA) @app.mcp_tool_property(arg_name="location", description="City name to check weather for (e.g., Seattle, New York, Miami)") def get_weather(location: str) -> Dict[str, Any]: result = weather_service.get_current_weather(location) return json.dumps(result) Example (C#): private const string ToolMetadata = """ { "ui": { "resourceUri": "ui://weather/index.html" } } """; [Function(nameof(GetWeather))] public async Task<object> GetWeather( [McpToolTrigger(nameof(GetWeather), "Returns current weather for a location via Open-Meteo.")] [McpMetadata(ToolMetadata)] ToolInvocationContext context, [McpToolProperty("location", "City name to check weather for (e.g., Seattle, New York, Miami)")] string location) { var result = await _weatherService.GetCurrentWeatherAsync(location); return result; } 2. The Resource Serving the UI The following snippet defines an MCP resource called GetWeatherWidget, which serves the bundled HTML at the matching URI. The MimeType is set to text/html;profile=mcp-app. Note that the resource URI (ui://weather/index.html) is the same as the one specified in ToolMetadata from above. Example (Python): RESOURCE_METADATA = '{"ui": {"prefersBorder": true}}' WEATHER_WIDGET_URI = "ui://weather/index.html" WEATHER_WIDGET_NAME = "Weather Widget" WEATHER_WIDGET_DESCRIPTION = "Interactive weather display for MCP Apps" WEATHER_WIDGET_MIME_TYPE = "text/html;profile=mcp-app" @app.mcp_resource_trigger( arg_name="context", uri=WEATHER_WIDGET_URI, resource_name=WEATHER_WIDGET_NAME, description=WEATHER_WIDGET_DESCRIPTION, mime_type=WEATHER_WIDGET_MIME_TYPE, metadata=RESOURCE_METADATA ) def get_weather_widget(context) -> str: # Get the path to the widget HTML file current_dir = Path(__file__).parent file_path = current_dir / "app" / "dist" / "index.html" return file_path.read_text(encoding="utf-8") Example (C#): // Optional UI metadata private const string ResourceMetadata = """ { "ui": { "prefersBorder": true } } """; [Function(nameof(GetWeatherWidget))] public string GetWeatherWidget( [McpResourceTrigger( "ui://weather/index.html", "Weather Widget", MimeType = "text/html;profile=mcp-app", Description = "Interactive weather display for MCP Apps")] [McpMetadata(ResourceMetadata)] ResourceInvocationContext context) { var file = Path.Combine(AppContext.BaseDirectory, "app", "dist", "index.html"); return File.ReadAllText(file); } See quickstarts in Getting Started section for full sample code. 3. Putting It All Together User asks: “What’s the weather in Seattle?” Agent calls the GetWeathertool. The tool returns weather data (as a normal tool result). The tool also includes ui.resourceUri metadata (ui://weather/index.html) telling the client an interactive UI is available. The client fetches the UI resource from ui://weather/index.html and loads it in a sandboxed iframe. The client passes the tool result to the UI app. User sees an interactive weather widget instead of plain text Get Started You can start building today using our samples. Each sample demonstrates how to define tools that trigger interactive UI components: Python quickstart TypeScript quickstart .NET quickstart Documentation Learn more about the Azure Functions MCP extension. Learn more about MCP Apps. Next Step: Authentication The samples above secure the MCP Apps using access keys. Learn how to secure the apps using Microsoft Entra and the built-in MCP auth feature.7.2KViews1like0CommentsBuilding HIPAA-Compliant Medical Transcription with Local AI
Building HIPAA-Compliant Medical Transcription with Local AI Introduction Healthcare organizations generate vast amounts of spoken content, patient consultations, research interviews, clinical notes, medical conferences. Transcribing these recordings traditionally requires either manual typing (time-consuming and expensive) or cloud transcription services (creating immediate HIPAA compliance concerns). Every audio file sent to external APIs exposes Protected Health Information (PHI), requires Business Associate Agreements, creates audit trails on third-party servers, and introduces potential breach vectors. This sample solution lies in on-premises voice-to-text systems that process audio entirely locally, never sending PHI beyond organizational boundaries. This article demonstrates building a sample medical transcription application using FLWhisper, ASP.NET Core, C#, and Microsoft Foundry Local with OpenAI Whisper models. You'll learn how to build sample HIPAA-compliant audio processing, integrate Whisper models for medical terminology accuracy, design privacy-first API patterns, and build responsive web UIs for healthcare workflows. Whether you're developing electronic health record (EHR) integrations, building clinical research platforms, or implementing dictation systems for medical practices, this sample could be a great starting point for privacy-first speech recognition. Why Local Transcription Is Critical for Healthcare Healthcare data handling is fundamentally different from general business data due to HIPAA regulations, state privacy laws, and professional ethics obligations. Understanding these requirements explains why cloud transcription services, despite their convenience, create unacceptable risks for medical applications. HIPAA compliance mandates strict controls over PHI. Every system that touches patient data must implement administrative, physical, and technical safeguards. Cloud transcription APIs require Business Associate Agreements (BAAs), but even with paperwork, you're entrusting PHI to external systems. Every API call creates logs on vendor servers, potentially in multiple jurisdictions. Data breaches at transcription vendors expose patient information, creating liability for healthcare organizations. On-premises processing eliminates these third-party risks entirely, PHI never leaves your controlled environment. US State laws increasingly add requirements beyond HIPAA. California's CCPA, New York's SHIELD Act, and similar legislation create additional compliance obligations. International regulations like GDPR prohibit transferring health data outside approved jurisdictions. Local processing simplifies compliance by keeping data within organizational boundaries. Research applications face even stricter requirements. Institutional Review Boards (IRBs) often require explicit consent for data sharing with external parties. Cloud transcription may violate study protocols that promise "no third-party data sharing." Clinical trials in pharmaceutical development handle proprietary information alongside PHI, double jeopardy for data exposure. Local transcription maintains research integrity while enabling audio analysis. Cost considerations favor local deployment at scale. Medical organizations generate substantial audio, thousands of patient encounters monthly. Cloud APIs charge per minute of audio, creating significant recurring costs. Local models have fixed infrastructure costs that scale economically. A modest GPU server can process hundreds of hours monthly at predictable expense. Latency matters for clinical workflows. Doctors and nurses need transcriptions available immediately after patient encounters to review and edit while details are fresh. Cloud APIs introduce network delays, especially problematic in rural health facilities with limited connectivity. Local inference provides <1 second turnaround for typical consultation lengths. Application Architecture: ASP.NET Core with Foundry Local The sample FLWhisper application implements clean separation between audio handling, AI inference, and state management using modern .NET patterns: The ASP.NET Core 10 minimal API provides HTTP endpoints for health checks, audio transcription, and sample file streaming. Minimal APIs reduce boilerplate while maintaining full middleware support for error handling, authentication, and CORS. The API design follows OpenAI's transcription endpoint specification, enabling drop-in replacement for existing integrations. The service layer encapsulates business logic: FoundryModelService manages model loading and lifetime, TranscriptionService handles audio processing and AI inference, and SampleAudioService provides demonstration files for testing. This separation enables easy testing, dependency injection, and service swapping. Foundry Local integration uses the Microsoft.AI.Foundry.Local.WinML SDK. Unlike cloud APIs requiring authentication and network calls, this SDK communicates directly with the local Foundry service via in-process calls. Models load once at startup, remaining resident in memory for sub-second inference on subsequent requests. The static file frontend delivers vanilla HTML/CSS/JavaScript, no framework overhead. This simplicity aids healthcare IT security audits and enables deployment on locked-down hospital networks. The UI provides file upload, sample selection, audio preview, transcription requests, and result display with copy-to-clipboard functionality. Here's the architectural flow for transcription requests: Web UI (Upload Audio File) ↓ POST /v1/audio/transcriptions (Multipart Form Data) ↓ ASP.NET Core API Route ↓ TranscriptionService.TranscribeAudio(audioStream) ↓ Foundry Local Model (Whisper Medium locally) ↓ Text Result + Metadata (language, duration) ↓ Return JSON/Text Response ↓ Display in UI This architecture embodies several healthcare system design principles: Data never leaves the device: All processing occurs on-premises, no external API calls No data persistence by default: Audio and transcripts are session-only, never saved unless explicitly configured Comprehensive health checks: System readiness verification before accepting PHI Audit logging support: Structured logging for compliance documentation Graceful degradation: Clear error messages when models unavailable rather than silent failures Setting Up Foundry Local with Whisper Models Foundry Local supports multiple Whisper model sizes, each with different accuracy/speed tradeoffs. For medical transcription, accuracy is paramount—misheard drug names or dosages create patient safety risks: # Install Foundry Local (Windows) winget install Microsoft.FoundryLocal # Verify installation foundry --version # Download Whisper Medium model (optimal for medical accuracy) foundry model add openai-whisper-medium-generic-cpu:1 # Check model availability foundry model list Whisper Medium (769M parameters) provides the best balance for medical use. Smaller models (Tiny, Base) miss medical terminology frequently. Larger models (Large) offer marginal accuracy gains at 3x inference time. Medium handles medical vocabulary well, drug names, anatomical terms, procedure names, while processing typical consultation audio (5-10 minutes) in under 30 seconds. The application detects and loads the model automatically: // Services/FoundryModelService.cs using Microsoft.AI.Foundry.Local.WinML; public class FoundryModelService { private readonly ILogger _logger; private readonly FoundryOptions _options; private ILocalAIModel? _loadedModel; public FoundryModelService( ILogger logger, IOptions options) { _logger = logger; _options = options.Value; } public async Task InitializeModelAsync() { try { _logger.LogInformation( "Loading Foundry model: {ModelAlias}", _options.ModelAlias ); // Load model from Foundry Local _loadedModel = await FoundryClient.LoadModelAsync( modelAlias: _options.ModelAlias, cancellationToken: CancellationToken.None ); if (_loadedModel == null) { _logger.LogWarning("Model loaded but returned null instance"); return false; } _logger.LogInformation( "Successfully loaded model: {ModelAlias}", _options.ModelAlias ); return true; } catch (Exception ex) { _logger.LogError( ex, "Failed to load Foundry model: {ModelAlias}", _options.ModelAlias ); return false; } } public ILocalAIModel? GetLoadedModel() => _loadedModel; public async Task UnloadModelAsync() { if (_loadedModel != null) { await FoundryClient.UnloadModelAsync(_loadedModel); _loadedModel = null; _logger.LogInformation("Model unloaded"); } } } Configuration lives in appsettings.json , enabling easy customization without code changes: { "Foundry": { "ModelAlias": "whisper-medium", "LogLevel": "Information" }, "Transcription": { "MaxAudioDurationSeconds": 300, "SupportedFormats": ["wav", "mp3", "m4a", "flac"], "DefaultLanguage": "en" } } Implementing Privacy-First Transcription Service The transcription service handles audio processing while maintaining strict privacy controls. No audio or transcript persists beyond the HTTP request lifecycle unless explicitly configured: // Services/TranscriptionService.cs public class TranscriptionService { private readonly FoundryModelService _modelService; private readonly ILogger _logger; public async Task TranscribeAudioAsync( Stream audioStream, string originalFileName, TranscriptionOptions? options = null) { options ??= new TranscriptionOptions(); var startTime = DateTime.UtcNow; try { // Validate audio format ValidateAudioFormat(originalFileName); // Get loaded model var model = _modelService.GetLoadedModel(); if (model == null) { throw new InvalidOperationException("Whisper model not loaded"); } // Create temporary file (automatically deleted after transcription) using var tempFile = new TempAudioFile(audioStream); // Execute transcription _logger.LogInformation( "Starting transcription for file: {FileName}", originalFileName ); var transcription = await model.TranscribeAsync( audioFilePath: tempFile.Path, language: options.Language, cancellationToken: CancellationToken.None ); var duration = (DateTime.UtcNow - startTime).TotalSeconds; _logger.LogInformation( "Transcription completed in {Duration:F2}s", duration ); return new TranscriptionResult { Text = transcription.Text, Language = transcription.Language ?? options.Language, Duration = transcription.AudioDuration, ProcessingTimeSeconds = duration, FileName = originalFileName, Timestamp = DateTime.UtcNow }; } catch (Exception ex) { _logger.LogError( ex, "Transcription failed for file: {FileName}", originalFileName ); throw; } } private void ValidateAudioFormat(string fileName) { var extension = Path.GetExtension(fileName).TrimStart('.'); var supportedFormats = new[] { "wav", "mp3", "m4a", "flac", "ogg" }; if (!supportedFormats.Contains(extension.ToLowerInvariant())) { throw new ArgumentException( $"Unsupported audio format: {extension}. " + $"Supported: {string.Join(", ", supportedFormats)}" ); } } } // Temporary file wrapper that auto-deletes internal class TempAudioFile : IDisposable { public string Path { get; } public TempAudioFile(Stream sourceStream) { Path = System.IO.Path.GetTempFileName(); using var fileStream = File.OpenWrite(Path); sourceStream.CopyTo(fileStream); } public void Dispose() { try { if (File.Exists(Path)) { File.Delete(Path); } } catch { // Ignore deletion errors in temp folder } } } This service demonstrates several privacy-first patterns: Temporary file lifecycle management: Audio written to temp storage, automatically deleted after transcription No implicit persistence: Results returned to caller, not saved by service Format validation: Accept only supported audio formats to prevent processing errors Comprehensive logging: Audit trail for compliance without logging PHI content Error isolation: Exceptions contain diagnostic info but no patient data Building the OpenAI-Compatible REST API The API endpoint mirrors OpenAI's transcription API specification, enabling existing integrations to work without modifications: // Program.cs var builder = WebApplication.CreateBuilder(args); // Configure services builder.Services.Configure( builder.Configuration.GetSection("Foundry") ); builder.Services.AddSingleton(); builder.Services.AddScoped(); builder.Services.AddHealthChecks() .AddCheck("foundry-health"); var app = builder.Build(); // Load model at startup var modelService = app.Services.GetRequiredService(); await modelService.InitializeModelAsync(); app.UseHealthChecks("/health"); app.MapHealthChecks("/api/health/status"); // OpenAI-compatible transcription endpoint app.MapPost("/v1/audio/transcriptions", async ( HttpRequest request, TranscriptionService transcriptionService, ILogger logger) => { if (!request.HasFormContentType) { return Results.BadRequest(new { error = "Content-Type must be multipart/form-data" }); } var form = await request.ReadFormAsync(); // Extract audio file var audioFile = form.Files.GetFile("file"); if (audioFile == null || audioFile.Length == 0) { return Results.BadRequest(new { error = "Audio file required in 'file' field" }); } // Parse options var format = form["format"].ToString() ?? "text"; var language = form["language"].ToString() ?? "en"; try { // Process transcription using var stream = audioFile.OpenReadStream(); var result = await transcriptionService.TranscribeAudioAsync( audioStream: stream, originalFileName: audioFile.FileName, options: new TranscriptionOptions { Language = language } ); // Return in requested format if (format == "json") { return Results.Json(new { text = result.Text, language = result.Language, duration = result.Duration }); } else { // Default: plain text return Results.Text(result.Text); } } catch (Exception ex) { logger.LogError(ex, "Transcription request failed"); return Results.StatusCode(500); } }) .DisableAntiforgery() // File uploads need CSRF exemption .WithName("TranscribeAudio") .WithOpenApi(); app.Run(); Example API usage: # PowerShell $audioFile = Get-Item "consultation-recording.wav" $response = Invoke-RestMethod ` -Uri "http://localhost:5192/v1/audio/transcriptions" ` -Method Post ` -Form @{ file = $audioFile; format = "json" } Write-Output $response.text # cURL curl -X POST http://localhost:5192/v1/audio/transcriptions \ -F "file=@consultation-recording.wav" \ -F "format=json" Building the Interactive Web Frontend The web UI provides a user-friendly interface for non-technical medical staff to transcribe recordings: SarahCare Medical Transcription The JavaScript handles file uploads and API interactions: // wwwroot/app.js let selectedFile = null; async function checkHealth() { try { const response = await fetch('/health'); const statusEl = document.getElementById('status'); if (response.ok) { statusEl.className = 'status-badge online'; statusEl.textContent = '✓ System Ready'; } else { statusEl.className = 'status-badge offline'; statusEl.textContent = '✗ System Unavailable'; } } catch (error) { console.error('Health check failed:', error); } } function handleFileSelect(event) { const file = event.target.files[0]; if (!file) return; selectedFile = file; // Show file info const fileInfo = document.getElementById('fileInfo'); fileInfo.textContent = `Selected: ${file.name} (${formatFileSize(file.size)})`; fileInfo.classList.remove('hidden'); // Enable audio preview const preview = document.getElementById('audioPreview'); preview.src = URL.createObjectURL(file); preview.classList.remove('hidden'); // Enable transcribe button document.getElementById('transcribeBtn').disabled = false; } async function transcribeAudio() { if (!selectedFile) return; const loadingEl = document.getElementById('loadingIndicator'); const resultEl = document.getElementById('resultSection'); const transcribeBtn = document.getElementById('transcribeBtn'); // Show loading state loadingEl.classList.remove('hidden'); resultEl.classList.add('hidden'); transcribeBtn.disabled = true; try { const formData = new FormData(); formData.append('file', selectedFile); formData.append('format', 'json'); const startTime = Date.now(); const response = await fetch('/v1/audio/transcriptions', { method: 'POST', body: formData }); if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } const result = await response.json(); const processingTime = ((Date.now() - startTime) / 1000).toFixed(1); // Display results document.getElementById('transcriptionText').value = result.text; document.getElementById('resultDuration').textContent = `Duration: ${result.duration.toFixed(1)}s`; document.getElementById('resultLanguage').textContent = `Language: ${result.language}`; resultEl.classList.remove('hidden'); console.log(`Transcription completed in ${processingTime}s`); } catch (error) { console.error('Transcription failed:', error); alert(`Transcription failed: ${error.message}`); } finally { loadingEl.classList.add('hidden'); transcribeBtn.disabled = false; } } function copyToClipboard() { const text = document.getElementById('transcriptionText').value; navigator.clipboard.writeText(text) .then(() => alert('Copied to clipboard')) .catch(err => console.error('Copy failed:', err)); } // Initialize window.addEventListener('load', () => { checkHealth(); loadSamplesList(); }); Key Takeaways and Production Considerations Building HIPAA-compliant voice-to-text systems requires architectural decisions that prioritize data privacy over convenience. The FLWhisper application demonstrates that you can achieve accurate medical transcription, fast processing times, and intuitive user experiences entirely on-premises. Critical lessons for healthcare AI: Privacy by architecture: Design systems where PHI never exists outside controlled environments, not as a configuration option No persistence by default: Audio and transcripts should be ephemeral unless explicitly saved with proper access controls Model selection matters: Whisper Medium provides medical terminology accuracy that smaller models miss Health checks enable reliability: Systems should verify model availability before accepting PHI Audit logging without content logging: Track operations for compliance without storing sensitive data in logs For production deployment in clinical settings, integrate with EHR systems via HL7/FHIR interfaces. Implement role-based access control with Active Directory integration. Add digital signatures for transcript authentication. Configure automatic PHI redaction using clinical NLP models. Deploy on HIPAA-compliant infrastructure with proper physical security. Implement comprehensive audit logging meeting compliance requirements. The complete implementation with ASP.NET Core API, Foundry Local integration, sample audio files, and comprehensive tests is available at github.com/leestott/FLWhisper. Clone the repository and follow the setup guide to experience privacy-first medical transcription. Resources and Further Reading FLWhisper Repository - Complete C# implementation with .NET 10 Quick Start Guide - Installation and usage instructions Microsoft Foundry Local Documentation - SDK reference and model catalog OpenAI Whisper Documentation - Model architecture and capabilities HIPAA Compliance Guidelines - HHS official guidance Testing Guide - Comprehensive test suite documentationFrom "Maybe Next Quarter" to "Running Before Lunch" on Container Apps - Modernizing Legacy .NET App
In early 2025, we wanted to modernize Jon Galloway's MVC Music Store — a classic ASP.NET MVC 5 app running on .NET Framework 4.8 with Entity Framework 6. The goal was straightforward: address vulnerabilities, enable managed identity, and deploy to Azure Container Apps and Azure SQL. No more plaintext connection strings. No more passwords in config files. We hit a wall immediately. Entity Framework on .NET Framework did not support Azure.Identity or DefaultAzureCredential. We just could not add a NuGet package and call it done — we’d need EF Core, which means modern .NET - and rewriting the data layer, the identity system, the startup pipeline, the views. The engineering team estimated one week of dedicated developer work. As a product manager without extensive .NET modernization experience, I wasn't able to complete it quickly on my own, so the project was placed in the backlog. This was before the GitHub Copilot "Agent" mode, the GitHub Copilot app modernization (a specialized agent with skills for modernization) existed but only offered assessment — it could tell you what needed to change, but couldn't make the end to end changes for you. Fast-forward one year. The full modernization agent is available. I sat down with the same app and the same goal. A few hours later, it was running on .NET 10 on Azure Container Apps with managed identity, Key Vault integration, and zero plaintext credentials. Thank you GitHub Copilot app modernization! And while we were on it – GitHub Copilot helped to modernize the experience as well, built more tests and generated more synthetic data for testing. Why Azure Container Apps? Azure Container Apps is an ideal deployment target for this modernized MVC Music Store application because it provides a serverless, fully managed container hosting environment. It abstracts away infrastructure management while natively supporting the key security and operational features this project required. It pairs naturally with infrastructure-as-code deployments, and its per-second billing on a consumption plan keeps costs minimal for a lightweight web app like this, eliminating the overhead of managing Kubernetes clusters while still giving you the container portability that modern .NET apps benefit from. That is why I asked Copilot to modernize to Azure Container Apps - here's how it went - Phase 1: Assessment GitHub Copilot App Modernization started by analyzing the codebase and producing a detailed assessment: Framework gap analysis — .NET Framework 4.0 → .NET 10, identifying every breaking change Dependency inventory — Entity Framework 6 (not EF Core), MVC 5 references, System.Web dependencies Security findings — plaintext SQL connection strings in Web.config, no managed identity support API surface changes — Global.asax → Program.cs minimal hosting, System.Web.Mvc → Microsoft.AspNetCore.Mvc The assessment is not a generic checklist. It reads your code — your controllers, your DbContext, your views — and maps a concrete modernization path. For this app, the key finding was clear: EF 6 on .NET Framework cannot support DefaultAzureCredential. The entire data layer needs to move to EF Core on modern .NET to unlock passwordless authentication. Phase 2: Code & Dependency Modernization This is where last year's experience ended and this year's began. The agent performed the actual modernization: Project structure: .csproj converted from legacy XML format to SDK-style targeting net10.0 Global.asax replaced with Program.cs using minimal hosting packages.config → NuGet PackageReference entries Data layer (the hard part): Entity Framework 6 → EF Core with Microsoft.EntityFrameworkCore.SqlServer DbContext rewritten with OnModelCreating fluent configuration System.Data.Entity → Microsoft.EntityFrameworkCore namespace throughout EF Core modernization generated from scratch Database seeding moved to a proper DbSeeder pattern with MigrateAsync() Identity: ASP.NET Membership → ASP.NET Core Identity with ApplicationUser, ApplicationDbContext Cookie authentication configured through ConfigureApplicationCookie Security (the whole trigger for this modernization): Azure.Identity + DefaultAzureCredential integrated in Program.cs Azure Key Vault configuration provider added via Azure.Extensions.AspNetCore.Configuration.Secrets Connection strings use Authentication=Active Directory Default — no passwords anywhere Application Insights wired through OpenTelemetry Views: Razor views updated from MVC 5 helpers to ASP.NET Core Tag Helpers and conventions _Layout.cshtml and all partials migrated The code changes touched every layer of the application. This is not a find-and-replace — it's a structural rewrite that maintains functional equivalence. Phase 3: Local Testing After modernization, the app builds, runs locally, and connects to a local SQL Server (or SQL in a container). EF Core modernizations apply cleanly, the seed data loads, and you can browse albums, add to cart, and check out. The identity system works. The Key Vault integration gracefully skips when KeyVaultName isn't configured — meaning local dev and Azure use the same Program.cs with zero code branches. Phase 4: AZD UP and Deployment to Azure The agent also generates the deployment infrastructure: azure.yaml — AZD service definition pointing to the Dockerfile, targeting Azure Container Apps Dockerfile — Multi-stage build using mcr.microsoft.com/dotnet/sdk:10.0 and aspnet:10.0 infra/main.bicep — Full IaaC including: Azure Container Apps with system + user-assigned managed identity Azure SQL Server with Azure AD-only authentication (no SQL auth) Azure Key Vault with RBAC, Secrets Officer role for the managed identity Container Registry with ACR Pull role assignment Application Insights + Log Analytics All connection strings injected as Container App secrets — using Active Directory Default, not passwords One command: AZD UP Provisions everything, builds the container, pushes to ACR, deploys to Container Apps. The app starts, runs MigrateAsync() on first boot, seeds the database, and serves traffic. Managed identity handles all auth to SQL and Key Vault. No credentials stored anywhere. What Changed in a Year Early 2025 Now Assessment Available Available Automated code modernization Semi-manual ✅ Full modernization agent Infrastructure generation Semi-manual ✅ Bicep + AZD generated Time to complete Weeks ✅ Hours The technology didn't just improve incrementally. The gap between "assessment" and "done" collapsed. A year ago, knowing what to do and being able to do it were very different things. Now they're the same step. Who This Is For If you have a .NET Framework app sitting on a backlog because "the modernization is too expensive" — revisit that assumption. The process changed. GitHub Copilot app modernization helps you rewrite your data layer, generates your infrastructure, and gets you to azd up. It can help you generate tests to increase your code coverage. If you have some feature requests – or – if you want to further optimize the code for scale – bring your requirements or logs or profile traces, you can take care of all of that during the modernization process. MVC Music Store went from .NET Framework 4.0 with Entity Framework 6 and plaintext SQL credentials to .NET 10 on Azure Container Apps with managed identity, Key Vault, and zero secrets in code. In an afternoon. That backlog item might be a lunch break now 😊. Really. Find your legacy apps and try it yourself. Next steps Modernize your .Net or Java apps with GitHub Copilot app modernization – https://aka.ms/ghcp-appmod Open your legacy application in Visual Studio or Visual Studio Code to start the process Deploy to Azure Container Apps https://aka.ms/aca/start294Views0likes0CommentsBeyond the Desktop: The Future of Development with Microsoft Dev Box and GitHub Codespaces
The modern developer platform has already moved past the desktop. We’re no longer defined by what’s installed on our laptops, instead we look at what tooling we can use to move from idea to production. An organisations developer platform strategy is no longer a nice to have, it sets the ceiling for what’s possible, an organisation can’t iterate it's way to developer nirvana if the foundation itself is brittle. A great developer platform shrinks TTFC (time to first commit), accelerates release velocity, and maybe most importantly, helps alleviate everyday frictions that lead to developer burnout. Very few platforms deliver everything an organization needs from a developer platform in one product. Modern development spans multiple dimensions, local tooling, cloud infrastructure, compliance, security, cross-platform builds, collaboration, and rapid onboarding. The options organizations face are then to either compromise on one or more of these areas or force developers into rigid environments that slow productivity and innovation. This is where Microsoft Dev Box and GitHub Codespaces come into play. On their own, each addresses critical parts of the modern developer platform: Microsoft Dev Box provides a full, managed cloud workstation. Dev Box gives developers a consistent, high-performance environment while letting central IT apply strict governance and control. Internally at Microsoft, we estimate that usage of Dev Box by our development teams delivers savings of 156 hours per year per developer purely on local environment setup and upkeep. We have also seen significant gains in other key SPACE metrics reducing context-switching friction and improving build/test cycles. Although the benefits of Dev Box are clear in the results demonstrated by our customers it is not without its challenges. The biggest challenge often faced by Dev Box customers is its lack of native Linux support. At the time of writing and for the foreseeable future Dev Box does not support native Linux developer workstations. While WSL2 provides partial parity, I know from my own engineering projects it still does not deliver the full experience. This is where GitHub Codespaces comes into this story. GitHub Codespaces delivers instant, Linux-native environments spun up directly from your repository. It’s lightweight, reproducible, and ephemeral ideal for rapid iteration, PR testing, and cross-platform development where you need Linux parity or containerized workflows. Unlike Dev Box, Codespaces can run fully in Linux, giving developers access to native tools, scripts, and runtimes without workarounds. It also removes much of the friction around onboarding: a new developer can open a repository and be coding in minutes, with the exact environment defined by the project’s devcontainer.json. That said, Codespaces isn’t a complete replacement for a full workstation. While it’s perfect for isolated project work or ephemeral testing, it doesn’t provide the persistent, policy-controlled environment that enterprise teams often require for heavier workloads or complex toolchains. Used together, they fill the gaps that neither can cover alone: Dev Box gives the enterprise-grade foundation, while Codespaces provides the agile, cross-platform sandbox. For organizations, this pairing sets a higher ceiling for developer productivity, delivering a truly hybrid, agile and well governed developer platform. Better Together: Dev Box and GitHub Codespaces in action Together, Microsoft Dev Box and GitHub Codespaces deliver a hybrid developer platform that combines consistency, speed, and flexibility. Teams can spin up full, policy-compliant Dev Box workstations preloaded with enterprise tooling, IDEs, and local testing infrastructure, while Codespaces provides ephemeral, Linux-native environments tailored to each project. One of my favourite use cases is having local testing setups like a Docker Swarm cluster, ready to go in either Dev Box or Codespaces. New developers can jump in and start running services or testing microservices immediately, without spending hours on environment setup. Anecdotally, my time to first commit and time to delivering “impact” has been significantly faster on projects where one or both technologies provide local development services out of the box. Switching between Dev Boxes and Codespaces is seamless every environment keeps its own libraries, extensions, and settings intact, so developers can jump between projects without reconfiguring or breaking dependencies. The result is a turnkey, ready-to-code experience that maximizes productivity, reduces friction, and lets teams focus entirely on building, testing, and shipping software. To showcase this value, I thought I would walk through an example scenario. In this scenario I want to simulate a typical modern developer workflow. Let's look at a day in the life of a developer on this hybrid platform building an IOT project using Python and React. Spin up a ready-to-go workstation (Dev Box) for Windows development and heavy builds. Launch a Linux-native Codespace for cross-platform services, ephemeral testing, and PR work. Run "local" testing like a Docker Swarm cluster, database, and message queue ready to go out-of-the-box. Switch seamlessly between environments without losing project-specific configurations, libraries, or extensions. 9:00 AM – Morning Kickoff on Dev Box I start my day on my Microsoft Dev Box, which gives me a fully-configured Windows environment with VS Code, design tools, and Azure integrations. I select my teams project, and the environment is pre-configured for me through the Dev Box catalogue. Fortunately for me, its already provisioned. I could always self service another one using the "New Dev Box" button if I wanted too. I'll connect through the browser but I could use the desktop app too if I wanted to. My Tasks are: Prototype a new dashboard widget for monitoring IoT device temperature. Use GUI-based tools to tweak the UI and preview changes live. Review my Visio Architecture. Join my morning stand up. Write documentation notes and plan API interactions for the backend. In a flash, I have access to my modern work tooling like Teams, I have this projects files already preloaded and all my peripherals are working without additional setup. Only down side was that I did seem to be the only person on my stand up this morning? Why Dev Box first: GUI-heavy tasks are fast and responsive. Dev Box’s environment allows me to use a full desktop. Great for early-stage design, planning, and visual work. Enterprise Apps are ready for me to use out of the box (P.S. It also supports my multi-monitor setup). I use my Dev Box to make a very complicated change to my IoT dashboard. Changing the title from "IoT Dashboard" to "Owain's IoT Dashboard". I preview this change in a browser live. (Time for a coffee after this hardwork). The rest of the dashboard isnt loading as my backend isnt running... yet. 10:30 AM – Switching to Linux Codespaces Once the UI is ready, I push the code to GitHub and spin up a Linux-native GitHub Codespace for backend development. Tasks: Implement FastAPI endpoints to support the new IoT feature. Run the service on my Codespace and debug any errors. Why Codespaces now: Linux-native tools ensure compatibility with the production server. Docker and containerized testing run natively, avoiding WSL translation overhead. The environment is fully reproducible across any device I log in from. 12:30 PM – Midday Testing & Sync I toggle between Dev Box and Codespaces to test and validate the integration. I do this in my Dev Box Edge browser viewing my codespace (I use my Codespace in a browser through this demo to highlight the difference in environments. In reality I would leverage the VSCode "Remote Explorer" extension and its GitHub Codespace integration to use my Codespace from within my own desktop VSCode but that is personal preference) and I use the same browser to view my frontend preview. I update the environment variable for my frontend that is running locally in my Dev Box and point it at the port running my API locally on my Codespace. In this case it was a web socket connection and HTTPS calls to port 8000. I can make this public by changing the port visibility in my Codespace. https://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/api/devices wss://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/ws This allows me to: Preview the frontend widget on Dev Box, connecting to the backend running in Codespaces. Make small frontend adjustments in Dev Box while monitoring backend logs in Codespaces. Commit changes to GitHub, keeping both environments in sync and leveraging my CI/CD for deployment to the next environment. We can see the Dev Box running local frontend and the Codespace running the API connected to each other, making requests and displaying the data in the frontend! Hybrid advantage: Dev Box handles GUI previews comfortably and allows me to live test frontend changes. Codespaces handles production-aligned backend testing and Linux-native tools. Dev Box allows me to view all of my files in one screen with potentially multiple Codespaces running in browser of VS Code Desktop. Due to all of those platform efficiencies I have completed my days goals within an hour or two and now I can spend the rest of my day learning about how to enable my developers to inner source using GitHub CoPilot and MCP (Shameless plug). The bottom line There are some additional considerations when architecting a developer platform for an enterprise such as private networking and security not covered in this post but these are implementation details to deliver the described developer experience. Architecting such a platform is a valuable investment to deliver the developer platform foundations we discussed at the top of the article. While in this demo I have quickly built I was working in a mono repository in real engineering teams it is likely (I hope) that an application is built of many different repositories. The great thing about Dev Box and Codespaces is that this wouldn’t slow down the rapid development I can achieve when using both. My Dev Box would be specific for the project or development team, pre loaded with all the tools I need and potentially some repos too! When I need too I can quickly switch over to Codespaces and work in a clean isolated environment and push my changes. In both cases any changes I want to deliver locally are pushed into GitHub (Or ADO), merged and my CI/CD ensures that my next step, potentially a staging environment or who knows perhaps *Whispering* straight into production is taken care of. Once I’m finished I delete my Codespace and potentially my Dev Box if I am done with the project, knowing I can self service either one of these anytime and be up and running again! Now is there overlap in terms of what can be developed in a Codespace vs what can be developed in Azure Dev Box? Of course, but as organisations prioritise developer experience to ensure release velocity while maintaining organisational standards and governance then providing developers a windows native and Linux native service both of which are primarily charged on the consumption of the compute* is a no brainer. There are also gaps that neither fill at the moment for example Microsoft Dev Box only provides windows compute while GitHub Codespaces only supports VS Code as your chosen IDE. It's not a question of which service do I choose for my developers, these two services are better together! *Changes have been announced to Dev Box pricing. A W365 license is already required today and dev boxes will continue to be managed through Azure. For more information please see: Microsoft Dev Box capabilities are coming to Windows 365 - Microsoft Dev Box | Microsoft Learn1.2KViews2likes0CommentsUsing ClientConnectionId to Correlate .NET Connection Attempts in Azure SQL
Getting Better Diagnostics with ClientConnectionId in .NET A few days ago, I was working on a customer case involving intermittent connectivity failures to Azure SQL Database from a .NET application. On the surface, nothing looked unusual. Retries were happening. In this post, I want to share a simple yet effective pattern for producing JDBC-style trace logs in .NET — specifically focusing on the ClientConnectionId property exposed by SqlConnection. This gives you a powerful correlation key that aligns with backend diagnostics and significantly speeds up root cause analysis for connection problems. Why ClientConnectionId Matters Azure SQL Database assigns a unique identifier to every connection attempt from the client. In .NET, this identifier is available through the ClientConnectionId property of SqlConnection. According to the official documentation: The ClientConnectionId property gets the connection ID of the most recent connection attempt, regardless of whether the attempt succeeded or failed. Source: https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.clientconnectionid?view=netframework-4.8.1 This GUID is the single most useful piece of telemetry for correlating client connection attempts with server logs and support traces. What .NET Logging Doesn’t Give You by Default Unlike the JDBC driver, the .NET SQL Client does not produce rich internal logs of every connection handshake or retry. There’s no built-in switch to emit gateway and redirect details, attempt counts, or port information. What you do have is: Timestamps Connection attempt boundaries ClientConnectionId values Outcome (success or failure) If you capture and format these consistently, you end up with logs that are as actionable as the JDBC trace output — and importantly, easy to correlate with backend diagnostics and Azure support tooling. Below is a small console application in C# that produces structured logs in the same timestamped, [FINE] format you might see from a JDBC trace — but for .NET applications: using System; using Microsoft.Data.SqlClient; class Program { static int Main() { // SAMPLE connection string (SQL Authentication) // Replace this with your own connection string. // This is provided only for demonstration purposes. string connectionString = "Server=tcp:<servername>.database.windows.net,1433;" + "Database=<database_name>;" + "User ID=<sql_username>;" + "Password=<sql_password>;" + "Encrypt=True;" + "TrustServerCertificate=False;" + "Connection Timeout=30;"; int connectionId = 1; // Log connection creation Log($"ConnectionID:{connectionId} created by (SqlConnection)"); using SqlConnection connection = new SqlConnection(connectionString); try { // Log connection attempt Log($"ConnectionID:{connectionId} This attempt No: 0"); // Open the connection connection.Open(); // Log ClientConnectionId after the connection attempt Log($"ConnectionID:{connectionId} ClientConnectionId: {connection.ClientConnectionId}"); // Execute a simple test query using SqlCommand cmd = new SqlCommand("SELECT 1", connection) { Log($"SqlCommand:1 created by (ConnectionID:{connectionId})"); Log("SqlCommand:1 Executing (not server cursor) SELECT 1"); cmd.ExecuteScalar(); Log("SqlDataReader:1 created by (SqlCommand:1)"); } } catch (SqlException ex) { // ClientConnectionId is available even on failure Log($"ConnectionID:{connectionId} ClientConnectionId: {connection.ClientConnectionId} (failure)"); Log($"SqlException Number: {ex.Number}"); Log($"Message: {ex.Message}"); return 1; } return 0; } // Simple logger to match JDBC-style output format static void Log(string message) { Console.WriteLine( $"[{DateTime.Now:yyyy-MM-dd HH:mm:ss}] [FINE] {message}" ); } } Run the above application and you’ll get output like: [2025-12-31 03:38:10] [FINE] ConnectionID:1 This attempt server name: aabeaXXX.trXXXX.northeurope1-a.worker.database.windows.net port: 11002 InstanceName: null useParallel: false [2025-12-31 03:38:10] [FINE] ConnectionID:1 This attempt endtime: 1767152309272 [2025-12-31 03:38:10] [FINE] ConnectionID:1 This attempt No: 1 [2025-12-31 03:38:10] [FINE] ConnectionID:1 Connecting with server: aabeaXXX.trXXXX.northeurope1-a.worker.database.windows.net port: 11002 Timeout Full: 20 [2025-12-31 03:38:10] [FINE] ConnectionID:1 ClientConnectionID: 6387718b-150d-482a-9731-02d06383d38f Server returned major version: 12 [2025-12-31 03:38:10] [FINE] SqlCommand:1 created by (ConnectionID:1 ClientConnectionID: 6387718b-150d-482a-9731-02d06383d38f) [2025-12-31 03:38:10] [FINE] SqlCommand:1 Executing (not server cursor) select 1 [2025-12-31 03:38:10] [FINE] SqlDataReader:1 created by (SqlCommand:1) [2025-12-31 03:38:10] [FINE] ConnectionID:2 created by (SqlConnection) [2025-12-31 03:38:11] [FINE] ConnectionID:2 ClientConnectionID: 5fdd311e-a219-45bc-a4f6-7ee1cc2f96bf Server returned major version: 12 [2025-12-31 03:38:11] [FINE] sp_executesql SQL: SELECT 1 AS ID, calling sp_executesql [2025-12-31 03:38:12] [FINE] SqlDataReader:3 created by (sp_executesql SQL: SELECT 1 AS ID) Notice how each line is tagged with: A consistent local timestamp (yyyy-MM-dd HH:mm:ss) A [FINE] log level A structured identifier that mirrors what you’d see in JDBC logging If a connection fails, you’ll still get the ClientConnectionId logged, which is exactly what Azure SQL support teams will ask for when troubleshooting connectivity issues.351Views3likes0CommentsMigrating Application Credentials to Azure Key Vault with GitHub Copilot App Modernization
Storing secrets directly in applications or configuration files increases operational risk. Migrating to Azure Key Vault centralizes secret management, supports rotation, and removes embedded credentials from application code. GitHub Copilot app modernization accelerates this process by identifying credential usage areas and generating changes for Key Vault integration. What This Migration Covers GitHub Copilot app modernization helps with: Detecting secrets hard‑coded in source files, config files, or environment variables. Recommending retrieval patterns using Azure Key Vault SDKs. Updating application code to load secrets from Key Vault. Preparing configuration updates to remove stored credentials. Surfacing dependency, version, and API adjustments required for Key Vault usage. Project Analysis Once the project is opened in Visual Studio Code or IntelliJ IDEA, GitHub Copilot app modernization analyzes: Hard‑coded credentials: passwords, tokens, client secrets, API keys. Legacy configuration patterns using .properties, .yaml, or environment variables. Azure SDK usage and required upgrades for Key Vault integration. Areas requiring secure retrieval or replacement with a managed identity. Migration Plan Generation The tool creates a step‑by‑step migration plan including: Introducing Key Vault client libraries. Mapping existing credential variables to Key Vault secrets. Updating configuration loading logic to retrieve secrets at runtime. Integrating Managed Identity authentication if applicable. Removing unused credential fields from code and configuration. Automated Transformations GitHub Copilot app modernization applies targeted changes: Rewrites code retrieving credentials from files or constants. Generates Key Vault retrieval patterns using SecretClient. Updates build dependencies to current Azure SDK versions. Removes unused configuration entries and environment variables. Build & Fix Iteration The project is rebuilt and validated: Fixes constructor changes related to updated clients. Resolves missing dependency versions. Corrects updated method signatures for Key Vault API calls. Rebuilds until no actionable errors remain. Security & Behavior Checks The tool surfaces: CVEs introduced by new or updated libraries. Behavior changes tied to lazy loading of secrets at runtime. Optional fixes or alternative patterns if Key Vault integration affects existing workflows. Expected Output After modernization: Credentials removed from source and config files. Application retrieves secrets from Azure Key Vault. Updated Azure SDK versions aligned with Key Vault. A summary file detailing code changes, dependency updates, and review items. Developer Responsibilities Developers should: Provision Key Vault resources and assign required access policies. Validate permissions through Managed Identity or service principals. Test application startup, error handling, and rotation scenarios. Review semantic impacts on components relying on early secret loading. Refer to the Microsoft Learn guide on upgrading Java projects with GitHub Copilot app modernization for foundational workflow details. Learn more Predefined tasks for GitHub Copilot app modernization Apply a predefined task Install GitHub Copilot app modernization for VS Code and IntelliJ IDEA318Views0likes0CommentsModernizing Applications by Migrating Code to Use Managed Identity with Copilot App Modernization
Migrating application code to use Managed Identity removes hard‑coded secrets, reduces operational risk, and aligns with modern cloud security practices. Applications authenticate directly with Azure services without storing credentials. GitHub Copilot app modernization streamlines this transition by identifying credential usage patterns, updating code, and aligning dependencies for Managed Identity flows. Supported Migration Steps GitHub Copilot app modernization helps accelerate: Replacing credential‑based authentication with Managed Identity authentication. Updating SDK usage to token‑based flows. Refactoring helper classes that build credential objects. Surfacing libraries or APIs that require alternative authentication approaches. Preparing build configuration changes needed for managed identity integration. Migration Analysis Open the project in Visual Studio Code or IntelliJ IDEA. GitHub Copilot app modernization analyzes: Locations where secrets, usernames, passwords, or connection strings are referenced. Service clients using credential constructors or static credential factories. Environment‑variable‑based authentication workarounds. Dependencies and SDK versions required for Managed Identity authentication. The analysis outlines upgrade blockers and the required changes for cloud‑native authentication. Migration Plan Generation GitHub Copilot app modernization produces a migration plan containing: Replacement of hard‑coded credentials with Managed Identity authentication patterns. Version updates for Azure libraries aligning with Managed Identity support. Adjustments to application configuration to remove unnecessary secrets. Developers can review and adjust before applying. Automated Transformations GitHub Copilot app modernization applies changes: Rewrites code that initializes clients using username/password or connection strings. Introduces Managed Identity‑friendly constructors and token credential patterns. Updates imports, method signatures, and helper utilities. Cleans up configuration files referencing outdated credential flows. Build & Fix Iteration The tool rebuilds the project, identifies issues, and applies targeted fixes: Compilation errors from removed credential classes. Incorrect parameter types or constructors. Dependencies requiring updates for Managed Identity compatibility. Security & Behavior Checks GitHub Copilot app modernization validates: CVEs introduced through dependency updates. Behavior changes caused by new authentication flows. Optional fixes for dependency vulnerabilities. Expected Output A migrated codebase using Managed Identity: Updated authentication logic. Removed credential references. Updated SDKs and dependencies. A summary file listing code edits, dependency changes, and items requiring manual review. Developer Responsibilities Developers should: Validate identity access on Azure resources. Reconfigure role assignments for system‑assigned or user‑assigned managed identities. Test functional behavior across environments. Review integration points dependent on identity scopes and permissions. Learn full upgrade workflows in the Microsoft Learn guide for upgrading Java projects with GitHub Copilot app modernization. Learn more Predefined tasks for GitHub Copilot app modernization Apply a predefined task Install GitHub Copilot app modernization for VS Code and IntelliJ IDEA264Views0likes0Comments