ai
1054 TopicsLogic Apps Agentic Workflows with SAP - Part 2: AI Agents
Part 2 focuses on the AI-shaped portion of the destination workflows: how the Logic Apps Agent is configured, how it pulls business rules from SharePoint, and how its outputs are converted into concrete workflow artifacts. In Destination workflow #1, the agent produces three structured outputs—an HTML validation summary, a CSV list of InvalidOrderIds , and an Invalid CSV payload—which drive (1) a verification email, (2) an optional RFC call to persist failed rows as IDocs, and (3) a filtered dataset used for the separate analysis step that returns only analysis (or errors) back to SAP. In Destination workflow #2, the same approach is applied to inbound IDocs: the workflow reconstructs CSV from the custom segment, runs AI validation against the same SharePoint rules, and safely appends results to an append blob using a lease-based write pattern for concurrency. 1. Introduction In Part 1, the goal was to make the integration deterministic: stable payload shapes, stable response shapes, and predictable error propagation across SAP and Logic Apps. Concretely, Part 1 established: how SAP reaches Logic Apps (Gateway/Program ID plumbing) the RFC contracts ( IT_CSV , response envelope, RETURN / MESSAGE , EXCEPTIONMSG ) how the source workflow interprets RFC responses (success vs error) how invalid rows can be persisted into SAP as custom IDocs ( Z_CREATE_ONLINEORDER_IDOC ) and how the second destination workflow receives those IDocs asynchronously With that foundation in place, Part 2 narrows in on the part that is not just plumbing: the agent loop, the tool boundaries, and the output schemas that make AI results usable inside a workflow rather than “generated text you still need to interpret.” The diagram below highlights the portion of the destination workflow where AI is doing real work. The red-circled section is the validation agent loop (rules in, structured validation outputs out), which then fans out into operational actions like email notification, optional IDoc persistence, and filtering for the analysis step. What matters here is the shape of the agent outputs and how they are consumed by the rest of the workflow. The agent is not treated as a black box; it is forced to emit typed, workflow-friendly artifacts (summary + invalid IDs + filtered CSV). Those artifacts are then used deterministically: invalid rows are reported (and optionally persisted as IDocs), while valid rows flow into the analysis stage and ultimately back to SAP. What this post covers In this post, I focus on five practical topics: Agent loop design in Logic Apps: tools, message design, and output schemas that make the agent’s results deterministic enough to automate. External rule retrieval: pulling validation rules from SharePoint and applying them consistently to incoming payloads. Structured validation outputs → workflow actions: producing InvalidOrderIds and a filtered CSV payload that directly drive notifications and SAP remediation. Two-model pattern: a specialized model for validation (agent) and a separate model call for analysis, with a clean handoff between the two. Output shaping for consumption: converting AI output into HTML for email and into the SAP response envelope (analysis/errors only). (Everything else—SAP plumbing, RFC wiring, and response/exception patterns—was covered in Part 1 and is assumed here.) Next, I’ll break down the agent loop itself—the tool sequence, the required output fields, and the exact points where the workflow turns AI output into variables, emails, and SAP actions. Huge thanks to KentWeareMSFT for helping me understand agent loops and design the validation agent structure. And thanks to everyone in 🤖 Agent Loop Demos 🤖 | Microsoft Community Hub for making such great material available. Note: For the full set of assets used here, see the companion GitHub repository (workflows, schemas, SAP ABAP code, and sample files). 2. Validation Agent Loop In this solution, the Data Validation Agent runs inside the destination workflow after the inbound SAP payload has been normalized into a single CSV string. The agent is invoked as a single Logic Apps Agent action, configured with an Azure OpenAI deployment and a short set of instructions. Its inputs are deliberately simple at this stage: the CSV payload (the dataset to validate), and the ValidationRules reference (where the rule document lives), shown in the instructions as a parameter token (ValidationRules is a logic app parameter). The figure below shows the validation agent configuration used in the destination workflow. The top half is the Agent action configuration (model + instructions), and the bottom half shows the toolset that the agent is allowed to use. The key design choice is that the agent is not “free-form chat”: it’s constrained by a small number of tools and a workflow-friendly output contract. What matters most in this configuration is the separation between instructions and tools. The instructions tell the agent what to do (“follow business process steps 1–3”), while the tools define how the agent can interact with external systems and workflow state. This keeps the agent modular: you can change rules in SharePoint or refine summarization expectations without rewriting the overall SAP integration mechanics. Purpose This agent’s job is narrowly scoped: validate the CSV payload from SAP against externally stored business rules and produce outputs that the workflow can use deterministically. In other words, it turns “validation as reasoning” into workflow artifacts (summary + invalid IDs + invalid payload), instead of leaving validation as unstructured prose. In Azure Logic Apps terms, this is an agent loop: an iterative process where an LLM follows instructions and selects from available tools to complete a multi-step task. Logic Apps agent workflows explicitly support this “agent chooses tools to complete tasks” model (see Agent Workflows Concepts). Tools In Logic Apps agent workflows, a tool is a named sequence that contains one or more actions the agent can invoke to accomplish part of its task (see Agent Workflows Concepts). In the screenshot, the agent is configured with three tools, explicitly labeled Get validation rules, Get CSV payload, Summarize CSV payload review. These tool names match the business process in the “Instructions for agent” box (steps 1–3). The next sections of the post go deeper into what each tool does internally; at this level, the important point is simply that the agent is constrained to a small, explicit toolset. Agent execution The screenshot shows the agent configured with: AI model: gpt-5-3 (gpt-5) A connection line: “Connected to … (Azure OpenAI)” Instructions for agent that define the agent’s role and a 3-step business process: Get validation rules (via the ValidationRules reference) Get CSV payload Summarize the CSV payload review, using the validation document This pattern is intentional: The instructions provide the agent’s “operating procedure” in plain language. The tools give the agent: controlled ways to fetch the rule document, access the CSV input, and return structured results. Because the workflow consumes the agent’s outputs downstream, the instruction text is effectively part of your workflow contract (it must remain stable enough that later actions can trust the output shape). Note: If a reader wants to recreate this pattern, the fastest path is: Start with the official overview of agent workflows (Workflows with AI Agents and Models - Azure Logic Apps). Follow a hands-on walkthrough for building an agent workflow and connecting it to an Azure OpenAI deployment (Logic Apps Labs is a good step-by-step reference). [ azure.github.io ] Use the Azure OpenAI connector reference to understand authentication options and operations available in Logic Apps Standard (see Built-in OpenAI Connector) If you’re using Foundry for resource management, review how Foundry connections are created and used, especially when multiple resources/tools are involved (see How to connect to AI foundry). 2.1 Tool 1: Get validation rules The first tool in the validation agent loop is Get validation rules. Its job is to load the business validation rules that will be applied to the incoming CSV payload from SAP. I keep these rules outside the workflow (in a document) so they can be updated without redeploying the Logic App. In this example, the rules are stored in SharePoint, and the tool simply retrieves the document content at runtime. Get validation rules is implemented as a single action called Get validation document. In the designer, you can see it: uses a SharePoint Online connection (SharePoint icon and connector action) calls GetFileContentByPath (shown by the “File Path” input) reads the rule file from the configured Site Address uses the workflow parameter token ValidationRules for the File Path (so the exact rule file location is configurable per environment) The output of this tool is the raw rule document content, which the Data Validation Agent uses in the next steps to validate the CSV payload. The bottom half of the figure shows an excerpt of the rules document. The format is simple and intentionally human-editable: each rule is expressed as FieldName : condition. For example, the visible rules include: PaymentMethod : value must exist PaymentMethod : value cannot be “Cash” OrderStatus : value must be different from “Cancelled” CouponCode : value must have at least 1 character OrderID : value must be unique in the CSV array A scope note: “Do not validate the Date field.” These rules are the “source of truth” for validation. The workflow does not hardcode them into expressions; instead, it retrieves them from SharePoint and passes them into the agent loop so the validation logic remains configurable and auditable (you can always point to the exact rule document used for a given run). A small but intentional rule in the document is “Do not validate the Date field.” That line is there for a practical reason: in an early version of the source workflow, the date column was being corrupted during CSV generation. The validation agent still tried to validate dates (even though date validation wasn’t part of the original intent), and the result was predictable: every row failed validation, leaving nothing to analyze. The upstream issue is fixed now, but I kept this rule in the demo to illustrate an important point: validation is only useful when it’s aligned with the data contract you can actually guarantee at that point in the pipeline. Note: The rules shown here assume the CSV includes a header row (field names in the first line) so the agent can interpret each column by name. If you want the agent to be schema‑agnostic, you can extend the rules with an explicit column mapping, for example: Column 1: Order ID Column 2: Date Column 3: Customer ID … This makes the contract explicit even when headers are missing or unreliable. With the rules loaded, the next tool provides the second input the agent needs: the CSV payload that will be validated against this document. 2.2 Tool 2: Get CSV payload The second tool in the validation agent loop is Get CSV payload. Its purpose is to make the dataset-to-validate explicit: it defines exactly what the agent should treat as “the CSV payload,” rather than relying on implicit workflow context. In this workflow, the CSV is already constructed earlier (as Create_CSV_payload ), and this tool acts as the narrow bridge between that prepared string and the agent’s validation step. Figure: Tool #2 (“Get CSV payload”) defines a single agent parameter and binds it to the workflow’s generated CSV. The figure shows two important pieces: - The tool parameter contract (“Agent Parameters”) On the right, the tool defines an agent parameter named CSV Payload with type String, and the description (highlighted in yellow) makes the intent explicit: “The CSV payload received from SAP and that we validate based on the validation rules.” This parameter is the tool’s interface: it documents what the agent is supposed to provide/consume when using this tool, and it anchors the rest of the validation process to a single, well-defined input. Tools in Logic Apps agent workflows exist specifically to constrain and structure what an agent can do and what data it operates on (see Agent Workflows Concepts). - Why there is an explicit Compose action (“CSV payload”) In the lower-right “Code view,” the tool’s internal action is shown as a standard Compose: { "type": "Compose", "inputs": "@outputs('Create_CSV_payload')" } This is intentional. Even though the CSV already exists in the workflow, the tool still needs a concrete action that produces the value it returns to the agent. The Compose step: pins the tool output to a single source of truth ( Create_CSV_payload ), and creates a stable boundary: “this is the exact CSV string the agent validates,” independent of other workflow state. Put simply: the Compose action isn’t there because Logic Apps can’t access the CSV—it’s there to make the agent/tool interface explicit, repeatable, and easy to troubleshoot. What “tool parameters” are (in practical terms) In Logic Apps agent workflows, a tool is a named sequence of one or more actions that the agent can invoke while executing its instructions. A tool parameter is the tool’s input/output contract exposed to the agent. In this screenshot, that contract is defined under Agent Parameters, where you specify: Name: CSV Payload Type: String Description: “The CSV payload received from SAP…” This matters because it clarifies (for both the model and the human reader) what the tool represents and what data it is responsible for supplying. With Tool #1 providing the rules document and Tool #2 providing the CSV dataset, Tool #3 is where the agent produces workflow-ready outputs (summary + invalid IDs + filtered payload) that the downstream steps can act on. 2.3 Tool 3: Summarize CSV payload review The third tool, Summarize CSV payload review, is where the agent stops being “an evaluator” and becomes a producer of workflow-ready outputs. It does most of the heavy lifting so let's go into the details. Instead of returning one blob of prose, the tool defines three explicit agent parameters—each with a specific format and purpose—so the workflow can reliably consume the results in downstream actions. In Logic Apps agent workflows, tools are explicitly defined tasks the agent can invoke, and each tool can be structured around actions and schemas that keep the loop predictable (see Agent Workflows Concepts). Figure: Tool #3 (“Summarize CSV payload review”) defines three structured agent outputs Description is not just documentation—it’s the contract the model is expected to satisfy, and it strongly shapes what the agent considers “relevant” when generating outputs. The parameters are: Validation summary (String) Goal: a human-readable summary that can be dropped straight into email. In the screenshot, the description is very explicit about shape and content: “expected format is an HTML table” “create a list of all orderids that have failed” “create a CSV document… only for the orderid values that failed… each row on a separate line” “include title row only in the email body” This parameter is designed for presentation: it’s the thing you want humans to read first. InvalidOrderIds (String, CSV format) Goal: a machine-friendly list of identifiers the workflow can use deterministically. The key part of the description (highlighted in the image) is: “The format is CSV.” That single sentence is doing a lot of work: it tells the model to emit a comma-separated list, which you then convert into an array in the workflow using split(...). Invalid CSV payload (String, one row per line) Goal: the failed rows extracted from the original dataset, in a form that downstream steps can reuse. The description constrains the output tightly: “original CSV rows… for the orderid values that failed validation” “each row must be on a separate line” “keep the title row only for the email body and remove it otherwise” This parameter is designed for automation: it becomes input to remediation steps (like transforming rows to XML and creating IDocs), not just a report. What “agent parameters” do here (and why they matter) A useful way to think about agent parameters is: they are the “typed return values” of a tool. Tools in agent workflows exist to structure work into bounded tasks the agent can perform, and a schema/parameter contract makes the results consumable by the rest of the workflow (see Agent Workflows Concepts). In this tool, the parameters serve two purposes at once: They guide the agent toward salient outputs. The descriptions explicitly name what matters: “failed orderids,” “HTML table,” “CSV format,” “one row per line,” “header row rules.” That phrasing makes it much harder for the model to “wander” into irrelevant commentary. They align with how the workflow will parse and use the results. By stating “ InvalidOrderIds is CSV,” you make it trivially parseable (split), and by stating “Invalid CSV payload is one row per line,” you make it easy to feed into later transformations. Why the wording works (and what wording tends to work best) What’s interesting about the parameter descriptions is that they combine three kinds of constraints: Output format constraints (make parsing deterministic) “expected format is an HTML table” “The format is CSV.” “each row must be on a separate line” These format cues help the agent decide what to emit and help you avoid brittle parsing later. Output selection constraints (force relevance) “only for the orderid values that failed validation” “Create a list of all orderids that have failed” This tells the agent what to keep and what to ignore. Output operational constraints (tie outputs to downstream actions) “Include title row only in the email body” “remove it otherwise” This explicitly anticipates downstream usage (email vs remediation), which is exactly the kind of detail models often miss unless you state it. Rule of thumb: wording works best when it describes what to produce, in what format, with what filtering rules, and why the workflow needs it. How these parameters tie directly to the downstream actions The next picture makes the design intent very clear: each parameter is immediately “bound” to a normal workflow value via Compose actions and then used by later steps. This is the pattern we want: agent output → Compose → (optional) normalization → reused by deterministic workflow actions. It’s the opposite of “read the model output and hope.” This is the reusable pattern: Decide the minimal set of outputs the workflow needs. Specify formats that are easy to parse. Write parameter descriptions that encode both selection and formatting constraints. Immediately bind outputs to workflow variables via Compose/ SetVariable actions. The main takeaway from this tool is that the agent is being forced into a structured contract: three outputs with explicit formats and clear intent. That contract is what makes the rest of the workflow deterministic—Compose actions can safely read @agentParameters(...), the workflow can safely split(...) the invalid IDs, and downstream actions can treat the “invalid payload” as real data rather than narrative. I'll show later how this same “parameter-first” design scales to other agent tools. 2.4 Turning agent outputs into a verification email Once the agent has produced structured outputs (Validation summary, InvalidOrderIds , and Invalid CSV payload), the next goal is to make those outputs operational: humans need a quick summary of what failed, and the workflow needs machine‑friendly values it can reuse downstream. The design here is intentionally straightforward: the workflow converts each agent parameter into a first‑class workflow output (via Compose actions and one variable assignment), then binds those values directly into the Office 365 email body. The result is an email that is both readable and actionable—without anyone needing to open run history. The figure below shows how the outputs of Summarize CSV payload review are mapped into the verification email. On the left, the tool produces three values via subsequent actions (Summary, Invalid order ids, and Invalid CSV payload), and the workflow also normalizes the invalid IDs into an array (Save invalid order ids). On the right, the Send verification summary action composes the email body using those same values as dynamic content tokens. Figure: Mapping agent outputs to the verification email The important point is that the email is not constructed by “re-prompting” or “re-summarizing.” It is assembled from already-structured outputs. This mapping is intentionally direct: each piece of the email corresponds to one explicit output from the agent tool. The workflow doesn’t interpret or transform the summary beyond basic formatting—its job is to preserve the agent’s structured outputs and present them consistently. The only normalization step happens for InvalidOrderIds , where the workflow also converts the CSV string into an array ( ArrayOfInvalidOrderIDs ) for later filtering and analysis steps. The next figure shows a sample verification email produced by this pipeline. It illustrates the three-part structure: an HTML validation summary table, the raw invalid order ID list, and the extracted invalid CSV rows: Figure: Sample verification email — validation summary table + invalid order IDs + invalid CSV rows. The extracted artifacts InvalidOrderIds and Invalid CSV payload are used in the downstream actions that persist failed rows as IDocs for later processing, which were presented in Part 1. I will get back to this later to talk about reusing the validation agent. Next however I will go over the data analysis part of the AI integration. 3. Analysis Phase: from validated dataset to HTML output After the validation agent loop finishes, the workflow enters a second AI phase: analysis. The validation phase is deliberately about correctness (what to exclude and why). The analysis phase is about insight, and it runs on the remaining dataset after invalid rows are filtered out. At a high level, this phase has three steps: Call Azure OpenAI to analyze the CSV dataset while explicitly excluding invalid OrderIDs . Extract the model’s text output from the OpenAI response object. Convert the model’s markdown output into HTML so it renders cleanly in email (and in the SAP response envelope). 3.1 OpenAI component: the “Analyze data” call The figure below shows the Analyze data action that drives the analysis phase. This action is executed after the Data Validation Agent completes, and it uses three messages: a system instruction that defines the task, the CSV dataset as input, and a second user message that enumerates the OrderIDs to exclude (the invalid IDs produced by validation). Figure: Azure OpenAI analysis call. The analysis call is structured as: system: define the task and constraints user: provide the dataset user: provide exclusions derived from validation system: Analyze dataset; provide trends/predictions; exclude specified orderids. user: <csv payload=""> user: Excluded orderids: <comma-separated ids="" invalid=""></comma-separated></csv> Two design choices are doing most of the work here: The model is given the dataset and the exclusions separately. This avoids ambiguity: the dataset is one message, and the “do not include these OrderIDs ” constraint is another. The exclusion list is derived from validation output, not re-discovered during analysis. The analysis step doesn’t re-validate; it consumes the validation phase’s results and focuses purely on trends/predictions. 3.2 Processing the response The next figure shows how the workflow turns the Azure OpenAI response into a single string that can be reused for email and for the SAP response. The workflow does three things in sequence: it parses the response JSON, extracts the model’s text content, and then passes that text into an HTML formatter. Figure: Processing the OpenAI response. This is the only part of the OpenAI response you need to understand for this workflow: Analyze_data response └─ choices[] (array) └─ [0] (object) └─ message (object) └─ content (string) <-- analysis text Everything else in the OpenAI response (filters, indexes, metadata) is useful for auditing but not required to build the final user-facing output. 3.3 Crafting the output to HTML The model’s output is plain text and often includes lightweight markdown structures (headings, lists, separators). To make the analysis readable in email (and safe to embed in the SAP response envelope), the workflow converts the markdown into HTML. The script was generated with copilot. Source code snippet may be found in Part 1. The next figure shows what the formatted analysis looks like when rendered. Not the explicit reference to the excluded OrderIDs and summary of the remaining dataset before listing trend observations. Figure: Example analysis output after formatting. 4. Closing the loop: persisting invalid rows as IDocs In Part 1, I introduced an optional remediation branch: when validation finds bad rows, the workflow can persist them into SAP as custom IDocs for later handling. In Part 2, after unpacking the agent loop, I want to reconnect those pieces and show the “end of the story”: the destination workflow creates IDocs for invalid data, and a second destination workflow receives those IDocs and produces a consolidated audit trail in Blob Storage. This final section is intentionally pragmatic. It shows: where the IDoc creation call happens, how the created IDocs arrive downstream, and how to safely handle many concurrent workflow instances writing to the same storage artifact (one instance per IDoc). 4.1 From “verification summary” to “Create all IDocs” The figure below shows the tail end of the verification summary flow. Once the agent produces the structured validation outputs, the workflow first emails the human-readable summary, then converts the invalid CSV rows into an SAP-friendly XML shape, and finally calls the RFC that creates IDocs from those rows. Figure: End of the validation/remediation branch. This is deliberately a “handoff point.” After this step, the invalid rows are no longer just text in an email—they become durable SAP artifacts (IDocs) that can be routed, retried, and processed independently of the original workflow run. 4.2 Z_CREATE_ONLINEORDER_IDOC and the downstream receiver The next figure is the same overview from Part 1. I’m reusing it here because it captures the full loop: the workflow calls Z_CREATE_ONLINEORDER_IDOC , SAP converts the invalid rows into custom IDocs, and Destination workflow #2 receives those IDocs asynchronously (one workflow run per IDoc). Figure 2: Invalid rows persisted as custom IDocs. This pattern is intentionally modular: Destination workflow #1 decides which rows are invalid and optionally persists them. SAP encapsulates the IDoc creation mechanics behind a stable RFC ( Z_CREATE_ONLINEORDER_IDOC ). Destination workflow #2 processes each incoming IDoc independently, which matches how IDoc-driven integrations typically behave in production. 4.3 Two phases in Destination workflow #2: AI agent + Blob Storage logging In the receiver workflow, there are two distinct phases: AI agent phase (per-IDoc): reconstruct a CSV view from the incoming IDoc payload and (optionally) run the same validation logic. Blob storage phase (shared output): append a normalized “verification line” into a shared blob in a concurrency-safe way. It’s worth calling out: in this demo, the IDocs being received were created from already-validated outputs upstream, so you could argue the second validation is redundant. I keep it anyway for two reasons: it demonstrates that the agent tooling is reusable with minimal changes, and in a general integration, Destination workflow #2 may receive IDocs from multiple sources, not only from this pipeline—so “validate on receipt” can still be valuable. 4.3.1 AI agent phase The figure below shows the validation agent used in Destination workflow #2. The key difference from the earlier agent loop is the output format: instead of producing an HTML summary + invalid lists, this agent writes a single “audit line” that includes the IDoc correlation key ( DOCNUM ) along with the order ID and the failed rules. Figure: Destination workflow #2 agent configuration. The reusable part here is the tooling structure: rules still come from the same validation document, the dataset is still supplied as CSV, and the summarization tool outputs a structured value the workflow can consume deterministically. The only meaningful change is “what shape do I want the output to take,” which is exactly what the agent parameter descriptions control. The next figure zooms in on the summarization tool parameter in Destination workflow #2. Instead of three outputs, this tool uses a single parameter ( VerificationInfo ) whose description forces a consistent line format anchored on DOCNUM . Figure 4: VerificationInfo parameter. This is the same design principle as Tool #3 in the first destination workflow: describe the output as a contract, not as a vague request. The parameter description tells the agent exactly what must be present ( DOCNUM + OrderId + failed rules) and therefore makes it straightforward to append the output to a shared log without additional parsing. Interesting snippets Extracting DOCNUM from the IDoc control record and carry it through the run: xpath(xml(triggerBody()?['content']), 'string(/*[local-name()="Receive"] /*[local-name()="idocData"] /*[local-name()="EDI_DC40"] /*[local-name()="DOCNUM"])') 4.3.2 Blob Storage phase Destination workflow #2 runs one instance per inbound IDoc. That means multiple runs can execute at the same time, all trying to write to the same “ ValidationErrorsYYYYMMDD.txt ” artifact. The figure below shows the resulting appended output: one line per IDoc, each line beginning with DOCNUM , which becomes the stable correlation key. Destination workflow #2 runs one instance per inbound IDoc, so multiple instances can attempt to write to the same daily “validation errors” append blob at the same time. The figure below shows the concurrency control pattern I used to make those writes safe: a short lease acquisition loop that retries until it owns the blob lease, then appends the verification line(s), and finally releases the lease. Figure: Concurrency-safe append pattern. Reading the diagram top‑to‑bottom, the workflow uses a simple lease → append → release pattern to make concurrent writes safe. Each instance waits briefly (Delay), attempts to acquire a blob lease (Acquire validation errors blob lease), and loops until it succeeds (Set status code → Until lease is acquired). Once a lease is obtained, the workflow stores the lease ID (Save lease id), appends its verification output under that lease (Append verification results), and then releases the lease (Release the lease) so the next workflow instance can write. Implementation note: the complete configuration for this concurrency pattern (including the HTTP actions, headers, retries, and loop conditions) is included in the attached artifacts, in the workflow JSON for Destination workflow #2. 5. Concluding remarks Part 2 zoomed in on the AI boundary inside the destination workflows and made it concrete: what the agent sees, what it is allowed to do, what it must return, and how those outputs drive deterministic workflow actions. The practical outcomes of Part 2 are: A tool-driven validation agent that produces workflow artifacts, not prose. The validation loop is constrained by tools and parameter schemas so its outputs are immediately consumable: an email-friendly validation summary, a machine-friendly InvalidOrderIds list, and an invalid-row payload that can be remediated. A clean separation between validation and analysis. Validation decides what not to trust (invalid IDs / rows) and analysis focuses on what is interesting in the remaining dataset. The analysis prompt makes the exclusion rule explicit by passing the dataset and excluded IDs as separate messages. A repeatable response-processing pipeline. You extract the model’s text from a stable response path ( choices[0].message.content ), then shape it into HTML once (markdown → HTML) so the same formatted output can be reused for email and the SAP response envelope. A “reuse with minimal changes” pattern across workflows. Destination workflow #2 shows the same agent principles applied to IDoc reception, but with a different output contract optimized for logging: DOCNUM + OrderId + FailedRules . This demonstrates that the real reusable asset is the tool + parameter contract design. Putting It All Together We have a full integration story where SAP, Logic Apps, AI, and IDocs are connected with explicit contracts and predictable behavior: Part 1 established the deterministic integration foundation. SAP ↔ Logic Apps connectivity (gateway/program wiring) RFC payload/response contracts ( IT_CSV , response envelope, error semantics) predictable exception propagation back into SAP an optional remediation branch that persists invalid rows as IDocs via a custom RFC ( Z_CREATE_ONLINEORDER_IDOC ) and the end-to-end response handling pattern in the caller workflow. Part 2 layered AI on top without destabilizing the contracts. Agent loop + tools for rule retrieval and validation output schemas that convert “reasoning” into workflow artifacts a separate analysis step that consumes validated data and produces formatted results and an asynchronous IDoc receiver that logs outcomes safely under concurrency. The reason it works as a two-part series is that the two layers evolve at different speeds: The integration layer (Part 1) should change slowly. It defines interoperability: payload shapes, RFC names, error contracts, and IDoc interfaces. The AI layer (Part 2) is expected to iterate. Prompts, rule documents, output formatting, and agent tool design will evolve as you tune behavior and edge cases. References Logic Apps Agentic Workflows with SAP - Part 1: Infrastructure 🤖 Agent Loop Demos 🤖 | Microsoft Community Hub Agent Workflows Concepts Workflows with AI Agents and Models - Azure Logic Apps Built-in OpenAI Connector How to connect to AI foundry Create Autonomous AI Agent Workflows - Azure Logic Apps Handling Errors in SAP BAPI Transactions Access SAP from workflows Create common SAP workflows Generate Schemas for SAP Artifacts via Workflows Exception Handling | ABAP Keyword Documentation Handling and Propagating Exceptions - ABAP Keyword Documentation SAP .NET Connector 3.1 Overview SAP .NET Connector 3.1 Programming Guide Connect to Azure AI services from Workflows All supporting content for this post may be found in the companion GitHub repository.AgentCon Hong Kong - Come One Come All for FREE
AgentCon is coming to Hong Kong! 🚀 The AI Agents Developer Conference lands on Saturday, 11 April 2026, at Hong Kong Institute of Information Technology (HKIIT) (VTC Tsing Yi Complex). If you're building with AI agents, automation, or intelligent systems, don't miss this gathering of developers, architects, and AI leaders for a full day of real-world sessions focused on designing, deploying, and scaling AI agents. Secure your spot ➡️ https://aka.ms/AgentconHongKong20268Views0likes0CommentsAgentCon Seoul - Come One Come All for FREE
AgentCon is coming to Seoul! 🚀 The AI Agents Developer Conference lands on Thursday, 16 April 2026, at Seoul National University. If you're building with AI agents, automation, or intelligent systems, don't miss this gathering of developers, architects, and AI leaders for a full day of real-world sessions focused on designing, deploying, and scaling AI agents. Secure your spot ➡️ https://aka.ms/agentconSeoul202611Views0likes0CommentsAgentCon London - Come One Come All for FREE
AgentCon is coming to London! 🚀 The AI Agents Developer Conference lands on Wednesday, 22 April 2026, at London Southbank University If you're building with AI agents, automation, or intelligent systems, don't miss this gathering of developers, architects, and AI leaders for a full day of real-world sessions focused on designing, deploying, and scaling AI agents. Secure your spot ➡️ https://aka.ms/agentconLondon202612Views0likes0CommentsMicrosoft Partners: Accelerate Your AI Journey at AgentCon 2026 (Free Community Event)
Recently, a customer asked me a question many Microsoft partners are hearing right now: “We have Copilot — how do we actually use AI to change the way we work?” That question captures where we are in the AI journey today. Organizations have moved past curiosity. Now they’re looking for trusted partners who can turn AI into real business outcomes. That’s why events like AgentCon 2026 matter. A free, community-led event built by practicioners AgentCon is not a traditional conference. It’s a free, community-driven global event organized by the Global AI Community together with Microsoft partners and ecosystem leaders. Simply put: it’s for the community, by the community. Across cities worldwide, developers, consultants, architects, and Microsoft partners come together to share practical experiences building with AI agents, Copilot, and the Microsoft platform. The focus isn’t theory — it’s implementation: What worked What didn’t What partners can apply immediately with customers This peer learning model reflects how many of us actually grow in the Microsoft ecosystem: by learning from other partners solving real problems. Why this matters for Microsoft partners The opportunity for partners is evolving quickly. Customers aren’t just asking about AI tools — they’re asking how to redesign processes, automate work, and unlock productivity using AI-powered solutions. The Microsoft AI Cloud Partner Program emphasizes partner skilling and helping customers realize value from AI investments. Community events like AgentCon accelerate that learning by bringing partners together to exchange proven approaches and practical insights. When partners upskill faster, customers succeed faster. Why attend AgentCon is designed to help partners move from AI awareness to AI delivery. As an attendee, you can expect: Practical sessions and demos from practitioners Real-world AI and agent scenarios Direct conversations with builders and peers New collaboration and co-sell opportunities You’ll leave with ideas and approaches you can bring directly into customer engagements. Why speak AgentCon thrives because partners share openly with one another. If you’ve implemented Copilot, explored AI agents, or learned lessons from customer deployments, your experience can help others accelerate their journey. Speaking at AgentCon allows you to: Share your expertise with the global partner community Build credibility within the Microsoft ecosystem Create new partnerships and opportunities Contribute to collective partner success You don’t need a perfect story — just an honest one others can learn from. Join the global AgentCon community AgentCon 2026 events takes place around the world including these upcoming events: March 9 - New York: https://aka.ms/AgentconNYC2026 April 11 - Hong Kong: https://aka.ms/AgentconHongKong2026 April 16 - Seoul: https://aka.ms/agentconLondon2026 April 22 - London: https://aka.ms/agentconSeoul2026 Each event is locally organized, community-led, and free to attend. Help shape the next phase of AI adoption AI transformation is happening now — and Microsoft partners play a critical role in guiding customers forward. AgentCon is an opportunity to learn together, share experiences, and strengthen the partner ecosystem driving AI innovation. 👉 Register or apply to speak: https://aka.ms/agentcon2026 We hope you’ll join us — and be part of the community helping customers turn AI potential into real impact.44Views0likes0CommentsAgentCon New York - Come One Come All for FREE
On March 9, 2026, #AgentCon lands at Nasdaq, Times Square, bringing together developers, engineers, and innovators shaping the future of AI agents. Expect deep‑dive talks, hands‑on learning, practical demos and plenty of networking with the AI community. This isn’t just another AI event, it’s where builders meet to talk real code. ➡️ Register now!29Views0likes0CommentsBehind the Build with RSA: Identity Resilience in the Age of AI
Behind the Build is an ongoing series spotlighting standout Microsoft partner collaborations. Each edition dives into the technical and strategic decisions that shape real-world integrations—highlighting engineering excellence, innovation, and the shared customer value created through partnership. RSA and Microsoft share a long, multiyear partnership shaped not by a single product or integration, but by shared customers grappling with some of today’s most complex security challenges, from cloud migration and identity sprawl to AI-driven threats. In this Behind the Build blog, we feature Dave Taku, RSA’s Vice President of Product Management and User Experience, to dive deeper into how that collaboration works at a technical level, how RSA and Microsoft engineers partner to solve real customer problems, and how recent work spanning Microsoft Entra, Microsoft Sentinel, and AI-driven security capabilities is shaping what comes next. Meet Dave Taku Dave Taku has spent nearly 25 years in cybersecurity, working across domains such as telecommunications and network security. But most of that time has been focused squarely on identity in areas like authentication, access management, governance and lifecycle, in particular. He’s been with RSA for two decades. When asked what makes a great VP of product, Dave describes his role as one centered on enablement. “My job is really to provide clarity and empower the team, to help them be successful.” That team-oriented mindset carries through RSA’s broader approach to engineering and partnerships. A Customer Driven Partnership with Microsoft RSA’s collaboration with Microsoft has largely been shaped by shared customers, many of them large, complex enterprises navigating the shift from on premises environments to cloud-first architectures. “These efforts are almost always customer initiated,” Dave notes. “Customers want us working together to make their journey successful.” That alignment has led to a wide range of joint initiatives over the years, spanning identity control planes, hybrid and multi cloud scenarios, and more recently, deeper analytics and AI driven security workflows. Identity as the Foundation Identity sits at the center of RSA’s partnership with Microsoft, particularly through integrations with Microsoft Entra. While organizations increasingly adopt Entra for cloud identity, many still operate complex hybrid estates and highly regulated environments. RSA can help in those mixed-use cases by extending identity controls beyond a single platform, providing behavioral analytics and risk-based authentication that complements Entra’s native features. “At RSA, we’re laser focused on answering two questions for our customers,” Dave explains. “Who is this user (can we be absolutely sure)? And is their access appropriate from a zero-trust perspective?” A standout example of Microsoft’s collaboration with RSA is their early adoption of External Authentication Methods (EAM), where they served as a day one launch partner. EAM built on prior generations of integration between RSA and Microsoft identity technologies and has been critical for customers migrating sensitive workloads to the cloud without disrupting existing security postures. At the end of the day, it is customers that drive this kind of innovation. Dave points to large, global, financial institutions as clear bellwethers. As these organizations shift toward cloud first models and embrace Azure and SaaS, they face the challenge of modernizing identity without disrupting environments long secured by RSA or introducing new risks during migration. EAM has been critical in enabling that transition, allowing established RSA authentication and policy controls to carry forward into Microsoft Entra so customers can adopt cloud services while preserving the security models and operational consistency they depend on. From Identity Signals to Agentic AI with Sentinel More recently, RSA and Microsoft have collaborated on deeper integrations with Microsoft Sentinel, including work with Sentinel data lake and Security Copilot. These efforts marked the first co-engineered agentic solution from RSA and Microsoft. RSA sees AI influencing identity security across several fronts: improving insights and automation, defending against AI-powered attacks, and securing non-human identities as autonomous agents become more common in enterprise environments. RSA’s approach starts with administrative telemetry from RSA ID Plus. Those events are ingested through a Sentinel connector and stored in the Microsoft Sentinel data lake which enables cost‑effective long‑term retention of identity telemetry, making it available for advanced analytics. Security Copilot agents then assess this data to surface anomalous or risky administrative behavior. “Admin accounts are increasingly a target,” says Dave. “If you don’t know when an admin is behaving unusually, you’re already too late.” This integration enables security teams to analyze identity related activity alongside broader organizational telemetry, helping analysts detect compromised credentials earlier and respond faster. “Human operators can’t keep up anymore,” Dave says. “As identities become more dynamic and more automated, we need AI driven assistance to maintain zero trust at scale.” Looking Ahead As RSA and Microsoft look ahead, their collaboration is increasingly shaped by how identity security must evolve in an AI driven world. Dave outlines three core areas where both teams see significant opportunities for continued innovation. AI will play a growing role in helping organizations make sense of increasingly fluid identity environments, enabling better insight, decision making, and, over time, more autonomous responses as manual oversight becomes less viable. At the same time, the rise of AI powered attacks is placing new strain on traditional identity trust models, pushing the industry toward more adaptive, analytics driven signals. Finally, as enterprises adopt AI agents that act independently or on behalf of users, identity security is expanding beyond humans altogether, making the protection of non-human identities an essential frontier for the future of cybersecurity. Programs like the Microsoft Intelligent Security Association (MISA) help enable this kind of deep technical collaboration, providing a framework for RSA and Microsoft to align on emerging scenarios, validate integrations, and bring new capabilities to market faster. “It’s been a long journey together,” Dave reflects. “And we’re just getting started.”109Views1like0CommentsPartner Case study | Insight
As organizations and leaders race to embrace the potential of AI, sometimes, the hardest part of the transformation isn’t choosing the right technology—it’s getting people to use it. Employees may hesitate to modify established workflows, feel resistant to new technology, or simply not know how or where to start. This can stall even the most ambitious AI strategies. Insight Enterprises—a Fortune 500 solutions integrator and longtime Microsoft partner—deeply understands this challenge. The company helps clients modernize their operations and embrace innovative technologies to accelerate growth. Their expertise and close partnerships with customers has earned them global recognition, and they were named a 2024 Microsoft Americas AI and Copilot Innovation Partner of the Year. As a trusted advisor to thousands of organizations, Insight knew they needed to lead by example. While the company was helping clients accelerate their AI journeys, they also wanted to accelerate their own internal AI adoption. To accomplish that goal, Insight launched Flight Academy, a gamified program designed to make AI usage practical, measurable, and rewarding for every teammate. Leading from within to stay on the frontier of AI innovation Insight has built their reputation by supporting AI adoption, modernization, and future-readiness for organizations around the world. But they wanted to apply that same philosophy internally, setting the standard for the AI innovation they bring to customers. To remain on the frontier of AI innovation, they committed to becoming an AI-first organization. That meant empowering 16,000 teammates worldwide to embrace AI tools and integrate them into daily workflows. “We think of each of our teammates as this force multiplier, this little snowball on top of a mountain,” said Stan Lequin, CTO, AI Transformation at Insight. “If we give it a push from an AI perspective, what kind of momentum and force of nature could we create with 16,000 snowballs, fully AI-enabled, rolling down the hill?” To realize that vision and become an AI-first organization, Insight wanted every teammate to feel confident using AI tools, not just for isolated tasks but as an integral part of daily work. “We had really good conversations with our teammates that AI is just going to make them more powerful,” said Lequin. “They shouldn’t fear it; they should embrace it.” Initially, AI adoption remained lower than the team hoped. But they quickly came up with an idea that changed the AI landscape, inspiring employees to adopt AI and experiment with solutions at a fast and ambitious rate. They built an environment where AI curiosity was encouraged, success was celebrated, and use cases were visible across the business. Essentially, they gamified AI adoption, incentivizing AI use through light competition and rewards—and in the process, they turned initial interest into sustained engagement. Continue reading here Explore all case studies or submit your own Subscribe to case studies tag to follow all new case study posts. Don't forget to follow this blog to receive email notifications of new stories!23Views0likes0CommentsOn-Premises Manufacturing Intelligence
Manufacturing facilities face a fundamental dilemma in the AI era: how to harness artificial intelligence for predictive maintenance, equipment diagnostics, and operational insights while keeping sensitive production data entirely on-premises. Industrial environments generate proprietary information, CNC machining parameters, quality control thresholds, equipment performance signatures, maintenance histories, that represents competitive advantage accumulated over decades of process optimization. Sending this data to cloud APIs risks intellectual property exposure, regulatory non-compliance, and operational dependencies that manufacturing operations cannot accept. Traditional cloud-based AI introduces unacceptable vulnerabilities. Network latency of 100-500ms makes real-time decision support impossible for time-sensitive manufacturing processes. Internet dependency creates single points of failure in environments where connectivity is unreliable or deliberately restricted for security. API pricing models become prohibitively expensive when analyzing thousands of sensor readings and maintenance logs continuously. Most critically, data residency requirements for aerospace, defense, pharmaceutical, and automotive industries make cloud AI architectures non-compliant by design ITAR, FDA 21 CFR Part 11, and customer-specific mandates require data never leaves facility boundaries. This article demonstrates a sample solution for manufacturing asset intelligence that runs entirely on-premises using Microsoft Foundry Local, Node.js, and JavaScript. The FoundryLocal-IndJSsample repository provides production-ready implementation with Express backend, HTML/JavaScript frontend, and comprehensive Foundry Local SDK integration. Facilities can deploy sophisticated AI-powered monitoring without external dependencies, cloud costs, data exposure risks, or network requirements. Every inference happens locally on facility hardware with predictable performance and zero data egress. Why On-Premises AI Matters for Industrial Operations The case for local AI inference in manufacturing extends beyond simple preference, it addresses fundamental operational, security, and compliance requirements that cloud solutions cannot satisfy. Understanding these constraints shapes architectural decisions that prioritize reliability, data sovereignty, and cost predictability. Data Sovereignty and Intellectual Property Protection Manufacturing processes represent years of proprietary research, optimization, and competitive advantage. Equipment configurations, cycle times, quality thresholds, and maintenance schedules contain intelligence that competitors would value highly. Sending this data to third-party cloud services, even with contractual protections, introduces risks that manufacturing operations cannot accept. On-premises AI ensures that production data never leaves the facility network perimeter. Telemetry from CNC machines, hydraulic systems, conveyor networks, and control systems remains within air-gapped environments where physical access controls and network isolation provide demonstrable data protection. This architectural guarantee of data locality satisfies both internal security policies and external audit requirements without relying on contractual assurances or encryption alone. Operational Resilience and Network Independence Factory floors frequently operate in environments with limited, unreliable, or intentionally restricted internet connectivity. Remote facilities, secure manufacturing zones, and legacy industrial networks cannot depend on continuous cloud access for critical monitoring functions. When network failures occur, whether from ISP outages, DDoS attacks, or infrastructure damage, AI capabilities must continue operating to prevent production losses. Local inference provides true operational independence. Equipment health monitoring, anomaly detection, and maintenance prioritization continue functioning during network disruptions. This resilience is essential for 24/7 manufacturing operations where downtime costs can exceed tens of thousands of dollars per hour. By eliminating external dependencies, on-premises AI becomes as reliable as the local power supply and computing infrastructure. Latency Requirements for Real-Time Decision Making Manufacturing processes involve precise timing where milliseconds determine quality outcomes. Automated inspection systems must classify defects before products leave the production line. Safety interlocks must respond to hazardous conditions before injuries occur. Predictive maintenance alerts must trigger before catastrophic equipment failures cascade through production lines. Cloud-based AI introduces latency that incompatible with these requirements. Network round-trips to cloud endpoints typically require 100-500 milliseconds, in some case latency is unacceptable for real-time applications. Local inference with Foundry Local delivers sub-50ms response times by eliminating network hops, enabling true real-time AI integration with SCADA systems, PLCs, and manufacturing execution systems. Cost Predictability at Industrial Scale Manufacturing facilities generate enormous volumes of time-series data from thousands of sensors, producing millions of data points daily. Cloud AI services charge per API call or per token processed, creating unpredictable costs that scale linearly with data volume. High-throughput industrial applications can quickly accumulate tens of thousands of dollars in monthly API fees. On-premises AI transforms this variable operational expense into fixed capital infrastructure costs. After initial hardware investment, inference costs remain constant regardless of query volume. For facilities analyzing equipment telemetry, maintenance logs, and operator notes continuously, this economic model provides cost certainty and eliminates budget surprises. Regulatory Compliance and Audit Requirements Regulated industries face strict data handling requirements. Aerospace manufacturers must comply with ITAR controls on technical data. Pharmaceutical facilities must satisfy FDA 21 CFR Part 11 requirements for electronic records. Automotive suppliers must meet customer-specific data residency mandates. Cloud AI services complicate compliance by introducing third-party data processors, cross-border data transfers, and shared infrastructure concerns. Local AI simplifies regulatory compliance by eliminating external data flows. Audit trails remain within the facility. Data handling procedures avoid third-party agreements. Compliance demonstrations become straightforward when AI infrastructure resides entirely within auditable physical and network boundaries. Architecture: Manufacturing Intelligence Without Cloud Dependencies The manufacturing asset intelligence system demonstrates a practical architecture for deploying AI capabilities entirely on-premises. The design prioritizes operational reliability, straightforward integration patterns, and maintainable code structure that facilities can adapt to their specific requirements. System Components and Technology Stack The implementation consists of three primary layers that separate concerns and enable independent scaling: Foundry Local Layer: Provides the local AI inference runtime. Foundry Local manages model loading, execution, and resource allocation. It supports multiple model families (Phi-3.5, Phi-4, Qwen2.5) with automatic hardware acceleration detection for NVIDIA GPUs (CUDA), Intel GPUs (OpenVINO), ARM Qualcomm (QNN) and optimized CPU inference. The service exposes a REST API on localhost that the backend layer consumes for completions. Backend Service Layer: An Express Node.js application that serves as the integration point between the AI runtime and the manufacturing data systems. This layer implements business logic for equipment monitoring, maintenance log classification, and conversational interfaces. It formats prompts with equipment context, calls Foundry Local for inference, and structures responses for the frontend. The backend persists chat history and provides RESTful endpoints for all AI operations. Frontend Interface Layer: A standalone HTML/JavaScript application that provides operator interfaces for equipment monitoring, maintenance management, and AI assistant interactions. The UI fetches data from the backend service and renders dashboards, equipment status views, and chat interfaces. No framework dependencies or build steps are required, the frontend operates as static files that any web server or file system can serve. Data Flow for Equipment Analysis Understanding how data moves through the system clarifies integration points and extension opportunities. When an operator requests AI analysis of equipment status, the following sequence occurs: The frontend collects equipment context including asset ID, current telemetry values, alert status, and recent maintenance history. It constructs an HTTP request to the backend's equipment summary endpoint, passing this context as query parameters or request body. The backend retrieves additional context from the equipment database, including specifications, normal operating ranges, and historical performance patterns. The backend constructs a detailed prompt that provides the AI model with comprehensive context: equipment specifications, current telemetry with alarming conditions highlighted, recent maintenance notes, and specific questions about operational status. This prompt engineering is critical, the model's accuracy depends entirely on the context provided. Generic prompts produce generic responses; detailed, structured prompts yield actionable insights. The backend calls Foundry Local's completion API with the formatted prompt, specifying temperature, max tokens, and other generation parameters. Foundry Local loads the configured model (if not already in memory) and generates a response analyzing the equipment's condition. The inference occurs locally with no network traffic leaving the facility. Response time typically ranges from 500ms to 3 seconds depending on prompt complexity and model size. Foundry Local returns the generated text to the backend, which parses the response for structured information if required (equipment health classifications, priority levels, recommended actions). The backend formats this analysis as JSON and returns it to the frontend. The frontend renders the AI-generated summary in the equipment health dashboard, highlighting critical findings and recommended operator actions. Prompt Engineering for Maintenance Log Classification The maintenance log classification feature demonstrates effective prompt engineering for extracting structured decisions from language models. Manufacturing facilities accumulate thousands of maintenance notes, operator observations, technician reports, and automated system logs. Automatically classifying these entries by severity enables priority-based work scheduling without manual review of every log entry. The classification prompt provides the model with clear instructions, classification categories with definitions, and the maintenance note text to analyze: const classificationPrompt = `You are a manufacturing maintenance expert analyzing equipment log entries. Classify the following maintenance note into one of these categories: CRITICAL: Immediate safety hazard, equipment failure, or production stoppage HIGH: Degraded performance, abnormal readings requiring same-shift attention MEDIUM: Scheduled maintenance items or routine inspections LOW: Informational notes, normal operations logs Provide your response in JSON format: { "classification": "CRITICAL|HIGH|MEDIUM|LOW", "reasoning": "Brief explanation of classification decision", "recommended_action": "Specific next steps for maintenance team" } Maintenance Note: ${maintenanceNote} Classification:`; const response = await foundryClient.chat.completions.create({ model: currentModelAlias, messages: [{ role: 'user', content: classificationPrompt }], temperature: 0.1, // Low temperature for consistent classification max_tokens: 300 }); Key aspects of this prompt design: Role definition: Establishing the model as a "manufacturing maintenance expert" activates relevant knowledge and reasoning patterns in the model's training data. Clear categories: Explicit classification options with definitions prevent ambiguous outputs and enable consistent decision-making across thousands of logs. Structured output format: Requesting JSON responses with specific fields enables automated parsing and integration with maintenance management systems without fragile text parsing. Temperature control: Setting temperature to 0.1 reduces randomness in classifications, ensuring consistent severity assessments for similar maintenance conditions. Context isolation: Separating the maintenance note text from the instructions with clear delimiters prevents prompt injection attacks where malicious log entries might attempt to manipulate classification logic. This classification runs locally for every maintenance log entry without API costs or network delays. Facilities processing hundreds of maintenance notes daily benefit from immediate, consistent classification that routes critical issues to technicians automatically while filtering routine informational logs. Model Selection and Performance Trade-offs Foundry Local supports multiple model families with different memory requirements, inference speeds, and accuracy characteristics. Choosing appropriate models for manufacturing environments requires balancing these trade-offs against hardware constraints and operational requirements: Qwen2.5-0.5b (500MB memory): The smallest available model provides extremely fast inference (100-200ms responses) on limited hardware. Suitable for simple classification tasks, keyword extraction, and high-throughput scenarios where response speed matters more than nuanced understanding. Works well on older servers or edge devices with constrained resources. Phi-3.5-mini (2.1GB memory): The recommended default model balances accuracy with reasonable memory requirements. Provides strong reasoning capabilities for equipment analysis, maintenance prioritization, and conversational assistance. Response times of 1-3 seconds on modern CPUs are acceptable for interactive dashboards. This model handles complex prompts with detailed equipment context effectively. Phi-4-mini (3.6GB memory): Increased model capacity improves understanding of technical terminology and complex equipment relationships. Best choice when analyzing detailed maintenance histories, interpreting sensor correlation patterns, or providing nuanced operational recommendations. Requires more memory but delivers noticeably improved analysis quality for complex scenarios. Qwen2.5-7b (4.7GB memory): The largest supported model provides maximum accuracy and sophisticated reasoning. Ideal for facilities with modern server hardware where best-possible analysis quality justifies longer inference times (3-5 seconds). Consider this model for critical applications where operator decisions depend heavily on AI recommendations. Facilities can download all models during initial setup and switch between them based on specific use cases. Use faster models for real-time dashboard updates and automated classification. Deploy larger models for detailed equipment analysis and maintenance planning where operators can wait several seconds for comprehensive insights. Implementation: Equipment Monitoring and AI Analysis The practical implementation reveals how straightforward on-premises AI integration can be with modern JavaScript tooling and proper architectural separation. The backend service manages all AI interactions, shielding the frontend from inference complexity and providing clean REST interfaces. Backend Service Architecture with Express The Node.js backend initializes the Foundry Local SDK client and exposes endpoints for equipment operations: const express = require('express'); const { FoundryLocalClient } = require('foundry-local-sdk'); const cors = require('cors'); const app = express(); const PORT = process.env.PORT || 3000; // Initialize Foundry Local client const foundryClient = new FoundryLocalClient({ baseURL: 'http://localhost:8008', // Default Foundry Local endpoint timeout: 30000 }); // Middleware configuration app.use(cors()); // Enable cross-origin requests from frontend app.use(express.json()); // Parse JSON request bodies // Health check endpoint for monitoring app.get('/api/health', (req, res) => { res.json({ ok: true, service: 'manufacturing-ai-backend' }); }); // Start server app.listen(PORT, () => { console.log(`Manufacturing AI backend running on port ${PORT}`); console.log(`Foundry Local endpoint: http://localhost:8008`); }); This foundational structure establishes the Express application with CORS support for browser-based frontends and JSON request handling. The Foundry Local client connects to the local inference service running on port 8008, no external network configuration required. Equipment Summary Generation with Context-Rich Prompts The equipment summary endpoint demonstrates effective context injection for accurate AI analysis: app.get('/api/assets/:id/summary', async (req, res) => { try { const assetId = req.params.id; const asset = equipmentDatabase.find(a => a.id === assetId); if (!asset) { return res.status(404).json({ error: 'Asset not found' }); } // Construct detailed equipment context const contextPrompt = buildEquipmentContext(asset); // Generate AI analysis const completion = await foundryClient.chat.completions.create({ model: 'phi-3.5-mini', messages: [{ role: 'user', content: contextPrompt }], temperature: 0.3, max_tokens: 500 }); const analysis = completion.choices[0].message.content; res.json({ assetId: asset.id, assetName: asset.name, analysis: analysis, generatedAt: new Date().toISOString() }); } catch (error) { console.error('Equipment summary error:', error); res.status(500).json({ error: 'AI analysis failed', details: error.message }); } }); The equipment context builder assembles comprehensive information for accurate analysis: function buildEquipmentContext(asset) { const alerts = asset.alerts.filter(a => a.severity !== 'INFO'); const telemetry = asset.currentTelemetry; return `Analyze the following manufacturing equipment status: Equipment: ${asset.name} (${asset.id}) Type: ${asset.type} Location: ${asset.location} Current Telemetry: - Temperature: ${telemetry.temperature}°C (Normal range: ${asset.specs.tempRange}) - Vibration: ${telemetry.vibration} mm/s (Threshold: ${asset.specs.vibrationThreshold}) - Pressure: ${telemetry.pressure} PSI (Normal: ${asset.specs.pressureRange}) - Runtime: ${telemetry.runHours} hours (Next maintenance due: ${asset.nextMaintenance}) Active Alerts: ${alerts.map(a => `- ${a.severity}: ${a.message}`).join('\n')} Recent Maintenance History: ${asset.recentMaintenance.slice(0, 3).map(m => `- ${m.date}: ${m.description}`).join('\n')} Provide a concise operational summary focusing on: 1. Current equipment health status 2. Any concerning trends or anomalies 3. Recommended operator actions if applicable 4. Maintenance priority level Summary:`; } This context-rich approach produces accurate, actionable analysis because the model receives equipment specifications, current telemetry with context, alert history, maintenance patterns, and structured output guidance. The model can identify abnormal conditions accurately rather than guessing what values seem unusual. Conversational AI Assistant with Manufacturing Context The chat endpoint enables natural language queries about equipment status and operational questions: app.post('/api/chat', async (req, res) => { try { const { message, conversationId } = req.body; // Retrieve conversation history for context const history = conversationStore.get(conversationId) || []; // Build plant-wide context for the query const plantContext = buildPlantOperationsContext(); // Construct system message with domain knowledge const systemMessage = { role: 'system', content: `You are an AI assistant for a manufacturing facility's operations team. You have access to real-time equipment data and maintenance records. Current Plant Status: ${plantContext} Provide specific, actionable responses based on actual equipment data. If you don't have information to answer a query, clearly state that. Never speculate about equipment conditions beyond available data.` }; // Include conversation history for multi-turn context const messages = [ systemMessage, ...history, { role: 'user', content: message } ]; const completion = await foundryClient.chat.completions.create({ model: 'phi-3.5-mini', messages: messages, temperature: 0.4, max_tokens: 600 }); const assistantResponse = completion.choices[0].message.content; // Update conversation history history.push( { role: 'user', content: message }, { role: 'assistant', content: assistantResponse } ); conversationStore.set(conversationId, history); res.json({ response: assistantResponse, conversationId: conversationId, timestamp: new Date().toISOString() }); } catch (error) { console.error('Chat error:', error); res.status(500).json({ error: 'Chat request failed', details: error.message }); } }); The conversational interface enables operators to ask natural language questions and receive grounded responses based on actual equipment data, citing specific asset IDs, current metric values, and alert statuses rather than speculating. Deployment and Production Operations Deploying on-premises AI in industrial settings requires consideration of hardware placement, network architecture, integration patterns, and operational procedures that differ from typical web application deployments. Hardware and Infrastructure Requirements The system runs on standard server hardware without specialized AI accelerators, though GPU availability improves performance significantly. Minimum requirements include 8GB RAM for the Phi-3.5-mini model, 4-core CPU, and 50GB storage for model files and application data. Production deployments benefit from 16GB+ RAM to support larger models and concurrent analysis requests. For facilities with NVIDIA GPUs, Foundry Local automatically utilizes CUDA acceleration, reducing inference times by 3-5x compared to CPU-only execution. Deploy the backend service on dedicated server hardware within the factory network. Avoid running AI workloads on the same systems that host critical SCADA or MES applications due to resource contention concerns. Network Architecture and SCADA Integration The AI backend should reside on the manufacturing operations network with firewall rules permitting connections from operator workstations and monitoring systems. Do not expose the backend service directly to the internet, all access should occur through the facility's internal network with authentication via existing directory services. Integrate with SCADA systems through standard industrial protocols. Configure OPC-UA clients to subscribe to equipment telemetry topics and forward readings to the AI backend via REST API calls. Modbus TCP gateways can bridge legacy PLCs to modern APIs by polling register values and POSTing updates to the backend's telemetry ingestion endpoints. Security and Compliance Considerations Many manufacturing facilities operate air-gapped networks where physical separation prevents internet connectivity entirely. Deploy Foundry Local and the AI application in these environments by transferring model files and application packages via removable media during controlled maintenance windows. Implement role-based access control (RBAC) using Active Directory integration. Configure the backend to validate user credentials against LDAP before serving AI analysis requests. Maintain detailed audit logs of all AI invocations including user identity, timestamp, equipment queried, and model version used. Store these logs in immutable append-only databases for compliance audits. Key Takeaways Building production-ready AI systems for industrial environments requires architectural decisions that prioritize operational reliability, data sovereignty, and integration simplicity: Data locality by architectural design: On-premises AI ensures proprietary production data never leaves facility networks through fundamental architectural guarantees rather than configuration options Model selection impacts deployment feasibility: Smaller models (0.5B-2B parameters) enable deployment on commodity hardware without specialized accelerators while maintaining acceptable accuracy Fallback logic preserves operational continuity: AI capabilities enhance but don't replace core monitoring functions, ensuring equipment dashboards display raw telemetry even when AI analysis is unavailable Context-rich prompts determine accuracy: Effective prompts include equipment specifications, normal operating ranges, alert thresholds, and maintenance history to enable grounded recommendations Structured outputs enable automation: JSON response formats allow automated systems to parse classifications and route work orders without fragile text parsing Integration patterns bridge legacy systems: OPC-UA and Modbus TCP gateways connect decades-old PLCs and SCADA systems to modern AI without replacing functional control infrastructure Resources and Further Exploration The complete implementation with extensive comments and documentation is available in the GitHub repository. Additional resources help facilities customize and extend the system for their specific requirements. FoundryLocal-IndJSsample GitHub Repository – Full source code with JavaScript backend, HTML frontend, and sample data files Quick Start Guide and Documentation – Installation instructions, API documentation, and troubleshooting guidance Microsoft Foundry Local Documentation – Official SDK reference, model catalog, and deployment guidance Sample Manufacturing Data – Example equipment telemetry, maintenance logs, and alert structures Backend Implementation Reference – Express server code with Foundry Local SDK integration patterns OPC Foundation – Industrial communication standards for SCADA and PLC integration Edge AI for Beginners - Online FREE course and resources for learning more about using AI on Edge Devices Why On-Premises AI Cloud AI services offer convenience, but they fundamentally conflict with manufacturing operational requirements. Understanding these conflicts explains why local AI isn't just preferable, it's mandatory for production environments. Data privacy and intellectual property protection stand paramount. A CNC machining program represents years of optimization, feed rates, tool paths, thermal compensation algorithms. Quality control measurements reveal product specifications competitors would pay millions to access. Sending this data to external APIs, even with encryption, creates unacceptable exposure risk. Every API call generates logs on third-party servers, potentially subject to subpoenas, data breaches, or regulatory compliance failures. Latency requirements eliminate cloud viability for real-time decisions. When a thermal sensor detects bearing temperature exceeding safe thresholds, the control system needs AI analysis in under 50 milliseconds to prevent catastrophic failure. Cloud APIs introduce 100-500ms baseline latency from network round-trips alone, before queue times and processing. For safety systems, quality inspection, and process control, this latency is operationally unacceptable. Network dependency creates operational fragility. Factory floors frequently have limited connectivity, legacy equipment, RF interference, isolated production cells. Critical AI capabilities cannot fail because internet service drops. Moreover, many defense, aerospace, and pharmaceutical facilities operate air-gapped networks for security compliance. Cloud AI is simply non-operational in these environments. Regulatory requirements mandate data residency. ITAR (International Traffic in Arms Regulations) prohibits certain manufacturing data from leaving approved facilities. FDA 21 CFR Part 11 requires strict data handling controls for pharmaceutical manufacturing. GDPR demands data residency in approved jurisdictions. On-premises AI simplifies compliance by eliminating cross-border data transfers. Cost predictability at scale favors local deployment. A high-volume facility generating 10,000 equipment events per day, each requiring AI analysis, would incur significant cloud API costs. Local models have fixed infrastructure costs that scale economically with usage, making AI economically viable for continuous monitoring. Application Architecture: Web UI + Local AI Backend The FoundryLocal-IndJSsample implements a clean separation between data presentation and AI inference. This architecture ensures the UI remains responsive while AI operations run independently, enabling real-time dashboard updates without blocking user interactions. The web frontend serves a single-page application with vanilla HTML, CSS, and JavaScript, no frameworks, no build tools. This simplicity is intentional: factory IT teams need to audit code, customize interfaces, and deploy on legacy systems. The UI presents four main interfaces: Plant Asset Overview (real-time health cards for all equipment), Asset Health (AI-generated summaries and trend analysis), Maintenance Logs (classification and priority routing), and AI Assistant (natural language interface for operations queries). The Node.js backend runs Express as the HTTP server, handling static file serving, API routing, and WebSocket connections for real-time updates. It loads sample manufacturing data from JSON files, equipment telemetry, maintenance logs, historical events, simulating the data streams that would come from SCADA systems, PLCs, and MES platforms in production. Foundry Local provides the AI inference layer. The backend uses foundry-local-sdk to communicate with the locally running service. All model loading, prompt processing, and response generation happens on-device. The application detects Foundry Local automatically and falls back to rule-based analysis if unavailable, ensuring core functionality persists even when AI is offline. Here's the architectural flow for asset health analysis: User Request (Web UI) ↓ Express API Route (/api/assets/:id/summary) ↓ Load Equipment Data (from JSON/database) ↓ Build Analysis Prompt (Equipment ID, telemetry, alerts) ↓ Foundry Local SDK Call (local AI inference) ↓ Parse AI Response (structured insights) ↓ Return JSON Result (with metadata: model, latency, confidence) ↓ Display in UI (formatted health summary) This architecture demonstrates several industrial system design principles: Offline-first operation: Core functionality works without internet connectivity, with AI as an enhancement rather than dependency Graceful degradation: If AI fails, fall back to rule-based logic rather than crashing operations Minimal external dependencies: Simple stack reduces attack surface and simplifies air-gapped deployment Data locality: All processing happens on-premises, no external API calls Real-time updates: WebSocket connections enable push-based event streaming for dashboard updates Setting Up Foundry Local for Industrial Applications Industrial deployments require careful model selection that balances accuracy, speed, and hardware constraints. Factory edge devices often run on limited hardware—industrial PCs with modest GPUs or CPU-only configurations. Model choice significantly impacts deployment feasibility. Install Foundry Local on the industrial edge device: # Windows (most common for industrial PCs) winget install Microsoft.FoundryLocal # Verify installation foundry --version For manufacturing asset intelligence, model selection trades off speed versus quality: # Fast option: Qwen 0.5B (500MB, <100ms inference) foundry model load qwen2.5-0.5b # Balanced option: Phi-3.5 Mini (2.1GB, ~200ms inference) foundry model load phi-3.5-mini # High quality option: Phi-4 Mini (3.6GB, ~500ms inference) foundry model load phi-4 # Check which model is currently loaded foundry model list For real-time monitoring dashboards where hundreds of assets update continuously, qwen2.5-0.5b provides sufficient quality at speeds that don't bottleneck refresh cycles. For detailed root cause analysis or maintenance report generation where quality matters most, phi-4 justifies the slightly longer inference time. Industrial systems benefit from proactive model caching during downtime: # During maintenance windows, pre-download models foundry model download phi-3.5-mini foundry model download qwen2.5-0.5b # Models cache locally, eliminating runtime downloads The backend automatically detects Foundry Local and selects the loaded model: // backend/services/foundry-service.js import { FoundryLocalClient } from 'foundry-local-sdk'; class FoundryService { constructor() { this.client = null; this.modelAlias = null; this.initializeClient(); } async initializeClient() { try { // Detect Foundry Local endpoint const endpoint = process.env.FOUNDRY_LOCAL_ENDPOINT || 'http://127.0.0.1:5272'; this.client = new FoundryLocalClient({ endpoint }); // Query which model is currently loaded const models = await this.client.models.list(); this.modelAlias = models.data[0]?.id || 'phi-3.5-mini'; console.log(`✅ Foundry Local connected: ${this.modelAlias}`); } catch (error) { console.warn('⚠️ Foundry Local not available, using rule-based fallback'); this.client = null; } } async generateCompletion(prompt, options = {}) { if (!this.client) { // Fallback to rule-based analysis return this.ruleBasedAnalysis(prompt); } try { const startTime = Date.now(); const completion = await this.client.chat.completions.create({ model: this.modelAlias, messages: [ { role: 'system', content: 'You are an industrial asset intelligence assistant analyzing manufacturing equipment.' }, { role: 'user', content: prompt } ], temperature: 0.3, // Low temperature for factual analysis max_tokens: 400, ...options }); const latency = Date.now() - startTime; return { content: completion.choices[0].message.content, model: this.modelAlias, latency_ms: latency, tokens: completion.usage?.total_tokens }; } catch (error) { console.error('Foundry inference error:', error); return this.ruleBasedAnalysis(prompt); } } ruleBasedAnalysis(prompt) { // Fallback logic for when AI is unavailable // Pattern matching and heuristics return { content: '(Rule-based analysis) Equipment status: Monitoring...', model: 'rule-based-fallback', latency_ms: 5, tokens: 0 }; } } export default new FoundryService(); This service layer demonstrates critical production patterns: Automatic endpoint detection: Tries environment variable first, falls back to default Model auto-discovery: Queries Foundry Local for currently loaded model rather than hardcoding Robust error handling: Every API call wrapped in try-catch with fallback logic Performance tracking: Latency measurement enables monitoring and capacity planning Conservative temperature: 0.3 temperature reduces hallucination for factual equipment analysis Implementing AI-Powered Asset Health Analysis Equipment health monitoring forms the core use case, synthesizing telemetry from multiple sources into actionable insights. Traditional monitoring systems show raw metrics (temperature, vibration, pressure) but require expert interpretation. AI transforms this into natural language summaries that any operator can understand and act upon. Here's the API endpoint that generates asset health summaries: // backend/routes/assets.js import express from 'express'; import foundryService from '../services/foundry-service.js'; import { getAssetData } from '../data/asset-loader.js'; const router = express.Router(); router.get('/api/assets/:id/summary', async (req, res) => { try { const assetId = req.params.id; // Load equipment data const asset = await getAssetData(assetId); if (!asset) { return res.status(404).json({ error: 'Asset not found' }); } // Build analysis prompt with context const prompt = buildHealthAnalysisPrompt(asset); // Generate AI summary const analysis = await foundryService.generateCompletion(prompt); // Structure response res.json({ asset_id: assetId, asset_name: asset.name, summary: analysis.content, model_used: analysis.model, latency_ms: analysis.latency_ms, timestamp: new Date().toISOString(), telemetry_snapshot: { temperature: asset.telemetry.temperature, vibration: asset.telemetry.vibration, runtime_hours: asset.telemetry.runtime_hours }, active_alerts: asset.alerts.filter(a => a.active).length }); } catch (error) { console.error('Asset summary error:', error); res.status(500).json({ error: 'Analysis failed' }); } }); function buildHealthAnalysisPrompt(asset) { return ` Analyze the health of this manufacturing equipment and provide a concise summary: Equipment: ${asset.name} (${asset.id}) Type: ${asset.type} Location: ${asset.location} Current Telemetry: - Temperature: ${asset.telemetry.temperature}°C (Normal: ${asset.specs.normal_temp_range}) - Vibration: ${asset.telemetry.vibration} mm/s (Threshold: ${asset.specs.vibration_threshold}) - Operating Pressure: ${asset.telemetry.pressure} PSI - Runtime: ${asset.telemetry.runtime_hours} hours - Last Maintenance: ${asset.maintenance.last_service_date} Active Alerts: ${asset.alerts.map(a => `- ${a.severity}: ${a.message}`).join('\n')} Recent Events: ${asset.recent_events.slice(0, 3).map(e => `- ${e.timestamp}: ${e.description}`).join('\n')} Provide a 3-4 sentence summary covering: 1. Overall equipment health status 2. Any concerning trends or anomalies 3. Recommended actions or monitoring focus Be factual and specific. Do not speculate beyond the provided data. `.trim(); } export default router; This prompt construction demonstrates several best practices for industrial AI: Structured data presentation: Organize telemetry, specs, and alerts in clear sections with labels Context enrichment: Include normal operating ranges so AI can assess abnormality Explicit constraints: Instruction to avoid speculation reduces hallucination risk Output formatting guidance: Request specific structure (3-4 sentences, covering key points) Temporal context: Include recent events so AI understands trend direction Example AI-generated asset summary: { "asset_id": "CNC-L2-M03", "asset_name": "CNC Mill #3", "summary": "Equipment is operating outside normal parameters with elevated temperature at 92°C, significantly above the 75-80°C normal range. Thermal Alert indicates possible coolant flow issue. Vibration levels remain acceptable at 2.8 mm/s. Recommend immediate inspection of coolant system and thermal throttling may impact throughput until resolved.", "model_used": "phi-3.5-mini", "latency_ms": 243, "timestamp": "2026-01-30T14:23:18Z", "telemetry_snapshot": { "temperature": 92, "vibration": 2.8, "runtime_hours": 12847 }, "active_alerts": 2 } This summary transforms raw telemetry into actionable intelligence—operations staff immediately understand the problem, its severity, and the appropriate response, without requiring deep equipment expertise. Maintenance Log Classification with AI Maintenance departments generate hundreds of logs daily, technician notes, operator observations, inspection reports. Manually categorizing and prioritizing these logs consumes significant time. AI classification automatically routes logs to appropriate teams, identifies urgent issues, and extracts key information. The classification endpoint processes maintenance notes: // backend/routes/maintenance.js router.post('/api/logs/classify', async (req, res) => { try { const { log_text, equipment_id } = req.body; if (!log_text || log_text.length < 10) { return res.status(400).json({ error: 'Log text required (min 10 chars)' }); } const classificationPrompt = ` Classify this maintenance log entry into appropriate categories and priority: Equipment: ${equipment_id || 'Unknown'} Log Text: "${log_text}" Classify into EXACTLY ONE primary category: - MECHANICAL: Physical components, bearings, belts, motors - ELECTRICAL: Power systems, sensors, controllers, wiring - HYDRAULIC: Pumps, fluid systems, pressure issues - THERMAL: Cooling, heating, temperature control - SOFTWARE: PLC programming, HMI issues, control logic - ROUTINE: Scheduled maintenance, inspections, calibration Assign priority level: - CRITICAL: Immediate action required, safety or production impact - HIGH: Resolve within 24 hours, performance degradation - MEDIUM: Schedule within 1 week, minor issues - LOW: Routine maintenance, cosmetic issues Extract key details: - Symptoms described - Suspected root cause (if mentioned) - Recommended actions Return ONLY a JSON object with this exact structure: { "category": "MECHANICAL", "priority": "HIGH", "symptoms": ["grinding noise", "vibration above 5mm/s"], "suspected_cause": "bearing wear", "recommended_actions": ["inspect bearings", "order replacement parts"] } `.trim(); const analysis = await foundryService.generateCompletion(classificationPrompt); // Parse AI response as JSON let classification; try { // Extract JSON from response (AI might add explanation text) const jsonMatch = analysis.content.match(/\{[\s\S]*\}/); classification = JSON.parse(jsonMatch[0]); } catch (parseError) { // Fallback parsing if JSON extraction fails classification = parseClassificationText(analysis.content); } // Validate classification const validCategories = ['MECHANICAL', 'ELECTRICAL', 'HYDRAULIC', 'THERMAL', 'SOFTWARE', 'ROUTINE']; const validPriorities = ['CRITICAL', 'HIGH', 'MEDIUM', 'LOW']; if (!validCategories.includes(classification.category)) { classification.category = 'ROUTINE'; } if (!validPriorities.includes(classification.priority)) { classification.priority = 'MEDIUM'; } res.json({ original_log: log_text, classification, model_used: analysis.model, latency_ms: analysis.latency_ms, timestamp: new Date().toISOString() }); } catch (error) { console.error('Classification error:', error); res.status(500).json({ error: 'Classification failed' }); } }); function parseClassificationText(text) { // Fallback parser for when AI doesn't return valid JSON // Extract category, priority, and details using regex patterns const categoryMatch = text.match(/category[":]\s*(MECHANICAL|ELECTRICAL|HYDRAULIC|THERMAL|SOFTWARE|ROUTINE)/i); const priorityMatch = text.match(/priority[":]\s*(CRITICAL|HIGH|MEDIUM|LOW)/i); return { category: categoryMatch ? categoryMatch[1].toUpperCase() : 'ROUTINE', priority: priorityMatch ? priorityMatch[1].toUpperCase() : 'MEDIUM', symptoms: [], suspected_cause: 'Unknown', recommended_actions: [] }; } This implementation demonstrates several critical patterns for structured AI outputs: Explicit output format requirements: Prompt specifies exact JSON structure to encourage parseable responses Defensive parsing: Try JSON extraction first, fall back to text parsing if that fails Validation with sensible defaults: Validate categories and priorities against allowed values, default to safe values on mismatch Constrained classification vocabulary: Limit categories to predefined set rather than open-ended categories Priority inference rules: Guide AI to assess urgency based on safety, production impact, and timeline Example classification output: POST /api/logs/classify { "log_text": "Hydraulic pump PUMP-L1-H01 making grinding noise during startup. Vibration readings spiked to 5.2 mm/s this morning. Possible bearing wear. Recommend inspection.", "equipment_id": "PUMP-L1-H01" } Response: { "original_log": "Hydraulic pump PUMP-L1-H01 making grinding noise...", "classification": { "category": "MECHANICAL", "priority": "HIGH", "symptoms": ["grinding noise during startup", "vibration spike to 5.2 mm/s"], "suspected_cause": "bearing wear", "recommended_actions": ["inspect bearings", "schedule replacement if confirmed worn"] }, "model_used": "phi-3.5-mini", "latency_ms": 187, "timestamp": "2026-01-30T14:35:22Z" } This classification automatically routes the log to the mechanical maintenance team, marks it high priority for same-day attention, and extracts actionable details, all without human intervention. Building the Natural Language Operations Assistant The AI Assistant interface enables operations staff to query equipment status, ask diagnostic questions, and get contextual guidance using natural language. This interface bridges the gap between complex SCADA systems and operators who need quick answers without navigating multiple screens. The chat endpoint implements contextual conversation: // backend/routes/chat.js router.post('/api/chat', async (req, res) => { try { const { message, conversation_id } = req.body; if (!message || message.length < 3) { return res.status(400).json({ error: 'Message required (min 3 chars)' }); } // Load conversation history if exists const history = conversation_id ? await loadConversationHistory(conversation_id) : []; // Build context from current plant state const plantContext = await buildPlantContext(); // Construct system prompt with operational context const systemPrompt = ` You are an operations assistant for a manufacturing facility. Answer questions about equipment status, maintenance, and operational issues. Current Plant Status: ${plantContext} Guidelines: - Provide specific, actionable answers based on current data - Reference specific equipment IDs when relevant - Suggest appropriate next steps for issues - If information is unavailable, say so clearly - Use concise language suitable for busy operators Do not speculate about issues without data to support it. `.trim(); // Build message chain with history const messages = [ { role: 'system', content: systemPrompt }, ...history.map(h => ({ role: h.role, content: h.content })), { role: 'user', content: message } ]; // Generate response const response = await foundryService.generateCompletion( message, { messages } // Pass full conversation context ); // Save conversation const newConversationId = conversation_id || generateConversationId(); await saveConversationTurn(newConversationId, message, response.content); res.json({ response: response.content, conversation_id: newConversationId, model_used: response.model, latency_ms: response.latency_ms, timestamp: new Date().toISOString() }); } catch (error) { console.error('Chat error:', error); res.status(500).json({ error: 'Chat failed' }); } }); async function buildPlantContext() { const assets = await getAllAssets(); // Summarize plant-wide status const criticalAlerts = assets.flatMap(a => a.alerts.filter(alert => alert.severity === 'CRITICAL') ); const equipmentStatus = assets.map(a => `${a.id}: ${a.status} (Temp: ${a.telemetry.temperature}°C, Alerts: ${a.alerts.length})` ); return ` Total Equipment: ${assets.length} Critical Alerts: ${criticalAlerts.length} Equipment Status Summary: ${equipmentStatus.join('\n')} `.trim(); } Example conversational interactions: User: "What's wrong with Line 2?" AI: "Line 2 has elevated issues. CNC-L2-M03 is running 12°C above normal temperature at 92°C with a thermal alert active. This has caused 28% below-target throughput as the equipment throttles to prevent damage. The coolant system requires immediate inspection—flow rate dropped to 3.2 L/min from the normal 5.5 L/min, indicating a potential blockage or pump failure. Other Line 2 equipment (CONV-L2-B01, PLC-L2-MAIN) is operating normally." User: "How's the hydraulic pump performing?" AI: "PUMP-L1-H01 has a high-priority maintenance alert. Grinding noise was reported during startup with vibration at 5.2 mm/s, exceeding the 4.0 mm/s threshold. This pattern suggests bearing wear. Current operational status shows the pump is running but degraded. Recommend inspection today to prevent catastrophic failure. Bearing replacement typically requires 2-4 hours downtime." User: "What should I prioritize this shift?" AI: "Priority order for this shift: 1) CNC-L2-M03 coolant inspection (CRITICAL - impacting production, safety risk), 2) PUMP-L1-H01 bearing check (HIGH - prevent failure progression), 3) Routine inspections on CONV-L1-B02 (scheduled maintenance). Focus immediate attention on Line 2 thermal issues. Line 1 is stable." The assistant provides grounded, specific responses citing actual equipment IDs, current metrics, and actionable priorities. This data-grounded approach maintains operational credibility, operators quickly learn to trust responses that reference specific equipment conditions they can independently verify on SCADA systems. Real-World Industrial Scenario: Thermal Event Response To understand the system's practical value, consider how it supports operators during an actual equipment incident. This scenario simulates realistic manufacturing conditions where multiple issues develop simultaneously and operators need rapid situational awareness. Scenario Setup: Assembly Line Degradation During Shift B operations on a Tuesday evening at a precision manufacturing facility, multiple equipment anomalies develop on production lines. The plant overview dashboard begins displaying elevated alert counts. Telemetry monitoring detects concerning patterns across Assembly Line 2 equipment. CNC mill CNC-L2-M03 reports sustained high temperature at 92°C compared to normal operating range of 65-75°C. Coolant flow sensors indicate 40% reduction from baseline levels—dropping from 5.5 L/min to 3.2 L/min. Equipment throughput has decreased 28% as thermal management systems automatically throttle spindle speeds to prevent damage. The mill has accumulated 12,847 runtime hours with next scheduled maintenance in 153 hours. Simultaneously, hydraulic pump PUMP-L1-H01 on Line 1 logged a maintenance note during startup: "Grinding noise observed during pump startup, vibration spike detected." Automated vibration sensors registered 5.2 mm/s compared to the 4.0 mm/s alarm threshold. The pump's condition monitoring system flagged bearing wear as probable cause based on frequency analysis of vibration signatures. The shift supervisor, monitoring the plant overview dashboard, sees both alerts flagged as high priority. Traditional response would require accessing SCADA screens, checking telemetry history, reviewing maintenance logs across multiple systems, correlating patterns, and consulting equipment experts, consuming 15-20 minutes of investigation time while equipment conditions potentially worsen. AI-Assisted Response Workflow Instead, the supervisor uses the AI Assistant interface to rapidly assess the situation. The workflow demonstrates how local AI transforms incident response: Step 1 - Initial Assessment Query: Supervisor types "What's the current plant status?" The AI assistant analyzes the complete equipment database and returns: "Plant status as of 22:14: 14 assets monitored, 2 high-priority alerts active. Critical issues: CNC-L2-M03 thermal event (92°C, coolant flow reduced 40%, throughput down 28%). High-priority: PUMP-L1-H01 mechanical degradation (bearing wear suspected, 5.2 mm/s vibration). All other equipment operating within normal parameters. Recommend immediate attention to Line 2 thermal issue—production impact and potential equipment damage risk." Step 2 - Detailed Equipment Analysis: Supervisor navigates to Asset Health tab and selects CNC-L2-M03 for detailed analysis. Within 1.8 seconds, the AI summary generates: "CNC-L2-M03 Health Summary: DEGRADED - IMMEDIATE ACTION REQUIRED Current Status: Elevated thermal condition with active coolant flow issue. Operating temperature 92°C exceeds normal range (65-75°C) by 22%. Coolant flow rate 3.2 L/min indicates 42% reduction from baseline 5.5 L/min. Impact Assessment: Throughput reduced 28% due to automatic thermal throttling. Spindle speed limited to 65% capacity to prevent thermal damage. Current production rate: 18 parts/hour vs. target 25 parts/hour. Root Cause Analysis: Coolant flow reduction suggests blockage in cooling circuit or coolant pump degradation. Temperature rise pattern correlates with gradual flow reduction observed over past 6 hours. Recommended Actions: 1. Immediate: Verify coolant reservoir level and pump operation 2. Inspect: Check cooling circuit for blockages or debris 3. Monitor: Temperature trend every 15 minutes until resolved 4. Consider: Production pause if temperature exceeds 95°C Maintenance Priority: HIGH - Production impact and equipment damage risk if unresolved. Estimated repair time: 1-2 hours for coolant system service." Step 3 - Maintenance Log Classification: When the pump maintenance note is entered, the AI classification system analyzes it and returns structured classification within 240ms: { "classification": { "category": "MECHANICAL", "priority": "HIGH", "symptoms": ["grinding noise during startup", "vibration spike to 5.2 mm/s", "frequency pattern indicates bearing wear"], "suspected_cause": "bearing wear - inner race degradation probable", "recommended_actions": [ "schedule bearing inspection within 8 hours", "prepare replacement bearings (6208-2RS)", "plan maintenance window: 2-4 hours", "order backup pump if repair extended" ], "estimated_downtime": "2-4 hours for bearing replacement" } } Step 4 - Decision Support and Prioritization: Supervisor asks "What should I prioritize tonight?" The AI synthesizes all context: "Priority recommendations for Shift B: IMMEDIATE (next 30 minutes): - CNC-L2-M03: Dispatch maintenance to coolant system NOW. Production impacted, equipment damage risk if temperature continues rising. Impact: $2,400/hour revenue loss at current throughput reduction. HIGH (within 4 hours): - PUMP-L1-H01: Schedule bearing inspection for line changeover at 02:00. Pump operational but degraded—plan bearing replacement parts and backup pump. Risk: Catastrophic failure if bearing fully degrades. STANDARD (this week): - CONV-L1-B02: Scheduled inspection per maintenance calendar - Routine: Daily checklist completion for all lines Resource allocation: Focus maintenance team lead on CNC thermal issue. Assign technician to prepare pump bearing replacement kit during current production run." Outcome: Rapid Response Enabled by Local AI The AI-assisted workflow reduced incident assessment time from 15-20 minutes to under 3 minutes. The supervisor immediately dispatched maintenance to investigate CNC-L2-M03's coolant system, identified as root cause. Technicians found debris blocking a cooling circuit junction, 5-minute clearance procedure restored coolant flow to 5.3 L/min. Equipment temperature dropped to 73°C within 10 minutes, and full production throughput resumed. For the hydraulic pump, bearing inspection was scheduled during planned line changeover at 02:00, preventing emergency production stoppage. Bearings were replaced preemptively, avoiding the catastrophic pump failure that would have caused 6-8 hours of unplanned downtime. Total downtime avoided: 8+ hours. Revenue protection: ~$48,000 based on facility's production value. All decisions made with AI running entirely on local edge device, no cloud dependency, no data exposure, no network latency impact. The complete incident response workflow operated on facility-controlled infrastructure with full data sovereignty. Key Takeaways for Manufacturing AI Deployment Building production-ready AI systems for industrial environments requires architectural decisions that prioritize operational reliability, data sovereignty, and integration pragmatism over cutting-edge model sophistication. Several critical lessons emerge from implementing on-premises manufacturing intelligence: Data locality through architectural guarantee: On-premises AI ensures proprietary production data never leaves facility networks not through configuration but through fundamental architecture. There are no cloud API calls to misconfigure, no data upload features to accidentally enable, no external endpoints to compromise. This physical data boundary satisfies security audits and competitive protection requirements with demonstrable certainty rather than contractual assurance. Model selection determines deployment feasibility: Smaller models (0.5B-2B parameters) enable deployment on commodity server hardware without specialized AI accelerators. These models provide sufficient accuracy for industrial classification, summarization, and conversational assistance while maintaining sub-3-second response times essential for operator acceptance. Larger models improve nuance but require GPU infrastructure and longer inference times that may not justify marginal accuracy gains for operational decision-making. Graceful degradation preserves operations: AI capabilities enhance but never replace core monitoring functions. Equipment dashboards must display raw telemetry, alert states, and historical trends even when AI analysis is unavailable. This architectural separation ensures operations continue during AI service maintenance, model updates, or system failures. AI becomes value-add intelligence rather than critical dependency. Context-rich prompts determine accuracy: Generic prompts produce generic responses unsuitable for operational decisions. Effective industrial prompts include equipment specifications, normal operating ranges, alert thresholds, maintenance history, and temporal context. This structured context enables models to provide grounded, specific recommendations citing actual equipment conditions rather than hallucinated speculation. Prompt engineering matters more than model size for operational accuracy. Structured outputs enable automation: JSON response formats with predefined fields allow automated systems to parse classifications, severity levels, and recommended actions without fragile natural language parsing. Maintenance management systems can automatically route work orders, trigger alerts, and update dashboards based on AI classification results. This structured integration scales AI beyond human-read summaries into automated workflow systems. Integration patterns bridge legacy and modern: OPC-UA clients and Modbus TCP gateways connect decades-old PLCs and SCADA systems to modern AI backends without replacing functional control infrastructure. This evolutionary approach enables AI adoption without massive capital equipment replacement. Manufacturing facilities can augment existing investments rather than ripping and replacing proven systems. Responsible AI through grounding and constraints: Industrial AI must acknowledge limits and avoid speculation beyond available data. System prompts should explicitly instruct models: "If you don't have information to answer, clearly state that" and "Do not speculate about equipment conditions beyond provided data." This reduces hallucination risk and maintains operator trust. Operators must verify AI recommendations against domain expertise, position AI as decision support augmenting human judgment, not replacing it. Getting Started: Installation and Deployment Implementing the manufacturing intelligence system requires Foundry Local installation, Node.js backend deployment, and frontend hosting, achievable within a few hours for facilities with existing IT infrastructure and server hardware. Prerequisites and System Requirements Hardware requirements depend on selected AI models. Minimum configuration supports Phi-3.5-mini model (2.1GB): 8GB RAM, 4-core CPU (Intel Core i5/AMD Ryzen 5 or better) 50GB available storage for model files and application data Windows 11/Server 2025 distribution. Recommended production configuration: 16GB+ RAM (supports larger models and concurrent requests), 8-core CPU or NVIDIA GPU (RTX 3060/4060 or better for 3-5x inference acceleration), 100GB SSD storage, gigabit network interface for intra-facility communication. Software prerequisites: Node.js 18 or newer (download from nodejs.org or install via system package manager), Git for repository cloning, modern web browser (Chrome, Edge, Firefox) for frontend access, Windows: PowerShell 5.1+. Foundry Local Installation and Model Setup Install Foundry Local using system-appropriate package manager: # Windows installation via winget winget install Microsoft.FoundryLocal # Verify installation foundry --version # macOS installation via Homebrew brew install microsoft/foundrylocal/foundrylocal Download AI models based on hardware capabilities and accuracy requirements: # Fast option: Qwen 0.5B (500MB, 100-200ms inference) foundry model download qwen2.5-0.5b # Balanced option: Phi-3.5 Mini (2.1GB, 1-3 second inference) foundry model download phi-3.5-mini # High quality option: Phi-4 Mini (3.6GB, 2-5 second inference) foundry model download phi-4-mini # Check downloaded models foundry model list Load a model into the Foundry Local service: # Load default recommended model foundry model run phi-3.5-mini # Verify service is running and model is loaded foundry service status The Foundry Local service will start automatically and expose a REST API on localhost:8008 (default port). The backend application connects to this endpoint for all AI inference operations. Backend Service Deployment Clone the repository and install dependencies: # Clone from GitHub git clone https://github.com/leestott/FoundryLocal-IndJSsample.git cd FoundryLocal-IndJSsample # Navigate to backend directory cd backend # Install Node.js dependencies npm install # Start the backend service npm start The backend server will initialize and display startup messages: Manufacturing AI Backend Starting... ✓ Foundry Local client initialized: http://localhost:8008 ✓ Model detected: phi-3.5-mini ✓ Sample data loaded: 6 assets, 12 maintenance logs ✓ Server running on port 3000 ✓ Frontend accessible at: http://localhost:3000 Health check: http://localhost:3000/api/health Verify backend health: # Test backend API curl http://localhost:3000/api/health # Expected response: {"ok":true,"service":"manufacturing-ai-backend"} # Test Foundry Local integration curl http://localhost:3000/api/models/status # Expected response: {"serviceRunning":true,"model":"phi-3.5-mini"} Frontend Access and Validation Open the web interface by navigating to web/index.html in a browser or starting from the backend URL: # Windows: Open frontend directly start http://localhost:3000 # macOS/Linux: Open frontend open http://localhost:3000 # or xdg-open http://localhost:3000 The web interface displays a navigation bar with four main sections: Overview: Plant-wide dashboard showing all equipment with health status cards, alert counts, and "Load Scenario" button to populate sample data Asset Health: Equipment selector dropdown, telemetry display, active alerts list, and "Generate AI Summary" button for detailed analysis Maintenance: Text area for maintenance log entry, "Classify Log" button, and classification result display showing category, priority, and recommendations AI Assistant: Chat interface with message input, conversation history, and natural language query capabilities Running the Sample Scenario Test the complete system with included sample data: Load scenario data: Click "Load Scenario Inputs" button in Overview tab. This populates equipment database with CNC-L2-M03 thermal event, PUMP-L1-H01 vibration alert, and baseline telemetry for all assets. Generate asset summary: Navigate to Asset Health tab, select "CNC-L2-M03" from dropdown, click "Generate AI Analysis". Within 2-3 seconds, detailed health summary appears explaining thermal condition, coolant flow issue, impacts, and recommended actions. Classify maintenance note: Go to Maintenance tab, enter text: "Grinding noise on startup, vibration 5.2 mm/s, suspect bearing wear". Click "Classify Log". AI categorizes as MECHANICAL/HIGH priority with specific repair recommendations. Ask operational questions: Open AI Assistant tab, type "What's wrong with Line 2?" or "Which equipment needs attention?" AI responds with specific equipment IDs, current conditions, and prioritized action list. Production Deployment Considerations For actual manufacturing facility deployment, several additional configurations apply: Hardware placement: Deploy backend service on dedicated server within manufacturing network zone. Avoid co-locating AI workloads with critical SCADA/MES systems due to resource contention. Use physical server or VM with direct hardware access for GPU acceleration. Network configuration: Backend should reside behind facility firewall with access restricted to internal networks. Do not expose AI service directly to internetm use VPN for remote access if required. Implement authentication via Active Directory/LDAP integration. Configure firewall rules permitting connections from operator workstations and monitoring systems only. Data integration: Replace sample JSON data with connections to actual data sources. Implement OPC-UA client for SCADA integration, connect to MES database for production schedules, integrate with CMMS for maintenance history. Code includes placeholder functions for external data source integration, customize for facility-specific systems. Model selection: Choose appropriate model based on hardware and accuracy requirements. Start with phi-3.5-mini for production deployment. Upgrade to phi-4-mini if analysis quality needs improvement and hardware supports it. Use qwen2.5-0.5b for high-throughput scenarios where speed matters more than nuanced understanding. Test all models against validation scenarios before production promotion. Monitoring and maintenance: Implement health checks monitoring Foundry Local service status, backend API responsiveness, model inference latency, and error rates. Set up alerting when inference latency exceeds thresholds or service unavailable. Establish procedures for model updates during planned maintenance windows. Keep audit logs of all AI invocations for compliance and troubleshooting. Resources and Further Learning The complete implementation with detailed comments, sample data, and documentation provides a foundation for building custom manufacturing intelligence systems. Additional resources support extension and adaptation to specific facility requirements. FoundryLocal-IndJSsample GitHub Repository – Complete source code with JavaScript backend, HTML/CSS/JS frontend, sample manufacturing data, and comprehensive README Installation and Configuration Guide – Detailed setup instructions, API documentation, troubleshooting procedures, and deployment guidance Microsoft Foundry Local Documentation – Official SDK reference, model catalog, hardware requirements, and performance tuning guidance Sample Manufacturing Data Format – JSON structure examples for equipment telemetry, maintenance logs, alert definitions, and operational events Backend Implementation Reference – Express server architecture, Foundry Local SDK integration patterns, API endpoint implementations, and error handling OPC Foundation – Industrial communication standards (OPC-UA, OPC DA) for SCADA system integration and PLC connectivity ISA Standards – International Society of Automation standards for industrial systems, SCADA architecture, and manufacturing execution systems EdgeAI for Beginner - Learn more about Edge AI using these course materials The manufacturing intelligence implementation demonstrates that sophisticated AI capabilities can run entirely on-premises without compromising operational requirements. Facilities gain predictive maintenance insights, natural language operational support, and automated equipment analysis while maintaining complete data sovereignty, zero network dependency, and deterministic performance characteristics essential for production environments.📣 MSLE Office Hours — Português
Olá, 👋 Espero que estejam bem! Tem dúvidas sobre o Programa MSLE? Nós temos as respostas! Participe das nossas Office Hours do MSLE: um espaço para conectar, aprender e receber apoio personalizado. ✅ Tire suas dúvidas sobre o programa ✅ Explore recursos e boas práticas ✅ Conecte-se com outros educadores e com nossa equipe do MSLE Traga suas perguntas, ideias e curiosidades — estamos aqui para ajudar você a aproveitar ao máximo sua experiência com o MSLE! No horário indicado, favor realizar acesso ao link: Teams meeting.