edge
794 TopicsRunning Phi-4 Locally with Microsoft Foundry Local: A Step-by-Step Guide
In our previous post, we explored how Phi-4 represents a new frontier in AI efficiency that delivers performance comparable to models 5x its size while being small enough to run on your laptop. Today, we're taking the next step: getting Phi-4 up and running locally on your machine using Microsoft Foundry Local. Whether you're a developer building AI-powered applications, an educator exploring AI capabilities, or simply curious about running state-of-the-art models without relying on cloud APIs, this guide will walk you through the entire process. Microsoft Foundry Local brings the power of Azure AI Foundry to your local device without requiring an Azure subscription, making local AI development more accessible than ever. So why do you want to run Phi-4 Locally? Before we dive into the setup, let's quickly recap why running models locally matters: Privacy and Control: Your data never leaves your machine. This is crucial for sensitive applications in healthcare, finance, or education where data privacy is paramount. Cost Efficiency: No API costs, no rate limits. Once you have the model downloaded, inference is completely free. Speed and Reliability: No network latency or dependency on external services. Your AI applications work even when you're offline. Learning and Experimentation: Full control over model parameters, prompts, and fine-tuning opportunities without restrictions. With Phi-4's compact size, these benefits are now accessible to anyone with a modern laptop—no expensive GPU required. What You'll Need Before we begin, make sure you have: Operating System: Windows 10/11, macOS (Intel or Apple Silicon), or Linux RAM: Minimum 16GB (32GB recommended for optimal performance) Storage: At least 5 - 10GB of free disk space Processor: Any modern CPU (GPU optional but provides faster inference) Note: Phi-4 works remarkably well even on consumer hardware 😀. Step 1: Installing Microsoft Foundry Local Microsoft Foundry Local is designed to make running AI models locally as simple as possible. It handles model downloads, manages memory efficiently, provides OpenAI-compatible APIs, and automatically optimizes for your hardware. For Windows Users: Open PowerShell or Command Prompt and run: winget install Microsoft.FoundryLocal For macOS Users (Apple Silicon): Open Terminal and run: brew install microsoft/foundrylocal/foundrylocal Verify Installation: Open your terminal and type. This should return the Microsoft Foundry Local version, confirming installation: foundry --version Step 2: Downloading Phi-4-Mini For this tutorial, we'll use Phi-4-mini, the lightweight 3.8 billion parameter version that's perfect for learning and experimentation. Open your terminal and run: foundry model run phi-4-mini You should see your download begin and something similar to the image below Available Phi Models on Foundry Local While we're using phi-4-mini for this guide, Foundry Local offers several Phi model variants and other open-source models optimized for different hardware and use cases: Model Hardware Type Size Best For phi-4-mini GPU chat-completion 3.72 GB Learning, fast responses, resource-constrained environments with GPU phi-4-mini CPU chat-completion 4.80 GB Learning, fast responses, CPU-only systems phi-4-mini-reasoning GPU chat-completion 3.15 GB Reasoning tasks with GPU acceleration phi-4-mini-reasoning CPU chat-completion 4.52 GB Mathematical proofs, logic puzzles with lower resource requirements phi-4 GPU chat-completion 8.37 GB Maximum reasoning performance, complex tasks with GPU phi-4 CPU chat-completion 10.16 GB Maximum reasoning performance, CPU-only systems phi-3.5-mini GPU chat-completion 2.16 GB Most lightweight option with GPU support phi-3.5-mini CPU chat-completion 2.53 GB Most lightweight option, CPU-optimized phi-3-mini-128k GPU chat-completion 2.13 GB Extended context (128k tokens), GPU-optimized phi-3-mini-128k CPU chat-completion 2.54 GB Extended context (128k tokens), CPU-optimized phi-3-mini-4k GPU chat-completion 2.13 GB Standard context (4k tokens), GPU-optimized phi-3-mini-4k CPU chat-completion 2.53 GB Standard context (4k tokens), CPU-optimized Note: Foundry Local automatically selects the best variant for your hardware. If you have an NVIDIA GPU, it will use the GPU-optimized version. Otherwise, it will use the CPU-optimized version. run the command below to see full list of models foundry model list Step 3: Test It Out Once the download completes, an interactive session will begin. Let's test Phi-4-mini's capabilities with a few different prompts: Example 1: Explanation Phi-4-mini provides a thorough, well-structured explanation! It starts with the basic definition, explains the process in biological systems, gives real-world examples (plant cells, human blood cells). The response is detailed yet accessible. Example 2: Mathematical Problem Solving Excellent step-by-step solution! Phi-4-mini breaks down the problem methodically: 1. Distributes on the left side 2. Isolates the variable terms 3. Simplifies progressively 4. Arrives at the final answer: x = 11 The model shows its work clearly, making it easy to follow the logic and ideal for educational purposes Example 3: Code Generation The model provides a concise Python function using string slicing ([::-1]) - the most Pythonic approach to reversing a string. It includes clear documentation with a docstring explaining the function's purpose, provides example usage demonstrating the output, and even explains how the slicing notation works under the hood. The response shows that the model understands not just how to write the code, but why this approach is preferred - noting that the [::-1] slice notation means "start at the end of the string and end at position 0, move with the step -1, negative one, which means one step backwards." This showcases the model's ability to generate production-ready code with proper documentation while being educational about Python idioms. To exit the interactive session, type `/bye` Step 4: Extending Phi-4 with Real-Time Tools Understanding Phi-4's Knowledge Cutoff Like all language models, Phi-4 has a knowledge cutoff date from its training data (typically several months old). This means it won't know about very recent events, current prices, or breaking news. For example, if you ask "Who won the 2024 NBA championship?" it might not have the answer. The good thing is, there's a powerful work-around. While Phi-4 is incredibly capable, connecting it to external tools like web search, databases, or APIs transforms it from a static knowledge base into a dynamic reasoning engine. This is where Microsoft Foundry's REST API comes in. Microsoft Foundry provides a simple API that lets you integrate Phi-4 into Python applications and connect it to real-time data sources. Here's a practical example: building a web-enhanced AI assistant. Web-Enhanced AI Assistant This simple application combines Phi-4's reasoning with real-time web search, allowing it to answer current questions accurately. Prerequisites: pip install foundry-local-sdk requests ddgs Create phi4_web_assistant.py: import requests from foundry_local import FoundryLocalManager from ddgs import DDGS import json def search_web(query): """Search the web and return top results""" try: results = list(DDGS().text(query, max_results=3)) if not results: return "No search results found." search_summary = "\n\n".join([ f"[Source {i+1}] {r['title']}\n{r['body'][:500]}" for i, r in enumerate(results) ]) return search_summary except Exception as e: return f"Search failed: {e}" def ask_phi4(endpoint, model_id, prompt): """Send a prompt to Phi-4 and stream response""" response = requests.post( f"{endpoint}/chat/completions", json={ "model": model_id, "messages": [{"role": "user", "content": prompt}], "stream": True }, stream=True, timeout=180 ) full_response = "" for line in response.iter_lines(): if line: line_text = line.decode('utf-8') if line_text.startswith('data: '): line_text = line_text[6:] # Remove 'data: ' prefix if line_text.strip() == '[DONE]': break try: data = json.loads(line_text) if 'choices' in data and len(data['choices']) > 0: delta = data['choices'][0].get('delta', {}) if 'content' in delta: chunk = delta['content'] print(chunk, end="", flush=True) full_response += chunk except json.JSONDecodeError: continue print() return full_response def web_enhanced_query(question): """Combine web search with Phi-4 reasoning""" # By using an alias, the most suitable model will be downloaded # to your device automatically alias = "phi-4-mini" # Create a FoundryLocalManager instance. This will start the Foundry # Local service if it is not already running and load the specified model. manager = FoundryLocalManager(alias) model_info = manager.get_model_info(alias) print("🔍 Searching the web...\n") search_results = search_web(question) prompt = f"""Here are recent search results: {search_results} Question: {question} Using only the information above, give a clear answer with specific details.""" print("🤖 Phi-4 Answer:\n") return ask_phi4(manager.endpoint, model_info.id, prompt) if __name__ == "__main__": # Try different questions question = "Who won the 2024 NBA championship?" # question = "What is the latest iPhone model released in 2024?" # question = "What is the current price of Bitcoin?" print(f"Question: {question}\n") print("=" * 60 + "\n") web_enhanced_query(question) print("\n" + "=" * 60) Run It: python phi4_web_assistant.py What Makes This Powerful By connecting Phi-4 to external tools, you create an intelligent system that: Accesses Real-Time Information: Get news, weather, sports scores, and breaking developments Verifies Facts: Cross-reference information with multiple sources Extends Capabilities: Connect to databases, APIs, file systems, or any other tool Enables Complex Applications: Build research assistants, customer support bots, educational tutors, and personal assistants This same pattern can be applied to connect Phi-4 to: Databases: Query your company's internal data APIs: Weather services, stock prices, translation services File Systems: Analyze documents and spreadsheets IoT Devices: Control smart home systems The possibilities are endless when you combine local AI reasoning with real-world data access. Troubleshooting Common Issues Service not running: Make sure Foundry Local is properly installed and the service is running. Try restarting with foundry --version to verify installation. Model downloads slowly: Check your internet connection and ensure you have enough disk space (5-10GB per model). Out of memory: Close other applications or try using a smaller model variant like phi-3.5-mini instead of the full phi-4. Connection issues: Verify that no other services are using the same ports. Foundry Local typically runs on http://localhost:5272. Model not found: Run foundry model list to see available models, then use foundry model run <model-name> to download and run a specific model. Your Next Steps with Foundry Local Congratulations! You now have Phi-4 running locally through Microsoft Foundry Local and understand how to extend it with external tools like web search. This combination of local AI reasoning with real-time data access opens up countless possibilities for building intelligent applications. Coming in Future Posts In the coming weeks, we'll explore advanced topics using Hugging Face: Fine-tuning Phi models on your own data for domain-specific applications Phi-4-multimodal: Analyze images, process audio, and combine multiple data types Advanced deployment patterns: RAG systems and multi-agent orchestration Resources to Explore EdgeAI for Beginners Course: Comprehensive 36-45 hour course covering Edge AI fundamentals, optimization, and production deployment Phi-4 Technical Report: Deep dive into architecture and benchmarks Phi Cookbook on GitHub: Practical examples and recipes Foundry Local Documentation: Complete technical documentation and API reference Module 08: Foundry Local Toolkit: 10 comprehensive samples including RAG applications and multi-agent systems Keep experimenting with Foundry Local, and stay tuned as we unlock the full potential of Edge AI! What will you build with Phi-4? Share your ideas and projects in the comments below!204Views1like1CommentAbility to Manager or Delete single autocomplete form data in Edge browser (new for 2025)
Someone has asked exactly this question before, in 2023, asking if the feature could be added, when maybe the feature already existed and the OP didn't know. Three months after the post, the OP wrote that they'd found the method and marked it as solved. It involved a few simple key strokes. See here https://techcommunity.microsoft.com/discussions/edgeinsiderdiscussions/ability-to-manager-or-delete-single-autocomplete-form-data-in-edge-browser-/3984284 This old post has attracted a few extra comments in the time since it was marked as solved, as the method described has been removed in more recent builds of Edge. I find the removal of this method frustrating, and I believe it invites security issues. For example, more than once, I've accidentally put a password in a username box. Now when I visit the site and try to log in, it shows my password in plain text on screen. Users have to search through autocomplete data to correct it, instead of being able to resolve the issue with one click (as I believe it is in Chrome) or the old Edge method of arrowing down to the suggestion and pressing delete while holding shift. Please could this feature be brought back in a future Edge build? Because I think it is a valuable component of old Edge, and I miss its ease of use very much, the sooner the better please.7Views0likes0CommentsStolen session token from Edge
We can steal the session token from Edge using tools like Burp Suite or Fiddler to intercept proxy traffic on the mobile phone, even when the Edge is MAM protected by Intune. This makes the Edge browser unsafe to use for Enterprise Applications on personal mobile. Recently I discovered that the https://learn.microsoft.com/en-us/entra/identity/conditional-access/concept-token-protection in Conditional Access Policy. However it is only available for Windows. I am wondering if anyone knows when it would become available for mobile on Entra roadmap. Also, if you know any Edge configuration, I could use to stop Token Theft, please let me know! Thank you everyone.84Views0likes1CommentPWA shortcuts don't keep asociated with window - Linux
I recently upgraded to version140.0.3485.14 (Official build) beta (64-bit) and noticed that my PWA, even if they open correctly in a independent window, don't show the "running window counter" on the taskbar. So, for example, if a open Outlook, the window open and loads the app, but if a I click the icon again, it opens another window, wich is a little annoying. I tried removing the PWA from Edge, rebooting, and installing again, but didn't work. Because the only change I made is updating Edge, I think is related to this build. (too lazy to go back or install the stable release)631Views9likes8CommentsGo Links on Edge Mobile
Dear community members, We use Intune managed computer and Zscaler that delivers DNS Search Domain. When user type a https://go/links in Edge browser, it automatically appends the FQDN to the address bar to become https://go.mycompanydomain.com/links. It is a quite common practice for Enterprise to provide convenience to access internal shortened URLs. With Intune managed mobile (also has Zscaler), can we achieve the same goal for Edge mobile? For the mobile use case, it is less of typing the go links directly in the browser. Because there are a lot of go links shared in Email and Chats from communications and newsletters, when user click them in Outlook or Teams on the phone, it will open in Edge. I am hoping when Edge opens these links, it automatically appends the search domain like on computers. I have looked up all Intune device and Edge documentation, chatted with three different LLMs, couldn't figure out a solution. All ideas are welcome! Thanks. Best regards,Solved97Views0likes1CommentEdge Dev CreateProcess bug
I'm using Edge Dev Version 143.0.3638.1 (Official build) dev (64-bit) downloaded from https://msedge.sf.dl.delivery.mp.microsoft.com/filestreamingservice/files/3b9b41d3-7038-420a-bb07-66fe54926f0f/MicrosoftEdgeDevEnterpriseX64.msi Opening a new Edge instance by using the code: BOOL bRet = ::CreateProcess((LPCTSTR)L"C:\\Program Files (x86)\\Microsoft\\Edge Dev\\Application\\msedge.exe", (LPWSTR)L" --new-window www.bing.com", NULL, NULL, FALSE, CREATE_DEFAULT_ERROR_MODE, NULL, NULL, &si, &pi); is no longer working. I can see the process starting and then terminating quickly. It works though when using Edge Version 141.0.3537.99 (Official build) (64-bit). Is this the correct place to report & track the bug?17Views0likes0CommentsExtension ID: gdndpilddmlahjjcfmknlmindbklnbel Meeting Scheduler
Meeting Scheduler The extension is flagged with a warning that it contains malware. I am not aware how it got installed although The source of the extension is the "Microsoft Edge Add-ons Store but not able to find it on store. Can someone help with the triage how it could have got installed and what is this extension69Views1like0CommentsCamera and Mic Site Permissions
edge settings/privacy/sitePermissions/allPermissions/camera edge settings/privacy/sitePermissions/allPermissions/microphone I was looking to add some sites / urls to automatically permit access to both camera and mic to stop the age old i said no and now cant use service x type service calls from coming in. i added these 2 policies to the admin template Sites that can access audio capture devices without requesting permission Sites that can access video capture devices without requesting permission both succeed in delivery to the device but don't appear in the edge site permission list as expected have i got the wrong policy? is it broke?25Views0likes0CommentsMicrosoft Edge PDF Reader - signature validation warning for official signatures
Since the last edge update we are running into the following issue, which triggers support tickets on our side. It seems that the edge internal PDF Reader now supports validation of digital PDF signatures. This new feature however is incomplete and causes a lot of confusion for edge users. "Official" signatures are labeled as "Unknown Signature" and "The validity of the signature is still unknown": "Official signatures" means qualified signatures from members of the EUTL - the European trust list as defined in ETSI TS 119 612 („Trusted Lists“): [...] the European Commission publishes a central list with links to the locations where the national trusted lists are published as notified by Member States. This central list, called the List Of Trusted Lists (LOTL), is available in both a human readable format and in a format suitable for automated (machine) processing XML. Parts of the edge PDF reader seems to come from *dobe itself which adds additional confusion since in their products (*dobe reader, acrobat etc) the signatures are correctly verified. They support this list for several years now (next to their own AATL list). I tried to find out whether this feature is 'work-in-progress' or it is a bug however the last published roadmap is quite old and doesn't give helpful information regarding this aspect. So my question is: Is full support of PDF signature validation including support of the EUTL planned and if so when?45Views0likes0CommentsEdge Pdf Viewer does not support page=Fit (only FitV or FitH works)
Dear community, I found that the Edge Pdf Viewer does not allow to display a pdf so that its first page fully fits into the window (no part of the PDF is out of the viewport). You can use the following URL to verify: https://creativelab.berkeley.edu/wp-content/uploads/2019/12/1920X1080-HORIZONTAL-template.pdf#page=1&view=Fit Test Case 1 When the window width gets too small, the window will display a horizontal scrollbar instead of scaling the pdf to fit the width: Test Case 2 When the window height gets too small, the window will display a vertical scrollbar instead of scaling the pdf to fit the height: When I use either FitV or FitH, the respective behaviour is as expected. But I need to combine both behaviours, and thus Fit would be the right option, which seems to be ignored by Edge. When I test the same parameters in other Browsers that use a Chromium engine, it works. Microsoft Edge Version 141.0.3537.85 (Official Build) (64-Bit) Windows 11 Thanks for any advice and thoughts! If there is a dedicated place to create a bug report, please let me know.17Views0likes0Comments