Blog Post

Educator Developer Blog
1 MIN READ

From Cloud to Edge: Navigating the Future of AI with LLMs, SLMs, and Azure AI Foundry

Lee_Stott's avatar
Lee_Stott
Icon for Microsoft rankMicrosoft
Jun 06, 2025

As AI continues to evolve, the need to understand and deploy the right models whether large or small has never been more critical. At the recent Microsoft AI Tour, we explored the latest in generative AI, from Large Language Models (LLMs) to Small Language Models (SLMs), and the tools that make them accessible and impactful.

Use Cases: From Automation to Edge AI

 

Generative AI is transforming industries through:

  • Content creation, summarization, and translation
  • Customer engagement via chatbots and personalization
  • Edge deployment for low-latency, privacy-sensitive applications
  • Domain-specific tasks like legal, medical, or technical document processing

LLMs vs. SLMs: Choosing the Right Fit

FeatureLLMsSLMs
ParametersBillions (e.g., GPT-4)Millions
PerformanceHigh accuracy, nuanced understandingFast, efficient for simpler tasks
DeploymentCloud-based, resource-intensiveIdeal for edge and mobile
CostHigh compute and energyCost-effective


SLMs are increasingly viable thanks to optimized runtimes and hardware, making them perfect for on-device AI.

Azure AI Foundry: Your AI Launchpad

Azure AI Foundry offers:

  • A model catalogue with open-source and proprietary models

 

  • Tools for fine-tuning, evaluation, and deployment
  • Integration with GitHub, VS Code, and Azure DevOps
  • Support for both cloud and local inferencing

Local AI: The Edge Advantage

With tools like Foundry Local and Windows AI Foundry, developers can:

  • Run models on-device with ONNX Runtime
  • Use APIs for summarization, translation, and more
  • Optimize for CPU, GPU, and NPU
  • Ensure privacy, low latency, and offline capability

Customization: RAG vs. Fine-Tuning

FeatureRAGFine-Tuning
Knowledge UpdatesDynamicStatic
InterpretabilityHighLow
LatencyHigherLower
Hallucination RiskLowerModerate
Use CaseReal-time, external dataDomain-specific tasks

Both methods enhance model relevance RAG by retrieving external data, and fine-tuning by adapting model weights.

Developer Resources

Get started with:

 

Published Jun 06, 2025
Version 1.0