azure friday
229 TopicsExcited to share my latest open-source project: KubeCost Guardian
After seeing how many DevOps teams struggle with Kubernetes cost visibility on Azure, I built a full-stack cost optimization platform from scratch. ๐ช๐ต๐ฎ๐ ๐ถ๐ ๐ฑ๐ผ๐ฒ๐: โ Real-time AKS cluster monitoring via Azure SDK โ Cost breakdown per namespace, node, and pod โ AI-powered recommendations generated from actual cluster state โ One-click optimization actions โ JWT-secured dashboard with full REST API ๐ง๐ฒ๐ฐ๐ต ๐ฆ๐๐ฎ๐ฐ๐ธ: - React 18 + TypeScript + Vite - Tailwind CSS + shadcn/ui + Recharts - Node.js + Express + TypeScript - Azure SDK (@azure/arm-containerservice) - JWT Authentication + Azure Service Principal ๐ช๐ต๐ฎ๐ ๐บ๐ฎ๐ธ๐ฒ๐ ๐ถ๐ ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐: Most cost tools show you generic estimates. KubeCost Guardian reads your actual VM size, node count, and cluster configuration to generate recommendations that are specific to your infrastructure not averages. For example, if your cluster has only 2 nodes with no autoscaler enabled, it immediately flags the HA risk and calculates exactly how much you'd save by switching to Spot instances based on your actual VM size. This project is fully open-source and built for the DevOps community. โญ GitHub: https://github.com/HlaliMedAmine/kubecost-guardian This project represents hours of hard work, and passion. I decided to make it open-source so everyone can benefit from it ๐ค ,If you find it useful, Iโd really appreciate your support . Your support motivates me to keep building and sharing more powerful projects ๐. More exciting ideas are coming soonโฆ stay tuned! ๐ฅ.25Views0likes0CommentsBuilding a Production-Ready Azure Lighthouse Deployment Pipeline with EPAC
Recently I worked on an interesting project for an end-to-end Azure Lighthouse implementation. What really stood out to me was the combination of Azure Lighthouse, EPAC, DevOps, and workload identity federation. The deployment model was so compelling that I decided to build and validate the full solution hands-on in my own personal Azure tenants. The result is a detailed article that documents the entire journey, including pipeline design, implementation steps, and the scripts I prepared along the way. You can read the full article here48Views0likes1CommentPipeline Intelligence is live and open-source real-time Azure DevOps monitoring powered by AI .
Every DevOps team I've worked with had the same problem: Slow pipelines. Zero visibility. No idea where to start. So I stopped complaining and built the solution. So I built something about it. โก Pipeline Intelligence is a full-stack Azure DevOps monitoring dashboard that: โ Connects to your real Azure DevOps organization via REST API โ Detects bottlenecks across all your pipelines automatically โ Calculates exactly how much time your team is wasting per month โ Uses Gemini AI to generate prioritized fixes with ready-to-paste YAML solutions โ JWT-secured, Docker-ready, and fully open-source Tech Stack: โ React 18 + Vite + Tailwind CSS โ Node.js + Express + Azure DevOps API v7 โ Google Gemini 1.5 Flash โ JWT Authentication + Docker ๐ช๐ต๐ฎ๐ ๐บ๐ฎ๐ธ๐ฒ๐ ๐ถ๐ ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐? Most tools show you generic estimates. Pipeline Intelligence reads your actual cluster config, node count, and pipeline structure and gives you recommendations specific to your infrastructure. ๐ฏ This year, I set myself a personal challenge: Build and open-source a series of production-grade tools exclusively focused on Azure services tools that solve real problems for real DevOps teams. This project represents weeks of research, architecture decisions, and late-night debugging sessions. I'm sharing it with the community because I believe great tooling should be accessible to everyone not locked behind enterprise paywalls. If this resonates with you, I have one simple ask: ๐ A like, a comment, or a share takes 3 seconds but it helps this reach the DevOps engineers who need it most. Your support is what keeps me building. โค๏ธ GitHub: https://github.com/HlaliMedAmine/pipeline-intelligence29Views0likes0CommentsAzure passowrd protection
We have a hybrid Azure infrastructure with an AD Connector installed on-prem and configured for PTA. We installed the password protection server and registered it with the Azure tenant, then deployed the DC agent on all domain controllers. Both the proxy and agents are operational. We published a few banned words to block in case anyone uses them. For testing, I changed my password to include one of the banned words. To my surprise, I was able to change the password. I checked the corresponding logon server, and the DC event viewer showed that the password was validated, but the banned word was in the password list that Azure set to enforce. Why is it not blocking the change?Solved82Views0likes1CommentApplying DevOps Principles on Lean Infrastructure. Lessons From Scaling to 102K Users.
Hi Azure Community, I'm a Microsoft Certified DevOps Engineer, and I want to share an unusual journey. I have been applying DevOps principles on traditional VPS infrastructure to scale to 102,000 users with 99.2% uptime. Why am I posting this in an Azure community? Because I'm planning migration to Azure in 2026, and I want to understand: What mistakes am I already making that will bite me during migration? THE CURRENT SETUP Platform: Social commerce (West Africa) Users: 102,000 active Monthly events: 2 million Uptime: 99.2% Infrastructure: Single VPS Stack: PHP/Laravel, MySQL, Redis Yes - one VPS. No cloud. No Kubernetes. No microservices. WHY I HAVEN'T USED AZURE YET Honest answer: Budget constraints in emerging market startup ecosystem. At our current scale, fully managed Azure services would significantly increase monthly burn before product-market expansion. The funding we raised needs to last through growth milestones. The trade: I manually optimize what Azure would auto-scale. I debug what Application Insights would catch. I do by hand what Azure Functions would automate. DEVOPS PRACTICES THAT KEPT US RUNNING Even on single-server infrastructure, core DevOps principles still apply: CI/CD Pipeline (GitHub Actions) โข 3-5 deployments weekly โข Zero-downtime deploys โข Automated rollback on health check failures โข Feature flags for gradual rollouts Monitoring & Observability โข Custom monitoring (would love Application Insights) โข Real-time alerting โข Performance tracking and slow query detection โข Resource usage monitoring Automation โข Automated backups โข Automated database optimization โข Automated image compression โข Automated security updates Infrastructure as Code โข Configs in Git โข Deployment scripts โข Environment variables โข Documented procedures Testing & Quality โข Automated test suite โข Pre-deployment health checks โข Staging environment โข Post-deployment verification KEY OPTIMIZATIONS Async Job Processing โข Upload endpoint: 8 seconds โ 340ms โข 4x capacity increase Database Optimization โข Feed loading: 6.4 seconds โ 280ms โข Strategic caching โข Batch processing Image Compression โข 3-8MB โ 180KB (94% reduction) โข Critical for mobile users Caching Strategy โข Redis for hot data โข Query result caching โข Smart invalidation Progressive Enhancement โข Server-rendered pages โข 2-3 second loads on 4G WHAT I'M WORRIED ABOUT FOR AZURE MIGRATION This is where I need your help: Architecture Decisions โข App Service vs Functions + managed services? โข MySQL vs Azure SQL? โข When does cost/benefit flip for managed services? Cost Management โข How do startups manage Azure costs during growth? โข Reserved instances vs pay-as-you-go? โข Which Azure services are worth the premium? Migration Strategy โข Lift-and-shift first, or re-architect immediately? โข Zero-downtime migration with 102K active users? โข Validation approach before full cutover? Monitoring & DevOps โข Application Insights - worth it from day one? โข Azure DevOps vs GitHub Actions for Azure deployments? โข Operational burden reduction with managed services? Development Workflow โข Local development against Azure services? โข Cost-effective staging environments? โข Testing Azure features without constant bills? MY PLANNED MIGRATION PATH Phase 1: Hybrid (Q1 2026) โข Azure CDN for static assets โข Azure Blob Storage for images โข Application Insights trial โข Keep compute on VPS Phase 2: Compute Migration (Q2 2026) โข App Service for API โข Azure Database for MySQL โข Azure Cache for Redis โข VPS for background jobs Phase 3: Full Azure (Q3 2026) โข Azure Functions for processing โข Full managed services โข Retire VPS QUESTIONS FOR THIS COMMUNITY Question 1: Am I making migration harder by waiting? Should I have started with Azure at higher cost to avoid technical debt? Question 2: What will break when I migrate? What works on VPS but fails in cloud? What assumptions won't hold? Question 3: How do I validate before cutting over? Parallel infrastructure? Gradual traffic shift? Safe patterns? Question 4: Cost optimization from day one? What to optimize immediately vs later? Common cost mistakes? Question 5: DevOps practices that transfer? What stays the same? What needs rethinking for cloud-native? THE BIGGER QUESTION Have you migrated from self-hosted to Azure? What surprised you? I know my setup isn't best practice by Azure standards. But it's working, and I've learned optimization, monitoring, and DevOps fundamentals in practice. Will those lessons transfer? Or am I building habits that cloud will expose as problematic? Looking forward to insights from folks who've made similar migrations. --- About the Author: Microsoft Certified DevOps Engineer and Azure Developer. CTO at social commerce platform scaling in West Africa. Preparing for phased Azure migration in 2026. P.S. I got the Azure certifications to prepare for this migration. Now I need real-world wisdom from people who've actually done it!119Views0likes0CommentsScaling Smart with Azure: Architecture That Works
Hi Tech Community! Iโm Zainab, currently based in Abu Dhabi and serving as Vice President of Finance & HR at Hoddz Trends LLC a global tech solutions company headquartered in Arkansas, USA. While I lead on strategy, people, and financials, I also roll up my sleeves when it comes to tech innovation. In this discussion, I want to explore the real-world challenges of scaling systems with Microsoft Azure. From choosing the right architecture to optimizing performance and cost, Iโll be sharing insights drawn from experience and Iโd love to hear yours too. Whether you're building from scratch, migrating legacy systems, or refining deployments, letโs talk about what actually works.213Views0likes1CommentError Running Script in Runbook with System Assigned Managed Identity
Hello everyone, I could use some assistance, please. I'm encountering an error when trying to run a script within a runbook. I'm using PowerShell 5.1 with a system-assigned managed identity. The script works find without using the managed identiy via powershell outside of azure. Error: System.Management.Automation.ParameterBindingException: Cannot process command because of one or more missing mandatory parameters: Credential. at System.Management.Automation.CmdletParameterBinderController.PromptForMissingMandatoryParameters(Collection1 fieldDescriptionList, Collection1 missingMandatoryParameters) at System.Management.Automation.CmdletParameterBinderController.HandleUnboundMandatoryParameters I am using this script Connect-ExchangeOnline -ManagedIdentity -Organization domain removed for privacy reasons # Specify the user's mailbox identity $mailboxIdentity = "email address removed for privacy reasons" # Get mailbox configuration and statistics for the specified mailbox $mailboxConfig = Get-Mailbox -Identity $mailboxIdentity $mailboxStats = Get-MailboxStatistics -Identity $mailboxIdentity # Check if TotalItemSize and ProhibitSendQuota are not null and extract the sizes if ($mailboxStats.TotalItemSize -and $mailboxConfig.ProhibitSendQuota) { $totalSizeBytes = $mailboxStats.TotalItemSize.Value.ToString().Split("(")[1].Split(" ")[0].Replace(",", "") -as [double] $prohibitQuotaBytes = $mailboxConfig.ProhibitSendQuota.ToString().Split("(")[1].Split(" ")[0].Replace(",", "") -as [double] # Convert sizes from bytes to gigabytes $totalMailboxSize = $totalSizeBytes / 1GB $mailboxWarningQuota = $prohibitQuotaBytes / 1GB # Check if the mailbox size exceeds 90% of the warning quota if ($totalMailboxSize -ge ($mailboxWarningQuota * 0.0)) { # Send an email notification $emailBody = "The mailbox $($mailboxIdentity) has reached $($totalMailboxSize) GB, which exceeds 90% of the warning quota." Send-MailMessage -To "email address removed for privacy reasons" -From "email address removed for privacy reasons" -Subject "Mailbox Size Warning" -Body $emailBody -SmtpServer "smtp.office365.com" -Port 587 -UseSsl -Credential (Get-Credential) } } else { Write-Host "The required values(TotalItemSize or ProhibitSendQuota) are not available." }758Views0likes1CommentComparision on Azure Cloud Sync and Traditional Entra connect Sync.
Introduction In the evolving landscape of identity management, organizations face a critical decision when integrating their on-premises Active Directory (AD) with Microsoft Entra ID (formerly Azure AD). Two primary tools are available for this synchronization: Traditional Entra Connect Sync (formerly Azure AD Connect) Azure Cloud Sync While both serve the same fundamental purpose, bridging on-prem AD with cloud identity, they differ significantly in architecture, capabilities, and ideal use cases. Architecture & Setup Entra Connect Sync is a heavyweight solution. It installs a full synchronization engine on a Windows Server, often backed by SQL Server. This setup gives administrators deep control over sync rules, attribute flows, and filtering. Azure Cloud Sync, on the other hand, is lightweight. It uses a cloud-managed agent installed on-premises, removing the need for SQL Server or complex infrastructure. The agent communicates with Microsoft Entra ID, and most configurations are handled in the cloud portal. For organizations with complex hybrid setups (e.g., Exchange hybrid, device management), is Cloud Sync too limited?874Views1like2CommentsBuilding an AI-Powered ESG Consultant Using Azure AI Services: A Case Study
In today's corporate landscape, Environmental, Social, and Governance (ESG) compliance has become increasingly important for stakeholders. To address the challenges of analyzing vast amounts of ESG data efficiently, a comprehensive AI-powered solution called ESGai has been developed. This blog explores how Azure AI services were leveraged to create a sophisticated ESG consultant for publicly listed companies. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh The Challenge: Making Sense of Complex ESG Data Organizations face significant challenges when analyzing ESG compliance data. Manual analysis is time-consuming, prone to errors, and difficult to scale. ESGai was designed to address these pain points by creating an AI-powered virtual consultant that provides detailed insights based on publicly available ESG data. Solution Architecture: The Three-Agent System ESGai implements a sophisticated three-agent architecture, all powered by Azure's AI capabilities: Manager Agent: Breaks down complex user queries into manageable sub-questions containing specific keywords that facilitate vector search retrieval. The system prompt includes generalized document headers from the vector database for context. Worker Agent: Processes the sub-questions generated by the Manager, connects to the vector database to retrieve relevant text chunks, and provides answers to the sub-questions. Results are stored in Cosmos DB for later use. Director Agent: Consolidates the answers from the Worker agent into a comprehensive final response tailored specifically to the user's original query. It's important to note that while conceptually there are three agents, the Worker is actually a single agent that gets called multiple times - once for each sub-question generated by the Manager. Current Implementation State The current MVP implementation has several limitations that are planned for expansion: Limited Company Coverage: The vector database currently stores data for only 2 companies, with 3 documents per company (Sustainability Report, XBRL, and BRSR). Single Model Deployment: Only one GPT-4o model is currently deployed to handle all agent functions. Basic Storage Structure: The Blob container has a simple structure with a single directory. While Azure Blob storage doesn't natively support hierarchical folders, the team plans to implement virtual folders in the future. Free Tier Limitations: Due to funding constraints, the AI Search service is using the free tier, which limits vector data storage to 50MB. Simplified Vector Database: The current index stores all 6 files (3 documents ร 2 companies) in a single vector database without filtering capabilities or schema definition. Azure Services Powering ESGai The implementation of ESGai leverages multiple Azure services for a robust and scalable architecture: Azure AI Services: Provides pre-built APIs, SDKs, and services that incorporate AI capabilities without requiring extensive machine learning expertise. This includes access to 62 pre-trained models for chat completions through the AI Foundry portal. Azure OpenAI: Hosts the GPT-4o model for generating responses and the Ada embedding model for vectorization. The service combines OpenAI's advanced language models with Azure's security and enterprise features. Azure AI Foundry: Serves as an integrated platform for developing, deploying, and governing generative AI applications. It offers a centralized management centre that consolidates subscription information, connected resources, access privileges, and usage quotas. Azure AI Search (formerly Cognitive Search): Provides both full-text and vector search capabilities using the OpenAI ada-002 embedding model for vectorization. It's configured with hybrid search algorithms (BM25 RRF) for optimal chunk ranking. Azure Storage Services: Utilizes Blob Storage for storing PDFs, Business Responsibility Sustainability Reports (BRSRs), and other essential documents. It integrates seamlessly with AI Search using indexers to track database changes. Cosmos DB: Employs MongoDB APIs within Cosmos DB as a NoSQL database for storing chat history between agents and users. Azure App Services: Hosts the web application using a B3-tier plan optimized for cost efficiency, with GitHub Actions integrated for continuous deployment. Project Evolution: From Concept to Deployment The development of ESGai followed a structured approach through several phases: Phase 1: Data Cleaning Extracted specific KPIs from XML/XBRL datasets and BRSR reports containing ESG data for 1,000 listed companies Cleaned and standardized data to ensure consistency and accuracy Phase 2: RAG Framework Development Implemented Retrieval-Augmented Generation (RAG) to enhance responses by dynamically fetching relevant information Created a workflow that includes query processing, data retrieval, and response generation Phase 3: Initial Deployment Deployed models locally using Docker and n8n automation tools for testing Identified the need for more scalable web services Phase 4: Transition to Azure Services Migrated automation workflows from n8n to Azure AI Foundry services Leveraged Azure's comprehensive suite of AI services, storage solutions, and app hosting capabilities Technical Implementation Details Model Configurations: The GPT model is configured with: Model version: 2024-11-20 Temperature: 0.7 Max Response Token: 800 Past Messages: 10 Top-p: 0.95 Frequency/Presence Penalties: 0 The embedding model uses OpenAI-text-embedding-Ada-002 with 1536 dimensions and hybrid semantic search (BM25 RRF) algorithms. Cost Analysis and Efficiency A detailed cost breakdown per user query reveals: App Server: $390-400 AI Search: $5 per query RAG Query Processing: $4.76 per query Agent-specific costs: Manager: $0.05 (30 input tokens, 210 output tokens) Worker: $3.71 (1500 input tokens, 1500 output tokens) Director: $1.00 (600 input tokens, 600 output tokens) Challenges and Solutions The team faced several challenges during implementation: Quota Limitations: Initial deployments encountered token quota restrictions, which were resolved through Azure support requests (typically granted within 24 hours). Cost Optimization: High costs associated with vectorization required careful monitoring. The team addressed this by shutting down unused services and deploying on services with free tiers. Integration Issues: GitHub Actions raised errors during deployment, which were resolved using GitHub's App Service Build Service. Azure UI Complexity: The team noted that Azure AI service naming conventions were sometimes confusing, as the same name is used for both parent and child resources. Free Tier Constraints: The AI Search service's free tier limitation of 50MB for vector data storage restricts the amount of company information that can be included in the current implementation. Future Roadmap The current implementation is an MVP with several areas for expansion: Expand the database to include more publicly available sustainability reports beyond the current two companies Optimize token usage by refining query handling processes Research alternative embedding models to reduce costs while maintaining accuracy Implement a more structured storage system with virtual folders in Blob storage Upgrade from the free tier of AI Search to support larger data volumes Develop a proper schema for the vector database to enable filtering and more targeted searches Scale to multiple GPT model deployments for improved performance and redundancy Conclusion ESGai demonstrates how advanced AI techniques like Retrieval-Augmented Generation can transform data-intensive domains such as ESG consulting. By leveraging Azure's comprehensive suite of AI services alongside a robust agent-based architecture, this solution provides users with actionable insights while maintaining scalability and cost efficiency. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh284Views0likes0CommentsEntra: Lock screen help.
Hi guys, I need some assistance with entra regarding the lockscreen images. We had a previous lock screen which displayed the company logo and users were not allowed to change the lock screen, we needed it to be disabled and I deleted the script as well as the policy for the lock screen to try and remove it. However this hasn't worked, the lock screen is still displaying on all devices, and users cannot change the lockscreen. I do not want to perform a reset, because we have so many machines. Any advice on how to enable the users to edit the lock screen again or load a new policy, will be highly appreciated. What I have tried: Removing registry key for lock screen. (Key just pops up after restart) Loading a new script (Fails to load, no reason given, I suspect because it conflicts with old one) Disconnecting from entra and trying to edit the lock screen. Thanks.209Views0likes1Comment