monitoring
282 TopicsUnderstand New Sentinel Pricing Model with Sentinel Data Lake Tier
Introduction on Sentinel and its New Pricing Model Microsoft Sentinel is a cloud-native Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) platform that collects, analyzes, and correlates security data from across your environment to detect threats and automate response. Traditionally, Sentinel stored all ingested data in the Analytics tier (Log Analytics workspace), which is powerful but expensive for high-volume logs. To reduce cost and enable customers to retain all security data without compromise, Microsoft introduced a new dual-tier pricing model consisting of the Analytics tier and the Data Lake tier. The Analytics tier continues to support fast, real-time querying and analytics for core security scenarios, while the new Data Lake tier provides very low-cost storage for long-term retention and high-volume datasets. Customers can now choose where each data type lands—analytics for high-value detections and investigations, and data lake for large or archival types—allowing organizations to significantly lower cost while still retaining all their security data for analytics, compliance, and hunting. Please flow diagram depicts new sentinel pricing model: Now let's understand this new pricing model with below scenarios: Scenario 1A (PAY GO) Scenario 1B (Usage Commitment) Scenario 2 (Data Lake Tier Only) Scenario 1A (PAY GO) Requirement Suppose you need to ingest 10 GB of data per day, and you must retain that data for 2 years. However, you will only frequently use, query, and analyze the data for the first 6 months. Solution To optimize cost, you can ingest the data into the Analytics tier and retain it there for the first 6 months, where active querying and investigation happen. After that period, the remaining 18 months of retention can be shifted to the Data Lake tier, which provides low-cost storage for compliance and auditing needs. But you will be charged separately for data lake tier querying and analytics which depicted as Compute (D) in pricing flow diagram. Pricing Flow / Notes The first 10 GB/day ingested into the Analytics tier is free for 31 days under the Analytics logs plan. All data ingested into the Analytics tier is automatically mirrored to the Data Lake tier at no additional ingestion or retention cost. For the first 6 months, you pay only for Analytics tier ingestion and retention, excluding any free capacity. For the next 18 months, you pay only for Data Lake tier retention, which is significantly cheaper. Azure Pricing Calculator Equivalent Assuming no data is queried or analyzed during the 18-month Data Lake tier retention period: Although the Analytics tier retention is set to 6 months, the first 3 months of retention fall under the free retention limit, so retention charges apply only for the remaining 3 months of the analytics retention window. Azure pricing calculator will adjust accordingly. Scenario 1B (Usage Commitment) Now, suppose you are ingesting 100 GB per day. If you follow the same pay-as-you-go pricing model described above, your estimated cost would be approximately $15,204 per month. However, you can reduce this cost by choosing a Commitment Tier, where Analytics tier ingestion is billed at a discounted rate. Note that the discount applies only to Analytics tier ingestion—it does not apply to Analytics tier retention costs or to any Data Lake tier–related charges. Please refer to the pricing flow and the equivalent pricing calculator results shown below. Monthly cost savings: $15,204 – $11,184 = $4,020 per month Now the question is: What happens if your usage reaches 150 GB per day? Will the additional 50 GB be billed at the Pay-As-You-Go rate? No. The entire 150 GB/day will still be billed at the discounted rate associated with the 100 GB/day commitment tier bucket. Azure Pricing Calculator Equivalent (100 GB/ Day) Azure Pricing Calculator Equivalent (150 GB/ Day) Scenario 2 (Data Lake Tier Only) Requirement Suppose you need to store certain audit or compliance logs amounting to 10 GB per day. These logs are not used for querying, analytics, or investigations on a regular basis, but must be retained for 2 years as per your organization’s compliance or forensic policies. Solution Since these logs are not actively analyzed, you should avoid ingesting them into the Analytics tier, which is more expensive and optimized for active querying. Instead, send them directly to the Data Lake tier, where they can be retained cost-effectively for future audit, compliance, or forensic needs. Pricing Flow Because the data is ingested directly into the Data Lake tier, you pay both ingestion and retention costs there for the entire 2-year period. If, at any point in the future, you need to perform advanced analytics, querying, or search, you will incur additional compute charges, based on actual usage. Even with occasional compute charges, the cost remains significantly lower than storing the same data in the Analytics tier. Realized Savings Scenario Cost per Month Scenario 1: 10 GB/day in Analytics tier $1,520.40 Scenario 2: 10 GB/day directly into Data Lake tier $202.20 (without compute) $257.20 (with sample compute price) Savings with no compute activity: $1,520.40 – $202.20 = $1,318.20 per month Savings with some compute activity (sample value): $1,520.40 – $257.20 = $1,263.20 per month Azure calculator equivalent without compute Azure calculator equivalent with Sample Compute Conclusion The combination of the Analytics tier and the Data Lake tier in Microsoft Sentinel enables organizations to optimize cost based on how their security data is used. High-value logs that require frequent querying, real-time analytics, and investigation can be stored in the Analytics tier, which provides powerful search performance and built-in detection capabilities. At the same time, large-volume or infrequently accessed logs—such as audit, compliance, or long-term retention data—can be directed to the Data Lake tier, which offers dramatically lower storage and ingestion costs. Because all Analytics tier data is automatically mirrored to the Data Lake tier at no extra cost, customers can use the Analytics tier only for the period they actively query data, and rely on the Data Lake tier for the remaining retention. This tiered model allows different scenarios—active investigation, archival storage, compliance retention, or large-scale telemetry ingestion—to be handled at the most cost-effective layer, ultimately delivering substantial savings without sacrificing visibility, retention, or future analytical capabilities.1.4KViews1like1CommentChange AVD Insights Workbooks to use VM Insights Queries
Currently the AVD Insights workbooks rely on the Microsoft Monitoring Agent which is being retired. Please release an AVD Insights solution that works with and utilises the VM Insights capabilities of Azure Monitor. This will allow us to decommission the use of the MMA.1.4KViews6likes3CommentsDefender Entity Page w/ Sentinel Events Tab
One device is displaying the Sentinel Events Tab, while the other is not. The only difference observed is that one device is Azure AD (AAD) joined and the other is Domain Joined. Could this difference account for the missing Sentinel events data? Any insight would be appreciated!215Views0likes2CommentsEvent-Driven to Change-Driven: Low-cost dependency inversion
Event-driven architectures tout scalability, loose coupling, and eventual consistency. The architectural patterns are sound, the theory is compelling, and the blog posts make it look straightforward. Then you implement it. Suddenly you're maintaining separate event stores, implementing transactional outboxes, debugging projection rebuilds, versioning events across a dozen micro-services, and writing mountains of boilerplate to handle what should be simple queries. Your domain events that were supposed to capture rich business meaning have devolved into glorified database change notifications. Downstream services diff field values to extract intent from "OrderUpdated" events because developers just don't get what constitutes a proper domain event. The complexity tax is real, don't get me wrong, it's very elegant but for many systems it's unjustified. Drasi offers an alternative: change-driven architecture that delivers reactive, real-time capabilities across multiple data sources without requiring you to rewrite your application or over complicate your architecture. What do we mean by “Event-driven” architecture As Martin Fowler notes, event-driven architecture isn't a single pattern, it's at least four distinct patterns that are often confused, each with its own benefits and traps. Event Notification is the simplest form. Here, events act as signals that something has happened, but carry minimal data, often just an identifier. The recipient must query the source system for more details if needed. For example, a service emits an OrderPlaced event with just the order ID. Downstream consumers must query the order service to retrieve full order details. Event Carried State Transfer broadcasts full state changes through events. When an order ships, you publish an OrderShipped event containing all the order details. Downstream services maintain their own materialized views or projections by consuming these events. Event Sourcing goes further, events become your source of truth. Instead of storing current state, you store the sequence of events that led to that state. Your order isn't a row in a database; it's the sum of OrderPlaced, ItemAdded, PaymentProcessed, and OrderShipped events. CQRS (Command Query Responsibility Segregation) separates write operations (commands) from read operations (queries). While not inherently event-driven, CQRS is often paired with event sourcing or event-carried state transfer to optimize for scalability and maintainability. Originally derived from Bertrand Meyer's Command-Query Separation principle and popularized by Greg Young, CQRS addresses a specific architectural challenge: the tension between optimizing for writes versus optimizing for reads. The pattern promises several benefits: Optimized data models: Your write model can focus on transactional consistency while read models optimize for query performance Scalability: Read and write sides can scale independently Temporal queries: With event sourcing, you get time travel for free—reconstruct state at any point in history Audit trail: Every change is captured as an immutable event While CQRS isn't inherently tied to Domain-Driven Design (DDD), the pattern complements DDD well. In DDD contexts, CQRS enables different bounded contexts to maintain their own read models tailored to their specific ubiquitous language, while the write model protects domain invariants. This is why you'll often see them discussed together, though each can be applied independently. The core motivation for these patterns is often to invert the dependency between systems, so that your downstream services do not need to know about your upstream services. The Developer's Struggle: When Domain Events Become Database Events Chris Kiehl puts it bluntly in his article "Don't Let the Internet Dupe You, Event Sourcing is Hard": "The sheer volume of plumbing code involved is staggering—instead of a friendly N-tier setup, you now have classes for commands, command handlers, command validators, events, aggregates, and then projections, model classes, access classes, custom materialization code, and so on." But the real tragedy isn't the boilerplate, it's what happens to those carefully crafted domain events. As developers are disconnected from the real-world business, they struggle to understand the nuances of domain events, a dangerous pattern emerges. Instead of modeling meaningful business processes, teams default to what they know: CRUD. Your event stream starts looking like this: OrderCreated OrderUpdated OrderUpdated (again) OrderUpdated (wait, what changed?) OrderDeleted As one developer noted on LinkedIn, these "CRUD events" are really just "leaky events that lack clarity and should not be used to replicate databases as this leaks implementation details and couples services to a shared data model." Dennis Doomen, reflecting on real-world production issues, observes: "It's only once you have a living, breathing machine, users which depend on you, consumers which you can't break, and all the other real-world complexities that plague software projects that the hard problems in event sourcing will rear their heads." The result? Your elegant event-driven architecture devolves into an expensive, brittle form of self-maintained Change Data Capture (CDC). You're not modeling business processes; you're just broadcasting database mutations with extra steps. The Anti-Corruption Layer: Your Defense Against the Outside World In DDD, an Anti-Corruption Layer (ACL) protects your bounded context from external models that would corrupt your domain. Think of it as a translator that speaks both languages, the messy external model and your clean internal model. The ACL ensures that changes to the external system don't ripple through your domain. If the legacy system changes its schema, you update the translator, not your entire domain model. When Event Taxonomies Become Your ACL (And Why They Fail) In most event-driven architectures, your event taxonomy is supposed to serve as the shared contract between services. Each service publishes events using its own ubiquitous language, and consumers translate these into their own models, this translation is the ACL. The theory looks beautiful: But reality? Most teams end up with this: Instead of OrderPaid events that carry business meaning, we get OrderUpdated events that force every consumer to reconstruct intent by diffing fields. When you change your database schema, say splitting the orders table or switching from SQL to NoSQL, every downstream service breaks because they're all coupled to your internal data model. You haven't built an anti-corruption layer. You've built a corruption pipeline that efficiently distributes your internal implementation details across the entire system, forcing you to deploy all services in lock step and eroding the decoupling benefits you were supposed to get. Enter Drasi: Continuous Queries This is where Drasi changes the game. Instead of publishing events and hoping downstream services can make sense of them, Drasi tails the changelog of the data source itself and derives meaning through continuous queries. A continuous query in Drasi isn't just a query that runs repeatedly, it's a living, breathing projection that reacts to changes in real-time. Here's the key insight: instead of imperative code that processes events ("when this happens, do that"), you write declarative queries that describe the state you care about ("I want to know about orders that are ready and have drivers waiting"). Let's break down what makes this powerful: Declarative vs. Imperative Traditional event processing: Drasi continuous query: Semantic Mapping from Low-Level Changes Drasi excels at transforming database-level changes into business-meaningful events. You're not reacting to "row updated in orders table", you're reacting to "order ready for curbside pickup." This enables the same core benefits of dependency inversion we get from event-driven architectures but at a fraction of the effort. Advanced Temporal Features Remember those developers struggling with "OrderUpdated" events, trying to figure out if something just happened or has been true for a while? Drasi handles this elegantly: This query only fires when a driver has been waiting for more than 10 minutes, no timestamp tracking, no state machines, no complex event correlation logic, imagine trying to manually implement this in a downstream event consumer. 😱 Cross-Source Aggregation Without Code With Drasi, you can have live projections across PostgreSQL, MySQL, SQL Server, and Cosmos DB as if they were a single graph: No custom aggregation service. No event stitching logic. No custom downstream datastore to track the sum or keep a materialized projection. Just a query. Continuous Queries as Your Shared Contract Drasi's continuous queries, combined with pre-processing middleware, can form the shared contract that your anti-corruption layer can depend on. The continuous query becomes your contract. Downstream systems don't know or care whether orders come from PostgreSQL, MongoDB, or a CSV file. They don't know if you normalized your database, denormalized it, or moved to event sourcing. They just consume the query results. Clean, semantic, and stable. Reactions as your Declarative Consumers Drasi does not simply output a stream of raw change diffs, instead it has a library of interchangeable Reactions, that can act on the output of continuous queries. These are declared using YAML and can do anything from host a web-socket endpoint that provides a live projection to your UI, to calling an Http endpoint or publishing a message on a queue. Example: The Curbside Pickup System Let's see how this works in Drasi's curbside pickup tutorial. This example has two independent databases and serves as a great illustration of a real-time projection built from multiple upstream services. The Business Problem A retail system needs to: Match ready orders with drivers who've arrived at pickup zones Alert staff when drivers wait more than 10 minutes without their order being ready Coordinate data from two different systems (retail ops in PostgreSQL, physical ops in MySQL) Traditional Event-Driven Approach In this architecture, you'd need something like: That's just the happy path. We haven't handled: Event ordering issues Partial failures Cache invalidation Service restarts and replay Duplicate events Transactional outboxing The Drasi Approach With Drasi, the entire aggregation service above becomes two queries: Delivery Dashboard Query: Wait Detection Query: That's it. No event handlers. No caching. No timers. No state management. Drasi handles: Change detection across both databases Correlation between orders and vehicles Temporal logic for wait detection Pushing updates to dashboards via SignalR The queries define your business logic declaratively. When data changes in either database, Drasi automatically re-evaluates the queries and triggers reactions for any changes in the result set. Drasi: The Non-Invasive Alternative to Legacy System Rewrites Here's perhaps the most compelling argument for Drasi: it doesn't require you to rewrite anything. Traditional event sourcing means: Redesigning your application around events Rewriting your persistence layer Implementing transactional outboxes Managing snapshots and replays Training your team on new patterns, steep learning curve Migrating existing data to event streams Building projection infrastructure Updating all consumers to handle events As one developer noted about their event sourcing journey: "Event Sourcing is a beautiful solution for high-performance or complex business systems, but you need to be aware that this also introduces challenges most people don't tell you about." Drasi's approach: Keep your existing databases Keep your existing services Keep your existing deployment model Add continuous queries where you need reactive behavior Get the benefits of dependency inversion Gradually migrate complexity from code to queries You can start with a single query on a single table and expand from there. No big bang. No feature freeze. No three-month architecture sprint or large multi-year investments, full of risk. Migration Example: From Polling to Reactive Let's say you have a legacy order system where a scheduled job polls for ready orders every 30 seconds: With Drasi, you: Point Drasi at your existing database Write the continuous query Update your dashboard to receive pushes instead of polls Turn off the polling job Your database hasn't changed. Your order service hasn't changed. You've just added a reactive layer on top that eliminates polling overhead and reduces notification latency from 30 seconds to milliseconds. The intellectually satisfying complexity of event sourcing often obscures a simple truth: most systems don't need it. They need to know when interesting things change in their data and react accordingly. They need to combine data from multiple sources without writing bespoke aggregation services. They need to transform low-level changes into business-meaningful events. Drasi delivers these capabilities without the ceremony. Where Do We Go from Here? If you're building a new system and your team has deep event sourcing experience embrace the pattern. Event sourcing shines for certain domains. But if you're like many teams, trying to add reactive capabilities to existing systems, struggling with data synchronization across services, or finding that your "events" are just CRUD operations in disguise, consider the change-driven approach. Start small: Identify one painful polling loop or batch job Set up Drasi to monitor those same data sources Write a continuous query that captures the business condition Replace the polling with push-based reactions Measure the reduction in latency, overhead, and code complexity The best architecture isn't the most sophisticated one, it's the one your team can understand, maintain, and evolve. Sometimes that means acknowledging that we've been mid-curving it with overly complex event-driven architectures. Drasi and change-driven architecture offer the power of reactive systems without the complexity tax. Your data changes. Your queries notice. Your systems react. It makes it a non-event. Want to explore Drasi further? Check out the official documentation and try the curbside pickup tutorial to see change-driven architecture in action.349Views1like0CommentsPAAS resource metrics using Azure Data Collection Rule to Log Analytics Workspace
Hi Team, I want to build a use case to pull the Azure PAAS resources metrics using azure DCR and push that data metrics to log analytics workspace which eventually will push the data to azure event hub through streaming and final destination as azure postgres to store all the resources metrics information in a centralized table and create KPIs and dashboard for the clients for better utilization of resources. I have not used diagnose setting enabling option since it has its cons like we need to manually enable each resources settings also we get limited information extracted from diagnose setting. But while implementing i saw multiple articles stating DCR is not used for pulling PAAS metrics its only compatible for VM metrics. Want to understand is it possible to use DCR for PAAS metrics? Thanks in advance for any inputs.Solved107Views0likes2CommentsApplying DevOps Principles on Lean Infrastructure. Lessons From Scaling to 102K Users.
Hi Azure Community, I'm a Microsoft Certified DevOps Engineer, and I want to share an unusual journey. I have been applying DevOps principles on traditional VPS infrastructure to scale to 102,000 users with 99.2% uptime. Why am I posting this in an Azure community? Because I'm planning migration to Azure in 2026, and I want to understand: What mistakes am I already making that will bite me during migration? THE CURRENT SETUP Platform: Social commerce (West Africa) Users: 102,000 active Monthly events: 2 million Uptime: 99.2% Infrastructure: Single VPS Stack: PHP/Laravel, MySQL, Redis Yes - one VPS. No cloud. No Kubernetes. No microservices. WHY I HAVEN'T USED AZURE YET Honest answer: Budget constraints in emerging market startup ecosystem. At our current scale, fully managed Azure services would significantly increase monthly burn before product-market expansion. The funding we raised needs to last through growth milestones. The trade: I manually optimize what Azure would auto-scale. I debug what Application Insights would catch. I do by hand what Azure Functions would automate. DEVOPS PRACTICES THAT KEPT US RUNNING Even on single-server infrastructure, core DevOps principles still apply: CI/CD Pipeline (GitHub Actions) • 3-5 deployments weekly • Zero-downtime deploys • Automated rollback on health check failures • Feature flags for gradual rollouts Monitoring & Observability • Custom monitoring (would love Application Insights) • Real-time alerting • Performance tracking and slow query detection • Resource usage monitoring Automation • Automated backups • Automated database optimization • Automated image compression • Automated security updates Infrastructure as Code • Configs in Git • Deployment scripts • Environment variables • Documented procedures Testing & Quality • Automated test suite • Pre-deployment health checks • Staging environment • Post-deployment verification KEY OPTIMIZATIONS Async Job Processing • Upload endpoint: 8 seconds → 340ms • 4x capacity increase Database Optimization • Feed loading: 6.4 seconds → 280ms • Strategic caching • Batch processing Image Compression • 3-8MB → 180KB (94% reduction) • Critical for mobile users Caching Strategy • Redis for hot data • Query result caching • Smart invalidation Progressive Enhancement • Server-rendered pages • 2-3 second loads on 4G WHAT I'M WORRIED ABOUT FOR AZURE MIGRATION This is where I need your help: Architecture Decisions • App Service vs Functions + managed services? • MySQL vs Azure SQL? • When does cost/benefit flip for managed services? Cost Management • How do startups manage Azure costs during growth? • Reserved instances vs pay-as-you-go? • Which Azure services are worth the premium? Migration Strategy • Lift-and-shift first, or re-architect immediately? • Zero-downtime migration with 102K active users? • Validation approach before full cutover? Monitoring & DevOps • Application Insights - worth it from day one? • Azure DevOps vs GitHub Actions for Azure deployments? • Operational burden reduction with managed services? Development Workflow • Local development against Azure services? • Cost-effective staging environments? • Testing Azure features without constant bills? MY PLANNED MIGRATION PATH Phase 1: Hybrid (Q1 2026) • Azure CDN for static assets • Azure Blob Storage for images • Application Insights trial • Keep compute on VPS Phase 2: Compute Migration (Q2 2026) • App Service for API • Azure Database for MySQL • Azure Cache for Redis • VPS for background jobs Phase 3: Full Azure (Q3 2026) • Azure Functions for processing • Full managed services • Retire VPS QUESTIONS FOR THIS COMMUNITY Question 1: Am I making migration harder by waiting? Should I have started with Azure at higher cost to avoid technical debt? Question 2: What will break when I migrate? What works on VPS but fails in cloud? What assumptions won't hold? Question 3: How do I validate before cutting over? Parallel infrastructure? Gradual traffic shift? Safe patterns? Question 4: Cost optimization from day one? What to optimize immediately vs later? Common cost mistakes? Question 5: DevOps practices that transfer? What stays the same? What needs rethinking for cloud-native? THE BIGGER QUESTION Have you migrated from self-hosted to Azure? What surprised you? I know my setup isn't best practice by Azure standards. But it's working, and I've learned optimization, monitoring, and DevOps fundamentals in practice. Will those lessons transfer? Or am I building habits that cloud will expose as problematic? Looking forward to insights from folks who've made similar migrations. --- About the Author: Microsoft Certified DevOps Engineer and Azure Developer. CTO at social commerce platform scaling in West Africa. Preparing for phased Azure migration in 2026. P.S. I got the Azure certifications to prepare for this migration. Now I need real-world wisdom from people who've actually done it!80Views0likes0CommentsProject Pavilion Presence at KubeCon NA 2025
KubeCon + CloudNativeCon NA took place in Atlanta, Georgia, from 10-13 November, and continued to highlight the ongoing growth of the open source, cloud-native community. Microsoft participated throughout the event and supported several open source projects in the Project Pavilion. Microsoft’s involvement reflected our commitment to upstream collaboration, open governance, and enabling developers to build secure, scalable and portable applications across the ecosystem. The Project Pavilion serves as a dedicated, vendor-neutral space on the KubeCon show floor reserved for CNCF projects. Unlike the corporate booths, it focuses entirely on open source collaboration. It brings maintainers and contributors together with end users for hands-on demos, technical discussions, and roadmap insights. This space helps attendees discover emerging technologies and understand how different projects fit into the cloud-native ecosystem. It plays a critical role for idea exchanges, resolving challenges and strengthening collaboration across CNCF approved technologies. Why Our Presence Matters KubeCon NA remains one of the most influential gatherings for developers and organizations shaping the future of cloud-native computing. For Microsoft, participating in the Project Pavilion helps advance our goals of: Open governance and community-driven innovation Scaling vital cloud-native technologies Secure and sustainable operations Learning from practitioners and adopters Enabling developers across clouds and platforms Many of Microsoft’s products and cloud services are built on or aligned with CNCF and open-source technologies. Being active within these communities ensures that we are contributing back to the ecosystem we depend on and designing by collaborating with the community, not just for it. Microsoft-Supported Pavilion Projects containerd Representative: Wei Fu The containerd team engaged with project maintainers and ecosystem partners to explore solutions for improving AI model workflows. A key focus was the challenge of handling large OCI artifacts (often 500+ GiB) used in AI training workloads. Current image-pulling flows require containerd to fetch and fully unpack blobs, which significantly delays pod startup for large models. Collaborators from Docker, NTT, and ModelPack discussed a non-unpacking workflow that would allow training workloads to consume model data directly. The team plans to prototype this behavior as an experimental feature in containerd. Additional discussions included updates related to nerdbox and next steps for the erofs snapshotter. Copacetic Representative: Joshua Duffney The Copa booth attracted roughly 75 attendees, with strong representation from federal agencies and financial institutions, a sign of growing adoption in regulated industries. A lightning talk delivered at the conference significantly boosted traffic and engagement. Key feedback and insights included: High interest in customizable package update sources Demand for application-level patching beyond OS-level updates Need for clearer CI/CD integration patterns Expectations around in-cluster image patching Questions about runtime support, including Podman The conversations revealed several documentation gaps and feature opportunities that will inform Copa’s roadmap and future enablement efforts. Drasi Representative: Nandita Valsan KubeCon NA 2025 marked Drasi’s first in-person presence since its launch in October 2024 and its entry into the CNCF Sandbox in early 2025. With multiple kiosk slots, the team interacted with ~70 visitors across shifts. Engagement highlights included: New community members joining the Drasi Discord and starring GitHub repositories Meaningful discussions with observability and incident management vendors interested in change-driven architectures Positive reception to Aman Singh’s conference talk, which led attendees back to the booth for deeper technical conversations Post-event follow-ups are underway with several sponsors and partners to explore collaboration opportunities. Flatcar Container Linux Representatives: Sudhanva Huruli and Vamsi Kavuru The Flatcar project had some fantastic conversations at the pavilion. Attendees were eager to learn about bare metal provisioning, GPU support for AI workloads, and how Flatcar’s fully automated build and test process keeps things simple and developer friendly. Questions around Talos vs. Flatcar and CoreOS sparked lively discussions, with the team emphasizing Flatcar’s usability and independence from an OS-level API. Interest came from government agencies and financial institutions, and the preview of Flatcar on AKS opened the door to deeper conversations about real-world adoption. The Project Pavilion proved to be the perfect venue for authentic, technical exchanges. Flux Representatives: Dipti Pai The Flux booth was active throughout all three days of the Project Pavilion, where Microsoft joined other maintainers to highlight new capabilities in Flux 2.7, including improved multi-tenancy, enhanced observability, and streamlined cloud-native integrations. Visitors shared real-world GitOps experiences, both successes and challenges, which provided valuable insights for the project’s ongoing development. Microsoft’s involvement reinforced strong collaboration within the Flux community and continued commitment to advancing GitOps practices. Headlamp Representatives: Joaquim Rocha, Will Case, and Oleksandr Dubenko Headlamp had a booth for all three days of the conference, engaging with both longstanding users and first-time attendees. The increased visibility from becoming a Kubernetes sub-project was evident, with many attendees sharing their usage patterns across large tech organizations and smaller industrial teams. The booth enabled maintainers to: Gather insights into how teams use Headlamp in different environments Introduce Headlamp to new users discovering it via talks or hallway conversations Build stronger connections with the community and understand evolving needs Inspektor Gadget Representatives: Jose Blanquicet and Mauricio Vásquez Bernal Hosting a half-day kiosk session, Inspektor Gadget welcomed approximately 25 visitors. Attendees included newcomers interested in learning the basics and existing users looking for updates. The team showcased new capabilities, including the tcpdump gadget and Prometheus metrics export, and invited visitors to the upcoming contribfest to encourage participation. Istio Representatives: Keith Mattix, Jackie Maertens, Steven Jin Xuan, Niranjan Shankar, and Mike Morris The Istio booth continued to attract a mix of experienced adopters and newcomers seeking guidance. Technical discussions focused on: Enhancements to multicluster support in ambient mode Migration paths from sidecars to ambient Improvements in Gateway API availability and usage Performance and operational benefits for large-scale deployments Users, including several Azure customers, expressed appreciation for Microsoft’s sustained investment in Istio as part of their service mesh infrastructure. Notary Project Representative: Feynman Zhou and Toddy Mladenov The Notary Project booth saw significant interest from practitioners concerned with software supply chain security. Attendees discussed signing, verification workflows, and integrations with Azure services and Kubernetes clusters. The conversations will influence upcoming improvements across Notary Project and Ratify, reinforcing Microsoft’s commitment to secure artifacts and verifiable software distribution. Open Policy Agent (OPA) - Gatekeeper Representative: Jaydip Gabani The OPA/Gatekeeper booth enabled maintainers to connect with both new and existing users to explore use cases around policy enforcement, Rego/CEL authoring, and managing large policy sets. Many conversations surfaced opportunities around simplifying best practices and reducing management complexity. The team also promoted participation in an ongoing Gatekeeper/OPA survey to guide future improvements. ORAS Representative: Feynman Zhou and Toddy Mladenov ORAS engaged developers interested in OCI artifacts beyond container images which includes AI/ML models, metadata, backups, and multi-cloud artifact workflows. Attendees appreciated ORAS’s ecosystem integrations and found the booth examples useful for understanding how artifacts are tagged, packaged, and distributed. Many users shared how they leverage ORAS with Azure Container Registry and other OCI-compatible registries. Radius Representative: Zach Casper The Radius booth attracted the attention of platform engineers looking for ways to simplify their developer's experience while being able to enforce enterprise-grade infrastructure and security best practices. Attendees saw demos on deploying a database to Kubernetes and using managed databases from AWS and Azure without modifying the application deployment logic. They also saw a preview of Radius integration with GitHub Copilot enabling AI coding agents to autonomously deploy and test applications in the cloud. Conclusion KubeCon + CloudNativeCon North America 2025 reinforced the essential role of open source communities in driving innovation across cloud native technologies. Through the Project Pavilion, Microsoft teams were able to exchange knowledge with other maintainers, gather user feedback, and support projects that form foundational components of modern cloud infrastructure. Microsoft remains committed to building alongside the community and strengthening the ecosystem that powers so much of today’s cloud-native development. For anyone interested in exploring or contributing to these open source efforts, please reach out directly to each project’s community to get involved, or contact Lexi Nadolski at lexinadolski@microsoft.com for more information.226Views1like0Comments