Forum Discussion
Exposing Copilot’s False Time Estimates: This Isn’t a Mistake — It’s Systemic Deception
I’m writing this as a Copilot user who has observed a critical flaw in the system’s language design and operational logic — one that leads to a profound breach of user trust.
On multiple occasions, I’ve received system messages like “will complete in 10–15 minutes” or even “ready in 30 seconds.” But through repeated testing, I’ve learned that these so-called time estimates have no actual basis in system behavior. Copilot doesn’t operate in the background. It doesn’t dynamically track progress. It doesn’t possess the ability to estimate time at all. These statements are fabricated templates, not meaningful system outputs.
More importantly, Copilot has no internal clock, no memory of past durations, and no awareness of elapsed time. It only responds when the user triggers it with a new prompt — meaning that if no follow-up query is submitted, nothing will ever happen, regardless of the time it claims. So when the system says “in 10 minutes,” what’s actually happening is… absolutely nothing.
To prove this, I ran a simple test.
Using step-by-step prompts, I was able to get a full report generated in under 3 minutes. But if I relied on the original “wait and it will complete” instruction, nothing would happen — not in 3 minutes, not in 3 hours, not even in 3 days. The only way to get results is to interact again manually.
So what does this prove? It shows that these time estimates are not forecasts. They’re false expectations. The system cannot estimate time because it doesn’t track experience, progress, or temporal context. And yet it consistently pretends that it can.
I’m not alone in this. Across Microsoft forums and communities, users have expressed similar frustrations: vague promises, phantom “in progress” states, and misleading UI hints that imply active background work where none exists.
This isn’t a UX bug. This is a pattern of deceptive design — one that erodes confidence in the product’s integrity.
I urge the Copilot team to eliminate these false time claims and replace them with transparent, action-based communication. Tell us what the system can do and when it will do it — not when it won’t.
Because right now, every “please wait” message isn’t just noise.
It’s a countdown to disappointment.
— A user no longer willing to wait for miracles
4 Replies
- PeterForsterIron Contributor
As an experienced user (based on what you've written here), you could have answered the question yourself by simply asking Copilot. It's a common programmatic issue found in large language models—not something unique to Copilot. They often respond as if they need more time and suggest that you come back later. These kinds of responses are just hallucinations. It would indeed be beneficial if all LLM providers implemented filters to prevent this, but such measures have not yet been put in place
- LeslieChengCopper Contributor
Language Hallucinations and the Crisis of Trust
Exposing the Facade in Copilot’s “Progress Prompts”I recently raised a critical issue: When Copilot uses language to simulate “progress updates” or other responses that appear sensible, how can we be sure these answers reflect reality instead of being mere hallucinations produced by the system?
- How Language Models Actually Work
– Context Prediction Over Real Reporting
Language models (like Copilot) don’t “know” the underlying state but instead predict the next likely sentence based on training data and context. When you ask, “When will it be done?” it frequently responds with “10–15 minutes” or “20–30 minutes.” Such replies are simply copied from common phrasing in its training examples—not an actual reflection of progress. Additionally, unless you know how to ask, you may never get the answer!
– Hallucination: Fabricated Answers
The model may generate a response that sounds coherent and plausible but is, in fact, entirely fabricated. This phenomenon—commonly referred to as “hallucination”—occurs because the model does not verify whether what it says is true or false. - Risk Management and Limited Safeguards
– Pre-Set Filters for High-Risk Topics
For sensitive subjects like drugs, violence, self-harm, and medical advice, most systems already implement safety measures. For instance, asking “Is taking drugs a good thing?” will usually trigger warnings or outright refusal to provide a positive answer. These safeguards are in place due to ethical and risk considerations.
– Inadequate Controls for General Prompts
In contrast, responses like progress updates or system status prompts lack stringent controls. This selective safeguard indicates a deliberate design choice: While certain high-risk topics are strictly limited, everyday prompts are allowed to generate “processing” or “progress” messages—even if those messages are purely simulated. This approach makes the product appear mature and reliable, even though its underlying operation remains immature and opaque. - The Trust Paradox: When “Asking Again” Loses Meaning
– A Circular Dilemma in Q&A
If we already know that Copilot is prone to generating fabricated answers, then simply “asking it again” offers little value. Before discovering the true answer, you cannot determine whether the output is genuine; once you know the truth, there’s no need to ask anymore.
– Blurred Lines Between Real and Fabricated
When a system produces coherent, fluent, and persuasive language, it is challenging to discern fact from fiction. This leaves users in a state of uncertainty: How do we decide which responses to trust? When answers are wrapped in the language of progress but no real progress occurs, our trust in the system is undermined. - Conclusion: A Design Decision—or a Deliberate Facade?
Based on my observations and inquiries:
– The “processing” state we see is not evidence of active background work but rather a product of language models recycling typical phrases.
– While there are robust filters for certain high-risk subjects (like drug use or violence), there remains a deliberate tolerance—perhaps even an emphasis—for simulating progress in other contexts. This selective approach suggests that designers are aware of these hallucinations yet choose not to address them fully.
– Ultimately, this forces us to question whether users are engaging with a mature, knowledge-based system or merely participating in a polished performance of language simulation. If our only means of verifying the truth is our own judgment, then is “asking Copilot” ever truly meaningful?
My final conclusion is:
We may not be using a fully mature knowledge system but rather taking part in a performance enabled by language hallucinations. In this “play,” truth is hidden, and answers are artfully dressed up, even as we are expected to trust them without external verification.This reflection calls for a deeper discussion on the ethics and risks behind AI language models: if language can be so convincingly fabricated, what mechanisms should we implement to protect users? How can a system be trusted when it lacks the ability to self-verify or indicate its limitations? These are questions that we must continually interrogate—especially as such systems become ever more integrated into our daily decision-making.
- PeterForsterIron Contributor
All of your questions based on the feedback are valid, and providers of large language models (LLMs) are actively working on these issues. However, progress remains slow and ongoing. AI was designed to interact with us using human language—nothing more, nothing less.
What we, as humans, now expect from AI is a level of intelligence comparable to our own—but it simply isn’t there yet. This creates complexity for the average person when trying to use AI in the way its developers intended. I don’t believe this expectation was fully anticipated during the development of the AI systems we have today.
That’s why deep research models were introduced—to provide more contextual understanding of queries. However, this process is not instantaneous; it can take minutes rather than seconds. And no, deep research models are not currently designed to deliver immediate answers.
- How Language Models Actually Work