Glossary
Hallucination
Models can sound authoritative while inventing policies, prices, or founders.
Catch them early with scripted prompt packs and dated snapshots.
Definition
In LLM monitoring, a hallucination is a fluent but false claim about your brand—wrong SKU, imaginary promo, mis-stated geography, or a competitor attributed to you. Hallucinations differ from silence (no mention) and from negative-but-true sentiment. They are especially risky in regulated industries where a confident wrong answer can create legal or PR exposure.
Severity matrix
| Type | Frequency | Business risk |
|---|---|---|
| Wrong price | medium | high |
| Wrong integration list | high | medium-high |
| Wrong HQ city | low | medium |
| Competitor confusion | medium | high |
Prioritize by risk first, not by sheer count.
How it's computed
Getllmspy compares model answers against structured facts you provide (site, niche, brand string) and flags contradictions or low-grounding patterns. Fanout queries stress-test whether a false story repeats under rephrasing. Pair automated flags with a human pass for nuance.
Recurrence metric
Hallucination recurrence = repeated false claims / all false claims
Recurring false claims require source-level fixes, not one-off PR replies.
Common hallucination patterns
Pricing — the model quotes a plan that never existed (“starts at $9”) while your real entry tier is higher.
Capabilities — it claims integrations or certifications you do not offer.
Positioning — it merges you with a namesake competitor or misstates your market.
Geography — wrong HQ city, country coverage, or team location.
Pricing + capabilities are the most dangerous because they directly change purchase decisions.
How to read it
Severity scales with reach and persistence: a one-off in a fringe model matters less than the same false price in ChatGPT and Perplexity for a week. Track whether fixes (schema, FAQ, press) reduce recurrence in the next snapshot.
Hallucination response plan
- Create a critical-facts list (pricing, policies, compliance).
- Build dedicated canonical fact pages.
- Add explicit citations and update dates.
- Re-test with fanout prompts.
- Track recurrence for 2-3 weekly runs.
Risk thresholds
| Recurring hallucination rate | Reading |
|---|---|
| <8% | manageable |
| 8-15% | active monitoring required |
| >15% | urgent grounding intervention |
In regulated verticals, even low rates may require immediate action.
When to use
- Crisis playbooks when a model invents a policy you do not offer.
- Pre-launch checks for new SKUs or pricing tiers.
- Quarterly audits alongside LLM-Score.