Glossary
Prompt pack
Reproducibility beats ad-hoc chats typed in the browser.
Stable packs make week-over-week deltas meaningful.
Definition
A prompt pack is the curated list of questions your monitoring run executes every cycle. It is the denominator for Share of Voice, the stress surface for fanout queries, and the backbone of LLM-Score. Good packs mix buyer intents, competitor comparisons, and edge cases (pricing, safety, regional nuance) without stuffing brand names into the question text.
Versioning example
| Week | Pack version | LLM-Score | Interpretation |
|---|---|---|---|
| W1 | core-v2 | 46 | baseline |
| W2 | core-v2 | 49 | real improvement |
| W3 | core-v3 | 55 | methodology change, not trend |
Rule: never claim trend continuity across a major pack rewrite.
Trend strip:
How it's computed
Packs are versioned. When you change prompts, start a new baseline window so you do not compare incompatible histories. Getllmspy logs completion metadata so you can prove which pack version produced a dated snapshot.
Pack health math
If this ratio drops after category shifts, refresh the pack instead of forcing old prompts.
How it works in practice
Recommended pack structure
| Segment | Share | Goal |
|---|---|---|
| High-intent commercial | 40% | Revenue-critical comparisons |
| Mid-intent educational | 35% | Category authority checks |
| Risk prompts (pricing/legal) | 25% | Hallucination control |
Balanced packs avoid over-optimizing one funnel stage.
Visual flow:
How to read it
If scores swing after a pack edit, attribute the swing to methodology first. If scores swing with a stable pack, treat it as a market or model drift signal worth investigating.
Build your first production pack
- Collect 100 real buyer questions from sales/support/search logs.
- Cluster by intent and pick the top 50.
- Remove direct brand-injected wording.
- Add 10 risk prompts (pricing, policies, compliance).
- Freeze version for one month before major edits.
- Keep a separate experimental pack for rapid hypothesis testing.
Prompt pack quality benchmarks
| Indicator | Weak | Strong |
|---|---|---|
| Prompt diversity | Mostly one intent | Balanced across funnel stages |
| Version discipline | Ad-hoc edits weekly | Versioned monthly |
| Decision value | Noisy trendline | Comparable week-over-week deltas |
Pack discipline is often the main difference between vanity dashboards and reliable ops monitoring.
For quarterly reporting, freeze one "core" pack and run experiments in a separate sandbox pack.
When to use
- Launching monitoring for a new category.
- Aligning marketing and SEO on the same questions.
- Agency handoffs where reproducibility is contractual.