When should you use Prompt pack?

Launching monitoring for a new category. Aligning marketing and SEO on the same questions. Agency handoffs where reproducibility is contractual.

Glossary

Prompt pack

A prompt pack is the curated set of questions a brand tracks against LLMs to measure its LLM-Score over time.

Reproducibility beats ad-hoc chats typed in the browser.
Stable packs make week-over-week deltas meaningful.

Definition

A prompt pack is the curated list of questions your monitoring run executes every cycle. It is the denominator for Share of Voice, the stress surface for fanout queries, and the backbone of LLM-Score. Good packs mix buyer intents, competitor comparisons, and edge cases (pricing, safety, regional nuance) without stuffing brand names into the question text.

Versioning example

Week	Pack version	LLM-Score	Interpretation
W1	`core-v2`	46	baseline
W2	`core-v2`	49	real improvement
W3	`core-v3`	55	methodology change, not trend

Rule: never claim trend continuity across a major pack rewrite.

Trend strip:

Prompt pack version trend chart

How it's computed

Packs are versioned. When you change prompts, start a new baseline window so you do not compare incompatible histories. Getllmspy logs completion metadata so you can prove which pack version produced a dated snapshot.

Pack health math

If this ratio drops after category shifts, refresh the pack instead of forcing old prompts.

How it works in practice

Recommended pack structure

Segment	Share	Goal
High-intent commercial	40%	Revenue-critical comparisons
Mid-intent educational	35%	Category authority checks
Risk prompts (pricing/legal)	25%	Hallucination control

Balanced packs avoid over-optimizing one funnel stage.

Visual flow:

Prompt pack workflow chart

How to read it

If scores swing after a pack edit, attribute the swing to methodology first. If scores swing with a stable pack, treat it as a market or model drift signal worth investigating.

Build your first production pack

Collect 100 real buyer questions from sales/support/search logs.
Cluster by intent and pick the top 50.
Remove direct brand-injected wording.
Add 10 risk prompts (pricing, policies, compliance).
Freeze version for one month before major edits.
Keep a separate experimental pack for rapid hypothesis testing.

Prompt pack quality benchmarks

Indicator	Weak	Strong
Prompt diversity	Mostly one intent	Balanced across funnel stages
Version discipline	Ad-hoc edits weekly	Versioned monthly
Decision value	Noisy trendline	Comparable week-over-week deltas

Pack discipline is often the main difference between vanity dashboards and reliable ops monitoring.

For quarterly reporting, freeze one "core" pack and run experiments in a separate sandbox pack.

When to use

Launching monitoring for a new category.
Aligning marketing and SEO on the same questions.
Agency handoffs where reproducibility is contractual.

LLM-Score Fanout queries Organic prompt Dated snapshot

Inspect a demo prompt pack