llm-cost-report

Name: llm-cost-report
Author: yado2000-maker

by yado2000-makerSource

Weekly LLM spend + unit-economics report — cost by stage (Haiku classifier vs Sonnet reply), by product (Solo 1:1 vs Family group), and per household, plotted against the price. Surfaces margin creep before it becomes a loss. Use to report spend, check the margin, or size the cost impact of a featur

Install

mkdir -p .claude/skills/llm-cost-report && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16472" && unzip -o skill.zip -d .claude/skills/llm-cost-report && rm skill.zip

Installs to .claude/skills/llm-cost-report

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Weekly LLM spend + unit-economics report — cost by stage (Haiku classifier vs Sonnet reply), by product (Solo 1:1 vs Family group), and per household, plotted against the price. Surfaces margin creep before it becomes a loss. Use to report spend, check the margin, or size the cost impact of a feature. Triggers - cost report, spend, margin, unit economics, how much per household.

381 charsno explicit “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

Turns LLM spend from a surprise into a tracked weekly number. Source of truth: docs/plans/2026-05-09-recovery-hardening-plan.md. Owned by Koren; pairs with token-optimization and Paz's first-try-scoreboard.

Data sources

ai_usage table (per-call token + cost rows) via mcp__supabase__execute_sql.
PostHog LLM-analytics MCP: get-llm-total-costs-for-project, exploring-llm-costs.

The weekly cuts to produce

By stage: Haiku classifier (claude-haiku-4-5-20251001) vs Sonnet reply (claude-sonnet-4-20250514) — input/cached/uncached token split each.
By product: Solo 1:1 vs Family group (normalize chat via split_part(group_id,'@',1); @g.us = Family, bare/@s.whatsapp.net = Solo).
Per household: total monthly LLM cost ÷ active households.
Margin line: per-household cost vs Premium ₪14.90/mo (~$4) and ₪149/yr.

Baselines to compare against

~$0.50/household/month two-stage vs ~$1.62 all-Sonnet.
1:1 ~$27.50 / 1K actions (64% silent-Sonnet) vs group ~$16.90 / 1K.

Margin-creep watchlist

Flag when these grow per-call tokens:

Google-calendar context injection (buildCalendarContextBlock, ≤200 tokens but per qualifying turn).
1:1 conversation-history fetch into the E5 extractor / Sonnet.
SHARED_* / prompt growth.
Cloud API Meta Utility template fees (~$0.005–0.015/conversation) once proactive sends move off the ambient-chatter subsidy (master-plan H0).

Output + the unit metric

Produce a one-page weekly summary; the headline number Koren reports up is the margin (price − per-household cost).

Combine with first-try-scoreboard to report cost per correctly-resolved-first-try message — the real unit. A cheaper bot that resolves less is not cheaper.

Running the report

-- Weekly spend by model and product type
SELECT
  model,
  CASE
    WHEN split_part(group_id, '@', 2) = 'g.us' THEN 'Family'
    ELSE 'Solo'
  END AS product,
  SUM(input_tokens)    AS input_tokens,
  SUM(cached_tokens)   AS cached_tokens,
  SUM(output_tokens)   AS output_tokens,
  SUM(cost_usd)        AS cost_usd
FROM ai_usage
WHERE created_at >= now() - interval '7 days'
GROUP BY 1, 2
ORDER BY cost_usd DESC;

-- Per-household monthly cost
SELECT
  household_id,
  SUM(cost_usd) AS monthly_cost_usd
FROM ai_usage
WHERE created_at >= date_trunc('month', now())
GROUP BY household_id
ORDER BY monthly_cost_usd DESC;

When to run this

When a feature PR lands that touches prompt size or adds an LLM call, run this to size the per-household delta BEFORE it ships at scale.

Ground-truth tools (the DB has NO tokens/$ — use these instead)

ai_usage only stores household_id / usage_date / message_count, so $ must come from Anthropic, not Supabase. Two scripts cover it:

scripts/analyze_token_csvs.py — the workhorse. Export the per-day token CSV from the Console (Usage → download) to ~/Downloads/claude_api_tokens_YYYY_MM.csv, then py -3.13 scripts/analyze_token_csvs.py. Reconciles to the dashboard total and breaks spend down by month × model, month × api_key, and cache read:write savings — how you tell a model swap from a caching effect from token-bloat.
scripts/anthropic_usage_report.py — same breakdown live via the Admin Usage/Cost API, IF you have an sk-ant-admin... key. NOTE: individual-tier orgs can't create admin keys (the API-keys page only issues workspace sk-ant-api... keys) — fall back to the CSV script.

Validated baseline (2026-06-08; bot = api_key my-new-key ≈ 100% of spend): the Sonnet 4.6 swap was cost-neutral ($3/$15, same as Sonnet 4); prompt caching saves ~$40/mo (read:write ~1.3×, net-positive); the per-message cost is driven by dynamic per-call context bloat (E5-shadow Haiku + calendar/history injection) and falls as volume grows (cache amortization). Watch cost ÷ first-try-resolved count.

Install

mkdir -p .claude/skills/llm-cost-report && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16472" && unzip -o skill.zip -d .claude/skills/llm-cost-report && rm skill.zip

Installs to .claude/skills/llm-cost-report

Safety

No risk patterns found

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

26d ago

Repo stars

Loads

~970 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

yado2000-maker

Links

Source code

llm-cost-report

Install

Activation

About this skill

Data sources

The weekly cuts to produce

Baselines to compare against

Margin-creep watchlist

Output + the unit metric

Running the report

When to run this

Ground-truth tools (the DB has NO tokens/$ — use these instead)

Search skills