claude-stats

Local Claude Code usage analytics. Pulls daily cost & token totals from ccusage and per-conversation detail by walking ~/.claude/projects/**/*.jsonl. Stores in a single DuckDB file. Surfaces everything through one interactive HTML dashboard.

CLI commands

Fish function at ~/.config/fish/functions/claude-stats.fish. All read-only against ~/Documents/claude-stats/claude.duckdb unless noted. Four subcommands plus help.

The display surface interactive

claude-stats dashboard [days=30]
The only display command. Calls build-dashboard.sh to regenerate /tmp/claude-stats-dashboard.html, then xdg-opens it. Plotly charts: cost stacked-by-model, daily token mix, cache hit rate, top tools, cost-by-model donut, top projects, context-window distribution. Pass 30 / 90 / 365 for different windows.

Maintenance non-display

claude-stats ingest [SINCE]
Manual ingest-daily.sh trigger. SINCE = YYYYMMDD, default = 8 days ago. Idempotent UPSERT — safe to re-run.
claude-stats ingest-sessions
Manual ingest-sessions.py trigger. Walks all JSONLs and refreshes the conversation/tool/project tables.
claude-stats raw <SQL>
Read+write SQL escape hatch. The only command that opens the DB without -readonly. Reserved for one-off debugging.
Removed in the slim-down
Earlier versions had a dozen terminal-report subcommands (summary, month, year, top, models, cache, compaction, tools, repos, sessions). They were dropped — the dashboard covers all of them with better visual density. If you need a specific number for scripting, use claude-stats raw "<SQL>".

How ingest-daily.sh works

Pulls per-day cost & token totals from ccusage (Anthropic's open-source usage CLI), reshapes via jq, UPSERTs into two tables: daily_usage (date × model) and project_daily_usage (project × date × model, via --instances --breakdown). Runs nightly at 02:00 via systemd timer.

source
2× npx ccusage daily
plain + --instances --breakdown
reshape
2× jq → CSV
flatten modelBreakdowns
load
DuckDB
2× INSERT OR REPLACE
targets
daily_usage
project_daily_usage
The jq reshape (one row per day × model)
# ccusage emits nested JSON per day with .modelBreakdowns[]
# jq flattens to one CSV row per (day, model)
.daily[] as $d
| $d.modelBreakdowns[]
| [
    $d.date,
    .modelName,
    (.inputTokens // 0),
    (.outputTokens // 0),
    (.cacheCreationTokens // 0),
    (.cacheReadTokens // 0),
    (.cost // 0)
  ]
| @csv
The DuckDB UPSERT (idempotent)
CREATE OR REPLACE TEMP TABLE staging AS
SELECT * FROM read_csv('$CSV',
  header=false,
  columns={
    'date': 'DATE',
    'model': 'VARCHAR',
    'input_tokens': 'BIGINT',
    ...
  });

INSERT OR REPLACE INTO daily_usage
SELECT date, model, input_tokens, output_tokens,
       cache_creation_tokens, cache_read_tokens, cost,
       CURRENT_TIMESTAMP
FROM staging;
Cron-safe PATH
The script exports an explicit PATH at the top because cron / systemd timers don't inherit your shell's PATH: ~/.nvm/versions/node/v24.14.1/bin:/usr/bin:/bin:$PATH. Without this, npx isn't found and the ingest silently fails. If you upgrade Node via nvm, update this path.

How ingest-sessions.py works

Walks every .jsonl conversation transcript under ~/.claude/projects/, parses message events & tool-use blocks, then UPSERTs into three tables. Per-project cost now lives in project_daily_usage (populated by ingest-daily.sh); this script no longer touches cost.

walk
~/.claude/projects/
**/*.jsonl
classify
main vs subagent
by path depth
parse
per-line JSON
count msgs, tools, ctx
stage
3 CSVs in /tmp
conv / tools / skills
load
DuckDB
3× INSERT OR REPLACE

Path classification

Two JSONL shapes exist under ~/.claude/projects/<project>/:

  • Main: <project>/<uuid>.jsonlkind = 'main'
  • Subagent: <project>/<parent_uuid>/subagents/<agent-id>.jsonlkind = 'subagent' with parent_session_id set

Per-line parse

Each JSONL line is a typed event. The parser tracks:

  • started_at / ended_at = min/max timestamp across lines
  • cwd / gitBranch = first non-null occurrence (sticky)
  • type=user → increment n_user_msgs
  • type=assistant → increment n_assistant_msgs; track peak context (input + cache_create + cache_read); count tool-use blocks in content[]; for blocks with name=Skill, also count input.skill into a per-session skill tally
  • type=summary → first one wins, capped at 500 chars
  • Files with zero user+assistant msgs are skipped

Storage

  • conversations — one row per JSONL (PK: session_id = filename stem)
  • conversation_tool_usage — one row per (session, tool); aggregated call_count
  • conversation_skill_usage — one row per (session, skill); skill names are <plugin>:<name> for plugin-namespaced skills, or bare <name> for personal skills
  • project_daily_usage (v4) — per-(project_path, date, model) cost from ccusage daily --instances --breakdown --json, populated alongside daily_usage in the same ingest-daily.sh run. Same 8-day rolling-rewrite discipline: rows older than 8 days are immutable. Sums to daily_usage exactly for any (date, model) where ccusage could see the JSONLs at ingest time. The dashboard's "Top projects" panel windows over this table.
claude-clean hook
Your claude-clean fish function runs ingest-sessions.py before any per-project deletion, so transcripts that are about to be wiped are still captured. The DuckDB conversations table outlives the JSONL files.

How build-dashboard.sh works

Self-contained HTML generator. Runs ~9 DuckDB queries, embeds the results as a JSON literal in the page, ships Plotly via CDN, and lets the browser render everything client-side. No build step, no server.

1. query
10× DuckDB
-readonly -json
2. emit
HEREDOC HTML
CSS + structure
3. inject
printf 'key: %s'
JSON inline
4. open
xdg-open
Plotly renders

The 10 queries embedded as DATA

stats
6 KPI numbers (total cost, active days, total tokens, cache hit %, main sessions, tool calls).
dailyCost
Per (date × model) for the window — feeds the stacked-bar chart.
dailyTokens
Per date: input / output / cache_create / cache_read — feeds the token mix stacked bar.
byModel
Cost per model over the window — feeds the donut.
cacheTrend
Cache hit % per date — feeds the spline + area chart.
topTools
Top 15 tools by total calls, with session count.
topSkills
Top 20 skills by total calls, split into plugin + skill for grouped/colored bars.
topRepos
Top 12 projects in the window, from project_daily_usage. Cost is windowed (SUM(cost) WHERE date >= window GROUP BY project_path) and reconciles to the cent against daily_usage for any date that was captured at ingest time.
attribution
Total/attributed/unattributed/% for the window. Surfaces how much of the windowed spend can still be tied to a project (vs. lost when JSONLs were claude-clean'd before being captured).
contextDist
Sessions bucketed by % of model context window used.
sessions
Most recent 25 main sessions — feeds the bottom table.

AMOLED theme

Color tokens used by both the dashboard and this docs site (single source of truth in docs/_styles.css):

--bg:        #000000   /* true black for OLED panels */
--panel:     #0a0a0a   /* card background */
--primary:   #a78bfa   /* electric purple — opus, accents */
--secondary: #22d3ee   /* cyan — sonnet, secondary accent */
--good:      #22c55e   /* green — haiku, healthy state */
--warn:      #f59e0b   /* amber */
--bad:       #ef4444   /* red — compaction risk */
--pink:      #ec4899   /* fourth distinct */

Stable per-model coloring is done in JS via colorForModel(m) — opus → primary, sonnet → secondary, haiku → good, others → warn.

Schema

Five tables. Source-of-truth at ~/Documents/pulse/claude-stats/schema.sql. Re-applied idempotently on each install.sh run (uses CREATE TABLE IF NOT EXISTS).

daily_usage populated by ingest-daily.sh

columntypenotes
dateDATEDay of usage; PK component
modelVARCHARe.g. claude-opus-4-7; PK component
input_tokensBIGINTUncached input tokens
output_tokensBIGINTGenerated tokens
cache_creation_tokensBIGINTTokens written to prompt cache
cache_read_tokensBIGINTTokens served from cache (cheap)
costDOUBLEUSD, computed by ccusage
ingested_atTIMESTAMPLast UPSERT time (UTC)

project_daily_usage populated by ingest-daily.sh (--instances)

columntypenotes
project_pathVARCHARSlugified project dir; PK component (matches conversations.project_path)
dateDATEDay of usage; PK component
modelVARCHARPK component
input/output/cache_*_tokensBIGINTPer (project, date, model) totals
costDOUBLEUSD; SUM over project_path = daily_usage.cost for the same (date, model) when ccusage could see the JSONLs at ingest time
ingested_atTIMESTAMPLast UPSERT time (UTC)

conversations one row per JSONL transcript

columntypenotes
session_idVARCHARPK; filename stem (UUID)
project_pathVARCHARSlugified parent dir
cwdVARCHARFirst-seen CWD from JSONL events
git_branchVARCHARFirst-seen branch
kindVARCHAR'main' or 'subagent'
parent_session_idVARCHARNULL for main; UUID for subagent
started_at / ended_atTIMESTAMPUTC; min/max event timestamps
n_user_msgs / n_assistant_msgs / n_tool_callsINTEGERCounts
max_context_tokensBIGINTPeak context size seen in any assistant turn
modelVARCHARModel in use at the peak-context turn
summaryVARCHARFirst type=summary event, capped at 500 chars

conversation_tool_usage

columntypenotes
session_id, tool_nameVARCHARComposite PK
call_countINTEGERTotal invocations of that tool in that session

conversation_skill_usage

columntypenotes
session_id, skill_nameVARCHARComposite PK. skill_name is <plugin>:<name> or bare <name>
call_countINTEGERTotal invocations of that skill in that session