claude-stats

Local Claude Code usage analytics. Pulls daily cost & token totals from ccusage and per-conversation detail by walking ~/.claude/projects/**/*.jsonl. Stores in a single DuckDB file. Surfaces everything through one interactive HTML dashboard.

CLI commands

Fish function at ~/.config/fish/functions/claude-stats.fish. All read-only against ~/Documents/claude-stats/claude.duckdb unless noted. Four subcommands plus help.

The display surface interactive

claude-stats dashboard [days=30]

The only display command. Calls build-dashboard.sh to regenerate /tmp/claude-stats-dashboard.html, then xdg-opens it. Plotly charts: cost stacked-by-model, daily token mix, cache hit rate, top tools, cost-by-model donut, top projects, context-window distribution. Pass 30 / 90 / 365 for different windows.

Maintenance non-display

claude-stats ingest [SINCE]

Manual ingest-daily.sh trigger. SINCE = YYYYMMDD, default = 8 days ago. Idempotent UPSERT — safe to re-run.

claude-stats ingest-sessions

Manual ingest-sessions.py trigger. Walks all JSONLs and refreshes the conversation/tool/project tables.

claude-stats raw <SQL>

Read+write SQL escape hatch. The only command that opens the DB without -readonly. Reserved for one-off debugging.

Removed in the slim-down

Earlier versions had a dozen terminal-report subcommands (summary, month, year, top, models, cache, compaction, tools, repos, sessions). They were dropped — the dashboard covers all of them with better visual density. If you need a specific number for scripting, use claude-stats raw "<SQL>".

How `ingest-daily.sh` works

Pulls per-day cost & token totals from ccusage (Anthropic's open-source usage CLI), reshapes via jq, UPSERTs into two tables: daily_usage (date × model) and project_daily_usage (project × date × model, via --instances --breakdown). Runs nightly at 02:00 via systemd timer.

source

2× npx ccusage daily
plain + --instances --breakdown

→

reshape

2× jq → CSV
flatten modelBreakdowns

→

load

DuckDB
2× INSERT OR REPLACE

→

targets

daily_usage
project_daily_usage

The jq reshape (one row per day × model)

# ccusage emits nested JSON per day with .modelBreakdowns[]
# jq flattens to one CSV row per (day, model)
.daily[] as $d
| $d.modelBreakdowns[]
| [
    $d.date,
    .modelName,
    (.inputTokens // 0),
    (.outputTokens // 0),
    (.cacheCreationTokens // 0),
    (.cacheReadTokens // 0),
    (.cost // 0)
  ]
| @csv

The DuckDB UPSERT (idempotent)

CREATE OR REPLACE TEMP TABLE staging AS
SELECT * FROM read_csv('$CSV',
  header=false,
  columns={
    'date': 'DATE',
    'model': 'VARCHAR',
    'input_tokens': 'BIGINT',
    ...
  });

INSERT OR REPLACE INTO daily_usage
SELECT date, model, input_tokens, output_tokens,
       cache_creation_tokens, cache_read_tokens, cost,
       CURRENT_TIMESTAMP
FROM staging;

Cron-safe PATH

The script exports an explicit PATH at the top because cron / systemd timers don't inherit your shell's PATH: ~/.nvm/versions/node/v24.14.1/bin:/usr/bin:/bin:$PATH. Without this, npx isn't found and the ingest silently fails. If you upgrade Node via nvm, update this path.

How `ingest-sessions.py` works

Walks every .jsonl conversation transcript under ~/.claude/projects/, parses message events & tool-use blocks, then UPSERTs into three tables. Per-project cost now lives in project_daily_usage (populated by ingest-daily.sh); this script no longer touches cost.

walk

~/.claude/projects/
**/*.jsonl

→

classify

main vs subagent
by path depth

→

parse

per-line JSON
count msgs, tools, ctx

→

stage

3 CSVs in /tmp
conv / tools / skills

→

load

DuckDB
3× INSERT OR REPLACE

Path classification

Two JSONL shapes exist under ~/.claude/projects/<project>/:

Main: <project>/<uuid>.jsonl → kind = 'main'
Subagent: <project>/<parent_uuid>/subagents/<agent-id>.jsonl → kind = 'subagent' with parent_session_id set

Per-line parse

Each JSONL line is a typed event. The parser tracks:

started_at / ended_at = min/max timestamp across lines
cwd / gitBranch = first non-null occurrence (sticky)
type=user → increment n_user_msgs
type=assistant → increment n_assistant_msgs; track peak context (input + cache_create + cache_read); count tool-use blocks in content[]; for blocks with name=Skill, also count input.skill into a per-session skill tally
type=summary → first one wins, capped at 500 chars
Files with zero user+assistant msgs are skipped

Storage

conversations — one row per JSONL (PK: session_id = filename stem)
conversation_tool_usage — one row per (session, tool); aggregated call_count
conversation_skill_usage — one row per (session, skill); skill names are <plugin>:<name> for plugin-namespaced skills, or bare <name> for personal skills
project_daily_usage (v4) — per-(project_path, date, model) cost from ccusage daily --instances --breakdown --json, populated alongside daily_usage in the same ingest-daily.sh run. Same 8-day rolling-rewrite discipline: rows older than 8 days are immutable. Sums to daily_usage exactly for any (date, model) where ccusage could see the JSONLs at ingest time. The dashboard's "Top projects" panel windows over this table.

claude-clean hook

Your claude-clean fish function runs ingest-sessions.py before any per-project deletion, so transcripts that are about to be wiped are still captured. The DuckDB conversations table outlives the JSONL files.

How `build-dashboard.sh` works

Self-contained HTML generator. Runs ~9 DuckDB queries, embeds the results as a JSON literal in the page, ships Plotly via CDN, and lets the browser render everything client-side. No build step, no server.

1. query

10× DuckDB
-readonly -json

→

2. emit

HEREDOC HTML
CSS + structure

→

3. inject

printf 'key: %s'
JSON inline

→

4. open

xdg-open
Plotly renders

The 10 queries embedded as `DATA`

stats

6 KPI numbers (total cost, active days, total tokens, cache hit %, main sessions, tool calls).

dailyCost

Per (date × model) for the window — feeds the stacked-bar chart.

dailyTokens

Per date: input / output / cache_create / cache_read — feeds the token mix stacked bar.

byModel

Cost per model over the window — feeds the donut.

cacheTrend

Cache hit % per date — feeds the spline + area chart.

topTools

Top 15 tools by total calls, with session count.

topSkills

Top 20 skills by total calls, split into plugin + skill for grouped/colored bars.

topRepos

Top 12 projects in the window, from project_daily_usage. Cost is windowed (SUM(cost) WHERE date >= window GROUP BY project_path) and reconciles to the cent against daily_usage for any date that was captured at ingest time.

attribution

Total/attributed/unattributed/% for the window. Surfaces how much of the windowed spend can still be tied to a project (vs. lost when JSONLs were claude-clean'd before being captured).

contextDist

Sessions bucketed by % of model context window used.

sessions

Most recent 25 main sessions — feeds the bottom table.

AMOLED theme

Color tokens used by both the dashboard and this docs site (single source of truth in docs/_styles.css):

--bg:        #000000   /* true black for OLED panels */
--panel:     #0a0a0a   /* card background */
--primary:   #a78bfa   /* electric purple — opus, accents */
--secondary: #22d3ee   /* cyan — sonnet, secondary accent */
--good:      #22c55e   /* green — haiku, healthy state */
--warn:      #f59e0b   /* amber */
--bad:       #ef4444   /* red — compaction risk */
--pink:      #ec4899   /* fourth distinct */

Stable per-model coloring is done in JS via colorForModel(m) — opus → primary, sonnet → secondary, haiku → good, others → warn.

Schema

Five tables. Source-of-truth at ~/Documents/pulse/claude-stats/schema.sql. Re-applied idempotently on each install.sh run (uses CREATE TABLE IF NOT EXISTS).

daily_usage populated by ingest-daily.sh

column	type	notes
date	DATE	Day of usage; PK component
model	VARCHAR	e.g. `claude-opus-4-7`; PK component
input_tokens	BIGINT	Uncached input tokens
output_tokens	BIGINT	Generated tokens
cache_creation_tokens	BIGINT	Tokens written to prompt cache
cache_read_tokens	BIGINT	Tokens served from cache (cheap)
cost	DOUBLE	USD, computed by ccusage
ingested_at	TIMESTAMP	Last UPSERT time (UTC)

project_daily_usage populated by ingest-daily.sh (--instances)

column	type	notes
project_path	VARCHAR	Slugified project dir; PK component (matches `conversations.project_path`)
date	DATE	Day of usage; PK component
model	VARCHAR	PK component
input/output/cache_*_tokens	BIGINT	Per (project, date, model) totals
cost	DOUBLE	USD; `SUM` over project_path = `daily_usage.cost` for the same (date, model) when ccusage could see the JSONLs at ingest time
ingested_at	TIMESTAMP	Last UPSERT time (UTC)

conversations one row per JSONL transcript

column	type	notes
session_id	VARCHAR	PK; filename stem (UUID)
project_path	VARCHAR	Slugified parent dir
cwd	VARCHAR	First-seen CWD from JSONL events
git_branch	VARCHAR	First-seen branch
kind	VARCHAR	`'main'` or `'subagent'`
parent_session_id	VARCHAR	NULL for main; UUID for subagent
started_at / ended_at	TIMESTAMP	UTC; min/max event timestamps
n_user_msgs / n_assistant_msgs / n_tool_calls	INTEGER	Counts
max_context_tokens	BIGINT	Peak context size seen in any assistant turn
model	VARCHAR	Model in use at the peak-context turn
summary	VARCHAR	First `type=summary` event, capped at 500 chars

conversation_tool_usage

column	type	notes
session_id, tool_name	VARCHAR	Composite PK
call_count	INTEGER	Total invocations of that tool in that session

conversation_skill_usage

column	type	notes
session_id, skill_name	VARCHAR	Composite PK. `skill_name` is `<plugin>:<name>` or bare `<name>`
call_count	INTEGER	Total invocations of that skill in that session