Two deterministic brain tools in one evening — pks brain scan and commit-plan

17 empty feature files in .pks/brain/features/. 5522 session JSONL files in ~/.claude/projects/. And one question I kept asking myself:

which of my 5522 session jsonl files actually edited this file?

There are two ways to answer that. The expensive way: throw AI at all 5522 files, let it read them, and politely ask which sessions matter. That's how all my tooling so far had been answering questions like this — because it's the easy way to write a tool. You have an LLM. You feed things in. You get things out.

The cheap way: jq. Each session is a JSONL file. Tool calls are structured. Edited files live in tool_input.file_path. It's a search. Zero tokens. Zero wait. Zero ambiguity, because the code is deterministic.

This post is about how we built both — pks brain scan filepath for deterministic discovery edges, and pks brain commit-plan for grouping uncommitted files and pulling the prompts that produced them — in the same evening, and how the second one ended up catching the first one's own birth session in its graph. Recursive self-discovery, if you want it fancy.

The bigger point is the role split: AI tokens for synthesis, deterministic code for discovery. I've written that rule in many shapes before, but this was the first time it got concrete enough to actually shape how the tool was built.

The first prompt — "lets just implement the tool in pks cli"

I was deep in a long session about pouring structure onto pks-cli using product-cli. Halfway through, we shifted gears. My message:

i think we should just materialize the 17 empty files with frontmatter and then we can spawn seperate childs for each to find relevant sessions that had tool call that edited those files? this is where i think it make sense that we use deterministisk code where pks brain scan filepath --session-files and it simply scan all the 6000 files for toolcalls that edit that file or shell commands where filename is included so we dont wait ai tokens on parsing all sessions. So we can start constructing the graph of sessions that are relevant for the given file?

The reply suggested doing it with bash and jq. That wasn't what I wanted:

my prompttmux child: brain-scan-impl

1 lines2026-05-24 21:05

but instead of doing it in bash and such, lets just implement the tool in pks cli so we can use that and then we know it works also when done

Built by a tmux-spawned child — monitored from here, tick by tick.

3 tools26K tok6 min

That sentence is what makes the difference. A bash one-liner is great for validating the idea. But a tool in pks-cli is something I can trust — it has tests, it has a contract, it behaves the same in every devcontainer I run, and it's a command I can document in a skill and tell an agent: "use this, not bash."

The difference between a bash snippet and a CLI command isn't the size of the code. It's the number of places it can go wrong before I notice.

What `scan filepath` actually does

pks brain scan filepath ./Commands/Brain/ --format text

It runs rg (ripgrep) as a prefilter so it doesn't load 5522 files into memory. For each match it parses the JSONL line by line, and for every tool call that touches the file (Edit, Write, Read, or a Bash whose command mentions the path) it emits an edge: Session → ToolCall → File. Pure structure parsing. No LLM.

Smoke test from the root folder:

41 edge(s) across 2 session(s) across 5522 JSONL(s) scanned.

Milliseconds. It tells me precisely which sessions have spoken to that file. If the answer is "all 5522," it's because my prefilter is too loose and needs fixing — not because the model got confused.

That's the kind of failure I want in my life. Not the kind where the LLM forgot half my 5522 files.

The second idea showed up during commit

Only after scan filepath was green did the next prompt land:

would we not just commit the relevant files for this feature now and then use it for each file in the submodule that is not staged to see why it was made or changed and then maybe do the command to take a list of files and it group them based on if they was made in the same session files and such?

Same move, an octave up. If I can find sessions from a file, I can flip it: take a pile of uncommitted files, group them by the session that authored them, and use that as the basis for commits.

The algorithm became a greedy set cover. Find the session that covers the most uncommitted files. Claim them. Repeat until everything is covered. Each group = one commit, because the files belong to the same piece of work.

The first time we dogfooded it against pks-cli's own uncommitted files the result was alarmingly good — and a little too good:

That's gold. Session 02f82b87 is huge (7838 entries) and covers things like cross-stream broadcast, ALP business line import + broadcast toggle, and agent-inbox migration. Those prompts tell us intent; the file diff tells us mechanics. Together → good commit message.

But notice: session 02f82b87 is so big it's likely touched those 5 files in service of OTHER work, not authored the current uncommitted diff. The "most files touched" heuristic overweights long-lived sessions.

This is where I should have stopped and smiled at how clever the tool was. I didn't. I caught exactly this: a giant 7838-entry catch-all session that had touched everything was getting credit for every group, just because it had the most tool calls. Classic set-cover bias — the longest session always wins the first greedy step.

The code wasn't wrong. The heuristic was wrong.

Patch one: switch from "most touched" to "last edit author"

The follow-up suggestion was clean:

Switch the heuristic from "touched most files" to "last-edit-author per file, aggregated" — more accurate for uncommitted work

The thinking: for uncommitted files what's interesting is "who wrote this version" — not "who has touched the file most over time." Last edit author is the session that performed the most recent write/edit before commit. Per file. Aggregate up. Group.

The practical patch was a rewrite of the planner. Tests first, then implementation. When it shipped we ran it again against the same uncommitted files. Result:

Group	Files	Session (author)	What the session did
1	4 files	`2bf720b7`	Add `--include-prompts` + last-edit heuristic to commit-plan
2	4 files (BrainCommitPlan*)	`2bf720b7` (this session)	The just-created commit-plan files

Note: group 1 contains the commit-plan tool's own code. Session 2bf720b7 is the one that built it. The tool finds its own author. It isn't magic, just deterministic structure, but there's something satisfying about watching the recursion close.

The old heuristic had put 02f82b87 (the 7838-entry catch-all) as primary for everything. The new heuristic correctly relegates it to "Contributing" — it touched the files, but it didn't write them last.

Patch two: "include the prompts too"

I was looking at output and thinking: groups are nice, but they're just lists of files. Where is the intent? That became:

my prompt

1 lines2026-05-24 21:39

we would like to do nice commit messages based on the session and prompts, so how does this play along, does the commit-plan also find the prompts that was done prior to the tool calls that edited/wrote teh file? so commit-plan can be used to gather information prior of committing? or whats the recomended step here?

1 tool12K tok

The answer shouldn't be "let the orchestrator prompt an agent that reads sessions and comes back with commit messages." It should be: pull user prompts out directly. Deterministic. So the orchestrator has material to work with, and only the final phrasing of the commit message is AI.

The flag is --include-prompts. For each group it pulls the user prompts that came before the relevant tool calls in the winning session's window. Not all 7838 entries — only those belonging to those files. Output looks roughly like:

Group 1 (4 files) — Session 2bf720b7
  Prompts (excerpt):
    • "switch the heuristic from touched-most to last-edit-author"
    • "we would like to do nice commit messages based on the session and prompts"
  Files:
    • BrainCommitPlanCommand.cs
    • BrainCommitPlanner.cs
    • IBrainCommitPlanner.cs
    • BrainCommitPlanTests.cs

The orchestrator gets files + prompts + intent without having read a single session jsonl. The only AI tokens spent are the ones writing the commit message string. And that's cheap — it's one sentence.

Where the extractor almost went sideways

When we dogfooded against every current uncommitted file we caught a funny bug — or rather, not a bug, an artifact:

Group 5 (GitProxy): 808c2e3d — (auth-prompt content was the most recent user-msg; functional but noisy)

Prompts plausibly explain the file changes in groups 2–4. Group 5's prompt is a captured auth-pause banner that happens to live as a "user" message — a known artifact of how Claude Code records control flow, not a bug in the extractor.

Claude Code sometimes logs auth prompts and similar control-flow as "user" messages. If you blindly take the most recent user message you're vulnerable to ending up with "please complete authentication" as your commit message. We noted it, planned a filter patch for a later release, and left it in — knowing the artifact exists beats hiding it behind premature cleanup.

The deterministic-vs-AI factoring

This is the bigger point left standing after the evening. Two commands, same principle:

Discovery is deterministic. Find files, find sessions, find prompts. It's structure parsing. It should not burn tokens.
Synthesis is AI. Write the commit message. Summarize what a group is about. That's where an LLM adds value.

Every tool in pks-cli is going to have this split. You will not use an LLM to "find files named Brain-something" when find does it in 4 ms. You will not use an LLM to "find sessions that edited this file" when jq can do it. You'll use the LLM where it actually thinks — on language.

It also changes what my subagent orchestrations look like. Before, a typical fan-out was "spawn 17 child claudes, each reads one feature file by scanning sessions." Now it's "run pks brain scan filepath 17 times (deterministic, fast, free), collect the result, then spawn one child claude that gets all the material and writes the feature descriptions."

Cheaper. Faster. Reproducible — if I run the discovery step again tomorrow I get the same answer. That kind of determinism is something I've been missing from the rest of my agentic setup.

Tests passed first try. That's not a great sign.

dotnet test --filter "FullyQualifiedName~BrainScanFilepathTests": Passed 5/5, 84ms.

I stopped celebrating "tests passed first try" a while ago. Mostly it means I wrote the tests from the same misunderstanding as the code. It's only when I dogfood against the actual 5522-file projects folder that I know it works — and that's where we'd just caught a heuristic bias tests would never have caught, because my test data was tidier than reality.

That's why I insist on building tools inside pks-cli and using them from day one against real data. If you can't dogfood a tool on the machine you're sitting at, it's either the wrong tool or the wrong design.

What comes next

Both commands are on main. They're small. They're fast. They have tests covering what they should, and one known artifact I'll clean up later.

Next step — a separate session and a separate post — is to use commit-plan --include-prompts to actually commit all the uncommitted files in this devcontainer (there are a lot). That's the dogfood that closes the loop: the tool was born here, and is now used to commit its own birth into the history.

If that experience is as clean as today's was, I'll have to revisit how many of my other AI tools should really have been deterministic commands instead.

This is part 2 of the series The pks-cli toolbox(coming soon).

This post has been revised 2 times — see full history

v0 — Original ai-draft2026-05-24
First draft; dead link to an unwritten follow-up post and the stray date marker removed in v1.
v1 — Number-first hook + dead link removed2026-06-10
Mirrors the Danish v1: number-first opening (17 empty feature files / 5522 session JSONLs), the stray 'May 2026' marker dropped, and the dead link to the unwritten follow-up post removed.
v2 — Sessionstory: the tmux child + include-prompts2026-06-11current
The two pivotal quotes upgraded from blockquotes to sessionstory blocks from session de4200b6 (May 24): the brain-scan-impl tmux child that built scan filepath (monitor ticks verbatim), and the 'nice commit messages' prompt with the commit-plan gap analysis.