Field Notes · 2026-04-20 · 7 min read

Subagents write our summaries now

We moved from a single AI prompt to a three-agent pipeline for board summaries. The quality gap was immediate. Here's the architecture and why decomposition beats one big prompt.

The original BoardSnap summary pipeline was a single prompt. Send the board image, ask the AI to produce a summary plus action items, get the response. Simple, cheap, fast.

The problem: the prompt was doing three different jobs simultaneously, and it was mediocre at all three.

Job 1: reading and transcribing the board content (OCR + understanding) Job 2: writing a coherent summary of what the board communicates Job 3: extracting structured action items in the right format

These are genuinely different tasks with different optimal approaches. Shoving them into a single prompt produces output that's okay at each but excellent at none.

### What a multi-agent pipeline looks like

The current BoardSnap pipeline runs three sequential agents:

Agent 1: Board Reader Input: the board image + VisionKit-processed text layer Task: produce a raw, literal transcription of everything on the board — text, shapes, arrows, connections, spatial relationships. No interpretation. No summary. Just "here's what's on the board." Output: structured JSON representing the board content

Agent 2: Summary Writer Input: the Board Reader JSON + brand context + pinned notes Task: write a coherent narrative summary of what was discussed, decided, or planned. Use the brand voice. Connect ideas. Provide context for the action items. Output: the summary prose

Agent 3: Action Item Extractor Input: the Board Reader JSON + the Summary Writer output Task: extract action items as a structured list. Start each with a verb. Flag ownership where marked. Flag ambiguous items. Apply the tri-state model. Output: structured action item list

Three agents, three focused tasks. The total latency is higher than a single pass (by about 2–3 seconds). The quality improvement is significant.

### What changed in the output

Summaries are more coherent. When the Summary Writer isn't simultaneously trying to extract action items, it can focus on the narrative. The summaries read more like meeting notes written by a smart human — connected paragraphs, clear progression — rather than a list of observations.

Action items are more accurate. The Action Item Extractor has the benefit of the Summary Writer's interpretation when it's doing its job. It can see "this item is connected to this larger theme" and use that context to write a better action item description. It's also less likely to miss items that weren't explicitly written as action items on the board but were clearly decisions requiring follow-up.

Failures are isolated. In a single-prompt pipeline, if the action item extraction goes wrong, the whole response is compromised. In a three-agent pipeline, a failure in one agent doesn't necessarily break the others. The Board Reader is the critical dependency; if it fails, everything else fails. But Summary Writer failures don't corrupt the action item list.

### The cost and the tradeoff

Three agents means three API calls. The cost per snap is higher than a single-prompt approach — roughly 2.5x the token cost.

For our pricing model (flat monthly subscription, unlimited snaps for Pro users), this means our unit economics on heavy Pro users are tighter than they would be with a single-pass approach. I've modeled this and it's manageable at current usage levels, but it's a real cost that will matter at scale.

The quality improvement is worth it. A boardsnap that produces a mediocre summary is an interesting demo. One that produces a consistently excellent summary is a product people pay for.

### When single-pass is still right

Not every BoardSnap use case needs the full three-agent pipeline. A simple board with a short list and clear action items doesn't benefit much from the separation. For these cases, we're experimenting with a routing layer that classifies board complexity and routes simple boards to a single-pass pipeline and complex boards to the three-agent pipeline.

The complexity signal: number of distinct elements, presence of arrows and spatial relationships, presence of abbreviations or shorthand that requires interpretation. High on any of these → three agents. Low on all → single pass.

We haven't shipped this routing in v1. Currently, every board goes through the full pipeline. It's the right default for quality consistency.

Frequently asked

Does the multi-agent pipeline affect how fast summaries appear?

Yes, slightly — the full pipeline adds about 2–3 seconds compared to a single-pass approach. The summary begins streaming as soon as Agent 2 starts producing output, so the perceived wait is shorter than the total pipeline latency.

Snap your first board today.

See the workflow this post talks about — free on the App Store.

Download on the App Store

Free · 1 project, 30 boards Pro $9.99/mo · everything unlimited Pro $69.99/yr · save 42%