Pyramid design
Every source in a Distillary brain is organized as a 4-layer pyramid. This isn’t arbitrary — it mirrors how arguments actually work and creates a navigable structure at every zoom level.
Why a pyramid
A book is not a flat list of facts. It’s an argument: a thesis supported by sub-arguments, each supported by evidence. The pyramid makes this structure explicit.
The alternative — a flat list of 300 claims — is unusable. You can’t browse it. You can’t understand the structure. You can’t answer “what does this book argue about pivoting?” without reading everything. The pyramid solves this by giving you entry points at every level of detail.
graph TD L3["Layer 3: Thesis — 1 note"] --> L2a["Layer 2: Chapter clusters — ~8"] L3 --> L2b["..."] L2a --> L1a["Layer 1: Argument groups — ~50"] L2a --> L1b["..."] L1a --> L0a["Layer 0: Atomic claims — ~300"] L1a --> L0b["..."] L1a --> L0c["..."]
The four layers
Layer 0: Atoms
An atomic claim has one assertion and one reason. If it contains “because X and Y,” it should be two separate notes. Atomicity matters because it makes each claim independently linkable, filterable, and debatable.
Each atom carries source_ref pointing to the chapter or section it came from. This is the leaf level — the actual evidence.
A typical 300-page book produces 150-300 atoms.
Layer 1: Structure
Structure notes group 3-7 atoms that together build one argument. The structure note’s body is a prose paragraph where each child appears as a [[wikilink]]. Its ## Related section lists the children explicitly.
The title of a structure note is a CLAIM, not a topic. “Markets fail without regulation” — not “Economic factors.” This is enforced in the agent prompts because topic-titled notes lose the argumentative thread.
A typical book produces 30-60 structure notes.
Layer 2: Clusters
Clusters group 3-7 structure notes into chapter-level arguments. Each cluster represents a major theme of the source — roughly corresponding to a chapter or section.
Clusters are the most useful navigation level. When someone asks “what does this book say about measurement?”, the answer is one cluster note with ~5 structure notes beneath it.
A typical book produces 6-10 clusters.
Layer 3: Root
One note per source. The root thesis summarizes the entire argument in one paragraph with [[wikilinks]] to every layer-2 cluster. Reading the root tells you what the source argues. Following the wikilinks tells you how.
The root carries type/claim/index in its tags — the only note with this tag, making it easy for tools and agents to find.
Why atomic claims
Larger chunks would be easier to produce but harder to use. An atomic claim can be:
- Linked to individually — an entity page can reference this specific claim, not a whole section
- Tagged precisely — this specific assertion is
certainty/speculative, even if the surrounding paragraph iscertainty/established - Compared across sources — two atoms with the same
propositionare saying the same thing - Debated — you can annotate “I disagree with THIS claim” without rejecting the whole chapter
Atomicity has a cost — more notes, more links, more processing time. But the navigability gain is worth it. The pyramid compresses the atoms into readable summaries at every level.
The grouping invariant
The grouping step (layer 0 → layer 1) has a contractual invariant: the output must have strictly fewer parentless notes than the input. This means every pass of the grouping agent reduces the number of unattached claims.
Pass 1: 300 atoms → 50 structure notes + 300 atoms (all attached)
Pass 2: 50 structure notes → 8 clusters + 50 structure notes (all attached)
Pass 3: 8 clusters → 1 root
The invariant guarantees convergence: the pyramid always reaches a single root. If a grouping pass fails to reduce roots (the LLM didn’t follow instructions), the pipeline detects it and stops rather than looping forever.
Why 3-7 children per parent
This range comes from cognitive load research — humans can hold 4±1 items in working memory. A parent with 3-7 children is browsable: you read the parent’s prose, see the child wikilinks, and decide which 1-2 to follow.
A parent with 15 children is a flat list. A parent with 2 children isn’t grouping anything meaningful. The 3-7 constraint keeps every level of the pyramid navigable.
The pyramid mirrors how you'd explain the book to someone
If a friend asked “what’s that book about?”, you’d give them the root (30 seconds). If they’re interested, you’d expand one cluster (5 minutes). If they want specifics, you’d tell them an atom (the evidence). The pyramid IS this conversation, frozen in markdown.