Distillary

Turn any knowledge source into a navigable, shareable brain.

Distillary is an open-source tool that takes books, YouTube videos, podcasts, and articles — anything with ideas — and distills them into atomic claims organized as a navigable knowledge graph. Each source gets decomposed into a pyramid of arguments, connected by entities and cross-source bridges, and published as an Obsidian vault or static website that both humans and AI agents can explore.

The entire pipeline runs through Claude agents — haiku for fast parallel extraction, opus for deep reasoning about how ideas connect. A 300-page book becomes a browsable brain in about 15 minutes.

Built with Claude agents. Published with Quartz.

Browse the demo brain

Explore a live brain built from The Art of Money Getting by P.T. Barnum (1880, public domain): brain.distillary.xyz

One book, 79 claims, 23 entities, a 4-layer argument pyramid. Click the root thesis, follow wikilinks into clusters and claims, check entity pages — their backlinks show everything the brain knows about that concept. This is what your brain looks like when published.

Or let an agent query it — 10 questions answered from the live brain

Any AI agent with web access can query a published brain. Get the skill, paste it, ask a question:

curl https://brain.distillary.xyz/static/skill.txt
#QuestionStrategyFetches
1What books are in this brain?Read manifest1
2What are the main ideas?Fetch clusters2
3What does Barnum say about debt?Concept lookup2
4Does luck play a role in success?Concept lookup2
5How important is integrity?Concept lookup2
6Career advice?Concept lookup2
7Role of health in wealth?Concept lookup2
8How to advertise?Concept lookup2
9Who is P.T. Barnum?Entity lookup2
10Root thesis?Fetch root note2

Average: 1.9 fetches per question. All answers from brain.distillary.xyz. Read the full answers →

Works with Claude Code, Codex, Gemini CLI, Cursor, and any agent with HTTP access. Setup guide →


How the pipeline works

When you add a source to your brain, it flows through a series of agent-powered steps. First, the text gets split into chunks and processed by 16 parallel haiku agents that extract individual claims. Those claims are deduplicated, entities (people, concepts, companies) are identified, and then opus agents do the deep work: clustering related claims into argumentative groups and building a hierarchy from atoms up to a single root thesis.

After the pyramid is built, haiku agents find lateral connections — tensions between claims, shared patterns, and evidence chains. Finally, a doctor agent fixes any orphaned notes and suggests concepts worth exploring further.

graph LR
    A["Any Source"] --> B["Extract + Dedupe"]
    B --> C["Group + Pyramid"]
    C --> D["Connect + Doctor"]
    D --> E["Brain Vault"]

The result is a structured vault where every claim links to its parent argument, every entity links to every claim that mentions it, and every source bridges to related sources through shared concepts.

StepWhat happensTime
Extract16 parallel haiku agents pull atomic claims from text~2 min
Dedupe + EntitiesRemove duplicates, identify people/concepts/companies~2 min
GroupOpus agents cluster claims by argumentative cohesion~5 min
PyramidBuild root thesis → chapters → arguments → evidence~3 min
ConnectFind tensions, patterns, and evidence between claims~2 min
DoctorFix orphans, discover ghost concepts, suggest explorations~1 min

The full pipeline for a 300-page book takes ~15 minutes

Most of that time is the opus grouping step, which requires deep reasoning about how claims relate to each other. Extraction is fast because 16 haiku agents work in parallel.


What you get

A pyramid of claims

Every source gets decomposed into a 4-layer hierarchy. The root thesis summarizes the entire source in one paragraph with wikilinks to chapter-level clusters. Each cluster links to mid-level structure notes, which link to individual atomic claims — the leaves of the tree, each traceable back to a specific chapter or section via source_ref.

This means you can read the book at any zoom level: the root for a 30-second summary, the clusters for chapter themes, or the atoms for specific evidence.

graph TD
    R["Root Thesis"] --> C1["Cluster: Validation"]
    R --> C2["Cluster: Metrics"]
    R --> C3["Cluster: Pivoting"]
    C1 --> S1["MVP tests assumptions"]
    C1 --> S2["Early adopters first"]
    S1 --> A1["Zappos tested demand with photos"]
    S1 --> A2["Dropbox used a video MVP"]

Entities as knowledge hubs

Every person, concept, company, and work mentioned across your sources gets its own page. The real power is the backlinks — every claim that references an entity shows up on its page, grouped by source. This turns entity pages into question-answering hubs.

When you want to know what your brain knows about “customer validation,” you don’t search — you open the entity page and read its backlinks. Each backlink is a claim from a specific source, with its own wikilinks to related concepts.

Answering questions through backlinks

The “Real Signals” entity page has 36 backlinks from The Lean Startup and 26 from The Mom Test. Each one is a claim about detecting genuine customer interest — from two different perspectives, already organized by source. One page, complete answer.

Bridge concepts across sources

When two sources discuss the same idea under different names, Distillary creates bridge entities that unify them. A bridge page has aliases from both sources, descriptions of each perspective, and backlinks from both — making it the fastest way to get a cross-source answer.

For example, The Lean Startup calls unreliable indicators “Vanity Metrics” (quantitative: gross numbers that look good). The Mom Test calls them “Compliments” (qualitative: praise that costs nothing). Both describe the same problem — misleading signals that create false confidence. The bridge concept “False Signals” captures both perspectives.

Bridge conceptLean Startup calls itMom Test calls it
Real SignalsActionable MetricsCommitment
False SignalsVanity MetricsCompliments
Direct Customer ContactGenchi GembutsuCustomer conversation
Critical AssumptionsHypothesisImportant questions
Demand UncertaintyLeap of FaithMarket risk

Your annotations are part of the graph

The brain isn’t read-only. You add your own reactions, questions, and insights to brain/personal/. Each annotation links to the claim it responds to, carries its own tags (status/agree, insight/aha), and appears in the graph as a connected node. Your voice is first-class data — queryable, filterable, visible in Obsidian’s graph view alongside the distilled sources.


For AI agents

Published brains expose an agent.json manifest — a lightweight entry point that tells any agent what sources exist, what bridge concepts connect them, and how to navigate the content. The agent doesn’t need to download everything; it follows links by relevance, the same way a human clicks through Obsidian.

Entity backlinks are the key mechanism. An agent looking up “validated learning” fetches the entity page, reads the backlinks grouped by source, and follows 2-3 claims for specific evidence. Multi-source answer with citations in under 2,500 tokens — compared to ~50,000 if it tried to read all claims.

graph TD
    A["Fetch agent.json"] --> B{"Question type?"}
    B -->|"What is X?"| C["Entity page → backlinks"]
    B -->|"Summary"| D["Thesis in manifest → root note"]
    B -->|"Do sources agree?"| E["Bridge concept or comparison page"]
    B -->|"Show me evidence"| F["Root → cluster → structure → atom"]
    B -->|"What's related?"| G["Any entity → follow backlinks + wikilinks"]

No MCP server. No authentication. No setup.

Published brains are static websites. The agent makes 2-3 HTTP GET requests to structured markdown pages. The link graph built during distillation IS the search engine — no keyword matching needed.

See the agent retrieval skill for setup instructions by tool, or the demo with full answers.


Source types

Distillary works with any source that contains ideas. The extraction step differs by format, but everything after that — deduplication, entity extraction, grouping, pyramid building, connection finding — is the same pipeline regardless of whether the input was a book, video, or article.

TypeInputHow text is extracted
BookEPUB, PDF, TXTParsed directly
YouTube videoURLTranscript via yt-dlp
PodcastAudio fileTranscribed via Whisper
ArticleURLWeb fetch + clean HTML
Research paperPDFParsed directly
Lecture notesMarkdown, PDFRead or parsed

Community

Distilled knowledge becomes more valuable when it’s shared. When you publish your brain, others can browse it as a website, clone it into their Obsidian, or have their AI agents query it via the API. When multiple people distill books on the same topic, their brains can be compared and combined into field-level understanding.

  1. Process a source → it joins your brain
  2. Publish to GitHub Pages → anyone can browse it
  3. Share the URL → other people’s agents can query it
  4. Combine multiple brains → cross-source synthesis and meta-analysis

Tag your repos distillary-brain on GitHub. Join the Discord to share your brains, request distillations, and discuss cross-source insights.


Getting started

Concepts

Deployment

  • Architecture — agents, skills, Python utilities
  • Publishing — deploy your brain as a website with agent API

For agents

  • Agent retrieval — get the skill, setup by tool (Claude Code, Codex, Gemini CLI, Cursor, and more)
  • Demo: 10 questions — live answers from the demo brain, showing each retrieval strategy

  • README — full project overview with code structure

Acknowledgments

Documentation and brain publishing are powered by Quartz — an excellent open-source static site generator for Obsidian vaults by Jacky Zhao. Quartz renders wikilinks, graph view, backlinks, Mermaid diagrams, and callouts out of the box. We’re grateful for the project and recommend it for anyone publishing Obsidian content.