Introducing Quarex Evaluate

Hi everyone,

A new tool went live on Quarex today: Evaluate. It takes a contested statement — the kind people argue about over coffee, on cable news, or in family group chats — and shows you what the Quarex library actually has to say about each piece of the claim.

Try it: quarex.org/evaluate

A new way to use a library

This isn't a chatbot bolted onto a search box. It's the library itself working as an analysis tool. Quarex now holds 1,037 books, 7,834 chapters, and 50,605 sharp topic-questions — all human-curated, all source-linked. When you submit a statement, every one of those topic-questions becomes a potential probe; the relevant ones get pulled in real time and applied to your claim.

Wikipedia gives you static articles. Search engines give you ranked links. General-purpose AI tools give you opaque answers built on training data you can't inspect. None of them use a curated library as the substrate of the analysis itself. Quarex does — and at this point the library is large enough that the analysis is substantive across most major contested questions, not just a handful of demo topics.

How it works

Pick a category (Science, Economics, History, Public Health, Technology & AI, or Contested Claims), then pick a statement. Quarex searches the library in real time, finds the topic-questions across our books that touch the claim, and writes an evaluation showing what those sources address, where they agree, and where they push back on each other.

Quarex doesn't render a verdict. It surfaces the actual sub-claims a statement implies and shows you what the library knows about each one. You bring the judgment; we bring the receipts.

Everything is footnoted

Every assertion in an evaluation is cited to a specific chapter in a specific book, with a clickable link. Below the evaluation you'll see two source lists: "Sources cited in this evaluation" (the ones the analysis actually leans on) and "Other relevant sources from the library" (matched the claim but weren't cited — useful for going deeper). You can read the evaluation and then read the source material yourself. No black box.

The two-axis read

Every evaluation ends with a structured assessment along two independent axes:

Library coverage — how thoroughly the Quarex library addresses the claim. Values: well-covered, partial, thin.
Where experts stand — whether the broader field of study has settled the underlying question. Values: settled, contested-in-literature, corpus-absent.

This separation matters. A claim can be honestly contested by experts even when the library covers it well; that's a legitimate state, not a failure. And a claim the field has settled can still show up as thin in our library — which tells us exactly where to write next.

How Evaluate feeds the library

The same analysis that produces evaluations also tells us where the library is weak. Each evaluation names specific gaps: not just "thin coverage" but the actual missing piece of evidence, time period, or sub-question. We can run hundreds of claims through and see those gaps cluster into themes.

That's exactly what happened this week. We ran 63 curated statements through the system, captured the gap patterns, and built 17 new books targeting the most actionable ones — from Vaccines and the Anti-Vaccine Movement to Trade, Tariffs, and the Real Economy to Church and State in American Founding. Re-running the analysis with those books in place, 7 of 9 priority gaps closed; three flipped to "well-covered + settled" — the strongest outcome the system can produce.

So Evaluate isn't just a reader-facing tool. It's the measurement layer for the whole library. As Quarex grows, this feedback loop tells us what to build next, in priority order, with named evidence.

What's next

The current set of 63 statements is a starting point. Over the coming months we'll be opening up more ways to bring your own questions to the system — with the same coverage transparency and citation discipline. We'll also be using accumulated evaluation data to keep targeting the next round of library expansion.

Try a few statements and let me know what surprises you — what reads cleanly, what feels off, and what claims you wish were on the list. The most useful feedback is "I expected X and got Y."

Under the hood: Evaluate uses Claude Sonnet 4.5 with our 50,605 topic-questions as grounding. Searches run via SQLite FTS5 against the catalog database, with retrieval tuned this week to filter out noise from short candidate-listing topics that were leaking into non-political claims. Every evaluation includes machine-readable gap data (HTML comment) that feeds the library-expansion analysis.

— Peter