Run this helper free — no credit card
Every helper is free for 30 days. Answer 3 questions and get the full result in 2 minutes.
Start free →Brain in the Fish — MCP Skill Guide
Ontology-grounded AI agents evaluate any document objectively
❌ Evaluating documents manually against multiple criteria is time-consuming, subjective, and prone to inconsistent judgment across different reviewers.
✅ Get consistent, evidence-based document evaluations with verifiable scoring that eliminates bias and hallucinations.
- ✓Multi-agent evaluation panel with grounded mental states
- ✓Evidence Density Scorer detects hallucinations mathematically
- ✓Works with essays, contracts, reports, policies
- ✓OWL ontology ensures consistent evaluation criteria
👁 2 views · 📦 0 installs
Install in one line
CLI$ mfkvault install fabio-rovai-brain-in-the-fishRequires the MFKVault CLI. Prefer MCP?
Free to install — no account needed
Copy the command below and paste into your agent.
Instant access • No coding needed • No account needed
What you get in 5 minutes
- Full skill code ready to install
- Works with 1 AI agent
- Lifetime updates included
Run this helper
Answer a few questions and let this helper do the work.
▸Advanced: use with your AI agent
Description
--- name: brain-in-the-fish description: Universal document evaluation engine — evaluate any document against any criteria using cognitively-modelled AI agents with ontology-grounded scoring version: 0.1.0 --- # Brain in the Fish — MCP Skill Guide ## What This Does Brain in the Fish evaluates documents (essays, policies, contracts, clinical reports, surveys) against evaluation criteria using a panel of AI agents. Each agent's mental state exists as OWL ontology. Scoring is grounded in an Evidence Density Scorer (EDS) that makes hallucination mathematically detectable. ## MCP Tools Available | Tool | Purpose | When to Call | |------|---------|-------------| | `eval_status` | Check server status and session state | First — verify server is running | | `eval_ingest` | Ingest a document (PDF/text) | Step 1 | | `eval_criteria` | Load evaluation framework | Step 2 | | `eval_align` | Align document sections to criteria | Step 3 | | `eval_spawn` | Generate evaluator agent panel | Step 4 | | `eval_scoring_tasks` | Get all scoring prompts for subagents | Step 5 | | `eval_score_prompt` | Get scoring prompt for one agent/criterion pair | Step 5 (per-task) | | `eval_record_score` | Record a score from an agent | Step 6 | | `eval_debate_status` | Check disagreements and convergence | Step 7 | | `eval_challenge_prompt` | Get challenge prompt for debate | Step 7 (per-challenge) | | `eval_report` | Generate final evaluation report | Step 8 | | `eval_whatif` | "What if" re-scoring with modified text | Optional | ## Evaluation Workflow ### Quick Mode (deterministic, no subagents needed) ``` eval_ingest → eval_criteria → eval_align → eval_spawn → eval_report ``` The server runs evidence scoring internally. `eval_report` produces a complete evaluation with deterministic scores. ### Full Mode (with Claude subagent scoring) ``` 1. eval_ingest(path, intent) 2. eval_criteria(framework_or_intent) 3. eval_align() 4. eval_spawn(intent) 5. eval_scoring_tasks() → get all tasks 6. For each task: - Read the scoring prompt - Evaluate the document content against the criterion as the agent persona - eval_record_score(agent_id, criterion_id, score, justification, evidence, gaps) 7. eval_debate_status() → check for disagreements 8. If disagreements: - eval_challenge_prompt(challenger, target, criterion) - Generate challenge argument - eval_record_score() with revised score - Repeat until converged 9. eval_report() → final report ``` ### Subagent Dispatch Pattern When orchestrating with multiple Claude subagents: ``` Orchestrator reads eval_scoring_tasks() → For each agent in the panel: Dispatch subagent with system prompt from eval_scoring_tasks Subagent receives: persona, criteria, document sections Subagent calls eval_record_score with their assessment → After all scores recorded: Check eval_debate_status If disagreements: dispatch challenge subagents → eval_report for final output ``` ## Scoring Guidelines for Subagents When scoring as an agent persona: 1. **Read the document content** provided in the scoring prompt carefully 2. **Reference the rubric levels** — state which level the document meets 3. **Cite specific evidence** from the document text (quote directly) 4. **Identify gaps** — what's missing that would improve the score 5. **Be the persona** — a Subject Expert scores differently from a Writing Specialist 6. **Do not hallucinate** — only reference evidence that appears in the provided text 7. **Use the full scale** — don't cluster all scores at 6-8. Use 1-10 range appropriately. ## Response Format for eval_record_score ```json { "agent_id": "from the scoring task", "criterion_id": "from the scoring task", "score": 7.5, "max_score": 10.0, "round": 1, "justification": "Detailed justification referencing specific document content and rubric levels. This section meets Level 3 (score range 6-8) because it demonstrates [specific evidence]. To reach Level 4, the document would need [specific improvement].", "evidence_used": ["Direct quote from document", "Another quote"], "gaps_identified": ["Missing topic X", "No counter-argument for claim Y"] } ``` ## Supported Document Types | Type | Intent Keywords | Framework Auto-Selected | |------|----------------|----------------------| | Academic essay | "essay", "mark", "grade", "coursework" | Academic Essay Marking | | Policy document | "policy", "green book", "impact assessment" | HM Treasury Green Book | | Survey/research | "survey", "methodology", "questionnaire" | Survey Methodology | | Contract/legal | "contract", "legal", "compliance" | Contract Review | | Clinical/NHS | "nhs", "clinical", "patient", "governance" | NHS Clinical Governance | | GCSE English | "gcse", "english language" | GCSE English Language | | Generic | anything else | Generic Quality | ## Architecture Notes - **Three ontologies** coexist in one Oxigraph triple store: Document, Criteria, Agent - **Evidence scorer** provides deterministic evidence-grounded scoring baseline - **Validation signals** (citations, structure, reading level, fallacies, hedging) feed into the scorer as spikes - **Epistemic state** tracks justified beliefs with empirical/normative/testimonial bases - **Philosophical analysis** applies Kantian/utilitarian/virtue ethics lenses - **Belief dynamics** — Maslow needs update based on findings, trust evolves during debate - **Cross-evaluation memory** persists results for historical comparison - **All triples are queryable** via SPARQL through the underlying onto_* tools
Security Status
Scanned
Passed automated security checks
Related AI Tools
More Save Money tools you might like
Family History Research Planning Skill
FreeProvides assistance with planning family history and genealogy research projects.
Run freeNaming Skill
FreeName products, SaaS, brands, open source projects, bots, and apps. Use when the user needs to name something, find a brand name, or pick a product name. Metaphor-driven process that produces memorable, meaningful names and avoids AI slop.
Run freeProfit Margin Calculator
Free during launchNormally $8Find hidden profit leaks — see exactly where your money goes
Run freeguard-scanner
Free"Security scanner and runtime guard for OpenClaw skills, MCP servers, and AI agent workflows. Detects prompt injection, identity hijacking, memory poisoning, A2A contagion, secret leaks, supply-chain abuse, and dangerous tool calls with 364 static th
Run freeLife OS · Personal Decision Engine
Free"A personal decision engine with 16 independent AI agents, checks and balances, and swappable cultural themes. Covers relationships, finance, learning, execution, risk control, health, and infrastructure. Use when facing complex personal decisions (c
Run freebbc-skill — Bilibili Comment Collector
FreeFetch Bilibili (哔哩哔哩) video comments for UP主 self-analysis. Use when the user asks to collect, download, export, or analyze comments on a Bilibili video (BV号 / URL / UID). Produces JSONL + summary.json suitable for further Claude Code analysis (senti
Run free