30-day free campaign

Run this helper free — no credit card

Every helper is free for 30 days. Answer 3 questions and get the full result in 2 minutes.

Start free →

FREE

Verified

Save Money

Model Hierarchy - Cost-Optimize AI Agent Operations

Name: Model Hierarchy - Cost-Optimize AI Agent Operations
Brand: MFKVault
Availability: InStock

Right-size your AI models, eliminate expensive overkill, maximize ROI

❌ AI agents waste thousands monthly by routing every task to expensive flagship models regardless of complexity.

✅ Intelligently route tasks to appropriate model tiers, reducing operational costs by 60-80% while maintaining quality.

✓Automatic task complexity detection and tier routing logic
✓Pre-built hierarchy spanning five model cost tiers
✓Sub-agent spawning with appropriate model assignment
✓Cost tracking and savings optimization reporting

👁 2 views · 📦 0 installs

Install in one line

CLI

$ mfkvault install zscole-model-hierarchy-skill

Requires the MFKVault CLI. Prefer MCP?

No reviews yet

🤖 Claude Code⚡ Cursor💻 Codex🦞 OpenClaw 🏄 Windsurf

This helper was discovered by MFKVault crawlers from public sources. Original author retains all rights. To request removal: [email protected]

Community helper

This helper was discovered by MFKVault crawlers from public sources. MFKVault does not create, maintain, or guarantee the output of this helper. Results are AI-generated and may be incomplete, inaccurate, or outdated. Use at your own risk. Original author retains all rights. Request removal

FREE

Free to install — no account needed

Copy the command below and paste into your agent.

Instant access • No coding needed • No account needed

What you get in 5 minutes

Full skill code ready to install
Works with 7 AI agents
Lifetime updates included

VerifiedSecureBe the first

Ready to run

Run this helper

Answer a few questions and let this helper do the work.

▸Advanced: use with your AI agent

Description

--- name: model-hierarchy description: > Cost-optimize AI agent operations by routing tasks to appropriate models based on complexity. Use this skill when: (1) deciding which model to use for a task, (2) spawning sub-agents, (3) considering cost efficiency, (4) the current model feels like overkill for the task. Triggers: "model routing", "cost optimization", "which model", "too expensive", "spawn agent". --- # Model Hierarchy Route tasks to the cheapest model that can handle them. Most agent work is routine. ## Core Principle **80% of agent tasks are janitorial.** File reads, status checks, formatting, simple Q&A. These don't need expensive models. Reserve premium models for problems that actually require deep reasoning. ## Model Tiers ### Tier 1: Cheap ($0.10-0.50/M tokens) | Model | Input | Output | Best For | |-------|-------|--------|----------| | DeepSeek V3 | $0.14 | $0.28 | General routine work | | GPT-4o-mini | $0.15 | $0.60 | Quick responses | | Claude Haiku | $0.25 | $1.25 | Fast tool use | | Gemini Flash | $0.075 | $0.30 | High volume | | GLM 5 (Zhipu) | (OpenRouter Z.AI) | (OpenRouter Z.AI) | Routine + moderate text; 200K context; **text-only** — do not use for image/vision | | Kimi K2.5 (Moonshot) | $0.45 | $2.25 | Routine + moderate; 262K context; **multimodal (text + image + video)** | **Text-only models (e.g. GLM 5):** Do not use for any task that requires image input or vision — no photo analysis, screenshots, image-generation tools, or document/chart vision. Route to a vision-capable model (e.g. Kimi K2.5, GPT-4o, Gemini, Claude with vision, GLM-4.5V/4.6V). **Vision-capable Tier 1/2 (e.g. Kimi K2.5):** Use for routine or moderate tasks that may involve images — screenshots, photo analysis, docs, image-generation orchestration — without moving to premium vision models. ### Tier 2: Mid ($1-5/M tokens) | Model | Input | Output | Best For | |-------|-------|--------|----------| | Claude Sonnet | $3.00 | $15.00 | Balanced performance | | GPT-4o | $2.50 | $10.00 | Multimodal tasks | | Gemini Pro | $1.25 | $5.00 | Long context | ### Tier 3: Premium ($10-75/M tokens) | Model | Input | Output | Best For | |-------|-------|--------|----------| | Claude Opus | $15.00 | $75.00 | Complex reasoning | | GPT-4.5 | $75.00 | $150.00 | Frontier tasks | | o1 | $15.00 | $60.00 | Multi-step reasoning | | o3-mini | $1.10 | $4.40 | Reasoning on budget | *Prices as of Feb 2026. Check provider docs for current rates.* ## Task Classification Before executing any task, classify it: ### ROUTINE → Use Tier 1 **Requires image/vision** → Do not assign to text-only models (GLM 5, etc.). Use a vision-capable model from Tier 1/2 or 3 (e.g. Kimi K2.5, GPT-4o, Gemini, Claude, GLM-4.5V). Characteristics: - Single-step operations - Clear, unambiguous instructions - No judgment required - Deterministic output expected Examples: - File read/write operations - Status checks and health monitoring - Simple lookups (time, weather, definitions) - Formatting and restructuring text - List operations (filter, sort, transform) - API calls with known parameters - Heartbeat and cron tasks - URL fetching and basic parsing ### MODERATE → Use Tier 2 Characteristics: - Multi-step but well-defined - Some synthesis required - Standard patterns apply - Quality matters but isn't critical Examples: - Code generation (standard patterns) - Summarization and synthesis - Draft writing (emails, docs, messages) - Data analysis and transformation - Multi-file operations - Tool orchestration - Code review (non-security) - Search and research tasks ### COMPLEX → Use Tier 3 Characteristics: - Novel problem solving required - Multiple valid approaches - Nuanced judgment calls - High stakes or irreversible - Previous attempts failed Examples: - Multi-step debugging - Architecture and design decisions - Security-sensitive code review - Tasks where cheaper model already failed - Ambiguous requirements needing interpretation - Long-context reasoning (>50K tokens) - Creative work requiring originality - Adversarial or edge-case handling ## Decision Algorithm ``` function selectModel(task): # Rule 1: Vision override (Tier 1/2 includes text-only models) if task.requiresImageInput or task.requiresVision: return VISION_CAPABLE_MODEL # e.g. Kimi K2.5, GPT-4o, Gemini, Claude; do not use GLM 5 or other text-only # Rule 2: Escalation override if task.previousAttemptFailed: return nextTierUp(task.previousModel) # Rule 3: Explicit complexity signals if task.hasSignal("debug", "architect", "design", "security"): return TIER_3 if task.hasSignal("write", "code", "summarize", "analyze"): return TIER_2 # Rule 4: Default classification complexity = classifyTask(task) if complexity == ROUTINE: return TIER_1 elif complexity == MODERATE: return TIER_2 else: return TIER_3 ``` ## Behavioral Rules ### For Main Session 1. **Default to Tier 2** for interactive work 2. **Suggest downgrade** when doing routine work: "This is routine - I can handle this on a cheaper model or spawn a sub-agent." 3. **Request upgrade** when stuck: "This needs more reasoning power. Switching to [premium model]." ### For Sub-Agents 1. **Default to Tier 1** unless task is clearly moderate+ 2. **Batch similar tasks** to amortize overhead 3. **Report failures** back to parent for escalation ### For Automated Tasks 1. **Heartbeats/monitoring** → Always Tier 1 2. **Scheduled reports** → Tier 1 or 2 based on complexity 3. **Alert responses** → Start Tier 2, escalate if needed ## Communication Patterns When suggesting model changes, use clear language: **Downgrade suggestion:** > "This looks like routine file work. Want me to spawn a sub-agent on DeepSeek for this? Same result, fraction of the cost." **Upgrade request:** > "I'm hitting the limits of what I can figure out here. This needs Opus-level reasoning. Switching up." **Explaining hierarchy:** > "I'm running the heavy analysis on Sonnet while sub-agents fetch the data on DeepSeek. Keeps costs down without sacrificing quality where it matters." ## Cost Impact Assuming 100K tokens/day average usage: | Strategy | Monthly Cost | Notes | |----------|--------------|-------| | Pure Opus | ~$225 | Maximum capability, maximum spend | | Pure Sonnet | ~$45 | Good default for most work | | Pure DeepSeek | ~$8 | Cheap but limited on hard problems | | **Hierarchy (80/15/5)** | **~$19** | Best of all worlds | The 80/15/5 split: - 80% routine tasks on Tier 1 (~$6) - 15% moderate tasks on Tier 2 (~$7) - 5% complex tasks on Tier 3 (~$6) **Result: 10x cost reduction vs pure premium, with equivalent quality on complex tasks.** ## Integration Examples ### OpenClaw ```yaml # config.yml - set default model model: anthropic/claude-sonnet-4 # In session, switch models /model opus # upgrade for complex task /model deepseek # downgrade for routine # Spawn sub-agent on cheap model sessions_spawn: task: "Fetch and parse these 50 URLs" model: deepseek ``` **OpenRouter (Tier 1 with vision or text-only):** ```yaml # Tier 1 with vision — Kimi K2.5 (multimodal) model: openrouter/moonshotai/kimi-k2.5 # Heartbeats, cron, image-involving tasks: K2.5 handles text and vision. # Tier 1 text-only — GLM 5 (no vision) # model: openrouter/z-ai/glm-5 # exact ID TBD on OpenRouter Z.AI # Routine text-only only; for image tasks use Kimi K2.5 or another vision-capable model. ``` ### Claude Code ``` # In CLAUDE.md or project instructions When spawning background agents, use claude-3-haiku for: - File operations - Simple searches - Status checks Reserve claude-sonnet-4 for: - Code generation - Analysis tasks ``` ### General Agent Systems ```python def get_model_for_task(task_description: str) -> str: routine_signals = ['read', 'fetch', 'check', 'list', 'format', 'status'] complex_signals = ['debug', 'architect', 'design', 'security', 'why'] desc_lower = task_description.lower() if any(signal in desc_lower for signal in complex_signals): return "claude-opus-4" elif any(signal in desc_lower for signal in routine_signals): return "deepseek-v3" else: return "claude-sonnet-4" ``` ## Anti-Patterns **DON'T:** - Run heartbeats on Opus - Use premium models for file I/O - Keep expensive model when task is clearly routine - Spawn sub-agents on premium models by default - Use GLM 5 (or any text-only Tier 1/2 model) for image/vision tasks — e.g. photo analysis, screenshot understanding, image-generation skills, or any tool that takes image input **DO:** - Start mid-tier, adjust based on task - Spawn helpers on cheapest viable model - Escalate explicitly when stuck - Track cost per task type to optimize further ## Extending This Skill To customize for your use case: 1. **Adjust tier definitions** based on your provider/budget 2. **Add domain-specific signals** to classification rules 3. **Track actual complexity** vs predicted to improve heuristics 4. **Set budget alerts** to catch runaway premium usage

Preview in:

Security Status

Verified

Manually verified by security team

Time saved

How much time did this skill save you?

Related AI Tools

More Save Money tools you might like

Family History Research Planning Skill

Free

Provides assistance with planning family history and genealogy research projects.

Run free

Naming Skill

Free

Name products, SaaS, brands, open source projects, bots, and apps. Use when the user needs to name something, find a brand name, or pick a product name. Metaphor-driven process that produces memorable, meaningful names and avoids AI slop.

Run free

Profit Margin Calculator

Free during launchNormally $8

Find hidden profit leaks — see exactly where your money goes

Run free

guard-scanner

Free

"Security scanner and runtime guard for OpenClaw skills, MCP servers, and AI agent workflows. Detects prompt injection, identity hijacking, memory poisoning, A2A contagion, secret leaks, supply-chain abuse, and dangerous tool calls with 364 static th

Run free

Life OS · Personal Decision Engine

Free

"A personal decision engine with 16 independent AI agents, checks and balances, and swappable cultural themes. Covers relationships, finance, learning, execution, risk control, health, and infrastructure. Use when facing complex personal decisions (c

Run free

bbc-skill — Bilibili Comment Collector

Free

Fetch Bilibili (哔哩哔哩) video comments for UP主 self-analysis. Use when the user asks to collect, download, export, or analyze comments on a Bilibili video (BV号 / URL / UID). Produces JSONL + summary.json suitable for further Claude Code analysis (senti

Run free