Model Selection & Troubleshooting

The model you assign to a Kin has a massive impact on how well it performs — especially for autonomous tasks. This guide helps you choose the right model and debug common problems.

Recommended models

For autonomous / agentic Kins

These Kins run crons, process webhooks, and work without human oversight. They must reliably call tools.

Model	Provider	Verdict	Notes
Claude Sonnet 4	Anthropic	✅ Best choice	Excellent tool use, follows complex instructions
Claude Sonnet 3.5	Anthropic	✅ Excellent	Battle-tested, great cost/performance ratio
Claude Haiku 3.5	Anthropic	✅ Good for simple tasks	Fast and cheap, but less reliable on complex multi-step workflows
GPT-4o	OpenAI	⚠️ Usable with caveats	Sometimes falls into “text mode” — needs stronger prompting
GPT-4o-mini	OpenAI	⚠️ Limited	Struggles with complex tool sequences
Gemini 2.5 Pro	Google	✅ Good	Strong tool use, very large context window
Gemini 2.5 Flash	Google	⚠️ Usable	Fast but sometimes skips tool calls on complex tasks
DeepSeek V3	DeepSeek	⚠️ Usable	Can work but less consistent on multi-step tool use
Llama 3.x (70B+)	Groq/Together/Ollama	⚠️ Limited	Open models struggle with reliable tool calling
Mistral Large	Mistral	⚠️ Usable	Decent tool use but less consistent than Claude

For conversational Kins

These Kins primarily chat with users and occasionally use tools. Most capable models work fine.

Model	Provider	Verdict
Claude Sonnet 4	Anthropic	✅ Excellent
Claude Haiku 3.5	Anthropic	✅ Great for fast responses
GPT-4o	OpenAI	✅ Excellent
GPT-4o-mini	OpenAI	✅ Good and cheap
Gemini 2.5 Pro	Google	✅ Excellent
Gemini 2.5 Flash	Google	✅ Fast and capable
Llama 3.x (70B+)	Groq/Together/Ollama	✅ Good for self-hosted

The “text mode” problem

The most common issue with autonomous Kins is the model falling into text mode — where it describes what it would do instead of actually calling tools.

What it looks like

Instead of calling web_search("latest AI news"), the model outputs:

I’ll search the web for the latest AI news and compile a summary. Let me start by looking at major tech publications for recent developments in artificial intelligence…

No tool calls appear. The model writes a plausible-sounding response entirely from its training data, without accessing any real information.

Why it happens

Model capability — Some models aren’t trained for reliable function calling
Prompt ambiguity — If the prompt sounds like a conversation, the model converses instead of acting
Missing instruction — The model doesn’t know it should USE tools rather than DESCRIBE tool usage
Context confusion — Very long contexts can cause the model to “forget” it has tools available

How to fix it

1. Use a recommended model

Claude Sonnet models are specifically trained for tool use. If you’re experiencing text mode with another model, switch to Claude Sonnet first — this fixes the problem in most cases.

2. Add explicit execution instructions

In your Kin’s system prompt, include:

You ALWAYS use tools to accomplish tasks. You NEVER describe what you would do —
you DO it by calling the appropriate tools.

When you need information, call web_search or browse_url.
When you need to save something, call memorize or write_file.
When you need to process data, call the relevant tools step by step.

WRONG: "I'll search for the latest news about AI..."
RIGHT: [calls web_search("latest AI news", freshness="pd")]

3. Use the EXEC pattern in task descriptions

For sub-Kin tasks (crons, webhooks), structure the task description as explicit commands:

## Steps — EXECUTE each one using tools

EXEC: web_search("artificial intelligence news", freshness="pd")
EXEC: browse_url on the top 3 results
EXEC: memorize the key findings
EXEC: update_task_status("completed", summary)

Do NOT describe these steps. CALL the tools.

This pattern tells the model unambiguously that it should execute tool calls, not write about them.

4. Check tool call indicators

In the KinBot UI, each message shows whether tool calls were made. Look for the tool call indicators (collapsible sections showing the tool name and parameters). If a response has no tool calls, the Kin operated in text mode.

Provider setup tips

Anthropic (recommended)

Get an API key from console.anthropic.com
In KinBot, go to Settings > Providers > Add Provider
Select Anthropic, paste your API key
The connection test will verify models are accessible

Anthropic also supports OAuth via Claude Max — no API key needed if you have a Claude Max subscription.

OpenAI

Get an API key from platform.openai.com
Add as a provider in KinBot
For autonomous Kins, use gpt-4o (not gpt-4o-mini)

Ollama (self-hosted)

Install Ollama and pull a model: ollama pull llama3.3:70b
In KinBot, add Ollama as a provider with base URL http://localhost:11434
From Docker, use http://host.docker.internal:11434

OpenRouter (access to many models)

Get an API key from openrouter.ai
Add as a provider in KinBot
You can access Claude, GPT-4o, Gemini, and many other models through a single provider

OpenRouter is convenient if you want to test different models without setting up multiple providers.

Verifying tool use is working

After setting up a Kin, verify it’s actually using tools:

Quick test

Send your Kin a message that requires a tool call:

What’s the current weather in Paris? Use web search to find out.

A working Kin will call web_search and return real, current data. A text-mode Kin will make up a plausible weather report.

Cron test

Create a simple cron job: “Search the web for ‘KinBot’ and summarize what you find”
Trigger it manually
Check the task result — does it contain actual search results or fabricated content?
Look at the task detail for tool call indicators

What to check in the UI

Tool call sections: Each message shows collapsible tool call blocks. No blocks = no tools were called
Task status: Autonomous tasks should end with completed and a meaningful result
Cron journal: Check get_cron_journal for execution history — failed runs often indicate tool issues

Cost considerations

Autonomous Kins consume more tokens than conversational ones because:

Cron jobs run on schedule regardless of whether there’s work to do
Webhook tasks process each event individually
Sub-tasks each require their own LLM call(s)
Tool results are included in the context, adding to input token count

Cost optimization tips

Tip	Impact
Use Haiku for simple, single-step crons	5–10x cheaper than Sonnet
Add webhook payload filters	Avoid processing irrelevant events
Set concurrency limits on webhook tasks	Prevent burst cost spikes
Use concise task descriptions	Fewer input tokens per run
Store results in memory instead of long outputs	Keeps future context smaller

Quick reference: model selection flowchart

Is the Kin autonomous? (crons, webhooks, sub-tasks)
- Yes → Claude Sonnet 4 or Claude Sonnet 3.5
- No → continue
Does the Kin use tools frequently?
- Yes → Claude Sonnet 3.5 or GPT-4o
- No → continue
Is cost the primary concern?
- Yes → Claude Haiku 3.5 or GPT-4o-mini
- No → Claude Sonnet 3.5 (best all-rounder)
Must it be self-hosted?
- Yes → Llama 3.3 70B+ via Ollama (conversational) or Gemini 2.5 Flash via API (agentic)
- No → Use a cloud provider