OpenClaw + DeepSeek: Self-Hosted Agent with a Powerful Low-Cost Local Model (2026)

Blog image

Hey fellow AI tinkerers — if you've been running autonomous agents and suddenly stared at your API bill like it personally insulted you, this one's for you.

I'm Hanks. I test workflows the way a mechanic stress-tests brakes: by trying to make them fail first. I've been running OpenClaw since the early Moltbot days, and after watching it explode to 140,000+ GitHub stars in a matter of weeks, I figured it's time to document the DeepSeek setup properly — because the official docs don't tell you everything.

The core question I kept asking myself: Can DeepSeek actually carry real agent tasks in OpenClaw, or does it just survive the demo and fall apart at scale?

Here's what I found.

Why Use DeepSeek with OpenClaw?

Blog image

OpenClaw is model-agnostic by design. You bring the brain; it handles the execution layer — skills, channels, file access, cron jobs, all of it. The default most people reach for is Claude or GPT-4o. Both are excellent. Both will quietly drain your account when you have agents looping overnight.

That's where DeepSeek changes the math.

DeepSeek's Strengths for Agentic Tasks (Reasoning, Code)

Blog image

DeepSeek ships two API models right now, both running DeepSeek-V3.2 under the hood:

deepseek-chat (non-thinking mode): General tasks, tool calls, summarization. Fast, cheap.
deepseek-reasoner (thinking mode): Chain-of-Thought reasoning for math, logic, and code-heavy workflows. Slower, but hits harder on complex problems.

Both have a 128K context window. The reasoner maxes at 64K output tokens — which matters for long code generation tasks.

What surprised me: tool calling works. OpenClaw's agent loop depends on reliable tool invocation, and DeepSeek handles it cleanly enough for most workflows. It occasionally hallucinates argument names and tries to call nonexistent tools, but OpenClaw's error handling catches most of these and retries automatically.

Two Ways to Use It: Local (Ollama) vs. DeepSeek API

You've got two options, and they suit different use cases:

Local via Ollama

DeepSeek Cloud API

Cost

$0 (electricity only)

$0.28/M input, $0.42/M output

Privacy

Full — nothing leaves your machine

Data sent to DeepSeek servers

Speed

Depends on your hardware

Fast, consistent

Internet required

No (after model pull)

Yes

Best for

Tinkering, sensitive data, offline

Production tasks, reliability

Pick local if you're experimenting or have sensitive files. Pick the API if you need consistent performance without VRAM roulette.

Option A — Running DeepSeek Locally via Ollama

Blog image

This is the "free forever" path. Fun, but physics applies.

Hardware Requirements for DeepSeek-R1 / V3

Blog image

Before you pull anything, know what you're getting into. I learned this the hard way:

Model

VRAM Required

RAM (CPU fallback)

Realistic Speed

deepseek-r1:1.5b

~2GB

8GB

Fast

deepseek-r1:8b

~8GB

16GB

Usable

deepseek-r1:32b

~20GB

32GB+

Slow on consumer hardware

deepseek-r1:67b

~40GB

Not recommended

Segfault territory

My honest take: on a MacBook Pro M2 with 16GB unified memory, the 8B model is the ceiling for a good experience. Anything larger and you're watching tokens crawl. If you're on a machine with less than 16GB RAM, start with deepseek-r1:1.5b just to validate the setup, then scale up later.

Pulling the Model and Connecting to OpenClaw

First, install Ollama and pull your model:

# Start small and verify it works
ollama pull deepseek-r1:8b
# Larger option if your hardware can handle it
ollama pull deepseek-r1:32b

Once Ollama is running on port 11434, connect OpenClaw via your config at ~/.openclaw/openclaw.json:

{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://localhost:11434/v1",
        "apiKey": "ollama-local",
        "api": "openai-completions",
        "models": [
          {
            "id": "ollama/deepseek-r1:8b",
            "name": "deepseek-r1:8b",
            "reasoning": false,
            "input": ["text"],
            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
            "contextWindow": 128000,
            "maxTokens": 8000
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/deepseek-r1:8b"
      }
    }
  }
}

Critical detail from the official OpenClaw docs: If your TUI shows 0/200k tokens and hangs, you may need to swap "api": "openai-completions" for "api": "openai-responses". This tripped me up for longer than I'd like to admit.

After editing, restart the gateway:

openclaw gateway restart
openclaw doctor --fix

You should see [INFO] Connected to Ollama in the logs. No internet required after this point.

Option B — Using the DeepSeek Cloud API with OpenClaw

Blog image

This is the setup I actually use day-to-day. The local path is great for testing; the API is better when you need things to actually finish.

Getting Your DeepSeek API Key

Sign up at platform.deepseek.com. New accounts get 5 million free tokens on registration — no credit card required. That's enough to run a meaningful amount of agent tasks before you need to top up.

Setting OPENAI_BASE_URL and Model Name in OpenClaw

DeepSeek exposes an OpenAI-compatible API at https://api.deepseek.com/v1. OpenClaw handles this cleanly through the openai-compatible provider block:

{
  "env": {
    "DEEPSEEK_API_KEY": "your-deepseek-api-key"
  },
  "models": {
    "providers": {
      "openai-compatible": {
        "baseUrl": "https://api.deepseek.com/v1",
        "headers": {
          "Authorization": "Bearer $DEEPSEEK_API_KEY"
        }
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai-compatible/deepseek-chat"
      }
    }
  }
}

Swap deepseek-chat for deepseek-reasoner if you need chain-of-thought reasoning on complex tasks. Be aware the reasoner burns more tokens on its internal thinking steps — useful for hard problems, overkill for routing and summarization.

Cost Estimate vs. OpenAI GPT-4o

Here's the part that made me do a double-take. Current DeepSeek V3.2 API pricing (as of February 2026):

Model

Input (cache hit)

Input (cache miss)

Output

deepseek-chat

$0.028/M

$0.28/M

$0.42/M

deepseek-reasoner

$0.028/M

$0.28/M

$0.42/M

GPT-4o (for reference)

~$1.25/M

~$2.50/M

~$10/M

Context caching is on by default — when your agent reuses the same system prompt or conversation prefix, you're paying the cache-hit rate automatically. For multi-turn agent loops, that adds up to real savings fast.

Testing the Setup — Sample Automations

Simple Task Routing Test

Once connected, send this to your OpenClaw agent via Telegram or WhatsApp:

Summarize the last 5 emails in my inbox and flag anything urgent.

With deepseek-chat, this completes in a few seconds. The model handles instruction-following and tool dispatch cleanly. I ran this 20+ times across a week of real email — it flagged genuinely urgent items correctly about 85% of the time, which is good enough for a first pass.

Code-Generation Skill Test

Switch to deepseek-reasoner and try:

Write a Python script that monitors a folder for new CSV files and 
automatically generates a summary report.

This is where the reasoning model earns its keep. The CoT process adds latency (expect 20–40 seconds), but the output quality is noticeably better on multi-step code tasks. The chain-of-thought tokens count toward output billing at $0.42/M, so be deliberate about when you use it.

One thing I always do: add "Be concise. Output only the final answer, not your thinking process." to your system prompt. Otherwise DeepSeek-R1 narrates every reasoning step in the response, which burns tokens and clutters your agent's context window.

Limitations and When to Switch Models

I want to be straight with you here. DeepSeek with OpenClaw works well for most tasks, but it's not a straight swap for Claude in every scenario.

Where DeepSeek struggles:

Complex multi-tool chains with ambiguous instructions (hallucinated tool calls happen more often than with Claude)
Multimodal tasks — DeepSeek V3.2 is text-only via API
Tasks requiring nuanced judgment or creative writing with a specific voice

Where local Ollama DeepSeek hits a wall:

Anything requiring consistent 128K context on consumer hardware — RAM consumption is brutal
Agentic loops that run overnight — thermal throttling on laptops kills inference speed
If you see segfaults or silent failures, you've hit your VRAM ceiling

My rule: use deepseek-chat via API for the 80% of everyday automation tasks. Keep Claude in your config as a fallback model for the edge cases that need it. OpenClaw's multi-model routing makes this easy — you can specify a fallback model in your agent defaults.

At Macaron, we built our agent to handle exactly this kind of task-to-workflow execution — turning a quick message into a structured, repeatable automation without stitching together five different tools. If you want to test how your workflows run end-to-end without managing your own gateway, try it free at macaron.im and judge the results with a real task.

Frequently Asked Questions

Q: Does DeepSeek support tool calling in OpenClaw? Yes. Both deepseek-chat and deepseek-reasoner support tool calls. There are occasional argument hallucinations, but OpenClaw's retry logic handles most of them. For production workflows, add explicit tool descriptions to your skill definitions.

Q: Can I run DeepSeek locally on a MacBook? Yes, with the 8B model if you have 16GB unified memory. Anything larger will be slow. The OpenClaw + Ollama setup guide covers hardware recommendations in detail.

Q: Is the DeepSeek API safe to use with sensitive data? Your data is sent to DeepSeek servers. For sensitive documents or internal business data, the local Ollama route keeps everything on-device. The CrowdStrike security overview of OpenClaw covers broader security considerations worth reading before any enterprise deployment.

Q: What's the difference between deepseek-chat and deepseek-reasoner? Both now run DeepSeek-V3.2 under the hood. deepseek-chat is non-thinking mode — faster, cheaper for general tasks. deepseek-reasoner activates Chain-of-Thought reasoning with up to 64K output tokens — better for math, logic, and complex code. See the official DeepSeek pricing docs for current rates.

Q: Will OpenClaw keep working now that the creator joined OpenAI? Yes. Peter Steinberger announced the project moves to an open-source foundation with OpenAI sponsorship. The community around it is large enough that it'll keep evolving independently.

Data verified February 2026. DeepSeek pricing sourced from api-docs.deepseek.com. OpenClaw config syntax verified against docs.openclaw.ai and community config examples.