[{"data":1,"prerenderedAt":1683},["ShallowReactive",2],{"blog-openclaw-ollama-local-models-guide":3},{"id":4,"title":5,"author":6,"body":7,"date":1668,"description":1669,"draft":1670,"extension":1671,"image":1672,"meta":1673,"navigation":1674,"path":1675,"seo":1676,"stem":1677,"tags":1678,"__hash__":1682},"blog/blog/openclaw-ollama-local-models-guide.md","How to Cut OpenClaw API Costs with Ollama and Local Models","ClawNest Team",{"type":8,"value":9,"toc":1643},"minimark",[10,15,24,27,42,46,53,58,93,106,110,119,124,127,160,163,179,183,194,450,453,514,518,524,569,572,576,583,663,667,674,694,697,701,710,714,735,757,765,956,963,967,975,988,992,1307,1314,1318,1321,1325,1328,1342,1347,1434,1437,1441,1444,1517,1524,1528,1531,1563,1566,1570,1613,1622,1625,1639],[11,12,14],"h2",{"id":13},"the-cost-problem","The Cost Problem",[16,17,18,19,23],"p",{},"If you're self-hosting OpenClaw with cloud APIs like OpenAI or Gemini, you've probably noticed the bills adding up fast. We've seen users reporting ",[20,21,22],"strong",{},"$20+ per week"," on basic search and browsing tasks — and that's with moderate usage.",[16,25,26],{},"The reason is token volume. OpenClaw's tool-use architecture means a single web search can consume millions of tokens. The agent reasons about which tools to call, processes search results, summarizes content, and formats a response. Each step burns tokens. Multiply that by dozens of queries per day and you're looking at serious monthly spend.",[16,28,29,30,33,34,37,38,41],{},"The good news: you have options. OpenClaw supports custom model providers, which means you can run ",[20,31,32],{},"local models for free",", use ",[20,35,36],{},"cheaper cloud alternatives",", or set up a ",[20,39,40],{},"hybrid approach"," that gives you the best of both worlds.",[11,43,45],{"id":44},"important-minimum-model-size","Important: Minimum Model Size",[16,47,48,49,52],{},"Before you get started, a critical caveat: ",[20,50,51],{},"OpenClaw requires models with strong tool-calling and reasoning capabilities",". Its agentic architecture involves multi-step planning, tool selection, structured output parsing, and long context windows. Small models simply can't handle this reliably.",[16,54,55],{},[20,56,57],{},"Minimum recommended sizes:",[59,60,61,68,74],"ul",{},[62,63,64,67],"li",{},[20,65,66],{},"32B+ parameters"," — the practical minimum for reliable OpenClaw usage (e.g. DeepSeek R1 32B, Qwen 2.5 32B)",[62,69,70,73],{},[20,71,72],{},"14B parameters"," — may work for simple tasks but will frequently fail on multi-step workflows",[62,75,76,79,80,83,84,88,89,92],{},[20,77,78],{},"7–8B parameters"," — ",[20,81,82],{},"not recommended",". Models like ",[85,86,87],"code",{},"llama3.1:8b"," or ",[85,90,91],{},"mistral:7b"," lack the reasoning depth for OpenClaw's tool-use chains and will produce errors, hallucinated tool calls, or get stuck in loops",[16,94,95,96,101,102,105],{},"If you don't have the hardware to run 32B+ models locally, consider ",[97,98,100],"a",{"href":99},"#option-2-use-cheaper-cloud-models-via-openrouter","OpenRouter"," or the ",[97,103,40],{"href":104},"#option-3-the-hybrid-approach"," instead.",[11,107,109],{"id":108},"option-1-run-local-models-with-ollama","Option 1: Run Local Models with Ollama",[16,111,112,118],{},[97,113,117],{"href":114,"rel":115},"https://ollama.com",[116],"nofollow","Ollama"," is an open-source tool that lets you run large language models locally on your own hardware. It exposes an OpenAI-compatible API, which means OpenClaw can use it as a drop-in replacement for cloud models.",[120,121,123],"h3",{"id":122},"installing-ollama","Installing Ollama",[16,125,126],{},"Install Ollama on the same machine as OpenClaw (or a separate machine — more on that later):",[128,129,134],"pre",{"className":130,"code":131,"language":132,"meta":133,"style":133},"language-bash shiki shiki-themes github-light github-dark","curl -fsSL https://ollama.com/install.sh | sh\n","bash","",[85,135,136],{"__ignoreMap":133},[137,138,141,145,149,153,157],"span",{"class":139,"line":140},"line",1,[137,142,144],{"class":143},"sScJk","curl",[137,146,148],{"class":147},"sj4cs"," -fsSL",[137,150,152],{"class":151},"sZZnC"," https://ollama.com/install.sh",[137,154,156],{"class":155},"szBVR"," |",[137,158,159],{"class":143}," sh\n",[16,161,162],{},"Then pull a model. We recommend starting with a 32B+ model:",[128,164,166],{"className":130,"code":165,"language":132,"meta":133,"style":133},"ollama pull qwen3:32b\n",[85,167,168],{"__ignoreMap":133},[137,169,170,173,176],{"class":139,"line":140},[137,171,172],{"class":143},"ollama",[137,174,175],{"class":151}," pull",[137,177,178],{"class":151}," qwen3:32b\n",[120,180,182],{"id":181},"configuring-openclaw-to-use-ollama","Configuring OpenClaw to Use Ollama",[16,184,185,186,189,190,193],{},"In your ",[85,187,188],{},"openclaw.json"," config, add Ollama as a custom provider under ",[85,191,192],{},"models.providers",":",[128,195,199],{"className":196,"code":197,"language":198,"meta":133,"style":133},"language-json shiki shiki-themes github-light github-dark","{\n  \"models\": {\n    \"providers\": {\n      \"ollama\": {\n        \"baseUrl\": \"http://localhost:11434/v1\",\n        \"api\": \"openai-completions\",\n        \"apiKey\": \"ollama\",\n        \"models\": [\n          {\n            \"id\": \"qwen3:32b\",\n            \"name\": \"Qwen 3 32B\",\n            \"reasoning\": true,\n            \"input\": [\"text\"],\n            \"cost\": { \"input\": 0, \"output\": 0, \"cacheRead\": 0, \"cacheWrite\": 0 },\n            \"contextWindow\": 32768,\n            \"maxTokens\": 8192\n          }\n        ]\n      }\n    }\n  }\n}\n","json",[85,200,201,207,216,224,232,247,260,273,282,288,301,314,327,342,390,403,414,420,426,432,438,444],{"__ignoreMap":133},[137,202,203],{"class":139,"line":140},[137,204,206],{"class":205},"sVt8B","{\n",[137,208,210,213],{"class":139,"line":209},2,[137,211,212],{"class":147},"  \"models\"",[137,214,215],{"class":205},": {\n",[137,217,219,222],{"class":139,"line":218},3,[137,220,221],{"class":147},"    \"providers\"",[137,223,215],{"class":205},[137,225,227,230],{"class":139,"line":226},4,[137,228,229],{"class":147},"      \"ollama\"",[137,231,215],{"class":205},[137,233,235,238,241,244],{"class":139,"line":234},5,[137,236,237],{"class":147},"        \"baseUrl\"",[137,239,240],{"class":205},": ",[137,242,243],{"class":151},"\"http://localhost:11434/v1\"",[137,245,246],{"class":205},",\n",[137,248,250,253,255,258],{"class":139,"line":249},6,[137,251,252],{"class":147},"        \"api\"",[137,254,240],{"class":205},[137,256,257],{"class":151},"\"openai-completions\"",[137,259,246],{"class":205},[137,261,263,266,268,271],{"class":139,"line":262},7,[137,264,265],{"class":147},"        \"apiKey\"",[137,267,240],{"class":205},[137,269,270],{"class":151},"\"ollama\"",[137,272,246],{"class":205},[137,274,276,279],{"class":139,"line":275},8,[137,277,278],{"class":147},"        \"models\"",[137,280,281],{"class":205},": [\n",[137,283,285],{"class":139,"line":284},9,[137,286,287],{"class":205},"          {\n",[137,289,291,294,296,299],{"class":139,"line":290},10,[137,292,293],{"class":147},"            \"id\"",[137,295,240],{"class":205},[137,297,298],{"class":151},"\"qwen3:32b\"",[137,300,246],{"class":205},[137,302,304,307,309,312],{"class":139,"line":303},11,[137,305,306],{"class":147},"            \"name\"",[137,308,240],{"class":205},[137,310,311],{"class":151},"\"Qwen 3 32B\"",[137,313,246],{"class":205},[137,315,317,320,322,325],{"class":139,"line":316},12,[137,318,319],{"class":147},"            \"reasoning\"",[137,321,240],{"class":205},[137,323,324],{"class":147},"true",[137,326,246],{"class":205},[137,328,330,333,336,339],{"class":139,"line":329},13,[137,331,332],{"class":147},"            \"input\"",[137,334,335],{"class":205},": [",[137,337,338],{"class":151},"\"text\"",[137,340,341],{"class":205},"],\n",[137,343,345,348,351,354,356,359,362,365,367,369,371,374,376,378,380,383,385,387],{"class":139,"line":344},14,[137,346,347],{"class":147},"            \"cost\"",[137,349,350],{"class":205},": { ",[137,352,353],{"class":147},"\"input\"",[137,355,240],{"class":205},[137,357,358],{"class":147},"0",[137,360,361],{"class":205},", ",[137,363,364],{"class":147},"\"output\"",[137,366,240],{"class":205},[137,368,358],{"class":147},[137,370,361],{"class":205},[137,372,373],{"class":147},"\"cacheRead\"",[137,375,240],{"class":205},[137,377,358],{"class":147},[137,379,361],{"class":205},[137,381,382],{"class":147},"\"cacheWrite\"",[137,384,240],{"class":205},[137,386,358],{"class":147},[137,388,389],{"class":205}," },\n",[137,391,393,396,398,401],{"class":139,"line":392},15,[137,394,395],{"class":147},"            \"contextWindow\"",[137,397,240],{"class":205},[137,399,400],{"class":147},"32768",[137,402,246],{"class":205},[137,404,406,409,411],{"class":139,"line":405},16,[137,407,408],{"class":147},"            \"maxTokens\"",[137,410,240],{"class":205},[137,412,413],{"class":147},"8192\n",[137,415,417],{"class":139,"line":416},17,[137,418,419],{"class":205},"          }\n",[137,421,423],{"class":139,"line":422},18,[137,424,425],{"class":205},"        ]\n",[137,427,429],{"class":139,"line":428},19,[137,430,431],{"class":205},"      }\n",[137,433,435],{"class":139,"line":434},20,[137,436,437],{"class":205},"    }\n",[137,439,441],{"class":139,"line":440},21,[137,442,443],{"class":205},"  }\n",[137,445,447],{"class":139,"line":446},22,[137,448,449],{"class":205},"}\n",[16,451,452],{},"The key settings:",[59,454,455,465,474,482],{},[62,456,457,460,461,464],{},[20,458,459],{},"baseUrl"," — points to Ollama's OpenAI-compatible endpoint (port 11434 by default, with ",[85,462,463],{},"/v1"," path)",[62,466,467,470,471,473],{},[20,468,469],{},"api"," — must be ",[85,472,257],{}," (the OpenAI Chat Completions API adapter)",[62,475,476,479,480],{},[20,477,478],{},"apiKey"," — required by OpenClaw even though Ollama doesn't need authentication. Use any placeholder value like ",[85,481,270],{},[62,483,484,487,488,491,492,361,495,361,498,361,501,361,504,361,507,510,511],{},[20,485,486],{},"models"," — list the models you've pulled with ",[85,489,490],{},"ollama pull",". Each model entry requires ",[85,493,494],{},"id",[85,496,497],{},"name",[85,499,500],{},"reasoning",[85,502,503],{},"input",[85,505,506],{},"cost",[85,508,509],{},"contextWindow",", and ",[85,512,513],{},"maxTokens",[120,515,517],{"id":516},"setting-a-local-model-as-default","Setting a Local Model as Default",[16,519,520,521,193],{},"To make OpenClaw use your local model by default instead of a cloud API, configure ",[85,522,523],{},"agents.defaults",[128,525,527],{"className":196,"code":526,"language":198,"meta":133,"style":133},"{\n  \"agents\": {\n    \"defaults\": {\n      \"model\": \"ollama/qwen3:32b\"\n    }\n  }\n}\n",[85,528,529,533,540,547,557,561,565],{"__ignoreMap":133},[137,530,531],{"class":139,"line":140},[137,532,206],{"class":205},[137,534,535,538],{"class":139,"line":209},[137,536,537],{"class":147},"  \"agents\"",[137,539,215],{"class":205},[137,541,542,545],{"class":139,"line":218},[137,543,544],{"class":147},"    \"defaults\"",[137,546,215],{"class":205},[137,548,549,552,554],{"class":139,"line":226},[137,550,551],{"class":147},"      \"model\"",[137,553,240],{"class":205},[137,555,556],{"class":151},"\"ollama/qwen3:32b\"\n",[137,558,559],{"class":139,"line":234},[137,560,437],{"class":205},[137,562,563],{"class":139,"line":249},[137,564,443],{"class":205},[137,566,567],{"class":139,"line":262},[137,568,449],{"class":205},[16,570,571],{},"Now every new conversation starts with your local model — zero API cost.",[11,573,575],{"id":574},"hardware-requirements-for-local-models","Hardware Requirements for Local Models",[16,577,578,579,582],{},"Local models run on your CPU or GPU. The limiting factor is almost always ",[20,580,581],{},"memory"," — the model needs to fit entirely in RAM (or VRAM for GPU inference).",[584,585,586,605],"table",{},[587,588,589],"thead",{},[590,591,592,596,599,602],"tr",{},[593,594,595],"th",{},"RAM / VRAM",[593,597,598],{},"Model Size",[593,600,601],{},"OpenClaw Compatibility",[593,603,604],{},"Examples",[606,607,608,623,636,649],"tbody",{},[590,609,610,614,617,620],{},[611,612,613],"td",{},"8 GB",[611,615,616],{},"7B parameters",[611,618,619],{},"Not compatible — too small for tool-use",[611,621,622],{},"Llama 3.1 8B, Mistral 7B",[590,624,625,628,630,633],{},[611,626,627],{},"16 GB",[611,629,72],{},[611,631,632],{},"Limited — simple tasks only",[611,634,635],{},"Qwen 2.5 14B",[590,637,638,641,643,646],{},[611,639,640],{},"32 GB+",[611,642,66],{},[611,644,645],{},"Recommended — reliable for most tasks",[611,647,648],{},"DeepSeek R1 32B, Qwen 3 32B",[590,650,651,654,657,660],{},[611,652,653],{},"48 GB+ VRAM (GPU)",[611,655,656],{},"70B+ parameters",[611,658,659],{},"Excellent — comparable to mid-tier cloud models",[611,661,662],{},"Llama 3.1 70B, Mixtral 8x22B",[120,664,666],{"id":665},"gpu-recommendations","GPU Recommendations",[16,668,669,670,673],{},"Running models on a GPU is ",[20,671,672],{},"5–10x faster"," than CPU inference. If you're serious about local models:",[59,675,676,682,688],{},[62,677,678,681],{},[20,679,680],{},"RTX 3090 (24 GB VRAM)"," — runs 14B models at full speed, 32B models with quantization",[62,683,684,687],{},[20,685,686],{},"RTX 4090 (24 GB VRAM)"," — same capacity, faster inference",[62,689,690,693],{},[20,691,692],{},"Dual GPUs or 48 GB+ VRAM"," — needed for 70B+ models without heavy quantization",[16,695,696],{},"CPU-only inference works but expect slower response times — around 5–15 tokens per second depending on your hardware and model size.",[11,698,700],{"id":699},"running-ollama-on-a-separate-machine","Running Ollama on a Separate Machine",[16,702,703,704,709],{},"Your VPS probably doesn't have a GPU. But your desktop at home might. You can run Ollama on a home machine with a GPU and connect your VPS to it securely using ",[97,705,708],{"href":706,"rel":707},"https://tailscale.com",[116],"Tailscale",".",[120,711,713],{"id":712},"setup","Setup",[715,716,717,723,729],"ol",{},[62,718,719,722],{},[20,720,721],{},"Install Tailscale"," on both your VPS and your home machine",[62,724,725,728],{},[20,726,727],{},"Install Ollama"," on your home machine (the one with the GPU)",[62,730,731,734],{},[20,732,733],{},"Start Ollama"," with network access enabled:",[128,736,738],{"className":130,"code":737,"language":132,"meta":133,"style":133},"OLLAMA_HOST=0.0.0.0 ollama serve\n",[85,739,740],{"__ignoreMap":133},[137,741,742,745,748,751,754],{"class":139,"line":140},[137,743,744],{"class":205},"OLLAMA_HOST",[137,746,747],{"class":155},"=",[137,749,750],{"class":151},"0.0.0.0",[137,752,753],{"class":143}," ollama",[137,755,756],{"class":151}," serve\n",[715,758,759],{"start":226},[62,760,761,764],{},[20,762,763],{},"Update OpenClaw config"," on your VPS to point to the Tailscale IP:",[128,766,768],{"className":196,"code":767,"language":198,"meta":133,"style":133},"{\n  \"models\": {\n    \"providers\": {\n      \"ollama\": {\n        \"baseUrl\": \"http://100.x.x.x:11434/v1\",\n        \"api\": \"openai-completions\",\n        \"apiKey\": \"ollama\",\n        \"models\": [\n          {\n            \"id\": \"deepseek-r1:32b\",\n            \"name\": \"DeepSeek R1 32B\",\n            \"reasoning\": true,\n            \"input\": [\"text\"],\n            \"cost\": { \"input\": 0, \"output\": 0, \"cacheRead\": 0, \"cacheWrite\": 0 },\n            \"contextWindow\": 65536,\n            \"maxTokens\": 8192\n          }\n        ]\n      }\n    }\n  }\n}\n",[85,769,770,774,780,786,792,803,813,823,829,833,844,855,865,875,913,924,932,936,940,944,948,952],{"__ignoreMap":133},[137,771,772],{"class":139,"line":140},[137,773,206],{"class":205},[137,775,776,778],{"class":139,"line":209},[137,777,212],{"class":147},[137,779,215],{"class":205},[137,781,782,784],{"class":139,"line":218},[137,783,221],{"class":147},[137,785,215],{"class":205},[137,787,788,790],{"class":139,"line":226},[137,789,229],{"class":147},[137,791,215],{"class":205},[137,793,794,796,798,801],{"class":139,"line":234},[137,795,237],{"class":147},[137,797,240],{"class":205},[137,799,800],{"class":151},"\"http://100.x.x.x:11434/v1\"",[137,802,246],{"class":205},[137,804,805,807,809,811],{"class":139,"line":249},[137,806,252],{"class":147},[137,808,240],{"class":205},[137,810,257],{"class":151},[137,812,246],{"class":205},[137,814,815,817,819,821],{"class":139,"line":262},[137,816,265],{"class":147},[137,818,240],{"class":205},[137,820,270],{"class":151},[137,822,246],{"class":205},[137,824,825,827],{"class":139,"line":275},[137,826,278],{"class":147},[137,828,281],{"class":205},[137,830,831],{"class":139,"line":284},[137,832,287],{"class":205},[137,834,835,837,839,842],{"class":139,"line":290},[137,836,293],{"class":147},[137,838,240],{"class":205},[137,840,841],{"class":151},"\"deepseek-r1:32b\"",[137,843,246],{"class":205},[137,845,846,848,850,853],{"class":139,"line":303},[137,847,306],{"class":147},[137,849,240],{"class":205},[137,851,852],{"class":151},"\"DeepSeek R1 32B\"",[137,854,246],{"class":205},[137,856,857,859,861,863],{"class":139,"line":316},[137,858,319],{"class":147},[137,860,240],{"class":205},[137,862,324],{"class":147},[137,864,246],{"class":205},[137,866,867,869,871,873],{"class":139,"line":329},[137,868,332],{"class":147},[137,870,335],{"class":205},[137,872,338],{"class":151},[137,874,341],{"class":205},[137,876,877,879,881,883,885,887,889,891,893,895,897,899,901,903,905,907,909,911],{"class":139,"line":344},[137,878,347],{"class":147},[137,880,350],{"class":205},[137,882,353],{"class":147},[137,884,240],{"class":205},[137,886,358],{"class":147},[137,888,361],{"class":205},[137,890,364],{"class":147},[137,892,240],{"class":205},[137,894,358],{"class":147},[137,896,361],{"class":205},[137,898,373],{"class":147},[137,900,240],{"class":205},[137,902,358],{"class":147},[137,904,361],{"class":205},[137,906,382],{"class":147},[137,908,240],{"class":205},[137,910,358],{"class":147},[137,912,389],{"class":205},[137,914,915,917,919,922],{"class":139,"line":392},[137,916,395],{"class":147},[137,918,240],{"class":205},[137,920,921],{"class":147},"65536",[137,923,246],{"class":205},[137,925,926,928,930],{"class":139,"line":405},[137,927,408],{"class":147},[137,929,240],{"class":205},[137,931,413],{"class":147},[137,933,934],{"class":139,"line":416},[137,935,419],{"class":205},[137,937,938],{"class":139,"line":422},[137,939,425],{"class":205},[137,941,942],{"class":139,"line":428},[137,943,431],{"class":205},[137,945,946],{"class":139,"line":434},[137,947,437],{"class":205},[137,949,950],{"class":139,"line":440},[137,951,443],{"class":205},[137,953,954],{"class":139,"line":446},[137,955,449],{"class":205},[16,957,958,959,962],{},"Replace ",[85,960,961],{},"100.x.x.x"," with your home machine's Tailscale IP. The connection is encrypted and doesn't require opening any ports on your home network.",[11,964,966],{"id":965},"option-2-use-cheaper-cloud-models-via-openrouter","Option 2: Use Cheaper Cloud Models via OpenRouter",[16,968,969,970,974],{},"Not everyone has GPU hardware at home. ",[97,971,100],{"href":972,"rel":973},"https://openrouter.ai",[116]," aggregates dozens of AI models and lets you pay per token — often at a fraction of the cost of direct API access.",[16,976,977,978,361,981,510,984,987],{},"Models like ",[20,979,980],{},"Gemini Flash",[20,982,983],{},"Llama 3.1 70B",[20,985,986],{},"Mistral Large"," are available at significantly lower rates than GPT or Claude, and work well for routine OpenClaw tasks.",[120,989,991],{"id":990},"openrouter-config","OpenRouter Config",[128,993,995],{"className":196,"code":994,"language":198,"meta":133,"style":133},"{\n  \"models\": {\n    \"providers\": {\n      \"openrouter\": {\n        \"baseUrl\": \"https://openrouter.ai/api/v1\",\n        \"api\": \"openai-completions\",\n        \"apiKey\": \"sk-or-your-key-here\",\n        \"models\": [\n          {\n            \"id\": \"google/gemini-2.0-flash-001\",\n            \"name\": \"Gemini 2.0 Flash\",\n            \"reasoning\": false,\n            \"input\": [\"text\"],\n            \"cost\": { \"input\": 0.1, \"output\": 0.4, \"cacheRead\": 0.025, \"cacheWrite\": 0.1 },\n            \"contextWindow\": 1048576,\n            \"maxTokens\": 8192\n          },\n          {\n            \"id\": \"meta-llama/llama-3.1-70b-instruct\",\n            \"name\": \"Llama 3.1 70B\",\n            \"reasoning\": false,\n            \"input\": [\"text\"],\n            \"cost\": { \"input\": 0.39, \"output\": 0.39, \"cacheRead\": 0, \"cacheWrite\": 0 },\n            \"contextWindow\": 131072,\n            \"maxTokens\": 8192\n          }\n        ]\n      }\n    }\n  }\n}\n",[85,996,997,1001,1007,1013,1020,1031,1041,1052,1058,1062,1073,1084,1095,1105,1146,1157,1165,1170,1174,1185,1196,1206,1216,1256,1268,1277,1282,1287,1292,1297,1302],{"__ignoreMap":133},[137,998,999],{"class":139,"line":140},[137,1000,206],{"class":205},[137,1002,1003,1005],{"class":139,"line":209},[137,1004,212],{"class":147},[137,1006,215],{"class":205},[137,1008,1009,1011],{"class":139,"line":218},[137,1010,221],{"class":147},[137,1012,215],{"class":205},[137,1014,1015,1018],{"class":139,"line":226},[137,1016,1017],{"class":147},"      \"openrouter\"",[137,1019,215],{"class":205},[137,1021,1022,1024,1026,1029],{"class":139,"line":234},[137,1023,237],{"class":147},[137,1025,240],{"class":205},[137,1027,1028],{"class":151},"\"https://openrouter.ai/api/v1\"",[137,1030,246],{"class":205},[137,1032,1033,1035,1037,1039],{"class":139,"line":249},[137,1034,252],{"class":147},[137,1036,240],{"class":205},[137,1038,257],{"class":151},[137,1040,246],{"class":205},[137,1042,1043,1045,1047,1050],{"class":139,"line":262},[137,1044,265],{"class":147},[137,1046,240],{"class":205},[137,1048,1049],{"class":151},"\"sk-or-your-key-here\"",[137,1051,246],{"class":205},[137,1053,1054,1056],{"class":139,"line":275},[137,1055,278],{"class":147},[137,1057,281],{"class":205},[137,1059,1060],{"class":139,"line":284},[137,1061,287],{"class":205},[137,1063,1064,1066,1068,1071],{"class":139,"line":290},[137,1065,293],{"class":147},[137,1067,240],{"class":205},[137,1069,1070],{"class":151},"\"google/gemini-2.0-flash-001\"",[137,1072,246],{"class":205},[137,1074,1075,1077,1079,1082],{"class":139,"line":303},[137,1076,306],{"class":147},[137,1078,240],{"class":205},[137,1080,1081],{"class":151},"\"Gemini 2.0 Flash\"",[137,1083,246],{"class":205},[137,1085,1086,1088,1090,1093],{"class":139,"line":316},[137,1087,319],{"class":147},[137,1089,240],{"class":205},[137,1091,1092],{"class":147},"false",[137,1094,246],{"class":205},[137,1096,1097,1099,1101,1103],{"class":139,"line":329},[137,1098,332],{"class":147},[137,1100,335],{"class":205},[137,1102,338],{"class":151},[137,1104,341],{"class":205},[137,1106,1107,1109,1111,1113,1115,1118,1120,1122,1124,1127,1129,1131,1133,1136,1138,1140,1142,1144],{"class":139,"line":344},[137,1108,347],{"class":147},[137,1110,350],{"class":205},[137,1112,353],{"class":147},[137,1114,240],{"class":205},[137,1116,1117],{"class":147},"0.1",[137,1119,361],{"class":205},[137,1121,364],{"class":147},[137,1123,240],{"class":205},[137,1125,1126],{"class":147},"0.4",[137,1128,361],{"class":205},[137,1130,373],{"class":147},[137,1132,240],{"class":205},[137,1134,1135],{"class":147},"0.025",[137,1137,361],{"class":205},[137,1139,382],{"class":147},[137,1141,240],{"class":205},[137,1143,1117],{"class":147},[137,1145,389],{"class":205},[137,1147,1148,1150,1152,1155],{"class":139,"line":392},[137,1149,395],{"class":147},[137,1151,240],{"class":205},[137,1153,1154],{"class":147},"1048576",[137,1156,246],{"class":205},[137,1158,1159,1161,1163],{"class":139,"line":405},[137,1160,408],{"class":147},[137,1162,240],{"class":205},[137,1164,413],{"class":147},[137,1166,1167],{"class":139,"line":416},[137,1168,1169],{"class":205},"          },\n",[137,1171,1172],{"class":139,"line":422},[137,1173,287],{"class":205},[137,1175,1176,1178,1180,1183],{"class":139,"line":428},[137,1177,293],{"class":147},[137,1179,240],{"class":205},[137,1181,1182],{"class":151},"\"meta-llama/llama-3.1-70b-instruct\"",[137,1184,246],{"class":205},[137,1186,1187,1189,1191,1194],{"class":139,"line":434},[137,1188,306],{"class":147},[137,1190,240],{"class":205},[137,1192,1193],{"class":151},"\"Llama 3.1 70B\"",[137,1195,246],{"class":205},[137,1197,1198,1200,1202,1204],{"class":139,"line":440},[137,1199,319],{"class":147},[137,1201,240],{"class":205},[137,1203,1092],{"class":147},[137,1205,246],{"class":205},[137,1207,1208,1210,1212,1214],{"class":139,"line":446},[137,1209,332],{"class":147},[137,1211,335],{"class":205},[137,1213,338],{"class":151},[137,1215,341],{"class":205},[137,1217,1219,1221,1223,1225,1227,1230,1232,1234,1236,1238,1240,1242,1244,1246,1248,1250,1252,1254],{"class":139,"line":1218},23,[137,1220,347],{"class":147},[137,1222,350],{"class":205},[137,1224,353],{"class":147},[137,1226,240],{"class":205},[137,1228,1229],{"class":147},"0.39",[137,1231,361],{"class":205},[137,1233,364],{"class":147},[137,1235,240],{"class":205},[137,1237,1229],{"class":147},[137,1239,361],{"class":205},[137,1241,373],{"class":147},[137,1243,240],{"class":205},[137,1245,358],{"class":147},[137,1247,361],{"class":205},[137,1249,382],{"class":147},[137,1251,240],{"class":205},[137,1253,358],{"class":147},[137,1255,389],{"class":205},[137,1257,1259,1261,1263,1266],{"class":139,"line":1258},24,[137,1260,395],{"class":147},[137,1262,240],{"class":205},[137,1264,1265],{"class":147},"131072",[137,1267,246],{"class":205},[137,1269,1271,1273,1275],{"class":139,"line":1270},25,[137,1272,408],{"class":147},[137,1274,240],{"class":205},[137,1276,413],{"class":147},[137,1278,1280],{"class":139,"line":1279},26,[137,1281,419],{"class":205},[137,1283,1285],{"class":139,"line":1284},27,[137,1286,425],{"class":205},[137,1288,1290],{"class":139,"line":1289},28,[137,1291,431],{"class":205},[137,1293,1295],{"class":139,"line":1294},29,[137,1296,437],{"class":205},[137,1298,1300],{"class":139,"line":1299},30,[137,1301,443],{"class":205},[137,1303,1305],{"class":139,"line":1304},31,[137,1306,449],{"class":205},[16,1308,1309,1310,1313],{},"OpenRouter pricing varies by model, but expect to pay ",[20,1311,1312],{},"50–90% less"," than equivalent OpenAI or Anthropic models for routine tasks.",[11,1315,1317],{"id":1316},"option-3-the-hybrid-approach","Option 3: The Hybrid Approach",[16,1319,1320],{},"The smartest strategy combines local and cloud models. Use cheap or free models for everyday tasks, and reserve expensive cloud models for when you actually need them.",[120,1322,1324],{"id":1323},"how-it-works","How It Works",[16,1326,1327],{},"OpenClaw lets you assign different models to different agents. The idea:",[59,1329,1330,1336],{},[62,1331,1332,1335],{},[20,1333,1334],{},"Routine tasks"," (web search, summarization, formatting) → Ollama local model or cheap OpenRouter model",[62,1337,1338,1341],{},[20,1339,1340],{},"Complex reasoning"," (code generation, multi-step analysis, creative writing) → GPT, Claude, or Gemini Pro",[16,1343,1344,1345,193],{},"Configure per-agent model selection in your ",[85,1346,188],{},[128,1348,1350],{"className":196,"code":1349,"language":198,"meta":133,"style":133},"{\n  \"agents\": {\n    \"defaults\": {\n      \"model\": \"ollama/qwen3:32b\"\n    },\n    \"overrides\": {\n      \"coder\": { \"model\": \"anthropic/claude-sonnet-4-5-20250929\" },\n      \"researcher\": { \"model\": \"openrouter/google/gemini-2.0-flash-001\" }\n    }\n  }\n}\n",[85,1351,1352,1356,1362,1368,1376,1381,1388,1405,1422,1426,1430],{"__ignoreMap":133},[137,1353,1354],{"class":139,"line":140},[137,1355,206],{"class":205},[137,1357,1358,1360],{"class":139,"line":209},[137,1359,537],{"class":147},[137,1361,215],{"class":205},[137,1363,1364,1366],{"class":139,"line":218},[137,1365,544],{"class":147},[137,1367,215],{"class":205},[137,1369,1370,1372,1374],{"class":139,"line":226},[137,1371,551],{"class":147},[137,1373,240],{"class":205},[137,1375,556],{"class":151},[137,1377,1378],{"class":139,"line":234},[137,1379,1380],{"class":205},"    },\n",[137,1382,1383,1386],{"class":139,"line":249},[137,1384,1385],{"class":147},"    \"overrides\"",[137,1387,215],{"class":205},[137,1389,1390,1393,1395,1398,1400,1403],{"class":139,"line":262},[137,1391,1392],{"class":147},"      \"coder\"",[137,1394,350],{"class":205},[137,1396,1397],{"class":147},"\"model\"",[137,1399,240],{"class":205},[137,1401,1402],{"class":151},"\"anthropic/claude-sonnet-4-5-20250929\"",[137,1404,389],{"class":205},[137,1406,1407,1410,1412,1414,1416,1419],{"class":139,"line":275},[137,1408,1409],{"class":147},"      \"researcher\"",[137,1411,350],{"class":205},[137,1413,1397],{"class":147},[137,1415,240],{"class":205},[137,1417,1418],{"class":151},"\"openrouter/google/gemini-2.0-flash-001\"",[137,1420,1421],{"class":205}," }\n",[137,1423,1424],{"class":139,"line":284},[137,1425,437],{"class":205},[137,1427,1428],{"class":139,"line":290},[137,1429,443],{"class":205},[137,1431,1432],{"class":139,"line":303},[137,1433,449],{"class":205},[16,1435,1436],{},"This way, your default agent uses a free local model, but specialized agents like the coder can use a more capable cloud model when the task demands it.",[11,1438,1440],{"id":1439},"cost-comparison","Cost Comparison",[16,1442,1443],{},"Here's what typical monthly costs look like for moderate usage (~50 queries/day) across different setups:",[584,1445,1446,1461],{},[587,1447,1448],{},[590,1449,1450,1452,1455,1458],{},[593,1451,713],{},[593,1453,1454],{},"AI Model Cost",[593,1456,1457],{},"Infrastructure",[593,1459,1460],{},"Total Monthly",[606,1462,1463,1477,1490,1504],{},[590,1464,1465,1468,1471,1474],{},[611,1466,1467],{},"Pure cloud (GPT / Gemini Pro)",[611,1469,1470],{},"$60–120",[611,1472,1473],{},"$5–10 VPS",[611,1475,1476],{},"$65–130",[590,1478,1479,1482,1485,1487],{},[611,1480,1481],{},"OpenRouter (Gemini Flash / Llama)",[611,1483,1484],{},"$10–30",[611,1486,1473],{},[611,1488,1489],{},"$15–40",[590,1491,1492,1495,1498,1501],{},[611,1493,1494],{},"Pure local (Ollama 32B+)",[611,1496,1497],{},"$0",[611,1499,1500],{},"$5–10 VPS + electricity",[611,1502,1503],{},"$5–15",[590,1505,1506,1509,1512,1514],{},[611,1507,1508],{},"Hybrid (local default + cloud for complex)",[611,1510,1511],{},"$10–25",[611,1513,1473],{},[611,1515,1516],{},"$15–35",[16,1518,1519,1520,1523],{},"The difference is dramatic. A hybrid setup can cut your monthly AI spend by ",[20,1521,1522],{},"70–85%"," compared to using cloud APIs exclusively.",[120,1525,1527],{"id":1526},"the-trade-offs","The Trade-offs",[16,1529,1530],{},"Local models aren't free of cost — they trade money for other things:",[59,1532,1533,1539,1545,1551,1557],{},[62,1534,1535,1538],{},[20,1536,1537],{},"Hardware requirements"," — you need 32 GB+ RAM or a decent GPU for OpenClaw-compatible models",[62,1540,1541,1544],{},[20,1542,1543],{},"Slower responses"," compared to cloud APIs, especially on CPU-only hardware",[62,1546,1547,1550],{},[20,1548,1549],{},"Lower quality"," on complex reasoning tasks compared to frontier cloud models",[62,1552,1553,1556],{},[20,1554,1555],{},"More setup work"," and occasional troubleshooting",[62,1558,1559,1562],{},[20,1560,1561],{},"Power consumption"," if running a GPU 24/7",[16,1564,1565],{},"For many users, the hybrid approach hits the sweet spot: fast and cheap for routine work, high quality when it matters.",[11,1567,1569],{"id":1568},"getting-started","Getting Started",[715,1571,1572,1578,1589,1601,1607],{},[62,1573,1574,1577],{},[20,1575,1576],{},"Check your hardware"," — you need 32 GB+ RAM or a GPU with 24 GB+ VRAM for reliable results",[62,1579,1580,1582,1583,88,1586],{},[20,1581,727],{}," and pull a 32B+ model: ",[85,1584,1585],{},"ollama pull qwen3:32b",[85,1587,1588],{},"ollama pull deepseek-r1:32b",[62,1590,1591,1594,1595,1598,1599],{},[20,1592,1593],{},"Update your config"," — add the provider with ",[85,1596,1597],{},"\"api\": \"openai-completions\""," and a placeholder ",[85,1600,478],{},[62,1602,1603,1606],{},[20,1604,1605],{},"Evaluate quality"," — run your typical tasks and see if the output is good enough",[62,1608,1609,1612],{},[20,1610,1611],{},"Go hybrid"," — configure per-agent models once you know which tasks need cloud quality",[16,1614,1615,1616,1621],{},"The OpenClaw community on ",[97,1617,1620],{"href":1618,"rel":1619},"https://discord.gg/wTBAcqC38R",[116],"Discord"," is a great place to share configs and get recommendations for which models work best for specific tasks.",[1623,1624],"hr",{},[16,1626,1627,1628,1633,1634,1638],{},"Want the convenience of managed hosting without the server management? ",[97,1629,1632],{"href":1630,"rel":1631},"https://dash.clawnest.ai/register",[116],"ClawNest"," handles infrastructure, updates, and backups so you can focus on configuring your AI assistants. Start with a ",[97,1635,1637],{"href":1636},"/pricing","free 3-day trial"," — no credit card required.",[1640,1641,1642],"style",{},"html pre.shiki code .sScJk, html code.shiki .sScJk{--shiki-default:#6F42C1;--shiki-dark:#B392F0}html pre.shiki code .sj4cs, html code.shiki .sj4cs{--shiki-default:#005CC5;--shiki-dark:#79B8FF}html pre.shiki code .sZZnC, html code.shiki .sZZnC{--shiki-default:#032F62;--shiki-dark:#9ECBFF}html pre.shiki code .szBVR, html code.shiki .szBVR{--shiki-default:#D73A49;--shiki-dark:#F97583}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .sVt8B, html code.shiki .sVt8B{--shiki-default:#24292E;--shiki-dark:#E1E4E8}",{"title":133,"searchDepth":209,"depth":209,"links":1644},[1645,1646,1647,1652,1655,1658,1661,1664,1667],{"id":13,"depth":209,"text":14},{"id":44,"depth":209,"text":45},{"id":108,"depth":209,"text":109,"children":1648},[1649,1650,1651],{"id":122,"depth":218,"text":123},{"id":181,"depth":218,"text":182},{"id":516,"depth":218,"text":517},{"id":574,"depth":209,"text":575,"children":1653},[1654],{"id":665,"depth":218,"text":666},{"id":699,"depth":209,"text":700,"children":1656},[1657],{"id":712,"depth":218,"text":713},{"id":965,"depth":209,"text":966,"children":1659},[1660],{"id":990,"depth":218,"text":991},{"id":1316,"depth":209,"text":1317,"children":1662},[1663],{"id":1323,"depth":218,"text":1324},{"id":1439,"depth":209,"text":1440,"children":1665},[1666],{"id":1526,"depth":218,"text":1527},{"id":1568,"depth":209,"text":1569},"2026-02-15","A practical guide to running local LLMs with OpenClaw using Ollama. Covers setup, config, hardware requirements, OpenRouter, and hybrid strategies to reduce your monthly AI spend.",false,"md","/blog/ollama-clawnest-openclaw.png",{},true,"/blog/openclaw-ollama-local-models-guide",{"title":5,"description":1669},"blog/openclaw-ollama-local-models-guide",[117,1679,1680,1681],"Cost Optimization","Self-Hosting","Guide","dLsc1xXYd6PA5ng5P-sH-yUtLyVpDeKVKrcoD-4J_dU",1774598013034]