FREE Setup Claude Code Locally in 3 Simple steps
Claude Code + Local LLMs: The Zero-Cost Secure Dev Setup
Hello Everyone
Welcome to your AKVAverse, I’m Abhishek Veeramalla, aka the AKVAman, your guide for Cloud, DevOps and AI.
Today, we will be diving deep into a setup that is going to change the way you code forever. I’m going to show you how to run Claude Code with local models for absolutely zero cost. This isn't just about saving money; it is about security and efficiency.
Why Run Claude Locally?
Before we jump into the commands, you might ask why this is so important. There are two major reasons.
Absolute Security: Because these models run entirely on your own machine, your code never leaves your local environment. If you are working on sensitive office projects or building secure applications, this is the way to go.
Cost-Effectiveness: You don’t need Anthropic API keys. By serving models locally, you bypass API usage fees entirely, which is a massive win for your wallet.
Step 1: Installing Ollama
The foundation of this setup is Ollama, which serves as the engine for running these local models.
curl -fsSL https://ollama.com/install.sh | shThis command uses curl to fetch the installation script from Ollama official site and executes it using the shell (sh). It works seamlessly across Windows, Linux, and MacOS. If you prefer a manual approach, you can simply visit Ollama.com and download the installer directly.
Verification:
Once installed, you can verify it’s running by simply typing ollama in your terminal to see the help menu.
Step 2: Pulling a Powerful Model
Next, you need to choose a local Large Language Model (LLM) that is actually capable of complex coding tasks.
ollama pull gpt-ossThe pull command tells Ollama to download the model files to your machine. I highly recommend GPT OSS 20B (20 billion parameters), as it is incredibly powerful for coding.
Other Recommended Models: You could also try TwinCoder 3 for high performance or GLM 4.7 Flash if you need something a bit more lightweight.
Pro-Tip on Size: Be patient! The GPT OSS model is roughly 13GB, so depending on your internet speed, this can take 20 to 30 minutes.
Critical Configuration: This is the part most people miss! You must set the Context Length in your Ollama settings widget.
Why it matters: Context length determines how much information (like your entire GitHub repository) the model can remember and process at once.
Recommendation: Set it to at least 16k or 32k. If you have a beast of a machine with multiple GPUs, you can even push it to 128k.
Step 3: Installing and Launching Claude Code
Finally, we need to install the Claude Code interface and link it to our local model.
Installation Command: You will use another simple curl command (provided in the setup gist) to install Claude Code, which should take about 2 to 3 minutes
ollama launch claude --model gpt-oss What this does: This is the most important command. It invokes the Claude interface but specifically instructs it to use your local GPT OSS model instead of reaching out to the cloud.
Putting It to the Test
Once you are in your project folder, you can start asking it to build things. In my test, I asked it to create a simple to-do Golang project, and it cooked the entire application in just 1 minute and 26 seconds. On a more powerful machine, this can happen in under 30 seconds.
You can also use it to explain existing projects. I pointed it at a Python-based application with a database and UI, and it analyzed the README and dependencies to explain the whole project structure in less than 30 seconds.
A quick heads-up: While this setup is amazing for coding, you might find slightly less accuracy when it comes to specific DevOps or cloud-related tasks compared to pure coding tasks.
Thought to Leave You With
Local LLMs are the future for anyone who cares about privacy and budget. Make sure you try this out, and if you have any questions at all, let me know in the comments section of the newsletter.
Until next time, keep building, keep experimenting, and keep exploring your AKVAverse. 💙
Abhishek Veeramalla, aka the AKVAman

PS C:\Windows\System32> ollama pull gpt-oss
pulling manifest
pulling e7b273f96360: 100% ▕██████████████████████████████████████████████████████████▏ 13 GB
pulling fa6710a93d78: 100% ▕██████████████████████████████████████████████████████████▏ 7.2 KB
pulling f60356777647: 100% ▕██████████████████████████████████████████████████████████▏ 11 KB
pulling d8ba2f9a17b3: 100% ▕██████████████████████████████████████████████████████████▏ 18 B
pulling 776beb3adb23: 100% ▕██████████████████████████████████████████████████████████▏ 489 B
verifying sha256 digest
writing manifest
success
PS C:\Windows\System32> ollama launch claude --model gpt-oss
Launching Claude Code with gpt-oss...
Error: claude is not installed, install from https://code.claude.com/docs/en/quickstart
Thanks for putting this across Abhishek, the model will basically be running on local, if we prompt for something which needs to be pulled from network or the open internet will there still be a security risk?