Open-source language models have transformed coding and content creation. Running these **open source LLMs** locally offers privacy, control, and no API fees. Whether for **code LLM** programming or **writing model** creative work, this guide covers top **open source language models** today.

This list highlights popular **open-source LLMs for coding and writing**, organized by recency and popularity. Each model entry shows key attributes, use cases, tags, and pros and cons. These models work with tools like **Ollama**, **LM Studio**, **Jan**, and other local AI platforms.

Review the [How to Choose the Right Model](#how-to-choose-the-right-open-source-llm) section for selection guidance.

_⚠️ Hardware requirements vary by model size. Smaller models (1B-7B) can run on CPU-only systems with modest RAM, while larger models require significant GPU or CPU resources._

{{< llm-cards data="data/open-source-llms.jsonc" minwidth="300px" />}}

## How to Choose the Right Open Source LLM

Selecting an **open source LLM** depends on hardware capabilities, use case, and performance requirements.

**Hardware Constraints:**
- For limited resources: Consider smaller models like `stable-code` (3B), `codegemma` (2B/7B), `qwen2.5-coder` (0.5B-7B), `phi3` (3.8B), or `llama3.2` (1B/3B)
- For high-end hardware: Larger models like `qwen3-coder` (480B), `deepseek-coder-v2` (236B), `llama3.1` (405B), or `codellama` (70B)

**Use Case - Coding:**
- **General coding**: `qwen2.5-coder`, `codellama`, or `deepseek-coder`
- **SQL-specific**: `sqlcoder`
- **Long context/agentic**: `qwen3-coder`
- **Code completion**: `stable-code`, `codegeex4`
- **Multi-language**: `starcoder` or `starcoder2`
- **Versatile (coding + writing)**: `qwen2.5`, `llama3.1`, `mistral`, `mixtral`

**Use Case - Writing:**
- **Creative writing**: `llama3`, `llama3.1`, `mistral`, `gemma2`
- **Long-form content**: `deepseek-r1`
- **Fiction/roleplay**: `dolphin-llama3`, `dolphin-mistral`, `dolphin3`
- **Conversational**: `vicuna`
- **Lightweight writing**: `phi3`, `gemma3`, `llama3.2`
- **Versatile (writing + coding)**: `qwen2.5`, `llama3.1`, `mistral`, `mixtral`

**Popularity and Reliability:**
- Most tested: `qwen2.5` (12.3M pulls), `qwen2.5-coder` (10.1M pulls), `llama3.1` (8.5M pulls), `llama3` (6.2M pulls), `mistral` (5.8M pulls)
- Newest features: `qwen3-coder` (3 months), `llama3.2` (recent), `gemma3` (recent)

## Benefits of Running Open Source LLMs Locally

Running **open source language models** locally has these characteristics compared to cloud-based APIs:

* **Privacy**: Your code and conversations never leave your machine
* **Cost**: No per-token API fees or subscription costs
* **Control**: Full control over model versions, parameters, and data
* **Offline Access**: Work without internet connectivity
* **Customization**: Fine-tune models for your specific needs
* **No Rate Limits**: Generate as much content as your hardware allows

## Getting Started with Local LLMs

You can run **open-source LLMs locally** using several tools and platforms:

**GUI-based tools:**
- **LM Studio** - Interface for downloading and chatting with models
- **Jan** - Open-source ChatGPT alternative
- **GPT4All** - General-purpose application with document chat capabilities

**Command-line tools:**
- **Ollama** - Simple command-line tool for running models locally
- **llama.cpp** - Lightweight C++ implementation that runs models efficiently on CPUs
- Direct model loading via Python frameworks like Hugging Face Transformers, PyTorch, or TensorFlow

**Web interfaces:**
If you want a ChatGPT-like experience, you can pair these backends with interfaces like **LobeChat**, **Open WebUI**, or **LibreChat**.

**LM Studio** or **Jan** provide model downloads and chat interfaces without command-line work. They support the same model formats (GGUF) that Ollama uses.

## Code LLMs vs Writing Models: What's the Difference?

Differences between **code LLMs** and **writing models**:

**Code LLMs** (like `qwen2.5-coder`, `codellama`, `deepseek-coder`) are trained on code repositories and handle:
* Code generation and completion
* Debugging and error fixing
* Code explanation and documentation
* Multi-language programming support
* Understanding code context and syntax

**Writing Models** (like `llama3.1`, `mistral`, `gemma2`) are designed for natural language tasks:
* Creative writing and storytelling
* Content generation and editing
* Conversational AI and chat
* Long-form content creation
* General language understanding

**Versatile Models** (like `qwen2.5`, `llama3.1`, `mistral`) handle both coding and writing tasks.

### Using Ollama for Local LLM Deployment

**Ollama** provides a command-line interface and API for running **open source LLMs** locally. Example usage:

**Pull a model (coding example):**

```sh
ollama pull qwen2.5-coder:7b
```

**Pull a model (writing example):**

```sh
ollama pull llama3.1:8b
```

**Run a model:**

```sh
ollama run qwen2.5-coder:7b
```

```sh
ollama run llama3.1:8b
```

**Or use in your application (coding):**

```sh
curl -s http://localhost:11434/api/generate -d '{
  "model": "qwen2.5-coder:7b",
  "prompt": "Write a Python function to calculate Fibonacci numbers.",
  "stream": false
}' | jq -r '.response'
```

**Or use in your application (writing):**

```sh
curl -s http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "prompt": "Write a short story about a robot learning to paint.",
  "stream": false
}' | jq -r '.response'
```

## Popular Use Cases for Open Source LLMs

**Open source language models** are being used across various domains:

* **Code Generation**: Automate boilerplate code, generate functions, and complete code snippets
* **Code Review**: Analyze code for bugs, security issues, and best practices
* **Documentation**: Generate API docs, README files, and technical documentation
* **Creative Writing**: Draft stories, articles, and creative content
* **Content Editing**: Improve grammar, style, and clarity of written content
* **Conversational AI**: Build chatbots and virtual assistants
* **Data Analysis**: Generate SQL queries and analyze datasets
* **Learning**: Understand programming concepts and get coding help

## Running Local LLMs

Considerations for running **open source LLMs**:

* **Start Small**: Begin with smaller models (3B-7B parameters) to test your hardware
* **Monitor Resources**: Use system monitoring tools to track GPU/CPU and memory usage
* **Experiment with Quantization**: Use quantized models (Q4, Q5, Q8) to reduce memory requirements
* **Try Multiple Models**: Different **code LLMs** and **writing models** perform differently on various tasks
* **Use Appropriate Context Windows**: Match model context length to your use case
* **Keep Models Updated**: Regularly pull updated versions for bug fixes and improvements

## References and Resources

- [Ollama Library](https://ollama.com/library) - Available **open source LLMs**
- [Ollama Library - Code Models](https://ollama.com/library?sort=newest&q=code) - **Code LLMs**
- [Ollama Documentation](https://ollama.com/docs) - Ollama documentation
- [LM Studio](https://lmstudio.ai/) - GUI for **local LLM** management
- [Jan](https://jan.ai/) - Open-source ChatGPT alternative
- [GPT4All](https://gpt4all.io/) - **Local AI** application
- [llama.cpp](https://github.com/ggerganov/llama.cpp) - CPU-based **LLM** inference