Open-source language models have transformed coding and content creation. Running these **open source LLMs** locally offers privacy, control, and no API fees. Whether for **code LLM** programming or **writing model** creative work, this guide covers top **open source language models** today. This list highlights popular **open-source LLMs for coding and writing**, organized by recency and popularity. Each model entry shows key attributes, use cases, tags, and pros and cons. These models work with tools like **Ollama**, **LM Studio**, **Jan**, and other local AI platforms. Review the [How to Choose the Right Model](#how-to-choose-the-right-open-source-llm) section for selection guidance. _⚠️ Hardware requirements vary by model size. Smaller models (1B-7B) can run on CPU-only systems with modest RAM, while larger models require significant GPU or CPU resources._ {{< llm-cards data="data/open-source-llms.jsonc" minwidth="300px" />}} ## How to Choose the Right Open Source LLM Selecting an **open source LLM** depends on hardware capabilities, use case, and performance requirements. **Hardware Constraints:** - For limited resources: Consider smaller models like `stable-code` (3B), `codegemma` (2B/7B), `qwen2.5-coder` (0.5B-7B), `phi3` (3.8B), or `llama3.2` (1B/3B) - For high-end hardware: Larger models like `qwen3-coder` (480B), `deepseek-coder-v2` (236B), `llama3.1` (405B), or `codellama` (70B) **Use Case - Coding:** - **General coding**: `qwen2.5-coder`, `codellama`, or `deepseek-coder` - **SQL-specific**: `sqlcoder` - **Long context/agentic**: `qwen3-coder` - **Code completion**: `stable-code`, `codegeex4` - **Multi-language**: `starcoder` or `starcoder2` - **Versatile (coding + writing)**: `qwen2.5`, `llama3.1`, `mistral`, `mixtral` **Use Case - Writing:** - **Creative writing**: `llama3`, `llama3.1`, `mistral`, `gemma2` - **Long-form content**: `deepseek-r1` - **Fiction/roleplay**: `dolphin-llama3`, `dolphin-mistral`, `dolphin3` - **Conversational**: `vicuna` - **Lightweight writing**: `phi3`, `gemma3`, `llama3.2` - **Versatile (writing + coding)**: `qwen2.5`, `llama3.1`, `mistral`, `mixtral` **Popularity and Reliability:** - Most tested: `qwen2.5` (12.3M pulls), `qwen2.5-coder` (10.1M pulls), `llama3.1` (8.5M pulls), `llama3` (6.2M pulls), `mistral` (5.8M pulls) - Newest features: `qwen3-coder` (3 months), `llama3.2` (recent), `gemma3` (recent) ## Benefits of Running Open Source LLMs Locally Running **open source language models** locally has these characteristics compared to cloud-based APIs: * **Privacy**: Your code and conversations never leave your machine * **Cost**: No per-token API fees or subscription costs * **Control**: Full control over model versions, parameters, and data * **Offline Access**: Work without internet connectivity * **Customization**: Fine-tune models for your specific needs * **No Rate Limits**: Generate as much content as your hardware allows ## Getting Started with Local LLMs You can run **open-source LLMs locally** using several tools and platforms: **GUI-based tools:** - **LM Studio** - Interface for downloading and chatting with models - **Jan** - Open-source ChatGPT alternative - **GPT4All** - General-purpose application with document chat capabilities **Command-line tools:** - **Ollama** - Simple command-line tool for running models locally - **llama.cpp** - Lightweight C++ implementation that runs models efficiently on CPUs - Direct model loading via Python frameworks like Hugging Face Transformers, PyTorch, or TensorFlow **Web interfaces:** If you want a ChatGPT-like experience, you can pair these backends with interfaces like **LobeChat**, **Open WebUI**, or **LibreChat**. **LM Studio** or **Jan** provide model downloads and chat interfaces without command-line work. They support the same model formats (GGUF) that Ollama uses. ## Code LLMs vs Writing Models: What's the Difference? Differences between **code LLMs** and **writing models**: **Code LLMs** (like `qwen2.5-coder`, `codellama`, `deepseek-coder`) are trained on code repositories and handle: * Code generation and completion * Debugging and error fixing * Code explanation and documentation * Multi-language programming support * Understanding code context and syntax **Writing Models** (like `llama3.1`, `mistral`, `gemma2`) are designed for natural language tasks: * Creative writing and storytelling * Content generation and editing * Conversational AI and chat * Long-form content creation * General language understanding **Versatile Models** (like `qwen2.5`, `llama3.1`, `mistral`) handle both coding and writing tasks. ### Using Ollama for Local LLM Deployment **Ollama** provides a command-line interface and API for running **open source LLMs** locally. Example usage: **Pull a model (coding example):** ```sh ollama pull qwen2.5-coder:7b ``` **Pull a model (writing example):** ```sh ollama pull llama3.1:8b ``` **Run a model:** ```sh ollama run qwen2.5-coder:7b ``` ```sh ollama run llama3.1:8b ``` **Or use in your application (coding):** ```sh curl -s http://localhost:11434/api/generate -d '{ "model": "qwen2.5-coder:7b", "prompt": "Write a Python function to calculate Fibonacci numbers.", "stream": false }' | jq -r '.response' ``` **Or use in your application (writing):** ```sh curl -s http://localhost:11434/api/generate -d '{ "model": "llama3.1:8b", "prompt": "Write a short story about a robot learning to paint.", "stream": false }' | jq -r '.response' ``` ## Popular Use Cases for Open Source LLMs **Open source language models** are being used across various domains: * **Code Generation**: Automate boilerplate code, generate functions, and complete code snippets * **Code Review**: Analyze code for bugs, security issues, and best practices * **Documentation**: Generate API docs, README files, and technical documentation * **Creative Writing**: Draft stories, articles, and creative content * **Content Editing**: Improve grammar, style, and clarity of written content * **Conversational AI**: Build chatbots and virtual assistants * **Data Analysis**: Generate SQL queries and analyze datasets * **Learning**: Understand programming concepts and get coding help ## Running Local LLMs Considerations for running **open source LLMs**: * **Start Small**: Begin with smaller models (3B-7B parameters) to test your hardware * **Monitor Resources**: Use system monitoring tools to track GPU/CPU and memory usage * **Experiment with Quantization**: Use quantized models (Q4, Q5, Q8) to reduce memory requirements * **Try Multiple Models**: Different **code LLMs** and **writing models** perform differently on various tasks * **Use Appropriate Context Windows**: Match model context length to your use case * **Keep Models Updated**: Regularly pull updated versions for bug fixes and improvements ## References and Resources - [Ollama Library](https://ollama.com/library) - Available **open source LLMs** - [Ollama Library - Code Models](https://ollama.com/library?sort=newest&q=code) - **Code LLMs** - [Ollama Documentation](https://ollama.com/docs) - Ollama documentation - [LM Studio](https://lmstudio.ai/) - GUI for **local LLM** management - [Jan](https://jan.ai/) - Open-source ChatGPT alternative - [GPT4All](https://gpt4all.io/) - **Local AI** application - [llama.cpp](https://github.com/ggerganov/llama.cpp) - CPU-based **LLM** inference