Open-source language models have transformed coding and content creation. Running these open source LLMs locally offers privacy, control, and no API fees. Whether for code LLM programming or writing model creative work, this guide covers top open source language models today.
This list highlights popular open-source LLMs for coding and writing, organized by recency and popularity. Each model entry shows key attributes, use cases, tags, and pros and cons. These models work with tools like Ollama, LM Studio, Jan, and other local AI platforms.
Review the How to Choose the Right Model section for selection guidance.
⚠️ Hardware requirements vary by model size. Smaller models (1B-7B) can run on CPU-only systems with modest RAM, while larger models require significant GPU or CPU resources.
gpt-oss
Tags: writing, coding
Owner/Author: OpenAI
Parameters: 20B (3.6B active), 120B (5.1B active)
Tool Support: Yes
Resource Demand:High (20B), Very-High (120B)
Primary Use Cases: Reasoning, agentic tasks, function calling, structured outputs, tool use, general-purpose tasks
Pros:
- OpenAI's open-weight models with state-of-the-art reasoning
- Mixture-of-Experts (MoE) architecture for efficiency
- 128K context window for long-context tasks
- Native support for function calling and structured outputs
- Adjustable reasoning effort (low, medium, high)
- Full chain-of-thought access for debugging
- Apache 2.0 license
- OpenAI-compatible API
Cons:
- 20B model requires ~16GB VRAM (14GB download)
- 120B model requires ~60GB VRAM (65GB download)
- Very new models with less community testing
- Large download sizes
- May be overkill for simple tasks
qwen3-coder
Tags: coding
Owner/Author: Alibaba Cloud (Qwen)
Parameters: 30B, 480B
Tool Support: Yes
Resource Demand:High (30B), Very-High (480B)
Primary Use Cases: Agentic coding tasks, long-context code generation, complex coding scenarios
Pros:
- Excellent performance on long-context coding tasks
- Strong support for agentic workflows
- Cloud deployment options available
- Tools integration support
Cons:
- Very large model sizes (especially 480B) require significant resources
- Relatively new (3 months old) with less community testing
- May be overkill for simple coding tasks
qwen2.5-coder
Tags: coding
Owner/Author: Alibaba Cloud (Qwen)
Parameters: 0.5B, 1.5B, 3B, 7B, 14B, 32B
Hugging Face: ♥ 664 · ↓ 1,680,689
Tool Support: Yes
Resource Demand:Low (0.5B-7B), Medium (14B), High (32B)
Primary Use Cases: Code generation, code reasoning, code fixing, general programming assistance
Pros:
- Wide range of model sizes for different hardware constraints
- Significant improvements in code generation and reasoning
- Most popular code model (10.1M pulls)
- Tools integration support
- Excellent code fixing capabilities
Cons:
- Larger models (32B) still require substantial resources
- Smaller models (0.5B, 1.5B) may lack depth for complex tasks
- Newer model with less long-term reliability data
deepseek-coder-v2
Tags: coding
Owner/Author: DeepSeek
Parameters: 16B, 236B
Hugging Face: ♥ 555 · ↓ 203,627
Tool Support: No
Resource Demand:Medium (16B), Very-High (236B)
Primary Use Cases: Code-specific tasks, performance comparable to GPT-4 Turbo
Pros:
- Mixture-of-Experts architecture for efficiency
- Performance comparable to GPT-4 Turbo on code tasks
- Strong code generation quality
- Well-optimized for code-specific scenarios
Cons:
- 236B model requires extremely high-end hardware
- Smaller 16B model may not match larger variants
- Less general-purpose than some alternatives
deepseek-coder
Tags: coding
Owner/Author: DeepSeek
Parameters: 1.3B, 6.7B, 33B
Hugging Face: ♥ 480 · ↓ 133,657
Tool Support: No
Resource Demand:Low (1.3B-6.7B), High (33B)
Primary Use Cases: General code generation, trained on 2 trillion tokens
Pros:
- Extensive training on 2 trillion code and natural language tokens
- Multiple size options for different use cases
- Very popular (2.7M pulls)
- Strong general-purpose coding capabilities
- Well-tested and reliable
Cons:
- Older model (2 years) may lack latest improvements
- 33B model requires significant resources
- Smaller models may lack depth for complex reasoning
codellama
Tags: coding
Owner/Author: Meta
Parameters: 7B, 13B, 34B, 70B
Hugging Face: ♥ 255 · ↓ 93,876
Tool Support: No
Resource Demand:Low (7B), Medium (13B), High (34B-70B)
Primary Use Cases: Text-to-code generation, code discussion, general programming
Pros:
- Extremely popular (4M pulls)
- Multiple size options including very large 70B model
- Strong general-purpose capabilities
- Can discuss and explain code, not just generate
- Well-established and reliable
Cons:
- 70B model requires very high-end hardware
- Older architecture compared to newer models
- May not specialize as well as code-specific models
starcoder2
Tags: coding
Owner/Author: BigCode (Hugging Face)
Parameters: 3B, 7B, 15B
Tool Support: No
Resource Demand:Low (3B-7B), Medium (15B)
Primary Use Cases: Transparently trained open code LLMs, general code generation
Pros:
- Transparent training process (open and reproducible)
- Good range of sizes for different hardware
- Strong code generation capabilities
- Well-documented and community-supported
Cons:
- May not match performance of newer specialized models
- Limited to three size options
- Less specialized than code-specific variants
codegemma
Tags: coding
Owner/Author: Google
Parameters: 2B, 7B
Hugging Face: ♥ 211 · ↓ 13,472
Tool Support: No
Resource Demand:Low (2B-7B)
Primary Use Cases: Fill-in-the-middle completion, code generation, natural language understanding, mathematical reasoning
Pros:
- Lightweight models suitable for resource-constrained environments
- Versatile capabilities beyond just code generation
- Strong fill-in-the-middle completion
- Good for mathematical reasoning tasks
Cons:
- Smaller models may lack depth for complex tasks
- Limited size options
- May not match larger models on complex code generation
granite-code
Tags: coding
Owner/Author: IBM
Parameters: 3B, 8B, 20B, 34B
Tool Support: Yes
Resource Demand:Low (3B-8B), Medium (20B), High (34B)
Primary Use Cases: Code Intelligence, IBM's open foundation models
Pros:
- Good range of sizes from small to large
- IBM-backed with enterprise support
- Focused on code intelligence tasks
- Well-maintained foundation models
Cons:
- Less popular than some alternatives
- May have IBM-specific optimizations
- Less community testing compared to more popular models
deepcoder
Tags: coding
Owner/Author: Agentica
Parameters: 1.5B, 14B
Tool Support: No
Resource Demand:Low (1.5B), Medium (14B)
Primary Use Cases: Code generation at O3-mini level performance
Pros:
- Fully open-source with transparent development
- Performance comparable to O3-mini level
- Good balance with 14B model
- Lightweight 1.5B option available
Cons:
- Limited to two size options
- May not match latest model capabilities
- Less popular than mainstream alternatives
opencoder
Tags: coding
Owner/Author: OpenCoder Team
Parameters: 1.5B, 8B
Tool Support: No
Resource Demand:Low (1.5B-8B)
Primary Use Cases: Open and reproducible code LLM, English and Chinese support
Pros:
- Open and reproducible training process
- Bilingual support (English and Chinese)
- Good for international development teams
- Lightweight options available
Cons:
- Limited size options
- Less popular than alternatives
- May not match performance of larger models
yi-coder
Tags: coding
Owner/Author: 01-ai (Yi)
Parameters: 1.5B, 9B
Tool Support: No
Resource Demand:Low (1.5B-9B)
Primary Use Cases: State-of-the-art coding performance with fewer parameters
Pros:
- Efficient performance with fewer parameters
- Good coding performance relative to size
- Lightweight options for resource-constrained environments
- Optimized for coding tasks
Cons:
- Limited size options
- May not match larger models on complex tasks
- Less popular than mainstream alternatives
codegeex4
Tags: coding
Owner/Author: Zhipu AI (CodeGeeX)
Parameters: 9B
Tool Support: No
Resource Demand:Medium (9B)
Primary Use Cases: AI software development, code completion
Pros:
- Versatile for various AI software development scenarios
- Strong code completion capabilities
- Single optimized size option
- Good for IDE integration
Cons:
- Only one size option available
- May not match performance of larger models
- Less popular than alternatives
codeqwen
Tags: coding
Owner/Author: Alibaba Cloud (Qwen)
Parameters: 7B
Tool Support: No
Resource Demand:Low (7B)
Primary Use Cases: Large language model pretrained on extensive code data
Pros:
- Extensive pretraining on code data
- Good balance of size and performance
- Part of Qwen model family
- Well-optimized for code tasks
Cons:
- Only one size option
- May not match latest Qwen2.5-coder improvements
- Less flexible than multi-size alternatives
dolphincoder
Tags: coding
Owner/Author: Eric Hartford (community)
Parameters: 7B, 15B
Tool Support: No
Resource Demand:Low (7B), Medium (15B)
Primary Use Cases: Uncensored coding variant, based on StarCoder2
Pros:
- Uncensored variant for unrestricted coding scenarios
- Based on proven StarCoder2 architecture
- Two size options available
- Good for scenarios requiring fewer restrictions
Cons:
- Uncensored nature may not be suitable for all use cases
- Less popular than mainstream alternatives
- May have ethical considerations for some teams
stable-code
Tags: coding
Owner/Author: Stability AI
Parameters: 3B
Hugging Face: ♥ 659 · ↓ 6,046
Tool Support: No
Resource Demand:Low (3B)
Primary Use Cases: Code completion, instruction following, coding tasks
Pros:
- Very lightweight (3B) suitable for most hardware
- Performance comparable to Code Llama 7B despite smaller size
- Good for code completion tasks
- Stable and reliable
Cons:
- Only one size option
- May lack depth for complex reasoning tasks
- Smaller than many alternatives
magicoder
Tags: coding
Owner/Author: iSE-UIUC
Parameters: 7B
Hugging Face: ♥ 205 · ↓ 5,562
Tool Support: No
Resource Demand:Low (7B)
Primary Use Cases: Code generation trained on 75K synthetic instruction data
Pros:
- Novel OSS-Instruct training approach
- Trained on open-source code snippets
- Good for general code generation
- Innovative training methodology
Cons:
- Only one size option
- Less popular than alternatives
- May not match performance of larger or newer models
codebooga
Tags: coding
Owner/Author: oobabooga
Parameters: 34B
Hugging Face: ♥ 147 · ↓ 9,001
Tool Support: No
Resource Demand:High (34B)
Primary Use Cases: High-performing code instruct model, merged architecture
Pros:
- High performance from merged model architecture
- Specialized for instruction following
- Large model size for complex tasks
- Good for detailed coding instructions
Cons:
- Very large model requires high-end hardware
- Only one size option
- Less popular than alternatives
- Merged architecture may have compatibility considerations
starcoder
Tags: coding
Owner/Author: BigCode (Hugging Face)
Parameters: 1B, 3B, 7B, 15B
Hugging Face: ♥ 2,924 · ↓ 6,369
Tool Support: No
Resource Demand:Low (1B-7B), Medium (15B)
Primary Use Cases: Code generation across 80+ programming languages
Pros:
- Trained on 80+ programming languages
- Excellent multi-language support
- Multiple size options
- Well-established and reliable
Cons:
- Older model (2 years) may lack latest improvements
- May not specialize as well as newer models
- Less popular than StarCoder2 successor
sqlcoder
Tags: coding
Owner/Author: Defog.ai
Parameters: 7B, 15B
Hugging Face: ♥ 68 · ↓ 308
Tool Support: No
Resource Demand:Low (7B), Medium (15B)
Primary Use Cases: SQL generation tasks, database query generation
Pros:
- Specialized for SQL generation
- Fine-tuned on StarCoder for SQL tasks
- Two size options available
- Excellent for database-related coding
Cons:
- Specialized only for SQL, less versatile
- May not perform well on non-SQL tasks
- Limited use case compared to general models
wizardcoder
Tags: coding
Owner/Author: WizardLM Team
Parameters: 33B
Hugging Face: ♥ 135 · ↓ 50
Tool Support: No
Resource Demand:High (33B)
Primary Use Cases: State-of-the-art code generation
Pros:
- State-of-the-art code generation capabilities
- Large model size for complex tasks
- Strong performance on code generation benchmarks
- Well-regarded in coding community
Cons:
- Very large model requires high-end hardware
- Only one size option
- Older model (2 years) may lack latest improvements
codeup
Tags: coding
Owner/Author: juyongjiang
Parameters: 13B
Tool Support: No
Resource Demand:Medium (13B)
Primary Use Cases: Code generation based on Llama2
Pros:
- Based on proven Llama2 architecture
- Good balance of size and performance
- Reliable code generation capabilities
- Well-tested foundation
Cons:
- Only one size option
- Older architecture (Llama2-based)
- Less popular than newer alternatives
- May not match latest model improvements
llama3.1
Tags: writing, coding
Owner/Author: Meta
Parameters: 8B, 70B, 405B
Hugging Face: ♥ 5,536 · ↓ 7,354,915
Tool Support: Yes
Resource Demand:Low (8B), High (70B), Very-High (405B)
Primary Use Cases: General-purpose tasks, creative writing, code generation, instruction following
Pros:
- Extremely popular and well-tested (8.5M pulls)
- Versatile for both writing and coding tasks
- Excellent instruction following capabilities
- Strong creative writing performance
- Multiple size options including massive 405B model
- Latest Llama architecture improvements
Cons:
- 405B model requires extremely high-end hardware
- May not specialize as well as dedicated models
- General-purpose nature means less optimization for specific tasks
llama3
Tags: writing, coding
Owner/Author: Meta
Parameters: 8B, 70B
Tool Support: No
Resource Demand:Low (8B), High (70B)
Primary Use Cases: Creative writing, general-purpose tasks, code generation, storytelling
Pros:
- Very popular creative writing model (6.2M pulls)
- Excellent for storytelling and narrative content
- Good understanding of nuance and tone
- Versatile for both writing and coding
- Well-established and reliable
- Strong dialogue generation
Cons:
- Older than Llama 3.1
- 70B model requires significant resources
- May not match specialized models in specific domains
llama3.2
Tags: writing, coding
Owner/Author: Meta
Parameters: 1B, 3B
Hugging Face: ♥ 2,024 · ↓ 4,131,445
Tool Support: Yes
Resource Demand:Low (1B-3B)
Primary Use Cases: Lightweight general-purpose tasks, writing, coding on resource-constrained devices
Pros:
- Very lightweight models suitable for edge devices
- Good performance relative to size
- Versatile for both writing and coding
- Latest Llama architecture in compact form
- Fast inference on limited hardware
Cons:
- Smaller models may lack depth for complex tasks
- Limited to two size options
- May not match larger models on complex reasoning
qwen2.5
Tags: writing, coding
Owner/Author: Alibaba Cloud (Qwen)
Parameters: 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B
Hugging Face: ♥ 1,117 · ↓ 21,497,102
Tool Support: Yes
Resource Demand:Low (0.5B-7B), Medium (14B), High (32B-72B)
Primary Use Cases: General-purpose tasks, creative writing, code generation, multilingual content
Pros:
- Most popular general-purpose model (12.3M pulls)
- Excellent for both writing and coding tasks
- Wide range of sizes for different hardware
- Strong multilingual capabilities
- Versatile and well-tested
- Good creative writing performance
Cons:
- General-purpose nature means less specialization
- Larger models (72B) require substantial resources
- May not match dedicated models in specific domains
deepseek-r1
Tags: writing
Owner/Author: DeepSeek
Parameters: 1.5B, 7B, 14B, 32B, 70B
Hugging Face: ♥ 1,456 · ↓ 1,465,814
Tool Support: Yes
Resource Demand:Low (1.5B-7B), Medium (14B), High (32B-70B)
Primary Use Cases: Long-form writing, structured content generation, detailed reasoning
Pros:
- Specialized for long-form content generation
- Excellent structured writing capabilities
- Strong reasoning and detailed outputs
- Multiple size options
- Well-optimized for writing tasks
- Good for blog posts, articles, and essays
Cons:
- Primarily focused on writing, less versatile for coding
- Larger models require significant resources
- Relatively new with less community testing
mistral
Tags: writing, coding
Owner/Author: Mistral AI
Parameters: 7B
Hugging Face: ♥ 2,452 · ↓ 1,598,460
Tool Support: Yes
Resource Demand:Low (7B)
Primary Use Cases: Creative writing, instruction following, general-purpose tasks, code generation
Pros:
- Very popular and efficient (5.8M pulls)
- Excellent instruction following
- Good balance of writing and coding capabilities
- Efficient 7B size suitable for most hardware
- Well-tested and reliable
- Strong creative writing performance
Cons:
- Only one size option
- May not match larger models on complex tasks
- Older than newer Mistral variants
mixtral
Tags: writing, coding
Owner/Author: Mistral AI
Parameters: 8x7B (45B effective), 8x22B (141B effective)
Hugging Face: ♥ 4,643 · ↓ 779,925
Tool Support: Yes
Resource Demand:High (8x7B), Very-High (8x22B)
Primary Use Cases: Complex creative writing, advanced code generation, mixture-of-experts efficiency
Pros:
- Mixture-of-Experts architecture for efficiency
- Excellent for complex creative tasks
- Good performance on both writing and coding
- More efficient than traditional 45B models
- Strong for advanced use cases
- Well-regarded in creative writing community
- 8x22B variant available for higher capability needs
Cons:
- Still requires substantial resources
- Only one configuration available
- May be overkill for simple tasks
- 8x22B requires very high-end hardware (~131GB RAM)
gemma2
Tags: writing, coding
Owner/Author: Google
Parameters: 2B, 9B, 27B
Hugging Face: ♥ 1,299 · ↓ 415,951
Tool Support: No
Resource Demand:Low (2B), Medium (9B), Medium (27B)
Primary Use Cases: Narrative-driven content, creative writing, general-purpose tasks, code generation
Pros:
- Google-developed with strong narrative capabilities
- Good for upbeat, conversational writing
- Versatile for both writing and coding
- Multiple size options
- Well-optimized for narrative content
- Strong for brainstorming and creative tasks
Cons:
- May not match specialized models in specific domains
- Smaller models may lack depth
- Less popular than some alternatives
gemma3
Tags: writing, coding
Owner/Author: Google
Parameters: 270M, 1B, 4B, 12B, 27B
Hugging Face: ♥ 1,215 · ↓ 2,184,470
Tool Support: No
Resource Demand:Low (270M-12B), Medium (27B)
Primary Use Cases: Lightweight writing tasks, narrative content, general-purpose, code generation
Pros:
- Latest Gemma architecture
- Very lightweight options (270M, 1B) for edge devices
- Good for narrative and creative writing
- Versatile for both writing and coding
- Wide range of sizes
- Fast inference on smaller models
Cons:
- Relatively new with less community testing
- Smaller models may lack depth
- Less popular than Gemma2
dolphin-llama3
Tags: writing
Owner/Author: Eric Hartford (community)
Parameters: 8B, 70B
Tool Support: Yes
Resource Demand:Low (8B), High (70B)
Primary Use Cases: Uncensored creative writing, fiction, roleplay, immersive storytelling
Pros:
- Uncensored variant for unrestricted creative writing
- Excellent for fiction and immersive storytelling
- Based on proven Llama3 architecture
- Popular for roleplay scenarios
- Two size options available
- Strong narrative capabilities
Cons:
- Uncensored nature may not be suitable for all use cases
- Primarily focused on writing, less versatile
- May have ethical considerations for some teams
dolphin-mistral
Tags: writing
Owner/Author: Eric Hartford (community)
Parameters: 7B
Tool Support: Yes
Resource Demand:Low (7B)
Primary Use Cases: Uncensored creative writing, fiction, roleplay, unrestricted content
Pros:
- Uncensored variant based on Mistral
- Efficient 7B size
- Excellent for fiction and creative writing
- Popular for unrestricted scenarios
- Well-tested uncensored model
Cons:
- Uncensored nature may not be suitable for all use cases
- Only one size option
- Primarily focused on writing
dolphin3
Tags: writing
Owner/Author: Eric Hartford (community)
Parameters: 8B
Hugging Face: ♥ 479 · ↓ 15,718
Tool Support: Yes
Resource Demand:Low (8B)
Primary Use Cases: Uncensored creative writing, fiction, roleplay, based on Llama 3.1
Pros:
- Based on latest Llama 3.1 architecture
- Uncensored for unrestricted creative writing
- Good balance of size and performance
- Popular for fiction and roleplay
- Latest Dolphin variant
Cons:
- Uncensored nature may not be suitable for all use cases
- Only one size option
- Primarily focused on writing
phi3
Tags: writing, coding
Owner/Author: Microsoft
Parameters: 3.8B, 14B
Hugging Face: ♥ 1,395 · ↓ 828,343
Tool Support: No
Resource Demand:Low (3.8B), Medium (14B)
Primary Use Cases: Lightweight writing tasks, structured content, code generation, edge devices
Pros:
- Very lightweight and efficient
- Good for structured writing and rubrics
- Versatile for both writing and coding
- Excellent for resource-constrained environments
- Fast inference
- Microsoft-developed with good documentation
Cons:
- Smaller models may lack depth for complex tasks
- Limited size options
- May not match larger models on complex reasoning
vicuna
Tags: writing
Owner/Author: LMSYS
Parameters: 7B, 13B, 33B
Hugging Face: ♥ 388 · ↓ 896,965
Tool Support: No
Resource Demand:Low (7B), Medium (13B), High (33B)
Primary Use Cases: Natural conversational writing, custom assistants, dialogue generation
Pros:
- Natural, less robotic conversational style
- Excellent for dialogue and conversational content
- Good for custom assistant applications
- Multiple size options
- Well-regarded for natural language generation
Cons:
- Primarily focused on writing, less versatile
- Older model may lack latest improvements
- May not match newer models on complex tasks
ministral-3
Tags: writing, coding
Owner/Author: Mistral AI
Parameters: 3B, 8B, 14B
Tool Support: Yes
Resource Demand:Low (3B-8B), Medium (14B)
Primary Use Cases: Edge deployment, efficient writing and coding, resource-constrained environments
Pros:
- Designed for edge deployment
- Efficient models for limited resources
- Versatile for both writing and coding
- Good performance relative to size
- Fast inference
- Latest Mistral architecture in compact form
Cons:
- Relatively new with less community testing
- Smaller models may lack depth
- Less popular than full-size Mistral
qwen3
Tags: writing, coding
Owner/Author: Alibaba Cloud (Qwen)
Parameters: 0.6B, 1.7B, 4B, 8B, 14B, 30B-A3B, 32B, 235B
Hugging Face: ♥ 916 · ↓ 582,223
Tool Support: Yes
Resource Demand:Low (0.6B-8B), Medium (14B), High (30B-A3B), High (32B), Very-High (235B)
Primary Use Cases: General-purpose tasks, reasoning, multilingual support, instruction following, code generation
Pros:
- Latest generation Qwen architecture with major improvements
- Hybrid thinking mode supports both fast and deep reasoning
- MoE variant (30B-A3B) provides high capability with lower compute
- Excellent multilingual support across 100+ languages
- Wide range of sizes from edge to datacenter
- Apache 2.0 license
Cons:
- 235B model requires datacenter-scale hardware
- Newer model with less community testing than Qwen2.5
- Some sizes may not yet have full Ollama quantization support
qwen3-coder-next
Tags: coding
Owner/Author: Alibaba Cloud (Qwen)
Parameters: 80B
Tool Support: Yes
Resource Demand:Very-High (80B)
Primary Use Cases: Advanced code generation, large-scale agentic coding, long-context code understanding
Pros:
- Next-generation code model from Qwen team
- 256K context window for very large codebases
- Designed for agentic coding workflows
- Strong performance on code benchmarks
Cons:
- Very large model requiring significant GPU resources (~74GB RAM)
- Brand new with limited community feedback
- Only one size option available
llama2
Tags: writing, coding
Owner/Author: Meta
Parameters: 7B, 13B, 70B
Hugging Face: ♥ 2,275 · ↓ 670,303
Tool Support: No
Resource Demand:Low (7B), Medium (13B), High (70B)
Primary Use Cases: General-purpose tasks, creative writing, conversation, code generation
Pros:
- Foundational open-source LLM that popularized open weights
- Extremely well-tested and widely deployed
- Large ecosystem of fine-tunes and community models
- Multiple size options for different hardware
- Well-documented training methodology
Cons:
- Older architecture superseded by Llama 3.x series
- 4K context window is limited by modern standards
- Performance significantly behind newer models
- Llama 2 Community License has some restrictions
llama3.3
Tags: writing, coding
Owner/Author: Meta
Parameters: 70B
Hugging Face: ♥ 2,674 · ↓ 608,312
Tool Support: Yes
Resource Demand:High (70B)
Primary Use Cases: High-quality instruction following, creative writing, code generation, reasoning
Pros:
- Latest Meta model with improved instruction following
- 128K context window for long-document tasks
- Performance approaching Llama 3.1 405B on many benchmarks
- Strong multilingual support
- Llama 3.3 Community License
Cons:
- Only available in 70B size, requires significant resources (~66GB RAM)
- Single size option limits flexibility
- No smaller variants for lighter hardware
phi-2
Tags: writing, coding
Owner/Author: Microsoft
Parameters: 2.8B
Hugging Face: ♥ 3,429 · ↓ 1,676,842
Tool Support: No
Resource Demand:Low (2.8B)
Primary Use Cases: Lightweight general-purpose tasks, reasoning, code generation on constrained hardware
Pros:
- Remarkably capable for its small size
- Runs efficiently on consumer hardware
- Strong reasoning for a sub-3B model
- Good for experimentation and prototyping
- MIT license
Cons:
- 2K context window is very limited
- Significantly less capable than larger models
- Superseded by Phi-3 and Phi-4
- Limited depth for complex multi-step reasoning
phi-4
Tags: writing, coding
Owner/Author: Microsoft
Parameters: 14B
Hugging Face: ♥ 695 · ↓ 262,109
Tool Support: Yes
Resource Demand:Medium (14B)
Primary Use Cases: Reasoning, STEM tasks, code generation, mathematical problem-solving
Pros:
- State-of-the-art reasoning for its size class
- Excellent STEM and mathematical capabilities
- Strong code generation performance
- 16K context window
- MIT license
- Good balance of quality and resource requirements (~13GB RAM)
Cons:
- Single size option
- Smaller context than some competitors
- May underperform on creative writing compared to larger models
falcon
Tags: writing, coding
Owner/Author: TII (Technology Innovation Institute)
Parameters: 7B, 40B, 180B
Hugging Face: ♥ 1,031 · ↓ 48,493
Tool Support: No
Resource Demand:Low (7B), High (40B), Very-High (180B)
Primary Use Cases: General-purpose text generation, instruction following, multilingual tasks
Pros:
- One of the first high-quality open-source LLMs
- 180B was the largest open model at time of release
- Apache 2.0 license
- Strong multilingual capabilities
- Well-tested and widely used
Cons:
- Older architecture, predates many modern improvements
- Short context windows (2K-4K)
- 180B requires datacenter hardware (~167GB RAM)
- Performance behind newer models like Llama 3 and Qwen2.5
falcon3
Tags: writing, coding
Owner/Author: TII (Technology Innovation Institute)
Parameters: 7B
Tool Support: Yes
Resource Demand:Low (7B)
Primary Use Cases: Instruction following, general-purpose tasks, improved architecture over Falcon 1/2
Pros:
- Major architectural upgrade over earlier Falcon models
- 32K context window
- Competitive with other 7B-class models
- Apache 2.0 license
- Efficient for its capabilities
Cons:
- Only 7B size available
- Less community adoption than Llama or Qwen
- Relatively new with limited ecosystem
mistral-nemo
Tags: writing, coding
Owner/Author: Mistral AI
Parameters: 12B
Tool Support: Yes
Resource Demand:Medium (12B)
Primary Use Cases: Instruction following, long-context tasks, coding, general-purpose generation
Pros:
- 128K context window in a 12B model
- Developed jointly by Mistral AI and NVIDIA
- Drop-in replacement for Mistral 7B with better performance
- Strong balance of capability and resource requirements (~11GB RAM)
- Good multilingual support
- Apache 2.0 license
Cons:
- Single size option
- Larger than Mistral 7B, may not fit on all devices
- Superseded by newer Mistral models
mistral-small
Tags: writing, coding
Owner/Author: Mistral AI
Parameters: 24B
Tool Support: Yes
Resource Demand:High (24B)
Primary Use Cases: High-quality instruction following, coding, reasoning, general-purpose tasks
Pros:
- Strong performance in the 24B parameter class
- Good balance between Mistral 7B and Mixtral
- 32K context window
- Efficient for its capability level
- Apache 2.0 license
Cons:
- Requires more resources than Mistral 7B (~22GB RAM)
- Single size option
- May be outperformed by newer models in same size class
yi
Tags: writing, coding
Owner/Author: 01-ai (Yi)
Parameters: 6B, 34B
Hugging Face: ♥ 35 · ↓ 9,322
Tool Support: No
Resource Demand:Low (6B), High (34B)
Primary Use Cases: Multilingual chat, Chinese/English bilingual tasks, general-purpose generation
Pros:
- Strong bilingual Chinese/English capabilities
- 34B model competitive with much larger models
- Good for multilingual applications
- Well-optimized for instruction following
Cons:
- 4K context window is limited
- Only two size options
- Less popular in English-only environments
- Older model superseded by Yi 1.5
openchat
Tags: writing
Owner/Author: OpenChat
Parameters: 7B
Tool Support: No
Resource Demand:Low (7B)
Primary Use Cases: Instruction following, conversational AI, general-purpose chat
Pros:
- Competitive with ChatGPT (March 2023) on MT-Bench
- Based on Mistral 7B with improved fine-tuning
- Lightweight and efficient
- Good instruction following capabilities
- Apache 2.0 license
Cons:
- Only one size option
- 8K context window is moderate
- Older model, performance behind newer alternatives
- Less versatile for coding tasks
zephyr
Tags: writing
Owner/Author: Hugging Face
Parameters: 7B
Hugging Face: ♥ 1,835 · ↓ 115,833
Tool Support: No
Resource Demand:Low (7B)
Primary Use Cases: Helpful assistant tasks, instruction following, conversational AI
Pros:
- Aligned with DPO (Direct Preference Optimization)
- Based on Mistral 7B with strong alignment
- 32K context window
- Lightweight and efficient
- Well-documented training process
Cons:
- Only one size option
- Older model with less capability than newer alternatives
- May be overly cautious due to alignment training
hermes
Tags: writing, coding
Owner/Author: NousResearch
Parameters: 8B
Hugging Face: ♥ 499 · ↓ 4,348
Tool Support: Yes
Resource Demand:Low (8B)
Primary Use Cases: General-purpose generation, function calling, structured outputs, agentic tasks
Pros:
- Based on Llama 3.1 with enhanced capabilities
- 128K context window
- Strong function calling and tool use support
- Good for agentic workflows
- Popular community model
- Apache 2.0 license
Cons:
- Currently only 8B size available
- Community-maintained, less corporate backing
- May not match commercial-grade models on all tasks
olmo
Tags: writing, coding
Owner/Author: Allen AI (AI2)
Parameters: 1B, 7B, 13B, 32B
Tool Support: Yes
Resource Demand:Low (1B-7B), Medium (13B), High (32B)
Primary Use Cases: General-purpose generation, research, fully open and reproducible LLM development
Pros:
- Fully open: weights, data, training code, and evaluation all public
- Multiple size options including 32B
- Apache 2.0 license
- Backed by Allen AI research institute
- Excellent for reproducible AI research
- Strong instruction-following capabilities
Cons:
- 4K context window is limited
- Less popular than Llama or Qwen
- May not match performance of models trained on larger data
- 32B model requires ~30GB RAM
tinyllama
Tags: writing
Owner/Author: Zhang Peiyuan (community)
Parameters: 1.1B
Hugging Face: ♥ 1,541 · ↓ 1,946,573
Tool Support: No
Resource Demand:Low (1.1B)
Primary Use Cases: Edge deployment, lightweight chat, mobile and embedded applications
Pros:
- Extremely lightweight and fast
- Runs on very limited hardware including mobile devices
- Based on Llama 2 architecture
- Trained on 3 trillion tokens for strong performance at its size
- Apache 2.0 license
- Good for experimentation and prototyping
Cons:
- Very limited capability compared to larger models
- 2K context window is restrictive
- Not suitable for complex reasoning or long-form generation
- Older architecture
nemotron
Tags: writing, coding
Owner/Author: NVIDIA
Parameters: 9B, 30B-A3B
Tool Support: Yes
Resource Demand:Low (9B), High (30B-A3B)
Primary Use Cases: General-purpose generation, reasoning, tool use, agentic workflows
Pros:
- NVIDIA-developed with hardware optimization expertise
- MoE variant (30B-A3B) offers efficiency with only 3B active parameters
- Very large context windows (128K-256K)
- Strong reasoning capabilities
- Good for NVIDIA GPU deployments
Cons:
- Limited size options
- 30B-A3B model still requires significant storage (~29GB RAM)
- Less community adoption than Llama or Qwen
- May be optimized primarily for NVIDIA hardware
glm-4
Tags: writing, coding
Owner/Author: Zhipu AI (THUDM)
Parameters: 9B
Tool Support: Yes
Resource Demand:Medium (9B)
Primary Use Cases: Bilingual Chinese/English generation, instruction following, long-context tasks
Pros:
- Excellent Chinese language capabilities
- 128K context window
- Strong bilingual (Chinese/English) performance
- Good instruction following
- Competitive with Western models in bilingual tasks
Cons:
- Single size option
- Less popular outside Chinese-speaking communities
- May underperform on English-only tasks vs peers
- Requires ~9GB RAM
deepseek-v3
Tags: writing, coding
Owner/Author: DeepSeek
Parameters: 685B (MoE)
Tool Support: Yes
Resource Demand:Very-High (685B)
Primary Use Cases: State-of-the-art general-purpose generation, coding, reasoning, multilingual tasks
Pros:
- Frontier-level performance competitive with GPT-4 and Claude
- Mixture-of-Experts architecture for training efficiency
- 128K+ context window
- Strong coding and reasoning capabilities
- Multiple versions (V3, V3-0324, V3.2) with continued improvements
- Open weights
Cons:
- Extremely large model requiring datacenter hardware (~638GB RAM)
- Not practical for consumer-grade local deployment
- Complex MoE architecture may require special infrastructure
- High inference cost
orca
Tags: writing, coding
Owner/Author: Microsoft
Parameters: 7B, 13B
Hugging Face: ♥ 666 · ↓ 3,547
Tool Support: No
Resource Demand:Low (7B), Medium (13B)
Primary Use Cases: Reasoning, step-by-step problem solving, cautious and accurate responses
Pros:
- Trained specifically for reasoning and step-by-step solutions
- Microsoft-developed with research rigor
- Good at providing cautious, well-reasoned answers
- Two size options
- Effective for educational and analytical tasks
Cons:
- 4K context window is limited
- Based on older Llama 2 architecture
- May be overly cautious for creative tasks
- Superseded by newer Microsoft models (Phi-4)
wizardlm
Tags: writing, coding
Owner/Author: WizardLM Team
Parameters: 13B
Tool Support: No
Resource Demand:Medium (13B)
Primary Use Cases: Complex instruction following, creative writing, code generation
Pros:
- Trained with Evol-Instruct method for complex instructions
- Strong instruction following capabilities
- Good balance of writing and coding tasks
- Based on proven Llama architecture
Cons:
- Only 13B size available
- 4K context window is limited
- Older model with Llama 2 base
- Project development paused
solar
Tags: writing, coding
Owner/Author: Upstage
Parameters: 10.7B
Tool Support: Yes
Resource Demand:Medium (10.7B)
Primary Use Cases: High-performance instruction following, general-purpose tasks
Pros:
- Depth-Upscaled (DUS) architecture for improved performance
- Competitive with larger models at its size class
- Good for instruction following
- Well-optimized single-model architecture (no MoE)
- Apache 2.0 license
Cons:
- 4K context window is limited
- Only one size option
- Older model, predates newer alternatives
- ~10GB RAM requirement
bloom
Tags: writing
Owner/Author: BigScience
Parameters: 560M, 1.1B, 1.7B, 3B, 7.1B, 176B
Tool Support: No
Resource Demand:Low (560M-7.1B), Very-High (176B)
Primary Use Cases: Multilingual text generation across 46 languages and 13 programming languages
Pros:
- Trained collaboratively by 1000+ researchers
- Supports 46 natural languages and 13 programming languages
- Fully open model with transparent training
- Multiple size options from 560M to 176B
- RAIL license promotes responsible use
Cons:
- Older model (2022) with dated performance
- 176B model requires datacenter hardware (~164GB RAM)
- Short context windows (2K-4K)
- Performance well behind modern models
- Large download sizes
exaone
Tags: writing, coding
Owner/Author: LG AI Research
Parameters: 32B
Tool Support: Yes
Resource Demand:High (32B)
Primary Use Cases: General-purpose generation, bilingual Korean/English tasks, long-context understanding
Pros:
- Strong bilingual Korean/English capabilities
- 128K context window
- Competitive performance with larger models
- Backed by LG AI Research
- Good for Korean language applications
Cons:
- Single size option
- Less known outside Korean-speaking markets
- Requires ~30GB RAM
- Limited community ecosystem
ernie
Tags: writing, coding
Owner/Author: Baidu
Parameters: 300B (47B active, MoE)
Tool Support: Yes
Resource Demand:Very-High (300B)
Primary Use Cases: General-purpose generation, Chinese language tasks, multimodal understanding
Pros:
- Baidu's flagship open-source model
- MoE architecture with 47B active parameters for efficiency
- 128K context window
- Excellent Chinese language capabilities
- Strong multimodal understanding
Cons:
- Very large model requiring datacenter hardware (~280GB RAM)
- Primarily optimized for Chinese language
- PaddlePaddle framework dependency
- Less community adoption outside China
stablelm
Tags: writing
Owner/Author: Stability AI
Parameters: 1.6B
Hugging Face: ♥ 208 · ↓ 1,376
Tool Support: Yes
Resource Demand:Low (1.6B)
Primary Use Cases: Lightweight chat, edge deployment, instruction following on constrained devices
Pros:
- Very lightweight and fast
- Good performance for its size
- Suitable for edge and mobile deployment
- Stability AI backing
Cons:
- Very small model with limited capabilities
- 4K context window
- Stability AI has reduced open-source focus
- Not suitable for complex tasks
How to Choose the Right Open Source LLM
Selecting an open source LLM depends on hardware capabilities, use case, and performance requirements.
Hardware Constraints:
- For limited resources: Consider smaller models like
stable-code(3B),codegemma(2B/7B),qwen2.5-coder(0.5B-7B),phi3(3.8B), orllama3.2(1B/3B) - For high-end hardware: Larger models like
qwen3-coder(480B),deepseek-coder-v2(236B),llama3.1(405B), orcodellama(70B)
Use Case - Coding:
- General coding:
qwen2.5-coder,codellama, ordeepseek-coder - SQL-specific:
sqlcoder - Long context/agentic:
qwen3-coder - Code completion:
stable-code,codegeex4 - Multi-language:
starcoderorstarcoder2 - Versatile (coding + writing):
qwen2.5,llama3.1,mistral,mixtral
Use Case - Writing:
- Creative writing:
llama3,llama3.1,mistral,gemma2 - Long-form content:
deepseek-r1 - Fiction/roleplay:
dolphin-llama3,dolphin-mistral,dolphin3 - Conversational:
vicuna - Lightweight writing:
phi3,gemma3,llama3.2 - Versatile (writing + coding):
qwen2.5,llama3.1,mistral,mixtral
Popularity and Reliability:
- Most tested:
qwen2.5(12.3M pulls),qwen2.5-coder(10.1M pulls),llama3.1(8.5M pulls),llama3(6.2M pulls),mistral(5.8M pulls) - Newest features:
qwen3-coder(3 months),llama3.2(recent),gemma3(recent)
Benefits of Running Open Source LLMs Locally
Running open source language models locally has these characteristics compared to cloud-based APIs:
- Privacy: Your code and conversations never leave your machine
- Cost: No per-token API fees or subscription costs
- Control: Full control over model versions, parameters, and data
- Offline Access: Work without internet connectivity
- Customization: Fine-tune models for your specific needs
- No Rate Limits: Generate as much content as your hardware allows
Getting Started with Local LLMs
You can run open-source LLMs locally using several tools and platforms:
GUI-based tools:
- LM Studio - Interface for downloading and chatting with models
- Jan - Open-source ChatGPT alternative
- GPT4All - General-purpose application with document chat capabilities
Command-line tools:
- Ollama - Simple command-line tool for running models locally
- llama.cpp - Lightweight C++ implementation that runs models efficiently on CPUs
- Direct model loading via Python frameworks like Hugging Face Transformers, PyTorch, or TensorFlow
Web interfaces: If you want a ChatGPT-like experience, you can pair these backends with interfaces like LobeChat, Open WebUI, or LibreChat.
LM Studio or Jan provide model downloads and chat interfaces without command-line work. They support the same model formats (GGUF) that Ollama uses.
Code LLMs vs Writing Models: What’s the Difference?
Differences between code LLMs and writing models:
Code LLMs (like qwen2.5-coder, codellama, deepseek-coder) are trained on code repositories and handle:
- Code generation and completion
- Debugging and error fixing
- Code explanation and documentation
- Multi-language programming support
- Understanding code context and syntax
Writing Models (like llama3.1, mistral, gemma2) are designed for natural language tasks:
- Creative writing and storytelling
- Content generation and editing
- Conversational AI and chat
- Long-form content creation
- General language understanding
Versatile Models (like qwen2.5, llama3.1, mistral) handle both coding and writing tasks.
Using Ollama for Local LLM Deployment
Ollama provides a command-line interface and API for running open source LLMs locally. Example usage:
Pull a model (coding example):
ollama pull qwen2.5-coder:7bPull a model (writing example):
ollama pull llama3.1:8bRun a model:
ollama run qwen2.5-coder:7bollama run llama3.1:8bOr use in your application (coding):
curl -s http://localhost:11434/api/generate -d '{
"model": "qwen2.5-coder:7b",
"prompt": "Write a Python function to calculate Fibonacci numbers.",
"stream": false
}' | jq -r '.response'Or use in your application (writing):
curl -s http://localhost:11434/api/generate -d '{
"model": "llama3.1:8b",
"prompt": "Write a short story about a robot learning to paint.",
"stream": false
}' | jq -r '.response'Popular Use Cases for Open Source LLMs
Open source language models are being used across various domains:
- Code Generation: Automate boilerplate code, generate functions, and complete code snippets
- Code Review: Analyze code for bugs, security issues, and best practices
- Documentation: Generate API docs, README files, and technical documentation
- Creative Writing: Draft stories, articles, and creative content
- Content Editing: Improve grammar, style, and clarity of written content
- Conversational AI: Build chatbots and virtual assistants
- Data Analysis: Generate SQL queries and analyze datasets
- Learning: Understand programming concepts and get coding help
Running Local LLMs
Considerations for running open source LLMs:
- Start Small: Begin with smaller models (3B-7B parameters) to test your hardware
- Monitor Resources: Use system monitoring tools to track GPU/CPU and memory usage
- Experiment with Quantization: Use quantized models (Q4, Q5, Q8) to reduce memory requirements
- Try Multiple Models: Different code LLMs and writing models perform differently on various tasks
- Use Appropriate Context Windows: Match model context length to your use case
- Keep Models Updated: Regularly pull updated versions for bug fixes and improvements
References and Resources
- Ollama Library - Available open source LLMs
- Ollama Library - Code Models - Code LLMs
- Ollama Documentation - Ollama documentation
- LM Studio - GUI for local LLM management
- Jan - Open-source ChatGPT alternative
- GPT4All - Local AI application
- llama.cpp - CPU-based LLM inference

Comments #