Open-source language models have transformed coding and content creation. Running these open source LLMs locally offers privacy, control, and no API fees. Whether for code LLM programming or writing model creative work, this guide covers top open source language models today.

This list highlights popular open-source LLMs for coding and writing, organized by recency and popularity. Each model entry shows key attributes, use cases, tags, and pros and cons. These models work with tools like Ollama, LM Studio, Jan, and other local AI platforms.

Review the How to Choose the Right Model section for selection guidance.

⚠️ Hardware requirements vary by model size. Smaller models (1B-7B) can run on CPU-only systems with modest RAM, while larger models require significant GPU or CPU resources.

gpt-oss

Tags: writing, coding

Owner/Author: OpenAI

Parameters: 20B (3.6B active), 120B (5.1B active)

Tool Support: Yes

Resource Demand:High (20B), Very-High (120B)

Primary Use Cases: Reasoning, agentic tasks, function calling, structured outputs, tool use, general-purpose tasks

Pros:

  • OpenAI's open-weight models with state-of-the-art reasoning
  • Mixture-of-Experts (MoE) architecture for efficiency
  • 128K context window for long-context tasks
  • Native support for function calling and structured outputs
  • Adjustable reasoning effort (low, medium, high)
  • Full chain-of-thought access for debugging
  • Apache 2.0 license
  • OpenAI-compatible API

Cons:

  • 20B model requires ~16GB VRAM (14GB download)
  • 120B model requires ~60GB VRAM (65GB download)
  • Very new models with less community testing
  • Large download sizes
  • May be overkill for simple tasks

qwen3-coder

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 30B, 480B

Tool Support: Yes

Resource Demand:High (30B), Very-High (480B)

Primary Use Cases: Agentic coding tasks, long-context code generation, complex coding scenarios

Pros:

  • Excellent performance on long-context coding tasks
  • Strong support for agentic workflows
  • Cloud deployment options available
  • Tools integration support

Cons:

  • Very large model sizes (especially 480B) require significant resources
  • Relatively new (3 months old) with less community testing
  • May be overkill for simple coding tasks

qwen2.5-coder

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 0.5B, 1.5B, 3B, 7B, 14B, 32B

Hugging Face: ♥ 664 · ↓ 1,680,689

Tool Support: Yes

Resource Demand:Low (0.5B-7B), Medium (14B), High (32B)

Primary Use Cases: Code generation, code reasoning, code fixing, general programming assistance

Pros:

  • Wide range of model sizes for different hardware constraints
  • Significant improvements in code generation and reasoning
  • Most popular code model (10.1M pulls)
  • Tools integration support
  • Excellent code fixing capabilities

Cons:

  • Larger models (32B) still require substantial resources
  • Smaller models (0.5B, 1.5B) may lack depth for complex tasks
  • Newer model with less long-term reliability data

deepseek-coder-v2

Tags: coding

Owner/Author: DeepSeek

Parameters: 16B, 236B

Hugging Face: ♥ 555 · ↓ 203,627

Tool Support: No

Resource Demand:Medium (16B), Very-High (236B)

Primary Use Cases: Code-specific tasks, performance comparable to GPT-4 Turbo

Pros:

  • Mixture-of-Experts architecture for efficiency
  • Performance comparable to GPT-4 Turbo on code tasks
  • Strong code generation quality
  • Well-optimized for code-specific scenarios

Cons:

  • 236B model requires extremely high-end hardware
  • Smaller 16B model may not match larger variants
  • Less general-purpose than some alternatives

deepseek-coder

Tags: coding

Owner/Author: DeepSeek

Parameters: 1.3B, 6.7B, 33B

Hugging Face: ♥ 480 · ↓ 133,657

Tool Support: No

Resource Demand:Low (1.3B-6.7B), High (33B)

Primary Use Cases: General code generation, trained on 2 trillion tokens

Pros:

  • Extensive training on 2 trillion code and natural language tokens
  • Multiple size options for different use cases
  • Very popular (2.7M pulls)
  • Strong general-purpose coding capabilities
  • Well-tested and reliable

Cons:

  • Older model (2 years) may lack latest improvements
  • 33B model requires significant resources
  • Smaller models may lack depth for complex reasoning

codellama

Tags: coding

Owner/Author: Meta

Parameters: 7B, 13B, 34B, 70B

Hugging Face: ♥ 255 · ↓ 93,876

Tool Support: No

Resource Demand:Low (7B), Medium (13B), High (34B-70B)

Primary Use Cases: Text-to-code generation, code discussion, general programming

Pros:

  • Extremely popular (4M pulls)
  • Multiple size options including very large 70B model
  • Strong general-purpose capabilities
  • Can discuss and explain code, not just generate
  • Well-established and reliable

Cons:

  • 70B model requires very high-end hardware
  • Older architecture compared to newer models
  • May not specialize as well as code-specific models

starcoder2

Tags: coding

Owner/Author: BigCode (Hugging Face)

Parameters: 3B, 7B, 15B

Tool Support: No

Resource Demand:Low (3B-7B), Medium (15B)

Primary Use Cases: Transparently trained open code LLMs, general code generation

Pros:

  • Transparent training process (open and reproducible)
  • Good range of sizes for different hardware
  • Strong code generation capabilities
  • Well-documented and community-supported

Cons:

  • May not match performance of newer specialized models
  • Limited to three size options
  • Less specialized than code-specific variants

codegemma

Tags: coding

Owner/Author: Google

Parameters: 2B, 7B

Hugging Face: ♥ 211 · ↓ 13,472

Tool Support: No

Resource Demand:Low (2B-7B)

Primary Use Cases: Fill-in-the-middle completion, code generation, natural language understanding, mathematical reasoning

Pros:

  • Lightweight models suitable for resource-constrained environments
  • Versatile capabilities beyond just code generation
  • Strong fill-in-the-middle completion
  • Good for mathematical reasoning tasks

Cons:

  • Smaller models may lack depth for complex tasks
  • Limited size options
  • May not match larger models on complex code generation

granite-code

Tags: coding

Owner/Author: IBM

Parameters: 3B, 8B, 20B, 34B

Tool Support: Yes

Resource Demand:Low (3B-8B), Medium (20B), High (34B)

Primary Use Cases: Code Intelligence, IBM's open foundation models

Pros:

  • Good range of sizes from small to large
  • IBM-backed with enterprise support
  • Focused on code intelligence tasks
  • Well-maintained foundation models

Cons:

  • Less popular than some alternatives
  • May have IBM-specific optimizations
  • Less community testing compared to more popular models

deepcoder

Tags: coding

Owner/Author: Agentica

Parameters: 1.5B, 14B

Tool Support: No

Resource Demand:Low (1.5B), Medium (14B)

Primary Use Cases: Code generation at O3-mini level performance

Pros:

  • Fully open-source with transparent development
  • Performance comparable to O3-mini level
  • Good balance with 14B model
  • Lightweight 1.5B option available

Cons:

  • Limited to two size options
  • May not match latest model capabilities
  • Less popular than mainstream alternatives

opencoder

Tags: coding

Owner/Author: OpenCoder Team

Parameters: 1.5B, 8B

Tool Support: No

Resource Demand:Low (1.5B-8B)

Primary Use Cases: Open and reproducible code LLM, English and Chinese support

Pros:

  • Open and reproducible training process
  • Bilingual support (English and Chinese)
  • Good for international development teams
  • Lightweight options available

Cons:

  • Limited size options
  • Less popular than alternatives
  • May not match performance of larger models

yi-coder

Tags: coding

Owner/Author: 01-ai (Yi)

Parameters: 1.5B, 9B

Tool Support: No

Resource Demand:Low (1.5B-9B)

Primary Use Cases: State-of-the-art coding performance with fewer parameters

Pros:

  • Efficient performance with fewer parameters
  • Good coding performance relative to size
  • Lightweight options for resource-constrained environments
  • Optimized for coding tasks

Cons:

  • Limited size options
  • May not match larger models on complex tasks
  • Less popular than mainstream alternatives

codegeex4

Tags: coding

Owner/Author: Zhipu AI (CodeGeeX)

Parameters: 9B

Tool Support: No

Resource Demand:Medium (9B)

Primary Use Cases: AI software development, code completion

Pros:

  • Versatile for various AI software development scenarios
  • Strong code completion capabilities
  • Single optimized size option
  • Good for IDE integration

Cons:

  • Only one size option available
  • May not match performance of larger models
  • Less popular than alternatives

codeqwen

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 7B

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Large language model pretrained on extensive code data

Pros:

  • Extensive pretraining on code data
  • Good balance of size and performance
  • Part of Qwen model family
  • Well-optimized for code tasks

Cons:

  • Only one size option
  • May not match latest Qwen2.5-coder improvements
  • Less flexible than multi-size alternatives

dolphincoder

Tags: coding

Owner/Author: Eric Hartford (community)

Parameters: 7B, 15B

Tool Support: No

Resource Demand:Low (7B), Medium (15B)

Primary Use Cases: Uncensored coding variant, based on StarCoder2

Pros:

  • Uncensored variant for unrestricted coding scenarios
  • Based on proven StarCoder2 architecture
  • Two size options available
  • Good for scenarios requiring fewer restrictions

Cons:

  • Uncensored nature may not be suitable for all use cases
  • Less popular than mainstream alternatives
  • May have ethical considerations for some teams

stable-code

Tags: coding

Owner/Author: Stability AI

Parameters: 3B

Hugging Face: ♥ 659 · ↓ 6,046

Tool Support: No

Resource Demand:Low (3B)

Primary Use Cases: Code completion, instruction following, coding tasks

Pros:

  • Very lightweight (3B) suitable for most hardware
  • Performance comparable to Code Llama 7B despite smaller size
  • Good for code completion tasks
  • Stable and reliable

Cons:

  • Only one size option
  • May lack depth for complex reasoning tasks
  • Smaller than many alternatives

magicoder

Tags: coding

Owner/Author: iSE-UIUC

Parameters: 7B

Hugging Face: ♥ 205 · ↓ 5,562

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Code generation trained on 75K synthetic instruction data

Pros:

  • Novel OSS-Instruct training approach
  • Trained on open-source code snippets
  • Good for general code generation
  • Innovative training methodology

Cons:

  • Only one size option
  • Less popular than alternatives
  • May not match performance of larger or newer models

codebooga

Tags: coding

Owner/Author: oobabooga

Parameters: 34B

Hugging Face: ♥ 147 · ↓ 9,001

Tool Support: No

Resource Demand:High (34B)

Primary Use Cases: High-performing code instruct model, merged architecture

Pros:

  • High performance from merged model architecture
  • Specialized for instruction following
  • Large model size for complex tasks
  • Good for detailed coding instructions

Cons:

  • Very large model requires high-end hardware
  • Only one size option
  • Less popular than alternatives
  • Merged architecture may have compatibility considerations

starcoder

Tags: coding

Owner/Author: BigCode (Hugging Face)

Parameters: 1B, 3B, 7B, 15B

Hugging Face: ♥ 2,924 · ↓ 6,369

Tool Support: No

Resource Demand:Low (1B-7B), Medium (15B)

Primary Use Cases: Code generation across 80+ programming languages

Pros:

  • Trained on 80+ programming languages
  • Excellent multi-language support
  • Multiple size options
  • Well-established and reliable

Cons:

  • Older model (2 years) may lack latest improvements
  • May not specialize as well as newer models
  • Less popular than StarCoder2 successor

sqlcoder

Tags: coding

Owner/Author: Defog.ai

Parameters: 7B, 15B

Hugging Face: ♥ 68 · ↓ 308

Tool Support: No

Resource Demand:Low (7B), Medium (15B)

Primary Use Cases: SQL generation tasks, database query generation

Pros:

  • Specialized for SQL generation
  • Fine-tuned on StarCoder for SQL tasks
  • Two size options available
  • Excellent for database-related coding

Cons:

  • Specialized only for SQL, less versatile
  • May not perform well on non-SQL tasks
  • Limited use case compared to general models

wizardcoder

Tags: coding

Owner/Author: WizardLM Team

Parameters: 33B

Hugging Face: ♥ 135 · ↓ 50

Tool Support: No

Resource Demand:High (33B)

Primary Use Cases: State-of-the-art code generation

Pros:

  • State-of-the-art code generation capabilities
  • Large model size for complex tasks
  • Strong performance on code generation benchmarks
  • Well-regarded in coding community

Cons:

  • Very large model requires high-end hardware
  • Only one size option
  • Older model (2 years) may lack latest improvements

codeup

Tags: coding

Owner/Author: juyongjiang

Parameters: 13B

Tool Support: No

Resource Demand:Medium (13B)

Primary Use Cases: Code generation based on Llama2

Pros:

  • Based on proven Llama2 architecture
  • Good balance of size and performance
  • Reliable code generation capabilities
  • Well-tested foundation

Cons:

  • Only one size option
  • Older architecture (Llama2-based)
  • Less popular than newer alternatives
  • May not match latest model improvements

llama3.1

Tags: writing, coding

Owner/Author: Meta

Parameters: 8B, 70B, 405B

Hugging Face: ♥ 5,536 · ↓ 7,354,915

Tool Support: Yes

Resource Demand:Low (8B), High (70B), Very-High (405B)

Primary Use Cases: General-purpose tasks, creative writing, code generation, instruction following

Pros:

  • Extremely popular and well-tested (8.5M pulls)
  • Versatile for both writing and coding tasks
  • Excellent instruction following capabilities
  • Strong creative writing performance
  • Multiple size options including massive 405B model
  • Latest Llama architecture improvements

Cons:

  • 405B model requires extremely high-end hardware
  • May not specialize as well as dedicated models
  • General-purpose nature means less optimization for specific tasks

llama3

Tags: writing, coding

Owner/Author: Meta

Parameters: 8B, 70B

Tool Support: No

Resource Demand:Low (8B), High (70B)

Primary Use Cases: Creative writing, general-purpose tasks, code generation, storytelling

Pros:

  • Very popular creative writing model (6.2M pulls)
  • Excellent for storytelling and narrative content
  • Good understanding of nuance and tone
  • Versatile for both writing and coding
  • Well-established and reliable
  • Strong dialogue generation

Cons:

  • Older than Llama 3.1
  • 70B model requires significant resources
  • May not match specialized models in specific domains

llama3.2

Tags: writing, coding

Owner/Author: Meta

Parameters: 1B, 3B

Hugging Face: ♥ 2,024 · ↓ 4,131,445

Tool Support: Yes

Resource Demand:Low (1B-3B)

Primary Use Cases: Lightweight general-purpose tasks, writing, coding on resource-constrained devices

Pros:

  • Very lightweight models suitable for edge devices
  • Good performance relative to size
  • Versatile for both writing and coding
  • Latest Llama architecture in compact form
  • Fast inference on limited hardware

Cons:

  • Smaller models may lack depth for complex tasks
  • Limited to two size options
  • May not match larger models on complex reasoning

qwen2.5

Tags: writing, coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B

Hugging Face: ♥ 1,117 · ↓ 21,497,102

Tool Support: Yes

Resource Demand:Low (0.5B-7B), Medium (14B), High (32B-72B)

Primary Use Cases: General-purpose tasks, creative writing, code generation, multilingual content

Pros:

  • Most popular general-purpose model (12.3M pulls)
  • Excellent for both writing and coding tasks
  • Wide range of sizes for different hardware
  • Strong multilingual capabilities
  • Versatile and well-tested
  • Good creative writing performance

Cons:

  • General-purpose nature means less specialization
  • Larger models (72B) require substantial resources
  • May not match dedicated models in specific domains

deepseek-r1

Tags: writing

Owner/Author: DeepSeek

Parameters: 1.5B, 7B, 14B, 32B, 70B

Hugging Face: ♥ 1,456 · ↓ 1,465,814

Tool Support: Yes

Resource Demand:Low (1.5B-7B), Medium (14B), High (32B-70B)

Primary Use Cases: Long-form writing, structured content generation, detailed reasoning

Pros:

  • Specialized for long-form content generation
  • Excellent structured writing capabilities
  • Strong reasoning and detailed outputs
  • Multiple size options
  • Well-optimized for writing tasks
  • Good for blog posts, articles, and essays

Cons:

  • Primarily focused on writing, less versatile for coding
  • Larger models require significant resources
  • Relatively new with less community testing

mistral

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 7B

Hugging Face: ♥ 2,452 · ↓ 1,598,460

Tool Support: Yes

Resource Demand:Low (7B)

Primary Use Cases: Creative writing, instruction following, general-purpose tasks, code generation

Pros:

  • Very popular and efficient (5.8M pulls)
  • Excellent instruction following
  • Good balance of writing and coding capabilities
  • Efficient 7B size suitable for most hardware
  • Well-tested and reliable
  • Strong creative writing performance

Cons:

  • Only one size option
  • May not match larger models on complex tasks
  • Older than newer Mistral variants

mixtral

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 8x7B (45B effective), 8x22B (141B effective)

Hugging Face: ♥ 4,643 · ↓ 779,925

Tool Support: Yes

Resource Demand:High (8x7B), Very-High (8x22B)

Primary Use Cases: Complex creative writing, advanced code generation, mixture-of-experts efficiency

Pros:

  • Mixture-of-Experts architecture for efficiency
  • Excellent for complex creative tasks
  • Good performance on both writing and coding
  • More efficient than traditional 45B models
  • Strong for advanced use cases
  • Well-regarded in creative writing community
  • 8x22B variant available for higher capability needs

Cons:

  • Still requires substantial resources
  • Only one configuration available
  • May be overkill for simple tasks
  • 8x22B requires very high-end hardware (~131GB RAM)

gemma2

Tags: writing, coding

Owner/Author: Google

Parameters: 2B, 9B, 27B

Hugging Face: ♥ 1,299 · ↓ 415,951

Tool Support: No

Resource Demand:Low (2B), Medium (9B), Medium (27B)

Primary Use Cases: Narrative-driven content, creative writing, general-purpose tasks, code generation

Pros:

  • Google-developed with strong narrative capabilities
  • Good for upbeat, conversational writing
  • Versatile for both writing and coding
  • Multiple size options
  • Well-optimized for narrative content
  • Strong for brainstorming and creative tasks

Cons:

  • May not match specialized models in specific domains
  • Smaller models may lack depth
  • Less popular than some alternatives

gemma3

Tags: writing, coding

Owner/Author: Google

Parameters: 270M, 1B, 4B, 12B, 27B

Hugging Face: ♥ 1,215 · ↓ 2,184,470

Tool Support: No

Resource Demand:Low (270M-12B), Medium (27B)

Primary Use Cases: Lightweight writing tasks, narrative content, general-purpose, code generation

Pros:

  • Latest Gemma architecture
  • Very lightweight options (270M, 1B) for edge devices
  • Good for narrative and creative writing
  • Versatile for both writing and coding
  • Wide range of sizes
  • Fast inference on smaller models

Cons:

  • Relatively new with less community testing
  • Smaller models may lack depth
  • Less popular than Gemma2

dolphin-llama3

Tags: writing

Owner/Author: Eric Hartford (community)

Parameters: 8B, 70B

Tool Support: Yes

Resource Demand:Low (8B), High (70B)

Primary Use Cases: Uncensored creative writing, fiction, roleplay, immersive storytelling

Pros:

  • Uncensored variant for unrestricted creative writing
  • Excellent for fiction and immersive storytelling
  • Based on proven Llama3 architecture
  • Popular for roleplay scenarios
  • Two size options available
  • Strong narrative capabilities

Cons:

  • Uncensored nature may not be suitable for all use cases
  • Primarily focused on writing, less versatile
  • May have ethical considerations for some teams

dolphin-mistral

Tags: writing

Owner/Author: Eric Hartford (community)

Parameters: 7B

Tool Support: Yes

Resource Demand:Low (7B)

Primary Use Cases: Uncensored creative writing, fiction, roleplay, unrestricted content

Pros:

  • Uncensored variant based on Mistral
  • Efficient 7B size
  • Excellent for fiction and creative writing
  • Popular for unrestricted scenarios
  • Well-tested uncensored model

Cons:

  • Uncensored nature may not be suitable for all use cases
  • Only one size option
  • Primarily focused on writing

dolphin3

Tags: writing

Owner/Author: Eric Hartford (community)

Parameters: 8B

Hugging Face: ♥ 479 · ↓ 15,718

Tool Support: Yes

Resource Demand:Low (8B)

Primary Use Cases: Uncensored creative writing, fiction, roleplay, based on Llama 3.1

Pros:

  • Based on latest Llama 3.1 architecture
  • Uncensored for unrestricted creative writing
  • Good balance of size and performance
  • Popular for fiction and roleplay
  • Latest Dolphin variant

Cons:

  • Uncensored nature may not be suitable for all use cases
  • Only one size option
  • Primarily focused on writing

phi3

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 3.8B, 14B

Hugging Face: ♥ 1,395 · ↓ 828,343

Tool Support: No

Resource Demand:Low (3.8B), Medium (14B)

Primary Use Cases: Lightweight writing tasks, structured content, code generation, edge devices

Pros:

  • Very lightweight and efficient
  • Good for structured writing and rubrics
  • Versatile for both writing and coding
  • Excellent for resource-constrained environments
  • Fast inference
  • Microsoft-developed with good documentation

Cons:

  • Smaller models may lack depth for complex tasks
  • Limited size options
  • May not match larger models on complex reasoning

vicuna

Tags: writing

Owner/Author: LMSYS

Parameters: 7B, 13B, 33B

Hugging Face: ♥ 388 · ↓ 896,965

Tool Support: No

Resource Demand:Low (7B), Medium (13B), High (33B)

Primary Use Cases: Natural conversational writing, custom assistants, dialogue generation

Pros:

  • Natural, less robotic conversational style
  • Excellent for dialogue and conversational content
  • Good for custom assistant applications
  • Multiple size options
  • Well-regarded for natural language generation

Cons:

  • Primarily focused on writing, less versatile
  • Older model may lack latest improvements
  • May not match newer models on complex tasks

ministral-3

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 3B, 8B, 14B

Tool Support: Yes

Resource Demand:Low (3B-8B), Medium (14B)

Primary Use Cases: Edge deployment, efficient writing and coding, resource-constrained environments

Pros:

  • Designed for edge deployment
  • Efficient models for limited resources
  • Versatile for both writing and coding
  • Good performance relative to size
  • Fast inference
  • Latest Mistral architecture in compact form

Cons:

  • Relatively new with less community testing
  • Smaller models may lack depth
  • Less popular than full-size Mistral

qwen3

Tags: writing, coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 0.6B, 1.7B, 4B, 8B, 14B, 30B-A3B, 32B, 235B

Hugging Face: ♥ 916 · ↓ 582,223

Tool Support: Yes

Resource Demand:Low (0.6B-8B), Medium (14B), High (30B-A3B), High (32B), Very-High (235B)

Primary Use Cases: General-purpose tasks, reasoning, multilingual support, instruction following, code generation

Pros:

  • Latest generation Qwen architecture with major improvements
  • Hybrid thinking mode supports both fast and deep reasoning
  • MoE variant (30B-A3B) provides high capability with lower compute
  • Excellent multilingual support across 100+ languages
  • Wide range of sizes from edge to datacenter
  • Apache 2.0 license

Cons:

  • 235B model requires datacenter-scale hardware
  • Newer model with less community testing than Qwen2.5
  • Some sizes may not yet have full Ollama quantization support

qwen3-coder-next

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 80B

Tool Support: Yes

Resource Demand:Very-High (80B)

Primary Use Cases: Advanced code generation, large-scale agentic coding, long-context code understanding

Pros:

  • Next-generation code model from Qwen team
  • 256K context window for very large codebases
  • Designed for agentic coding workflows
  • Strong performance on code benchmarks

Cons:

  • Very large model requiring significant GPU resources (~74GB RAM)
  • Brand new with limited community feedback
  • Only one size option available

llama2

Tags: writing, coding

Owner/Author: Meta

Parameters: 7B, 13B, 70B

Hugging Face: ♥ 2,275 · ↓ 670,303

Tool Support: No

Resource Demand:Low (7B), Medium (13B), High (70B)

Primary Use Cases: General-purpose tasks, creative writing, conversation, code generation

Pros:

  • Foundational open-source LLM that popularized open weights
  • Extremely well-tested and widely deployed
  • Large ecosystem of fine-tunes and community models
  • Multiple size options for different hardware
  • Well-documented training methodology

Cons:

  • Older architecture superseded by Llama 3.x series
  • 4K context window is limited by modern standards
  • Performance significantly behind newer models
  • Llama 2 Community License has some restrictions

llama3.3

Tags: writing, coding

Owner/Author: Meta

Parameters: 70B

Hugging Face: ♥ 2,674 · ↓ 608,312

Tool Support: Yes

Resource Demand:High (70B)

Primary Use Cases: High-quality instruction following, creative writing, code generation, reasoning

Pros:

  • Latest Meta model with improved instruction following
  • 128K context window for long-document tasks
  • Performance approaching Llama 3.1 405B on many benchmarks
  • Strong multilingual support
  • Llama 3.3 Community License

Cons:

  • Only available in 70B size, requires significant resources (~66GB RAM)
  • Single size option limits flexibility
  • No smaller variants for lighter hardware

phi-2

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 2.8B

Hugging Face: ♥ 3,429 · ↓ 1,676,842

Tool Support: No

Resource Demand:Low (2.8B)

Primary Use Cases: Lightweight general-purpose tasks, reasoning, code generation on constrained hardware

Pros:

  • Remarkably capable for its small size
  • Runs efficiently on consumer hardware
  • Strong reasoning for a sub-3B model
  • Good for experimentation and prototyping
  • MIT license

Cons:

  • 2K context window is very limited
  • Significantly less capable than larger models
  • Superseded by Phi-3 and Phi-4
  • Limited depth for complex multi-step reasoning

phi-4

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 14B

Hugging Face: ♥ 695 · ↓ 262,109

Tool Support: Yes

Resource Demand:Medium (14B)

Primary Use Cases: Reasoning, STEM tasks, code generation, mathematical problem-solving

Pros:

  • State-of-the-art reasoning for its size class
  • Excellent STEM and mathematical capabilities
  • Strong code generation performance
  • 16K context window
  • MIT license
  • Good balance of quality and resource requirements (~13GB RAM)

Cons:

  • Single size option
  • Smaller context than some competitors
  • May underperform on creative writing compared to larger models

falcon

Tags: writing, coding

Owner/Author: TII (Technology Innovation Institute)

Parameters: 7B, 40B, 180B

Hugging Face: ♥ 1,031 · ↓ 48,493

Tool Support: No

Resource Demand:Low (7B), High (40B), Very-High (180B)

Primary Use Cases: General-purpose text generation, instruction following, multilingual tasks

Pros:

  • One of the first high-quality open-source LLMs
  • 180B was the largest open model at time of release
  • Apache 2.0 license
  • Strong multilingual capabilities
  • Well-tested and widely used

Cons:

  • Older architecture, predates many modern improvements
  • Short context windows (2K-4K)
  • 180B requires datacenter hardware (~167GB RAM)
  • Performance behind newer models like Llama 3 and Qwen2.5

falcon3

Tags: writing, coding

Owner/Author: TII (Technology Innovation Institute)

Parameters: 7B

Tool Support: Yes

Resource Demand:Low (7B)

Primary Use Cases: Instruction following, general-purpose tasks, improved architecture over Falcon 1/2

Pros:

  • Major architectural upgrade over earlier Falcon models
  • 32K context window
  • Competitive with other 7B-class models
  • Apache 2.0 license
  • Efficient for its capabilities

Cons:

  • Only 7B size available
  • Less community adoption than Llama or Qwen
  • Relatively new with limited ecosystem

mistral-nemo

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 12B

Tool Support: Yes

Resource Demand:Medium (12B)

Primary Use Cases: Instruction following, long-context tasks, coding, general-purpose generation

Pros:

  • 128K context window in a 12B model
  • Developed jointly by Mistral AI and NVIDIA
  • Drop-in replacement for Mistral 7B with better performance
  • Strong balance of capability and resource requirements (~11GB RAM)
  • Good multilingual support
  • Apache 2.0 license

Cons:

  • Single size option
  • Larger than Mistral 7B, may not fit on all devices
  • Superseded by newer Mistral models

mistral-small

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 24B

Tool Support: Yes

Resource Demand:High (24B)

Primary Use Cases: High-quality instruction following, coding, reasoning, general-purpose tasks

Pros:

  • Strong performance in the 24B parameter class
  • Good balance between Mistral 7B and Mixtral
  • 32K context window
  • Efficient for its capability level
  • Apache 2.0 license

Cons:

  • Requires more resources than Mistral 7B (~22GB RAM)
  • Single size option
  • May be outperformed by newer models in same size class

yi

Tags: writing, coding

Owner/Author: 01-ai (Yi)

Parameters: 6B, 34B

Hugging Face: ♥ 35 · ↓ 9,322

Tool Support: No

Resource Demand:Low (6B), High (34B)

Primary Use Cases: Multilingual chat, Chinese/English bilingual tasks, general-purpose generation

Pros:

  • Strong bilingual Chinese/English capabilities
  • 34B model competitive with much larger models
  • Good for multilingual applications
  • Well-optimized for instruction following

Cons:

  • 4K context window is limited
  • Only two size options
  • Less popular in English-only environments
  • Older model superseded by Yi 1.5

openchat

Tags: writing

Owner/Author: OpenChat

Parameters: 7B

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Instruction following, conversational AI, general-purpose chat

Pros:

  • Competitive with ChatGPT (March 2023) on MT-Bench
  • Based on Mistral 7B with improved fine-tuning
  • Lightweight and efficient
  • Good instruction following capabilities
  • Apache 2.0 license

Cons:

  • Only one size option
  • 8K context window is moderate
  • Older model, performance behind newer alternatives
  • Less versatile for coding tasks

zephyr

Tags: writing

Owner/Author: Hugging Face

Parameters: 7B

Hugging Face: ♥ 1,835 · ↓ 115,833

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Helpful assistant tasks, instruction following, conversational AI

Pros:

  • Aligned with DPO (Direct Preference Optimization)
  • Based on Mistral 7B with strong alignment
  • 32K context window
  • Lightweight and efficient
  • Well-documented training process

Cons:

  • Only one size option
  • Older model with less capability than newer alternatives
  • May be overly cautious due to alignment training

hermes

Tags: writing, coding

Owner/Author: NousResearch

Parameters: 8B

Hugging Face: ♥ 499 · ↓ 4,348

Tool Support: Yes

Resource Demand:Low (8B)

Primary Use Cases: General-purpose generation, function calling, structured outputs, agentic tasks

Pros:

  • Based on Llama 3.1 with enhanced capabilities
  • 128K context window
  • Strong function calling and tool use support
  • Good for agentic workflows
  • Popular community model
  • Apache 2.0 license

Cons:

  • Currently only 8B size available
  • Community-maintained, less corporate backing
  • May not match commercial-grade models on all tasks

olmo

Tags: writing, coding

Owner/Author: Allen AI (AI2)

Parameters: 1B, 7B, 13B, 32B

Tool Support: Yes

Resource Demand:Low (1B-7B), Medium (13B), High (32B)

Primary Use Cases: General-purpose generation, research, fully open and reproducible LLM development

Pros:

  • Fully open: weights, data, training code, and evaluation all public
  • Multiple size options including 32B
  • Apache 2.0 license
  • Backed by Allen AI research institute
  • Excellent for reproducible AI research
  • Strong instruction-following capabilities

Cons:

  • 4K context window is limited
  • Less popular than Llama or Qwen
  • May not match performance of models trained on larger data
  • 32B model requires ~30GB RAM

tinyllama

Tags: writing

Owner/Author: Zhang Peiyuan (community)

Parameters: 1.1B

Hugging Face: ♥ 1,541 · ↓ 1,946,573

Tool Support: No

Resource Demand:Low (1.1B)

Primary Use Cases: Edge deployment, lightweight chat, mobile and embedded applications

Pros:

  • Extremely lightweight and fast
  • Runs on very limited hardware including mobile devices
  • Based on Llama 2 architecture
  • Trained on 3 trillion tokens for strong performance at its size
  • Apache 2.0 license
  • Good for experimentation and prototyping

Cons:

  • Very limited capability compared to larger models
  • 2K context window is restrictive
  • Not suitable for complex reasoning or long-form generation
  • Older architecture

nemotron

Tags: writing, coding

Owner/Author: NVIDIA

Parameters: 9B, 30B-A3B

Tool Support: Yes

Resource Demand:Low (9B), High (30B-A3B)

Primary Use Cases: General-purpose generation, reasoning, tool use, agentic workflows

Pros:

  • NVIDIA-developed with hardware optimization expertise
  • MoE variant (30B-A3B) offers efficiency with only 3B active parameters
  • Very large context windows (128K-256K)
  • Strong reasoning capabilities
  • Good for NVIDIA GPU deployments

Cons:

  • Limited size options
  • 30B-A3B model still requires significant storage (~29GB RAM)
  • Less community adoption than Llama or Qwen
  • May be optimized primarily for NVIDIA hardware

glm-4

Tags: writing, coding

Owner/Author: Zhipu AI (THUDM)

Parameters: 9B

Tool Support: Yes

Resource Demand:Medium (9B)

Primary Use Cases: Bilingual Chinese/English generation, instruction following, long-context tasks

Pros:

  • Excellent Chinese language capabilities
  • 128K context window
  • Strong bilingual (Chinese/English) performance
  • Good instruction following
  • Competitive with Western models in bilingual tasks

Cons:

  • Single size option
  • Less popular outside Chinese-speaking communities
  • May underperform on English-only tasks vs peers
  • Requires ~9GB RAM

deepseek-v3

Tags: writing, coding

Owner/Author: DeepSeek

Parameters: 685B (MoE)

Tool Support: Yes

Resource Demand:Very-High (685B)

Primary Use Cases: State-of-the-art general-purpose generation, coding, reasoning, multilingual tasks

Pros:

  • Frontier-level performance competitive with GPT-4 and Claude
  • Mixture-of-Experts architecture for training efficiency
  • 128K+ context window
  • Strong coding and reasoning capabilities
  • Multiple versions (V3, V3-0324, V3.2) with continued improvements
  • Open weights

Cons:

  • Extremely large model requiring datacenter hardware (~638GB RAM)
  • Not practical for consumer-grade local deployment
  • Complex MoE architecture may require special infrastructure
  • High inference cost

orca

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 7B, 13B

Hugging Face: ♥ 666 · ↓ 3,547

Tool Support: No

Resource Demand:Low (7B), Medium (13B)

Primary Use Cases: Reasoning, step-by-step problem solving, cautious and accurate responses

Pros:

  • Trained specifically for reasoning and step-by-step solutions
  • Microsoft-developed with research rigor
  • Good at providing cautious, well-reasoned answers
  • Two size options
  • Effective for educational and analytical tasks

Cons:

  • 4K context window is limited
  • Based on older Llama 2 architecture
  • May be overly cautious for creative tasks
  • Superseded by newer Microsoft models (Phi-4)

wizardlm

Tags: writing, coding

Owner/Author: WizardLM Team

Parameters: 13B

Tool Support: No

Resource Demand:Medium (13B)

Primary Use Cases: Complex instruction following, creative writing, code generation

Pros:

  • Trained with Evol-Instruct method for complex instructions
  • Strong instruction following capabilities
  • Good balance of writing and coding tasks
  • Based on proven Llama architecture

Cons:

  • Only 13B size available
  • 4K context window is limited
  • Older model with Llama 2 base
  • Project development paused

solar

Tags: writing, coding

Owner/Author: Upstage

Parameters: 10.7B

Tool Support: Yes

Resource Demand:Medium (10.7B)

Primary Use Cases: High-performance instruction following, general-purpose tasks

Pros:

  • Depth-Upscaled (DUS) architecture for improved performance
  • Competitive with larger models at its size class
  • Good for instruction following
  • Well-optimized single-model architecture (no MoE)
  • Apache 2.0 license

Cons:

  • 4K context window is limited
  • Only one size option
  • Older model, predates newer alternatives
  • ~10GB RAM requirement

bloom

Tags: writing

Owner/Author: BigScience

Parameters: 560M, 1.1B, 1.7B, 3B, 7.1B, 176B

Tool Support: No

Resource Demand:Low (560M-7.1B), Very-High (176B)

Primary Use Cases: Multilingual text generation across 46 languages and 13 programming languages

Pros:

  • Trained collaboratively by 1000+ researchers
  • Supports 46 natural languages and 13 programming languages
  • Fully open model with transparent training
  • Multiple size options from 560M to 176B
  • RAIL license promotes responsible use

Cons:

  • Older model (2022) with dated performance
  • 176B model requires datacenter hardware (~164GB RAM)
  • Short context windows (2K-4K)
  • Performance well behind modern models
  • Large download sizes

exaone

Tags: writing, coding

Owner/Author: LG AI Research

Parameters: 32B

Tool Support: Yes

Resource Demand:High (32B)

Primary Use Cases: General-purpose generation, bilingual Korean/English tasks, long-context understanding

Pros:

  • Strong bilingual Korean/English capabilities
  • 128K context window
  • Competitive performance with larger models
  • Backed by LG AI Research
  • Good for Korean language applications

Cons:

  • Single size option
  • Less known outside Korean-speaking markets
  • Requires ~30GB RAM
  • Limited community ecosystem

ernie

Tags: writing, coding

Owner/Author: Baidu

Parameters: 300B (47B active, MoE)

Tool Support: Yes

Resource Demand:Very-High (300B)

Primary Use Cases: General-purpose generation, Chinese language tasks, multimodal understanding

Pros:

  • Baidu's flagship open-source model
  • MoE architecture with 47B active parameters for efficiency
  • 128K context window
  • Excellent Chinese language capabilities
  • Strong multimodal understanding

Cons:

  • Very large model requiring datacenter hardware (~280GB RAM)
  • Primarily optimized for Chinese language
  • PaddlePaddle framework dependency
  • Less community adoption outside China

stablelm

Tags: writing

Owner/Author: Stability AI

Parameters: 1.6B

Hugging Face: ♥ 208 · ↓ 1,376

Tool Support: Yes

Resource Demand:Low (1.6B)

Primary Use Cases: Lightweight chat, edge deployment, instruction following on constrained devices

Pros:

  • Very lightweight and fast
  • Good performance for its size
  • Suitable for edge and mobile deployment
  • Stability AI backing

Cons:

  • Very small model with limited capabilities
  • 4K context window
  • Stability AI has reduced open-source focus
  • Not suitable for complex tasks

How to Choose the Right Open Source LLM

Selecting an open source LLM depends on hardware capabilities, use case, and performance requirements.

Hardware Constraints:

  • For limited resources: Consider smaller models like stable-code (3B), codegemma (2B/7B), qwen2.5-coder (0.5B-7B), phi3 (3.8B), or llama3.2 (1B/3B)
  • For high-end hardware: Larger models like qwen3-coder (480B), deepseek-coder-v2 (236B), llama3.1 (405B), or codellama (70B)

Use Case - Coding:

  • General coding: qwen2.5-coder, codellama, or deepseek-coder
  • SQL-specific: sqlcoder
  • Long context/agentic: qwen3-coder
  • Code completion: stable-code, codegeex4
  • Multi-language: starcoder or starcoder2
  • Versatile (coding + writing): qwen2.5, llama3.1, mistral, mixtral

Use Case - Writing:

  • Creative writing: llama3, llama3.1, mistral, gemma2
  • Long-form content: deepseek-r1
  • Fiction/roleplay: dolphin-llama3, dolphin-mistral, dolphin3
  • Conversational: vicuna
  • Lightweight writing: phi3, gemma3, llama3.2
  • Versatile (writing + coding): qwen2.5, llama3.1, mistral, mixtral

Popularity and Reliability:

  • Most tested: qwen2.5 (12.3M pulls), qwen2.5-coder (10.1M pulls), llama3.1 (8.5M pulls), llama3 (6.2M pulls), mistral (5.8M pulls)
  • Newest features: qwen3-coder (3 months), llama3.2 (recent), gemma3 (recent)

Benefits of Running Open Source LLMs Locally

Running open source language models locally has these characteristics compared to cloud-based APIs:

  • Privacy: Your code and conversations never leave your machine
  • Cost: No per-token API fees or subscription costs
  • Control: Full control over model versions, parameters, and data
  • Offline Access: Work without internet connectivity
  • Customization: Fine-tune models for your specific needs
  • No Rate Limits: Generate as much content as your hardware allows

Getting Started with Local LLMs

You can run open-source LLMs locally using several tools and platforms:

GUI-based tools:

  • LM Studio - Interface for downloading and chatting with models
  • Jan - Open-source ChatGPT alternative
  • GPT4All - General-purpose application with document chat capabilities

Command-line tools:

  • Ollama - Simple command-line tool for running models locally
  • llama.cpp - Lightweight C++ implementation that runs models efficiently on CPUs
  • Direct model loading via Python frameworks like Hugging Face Transformers, PyTorch, or TensorFlow

Web interfaces: If you want a ChatGPT-like experience, you can pair these backends with interfaces like LobeChat, Open WebUI, or LibreChat.

LM Studio or Jan provide model downloads and chat interfaces without command-line work. They support the same model formats (GGUF) that Ollama uses.

Code LLMs vs Writing Models: What’s the Difference?

Differences between code LLMs and writing models:

Code LLMs (like qwen2.5-coder, codellama, deepseek-coder) are trained on code repositories and handle:

  • Code generation and completion
  • Debugging and error fixing
  • Code explanation and documentation
  • Multi-language programming support
  • Understanding code context and syntax

Writing Models (like llama3.1, mistral, gemma2) are designed for natural language tasks:

  • Creative writing and storytelling
  • Content generation and editing
  • Conversational AI and chat
  • Long-form content creation
  • General language understanding

Versatile Models (like qwen2.5, llama3.1, mistral) handle both coding and writing tasks.

Using Ollama for Local LLM Deployment

Ollama provides a command-line interface and API for running open source LLMs locally. Example usage:

Pull a model (coding example):

ollama pull qwen2.5-coder:7b

Pull a model (writing example):

ollama pull llama3.1:8b

Run a model:

ollama run qwen2.5-coder:7b
ollama run llama3.1:8b

Or use in your application (coding):

curl -s http://localhost:11434/api/generate -d '{
  "model": "qwen2.5-coder:7b",
  "prompt": "Write a Python function to calculate Fibonacci numbers.",
  "stream": false
}' | jq -r '.response'

Or use in your application (writing):

curl -s http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "prompt": "Write a short story about a robot learning to paint.",
  "stream": false
}' | jq -r '.response'

Open source language models are being used across various domains:

  • Code Generation: Automate boilerplate code, generate functions, and complete code snippets
  • Code Review: Analyze code for bugs, security issues, and best practices
  • Documentation: Generate API docs, README files, and technical documentation
  • Creative Writing: Draft stories, articles, and creative content
  • Content Editing: Improve grammar, style, and clarity of written content
  • Conversational AI: Build chatbots and virtual assistants
  • Data Analysis: Generate SQL queries and analyze datasets
  • Learning: Understand programming concepts and get coding help

Running Local LLMs

Considerations for running open source LLMs:

  • Start Small: Begin with smaller models (3B-7B parameters) to test your hardware
  • Monitor Resources: Use system monitoring tools to track GPU/CPU and memory usage
  • Experiment with Quantization: Use quantized models (Q4, Q5, Q8) to reduce memory requirements
  • Try Multiple Models: Different code LLMs and writing models perform differently on various tasks
  • Use Appropriate Context Windows: Match model context length to your use case
  • Keep Models Updated: Regularly pull updated versions for bug fixes and improvements

References and Resources