A List of Open Source LLMs: Coding and Writing Models for Local AI

Open-source language models have transformed coding and content creation. Running these open source LLMs locally offers privacy, control, and no API fees. Whether for code LLM programming or writing model creative work, this guide covers top open source language models today.

This list highlights popular open-source LLMs for coding and writing, organized by recency and popularity. Each model entry shows key attributes, use cases, tags, and pros and cons. These models work with tools like Ollama, LM Studio, Jan, and other local AI platforms.

Review the How to Choose the Right Model section for selection guidance.

⚠️ Hardware requirements vary by model size. Smaller models (1B-7B) can run on CPU-only systems with modest RAM, while larger models require significant GPU or CPU resources.

gpt-oss

Tags: writing, coding

Owner/Author: OpenAI

Parameters: 20B (3.6B active), 120B (5.1B active)

Tool Support: Yes

Resource Demand:High (20B), Very-High (120B)

Primary Use Cases: Reasoning, agentic tasks, function calling, structured outputs, tool use, general-purpose tasks

Pros:

OpenAI's open-weight models with state-of-the-art reasoning
Mixture-of-Experts (MoE) architecture for efficiency
128K context window for long-context tasks
Native support for function calling and structured outputs
Adjustable reasoning effort (low, medium, high)
Full chain-of-thought access for debugging
Apache 2.0 license
OpenAI-compatible API

Cons:

20B model requires ~16GB VRAM (14GB download)
120B model requires ~60GB VRAM (65GB download)
Very new models with less community testing
Large download sizes
May be overkill for simple tasks

qwen3-coder

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 30B, 480B

Tool Support: Yes

Resource Demand:High (30B), Very-High (480B)

Primary Use Cases: Agentic coding tasks, long-context code generation, complex coding scenarios

Pros:

Excellent performance on long-context coding tasks
Strong support for agentic workflows
Cloud deployment options available
Tools integration support

Cons:

Very large model sizes (especially 480B) require significant resources
Relatively new (3 months old) with less community testing
May be overkill for simple coding tasks

qwen2.5-coder

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 0.5B, 1.5B, 3B, 7B, 14B, 32B

Hugging Face: ♥ 664 · ↓ 1,680,689

Tool Support: Yes

Resource Demand:Low (0.5B-7B), Medium (14B), High (32B)

Primary Use Cases: Code generation, code reasoning, code fixing, general programming assistance

Pros:

Wide range of model sizes for different hardware constraints
Significant improvements in code generation and reasoning
Most popular code model (10.1M pulls)
Tools integration support
Excellent code fixing capabilities

Cons:

Larger models (32B) still require substantial resources
Smaller models (0.5B, 1.5B) may lack depth for complex tasks
Newer model with less long-term reliability data

deepseek-coder-v2

Tags: coding

Owner/Author: DeepSeek

Parameters: 16B, 236B

Hugging Face: ♥ 555 · ↓ 203,627

Tool Support: No

Resource Demand:Medium (16B), Very-High (236B)

Primary Use Cases: Code-specific tasks, performance comparable to GPT-4 Turbo

Pros:

Mixture-of-Experts architecture for efficiency
Performance comparable to GPT-4 Turbo on code tasks
Strong code generation quality
Well-optimized for code-specific scenarios

Cons:

236B model requires extremely high-end hardware
Smaller 16B model may not match larger variants
Less general-purpose than some alternatives

deepseek-coder

Tags: coding

Owner/Author: DeepSeek

Parameters: 1.3B, 6.7B, 33B

Hugging Face: ♥ 480 · ↓ 133,657

Tool Support: No

Resource Demand:Low (1.3B-6.7B), High (33B)

Primary Use Cases: General code generation, trained on 2 trillion tokens

Pros:

Extensive training on 2 trillion code and natural language tokens
Multiple size options for different use cases
Very popular (2.7M pulls)
Strong general-purpose coding capabilities
Well-tested and reliable

Cons:

Older model (2 years) may lack latest improvements
33B model requires significant resources
Smaller models may lack depth for complex reasoning

codellama

Tags: coding

Owner/Author: Meta

Parameters: 7B, 13B, 34B, 70B

Hugging Face: ♥ 255 · ↓ 93,876

Tool Support: No

Resource Demand:Low (7B), Medium (13B), High (34B-70B)

Primary Use Cases: Text-to-code generation, code discussion, general programming

Pros:

Extremely popular (4M pulls)
Multiple size options including very large 70B model
Strong general-purpose capabilities
Can discuss and explain code, not just generate
Well-established and reliable

Cons:

70B model requires very high-end hardware
Older architecture compared to newer models
May not specialize as well as code-specific models

starcoder2

Tags: coding

Owner/Author: BigCode (Hugging Face)

Parameters: 3B, 7B, 15B

Tool Support: No

Resource Demand:Low (3B-7B), Medium (15B)

Primary Use Cases: Transparently trained open code LLMs, general code generation

Pros:

Transparent training process (open and reproducible)
Good range of sizes for different hardware
Strong code generation capabilities
Well-documented and community-supported

Cons:

May not match performance of newer specialized models
Limited to three size options
Less specialized than code-specific variants

codegemma

Tags: coding

Owner/Author: Google

Parameters: 2B, 7B

Hugging Face: ♥ 211 · ↓ 13,472

Tool Support: No

Resource Demand:Low (2B-7B)

Primary Use Cases: Fill-in-the-middle completion, code generation, natural language understanding, mathematical reasoning

Pros:

Lightweight models suitable for resource-constrained environments
Versatile capabilities beyond just code generation
Strong fill-in-the-middle completion
Good for mathematical reasoning tasks

Cons:

Smaller models may lack depth for complex tasks
Limited size options
May not match larger models on complex code generation

granite-code

Tags: coding

Owner/Author: IBM

Parameters: 3B, 8B, 20B, 34B

Tool Support: Yes

Resource Demand:Low (3B-8B), Medium (20B), High (34B)

Primary Use Cases: Code Intelligence, IBM's open foundation models

Pros:

Good range of sizes from small to large
IBM-backed with enterprise support
Focused on code intelligence tasks
Well-maintained foundation models

Cons:

Less popular than some alternatives
May have IBM-specific optimizations
Less community testing compared to more popular models

deepcoder

Tags: coding

Owner/Author: Agentica

Parameters: 1.5B, 14B

Tool Support: No

Resource Demand:Low (1.5B), Medium (14B)

Primary Use Cases: Code generation at O3-mini level performance

Pros:

Fully open-source with transparent development
Performance comparable to O3-mini level
Good balance with 14B model
Lightweight 1.5B option available

Cons:

Limited to two size options
May not match latest model capabilities
Less popular than mainstream alternatives

opencoder

Tags: coding

Owner/Author: OpenCoder Team

Parameters: 1.5B, 8B

Tool Support: No

Resource Demand:Low (1.5B-8B)

Primary Use Cases: Open and reproducible code LLM, English and Chinese support

Pros:

Open and reproducible training process
Bilingual support (English and Chinese)
Good for international development teams
Lightweight options available

Cons:

Limited size options
Less popular than alternatives
May not match performance of larger models

yi-coder

Tags: coding

Owner/Author: 01-ai (Yi)

Parameters: 1.5B, 9B

Tool Support: No

Resource Demand:Low (1.5B-9B)

Primary Use Cases: State-of-the-art coding performance with fewer parameters

Pros:

Efficient performance with fewer parameters
Good coding performance relative to size
Lightweight options for resource-constrained environments
Optimized for coding tasks

Cons:

Limited size options
May not match larger models on complex tasks
Less popular than mainstream alternatives

codegeex4

Tags: coding

Owner/Author: Zhipu AI (CodeGeeX)

Parameters: 9B

Tool Support: No

Resource Demand:Medium (9B)

Primary Use Cases: AI software development, code completion

Pros:

Versatile for various AI software development scenarios
Strong code completion capabilities
Single optimized size option
Good for IDE integration

Cons:

Only one size option available
May not match performance of larger models
Less popular than alternatives

codeqwen

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 7B

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Large language model pretrained on extensive code data

Pros:

Extensive pretraining on code data
Good balance of size and performance
Part of Qwen model family
Well-optimized for code tasks

Cons:

Only one size option
May not match latest Qwen2.5-coder improvements
Less flexible than multi-size alternatives

dolphincoder

Tags: coding

Owner/Author: Eric Hartford (community)

Parameters: 7B, 15B

Tool Support: No

Resource Demand:Low (7B), Medium (15B)

Primary Use Cases: Uncensored coding variant, based on StarCoder2

Pros:

Uncensored variant for unrestricted coding scenarios
Based on proven StarCoder2 architecture
Two size options available
Good for scenarios requiring fewer restrictions

Cons:

Uncensored nature may not be suitable for all use cases
Less popular than mainstream alternatives
May have ethical considerations for some teams

stable-code

Tags: coding

Owner/Author: Stability AI

Parameters: 3B

Hugging Face: ♥ 659 · ↓ 6,046

Tool Support: No

Resource Demand:Low (3B)

Primary Use Cases: Code completion, instruction following, coding tasks

Pros:

Very lightweight (3B) suitable for most hardware
Performance comparable to Code Llama 7B despite smaller size
Good for code completion tasks
Stable and reliable

Cons:

Only one size option
May lack depth for complex reasoning tasks
Smaller than many alternatives

magicoder

Tags: coding

Owner/Author: iSE-UIUC

Parameters: 7B

Hugging Face: ♥ 205 · ↓ 5,562

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Code generation trained on 75K synthetic instruction data

Pros:

Novel OSS-Instruct training approach
Trained on open-source code snippets
Good for general code generation
Innovative training methodology

Cons:

Only one size option
Less popular than alternatives
May not match performance of larger or newer models

codebooga

Tags: coding

Owner/Author: oobabooga

Parameters: 34B

Hugging Face: ♥ 147 · ↓ 9,001

Tool Support: No

Resource Demand:High (34B)

Primary Use Cases: High-performing code instruct model, merged architecture

Pros:

High performance from merged model architecture
Specialized for instruction following
Large model size for complex tasks
Good for detailed coding instructions

Cons:

Very large model requires high-end hardware
Only one size option
Less popular than alternatives
Merged architecture may have compatibility considerations

starcoder

Tags: coding

Owner/Author: BigCode (Hugging Face)

Parameters: 1B, 3B, 7B, 15B

Hugging Face: ♥ 2,924 · ↓ 6,369

Tool Support: No

Resource Demand:Low (1B-7B), Medium (15B)

Primary Use Cases: Code generation across 80+ programming languages

Pros:

Trained on 80+ programming languages
Excellent multi-language support
Multiple size options
Well-established and reliable

Cons:

Older model (2 years) may lack latest improvements
May not specialize as well as newer models
Less popular than StarCoder2 successor

sqlcoder

Tags: coding

Owner/Author: Defog.ai

Parameters: 7B, 15B

Hugging Face: ♥ 68 · ↓ 308

Tool Support: No

Resource Demand:Low (7B), Medium (15B)

Primary Use Cases: SQL generation tasks, database query generation

Pros:

Specialized for SQL generation
Fine-tuned on StarCoder for SQL tasks
Two size options available
Excellent for database-related coding

Cons:

Specialized only for SQL, less versatile
May not perform well on non-SQL tasks
Limited use case compared to general models

wizardcoder

Tags: coding

Owner/Author: WizardLM Team

Parameters: 33B

Hugging Face: ♥ 135 · ↓ 50

Tool Support: No

Resource Demand:High (33B)

Primary Use Cases: State-of-the-art code generation

Pros:

State-of-the-art code generation capabilities
Large model size for complex tasks
Strong performance on code generation benchmarks
Well-regarded in coding community

Cons:

Very large model requires high-end hardware
Only one size option
Older model (2 years) may lack latest improvements

codeup

Tags: coding

Owner/Author: juyongjiang

Parameters: 13B

Tool Support: No

Resource Demand:Medium (13B)

Primary Use Cases: Code generation based on Llama2

Pros:

Based on proven Llama2 architecture
Good balance of size and performance
Reliable code generation capabilities
Well-tested foundation

Cons:

Only one size option
Older architecture (Llama2-based)
Less popular than newer alternatives
May not match latest model improvements

llama3.1

Tags: writing, coding

Owner/Author: Meta

Parameters: 8B, 70B, 405B

Hugging Face: ♥ 5,536 · ↓ 7,354,915

Tool Support: Yes

Resource Demand:Low (8B), High (70B), Very-High (405B)

Primary Use Cases: General-purpose tasks, creative writing, code generation, instruction following

Pros:

Extremely popular and well-tested (8.5M pulls)
Versatile for both writing and coding tasks
Excellent instruction following capabilities
Strong creative writing performance
Multiple size options including massive 405B model
Latest Llama architecture improvements

Cons:

405B model requires extremely high-end hardware
May not specialize as well as dedicated models
General-purpose nature means less optimization for specific tasks

llama3

Tags: writing, coding

Owner/Author: Meta

Parameters: 8B, 70B

Tool Support: No

Resource Demand:Low (8B), High (70B)

Primary Use Cases: Creative writing, general-purpose tasks, code generation, storytelling

Pros:

Very popular creative writing model (6.2M pulls)
Excellent for storytelling and narrative content
Good understanding of nuance and tone
Versatile for both writing and coding
Well-established and reliable
Strong dialogue generation

Cons:

Older than Llama 3.1
70B model requires significant resources
May not match specialized models in specific domains

llama3.2

Tags: writing, coding

Owner/Author: Meta

Parameters: 1B, 3B

Hugging Face: ♥ 2,024 · ↓ 4,131,445

Tool Support: Yes

Resource Demand:Low (1B-3B)

Primary Use Cases: Lightweight general-purpose tasks, writing, coding on resource-constrained devices

Pros:

Very lightweight models suitable for edge devices
Good performance relative to size
Versatile for both writing and coding
Latest Llama architecture in compact form
Fast inference on limited hardware

Cons:

Smaller models may lack depth for complex tasks
Limited to two size options
May not match larger models on complex reasoning

qwen2.5

Tags: writing, coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B

Hugging Face: ♥ 1,117 · ↓ 21,497,102

Tool Support: Yes

Resource Demand:Low (0.5B-7B), Medium (14B), High (32B-72B)

Primary Use Cases: General-purpose tasks, creative writing, code generation, multilingual content

Pros:

Most popular general-purpose model (12.3M pulls)
Excellent for both writing and coding tasks
Wide range of sizes for different hardware
Strong multilingual capabilities
Versatile and well-tested
Good creative writing performance

Cons:

General-purpose nature means less specialization
Larger models (72B) require substantial resources
May not match dedicated models in specific domains

deepseek-r1

Tags: writing

Owner/Author: DeepSeek

Parameters: 1.5B, 7B, 14B, 32B, 70B

Hugging Face: ♥ 1,456 · ↓ 1,465,814

Tool Support: Yes

Resource Demand:Low (1.5B-7B), Medium (14B), High (32B-70B)

Primary Use Cases: Long-form writing, structured content generation, detailed reasoning

Pros:

Specialized for long-form content generation
Excellent structured writing capabilities
Strong reasoning and detailed outputs
Multiple size options
Well-optimized for writing tasks
Good for blog posts, articles, and essays

Cons:

Primarily focused on writing, less versatile for coding
Larger models require significant resources
Relatively new with less community testing

mistral

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 7B

Hugging Face: ♥ 2,452 · ↓ 1,598,460

Tool Support: Yes

Resource Demand:Low (7B)

Primary Use Cases: Creative writing, instruction following, general-purpose tasks, code generation

Pros:

Very popular and efficient (5.8M pulls)
Excellent instruction following
Good balance of writing and coding capabilities
Efficient 7B size suitable for most hardware
Well-tested and reliable
Strong creative writing performance

Cons:

Only one size option
May not match larger models on complex tasks
Older than newer Mistral variants

mixtral

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 8x7B (45B effective), 8x22B (141B effective)

Hugging Face: ♥ 4,643 · ↓ 779,925

Tool Support: Yes

Resource Demand:High (8x7B), Very-High (8x22B)

Primary Use Cases: Complex creative writing, advanced code generation, mixture-of-experts efficiency

Pros:

Mixture-of-Experts architecture for efficiency
Excellent for complex creative tasks
Good performance on both writing and coding
More efficient than traditional 45B models
Strong for advanced use cases
Well-regarded in creative writing community
8x22B variant available for higher capability needs

Cons:

Still requires substantial resources
Only one configuration available
May be overkill for simple tasks
8x22B requires very high-end hardware (~131GB RAM)

gemma2

Tags: writing, coding

Owner/Author: Google

Parameters: 2B, 9B, 27B

Hugging Face: ♥ 1,299 · ↓ 415,951

Tool Support: No

Resource Demand:Low (2B), Medium (9B), Medium (27B)

Primary Use Cases: Narrative-driven content, creative writing, general-purpose tasks, code generation

Pros:

Google-developed with strong narrative capabilities
Good for upbeat, conversational writing
Versatile for both writing and coding
Multiple size options
Well-optimized for narrative content
Strong for brainstorming and creative tasks

Cons:

May not match specialized models in specific domains
Smaller models may lack depth
Less popular than some alternatives

gemma3

Tags: writing, coding

Owner/Author: Google

Parameters: 270M, 1B, 4B, 12B, 27B

Hugging Face: ♥ 1,215 · ↓ 2,184,470

Tool Support: No

Resource Demand:Low (270M-12B), Medium (27B)

Primary Use Cases: Lightweight writing tasks, narrative content, general-purpose, code generation

Pros:

Latest Gemma architecture
Very lightweight options (270M, 1B) for edge devices
Good for narrative and creative writing
Versatile for both writing and coding
Wide range of sizes
Fast inference on smaller models

Cons:

Relatively new with less community testing
Smaller models may lack depth
Less popular than Gemma2

dolphin-llama3

Tags: writing

Owner/Author: Eric Hartford (community)

Parameters: 8B, 70B

Tool Support: Yes

Resource Demand:Low (8B), High (70B)

Primary Use Cases: Uncensored creative writing, fiction, roleplay, immersive storytelling

Pros:

Uncensored variant for unrestricted creative writing
Excellent for fiction and immersive storytelling
Based on proven Llama3 architecture
Popular for roleplay scenarios
Two size options available
Strong narrative capabilities

Cons:

Uncensored nature may not be suitable for all use cases
Primarily focused on writing, less versatile
May have ethical considerations for some teams

dolphin-mistral

Tags: writing

Owner/Author: Eric Hartford (community)

Parameters: 7B

Tool Support: Yes

Resource Demand:Low (7B)

Primary Use Cases: Uncensored creative writing, fiction, roleplay, unrestricted content

Pros:

Uncensored variant based on Mistral
Efficient 7B size
Excellent for fiction and creative writing
Popular for unrestricted scenarios
Well-tested uncensored model

Cons:

Uncensored nature may not be suitable for all use cases
Only one size option
Primarily focused on writing

dolphin3

Tags: writing

Owner/Author: Eric Hartford (community)

Parameters: 8B

Hugging Face: ♥ 479 · ↓ 15,718

Tool Support: Yes

Resource Demand:Low (8B)

Primary Use Cases: Uncensored creative writing, fiction, roleplay, based on Llama 3.1

Pros:

Based on latest Llama 3.1 architecture
Uncensored for unrestricted creative writing
Good balance of size and performance
Popular for fiction and roleplay
Latest Dolphin variant

Cons:

Uncensored nature may not be suitable for all use cases
Only one size option
Primarily focused on writing

phi3

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 3.8B, 14B

Hugging Face: ♥ 1,395 · ↓ 828,343

Tool Support: No

Resource Demand:Low (3.8B), Medium (14B)

Primary Use Cases: Lightweight writing tasks, structured content, code generation, edge devices

Pros:

Very lightweight and efficient
Good for structured writing and rubrics
Versatile for both writing and coding
Excellent for resource-constrained environments
Fast inference
Microsoft-developed with good documentation

Cons:

Smaller models may lack depth for complex tasks
Limited size options
May not match larger models on complex reasoning

vicuna

Tags: writing

Owner/Author: LMSYS

Parameters: 7B, 13B, 33B

Hugging Face: ♥ 388 · ↓ 896,965

Tool Support: No

Resource Demand:Low (7B), Medium (13B), High (33B)

Primary Use Cases: Natural conversational writing, custom assistants, dialogue generation

Pros:

Natural, less robotic conversational style
Excellent for dialogue and conversational content
Good for custom assistant applications
Multiple size options
Well-regarded for natural language generation

Cons:

Primarily focused on writing, less versatile
Older model may lack latest improvements
May not match newer models on complex tasks

ministral-3

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 3B, 8B, 14B

Tool Support: Yes

Resource Demand:Low (3B-8B), Medium (14B)

Primary Use Cases: Edge deployment, efficient writing and coding, resource-constrained environments

Pros:

Designed for edge deployment
Efficient models for limited resources
Versatile for both writing and coding
Good performance relative to size
Fast inference
Latest Mistral architecture in compact form

Cons:

Relatively new with less community testing
Smaller models may lack depth
Less popular than full-size Mistral

qwen3

Tags: writing, coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 0.6B, 1.7B, 4B, 8B, 14B, 30B-A3B, 32B, 235B

Hugging Face: ♥ 916 · ↓ 582,223

Tool Support: Yes

Resource Demand:Low (0.6B-8B), Medium (14B), High (30B-A3B), High (32B), Very-High (235B)

Primary Use Cases: General-purpose tasks, reasoning, multilingual support, instruction following, code generation

Pros:

Latest generation Qwen architecture with major improvements
Hybrid thinking mode supports both fast and deep reasoning
MoE variant (30B-A3B) provides high capability with lower compute
Excellent multilingual support across 100+ languages
Wide range of sizes from edge to datacenter
Apache 2.0 license

Cons:

235B model requires datacenter-scale hardware
Newer model with less community testing than Qwen2.5
Some sizes may not yet have full Ollama quantization support

qwen3-coder-next

Tags: coding

Owner/Author: Alibaba Cloud (Qwen)

Parameters: 80B

Tool Support: Yes

Resource Demand:Very-High (80B)

Primary Use Cases: Advanced code generation, large-scale agentic coding, long-context code understanding

Pros:

Next-generation code model from Qwen team
256K context window for very large codebases
Designed for agentic coding workflows
Strong performance on code benchmarks

Cons:

Very large model requiring significant GPU resources (~74GB RAM)
Brand new with limited community feedback
Only one size option available

llama2

Tags: writing, coding

Owner/Author: Meta

Parameters: 7B, 13B, 70B

Hugging Face: ♥ 2,275 · ↓ 670,303

Tool Support: No

Resource Demand:Low (7B), Medium (13B), High (70B)

Primary Use Cases: General-purpose tasks, creative writing, conversation, code generation

Pros:

Foundational open-source LLM that popularized open weights
Extremely well-tested and widely deployed
Large ecosystem of fine-tunes and community models
Multiple size options for different hardware
Well-documented training methodology

Cons:

Older architecture superseded by Llama 3.x series
4K context window is limited by modern standards
Performance significantly behind newer models
Llama 2 Community License has some restrictions

llama3.3

Tags: writing, coding

Owner/Author: Meta

Parameters: 70B

Hugging Face: ♥ 2,674 · ↓ 608,312

Tool Support: Yes

Resource Demand:High (70B)

Primary Use Cases: High-quality instruction following, creative writing, code generation, reasoning

Pros:

Latest Meta model with improved instruction following
128K context window for long-document tasks
Performance approaching Llama 3.1 405B on many benchmarks
Strong multilingual support
Llama 3.3 Community License

Cons:

Only available in 70B size, requires significant resources (~66GB RAM)
Single size option limits flexibility
No smaller variants for lighter hardware

phi-2

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 2.8B

Hugging Face: ♥ 3,429 · ↓ 1,676,842

Tool Support: No

Resource Demand:Low (2.8B)

Primary Use Cases: Lightweight general-purpose tasks, reasoning, code generation on constrained hardware

Pros:

Remarkably capable for its small size
Runs efficiently on consumer hardware
Strong reasoning for a sub-3B model
Good for experimentation and prototyping
MIT license

Cons:

2K context window is very limited
Significantly less capable than larger models
Superseded by Phi-3 and Phi-4
Limited depth for complex multi-step reasoning

phi-4

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 14B

Hugging Face: ♥ 695 · ↓ 262,109

Tool Support: Yes

Resource Demand:Medium (14B)

Primary Use Cases: Reasoning, STEM tasks, code generation, mathematical problem-solving

Pros:

State-of-the-art reasoning for its size class
Excellent STEM and mathematical capabilities
Strong code generation performance
16K context window
MIT license
Good balance of quality and resource requirements (~13GB RAM)

Cons:

Single size option
Smaller context than some competitors
May underperform on creative writing compared to larger models

falcon

Tags: writing, coding

Owner/Author: TII (Technology Innovation Institute)

Parameters: 7B, 40B, 180B

Hugging Face: ♥ 1,031 · ↓ 48,493

Tool Support: No

Resource Demand:Low (7B), High (40B), Very-High (180B)

Primary Use Cases: General-purpose text generation, instruction following, multilingual tasks

Pros:

One of the first high-quality open-source LLMs
180B was the largest open model at time of release
Apache 2.0 license
Strong multilingual capabilities
Well-tested and widely used

Cons:

Older architecture, predates many modern improvements
Short context windows (2K-4K)
180B requires datacenter hardware (~167GB RAM)
Performance behind newer models like Llama 3 and Qwen2.5

falcon3

Tags: writing, coding

Owner/Author: TII (Technology Innovation Institute)

Parameters: 7B

Tool Support: Yes

Resource Demand:Low (7B)

Primary Use Cases: Instruction following, general-purpose tasks, improved architecture over Falcon 1/2

Pros:

Major architectural upgrade over earlier Falcon models
32K context window
Competitive with other 7B-class models
Apache 2.0 license
Efficient for its capabilities

Cons:

Only 7B size available
Less community adoption than Llama or Qwen
Relatively new with limited ecosystem

mistral-nemo

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 12B

Tool Support: Yes

Resource Demand:Medium (12B)

Primary Use Cases: Instruction following, long-context tasks, coding, general-purpose generation

Pros:

128K context window in a 12B model
Developed jointly by Mistral AI and NVIDIA
Drop-in replacement for Mistral 7B with better performance
Strong balance of capability and resource requirements (~11GB RAM)
Good multilingual support
Apache 2.0 license

Cons:

Single size option
Larger than Mistral 7B, may not fit on all devices
Superseded by newer Mistral models

mistral-small

Tags: writing, coding

Owner/Author: Mistral AI

Parameters: 24B

Tool Support: Yes

Resource Demand:High (24B)

Primary Use Cases: High-quality instruction following, coding, reasoning, general-purpose tasks

Pros:

Strong performance in the 24B parameter class
Good balance between Mistral 7B and Mixtral
32K context window
Efficient for its capability level
Apache 2.0 license

Cons:

Requires more resources than Mistral 7B (~22GB RAM)
Single size option
May be outperformed by newer models in same size class

yi

Tags: writing, coding

Owner/Author: 01-ai (Yi)

Parameters: 6B, 34B

Hugging Face: ♥ 35 · ↓ 9,322

Tool Support: No

Resource Demand:Low (6B), High (34B)

Primary Use Cases: Multilingual chat, Chinese/English bilingual tasks, general-purpose generation

Pros:

Strong bilingual Chinese/English capabilities
34B model competitive with much larger models
Good for multilingual applications
Well-optimized for instruction following

Cons:

4K context window is limited
Only two size options
Less popular in English-only environments
Older model superseded by Yi 1.5

openchat

Tags: writing

Owner/Author: OpenChat

Parameters: 7B

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Instruction following, conversational AI, general-purpose chat

Pros:

Competitive with ChatGPT (March 2023) on MT-Bench
Based on Mistral 7B with improved fine-tuning
Lightweight and efficient
Good instruction following capabilities
Apache 2.0 license

Cons:

Only one size option
8K context window is moderate
Older model, performance behind newer alternatives
Less versatile for coding tasks

zephyr

Tags: writing

Owner/Author: Hugging Face

Parameters: 7B

Hugging Face: ♥ 1,835 · ↓ 115,833

Tool Support: No

Resource Demand:Low (7B)

Primary Use Cases: Helpful assistant tasks, instruction following, conversational AI

Pros:

Aligned with DPO (Direct Preference Optimization)
Based on Mistral 7B with strong alignment
32K context window
Lightweight and efficient
Well-documented training process

Cons:

Only one size option
Older model with less capability than newer alternatives
May be overly cautious due to alignment training

hermes

Tags: writing, coding

Owner/Author: NousResearch

Parameters: 8B

Hugging Face: ♥ 499 · ↓ 4,348

Tool Support: Yes

Resource Demand:Low (8B)

Primary Use Cases: General-purpose generation, function calling, structured outputs, agentic tasks

Pros:

Based on Llama 3.1 with enhanced capabilities
128K context window
Strong function calling and tool use support
Good for agentic workflows
Popular community model
Apache 2.0 license

Cons:

Currently only 8B size available
Community-maintained, less corporate backing
May not match commercial-grade models on all tasks

olmo

Tags: writing, coding

Owner/Author: Allen AI (AI2)

Parameters: 1B, 7B, 13B, 32B

Tool Support: Yes

Resource Demand:Low (1B-7B), Medium (13B), High (32B)

Primary Use Cases: General-purpose generation, research, fully open and reproducible LLM development

Pros:

Fully open: weights, data, training code, and evaluation all public
Multiple size options including 32B
Apache 2.0 license
Backed by Allen AI research institute
Excellent for reproducible AI research
Strong instruction-following capabilities

Cons:

4K context window is limited
Less popular than Llama or Qwen
May not match performance of models trained on larger data
32B model requires ~30GB RAM

tinyllama

Tags: writing

Owner/Author: Zhang Peiyuan (community)

Parameters: 1.1B

Hugging Face: ♥ 1,541 · ↓ 1,946,573

Tool Support: No

Resource Demand:Low (1.1B)

Primary Use Cases: Edge deployment, lightweight chat, mobile and embedded applications

Pros:

Extremely lightweight and fast
Runs on very limited hardware including mobile devices
Based on Llama 2 architecture
Trained on 3 trillion tokens for strong performance at its size
Apache 2.0 license
Good for experimentation and prototyping

Cons:

Very limited capability compared to larger models
2K context window is restrictive
Not suitable for complex reasoning or long-form generation
Older architecture

nemotron

Tags: writing, coding

Owner/Author: NVIDIA

Parameters: 9B, 30B-A3B

Tool Support: Yes

Resource Demand:Low (9B), High (30B-A3B)

Primary Use Cases: General-purpose generation, reasoning, tool use, agentic workflows

Pros:

NVIDIA-developed with hardware optimization expertise
MoE variant (30B-A3B) offers efficiency with only 3B active parameters
Very large context windows (128K-256K)
Strong reasoning capabilities
Good for NVIDIA GPU deployments

Cons:

Limited size options
30B-A3B model still requires significant storage (~29GB RAM)
Less community adoption than Llama or Qwen
May be optimized primarily for NVIDIA hardware

glm-4

Tags: writing, coding

Owner/Author: Zhipu AI (THUDM)

Parameters: 9B

Tool Support: Yes

Resource Demand:Medium (9B)

Primary Use Cases: Bilingual Chinese/English generation, instruction following, long-context tasks

Pros:

Excellent Chinese language capabilities
128K context window
Strong bilingual (Chinese/English) performance
Good instruction following
Competitive with Western models in bilingual tasks

Cons:

Single size option
Less popular outside Chinese-speaking communities
May underperform on English-only tasks vs peers
Requires ~9GB RAM

deepseek-v3

Tags: writing, coding

Owner/Author: DeepSeek

Parameters: 685B (MoE)

Tool Support: Yes

Resource Demand:Very-High (685B)

Primary Use Cases: State-of-the-art general-purpose generation, coding, reasoning, multilingual tasks

Pros:

Frontier-level performance competitive with GPT-4 and Claude
Mixture-of-Experts architecture for training efficiency
128K+ context window
Strong coding and reasoning capabilities
Multiple versions (V3, V3-0324, V3.2) with continued improvements
Open weights

Cons:

Extremely large model requiring datacenter hardware (~638GB RAM)
Not practical for consumer-grade local deployment
Complex MoE architecture may require special infrastructure
High inference cost

orca

Tags: writing, coding

Owner/Author: Microsoft

Parameters: 7B, 13B

Hugging Face: ♥ 666 · ↓ 3,547

Tool Support: No

Resource Demand:Low (7B), Medium (13B)

Primary Use Cases: Reasoning, step-by-step problem solving, cautious and accurate responses

Pros:

Trained specifically for reasoning and step-by-step solutions
Microsoft-developed with research rigor
Good at providing cautious, well-reasoned answers
Two size options
Effective for educational and analytical tasks

Cons:

4K context window is limited
Based on older Llama 2 architecture
May be overly cautious for creative tasks
Superseded by newer Microsoft models (Phi-4)

wizardlm

Tags: writing, coding

Owner/Author: WizardLM Team

Parameters: 13B

Tool Support: No

Resource Demand:Medium (13B)

Primary Use Cases: Complex instruction following, creative writing, code generation

Pros:

Trained with Evol-Instruct method for complex instructions
Strong instruction following capabilities
Good balance of writing and coding tasks
Based on proven Llama architecture

Cons:

Only 13B size available
4K context window is limited
Older model with Llama 2 base
Project development paused

solar

Tags: writing, coding

Owner/Author: Upstage

Parameters: 10.7B

Tool Support: Yes

Resource Demand:Medium (10.7B)

Primary Use Cases: High-performance instruction following, general-purpose tasks

Pros:

Depth-Upscaled (DUS) architecture for improved performance
Competitive with larger models at its size class
Good for instruction following
Well-optimized single-model architecture (no MoE)
Apache 2.0 license

Cons:

4K context window is limited
Only one size option
Older model, predates newer alternatives
~10GB RAM requirement

bloom

Tags: writing

Owner/Author: BigScience

Parameters: 560M, 1.1B, 1.7B, 3B, 7.1B, 176B

Tool Support: No

Resource Demand:Low (560M-7.1B), Very-High (176B)

Primary Use Cases: Multilingual text generation across 46 languages and 13 programming languages

Pros:

Trained collaboratively by 1000+ researchers
Supports 46 natural languages and 13 programming languages
Fully open model with transparent training
Multiple size options from 560M to 176B
RAIL license promotes responsible use

Cons:

Older model (2022) with dated performance
176B model requires datacenter hardware (~164GB RAM)
Short context windows (2K-4K)
Performance well behind modern models
Large download sizes

exaone

Tags: writing, coding

Owner/Author: LG AI Research

Parameters: 32B

Tool Support: Yes

Resource Demand:High (32B)

Primary Use Cases: General-purpose generation, bilingual Korean/English tasks, long-context understanding

Pros:

Strong bilingual Korean/English capabilities
128K context window
Competitive performance with larger models
Backed by LG AI Research
Good for Korean language applications

Cons:

Single size option
Less known outside Korean-speaking markets
Requires ~30GB RAM
Limited community ecosystem

ernie

Tags: writing, coding

Owner/Author: Baidu

Parameters: 300B (47B active, MoE)

Tool Support: Yes

Resource Demand:Very-High (300B)

Primary Use Cases: General-purpose generation, Chinese language tasks, multimodal understanding

Pros:

Baidu's flagship open-source model
MoE architecture with 47B active parameters for efficiency
128K context window
Excellent Chinese language capabilities
Strong multimodal understanding

Cons:

Very large model requiring datacenter hardware (~280GB RAM)
Primarily optimized for Chinese language
PaddlePaddle framework dependency
Less community adoption outside China

stablelm

Tags: writing

Owner/Author: Stability AI

Parameters: 1.6B

Hugging Face: ♥ 208 · ↓ 1,376

Tool Support: Yes

Resource Demand:Low (1.6B)

Primary Use Cases: Lightweight chat, edge deployment, instruction following on constrained devices

Pros:

Very lightweight and fast
Good performance for its size
Suitable for edge and mobile deployment
Stability AI backing

Cons:

Very small model with limited capabilities
4K context window
Stability AI has reduced open-source focus
Not suitable for complex tasks

How to Choose the Right Open Source LLM

Selecting an open source LLM depends on hardware capabilities, use case, and performance requirements.

Hardware Constraints:

For limited resources: Consider smaller models like stable-code (3B), codegemma (2B/7B), qwen2.5-coder (0.5B-7B), phi3 (3.8B), or llama3.2 (1B/3B)
For high-end hardware: Larger models like qwen3-coder (480B), deepseek-coder-v2 (236B), llama3.1 (405B), or codellama (70B)

Use Case - Coding:

General coding: qwen2.5-coder, codellama, or deepseek-coder
SQL-specific: sqlcoder
Long context/agentic: qwen3-coder
Code completion: stable-code, codegeex4
Multi-language: starcoder or starcoder2
Versatile (coding + writing): qwen2.5, llama3.1, mistral, mixtral

Use Case - Writing:

Creative writing: llama3, llama3.1, mistral, gemma2
Long-form content: deepseek-r1
Fiction/roleplay: dolphin-llama3, dolphin-mistral, dolphin3
Conversational: vicuna
Lightweight writing: phi3, gemma3, llama3.2
Versatile (writing + coding): qwen2.5, llama3.1, mistral, mixtral

Popularity and Reliability:

Most tested: qwen2.5 (12.3M pulls), qwen2.5-coder (10.1M pulls), llama3.1 (8.5M pulls), llama3 (6.2M pulls), mistral (5.8M pulls)
Newest features: qwen3-coder (3 months), llama3.2 (recent), gemma3 (recent)

Benefits of Running Open Source LLMs Locally

Running open source language models locally has these characteristics compared to cloud-based APIs:

Privacy: Your code and conversations never leave your machine
Cost: No per-token API fees or subscription costs
Control: Full control over model versions, parameters, and data
Offline Access: Work without internet connectivity
Customization: Fine-tune models for your specific needs
No Rate Limits: Generate as much content as your hardware allows

Getting Started with Local LLMs

You can run open-source LLMs locally using several tools and platforms:

GUI-based tools:

LM Studio - Interface for downloading and chatting with models
Jan - Open-source ChatGPT alternative
GPT4All - General-purpose application with document chat capabilities

Command-line tools:

Ollama - Simple command-line tool for running models locally
llama.cpp - Lightweight C++ implementation that runs models efficiently on CPUs
Direct model loading via Python frameworks like Hugging Face Transformers, PyTorch, or TensorFlow

Web interfaces: If you want a ChatGPT-like experience, you can pair these backends with interfaces like LobeChat, Open WebUI, or LibreChat.

LM Studio or Jan provide model downloads and chat interfaces without command-line work. They support the same model formats (GGUF) that Ollama uses.

Code LLMs vs Writing Models: What’s the Difference?

Differences between code LLMs and writing models:

Code LLMs (like qwen2.5-coder, codellama, deepseek-coder) are trained on code repositories and handle:

Code generation and completion
Debugging and error fixing
Code explanation and documentation
Multi-language programming support
Understanding code context and syntax

Writing Models (like llama3.1, mistral, gemma2) are designed for natural language tasks:

Creative writing and storytelling
Content generation and editing
Conversational AI and chat
Long-form content creation
General language understanding

Versatile Models (like qwen2.5, llama3.1, mistral) handle both coding and writing tasks.

Using Ollama for Local LLM Deployment

Ollama provides a command-line interface and API for running open source LLMs locally. Example usage:

Pull a model (coding example):

ollama pull qwen2.5-coder:7b

Pull a model (writing example):

ollama pull llama3.1:8b

Run a model:

ollama run qwen2.5-coder:7b

ollama run llama3.1:8b

Or use in your application (coding):

curl -s http://localhost:11434/api/generate -d '{
  "model": "qwen2.5-coder:7b",
  "prompt": "Write a Python function to calculate Fibonacci numbers.",
  "stream": false
}' | jq -r '.response'

Or use in your application (writing):

curl -s http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "prompt": "Write a short story about a robot learning to paint.",
  "stream": false
}' | jq -r '.response'

Popular Use Cases for Open Source LLMs

Open source language models are being used across various domains:

Code Generation: Automate boilerplate code, generate functions, and complete code snippets
Code Review: Analyze code for bugs, security issues, and best practices
Documentation: Generate API docs, README files, and technical documentation
Creative Writing: Draft stories, articles, and creative content
Content Editing: Improve grammar, style, and clarity of written content
Conversational AI: Build chatbots and virtual assistants
Data Analysis: Generate SQL queries and analyze datasets
Learning: Understand programming concepts and get coding help

Running Local LLMs

Considerations for running open source LLMs:

Start Small: Begin with smaller models (3B-7B parameters) to test your hardware
Monitor Resources: Use system monitoring tools to track GPU/CPU and memory usage
Experiment with Quantization: Use quantized models (Q4, Q5, Q8) to reduce memory requirements
Try Multiple Models: Different code LLMs and writing models perform differently on various tasks
Use Appropriate Context Windows: Match model context length to your use case
Keep Models Updated: Regularly pull updated versions for bug fixes and improvements

References and Resources

Ollama Library - Available open source LLMs
Ollama Library - Code Models - Code LLMs
Ollama Documentation - Ollama documentation
LM Studio - GUI for local LLM management
Jan - Open-source ChatGPT alternative
GPT4All - Local AI application
llama.cpp - CPU-based LLM inference

gpt-oss

qwen3-coder

qwen2.5-coder

deepseek-coder-v2

deepseek-coder

codellama

starcoder2

codegemma

granite-code

deepcoder

opencoder

yi-coder

codegeex4

codeqwen

dolphincoder

stable-code

magicoder

codebooga

starcoder

sqlcoder

wizardcoder

codeup

llama3.1

llama3

llama3.2

qwen2.5

deepseek-r1

mistral

mixtral

gemma2

gemma3

dolphin-llama3

dolphin-mistral

dolphin3

phi3

vicuna

ministral-3

qwen3

qwen3-coder-next

llama2

llama3.3

phi-2

phi-4

falcon

falcon3

mistral-nemo

mistral-small

yi

openchat

zephyr

hermes

olmo

tinyllama

nemotron

glm-4

deepseek-v3

orca

wizardlm

solar

bloom

exaone

ernie

stablelm

How to Choose the Right Open Source LLM#

Benefits of Running Open Source LLMs Locally#

Getting Started with Local LLMs#

Code LLMs vs Writing Models: What’s the Difference?#

Using Ollama for Local LLM Deployment#

Popular Use Cases for Open Source LLMs#

Running Local LLMs#

References and Resources#

Comments #

How to Choose the Right Open Source LLM

Benefits of Running Open Source LLMs Locally

Getting Started with Local LLMs

Code LLMs vs Writing Models: What’s the Difference?

Using Ollama for Local LLM Deployment

Popular Use Cases for Open Source LLMs

Running Local LLMs

References and Resources

Comments