Qwen-3: Explore the Next-Generation Open Source Large Model
Experience the flagship Qwen-3 model series developed by Alibaba Cloud, featuring hybrid thinking, multimodal processing, and powerful multilingual capabilities. Open sourced under Apache 2.0.
One-Click Website Integration
Own a website? Instantly add our chat interface using simple iframe code - no registration required.
Free Online Chat - No Registration Required | Fast & Stable | Powered by Qwen-3
Download Tongyi Qianwen APP
Experience Qwen on your mobile device
Core Features
Explore the powerful functions and innovative features of Qwen-3
Hybrid Thinking Mode
Automatically switches between deep thinking and quick response modes based on task complexity, balancing intelligence and efficiency, with flexible control.
- •Thinking Mode (Step-by-step reasoning)
- •Non-Thinking Mode (Quick response)
- •API/Prompt tag control
- •Optimized thinking budget
Flagship & Efficient Performance
Flagship MoE model performance rivals top closed-source models, while small-size models also exhibit exceptional performance, surpassing previous-generation large models.
- •Leading in coding/math/general ability
- •Excellent performance of Qwen3-235B-A22B
- •Qwen3-4B matches Qwen2.5-72B
- •MoE models activate fewer parameters, high efficiency
Unified Multimodal Processing
Utilizes unified multimodal encoding technology, deeply integrating the processing of text, images, audio, video, and other inputs within a single architecture.
- •Text understanding and generation
- •Image recognition and analysis
- •Audio processing and interaction
- •Video content understanding
Broad Multilingual Support
Supports up to 119 languages and dialects, significantly optimizing cross-lingual task performance and language switching issues.
- •Coverage of 119 languages & dialects
- •Pre-trained on 36T tokens
- •Reduced language switching errors
- •Strong cross-lingual capabilities
MCP Protocol & Agent Capabilities
Natively supports the MCP protocol, standardizing external tool calls for AI Agents. Recommended to build agents using the Qwen-Agent framework.
- •Standardized external Action calls
- •Improved Agent development compatibility
- •Easy to build browser assistants, etc.
- •Qwen-Agent framework recommended
Efficient MoE & Diverse Dense Models
Offers flagship MoE models and a variety of Dense models from 0.6B to 32B, meeting diverse scenario requirements.
- •Qwen3-235B (MoE, 22B activated)
- •Qwen3-30B (MoE, 3B activated)
- •0.6B to 32B Dense models
- •Open sourced under Apache 2.0
Ultra-Long Context Processing
Dense models support up to 128K token context, and MoE models also support long context, efficiently handling long documents and complex dialogues.
- •Up to 128K context (8B-32B)
- •32K context (0.6B-4B)
- •Optimized attention mechanisms
- •Reduced memory usage for long sequences
Advanced Training Techniques
Undergoes three-stage pre-training based on nearly 36 trillion tokens of data, and employs four-stage post-training to develop hybrid thinking and general capabilities.
- •36T tokens pre-training data
- •Three-stage pre-training process
- •Four-stage post-training flow
- •Application of high-quality synthetic data
Open Ecosystem & Compatibility
Open sourced under the Apache 2.0 license, seamlessly integrated with mainstream tools like HuggingFace, vLLM, Ollama, SGLang, etc.
- •Fully open source (Apache 2.0)
- •Supports frameworks like vLLM, SGLang
- •Supports local tools like Ollama, LMStudio
- •Available on HuggingFace/ModelScope/Kaggle
DeepSeek V3 in Media Coverage
New Breakthroughs in Open-Source AI Development
Breakthrough Performance
DeepSeek V3 outperforms both open-source and closed-source AI models in programming competitions, particularly excelling in Codeforces competitions and Aider Polyglot tests.
Massive Scale Architecture
Possesses 671 billion parameters and trained on 14.8 trillion tokens, 1.6 times the scale of Meta's Llama 3.1 405B.
Efficient Development Cost
Completed training in just two months using Nvidia H800 GPUs, with a development cost of only $5.5 million.
Qwen-3 in Action
See how Qwen-3 elevates open-source AI capabilities
Qwen-3: Leading the Way in Open Source AI
A deep dive into the capabilities of Qwen-3 and its performance against other leading AI models.
Qwen-3 Performance on Authoritative Benchmarks
General Ability & Language Understanding
Coding Ability
Mathematical Ability
Technical Specifications
Explore the advanced technology, architecture, and capabilities driving Qwen-3
Qwen-3 Architecture Details
Advanced architecture integrating Mixture-of-Experts, diverse dense models, and innovative mechanisms
Qwen-3 Research
Pushing the boundaries of language model capabilities
Innovative Architecture
Integrating hybrid thinking mode, unified multimodal encoding, and efficient MoE architecture.
Training Methodology
Multi-stage pre-training and post-training based on nearly 36 trillion tokens, covering 119 languages.
Technical Blog & Report
Read our blog post to understand the design philosophy and performance details of Qwen-3. A detailed technical report will be released soon.
Read Blog PostAbout the Qwen Team
The team behind the Qwen-3 models
Development Background
The Qwen-3 model series is developed by the Alibaba Cloud Tongyi Qianwen team. This team is dedicated to the open-source research and application of large language models, continuously releasing the leading Qwen model series.
Technical Strength
Leveraging Alibaba Cloud's powerful cloud computing infrastructure and extensive experience in large-scale AI model training, the Qwen team can efficiently develop and iterate advanced language models.
Qwen-3 Deployment Options
Efficient Inference Frameworks (vLLM & SGLang)
Recommended for high-performance deployment using vLLM (>=0.8.4) or SGLang (>=0.4.6.post1), supporting long context and Hybrid Thinking Mode.
- High throughput
- Low latency
- Supports Hybrid Thinking Mode
- Compatible with OpenAI API
Convenient Local Deployment
Easily run Qwen-3 models locally using tools like Ollama, LMStudio, MLX, llama.cpp, KTransformers, etc.
- Quick start
- Cross-platform support (CPU/GPU)
- Active community
- Support for various quantization formats
Cloud API Services
Directly call the Qwen-3 API via Alibaba Cloud Bailian, DashScope, or together.ai without self-deployment.
- Out-of-the-box
- Pay-as-you-go
- Global access
- Enterprise-level support
Model Platforms & Quantization Formats
Model weights are available on Hugging Face, ModelScope, Kaggle. Supports quantization formats like GGUF, AWQ, AutoGPTQ to reduce resource requirements.
- Multi-platform access
- Apache 2.0 License
- Supports Int4/Int8 quantization
- Suitable for consumer hardware
How to Use Qwen-3
Get started quickly with Qwen-3: Try online, call APIs, or deploy locally
Choose Your Method
Based on your needs, choose to try it online (Qwen Chat), call the API service, or download the model for local deployment.
Access Platform or Download Model
Visit the Qwen Chat website/app, consult API documentation and providers (like Alibaba Cloud Bailian), or go to Hugging Face/ModelScope/Kaggle to download the required model files.
Start Interacting or Integrating
Interact directly with Qwen Chat, integrate it into your application according to the API documentation, or use tools like Ollama, vLLM, SGLang to run and manage the model locally.
Frequently Asked Questions
Learn more about Qwen-3
What makes Qwen-3 unique?
Qwen-3 offers various model sizes from 0.6B to 235B (MoE), open-sourced under Apache 2.0. Key innovations include the Hybrid Thinking Mode (intelligently switching thought depth), unified multimodal processing capabilities, and broad support for 119 languages.
How can I access or use Qwen-3?
You can download model weights from Hugging Face, ModelScope, or Kaggle for local deployment (tools like vLLM, SGLang, Ollama are recommended). You can also call API services via Alibaba Cloud Bailian, DashScope, together.ai, etc., or experience it directly on the Qwen Chat website/app.
What tasks does Qwen-3 excel at?
Qwen-3 demonstrates leading performance in coding, mathematics, and general capability benchmarks, surpassing models like Llama3.1-405B. Its multilingual abilities, long context processing, and Agent functionalities (with MCP protocol) are also very strong.
What is Hybrid Thinking Mode?
This is an innovative feature of Qwen-3. The model can automatically or manually switch between a 'thinking mode' for deep reasoning and a 'non-thinking mode' for quick responses, based on task complexity, to balance effectiveness and efficiency.
How many languages does Qwen-3 support?
Qwen-3 supports up to 119 languages and dialects, significantly enhancing cross-lingual understanding and generation capabilities through large-scale multilingual pre-training data (nearly 36T tokens).
What are the hardware requirements for running Qwen-3?
Requirements depend on the model size. Smaller models (e.g., 0.6B, 1.7B) can run on consumer hardware, especially with Int4/Int8 quantization (like GGUF). Larger models (e.g., 32B, 235B) require more powerful GPU support. It's recommended to check the specific model's documentation and quantization options.
Is Qwen-3 available for commercial use?
Yes, all models in the Qwen-3 series are released under the Apache 2.0 license, allowing for both commercial and research use.
What is the context window size of Qwen-3?
Depending on the model size, Qwen-3 dense models support context lengths of 32K (0.6B-4B) or 128K (8B-32B) tokens. MoE models also support long context (check the model card for specific sizes).
Which deployment frameworks/tools does Qwen-3 support?
vLLM (>=0.8.4) and SGLang (>=0.4.6.post1) are recommended for efficient deployment. For local execution, you can use Ollama, LMStudio, llama.cpp, MLX-LM, KTransformers, etc. It is also compatible with the Hugging Face Transformers library.
Get Started with Qwen-3
Try the Qwen-3 API Service
Access Qwen-3 API functionalities through platforms like Alibaba Cloud Bailian, DashScope, together.ai, etc.
View API DocsVisit the GitHub Repository
Find Qwen-3 source code, documentation, examples, and community support in the official GitHub repository.
Visit GitHubExperience Qwen Chat
Directly experience the capabilities of the Qwen-3 model through the official Qwen Chat website or mobile app.
Visit Qwen Chat