Qwen-3

Qwen-3: Explore the Next-Generation Open Source Large Model

Experience the flagship Qwen-3 model series developed by Alibaba Cloud, featuring hybrid thinking, multimodal processing, and powerful multilingual capabilities. Open sourced under Apache 2.0.

235B MoE Parameters

119+ Language Support

Hybrid Thinking Mode

Model Download & Deployment Try Online Now

One-Click Website Integration

Own a website? Instantly add our chat interface using simple iframe code - no registration required.

Free Online Chat - No Registration Required | Fast & Stable | Powered by Qwen-3

Qwen-3

Qwen-3 with DeepSeek R1 Qwen-3 with x.AI Grok 3

Download Tongyi Qianwen APP

Experience Qwen on your mobile device

iOS App Store

For iPhone and iPad

Download

Google Play Store

For Android devices

(Play Store download link currently unavailable)

Download

Android Package (Official)

Direct APK download not officially provided yet

(Direct APK download link currently unavailable)

Download

Core Features

Explore the powerful functions and innovative features of Qwen-3

Hybrid Thinking Mode

Automatically switches between deep thinking and quick response modes based on task complexity, balancing intelligence and efficiency, with flexible control.

•Thinking Mode (Step-by-step reasoning)
•Non-Thinking Mode (Quick response)
•API/Prompt tag control
•Optimized thinking budget

Flagship & Efficient Performance

Flagship MoE model performance rivals top closed-source models, while small-size models also exhibit exceptional performance, surpassing previous-generation large models.

•Leading in coding/math/general ability
•Excellent performance of Qwen3-235B-A22B
•Qwen3-4B matches Qwen2.5-72B
•MoE models activate fewer parameters, high efficiency

Unified Multimodal Processing

Utilizes unified multimodal encoding technology, deeply integrating the processing of text, images, audio, video, and other inputs within a single architecture.

•Text understanding and generation
•Image recognition and analysis
•Audio processing and interaction
•Video content understanding

Broad Multilingual Support

Supports up to 119 languages and dialects, significantly optimizing cross-lingual task performance and language switching issues.

•Coverage of 119 languages & dialects
•Pre-trained on 36T tokens
•Reduced language switching errors
•Strong cross-lingual capabilities

MCP Protocol & Agent Capabilities

Natively supports the MCP protocol, standardizing external tool calls for AI Agents. Recommended to build agents using the Qwen-Agent framework.

•Standardized external Action calls
•Improved Agent development compatibility
•Easy to build browser assistants, etc.
•Qwen-Agent framework recommended

Efficient MoE & Diverse Dense Models

Offers flagship MoE models and a variety of Dense models from 0.6B to 32B, meeting diverse scenario requirements.

•Qwen3-235B (MoE, 22B activated)
•Qwen3-30B (MoE, 3B activated)
•0.6B to 32B Dense models
•Open sourced under Apache 2.0

Ultra-Long Context Processing

Dense models support up to 128K token context, and MoE models also support long context, efficiently handling long documents and complex dialogues.

•Up to 128K context (8B-32B)
•32K context (0.6B-4B)
•Optimized attention mechanisms
•Reduced memory usage for long sequences

Advanced Training Techniques

Undergoes three-stage pre-training based on nearly 36 trillion tokens of data, and employs four-stage post-training to develop hybrid thinking and general capabilities.

•36T tokens pre-training data
•Three-stage pre-training process
•Four-stage post-training flow
•Application of high-quality synthetic data

Open Ecosystem & Compatibility

Open sourced under the Apache 2.0 license, seamlessly integrated with mainstream tools like HuggingFace, vLLM, Ollama, SGLang, etc.

•Fully open source (Apache 2.0)
•Supports frameworks like vLLM, SGLang
•Supports local tools like Ollama, LMStudio
•Available on HuggingFace/ModelScope/Kaggle

DeepSeek V3 in Media Coverage

New Breakthroughs in Open-Source AI Development

Breakthrough Performance

DeepSeek V3 outperforms both open-source and closed-source AI models in programming competitions, particularly excelling in Codeforces competitions and Aider Polyglot tests.

Massive Scale Architecture

Possesses 671 billion parameters and trained on 14.8 trillion tokens, 1.6 times the scale of Meta's Llama 3.1 405B.

Efficient Development Cost

Completed training in just two months using Nvidia H800 GPUs, with a development cost of only $5.5 million.

Qwen-3 in Action

See how Qwen-3 elevates open-source AI capabilities

Qwen-3: Leading the Way in Open Source AI

A deep dive into the capabilities of Qwen-3 and its performance against other leading AI models.

Qwen-3 Performance on Authoritative Benchmarks

General Ability & Language Understanding

MMLULeading

GPQALeading

Arena HardExcellent

Coding Ability

LiveCodeBenchSOTA

HumanEvalLeading

OpenCompassLeading

Mathematical Ability

GSM8KExcellent

AIMEExcellent

Technical Specifications

Explore the advanced technology, architecture, and capabilities driving Qwen-3

Qwen-3 Architecture Details

Advanced architecture integrating Mixture-of-Experts, diverse dense models, and innovative mechanisms

•Mixture-of-Experts (MoE) models: Qwen3-235B (22B activated), Qwen3-30B (3B activated)

•Diverse Dense models: 0.6B, 1.7B, 4B, 8B, 14B, 32B

•Architectural basis for Hybrid Thinking Mode

•Unified Multimodal Encoding technology

•Native MCP (Model-Action-Protocol) support

•Support for long context (up to 128K/32K tokens)

•Optimized Transformer variant design

•Efficient attention mechanisms and chunked prefilling techniques

Qwen-3 Research

Pushing the boundaries of language model capabilities

Innovative Architecture

Integrating hybrid thinking mode, unified multimodal encoding, and efficient MoE architecture.

Training Methodology

Multi-stage pre-training and post-training based on nearly 36 trillion tokens, covering 119 languages.

Technical Blog & Report

Read our blog post to understand the design philosophy and performance details of Qwen-3. A detailed technical report will be released soon.

Read Blog Post

About the Qwen Team

The team behind the Qwen-3 models

Development Background

The Qwen-3 model series is developed by the Alibaba Cloud Tongyi Qianwen team. This team is dedicated to the open-source research and application of large language models, continuously releasing the leading Qwen model series.

Technical Strength

Leveraging Alibaba Cloud's powerful cloud computing infrastructure and extensive experience in large-scale AI model training, the Qwen team can efficiently develop and iterate advanced language models.

Qwen-3 Deployment Options

Efficient Inference Frameworks (vLLM & SGLang)

Recommended for high-performance deployment using vLLM (>=0.8.4) or SGLang (>=0.4.6.post1), supporting long context and Hybrid Thinking Mode.

High throughput
Low latency
Supports Hybrid Thinking Mode
Compatible with OpenAI API

Convenient Local Deployment

Easily run Qwen-3 models locally using tools like Ollama, LMStudio, MLX, llama.cpp, KTransformers, etc.

Quick start
Cross-platform support (CPU/GPU)
Active community
Support for various quantization formats

Cloud API Services

Directly call the Qwen-3 API via Alibaba Cloud Bailian, DashScope, or together.ai without self-deployment.

Out-of-the-box
Pay-as-you-go
Global access
Enterprise-level support

Model Platforms & Quantization Formats

Model weights are available on Hugging Face, ModelScope, Kaggle. Supports quantization formats like GGUF, AWQ, AutoGPTQ to reduce resource requirements.

Multi-platform access
Apache 2.0 License
Supports Int4/Int8 quantization
Suitable for consumer hardware

How to Use Qwen-3

Get started quickly with Qwen-3: Try online, call APIs, or deploy locally

Step 1

Choose Your Method

Based on your needs, choose to try it online (Qwen Chat), call the API service, or download the model for local deployment.

Step 2

Access Platform or Download Model

Visit the Qwen Chat website/app, consult API documentation and providers (like Alibaba Cloud Bailian), or go to Hugging Face/ModelScope/Kaggle to download the required model files.

Step 3

Start Interacting or Integrating

Interact directly with Qwen Chat, integrate it into your application according to the API documentation, or use tools like Ollama, vLLM, SGLang to run and manage the model locally.

Try Qwen Chat Online

Frequently Asked Questions

Learn more about Qwen-3

What makes Qwen-3 unique?

Qwen-3 offers various model sizes from 0.6B to 235B (MoE), open-sourced under Apache 2.0. Key innovations include the Hybrid Thinking Mode (intelligently switching thought depth), unified multimodal processing capabilities, and broad support for 119 languages.

How can I access or use Qwen-3?

You can download model weights from Hugging Face, ModelScope, or Kaggle for local deployment (tools like vLLM, SGLang, Ollama are recommended). You can also call API services via Alibaba Cloud Bailian, DashScope, together.ai, etc., or experience it directly on the Qwen Chat website/app.

What tasks does Qwen-3 excel at?

Qwen-3 demonstrates leading performance in coding, mathematics, and general capability benchmarks, surpassing models like Llama3.1-405B. Its multilingual abilities, long context processing, and Agent functionalities (with MCP protocol) are also very strong.

What is Hybrid Thinking Mode?

This is an innovative feature of Qwen-3. The model can automatically or manually switch between a 'thinking mode' for deep reasoning and a 'non-thinking mode' for quick responses, based on task complexity, to balance effectiveness and efficiency.

How many languages does Qwen-3 support?

Qwen-3 supports up to 119 languages and dialects, significantly enhancing cross-lingual understanding and generation capabilities through large-scale multilingual pre-training data (nearly 36T tokens).

What are the hardware requirements for running Qwen-3?

Requirements depend on the model size. Smaller models (e.g., 0.6B, 1.7B) can run on consumer hardware, especially with Int4/Int8 quantization (like GGUF). Larger models (e.g., 32B, 235B) require more powerful GPU support. It's recommended to check the specific model's documentation and quantization options.

Is Qwen-3 available for commercial use?

Yes, all models in the Qwen-3 series are released under the Apache 2.0 license, allowing for both commercial and research use.

What is the context window size of Qwen-3?

Depending on the model size, Qwen-3 dense models support context lengths of 32K (0.6B-4B) or 128K (8B-32B) tokens. MoE models also support long context (check the model card for specific sizes).

Which deployment frameworks/tools does Qwen-3 support?

vLLM (>=0.8.4) and SGLang (>=0.4.6.post1) are recommended for efficient deployment. For local execution, you can use Ollama, LMStudio, llama.cpp, MLX-LM, KTransformers, etc. It is also compatible with the Hugging Face Transformers library.

Get Started with Qwen-3

Try the Qwen-3 API Service

Access Qwen-3 API functionalities through platforms like Alibaba Cloud Bailian, DashScope, together.ai, etc.

View API Docs

Visit the GitHub Repository

Find Qwen-3 source code, documentation, examples, and community support in the official GitHub repository.

Visit GitHub

Experience Qwen Chat

Directly experience the capabilities of the Qwen-3 model through the official Qwen Chat website or mobile app.

Visit Qwen Chat