TL;DR
Alibaba's Qwen team has released Qwen 3 Coder, a 480-billion-parameter mixture-of-experts model that sets a new bar for open-source coding assistants. With 35 billion active parameters and support ...
Alibaba's Qwen team has released Qwen 3 Coder, a 480-billion-parameter mixture-of-experts model that sets a new bar for open-source coding assistants. With 35 billion active parameters and support for context windows scaling to one million tokens, this model doesn't just compete with proprietary alternatives - it beats them on several key benchmarks.

The numbers tell a clear story. On TerminalBench, Qwen 3 Coder outperforms Claude 4 Sonnet. On SWE-bench Verified, it scores 69.6 against Claude 4's 70.4 - functionally a tie. Agentic browser use is nearly identical between the two models, and while Qwen 3 Coder trails slightly on agentic tool use, it remains within striking distance. Perhaps most telling is the comparison to Kimi K2, which scored 65.4 on SWE-bench: Qwen 3 Coder clears that bar with room to spare.
This represents a dramatic acceleration in capability. Just months ago, DeepSeek R1 was the benchmark everyone discussed. Now an open model matches or exceeds Claude 4 Sonnet across most coding tasks.
Qwen 3 Coder was trained on 7.5 trillion tokens, 70% of which were code-specific. The team employed synthetic data generation to filter noisy training data, significantly improving overall data quality. The model natively supports 256,000 tokens but extends to one million using YaRN extrapolation - optimized specifically for repository-scale coding and dynamic data like pull requests.

Unlike models optimized for competitive programming puzzles, Qwen 3 Coder focuses on real-world software engineering tasks suited for execution-driven reinforcement learning. The team scaled code RL training across a broad spectrum of practical coding scenarios rather than cherry-picking benchmark-friendly problems.
The post-training pipeline introduces long-horizon reinforcement learning to handle multi-turn interactions with development environments. Training an agentic coding model requires massive environmental scale - Alibaba spun up 20,000 independent environments running in parallel across their cloud infrastructure. This setup provided the feedback loops necessary for large-scale RL and supported evaluations at scale. The result: state-of-the-art performance among open-source models on SWE-bench and related benchmarks.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
While hybrid reasoning and test-time compute dominate headlines, Qwen 3 Coder prioritizes inference speed - a critical factor when running inside AI IDEs or agentic coding tools. Fast feedback loops matter when you're iterating on code.
Alibaba released Qwen Code alongside the model, a CLI tool forked from Gemini CLI but customized with specialized prompts and function-calling protocols designed specifically for Qwen 3 Coder. The tool handles agentic coding tasks out of the box.
Integration extends beyond Alibaba's official tooling. Qwen 3 Coder works with:
The fastest way to test Qwen 3 Coder is through the official web interface at chat.qwen.ai. The platform offers free access with an artifacts feature that renders generated web applications directly in the browser - useful for quickly prototyping 3D visualizations, physics simulations, or interactive demos.

For local CLI usage:
npm install -g @qwen/codeThen configure your API key from OpenRouter, Alibaba Cloud, or another provider by setting the base URL and model identifier to point at Qwen 3 Coder.
To use with Cloud Code, obtain an API key from Alibaba Cloud Model Studio, install Cloud Code, and configure the proxy URL and OAuth token. Klein users can similarly swap in the model through its provider configuration.
Qwen 3 Coder arrives at a moment when open-source models are closing the gap with proprietary alternatives faster than expected. The model's strength on SWE-bench - a benchmark requiring multi-turn planning, tool use, and environment interaction - suggests it handles real software engineering workflows, not just code completion.

The combination of competitive performance, million-token context windows, and permissive open licensing gives teams a viable alternative to closed APIs for agentic coding workflows. Whether you're building automated devtools, running an AI-powered IDE, or experimenting with code generation agents, Qwen 3 Coder deserves evaluation.
The rapid progression from DeepSeek R1 to Kimi K2 to Qwen 3 Coder - each leapfrogging the previous state of the art within months - suggests the pace of improvement in coding models isn't slowing. If anything, it's accelerating.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolAI-native code editor forked from VS Code. Composer mode rewrites multiple files at once. Tab autocomplete predicts your...
View Tool
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
OpenAI's cloud coding agent. Runs in a sandboxed container, reads your repo, executes tasks, and submits PRs. Uses GPT-5...
Install Ollama, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting StartedInstall the dd CLI and scaffold your first AI-powered app in under a minute.
Getting StartedConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI Agents
The video reviews OpenAI’s newly released GPT 5.4, highlighting access tiers (GPT 5.4 Thinking in ChatGPT Plus/Teams/Pro/Enterprise and GPT 5.4 in the $200/month tier) and API availability. It covers

Unveiling Qwen 3 Coder: The Most Powerful Open-Source Code Model by Alibaba In this video, we explore Qwen 3 Coder, Alibaba's latest and most powerful open-source AI code model. With 480 billion...

Qwen 3 is here! 🎉 In this video, I dive into Alibaba's latest series of models, featuring six dense models ranging from 600 million to 32 billion parameters and two mixture of experts models....

State-of-the-art computer use, steerable thinking you can redirect mid-response, and a million tokens of context. GPT 5....

Cursor just dropped their first in-house model. Composer is 4x faster than similar models and completes most coding task...

Two months ago, I built Open Lovable with Claude Sonnet 4. Today, Kimi K2 runs the show.