DeepSeek released preview versions of two new flagship models, DeepSeek-V4-Pro and DeepSeek-V4-Flash, on , the Chinese startup's most aggressive product launch since the V3 release in early 2025 sent Magnificent Seven valuations into freefall and forced every major U.S. lab to publicly justify its compute spend. Both new models are open-source under the company's permissive license, meaning developers can download, modify, and self-host them.
According to DeepSeek's own published benchmarks, V4-Pro outperforms every other open-source model on math reasoning and coding tasks, and trails only Google's closed Gemini 3.1 Pro on broad world knowledge. The company described the model's overall performance as "marginally short" of OpenAI's GPT-5.4 and Gemini 3.1 Pro, which it argued suggests a developmental trajectory that lags state-of-the-art frontier models by roughly three to six months. That is the smallest gap any open-source family has plausibly claimed since the modern LLM race began.
What V4-Pro and V4-Flash Actually Are
The two models split the workload along the now-familiar quality-versus-cost axis. V4-Pro is the maximum-capability model, optimized for reasoning, coding, and long-context tasks where the cost per token is acceptable in exchange for the best available answer. V4-Flash trades a small amount of quality for substantially faster response times and what DeepSeek calls "highly cost-effective" pricing, putting it in the same product slot as OpenAI's GPT-5.4 Mini, Anthropic's Claude Haiku 4.7, and Google's Gemini 3.1 Flash.
The split matters because the economics of agentic AI applications, which now drive the majority of new enterprise contracts, depend almost entirely on the inference cost of the worker model rather than the orchestrator. A capable Flash-tier model from a hyperscaler can cost roughly half a cent per thousand output tokens. DeepSeek has historically priced its open-source releases dramatically below that, and the V4-Flash announcement is expected to continue the pattern.
"V4-Pro's performance falls only marginally short of OpenAI's GPT-5.4 and Gemini 3.1 Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months."
DeepSeek, official launch announcement, April 24, 2026
| Model | Provider | License | Strength noted by DeepSeek |
|---|---|---|---|
| DeepSeek-V4-Pro | DeepSeek | Open-source | Top open-source for math and coding |
| DeepSeek-V4-Flash | DeepSeek | Open-source | Reasoning at lower cost, faster latency |
| GPT-5.4 | OpenAI | Closed API | Frontier across tasks |
| Gemini 3.1 Pro | Closed API | Frontier on world knowledge | |
| Claude Opus 4.7 | Anthropic | Closed API | Frontier on coding and agentic work |
Why a Three-to-Six-Month Gap Is the Headline Number
The most telegraphed line in DeepSeek's announcement is the explicit framing that the company is now within three to six months of frontier closed models. For developers building agentic systems on open weights, that estimate has direct practical consequences. A six-month frontier lag is the difference between an enterprise being able to self-host a model that handles 90 percent of its workload and being forced to send every request to a U.S. closed-API provider.
If V4-Pro holds up to independent benchmarking, the calculus shifts for cloud security teams, sovereign deployments, and any organization that has been waiting for an open-source option that is good enough to take seriously for production workloads. The model can be deployed in a private cloud, a national cloud, or a fully on-premises environment without sending sensitive data to OpenAI, Anthropic, or Google.
That is the same argument DeepSeek made when it released V3 in January 2025, and the response was swift and unsettling. Marc Andreessen, the Silicon Valley venture capitalist with close ties to the Trump administration, called the launch "AI's Sputnik moment," a phrase that captured the unease in U.S. policy circles about the speed at which Chinese open-source had closed the gap. The reaction this time will depend on whether independent labs reproduce the benchmark numbers DeepSeek has published.
The Open-Source Question and Its Critics
DeepSeek has been clear about its open-source positioning since the V2 release. V4-Pro and V4-Flash are no different. Developers can use the models commercially, modify the source, and redistribute fine-tuned variants. That is more liberal than what Anthropic, OpenAI, or Google offer with their flagship products, and roughly comparable to what Meta provides with Llama and Mistral with its open-weight releases.
The V3 release in 2025 was met with skepticism from some U.S. researchers, who argued that DeepSeek's stated training cost of less than $6 million understated the actual compute and chip access the company likely had. Several analysts at the time pointed to circumstantial evidence that DeepSeek had access to more advanced Nvidia hardware than U.S. export controls would have permitted, either through pre-restriction stockpiles or through gray-market intermediaries. DeepSeek has never publicly confirmed those claims and has continued to push the narrative of cost-efficient training as a core advantage.
The V4 family release does not include detailed training-cost disclosures, but the company has emphasized algorithmic efficiency and architectural improvements over raw compute scale. That framing is consistent with the broader story Chinese AI labs have been telling: that they will compete on the science and engineering of the models, not on the size of the GPU fleet behind them.
Restrictions, Bans, and Sovereign Concerns
DeepSeek's V3 release prompted a broad regulatory response. Multiple U.S. states, Australia, Taiwan, South Korea, Denmark, and Italy introduced bans or other restrictions on DeepSeek-R1 within weeks of its launch, citing privacy and national security concerns related to the company's data handling and the Chinese government's potential access to user inputs. Several of those restrictions remain in force, and others were tightened after subsequent disclosures about how DeepSeek's hosted service routed user data.
The V4-Pro and V4-Flash launch will face similar scrutiny, but the open-source nature of the release complicates the response. A locally hosted DeepSeek-V4-Pro instance does not send data anywhere, which removes the most cited objection from regulators. That is also why European competition regulators have started to look more closely at the open-versus-closed framing, and why several sovereign cloud operators in Europe and the Middle East have already announced plans to make V4-Pro available as a managed deployment option.
For multinational enterprises, the deployment question becomes a procurement question. Legal, compliance, and security teams now have to evaluate whether a self-hosted DeepSeek instance carries less risk than a managed service from a U.S. provider that has its own regulatory exposure. Six months ago, that conversation would have ended quickly with a default to the U.S. provider. With V4-Pro on the table, it becomes a real evaluation.
What Stanford's Index Already Told Us
The Stanford AI Index 2026, released earlier this month, documented what DeepSeek's launch now demonstrates in product form. According to the index, Chinese AI labs have "effectively closed" the performance gap with U.S. rivals on most academic benchmarks, while still trailing on what the report's authors called "higher-impact patents" and the most cited frontier models. China leads in publication volume, citation counts, patent filings, and industrial robot installations.
"Chinese companies have effectively closed the AI performance gap with their U.S. rivals."
Stanford AI Index 2026
The report's framing was that the U.S. retains a slim lead in the very top of the model quality distribution, but the slope of the curve has flattened. DeepSeek's V4 release is a data point in favor of that thesis, and the company's three-to-six-month gap claim is consistent with the index's headline finding. The next data point will come when independent benchmarking labs, including the lm-eval-harness team and HELM at Stanford, publish their own evaluations over the next two weeks.
What to Watch Next
Three things will determine whether V4-Pro becomes a genuine inflection point or a strong release that closes the gap without changing the market structure. The first is independent benchmark reproducibility. If the math and coding numbers hold under HELM and lm-eval-harness scoring, the open-source community will quickly build infrastructure around the model. The second is the inference cost when self-hosted on consumer-accessible hardware. The third is the regulatory response in Washington, which has been markedly less coordinated under the second Trump administration than during the V3 cycle.
For developers, the practical question is simpler. The model is available to download, the license permits commercial use, and the company is releasing inference code alongside the weights. Expect community-trained fine-tunes to appear within days. The competitive dynamic that V3 set in motion has not slowed, and V4-Pro is the strongest argument yet that the open-source frontier is no longer a generation behind.












