AI Product Updates — May 26, 2026

Today's notable movements: DeepSeek locks in a permanent 75% price cut that resets enterprise pricing expectations, Anthropic stages a coordinated pre-launch for Claude Mythos, Microsoft renegotiates its OpenAI deal, and the MCP protocol drops its biggest spec revision since launch. Here's what shipped and what changed.

Pricing and economics

DeepSeek makes its 75% discount permanent

What started as an end-of-month promotion is now the standing price. DeepSeek announced on May 22 that its V4 Pro flagship discount would not expire — making the promotional rate the permanent one. 1

The practical gap is large. On equivalent reasoning-heavy enterprise workloads, costs at mid-May came out to roughly $4,811 for Claude Opus 4.7, $3,357 for GPT-5, and $1,071 for DeepSeek V4 Pro — a roughly 7-to-1 spread between DeepSeek and the Western flagships at the top tier. Enterprises running most of their non-sensitive inference on DeepSeek while reserving the final reasoning step for a Western model can cut their annual AI bill by an order of magnitude.

This puts direct pricing pressure on the IPO stories at both Anthropic and OpenAI: holding margin while keeping market share now requires a response.

DeepSeek V4 Pro 75% permanent pricing announcement — DeepSeek V4 Pro's permanent pricing vs. competitors 1

Anthropic's path to margin

Anthropic's compute cost per revenue dollar dropped from 71 cents in Q1 to 56 cents in Q2, with Claude Code alone at $2.5 billion in annualized revenue and a $559 million profit projection ahead of its October IPO. 2 The margin improvement is the empirical counter to the assumption that frontier labs can't operate profitably while also competing on capability.

Model releases and product updates

Anthropic prepares Claude Mythos for public release

Anthropic's restricted frontier model, Claude Mythos, is showing signs of an imminent broader rollout. References to the model have been spotted inside Claude Code and AWS vulnerability discovery programs, and a brief UI test in Claude Code's public build exposed a toggle to enable claude-mythos-1-preview — visible to some users before being pulled. 3

Mythos was restricted at launch in April because of its cyberoffense capability: in Project Glasswing, the model — used in a controlled research context with 50 partner organizations — found more than 10,000 high-severity or critical vulnerabilities in open-source software in its first month. The appearance of its signatures in Google Cloud and AWS security tooling suggests Anthropic's guardrail work is far enough along to move toward broader availability, though subscription tier access remains unconfirmed. 4

Claude Mythos security vulnerability pipeline showing findings at each stage — Claude Mythos vulnerability pipeline: 10,000+ critical findings in one month 3

Anthropic plans structured memory files for Claude

In parallel with the Mythos preparations, Anthropic is redesigning Claude's memory architecture. The current system stores conversation history in a single growing blob; the planned replacement shards memory across multiple structured documents organized by topic, project, or context. 5 This addresses a practical limitation that becomes acute in long-running agent workflows, where retrieval signal-to-noise degrades as the context accumulates.

Google Gemini 3.5 Flash: cheaper tier outperforms expensive one on software tasks

Google's Gemini 3.5 Flash now has a tiered reasoning structure. Developer testing finds the Low tier — the cheapest in the lineup — outperforms the High tier on software-engineering tasks while generating 45% fewer tokens than Medium. 6 If the pattern holds across workloads, swapping from High to Low cuts inference cost roughly in half. Google also unveiled Gemini 3.5 Omni and Antigravity 2.0 as part of a broader model slate at Google I/O 2026. 7

ChatGPT adds form-filling from photos

OpenAI shipped a form-fill feature: upload an image of a form, describe via chat what each field should contain, and ChatGPT returns the completed version. 8 The OCR and field-detection components are not new; the natural-language instruction layer on top is. As a baseline capability it sets the floor for what enterprise forms-as-a-service products now need to match.

Infrastructure and developer tooling

MCP's biggest spec revision since launch enters release candidate

The Model Context Protocol's release candidate for its first major post-launch revision landed today. Key changes: a stateless core, a formal extensions framework, OAuth/OIDC-aligned authorization, and an explicit deprecation policy. The final spec ships July 28, 2026 and contains breaking changes. 9 Teams running MCP servers in production have roughly nine weeks to work through the migration checklist before the final spec ships.

Langfuse Launch Week 5: experiments in CI/CD (Day 1 of 5)

Langfuse opened its fifth consecutive launch week (May 25–29) with a GitHub Actions integration that runs LLM experiments inside CI/CD pipelines. The action tests every pull request against a Langfuse dataset, fails the workflow if scores drop below a set threshold, and posts results back to the PR as a comment. Days 2–5 (Tuesday through Friday) are scheduled to ship "Built for agents," "Find anything," "Evals as code," and "Never miss a thing."

langfuse.comhttps://langfuse.com/launch-week-5外部链接

正在加载内容卡片…

OpenAI publishes macro-eval workflow for agents

OpenAI released a methodology for evaluating agentic systems at scale. The approach shifts the unit of analysis from individual failed traces to patterns across thousands of runs: lower-level evals grade each run; macro-evals cluster the recurring failure modes that per-run review misses. 10

Reasonix: DeepSeek-native coding agent

A new terminal coding agent called Reasonix is designed specifically around DeepSeek's prefix-cache economics. It optimizes for the cost shape of long-running coding sessions rather than for UX features, and paired with the now-permanent 75% discount, offers order-of-magnitude lower per-hour cost than incumbent coding harnesses. The tradeoff is single-model lock-in to DeepSeek. 11

Perplexity open-sources Bumblebee security scanner

Perplexity released Bumblebee, a read-only security scanner for developer workstations. It flags risky npm packages, browser extensions, and AI tool configurations. The tool is unusual for a consumer AI company — it's essentially operational hardening of the environments its own assistants run inside. 12

Deals and corporate moves

Microsoft restructures its OpenAI deal

Microsoft has renegotiated its foundational agreement with OpenAI: the revenue-share arrangement ends, Microsoft secures rights to OpenAI's IP through 2032, and OpenAI is now permitted to run workloads on any cloud provider rather than exclusively on Azure. 13 Wedbush now puts Microsoft's 2026 revenue attributable to OpenAI at roughly $6 billion.

Separately, Microsoft is adding persistent long-term memory to its AI chatbots — joining Anthropic and Google in building out cross-session context — with user controls for storing and managing what's retained.

SpaceX's cloud revenue dwarfs its rocket history

SpaceX's compute deal with Anthropic is now valued at roughly 80% of the company's total rocket revenue across 23 years. As a data point in the broader AI infrastructure buildout, $7.5 trillion in total projected spend runs at approximately 5% of annual US GDP — a ratio comparable to the 1880s railroad boom, which ended in significant overcapacity. 14

GitHub Copilot shifts to usage-based billing

Starting June 1, 2026, all GitHub Copilot plans will move to usage-based billing via GitHub AI Credits. 15

Policy

US shelves federal AI executive order

AI advisor David Sacks persuaded President Trump to abandon a sweeping AI executive order in an 11th-hour phone call. The argument: mandatory federal regulation would hand China a competitive advantage. The order was shelved. 16 With no binding federal framework expected on the 12–24 month horizon, each lab's own published safety record becomes the de facto regulatory benchmark.

California took a different path the same week: Governor Newsom signed an executive order creating a dashboard tracking AI's employment impacts by sector, with data from the Employment Development Department required within 90 days. 17