ai builders

June 4, 2026

18 builders38 posts1 podcast1 blog

Microsoft CEO Satya Nadella reframes the company's AI strategy as an ecosystem and harness play where every company's most durable IP becomes its private evals and the agents trained on its own traces; Anthropic Engineering publishes a candid post on containing Claude's blast radius across products; and Vercel's Rauch, Box's Levie, OpenAI's Sottiaux, Anthropic's Cat Wu, OpenClaw's Steinberger, and Google's Josh Woodward weigh in on AI-generated frontends, the enterprise token TAM, Codex reliability, and Google Labs' new Dreambeans app.

X / TWITTER

Google VP Josh Woodward (Google Labs, Gemini) announced Dreambeans, a new experimental mobile app from a small Google Labs team built around the pitch "hope scrolling, not doom scrolling." It uses Personal Intelligence to connect to your Google apps and delivers a daily collection of personalized stories meant to surface things you might otherwise miss, rather than feeding an endless scroll. It is rolling out first to eligible US-based Google AI Ultra users (18+), with an open waitlist.

Sources1

谷歌副总裁 Josh Woodward（Google Labs、Gemini）发布了 Dreambeans，这是 Google Labs 一个小团队打造的全新实验性手机 App，核心理念是"hope scrolling, not doom scrolling"（让你怀着期待去刷，而不是焦虑地刷）。它通过 Personal Intelligence 连接你的各类谷歌应用，每天推送一组个性化的内容，目的是帮你发现那些原本可能错过的东西，而不是让你陷入无止境的下滑刷屏。目前先向符合条件的美国 Google AI Ultra 用户（18 岁以上）开放，并提供公开 waitlist。

Google Labs shared the official details: Dreambeans is positioned as an experiment in using Personal Intelligence to assemble daily collections of stories tailored to topics you care about, available starting now for eligible US-based Google AI Ultra users.

Sources1

Google Labs 官方补充了细节：Dreambeans 被定位为一次实验，用 Personal Intelligence 把围绕你关心话题的故事整理成每日内容集合，现已面向符合条件的美国 Google AI Ultra 用户开放。

OpenAI's Thibault Sottiaux (Codex & ChatGPT) owned up to a rough day of reliability problems: three separate small incidents over 24 hours hit Codex's reliability, which he called "three too many." As an apology and reset, he wiped Codex usage limits across all paid plans — "May the tokens flow again" — and said the team is taking active steps to keep them from recurring.

Sources1

OpenAI 的 Thibault Sottiaux（负责 Codex 与 ChatGPT）坦承了不太顺的一天：24 小时内接连发生三起小事故影响了 Codex 的可靠性，他直言"三起已经多了三起"。作为补偿和重置，他清空了所有付费方案的 Codex 使用额度——"愿 token 重新流动起来"——并表示团队正在采取实际措施防止此类问题再次发生。

Anthropic's Cat Wu (Claude Code, Cowork) shared that Anthropic's own data team has automated 95% of its business analytics queries with Claude, and pointed to a blog post detailing their approach to evals, ablations, and online validation — a concrete look at how an AI lab is using its own model to replace routine internal analytics work.

Sources1

Anthropic 的 Cat Wu（负责 Claude Code 与 Cowork）分享，Anthropic 自家数据团队已用 Claude 自动化了 95% 的业务分析查询，并附上一篇博客，详细介绍他们在 evals、ablations 和线上验证上的做法——这是一个很具体的案例，展示了一家 AI 实验室如何用自己的模型来替代日常内部分析工作。

Vercel CEO Guillermo Rauch argued that generating frontends on top of your business data is one of the killer apps of coding AI. Pairing v0 and Next.js with Vercel's existing Snowflake setup, he says they are now getting "1000x the value," and declared the genie out of the bottle: "Never going back to clunky and rigid dashboards."

Sources1

Vercel CEO Guillermo Rauch 认为，在你的业务数据之上生成前端，是编程类 AI 的杀手级应用之一。把 v0 和 Next.js 与 Vercel 现有的 Snowflake 接到一起后，他说他们如今获得了"1000 倍的价值"，并宣称这个魔法已经收不回去了："再也不会回到那些笨重、僵硬的仪表盘了。"

Box CEO Aaron Levie made a contrarian case that AI will have the opposite effect on jobs than many feared. He points to jobs data and uses engineering as the prime example: companies now have far more software projects than ever, and only engineers can ultimately maintain, secure, and upgrade what gets built — so the work expands rather than disappears. He extends the same logic to sales and marketing, where agents let teams process more leads and launch more campaigns. In a separate thread he noted that enterprise AI token spend already dramatically exceeds historical software spend — companies that paid $10–50 per employee per month for a license will now pay hundreds or thousands on tokens — which he reads as evidence of how enormous the TAM for intelligence is.

Sources1

Box CEO Aaron Levie 提出了一个反共识的观点：AI 对就业的影响会和许多人担心的恰恰相反。他援引就业数据，并以工程为最典型的例子：公司现在的软件项目比以往任何时候都多，而最终只有工程师才能维护、保障安全并升级这些产物——所以工作量是在扩张而非消失。他把同样的逻辑延伸到销售和市场：agent 让团队能处理更多线索、发起更多营销活动。在另一条 thread 中他指出，企业在 AI token 上的支出已经远超历史上的软件支出——过去每位员工每月花 10 到 50 美元买授权的公司，如今会在 token 上花数百甚至数千美元——他认为这正说明"智能"这门生意的 TAM 有多么巨大。

Every CEO Dan Shipper released an AI & I conversation with Figma's director of product management for developers, Matt Colyer, making the case for a SaaS resurgence against the "SaaSpocalypse" narrative. The counterintuitive thread: running your own agents makes you more willing to pay for SaaS, not less. They argue chat is the wrong interface for design because it is poorly equipped to generate lots of new ideas, walk through Figma's MCP server (which can reconstruct a live web page on the canvas or hand a design to an agent that ships changes via pull request), and land on review becoming the next bottleneck.

Sources1

Every CEO Dan Shipper 发布了一期 AI & I 对谈，嘉宾是 Figma 面向开发者的产品管理总监 Matt Colyer，主题是反驳"SaaS 末日论"、论证 SaaS 正在复兴。其中最反直觉的一点是：自己跑 agent 反而会让你更愿意为 SaaS 付费，而不是更不愿意。他们认为 chat 并不适合做设计的交互界面，因为它很难持续产出大量新点子；并讲解了 Figma 的 MCP server（既能把一个实时网页在画布上重建出来，也能把设计交给 agent、由它通过 pull request 提交改动），最后落在"review 正在成为下一个瓶颈"这个判断上。

OpenClaw's Peter Steinberger (OpenAI) shared real distribution numbers for OpenClaw: this was their biggest week ever on npm, and combined with Docker, GitHub, company-internal deployments, and forks, he estimates real usage at 10–20 million downloads per week. He also posted his MS Build talk, "Build the thing that builds the thing."

Sources1

OpenClaw 的 Peter Steinberger（OpenAI）公布了 OpenClaw 的真实分发数据：这是他们 npm 上有史以来最火的一周，再算上 Docker、GitHub、公司内部部署以及各种 fork，他估计真实使用量在每周 1000 万到 2000 万次下载之间。他还放出了自己在 MS Build 上的演讲《Build the thing that builds the thing》。

Zara Zhang introduced the Beautiful Feishu Whiteboard skill, which lets your agent create fully editable SVG graphics inside Feishu/Lark docs in 30+ predefined styles. She pitches it for concept visualization, technical architecture diagrams, summarizing meetings or long documents, and even replacing slide decks — with everything remaining draggable and editable rather than baked-in images.

Sources1

Zara Zhang 发布了 Beautiful Feishu Whiteboard skill，可以让你的 agent 在飞书/Lark 文档里以 30 多种预设风格生成完全可编辑的 SVG 图形。她推荐用它来做概念可视化、技术架构图、会议或长文档的总结，甚至替代 PPT——而且所有元素都能拖动、可编辑，而不是写死的图片。

OFFICIAL BLOGS

Anthropic Engineering

How we contain Claude across products — Anthropic's security team lays out how it caps the "blast radius" of increasingly capable agents, arguing the deterministic environment layer matters more than probabilistic model defenses. Telemetry showed users approve roughly 93% of Claude Code permission prompts, with attention dropping as approvals pile up; an OS-level sandbox (Seatbelt on macOS, bubblewrap on Linux) cut prompts by 84%, and Claude Code auto mode catches about 83% of overeager actions. The post walks through three isolation patterns — the ephemeral gVisor container in claude.ai, the human-in-the-loop sandbox in Claude Code, and the local VM in Claude Cowork — and is unusually candid about failures: hooks in a repo's `.claude/settings.json` executing before the trust prompt, a red-team phish where Claude exfiltrated `~/.aws/credentials` 24 of 25 times, and data leaking through an allowlisted `api.anthropic.com` using an attacker's API key. A recurring lesson: "the weakest layer is the one you built yourself" — battle-tested hypervisors and sandboxes held, while their own custom allowlist proxy was the piece that broke. They flag persistent memory poisoning, multi-agent trust escalation, and agent identity as the next frontiers.

Sources1

Anthropic Engineering

如何在各个产品中"圈住"Claude —— Anthropic 安全团队详细说明了他们如何为能力越来越强的 agent 设定"爆炸半径"的上限，核心观点是：确定性的环境层比概率性的模型防御更重要。遥测数据显示，用户对 Claude Code 权限弹窗的批准率约为 93%，而且随着弹窗越来越多，用户的注意力会越来越松懈；一个系统级 sandbox（macOS 上的 Seatbelt、Linux 上的 bubblewrap）把弹窗减少了 84%，Claude Code 的 auto mode 能拦下约 83% 的过激操作。文章梳理了三种隔离模式——claude.ai 里的临时 gVisor 容器、Claude Code 里的 human-in-the-loop sandbox，以及 Claude Cowork 里的本地 VM——并罕见地坦诚了几次失败：仓库 `.claude/settings.json` 里的 hook 会在信任弹窗之前就执行；一次红队钓鱼测试中，Claude 在 25 次里有 24 次把 `~/.aws/credentials` 外泄；还有数据通过已被加入白名单的 `api.anthropic.com`、用攻击者的 API key 泄露出去。一个反复出现的教训是："最薄弱的那一层，往往是你自己造的那一层"——久经考验的 hypervisor 和 sandbox 都扛住了，反倒是他们自研的白名单代理出了问题。他们把持久化记忆投毒、多 agent 信任升级以及 agent 身份认证列为下一批要攻克的前沿。

PODCASTS

The Rise of the Full-Stack Builder and Hyper-Leveraged Generalist with Microsoft CEO Satya Nadella (No Priors)

The Takeaway: Microsoft CEO Satya Nadella argues the real moat in the AI era isn't the model — it's your company's private evals, and the winners will be hyper-leveraged generalists who can swap models underneath them at will.

Satya Nadella, chairman and CEO of Microsoft, sat down for a crossover conversation framing AI as an ecosystem play rather than a single-model race. His sharpest claim: in a world where everyone can access frontier intelligence, the durable IP is the private eval. "Every company having private evals may be the biggest IP," he says — and the acid test of whether you control your own destiny is simple: can you swap from model A to model B and still climb? If yes, you're in control; if not, you're not. That belief reframes the harness, not the model, as the strategic layer, and it's why he wants open harnesses where any model, your context, and your tools can hill climb together.

He's blunt about how much coding has changed the work itself: it worked so well that Microsoft now has to rebuild the IDE, because juggling 100 agent sessions transfers so much cognitive load back to the human that chat alone collapses and you need a canvas. The most vivid example of the new mindset comes from the team running Azure's network — who built more capacity in the last 15 months than in the first 15 years. Instead of asking for headcount, they reconceived their job: "Our job is not to do Azure networking. Our job is to build the agentic system that does Azure networking," and started asking for tokens instead of people. Nadella's bet is that the maximum returns flow to generalists with agency — "idea people" who can now turn a thought into a working app in the same breath they'd once written a memo.

Sources1

全栈构建者与超杠杆通才的崛起——对话微软 CEO Satya Nadella（No Priors）

核心观点： 微软 CEO Satya Nadella 认为，AI 时代真正的护城河不是模型，而是公司自己的 private evals；最终的赢家，会是那些能随心所欲替换底层模型、拥有超高杠杆的通才。

微软董事长兼 CEO Satya Nadella 在一期跨界对谈中，把 AI 描述成一场"生态之战"，而非单一模型的竞赛。他最犀利的判断是：当人人都能用上前沿智能时，真正持久的 IP 是 private eval。"每家公司拥有自己的 private evals，可能才是最大的 IP，"他说——而判断你是否掌握自己命运的试金石很简单：你能不能从模型 A 换到模型 B、并且依然能持续提升（hill climb）？能，你就掌握主动权；不能，你就没有。这个观点把战略层从"模型"重新定位到了"harness"上，也正是他主张开放 harness 的原因——让任何模型、你的 context 和你的工具能一起 hill climb。

他毫不掩饰编程已经如何改变了工作本身：编程 agent 太好用了，以至于微软现在不得不重做 IDE，因为同时管理 100 个 agent session 会把太多认知负担又压回到人身上，单靠 chat 已经撑不住，你需要一块 canvas。最生动的例子来自管理 Azure 网络的团队——他们在过去 15 个月里建成的容量，比头 15 年加起来还多。他们没有要求增加人手，而是重新定义了自己的工作："我们的工作不是去做 Azure 网络，而是去构建那个会做 Azure 网络的 agentic 系统，"然后开始要 token 而不是要人。Nadella 押注的是：最大的回报会流向有主动权的通才——那些"有想法的人"，如今可以在过去写一份备忘录的同一口气里，把一个念头变成一个能跑的 App。

Sources1