May 30, 2026
Onyx Security CEO Maxim Bar Kogan on training small models to police autonomous agents; Cherny, Levie, and Tan share agentic and founder insights.
X / TWITTER
Boris Cherny — Claude Code, Anthropic
Boris Cherny, who works on Claude Code at Anthropic, spotlighted a detailed Salesforce writeup on going agentic with Claude Code, and the numbers are striking: a migration originally scoped at 231 days shipped in 13, and a single PR delivered 21 endpoints at 100% test coverage. His larger point is that the teams seeing the biggest AI wins aren't speeding up what they already do — they're deleting steps, removing handoffs, and letting an agent own tasks end to end. Crucially, quality rose alongside output: even with more PRs shipping, total incidents dropped 5%, because security guardrails and quality standards were built into the agentic workflow itself rather than bolted on after.
Anthropic Claude Code 团队的 Boris Cherny 重点推荐了 Salesforce 关于全面转向 agentic、用 Claude Code 重构工作流的详细复盘,数字相当惊人:原本评估需要 231 天的迁移,13 天就上线了;一个 PR 交付了 21 个 endpoint,且测试覆盖率达到 100%。他更想说的是:真正从 AI 中获益最大的团队,不是把原有流程做得更快,而是直接删掉步骤、去掉交接、让 agent 端到端地接管整件事。关键在于,产出变多的同时质量也在提升——即便上线的 PR 更多,整体事故率反而下降了 5%,因为安全护栏和质量标准被直接内置进了 agentic 工作流本身,而不是事后再补。
Aaron Levie — CEO, Box
Box CEO Aaron Levie argued that when a company spends $500M to build its own version of an app-layer product, that's the best possible advertisement for the app layer — and a reason to be very bullish on software. His take cuts against the fear that foundation labs or deep-pocketed incumbents will commoditize application companies: to him, the willingness to spend that much replicating an app only validates how much value sits at that layer.
Box CEO Aaron Levie 认为,当一家公司愿意花 5 亿美元去自建某个应用层产品的"自有版本"时,这恰恰是对应用层最好的广告,也是让人对软件极度看好的理由。他的观点反驳了"基础模型实验室或财力雄厚的巨头会把应用层公司商品化"的担忧——在他看来,愿意砸这么多钱去复刻一个应用,正说明了这一层的价值有多大。
Garry Tan — CEO, Y Combinator
Y Combinator CEO Garry Tan pushed back on the founder mindset that funding is the bottleneck: "Money is not the fire. Money is gasoline you pour on a fire that already exists." His point is that if you think you only need money to do X, you don't have a funding problem — you have a "people don't want it yet" problem. Go make the first fire, building real demand and traction, before chasing capital.
Y Combinator CEO Garry Tan 反驳了"钱才是瓶颈"这种创业者心态:"钱不是火。钱是你浇在一团已经烧起来的火上的汽油。"他的意思是——如果你觉得只要有钱就能做成 X,那你其实没有融资问题,你有的是"大家还不想要"的问题。先去把第一团火点起来、做出真实的需求和早期增长,再去追逐资本。
PODCASTS
No Priors — Building an AI Guardian for Enterprise with Onyx Security CEO Maxim Bar Kogan
The Takeaway: As enterprises unleash autonomous coding agents at scale, the hardest security problem isn't seeing what agents do — it's having a fast, cheap way to decide, in real time, which of their millions of actions deserve a closer look.
Maxim Bar Kogan is the co-founder and CEO of Onyx Security, an Israel-based startup of researchers and mathematicians building "agents that watch the AI agents." His bet traces back to AutoGPT, which convinced him that once models got good enough, enterprises would run agents that act with human-level permissions — and have no way to oversee them. That moment has arrived: he estimates that in a typical enterprise, autonomous coding agents and assistants are now over 50% of AI usage and the fastest-growing category, usually deployed with no controls at all.
His sharpest insight is why traditional security fails here. Identity-based controls assume you can scope what a system is allowed to do, but "we kind of want them to have our permissions" so the agent can run while you go to lunch. Endpoint and API tools see the action but not the intent — they can't tell whether deleting a database is the task or a hallucinated detour. Onyx's answer is counterintuitive: don't throw a smart agent at every action, because the cost and latency would exceed what you pay for the AI itself. Instead, train very small models that are good at exactly one thing — flagging when a smarter reviewer should step in.
He frames it like a chess player's intuition: most moves are instant, but occasionally you stop to calculate hard. "You don't want to spend too much intelligence where you don't have to, and you want to spend a lot of intelligence, overwhelmingly a lot, in situations where there's high risk." And he's blunt about why an independent party is needed at all: you don't trust the seller of a car to certify it's safe.
要点:当企业开始大规模放开自主编码 agent 时,最难的安全问题不是"看见" agent 做了什么,而是要有一种又快又便宜的办法,能实时判断它那数百万次操作里,哪些值得再仔细看一眼。
Maxim Bar Kogan 是 Onyx Security 的联合创始人兼 CEO,这是一家位于以色列、由研究员和数学家组成的初创公司,专门打造"监视 AI agent 的 agent"。他的押注可以追溯到 AutoGPT——那让他确信,一旦模型足够强,企业就会运行拥有人类级别权限的 agent,却完全没有办法去监管它们。如今这一刻已经到来:他估计在一家典型企业里,自主编码 agent 和助手已经占到 AI 使用量的 50% 以上,且是增长最快的品类,而部署时往往没有任何管控。
他最犀利的洞察是:为什么传统安全在这里失效。基于身份的管控假设你能界定一个系统被允许做什么,但"我们其实是希望它们拥有我们的权限",这样你去吃午饭时 agent 还能继续干活。端点和 API 工具能看到动作,却看不到意图——它分不清删库到底是任务本身,还是一次幻觉式的跑偏。Onyx 的答案有点反直觉:不要对每一个动作都派一个聪明的 agent,因为那样的成本和延迟会超过你为 AI 本身付的钱。取而代之的是,训练非常小、只擅长一件事的模型——判断什么时候该让更聪明的审查者介入。
他把这比作棋手的直觉:大多数招法都是瞬间落子,只有偶尔才会停下来认真计算。"你不想在不必要的地方花太多智能,而要把大量的、压倒性多的智能,花在高风险的情形上。"他也直言为什么必须有一个独立第三方:你不会让卖车给你的人来认证这辆车是安全的。