← 2026-05-11

Daily Edition

2026-05-12

2026-05-13 →

AI Builders 日报 — 5月12日


今日思考

AI 的范式转移:从「写代码」到「验证代码」

Garry Tan 指出一个被低估的洞察:AI 的价值不止是让人写代码更快——而是让「验证」这件事变得前所未有的便宜。以前 90% 测试覆盖率是每个工程师都知道应该做、但极少有人真能做到的事情,因为维护成本太高、人的意志力撑不住。现在 AI 让这个阈值变成了「免费」的。

这句话的深意是:AI 并没有让编程变得更容易,而是让「质量」这件事不再需要代价。这是开发范式上的一次根本性转移——从「快速构建」到「持续验证」,工程师的角色正在从「建造者」转变为「质量把关者」。


产品与发布

TrustClaw — 开源信任验证框架

rauchg 宣布 TrustClaw 开源,这个工具受 OpenClaw 启发。从名字推断,这是一个围绕「信任」做验证的框架,可能是工作流或权限验证方向的工具。rauchg 只发了一个 "Cool",YC 系创始人典型的克制发布风格。

结合 Garry 提到的「verification is now free」,TrustClaw 的出现时机耐人寻味——它可能是在解决 AI 辅助环境下「如何确保系统行为可信」这个问题。


观点与判断

Carmack:视频游戏创业的现实

ID_AA_Carmack 说了一句在今天这个融资环境下格外刺耳的真话:视频游戏公司的回报分布「generally not very good today」,市场已经充分服务,你可能要烧掉七位数才能做出点名堂。

他的建议很直接:先找客户,先做最小的游戏,hill-climb value。这和今天很多 AI 创业者的做法完全相反——后者是先融资、建团队、做产品。Carmack 的逻辑是:在错误的顺序里,你的burn rate 会杀死你。


swyx:/goal 的本质是重新定义「评估」

swyx 提出了一个关于 AI 编程界面的判断:/goal 这个设计范式的核心意义在于,从「描述你希望完成的事情」转向「描述你如何评估结果」。这是两种完全不同的思维模式——前者是人类的直觉语言,后者是系统的执行语言。

这个转变的意义在于:它让 AI 从「听从指令」进化到「理解标准」。当评价标准被明确化,AI 的行为空间就从「猜测人类意图」变成了「在已知约束内优化」。


技术动态

GBrain — 72小时14个PR的质量攻势

Garry Tan 披露了 GBrain 的最新进展:72小时内合并14个PR,净增 +28,746 / -1,173 行代码,版本从 v0.31.2 升至 v0.32.4。这是一次罕见的高强度合并节奏。

核心更新包括:

  • Facts Join System-of-Record (#885):+5,682 行,事实层接入系统记录
  • Takes v2 (#795):+5,306 行,第二版 takes 逻辑
  • Extract Facts During Sync (#796):+3,418 行,sync 时同步提取事实
  • Functional-Area Resolvers (#859):+3,166 行,功能区域解析器
  • 5 New Embedding Recipes (#810):+1,818 行,新的向量嵌入配方
  • MCP Graceful Cleanup (#801):+1,863 行,Model Context Protocol 的优雅清理
  • Thin-Client Auto-Upgrade (#816):+1,608 行,瘦客户端自动升级
  • Multi-Source Threading (#808):+1,571 行,多源线程化
  • Canonical Model IDs (#844):+1,304 行,规范模型 ID 体系
  • Doctor Fix-Wave (#804):+828 行,修复波

Garry 在另一条帖子中提到 GBrain 没有传统 roadmap,而是用 git worktrees 的方式在 conductor 中并行开发多个方向。这种架构思维让并行 PR 成为可能。

X / Twitter

31
amasad
amasad @amasad
Another day another massive JavaScript supply chain attack. Replit users are safe and protected.

Scott Kennedy ⠕: Number of @Replit users impacted by the Tanstack supply chain attack: 0

It's important to have good defaults set for your package manager. Replit doesn't make you learn that the hard way. We just do it for you.
AmandaAskell
AmandaAskell @AmandaAskell
You can now listen to me and Joe read out Claude's constitution as an audiobook. Working on adding the option of listening to it on fast mode :)

Anthropic: Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmith.

It includes a Q&A on the writing process, the philosophies that shaped the document, and how it might change as models become more capable.

Listen at http://anthropic.com/constitution

petergyang
petergyang @petergyang
Ok what kind of things should I try /goal for? Building 0-1? Refactoring? What context does it need to work well
petergyang
petergyang @petergyang
Turns out everyone just loves shipping

Gokul Rajaram: TRANSCRIBE: DAY 1 IN THE BOOKS

Transcribe (link in comments) launched this morning.

Day 1 metrics: ~3000 visitors. ~1000 transcripts created.

The best part of the day was the inbox. Real users (still need to pinch myself that this product has users!) wrote in with bugs,
garrytan
garrytan @garrytan
I guess GBrain doesn't have a roadmap

It has git worktrees in conductor
petergyang
petergyang @petergyang
Why does SoCal have so much world class kids stuff (Disney, Legoland, Universal, Zoo, etc) but in the Bay Area we can't keep our only amusement park alive
garrytan
garrytan @garrytan
GBrain update

14 PRs merged in 72 hours. +28,746 / -1,173 lines.

The highlights:

• #885 facts join system-of-record (+5,682) — the hot memory layer
• #795 takes v2 (+5,306) — rewritten from 100K-take production learnings
• #796 extract facts during sync (+3,418) — real-time hot memory
• #859 functional-area resolvers (+3,166) — routing table compression
• #810 5 new embedding recipes (+1,818) — closed a 17-PR cluster in one shot
• #801 MCP graceful cleanup (+1,863)
• #816 thin-client auto-upgrade (+1,608)
• #808 multi-source threading (+1,571)
• #844 canonical model IDs (+1,304)
• #804 doctor fix-wave, adapting 5 community PRs (+828)

Nearly 29K lines of production code merged in 3 days. v0.31.2 → v0.32.4. Eight version bumps.

Been busy.
garrytan
garrytan @garrytan
Many such cases

Leen: Obsessed with the gstack setup by @garrytan 🤖

It turned my Claude into an entire team

23 specialists that run a full sprint:

Think → Plan → Design → Build → Review → Test → Ship

This changed how i build 👇

http://github.com/garrytan/gstack
garrytan
garrytan @garrytan
Retweeted
Vijay Vijay
Seattle Turns Hostile to the Great Businesses It Made
Starbucks is moving jobs from Washington state to Tennessee, and it isn’t alone in looking elsewhere.
By Howard Schulz
"Washington state has been my home for more than four decades. I arrived in Seattle with dreams and ambition and ended up building Starbucks into a company known around the world. Many Pacific Northwesterners joined me in shaping the culture, benefits and brand of Starbucks—contributing not only to a business, but also the civic and entrepreneurial life of the area.
I am no longer a resident of Washington. My decision to leave had much to do with family choices and my stage of life. Still, I feel a responsibility to speak up about the business and job climate in a city and state that gave me so many opportunities.
Washington’s economic story over the past half century is extraordinary. Microsoft, Amazon, Costco and a host of other new companies transformed the state into a global center of technology, innovation and logistics. Entrepreneurs exported ideas worldwide. Capital flowed. Wages rose. Imported and homegrown talent flourished.
That ecosystem worked because risk‑taking was rewarded, growth was possible, and civic leadership—while imperfect—understood that private enterprise wasn’t the adversary of the public good. It was one engine for improving the public sphere.
That ecosystem is fractured today. Seattle and much of Washington face serious problems: chronic homelessness, disorder in core business districts, persistent budget deficits, declining public-school outcomes and a slowing technology hiring cycle. These challenges aren’t unique to the state—but Washington’s response to them is.
Seattle’s mayor, Katie Wilson, has chosen to cast business as a foil rather than a partner. Her socialist rhetoric vilifies employers, even while she continues to rely on them for revenue. She has encouraged residents who disagree with her policies to leave.
In the state capital, the Legislature and governor have confronted difficult fiscal trade-offs by emphasizing taxation rather than reform or performance management. The theory appears to be that prosperity can be mandated through redistribution rather than generated through growth.
Washington has a broken tax system. The reliance on sales taxes—10.55% in Seattle—is deeply regressive. The state needs to rewrite its tax code across the board in a way that ensures people and businesses alike pay their share.
But instead of reform, those in power have opted to increase the burden on businesses and successful entrepreneurs in ways that discourage them from growing within the state—at a moment when Washington’s economic situation is growing more fragile.
Microsoft and Amazon—once hiring engines—have slowed recruitment and reduced head counts as they race to build data-center capacity and compete globally. Starbucks recently announced it will shift hundreds of corporate roles to Tennessee.
These companies imported global talent at scale for decades, anchoring an interconnected system of suppliers and startups. As those businesses reduce their local role, Seattle has no clear answer to the question of what will provide the next set of jobs and revenue growth.
Cities and states don’t decline overnight. They drift when public safety, fiscal stability and economic vitality deteriorate together. Downtown vacancies reduce foot traffic. Declining foot traffic weakens small businesses. Employment falls. Revenue shrinks. Services erode. Confidence—something that’s hard to build and easy to lose—begins to evaporate.
Entrepreneurs are accustomed to accountability: If we fail to deliver value, we lose customers. If we misallocate capital, we absorb the loss. Government, too, should be judged by results, not intentions. In Washington, steadily increasing government spending hasn’t delivered commensurate results on a range of issues, from addressing homelessness and drug addiction to poor prospects for new high-school graduates.
Entrepreneurs take risks others won’t. We build before certainty exists. We hire before revenue is guaranteed. We invest locally, pay taxes and support civic institutions. When our companies succeed, entire regions benefit. America can’t afford to forget that.
Leaving doesn’t mean abandoning. My family foundation remains invested in Washington’s future, seeking to help the next generation achieve economic mobility and prosperity. But that future is linked to economic growth and job creation. Across the country, other states are competing for capital and talent by simplifying regulation, reforming tax systems and investing in workforce development. One important initiative comes from the bipartisan National Governors Association, helping states craft pro-entrepreneurship policies.
I hope Washington’s leaders will embrace these policies and forge a new compact—one grounded in job creation, sensible taxation and accountable public spending. Washington once embodied the future of the U.S. economy, and it can again. But the current government needs to learn that future entrepreneurs won’t be attracted by ineffective public systems, especially when joined with policy and political rhetoric that demonize businesses.
Mr. Schultz is a former CEO and chairman emeritus of Starbucks."
gdb
gdb @gdb
Having an agent in your meeting is such a futuristic experience:

OpenAI Developers: What if your team gave standup updates, and GPT-Realtime-2 moved the tickets?

garrytan
garrytan @garrytan
http://x.com/i/article/2054062662225440768
garrytan
garrytan @garrytan
It's not that AI lets you write code faster. Plenty of people have noticed that. It's that AI lets you verify at a level that was previously too expensive to sustain. The 90% testing threshold is magical, but it used to cost too much human willpower to reach.

Now it's free.

Garry Tan: http://x.com/i/article/2054062662225440768
gdb
gdb @gdb
AI for helping you build apps powered by AI:

OpenAI Developers: Codex can now help you build AI apps and agents faster with OpenAI APIs using the OpenAI Developers plugin.

swyx
swyx @swyx
Retweeted
Abhi Abhi
Re https://www.latent.space/p/unsupervised-learning-2026
The amazing episode that inspired this article.
@swyx @jacobeffron
amasad
amasad @amasad
This has been going on for 80 years, and heavily documented. The NYTimes finally scraped together enough integrity to publish something.


Nicholas Kristof: This is a hard article to read, but I hope you'll do so. I've spent some time reporting on widespread rape and other sexual violence of Palestinian male and female prisoners by Israeli authorities, and the article is now published. The assault victims were warned not to give
garrytan
garrytan @garrytan
Retweeted
Vox Vox
this article is long but worth chewing on. tldr:
last week garry merged 14 PRs in 72 hours. 29,000 new lines. he calls this a ratchet: every agent coding session ships tests, docs, and evals alongside the code. each change comes out more solid than the last. the pattern lives in gstack.
i've been seeing this in a few places lately. three weeks ago steipete (behind openclaw) ran 50 codex in parallel and closed 4000 issues. open-sourced it as clawsweeper. same shape, basically.
side by side: agents are loops, humans are events. humans get tired, get distracted, forget. agents don't. or really, the records they leave don't. the systems that actually compound in the agent era are all this shape: every loop ratchets forward one notch, and nobody knows where it stops.
result: faster, more reliable, complexity ceiling pushed up.
if the article felt too technical: just add "and write tests too" to every codex prompt. been doing it this week, the project feels totally different.
Garry Tan: http://x.com/i/article/2054062662225440768
ylecun
ylecun @ylecun
Retweeted
Harry Sisson Harry Sisson
Trump had one of his worst mental health episodes yet last night, posting over 55 times in 3 hours. Here is the list:
10:15 PM - Accuses Obama of attempting a coup in 2016
10:15 PM - Says Obama worked with CIA to overthrow Trump
10:15 PM - Reposts tweet saying Obama is a “traitor” and that he should be arrested
10:22 PM - Attacks dominion voting systems for 2020 election saying they switched votes
10:22 PM - Says Fulton County, GA had their 2020 fraud exposed (there was none)
10:23 PM - Accuses Obama of personally making $120 million from Obamacare (wtf?)
10:23 PM - Cites quack lawyer Sidney Powell on the 2020 election
10:24 PM - Posts fake JFK Jr account that says Obama wiretapped Trump Tower
10:27 PM - Demands Senator Mark Kelly resign
10:29 PM - Claims neither Biden nor Harris were in charge of the Biden admin
10:29 PM - Attacks Fulton County, GA again
10:29 PM - Posts Fox News clip of Rep Ro Khanna
10:30 PM - Demands Jack Smith be arrested
10:30 PM - Accuses Obama, Clinton, and Comey of treason
10:39 PM - Reposts a tweet from a MAGA account saying they have secret intel proving Clinton and Obama committed crimes
10:39 PM - Reposts a MAGA tweet saying Hillary Clinton should be sent to Haiti
10:40 PM - Says the DOJ is “working hard” to arrest his enemies for treason
10:40 PM - Reposts a tweet attacking his own DOJ and Todd Blanche for no arrests of political enemies
10:40 PM - Posts a TikTok video of people stealing from a convenience store
10:41 PM - Posts a TikTok of someone taking a Door Dash order
10:41 PM - accuses Obama, John Brennan, and Clinton of sedition and treason again
10:42 PM - Posts a video of a man on CCTV footage knocking over food a waiter was carrying
10:47 PM - Calls Obama the “most DEMONIC FORCE” in American politics
10:47 PM - Posts a tweet from Mike Flynn saying 2020 election wasn’t fair
10:49 PM - Attacks Dominion again claiming they stole the 2020 election (it wasn’t)
10:51 PM - Reposts a fake Charlie Kirk account that claimed Obama blocked Hillary Clinton from being prosecuted
10:53 PM - Claims Obama was part of Hillary Clinton’s emails in some way
11:28 PM - Claims a senior Democrat just testified under oath that Senator Adam Schiff leaked classified information
1:13 AM - Attacks the New York Times for reporting on the reflecting pool
This man is clearly not well.
garrytan
garrytan @garrytan
Retweeted
Paul Graham Paul Graham
A friend's startup is growing at 93% a month. I pointed out that her net worth is also growing at 93% a month, and that she can thus feel, in her own life, the falsity of politicians' claim that you have to do bad things to get rich. They're just focusing on making users happy.
ylecun
ylecun @ylecun
Retweeted
Justin Wolfers Justin Wolfers
Trump campaigned on bringing down the cost of living "starting on day one," and then: started a trade war; deported much of the farm workforce, bombed Iran, allowed healthcare subsidies to expire, cut food assistance, ran an interest-rate boosting deficit, and attacked fed independence.
garrytan
garrytan @garrytan
Retweeted
Vapi Vapi
We've raised a $50M Series B to keep making everyday calls extraordinary.
After 1,000,000,000 calls and 10x enterprise ARR growth, Vapi's platform approach to voice agents is becoming the standard for businesses building human interfaces.
ID_AA_Carmack
ID_AA_Carmack @ID_AA_Carmack
My reply to someone considering starting a video game company:

The distribution of possible rewards for starting a video game company are generally not very good today. The market is well served, and gaining a foothold requires strong execution on both business and product issues, along with a substantial amount of luck. Plan to burn through seven figures with a not-great chance of making it back.

If you do go for it, some bits of advice:

Identify your customers clearly before you start. Not just a broad community, but specific people, and imagine them as you make decisions.

Initially, build the smallest, most concise game you can imagine anyone paying for. It will still take much longer than you expect.

Once something exists, hill-climb the value. Hopefully you will have some elements that clearly bring joy to people, which you can magnify. There will inevitably be tons of things that people find confusing, frustrating, or just boring that you will need to fix.
garrytan
garrytan @garrytan
Retweeted
Keith Humphreys Keith Humphreys
Very important lesson for Blue Cities here: Every change Lurie has made has been opposed by screaming, bullying activists -- ignoring them has been the path to policy success and extremely high popularity. @skaushik100
https://www.sfchronicle.com/election/article/sf-voters-daniel-lurie-22244870.php
rauchg
rauchg @rauchg
Cool

sarah: Despite being told no, I'm open-sourcing TrustClaw.

You can now deploy a production-ready personal agent service with over 1000+ app integrations in a single command, straight to @vercel with npx @composio/trustclaw deploy

I was inspired by @openclaw to build a simple web app

garrytan
garrytan @garrytan
Retweeted
Y Combinator Y Combinator
Superlog (@superlogYC) is the observability tool you're not supposed to open: a wizard sets up your logs, traces, alerts, and dashboards daily, and an agent investigates incidents and posts one mergeable PR per issue into Slack.
Sign up here: https://superlog.sh/
Congrats on the launch, @nicolomagnante and @arseniycodes!
https://www.ycombinator.com/launches/QKt-superlog-the-self-installing-observability-tool-that-fixes-your-bugs
alexalbert__
alexalbert__ @alexalbert__
Retweeted
ClaudeDevs ClaudeDevs
Fast mode for Claude Opus 4.7 is now available in research preview on the API and in Claude Code.
garrytan
garrytan @garrytan
Retweeted
Akshay 🚀 Akshay 🚀
What actually is GBrain?
(Y Combinator CEO's personal agent brain)
Every agent memory tool you've seen solves a simple problem: store facts, retrieve facts.
GBrain solves a different one. It gives your agent a knowledge system that wires itself, enriches itself, and compounds while you're not even using it.
Here's what makes it fundamentally different from Mem0, Zep, LangMem, or a CLAUDE.md file.
The standard approach to agent memory is vector-based. Your agent stores memories as embeddings, retrieves them by semantic similarity, and that's the loop. Some tools add a knowledge graph on top.
GBrain flips the model entirely. The source of truth is a folder of markdown files. One page per person, one page per company, one page per concept. Every page follows the same two-part structure:
𝗖𝗼𝗺𝗽𝗶𝗹𝗲𝗱 𝘁𝗿𝘂𝘁𝗵 on top: your current best understanding, rewritten as new evidence arrives
𝗧𝗶𝗺𝗲𝗹𝗶𝗻𝗲 on the bottom: an append-only evidence trail that never gets edited
This is not a vector store with a markdown export. The markdown IS the system of record. You can open it in VS Code, edit it by hand, and 𝗴𝗯𝗿𝗮𝗶𝗻 𝘀𝘆𝗻𝗰 picks up the changes.
Now the part that makes this compound.
Every time a page is written, GBrain extracts entity references and creates typed relationship links: 𝘄𝗼𝗿𝗸𝘀_𝗮𝘁, 𝗶𝗻𝘃𝗲𝘀𝘁𝗲𝗱_𝗶𝗻, 𝗳𝗼𝘂𝗻𝗱𝗲𝗱, 𝗮𝘁𝘁𝗲𝗻𝗱𝗲𝗱, 𝗮𝗱𝘃𝗶𝘀𝗲𝘀. All deterministic, all regex-based, zero LLM calls.
The knowledge graph wires itself on every single write, without spending tokens.
So when you ask "who works at Acme AI?" or "what has Bob invested in this quarter?", the agent walks the graph instead of relying on vector similarity (which struggles with relational queries like these).
Search layers ~20 deterministic techniques in concert: intent classification, multi-query expansion, vector search, keyword search, reciprocal rank fusion, cosine re-scoring, compiled-truth boosting, and backlink ranking. Each catches what the others miss.
But the real unlock is the compounding loop.
GBrain has a 𝘀𝗶𝗴𝗻𝗮𝗹 𝗱𝗲𝘁𝗲𝗰𝘁𝗼𝗿 that fires on every message and captures entities in the background. Person mentioned once? They get a stub page. Three mentions across different sources? Web enrichment kicks in. After a meeting? Full pipeline.
The agent runs a 𝗱𝗿𝗲𝗮𝗺 𝗰𝘆𝗰𝗹𝗲 overnight: scans conversations, enriches missing entities, fixes broken citations, consolidates memory. You wake up and the brain is smarter than when you went to bed.
This is fundamentally different from memory systems that only store what you explicitly tell them to store.
Garry Tan (President and CEO of Y Combinator) built this to run his actual AI agents. It ships with 34 skills, runs on embedded PGLite (no server, ready in 2 seconds), and works as an MCP server for Claude Code, Cursor, and Windsurf.
GBrain: http://github.com/garrytan/gbrain
Garry Tan: http://x.com/i/article/2042922188924424198
petergyang
petergyang @petergyang
Can you fix this @grok


Peter Yang: Why does SoCal have so much world class kids stuff (Disney, Legoland, Universal, Zoo, etc) but in the Bay Area we can't keep our only amusement park alive
petergyang
petergyang @petergyang
This is pretty BS - @AnthropicAI you should take these guys down for using your assets

clauders.fun: /early_access

http://clauders.fun

live_in = 24hrs

garrytan
garrytan @garrytan
Retweeted
Paul Graham Paul Graham
It's so weird how difficult it is for the left to understand that all you have to do to decrease housing costs is to let people build more housing.
M. Nolan Gray 🥑: For 45 years, Berkeley built virtually no new housing. By the mid-2010s, it was the most expensive college town in America. Shortly thereafter, YIMBYs took over and kicked off a building boom. Today, nominal rents are below 2018 rates—remarkable progress on affordability.
swyx
swyx @swyx
the best way to think about /goal is that you’re moving from prompting WHAT you want done, to HOW you want it evaluated
amasad
amasad @amasad
Going live from SaaStr!

Replit ⠕: Live From SaaStr: Kickoff (Day 1) https://x.com/i/broadcasts/1qKVmQePgaWxB

YouTube

0

No recent videos fetched on this date.