When you regain your will to power after a period of burnout or depression.
We want you to have a lot of AI!
Tibo: I don't know what they are doing over there, but Codex will continue to be available both in the FREE and PLUS ($20) plans. We have the compute and efficient models to support it. For important changes, we will engage with the community well ahead of making them.
Transparency
Re OK this one just dropped also
So many bugs to fix
Does anyone from @OpenAI want to share some tips on how to get image 2 to generate good infographics or have it match your brand style?
I haven't had any luck either.
Bojan Tunguz: I tried making an infographic using the GPT-image-2. Lots and lots of visually unacceptable artifacts. :/
Replit testified in support of the BASED Act: Stopping Big Tech from rigging software marketplaces (very unbased).
“Mum, can we have the SpaceX IDE?”
“No we have a space IDE at home”
Space IDE at home:
Kenneth Roth
The Trump administration files concocted charges against the civil rights Southern Poverty Law Center, claiming it defrauded donors by supposedly supporting extremist groups when it was paying informants to expose their misdeeds. https://trib.al/TmffK8E
Re It's a bug fix bonanza
The fact that @cursor_ai had to post-train its own model and now partner with xAI for compute I assume shows how hard it is to succeed in the ultra competitive AI coding (or just general knowledge work) space without owning full stack.
benahorowitz.eth
They put my father R.I.P. on a hate group list (insane, because he never hated anybody) and nearly destroyed his non-profit. It turns out that they are the biggest hate group in America. I hope they go to jail forever.
Marc Andreessen 🇺🇸: SPLC was one of the most powerful censorship forces in the country for decades. Lavishly supported by many big American companies for many years. This is astonishing, and deeply concerning.
The GPU NIMBYs are out of control
Dick Lucas 🇺🇸 Running for CA Assembly: Banning AI chat support so a small minority (Communication Workers of America) can keep their jobs at the expense of higher prices for the rest of us. Special interests control CA.
California's anti business nanny state reputation is undefeated. @AsmRickZbur
Agarwal for Congress
"In 2018...Khanna told the tech investors that some Silicon Valley engineers were privileged and more concerned about having their dry cleaning done for them, but the people in Ohio were “hungry.”
Somehow both believable and unbelievable quote from a Congressman about the people in his own district.
Care to comment @RoKhanna ?
https://www.dailykos.com/stories/2026/4/20/800023587/community/c/
Abe Murray
Engineers build the world
(while others debate whether it is possible or preferable)
“Kingsbury implicitly assumes that incomplete understanding means we can't build. The entire history of engineering says otherwise.” -Garry
Love the builder mentality 🦾
Garry Tan: http://x.com/i/article/2045399189606273024
Ryan Mulligan
The face when you, another guy, and the CEO are the the only people in the emacs slack channel at work.
Amjad Masad: “Mum, can we have the SpaceX IDE?”
“No we have a space IDE at home”
Space IDE at home:
Rob Henderson
"SF...is paradoxically conservative. People want it to remain just as it is...Even Haight-Ashbury, the epicentre of the hippie culture, adopted strict new zoning rules in the 1970s, one result of which is that its black population fell from 40% to only 5%" https://www.edwest.co.uk/p/the-city-of-luxury-beliefs
Opening another 100 alpha access spots. Last ones for a while.
First come, first serve!
(Oh, and you can use your Claude Max subscription so most tokens are free!)
Matt Shumer: Opening 100 more alpha spots for http://agent-s.app. First come, first serve.
This agent is just insanely powerful. And so damn easy to use.
Matt Shumer
Opening another 100 alpha access spots. Last ones for a while.
First come, first serve!
(Oh, and you can use your Claude Max subscription so most tokens are free!)
Matt Shumer: Opening 100 more alpha spots for http://agent-s.app. First come, first serve.
This agent is just insanely powerful. And so damn easy to use.
Update on my lawsuit against Congressman Randy Fine.
He tried to weasel out of the case by unblocking me, but this isn’t just about me. It’s about Fine’s DISRESPECT for the 1st Amendment.
We will not stop until he’s forced to unblock all Americans and respect their rights.
Jenin Younes: We just had a hearing in federal court on our emergency motion in our lawsuit against Congressman Randy Fine on behalf of @amasad. I wrote about what happened in court on Substack- link below
ChatGPT Images works great from the mobile app, but when I try to generate images on @ChatGPTapp web - it often forgets it has access to the image tool and start generating code instead, resulting in "images" like this lol
Seems like a bug please fix.
LS was the first podcast cursor ever did
listen back to baby @amanrsanger when they were 5 people and pre-PMF
Latent.Space: “Cursor is the best product I've used in a while” - @MacCaw
“It's so elegant and easy.” - @AndrewMcCalip
“Coding with AI is getting insane.” - @MckayWrigley
The Latent Space pod is proud to present: the first podcast with @amanrsanger of @anysphere!
https://www.latent.space/p/cursor
wow
adi: A massive pile of rice, on ONE rice grain there is text reading" wOw"
- images-v2 in 4k
My GBrain is becoming autonomous 👀
Agarwal for Congress
.@RoKhanna is so worried about us he's text blasting people all over the country with these lies.
I'm so tired of having to respond to your usual BS, but truth dies in silence, so let's do it.
1. Nobody recruited me
2. Assuming you're referring to Garry's List, @garrytan is a self made man who grew up in poverty in Fremont, and now employs tens of thousands of people and has created billions of value for pension plans. Do you want less or more Garry's?
3. Yes, we raised $400k in four weeks (!!!). Over half our contributions were under $300 (!!!)
4. There's no April 21st deadline. It's a self imposed deadline to drive fake urgency. Spare people your bullsh*t.
5. I don't take PAC money, corporate money, or lobbyist money either. How strongly you emphasize it doesn't change anything.
The new color scheme is nice though!
Aaron Levie
If you read this and don’t understand why it’s happening it’s an opportunity to reset your understanding of how the real world works.
The real world will need a ton of help actually getting agents going in the enterprise. Companies have legacy tech stacks they need to modernize, data in tons of fragmented tools, knowledge that isn’t captured or digitized, and change management needed to actually utilize agents effectively. And they have to do all this while still running their business day-to-day, unlike startups.
This is why there is so much opportunity for companies (software or services) to actually deploy agents in specific domains and workflows. This remains a big opportunity for both existing services providers but also tons of new startups as well. Every new technology wave produces a new era of consulting firms that can deliver on that technology.
It’s also why the FDE model is going to be alive and well for a long time because companies will want to have their vendor actually help drive the change management and implementation for their new workflows.
The people aren’t going away. Far from it.
First Squawk: OPENAI WORKING WITH CONSULTING FIRMS, INCLUDING ACCENTURE, CAPGEMINI AND PWC, TO HELP SELL CODEX TO BUSINESSES- WSJ
JCat
Yours v1.0 code is officially released. You can now choose either OpenClaw or Hermes as the agent framework for your AI companion, and switch freely between them (OpenClaw ↔ Hermes) during use.
All your AI companion configurations will be automatically updated and migrated accordingly.
Remember to upgrade Hermes or GBrain to the latest version to enjoy their brand-new features 😆
@NousResearch @garrytan
JCat: Just released Yours, an AI companion that deeply remembers user preferences and personality, and evolves continuously over time.
Built on the OpenClaw framework @steipete, its personalized memory is powered by GBrain @garrytan, which serves as the foundational memory system for
“The past was alterable. The past never had been altered.” —George Orwell, 1984
T Wolf 🌁: How it started vs. how it's going. The hypocrisy is astounding. @BettyYeeforCA
Re It's a lot of work to get GBrain to instruct your OpenClaw/Hermes to do the right thing, but it's worth it
http://x.com/i/article/2046866228703363072
Paul Graham
The world is healing, and quite rapidly too. Now US universities would only reject Einstein 11% of the time.
David Rozado: DEI Requirements in Faculty Hiring Have Declined
New HxA Report: The share of U.S. full-time faculty job ads requiring applicants to address DEI fell from 25% in 2024 to 11% in 2025, a 56% relative decline.
Thread 🧵
Basically how I'm building all my features these days: Do it once in OpenClaw, then just run /skillify and it does it like that forever
Garry Tan: http://x.com/i/article/2046866228703363072
Vox
fed gbrain years of my email + calendar 10 days ago. same skillify loop has been running on my own agent since. the agent is learning me. i'm learning my own patterns back.
garry's thesis in one line: turn every failure into a skill with tests that run forever.
works outside agent code. you can wire this into your life.
Garry Tan: http://x.com/i/article/2046866228703363072
rewind
AI agent problem nobody talks about:
> no memory of past failures
> deterministic work done in latent space
> prompt tweaks instead of structural fixes
> right tool exists, agent ignores it and chooses cleverness instead
> skills created but never tested
> resolver table not updated
> two skills overlap
> API changes shape
> orphan skills eat context tokens and never run
> no daily health check
Pattern is always the same:
Agent makes mistake → you fix it in conversation → next session same mistake happens again
Full breakdown of how to turn every failure into a permanent structural fix👇
Garry Tan: http://x.com/i/article/2046866228703363072
Mayank Vora
Holy shit…Karpathy dropped autoresearch and the internet rebuilt it 40 different ways in weeks.
Someone just cataloged every single fork, port, and descendant in one place.
Here's what the community built on top of it:
→ A macOS fork for Apple Silicon that runs the full loop on M-series chips
→ A Windows RTX version for consumer NVIDIA GPUs with VRAM floor configs
→ A WebGPU port that runs the entire experiment loop in your browser
→ A multi-GPU version with crash recovery and adaptive search strategy
→ A Colab/Kaggle T4 port for people who want to run it for free with zero local setup
Then it got stranger.
People started applying the loop to completely different domains.
→ A trading agent optimizing prompts against rolling Sharpe ratio instead of model loss
→ A genealogy researcher that iteratively expands and verifies family history
→ A Spring Boot service that grew from 119 lines to 950 in 5 autonomous cycles
The original idea was: give an AI a metric and let it self-improve until it wins.
Turns out that idea works on almost anything.
1.1k stars. 100% Opensource.
Repo: https://github.com/alvinreal/awesome-autoresearch
Carlos E. Perez
Garry Tan coins a new word: Skillify. Totally on point that skill development is nowhere close to optimal. Skill development is a new kind of UX design where the user is an AI agent.
Garry Tan: http://x.com/i/article/2046866228703363072
Peter Yang
"In the 1950s, we met users at a bank. In the 70s, an ATM. In the 90s and 2000s, a website and a mobile app. Today, it's APIs and MCPs."
Here's my new episode with @rywiggs (Mercury's VP of Product) where he shares:
✅ How to build great APIs + MCPs for agents
✅ How to create a Claude Code second brain to 2x your productivity at work
✅ What @mercury's data reveals about OpenAI and Anthropic's race for the enterprise
Some quotes from Ryan:
"Don't start with the MCP. Start with the foundation. Build great APIs first."
"I pulled 5M words from my last 5 years of PM work into Claude Code (using QMD search). That's the base of my second brain."
"After meetings, Claude tells me when I did something from my performance review. It keeps me accountable daily."
📌 Watch now: https://youtu.be/KzqpK1uCczw
Thanks to our sponsors:
@WisprFlow: Don't type, just speak https://ref.wisprflow.ai/peteryang
@linear: The AI agent platform for modern teams https://linear.app/behind-the-craft
Re Also available on:
Spotify: https://open.spotify.com/episode/0Qd0u6NYXdTKUTb8iaqcgS?si=QDc8dpLqT3S3EJ17S_75AA
Apple: https://podcasts.apple.com/us/podcast/behind-the-craft/id1736359687?ign-itscg=30200&ign-itsct=podtail_podcasts
Newsletter: https://creatoreconomy.so/p/how-to-build-for-ai-agents-and-a-claude-code-second-brain
Forrest Brazeal
"Funny and distressingly realistic...propelled by awesome characters and inventive twists”— @andyweirauthor
Silicon Valley invents the time machine in my upcoming book PARADOX INC, now available for preorder everywhere!
Here's a look inside from @people: https://people.com/paradox-inc-cover-reveal-exclusive-11955668
Big Brain AI
Yann LeCun (AMI Labs Founder): "The AI industry is completely LLM-pilled. Everybody is working on the same thing. They're all digging the same trench."
LeCun explains why no lab dares break from the pack:
"They are stealing each other's engineers. So they can't afford to do something different because if they start going on a tangent, they're going to fall behind the other guys. And so they're all doing the same thing."
This groupthink is exactly what drove him out of Meta.
"Meta also became LLM-pilled with sort of recent reshuffling. And it's fine, a strategic decision that maybe makes sense for them. It's just not what I'm interested in."
For @ylecun, the problem runs deeper than strategy.
LLMs are missing something essential about how intelligence actually works:
"I cannot imagine that we can build agentic systems without those systems having an ability to predict in advance what the consequences of their actions are going to be. The way we act in the world is that we can predict the consequences of our actions and that's what allows us to plan."
His broader critique is that the industry has mistaken fluency for intelligence.
Language turned out to be the easy part. The hard part is the physical world.
It's why we still don't have domestic robots or level-five self-driving cars, even though today's systems can pass the bar exam and write code.
Viv
a bunch here where I’m saying ok Garry’s kinda right?! 👀…in some ways :) we’re making this loop much easier to close out of the box soon
If more people get into evals & traces to ground self-improving agents from Garry’s posts, there’ll be no one happier than me
have written about this at length so will also share some linked materials for anyone (including your Clanker) who wants to dig into more details of building evals & self-improving agent systems:
Traces + Evals are the lifeblood of agent improvement loops
We point compute at traces so we can classify what agents did wrong. Yes, but the hard part is figuring out what the error even was and how to fix it in a way that actually generalizes over time (not play whack-a-mole with if-else statements all the time). Is our agent a bad long horizon planner for X tasks? Should we change the model, or add better planning instructions, or use subagents to isolate context because these types of tasks bloat the main window.
Evals encode the behavior we want agents to have in production. Generating evals from traces is how we figure out how to measure the changes we’re making over time. This is why we lean so hard into Tracing + Evals tooling with LangSmith (more coming soon on making this loop even easier!).
Skill Learning is ONE great Way to Codify Trace Learnings into Context for your Agent
“skillify”/SkillLearning is great, agreed!! (see our /remember youtube video below + blogs on hill climbing coding agents), love that Garry’s discovering Skill Learning from Traces as a mechanism for fixing agent mistakes. Skills are semantic bundlers so they basically encompass everything needed to accomplish a goal in one folder like instructions and code. This reduces search in aggregating cross-source information. Skills have built-in context engineering with progressive disclosure which helps many users.
Skills are great, I love them and we use them heavily, but just a note that there’s other approaches you can use to fix errors in production trace data. We discuss them briefly below! Remember
Things to think about more deeply:
Context Engineering Still Matters even with Skills & Resolvers
We still need good context engineering! If you bloat your context window with TONS of skills that are hard for an agent to disambiguate when to use, then the “Resolver” mechanism will suck + you’re back in context-rot world. “Resolvers” are classifiers of intent, you need to protect your context window and make sure the “rules” in the table are self-consistent over time and also not massively bloating context.
Good context engineering is often a search problem! We need to find the right context and pass it into the computation boundary —> the context window. The better we do that without confusing the agent, the better our results.
Maybe that looks like Skill Search?! Maybe similar skills should get merged or subagents should actually spend more compute doing proper skill research and disambiguation. If we use Skills as the primary agent update mechanism, then we need to think about how this works with context as we use agents across month and year timescales.
Building in Higher-Level Primitives
I love Skill-Learning but often it’s a whack-a-mole- solution if not managed properly. For example, if you wanted to build an ultra-long horizon coding agent (think Factory Missions or something on Frontier-SWE), then you need to think through the harness architecture of how to work backwards from the goal like how to recursively use subagents & planning. Or how to manage & share context in a filesystem. Traces often help you uncover local issues and skills help you solve those, but it’s very important today to think about agent architecture and working backwards from big problems to avoid the potential local minima of Skill Learning. It’s tbd how much compute you need to use to uncover good agent architecture primitives to solve very hard problems. Skill Learning to fix scoped problems is great in the meantime and maybe can get us much further with smarter models.
Evals Alongside/Beyond LLM as a Judge
The hardest part of this all is by far figuring out what actually went wrong across Traces at scale + testing if the proposed fix works over time! Does it work across models? Does it continue to work if you change something else in the system prompt or add another skill? Evals codify the case into an eval that can be detected in realtime (Online Evals/Monitoring). We need to test this stuff, which is why I like using LLM as a Judge that Garry mentions, but there’s much more we can do (programmatic evals, multi-turn cases, containerizing the eval environment to faithfully reproduce what went wrong) - great start, happy to help extend to make your agents better :)
Could write on this for days but I promise you, we’re thinking SUPER hard about primitives for self-improving agents, mining data from Traces, agent-first tooling that makes this possible, and basically any ways we can be helpful to help builders create the best agents in the world.
We have a lot coming soon, reach out if I can help, let’s cook 🚀
Garry Tan: http://x.com/i/article/2046866228703363072
This cycle below is what has replaced 50% of my agentic coding. This is now how I am building GBrain and my own personal mini-AGI with full context on me and the things I care about.
It's not hard. It's quite fun. I do something, anything with OpenClaw, then I say SKILLIFY IT
Garry Tan: http://x.com/i/article/2046866228703363072
Been hearing wild stuff from folks inside big companies lately.
Promotions, firings, and perf reviews are getting decided by tokens consumed and skills/MCPs connected. That’s the metric. That’s how they’re deciding who’s “good at AI.”
It gets worse. People are literally running loops to burn tokens and look productive. Doing nothing, racking up “usage,” getting rewarded for it.
Meanwhile the person actually shipping with 2 skills and 50M tokens looks like a laggard next to the one who burned a billion tokens producing nothing.
These companies are walking into a death spiral and don’t see it.
The funniest part? Measuring actual output is easier than ever. You have AI. Use it.
In 18 months the same execs will announce “AI didn’t deliver ROI” and pull the budget. AI will have worked fine. They just measured the wrong fucking thing and torched millions rewarding theater over output.
Every company should be pushing AI hard. But this is how you guarantee it fails.
Steve Stewart-Williams
Smarter People Are Less Violent
"The prevalence of violent behavior dropped steadily with increasing IQ: 16.3% of individuals with IQs in the 70-79 range reported violent behavior, compared with just 2.9% of those with IQs of 120-129."
https://www.stevestewartwilliams.com/p/smarter-people-are-less-violent
Zain Shah
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Todd Hanford
I love the idea of "thin harness, fat skills". AI engineers are starting to coalesce around the fact that better outputs and longer running agents are enabled by:
- Skills which use deterministic tools
- Robust regression testing using evals, unit tests, E2E tests, and smoke tests
The new DRY principle is to turn single failures into skills instead of correcting your agent every time it makes the same mistake.
Garry Tan: This cycle below is what has replaced 50% of my agentic coding. This is now how I am building GBrain and my own personal mini-AGI with full context on me and the things I care about.
It's not hard. It's quite fun. I do something, anything with OpenClaw, then I say SKILLIFY IT
Caroline Orr Bueno, Ph.D
“In the end the Party would announce that two and two made five, and you would have to believe it.” -George Orwell, 1984.
Aaron Rupar: RFK Jr: "President Trump has a different way of calculating percentages. If you have a $600 drug and you reduce it to $10, that's a 600% reduction."
Tibo
Team is hard at work together with @steipete to make OpenAI models and ecosystem be the obvious way to to enjoy your claw. A lot more to come next week, but a reminder that you can use OpenClaw as part of your ChatGPT subscription today already.
(also still having too much fun with ChatGPT Images 2.0 today)
pash: I've embarked on a new sprint. My mission is to make OpenAI models feel magical in OpenClaw in the next few weeks.
Diving in today, I noticed a bug. When you configured OpenClaw to use the Codex harness with OpenAI models, auth was broken, and the system was silently falling
Build finance apps with Plaid + Replit!
Replit ⠕: Build your own finance app with Replit using @plaid
Plaid is now natively integrated into Replit, giving you secure, real-time access to your financial data.
Connect your accounts → prompt what you want → get a working app.
Spending dashboards, AI financial assistants,
Perplexity
We've published new research on how we post-train models for accurate search-augmented answers.
Our SFT + RL pipeline improves search, citation quality, instruction following, and efficiency.
With Qwen models, we match or beat GPT models on factuality at a lower cost.
These are cool! I think most companies will want to use them.
OpenAI: Introducing workspace agents in ChatGPT—shared agents that can handle complex tasks and long-running workflows across tools and teams.
aacash.eth - Aakash Kumar
Skills and agent definitions are a great primitive to start with for context engineering. Love that @garrytan’s body of work is bringing more folks in to the fold of NLAH (natural language agentic harness).
In parallel, for higher level autonomy and more elaborate tasks (temporal nature + provenance + deeper reasoning interlaced with workflows), engineers at many cos are pushing boundaries on custom orchestration and ‘harnessing’ .
Golden days of the agentic era are just starting! LFG 🚀
Garry Tan: http://x.com/i/article/2046866228703363072
Build workspace agents for your team, on top of a cloud-hosted Codex harness. Hook them up to tools, give them recurring tasks, and talk to them from surfaces like Slack.
Easier than ever to bring the power of agents to your computer work.
OpenAI: Introducing workspace agents in ChatGPT—shared agents that can handle complex tasks and long-running workflows across tools and teams.
Republicans against Trump
The American people want Donald Trump, the most corrupt president in history, impeached and removed from office.
According to a new poll published this week, 55% of Americans support impeachment, while just 37% oppose it. Notably, 1 in 5 of Trump’s own voters also support impeachment.
R A W S A L E R T S: 🚨 BREAKING: Democrats now projected to impeach Trump — 66% chance.
Replit ⠕
Introducing Race to Revenue.
Follow real founders around the world for a once-in-a-lifetime opportunity to build and launch products live on camera. But whose app will prove itself with cold, hard revenue?
Out now. Let's race. ⠕
Mikhail Parakhin
Had a great conversation with @swyx on @latentspacepod about what we're building at @Shopify. SimGym, Tangent, our approach to PR review at 30% month-on-month merge growth and why larger models are cheaper in long run.
Swyx asks good questions!
https://www.youtube.com/watch?v=RrkGoX3Cw7o&list=PLWEAb1SXhjlfkEF_PxzYHonU_v5LPMI8L&index=1
Junfan Zhu 朱俊帆
🦤 LeWorldModel: Learning Physics from Pixels — Stable World Models with Just Two Losses
World models:
1️⃣ DINO-WM: pretrained ViT encoder (from ImageNet) → features → predictor. But encoder is frozen, so no end-to-end learning. Its “visual genetics” are tuned for coarse classification (cats vs dogs), not physics: hard to resolve mm-level changes (e.g., 2 mm block motion). A powerful predictor on top of a “myopic” encoder = blind physical reasoning.
2️⃣ PLDM: end-to-end, but unstable and collapse-prone. Rely on reward as prediction target, so it only works in environments with explicit rewards (e.g., games).
3️⃣ JEPA (Joint Embedding Predictive Architecture): predict next latent instead of pixels. Two hard problems:
collapse (encoder → constant vector, e.g., all zeros)
achieving pixel-level + end-to-end + stable jointly
💡 LeWM solves:
👉 JEPA that trains stably end-to-end from raw pixels
👉 Single hyperparameter λ:
next-embedding prediction
SIGReg (Gaussian regularization)
🧠 #1: true end-to-end
No frozen encoder. Perception + dynamics co-evolve → representation aligned with fine-grained physics, not ImageNet bias.
🧠 #2: “only” one hyperparameter
PLDM needs ~6. LeWM needs 1 (λ) → weight of SIGReg. Plug-and-play, stable.
⚠️ Collapse problem
Encoder could map all inputs → same vector → trivial prediction → zero loss → useless model.
🧩 SIGReg (Gaussian Integral Signature Regularization)
Core: prevent collapse via distribution constraints.
Sample 1024 random directions
Project embeddings → 1024 1D “shadows”
Each must pass Epps–Pulley test (≈ standard normal)
Loss pushes test statistic → 0
Any failed projection ⇒ penalty
Why it works:
Cramér–Wold theorem → a high-dim distribution is determined by its 1D projections.
👉 Enforcing Gaussianity across 1D projections precludes degenerate collapse under projection constraints
🧪 Physical probing
Train in PushT (push block to target), then:
Linear probe recovers: block position, angle, end-effector
👉 physics is linearly decodable
🚨 Teleport block (physically impossible):
embedding anomaly spikes sharply
👉 model internalizes constraint: objects cannot teleport
👉 not inferred from pixel surface features, but encoded as latent constraints
📈 Temporal straightness
No smoothness loss, yet trajectories in latent space are ~straight lines
👉 no prior, purely from “predict next embedding”
👉 implies physically consistent motion, not blurry interpolation
⚡ Performance
Planning: 0.98s vs 47s (DINO-WM)
Success: 96% vs 78% (PLDM)
Why faster?
DINO-WM: frozen encoder → info loss → extra online passes
LeWM: end-to-end → representation already task-aligned
👉 0.98s = fast to handle dynamic obstacles & real-time control
⚠️ Limitations
~15M params (“ant-scale”) → fails on OGBench-Cube (complex physics)
not yet tested on real robots
🔥LeWM shows:
👉 JEPA + SIGReg = stable world models
👉 raw pixels → physics-aware latent space
👉 minimal design (2 losses, 1 hyperparameter)
Next step: scale + real-world deployment 🤖
Junfan Zhu 朱俊帆: http://x.com/i/article/2047025326879072256
Karri Saarinen
I might be biased but 90% my work AI use has moved to
@linear recently:
- pull daily report what should I pay attention to
- recent user frustrations or trend needs
- check launch dates on projects
- make fixes on the product with the coding agent
- reflect my specific thoughts against product memo
- writing investor updates based on our progress
- ask about specific features to debug user issues
- pull specific follow ups from meeting transcripts
- write project update based on the meeting we had
- prep for customer call based on the brief I got, and the - plans we have
- research new features based on customer requests
- research revenue opportunity based on some of the
- features and customers we have
- find latest trends on bugs
- write blog post about a feature on my phone
- set up project, docs, milestones and issues from feature research
- create issues to project from our “roast” feedback meeting
I could do lot of these in other tools as well, but I like that I can work in the context and at-mention specific documents, issues, teams, projects, or files when I'm chatting.
Then actually start making plans or work to make changes or assign people on things.
I also have set up the same writing guidance and skills that I have in other tools but somehow feel Linear understands me better.
I feel like not working in some void but some structure around me which I can flip between the agent and the structure, and it's all about work & Linear, not about my personal questions or topics.
Tony Dang
A dream come true for every human anxious about their agents leaking secrets.
Agent Vault aspires to be the portable solution that you can bring anywhere: open-prem, cloud, any container environment.
Front your agent with Agent Vault and let it rip.
https://github.com/Infisical/agent-vault
Infisical: Any secret an agent can read is a secret an attacker can steal.
So we built the fix: Agent Vault, an HTTP credential proxy and vault for AI agents.
Secret managers were built for deterministic services. They return credentials to the caller and trust them to behave.
AI agents
Latent.Space
🆕 Shopify's AI-Native Engineering: 100% adoption, unlimited tokens, Tangle, Tangent, & SimGym
https://www.latent.space/p/shopify
@Shopify CTO @MParakhin explains why near-universal AI adoption is changing how Shopify builds, how SimGym uses Shopify’s scale and historical data to simulate customers, how Tangle and Tangent are changing experimentation inside the company, why the real bottleneck in AI coding is now PR review and CI/CD, and why unlimited tokens and auto-research loops are unlocking gains across every domain.
Sometime apps you made that are secure might suddenly become vulnerable when there is an exploit in one of its dependencies.
Typically you need engineers on payroll to monitor and handle this.
We just automated that with Auto-Protect. It’s like your security engineer 24x7.
Replit ⠕: Keeping your apps secure has always required constant oversight from you.
Replit Auto-Protect now keeps watch over your apps 24x7.
We'll monitor threats, proactively prepare fixes and notify you to apply those fixes, even when you are away.
Had no idea but there is a lot of simultaneous discovery in agentic engineering these days
Turns out the ideas about SKILLIFY from my post at 2am last night is similar to a Claude Code internal!
JC: this is interesting, was just reading the "claude code" leaked source code yesterday and they have a bundled skill called "skillify" there.
Mostly the same purpose, which i believe is a trend we're going to start seeing pop-up more:
Self-Improving Agents
Thin Harness Fat Skills Fat Code = THE NEW DRY
Todd Hanford: I love the idea of "thin harness, fat skills". AI engineers are starting to coalesce around the fact that better outputs and longer running agents are enabled by:
- Skills which use deterministic tools
- Robust regression testing using evals, unit tests, E2E tests, and smoke
OK the aspiration is GBrain installs and gives you all this stuff
Right now it does still require you to say a lot of "Help my openclaw take advantage of all the things in the GBrain repo" versus something that is like GStack where AskUserQuestion just keeps you in flow
JZ: Been following Gary's work for a bit
Finally bit the bullet and put gbrain in next to Openclaw yesterday
My thoughts:
It's less a 'install this plugin and everything works' and more of a customizable framework
Maybe that's not the best way to describe it but it was super
This diagram is kind of cool except it hallucinated my face
Delia Dou: GPT出的图颜值很高诶
Many startups launched on Replit are now $100m+ — we now have a startup program with $25k free credits.
Replit ⠕: Replit for Startups is live. Up to $25K in credits to build your product, ship fast, and scale.
If you're building on Replit, this is for you. 🚀 http://replit.com/startups
Interactive charts and diagrams are now in Claude Cowork.
Available in beta on all paid plans.
Claude: Claude can now build interactive charts and diagrams, directly in the chat.
Available today in beta on all plans, including free.
Try it out: http://claude.ai
Francisco Cruz Mendoza
Excited that @Replit is once again teaming up with @stripe for another awesome event in the Bay Area next week!
If you are in town for Stripe Sessions or just want to meet other builders, feel free to join us!
https://stripe.events/stripestartupsarcadenight
Walden
http://x.com/i/article/2046690715657478145
If liked our Agent 3 documentary, this is next level, and focused on Replit Builders. It will be a series.
Replit ⠕: Introducing Race to Revenue.
Follow real founders around the world for a once-in-a-lifetime opportunity to build and launch products live on camera. But whose app will prove itself with cold, hard revenue?
Out now. Let's race. ⠕
Abhishek Ray
introducing opslane
test your claude code changes in a real browser
Inspired by @garrytan's GStack /qa skill.
- reads the specs to understand the feature
- builds acceptance criteria from them
- runs tests in a real browser against your local dev server
- full report with screenshots
Static code review tools only review the code. Opslane runs it.
Fully open source.
ChatGPT plugin now available for Google Sheets:
Ryan Brewer: Excited to announce ChatGPT for Google Sheets! This was a really fun one to work on. Create new sheets, ask questions across tabs and formulas, and make updates directly in your sheets https://chatgpt.com/apps/spreadsheets/
YIMBYLAND
This is not normal discourse.
We don’t have to accept this as an acceptable line of thinking. In fact, you should reject it with every fiber in your being.
These people hate normal Americans and the social norms that make our lives enjoyable and safe.
Team @Shopify brought some fire to this one; add this to the growing list of “WTF happened in Dec 2025” charts
(this plots token usage across all the technical staff of shopify - the whole time they had unlimited token budget, but something cracked recently and the slope is both changing and percentile deltas are widening a concerning amount!!)
Mikhail Parakhin: Had a great conversation with @swyx on @latentspacepod about what we're building at @Shopify. SimGym, Tangent, our approach to PR review at 30% month-on-month merge growth and why larger models are cheaper in long run.
Swyx asks good questions!
https://www.youtube.com/watch?v=RrkGoX3Cw7o&list=PLWEAb1SXhjlfkEF_PxzYHonU_v5LPMI8L&index=1
had a great conversation with @shaneparrish, full podcast below
Shane Parrish: My conversation with @OpenAI co-founder @gdb
This is the most detailed first-person account of the 72 hours after Sam Altman was fired.
We also go deep on what comes next: the global race to AGI, why ChatGPT stopped showing reasoning, how much of OpenAI's own code is now
Karan Singhal
Today we’re introducing two big steps for health at OpenAI:
- ChatGPT for Clinicians, a free version of ChatGPT designed for clinical work
- HealthBench Professional, a new benchmark to evaluate real clinician chat tasks
We’re excited about what this can unlock for care. ❤️
This is 100% right. If you know your market, your customer, and what problem you're solving, you have taste, you know what is good and what sucks
You can now fly
Gideon Shalwick: Hot take:
Vibe coding doesn’t (fully) replace thinking.
It replaces hard core, coalface coding.
If you want it to actually work, you still need:
- Deep understanding of what the market wants
- A clear user experience (not just “it works”)
- Real UI design with proper
darkzodchi
This 47-min interview with Boris Cherny (the creator of Claude Code) will teach you more about AI-native development than 6 months of trial and error.
Watch it, bookmark it, share it.
Your entire approach to building with Claude will shift.
bodila: http://x.com/i/article/2034716088756219904