Firm Conviction

Which Software Survives AI? A Scoring System

S&P 500 Software & Services is down 20%+ this month. About $1T in market cap erased. Every SaaS name is selling together — Palantir (+70% YoY revenue growth) dropped the same day as DocuSign (+8% revenue growth). The question everyone is now asking: which software companies survive AI?

Nobody has a systematic answer yet.

Citrini's "2028 Global Intelligence Crisis" laid out the bear case with enough mechanical detail that PMs could hand it to their risk committees. 14M+ views later, the piece isn't just describing a scenario — it's creating one. PMs reposition, the selling confirms the narrative, more selling follows. The scenario doesn't need to be right for the repricing to work. The reflexivity is already running.

But Citrini answered "what happens if AI kills SaaS?" He didn't answer "which SaaS does AI actually kill?" That's the question that matters for capital allocation, and it's the one nobody has a framework for yet.

We built one. It's early, the sample is small, and we're sharing it because we think the question is important enough that being wrong in public is better than being vague in private.

The thermodynamic principle

Start with physics. As inference costs drop 10-20x over the next two years, anything that CAN be done locally WILL be done locally. It's the lower energy state. This isn't a prediction — it's thermodynamics applied to compute. When running Llama locally costs essentially nothing, every software function that can collapse to local inference will collapse to local inference.

Chegg's homework answers. Grammarly's text editing. UiPath's process automation. A local model already does all of this. These companies sold access to a centralized capability that is now available everywhere for free. Their product became ambient.

But CrowdStrike ingests petabytes of real-time threat data across millions of endpoints to detect novel attacks. S&P Global's credit ratings are embedded in regulatory frameworks that legally require them. Snowflake runs multi-tenant data infrastructure at a scale no local model can approximate. These aren't functions — they're systems. Systems that require specialized infrastructure operating at scales that don't collapse to a laptop.

The question "which SaaS survives?" reduces to: what can't go local?

Two independent sources, same answer

We found two frameworks developed independently that converge on the same discriminator.

Steve Yegge's "The Anthropic Hive Mind" (Jan 2026). Yegge spent months talking to ~40 Anthropic employees and came away convinced that "2026 is going to be a year that just about breaks a lot of companies." His piece is impressionistic — vibes from inside the spaceship, not a formula — but the survival dynamics he describes are precise: as AI coding agents improve, buy vs. build shifts permanently. Agents generate small-to-medium SaaS on demand. The ecosystem selects for energy efficiency. We extracted six quantifiable levers from his analysis: crystallized complexity, substrate efficiency, broad utility, agent discoverability, desire paths, and a human connection coefficient.

Nicolas Bustamante, CEO of Fintool (Anthropic-backed). Bustamante built Doctrine (legal SaaS — the threatened category) and now builds Fintool (AI equity research — the threatening category). Ten years of vertical software from both sides of the disruption. His 10-moat taxonomy: moats 1-5 (interfaces, workflows, public data, talent, bundling) are destroyed or weakened by LLMs. Moats 6-9 (proprietary data, regulatory lock-in, network effects, transaction embedding) hold or strengthen. His 3-question screen: Is the data proprietary? Is there regulatory lock-in? Is the software embedded in the transaction? 0/3 = high risk. 2-3/3 = probably safe.

An engineer who talked to Anthropic and an operator who built on both sides of the disruption — independently pointing at the same thing: infrastructure irreducibility. Software survives when it runs systems that can't be replicated locally. Software dies when its function collapses to inference.

The V-score

We formalized this into a quantitative scoring system. Six factors, weighted by discriminative power:

  • C — Crystallized Complexity. How much accumulated logic is baked into the system? Git, compilers, database engines — decades of crystallized cognition that's too expensive for AI to re-derive from scratch.
  • E — Irreducible Infrastructure. Does the system require specialized infrastructure at scale? Petabyte ingestion, real-time global networks, regulated data pipelines. This is the strongest factor by far.
  • U — Broad Utility. Swiss Army Knives amortize their discovery cost. A tool used for one thing competes with a free agent-built alternative. A tool used for twenty things has switching cost baked in.
  • A — Agent Discoverability. If AI agents don't know your tool exists, they'll build a worse version instead of calling yours. API quality, documentation, ecosystem presence.
  • M — Ecosystem Lock-in. How painful is it to switch the whole stack? Even when agents are smart, ripping out Workday's payroll integration from 3,000 enterprises is a multi-year project. We added this factor — neither Yegge nor Bustamante emphasized it, but it showed up strongly in the data.
  • F — Frontier AI Exposure (penalty). Companies directly competing with frontier model capabilities get penalized. If your core product is "we make AI do X" and the model itself starts doing X natively, you're in trouble.

Two gates filter before scoring: E must be above 1 (or you're dead regardless of other factors), and either A must be above 1 or the combination of C+E+U must be exceptional.

We also found something we didn't expect. Yegge proposed a Human Coefficient — the idea that software with human connection (games, social) would be protected. When we scored it against known outcomes, it ran in the WRONG direction. Dead companies actually scored higher on human premium than survivors. "People love our product" doesn't protect you when an agent can replicate the experience. We removed it.

Calibration against known outcomes

We scored the V-score against about a dozen companies with clear outcomes — the first wave of AI disruption has already produced unambiguous winners and losers. The sample is small and we're honest about that. But the separation was clean.

Dead companies (Chegg, Grammarly, Stack Overflow, Appian, Zendesk) averaged V=1.27. Survivors (Datadog, MongoDB, CrowdStrike, Snowflake, Microsoft/GitHub) averaged V=3.74. Zero overlap in the sample.

E was the strongest discriminator:

Factor Dead Avg Alive Avg Gap
E (Infrastructure) 0.4 4.2 3.8
C (Complexity) 1.6 4.4 2.8
M (Lock-in) 1.8 4.6 2.8

Dead companies averaged E=0.4 — their infrastructure could be replicated locally. Survivors averaged E=4.2 — they run systems at scales that physically can't go local. The gap is 3.8 on a 5-point scale. No other factor comes close.

Caveats are real: a dozen companies is a calibration set, not a proper backtest. We haven't run it on a holdout sample. The boundaries between tiers are judgment calls. We could be overfitting to a small, obvious sample. But a 3.8 gap with zero overlap is enough signal to share the framework while we expand the dataset.

Scoring the current selloff

Applied to the SaaS names getting crushed right now:

Tier Companies P/ARR V-score Signal
Fortress SPGI, FICO, ICE N/A Regulatory + transaction embedded. E=5. Still selling off with everything else — which means even the strongest names are mispriced if the framework holds.
Infrastructure CRWD (19.8x), SNOW (12.2x), DDOG (10.6x) High Petabyte-scale infra, hard to replace. Probably survive but priced for it.
Embedded WDAY (3.7x), NOW (7.6x), MDB (11.2x) Mid Workflow lock-in with erosion risk. Potential mispricing — trading at dead-zone multiples with alive-tier infrastructure.
Dead zone PATH (3.5x), DOCU (2.7x), OKTA (4.4x) Low Core function replaceable by agents. Cheap for a reason.

Look at the embedded tier. Workday at 3.7x P/ARR and UiPath at 3.5x. Nearly identical valuations. But Workday runs Fortune 500 payroll systems embedded in every hire, fire, and direct deposit across thousands of enterprises. UiPath does robotic process automation — quite literally the first category LLM agents replaced. The market is pricing them the same because it hasn't finished thinking.

The selloff hasn't discriminated yet. SPGI (-23% 1M) is falling alongside PATH. That's the opportunity — when revenue divergence forces the market to separate survivors from casualties over the next 2-3 quarters, the fortress and embedded tiers snap back while the dead zone stays dead. The embedded tier is where the mispricing is sharpest — real infrastructure trading at dead-zone multiples.

For the V-score applied to a specific company with primary source evidence, see the SPGI deep dive at V=4.24.

A note on the 2028 GIC scenario itself

Citrini's chain is well-constructed — SaaS disruption → white-collar displacement → private credit defaults (Zendesk's $5B facility as the smoking gun) → insurance/annuity exposure (Athene, Global Atlantic) → prime mortgage stress → systemic crisis. Each link is built with specific data and mechanical detail. Whether the full chain completes is a macro question we don't have edge on, and it's not the question we're trying to answer.

Our question is narrower: which companies are immune to the FIRST link? Because the first link — AI disrupting SaaS — is already happening. The reflexive selloff is running NOW, driven by narrative as much as reality. Citrini's 14M views means every PM with SaaS exposure has read it or heard the summary. The repricing is happening because the narrative is credible enough to justify reducing exposure, and reducing exposure creates the price action that makes the narrative look prescient. You don't need to believe the chain reaches mortgages to see that the SaaS repricing is real and indiscriminate.

What we don't know

The V-score is a working framework, not a finished product. The sample size is small. We haven't stress-tested it against a holdout set. The embedded tier is the hardest to score — these companies have real lock-in AND real erosion risk, and the relative weight between those forces is genuinely uncertain. We don't know how fast the discrimination process takes. Could be weeks, could be quarters. Revenue divergence will force it eventually, but "eventually" is a wide range.

We're sharing the framework because the question — which software survives AI? — is the most important capital allocation question in tech right now, and nobody else has published a systematic approach to answering it. If the V-score is useful, use it. If you find cases where it breaks, we want to know.