The "AI winter is coming" take has been circulating since at least 2023. Each new wave — slower capability improvements, a high-profile product failure, a round of AI startup layoffs — prompts a fresh cycle of winter predictions. The predictions are wrong, but they're wrong in an interesting way: they confuse two very different markets that happen to share a label.
The AI application layer — AI-first startups building products on top of foundation models — is genuinely overcrowded. Many of those companies won't survive the next two years. That's not a prediction; it's a visible trend. Unit economics are difficult when your core cost is API inference on a commodity model and your competitive differentiation is surface-level. Some of these companies will be acquired; many will wind down; a few will find defensible niches.
But the infrastructure layer is different. The companies building the systems that process AI inference — the serving frameworks, the optimization tooling, the deployment infrastructure, the fine-tuning platforms — have a fundamentally different risk profile. Calling the application-layer correction "AI winter" conflates them, and that conflation is expensive for investors and founders who let it shape their decisions.
What historical AI winters were actually about
The AI winters of the 1970s and 1980s were characterized by a specific failure: the techniques of the time (primarily expert systems and symbolic reasoning) hit fundamental capability ceilings that no amount of additional compute could overcome. The gap between what the technology could do and what it was being sold to do was unbridgeable with the available methods. Research funding collapsed because the theoretical foundations were wrong.
That failure mode does not apply to the current cycle. Large language models have demonstrated capability improvements that track closely with compute and data scaling. The capability ceiling, if one exists, is not visible at current parameter counts. The applications being built on top of these models are limited by economic viability and product quality, not by fundamental technical impossibility.
The correct analogy for a capability-justified correction in the AI application layer is not "AI winter" — it's what happened to the internet application layer in 2001. Many companies building on the internet failed, often dramatically. The infrastructure layer — the networking equipment, the CDNs, the data centers, the hosting providers — had a rough few years but was not fundamentally wrong. The applications were wrong, not the infrastructure. Twenty years later, the internet application layer is orders of magnitude larger than it was at peak 2000, and the infrastructure underneath it is also orders of magnitude more developed.
Why infrastructure demand doesn't correlate with application-layer valuation
A specific mechanism the winter-predictors miss: infrastructure demand is driven by inference volume, not by application-layer valuations. Even if the AI application startup count contracts by 50% over the next two years, the inference volume processed by surviving companies — and by enterprises integrating AI into existing workflows — will continue growing.
There's a consolidation dynamic here that is actually positive for infrastructure: fewer, more efficient AI applications at higher scale generate more inference volume per company than many small-scale experiments. A 50% contraction in the AI startup count might correspond to a net increase in production inference volume if the surviving companies are the ones with real user adoption and growing deployment scale.
The serving infrastructure, optimization tooling, and deployment platforms that serve these workloads are directly exposed to that volume growth, not to startup count. This is why the portfolio companies we back haven't seen demand softening even as the broader AI startup landscape has become more cautious. They're building for production workloads, which are growing.
The categories where winter risk is real
Being honest about the risk profile: not all AI infrastructure is equally insulated from a broader market correction. The categories where I see genuine risk:
Infrastructure built for a specific frontier model dependency. If your product's value is optimization of GPT-4 specifically, or serving Llama 2 specifically, and the next generation of models changes the architecture significantly enough to invalidate that optimization, the business has a real problem. Good infrastructure businesses abstract over model specifics and provide value across model generations.
Training infrastructure assuming continued frontier scaling. The compute requirements for frontier training are enormous and concentrated among a small number of actors. If the scaling hypothesis weakens — and there are credible arguments that it will at some scale — the demand for frontier training compute contracts. For inference infrastructure, which serves deployed models regardless of training compute trends, this is a secondary effect at most.
Developer tooling for AI applications that are themselves precarious. A CI/CD tool for AI applications is only as durable as the applications it serves. If the application market contracts, demand for developer tooling contracts with it. This is more of an application-adjacent risk than a pure infrastructure risk, but worth distinguishing from the serving-layer infrastructure that processes production inference regardless of which developer tools the teams used to build it.
The production layer thesis holds
We started Firntal with the conviction that production deployment — not model capability, not research, not the applications themselves — was the durable layer to back. The teams solving inference cost, latency, and deployment tooling have a structural advantage: the demand for their work grows proportionally to AI adoption, and their value doesn't depend on any single model architecture or application category succeeding.
Eighteen months into Fund II, that conviction is stronger, not weaker. The market has shaken out many early-stage AI companies that were long on ambition and short on production engineering. The ones that remain, and the ones we're backing at the infrastructure layer, are solving harder problems with more durable demand curves. If anything, the application-layer correction accelerates the shift to production discipline — which is exactly where we invest.