Firntal — Production is the thesis.

Inference. Scale. Production.

Production is the thesis.

We back the engineers solving inference cost, latency, and GPU scheduling — the systems layer that determines which AI applications make it past the prototype.

Read our approach

$68M deployed across 12 infrastructure companies — Seed and Pre-Seed.

// Conviction

The compute layer is not a commodity.

Every generation of AI applications runs on infrastructure assumptions that are wrong within 18 months. Inference cost per token, batching throughput, model routing overhead, cold-start latency — these are the constraints that actually determine whether a model ships to users or stays in a Jupyter notebook.

We invest when a technical insight about these problems is sharp enough to build a company around. Not a product vision. A systems insight — the kind you only form by having built the thing yourself.

Fund I $28M Fully deployed. 2022.

Fund II $40M Actively deploying. 2024.

Modal Serverless GPU Infrastructure

Seed 2022

Cerebrium Serverless GPU Inference

Pre-Seed 2022

Baseten ML Model Serving

Seed 2023

Martian Model Routing & Selection

Seed 2023

Lepton AI AI Inference Platform

Seed 2023

Tensorwave GPU Cloud for Inference

Seed 2024

Dagger CI/CD for ML Pipelines

Seed 2024

Shaped Real-Time ML Inference

Seed 2024

Titanml Model Compression & Deployment

Pre-Seed 2023

Unsloth LLM Training Optimization

Seed 2025

Nscale Sovereign AI Cloud

Seed 2025

Inferless Serverless Model Serving

Seed 2026

Full portfolio

Architecture Office Hours

Sarah holds bi-weekly sessions with portfolio CTOs navigating inference scaling decisions. The agenda: batching strategy, hardware selection, KV cache sizing, serving architecture trade-offs. Founders bring real decisions; Sarah brings the context of having built production serving systems at scale. Not advisory theater — actual technical work.

GPU Capacity & Cloud Access

Lukas's network spans hyperscale cloud teams and GPU cloud operators built during six years shipping GPU scheduling infrastructure. Portfolio companies get introductions to capacity allocation contacts before those relationships are needed — relevant when your inference product depends on hardware that has a waitlist.

Enterprise Buyer Introductions

Niklas maps infrastructure buyers in financial services, healthcare, and manufacturing — sectors where AI inference is moving from pilot to procurement. Introductions go to platform engineering directors and CTO-level buyers who are actively building inference stacks, not generic LinkedIn outreach.

Follow-on Fundraising Preparation

Niklas starts fundraising preparation 9 months before portfolio companies need capital — financial model construction, investor targeting, and narrative positioning. Built with time to iterate, not assembled with two weeks of runway left.

Recruiting Network

The ETH Zurich ML engineering and distributed systems community runs through Firntal. We connect portfolio companies with senior ML engineers, infrastructure leads, and potential founding CPOs — from networks built as practitioners, not as investors.

LP-Backed Distribution Partnerships

Several Firntal LPs are corporate strategics with active AI infrastructure procurement programs. When portfolio companies reach distribution-readiness, we facilitate structured introductions — not warm emails into procurement black holes.

May 22, 2026 Edge Inference: Emerging Architectures and Where the Value Goes Sarah Brunner Oct 14, 2025 Multi-Modal Inference: The Next Frontier for Infrastructure Builders Lukas Meier Jul 22, 2025 Inference Optimization: State of the Art and Where It Goes Next Sarah Brunner

All writing

The compute layer is not a commodity.

12 companies building the production layer.

Former engineers. Not just capital.

Architecture Office Hours

GPU Capacity & Cloud Access

Enterprise Buyer Introductions

Follow-on Fundraising Preparation

Recruiting Network

LP-Backed Distribution Partnerships

From the team.