We invest where inference meets production.

Seed and Pre-Seed in AI inference infrastructure and model optimization — the systems that close the gap between a trained model and a deployed one.

Seed · Pre-Seed $1M–$3.5M initial checks

We invest in startups building the systems layer beneath foundation models.

Stage Seed · Pre-Seed
Check size $1M–$3.5M
AUM $68M
Portfolio 12 companies
Founded 2021

The model is not the moat. The infrastructure is.

Foundation models are increasingly commoditized. GPT-4-class capability is available as open weights. The durable value in the AI stack is not who trained the largest model — it is who built the infrastructure that makes models economically deployable at the request volumes that real applications require.

Inference cost per token is still dropping by an order of magnitude every 18 months. Latency constraints are tightening as AI enters real-time workloads — recommendation, code generation, document processing. Fine-tuning is shifting from research experiment to production requirement. Model routing between providers is an active engineering problem with measurable cost consequences, not a future abstraction.

We invest in the founding teams who have built enough of this infrastructure to see its failure modes clearly — and who are building the production primitives that didn't exist when they needed them. GPU scheduling, serverless serving, continuous batching, quantization tooling, model routing, CI/CD for ML pipelines. The companies that solve these problems compound across every application vertical above them.

Inference Efficiency

Continuous batching, INT4/INT8 quantization, speculative decoding, hardware-aware kernel scheduling. Teams driving down cost per token at the throughput levels production workloads actually require.

Model Serving & Deployment

Serverless GPU inference, model compression for edge deployment, multi-cloud serving primitives, cold-start elimination. The gap between a model checkpoint and a production endpoint — these companies close it.

Fine-Tuning Infrastructure

Parameter-efficient fine-tuning (LoRA, QLoRA), training memory optimization, domain adaptation pipelines. Moving custom model training from research scripts into repeatable engineering workflows.

ML Developer Tooling

CI/CD for ML pipelines, intelligent model routing and cost-based selection, observability and latency tracing for inference systems. The operational layer that production ML teams are still building by hand.

Two funds. One thesis.

Firntal Fund I $28M Fully deployed Closed June 2022 8 investments. Conviction formed through the first full inference infrastructure cycle — before the category had a name.
Firntal Fund II $40M Actively deploying Closed September 2024 Currently investing. Check sizes $1M–$3.5M. Seed and Pre-Seed stages.
$68M total assets under management

Technical operators, not just capital.

Firntal's partners built GPU scheduling layers, ML serving platforms, and low-latency execution systems before they managed capital. The support is grounded in that experience — not in a playbook written for a different category.

Technical Diligence as a Two-Way Signal

Sarah reviews system architecture directly with founding teams as part of diligence — serving stack, batching design, hardware assumptions, scaling model. Founders who engage with those questions deeply are typically the ones we back. It is also an accurate preview of how we operate post-investment.

Infrastructure Network

Our network spans hyperscale cloud engineering teams, GPU hardware suppliers, and open-source ML communities built through years of practitioner work. We make introductions that matter — to engineers who can become hires, to cloud allocation contacts, to infrastructure founders who have solved adjacent problems.

Board Participation

Lukas joins as board observer or director on most investments. We arrive with prepared positions on technical strategy, financial trajectory, and fundraising posture. We do not arrive at board meetings to ask what the team has been doing.

Recruiting from the ETH Network

The ETH Zurich distributed systems and ML engineering community runs through Firntal. Introductions to senior ML engineers, inference infrastructure leads, and potential founding CPOs — from networks built as practitioners, not assembled as investors.

Enterprise Buyer Introductions

Niklas maps infrastructure buyers across financial services, healthcare, and manufacturing — sectors where AI inference is entering procurement cycles. Introductions go to engineering directors and CTO-level buyers who are actively evaluating inference infrastructure, not generic warm notes.

Follow-on Fundraising Preparation

We start fundraising preparation 9 months before portfolio companies need capital. Financial model construction, investor targeting, and narrative positioning — built with time to iterate, not assembled in the two weeks before a runway conversation becomes urgent.

Building in this space?

We review every inbound from founders working on inference infrastructure, model optimization, or the systems layer beneath foundation models. The fastest path to a conversation is a direct note to Lukas — no deck required to start.

[email protected]