Taalas Raises $169M to Build Model-Specific AI Chips That Outperform Nvidia H200 at 1/10th the Power
Toronto-based Taalas raises $169 million to etch AI model weights directly into silicon, claiming 73x more output tokens per second than Nvidia's H200 GPU at one-tenth the power consumption for its Llama-optimized inference chip.
Taalas, a Toronto-based AI chip startup, emerged from stealth with a $169 million funding round on February 19, 2026, bringing total capital raised to over $200 million. The company takes a fundamentally different approach to AI inference: rather than running models on general-purpose GPUs, Taalas etches specific model weights directly into silicon, trading flexibility for extraordinary efficiency.
The Technology
Taalas's approach is "model-specific" chip design. Instead of building a general-purpose processor that can run any model, Taalas customizes only 2 of a chip's 100+ layers using what the company calls "mask ROM recall fabric" modules — circuit structures that use a single transistor per 4-bit storage unit for matrix multiplication. The effect is that the model's weights are physically encoded in the chip's circuitry, eliminating the memory bandwidth bottleneck that limits GPU-based inference performance.
The first product is a chip optimized for Meta's Llama 3.1 8B model. Taalas claims this chip produces 17,000 output tokens per second — 73x more than Nvidia's H200 GPU — while consuming one-tenth the power. If these numbers hold up under independent testing, they represent a step-change in inference economics: the same workload running on Taalas silicon would cost a fraction of what it costs on Nvidia hardware, with dramatically lower energy consumption.
The Founders
Taalas was founded by Ljubisa Bajic, who previously founded Tenstorrent, another AI chip company. Co-founders Drago Ignjatovic and Lejla Bajic were early Tenstorrent engineers. The team of 25 employees has deep semiconductor design expertise and a track record of building AI-specific silicon. Investors include Quiet Capital, Fidelity, and Pierre Lamond, a veteran semiconductor investor who was an early backer of National Semiconductor and other foundational chip companies.
Roadmap and Limitations
A chip optimized for the Llama 20B model is expected by mid-2026, and a more advanced "HC2" processor targeting frontier-class models will follow. The fundamental trade-off of Taalas's approach is obvious: a chip optimized for Llama 3.1 8B cannot run GPT-5 or Claude. When a new model version is released, a new chip must be fabricated. Taalas's bet is that the inference cost savings are large enough to justify model-specific silicon for the most widely deployed models, and that the fabrication cycle can keep pace with model release cadences.
For cloud providers and enterprises running specific models at massive scale — where inference compute is a multi-billion-dollar annual expense — model-specific chips could fundamentally change the cost structure of AI deployment. The question is whether the performance claims survive independent verification and whether the model-specific constraint is acceptable for production deployments that may need to upgrade models on shorter timelines than chip fabrication allows.
Related Articles
Google Gemini 3.1 Flash-Lite Targets Enterprise Scale at $0.25 Per Million Tokens
Google has launched Gemini 3.1 Flash-Lite in preview, the fastest and most cost-efficient model in its Gemini 3 family, priced at just $0.25 per million input tokens with 2.5x faster time-to-first-token than its predecessor. The model targets high-volume enterprise workloads where cost and latency matter more than peak capability.
Mandiant Founder Kevin Mandia Raises $190 Million for AI Cybersecurity Startup Armadin
Kevin Mandia, who sold Mandiant to Google for $5.4 billion in 2022, has raised a record-breaking $190 million in combined seed and Series A funding for Armadin, a startup building autonomous AI security agents. Backed by Accel, GV, Kleiner Perkins, and the CIA's In-Q-Tel, Armadin is already working with Fortune 100 companies.
Nscale Raises $2 Billion Series C — the Largest Funding Round in European Tech History
London-based AI infrastructure company Nscale closes a $2 billion Series C at a $14.6 billion valuation — the largest funding round in European history — backed by Citadel, Dell, NVIDIA, and Nokia, with former Meta COO Sheryl Sandberg joining the board.