Recap of TT-Deploy

Tenstorrent Galaxy Blackhole and superclusters are here. View our latest performance benchmarks and see how our AI acceleration architecture, Networked AI, scales-out.

Tenstorrent Galaxy™ Blackhole and superclusters

Tenstorrent Galaxy Blackhole is our air-cooled compute server built with our next-generation Blackhole® chips and fully open-source software stack for general-purpose AI with native scale-out.

Extend Tenstorrent Galaxy™ Blackhole with superclusters into multi-server topology unlocking optimized setups for AI video generation, large-scale LLM Inference, and AI infrastructure.

Faster than real-time

In collaboration with Prodia, the industry’s fastest video generation is now 10x faster on a Tenstorrent Galaxy supercluster.

Latency (sec/video)

28.2 sec
23.2 sec
14.8 sec
2.4 sec

Wan 2.2 5B

Wan 2.2 A14B Lightning, Nvidia x Prodia

grok-imagine-video, xAI

Wan 2.2 A14B, Tenstorrent x Prodia

2x Time to First Token, 4x Output Speed

In Blitz mode, four Tenstorrent Galaxy supercluster connects multiple systems, fits the active model and KV cache entirely into on-chip SRAM, and executes directly from it eliminating the HBM round-trips that slow every GPU inference stack.

Time to First Token (sec), DeepSeek V3.2, 100k Context

7.5 sec
4.0 sec

GPUs

Tenstorrent

Benchmarks

See exactly how Tenstorrent Galaxy Blackhole performs under real AI workloads. Other solutions require bolting together separate accelerators across fragmented infrastructure. General-purpose means leading performance on every workload defining modern AI, not specializing in one.

Fastest AI Video Gen

Fastest AI Video Gen

AI Video Generation on Tenstorrent Galaxy is 10x faster than leading GPU systems.The industry’s fastest video generation is now 10x faster running on a Tenstorrent Galaxy supercluster and generating 720p, 81-frame video in brisk 2.4 seconds.

Learn more
Fastest and Most Affordable LLM Inference

Fastest and Most Affordable LLM Inference

Blitz Mode on Tenstorrent Galaxy, optimized for premium, latency-sensitive AI workloads, enables 350+ t/s/u and sub-4-second time-to-first-tokenon Deepseek 671B, beating the leading comparable GPU systems.

Learn more

Networked AI

Tenstorrent Galaxy™ Blackhole is made possible through Networked AI: A new model for AI infrastructure where compute, memory, and networking are unified into a single system optimized for real-world AI workloads. The architecture scales from a single core to thousands of servers networked under one software model

Get in touch

If you’re ready to discuss how Galaxy Blackhole fits into your solutions, or if you just want to learn more, we’re here to help you scale. Provide a few details, and we’ll reach out!