Tenstorrent Galaxy

The Tenstorrent Galaxy Wormhole Server is our scalable, ultra-dense AI compute solution for corporations and research institutions, offering superior performance density for cost. These pre-configured, rack-mounted systems are engineered to deliver dense, scalable, high-performance AI compute built on an Ethernet-based mesh of 32 Tenstorrent Wormhole™ processors.
Tenstorrent Galaxy
Tenstorrent Galaxy
Specification
Tenstorrent Wormhole Tensix Processor
Tenstorrent Galaxy Wormhole Server
AI Processor(s)
Tenstorrent Wormhole
32 x Tenstorrent Wormhole
Tenstorrent Galaxy Modules
1
32
Tensix Cores
80
2,560
AI Clock
1 GHz
1 GHz
TeraFLOPs (FP8)
292
9,322 (9.3 PetaFLOPs)
SRAM
120MB (1.5MB per Tensix Core)
3.8GB (120MB per Module)
Memory
12GB GDDR6 (192-bit memory bus, 12 GT/sec)
384GB GDDR6, globally addressable
Power
200W
7.5 kW
System Interface
3.2 Tbps Ethernet (16 x 200Gbps)
41.6 Tbps Ethernet Internal Connectivity
Board Management Controller (BMC)
-
IMX8
Tenstorrent Galaxy
Supported datatypes
Floating point
FP8, FP16, FP32* *Output only
Block floating point
BFP2, BFP4, BFP8
Integer
INT8, INT16, INT32* *Output only
Unsigned integer
UINT8
TensorFloat
TF32

Ultra Dense.

The Tenstorrent Galaxy Wormhole Server is Tenstorrent’s ultra-dense AI compute solution, offering superior performance density for cost. Designed to scale, it’s a perfect AI sidecar on a supercomputer, with applications in HPC and/or as the main compute engine for a supercomputer. Tenstorrent Galaxy is designed to subdivide without the need for complicated networking/software layers, easily switching between several small hosts, or one single host. One server contains 384GB of Tensix Processor GDDR6 memory, and the performance for cost only increases as you scale. Tenstorrent Galaxy is supported by Tenstorrent’s open-source TT-Metalium™ SDK, giving engineers full access to the metal.
Ultra Dense.

Looking for additional information?