Galaxy

The Galaxy Wormhole Server is Tenstorrent’s scalable, ultra-dense AI compute solution for corporations and research institutions, offering superior performance density for cost. These pre-configured, rack-mounted Galaxy systems are engineered to deliver dense, scalable, high-performance AI compute built on an Ethernet-based mesh of 32 Tenstorrent Wormhole™ processors.
Galaxy
Galaxy
Specification
Single Galaxy Module

Galaxy Wormhole Server
AI Processor(s)
Tenstorrent Wormhole
32 x Tenstorrent Wormhole
Galaxy Modules
1
32
Tensix Cores
80
2,560
AI Clock
1 GHz
1 GHz
TeraFLOPs (FP8)
292
9,322 (9.3 PetaFLOPs)
SRAM
120MB (1.5MB per Tensix Core)
3.8GB (120MB per Module)
Memory
12GB GDDR6 (192-bit memory bus, 12 GT/sec)
384GB GDDR6, globally addressable
Power
200W
7.5 kW
System Interface
3.2 Tbps Ethernet (16 x 200Gbps)
41.6 Tbps Ethernet Internal Connectivity
Cooling
Passive
6x 120mm Fan
Board Management Controller (BMC)
-
IMX8
Galaxy
Supported datatypes
Floating point
FP8, FP16, FP32* *Output only
Block floating point
BFP2, BFP4, BFP8
Integer
INT8, INT16, INT32* *Output only
Unsigned integer
UINT8
TensorFloat
TF32

Ultra Dense.

The Galaxy Wormhole Server is Tenstorrent’s ultra-dense AI compute solution, offering superior performance density for cost. Designed to scale, it’s a perfect AI sidecar on a supercomputer, with applications in HPC and/or as the main compute engine for a supercomputer. Galaxy is designed to subdivide without the need for complicated networking/software layers, easily switching between several small hosts, or one single host. One server contains 384GB of Tensix Processor GDDR6 memory, and the performance for cost only increases as you scale. Galaxy is supported by Tenstorrent’s open-source TT-Metalium™ SDK, giving engineers full access to the metal.
Ultra Dense.

Looking for additional information?