There is a desire among customers to own their silicon and tailor it to their specific use cases and needs, well beyond conventional AI training and inference and into automotive AI, high performance computing (HPC), and more.
Tenstorrent’s RISC-V-based Ascalon architecture has been developed with flexibility in mind, scalable from a massive, highly-performant 8-wide implementation down to a minimal 2-wide implementation, with multiple steps in between. The Tensix Cores that power our AI accelerators are equally scalable, designed to form a mesh that scales with as many cores as are needed.
Based on system PPA goal requirements, Ascalon RTL can be parameterized into a 6/4/3/2-wide superscalar O-o-O processor. The wide PPA range of TT CPU IP’s addresses the performance and power efficiency requirements from edge devices, edge servers, to Cloud servers.
- 8-wide decode
- 2 256-bit vector Units
- 3 LD/ST with large load/store queues
- 6 ALU/2 BR
- 2 FPU Units
- 8 Core/per Cluster
- 230GB/S CHI coherency bus
- 230GB/S AXI message passing bus
- 12MB shared cluster cache
- Companion CPU cluster for AI
- Inter-cluster coherency
- Directory-base coherency system
- Large memory cache per DDR5-6400 channel
- 4 cc-NUMA 32-core quadrants with hierarchical interconnection
- Ample coherent/non-coherent bandwidth for system scalability
Have questions about Tenstorrent IP?