개발자
Get your models up and running fast on Tenstorrent hardware. With two open source SDKs, you can get as close to the metal as possible, or let our AI compiler do the work.
Models
Explore models optimized on Tenstorrent hardware.
Don’t see yours listed? Check out TT-Forge, our compiler, to get other models running today.
Blackhole
Wormhole
Purpose-built for semantic search, dense retrieval, and RAG. BAAI, 335M.
Mixture-of-experts reasoning powerhouse rivaling closed frontier models on math and code. DeepSeek, 671B.
Compound-scaled CNN delivering state-of-the-art accuracy at a fraction of the parameters. Google, 66M.
Commercially licensed generalist trained on 1T tokens of curated web data. TII, 40B.
Commercially licensed instruction model — lean resource footprint. TII, 7B.
Flow-matching transformer for photorealistic text-to-image with strong prompt adherence. Black Forest Labs, 12B.
FLUX distilled to 4 steps — full quality, fraction of the compute. Black Forest Labs, 12B.
Tiny and capable — instruction-tuned for edge and on-device use. Google, 1B.
Largest Gemma 3 base — 140-language reasoning, coding, long context. Google, 27B.
Largest Gemma 3 — 140-language reasoning, coding, long context. Google, 27B.
Compact Gemma 3 base — punches above its weight in reasoning and code. Google, 4B.
Punches above its weight in reasoning and code. Google, 4B.
Open GPT-style model for deep reasoning in self-hosted deployments. GPT-OSS, 120B.
Open GPT-style generation at a practical self-hosted scale. GPT-OSS, 20B.
128K context, multilingual, tool-use ready, fully open weights. Meta, 70B.
128K context and multilingual base with a wide fine-tuning ecosystem. Meta, 8B.
Multilingual instruction following, tool use, and function calling. Meta, 8B.
Image + text inputs on Llama's reasoning foundation, 128K context. Meta, 11B.
Charts, image Q&A, visual documents — multimodal on a single accelerator. Meta, 11B.
Sub-gigabyte base model for on-device and embedded inference. Meta, 1B.
Instruction-tuned to run anywhere — on-device at 1B. Meta, 1B.
On-device reasoning with room for language understanding and light tool use. Meta, 3B.
Lightweight agent backbone for low-latency, resource-constrained deployments. Meta, 3B.
Image + text inputs at full scale on Llama's reasoning foundation. Meta, 90B.
Document analysis, chart reading, OCR-level image understanding at full scale. Meta, 90B.
Llama 3.1 refined — better math, code, and multilingual, same 128K context. Meta, 70B.
Stronger structured tasks, tool use, and reasoning than prior Llama generations. Meta, 70B.
Sliding window and grouped-query attention for fast, memory-lean inference. Mistral AI, 7B.
Function-calling and instruction following for production and agentic use. Mistral AI, 7B.
Sparse MoE: 13B active per token out of 45B total — quality at low per-token cost. Mistral AI, 45B.
Inverted residuals and linear bottlenecks — strong accuracy at near-zero inference cost. Google, 3.4M.
Text-to-video focused on motion quality and temporal coherence. Genmo, 10B.
Preview text-to-video from natural language prompts. Motif, 6B.
Compact 3.8B instruction model with 128K context tuned for reasoning on constrained hardware. Microsoft, 3.8B.
General-purpose 32B text model with strong multilingual and code coverage. Alibaba, 32B.
Chinese, English, code, and structured output — 128K context base. Alibaba, 72B.
Instruction-tuned across Chinese, English, coding, and structured output at scale. Alibaba, 72B.
Multilingual base for Chinese, English, coding, and math. Alibaba, 7B.
Tuned for code, math, and structured output at an efficient scale. Alibaba, 7B.
Code-specialist 32B tuned for program synthesis, repair, and structured output. Alibaba, 32B.
Vision-language base — documents, charts, and multi-image Q&A. Alibaba, 32B.
Document understanding, chart analysis, and multi-image Q&A. Alibaba, 32B.
Vision-language for low-latency multimodal deployment on constrained hardware. Alibaba, 3B.
Vision-language base at scale — documents, charts, and scene understanding. Alibaba, 72B.
Vision-language at scale — documents, charts, and scene understanding. Alibaba, 72B.
Visual Q&A, OCR, and document parsing in a practical VLM footprint. Alibaba, 7B.
Toggleable chain-of-thought for on-demand deep reasoning. Alibaba, 32B.
Reasoning depth on demand without a throughput penalty. Alibaba, 8B.
Multilingual embeddings for retrieval and semantic similarity. Alibaba, 4B.
Higher-capacity multilingual embeddings for retrieval and reranking. Alibaba, 8B.
Hybrid thinking + vision — documents, charts, UI, and multi-image tasks. Alibaba, 32B.
Chain-of-thought with self-reflection for math, science, and logic. Alibaba, 32B.
Skip connections that made deep networks trainable — the image classification baseline. Microsoft Research, 25M.
Dual-encoder base for high-resolution synthesis and fine-tuning pipelines. Stability AI, 6.6B.
Mix-transformer segmentation without positional encoding — accurate at low compute. NVIDIA, 3.8M.
Unified encoder-decoder for natural text-to-speech synthesis. Microsoft, 307M.
MMDiT architecture — strong text rendering and prompt control for image generation. Stability AI, 8B.
Skip connections preserve spatial detail for precise segmentation — VGG backbone. 31M.
Image patches + self-attention, no convolutions — the original vision transformer. Google, 86M.
One-Shot Aggregation avoids DenseNet's redundant paths for better accuracy-per-FLOP. 11.2M.
Causal video transformer for text-to-video with strong motion coherence. Alibaba, 14B.
99 languages, 680K hours of training audio — built for robust speech recognition. OpenAI, 1.5B.