開発者

Tenstorrentハードウェア上でモデルを素早く稼働させましょう。2つのオープンソースSDKで、可能な限りメタルに近づくか、AIコンパイラに任せることができます。

モデル

Explore models optimized on Tenstorrent hardware.

Don’t see yours listed? Check out TT-Forge, our compiler, to get other models running today.

モデル58
bge-large-en-v1.5

Purpose-built for semantic search, dense retrieval, and RAG. BAAI, 335M.

Feature Extraction
335M
DeepSeek-R1

Mixture-of-experts reasoning powerhouse rivaling closed frontier models on math and code. DeepSeek, 671B.

Text Generation
671B
efficientnet-b0

Compound-scaled CNN delivering state-of-the-art accuracy at a fraction of the parameters. Google, 66M.

Image Classification
5.3M
FLUX.1 [dev]

Flow-matching transformer for photorealistic text-to-image with strong prompt adherence. Black Forest Labs, 12B.

Text-to-Image
12B
FLUX.1 [schnell]

FLUX distilled to 4 steps — full quality, fraction of the compute. Black Forest Labs, 12B.

Text-to-Image
12B
gemma-3-1b-it

Tiny and capable — instruction-tuned for edge and on-device use. Google, 1B.

Text Generation
1B
gemma-3-27b

Largest Gemma 3 base — 140-language reasoning, coding, long context. Google, 27B.

Text Generation
27B
gemma-3-27b-it

Largest Gemma 3 — 140-language reasoning, coding, long context. Google, 27B.

Text Generation
27B
gemma-3-4b-it

Compact Gemma 3 base — punches above its weight in reasoning and code. Google, 4B.

Text Generation
4B
gemma-3-4b-it

Punches above its weight in reasoning and code. Google, 4B.

Text Generation
4B
gpt-oss-120b

Open GPT-style model for deep reasoning in self-hosted deployments. GPT-OSS, 120B.

Text Generation
120B
gpt-oss-20b

Open GPT-style generation at a practical self-hosted scale. GPT-OSS, 20B.

Text Generation
20B
Llama-3.1-8B

128K context and multilingual base with a wide fine-tuning ecosystem. Meta, 8B.

Text Generation
8B
Llama-3.1-8B-Instruct

Multilingual instruction following, tool use, and function calling. Meta, 8B.

Text Generation
8B
Llama-3.2-11B-Vision

Image + text inputs on Llama's reasoning foundation, 128K context. Meta, 11B.

Image-Text-to-Text
11B
Llama-3.2-11B-Vision-Instruct

Charts, image Q&A, visual documents — multimodal on a single accelerator. Meta, 11B.

Image-Text-to-Text
11B
Llama-3.2-1B

Sub-gigabyte base model for on-device and embedded inference. Meta, 1B.

Text Generation
1B
Llama-3.2-1B-Instruct

Instruction-tuned to run anywhere — on-device at 1B. Meta, 1B.

Text Generation
1B
Llama-3.2-3B

On-device reasoning with room for language understanding and light tool use. Meta, 3B.

Text Generation
3B
Llama-3.2-3B-Instruct

Lightweight agent backbone for low-latency, resource-constrained deployments. Meta, 3B.

Text Generation
3B
Llama-3.2-90B-Vision

Image + text inputs at full scale on Llama's reasoning foundation. Meta, 90B.

Image-Text-to-Text
90B
Llama-3.2-90B-Vision-Instruct

Document analysis, chart reading, OCR-level image understanding at full scale. Meta, 90B.

Image-Text-to-Text
90B
Llama-3.3-70B

Llama 3.1 refined — better math, code, and multilingual, same 128K context. Meta, 70B.

Text Generation
70B
Llama-3.3-70B-Instruct

Stronger structured tasks, tool use, and reasoning than prior Llama generations. Meta, 70B.

Text Generation
70B
Mistral-7B

Sliding window and grouped-query attention for fast, memory-lean inference. Mistral AI, 7B.

Text Generation
7B
Mistral-7B-Instruct-v0.3

Function-calling and instruction following for production and agentic use. Mistral AI, 7B.

Text Generation
7B
Mixtral-8x7B

Sparse MoE: 13B active per token out of 45B total — quality at low per-token cost. Mistral AI, 45B.

Text Generation
45B
MobileNet V2

Inverted residuals and linear bottlenecks — strong accuracy at near-zero inference cost. Google, 3.4M.

Image Classification
3.4M
Mochi 1

Text-to-video focused on motion quality and temporal coherence. Genmo, 10B.

Text-to-Video
10B
Motif Vision 6B Preview

Preview text-to-video from natural language prompts. Motif, 6B.

Text-to-Video
6B
Phi-3-Mini-128K-Instruct

Compact 3.8B instruction model with 128K context tuned for reasoning on constrained hardware. Microsoft, 3.8B.

Text Generation
3.8B
Qwen2.5-32B

General-purpose 32B text model with strong multilingual and code coverage. Alibaba, 32B.

Text Generation
32B
Qwen2.5-72B-Instruct

Instruction-tuned across Chinese, English, coding, and structured output at scale. Alibaba, 72B.

Text Generation
72B
Qwen2.5-7B

Multilingual base for Chinese, English, coding, and math. Alibaba, 7B.

Text Generation
7B
Qwen2.5-7B-Instruct

Tuned for code, math, and structured output at an efficient scale. Alibaba, 7B.

Text Generation
7B
Qwen2.5-Coder-32B

Code-specialist 32B tuned for program synthesis, repair, and structured output. Alibaba, 32B.

Text Generation
32B
Qwen2.5-VL-32B

Vision-language base — documents, charts, and multi-image Q&A. Alibaba, 32B.

Image-Text-to-Text
32B
Qwen2.5-VL-32B-Instruct

Document understanding, chart analysis, and multi-image Q&A. Alibaba, 32B.

Image-Text-to-Text
32B
Qwen2.5-VL-3B-Instruct

Vision-language for low-latency multimodal deployment on constrained hardware. Alibaba, 3B.

Image-Text-to-Text
3B
Qwen2.5-VL-72B

Vision-language base at scale — documents, charts, and scene understanding. Alibaba, 72B.

Image-Text-to-Text
72B
Qwen2.5-VL-72B-Instruct

Vision-language at scale — documents, charts, and scene understanding. Alibaba, 72B.

Image-Text-to-Text
72B
Qwen2.5-VL-7B-Instruct

Visual Q&A, OCR, and document parsing in a practical VLM footprint. Alibaba, 7B.

Image-Text-to-Text
7B
Qwen3-32B

Toggleable chain-of-thought for on-demand deep reasoning. Alibaba, 32B.

Text Generation
32B
Qwen3-8B

Reasoning depth on demand without a throughput penalty. Alibaba, 8B.

Text Generation
8B
Qwen3-Embedding-4B

Multilingual embeddings for retrieval and semantic similarity. Alibaba, 4B.

Embedding
4B
Qwen3-Embedding-8B

Higher-capacity multilingual embeddings for retrieval and reranking. Alibaba, 8B.

Embedding
8B
Qwen3-VL-32B-Instruct

Hybrid thinking + vision — documents, charts, UI, and multi-image tasks. Alibaba, 32B.

Image-Text-to-Text
32B
QwQ-32B

Chain-of-thought with self-reflection for math, science, and logic. Alibaba, 32B.

Text Generation
32B
ResNet-50

Skip connections that made deep networks trainable — the image classification baseline. Microsoft Research, 25M.

Image Classification
25M
SegFormer (b0)

Dual-encoder base for high-resolution synthesis and fine-tuning pipelines. Stability AI, 6.6B.

Image Segmentation
3.8M
SpeechT5 (TTS task)

Mix-transformer segmentation without positional encoding — accurate at low compute. NVIDIA, 3.8M.

Text-to-Speech
307M
Stable Diffusion 3.5 Large

Unified encoder-decoder for natural text-to-speech synthesis. Microsoft, 307M.

Text-to-Image
8B
SD-XL 1.0-base

MMDiT architecture — strong text rendering and prompt control for image generation. Stability AI, 8B.

Text-to-Image
6.6B
unet-base-vgg

Skip connections preserve spatial detail for precise segmentation — VGG backbone. 31M.

Image Segmentation
31M
vit-base

Image patches + self-attention, no convolutions — the original vision transformer. Google, 86M.

Image Classification
86M
vovnet-19b-ra

One-Shot Aggregation avoids DenseNet's redundant paths for better accuracy-per-FLOP. 11.2M.

Image Classification
11.2M
Wan2.2

Causal video transformer for text-to-video with strong motion coherence. Alibaba, 14B.

Text-to-Video
14B
whisper-large-v3

99 languages, 680K hours of training audio — built for robust speech recognition. OpenAI, 1.5B.

Speech-to-Text
1.5B

バウンティプログラム

私たちと一緒に、開かれた未来を築きましょう。バグを修正し、機能を追加して、報酬を得ましょう。

開発者コミュニティに参加しましょう

最新情報を入手したり、質問をしたり、オープンソースコードを確認したりできます。