開発者

Optimise `deg2rad` / `rad2deg` (fp32/bf16)

medium

Optimise/improve accuracy for pow(x, y) fp32 (non-integer exponent)

Improve tanh accuracy and performance (fp32)

exp: perf regression (WH/BH) and -NaN edge case (WH)

[SFPU] Optimize `atanh`, `asinh`, and `acosh` with Numerically Stable `log1p`-Based Implementations (WH B0/BH)

Jun 12

Support for Compiler Explorer

Jun 8

Optimise exp(x) for fp32

Optimise/improve accuracy for sinh/cosh (fp32/bf16)

May 22

Performance/precision: atan/asin/acos (fp32)

開発者コミュニティに参加しましょう

最新情報を入手したり、質問をしたり、オープンソースコードを確認したりできます。

GitHub Discord X

開発者

Models

Try our Hardware Compatibility tool

Bounties

Contribute to our open source software

Community

Join our developer community

モデル

Explore models optimized on Tenstorrent hardware.

Don’t see yours listed? Check out TT-Forge, our compiler, to get other models running today.

目的

モデルサイズ

ソフトウェア

ハードウェア

Blackhole

Wormhole

モデル35

bge-large-en-v1.5

Purpose-built for semantic search, dense retrieval, and RAG. BAAI, 335M.

Feature Extraction

335M

DeepSeek-R1

Mixture-of-experts reasoning powerhouse rivaling closed frontier models on math and code. DeepSeek, 671B.

Text Generation

671B

efficientnet-b0

Compound-scaled CNN delivering state-of-the-art accuracy at a fraction of the parameters. Google, 66M.

Image Classification

5.3M

FLUX.1 [dev]

Flow-matching transformer for photorealistic text-to-image with strong prompt adherence. Black Forest Labs, 12B.

Text-to-Image

12B

FLUX.1 [schnell]

FLUX distilled to 4 steps — full quality, fraction of the compute. Black Forest Labs, 12B.

Text-to-Image

12B

gemma-3-1b-it

Tiny and capable — instruction-tuned for edge and on-device use. Google, 1B.

Text Generation

gemma-3-27b-it

Largest Gemma 3 — 140-language reasoning, coding, long context. Google, 27B.

Text Generation

27B

gemma-3-4b-it

Punches above its weight in reasoning and code. Google, 4B.

Text Generation

Llama-3.1-8B-Instruct

Multilingual instruction following, tool use, and function calling. Meta, 8B.

Text Generation

Llama-3.2-11B-Vision-Instruct

Charts, image Q&A, visual documents — multimodal on a single accelerator. Meta, 11B.

Image-Text-to-Text

11B

Llama-3.2-3B-Instruct

Lightweight agent backbone for low-latency, resource-constrained deployments. Meta, 3B.

Text Generation

Llama-3.2-90B-Vision-Instruct

Document analysis, chart reading, OCR-level image understanding at full scale. Meta, 90B.

Image-Text-to-Text

90B

Llama-3.3-70B-Instruct

Stronger structured tasks, tool use, and reasoning than prior Llama generations. Meta, 70B.

Text Generation

70B

MobileNet V2

Inverted residuals and linear bottlenecks — strong accuracy at near-zero inference cost. Google, 3.4M.

Image Classification

3.4M

Mochi 1

Text-to-video focused on motion quality and temporal coherence. Genmo, 10B.

Text-to-Video

10B

Motif Vision 6B Preview

Preview text-to-video from natural language prompts. Motif, 6B.

Text-to-Video

Qwen2.5-72B-Instruct

Instruction-tuned across Chinese, English, coding, and structured output at scale. Alibaba, 72B.

Text Generation

72B

Qwen2.5-VL-32B-Instruct

Document understanding, chart analysis, and multi-image Q&A. Alibaba, 32B.

Image-Text-to-Text

32B

Qwen2.5-VL-3B-Instruct

Vision-language for low-latency multimodal deployment on constrained hardware. Alibaba, 3B.

Image-Text-to-Text

Qwen2.5-VL-72B-Instruct

Vision-language at scale — documents, charts, and scene understanding. Alibaba, 72B.

Image-Text-to-Text

72B

Qwen2.5-VL-7B-Instruct

Visual Q&A, OCR, and document parsing in a practical VLM footprint. Alibaba, 7B.

Image-Text-to-Text

Qwen3-8B

Reasoning depth on demand without a throughput penalty. Alibaba, 8B.

Text Generation

Qwen3-Embedding-4B

Multilingual embeddings for retrieval and semantic similarity. Alibaba, 4B.

Embedding

Qwen3-Embedding-8B

Higher-capacity multilingual embeddings for retrieval and reranking. Alibaba, 8B.

Embedding

QwQ-32B

Chain-of-thought with self-reflection for math, science, and logic. Alibaba, 32B.

Text Generation

32B

ResNet-50

Skip connections that made deep networks trainable — the image classification baseline. Microsoft Research, 25M.

Image Classification

25M

SegFormer (b0)

Dual-encoder base for high-resolution synthesis and fine-tuning pipelines. Stability AI, 6.6B.

Image Segmentation

3.8M

SpeechT5 (TTS task)

Mix-transformer segmentation without positional encoding — accurate at low compute. NVIDIA, 3.8M.

Text-to-Speech

307M

Stable Diffusion 3.5 Large

Unified encoder-decoder for natural text-to-speech synthesis. Microsoft, 307M.

Text-to-Image

SD-XL 1.0-base

MMDiT architecture — strong text rendering and prompt control for image generation. Stability AI, 8B.

Text-to-Image

6.6B

unet-base-vgg

Skip connections preserve spatial detail for precise segmentation — VGG backbone. 31M.

Image Segmentation

31M

vit-base

Image patches + self-attention, no convolutions — the original vision transformer. Google, 86M.

Image Classification

86M

vovnet-19b-ra

One-Shot Aggregation avoids DenseNet's redundant paths for better accuracy-per-FLOP. 11.2M.

Image Classification

11.2M

Wan2.2

Causal video transformer for text-to-video with strong motion coherence. Alibaba, 14B.

Text-to-Video

14B

whisper-large-v3

99 languages, 680K hours of training audio — built for robust speech recognition. OpenAI, 1.5B.

Speech-to-Text

1.5B

バウンティプログラム

私たちと一緒に、開かれた未来を築きましょう。バグを修正し、機能を追加して、報酬を得ましょう。

すべて見る

Jul 15

Optimise `deg2rad` / `rad2deg` (fp32/bf16)

medium

Optimise/improve accuracy for pow(x, y) fp32 (non-integer exponent)

Improve tanh accuracy and performance (fp32)

exp: perf regression (WH/BH) and -NaN edge case (WH)

[SFPU] Optimize `atanh`, `asinh`, and `acosh` with Numerically Stable `log1p`-Based Implementations (WH B0/BH)

Jun 12

Support for Compiler Explorer

Jun 8

Optimise exp(x) for fp32

Optimise/improve accuracy for sinh/cosh (fp32/bf16)

May 22