agentskills.codes
TR

transformers-js

Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Wor

Install

mkdir -p .claude/skills/transformers-js && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16132" && unzip -o skill.zip -d .claude/skills/transformers-js && rm skill.zip

Installs to .claude/skills/transformers-js

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Works in browsers and server-side runtimes (Node.js, Bun, Deno) with WebGPU/WASM using pre-trained models from Hugging Face Hub.
425 charsno explicit “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

Transformers.js - Machine Learning for JavaScript

Transformers.js enables running state-of-the-art machine learning models directly in JavaScript across browsers and server-side runtimes (Node.js, Bun, Deno), with no Python server required.

When to Use This Skill

Use this skill when you need to:

  • Run ML models for text analysis, generation, or translation in JavaScript
  • Perform image classification, object detection, or segmentation
  • Implement speech recognition or audio processing
  • Build multimodal AI applications (text-to-image, image-to-text, etc.)
  • Run models client-side in the browser without a backend

Installation

NPM Installation

npm install @huggingface/transformers

Browser Usage (CDN)

<script type="module">
  import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
</script>

Core Concepts

1. Pipeline API

The pipeline API is the easiest way to use models. It groups together preprocessing, model inference, and postprocessing:

import { pipeline } from '@huggingface/transformers';

// Create a pipeline for a specific task
const pipe = await pipeline('sentiment-analysis');

// Use the pipeline
const result = await pipe('I love transformers!');
// Output: [{ label: 'POSITIVE', score: 0.999817686 }]

// IMPORTANT: Always dispose when done to free memory
await pipe.dispose();

⚠️ Memory Management: All pipelines must be disposed with pipe.dispose() when finished to prevent memory leaks. See examples in Code Examples for cleanup patterns across different environments.

2. Model Selection

You can specify a custom model as the second argument:

const pipe = await pipeline(
  'sentiment-analysis',
  'Xenova/bert-base-multilingual-uncased-sentiment'
);

Finding Models:

Browse available Transformers.js models on Hugging Face Hub:

Tip: Filter by task type, sort by trending/downloads, and check model cards for performance metrics and usage examples.

3. Device Selection

Choose where to run the model:

// Run on CPU (default for WASM)
const pipe = await pipeline('sentiment-analysis', 'model-id');

// Run on GPU (WebGPU)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  device: 'webgpu',
});

4. Quantization Options

Control model precision vs. performance:

// Use quantized model (faster, smaller)
const pipe = await pipeline('sentiment-analysis', 'model-id', {
  dtype: 'q4',  // Options: 'fp32', 'fp16', 'q8', 'q4'
});

Supported Tasks

Note: All examples below show basic usage.

Natural Language Processing

Text Classification

const classifier = await pipeline('text-classification');
const result = await classifier('This movie was amazing!');

Named Entity Recognition (NER)

const ner = await pipeline('token-classification');
const entities = await ner('My name is John and I live in New York.');

Question Answering

const qa = await pipeline('question-answering');
const answer = await qa({
  question: 'What is the capital of France?',
  context: 'Paris is the capital and largest city of France.'
});

Text Generation

const generator = await pipeline('text-generation', 'onnx-community/gemma-3-270m-it-ONNX');
const text = await generator('Once upon a time', {
  max_new_tokens: 100,
  temperature: 0.7
});

For streaming and chat: See Text Generation Guide for:

  • Streaming token-by-token output with TextStreamer
  • Chat/conversation format with system/user/assistant roles
  • Generation parameters (temperature, top_k, top_p)
  • Browser and Node.js examples
  • React components and API endpoints

Translation

const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
const output = await translator('Hello, how are you?', {
  src_lang: 'eng_Latn',
  tgt_lang: 'fra_Latn'
});

Summarization

const summarizer = await pipeline('summarization');
const summary = await summarizer(longText, {
  max_length: 100,
  min_length: 30
});

Zero-Shot Classification

const classifier = await pipeline('zero-shot-classification');
const result = await classifier('This is a story about sports.', ['politics', 'sports', 'technology']);

Computer Vision

Image Classification

const classifier = await pipeline('image-classification');
const result = await classifier('https://example.com/image.jpg');
// Or with local file
const result = await classifier(imageUrl);

Object Detection

const detector = await pipeline('object-detection');
const objects = await detector('https://example.com/image.jpg');
// Returns: [{ label: 'person', score: 0.95, box: { xmin, ymin, xmax, ymax } }, ...]

Image Segmentation

const segmenter = await pipeline('image-segmentation');
const segments = await segmenter('https://example.com/image.jpg');

Depth Estimation

const depthEstimator = await pipeline('depth-estimation');
const depth = await depthEstimator('https://example.com/image.jpg');

Zero-Shot Image Classification

const classifier = await pipeline('zero-shot-image-classification');
const result = await classifier('image.jpg', ['cat', 'dog', 'bird']);

Audio Processing

Automatic Speech Recognition

const transcriber = await pipeline('automatic-speech-recognition');
const result = await transcriber('audio.wav');
// Returns: { text: 'transcribed text here' }

Audio Classification

const classifier = await pipeline('audio-classification');
const result = await classifier('audio.wav');

Text-to-Speech

const synthesizer = await pipeline('text-to-speech', 'Xenova/speecht5_tts');
const audio = await synthesizer('Hello, this is a test.', {
  speaker_embeddings: speakerEmbeddings
});

Multimodal

Image-to-Text (Image Captioning)

const captioner = await pipeline('image-to-text');
const caption = await captioner('image.jpg');

Document Question Answering

const docQA = await pipeline('document-question-answering');
const answer = await docQA('document-image.jpg', 'What is the total amount?');

Zero-Shot Object Detection

const detector = await pipeline('zero-shot-object-detection');
const objects = await detector('image.jpg', ['person', 'car', 'tree']);

Feature Extraction (Embeddings)

const extractor = await pipeline('feature-extraction');
const embeddings = await extractor('This is a sentence to embed.');
// Returns: tensor of shape [1, sequence_length, hidden_size]

// For sentence embeddings (mean pooling)
const extractor = await pipeline('feature-extraction', 'onnx-community/all-MiniLM-L6-v2-ONNX');
const embeddings = await extractor('Text to embed', { pooling: 'mean', normalize: true });

Finding and Choosing Models

Browsing the Hugging Face Hub

Discover compatible Transformers.js models on Hugging Face Hub:

Base URL (all models):

https://huggingface.co/models?library=transformers.js&sort=trending

Filter by task using the pipeline_tag parameter:

TaskURL
Text Generationhttps://huggingface.co/models?pipeline_tag=text-generation&library=transformers.js&sort=trending
Text Classificationhttps://huggingface.co/models?pipeline_tag=text-classification&library=transformers.js&sort=trending
Translationhttps://huggingface.co/models?pipeline_tag=translation&library=transformers.js&sort=trending
Summarizationhttps://huggingface.co/models?pipeline_tag=summarization&library=transformers.js&sort=trending
Question Answeringhttps://huggingface.co/models?pipeline_tag=question-answering&library=transformers.js&sort=trending
Image Classificationhttps://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js&sort=trending
Object Detectionhttps://huggingface.co/models?pipeline_tag=object-detection&library=transformers.js&sort=trending
Image Segmentationhttps://huggingface.co/models?pipeline_tag=image-segmentation&library=transformers.js&sort=trending
Speech Recognitionhttps://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=transformers.js&sort=trending
Audio Classificationhttps://huggingface.co/models?pipeline_tag=audio-classification&library=transformers.js&sort=trending
Image-to-Texthttps://huggingface.co/models?pipeline_tag=image-to-text&library=transformers.js&sort=trending
Feature Extractionhttps://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js&sort=trending
Zero-Shot Classificationhttps://huggingface.co/models?pipeline_tag=zero-shot-classification&library=transformers.js&sort=trending

Sort options:

  • &sort=trending - Most popular recently
  • &sort=downloads - Most downloaded overall
  • &sort=likes - Most liked by community
  • &sort=modified - Recently updated

Choosing the Right Model

Consider these factors when selecting a model:

1. Model Size

  • Small (< 100MB): Fast, suitable for browsers, limited accuracy
  • Medium (100MB - 500MB): Balanced performance, g

Content truncated.

Search skills

Search the agent skills registry