How to use Hugging Face Inference API — a practical guide

If you're here, you probably want working code in five minutes. That's exactly what this guide does.

Five steps to get started

  1. 1
    Create an account at huggingface.co and grab an API token.
  2. 2
    Pick a model (e.g. 'dicta-il/dictalm2.0' for Hebrew).
  3. 3
    POST to api-inference.huggingface.co/models/<model>.

Shortest snippet that will actually run

import { HfInference } from "@huggingface/inference";

const hf = new HfInference(process.env.HF_TOKEN);

const out = await hf.textGeneration({
  model: "dicta-il/dictalm2.0",
  inputs: "שלום, מה שלומך היום?",
});
console.log(out.generated_text);

Common mistakes (the honest list)

Filter models by language=hebrew at huggingface.co/models.
Don't rely on serverless in production. Use a dedicated endpoint or self-host.