Maintained by deepset

Integration: Hugging Face Transformers

Run Transformers models locally in your Haystack pipelines

Authors

deepset

GitHub Repo PyPI Package

Overview
Installation
Usage

Overview

Transformers is Hugging Face’s library for state-of-the-art machine learning models. With this integration, you can run models from the Hugging Face Hub locally, on your own machine, in your Haystack pipelines.

Haystack supports Hugging Face models in other ways too:

Sentence Transformers for local embedding and ranking models
Hugging Face API to call models via Inference Providers, Inference Endpoints, or self-hosted TGI/TEI
Optimum for high-performance inference with ONNX Runtime

Installation

pip install haystack-ai "transformers[torch,sentencepiece]"

Usage

Components

Haystack provides several components that run Transformers models locally:

HuggingFaceLocalChatGenerator: chat generation with local LLMs.
ExtractiveReader: extracts answers from documents using question answering models.
TransformersTextRouter and TransformersZeroShotTextRouter: route text to different pipeline branches based on classification.
TransformersZeroShotDocumentClassifier: classifies documents with zero-shot classification models.
NamedEntityExtractor: annotates named entities in documents (with the hugging_face backend).

Chat Generation

Use HuggingFaceLocalChatGenerator to run a chat model locally:

from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
from haystack.dataclasses import ChatMessage

generator = HuggingFaceLocalChatGenerator(model="Qwen/Qwen3-0.6B")

messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))

Extractive Question Answering

Use ExtractiveReader to extract answers from the relevant context:

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.readers import ExtractiveReader

docs = [Document(content="Paris is the capital of France."),
        Document(content="Berlin is the capital of Germany."),
        Document(content="Rome is the capital of Italy."),
        Document(content="Madrid is the capital of Spain.")]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)

retriever = InMemoryBM25Retriever(document_store=document_store)
reader = ExtractiveReader(model="deepset/roberta-base-squad2-distilled")

extractive_qa_pipeline = Pipeline()
extractive_qa_pipeline.add_component(instance=retriever, name="retriever")
extractive_qa_pipeline.add_component(instance=reader, name="reader")
extractive_qa_pipeline.connect("retriever.documents", "reader.documents")

query = "What is the capital of France?"
extractive_qa_pipeline.run(data={"retriever": {"query": query, "top_k": 3},
                                 "reader": {"query": query, "top_k": 2}})

Zero-Shot Document Classification

Use TransformersZeroShotDocumentClassifier to classify documents with labels of your choice, without fine-tuning:

from haystack import Document
from haystack.components.classifiers import TransformersZeroShotDocumentClassifier

documents = [Document(content="Today was a nice day!"),
             Document(content="Yesterday was a bad day!")]

classifier = TransformersZeroShotDocumentClassifier(
    model="cross-encoder/nli-deberta-v3-xsmall",
    labels=["positive", "negative"],
)

result = classifier.run(documents=documents)
print([doc.meta["classification"]["label"] for doc in result["documents"]])
# ['positive', 'negative']

Named Entity Recognition

Use NamedEntityExtractor to annotate named entities in documents:

from haystack import Document
from haystack.components.extractors.named_entity_extractor import NamedEntityExtractor

documents = [
    Document(content="I'm Merlin, the happy pig!"),
    Document(content="My name is Clara and I live in Berkeley, California."),
]
extractor = NamedEntityExtractor(backend="hugging_face", model="dslim/bert-base-NER")

results = extractor.run(documents=documents)["documents"]
annotations = [NamedEntityExtractor.get_stored_annotations(doc) for doc in results]
print(annotations)