Integration: Hugging Face Transformers
Run Transformers models locally in your Haystack pipelines
Table of Contents
Overview
Transformers is Hugging Face’s library for state-of-the-art machine learning models. With this integration, you can run models from the Hugging Face Hub locally, on your own machine, in your Haystack pipelines.
Haystack supports Hugging Face models in other ways too:
- Sentence Transformers for local embedding and ranking models
- Hugging Face API to call models via Inference Providers, Inference Endpoints, or self-hosted TGI/TEI
- Optimum for high-performance inference with ONNX Runtime
Installation
pip install haystack-ai "transformers[torch,sentencepiece]"
Usage
Components
Haystack provides several components that run Transformers models locally:
-
HuggingFaceLocalChatGenerator: chat generation with local LLMs. -
ExtractiveReader: extracts answers from documents using question answering models. -
TransformersTextRouterandTransformersZeroShotTextRouter: route text to different pipeline branches based on classification. -
TransformersZeroShotDocumentClassifier: classifies documents with zero-shot classification models. -
NamedEntityExtractor: annotates named entities in documents (with thehugging_facebackend).
Chat Generation
Use
HuggingFaceLocalChatGenerator to run a chat model locally:
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
from haystack.dataclasses import ChatMessage
generator = HuggingFaceLocalChatGenerator(model="Qwen/Qwen3-0.6B")
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))
Extractive Question Answering
Use
ExtractiveReader to extract answers from the relevant context:
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.readers import ExtractiveReader
docs = [Document(content="Paris is the capital of France."),
Document(content="Berlin is the capital of Germany."),
Document(content="Rome is the capital of Italy."),
Document(content="Madrid is the capital of Spain.")]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)
retriever = InMemoryBM25Retriever(document_store=document_store)
reader = ExtractiveReader(model="deepset/roberta-base-squad2-distilled")
extractive_qa_pipeline = Pipeline()
extractive_qa_pipeline.add_component(instance=retriever, name="retriever")
extractive_qa_pipeline.add_component(instance=reader, name="reader")
extractive_qa_pipeline.connect("retriever.documents", "reader.documents")
query = "What is the capital of France?"
extractive_qa_pipeline.run(data={"retriever": {"query": query, "top_k": 3},
"reader": {"query": query, "top_k": 2}})
Zero-Shot Document Classification
Use
TransformersZeroShotDocumentClassifier to classify documents with labels of your choice, without fine-tuning:
from haystack import Document
from haystack.components.classifiers import TransformersZeroShotDocumentClassifier
documents = [Document(content="Today was a nice day!"),
Document(content="Yesterday was a bad day!")]
classifier = TransformersZeroShotDocumentClassifier(
model="cross-encoder/nli-deberta-v3-xsmall",
labels=["positive", "negative"],
)
result = classifier.run(documents=documents)
print([doc.meta["classification"]["label"] for doc in result["documents"]])
# ['positive', 'negative']
Named Entity Recognition
Use
NamedEntityExtractor to annotate named entities in documents:
from haystack import Document
from haystack.components.extractors.named_entity_extractor import NamedEntityExtractor
documents = [
Document(content="I'm Merlin, the happy pig!"),
Document(content="My name is Clara and I live in Berkeley, California."),
]
extractor = NamedEntityExtractor(backend="hugging_face", model="dslim/bert-base-NER")
results = extractor.run(documents=documents)["documents"]
annotations = [NamedEntityExtractor.get_stored_annotations(doc) for doc in results]
print(annotations)
