Skip to main content

granite_embedding

Granite Embedding Client for text-extractor service.

Generates 768-dimensional embeddings using IBM Granite Embedding 278M model via vLLM OpenAI-compatible API. Compatible with existing pgvector schema.

Classes

GraniteEmbeddingClient

Client for IBM Granite Embedding model via vLLM.

Self-hosted embedding solution using IBM Granite on vLLM. Maintains 768-dimension compatibility for existing pgvector schema.

Constructor:

def __init__(self) -> None

Methods

generate_embeddings

def generate_embeddings(self, texts: list[str], batch_size: int = 32) -> list[list[float]]

Generate embeddings for a list of texts.

Args: texts: List of text strings to embed batch_size: Maximum texts per API request

Returns: List of embedding vectors (768 dimensions each)

Raises: Exception: If embedding generation fails after retries

close

def close(self) -> None

Close the HTTP client.

Functions

get_granite_embedding_client

def get_granite_embedding_client() -> GraniteEmbeddingClient

Get or create singleton Granite embedding client.