Skip to main content

llmService

Workspace API


Workspace API / services/llmService

services/llmService

Classes

LLMTimeoutError

Defined in: services/llmService.ts:87

Extends

  • Error

Constructors

Constructor

new LLMTimeoutError(message, timeoutMs, provider): LLMTimeoutError

Defined in: services/llmService.ts:88

Parameters
message

string

timeoutMs

number

provider

LLMProviderId

Returns

LLMTimeoutError

Overrides

Error.constructor

Properties

timeoutMs

readonly timeoutMs: number

Defined in: services/llmService.ts:90

provider

readonly provider: LLMProviderId

Defined in: services/llmService.ts:91


LLMProviderError

Defined in: services/llmService.ts:98

Extends

  • Error

Constructors

Constructor

new LLMProviderError(message, provider, code?): LLMProviderError

Defined in: services/llmService.ts:99

Parameters
message

string

provider

LLMProviderId

code?

string

Returns

LLMProviderError

Overrides

Error.constructor

Properties

provider

readonly provider: LLMProviderId

Defined in: services/llmService.ts:101

code?

readonly optional code: string

Defined in: services/llmService.ts:102

Interfaces

LLMGenerateOptions

Defined in: services/llmService.ts:35

Properties

maxTokens?

optional maxTokens: number

Defined in: services/llmService.ts:37

Maximum tokens to generate

temperature?

optional temperature: number

Defined in: services/llmService.ts:39

Temperature for generation (0-1)

topP?

optional topP: number

Defined in: services/llmService.ts:41

Top-p sampling

systemInstruction?

optional systemInstruction: string

Defined in: services/llmService.ts:43

System instruction/prompt

timeoutMs?

optional timeoutMs: number

Defined in: services/llmService.ts:45

Timeout in milliseconds (default: 120000 for non-streaming, 180000 for streaming)

preCalculatedInputTokens?

optional preCalculatedInputTokens: number

Defined in: services/llmService.ts:52

Pre-calculated input token count (optional) If provided, the provider will use this value instead of re-estimating. Useful when the caller has already calculated tokens (e.g., RAG service). Providers may ignore this if they don't support it.


LLMProviderInfo

Defined in: services/llmService.ts:55

Properties

id

id: LLMProviderId

Defined in: services/llmService.ts:56

modelId

modelId: string

Defined in: services/llmService.ts:57

displayName

displayName: string

Defined in: services/llmService.ts:58

maxContextLength

maxContextLength: number

Defined in: services/llmService.ts:66

Maximum context length (input + output tokens) for the model. Used to cap maxTokens in generation to prevent context overflow errors.

Source: Model documentation or vLLM --max-model-len configuration. Example: Granite 4.0-H-Tiny = 128000 tokens

endpoint

endpoint: string

Defined in: services/llmService.ts:71

Endpoint URL for the model. Used for logging, metrics, and diagnostics.


LLMProvider

Defined in: services/llmService.ts:74

Properties

info

readonly info: LLMProviderInfo

Defined in: services/llmService.ts:75

Methods

generateText()

generateText(prompt, options?): Promise<string>

Defined in: services/llmService.ts:78

Generate text (non-streaming)

Parameters
prompt

string

options?

LLMGenerateOptions

Returns

Promise<string>

generateTextStream()

generateTextStream(prompt, options?): AsyncGenerator<string, void, unknown>

Defined in: services/llmService.ts:81

Generate text with streaming

Parameters
prompt

string

options?

LLMGenerateOptions

Returns

AsyncGenerator<string, void, unknown>

Type Aliases

LLMProviderId

LLMProviderId = "granite" | "gemini" | "claude" | "openai"

Defined in: services/llmService.ts:33

Variables

DEFAULT_TIMEOUT_MS

const DEFAULT_TIMEOUT_MS: 120000 = 120_000

Defined in: services/llmService.ts:21

Default timeout for LLM generation (2 minutes)


DEFAULT_STREAM_TIMEOUT_MS

const DEFAULT_STREAM_TIMEOUT_MS: 180000 = 180_000

Defined in: services/llmService.ts:24

Default timeout for streaming LLM generation (3 minutes - longer for streaming)


INITIAL_RESPONSE_TIMEOUT_MS

const INITIAL_RESPONSE_TIMEOUT_MS: 60000 = 60_000

Defined in: services/llmService.ts:27

Timeout for initial connection/response (60 seconds - allows buffer for retries on 503s)

Functions

withTimeout()

withTimeout<T>(promise, timeoutMs, errorMessage): Promise<T>

Defined in: services/llmService.ts:119

Wrap a promise with a timeout

Type Parameters

T

T

Parameters

promise

Promise<T>

The promise to wrap

timeoutMs

number

Timeout in milliseconds

errorMessage

string

Error message if timeout occurs

Returns

Promise<T>


withStreamTimeout()

withStreamTimeout<T>(generator, initialTimeoutMs, providerId): AsyncGenerator<T, void, unknown>

Defined in: services/llmService.ts:146

Wrap an async generator with timeout for the first chunk This ensures we fail fast if the LLM doesn't respond

Type Parameters

T

T

Parameters

generator

AsyncGenerator<T, void, unknown>

initialTimeoutMs

number

providerId

LLMProviderId

Returns

AsyncGenerator<T, void, unknown>


getLLMProvider()

getLLMProvider(providerId): LLMProvider

Defined in: services/llmService.ts:210

Get an LLM provider by ID Returns a singleton instance for each provider

Parameters

providerId

LLMProviderId = 'granite'

Returns

LLMProvider


getDefaultLLMProvider()

getDefaultLLMProvider(): LLMProvider

Defined in: services/llmService.ts:240

Get the default LLM provider Defaults to Granite (self-hosted IBM Granite 3.2-8B-Instruct)

Returns

LLMProvider


getDefaultProviderInfo()

getDefaultProviderInfo(): LLMProviderInfo

Defined in: services/llmService.ts:248

Get provider info for the current default provider

Returns

LLMProviderInfo