Vendor Catalog

vLLM

Self-hosted high-throughput inference

Vendor ID: vllm · Categories: AI

vLLM — high-throughput, memory-efficient inference engine. Exposes an OpenAI-compatible API surface.

Auth

Credential	Notes
`apiKey` (optional)	If your vLLM deployment is gated. Many self-hosted setups don't enable it.
`none`	If the vLLM server is reachable directly.

Capability	Wire protocol
LLM	`openai-chat-v1`
Embedding	`openai-chat-v1`-compatible

Configured per integration with a baseUrl pointing at the vLLM endpoint.

On this page