Send chat completion request
Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes.
Request
This endpoint expects an object.
| 字段 | 类型 | 必填 | 说明 |
|---|---|---|---|
| messages | list of objects | 是 | List of messages for the conversation |
| provider | object or null | 否 | When multiple model providers are available, optionally indicate your routing preference. |
| plugins | list of objects | 否 | Plugins you want to enable for this request, including their settings. |
| user | string | 否 | Unique user identifier |
| session_id | string | 否 | <=128 characters A unique identifier for grouping related requests (e.g., a conversation or agent workflow) for observability. If provided in both the request body and the x-session-id header, the body value takes precedence. Maximum of 128 characters. |
| trace | object | 否 | Metadata for observability and tracing. Known keys (trace_id, trace_name, span_name, generation_name, parent_span_id) have special handling. Additional keys are passed through as custom metadata to configured broadcast destinations. |
| model | string | 否 | Model to use for completion |
| models | list of objects | 否 | Models to use for completion |
| frequency_penalty | double or null | 否 | -2-2 Frequency penalty (-2.0 to 2.0) |
| logit_bias | map from strings to doubles or null | 否 | Token logit bias adjustments |
| logprobs | boolean or null | 否 | Return log probabilities |
| top_logprobs | double or null | 否 | 0-20 Number of top log probabilities to return (0-20) |
| max_completion_tokens | double or null | 否 | >=1 Maximum tokens in completion |
| max_tokens | double or null | 否 | >=1 Maximum tokens (deprecated, use max_completion_tokens). Note: some providers enforce a minimum of 16. |
| metadata | map from strings to strings | 否 | Key-value pairs for additional object information (max 16 pairs, 64 char keys, 512 char values) |
| presence_penalty | double or null | 否 | -2-2 Presence penalty (-2.0 to 2.0) |
| reasoning | object | 否 | Configuration options for reasoning models |
| response_format | object | 否 | Response format configuration |
| seed | integer or null | 否 | Random seed for deterministic outputs |
| stop | string or list of strings or any | 否 | Stop sequences (up to 4) |
| stream | boolean | 否 | Defaults to false Enable streaming response |
| stream_options | object | 否 | Streaming configuration options |
| temperature | double or null | 否 | 0-2Defaults to 1 Sampling temperature (0-2) |
| parallel_tool_calls | boolean or null | 否 | - |
| tool_choice | enum or object | 否 | Tool choice configuration |
| tools | list of objects | 否 | Available tools for function calling |
| top_p | double or null | 否 | 0-1Defaults to 1 Nucleus sampling parameter (0-1) |
| debug | object | 否 | Debug options for inspecting request transformations (streaming only) |
| image_config | map from strings to strings or doubles or lists of any | 否 | Provider-specific image configuration options. Keys and values vary by model/provider. See https://novapai.ai/docs/guides/overview/multimodal/image-generation for more details. |
| modalities | list of enums | 否 | Output modalities for the response. Supported values are “text”, “image”, and “audio”. |
| cache_control | object | 否 | Enable automatic prompt caching. When set, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models. |
Response
Successful chat completion response
| 字段 | 类型 | 必填 | 说明 |
|---|---|---|---|
| id | string | - | Unique completion identifier |
| choices | list of objects | - | List of completion choices |
| created | double | - | Unix timestamp of creation |
| model | string | - | Model used for completion |
| object | enum | - | - |
| system_fingerprint | string or null | - | System fingerprint |
| usage | object | - | Token usage statistics |