{
"data": "asd",
"id": "245c52bc936f53ba90327800c73d1c3e",
"object": "chat.completion",
"model": "codestral",
"usage": {
"prompt_tokens": 16,
"completion_tokens": 102,
"total_tokens": 118
},
"created": 1732902806,
"choices": [
{
"index": 0,
"message": {
"content": "\n // Use a regular expression to match any non-alphanumeric character and replace it with an empty string\n return str.replace(/[^a-zA-Z0-9]/g, '');\n}\n\n// Test the function\nconst inputString = \"Hello, World! 123\";\nconst outputString = removeSpecialCharactersWithRegex(inputString);\nconsole.log(outputString); // Output: \"HelloWorld123\"",
"prefix": false,
"role": "assistant"
},
"finish_reason": "stop"
}
]
}
Rate limits
The rate limit for the FIM Completion endpoint is 500 RPM (requests per minute) and 60.000 TPM (tokens per minute). Rate limits are defined at the workspace level — not at an API key level. Each model has its own rate limit. If you exceed your rate limit, you will receive a 429 Too Many Requests response.
Please note that the rate limits are subject to change; refer to this documentation for the most up-to-date information. In case you need a higher rate limit, please contact us at [email protected].
Using the Continue AI Code Assistant
Using the Codestral model, combined with chat completion models from the Langdock API, makes it possible to use the open-source AI code assistant Continue (continue.dev) fully via the Langdock API. Continue is available as a VS Code extension and as a JetBrains extension.
To customize the models used by Continue, edit the configuration file at ~/.continue/config.json (macOS / Linux) or %USERPROFILE%\.continue\config.json (Windows). Example setup using Codestral for autocomplete and other models for chats/edits:
Endpoint
POST /mistral/{region}/v1/fim/completions
Try it with the example cURL shown above.
Headers
Authorization (string) — required
API key as Bearer token. Format: "Bearer YOUR_API_KEY"
Path parameters
region (string, required)
The region of the API to use.
Available options:
eu
Body (application/json)
model (string) — required, default: codestral-2501
ID of the model to use. Only compatible for now with:
codestral-2501
prompt (string) — required
The text/code to complete.
temperature (number)
What sampling temperature to use; recommended between 0.0 and 0.7. Higher values (e.g., 0.7) make output more random; lower values (e.g., 0.2) make it more focused/deterministic. We generally recommend altering this or top_p, but not both. The default value varies by model. Call the /models endpoint to retrieve the appropriate default.
Required range: 0 <= x <= 1.5
top_p (number) — default: 1
Nucleus sampling: the model considers tokens comprising the top top_p probability mass. We generally recommend altering this or temperature, but not both.
Required range: 0 <= x <= 1
max_tokens (integer)
Maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length.
Required range: x >= 0
stream (boolean) — default: false
Whether to stream back partial progress. If set, tokens are sent as data-only server-side events as they become available, with the stream terminated by a data: [DONE] message. Otherwise, the server returns the full result as JSON when complete.
stop (string | string[])
Stop generation if this token is detected. Or provide an array of tokens.
random_seed (integer)
The seed to use for random sampling. If set, different calls will generate deterministic results.
Required range: x >= 0
suffix (string) — default: ""
Optional text/code that adds more context for the model. When given both a prompt and a suffix, the model will fill what is between them. When suffix is not provided, the model will simply execute completion starting with prompt.
min_tokens (integer)
The minimum number of tokens to generate in the completion.
Required range: x >= 0
Response (200 — application/json)
Successful response fields:
model (string) — Example: "mistral-small-latest"
id (string) — Example: "cmpl-e5cc70bb28c444948073e77776eb30ef"
object (string) — Example: "chat.completion"
usage (object) — required
usage.prompt_tokens (integer) — Example: 16
usage.completion_tokens (integer) — Example: 34
usage.total_tokens (integer) — Example: 50
choices (array of ChatCompletionChoice objects)
index (integer) — Example: 0
message (object) — contains the assistant's generated content