Endpoints
We provide different endpoints with different price/performance tradeoffs. Our endpoints depend on internal models. Some of them are open-weight, which allow users to deploy them on their own, on arbitrary infrastructure. See Self-deployment for details.
Generative endpoints
All our generative endpoints can reason on contexts up to 32k tokens and follow fine-grained instructions.
We only provide chat access through our API. Users can access underlying base models for endpoints relying on open-weight models.
Tiny
This generative endpoint is best used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial.
Currently powered by Mistral-7B-v0.2, a better fine-tuning of the initial Mistral-7B released, inspired by the fantastic work of the community.
API name: nido-tiny
Small
Higher reasoning capabilities and more capabilities.
The endpoint supports English, French, German, Italian, and Spanish and can produce and reason about code.
Currently powered Mixtral-8X7B-v0.1, a sparse mixture of experts model with 12B active parameters.
API name: nido-small
Medium
This endpoint currently relies on an internal prototype model.
API name: nido-medium
Embedding models
Embedding models enable retrieval and retrieval-augmented generation applications.
Our endpoint outputs vectors in 1024
dimensions. It achieves a retrieval score of 55.26 on MTEB.
API name: nido-embed
Last updated