Embedding Model
The Embedding Model converts given input into numerical vectors (embeddings) that represent the semantic meaning of the input text.
In Autoflow, we use the Embedding Model to vectorize documents and store them in TiDB. This enables us to leverage TiDB’s Vector Search capability to retrieve relevant documents for user queries.
Configure Embedding Model
After logging in with an admin account, you can configure the Embedding Model in the admin panel.
-
Click on the
Models > Embedding Models
tab; -
Click the
New Embedding Model
button, select your preferred embedding model provider, and configure the model parameters.
Supported Providers
Currently Autoflow supports the following embedding model providers:
OpenAI
OpenAI provides a variety of Embedding Models, we recommend using the OpenAI text-embedding-3-small
model due to its performance and compatibility with Autoflow.
Supported Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
text-embedding-3-small | 1536 | 8191 |
For more information, see the OpenAI Embedding Models documentation.
OpenAI-Like
Autoflow also supports embedding model providers (such as ZhipuAI) that conform to the OpenAI API specification.
You can also use models deployed on local AI model platforms (such as vLLM and Xinference) that conform to the OpenAI API specification in Autoflow.
To use OpenAI-Like embedding model providers, you need to provide the base URL of the embedding API as the following JSON format in Advanced Settings:
{
"api_base": "{api_base_url}"
}
ZhipuAI BigModel
For example, the embedding API endpoint for ZhipuAI is:
https://open.bigmodel.cn/api/paas/v4/embeddings
You need to set up the base URL in the Advanced Settings as follows:
{
"api_base": "https://open.bigmodel.cn/api/paas/v4/"
}
Supported Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
embedding-3 | 2048 | 8192 |
For more information, see the ZhipuAI embedding models documentation.
vLLM
When serving locally, the default embedding API endpoint for vLLM is:
http://localhost:8000/v1/embeddings
You need to set up the base URL in the Advanced Settings as follows:
{
"api_base": "http://localhost:8000/v1/"
}
For more information, see the vLLM documentation.
JinaAI
JinaAI provides multimodal multilingual long-context Embedding Models for RAG applications.
Supported Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
jina-clip-v1 | 768 | 8192 |
jina-embeddings-v3 | 1024 | 8192 |
For more information, see the JinaAI embedding models documentation.
Cohere
Cohere provides industry-leading large language models (LLMs) and RAG capabilities tailored to meet the needs of enterprise use cases that solve real-world problems.
Supported Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
embed-multilingual-v3.0 | 1024 | 512 |
For more information, see the Cohere Embed documentation.
Amazon Bedrock
Amazon Bedrock is a fully managed foundation models service that provides a range of large language models and embedding models.
Featured Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
amazon.titan-embed-text-v2:0 | 1024 | 8192 |
amazon.titan-embed-text-v1 | 1536 | 8192 |
amazon.titan-embed-g1-text-02 | 1536 | 8192 |
cohere.embed-english-v3 | 1024 | 512 |
cohere.embed-multilingual-v3 | 1024 | 512 |
To check all embbeding models supported by Bedrock, go to Bedrock console.
To use Amazon Bedrock, you’ll need to provide a JSON Object of your AWS Credentials, as described in the AWS CLI config global settings:
{
"aws_access_key_id": "****",
"aws_secret_access_key": "****",
"aws_region_name": "us-west-2"
}
For more information, see the Amazon Bedrock documentation.
Ollama
Ollama is a lightweight framework for building and running large language models and embedding models locally.
Supported Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
nomic-embed-text | 768 | 8192 |
bge-m3 | 1024 | 8192 |
To use Ollama, you’ll need to configure the API base URL in the Advanced Settings:
{
"api_base": "http://localhost:11434"
}
For more information, see the Ollama embedding models documentation.
Gitee AI
Gitee AI is a third-party model provider that offers ready-to-use cutting-edge model APIs for AI developers.
Supported Models:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
bge-m3 | 1024 | 8192 |
bge-large-zh-v1.5 | 1024 | 512 |
bge-small-zh-v1.5 | 512 | 512 |
For more information, see the Gitee AI embedding models documentation.
Local Embedding Server
Autoflow’s local embedding server is a self-hosted embedding service built upon sentence-transformers and deployed on your own infrastructure.
You can choose from a variety of pre-trained models from Hugging Face, such as:
Embedding Model | Vector Dimensions | Max Tokens |
---|---|---|
BAAI/bge-m3 | 1024 | 8192 |
To configure the Local Embedding Service, set the API URL in the Advanced Settings:
{
"api_url": "http://local-embedding-reranker:5001/api/v1/embedding"
}