Deployment Requirements
In this section, we will cover the requirements for deploying the project.
LLM(Large Language Model) and Embedding Model
- A saas LLM model like OpenAI API or self-hosted LLM model with requirements:
- Smarter than GPT-3.5
- Provide openai-like API
- Embedding model: AutoFlow needs an embedding model to translate the text into vectors. You can use the following:
- OpenAI-like embedding model
- Cohere embedding model
- ZhipuAI embedding model
- You can also use the Jina AI API for this purpose. It is free for 1M tokens.
- (Optional) Reranker. You can use the Jina AI API for this purpose. It is free for 1M tokens.
TiDB
- With TiDB Serverless account, you can setup a TiDB cluster with Vector Search enabled. Free quota is available for 1M RU per month.
- You can also use a self-hosted TiDB cluster(>v8.4) with Vector Search enabled, but please note it will require TiFlash enabled for Vector Search.
Hardware
If you are using a Cloud TiDB and SaaS LLM
You can use any of the following web hosting services to deploy the project:
- Cloud server providers like AWS, Google Cloud, Azure, etc.
- Or your own server.
We suggest the following configuration for the server:
Name | Value |
---|---|
CPU | 4 vCPUs |
Memory | 8 GB RAM |
Disk | 200 GB SSD |
Number of servers | 1 |
If you are using a self-hosted TiDB and self-hosted LLM
If you use a self-hosted TiDB and self-hosted LLM, you need a powerful server to handle the load. We suggest the following configuration for the server:
Name | Value |
---|---|
CPU | 32 vCPUs |
Memory | 64 GB RAM |
Disk | 500 GB SSD |
GPU | 1 x NVIDIA A100 |
Number of servers | 1 |
GPU here is used for the LLM model, you can use any other GPU model that can be used for the LLM model which has capability more than gpt-3.5.