khoj/documentation/docs/advanced/litellm.md

# LiteLLM
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
:::

:::info
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
:::

[LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start) exposes an OpenAI compatible API that proxies requests to other LLM API services. This provides a standardized API to interact with both open-source and commercial LLMs.

Using LiteLLM with Khoj makes it possible to turn any LLM behind an API into your personal AI agent.

## Setup
1. Install LiteLLM
   ```bash
   pip install litellm[proxy]
   ```
2. Start LiteLLM and use Mistral tiny via Mistral API
   ```
   export MISTRAL_API_KEY=<MISTRAL_API_KEY>
   litellm --model mistral/mistral-tiny --drop_params
   ```
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
   - Name: `proxy-name`
   - Api Key: `any string`
   - Api Base Url: **URL of your Openai Proxy API**
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
   - Name: `llama3.1` (replace with the name of your local model)
   - Model Type: `Openai`
   - Openai Config: `<the proxy config you created in step 3>`
   - Max prompt size: `20000` (replace with the max prompt size of your model)
   - Tokenizer: *Do not set for OpenAI, Mistral, Llama3 based models*
5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
Split setup of specific OpenAI API proxies into separate doc pages 2024-06-24 11:24:50 +02:00			`# LiteLLM`
			`:::info`
			`This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.`
			`:::`

			`:::info`
			`Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.`
			`:::`

			`[LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start) exposes an OpenAI compatible API that proxies requests to other LLM API services. This provides a standardized API to interact with both open-source and commercial LLMs.`

			`Using LiteLLM with Khoj makes it possible to turn any LLM behind an API into your personal AI agent.`

			`## Setup`
			`1. Install LiteLLM`
			```bash
			`pip install litellm[proxy]`
			```
			`2. Start LiteLLM and use Mistral tiny via Mistral API`
			```
			`export MISTRAL_API_KEY=<MISTRAL_API_KEY>`
			`litellm --model mistral/mistral-tiny --drop_params`
			```
			`3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel`
			- Name: `proxy-name`
			- Api Key: `any string`
			`- Api Base Url: URL of your Openai Proxy API`
			`4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.`
Update docs to mention using Llama 3.1 and 20K max prompt size for it Update stale credits to better reflect bigger open source dependencies 2024-08-23 05:27:58 +02:00			- Name: `llama3.1` (replace with the name of your local model)
Split setup of specific OpenAI API proxies into separate doc pages 2024-06-24 11:24:50 +02:00			- Model Type: `Openai`
			- Openai Config: `<the proxy config you created in step 3>`
Update docs to mention using Llama 3.1 and 20K max prompt size for it Update stale credits to better reflect bigger open source dependencies 2024-08-23 05:27:58 +02:00			- Max prompt size: `20000` (replace with the max prompt size of your model)
			`- Tokenizer: Do not set for OpenAI, Mistral, Llama3 based models`
Remove need to set server chat settings from use openai proxies docs This was previously required, but now it's only usefuly for more advanced settings, not typical for self-hosting users. With recent updates, the user's selected chat model is used for both Khoj's train of thought and response. This makes it easy to switch your preferred chat model directly from the user settings page and not have to update this in the admin panel as well. Reflect these code changse in the docs, by removing the unnecessary step for self-hosted users to create a server chat setting when using an OpenAI proxy service like Ollama, LiteLLM etc. 2024-11-06 02:03:17 +01:00			`5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.`