Simplify integrating Ollama, OpenAI proxies with Khoj on first run

- Integrate with Ollama or other openai compatible APIs by simply
  setting `OPENAI_API_BASE' environment variable in docker-compose etc.
- Update docs on integrating with Ollama, openai proxies on first run
- Auto populate all chat models supported by openai compatible APIs
- Auto set vision enabled for all commercial models

- Minor
  - Add huggingface cache to khoj_models volume. This is where chat
  models and (now) sentence transformer models are stored by default
  - Reduce verbosity of yarn install of web app. Otherwise hit docker
  log size limit & stops showing remaining logs after web app install
  - Suggest `ollama pull <model_name>` to start it in background
This commit is contained in:
Debanjum 2024-11-16 23:53:11 -08:00
parent 2366fa08b9
commit 69ef6829c1
6 changed files with 164 additions and 84 deletions

View file

@ -37,7 +37,7 @@ ENV PYTHONPATH=/app/src:$PYTHONPATH
# Go to the directory src/interface/web and export the built Next.js assets # Go to the directory src/interface/web and export the built Next.js assets
WORKDIR /app/src/interface/web WORKDIR /app/src/interface/web
RUN bash -c "yarn install --frozen-lockfile --verbose && yarn ciexport && yarn cache clean" RUN bash -c "yarn install --frozen-lockfile && yarn ciexport && yarn cache clean"
WORKDIR /app WORKDIR /app
# Run the Application # Run the Application

View file

@ -37,6 +37,7 @@ services:
volumes: volumes:
- khoj_config:/root/.khoj/ - khoj_config:/root/.khoj/
- khoj_models:/root/.cache/torch/sentence_transformers - khoj_models:/root/.cache/torch/sentence_transformers
- khoj_models:/root/.cache/huggingface
# Use 0.0.0.0 to explicitly set the host ip for the service on the container. https://pythonspeed.com/articles/docker-connection-refused/ # Use 0.0.0.0 to explicitly set the host ip for the service on the container. https://pythonspeed.com/articles/docker-connection-refused/
environment: environment:
- POSTGRES_DB=postgres - POSTGRES_DB=postgres
@ -48,12 +49,17 @@ services:
- KHOJ_DEBUG=False - KHOJ_DEBUG=False
- KHOJ_ADMIN_EMAIL=username@example.com - KHOJ_ADMIN_EMAIL=username@example.com
- KHOJ_ADMIN_PASSWORD=password - KHOJ_ADMIN_PASSWORD=password
# Uncomment lines below to use chat models by each provider. # Uncomment line below to use with Ollama running on your local machine at localhost:11434.
# Change URL to use with other OpenAI API compatible providers like VLLM, LMStudio etc.
# - OPENAI_API_BASE=http://host.docker.internal:11434/v1/
#
# Uncomment appropriate lines below to use chat models by OpenAI, Anthropic, Google.
# Ensure you set your provider specific API keys. # Ensure you set your provider specific API keys.
# --- # ---
# - OPENAI_API_KEY=your_openai_api_key # - OPENAI_API_KEY=your_openai_api_key
# - GEMINI_API_KEY=your_gemini_api_key # - GEMINI_API_KEY=your_gemini_api_key
# - ANTHROPIC_API_KEY=your_anthropic_api_key # - ANTHROPIC_API_KEY=your_anthropic_api_key
#
# Uncomment the necessary lines below to make your instance publicly accessible. # Uncomment the necessary lines below to make your instance publicly accessible.
# Replace the KHOJ_DOMAIN with either your domain or IP address (no http/https prefix). # Replace the KHOJ_DOMAIN with either your domain or IP address (no http/https prefix).
# Proceed with caution, especially if you are using anonymous mode. # Proceed with caution, especially if you are using anonymous mode.

View file

@ -1,33 +0,0 @@
# Ollama
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
:::
:::info
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
:::
Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama to create your personal AI agents with Khoj.
## Setup
1. Setup Ollama: https://ollama.com/
2. Start your preferred model with Ollama. For example,
```bash
ollama run llama3.1
```
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
- Name: `ollama`
- Api Key: `any string`
- Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
- Name: `llama3.1` (replace with the name of your local model)
- Model Type: `Openai`
- Openai Config: `<the ollama config you created in step 3>`
- Max prompt size: `20000` (replace with the max prompt size of your model)
5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
That's it! You should now be able to chat with your Ollama model from Khoj. If you want to add additional models running on Ollama, repeat step 6 for each model.

View file

@ -0,0 +1,78 @@
# Ollama
```mdx-code-block
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
```
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you can use our first-party supported models.
:::
:::info
Khoj can directly run local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). The integration with Ollama is useful to run Khoj on Docker and have the chat models use your GPU or to try new models via CLI.
:::
Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama with Khoj.
## Setup
:::info
Restart your Khoj server after first run or update to the settings below to ensure all settings are applied correctly.
:::
<Tabs groupId="type" queryString>
<TabItem value="first-run" label="First Run">
<Tabs groupId="server" queryString>
<TabItem value="docker" label="Docker">
1. Setup Ollama: https://ollama.com/
2. Download your preferred chat model with Ollama. For example,
```bash
ollama pull llama3.1
```
3. Uncomment `OPENAI_API_BASE` environment variable in your downloaded Khoj [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml#:~:text=OPENAI_API_BASE)
4. Start Khoj docker for the first time to automatically integrate and load models from the Ollama running on your host machine
```bash
# run below command in the directory where you downloaded the Khoj docker-compose.yml
docker-compose up
```
</TabItem>
<TabItem value="pip" label="Pip">
1. Setup Ollama: https://ollama.com/
2. Download your preferred chat model with Ollama. For example,
```bash
ollama pull llama3.1
```
3. Set `OPENAI_API_BASE` environment variable to `http://localhost:11434/v1` in your shell before starting Khoj for the first time
```bash
export OPENAI_API_BASE="http://localhost:11434/v1"
khoj --anonymous-mode
```
</TabItem>
</Tabs>
</TabItem>
<TabItem value="update" label="Update">
1. Setup Ollama: https://ollama.com/
2. Download your preferred chat model with Ollama. For example,
```bash
ollama pull llama3.1
```
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
- Name: `ollama`
- Api Key: `any string`
- Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
- Name: `llama3.1` (replace with the name of your local model)
- Model Type: `Openai`
- Openai Config: `<the ollama config you created in step 3>`
- Max prompt size: `20000` (replace with the max prompt size of your model)
5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
If you want to add additional models running on Ollama, repeat step 4 for each model.
</TabItem>
</Tabs>
That's it! You should now be able to chat with your Ollama model from Khoj.

View file

@ -19,7 +19,11 @@ These are the general setup instructions for self-hosted Khoj.
You can install the Khoj server using either [Docker](?server=docker) or [Pip](?server=pip). You can install the Khoj server using either [Docker](?server=docker) or [Pip](?server=pip).
:::info[Offline Model + GPU] :::info[Offline Model + GPU]
If you want to use the offline chat model and you have a GPU, you should use Installation Option 2 - local setup via the Python package directly. Our Docker image doesn't currently support running the offline chat model on GPU, making inference times really slow. To use the offline chat model with your GPU, we recommend using the Docker setup with Ollama . You can also use the local Khoj setup via the Python package directly.
:::
:::info[First Run]
Restart your Khoj server after the first run to ensure all settings are applied correctly.
::: :::
<Tabs groupId="server" queryString> <Tabs groupId="server" queryString>
@ -28,27 +32,28 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
<TabItem value="macos" label="MacOS"> <TabItem value="macos" label="MacOS">
<h3>Prerequisites</h3> <h3>Prerequisites</h3>
<h4>Docker</h4> <h4>Docker</h4>
(Option 1) Click here to install [Docker Desktop](https://docs.docker.com/desktop/install/mac-install/). Make sure you also install the [Docker Compose](https://docs.docker.com/desktop/install/mac-install/) tool. - *Option 1*: Click here to install [Docker Desktop](https://docs.docker.com/desktop/install/mac-install/). Make sure you also install the [Docker Compose](https://docs.docker.com/desktop/install/mac-install/) tool.
(Option 2) Use [Homebrew](https://brew.sh/) to install Docker and Docker Compose. - *Option 2*: Use [Homebrew](https://brew.sh/) to install Docker and Docker Compose.
```shell ```shell
brew install --cask docker brew install --cask docker
brew install docker-compose brew install docker-compose
``` ```
<h3>Setup</h3> <h3>Setup</h3>
1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) 1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml)
```shell ```shell
mkdir ~/.khoj && cd ~/.khoj mkdir ~/.khoj && cd ~/.khoj
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
``` ```
2. Configure the environment variables in the docker-compose.yml 2. Configure the environment variables in the `docker-compose.yml`
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel. - Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively. - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
- Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama?type=first-run&server=docker#setup) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
3. Start Khoj by running the following command in the same directory as your docker-compose.yml file. 3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
```shell ```shell
cd ~/.khoj cd ~/.khoj
docker-compose up docker-compose up
``` ```
</TabItem> </TabItem>
<TabItem value="windows" label="Windows"> <TabItem value="windows" label="Windows">
<h3>Prerequisites</h3> <h3>Prerequisites</h3>
@ -61,20 +66,21 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
<h3>Setup</h3> <h3>Setup</h3>
1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) 1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml)
```shell ```shell
# Windows users should use their WSL2 terminal to run these commands # Windows users should use their WSL2 terminal to run these commands
mkdir ~/.khoj && cd ~/.khoj mkdir ~/.khoj && cd ~/.khoj
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
``` ```
2. Configure the environment variables in the docker-compose.yml 2. Configure the environment variables in the `docker-compose.yml`
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel. - Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively. - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
- Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
3. Start Khoj by running the following command in the same directory as your docker-compose.yml file. 3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
```shell ```shell
# Windows users should use their WSL2 terminal to run these commands # Windows users should use their WSL2 terminal to run these commands
cd ~/.khoj cd ~/.khoj
docker-compose up docker-compose up
``` ```
</TabItem> </TabItem>
<TabItem value="linux" label="Linux"> <TabItem value="linux" label="Linux">
<h3>Prerequisites</h3> <h3>Prerequisites</h3>
@ -83,18 +89,19 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
<h3>Setup</h3> <h3>Setup</h3>
1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) 1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml)
```shell ```shell
mkdir ~/.khoj && cd ~/.khoj mkdir ~/.khoj && cd ~/.khoj
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
``` ```
2. Configure the environment variables in the docker-compose.yml 2. Configure the environment variables in the `docker-compose.yml`
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel. - Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively. - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
- Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
3. Start Khoj by running the following command in the same directory as your docker-compose.yml file. 3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
```shell ```shell
cd ~/.khoj cd ~/.khoj
docker-compose up docker-compose up
``` ```
</TabItem> </TabItem>
</Tabs> </Tabs>

View file

@ -2,12 +2,13 @@ import logging
import os import os
from typing import Tuple from typing import Tuple
import openai
from khoj.database.adapters import ConversationAdapters from khoj.database.adapters import ConversationAdapters
from khoj.database.models import ( from khoj.database.models import (
ChatModelOptions, ChatModelOptions,
KhojUser, KhojUser,
OpenAIProcessorConversationConfig, OpenAIProcessorConversationConfig,
ServerChatSettings,
SpeechToTextModelOptions, SpeechToTextModelOptions,
TextToImageModelConfig, TextToImageModelConfig,
) )
@ -42,14 +43,32 @@ def initialization(interactive: bool = True):
"🗣️ Configure chat models available to your server. You can always update these at /server/admin using your admin account" "🗣️ Configure chat models available to your server. You can always update these at /server/admin using your admin account"
) )
openai_api_base = os.getenv("OPENAI_API_BASE")
provider = "Ollama" if openai_api_base and openai_api_base.endswith(":11434/v1/") else "OpenAI"
openai_api_key = os.getenv("OPENAI_API_KEY", "placeholder" if openai_api_base else None)
default_chat_models = default_openai_chat_models
if openai_api_base:
# Get available chat models from OpenAI compatible API
try:
openai_client = openai.OpenAI(api_key=openai_api_key, base_url=openai_api_base)
default_chat_models = [model.id for model in openai_client.models.list()]
# Put the available default OpenAI models at the top
valid_default_models = [model for model in default_openai_chat_models if model in default_chat_models]
other_available_models = [model for model in default_chat_models if model not in valid_default_models]
default_chat_models = valid_default_models + other_available_models
except Exception:
logger.warning(f"⚠️ Failed to fetch {provider} chat models. Fallback to default models. Error: {e}")
# Set up OpenAI's online chat models # Set up OpenAI's online chat models
openai_configured, openai_provider = _setup_chat_model_provider( openai_configured, openai_provider = _setup_chat_model_provider(
ChatModelOptions.ModelType.OPENAI, ChatModelOptions.ModelType.OPENAI,
default_openai_chat_models, default_chat_models,
default_api_key=os.getenv("OPENAI_API_KEY"), default_api_key=openai_api_key,
api_base_url=openai_api_base,
vision_enabled=True, vision_enabled=True,
is_offline=False, is_offline=False,
interactive=interactive, interactive=interactive,
provider_name=provider,
) )
# Setup OpenAI speech to text model # Setup OpenAI speech to text model
@ -154,6 +173,7 @@ def initialization(interactive: bool = True):
default_chat_models: list, default_chat_models: list,
default_api_key: str, default_api_key: str,
interactive: bool, interactive: bool,
api_base_url: str = None,
vision_enabled: bool = False, vision_enabled: bool = False,
is_offline: bool = False, is_offline: bool = False,
provider_name: str = None, provider_name: str = None,
@ -172,14 +192,16 @@ def initialization(interactive: bool = True):
logger.info(f"️💬 Setting up your {provider_name} chat configuration") logger.info(f"️💬 Setting up your {provider_name} chat configuration")
chat_model_provider = None chat_provider = None
if not is_offline: if not is_offline:
if interactive: if interactive:
user_api_key = input(f"Enter your {provider_name} API key (default: {default_api_key}): ") user_api_key = input(f"Enter your {provider_name} API key (default: {default_api_key}): ")
api_key = user_api_key if user_api_key != "" else default_api_key api_key = user_api_key if user_api_key != "" else default_api_key
else: else:
api_key = default_api_key api_key = default_api_key
chat_model_provider = OpenAIProcessorConversationConfig.objects.create(api_key=api_key, name=provider_name) chat_provider = OpenAIProcessorConversationConfig.objects.create(
api_key=api_key, name=provider_name, api_base_url=api_base_url
)
if interactive: if interactive:
chat_model_names = input( chat_model_names = input(
@ -201,13 +223,13 @@ def initialization(interactive: bool = True):
"max_prompt_size": default_max_tokens, "max_prompt_size": default_max_tokens,
"vision_enabled": vision_enabled, "vision_enabled": vision_enabled,
"tokenizer": default_tokenizer, "tokenizer": default_tokenizer,
"openai_config": chat_model_provider, "openai_config": chat_provider,
} }
ChatModelOptions.objects.create(**chat_model_options) ChatModelOptions.objects.create(**chat_model_options)
logger.info(f"🗣️ {provider_name} chat model configuration complete") logger.info(f"🗣️ {provider_name} chat model configuration complete")
return True, chat_model_provider return True, chat_provider
admin_user = KhojUser.objects.filter(is_staff=True).first() admin_user = KhojUser.objects.filter(is_staff=True).first()
if admin_user is None: if admin_user is None: