mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-11-23 15:38:55 +01:00
Simplify integrating Ollama, OpenAI proxies with Khoj on first run
- Integrate with Ollama or other openai compatible APIs by simply setting `OPENAI_API_BASE' environment variable in docker-compose etc. - Update docs on integrating with Ollama, openai proxies on first run - Auto populate all chat models supported by openai compatible APIs - Auto set vision enabled for all commercial models - Minor - Add huggingface cache to khoj_models volume. This is where chat models and (now) sentence transformer models are stored by default - Reduce verbosity of yarn install of web app. Otherwise hit docker log size limit & stops showing remaining logs after web app install - Suggest `ollama pull <model_name>` to start it in background
This commit is contained in:
parent
2366fa08b9
commit
69ef6829c1
6 changed files with 164 additions and 84 deletions
|
@ -37,7 +37,7 @@ ENV PYTHONPATH=/app/src:$PYTHONPATH
|
|||
|
||||
# Go to the directory src/interface/web and export the built Next.js assets
|
||||
WORKDIR /app/src/interface/web
|
||||
RUN bash -c "yarn install --frozen-lockfile --verbose && yarn ciexport && yarn cache clean"
|
||||
RUN bash -c "yarn install --frozen-lockfile && yarn ciexport && yarn cache clean"
|
||||
WORKDIR /app
|
||||
|
||||
# Run the Application
|
||||
|
|
|
@ -37,6 +37,7 @@ services:
|
|||
volumes:
|
||||
- khoj_config:/root/.khoj/
|
||||
- khoj_models:/root/.cache/torch/sentence_transformers
|
||||
- khoj_models:/root/.cache/huggingface
|
||||
# Use 0.0.0.0 to explicitly set the host ip for the service on the container. https://pythonspeed.com/articles/docker-connection-refused/
|
||||
environment:
|
||||
- POSTGRES_DB=postgres
|
||||
|
@ -48,12 +49,17 @@ services:
|
|||
- KHOJ_DEBUG=False
|
||||
- KHOJ_ADMIN_EMAIL=username@example.com
|
||||
- KHOJ_ADMIN_PASSWORD=password
|
||||
# Uncomment lines below to use chat models by each provider.
|
||||
# Uncomment line below to use with Ollama running on your local machine at localhost:11434.
|
||||
# Change URL to use with other OpenAI API compatible providers like VLLM, LMStudio etc.
|
||||
# - OPENAI_API_BASE=http://host.docker.internal:11434/v1/
|
||||
#
|
||||
# Uncomment appropriate lines below to use chat models by OpenAI, Anthropic, Google.
|
||||
# Ensure you set your provider specific API keys.
|
||||
# ---
|
||||
# - OPENAI_API_KEY=your_openai_api_key
|
||||
# - GEMINI_API_KEY=your_gemini_api_key
|
||||
# - ANTHROPIC_API_KEY=your_anthropic_api_key
|
||||
#
|
||||
# Uncomment the necessary lines below to make your instance publicly accessible.
|
||||
# Replace the KHOJ_DOMAIN with either your domain or IP address (no http/https prefix).
|
||||
# Proceed with caution, especially if you are using anonymous mode.
|
||||
|
|
|
@ -1,33 +0,0 @@
|
|||
# Ollama
|
||||
:::info
|
||||
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
|
||||
:::
|
||||
|
||||
:::info
|
||||
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
|
||||
:::
|
||||
|
||||
Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
|
||||
For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
|
||||
|
||||
Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama to create your personal AI agents with Khoj.
|
||||
|
||||
## Setup
|
||||
|
||||
1. Setup Ollama: https://ollama.com/
|
||||
2. Start your preferred model with Ollama. For example,
|
||||
```bash
|
||||
ollama run llama3.1
|
||||
```
|
||||
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
|
||||
- Name: `ollama`
|
||||
- Api Key: `any string`
|
||||
- Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
|
||||
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
|
||||
- Name: `llama3.1` (replace with the name of your local model)
|
||||
- Model Type: `Openai`
|
||||
- Openai Config: `<the ollama config you created in step 3>`
|
||||
- Max prompt size: `20000` (replace with the max prompt size of your model)
|
||||
5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
|
||||
|
||||
That's it! You should now be able to chat with your Ollama model from Khoj. If you want to add additional models running on Ollama, repeat step 6 for each model.
|
78
documentation/docs/advanced/ollama.mdx
Normal file
78
documentation/docs/advanced/ollama.mdx
Normal file
|
@ -0,0 +1,78 @@
|
|||
# Ollama
|
||||
|
||||
```mdx-code-block
|
||||
import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
```
|
||||
|
||||
:::info
|
||||
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you can use our first-party supported models.
|
||||
:::
|
||||
|
||||
:::info
|
||||
Khoj can directly run local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). The integration with Ollama is useful to run Khoj on Docker and have the chat models use your GPU or to try new models via CLI.
|
||||
:::
|
||||
|
||||
Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
|
||||
For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
|
||||
|
||||
Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama with Khoj.
|
||||
|
||||
## Setup
|
||||
:::info
|
||||
Restart your Khoj server after first run or update to the settings below to ensure all settings are applied correctly.
|
||||
:::
|
||||
|
||||
<Tabs groupId="type" queryString>
|
||||
<TabItem value="first-run" label="First Run">
|
||||
<Tabs groupId="server" queryString>
|
||||
<TabItem value="docker" label="Docker">
|
||||
1. Setup Ollama: https://ollama.com/
|
||||
2. Download your preferred chat model with Ollama. For example,
|
||||
```bash
|
||||
ollama pull llama3.1
|
||||
```
|
||||
3. Uncomment `OPENAI_API_BASE` environment variable in your downloaded Khoj [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml#:~:text=OPENAI_API_BASE)
|
||||
4. Start Khoj docker for the first time to automatically integrate and load models from the Ollama running on your host machine
|
||||
```bash
|
||||
# run below command in the directory where you downloaded the Khoj docker-compose.yml
|
||||
docker-compose up
|
||||
```
|
||||
</TabItem>
|
||||
|
||||
<TabItem value="pip" label="Pip">
|
||||
1. Setup Ollama: https://ollama.com/
|
||||
2. Download your preferred chat model with Ollama. For example,
|
||||
```bash
|
||||
ollama pull llama3.1
|
||||
```
|
||||
3. Set `OPENAI_API_BASE` environment variable to `http://localhost:11434/v1` in your shell before starting Khoj for the first time
|
||||
```bash
|
||||
export OPENAI_API_BASE="http://localhost:11434/v1"
|
||||
khoj --anonymous-mode
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
</TabItem>
|
||||
<TabItem value="update" label="Update">
|
||||
1. Setup Ollama: https://ollama.com/
|
||||
2. Download your preferred chat model with Ollama. For example,
|
||||
```bash
|
||||
ollama pull llama3.1
|
||||
```
|
||||
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
|
||||
- Name: `ollama`
|
||||
- Api Key: `any string`
|
||||
- Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
|
||||
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
|
||||
- Name: `llama3.1` (replace with the name of your local model)
|
||||
- Model Type: `Openai`
|
||||
- Openai Config: `<the ollama config you created in step 3>`
|
||||
- Max prompt size: `20000` (replace with the max prompt size of your model)
|
||||
5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
|
||||
|
||||
If you want to add additional models running on Ollama, repeat step 4 for each model.
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
That's it! You should now be able to chat with your Ollama model from Khoj.
|
|
@ -19,7 +19,11 @@ These are the general setup instructions for self-hosted Khoj.
|
|||
You can install the Khoj server using either [Docker](?server=docker) or [Pip](?server=pip).
|
||||
|
||||
:::info[Offline Model + GPU]
|
||||
If you want to use the offline chat model and you have a GPU, you should use Installation Option 2 - local setup via the Python package directly. Our Docker image doesn't currently support running the offline chat model on GPU, making inference times really slow.
|
||||
To use the offline chat model with your GPU, we recommend using the Docker setup with Ollama . You can also use the local Khoj setup via the Python package directly.
|
||||
:::
|
||||
|
||||
:::info[First Run]
|
||||
Restart your Khoj server after the first run to ensure all settings are applied correctly.
|
||||
:::
|
||||
|
||||
<Tabs groupId="server" queryString>
|
||||
|
@ -28,27 +32,28 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
|
|||
<TabItem value="macos" label="MacOS">
|
||||
<h3>Prerequisites</h3>
|
||||
<h4>Docker</h4>
|
||||
(Option 1) Click here to install [Docker Desktop](https://docs.docker.com/desktop/install/mac-install/). Make sure you also install the [Docker Compose](https://docs.docker.com/desktop/install/mac-install/) tool.
|
||||
- *Option 1*: Click here to install [Docker Desktop](https://docs.docker.com/desktop/install/mac-install/). Make sure you also install the [Docker Compose](https://docs.docker.com/desktop/install/mac-install/) tool.
|
||||
|
||||
(Option 2) Use [Homebrew](https://brew.sh/) to install Docker and Docker Compose.
|
||||
```shell
|
||||
brew install --cask docker
|
||||
brew install docker-compose
|
||||
```
|
||||
- *Option 2*: Use [Homebrew](https://brew.sh/) to install Docker and Docker Compose.
|
||||
```shell
|
||||
brew install --cask docker
|
||||
brew install docker-compose
|
||||
```
|
||||
<h3>Setup</h3>
|
||||
1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml)
|
||||
```shell
|
||||
mkdir ~/.khoj && cd ~/.khoj
|
||||
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
|
||||
```
|
||||
2. Configure the environment variables in the docker-compose.yml
|
||||
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
|
||||
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively.
|
||||
```shell
|
||||
mkdir ~/.khoj && cd ~/.khoj
|
||||
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
|
||||
```
|
||||
2. Configure the environment variables in the `docker-compose.yml`
|
||||
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
|
||||
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
|
||||
- Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama?type=first-run&server=docker#setup) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
|
||||
3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
|
||||
```shell
|
||||
cd ~/.khoj
|
||||
docker-compose up
|
||||
```
|
||||
```shell
|
||||
cd ~/.khoj
|
||||
docker-compose up
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="windows" label="Windows">
|
||||
<h3>Prerequisites</h3>
|
||||
|
@ -61,20 +66,21 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
|
|||
|
||||
<h3>Setup</h3>
|
||||
1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml)
|
||||
```shell
|
||||
# Windows users should use their WSL2 terminal to run these commands
|
||||
mkdir ~/.khoj && cd ~/.khoj
|
||||
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
|
||||
```
|
||||
2. Configure the environment variables in the docker-compose.yml
|
||||
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
|
||||
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively.
|
||||
```shell
|
||||
# Windows users should use their WSL2 terminal to run these commands
|
||||
mkdir ~/.khoj && cd ~/.khoj
|
||||
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
|
||||
```
|
||||
2. Configure the environment variables in the `docker-compose.yml`
|
||||
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
|
||||
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
|
||||
- Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
|
||||
3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
|
||||
```shell
|
||||
# Windows users should use their WSL2 terminal to run these commands
|
||||
cd ~/.khoj
|
||||
docker-compose up
|
||||
```
|
||||
```shell
|
||||
# Windows users should use their WSL2 terminal to run these commands
|
||||
cd ~/.khoj
|
||||
docker-compose up
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="linux" label="Linux">
|
||||
<h3>Prerequisites</h3>
|
||||
|
@ -83,18 +89,19 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
|
|||
|
||||
<h3>Setup</h3>
|
||||
1. Download the Khoj docker-compose.yml file [from Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml)
|
||||
```shell
|
||||
mkdir ~/.khoj && cd ~/.khoj
|
||||
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
|
||||
```
|
||||
2. Configure the environment variables in the docker-compose.yml
|
||||
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
|
||||
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively.
|
||||
```shell
|
||||
mkdir ~/.khoj && cd ~/.khoj
|
||||
wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
|
||||
```
|
||||
2. Configure the environment variables in the `docker-compose.yml`
|
||||
- Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
|
||||
- Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
|
||||
- Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
|
||||
3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
|
||||
```shell
|
||||
cd ~/.khoj
|
||||
docker-compose up
|
||||
```
|
||||
```shell
|
||||
cd ~/.khoj
|
||||
docker-compose up
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
|
|
@ -2,12 +2,13 @@ import logging
|
|||
import os
|
||||
from typing import Tuple
|
||||
|
||||
import openai
|
||||
|
||||
from khoj.database.adapters import ConversationAdapters
|
||||
from khoj.database.models import (
|
||||
ChatModelOptions,
|
||||
KhojUser,
|
||||
OpenAIProcessorConversationConfig,
|
||||
ServerChatSettings,
|
||||
SpeechToTextModelOptions,
|
||||
TextToImageModelConfig,
|
||||
)
|
||||
|
@ -42,14 +43,32 @@ def initialization(interactive: bool = True):
|
|||
"🗣️ Configure chat models available to your server. You can always update these at /server/admin using your admin account"
|
||||
)
|
||||
|
||||
openai_api_base = os.getenv("OPENAI_API_BASE")
|
||||
provider = "Ollama" if openai_api_base and openai_api_base.endswith(":11434/v1/") else "OpenAI"
|
||||
openai_api_key = os.getenv("OPENAI_API_KEY", "placeholder" if openai_api_base else None)
|
||||
default_chat_models = default_openai_chat_models
|
||||
if openai_api_base:
|
||||
# Get available chat models from OpenAI compatible API
|
||||
try:
|
||||
openai_client = openai.OpenAI(api_key=openai_api_key, base_url=openai_api_base)
|
||||
default_chat_models = [model.id for model in openai_client.models.list()]
|
||||
# Put the available default OpenAI models at the top
|
||||
valid_default_models = [model for model in default_openai_chat_models if model in default_chat_models]
|
||||
other_available_models = [model for model in default_chat_models if model not in valid_default_models]
|
||||
default_chat_models = valid_default_models + other_available_models
|
||||
except Exception:
|
||||
logger.warning(f"⚠️ Failed to fetch {provider} chat models. Fallback to default models. Error: {e}")
|
||||
|
||||
# Set up OpenAI's online chat models
|
||||
openai_configured, openai_provider = _setup_chat_model_provider(
|
||||
ChatModelOptions.ModelType.OPENAI,
|
||||
default_openai_chat_models,
|
||||
default_api_key=os.getenv("OPENAI_API_KEY"),
|
||||
default_chat_models,
|
||||
default_api_key=openai_api_key,
|
||||
api_base_url=openai_api_base,
|
||||
vision_enabled=True,
|
||||
is_offline=False,
|
||||
interactive=interactive,
|
||||
provider_name=provider,
|
||||
)
|
||||
|
||||
# Setup OpenAI speech to text model
|
||||
|
@ -154,6 +173,7 @@ def initialization(interactive: bool = True):
|
|||
default_chat_models: list,
|
||||
default_api_key: str,
|
||||
interactive: bool,
|
||||
api_base_url: str = None,
|
||||
vision_enabled: bool = False,
|
||||
is_offline: bool = False,
|
||||
provider_name: str = None,
|
||||
|
@ -172,14 +192,16 @@ def initialization(interactive: bool = True):
|
|||
|
||||
logger.info(f"️💬 Setting up your {provider_name} chat configuration")
|
||||
|
||||
chat_model_provider = None
|
||||
chat_provider = None
|
||||
if not is_offline:
|
||||
if interactive:
|
||||
user_api_key = input(f"Enter your {provider_name} API key (default: {default_api_key}): ")
|
||||
api_key = user_api_key if user_api_key != "" else default_api_key
|
||||
else:
|
||||
api_key = default_api_key
|
||||
chat_model_provider = OpenAIProcessorConversationConfig.objects.create(api_key=api_key, name=provider_name)
|
||||
chat_provider = OpenAIProcessorConversationConfig.objects.create(
|
||||
api_key=api_key, name=provider_name, api_base_url=api_base_url
|
||||
)
|
||||
|
||||
if interactive:
|
||||
chat_model_names = input(
|
||||
|
@ -201,13 +223,13 @@ def initialization(interactive: bool = True):
|
|||
"max_prompt_size": default_max_tokens,
|
||||
"vision_enabled": vision_enabled,
|
||||
"tokenizer": default_tokenizer,
|
||||
"openai_config": chat_model_provider,
|
||||
"openai_config": chat_provider,
|
||||
}
|
||||
|
||||
ChatModelOptions.objects.create(**chat_model_options)
|
||||
|
||||
logger.info(f"🗣️ {provider_name} chat model configuration complete")
|
||||
return True, chat_model_provider
|
||||
return True, chat_provider
|
||||
|
||||
admin_user = KhojUser.objects.filter(is_staff=True).first()
|
||||
if admin_user is None:
|
||||
|
|
Loading…
Reference in a new issue