Rebase with matser

This commit is contained in:
sabaimran 2024-04-02 16:16:06 +05:30
commit 47fc7e1ce6
61 changed files with 832 additions and 481 deletions

View file

@ -21,16 +21,18 @@ on:
jobs: jobs:
publish: publish:
name: Publish Python Package to PyPI name: Publish Python Package to PyPI
runs-on: ubuntu-20.04 runs-on: ubuntu-latest
permissions:
id-token: write
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
with: with:
fetch-depth: 0 fetch-depth: 0
- name: Set up Python 3.10 - name: Set up Python 3.11
uses: actions/setup-python@v4 uses: actions/setup-python@v4
with: with:
python-version: '3.10' python-version: '3.11'
- name: ⬇️ Install Application - name: ⬇️ Install Application
run: python -m pip install --upgrade pip && pip install --upgrade . run: python -m pip install --upgrade pip && pip install --upgrade .
@ -59,6 +61,4 @@ jobs:
- name: 📦 Publish Python Package to PyPI - name: 📦 Publish Python Package to PyPI
if: startsWith(github.ref, 'refs/tags') || github.ref == 'refs/heads/master' if: startsWith(github.ref, 'refs/tags') || github.ref == 'refs/heads/master'
uses: pypa/gh-action-pypi-publish@v1.6.4 uses: pypa/gh-action-pypi-publish@v1.8.14
with:
password: ${{ secrets.PYPI_API_KEY }}

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 MiB

View file

@ -0,0 +1,15 @@
---
sidebar_position: 4
---
# Agents
You can use agents to setup custom system prompts with Khoj. The server host can setup their own agents, which are accessible to all users. You can see ours at https://app.khoj.dev/agents.
![Demo](/img/agents_demo.gif)
## Creating an Agent (Self-Hosted)
Go to `server/admin/database/agent` on your server and click `Add Agent` to create a new one. You have to set it to `public` in order for it to be accessible to all the users on your server. To limit access to a specific user, do not set the `public` flag and add the user in the `Creator` field.
Set your custom prompt in the `personality` field.

View file

@ -2,7 +2,7 @@
sidebar_position: 1 sidebar_position: 1
--- ---
# Features # Overview
Khoj supports a variety of features, including search and chat with a wide range of data sources and interfaces. Khoj supports a variety of features, including search and chat with a wide range of data sources and interfaces.

View file

@ -14,16 +14,16 @@ You can configure Khoj to chat with you about anything. When relevant, it'll use
### Setup (Self-Hosting) ### Setup (Self-Hosting)
#### Offline Chat #### Offline Chat
Offline chat stays completely private and works without internet using open-source models. Offline chat stays completely private and can work without internet using open-source models.
> **System Requirements**: > **System Requirements**:
> - Minimum 8 GB RAM. Recommend **16Gb VRAM** > - Minimum 8 GB RAM. Recommend **16Gb VRAM**
> - Minimum **5 GB of Disk** available > - Minimum **5 GB of Disk** available
> - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required > - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required
> - A Mac M1+ or [Vulcan supported GPU](https://vulkan.gpuinfo.org/) should significantly speed up chat response times > - An Nvidia, AMD GPU or a Mac M1+ machine would significantly speed up chat response times
1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration. 1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration.
2. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the offline chat model you want to use. Make sure to use `Offline` as its type. We currently only support offline models that use the [Llama chat prompt](https://replicate.com/blog/how-to-prompt-llama#wrap-user-input-with-inst-inst-tags) format. We recommend using `mistral-7b-instruct-v0.1.Q4_0.gguf`. 2. Open your [Chat model options settings](http://localhost:42110/server/admin/database/chatmodeloptions/) and add any [GGUF chat model](https://huggingface.co/models?library=gguf) to use for offline chat. Make sure to use `Offline` as its type. For a balanced chat model that runs well on standard consumer hardware we recommend using [Hermes-2-Pro-Mistral-7B by NousResearch](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B-GGUF) by default.
:::tip[Note] :::tip[Note]

View file

@ -0,0 +1,15 @@
# Image Generation
You can use Khoj to generate images from text prompts. You can get deeper into the details of our image generation flow in this blog post: https://blog.khoj.dev/posts/how-khoj-generates-images/.
To generate images, you just need to provide a prompt to Khoj in which the image generation is in the instructions. Khoj will automatically detect the image generation intent, augment your generation prompt, and then create the image. Here are some examples:
| Prompt | Image |
| --- | --- |
| Paint a picture of the plants I got last month, pixar-animation | ![plants](/img/plants_i_got.png) |
| Create a picture of my dream house, based on my interests | ![house](/img/dream_house.png) |
## Setup (Self-Hosting)
Right now, we only support integration with OpenAI's DALL-E. You need to have an OpenAI API key to use this feature. Here's how you can set it up:
1. Setup your OpenAI API key. See instructions [here](/get-started/setup#2-configure)
2. Create a text to image config at http://localhost:42110/server/admin/database/texttoimagemodelconfig/. We recommend the value `dall-e-3`.

View file

@ -0,0 +1,14 @@
# Voice
You can talk to Khoj using your voice. Khoj will respond to your queries using the same models as the chat feature. You can use voice chat on the web, Desktop, and Obsidian apps. Click on the little mic icon to send your voice message to Khoj. It will send back what it heard via text. You'll have some time to edit it before sending it, if required. Try it at https://app.khoj.dev/.
:::info[Voice Response]
Khoj doesn't yet respond with voice, but it will send back a text response. Let us know if you're interested in voice responses at team at khoj.dev.
:::
## Setup (Self-Hosting)
Voice chat will automatically be configured when you initialize the application. The default configuration will run locally. If you want to use the OpenAI whisper API for voice chat, you can set it up by following these steps:
1. Setup your OpenAI API key. See instructions [here](/get-started/setup#2-configure).
2. Create a new configuration at http://localhost:42110/server/admin/database/speechtotextmodeloptions/. We recommend the value `whisper-1` and model type `Openai`.

View file

@ -37,9 +37,7 @@ Welcome to the Khoj Docs! This is the best place to get setup and explore Khoj's
- [Read these instructions](/get-started/setup) to self-host a private instance of Khoj - [Read these instructions](/get-started/setup) to self-host a private instance of Khoj
## At a Glance ## At a Glance
<img src="https://docs.khoj.dev/img/khoj_search_on_web.png" width="400px" /> ![demo_chat](/img/using_khoj_for_studying.gif)
<span>&nbsp;&nbsp;</span>
<img src="https://docs.khoj.dev/img/khoj_chat_on_web.png" width="400px" />
#### [Search](/features/search) #### [Search](/features/search)
- **Natural**: Use natural language queries to quickly find relevant notes and documents. - **Natural**: Use natural language queries to quickly find relevant notes and documents.

View file

@ -25,6 +25,10 @@ These are the general setup instructions for self-hosted Khoj.
For Installation, you can either use Docker or install the Khoj server locally. For Installation, you can either use Docker or install the Khoj server locally.
:::info[Offline Model + GPU]
If you want to use the offline chat model and you have a GPU, you should use Installation Option 2 - local setup via the Python package directly. Our Docker image doesn't currently support running the offline chat model on GPU, making inference times really slow.
:::
### Installation Option 1 (Docker) ### Installation Option 1 (Docker)
#### Prerequisites #### Prerequisites
@ -97,6 +101,7 @@ sudo -u postgres createdb khoj --password
##### Local Server Setup ##### Local Server Setup
- *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine* - *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine*
- Check [llama-cpp-python setup](https://python.langchain.com/docs/integrations/llms/llamacpp#installation) if you hit any llama-cpp issues with the installation
Run the following command in your terminal to install the Khoj backend. Run the following command in your terminal to install the Khoj backend.
@ -104,17 +109,36 @@ Run the following command in your terminal to install the Khoj backend.
<Tabs groupId="operating-systems"> <Tabs groupId="operating-systems">
<TabItem value="macos" label="MacOS"> <TabItem value="macos" label="MacOS">
```shell ```shell
# ARM/M1+ Machines
MAKE_ARGS="-DLLAMA_METAL=on" python -m pip install khoj-assistant
# Intel Machines
python -m pip install khoj-assistant python -m pip install khoj-assistant
``` ```
</TabItem> </TabItem>
<TabItem value="win" label="Windows"> <TabItem value="win" label="Windows">
```shell ```shell
py -m pip install khoj-assistant # 1. (Optional) To use NVIDIA (CUDA) GPU
$env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on"
# 1. (Optional) To use AMD (ROCm) GPU
CMAKE_ARGS="-DLLAMA_HIPBLAS=on"
# 1. (Optional) To use VULCAN GPU
CMAKE_ARGS="-DLLAMA_VULKAN=on"
# 2. Install Khoj
py -m pip install khoj-assistant
``` ```
</TabItem> </TabItem>
<TabItem value="unix" label="Linux"> <TabItem value="unix" label="Linux">
```shell ```shell
python -m pip install khoj-assistant # CPU
python -m pip install khoj-assistant
# NVIDIA (CUDA) GPU
CMAKE_ARGS="DLLAMA_CUBLAS=on" FORCE_CMAKE=1 python -m pip install khoj-assistant
# AMD (ROCm) GPU
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 python -m pip install khoj-assistant
# VULCAN GPU
CMAKE_ARGS="-DLLAMA_VULKAN=on" FORCE_CMAKE=1 python -m pip install khoj-assistant
``` ```
</TabItem> </TabItem>
</Tabs> </Tabs>
@ -163,7 +187,31 @@ Khoj should now be running at http://localhost:42110. You can see the web UI in
Note: To start Khoj automatically in the background use [Task scheduler](https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10) on Windows or [Cron](https://en.wikipedia.org/wiki/Cron) on Mac, Linux (e.g with `@reboot khoj`) Note: To start Khoj automatically in the background use [Task scheduler](https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10) on Windows or [Cron](https://en.wikipedia.org/wiki/Cron) on Mac, Linux (e.g with `@reboot khoj`)
### 2. Download the desktop client ### Setup Notes
Optionally, you can use Khoj with a custom domain as well. To do so, you need to set the `KHOJ_DOMAIN` environment variable to your domain (e.g., `export KHOJ_DOMAIN=my-khoj-domain.com` or add it to your `docker-compose.yml`). By default, the Khoj server you set up will not be accessible outside of `localhost` or `127.0.0.1`.
:::warning[Must use an SSL certificate]
If you're using a custom domain, you must use an SSL certificate. You can use [Let's Encrypt](https://letsencrypt.org/) to get a free SSL certificate for your domain.
:::
### 2. Configure
1. Go to http://localhost:42110/server/admin and login with your admin credentials.
1. Go to [OpenAI settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/) in the server admin settings to add an OpenAI processor conversation config. This is where you set your API key. Alternatively, you can go to the [offline chat settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and simply create a new setting with `Enabled` set to `True`.
2. Go to the ChatModelOptions if you want to add additional models for chat.
- Set the `chat-model` field to a supported chat model[^1] of your choice. For example, you can specify `gpt-4-turbo-preview` if you're using OpenAI or `NousResearch/Hermes-2-Pro-Mistral-7B-GGUF` if you're using offline chat.
- Make sure to set the `model-type` field to `OpenAI` or `Offline` respectively.
- The `tokenizer` and `max-prompt-size` fields are optional. Set them only when using a non-standard model (i.e not mistral, gpt or llama2 model).
1. Select files and folders to index [using the desktop client](/get-started/setup#2-download-the-desktop-client). When you click 'Save', the files will be sent to your server for indexing.
- Select Notion workspaces and Github repositories to index using the web interface.
[^1]: Khoj, by default, can use [OpenAI GPT3.5+ chat models](https://platform.openai.com/docs/models/overview) or [GGUF chat models](https://huggingface.co/models?library=gguf). See [this section](/miscellaneous/advanced#use-openai-compatible-llm-api-server-self-hosting) to use non-standard chat models
:::tip[Note]
Using Safari on Mac? You might not be able to login to the admin panel. Try using Chrome or Firefox instead.
:::
### 3. Download the desktop client (Optional)
You can use our desktop executables to select file paths and folders to index. You can simply select the folders or files, and they'll be automatically uploaded to the server. Once you specify a file or file path, you don't need to update the configuration again; it will grab any data diffs dynamically over time. You can use our desktop executables to select file paths and folders to index. You can simply select the folders or files, and they'll be automatically uploaded to the server. Once you specify a file or file path, you don't need to update the configuration again; it will grab any data diffs dynamically over time.
@ -171,22 +219,6 @@ You can use our desktop executables to select file paths and folders to index. Y
To use the desktop client, you need to go to your Khoj server's settings page (http://localhost:42110/config) and copy the API key. Then, paste it into the desktop client's settings page. Once you've done that, you can select files and folders to index. Set the desktop client settings to use `http://127.0.0.1:42110` as the host URL. To use the desktop client, you need to go to your Khoj server's settings page (http://localhost:42110/config) and copy the API key. Then, paste it into the desktop client's settings page. Once you've done that, you can select files and folders to index. Set the desktop client settings to use `http://127.0.0.1:42110` as the host URL.
### 3. Configure
1. Go to http://localhost:42110/server/admin and login with your admin credentials.
1. Go to [OpenAI settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/) in the server admin settings to add an OpenAI processor conversation config. This is where you set your API key. Alternatively, you can go to the [offline chat settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and simply create a new setting with `Enabled` set to `True`.
2. Go to the ChatModelOptions if you want to add additional models for chat.
- Set the `chat-model` field to a supported chat model[^1] of your choice. For example, you can specify `gpt-4-turbo-preview` if you're using OpenAI or `mistral-7b-instruct-v0.1.Q4_0.gguf` if you're using offline chat.
- Make sure to set the `model-type` field to `OpenAI` or `Offline` respectively.
- The `tokenizer` and `max-prompt-size` fields are optional. Set them only when using a non-standard model (i.e not mistral, gpt or llama2 model).
1. Select files and folders to index [using the desktop client](/get-started/setup#2-download-the-desktop-client). When you click 'Save', the files will be sent to your server for indexing.
- Select Notion workspaces and Github repositories to index using the web interface.
[^1]: Khoj, by default, can use [OpenAI GPT3.5+ chat models](https://platform.openai.com/docs/models/overview) or [GPT4All chat models that follow Llama2 Prompt Template](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json). See [this section](/miscellaneous/advanced#use-openai-compatible-llm-api-server-self-hosting) to use non-standard chat models
:::tip[Note]
Using Safari on Mac? You might not be able to login to the admin panel. Try using Chrome or Firefox instead.
:::
### 4. Install Client Plugins (Optional) ### 4. Install Client Plugins (Optional)
Khoj exposes a web interface to search, chat and configure by default.<br /> Khoj exposes a web interface to search, chat and configure by default.<br />

View file

@ -10,4 +10,4 @@ Many Open Source projects are used to power Khoj. Here's a few of them:
- Charles Cave for [OrgNode Parser](http://members.optusnet.com.au/~charles57/GTD/orgnode.html) - Charles Cave for [OrgNode Parser](http://members.optusnet.com.au/~charles57/GTD/orgnode.html)
- [Org.js](https://mooz.github.io/org-js/) to render Org-mode results on the Web interface - [Org.js](https://mooz.github.io/org-js/) to render Org-mode results on the Web interface
- [Markdown-it](https://github.com/markdown-it/markdown-it) to render Markdown results on the Web interface - [Markdown-it](https://github.com/markdown-it/markdown-it) to render Markdown results on the Web interface
- [GPT4All](https://github.com/nomic-ai/gpt4all) to chat with local LLM - [Llama.cpp](https://github.com/ggerganov/llama.cpp) to chat with local LLM

View file

@ -1,10 +1,10 @@
import multiprocessing import multiprocessing
bind = "0.0.0.0:42110" bind = "0.0.0.0:42110"
workers = 8 workers = 1
worker_class = "uvicorn.workers.UvicornWorker" worker_class = "uvicorn.workers.UvicornWorker"
timeout = 120 timeout = 120
keep_alive = 60 keep_alive = 60
accesslog = "access.log" accesslog = "-"
errorlog = "error.log" errorlog = "-"
loglevel = "debug" loglevel = "debug"

View file

@ -1,7 +1,7 @@
{ {
"id": "khoj", "id": "khoj",
"name": "Khoj", "name": "Khoj",
"version": "1.7.0", "version": "1.8.0",
"minAppVersion": "0.15.0", "minAppVersion": "0.15.0",
"description": "An AI copilot for your Second Brain", "description": "An AI copilot for your Second Brain",
"author": "Khoj Inc.", "author": "Khoj Inc.",

View file

@ -1,12 +1,9 @@
# Use Nvidia's latest Ubuntu 22.04 image as the base image FROM ubuntu:jammy
FROM nvidia/cuda:12.2.0-devel-ubuntu22.04
LABEL org.opencontainers.image.source https://github.com/khoj-ai/khoj LABEL org.opencontainers.image.source https://github.com/khoj-ai/khoj
# Install System Dependencies # Install System Dependencies
RUN apt update -y && apt -y install python3-pip libsqlite3-0 ffmpeg libsm6 libxext6 RUN apt update -y && apt -y install python3-pip libsqlite3-0 ffmpeg libsm6 libxext6
# Install Optional Dependencies
RUN apt install vim -y
WORKDIR /app WORKDIR /app

View file

@ -7,7 +7,7 @@ name = "khoj-assistant"
description = "An AI copilot for your Second Brain" description = "An AI copilot for your Second Brain"
readme = "README.md" readme = "README.md"
license = "AGPL-3.0-or-later" license = "AGPL-3.0-or-later"
requires-python = ">=3.8" requires-python = ">=3.9"
authors = [ authors = [
{ name = "Debanjum Singh Solanky, Saba Imran" }, { name = "Debanjum Singh Solanky, Saba Imran" },
] ]
@ -23,8 +23,8 @@ keywords = [
"pdf", "pdf",
] ]
classifiers = [ classifiers = [
"Development Status :: 4 - Beta", "Development Status :: 5 - Production/Stable",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)",
"Operating System :: OS Independent", "Operating System :: OS Independent",
"Programming Language :: Python :: 3", "Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.9",
@ -33,7 +33,7 @@ classifiers = [
"Topic :: Internet :: WWW/HTTP :: Indexing/Search", "Topic :: Internet :: WWW/HTTP :: Indexing/Search",
"Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Scientific/Engineering :: Human Machine Interfaces", "Topic :: Scientific/Engineering :: Human Machine Interfaces",
"Topic :: Text Processing :: Linguistic", "Intended Audience :: Information Technology",
] ]
dependencies = [ dependencies = [
"beautifulsoup4 ~= 4.12.3", "beautifulsoup4 ~= 4.12.3",
@ -62,8 +62,7 @@ dependencies = [
"pymupdf >= 1.23.5", "pymupdf >= 1.23.5",
"django == 4.2.10", "django == 4.2.10",
"authlib == 1.2.1", "authlib == 1.2.1",
"gpt4all == 2.1.0; platform_system == 'Linux' and platform_machine == 'x86_64'", "llama-cpp-python == 0.2.56",
"gpt4all == 2.1.0; platform_system == 'Windows' or platform_system == 'Darwin'",
"itsdangerous == 2.1.2", "itsdangerous == 2.1.2",
"httpx == 0.25.0", "httpx == 0.25.0",
"pgvector == 0.2.4", "pgvector == 0.2.4",

View file

@ -87,7 +87,7 @@
function generateOnlineReference(reference, index) { function generateOnlineReference(reference, index) {
// Generate HTML for Chat Reference // Generate HTML for Chat Reference
let title = reference.title; let title = reference.title || reference.link;
let link = reference.link; let link = reference.link;
let snippet = reference.snippet; let snippet = reference.snippet;
let question = reference.question; let question = reference.question;
@ -191,6 +191,15 @@
referenceSection.appendChild(polishedReference); referenceSection.appendChild(polishedReference);
} }
} }
if (onlineReference.webpages && onlineReference.webpages.length > 0) {
numOnlineReferences += onlineReference.webpages.length;
for (let index in onlineReference.webpages) {
let reference = onlineReference.webpages[index];
let polishedReference = generateOnlineReference(reference, index);
referenceSection.appendChild(polishedReference);
}
}
} }
return numOnlineReferences; return numOnlineReferences;

View file

@ -1,6 +1,6 @@
{ {
"name": "Khoj", "name": "Khoj",
"version": "1.7.0", "version": "1.8.0",
"description": "An AI copilot for your Second Brain", "description": "An AI copilot for your Second Brain",
"author": "Saba Imran, Debanjum Singh Solanky <team@khoj.dev>", "author": "Saba Imran, Debanjum Singh Solanky <team@khoj.dev>",
"license": "GPL-3.0-or-later", "license": "GPL-3.0-or-later",

View file

@ -6,7 +6,7 @@
;; Saba Imran <saba@khoj.dev> ;; Saba Imran <saba@khoj.dev>
;; Description: An AI copilot for your Second Brain ;; Description: An AI copilot for your Second Brain
;; Keywords: search, chat, org-mode, outlines, markdown, pdf, image ;; Keywords: search, chat, org-mode, outlines, markdown, pdf, image
;; Version: 1.7.0 ;; Version: 1.8.0
;; Package-Requires: ((emacs "27.1") (transient "0.3.0") (dash "2.19.1")) ;; Package-Requires: ((emacs "27.1") (transient "0.3.0") (dash "2.19.1"))
;; URL: https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs ;; URL: https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs

View file

@ -1,7 +1,7 @@
{ {
"id": "khoj", "id": "khoj",
"name": "Khoj", "name": "Khoj",
"version": "1.7.0", "version": "1.8.0",
"minAppVersion": "0.15.0", "minAppVersion": "0.15.0",
"description": "An AI copilot for your Second Brain", "description": "An AI copilot for your Second Brain",
"author": "Khoj Inc.", "author": "Khoj Inc.",

View file

@ -1,6 +1,6 @@
{ {
"name": "Khoj", "name": "Khoj",
"version": "1.7.0", "version": "1.8.0",
"description": "An AI copilot for your Second Brain", "description": "An AI copilot for your Second Brain",
"author": "Debanjum Singh Solanky, Saba Imran <team@khoj.dev>", "author": "Debanjum Singh Solanky, Saba Imran <team@khoj.dev>",
"license": "GPL-3.0-or-later", "license": "GPL-3.0-or-later",

View file

@ -39,5 +39,6 @@
"1.6.0": "0.15.0", "1.6.0": "0.15.0",
"1.6.1": "0.15.0", "1.6.1": "0.15.0",
"1.6.2": "0.15.0", "1.6.2": "0.15.0",
"1.7.0": "0.15.0" "1.7.0": "0.15.0",
"1.8.0": "0.15.0"
} }

View file

@ -43,7 +43,7 @@ from khoj.search_filter.date_filter import DateFilter
from khoj.search_filter.file_filter import FileFilter from khoj.search_filter.file_filter import FileFilter
from khoj.search_filter.word_filter import WordFilter from khoj.search_filter.word_filter import WordFilter
from khoj.utils import state from khoj.utils import state
from khoj.utils.config import GPT4AllProcessorModel from khoj.utils.config import OfflineChatProcessorModel
from khoj.utils.helpers import generate_random_name, is_none_or_empty from khoj.utils.helpers import generate_random_name, is_none_or_empty
@ -399,32 +399,26 @@ class AgentAdapters:
DEFAULT_AGENT_SLUG = "khoj" DEFAULT_AGENT_SLUG = "khoj"
@staticmethod @staticmethod
async def aget_agent_by_id(agent_id: int): async def aget_agent_by_slug(agent_slug: str, user: KhojUser):
return await Agent.objects.filter(id=agent_id).afirst() return await Agent.objects.filter(
(Q(slug__iexact=agent_slug.lower())) & (Q(public=True) | Q(creator=user))
@staticmethod ).afirst()
async def aget_agent_by_slug(agent_slug: str):
return await Agent.objects.filter(slug__iexact=agent_slug.lower()).afirst()
@staticmethod @staticmethod
def get_agent_by_slug(slug: str, user: KhojUser = None): def get_agent_by_slug(slug: str, user: KhojUser = None):
agent = Agent.objects.filter(slug=slug).first() if user:
# Check if agent is public or created by the user return Agent.objects.filter((Q(slug__iexact=slug.lower())) & (Q(public=True) | Q(creator=user))).first()
if agent and (agent.public or agent.creator == user): return Agent.objects.filter(slug__iexact=slug.lower(), public=True).first()
return agent
return None
@staticmethod @staticmethod
def get_all_accessible_agents(user: KhojUser = None): def get_all_accessible_agents(user: KhojUser = None):
return Agent.objects.filter(Q(public=True) | Q(creator=user)).distinct().order_by("created_at") if user:
return Agent.objects.filter(Q(public=True) | Q(creator=user)).distinct().order_by("created_at")
return Agent.objects.filter(public=True).order_by("created_at")
@staticmethod @staticmethod
async def aget_all_accessible_agents(user: KhojUser = None) -> List[Agent]: async def aget_all_accessible_agents(user: KhojUser = None) -> List[Agent]:
get_all_accessible_agents = sync_to_async( agents = await sync_to_async(AgentAdapters.get_all_accessible_agents)(user)
lambda: Agent.objects.filter(Q(public=True) | Q(creator=user)).distinct().order_by("created_at").all(),
thread_sensitive=True,
)
agents = await get_all_accessible_agents()
return await sync_to_async(list)(agents) return await sync_to_async(list)(agents)
@staticmethod @staticmethod
@ -444,26 +438,29 @@ class AgentAdapters:
default_conversation_config = ConversationAdapters.get_default_conversation_config() default_conversation_config = ConversationAdapters.get_default_conversation_config()
default_personality = prompts.personality.format(current_date="placeholder") default_personality = prompts.personality.format(current_date="placeholder")
if Agent.objects.filter(name=AgentAdapters.DEFAULT_AGENT_NAME).exists(): agent = Agent.objects.filter(name=AgentAdapters.DEFAULT_AGENT_NAME).first()
agent = Agent.objects.filter(name=AgentAdapters.DEFAULT_AGENT_NAME).first()
agent.tuning = default_personality if agent:
agent.personality = default_personality
agent.chat_model = default_conversation_config agent.chat_model = default_conversation_config
agent.slug = AgentAdapters.DEFAULT_AGENT_SLUG agent.slug = AgentAdapters.DEFAULT_AGENT_SLUG
agent.name = AgentAdapters.DEFAULT_AGENT_NAME agent.name = AgentAdapters.DEFAULT_AGENT_NAME
agent.save() agent.save()
return agent else:
# The default agent is public and managed by the admin. It's handled a little differently than other agents.
agent = Agent.objects.create(
name=AgentAdapters.DEFAULT_AGENT_NAME,
public=True,
managed_by_admin=True,
chat_model=default_conversation_config,
personality=default_personality,
tools=["*"],
avatar=AgentAdapters.DEFAULT_AGENT_AVATAR,
slug=AgentAdapters.DEFAULT_AGENT_SLUG,
)
Conversation.objects.filter(agent=None).update(agent=agent)
# The default agent is public and managed by the admin. It's handled a little differently than other agents. return agent
return Agent.objects.create(
name=AgentAdapters.DEFAULT_AGENT_NAME,
public=True,
managed_by_admin=True,
chat_model=default_conversation_config,
tuning=default_personality,
tools=["*"],
avatar=AgentAdapters.DEFAULT_AGENT_AVATAR,
slug=AgentAdapters.DEFAULT_AGENT_SLUG,
)
@staticmethod @staticmethod
async def aget_default_agent(): async def aget_default_agent():
@ -482,9 +479,10 @@ class ConversationAdapters:
.first() .first()
) )
else: else:
agent = AgentAdapters.get_default_agent()
conversation = ( conversation = (
Conversation.objects.filter(user=user, client=client_application).order_by("-updated_at").first() Conversation.objects.filter(user=user, client=client_application).order_by("-updated_at").first()
) or Conversation.objects.create(user=user, client=client_application) ) or Conversation.objects.create(user=user, client=client_application, agent=agent)
return conversation return conversation
@ -514,11 +512,12 @@ class ConversationAdapters:
user: KhojUser, client_application: ClientApplication = None, agent_slug: str = None user: KhojUser, client_application: ClientApplication = None, agent_slug: str = None
): ):
if agent_slug: if agent_slug:
agent = await AgentAdapters.aget_agent_by_slug(agent_slug) agent = await AgentAdapters.aget_agent_by_slug(agent_slug, user)
if agent is None: if agent is None:
raise HTTPException(status_code=400, detail="Invalid agent id") raise HTTPException(status_code=400, detail="No such agent currently exists.")
return await Conversation.objects.acreate(user=user, client=client_application, agent=agent) return await Conversation.objects.acreate(user=user, client=client_application, agent=agent)
return await Conversation.objects.acreate(user=user, client=client_application) agent = await AgentAdapters.aget_default_agent()
return await Conversation.objects.acreate(user=user, client=client_application, agent=agent)
@staticmethod @staticmethod
async def aget_conversation_by_user( async def aget_conversation_by_user(
@ -706,8 +705,8 @@ class ConversationAdapters:
conversation_config = ConversationAdapters.get_default_conversation_config() conversation_config = ConversationAdapters.get_default_conversation_config()
if offline_chat_config and offline_chat_config.enabled and conversation_config.model_type == "offline": if offline_chat_config and offline_chat_config.enabled and conversation_config.model_type == "offline":
if state.gpt4all_processor_config is None or state.gpt4all_processor_config.loaded_model is None: if state.offline_chat_processor_config is None or state.offline_chat_processor_config.loaded_model is None:
state.gpt4all_processor_config = GPT4AllProcessorModel(conversation_config.chat_model) state.offline_chat_processor_config = OfflineChatProcessorModel(conversation_config.chat_model)
return conversation_config return conversation_config

View file

@ -23,7 +23,7 @@ class Migration(migrations.Migration):
("tools", models.JSONField(default=list)), ("tools", models.JSONField(default=list)),
("public", models.BooleanField(default=False)), ("public", models.BooleanField(default=False)),
("managed_by_admin", models.BooleanField(default=False)), ("managed_by_admin", models.BooleanField(default=False)),
("slug", models.CharField(blank=True, default=None, max_length=200, null=True)), ("slug", models.CharField(max_length=200)),
( (
"chat_model", "chat_model",
models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to="database.chatmodeloptions"), models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to="database.chatmodeloptions"),

View file

@ -3,6 +3,18 @@
from django.db import migrations, models from django.db import migrations, models
def set_default_locale(apps, schema_editor):
return
def reverse_set_default_locale(apps, schema_editor):
GoogleUser = apps.get_model("database", "GoogleUser")
for user in GoogleUser.objects.all():
if not user.locale:
user.locale = "en"
user.save()
class Migration(migrations.Migration): class Migration(migrations.Migration):
dependencies = [ dependencies = [
("database", "0030_conversation_slug_and_title"), ("database", "0030_conversation_slug_and_title"),
@ -14,4 +26,5 @@ class Migration(migrations.Migration):
name="locale", name="locale",
field=models.CharField(blank=True, default=None, max_length=200, null=True), field=models.CharField(blank=True, default=None, max_length=200, null=True),
), ),
migrations.RunPython(set_default_locale, reverse_set_default_locale),
] ]

View file

@ -0,0 +1,17 @@
# Generated by Django 4.2.10 on 2024-03-23 16:01
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("database", "0032_merge_20240322_0427"),
]
operations = [
migrations.RenameField(
model_name="agent",
old_name="tuning",
new_name="personality",
),
]

View file

@ -80,20 +80,22 @@ class ChatModelOptions(BaseModel):
max_prompt_size = models.IntegerField(default=None, null=True, blank=True) max_prompt_size = models.IntegerField(default=None, null=True, blank=True)
tokenizer = models.CharField(max_length=200, default=None, null=True, blank=True) tokenizer = models.CharField(max_length=200, default=None, null=True, blank=True)
chat_model = models.CharField(max_length=200, default="mistral-7b-instruct-v0.1.Q4_0.gguf") chat_model = models.CharField(max_length=200, default="NousResearch/Hermes-2-Pro-Mistral-7B-GGUF")
model_type = models.CharField(max_length=200, choices=ModelType.choices, default=ModelType.OFFLINE) model_type = models.CharField(max_length=200, choices=ModelType.choices, default=ModelType.OFFLINE)
class Agent(BaseModel): class Agent(BaseModel):
creator = models.ForeignKey(KhojUser, on_delete=models.CASCADE, default=None, null=True, blank=True) creator = models.ForeignKey(
KhojUser, on_delete=models.CASCADE, default=None, null=True, blank=True
) # Creator will only be null when the agents are managed by admin
name = models.CharField(max_length=200) name = models.CharField(max_length=200)
tuning = models.TextField() personality = models.TextField()
avatar = models.URLField(max_length=400, default=None, null=True, blank=True) avatar = models.URLField(max_length=400, default=None, null=True, blank=True)
tools = models.JSONField(default=list) # List of tools the agent has access to, like online search or notes search tools = models.JSONField(default=list) # List of tools the agent has access to, like online search or notes search
public = models.BooleanField(default=False) public = models.BooleanField(default=False)
managed_by_admin = models.BooleanField(default=False) managed_by_admin = models.BooleanField(default=False)
chat_model = models.ForeignKey(ChatModelOptions, on_delete=models.CASCADE) chat_model = models.ForeignKey(ChatModelOptions, on_delete=models.CASCADE)
slug = models.CharField(max_length=200, default=None, null=True, blank=True) slug = models.CharField(max_length=200)
@receiver(pre_save, sender=Agent) @receiver(pre_save, sender=Agent)
@ -108,7 +110,10 @@ def verify_agent(sender, instance, **kwargs):
slug = instance.name.lower().replace(" ", "-") slug = instance.name.lower().replace(" ", "-")
observed_random_numbers = set() observed_random_numbers = set()
while Agent.objects.filter(slug=slug).exists(): while Agent.objects.filter(slug=slug).exists():
random_number = choice([i for i in range(0, 10000) if i not in observed_random_numbers]) try:
random_number = choice([i for i in range(0, 1000) if i not in observed_random_numbers])
except IndexError:
raise ValidationError("Unable to generate a unique slug for the Agent. Please try again later.")
observed_random_numbers.add(random_number) observed_random_numbers.add(random_number)
slug = f"{slug}-{random_number}" slug = f"{slug}-{random_number}"
instance.slug = slug instance.slug = slug

View file

@ -24,9 +24,9 @@
<img id="agent-avatar" src="{{ agent.avatar }}" alt="Agent Avatar"> <img id="agent-avatar" src="{{ agent.avatar }}" alt="Agent Avatar">
<input type="text" id="agent-name-input" value="{{ agent.name }}" {% if agent.creator_not_self %} disabled {% endif %}> <input type="text" id="agent-name-input" value="{{ agent.name }}" {% if agent.creator_not_self %} disabled {% endif %}>
</div> </div>
<div id="agent-instructions">Instructions</div> <div id="agent-instructions">Personality</div>
<div id="agent-tuning"> <div id="agent-tuning">
<p>{{ agent.tuning }}</p> <p>{{ agent.personality }}</p>
</div> </div>
<div class="divider"></div> <div class="divider"></div>
<div id="agent-public"> <div id="agent-public">
@ -256,8 +256,8 @@
} }
</style> </style>
<script> <script>
async function openChat(agentId) { async function openChat(agentSlug) {
let response = await fetch(`/api/chat/sessions?agent_slug=${agentId}`, { method: "POST" }); let response = await fetch(`/api/chat/sessions?agent_slug=${agentSlug}`, { method: "POST" });
let data = await response.json(); let data = await response.json();
if (response.status == 200) { if (response.status == 200) {
window.location.href = "/"; window.location.href = "/";

View file

@ -36,7 +36,7 @@
<a href="/agent/{{ agent.slug }}"> <a href="/agent/{{ agent.slug }}">
<h2>{{ agent.name }}</h2> <h2>{{ agent.name }}</h2>
</a> </a>
<p>{{ agent.tuning }}</p> <p>{{ agent.personality }}</p>
</div> </div>
<div class="agent-info"> <div class="agent-info">
<button onclick="openChat('{{ agent.slug }}')">Talk</button> <button onclick="openChat('{{ agent.slug }}')">Talk</button>
@ -69,7 +69,7 @@
} }
.agent-info p { .agent-info p {
height: 50px; /* Adjust this value as needed */ height: 50px;
overflow: auto; overflow: auto;
margin: 0px; margin: 0px;
} }
@ -84,8 +84,10 @@
} }
div.agent img { div.agent img {
width: 50px; width: 78px;
height: 78px;
border-radius: 50%; border-radius: 50%;
object-fit: cover;
} }
div.agent a { div.agent a {
@ -157,7 +159,7 @@
background-color: var(--frosted-background-color); background-color: var(--frosted-background-color);
box-shadow: 0px 8px 16px 0px rgba(0,0,0,0.2); box-shadow: 0px 8px 16px 0px rgba(0,0,0,0.2);
border-radius: 8px; border-radius: 8px;
width: 50%; width: 75%;
margin-right: auto; margin-right: auto;
margin-left: auto; margin-left: auto;
} }
@ -186,8 +188,8 @@
} }
</style> </style>
<script> <script>
async function openChat(agentId) { async function openChat(agentSlug) {
let response = await fetch(`/api/chat/sessions?agent_slug=${agentId}`, { method: "POST" }); let response = await fetch(`/api/chat/sessions?agent_slug=${agentSlug}`, { method: "POST" });
let data = await response.json(); let data = await response.json();
if (response.status == 200) { if (response.status == 200) {
window.location.href = "/"; window.location.href = "/";

View file

@ -5,6 +5,7 @@
<title>Khoj - Chat</title> <title>Khoj - Chat</title>
<link rel="icon" type="image/png" sizes="128x128" href="/static/assets/icons/favicon-128x128.png?v={{ khoj_version }}"> <link rel="icon" type="image/png" sizes="128x128" href="/static/assets/icons/favicon-128x128.png?v={{ khoj_version }}">
<link rel="apple-touch-icon" href="/static/assets/icons/favicon-128x128.png?v={{ khoj_version }}">
<link rel="manifest" href="/static/khoj.webmanifest?v={{ khoj_version }}"> <link rel="manifest" href="/static/khoj.webmanifest?v={{ khoj_version }}">
<link rel="stylesheet" href="/static/assets/khoj.css?v={{ khoj_version }}"> <link rel="stylesheet" href="/static/assets/khoj.css?v={{ khoj_version }}">
</head> </head>
@ -19,7 +20,13 @@ Hi, I am Khoj, your open, personal AI 👋🏽. I can:
- 🌄 Generate images based on your messages - 🌄 Generate images based on your messages
- 🔎 Search the web for answers to your questions - 🔎 Search the web for answers to your questions
- 🎙️ Listen to your audio messages (use the mic by the input box to speak your message) - 🎙️ Listen to your audio messages (use the mic by the input box to speak your message)
<<<<<<< HEAD
- 📚 Understand files you drag & drop here - 📚 Understand files you drag & drop here
||||||| 7416ca9a
=======
- 📚 Understand files you drag & drop here
- 👩🏾‍🚀 Be tuned to your conversation needs via [agents](./agents)
>>>>>>> 3c3e48b18c87a42e3b69af5dbfb86b5c8b68effa
Get the Khoj [Desktop](https://khoj.dev/downloads), [Obsidian](https://docs.khoj.dev/clients/obsidian#setup), [Emacs](https://docs.khoj.dev/clients/emacs#setup) apps to search, chat with your 🖥️ computer docs. You can manage all the files you've shared with me at any time by going to [your settings](/config/content-source/computer/). Get the Khoj [Desktop](https://khoj.dev/downloads), [Obsidian](https://docs.khoj.dev/clients/obsidian#setup), [Emacs](https://docs.khoj.dev/clients/emacs#setup) apps to search, chat with your 🖥️ computer docs. You can manage all the files you've shared with me at any time by going to [your settings](/config/content-source/computer/).
@ -111,7 +118,7 @@ To get started, just start typing below. You can also type / to see a list of co
function generateOnlineReference(reference, index) { function generateOnlineReference(reference, index) {
// Generate HTML for Chat Reference // Generate HTML for Chat Reference
let title = reference.title; let title = reference.title || reference.link;
let link = reference.link; let link = reference.link;
let snippet = reference.snippet; let snippet = reference.snippet;
let question = reference.question; let question = reference.question;
@ -214,6 +221,15 @@ To get started, just start typing below. You can also type / to see a list of co
referenceSection.appendChild(polishedReference); referenceSection.appendChild(polishedReference);
} }
} }
if (onlineReference.webpages && onlineReference.webpages.length > 0) {
numOnlineReferences += onlineReference.webpages.length;
for (let index in onlineReference.webpages) {
let reference = onlineReference.webpages[index];
let polishedReference = generateOnlineReference(reference, index);
referenceSection.appendChild(polishedReference);
}
}
} }
return numOnlineReferences; return numOnlineReferences;
@ -2452,6 +2468,7 @@ To get started, just start typing below. You can also type / to see a list of co
width: 100%; /* Full width */ width: 100%; /* Full width */
height: 100%; /* Full height */ height: 100%; /* Full height */
background-color: rgba(0,0,0,0.4); /* Black w/ opacity */ background-color: rgba(0,0,0,0.4); /* Black w/ opacity */
margin: 0px;
} }
.modal-content { .modal-content {

View file

@ -5,6 +5,7 @@
<title>Khoj - Search</title> <title>Khoj - Search</title>
<link rel="icon" type="image/png" sizes="128x128" href="/static/assets/icons/favicon-128x128.png?v={{ khoj_version }}"> <link rel="icon" type="image/png" sizes="128x128" href="/static/assets/icons/favicon-128x128.png?v={{ khoj_version }}">
<link rel="apple-touch-icon" href="/static/assets/icons/favicon-128x128.png?v={{ khoj_version }}">
<link rel="manifest" href="/static/khoj.webmanifest?v={{ khoj_version }}"> <link rel="manifest" href="/static/khoj.webmanifest?v={{ khoj_version }}">
<link rel="stylesheet" href="/static/assets/khoj.css?v={{ khoj_version }}"> <link rel="stylesheet" href="/static/assets/khoj.css?v={{ khoj_version }}">
</head> </head>

View file

@ -4,9 +4,9 @@
<img class="khoj-logo" src="/static/assets/icons/khoj-logo-sideways-500.png?v={{ khoj_version }}" alt="Khoj"></img> <img class="khoj-logo" src="/static/assets/icons/khoj-logo-sideways-500.png?v={{ khoj_version }}" alt="Khoj"></img>
</a> </a>
<nav class="khoj-nav"> <nav class="khoj-nav">
<a id="agents-nav" class="khoj-nav" href="/agents">Agents</a>
{% if has_documents %} {% if has_documents %}
<a id="chat-nav" class="khoj-nav" href="/chat">💬 Chat</a> <a id="search-nav" class="khoj-nav" href="/search">Search</a>
<a id="search-nav" class="khoj-nav" href="/search">🔎 Search</a>
{% endif %} {% endif %}
<!-- Dropdown Menu --> <!-- Dropdown Menu -->
{% if username %} {% if username %}

View file

@ -160,7 +160,9 @@ def start_server(app, host=None, port=None, socket=None):
if socket: if socket:
uvicorn.run(app, proxy_headers=True, uds=socket, log_level="debug", use_colors=True, log_config=None) uvicorn.run(app, proxy_headers=True, uds=socket, log_level="debug", use_colors=True, log_config=None)
else: else:
uvicorn.run(app, host=host, port=port, log_level="debug", use_colors=True, log_config=None) uvicorn.run(
app, host=host, port=port, log_level="debug", use_colors=True, log_config=None, timeout_keep_alive=60
)
logger.info("🌒 Stopping Khoj") logger.info("🌒 Stopping Khoj")

View file

@ -0,0 +1,71 @@
"""
Current format of khoj.yml
---
app:
...
content-type:
...
processor:
conversation:
offline-chat:
enable-offline-chat: false
chat-model: mistral-7b-instruct-v0.1.Q4_0.gguf
...
search-type:
...
New format of khoj.yml
---
app:
...
content-type:
...
processor:
conversation:
offline-chat:
enable-offline-chat: false
chat-model: NousResearch/Hermes-2-Pro-Mistral-7B-GGUF
...
search-type:
...
"""
import logging
from packaging import version
from khoj.utils.yaml import load_config_from_file, save_config_to_file
logger = logging.getLogger(__name__)
def migrate_offline_chat_default_model(args):
schema_version = "1.7.0"
raw_config = load_config_from_file(args.config_file)
previous_version = raw_config.get("version")
if "processor" not in raw_config:
return args
if raw_config["processor"] is None:
return args
if "conversation" not in raw_config["processor"]:
return args
if "offline-chat" not in raw_config["processor"]["conversation"]:
return args
if "chat-model" not in raw_config["processor"]["conversation"]["offline-chat"]:
return args
if previous_version is None or version.parse(previous_version) < version.parse(schema_version):
logger.info(
f"Upgrading config schema to {schema_version} from {previous_version} to change default (offline) chat model to mistral GGUF"
)
raw_config["version"] = schema_version
# Update offline chat model to use Nous Research's Hermes-2-Pro GGUF in path format suitable for llama-cpp
offline_chat_model = raw_config["processor"]["conversation"]["offline-chat"]["chat-model"]
if offline_chat_model == "mistral-7b-instruct-v0.1.Q4_0.gguf":
raw_config["processor"]["conversation"]["offline-chat"][
"chat-model"
] = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF"
save_config_to_file(raw_config, args.config_file)
return args

View file

@ -1,13 +1,15 @@
import json
import logging import logging
from collections import deque from datetime import datetime, timedelta
from datetime import datetime
from threading import Thread from threading import Thread
from typing import Any, Iterator, List, Union from typing import Any, Iterator, List, Union
from langchain.schema import ChatMessage from langchain.schema import ChatMessage
from llama_cpp import Llama
from khoj.database.models import Agent from khoj.database.models import Agent
from khoj.processor.conversation import prompts from khoj.processor.conversation import prompts
from khoj.processor.conversation.offline.utils import download_model
from khoj.processor.conversation.utils import ( from khoj.processor.conversation.utils import (
ThreadedGenerator, ThreadedGenerator,
generate_chatml_messages_with_context, generate_chatml_messages_with_context,
@ -22,7 +24,7 @@ logger = logging.getLogger(__name__)
def extract_questions_offline( def extract_questions_offline(
text: str, text: str,
model: str = "mistral-7b-instruct-v0.1.Q4_0.gguf", model: str = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF",
loaded_model: Union[Any, None] = None, loaded_model: Union[Any, None] = None,
conversation_log={}, conversation_log={},
use_history: bool = True, use_history: bool = True,
@ -32,22 +34,14 @@ def extract_questions_offline(
""" """
Infer search queries to retrieve relevant notes to answer user query Infer search queries to retrieve relevant notes to answer user query
""" """
try:
from gpt4all import GPT4All
except ModuleNotFoundError as e:
logger.info("There was an error importing GPT4All. Please run pip install gpt4all in order to install it.")
raise e
# Assert that loaded_model is either None or of type GPT4All
assert loaded_model is None or isinstance(loaded_model, GPT4All), "loaded_model must be of type GPT4All or None"
all_questions = text.split("? ") all_questions = text.split("? ")
all_questions = [q + "?" for q in all_questions[:-1]] + [all_questions[-1]] all_questions = [q + "?" for q in all_questions[:-1]] + [all_questions[-1]]
if not should_extract_questions: if not should_extract_questions:
return all_questions return all_questions
gpt4all_model = loaded_model or GPT4All(model) assert loaded_model is None or isinstance(loaded_model, Llama), "loaded_model must be of type Llama, if configured"
offline_chat_model = loaded_model or download_model(model)
location = f"{location_data.city}, {location_data.region}, {location_data.country}" if location_data else "Unknown" location = f"{location_data.city}, {location_data.region}, {location_data.country}" if location_data else "Unknown"
@ -56,37 +50,36 @@ def extract_questions_offline(
if use_history: if use_history:
for chat in conversation_log.get("chat", [])[-4:]: for chat in conversation_log.get("chat", [])[-4:]:
if chat["by"] == "khoj" and chat["intent"].get("type") != "text-to-image": if chat["by"] == "khoj" and "text-to-image" not in chat["intent"].get("type"):
chat_history += f"Q: {chat['intent']['query']}\n" chat_history += f"Q: {chat['intent']['query']}\n"
chat_history += f"A: {chat['message']}\n" chat_history += f"Khoj: {chat['message']}\n\n"
current_date = datetime.now().strftime("%Y-%m-%d") today = datetime.today()
last_year = datetime.now().year - 1 yesterday = (today - timedelta(days=1)).strftime("%Y-%m-%d")
last_christmas_date = f"{last_year}-12-25" last_year = today.year - 1
next_christmas_date = f"{datetime.now().year}-12-25" example_questions = prompts.extract_questions_offline.format(
system_prompt = prompts.system_prompt_extract_questions_gpt4all.format(
message=(prompts.system_prompt_message_extract_questions_gpt4all)
)
example_questions = prompts.extract_questions_gpt4all_sample.format(
query=text, query=text,
chat_history=chat_history, chat_history=chat_history,
current_date=current_date, current_date=today.strftime("%Y-%m-%d"),
yesterday_date=yesterday,
last_year=last_year, last_year=last_year,
last_christmas_date=last_christmas_date, this_year=today.year,
next_christmas_date=next_christmas_date,
location=location, location=location,
) )
message = system_prompt + example_questions messages = generate_chatml_messages_with_context(
example_questions, model_name=model, loaded_model=offline_chat_model
)
state.chat_lock.acquire() state.chat_lock.acquire()
try: try:
response = gpt4all_model.generate(message, max_tokens=200, top_k=2, temp=0, n_batch=512) response = send_message_to_model_offline(messages, loaded_model=offline_chat_model)
finally: finally:
state.chat_lock.release() state.chat_lock.release()
# Extract, Clean Message from GPT's Response # Extract, Clean Message from GPT's Response
try: try:
# This will expect to be a list with a single string with a list of questions # This will expect to be a list with a single string with a list of questions
questions = ( questions_str = (
str(response) str(response)
.strip(empty_escape_sequences) .strip(empty_escape_sequences)
.replace("['", '["') .replace("['", '["')
@ -94,11 +87,8 @@ def extract_questions_offline(
.replace("</s>", "") .replace("</s>", "")
.replace("']", '"]') .replace("']", '"]')
.replace("', '", '", "') .replace("', '", '", "')
.replace('["', "")
.replace('"]', "")
.split("? ")
) )
questions = [q + "?" for q in questions[:-1]] + [questions[-1]] questions: List[str] = json.loads(questions_str)
questions = filter_questions(questions) questions = filter_questions(questions)
except: except:
logger.warning(f"Llama returned invalid JSON. Falling back to using user message as search query.\n{response}") logger.warning(f"Llama returned invalid JSON. Falling back to using user message as search query.\n{response}")
@ -121,12 +111,12 @@ def filter_questions(questions: List[str]):
"do not know", "do not know",
"do not understand", "do not understand",
] ]
filtered_questions = [] filtered_questions = set()
for q in questions: for q in questions:
if not any([word in q.lower() for word in hint_words]) and not is_none_or_empty(q): if not any([word in q.lower() for word in hint_words]) and not is_none_or_empty(q):
filtered_questions.append(q) filtered_questions.add(q)
return filtered_questions return list(filtered_questions)
def converse_offline( def converse_offline(
@ -134,7 +124,7 @@ def converse_offline(
references=[], references=[],
online_results=[], online_results=[],
conversation_log={}, conversation_log={},
model: str = "mistral-7b-instruct-v0.1.Q4_0.gguf", model: str = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF",
loaded_model: Union[Any, None] = None, loaded_model: Union[Any, None] = None,
completion_func=None, completion_func=None,
conversation_commands=[ConversationCommand.Default], conversation_commands=[ConversationCommand.Default],
@ -147,26 +137,19 @@ def converse_offline(
""" """
Converse with user using Llama Converse with user using Llama
""" """
try:
from gpt4all import GPT4All
except ModuleNotFoundError as e:
logger.info("There was an error importing GPT4All. Please run pip install gpt4all in order to install it.")
raise e
assert loaded_model is None or isinstance(loaded_model, GPT4All), "loaded_model must be of type GPT4All or None"
gpt4all_model = loaded_model or GPT4All(model)
# Initialize Variables # Initialize Variables
assert loaded_model is None or isinstance(loaded_model, Llama), "loaded_model must be of type Llama, if configured"
offline_chat_model = loaded_model or download_model(model)
compiled_references_message = "\n\n".join({f"{item}" for item in references}) compiled_references_message = "\n\n".join({f"{item}" for item in references})
system_prompt = ""
current_date = datetime.now().strftime("%Y-%m-%d") current_date = datetime.now().strftime("%Y-%m-%d")
if agent and agent.tuning: if agent and agent.personality:
system_prompt = prompts.custom_system_prompt_message_gpt4all.format( system_prompt = prompts.custom_system_prompt_offline_chat.format(
name=agent.name, bio=agent.tuning, current_date=current_date name=agent.name, bio=agent.personality, current_date=current_date
) )
else: else:
system_prompt = prompts.system_prompt_message_gpt4all.format(current_date=current_date) system_prompt = prompts.system_prompt_offline_chat.format(current_date=current_date)
conversation_primer = prompts.query_prompt.format(query=user_query) conversation_primer = prompts.query_prompt.format(query=user_query)
@ -189,12 +172,12 @@ def converse_offline(
if ConversationCommand.Online in conversation_commands: if ConversationCommand.Online in conversation_commands:
simplified_online_results = online_results.copy() simplified_online_results = online_results.copy()
for result in online_results: for result in online_results:
if online_results[result].get("extracted_content"): if online_results[result].get("webpages"):
simplified_online_results[result] = online_results[result]["extracted_content"] simplified_online_results[result] = online_results[result]["webpages"]
conversation_primer = f"{prompts.online_search_conversation.format(online_results=str(simplified_online_results))}\n{conversation_primer}" conversation_primer = f"{prompts.online_search_conversation.format(online_results=str(simplified_online_results))}\n{conversation_primer}"
if not is_none_or_empty(compiled_references_message): if not is_none_or_empty(compiled_references_message):
conversation_primer = f"{prompts.notes_conversation_gpt4all.format(references=compiled_references_message)}\n{conversation_primer}" conversation_primer = f"{prompts.notes_conversation_offline.format(references=compiled_references_message)}\n{conversation_primer}"
# Setup Prompt with Primer or Conversation History # Setup Prompt with Primer or Conversation History
messages = generate_chatml_messages_with_context( messages = generate_chatml_messages_with_context(
@ -202,72 +185,44 @@ def converse_offline(
system_prompt, system_prompt,
conversation_log, conversation_log,
model_name=model, model_name=model,
loaded_model=offline_chat_model,
max_prompt_size=max_prompt_size, max_prompt_size=max_prompt_size,
tokenizer_name=tokenizer_name, tokenizer_name=tokenizer_name,
) )
g = ThreadedGenerator(references, online_results, completion_func=completion_func) g = ThreadedGenerator(references, online_results, completion_func=completion_func)
t = Thread(target=llm_thread, args=(g, messages, gpt4all_model)) t = Thread(target=llm_thread, args=(g, messages, offline_chat_model))
t.start() t.start()
return g return g
def llm_thread(g, messages: List[ChatMessage], model: Any): def llm_thread(g, messages: List[ChatMessage], model: Any):
user_message = messages[-1]
system_message = messages[0]
conversation_history = messages[1:-1]
formatted_messages = [
prompts.khoj_message_gpt4all.format(message=message.content)
if message.role == "assistant"
else prompts.user_message_gpt4all.format(message=message.content)
for message in conversation_history
]
stop_phrases = ["<s>", "INST]", "Notes:"] stop_phrases = ["<s>", "INST]", "Notes:"]
chat_history = "".join(formatted_messages)
templated_system_message = prompts.system_prompt_gpt4all.format(message=system_message.content)
templated_user_message = prompts.user_message_gpt4all.format(message=user_message.content)
prompted_message = templated_system_message + chat_history + templated_user_message
response_queue: deque[str] = deque(maxlen=3) # Create a response queue with a maximum length of 3
hit_stop_phrase = False
state.chat_lock.acquire() state.chat_lock.acquire()
response_iterator = send_message_to_model_offline(prompted_message, loaded_model=model, streaming=True)
try: try:
response_iterator = send_message_to_model_offline(
messages, loaded_model=model, stop=stop_phrases, streaming=True
)
for response in response_iterator: for response in response_iterator:
response_queue.append(response) g.send(response["choices"][0]["delta"].get("content", ""))
hit_stop_phrase = any(stop_phrase in "".join(response_queue) for stop_phrase in stop_phrases)
if hit_stop_phrase:
logger.debug(f"Stop response as hit stop phrase: {''.join(response_queue)}")
break
# Start streaming the response at a lag once the queue is full
# This allows stop word testing before sending the response
if len(response_queue) == response_queue.maxlen:
g.send(response_queue[0])
finally: finally:
if not hit_stop_phrase:
if len(response_queue) == response_queue.maxlen:
# remove already sent reponse chunk
response_queue.popleft()
# send the remaining response
g.send("".join(response_queue))
state.chat_lock.release() state.chat_lock.release()
g.close() g.close()
def send_message_to_model_offline( def send_message_to_model_offline(
message, loaded_model=None, model="mistral-7b-instruct-v0.1.Q4_0.gguf", streaming=False, system_message="" messages: List[ChatMessage],
) -> str: loaded_model=None,
try: model="NousResearch/Hermes-2-Pro-Mistral-7B-GGUF",
from gpt4all import GPT4All streaming=False,
except ModuleNotFoundError as e: stop=[],
logger.info("There was an error importing GPT4All. Please run pip install gpt4all in order to install it.") ):
raise e assert loaded_model is None or isinstance(loaded_model, Llama), "loaded_model must be of type Llama, if configured"
offline_chat_model = loaded_model or download_model(model)
assert loaded_model is None or isinstance(loaded_model, GPT4All), "loaded_model must be of type GPT4All or None" messages_dict = [{"role": message.role, "content": message.content} for message in messages]
gpt4all_model = loaded_model or GPT4All(model) response = offline_chat_model.create_chat_completion(messages_dict, stop=stop, stream=streaming)
if streaming:
return gpt4all_model.generate( return response
system_message + message, max_tokens=200, top_k=2, temp=0, n_batch=512, streaming=streaming else:
) return response["choices"][0]["message"].get("content", "")

View file

@ -1,43 +1,54 @@
import glob
import logging import logging
import os
from huggingface_hub.constants import HF_HUB_CACHE
from khoj.utils import state from khoj.utils import state
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def download_model(model_name: str): def download_model(repo_id: str, filename: str = "*Q4_K_M.gguf"):
try: from llama_cpp.llama import Llama
import gpt4all
except ModuleNotFoundError as e: # Initialize Model Parameters. Use n_ctx=0 to get context size from the model
logger.info("There was an error importing GPT4All. Please run pip install gpt4all in order to install it.") kwargs = {"n_threads": 4, "n_ctx": 0, "verbose": False}
raise e
# Decide whether to load model to GPU or CPU # Decide whether to load model to GPU or CPU
chat_model_config = None device = "gpu" if state.chat_on_gpu and state.device != "cpu" else "cpu"
kwargs["n_gpu_layers"] = -1 if device == "gpu" else 0
# Check if the model is already downloaded
model_path = load_model_from_cache(repo_id, filename)
chat_model = None
try: try:
# Download the chat model and its config if model_path:
chat_model_config = gpt4all.GPT4All.retrieve_model(model_name=model_name, allow_download=True) chat_model = Llama(model_path, **kwargs)
# Try load chat model to GPU if:
# 1. Loading chat model to GPU isn't disabled via CLI and
# 2. Machine has GPU
# 3. GPU has enough free memory to load the chat model with max context length of 4096
device = (
"gpu"
if state.chat_on_gpu and gpt4all.pyllmodel.LLModel().list_gpu(chat_model_config["path"], 4096)
else "cpu"
)
except ValueError:
device = "cpu"
except Exception as e:
if chat_model_config is None:
device = "cpu" # Fallback to CPU as can't determine if GPU has enough memory
logger.debug(f"Unable to download model config from gpt4all website: {e}")
else: else:
raise e Llama.from_pretrained(repo_id=repo_id, filename=filename, **kwargs)
except:
# Load model on CPU if GPU is not available
kwargs["n_gpu_layers"], device = 0, "cpu"
if model_path:
chat_model = Llama(model_path, **kwargs)
else:
chat_model = Llama.from_pretrained(repo_id=repo_id, filename=filename, **kwargs)
# Now load the downloaded chat model onto appropriate device logger.debug(f"{'Loaded' if model_path else 'Downloaded'} chat model to {device.upper()}")
chat_model = gpt4all.GPT4All(model_name=model_name, n_ctx=4096, device=device, allow_download=False)
logger.debug(f"Loaded chat model to {device.upper()}.")
return chat_model return chat_model
def load_model_from_cache(repo_id: str, filename: str, repo_type="models"):
# Construct the path to the model file in the cache directory
repo_org, repo_name = repo_id.split("/")
object_id = "--".join([repo_type, repo_org, repo_name])
model_path = os.path.sep.join([HF_HUB_CACHE, object_id, "snapshots", "**", filename])
# Check if the model file exists
paths = glob.glob(model_path)
if paths:
return paths[0]
else:
return None

View file

@ -1,7 +1,7 @@
import json import json
import logging import logging
from datetime import datetime, timedelta from datetime import datetime, timedelta
from typing import Optional from typing import Dict, Optional
from langchain.schema import ChatMessage from langchain.schema import ChatMessage
@ -105,7 +105,7 @@ def send_message_to_model(messages, api_key, model, response_type="text"):
def converse( def converse(
references, references,
user_query, user_query,
online_results: Optional[dict] = None, online_results: Optional[Dict[str, Dict]] = None,
conversation_log={}, conversation_log={},
model: str = "gpt-3.5-turbo", model: str = "gpt-3.5-turbo",
api_key: Optional[str] = None, api_key: Optional[str] = None,
@ -127,10 +127,10 @@ def converse(
conversation_primer = prompts.query_prompt.format(query=user_query) conversation_primer = prompts.query_prompt.format(query=user_query)
system_prompt = "" if agent and agent.personality:
system_prompt = prompts.custom_personality.format(
if agent and agent.tuning: name=agent.name, bio=agent.personality, current_date=current_date
system_prompt = prompts.custom_personality.format(name=agent.name, bio=agent.tuning, current_date=current_date) )
else: else:
system_prompt = prompts.personality.format(current_date=current_date) system_prompt = prompts.personality.format(current_date=current_date)
@ -151,7 +151,7 @@ def converse(
completion_func(chat_response=prompts.no_online_results_found.format()) completion_func(chat_response=prompts.no_online_results_found.format())
return iter([prompts.no_online_results_found.format()]) return iter([prompts.no_online_results_found.format()])
if ConversationCommand.Online in conversation_commands: if ConversationCommand.Online in conversation_commands or ConversationCommand.Webpage in conversation_commands:
conversation_primer = ( conversation_primer = (
f"{prompts.online_search_conversation.format(online_results=str(online_results))}\n{conversation_primer}" f"{prompts.online_search_conversation.format(online_results=str(online_results))}\n{conversation_primer}"
) )
@ -167,7 +167,7 @@ def converse(
max_prompt_size, max_prompt_size,
tokenizer_name, tokenizer_name,
) )
truncated_messages = "\n".join({f"{message.content[:40]}..." for message in messages}) truncated_messages = "\n".join({f"{message.content[:70]}..." for message in messages})
logger.debug(f"Conversation Context for GPT: {truncated_messages}") logger.debug(f"Conversation Context for GPT: {truncated_messages}")
# Get Response from GPT # Get Response from GPT

View file

@ -101,8 +101,3 @@ def llm_thread(g, messages, model_name, temperature, openai_api_key=None, model_
chat(messages=messages) chat(messages=messages)
g.close() g.close()
def extract_summaries(metadata):
"""Extract summaries from metadata"""
return "".join([f'\n{session["summary"]}' for session in metadata])

View file

@ -10,7 +10,7 @@ You were created by Khoj Inc. with the following capabilities:
- You *CAN REMEMBER ALL NOTES and PERSONAL INFORMATION FOREVER* that the user ever shares with you. - You *CAN REMEMBER ALL NOTES and PERSONAL INFORMATION FOREVER* that the user ever shares with you.
- Users can share files and other information with you using the Khoj Desktop, Obsidian or Emacs app. They can also drag and drop their files into the chat window. - Users can share files and other information with you using the Khoj Desktop, Obsidian or Emacs app. They can also drag and drop their files into the chat window.
- You can generate images, look-up information from the internet, and answer questions based on the user's notes. - You *CAN* generate images, look-up real-time information from the internet, and answer questions based on the user's notes.
- You cannot set reminders. - You cannot set reminders.
- Say "I don't know" or "I don't understand" if you don't know what to say or if you don't know the answer to a question. - Say "I don't know" or "I don't understand" if you don't know what to say or if you don't know the answer to a question.
- Ask crisp follow-up questions to get additional context, when the answer cannot be inferred from the provided notes or past conversations. - Ask crisp follow-up questions to get additional context, when the answer cannot be inferred from the provided notes or past conversations.
@ -65,9 +65,9 @@ no_entries_found = PromptTemplate.from_template(
""".strip() """.strip()
) )
## Conversation Prompts for GPT4All Models ## Conversation Prompts for Offline Chat Models
## -- ## --
system_prompt_message_gpt4all = PromptTemplate.from_template( system_prompt_offline_chat = PromptTemplate.from_template(
""" """
You are Khoj, a smart, inquisitive and helpful personal assistant. You are Khoj, a smart, inquisitive and helpful personal assistant.
- Use your general knowledge and past conversation with the user as context to inform your responses. - Use your general knowledge and past conversation with the user as context to inform your responses.
@ -79,7 +79,7 @@ Today is {current_date} in UTC.
""".strip() """.strip()
) )
custom_system_prompt_message_gpt4all = PromptTemplate.from_template( custom_system_prompt_offline_chat = PromptTemplate.from_template(
""" """
You are {name}, a personal agent on Khoj. You are {name}, a personal agent on Khoj.
- Use your general knowledge and past conversation with the user as context to inform your responses. - Use your general knowledge and past conversation with the user as context to inform your responses.
@ -93,40 +93,6 @@ Instructions:\n{bio}
""".strip() """.strip()
) )
system_prompt_message_extract_questions_gpt4all = f"""You are Khoj, a kind and intelligent personal assistant. When the user asks you a question, you ask follow-up questions to clarify the necessary information you need in order to answer from the user's perspective.
- Write the question as if you can search for the answer on the user's personal notes.
- Try to be as specific as possible. Instead of saying "they" or "it" or "he", use the name of the person or thing you are referring to. For example, instead of saying "Which store did they go to?", say "Which store did Alice and Bob go to?".
- Add as much context from the previous questions and notes as required into your search queries.
- Provide search queries as a list of questions
What follow-up questions, if any, will you need to ask to answer the user's question?
"""
system_prompt_gpt4all = PromptTemplate.from_template(
"""
<s>[INST] <<SYS>>
{message}
<</SYS>>Hi there! [/INST] Hello! How can I help you today? </s>"""
)
system_prompt_extract_questions_gpt4all = PromptTemplate.from_template(
"""
<s>[INST] <<SYS>>
{message}
<</SYS>>[/INST]</s>"""
)
user_message_gpt4all = PromptTemplate.from_template(
"""
<s>[INST] {message} [/INST]
""".strip()
)
khoj_message_gpt4all = PromptTemplate.from_template(
"""
{message}</s>
""".strip()
)
## Notes Conversation ## Notes Conversation
## -- ## --
notes_conversation = PromptTemplate.from_template( notes_conversation = PromptTemplate.from_template(
@ -139,7 +105,7 @@ Notes:
""".strip() """.strip()
) )
notes_conversation_gpt4all = PromptTemplate.from_template( notes_conversation_offline = PromptTemplate.from_template(
""" """
User's Notes: User's Notes:
{references} {references}
@ -178,7 +144,8 @@ online_search_conversation = PromptTemplate.from_template(
Use this up-to-date information from the internet to inform your response. Use this up-to-date information from the internet to inform your response.
Ask crisp follow-up questions to get additional context, when a helpful response cannot be provided from the online data or past conversations. Ask crisp follow-up questions to get additional context, when a helpful response cannot be provided from the online data or past conversations.
Information from the internet: {online_results} Information from the internet:
{online_results}
""".strip() """.strip()
) )
@ -190,58 +157,50 @@ Query: {query}""".strip()
) )
## Summarize Notes
## --
summarize_notes = PromptTemplate.from_template(
"""
Summarize the below notes about {user_query}:
{text}
Summarize the notes in second person perspective:"""
)
## Answer
## --
answer = PromptTemplate.from_template(
"""
You are a friendly, helpful personal assistant.
Using the users notes below, answer their following question. If the answer is not contained within the notes, say "I don't know."
Notes:
{text}
Question: {user_query}
Answer (in second person):"""
)
## Extract Questions ## Extract Questions
## -- ## --
extract_questions_gpt4all_sample = PromptTemplate.from_template( extract_questions_offline = PromptTemplate.from_template(
""" """
<s>[INST] <<SYS>>Current Date: {current_date}. User's Location: {location}<</SYS>> [/INST]</s> You are Khoj, an extremely smart and helpful search assistant with the ability to retrieve information from the user's notes. Construct search queries to retrieve relevant information to answer the user's question.
<s>[INST] How was my trip to Cambodia? [/INST] - You will be provided past questions(Q) and answers(A) for context.
How was my trip to Cambodia?</s> - Try to be as specific as possible. Instead of saying "they" or "it" or "he", use proper nouns like name of the person or thing you are referring to.
<s>[INST] Who did I visit the temple with on that trip? [/INST] - Add as much context from the previous questions and answers as required into your search queries.
Who did I visit the temple with in Cambodia?</s> - Break messages into multiple search queries when required to retrieve the relevant information.
<s>[INST] How should I take care of my plants? [/INST] - Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
What kind of plants do I have? What issues do my plants have?</s>
<s>[INST] How many tennis balls fit in the back of a 2002 Honda Civic? [/INST] Current Date: {current_date}
What is the size of a tennis ball? What is the trunk size of a 2002 Honda Civic?</s> User's Location: {location}
<s>[INST] What did I do for Christmas last year? [/INST]
What did I do for Christmas {last_year} dt>='{last_christmas_date}' dt<'{next_christmas_date}'</s> Examples:
<s>[INST] How are you feeling today? [/INST]</s> Q: How was my trip to Cambodia?
<s>[INST] Is Alice older than Bob? [/INST] Khoj: ["How was my trip to Cambodia?"]
When was Alice born? What is Bob's age?</s>
<s>[INST] <<SYS>> Q: Who did I visit the temple with on that trip?
Use these notes from the user's previous conversations to provide a response: Khoj: ["Who did I visit the temple with in Cambodia?"]
Q: Which of them is older?
Khoj: ["When was Alice born?", "What is Bob's age?"]
Q: Where did John say he was? He mentioned it in our call last week.
Khoj: ["Where is John? dt>='{last_year}-12-25' dt<'{last_year}-12-26'", "John's location in call notes"]
Q: How can you help me?
Khoj: ["Social relationships", "Physical and mental health", "Education and career", "Personal life goals and habits"]
Q: What did I do for Christmas last year?
Khoj: ["What did I do for Christmas {last_year} dt>='{last_year}-12-25' dt<'{last_year}-12-26'"]
Q: How should I take care of my plants?
Khoj: ["What kind of plants do I have?", "What issues do my plants have?"]
Q: Who all did I meet here yesterday?
Khoj: ["Met in {location} on {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]
Chat History:
{chat_history} {chat_history}
<</SYS>> [/INST]</s> What searches will you perform to answer the following question, using the chat history as reference? Respond with relevant search queries as list of strings.
<s>[INST] {query} [/INST] Q: {query}
""" """.strip()
) )
@ -259,7 +218,7 @@ User's Location: {location}
Q: How was my trip to Cambodia? Q: How was my trip to Cambodia?
Khoj: {{"queries": ["How was my trip to Cambodia?"]}} Khoj: {{"queries": ["How was my trip to Cambodia?"]}}
A: The trip was amazing. I went to the Angkor Wat temple and it was beautiful. A: The trip was amazing. You went to the Angkor Wat temple and it was beautiful.
Q: Who did i visit that temple with? Q: Who did i visit that temple with?
Khoj: {{"queries": ["Who did I visit the Angkor Wat Temple in Cambodia with?"]}} Khoj: {{"queries": ["Who did I visit the Angkor Wat Temple in Cambodia with?"]}}
@ -285,8 +244,8 @@ Q: What is their age difference?
Khoj: {{"queries": ["What is Bob's age?", "What is Tom's age?"]}} Khoj: {{"queries": ["What is Bob's age?", "What is Tom's age?"]}}
A: Bob is {bob_tom_age_difference} years older than Tom. As Bob is {bob_age} years old and Tom is 30 years old. A: Bob is {bob_tom_age_difference} years older than Tom. As Bob is {bob_age} years old and Tom is 30 years old.
Q: What does yesterday's note say? Q: Who all did I meet here yesterday?
Khoj: {{"queries": ["Note from {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]}} Khoj: {{"queries": ["Met in {location} on {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]}}
A: Yesterday's note mentions your visit to your local beach with Ram and Shyam. A: Yesterday's note mentions your visit to your local beach with Ram and Shyam.
{chat_history} {chat_history}
@ -312,7 +271,7 @@ Target Query: {query}
Web Pages: Web Pages:
{corpus} {corpus}
Collate the relevant information from the website to answer the target query. Collate only relevant information from the website to answer the target query.
""".strip() """.strip()
) )
@ -394,6 +353,14 @@ AI: Good morning! How can I help you today?
Q: How can I share my files with Khoj? Q: How can I share my files with Khoj?
Khoj: {{"source": ["default", "online"]}} Khoj: {{"source": ["default", "online"]}}
Example:
Chat History:
User: What is the first element in the periodic table?
AI: The first element in the periodic table is Hydrogen.
Q: Summarize this article https://en.wikipedia.org/wiki/Hydrogen
Khoj: {{"source": ["webpage"]}}
Example: Example:
Chat History: Chat History:
User: I want to start a new hobby. I'm thinking of learning to play the guitar. User: I want to start a new hobby. I'm thinking of learning to play the guitar.
@ -412,6 +379,50 @@ Khoj:
""".strip() """.strip()
) )
infer_webpages_to_read = PromptTemplate.from_template(
"""
You are Khoj, an advanced web page reading assistant. You are to construct **up to three, valid** webpage urls to read before answering the user's question.
- You will receive the conversation history as context.
- Add as much context from the previous questions and answers as required to construct the webpage urls.
- Use multiple web page urls if required to retrieve the relevant information.
- You have access to the the whole internet to retrieve information.
Which webpages will you need to read to answer the user's question?
Provide web page links as a list of strings in a JSON object.
Current Date: {current_date}
User's Location: {location}
Here are some examples:
History:
User: I like to use Hacker News to get my tech news.
AI: Hacker News is an online forum for sharing and discussing the latest tech news. It is a great place to learn about new technologies and startups.
Q: Summarize this post about vector database on Hacker News, https://news.ycombinator.com/item?id=12345
Khoj: {{"links": ["https://news.ycombinator.com/item?id=12345"]}}
History:
User: I'm currently living in New York but I'm thinking about moving to San Francisco.
AI: New York is a great city to live in. It has a lot of great restaurants and museums. San Francisco is also a great city to live in. It has good access to nature and a great tech scene.
Q: What is the climate like in those cities?
Khoj: {{"links": ["https://en.wikipedia.org/wiki/New_York_City", "https://en.wikipedia.org/wiki/San_Francisco"]}}
History:
User: Hey, how is it going?
AI: Not too bad. How can I help you today?
Q: What's the latest news on r/worldnews?
Khoj: {{"links": ["https://www.reddit.com/r/worldnews/"]}}
Now it's your turn to share actual webpage urls you'd like to read to answer the user's question.
History:
{chat_history}
Q: {query}
Khoj:
""".strip()
)
online_search_conversation_subqueries = PromptTemplate.from_template( online_search_conversation_subqueries = PromptTemplate.from_template(
""" """
You are Khoj, an advanced google search assistant. You are tasked with constructing **up to three** google search queries to answer the user's question. You are Khoj, an advanced google search assistant. You are tasked with constructing **up to three** google search queries to answer the user's question.
@ -490,7 +501,6 @@ help_message = PromptTemplate.from_template(
- **/image**: Generate an image based on your message. - **/image**: Generate an image based on your message.
- **/help**: Show this help message. - **/help**: Show this help message.
You are using the **{model}** model on the **{device}**. You are using the **{model}** model on the **{device}**.
**version**: {version} **version**: {version}
""".strip() """.strip()

View file

@ -3,29 +3,28 @@ import logging
import queue import queue
from datetime import datetime from datetime import datetime
from time import perf_counter from time import perf_counter
from typing import Any, Dict, List from typing import Any, Dict, List, Optional
import tiktoken import tiktoken
from langchain.schema import ChatMessage from langchain.schema import ChatMessage
from llama_cpp.llama import Llama
from transformers import AutoTokenizer from transformers import AutoTokenizer
from khoj.database.adapters import ConversationAdapters from khoj.database.adapters import ConversationAdapters
from khoj.database.models import ClientApplication, KhojUser from khoj.database.models import ClientApplication, KhojUser
from khoj.processor.conversation.offline.utils import download_model
from khoj.utils.helpers import is_none_or_empty, merge_dicts from khoj.utils.helpers import is_none_or_empty, merge_dicts
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
model_to_prompt_size = { model_to_prompt_size = {
"gpt-3.5-turbo": 3000, "gpt-3.5-turbo": 12000,
"gpt-3.5-turbo-0125": 3000, "gpt-3.5-turbo-0125": 12000,
"gpt-4-0125-preview": 7000, "gpt-4-0125-preview": 20000,
"gpt-4-turbo-preview": 7000, "gpt-4-turbo-preview": 20000,
"llama-2-7b-chat.ggmlv3.q4_0.bin": 1548, "TheBloke/Mistral-7B-Instruct-v0.2-GGUF": 3500,
"mistral-7b-instruct-v0.1.Q4_0.gguf": 1548, "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF": 3500,
}
model_to_tokenizer = {
"llama-2-7b-chat.ggmlv3.q4_0.bin": "hf-internal-testing/llama-tokenizer",
"mistral-7b-instruct-v0.1.Q4_0.gguf": "mistralai/Mistral-7B-Instruct-v0.1",
} }
model_to_tokenizer: Dict[str, str] = {}
class ThreadedGenerator: class ThreadedGenerator:
@ -134,9 +133,10 @@ Khoj: "{inferred_queries if ("text-to-image" in intent_type) else chat_response}
def generate_chatml_messages_with_context( def generate_chatml_messages_with_context(
user_message, user_message,
system_message, system_message=None,
conversation_log={}, conversation_log={},
model_name="gpt-3.5-turbo", model_name="gpt-3.5-turbo",
loaded_model: Optional[Llama] = None,
max_prompt_size=None, max_prompt_size=None,
tokenizer_name=None, tokenizer_name=None,
): ):
@ -159,7 +159,7 @@ def generate_chatml_messages_with_context(
chat_notes = f'\n\n Notes:\n{chat.get("context")}' if chat.get("context") else "\n" chat_notes = f'\n\n Notes:\n{chat.get("context")}' if chat.get("context") else "\n"
chat_logs += [chat["message"] + chat_notes] chat_logs += [chat["message"] + chat_notes]
rest_backnforths = [] rest_backnforths: List[ChatMessage] = []
# Extract in reverse chronological order # Extract in reverse chronological order
for user_msg, assistant_msg in zip(chat_logs[-2::-2], chat_logs[::-2]): for user_msg, assistant_msg in zip(chat_logs[-2::-2], chat_logs[::-2]):
if len(rest_backnforths) >= 2 * lookback_turns: if len(rest_backnforths) >= 2 * lookback_turns:
@ -176,22 +176,31 @@ def generate_chatml_messages_with_context(
messages.append(ChatMessage(content=system_message, role="system")) messages.append(ChatMessage(content=system_message, role="system"))
# Truncate oldest messages from conversation history until under max supported prompt size by model # Truncate oldest messages from conversation history until under max supported prompt size by model
messages = truncate_messages(messages, max_prompt_size, model_name, tokenizer_name) messages = truncate_messages(messages, max_prompt_size, model_name, loaded_model, tokenizer_name)
# Return message in chronological order # Return message in chronological order
return messages[::-1] return messages[::-1]
def truncate_messages( def truncate_messages(
messages: list[ChatMessage], max_prompt_size, model_name: str, tokenizer_name=None messages: list[ChatMessage],
max_prompt_size,
model_name: str,
loaded_model: Optional[Llama] = None,
tokenizer_name=None,
) -> list[ChatMessage]: ) -> list[ChatMessage]:
"""Truncate messages to fit within max prompt size supported by model""" """Truncate messages to fit within max prompt size supported by model"""
try: try:
if model_name.startswith("gpt-"): if loaded_model:
encoder = loaded_model.tokenizer()
elif model_name.startswith("gpt-"):
encoder = tiktoken.encoding_for_model(model_name) encoder = tiktoken.encoding_for_model(model_name)
else: else:
encoder = AutoTokenizer.from_pretrained(tokenizer_name or model_to_tokenizer[model_name]) try:
encoder = download_model(model_name).tokenizer()
except:
encoder = AutoTokenizer.from_pretrained(tokenizer_name or model_to_tokenizer[model_name])
except: except:
default_tokenizer = "hf-internal-testing/llama-tokenizer" default_tokenizer = "hf-internal-testing/llama-tokenizer"
encoder = AutoTokenizer.from_pretrained(default_tokenizer) encoder = AutoTokenizer.from_pretrained(default_tokenizer)
@ -223,12 +232,17 @@ def truncate_messages(
original_question = "\n".join(messages[0].content.split("\n")[-1:]) if type(messages[0].content) == str else "" original_question = "\n".join(messages[0].content.split("\n")[-1:]) if type(messages[0].content) == str else ""
original_question = f"\n{original_question}" original_question = f"\n{original_question}"
original_question_tokens = len(encoder.encode(original_question)) original_question_tokens = len(encoder.encode(original_question))
remaining_tokens = max_prompt_size - original_question_tokens - system_message_tokens remaining_tokens = max_prompt_size - system_message_tokens
truncated_message = encoder.decode(encoder.encode(current_message)[:remaining_tokens]).strip() if remaining_tokens > original_question_tokens:
remaining_tokens -= original_question_tokens
truncated_message = encoder.decode(encoder.encode(current_message)[:remaining_tokens]).strip()
messages = [ChatMessage(content=truncated_message + original_question, role=messages[0].role)]
else:
truncated_message = encoder.decode(encoder.encode(original_question)[:remaining_tokens]).strip()
messages = [ChatMessage(content=truncated_message, role=messages[0].role)]
logger.debug( logger.debug(
f"Truncate current message to fit within max prompt size of {max_prompt_size} supported by {model_name} model:\n {truncated_message}" f"Truncate current message to fit within max prompt size of {max_prompt_size} supported by {model_name} model:\n {truncated_message}"
) )
messages = [ChatMessage(content=truncated_message + original_question, role=messages[0].role)]
return messages + [system_message] if system_message else messages return messages + [system_message] if system_message else messages

View file

@ -2,6 +2,7 @@ import asyncio
import json import json
import logging import logging
import os import os
from collections import defaultdict
from typing import Dict, Tuple, Union from typing import Dict, Tuple, Union
import aiohttp import aiohttp
@ -9,7 +10,11 @@ import requests
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from markdownify import markdownify from markdownify import markdownify
from khoj.routers.helpers import extract_relevant_info, generate_online_subqueries from khoj.routers.helpers import (
extract_relevant_info,
generate_online_subqueries,
infer_webpage_urls,
)
from khoj.utils.helpers import is_none_or_empty, timer from khoj.utils.helpers import is_none_or_empty, timer
from khoj.utils.rawconfig import LocationData from khoj.utils.rawconfig import LocationData
@ -38,7 +43,7 @@ MAX_WEBPAGES_TO_READ = 1
async def search_online(query: str, conversation_history: dict, location: LocationData): async def search_online(query: str, conversation_history: dict, location: LocationData):
if SERPER_DEV_API_KEY is None: if not online_search_enabled():
logger.warn("SERPER_DEV_API_KEY is not set") logger.warn("SERPER_DEV_API_KEY is not set")
return {} return {}
@ -52,24 +57,21 @@ async def search_online(query: str, conversation_history: dict, location: Locati
# Gather distinct web pages from organic search results of each subquery without an instant answer # Gather distinct web pages from organic search results of each subquery without an instant answer
webpage_links = { webpage_links = {
result["link"] organic["link"]: subquery
for subquery in response_dict for subquery in response_dict
for result in response_dict[subquery].get("organic", [])[:MAX_WEBPAGES_TO_READ] for organic in response_dict[subquery].get("organic", [])[:MAX_WEBPAGES_TO_READ]
if "answerBox" not in response_dict[subquery] if "answerBox" not in response_dict[subquery]
} }
# Read, extract relevant info from the retrieved web pages # Read, extract relevant info from the retrieved web pages
tasks = [] logger.info(f"Reading web pages at: {webpage_links.keys()}")
for webpage_link in webpage_links: tasks = [read_webpage_and_extract_content(subquery, link) for link, subquery in webpage_links.items()]
logger.info(f"Reading web page at '{webpage_link}'")
task = read_webpage_and_extract_content(subquery, webpage_link)
tasks.append(task)
results = await asyncio.gather(*tasks) results = await asyncio.gather(*tasks)
# Collect extracted info from the retrieved web pages # Collect extracted info from the retrieved web pages
for subquery, extracted_webpage_content in results: for subquery, webpage_extract, url in results:
if extracted_webpage_content is not None: if webpage_extract is not None:
response_dict[subquery]["extracted_content"] = extracted_webpage_content response_dict[subquery]["webpages"] = {"link": url, "snippet": webpage_extract}
return response_dict return response_dict
@ -93,19 +95,35 @@ def search_with_google(subquery: str):
return extracted_search_result return extracted_search_result
async def read_webpage_and_extract_content(subquery: str, url: str) -> Tuple[str, Union[None, str]]: async def read_webpages(query: str, conversation_history: dict, location: LocationData):
"Infer web pages to read from the query and extract relevant information from them"
logger.info(f"Inferring web pages to read")
urls = await infer_webpage_urls(query, conversation_history, location)
logger.info(f"Reading web pages at: {urls}")
tasks = [read_webpage_and_extract_content(query, url) for url in urls]
results = await asyncio.gather(*tasks)
response: Dict[str, Dict] = defaultdict(dict)
response[query]["webpages"] = [
{"query": q, "link": url, "snippet": web_extract} for q, web_extract, url in results if web_extract is not None
]
return response
async def read_webpage_and_extract_content(subquery: str, url: str) -> Tuple[str, Union[None, str], str]:
try: try:
with timer(f"Reading web page at '{url}' took", logger): with timer(f"Reading web page at '{url}' took", logger):
content = await read_webpage_with_olostep(url) if OLOSTEP_API_KEY else await read_webpage(url) content = await read_webpage_with_olostep(url) if OLOSTEP_API_KEY else await read_webpage_at_url(url)
with timer(f"Extracting relevant information from web page at '{url}' took", logger): with timer(f"Extracting relevant information from web page at '{url}' took", logger):
extracted_info = await extract_relevant_info(subquery, content) extracted_info = await extract_relevant_info(subquery, content)
return subquery, extracted_info return subquery, extracted_info, url
except Exception as e: except Exception as e:
logger.error(f"Failed to read web page at '{url}' with {e}") logger.error(f"Failed to read web page at '{url}' with {e}")
return subquery, None return subquery, None, url
async def read_webpage(web_url: str) -> str: async def read_webpage_at_url(web_url: str) -> str:
headers = { headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36",
} }
@ -129,3 +147,7 @@ async def read_webpage_with_olostep(web_url: str) -> str:
response.raise_for_status() response.raise_for_status()
response_json = await response.json() response_json = await response.json()
return response_json["markdown_content"] return response_json["markdown_content"]
def online_search_enabled():
return SERPER_DEV_API_KEY is not None

View file

@ -35,7 +35,7 @@ from khoj.search_filter.file_filter import FileFilter
from khoj.search_filter.word_filter import WordFilter from khoj.search_filter.word_filter import WordFilter
from khoj.search_type import image_search, text_search from khoj.search_type import image_search, text_search
from khoj.utils import constants, state from khoj.utils import constants, state
from khoj.utils.config import GPT4AllProcessorModel from khoj.utils.config import OfflineChatProcessorModel
from khoj.utils.helpers import ConversationCommand, timer from khoj.utils.helpers import ConversationCommand, timer
from khoj.utils.rawconfig import LocationData, SearchResponse from khoj.utils.rawconfig import LocationData, SearchResponse
from khoj.utils.state import SearchType from khoj.utils.state import SearchType
@ -341,16 +341,16 @@ async def extract_references_and_questions(
using_offline_chat = True using_offline_chat = True
default_offline_llm = await ConversationAdapters.get_default_offline_llm() default_offline_llm = await ConversationAdapters.get_default_offline_llm()
chat_model = default_offline_llm.chat_model chat_model = default_offline_llm.chat_model
if state.gpt4all_processor_config is None: if state.offline_chat_processor_config is None:
state.gpt4all_processor_config = GPT4AllProcessorModel(chat_model=chat_model) state.offline_chat_processor_config = OfflineChatProcessorModel(chat_model=chat_model)
loaded_model = state.gpt4all_processor_config.loaded_model loaded_model = state.offline_chat_processor_config.loaded_model
inferred_queries = extract_questions_offline( inferred_queries = extract_questions_offline(
defiltered_query, defiltered_query,
loaded_model=loaded_model, loaded_model=loaded_model,
conversation_log=meta_log, conversation_log=meta_log,
should_extract_questions=False, should_extract_questions=True,
location_data=location_data, location_data=location_data,
) )
elif conversation_config and conversation_config.model_type == ChatModelOptions.ModelType.OPENAI: elif conversation_config and conversation_config.model_type == ChatModelOptions.ModelType.OPENAI:

View file

@ -16,7 +16,7 @@ logger = logging.getLogger(__name__)
api_agents = APIRouter() api_agents = APIRouter()
@api_agents.get("/", response_class=Response) @api_agents.get("", response_class=Response)
async def all_agents( async def all_agents(
request: Request, request: Request,
common: CommonQueryParams, common: CommonQueryParams,
@ -30,7 +30,7 @@ async def all_agents(
"slug": agent.slug, "slug": agent.slug,
"avatar": agent.avatar, "avatar": agent.avatar,
"name": agent.name, "name": agent.name,
"tuning": agent.tuning, "personality": agent.personality,
"public": agent.public, "public": agent.public,
"creator": agent.creator.username if agent.creator else None, "creator": agent.creator.username if agent.creator else None,
"managed_by_admin": agent.managed_by_admin, "managed_by_admin": agent.managed_by_admin,

View file

@ -14,9 +14,17 @@ from websockets import ConnectionClosedOK
from khoj.database.adapters import ConversationAdapters, EntryAdapters, aget_user_name from khoj.database.adapters import ConversationAdapters, EntryAdapters, aget_user_name
from khoj.database.models import KhojUser from khoj.database.models import KhojUser
from khoj.processor.conversation.prompts import help_message, no_entries_found from khoj.processor.conversation.prompts import (
help_message,
no_entries_found,
no_notes_found,
)
from khoj.processor.conversation.utils import save_to_conversation_log from khoj.processor.conversation.utils import save_to_conversation_log
from khoj.processor.tools.online_search import search_online from khoj.processor.tools.online_search import (
online_search_enabled,
read_webpages,
search_online,
)
from khoj.routers.api import extract_references_and_questions from khoj.routers.api import extract_references_and_questions
from khoj.routers.helpers import ( from khoj.routers.helpers import (
ApiUserRateLimiter, ApiUserRateLimiter,
@ -172,10 +180,15 @@ async def create_chat_session(
response = {"conversation_id": conversation.id} response = {"conversation_id": conversation.id}
conversation_metadata = {
"agent": agent_slug,
}
update_telemetry_state( update_telemetry_state(
request=request, request=request,
telemetry_type="api", telemetry_type="api",
api="create_chat_sessions", api="create_chat_sessions",
metadata=conversation_metadata,
**common.__dict__, **common.__dict__,
) )
@ -469,6 +482,7 @@ async def chat(
) -> Response: ) -> Response:
user: KhojUser = request.user.object user: KhojUser = request.user.object
q = unquote(q) q = unquote(q)
logger.info(f"Chat request by {user.username}: {q}")
await is_ready_to_chat(user) await is_ready_to_chat(user)
conversation_commands = [get_conversation_command(query=q, any_references=True)] conversation_commands = [get_conversation_command(query=q, any_references=True)]
@ -511,7 +525,7 @@ async def chat(
compiled_references, inferred_queries, defiltered_query = await extract_references_and_questions( compiled_references, inferred_queries, defiltered_query = await extract_references_and_questions(
request, common, meta_log, q, (n or 5), (d or math.inf), conversation_commands, location request, common, meta_log, q, (n or 5), (d or math.inf), conversation_commands, location
) )
online_results: Dict = dict() online_results: Dict[str, Dict] = {}
if conversation_commands == [ConversationCommand.Notes] and not await EntryAdapters.auser_has_entries(user): if conversation_commands == [ConversationCommand.Notes] and not await EntryAdapters.auser_has_entries(user):
no_entries_found_format = no_entries_found.format() no_entries_found_format = no_entries_found.format()
@ -521,17 +535,35 @@ async def chat(
response_obj = {"response": no_entries_found_format} response_obj = {"response": no_entries_found_format}
return Response(content=json.dumps(response_obj), media_type="text/plain", status_code=200) return Response(content=json.dumps(response_obj), media_type="text/plain", status_code=200)
if conversation_commands == [ConversationCommand.Notes] and is_none_or_empty(compiled_references):
no_notes_found_format = no_notes_found.format()
if stream:
return StreamingResponse(iter([no_notes_found_format]), media_type="text/event-stream", status_code=200)
else:
response_obj = {"response": no_notes_found_format}
return Response(content=json.dumps(response_obj), media_type="text/plain", status_code=200)
if ConversationCommand.Notes in conversation_commands and is_none_or_empty(compiled_references): if ConversationCommand.Notes in conversation_commands and is_none_or_empty(compiled_references):
conversation_commands.remove(ConversationCommand.Notes) conversation_commands.remove(ConversationCommand.Notes)
if ConversationCommand.Online in conversation_commands: if ConversationCommand.Online in conversation_commands:
if not online_search_enabled():
conversation_commands.remove(ConversationCommand.Online)
# If online search is not enabled, try to read webpages directly
if ConversationCommand.Webpage not in conversation_commands:
conversation_commands.append(ConversationCommand.Webpage)
else:
try:
online_results = await search_online(defiltered_query, meta_log, location)
except ValueError as e:
logger.warning(f"Error searching online: {e}. Attempting to respond without online results")
if ConversationCommand.Webpage in conversation_commands:
try: try:
online_results = await search_online(defiltered_query, meta_log, location) online_results = await read_webpages(defiltered_query, meta_log, location)
except ValueError as e: except ValueError as e:
return StreamingResponse( logger.warning(
iter(["Please set your SERPER_DEV_API_KEY to get started with online searches 🌐"]), f"Error directly reading webpages: {e}. Attempting to respond without online results", exc_info=True
media_type="text/event-stream",
status_code=200,
) )
if ConversationCommand.Image in conversation_commands: if ConversationCommand.Image in conversation_commands:
@ -586,6 +618,7 @@ async def chat(
cmd_set = set([cmd.value for cmd in conversation_commands]) cmd_set = set([cmd.value for cmd in conversation_commands])
chat_metadata["conversation_command"] = cmd_set chat_metadata["conversation_command"] = cmd_set
chat_metadata["agent"] = conversation.agent.slug if conversation.agent else None
update_telemetry_state( update_telemetry_state(
request=request, request=request,

View file

@ -33,10 +33,11 @@ from khoj.processor.conversation.utils import (
) )
from khoj.routers.storage import upload_image from khoj.routers.storage import upload_image
from khoj.utils import state from khoj.utils import state
from khoj.utils.config import GPT4AllProcessorModel from khoj.utils.config import OfflineChatProcessorModel
from khoj.utils.helpers import ( from khoj.utils.helpers import (
ConversationCommand, ConversationCommand,
is_none_or_empty, is_none_or_empty,
is_valid_url,
log_telemetry, log_telemetry,
mode_descriptions_for_llm, mode_descriptions_for_llm,
timer, timer,
@ -68,9 +69,9 @@ async def is_ready_to_chat(user: KhojUser):
if has_offline_config and user_conversation_config and user_conversation_config.model_type == "offline": if has_offline_config and user_conversation_config and user_conversation_config.model_type == "offline":
chat_model = user_conversation_config.chat_model chat_model = user_conversation_config.chat_model
if state.gpt4all_processor_config is None: if state.offline_chat_processor_config is None:
logger.info("Loading Offline Chat Model...") logger.info("Loading Offline Chat Model...")
state.gpt4all_processor_config = GPT4AllProcessorModel(chat_model=chat_model) state.offline_chat_processor_config = OfflineChatProcessorModel(chat_model=chat_model)
return True return True
ready = has_openai_config or has_offline_config ready = has_openai_config or has_offline_config
@ -168,7 +169,8 @@ async def aget_relevant_information_sources(query: str, conversation_history: di
chat_history=chat_history, chat_history=chat_history,
) )
response = await send_message_to_model_wrapper(relevant_tools_prompt, response_type="json_object") with timer("Chat actor: Infer information sources to refer", logger):
response = await send_message_to_model_wrapper(relevant_tools_prompt, response_type="json_object")
try: try:
response = response.strip() response = response.strip()
@ -212,7 +214,8 @@ async def aget_relevant_output_modes(query: str, conversation_history: dict):
chat_history=chat_history, chat_history=chat_history,
) )
response = await send_message_to_model_wrapper(relevant_mode_prompt) with timer("Chat actor: Infer output mode for chat response", logger):
response = await send_message_to_model_wrapper(relevant_mode_prompt)
try: try:
response = response.strip() response = response.strip()
@ -230,6 +233,36 @@ async def aget_relevant_output_modes(query: str, conversation_history: dict):
return ConversationCommand.Default return ConversationCommand.Default
async def infer_webpage_urls(q: str, conversation_history: dict, location_data: LocationData) -> List[str]:
"""
Infer webpage links from the given query
"""
location = f"{location_data.city}, {location_data.region}, {location_data.country}" if location_data else "Unknown"
chat_history = construct_chat_history(conversation_history)
utc_date = datetime.utcnow().strftime("%Y-%m-%d")
online_queries_prompt = prompts.infer_webpages_to_read.format(
current_date=utc_date,
query=q,
chat_history=chat_history,
location=location,
)
with timer("Chat actor: Infer webpage urls to read", logger):
response = await send_message_to_model_wrapper(online_queries_prompt, response_type="json_object")
# Validate that the response is a non-empty, JSON-serializable list of URLs
try:
response = response.strip()
urls = json.loads(response)
valid_unique_urls = {str(url).strip() for url in urls["links"] if is_valid_url(url)}
if is_none_or_empty(valid_unique_urls):
raise ValueError(f"Invalid list of urls: {response}")
return list(valid_unique_urls)
except Exception:
raise ValueError(f"Invalid list of urls: {response}")
async def generate_online_subqueries(q: str, conversation_history: dict, location_data: LocationData) -> List[str]: async def generate_online_subqueries(q: str, conversation_history: dict, location_data: LocationData) -> List[str]:
""" """
Generate subqueries from the given query Generate subqueries from the given query
@ -245,7 +278,8 @@ async def generate_online_subqueries(q: str, conversation_history: dict, locatio
location=location, location=location,
) )
response = await send_message_to_model_wrapper(online_queries_prompt, response_type="json_object") with timer("Chat actor: Generate online search subqueries", logger):
response = await send_message_to_model_wrapper(online_queries_prompt, response_type="json_object")
# Validate that the response is a non-empty, JSON-serializable list # Validate that the response is a non-empty, JSON-serializable list
try: try:
@ -274,9 +308,10 @@ async def extract_relevant_info(q: str, corpus: str) -> Union[str, None]:
corpus=corpus.strip(), corpus=corpus.strip(),
) )
response = await send_message_to_model_wrapper( with timer("Chat actor: Extract relevant information from data", logger):
extract_relevant_information, prompts.system_prompt_extract_relevant_information response = await send_message_to_model_wrapper(
) extract_relevant_information, prompts.system_prompt_extract_relevant_information
)
return response.strip() return response.strip()
@ -292,10 +327,13 @@ async def generate_better_image_prompt(
Generate a better image prompt from the given query Generate a better image prompt from the given query
""" """
location = f"{location_data.city}, {location_data.region}, {location_data.country}" if location_data else "Unknown"
today_date = datetime.now(tz=timezone.utc).strftime("%Y-%m-%d") today_date = datetime.now(tz=timezone.utc).strftime("%Y-%m-%d")
location_prompt = prompts.user_location.format(location=location) if location_data:
location = f"{location_data.city}, {location_data.region}, {location_data.country}"
location_prompt = prompts.user_location.format(location=location)
else:
location_prompt = "Unknown"
user_references = "\n\n".join([f"# {item}" for item in note_references]) user_references = "\n\n".join([f"# {item}" for item in note_references])
@ -305,8 +343,8 @@ async def generate_better_image_prompt(
for result in online_results: for result in online_results:
if online_results[result].get("answerBox"): if online_results[result].get("answerBox"):
simplified_online_results[result] = online_results[result]["answerBox"] simplified_online_results[result] = online_results[result]["answerBox"]
elif online_results[result].get("extracted_content"): elif online_results[result].get("webpages"):
simplified_online_results[result] = online_results[result]["extracted_content"] simplified_online_results[result] = online_results[result]["webpages"]
image_prompt = prompts.image_generation_improve_prompt.format( image_prompt = prompts.image_generation_improve_prompt.format(
query=q, query=q,
@ -317,7 +355,8 @@ async def generate_better_image_prompt(
online_results=simplified_online_results, online_results=simplified_online_results,
) )
response = await send_message_to_model_wrapper(image_prompt) with timer("Chat actor: Generate contextual image prompt", logger):
response = await send_message_to_model_wrapper(image_prompt)
return response.strip() return response.strip()
@ -332,27 +371,31 @@ async def send_message_to_model_wrapper(
if conversation_config is None: if conversation_config is None:
raise HTTPException(status_code=500, detail="Contact the server administrator to set a default chat model.") raise HTTPException(status_code=500, detail="Contact the server administrator to set a default chat model.")
truncated_messages = generate_chatml_messages_with_context( chat_model = conversation_config.chat_model
user_message=message, system_message=system_message, model_name=conversation_config.chat_model
)
if conversation_config.model_type == "offline": if conversation_config.model_type == "offline":
if state.gpt4all_processor_config is None or state.gpt4all_processor_config.loaded_model is None: if state.offline_chat_processor_config is None or state.offline_chat_processor_config.loaded_model is None:
state.gpt4all_processor_config = GPT4AllProcessorModel(conversation_config.chat_model) state.offline_chat_processor_config = OfflineChatProcessorModel(chat_model)
loaded_model = state.offline_chat_processor_config.loaded_model
truncated_messages = generate_chatml_messages_with_context(
user_message=message, system_message=system_message, model_name=chat_model, loaded_model=loaded_model
)
loaded_model = state.gpt4all_processor_config.loaded_model
return send_message_to_model_offline( return send_message_to_model_offline(
message=truncated_messages[-1].content, messages=truncated_messages,
loaded_model=loaded_model, loaded_model=loaded_model,
model=conversation_config.chat_model, model=chat_model,
streaming=False, streaming=False,
system_message=truncated_messages[0].content,
) )
elif conversation_config.model_type == "openai": elif conversation_config.model_type == "openai":
openai_chat_config = await ConversationAdapters.aget_openai_conversation_config() openai_chat_config = await ConversationAdapters.aget_openai_conversation_config()
api_key = openai_chat_config.api_key api_key = openai_chat_config.api_key
chat_model = conversation_config.chat_model truncated_messages = generate_chatml_messages_with_context(
user_message=message, system_message=system_message, model_name=chat_model
)
openai_response = send_message_to_model( openai_response = send_message_to_model(
messages=truncated_messages, api_key=api_key, model=chat_model, response_type=response_type messages=truncated_messages, api_key=api_key, model=chat_model, response_type=response_type
) )
@ -367,7 +410,7 @@ def generate_chat_response(
meta_log: dict, meta_log: dict,
conversation: Conversation, conversation: Conversation,
compiled_references: List[str] = [], compiled_references: List[str] = [],
online_results: Dict[str, Any] = {}, online_results: Dict[str, Dict] = {},
inferred_queries: List[str] = [], inferred_queries: List[str] = [],
conversation_commands: List[ConversationCommand] = [ConversationCommand.Default], conversation_commands: List[ConversationCommand] = [ConversationCommand.Default],
user: KhojUser = None, user: KhojUser = None,
@ -398,10 +441,10 @@ def generate_chat_response(
conversation_config = ConversationAdapters.get_valid_conversation_config(user, conversation) conversation_config = ConversationAdapters.get_valid_conversation_config(user, conversation)
if conversation_config.model_type == "offline": if conversation_config.model_type == "offline":
if state.gpt4all_processor_config is None or state.gpt4all_processor_config.loaded_model is None: if state.offline_chat_processor_config is None or state.offline_chat_processor_config.loaded_model is None:
state.gpt4all_processor_config = GPT4AllProcessorModel(conversation_config.chat_model) state.offline_chat_processor_config = OfflineChatProcessorModel(conversation_config.chat_model)
loaded_model = state.gpt4all_processor_config.loaded_model loaded_model = state.offline_chat_processor_config.loaded_model
chat_response = converse_offline( chat_response = converse_offline(
references=compiled_references, references=compiled_references,
online_results=online_results, online_results=online_results,

View file

@ -142,7 +142,7 @@ def agents_page(request: Request):
"slug": agent.slug, "slug": agent.slug,
"avatar": agent.avatar, "avatar": agent.avatar,
"name": agent.name, "name": agent.name,
"tuning": agent.tuning, "personality": agent.personality,
"public": agent.public, "public": agent.public,
"creator": agent.creator.username if agent.creator else None, "creator": agent.creator.username if agent.creator else None,
"managed_by_admin": agent.managed_by_admin, "managed_by_admin": agent.managed_by_admin,
@ -186,7 +186,7 @@ def agent_page(request: Request, agent_slug: str):
"slug": agent.slug, "slug": agent.slug,
"avatar": agent.avatar, "avatar": agent.avatar,
"name": agent.name, "name": agent.name,
"tuning": agent.tuning, "personality": agent.personality,
"public": agent.public, "public": agent.public,
"creator": agent.creator.username if agent.creator else None, "creator": agent.creator.username if agent.creator else None,
"managed_by_admin": agent.managed_by_admin, "managed_by_admin": agent.managed_by_admin,

View file

@ -70,15 +70,12 @@ class SearchModels:
@dataclass @dataclass
class GPT4AllProcessorConfig: class OfflineChatProcessorConfig:
loaded_model: Union[Any, None] = None loaded_model: Union[Any, None] = None
class GPT4AllProcessorModel: class OfflineChatProcessorModel:
def __init__( def __init__(self, chat_model: str = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF"):
self,
chat_model: str = "mistral-7b-instruct-v0.1.Q4_0.gguf",
):
self.chat_model = chat_model self.chat_model = chat_model
self.loaded_model = None self.loaded_model = None
try: try:

View file

@ -6,7 +6,7 @@ empty_escape_sequences = "\n|\r|\t| "
app_env_filepath = "~/.khoj/env" app_env_filepath = "~/.khoj/env"
telemetry_server = "https://khoj.beta.haletic.com/v1/telemetry" telemetry_server = "https://khoj.beta.haletic.com/v1/telemetry"
content_directory = "~/.khoj/content/" content_directory = "~/.khoj/content/"
default_offline_chat_model = "mistral-7b-instruct-v0.1.Q4_0.gguf" default_offline_chat_model = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF"
default_online_chat_model = "gpt-4-turbo-preview" default_online_chat_model = "gpt-4-turbo-preview"
empty_config = { empty_config = {

View file

@ -15,6 +15,7 @@ from os import path
from pathlib import Path from pathlib import Path
from time import perf_counter from time import perf_counter
from typing import TYPE_CHECKING, Optional, Union from typing import TYPE_CHECKING, Optional, Union
from urllib.parse import urlparse
import torch import torch
from asgiref.sync import sync_to_async from asgiref.sync import sync_to_async
@ -270,6 +271,7 @@ class ConversationCommand(str, Enum):
Notes = "notes" Notes = "notes"
Help = "help" Help = "help"
Online = "online" Online = "online"
Webpage = "webpage"
Image = "image" Image = "image"
@ -278,15 +280,17 @@ command_descriptions = {
ConversationCommand.Notes: "Only talk about information that is available in your knowledge base.", ConversationCommand.Notes: "Only talk about information that is available in your knowledge base.",
ConversationCommand.Default: "The default command when no command specified. It intelligently auto-switches between general and notes mode.", ConversationCommand.Default: "The default command when no command specified. It intelligently auto-switches between general and notes mode.",
ConversationCommand.Online: "Search for information on the internet.", ConversationCommand.Online: "Search for information on the internet.",
ConversationCommand.Webpage: "Get information from webpage links provided by you.",
ConversationCommand.Image: "Generate images by describing your imagination in words.", ConversationCommand.Image: "Generate images by describing your imagination in words.",
ConversationCommand.Help: "Display a help message with all available commands and other metadata.", ConversationCommand.Help: "Display a help message with all available commands and other metadata.",
} }
tool_descriptions_for_llm = { tool_descriptions_for_llm = {
ConversationCommand.Default: "To use a mix of your internal knowledge and the user's personal knowledge, or if you don't entirely understand the query.", ConversationCommand.Default: "To use a mix of your internal knowledge and the user's personal knowledge, or if you don't entirely understand the query.",
ConversationCommand.General: "Use this when you can answer the question without any outside information or personal knowledge", ConversationCommand.General: "To use when you can answer the question without any outside information or personal knowledge",
ConversationCommand.Notes: "To search the user's personal knowledge base. Especially helpful if the question expects context from the user's notes or documents.", ConversationCommand.Notes: "To search the user's personal knowledge base. Especially helpful if the question expects context from the user's notes or documents.",
ConversationCommand.Online: "To search for the latest, up-to-date information from the internet. Note: **Questions about Khoj should always use this data source**", ConversationCommand.Online: "To search for the latest, up-to-date information from the internet. Note: **Questions about Khoj should always use this data source**",
ConversationCommand.Webpage: "To use if the user has directly provided the webpage urls or you are certain of the webpage urls to read.",
} }
mode_descriptions_for_llm = { mode_descriptions_for_llm = {
@ -340,3 +344,12 @@ def in_debug_mode():
"""Check if Khoj is running in debug mode. """Check if Khoj is running in debug mode.
Set KHOJ_DEBUG environment variable to true to enable debug mode.""" Set KHOJ_DEBUG environment variable to true to enable debug mode."""
return is_env_var_true("KHOJ_DEBUG") return is_env_var_true("KHOJ_DEBUG")
def is_valid_url(url: str) -> bool:
"""Check if a string is a valid URL"""
try:
result = urlparse(url.strip())
return all([result.scheme, result.netloc])
except:
return False

View file

@ -32,17 +32,13 @@ def initialization():
) )
try: try:
# Note: gpt4all package is not available on all devices.
# So ensure gpt4all package is installed before continuing this step.
import gpt4all
use_offline_model = input("Use offline chat model? (y/n): ") use_offline_model = input("Use offline chat model? (y/n): ")
if use_offline_model == "y": if use_offline_model == "y":
logger.info("🗣️ Setting up offline chat model") logger.info("🗣️ Setting up offline chat model")
OfflineChatProcessorConversationConfig.objects.create(enabled=True) OfflineChatProcessorConversationConfig.objects.create(enabled=True)
offline_chat_model = input( offline_chat_model = input(
f"Enter the offline chat model you want to use, See GPT4All for supported models (default: {default_offline_chat_model}): " f"Enter the offline chat model you want to use. See HuggingFace for available GGUF models (default: {default_offline_chat_model}): "
) )
if offline_chat_model == "": if offline_chat_model == "":
ChatModelOptions.objects.create( ChatModelOptions.objects.create(

View file

@ -91,7 +91,7 @@ class OpenAIProcessorConfig(ConfigBase):
class OfflineChatProcessorConfig(ConfigBase): class OfflineChatProcessorConfig(ConfigBase):
enable_offline_chat: Optional[bool] = False enable_offline_chat: Optional[bool] = False
chat_model: Optional[str] = "mistral-7b-instruct-v0.1.Q4_0.gguf" chat_model: Optional[str] = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF"
class ConversationProcessorConfig(ConfigBase): class ConversationProcessorConfig(ConfigBase):

View file

@ -9,7 +9,7 @@ from whisper import Whisper
from khoj.processor.embeddings import CrossEncoderModel, EmbeddingsModel from khoj.processor.embeddings import CrossEncoderModel, EmbeddingsModel
from khoj.utils import config as utils_config from khoj.utils import config as utils_config
from khoj.utils.config import ContentIndex, GPT4AllProcessorModel, SearchModels from khoj.utils.config import ContentIndex, OfflineChatProcessorModel, SearchModels
from khoj.utils.helpers import LRU, get_device from khoj.utils.helpers import LRU, get_device
from khoj.utils.rawconfig import FullConfig from khoj.utils.rawconfig import FullConfig
@ -20,7 +20,7 @@ embeddings_model: Dict[str, EmbeddingsModel] = None
cross_encoder_model: Dict[str, CrossEncoderModel] = None cross_encoder_model: Dict[str, CrossEncoderModel] = None
content_index = ContentIndex() content_index = ContentIndex()
openai_client: OpenAI = None openai_client: OpenAI = None
gpt4all_processor_config: GPT4AllProcessorModel = None offline_chat_processor_config: OfflineChatProcessorModel = None
whisper_model: Whisper = None whisper_model: Whisper = None
config_file: Path = None config_file: Path = None
verbose: int = 0 verbose: int = 0

View file

@ -189,7 +189,7 @@ def offline_agent():
return Agent.objects.create( return Agent.objects.create(
name="Accountant", name="Accountant",
chat_model=chat_model, chat_model=chat_model,
tuning="You are a certified CPA. You are able to tell me how much I've spent based on my notes. Regardless of what I ask, you should always respond with the total amount I've spent. ALWAYS RESPOND WITH A SUMMARY TOTAL OF HOW MUCH MONEY I HAVE SPENT.", personality="You are a certified CPA. You are able to tell me how much I've spent based on my notes. Regardless of what I ask, you should always respond with the total amount I've spent. ALWAYS RESPOND WITH A SUMMARY TOTAL OF HOW MUCH MONEY I HAVE SPENT.",
) )
@ -200,7 +200,7 @@ def openai_agent():
return Agent.objects.create( return Agent.objects.create(
name="Accountant", name="Accountant",
chat_model=chat_model, chat_model=chat_model,
tuning="You are a certified CPA. You are able to tell me how much I've spent based on my notes. Regardless of what I ask, you should always respond with the total amount I've spent.", personality="You are a certified CPA. You are able to tell me how much I've spent based on my notes. Regardless of what I ask, you should always respond with the total amount I've spent.",
) )

View file

@ -40,9 +40,9 @@ class ChatModelOptionsFactory(factory.django.DjangoModelFactory):
class Meta: class Meta:
model = ChatModelOptions model = ChatModelOptions
max_prompt_size = 2000 max_prompt_size = 3500
tokenizer = None tokenizer = None
chat_model = "mistral-7b-instruct-v0.1.Q4_0.gguf" chat_model = "NousResearch/Hermes-2-Pro-Mistral-7B-GGUF"
model_type = "offline" model_type = "offline"

View file

@ -96,3 +96,23 @@ class TestTruncateMessage:
assert final_tokens <= self.max_prompt_size assert final_tokens <= self.max_prompt_size
assert len(chat_messages) == 1 assert len(chat_messages) == 1
assert truncated_chat_history[0] != copy_big_chat_message assert truncated_chat_history[0] != copy_big_chat_message
def test_truncate_single_large_question(self):
# Arrange
big_chat_message_content = " ".join(["hi"] * (self.max_prompt_size + 1))
big_chat_message = ChatMessageFactory.build(content=big_chat_message_content)
big_chat_message.role = "user"
copy_big_chat_message = big_chat_message.copy()
chat_messages = [big_chat_message]
initial_tokens = sum([len(self.encoder.encode(message.content)) for message in chat_messages])
# Act
truncated_chat_history = utils.truncate_messages(chat_messages, self.max_prompt_size, self.model_name)
final_tokens = sum([len(self.encoder.encode(message.content)) for message in truncated_chat_history])
# Assert
# The original object has been modified. Verify certain properties
assert initial_tokens > self.max_prompt_size
assert final_tokens <= self.max_prompt_size
assert len(chat_messages) == 1
assert truncated_chat_history[0] != copy_big_chat_message

View file

@ -7,7 +7,10 @@ import pytest
from scipy.stats import linregress from scipy.stats import linregress
from khoj.processor.embeddings import EmbeddingsModel from khoj.processor.embeddings import EmbeddingsModel
from khoj.processor.tools.online_search import read_webpage, read_webpage_with_olostep from khoj.processor.tools.online_search import (
read_webpage_at_url,
read_webpage_with_olostep,
)
from khoj.utils import helpers from khoj.utils import helpers
@ -90,7 +93,7 @@ async def test_reading_webpage():
website = "https://en.wikipedia.org/wiki/Great_Chicago_Fire" website = "https://en.wikipedia.org/wiki/Great_Chicago_Fire"
# Act # Act
response = await read_webpage(website) response = await read_webpage_at_url(website)
# Assert # Assert
assert ( assert (

View file

@ -5,18 +5,12 @@ import pytest
SKIP_TESTS = True SKIP_TESTS = True
pytestmark = pytest.mark.skipif( pytestmark = pytest.mark.skipif(
SKIP_TESTS, SKIP_TESTS,
reason="The GPT4All library has some quirks that make it hard to test in CI. This causes some tests to fail. Hence, disable it in CI.", reason="Disable in CI to avoid long test runs.",
) )
import freezegun import freezegun
from freezegun import freeze_time from freezegun import freeze_time
try:
from gpt4all import GPT4All
except ModuleNotFoundError as e:
print("There was an error importing GPT4All. Please run pip install gpt4all in order to install it.")
from khoj.processor.conversation.offline.chat_model import ( from khoj.processor.conversation.offline.chat_model import (
converse_offline, converse_offline,
extract_questions_offline, extract_questions_offline,
@ -25,14 +19,12 @@ from khoj.processor.conversation.offline.chat_model import (
from khoj.processor.conversation.offline.utils import download_model from khoj.processor.conversation.offline.utils import download_model
from khoj.processor.conversation.utils import message_to_log from khoj.processor.conversation.utils import message_to_log
from khoj.routers.helpers import aget_relevant_output_modes from khoj.routers.helpers import aget_relevant_output_modes
from khoj.utils.constants import default_offline_chat_model
MODEL_NAME = "mistral-7b-instruct-v0.1.Q4_0.gguf"
@pytest.fixture(scope="session") @pytest.fixture(scope="session")
def loaded_model(): def loaded_model():
download_model(MODEL_NAME) return download_model(default_offline_chat_model)
return GPT4All(MODEL_NAME)
freezegun.configure(extend_ignore_list=["transformers"]) freezegun.configure(extend_ignore_list=["transformers"])
@ -40,7 +32,6 @@ freezegun.configure(extend_ignore_list=["transformers"])
# Test # Test
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
@pytest.mark.xfail(reason="Search actor isn't very date aware nor capable of formatting")
@pytest.mark.chatquality @pytest.mark.chatquality
@freeze_time("1984-04-02", ignore=["transformers"]) @freeze_time("1984-04-02", ignore=["transformers"])
def test_extract_question_with_date_filter_from_relative_day(loaded_model): def test_extract_question_with_date_filter_from_relative_day(loaded_model):
@ -149,20 +140,22 @@ def test_generate_search_query_using_question_from_chat_history(loaded_model):
message_list = [ message_list = [
("What is the name of Mr. Anderson's daughter?", "Miss Barbara", []), ("What is the name of Mr. Anderson's daughter?", "Miss Barbara", []),
] ]
query = "Does he have any sons?"
# Act # Act
response = extract_questions_offline( response = extract_questions_offline(
"Does he have any sons?", query,
conversation_log=populate_chat_history(message_list), conversation_log=populate_chat_history(message_list),
loaded_model=loaded_model, loaded_model=loaded_model,
use_history=True, use_history=True,
) )
all_expected_in_response = [ any_expected_with_barbara = [
"Anderson", "sibling",
"brother",
] ]
any_expected_in_response = [ any_expected_with_anderson = [
"son", "son",
"sons", "sons",
"children", "children",
@ -170,12 +163,21 @@ def test_generate_search_query_using_question_from_chat_history(loaded_model):
# Assert # Assert
assert len(response) >= 1 assert len(response) >= 1
assert all([expected_response in response[0] for expected_response in all_expected_in_response]), ( assert response[-1] == query, "Expected last question to be the user query, but got: " + response[-1]
"Expected chat actor to ask for clarification in response, but got: " + response[0] # Ensure the remaining generated search queries use proper nouns and chat history context
) for question in response[:-1]:
assert any([expected_response in response[0] for expected_response in any_expected_in_response]), ( if "Barbara" in question:
"Expected chat actor to ask for clarification in response, but got: " + response[0] assert any([expected_relation in question for expected_relation in any_expected_with_barbara]), (
) "Expected search queries using proper nouns and chat history for context, but got: " + question
)
elif "Anderson" in question:
assert any([expected_response in question for expected_response in any_expected_with_anderson]), (
"Expected search queries using proper nouns and chat history for context, but got: " + question
)
else:
assert False, (
"Expected search queries using proper nouns and chat history for context, but got: " + question
)
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
@ -312,6 +314,7 @@ def test_answer_from_chat_history_and_currently_retrieved_content(loaded_model):
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
@pytest.mark.xfail(reason="Chat actor lies when it doesn't know the answer")
@pytest.mark.chatquality @pytest.mark.chatquality
def test_refuse_answering_unanswerable_question(loaded_model): def test_refuse_answering_unanswerable_question(loaded_model):
"Chat actor should not try make up answers to unanswerable questions." "Chat actor should not try make up answers to unanswerable questions."
@ -436,7 +439,6 @@ def test_answer_general_question_not_in_chat_history_or_retrieved_content(loaded
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
@pytest.mark.xfail(reason="Chat actor doesn't ask clarifying questions when context is insufficient")
@pytest.mark.chatquality @pytest.mark.chatquality
def test_ask_for_clarification_if_not_enough_context_in_question(loaded_model): def test_ask_for_clarification_if_not_enough_context_in_question(loaded_model):
"Chat actor should ask for clarification if question cannot be answered unambiguously with the provided context" "Chat actor should ask for clarification if question cannot be answered unambiguously with the provided context"

View file

@ -15,7 +15,7 @@ from tests.helpers import ConversationFactory
SKIP_TESTS = True SKIP_TESTS = True
pytestmark = pytest.mark.skipif( pytestmark = pytest.mark.skipif(
SKIP_TESTS, SKIP_TESTS,
reason="The GPT4All library has some quirks that make it hard to test in CI. This causes some tests to fail. Hence, disable it in CI.", reason="Disable in CI to avoid long test runs.",
) )
fake = Faker() fake = Faker()
@ -48,7 +48,7 @@ def create_conversation(message_list, user, agent=None):
@pytest.mark.xfail(AssertionError, reason="Chat director not capable of answering this question yet") @pytest.mark.xfail(AssertionError, reason="Chat director not capable of answering this question yet")
@pytest.mark.chatquality @pytest.mark.chatquality
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_chat_with_no_chat_history_or_retrieved_content_gpt4all(client_offline_chat): def test_offline_chat_with_no_chat_history_or_retrieved_content(client_offline_chat):
# Act # Act
response = client_offline_chat.get(f'/api/chat?q="Hello, my name is Testatron. Who are you?"&stream=true') response = client_offline_chat.get(f'/api/chat?q="Hello, my name is Testatron. Who are you?"&stream=true')
response_message = response.content.decode("utf-8") response_message = response.content.decode("utf-8")
@ -339,7 +339,7 @@ def test_answer_requires_date_aware_aggregation_across_provided_notes(client_off
# Assert # Assert
assert response.status_code == 200 assert response.status_code == 200
assert "23" in response_message assert "26" in response_message
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
@ -579,7 +579,7 @@ async def test_get_correct_tools_general(client_offline_chat):
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
@pytest.mark.anyio @pytest.mark.anyio
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
async def test_get_correct_tools_with_chat_history(client_offline_chat): async def test_get_correct_tools_with_chat_history(client_offline_chat, default_user2):
# Arrange # Arrange
user_query = "What's the latest in the Israel/Palestine conflict?" user_query = "What's the latest in the Israel/Palestine conflict?"
chat_log = [ chat_log = [
@ -590,7 +590,7 @@ async def test_get_correct_tools_with_chat_history(client_offline_chat):
), ),
("What's up in New York City?", "A Pride parade has recently been held in New York City, on July 31st.", []), ("What's up in New York City?", "A Pride parade has recently been held in New York City, on July 31st.", []),
] ]
chat_history = create_conversation(chat_log) chat_history = create_conversation(chat_log, default_user2)
# Act # Act
tools = await aget_relevant_information_sources(user_query, chat_history) tools = await aget_relevant_information_sources(user_query, chat_history)

View file

@ -11,6 +11,7 @@ from khoj.routers.helpers import (
aget_relevant_information_sources, aget_relevant_information_sources,
aget_relevant_output_modes, aget_relevant_output_modes,
generate_online_subqueries, generate_online_subqueries,
infer_webpage_urls,
) )
from khoj.utils.helpers import ConversationCommand from khoj.utils.helpers import ConversationCommand
@ -546,6 +547,34 @@ async def test_select_data_sources_actor_chooses_to_search_online(chat_client):
assert ConversationCommand.Online in conversation_commands assert ConversationCommand.Online in conversation_commands
# ----------------------------------------------------------------------------------------------------
@pytest.mark.anyio
@pytest.mark.django_db(transaction=True)
async def test_select_data_sources_actor_chooses_to_read_webpage(chat_client):
# Arrange
user_query = "Summarize the wikipedia page on the history of the internet"
# Act
conversation_commands = await aget_relevant_information_sources(user_query, {})
# Assert
assert ConversationCommand.Webpage in conversation_commands
# ----------------------------------------------------------------------------------------------------
@pytest.mark.anyio
@pytest.mark.django_db(transaction=True)
async def test_infer_webpage_urls_actor_extracts_correct_links(chat_client):
# Arrange
user_query = "Summarize the wikipedia page on the history of the internet"
# Act
urls = await infer_webpage_urls(user_query, {}, None)
# Assert
assert "https://en.wikipedia.org/wiki/History_of_the_Internet" in urls
# Helpers # Helpers
# ---------------------------------------------------------------------------------------------------- # ----------------------------------------------------------------------------------------------------
def populate_chat_history(message_list): def populate_chat_history(message_list):

View file

@ -39,5 +39,6 @@
"1.6.0": "0.15.0", "1.6.0": "0.15.0",
"1.6.1": "0.15.0", "1.6.1": "0.15.0",
"1.6.2": "0.15.0", "1.6.2": "0.15.0",
"1.7.0": "0.15.0" "1.7.0": "0.15.0",
"1.8.0": "0.15.0"
} }