mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-11-23 23:48:56 +01:00
Resolve merge conflicts in dependency imports
This commit is contained in:
commit
ef5e9d66c1
43 changed files with 450 additions and 428 deletions
|
@ -9,7 +9,7 @@
|
|||
</div>
|
||||
|
||||
<div align="center">
|
||||
<b>An AI personal assistant for your digital brain</b>
|
||||
<b>An AI copilot for your Second Brain</b>
|
||||
|
||||
</div>
|
||||
|
||||
|
@ -24,30 +24,29 @@
|
|||
</div>
|
||||
|
||||
## Introduction
|
||||
Welcome to the Khoj Docs! This is the best place to [get started](./setup.md) with Khoj.
|
||||
Welcome to the Khoj Docs! This is the best place to get setup and explore Khoj's features.
|
||||
|
||||
- Khoj is a desktop application to [search](./search.md) and [chat](./chat.md) with your notes, documents and images
|
||||
- It is an offline-first, open source AI personal assistant accessible from your [Emacs](./emacs.md), [Obsidian](./obsidian.md) or [Web browser](./web.md)
|
||||
- It works with jpeg, markdown, [notion](./notion_integration.md) org-mode, pdf files and [github repositories](./github_integration.md)
|
||||
- If you have more questions, check out the [FAQ](https://faq.khoj.dev/) - it's a live Khoj instance indexing our Github repository!
|
||||
- Khoj is an open source, personal AI
|
||||
- You can [chat](chat.md) with it about anything. When relevant, it'll use any notes or documents you shared with it to respond
|
||||
- Quickly [find](search.md) relevant notes and documents using natural language
|
||||
- It understands pdf, plaintext, markdown, org-mode files, [notion pages](notion_integration.md) and [github repositories](github_integration.md)
|
||||
- Access it from your [Emacs](emacs.md), [Obsidian](obsidian.md), [Web browser](web.md) or the [Khoj Desktop app](desktop.md)
|
||||
- You can self-host Khoj on your consumer hardware or share it with your family, friends or team from your private cloud
|
||||
|
||||
## Quickstart
|
||||
[Click here](./setup.md) for full setup instructions
|
||||
|
||||
```shell
|
||||
pip install khoj-assistant && khoj
|
||||
```
|
||||
- [Try Khoj Cloud](https://app.khoj.dev) to get started quickly
|
||||
- [Read these instructions](./setup.md) to self-host a private instance of Khoj
|
||||
|
||||
## Overview
|
||||
<img src="https://docs.khoj.dev/assets/khoj_search_on_web.png" width="400px">
|
||||
<span> </span>
|
||||
<img src="https://docs.khoj.dev/assets/khoj_chat_on_web.png" width="400px">
|
||||
|
||||
#### [Search](./search.md)
|
||||
- **Local**: Your personal data stays local. All search and indexing is done on your machine.
|
||||
#### [Search](search.md)
|
||||
- **Natural**: Use natural language queries to quickly find relevant notes and documents.
|
||||
- **Incremental**: Incremental search for a fast, search-as-you-type experience
|
||||
|
||||
#### [Chat](./chat.md)
|
||||
#### [Chat](chat.md)
|
||||
- **Faster answers**: Find answers faster, smoother than search. No need to manually scan through your notes to find answers.
|
||||
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
|
||||
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
|
||||
|
|
|
@ -1,12 +1,13 @@
|
|||
- Get Started
|
||||
- [Overview](README.md)
|
||||
- [Install](setup.md)
|
||||
- [Self-Host](setup.md)
|
||||
- [Demos](demos.md)
|
||||
- Use
|
||||
- [Features](features.md)
|
||||
- [Chat](chat.md)
|
||||
- [Search](search.md)
|
||||
- Interfaces
|
||||
- Clients
|
||||
- [Desktop](desktop.md)
|
||||
- [Obsidian](obsidian.md)
|
||||
- [Emacs](emacs.md)
|
||||
- [Web](web.md)
|
||||
|
|
|
@ -1,63 +1,11 @@
|
|||
|
||||
## Advanced Usage
|
||||
### Search across Different Languages
|
||||
|
||||
### Search across Different Languages (Self-Hosting)
|
||||
To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
|
||||
For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it:
|
||||
1. Manually update `search-type > asymmetric > encoder` to `paraphrase-multilingual-MiniLM-L12-v2` in your `~/.khoj/khoj.yml` file for now. See diff of `khoj.yml` below for illustration:
|
||||
|
||||
```diff
|
||||
asymmetric:
|
||||
- encoder: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
|
||||
+ encoder: paraphrase-multilingual-MiniLM-L12-v2
|
||||
cross-encoder: cross-encoder/ms-marco-MiniLM-L-6-v2
|
||||
model_directory: "~/.khoj/search/asymmetric/"
|
||||
```
|
||||
|
||||
2. Regenerate your content index. For example, by opening [\<khoj-url\>/api/update?t=force](http://localhost:42110/api/update?t=force)
|
||||
|
||||
### Access Khoj on Mobile
|
||||
1. [Setup Khoj](/#/setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc
|
||||
2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone
|
||||
3. Open the Khoj web interface of the server from your phone browser.<br /> It should be `http://tailscale-ip-of-server:42110` or `http://name-of-server:42110` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)
|
||||
4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button
|
||||
5. Enjoy exploring your notes, documents and images from your phone!
|
||||
|
||||
![](./assets/khoj_pwa_android.png?)
|
||||
|
||||
### Use OpenAI Models for Search
|
||||
#### Setup
|
||||
1. Set `encoder-type`, `encoder` and `model-directory` under `asymmetric` and/or `symmetric` `search-type` in your `khoj.yml` (at `~/.khoj/khoj.yml`):
|
||||
```diff
|
||||
asymmetric:
|
||||
- encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
|
||||
+ encoder: text-embedding-ada-002
|
||||
+ encoder-type: khoj.utils.models.OpenAI
|
||||
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
|
||||
- encoder-type: sentence_transformers.SentenceTransformer
|
||||
- model_directory: "~/.khoj/search/asymmetric/"
|
||||
+ model-directory: null
|
||||
```
|
||||
2. [Setup your OpenAI API key in Khoj](/#/chat?id=setup)
|
||||
3. Restart Khoj server to generate embeddings. It will take longer than with the offline search models.
|
||||
|
||||
#### Warnings
|
||||
This configuration *uses an online model*
|
||||
- It will **send all notes to OpenAI** to generate embeddings
|
||||
- **All queries will be sent to OpenAI** when you search with Khoj
|
||||
- You will be **charged by OpenAI** based on the total tokens processed
|
||||
- It *requires an active internet connection* to search and index
|
||||
|
||||
### Bootstrap Khoj Search for Offline Usage later
|
||||
|
||||
You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine.
|
||||
Note: *Only search can currently run in fully offline mode, not chat.*
|
||||
|
||||
- With Internet
|
||||
1. Manually download the [asymmetric text](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [symmetric text](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [image search](https://huggingface.co/sentence-transformers/clip-ViT-B-32) models from HuggingFace
|
||||
2. Pip install khoj (and dependencies) in an associated virtualenv. E.g `python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant`
|
||||
- Without Internet
|
||||
1. Copy each of the search models into their respective folders, `asymmetric`, `symmetric` and `image` under the `~/.khoj/search/` directory on the air-gapped machine
|
||||
2. Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g `source .venv/bin/activate && khoj`
|
||||
1. Manually update the search config in server's admin settings page. Go to [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/). Either create a new one, if none exists, or update the existing one. Set the bi_encoder to `sentence-transformers/multi-qa-MiniLM-L6-cos-v1` and the cross_encoder to `cross-encoder/ms-marco-MiniLM-L-6-v2`.
|
||||
2. Regenerate your content index from all the relevant clients. This step is very important, as you'll need to re-encode all your content with the new model.
|
||||
|
||||
### Query Filters
|
||||
|
||||
|
|
BIN
docs/assets/khoj_chat_on_desktop.png
Normal file
BIN
docs/assets/khoj_chat_on_desktop.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 298 KiB |
BIN
docs/assets/khoj_search_on_desktop.png
Normal file
BIN
docs/assets/khoj_search_on_desktop.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 333 KiB |
21
docs/chat.md
21
docs/chat.md
|
@ -1,13 +1,13 @@
|
|||
### Khoj Chat
|
||||
#### Overview
|
||||
## Khoj Chat
|
||||
### Overview
|
||||
- Creates a personal assistant for you to inquire and engage with your notes
|
||||
- You can choose to use Online or Offline Chat depending on your requirements
|
||||
- Supports multi-turn conversations with the relevant notes for context
|
||||
- Shows reference notes used to generate a response
|
||||
|
||||
### Setup
|
||||
### Setup (Self-Hosting)
|
||||
#### Offline Chat
|
||||
Offline chat stays completely private and works without internet. But it is slower, lower quality and more compute intensive.
|
||||
Offline chat stays completely private and works without internet using open-source models.
|
||||
|
||||
> **System Requirements**:
|
||||
> - Minimum 8 GB RAM. Recommend **16Gb VRAM**
|
||||
|
@ -15,9 +15,10 @@ Offline chat stays completely private and works without internet. But it is slow
|
|||
> - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required
|
||||
> - A Mac M1+ or [Vulcan supported GPU](https://vulkan.gpuinfo.org/) should significantly speed up chat response times
|
||||
|
||||
- Open your [Khoj settings](http://localhost:42110/config/) and click *Enable* on the Offline Chat card
|
||||
1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration.
|
||||
2. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the offline chat model you want to use. Make sure to use `Offline` as its type. We currently only support offline models that use the [Llama chat prompt](https://replicate.com/blog/how-to-prompt-llama#wrap-user-input-with-inst-inst-tags) format. We recommend using `mistral-7b-instruct-v0.1.Q4_0.gguf`.
|
||||
|
||||
![Configure offline chat](https://user-images.githubusercontent.com/6413477/257021364-8a2029f5-dc21-4de8-9af9-9ba6100d695c.mp4 ':include :type=mp4')
|
||||
!> **Note**: Offline chat is not supported for a multi-user scenario. The host machine will encounter segmentation faults if multiple users try to use offline chat at the same time.
|
||||
|
||||
#### Online Chat
|
||||
Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.
|
||||
|
@ -25,14 +26,12 @@ Online chat requires internet to use ChatGPT but is faster, higher quality and l
|
|||
!> **Warning**: This will enable Khoj to send your chat queries and query relevant notes to OpenAI for processing
|
||||
|
||||
1. Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)
|
||||
2. Open your [Khoj Online Chat settings](http://localhost:42110/config/processor/conversation), add your OpenAI API key, and click *Save*. Then go to your [Khoj settings](http://localhost:42110/config) and click `Configure`. This will refresh Khoj with your OpenAI API key.
|
||||
|
||||
![Configure online chat](https://user-images.githubusercontent.com/6413477/256998908-ac26e55e-13a2-45fb-9348-3b90a62f7687.mp4 ':include :type=mp4')
|
||||
|
||||
2. Open your [Khoj Online Chat settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/). Add a new setting with your OpenAI API key, and click *Save*. Only one configuration will be used, so make sure that's the only one you have.
|
||||
3. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the OpenAI chat model you want to use. Make sure to use `OpenAI` as its type.
|
||||
|
||||
### Use
|
||||
1. Open Khoj Chat
|
||||
- **On Web**: Open [/chat](http://localhost:42110/chat) in your web browser
|
||||
- **On Web**: Open [/chat](https://app.khoj.dev/chat) in your web browser
|
||||
- **On Obsidian**: Search for *Khoj: Chat* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
- **On Emacs**: Run `M-x khoj <user-query>`
|
||||
2. Enter your queries to chat with Khoj. Use [slash commands](#commands) and [query filters](./advanced.md#query-filters) to change what Khoj uses to respond
|
||||
|
|
23
docs/desktop.md
Normal file
23
docs/desktop.md
Normal file
|
@ -0,0 +1,23 @@
|
|||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo"> Desktop</h1>
|
||||
|
||||
> An AI copilot for your Second Brain
|
||||
|
||||
## Features
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers quickly, from your private notes or the public internet
|
||||
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
|
||||
- **Iterative discovery**: Iteratively explore and re-discover your notes
|
||||
- **Search**
|
||||
- **Natural**: Advanced natural language understanding using Transformer based ML Models
|
||||
- **Incremental**: Incremental search for a fast, search-as-you-type experience
|
||||
|
||||
## Setup
|
||||
|
||||
1. Install the [Khoj Desktop app](https://khoj.dev/downloads) for your OS
|
||||
2. Generate an API key on the [Khoj Web App](https://app.khoj.dev/config#clients)
|
||||
3. Set your Khoj API Key on the *Settings* page of the Khoj Desktop app
|
||||
4. [Optional] Add any files, folders you'd like Khoj to be aware of on the *Settings* page and Click *Save*
|
||||
|
||||
## Interface
|
||||
![](./assets/khoj_chat_on_desktop.png ':size=600px')
|
||||
![](./assets/khoj_search_on_desktop.png ':size=600px')
|
|
@ -28,5 +28,5 @@ For the Linux installation, you have to have `glibc` version 2.35 or higher. You
|
|||
If you decide you want to uninstall the application, you can uninstall it like any other application on your system. For example, on MacOS, you can drag the application to the trash. On Windows, you can uninstall it from the `Add or Remove Programs` menu. On Linux, you can uninstall it with `sudo apt remove khoj`.
|
||||
|
||||
In addition to that, you might want to `rm -rf` the following directories:
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
|
|
|
@ -25,13 +25,7 @@ pip install -e .'[dev]'
|
|||
khoj -vv
|
||||
```
|
||||
2. Configure Khoj
|
||||
- **Via the Settings UI**: Add files, directories to index the [Khoj settings](http://localhost:42110/config) UI once Khoj has started up. Once you've saved all your settings, click `Configure`.
|
||||
- **Manually**:
|
||||
- Copy the `config/khoj_sample.yml` to `~/.khoj/khoj.yml`
|
||||
- Set `input-files` or `input-filter` in each relevant `content-type` section of `~/.khoj/khoj.yml`
|
||||
- Set `input-directories` field in `image` `content-type` section
|
||||
- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case
|
||||
- Restart khoj
|
||||
- **Via the Desktop application**: Add files, directories to index using the settings page of your desktop application. Click "Save" to immediately trigger indexing.
|
||||
|
||||
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
|
||||
|
||||
|
|
131
docs/emacs.md
131
docs/emacs.md
|
@ -1,6 +1,6 @@
|
|||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo">Emacs</h1>
|
||||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo"> Emacs</h1>
|
||||
|
||||
> An AI personal assistance for your digital brain
|
||||
> An AI copilot for your Second Brain in Emacs
|
||||
|
||||
<img src="https://stable.melpa.org/packages/khoj-badge.svg" width="150" alt="Melpa Stable Badge">
|
||||
<img src="https://melpa.org/packages/khoj-badge.svg" width="150" alt="Melpa Badge">
|
||||
|
@ -10,14 +10,13 @@
|
|||
|
||||
|
||||
## Features
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers quickly, from your private notes or the public internet
|
||||
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
|
||||
- **Iterative discovery**: Iteratively explore and re-discover your notes
|
||||
- **Search**
|
||||
- **Natural**: Advanced natural language understanding using Transformer based ML Models
|
||||
- **Local**: Your personal data stays local. All search, indexing is done on your machine*
|
||||
- **Incremental**: Incremental search for a fast, search-as-you-type experience
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers faster than search
|
||||
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
|
||||
- **Assisted creativity**: Smoothly weave across answer retrieval and content generation
|
||||
|
||||
## Interface
|
||||
#### Search
|
||||
|
@ -27,79 +26,76 @@
|
|||
![khoj chat on emacs](./assets/khoj_chat_on_emacs.png ':size=400px')
|
||||
|
||||
## Setup
|
||||
- *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine*
|
||||
1. Generate an API key on the [Khoj Web App](https://app.khoj.dev/config#clients)
|
||||
2. Add below snippet to your Emacs config file, usually at `~/.emacs.d/init.el`
|
||||
|
||||
- *khoj.el attempts to automatically install, start and configure the khoj server.*
|
||||
If this fails, follow [these instructions](/setup) to manually setup the khoj server.
|
||||
|
||||
### Direct Install
|
||||
<!-- tabs:start -->
|
||||
|
||||
#### **Direct Install**
|
||||
*Khoj will index your org-agenda files, by default*
|
||||
|
||||
```elisp
|
||||
;; Install Khoj.el
|
||||
M-x package-install khoj
|
||||
|
||||
; Set your Khoj API key
|
||||
(setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY")
|
||||
```
|
||||
|
||||
### Minimal Install
|
||||
Add below snippet to your Emacs config file.
|
||||
Indexes your org-agenda files, by default.
|
||||
#### **Minimal Install**
|
||||
*Khoj will index your org-agenda files, by default*
|
||||
|
||||
```elisp
|
||||
;; Install Khoj Package from MELPA Stable
|
||||
(use-package khoj
|
||||
:ensure t
|
||||
:pin melpa-stable
|
||||
:bind ("C-c s" . 'khoj))
|
||||
```
|
||||
|
||||
- Note: Install `khoj.el` from MELPA (instead of MELPA Stable) if you installed the pre-release version of khoj
|
||||
- That is, use `:pin melpa` to install khoj.el in above snippet if khoj server was installed with `--pre` flag, i.e `pip install --pre khoj-assistant`
|
||||
- Else use `:pin melpa-stable` to install khoj.el in above snippet if khoj was installed with `pip install khoj-assistant`
|
||||
- This ensures both khoj.el and khoj app are from the same version (git tagged or latest)
|
||||
|
||||
### Standard Install
|
||||
Add below snippet to your Emacs config file.
|
||||
Indexes the specified org files, directories. Sets up OpenAI API key for Khoj Chat
|
||||
|
||||
```elisp
|
||||
;; Install Khoj Package from MELPA Stable
|
||||
;; Install Khoj client from MELPA Stable
|
||||
(use-package khoj
|
||||
:ensure t
|
||||
:pin melpa-stable
|
||||
:bind ("C-c s" . 'khoj)
|
||||
:config (setq khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
|
||||
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")
|
||||
khoj-openai-api-key "YOUR_OPENAI_API_KEY")) ; required to enable chat
|
||||
:config (setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY"))
|
||||
```
|
||||
|
||||
### With [Straight.el](https://github.com/raxod502/straight.el)
|
||||
Add below snippet to your Emacs config file.
|
||||
Indexes the specified org files, directories. Sets up OpenAI API key for Khoj Chat
|
||||
#### **Standard Install**
|
||||
*Configures the specified org files, directories to be indexed by Khoj*
|
||||
|
||||
```elisp
|
||||
;; Install Khoj Package using Straight.el
|
||||
(use-package khoj
|
||||
;; Install Khoj client from MELPA Stable
|
||||
(use-package khoj
|
||||
:ensure t
|
||||
:pin melpa-stable
|
||||
:bind ("C-c s" . 'khoj)
|
||||
:config (setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY"
|
||||
khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
|
||||
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")))
|
||||
```
|
||||
|
||||
#### **Straight.el**
|
||||
*Configures the specified org files, directories to be indexed by Khoj*
|
||||
|
||||
```elisp
|
||||
;; Install Khoj client using Straight.el
|
||||
(use-package khoj
|
||||
:after org
|
||||
:straight (khoj :type git :host github :repo "khoj-ai/khoj" :files (:defaults "src/interface/emacs/khoj.el"))
|
||||
:bind ("C-c s" . 'khoj)
|
||||
:config (setq khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
|
||||
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")
|
||||
khoj-openai-api-key "YOUR_OPENAI_API_KEY" ; required to enable chat)
|
||||
```
|
||||
:config (setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY"
|
||||
khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
|
||||
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")))
|
||||
```
|
||||
|
||||
<!-- tabs:end -->
|
||||
## Use
|
||||
### Search
|
||||
See [Khoj Search](search.md) for details
|
||||
1. Hit `C-c s s` (or `M-x khoj RET s`) to open khoj search
|
||||
|
||||
2. Enter your query in natural language
|
||||
|
||||
e.g "What is the meaning of life?", "My life goals for 2023"
|
||||
2. Enter your query in natural language<br/>
|
||||
E.g *"What is the meaning of life?"*, *"My life goals for 2023"*
|
||||
|
||||
### Chat
|
||||
See [Khoj Chat](chat.md) for details
|
||||
1. Hit `C-c s c` (or `M-x khoj RET c`) to open khoj chat
|
||||
|
||||
2. Ask questions in a natural, conversational style
|
||||
|
||||
E.g "When did I file my taxes last year?"
|
||||
|
||||
See [Khoj Chat](/#/chat) for more details
|
||||
2. Ask questions in a natural, conversational style<br/>
|
||||
E.g *"When did I file my taxes last year?"*
|
||||
|
||||
### Find Similar Entries
|
||||
This feature finds entries similar to the one you are currently on.
|
||||
|
@ -108,7 +104,6 @@ This feature finds entries similar to the one you are currently on.
|
|||
|
||||
### Advanced Usage
|
||||
- Add [query filters](https://github.com/khoj-ai/khoj/#query-filters) during search to narrow down results further
|
||||
|
||||
e.g `What is the meaning of life? -"god" +"none" dt>"last week"`
|
||||
|
||||
- Use `C-c C-o 2` to open the current result at cursor in its source org file
|
||||
|
@ -121,31 +116,21 @@ This feature finds entries similar to the one you are currently on.
|
|||
![](./assets/khoj_emacs_menu.png)
|
||||
Hit `C-c s` (or `M-x khoj`) to open the khoj menu above. Then:
|
||||
- Hit `t` until you preferred content type is selected in the khoj menu
|
||||
|
||||
`Content Type` specifies the content to perform `Search`, `Update` or `Find Similar` actions on
|
||||
- Hit `n` twice and then enter number of results you want to see
|
||||
|
||||
`Results Count` is used by the `Search` and `Find Similar` actions
|
||||
- Hit `-f u` to `force` update the khoj content index
|
||||
|
||||
The `Force Update` switch is only used by the `Update` action
|
||||
|
||||
## Upgrade
|
||||
### Upgrade Khoj Backend
|
||||
```bash
|
||||
pip install --upgrade khoj-assistant
|
||||
```
|
||||
### Upgrade Khoj.el
|
||||
Use your Emacs package manager to upgrade `khoj.el`
|
||||
<!-- tabs:start -->
|
||||
|
||||
- For `khoj.el` from MELPA
|
||||
- Method 1
|
||||
- Run `M-x package-list-packages` to list all packages
|
||||
- Press `U` on `khoj` to mark it for upgrade
|
||||
- Press `x` to execute the marked actions
|
||||
- Method 2
|
||||
- Run `M-x package-refresh-content`
|
||||
- Run `M-x package-reinstall khoj`
|
||||
#### **With MELPA**
|
||||
1. Run `M-x package-refresh-content`
|
||||
2. Run `M-x package-reinstall khoj`
|
||||
|
||||
- For `khoj.el` from Straight
|
||||
- Run `M-x straight-pull-package khoj`
|
||||
#### **With Straight.el**
|
||||
- Run `M-x straight-pull-package khoj`
|
||||
|
||||
<!-- tabs:end -->
|
||||
|
|
|
@ -1,10 +1,10 @@
|
|||
## Features
|
||||
|
||||
#### [Search](./search.md)
|
||||
#### [Search](search.md)
|
||||
- **Local**: Your personal data stays local. All search and indexing is done on your machine.
|
||||
- **Incremental**: Incremental search for a fast, search-as-you-type experience
|
||||
|
||||
#### [Chat](./chat.md)
|
||||
#### [Chat](chat.md)
|
||||
- **Faster answers**: Find answers faster, smoother than search. No need to manually scan through your notes to find answers.
|
||||
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
|
||||
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
|
||||
|
|
|
@ -1,14 +1,14 @@
|
|||
# Setup the Github integration
|
||||
# 🧑🏾💻 Setup the Github integration
|
||||
|
||||
The Github integration allows you to index as many repositories as you want. It's currently default configured to index Issues, Commits, and all Markdown/Org files in each repository. For large repositories, this takes a fairly long time, but it works well for smaller projects.
|
||||
|
||||
# Configure your settings
|
||||
|
||||
1. Go to [http://localhost:42110/config](http://localhost:42110/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
|
||||
1. Go to [https://app.khoj.dev/config](https://app.khoj.dev/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
|
||||
|
||||
## Use the Github plugin
|
||||
|
||||
1. Generate a [classic PAT (personal access token)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) from [Github](https://github.com/settings/tokens) with `repo` and `admin:org` scopes at least.
|
||||
2. Navigate to [http://localhost:42110/config/content-source/github](http://localhost:42110/config/content-source/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
|
||||
2. Navigate to [https://app.khoj.dev/config/content-source/github](https://app.khoj.dev/config/content-source/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
|
||||
3. Click `Save`. Go back to the settings page and click `Configure`.
|
||||
4. Go to [http://localhost:42110/](http://localhost:42110/) and start searching!
|
||||
4. Go to [https://app.khoj.dev/](https://app.khoj.dev/) and start searching!
|
||||
|
|
|
@ -17,11 +17,13 @@
|
|||
repo: 'https://github.com/khoj-ai/khoj',
|
||||
loadSidebar: true,
|
||||
themeColor: '#c2a600',
|
||||
auto2top: true,
|
||||
// coverpage: true,
|
||||
}
|
||||
</script>
|
||||
<!-- Docsify v4 -->
|
||||
<script src="//cdn.jsdelivr.net/npm/docsify@4"></script>
|
||||
<script src="//cdn.jsdelivr.net/npm/docsify-tabs@1"></script>
|
||||
<script src="//cdn.jsdelivr.net/npm/docsify/lib/plugins/search.min.js"></script>
|
||||
<script src="//cdn.jsdelivr.net/npm/docsify-copy-code/dist/docsify-copy-code.min.js"></script>
|
||||
<script src="//cdn.jsdelivr.net/npm/prismjs@1/components/prism-bash.min.js"></script>
|
||||
|
|
|
@ -8,7 +8,7 @@ We haven't setup a fancy integration with OAuth yet, so this integration still r
|
|||
![setup_new_integration](https://github.com/khoj-ai/khoj/assets/65192171/b056e057-d4dc-47dc-aad3-57b59a22c68b)
|
||||
3. Share all the workspaces that you want to integrate with the Khoj integration you just made in the previous step
|
||||
![enable_workspace](https://github.com/khoj-ai/khoj/assets/65192171/98290303-b5b8-4cb0-b32c-f68c6923a3d0)
|
||||
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at http://localhost:42110/config/content-source/notion. Click `Save`.
|
||||
5. Click `Configure` in http://localhost:42110/config to index your Notion workspace(s).
|
||||
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at https://app.khoj.dev/config/content-source/notion. Click `Save`.
|
||||
5. Click `Configure` in https://app.khoj.dev/config to index your Notion workspace(s).
|
||||
|
||||
That's it! You should be ready to start searching and chatting. Make sure you've configured your OpenAI API Key for chat.
|
||||
|
|
100
docs/obsidian.md
100
docs/obsidian.md
|
@ -1,16 +1,15 @@
|
|||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo">Obsidian</h1>
|
||||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo"> Obsidian</h1>
|
||||
|
||||
> An AI personal assistant for your Digital Brain in Obsidian
|
||||
> An AI copilot for your Second Brain in Obsidian
|
||||
|
||||
## Features
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers quickly, from your private notes or the public internet
|
||||
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
|
||||
- **Iterative discovery**: Iteratively explore and re-discover your notes
|
||||
- **Search**
|
||||
- **Natural**: Advanced natural language understanding using Transformer based ML Models
|
||||
- **Local**: Your personal data stays local. All search and indexing is done on your machine. *Unlike chat which requires access to GPT.*
|
||||
- **Incremental**: Incremental search for a fast, search-as-you-type experience
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers faster and with less effort than search
|
||||
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
|
||||
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
|
||||
|
||||
## Interface
|
||||
![](./assets/khoj_search_on_obsidian.png ':size=400px')
|
||||
|
@ -18,102 +17,37 @@
|
|||
|
||||
|
||||
## Setup
|
||||
- *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine*
|
||||
- *Ensure you follow the ordering of the setup steps. Install the plugin after starting the khoj backend. This allows the plugin to configure the khoj backend*
|
||||
|
||||
### 1. Setup Backend
|
||||
Open terminal/cmd and run below command to install and start the khoj backend
|
||||
- On Linux/MacOS
|
||||
```shell
|
||||
python -m pip install khoj-assistant && khoj
|
||||
```
|
||||
|
||||
- On Windows
|
||||
```shell
|
||||
py -m pip install khoj-assistant && khoj
|
||||
```
|
||||
|
||||
### 2. Setup Plugin
|
||||
1. Open [Khoj](https://obsidian.md/plugins?id=khoj) from the *Community plugins* tab in Obsidian settings panel
|
||||
2. Click *Install*, then *Enable* on the Khoj plugin page in Obsidian
|
||||
3. [Optional] To enable Khoj Chat, set your [OpenAI API key](https://platform.openai.com/account/api-keys) in the Khoj plugin settings
|
||||
3. Generate an API key on the [Khoj Web App](https://app.khoj.dev/config#clients)
|
||||
4. Set your Khoj API Key in the Khoj plugin settings in Obsidian
|
||||
|
||||
See [official Obsidian plugin docs](https://help.obsidian.md/Extending+Obsidian/Community+plugins) for details
|
||||
See the official [Obsidian Plugin Docs](https://help.obsidian.md/Extending+Obsidian/Community+plugins) for more details on installing Obsidian plugins.
|
||||
|
||||
## Use
|
||||
### Chat
|
||||
Run *Khoj: Chat* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette) and ask questions in a natural, conversational style.<br />
|
||||
E.g "When did I file my taxes last year?"
|
||||
|
||||
Notes:
|
||||
- *Using Khoj Chat will result in query relevant notes being shared with OpenAI for ChatGPT to respond.*
|
||||
- *To use Khoj Chat, ensure you've set your [OpenAI API key](https://platform.openai.com/account/api-keys) in the Khoj plugin settings.*
|
||||
E.g *"When did I file my taxes last year?"*
|
||||
|
||||
See [Khoj Chat](/chat) for more details
|
||||
|
||||
### Search
|
||||
Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or run *Khoj: Search* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
|
||||
*Note: Ensure the khoj server is running in the background before searching. Execute `khoj` in your terminal if it is not already running*
|
||||
|
||||
[search_demo](https://user-images.githubusercontent.com/6413477/218801155-cd67e8b4-a770-404a-8179-d6b61caa0f93.mp4 ':include :type=mp4')
|
||||
|
||||
#### Query Filters
|
||||
|
||||
Use structured query syntax to filter the natural language search results
|
||||
- **Word Filter**: Get entries that include/exclude a specified term
|
||||
- Entries that contain term_to_include: `+"term_to_include"`
|
||||
- Entries that contain term_to_exclude: `-"term_to_exclude"`
|
||||
- **Date Filter**: Get entries containing dates in YYYY-MM-DD format from specified date (range)
|
||||
- Entries from April 1st 1984: `dt:"1984-04-01"`
|
||||
- Entries after March 31st 1984: `dt>="1984-04-01"`
|
||||
- Entries before April 2nd 1984 : `dt<="1984-04-01"`
|
||||
- **File Filter**: Get entries from a specified file
|
||||
- Entries from incoming.org file: `file:"incoming.org"`
|
||||
- Combined Example
|
||||
- `what is the meaning of life? file:"1984.org" dt>="1984-01-01" dt<="1985-01-01" -"big" -"brother"`
|
||||
- Adds all filters to the natural language query. It should return entries
|
||||
- from the file *1984.org*
|
||||
- containing dates from the year *1984*
|
||||
- excluding words *"big"* and *"brother"*
|
||||
- that best match the natural language query *"what is the meaning of life?"*
|
||||
|
||||
### Find Similar Notes
|
||||
To see other notes similar to the current one, run *Khoj: Find Similar Notes* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
|
||||
### Search
|
||||
Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or run *Khoj: Search* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
|
||||
See [Khoj Search](/search) for more details. Use [query filters](/advanced#query-filters) to limit entries to search
|
||||
|
||||
[search_demo](https://user-images.githubusercontent.com/6413477/218801155-cd67e8b4-a770-404a-8179-d6b61caa0f93.mp4 ':include :type=mp4')
|
||||
|
||||
## Upgrade
|
||||
### 1. Upgrade Backend
|
||||
```shell
|
||||
pip install --upgrade khoj-assistant
|
||||
```
|
||||
### 2. Upgrade Plugin
|
||||
1. Open *Community plugins* tab in Obsidian settings
|
||||
2. Click the *Check for updates* button
|
||||
3. Click the *Update* button next to Khoj, if available
|
||||
|
||||
## Demo
|
||||
### Search Demo
|
||||
[demo](https://github-production-user-asset-6210df.s3.amazonaws.com/6413477/240061700-3e33d8ea-25bb-46c8-a3bf-c92f78d0f56b.mp4 ':include :type=mp4')
|
||||
|
||||
#### Description
|
||||
|
||||
1. Install Khoj via `pip` and start Khoj backend
|
||||
```shell
|
||||
python -m pip install khoj-assistant && khoj
|
||||
```
|
||||
2. Install Khoj plugin via Community Plugins settings pane on Obsidian app
|
||||
- Check the new Khoj plugin settings
|
||||
- Wait for Khoj backend to index markdown, PDF files in the current Vault
|
||||
- Open Khoj plugin on Obsidian via Search button on Left Pane
|
||||
- Search \"*Announce plugin to folks*\" in the [Obsidian Plugin docs](https://marcus.se.net/obsidian-plugin-docs/)
|
||||
- Jump to the [search result](https://marcus.se.net/obsidian-plugin-docs/publishing/submit-your-plugin)
|
||||
|
||||
|
||||
## Troubleshooting
|
||||
- Open the Khoj plugin settings pane, to configure Khoj
|
||||
- Toggle Enable/Disable Khoj, if setting changes have not applied
|
||||
- Click *Update* button to force index to refresh, if results are failing or stale
|
||||
|
||||
## Current Limitations
|
||||
- The plugin loads the index of only one vault at a time.<br/>
|
||||
So notes across multiple vaults **cannot** be searched at the same time
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
## Khoj Search
|
||||
### Use
|
||||
1. Open Khoj Search
|
||||
- **On Web**: Open <http://localhost:42110/> in your web browser
|
||||
- **On Web**: Open <https://app.khoj.dev/> in your web browser
|
||||
- **On Obsidian**: Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or Search for *Khoj: Search* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
- **On Emacs**: Run `M-x khoj <user-query>`
|
||||
2. Query using natural language to find relevant entries from your knowledge base. Use [query filters](./advanced.md#query-filters) to limit entries to search
|
||||
|
|
186
docs/setup.md
186
docs/setup.md
|
@ -3,41 +3,15 @@ These are the general setup instructions for Khoj.
|
|||
|
||||
- Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine
|
||||
- Check the [Khoj Emacs docs](/emacs?id=setup) to setup Khoj with Emacs<br />
|
||||
Its simpler as it can skip the server *install*, *run* and *configure* step below.
|
||||
It's simpler as it can skip the server *install*, *run* and *configure* step below.
|
||||
- Check the [Khoj Obsidian docs](/obsidian?id=_2-setup-plugin) to setup Khoj with Obsidian<br />
|
||||
Its simpler as it can skip the *configure* step below.
|
||||
|
||||
### 1. Install
|
||||
For Installation, you can either use Docker or install Khoj locally.
|
||||
|
||||
#### 1.1 Local Server Setup
|
||||
Run the following command in your terminal to install the Khoj backend.
|
||||
### 1. Installation (Docker)
|
||||
|
||||
- On Linux/MacOS
|
||||
```shell
|
||||
python -m pip install khoj-assistant
|
||||
```
|
||||
|
||||
- On Windows
|
||||
```shell
|
||||
py -m pip install khoj-assistant
|
||||
```
|
||||
For more detailed Windows installation and troubleshooting, see [Windows Install](./windows_install.md).
|
||||
|
||||
|
||||
##### 1.1.1 Local Server Start
|
||||
|
||||
Run the following command from your terminal to start the Khoj backend and open Khoj in your browser.
|
||||
|
||||
```shell
|
||||
khoj
|
||||
```
|
||||
|
||||
Khoj should now be running at http://localhost:42110. You can see the web UI in your browser.
|
||||
|
||||
Note: To start Khoj automatically in the background use [Task scheduler](https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10) on Windows or [Cron](https://en.wikipedia.org/wiki/Cron) on Mac, Linux (e.g with `@reboot khoj`)
|
||||
|
||||
#### 1.2 Local Docker Setup
|
||||
Use the sample docker-compose [in Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) to run Khoj in Docker. To start the container, run the following command in the same directory as the docker-compose.yml file. You'll have to configure the mounted directories to match your local knowledge base.
|
||||
Use the sample docker-compose [in Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) to run Khoj in Docker. Start by configuring all the environment variables to your choosing. Your admin account will automatically be created based on the admin credentials in that file, so pay attention to those. To start the container, run the following command in the same directory as the docker-compose.yml file. This will automatically setup the database and run the Khoj server.
|
||||
|
||||
```shell
|
||||
docker-compose up
|
||||
|
@ -45,27 +19,131 @@ docker-compose up
|
|||
|
||||
Khoj should now be running at http://localhost:42110. You can see the web UI in your browser.
|
||||
|
||||
#### 1.3 Download the desktop client [Optional]
|
||||
### 1. Installation (Local)
|
||||
|
||||
You can use our desktop executables to select file paths and folders to index. You can simply select the folders or files, and they'll be automatically uploaded to the server. Once you specify a file or file path, you don't need to update the configuration again; it will grab any data diffs dynamically over time. This part is currently optional, but may make setup and configuration slightly easier. It removes the need for setting up custom file paths for your Khoj data configurations.
|
||||
#### Prerequisites
|
||||
|
||||
**To download the desktop client, go to https://download.khoj.dev** and the correct executable for your OS will automatically start downloading. Once downloaded, you can configure your folders for indexing using the settings tab. To set your chat configuration, you'll have to use the web interface for the Khoj server you setup in the previous step.
|
||||
##### Install Postgres (with PgVector)
|
||||
|
||||
### 1.4 Use (deprecated) desktop builds
|
||||
Khoj uses the `pgvector` package to store embeddings of your index in a Postgres database. In order to use this, you need to have Postgres installed.
|
||||
|
||||
Before `v0.12.0``, we had self-contained desktop builds that included both the server and the client. These were difficult to maintain, but are still available as part of earlier releases. To find setup instructions, see here:
|
||||
<!-- tabs:start -->
|
||||
|
||||
- [Desktop Installation](desktop_installation.md)
|
||||
- [Windows Installation](windows_install.md)
|
||||
#### **MacOS**
|
||||
|
||||
### 2. Configure
|
||||
1. Set `File`, `Folder` and hit `Save` in each Plugins you want to enable for Search on the Khoj config page
|
||||
2. Add your OpenAI API key to Chat Feature settings if you want to use Chat
|
||||
3. Click `Configure` and wait. The app will download ML models and index the content for search and (optionally) chat
|
||||
Install [Postgres.app](https://postgresapp.com/). This comes pre-installed with `pgvector` and relevant dependencies.
|
||||
|
||||
![configure demo](https://user-images.githubusercontent.com/6413477/255307879-61247d3f-c69a-46ef-b058-9bc533cb5c72.mp4 ':include :type=mp4')
|
||||
#### **Windows**
|
||||
|
||||
### 3. Install Interface Plugins (Optional)
|
||||
Use the [recommended installer](https://www.postgresql.org/download/windows/)
|
||||
|
||||
#### **Linux**
|
||||
From [official instructions](https://wiki.postgresql.org/wiki/Apt)
|
||||
|
||||
```bash
|
||||
sudo apt install -y postgresql-common
|
||||
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
|
||||
sudo apt install postgres-16 postgresql-16-pgvector
|
||||
```
|
||||
|
||||
##### **From Source**
|
||||
1. Follow instructions to [Install Postgres](https://www.postgresql.org/download/)
|
||||
2. Follow instructions to [Install PgVector](https://github.com/pgvector/pgvector#installation) in case you need to manually install it. Reproduced instructions below for convenience.
|
||||
|
||||
```bash
|
||||
cd /tmp
|
||||
git clone --branch v0.5.1 https://github.com/pgvector/pgvector.git
|
||||
cd pgvector
|
||||
make
|
||||
make install # may need sudo
|
||||
```
|
||||
|
||||
<!-- tabs:end -->
|
||||
|
||||
|
||||
##### Create the Khoj database
|
||||
|
||||
Make sure to update your environment variables to match your Postgres configuration if you're using a different name. The default values should work for most people.
|
||||
|
||||
<!-- tabs:start -->
|
||||
|
||||
#### **MacOS**
|
||||
```bash
|
||||
createdb khoj -U postgres
|
||||
```
|
||||
|
||||
#### **Windows**
|
||||
```bash
|
||||
createdb khoj -U postgres
|
||||
```
|
||||
|
||||
#### **Linux**
|
||||
```bash
|
||||
sudo -u postgres createdb khoj
|
||||
```
|
||||
|
||||
<!-- tabs:end -->
|
||||
|
||||
#### Install package
|
||||
|
||||
##### Local Server Setup
|
||||
- *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine*
|
||||
|
||||
Run the following command in your terminal to install the Khoj backend.
|
||||
|
||||
<!-- tabs:start -->
|
||||
|
||||
#### **MacOS**
|
||||
|
||||
```shell
|
||||
python -m pip install khoj-assistant
|
||||
```
|
||||
|
||||
#### **Windows**
|
||||
|
||||
```shell
|
||||
py -m pip install khoj-assistant
|
||||
```
|
||||
For more detailed Windows installation and troubleshooting, see [Windows Install](./windows_install.md).
|
||||
|
||||
#### **Linux**
|
||||
|
||||
```shell
|
||||
python -m pip install khoj-assistant
|
||||
```
|
||||
|
||||
<!-- tabs:end -->
|
||||
|
||||
##### Local Server Start
|
||||
|
||||
Run the following command from your terminal to start the Khoj backend and open Khoj in your browser.
|
||||
|
||||
```shell
|
||||
khoj --anonymous-mode
|
||||
```
|
||||
`--anonymous-mode` allows you to run the server without setting up Google credentials for login. This allows you to use any of the clients without a login wall. If you want to use Google login, you can skip this flag, but you will have to add your Google developer credentials.
|
||||
|
||||
On the first run, you will be prompted to input credentials for your admin account and do some basic configuration for your chat model settings. Once created, you can go to http://localhost:42110/server/admin and login with the credentials you just created.
|
||||
|
||||
Khoj should now be running at http://localhost:42110. You can see the web UI in your browser.
|
||||
|
||||
Note: To start Khoj automatically in the background use [Task scheduler](https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10) on Windows or [Cron](https://en.wikipedia.org/wiki/Cron) on Mac, Linux (e.g with `@reboot khoj`)
|
||||
|
||||
|
||||
### 2. Download the desktop client
|
||||
|
||||
You can use our desktop executables to select file paths and folders to index. You can simply select the folders or files, and they'll be automatically uploaded to the server. Once you specify a file or file path, you don't need to update the configuration again; it will grab any data diffs dynamically over time.
|
||||
|
||||
**To download the latest desktop client, go to https://download.khoj.dev** and the correct executable for your OS will automatically start downloading. Once downloaded, you can configure your folders for indexing using the settings tab. To set your chat configuration, you'll have to use the web interface for the Khoj server you setup in the previous step.
|
||||
|
||||
To use the desktop client, you need to go to your Khoj server's settings page (http://localhost:42110/config) and copy the API key. Then, paste it into the desktop client's settings page. Once you've done that, you can select files and folders to index.
|
||||
|
||||
### 3. Configure
|
||||
1. Go to http://localhost:42110/server/admin and login with your admin credentials. Go to the ChatModelOptions if you want to add additional models for chat.
|
||||
1. Select files and folders to index [using the desktop client](./setup.md?id=_2-download-the-desktop-client). When you click 'Save', the files will be sent to your server for indexing.
|
||||
- Select Notion workspaces and Github repositories to index using the web interface.
|
||||
|
||||
### 4. Install Client Plugins (Optional)
|
||||
Khoj exposes a web interface to search, chat and configure by default.<br />
|
||||
The optional steps below allow using Khoj from within an existing application like Obsidian or Emacs.
|
||||
|
||||
|
@ -75,9 +153,17 @@ The optional steps below allow using Khoj from within an existing application li
|
|||
- **Khoj Emacs**:<br />
|
||||
[Install](/emacs?id=setup) khoj.el
|
||||
|
||||
### 5. Use Khoj 🚀
|
||||
|
||||
You can head to http://localhost:42110 to use the web interface. You can also use the desktop client to search and chat.
|
||||
|
||||
## Upgrade
|
||||
### Upgrade Khoj Server
|
||||
|
||||
<!-- tabs:start -->
|
||||
|
||||
#### **Local Setup**
|
||||
|
||||
```shell
|
||||
pip install --upgrade khoj-assistant
|
||||
```
|
||||
|
@ -88,6 +174,16 @@ pip install --upgrade khoj-assistant
|
|||
pip install --upgrade --pre khoj-assistant
|
||||
```
|
||||
|
||||
#### **Docker**
|
||||
From the same directory where you have your `docker-compose` file, this will fetch the latest build and upgrade your server.
|
||||
|
||||
```shell
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
<!-- tabs:end -->
|
||||
|
||||
|
||||
### Upgrade Khoj on Emacs
|
||||
- Use your Emacs Package Manager to Upgrade
|
||||
- See [khoj.el package setup](/emacs?id=setup) for details
|
||||
|
@ -100,8 +196,8 @@ pip install --upgrade --pre khoj-assistant
|
|||
1. (Optional) Hit `Ctrl-C` in the terminal running the khoj server to stop it
|
||||
2. Delete the khoj directory in your home folder (i.e `~/.khoj` on Linux, Mac or `C:\Users\<your-username>\.khoj` on Windows)
|
||||
5. You might want to `rm -rf` the following directories:
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
3. Uninstall the khoj server with `pip uninstall khoj-assistant`
|
||||
4. (Optional) Uninstall khoj.el or the khoj obsidian plugin in the standard way on Emacs, Obsidian
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
# Telemetry
|
||||
# Telemetry (self-hosting)
|
||||
|
||||
We collect some high level, anonymized metadata about usage of Khoj. This includes:
|
||||
- Client (Web, Emacs, Obsidian)
|
||||
|
|
15
docs/web.md
15
docs/web.md
|
@ -1,19 +1,18 @@
|
|||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo">Web</h1>
|
||||
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo"> Web</h1>
|
||||
|
||||
> An AI personal assistant for your Digital Brain
|
||||
> An AI copilot for your Second Brain
|
||||
|
||||
## Features
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers quickly, from your private notes or the public internet
|
||||
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
|
||||
- **Iterative discovery**: Iteratively explore and re-discover your notes
|
||||
- **Search**
|
||||
- **Natural**: Advanced natural language understanding using Transformer based ML Models
|
||||
- **Local**: Your personal data stays local. All search and indexing is done on your machine. *Unlike chat which requires access to GPT.*
|
||||
- **Incremental**: Incremental search for a fast, search-as-you-type experience
|
||||
- **Chat**
|
||||
- **Faster answers**: Find answers faster and with less effort than search
|
||||
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
|
||||
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
|
||||
|
||||
## Setup
|
||||
The Khoj web interface is the default interface. It comes packaged with the khoj server.
|
||||
No setup required. The Khoj web app is the default interface to Khoj. You can access it from any web browser. Try it on [Khoj Cloud](https://app.khoj.dev)
|
||||
|
||||
## Interface
|
||||
![](./assets/khoj_search_on_web.png ':size=400px')
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
{
|
||||
"id": "khoj",
|
||||
"name": "Khoj",
|
||||
"version": "0.14.0",
|
||||
"version": "1.0.0",
|
||||
"minAppVersion": "0.15.0",
|
||||
"description": "An AI copilot for your Second Brain",
|
||||
"author": "Khoj Inc.",
|
||||
|
|
|
@ -39,7 +39,7 @@ dependencies = [
|
|||
"bs4 >= 0.0.1",
|
||||
"dateparser >= 1.1.1",
|
||||
"defusedxml == 0.7.1",
|
||||
"fastapi == 0.77.1",
|
||||
"fastapi >= 0.104.1",
|
||||
"python-multipart >= 0.0.5",
|
||||
"jinja2 == 3.1.2",
|
||||
"openai >= 0.27.0, < 1.0.0",
|
||||
|
@ -60,7 +60,7 @@ dependencies = [
|
|||
"bs4 >= 0.0.1",
|
||||
"anyio == 3.7.1",
|
||||
"pymupdf >= 1.23.5",
|
||||
"django == 4.2.5",
|
||||
"django == 4.2.7",
|
||||
"authlib == 1.2.1",
|
||||
"gpt4all >= 2.0.0; platform_system == 'Linux' and platform_machine == 'x86_64'",
|
||||
"gpt4all >= 2.0.0; platform_system == 'Windows' or platform_system == 'Darwin'",
|
||||
|
|
|
@ -240,10 +240,18 @@ class ConversationAdapters:
|
|||
def get_openai_conversation_config():
|
||||
return OpenAIProcessorConversationConfig.objects.filter().first()
|
||||
|
||||
@staticmethod
|
||||
async def aget_openai_conversation_config():
|
||||
return await OpenAIProcessorConversationConfig.objects.filter().afirst()
|
||||
|
||||
@staticmethod
|
||||
def get_offline_chat_conversation_config():
|
||||
return OfflineChatProcessorConversationConfig.objects.filter().first()
|
||||
|
||||
@staticmethod
|
||||
async def aget_offline_chat_conversation_config():
|
||||
return await OfflineChatProcessorConversationConfig.objects.filter().afirst()
|
||||
|
||||
@staticmethod
|
||||
def has_valid_offline_conversation_config():
|
||||
return OfflineChatProcessorConversationConfig.objects.filter(enabled=True).exists()
|
||||
|
@ -267,10 +275,21 @@ class ConversationAdapters:
|
|||
return None
|
||||
return config.setting
|
||||
|
||||
@staticmethod
|
||||
async def aget_conversation_config(user: KhojUser):
|
||||
config = await UserConversationConfig.objects.filter(user=user).prefetch_related("setting").afirst()
|
||||
if not config:
|
||||
return None
|
||||
return config.setting
|
||||
|
||||
@staticmethod
|
||||
def get_default_conversation_config():
|
||||
return ChatModelOptions.objects.filter().first()
|
||||
|
||||
@staticmethod
|
||||
async def aget_default_conversation_config():
|
||||
return await ChatModelOptions.objects.filter().afirst()
|
||||
|
||||
@staticmethod
|
||||
def save_conversation(user: KhojUser, conversation_log: dict):
|
||||
conversation = Conversation.objects.filter(user=user)
|
||||
|
@ -320,10 +339,6 @@ class ConversationAdapters:
|
|||
async def get_openai_chat_config():
|
||||
return await OpenAIProcessorConversationConfig.objects.filter().afirst()
|
||||
|
||||
@staticmethod
|
||||
async def aget_default_conversation_config():
|
||||
return await ChatModelOptions.objects.filter().afirst()
|
||||
|
||||
|
||||
class EntryAdapters:
|
||||
word_filer = WordFilter()
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{
|
||||
"name": "Khoj",
|
||||
"version": "0.14.0",
|
||||
"version": "1.0.0",
|
||||
"description": "An AI copilot for your Second Brain",
|
||||
"author": "Saba Imran, Debanjum Singh Solanky <team@khoj.dev>",
|
||||
"license": "GPL-3.0-or-later",
|
||||
|
|
|
@ -6,7 +6,7 @@
|
|||
;; Saba Imran <saba@khoj.dev>
|
||||
;; Description: An AI copilot for your Second Brain
|
||||
;; Keywords: search, chat, org-mode, outlines, markdown, pdf, image
|
||||
;; Version: 0.14.0
|
||||
;; Version: 1.0.0
|
||||
;; Package-Requires: ((emacs "27.1") (transient "0.3.0") (dash "2.19.1"))
|
||||
;; URL: https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs
|
||||
|
||||
|
@ -63,7 +63,7 @@
|
|||
;; Khoj Static Configuration
|
||||
;; -------------------------
|
||||
|
||||
(defcustom khoj-server-url "http://localhost:42110"
|
||||
(defcustom khoj-server-url "https://app.khoj.dev"
|
||||
"Location of Khoj API server."
|
||||
:group 'khoj
|
||||
:type 'string)
|
||||
|
@ -94,7 +94,7 @@
|
|||
:type 'number)
|
||||
|
||||
(defcustom khoj-api-key nil
|
||||
"API Key to Khoj server."
|
||||
"API Key to your Khoj. Default at https://app.khoj.dev/config#clients."
|
||||
:group 'khoj
|
||||
:type 'string)
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
{
|
||||
"id": "khoj",
|
||||
"name": "Khoj",
|
||||
"version": "0.14.0",
|
||||
"version": "1.0.0",
|
||||
"minAppVersion": "0.15.0",
|
||||
"description": "An AI copilot for your Second Brain",
|
||||
"author": "Khoj Inc.",
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{
|
||||
"name": "Khoj",
|
||||
"version": "0.14.0",
|
||||
"version": "1.0.0",
|
||||
"description": "An AI copilot for your Second Brain",
|
||||
"author": "Debanjum Singh Solanky, Saba Imran <team@khoj.dev>",
|
||||
"license": "GPL-3.0-or-later",
|
||||
|
|
|
@ -75,7 +75,7 @@ export default class Khoj extends Plugin {
|
|||
|
||||
if (this.settings.khojUrl === "https://app.khoj.dev") {
|
||||
if (this.settings.khojApiKey === "") {
|
||||
new Notice(`❗️Khoj API key is not configured. Please visit https://app.khoj.dev to get an API key.`);
|
||||
new Notice(`❗️Khoj API key is not configured. Please visit https://app.khoj.dev/config#clients to get an API key.`);
|
||||
return;
|
||||
}
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@ export interface KhojSetting {
|
|||
|
||||
export const DEFAULT_SETTINGS: KhojSetting = {
|
||||
resultsCount: 6,
|
||||
khojUrl: 'http://127.0.0.1:42110',
|
||||
khojUrl: 'https://app.khoj.dev',
|
||||
khojApiKey: '',
|
||||
connectedToBackend: false,
|
||||
autoConfigure: true,
|
||||
|
|
|
@ -26,5 +26,6 @@
|
|||
"0.12.2": "0.15.0",
|
||||
"0.12.3": "0.15.0",
|
||||
"0.13.0": "0.15.0",
|
||||
"0.14.0": "0.15.0"
|
||||
"0.14.0": "0.15.0",
|
||||
"1.0.0": "0.15.0"
|
||||
}
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
{% block content %}
|
||||
|
||||
<div class="page">
|
||||
<div class="section">
|
||||
<div id="content" class="section">
|
||||
<h2 class="section-title">Content</h2>
|
||||
<div class="section-cards">
|
||||
<div class="card">
|
||||
|
@ -118,7 +118,7 @@
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section">
|
||||
<div id ="features" class="section">
|
||||
<h2 class="section-title">Features</h2>
|
||||
<div id="features-hint-text"></div>
|
||||
<div class="section-cards">
|
||||
|
@ -144,9 +144,9 @@
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section">
|
||||
<div id="clients" class="section">
|
||||
<h2 class="section-title">Clients</h2>
|
||||
<div class="api-settings">
|
||||
<div id="clients-api" class="api-settings">
|
||||
<div class="card-title-row">
|
||||
<img class="card-icon" src="/static/assets/icons/key.svg" alt="API Key">
|
||||
<h3 class="card-title">API Keys</h3>
|
||||
|
@ -172,7 +172,7 @@
|
|||
</div>
|
||||
</div>
|
||||
{% if billing_enabled %}
|
||||
<div class="section">
|
||||
<div id="billing" class="section">
|
||||
<h2 class="section-title">Billing</h2>
|
||||
<div class="section-cards">
|
||||
<div class="card">
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
# Standard Packages
|
||||
import logging
|
||||
import json
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Optional
|
||||
|
||||
|
@ -31,6 +32,10 @@ def extract_questions(
|
|||
"""
|
||||
Infer search queries to retrieve relevant notes to answer user query
|
||||
"""
|
||||
|
||||
def _valid_question(question: str):
|
||||
return not is_none_or_empty(question) and question != "[]"
|
||||
|
||||
# Extract Past User Message and Inferred Questions from Conversation Log
|
||||
chat_history = "".join(
|
||||
[
|
||||
|
@ -70,7 +75,7 @@ def extract_questions(
|
|||
|
||||
# Extract, Clean Message from GPT's Response
|
||||
try:
|
||||
questions = (
|
||||
split_questions = (
|
||||
response.content.strip(empty_escape_sequences)
|
||||
.replace("['", '["')
|
||||
.replace("']", '"]')
|
||||
|
@ -79,9 +84,18 @@ def extract_questions(
|
|||
.replace('"]', "")
|
||||
.split('", "')
|
||||
)
|
||||
questions = []
|
||||
|
||||
for question in split_questions:
|
||||
if question not in questions and _valid_question(question):
|
||||
questions.append(question)
|
||||
|
||||
if is_none_or_empty(questions):
|
||||
raise ValueError("GPT returned empty JSON")
|
||||
except:
|
||||
logger.warning(f"GPT returned invalid JSON. Falling back to using user message as search query.\n{response}")
|
||||
questions = [text]
|
||||
|
||||
logger.debug(f"Extracted Questions by GPT: {questions}")
|
||||
return questions
|
||||
|
||||
|
|
|
@ -154,17 +154,20 @@ def truncate_messages(
|
|||
)
|
||||
|
||||
system_message = messages.pop()
|
||||
assert type(system_message.content) == str
|
||||
system_message_tokens = len(encoder.encode(system_message.content))
|
||||
|
||||
tokens = sum([len(encoder.encode(message.content)) for message in messages])
|
||||
tokens = sum([len(encoder.encode(message.content)) for message in messages if type(message.content) == str])
|
||||
while (tokens + system_message_tokens) > max_prompt_size and len(messages) > 1:
|
||||
messages.pop()
|
||||
tokens = sum([len(encoder.encode(message.content)) for message in messages])
|
||||
assert type(system_message.content) == str
|
||||
tokens = sum([len(encoder.encode(message.content)) for message in messages if type(message.content) == str])
|
||||
|
||||
# Truncate current message if still over max supported prompt size by model
|
||||
if (tokens + system_message_tokens) > max_prompt_size:
|
||||
current_message = "\n".join(messages[0].content.split("\n")[:-1])
|
||||
original_question = "\n".join(messages[0].content.split("\n")[-1:])
|
||||
assert type(system_message.content) == str
|
||||
current_message = "\n".join(messages[0].content.split("\n")[:-1]) if type(messages[0].content) == str else ""
|
||||
original_question = "\n".join(messages[0].content.split("\n")[-1:]) if type(messages[0].content) == str else ""
|
||||
original_question_tokens = len(encoder.encode(original_question))
|
||||
remaining_tokens = max_prompt_size - original_question_tokens - system_message_tokens
|
||||
truncated_message = encoder.decode(encoder.encode(current_message)[:remaining_tokens]).strip()
|
||||
|
|
|
@ -31,6 +31,7 @@ from khoj.utils import state, constants
|
|||
from khoj.utils.helpers import AsyncIteratorWrapper, get_device
|
||||
from fastapi.responses import StreamingResponse, Response
|
||||
from khoj.routers.helpers import (
|
||||
CommonQueryParams,
|
||||
get_conversation_command,
|
||||
validate_conversation_config,
|
||||
agenerate_chat_response,
|
||||
|
@ -55,6 +56,7 @@ from database.models import (
|
|||
Entry as DbEntry,
|
||||
GithubConfig,
|
||||
NotionConfig,
|
||||
ChatModelOptions,
|
||||
)
|
||||
|
||||
|
||||
|
@ -122,7 +124,7 @@ async def map_config_to_db(config: FullConfig, user: KhojUser):
|
|||
def _initialize_config():
|
||||
if state.config is None:
|
||||
state.config = FullConfig()
|
||||
state.config.search_type = SearchConfig.parse_obj(constants.default_config["search-type"])
|
||||
state.config.search_type = SearchConfig.model_validate(constants.default_config["search-type"])
|
||||
|
||||
|
||||
@api.get("/config/data", response_model=FullConfig)
|
||||
|
@ -355,15 +357,12 @@ def get_config_types(
|
|||
async def search(
|
||||
q: str,
|
||||
request: Request,
|
||||
common: CommonQueryParams,
|
||||
n: Optional[int] = 5,
|
||||
t: Optional[SearchType] = SearchType.All,
|
||||
r: Optional[bool] = False,
|
||||
max_distance: Optional[Union[float, None]] = None,
|
||||
dedupe: Optional[bool] = True,
|
||||
client: Optional[str] = None,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
host: Optional[str] = Header(None),
|
||||
):
|
||||
user = request.user.object
|
||||
start_time = time.time()
|
||||
|
@ -467,10 +466,7 @@ async def search(
|
|||
request=request,
|
||||
telemetry_type="api",
|
||||
api="search",
|
||||
client=client,
|
||||
user_agent=user_agent,
|
||||
referer=referer,
|
||||
host=host,
|
||||
**common.__dict__,
|
||||
)
|
||||
|
||||
end_time = time.time()
|
||||
|
@ -483,12 +479,9 @@ async def search(
|
|||
@requires(["authenticated"])
|
||||
def update(
|
||||
request: Request,
|
||||
common: CommonQueryParams,
|
||||
t: Optional[SearchType] = None,
|
||||
force: Optional[bool] = False,
|
||||
client: Optional[str] = None,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
host: Optional[str] = Header(None),
|
||||
):
|
||||
user = request.user.object
|
||||
if not state.config:
|
||||
|
@ -514,10 +507,7 @@ def update(
|
|||
request=request,
|
||||
telemetry_type="api",
|
||||
api="update",
|
||||
client=client,
|
||||
user_agent=user_agent,
|
||||
referer=referer,
|
||||
host=host,
|
||||
**common.__dict__,
|
||||
)
|
||||
|
||||
return {"status": "ok", "message": "khoj reloaded"}
|
||||
|
@ -527,10 +517,7 @@ def update(
|
|||
@requires(["authenticated"])
|
||||
def chat_history(
|
||||
request: Request,
|
||||
client: Optional[str] = None,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
host: Optional[str] = Header(None),
|
||||
common: CommonQueryParams,
|
||||
):
|
||||
user = request.user.object
|
||||
validate_conversation_config()
|
||||
|
@ -542,10 +529,7 @@ def chat_history(
|
|||
request=request,
|
||||
telemetry_type="api",
|
||||
api="chat",
|
||||
client=client,
|
||||
user_agent=user_agent,
|
||||
referer=referer,
|
||||
host=host,
|
||||
**common.__dict__,
|
||||
)
|
||||
|
||||
return {"status": "ok", "response": meta_log.get("chat", [])}
|
||||
|
@ -555,10 +539,7 @@ def chat_history(
|
|||
@requires(["authenticated"])
|
||||
async def chat_options(
|
||||
request: Request,
|
||||
client: Optional[str] = None,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
host: Optional[str] = Header(None),
|
||||
common: CommonQueryParams,
|
||||
) -> Response:
|
||||
cmd_options = {}
|
||||
for cmd in ConversationCommand:
|
||||
|
@ -568,10 +549,7 @@ async def chat_options(
|
|||
request=request,
|
||||
telemetry_type="api",
|
||||
api="chat_options",
|
||||
client=client,
|
||||
user_agent=user_agent,
|
||||
referer=referer,
|
||||
host=host,
|
||||
**common.__dict__,
|
||||
)
|
||||
return Response(content=json.dumps(cmd_options), media_type="application/json", status_code=200)
|
||||
|
||||
|
@ -580,14 +558,11 @@ async def chat_options(
|
|||
@requires(["authenticated"])
|
||||
async def chat(
|
||||
request: Request,
|
||||
common: CommonQueryParams,
|
||||
q: str,
|
||||
n: Optional[int] = 5,
|
||||
d: Optional[float] = 0.18,
|
||||
client: Optional[str] = None,
|
||||
stream: Optional[bool] = False,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
host: Optional[str] = Header(None),
|
||||
rate_limiter_per_minute=Depends(ApiUserRateLimiter(requests=30, window=60)),
|
||||
rate_limiter_per_day=Depends(ApiUserRateLimiter(requests=500, window=60 * 60 * 24)),
|
||||
) -> Response:
|
||||
|
@ -601,7 +576,7 @@ async def chat(
|
|||
meta_log = (await ConversationAdapters.aget_conversation_by_user(user)).conversation_log
|
||||
|
||||
compiled_references, inferred_queries, defiltered_query = await extract_references_and_questions(
|
||||
request, meta_log, q, (n or 5), (d or math.inf), conversation_command
|
||||
request, common, meta_log, q, (n or 5), (d or math.inf), conversation_command
|
||||
)
|
||||
online_results: Dict = dict()
|
||||
|
||||
|
@ -647,11 +622,8 @@ async def chat(
|
|||
request=request,
|
||||
telemetry_type="api",
|
||||
api="chat",
|
||||
client=client,
|
||||
user_agent=user_agent,
|
||||
referer=referer,
|
||||
host=host,
|
||||
metadata=chat_metadata,
|
||||
**common.__dict__,
|
||||
)
|
||||
|
||||
if llm_response is None:
|
||||
|
@ -678,6 +650,7 @@ async def chat(
|
|||
|
||||
async def extract_references_and_questions(
|
||||
request: Request,
|
||||
common: CommonQueryParams,
|
||||
meta_log: dict,
|
||||
q: str,
|
||||
n: int,
|
||||
|
@ -710,7 +683,16 @@ async def extract_references_and_questions(
|
|||
# Infer search queries from user message
|
||||
with timer("Extracting search queries took", logger):
|
||||
# If we've reached here, either the user has enabled offline chat or the openai model is enabled.
|
||||
if await ConversationAdapters.ahas_offline_chat():
|
||||
offline_chat_config = await ConversationAdapters.aget_offline_chat_conversation_config()
|
||||
conversation_config = await ConversationAdapters.aget_conversation_config(user)
|
||||
if conversation_config is None:
|
||||
conversation_config = await ConversationAdapters.aget_default_conversation_config()
|
||||
openai_chat_config = await ConversationAdapters.aget_openai_conversation_config()
|
||||
if (
|
||||
offline_chat_config
|
||||
and offline_chat_config.enabled
|
||||
and conversation_config.model_type == ChatModelOptions.ModelType.OFFLINE
|
||||
):
|
||||
using_offline_chat = True
|
||||
offline_chat = await ConversationAdapters.get_offline_chat()
|
||||
chat_model = offline_chat.chat_model
|
||||
|
@ -722,7 +704,7 @@ async def extract_references_and_questions(
|
|||
inferred_queries = extract_questions_offline(
|
||||
defiltered_query, loaded_model=loaded_model, conversation_log=meta_log, should_extract_questions=False
|
||||
)
|
||||
elif await ConversationAdapters.has_openai_chat():
|
||||
elif openai_chat_config and conversation_config.model_type == ChatModelOptions.ModelType.OPENAI:
|
||||
openai_chat_config = await ConversationAdapters.get_openai_chat_config()
|
||||
openai_chat = await ConversationAdapters.get_openai_chat()
|
||||
api_key = openai_chat_config.api_key
|
||||
|
@ -744,9 +726,9 @@ async def extract_references_and_questions(
|
|||
r=True,
|
||||
max_distance=d,
|
||||
dedupe=False,
|
||||
common=common,
|
||||
)
|
||||
)
|
||||
# Dedupe the results again, as duplicates may be returned across queries.
|
||||
result_list = text_search.deduplicated_search_responses(result_list)
|
||||
compiled_references = [item.additional["compiled"] for item in result_list]
|
||||
|
||||
|
|
|
@ -6,10 +6,10 @@ from datetime import datetime
|
|||
from functools import partial
|
||||
import logging
|
||||
from time import time
|
||||
from typing import Iterator, List, Optional, Union, Tuple, Dict, Any
|
||||
from typing import Annotated, Iterator, List, Optional, Union, Tuple, Dict, Any
|
||||
|
||||
# External Packages
|
||||
from fastapi import HTTPException, Request
|
||||
from fastapi import HTTPException, Header, Request, Depends
|
||||
|
||||
# Internal Packages
|
||||
from khoj.utils import state
|
||||
|
@ -232,3 +232,20 @@ class ApiUserRateLimiter:
|
|||
|
||||
# Add the current request to the cache
|
||||
user_requests.append(time())
|
||||
|
||||
|
||||
class CommonQueryParamsClass:
|
||||
def __init__(
|
||||
self,
|
||||
client: Optional[str] = None,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
host: Optional[str] = Header(None),
|
||||
):
|
||||
self.client = client
|
||||
self.user_agent = user_agent
|
||||
self.referer = referer
|
||||
self.host = host
|
||||
|
||||
|
||||
CommonQueryParams = Annotated[CommonQueryParamsClass, Depends()]
|
||||
|
|
|
@ -63,7 +63,7 @@ async def update(
|
|||
request: Request,
|
||||
files: list[UploadFile],
|
||||
force: bool = False,
|
||||
t: Optional[Union[state.SearchType, str]] = None,
|
||||
t: Optional[Union[state.SearchType, str]] = state.SearchType.All,
|
||||
client: Optional[str] = None,
|
||||
user_agent: Optional[str] = Header(None),
|
||||
referer: Optional[str] = Header(None),
|
||||
|
@ -182,13 +182,16 @@ def configure_content(
|
|||
files: Optional[dict[str, dict[str, str]]],
|
||||
search_models: SearchModels,
|
||||
regenerate: bool = False,
|
||||
t: Optional[state.SearchType] = None,
|
||||
t: Optional[state.SearchType] = state.SearchType.All,
|
||||
full_corpus: bool = True,
|
||||
user: KhojUser = None,
|
||||
) -> tuple[Optional[ContentIndex], bool]:
|
||||
content_index = ContentIndex()
|
||||
|
||||
success = True
|
||||
if t is not None and t in [type.value for type in state.SearchType]:
|
||||
t = state.SearchType(t)
|
||||
|
||||
if t is not None and not t.value in [type.value for type in state.SearchType]:
|
||||
logger.warning(f"🚨 Invalid search type: {t}")
|
||||
return None, False
|
||||
|
@ -201,7 +204,7 @@ def configure_content(
|
|||
|
||||
try:
|
||||
# Initialize Org Notes Search
|
||||
if (search_type == None or search_type == state.SearchType.Org.value) and files["org"]:
|
||||
if (search_type == state.SearchType.All.value or search_type == state.SearchType.Org.value) and files["org"]:
|
||||
logger.info("🦄 Setting up search for orgmode notes")
|
||||
# Extract Entries, Generate Notes Embeddings
|
||||
text_search.setup(
|
||||
|
@ -217,7 +220,9 @@ def configure_content(
|
|||
|
||||
try:
|
||||
# Initialize Markdown Search
|
||||
if (search_type == None or search_type == state.SearchType.Markdown.value) and files["markdown"]:
|
||||
if (search_type == state.SearchType.All.value or search_type == state.SearchType.Markdown.value) and files[
|
||||
"markdown"
|
||||
]:
|
||||
logger.info("💎 Setting up search for markdown notes")
|
||||
# Extract Entries, Generate Markdown Embeddings
|
||||
text_search.setup(
|
||||
|
@ -234,7 +239,7 @@ def configure_content(
|
|||
|
||||
try:
|
||||
# Initialize PDF Search
|
||||
if (search_type == None or search_type == state.SearchType.Pdf.value) and files["pdf"]:
|
||||
if (search_type == state.SearchType.All.value or search_type == state.SearchType.Pdf.value) and files["pdf"]:
|
||||
logger.info("🖨️ Setting up search for pdf")
|
||||
# Extract Entries, Generate PDF Embeddings
|
||||
text_search.setup(
|
||||
|
@ -251,7 +256,9 @@ def configure_content(
|
|||
|
||||
try:
|
||||
# Initialize Plaintext Search
|
||||
if (search_type == None or search_type == state.SearchType.Plaintext.value) and files["plaintext"]:
|
||||
if (search_type == state.SearchType.All.value or search_type == state.SearchType.Plaintext.value) and files[
|
||||
"plaintext"
|
||||
]:
|
||||
logger.info("📄 Setting up search for plaintext")
|
||||
# Extract Entries, Generate Plaintext Embeddings
|
||||
text_search.setup(
|
||||
|
@ -269,7 +276,7 @@ def configure_content(
|
|||
try:
|
||||
# Initialize Image Search
|
||||
if (
|
||||
(search_type == None or search_type == state.SearchType.Image.value)
|
||||
(search_type == state.SearchType.All.value or search_type == state.SearchType.Image.value)
|
||||
and content_config
|
||||
and content_config.image
|
||||
and search_models.image_search
|
||||
|
@ -286,7 +293,9 @@ def configure_content(
|
|||
|
||||
try:
|
||||
github_config = GithubConfig.objects.filter(user=user).prefetch_related("githubrepoconfig").first()
|
||||
if (search_type == None or search_type == state.SearchType.Github.value) and github_config is not None:
|
||||
if (
|
||||
search_type == state.SearchType.All.value or search_type == state.SearchType.Github.value
|
||||
) and github_config is not None:
|
||||
logger.info("🐙 Setting up search for github")
|
||||
# Extract Entries, Generate Github Embeddings
|
||||
text_search.setup(
|
||||
|
@ -305,7 +314,9 @@ def configure_content(
|
|||
try:
|
||||
# Initialize Notion Search
|
||||
notion_config = NotionConfig.objects.filter(user=user).first()
|
||||
if (search_type == None or search_type in state.SearchType.Notion.value) and notion_config:
|
||||
if (
|
||||
search_type == state.SearchType.All.value or search_type in state.SearchType.Notion.value
|
||||
) and notion_config:
|
||||
logger.info("🔌 Setting up search for notion")
|
||||
text_search.setup(
|
||||
NotionToEntries,
|
||||
|
|
|
@ -229,7 +229,7 @@ def collate_results(hits, image_names, output_directory, image_files_url, count=
|
|||
|
||||
# Add the image metadata to the results
|
||||
results += [
|
||||
SearchResponse.parse_obj(
|
||||
SearchResponse.model_validate(
|
||||
{
|
||||
"entry": f"{image_files_url}/{target_image_name}",
|
||||
"score": f"{hit['score']:.9f}",
|
||||
|
@ -237,7 +237,7 @@ def collate_results(hits, image_names, output_directory, image_files_url, count=
|
|||
"image_score": f"{hit['image_score']:.9f}",
|
||||
"metadata_score": f"{hit['metadata_score']:.9f}",
|
||||
},
|
||||
"corpus_id": hit["corpus_id"],
|
||||
"corpus_id": str(hit["corpus_id"]),
|
||||
}
|
||||
)
|
||||
]
|
||||
|
|
|
@ -163,7 +163,7 @@ def deduplicated_search_responses(hits: List[SearchResponse]):
|
|||
|
||||
else:
|
||||
hit_ids.add(hit.corpus_id)
|
||||
yield SearchResponse.parse_obj(
|
||||
yield SearchResponse.model_validate(
|
||||
{
|
||||
"entry": hit.entry,
|
||||
"score": hit.score,
|
||||
|
|
|
@ -288,15 +288,15 @@ def generate_random_name():
|
|||
# List of adjectives and nouns to choose from
|
||||
adjectives = [
|
||||
"happy",
|
||||
"irritated",
|
||||
"annoyed",
|
||||
"serendipitous",
|
||||
"exuberant",
|
||||
"calm",
|
||||
"brave",
|
||||
"scared",
|
||||
"energetic",
|
||||
"chivalrous",
|
||||
"kind",
|
||||
"grumpy",
|
||||
"suave",
|
||||
]
|
||||
nouns = ["dog", "cat", "falcon", "whale", "turtle", "rabbit", "hamster", "snake", "spider", "elephant"]
|
||||
|
||||
|
|
|
@ -14,7 +14,7 @@ from khoj.utils.helpers import to_snake_case_from_dash
|
|||
class ConfigBase(BaseModel):
|
||||
class Config:
|
||||
alias_generator = to_snake_case_from_dash
|
||||
allow_population_by_field_name = True
|
||||
populate_by_name = True
|
||||
|
||||
def __getitem__(self, item):
|
||||
return getattr(self, item)
|
||||
|
@ -29,8 +29,8 @@ class TextConfigBase(ConfigBase):
|
|||
|
||||
|
||||
class TextContentConfig(ConfigBase):
|
||||
input_files: Optional[List[Path]]
|
||||
input_filter: Optional[List[str]]
|
||||
input_files: Optional[List[Path]] = None
|
||||
input_filter: Optional[List[str]] = None
|
||||
index_heading_entries: Optional[bool] = False
|
||||
|
||||
|
||||
|
@ -50,31 +50,31 @@ class NotionContentConfig(ConfigBase):
|
|||
|
||||
|
||||
class ImageContentConfig(ConfigBase):
|
||||
input_directories: Optional[List[Path]]
|
||||
input_filter: Optional[List[str]]
|
||||
input_directories: Optional[List[Path]] = None
|
||||
input_filter: Optional[List[str]] = None
|
||||
embeddings_file: Path
|
||||
use_xmp_metadata: bool
|
||||
batch_size: int
|
||||
|
||||
|
||||
class ContentConfig(ConfigBase):
|
||||
org: Optional[TextContentConfig]
|
||||
image: Optional[ImageContentConfig]
|
||||
markdown: Optional[TextContentConfig]
|
||||
pdf: Optional[TextContentConfig]
|
||||
plaintext: Optional[TextContentConfig]
|
||||
github: Optional[GithubContentConfig]
|
||||
notion: Optional[NotionContentConfig]
|
||||
org: Optional[TextContentConfig] = None
|
||||
image: Optional[ImageContentConfig] = None
|
||||
markdown: Optional[TextContentConfig] = None
|
||||
pdf: Optional[TextContentConfig] = None
|
||||
plaintext: Optional[TextContentConfig] = None
|
||||
github: Optional[GithubContentConfig] = None
|
||||
notion: Optional[NotionContentConfig] = None
|
||||
|
||||
|
||||
class ImageSearchConfig(ConfigBase):
|
||||
encoder: str
|
||||
encoder_type: Optional[str]
|
||||
model_directory: Optional[Path]
|
||||
encoder_type: Optional[str] = None
|
||||
model_directory: Optional[Path] = None
|
||||
|
||||
|
||||
class SearchConfig(ConfigBase):
|
||||
image: Optional[ImageSearchConfig]
|
||||
image: Optional[ImageSearchConfig] = None
|
||||
|
||||
|
||||
class OpenAIProcessorConfig(ConfigBase):
|
||||
|
@ -95,26 +95,26 @@ class ConversationProcessorConfig(ConfigBase):
|
|||
|
||||
|
||||
class ProcessorConfig(ConfigBase):
|
||||
conversation: Optional[ConversationProcessorConfig]
|
||||
conversation: Optional[ConversationProcessorConfig] = None
|
||||
|
||||
|
||||
class AppConfig(ConfigBase):
|
||||
should_log_telemetry: bool
|
||||
should_log_telemetry: bool = True
|
||||
|
||||
|
||||
class FullConfig(ConfigBase):
|
||||
content_type: Optional[ContentConfig] = None
|
||||
search_type: Optional[SearchConfig] = None
|
||||
processor: Optional[ProcessorConfig] = None
|
||||
app: Optional[AppConfig] = AppConfig(should_log_telemetry=True)
|
||||
app: Optional[AppConfig] = AppConfig()
|
||||
version: Optional[str] = None
|
||||
|
||||
|
||||
class SearchResponse(ConfigBase):
|
||||
entry: str
|
||||
score: float
|
||||
cross_score: Optional[float]
|
||||
additional: Optional[dict]
|
||||
cross_score: Optional[float] = None
|
||||
additional: Optional[dict] = None
|
||||
corpus_id: str
|
||||
|
||||
|
||||
|
|
|
@ -39,7 +39,7 @@ def load_config_from_file(yaml_config_file: Path) -> dict:
|
|||
|
||||
def parse_config_from_string(yaml_config: dict) -> FullConfig:
|
||||
"Parse and validate config in YML string"
|
||||
return FullConfig.parse_obj(yaml_config)
|
||||
return FullConfig.model_validate(yaml_config)
|
||||
|
||||
|
||||
def parse_config_from_file(yaml_config_file):
|
||||
|
|
|
@ -9,9 +9,6 @@ import os
|
|||
from fastapi import FastAPI
|
||||
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
|
||||
# Internal Packages
|
||||
from khoj.configure import configure_routes, configure_search_types, configure_middleware
|
||||
from khoj.processor.embeddings import CrossEncoderModel, EmbeddingsModel
|
||||
|
@ -320,6 +317,7 @@ def client(
|
|||
|
||||
state.anonymous_mode = False
|
||||
|
||||
app = FastAPI()
|
||||
configure_routes(app)
|
||||
configure_middleware(app)
|
||||
app.mount("/static", StaticFiles(directory=web_directory), name="static")
|
||||
|
|
|
@ -227,7 +227,7 @@ def test_answer_not_known_using_notes_command(chat_client_no_background, default
|
|||
|
||||
# Assert
|
||||
assert response.status_code == 200
|
||||
assert response_message == prompts.no_notes_found.format()
|
||||
assert response_message == prompts.no_entries_found.format()
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------------------------------------
|
||||
|
|
|
@ -26,5 +26,6 @@
|
|||
"0.12.2": "0.15.0",
|
||||
"0.12.3": "0.15.0",
|
||||
"0.13.0": "0.15.0",
|
||||
"0.14.0": "0.15.0"
|
||||
"0.14.0": "0.15.0",
|
||||
"1.0.0": "0.15.0"
|
||||
}
|
||||
|
|
Loading…
Reference in a new issue