From c28755ccd28c091291c5468c6e47c359f60c7be5 Mon Sep 17 00:00:00 2001 From: Debanjum Singh Solanky Date: Fri, 21 Jul 2023 00:05:44 -0700 Subject: [PATCH] Fix diff blocks, links, remove footnotes & rearrange sections in docs Extract performance into separate sectin into shoving it under search Create page for web interface --- .github/workflows/build_desktop.yml | 4 ++ docs/_sidebar.md | 10 ++-- docs/advanced.md | 86 +++++++++++++++-------------- docs/chat.md | 22 ++++++-- docs/development.md | 4 +- docs/emacs.md | 4 +- docs/index.html | 1 + docs/{README.md => overview.md} | 9 +-- docs/performance.md | 19 +++++++ docs/search.md | 60 ++------------------ docs/setup.md | 2 +- docs/web.md | 19 +++++++ 12 files changed, 125 insertions(+), 115 deletions(-) rename docs/{README.md => overview.md} (87%) create mode 100644 docs/performance.md create mode 100644 docs/web.md diff --git a/.github/workflows/build_desktop.yml b/.github/workflows/build_desktop.yml index 16dde6fb..8fc6b22c 100644 --- a/.github/workflows/build_desktop.yml +++ b/.github/workflows/build_desktop.yml @@ -4,6 +4,10 @@ on: push: branches: - master + paths: + - src/khoj/** + - pyproject.toml + - Khoj.spec workflow_dispatch: jobs: diff --git a/docs/_sidebar.md b/docs/_sidebar.md index ebad3d62..ab9db401 100644 --- a/docs/_sidebar.md +++ b/docs/_sidebar.md @@ -1,18 +1,20 @@ -- Getting Started - - [Overview](README.md) +- Get Started + - [Overview](overview.md) - [Install](setup.md) - [Windows Installation](windows_install.md) -- Learn More + - [Demos](demos.md) +- Use - [Features](features.md) - [Chat](chat.md) - [Search](search.md) - - [Demos](demos.md) - Interfaces - [Obsidian](obsidian.md) - [Emacs](emacs.md) + - [Web](web.md) - Data Sources - [Github](github_integration.md) - [Notion](notion_integration.md) - [Advanced](advanced.md) + - [Performance](performance.md) - Contributing - [Development](development.md) diff --git a/docs/advanced.md b/docs/advanced.md index 95e6d351..9b47a373 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,54 +1,60 @@ ## Advanced Usage +### Search across Different Languages +To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).
+For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it: +1. Manually update `search-type > asymmetric > encoder` to `paraphrase-multilingual-MiniLM-L12-v2` in your `~/.khoj/khoj.yml` file for now. See diff of `khoj.yml` below for illustration: + + ```diff + asymmetric: + - encoder: sentence-transformers/multi-qa-MiniLM-L6-cos-v1 + + encoder: paraphrase-multilingual-MiniLM-L12-v2 + cross-encoder: cross-encoder/ms-marco-MiniLM-L-6-v2 + model_directory: "~/.khoj/search/asymmetric/" + ``` + +2. Regenerate your content index. For example, by opening [\/api/update?t=force](http://localhost:42110/api/update?t=force) + ### Access Khoj on Mobile -1. [Setup Khoj](#Setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc +1. [Setup Khoj](/#/setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc 2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone 3. Open the Khoj web interface of the server from your phone browser.
It should be `http://tailscale-ip-of-server:42110` or `http://name-of-server:42110` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/) 4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button 5. Enjoy exploring your notes, documents and images from your phone! -![](https://github.com/khoj-ai/khoj/blob/master/docs/khoj_pwa_android.png?) +![](./assets/khoj_pwa_android.png?) -### Search across Different Languages - To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).
- For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it: - 1. Manually update `search-type > asymmetric > encoder` to `paraphrase-multilingual-MiniLM-L12-v2` in your `~/.khoj/khoj.yml` file for now. See diff of `khoj.yml` below for illustration: - ```diff - asymmetric: -- encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1" -+ encoder: "paraphrase-multilingual-MiniLM-L12-v2" - cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2" - model_directory: "~/.khoj/search/asymmetric/" - ``` +### Use OpenAI Models for Search +#### Setup +1. Set `encoder-type`, `encoder` and `model-directory` under `asymmetric` and/or `symmetric` `search-type` in your `khoj.yml` (at `~/.khoj/khoj.yml`): + ```diff + asymmetric: + - encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1" + + encoder: text-embedding-ada-002 + + encoder-type: khoj.utils.models.OpenAI + cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2" + - encoder-type: sentence_transformers.SentenceTransformer + - model_directory: "~/.khoj/search/asymmetric/" + + model-directory: null + ``` +2. [Setup your OpenAI API key in Khoj](/#/chat?id=setup) +3. Restart Khoj server to generate embeddings. It will take longer than with the offline search models. - 2. Regenerate your content index. For example, by opening [\/api/update?t=force](http://localhost:42110/api/update?t=force) +#### Warnings + This configuration *uses an online model* + - It will **send all notes to OpenAI** to generate embeddings + - **All queries will be sent to OpenAI** when you search with Khoj + - You will be **charged by OpenAI** based on the total tokens processed + - It *requires an active internet connection* to search and index ### Bootstrap Khoj Search for Offline Usage later - You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine. - Note: *Only search can currently run in fully offline mode, not chat.* +You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine. +Note: *Only search can currently run in fully offline mode, not chat.* - - With Internet - 1. Manually download the [asymmetric text](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [symmetric text](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)and [image search](https://huggingface.co/sentence-transformers/clip-ViT-B-32) models from HuggingFace - 2. Pip install khoj (and dependencies) in an associated virtualenv. E.g `python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant` - - Without Internet - 1. Copy each of the search models into their respective folders, `asymmetric`, `symmetric` and `image` under the `~/.khoj/search/` directory on the air-gapped machine - 2. Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g `source .venv/bin/activate && khoj` - - -## Miscellaneous -### Set your OpenAI API key in Khoj -If you want, Khoj can be configured to use OpenAI for search and chat.
-Add your OpenAI API to Khoj by using either of the two options below: - - Open your [Khoj settings](http://localhost:42110/config/processor/conversation), add your OpenAI API key, and click *Save*. Then go to your [Khoj settings](http://localhost:42110/config) and click `Configure`. This will refresh Khoj with your OpenAI API key. - - Set `openai-api-key` field under `processor.conversation` section in your `khoj.yml`[^1] to your [OpenAI API key](https://beta.openai.com/account/api-keys) and restart khoj: - ```diff - processor: - conversation: - - openai-api-key: # "YOUR_OPENAI_API_KEY" - + openai-api-key: sk-aaaaaaaaaaaaaaaaaaaaaaaahhhhhhhhhhhhhhhhhhhhhhhh - model: "text-davinci-003" - conversation-logfile: "~/.khoj/processor/conversation/conversation_logs.json" - ``` - -!> **Warning**: This will enable Khoj to send your query and note(s) to OpenAI for processing +- With Internet + 1. Manually download the [asymmetric text](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [symmetric text](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [image search](https://huggingface.co/sentence-transformers/clip-ViT-B-32) models from HuggingFace + 2. Pip install khoj (and dependencies) in an associated virtualenv. E.g `python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant` +- Without Internet + 1. Copy each of the search models into their respective folders, `asymmetric`, `symmetric` and `image` under the `~/.khoj/search/` directory on the air-gapped machine + 2. Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g `source .venv/bin/activate && khoj` diff --git a/docs/chat.md b/docs/chat.md index f1799fad..1f30a197 100644 --- a/docs/chat.md +++ b/docs/chat.md @@ -1,16 +1,30 @@ ### Khoj Chat #### Overview - Creates a personal assistant for you to inquire and engage with your notes -- Uses [ChatGPT](https://openai.com/blog/chatgpt) and [Khoj search](#khoj-search). [Offline chat](https://github.com/khoj-ai/khoj/issues/201) is coming soon. +- Uses [ChatGPT](https://openai.com/blog/chatgpt) and [Khoj search](/#/search). [Offline chat](https://github.com/khoj-ai/khoj/issues/201) is coming soon. - Supports multi-turn conversations with the relevant notes for context - Shows reference notes used to generate a response -- **Note**: *Your query and top notes from khoj search will be sent to OpenAI for processing* + +!> **Warning**: This will enable Khoj to send your query and note(s) to OpenAI for processing #### Setup -- [Setup your OpenAI API key in Khoj](#set-your-openai-api-key-in-khoj) +- Get your [OpenAI API Key](https://platform.openai.com/account/api-keys) +- Add your OpenAI API to Khoj by using either of the two options below: + + - Open your [Khoj settings](http://localhost:42110/config/processor/conversation), add your OpenAI API key, and click *Save*. Then go to your [Khoj settings](http://localhost:42110/config) and click `Configure`. This will refresh Khoj with your OpenAI API key. + + - Set `openai-api-key` field under `processor.conversation` section in your `khoj.yml` @ `~/.khoj/khoj.yml` to your [OpenAI API key](https://beta.openai.com/account/api-keys) and restart khoj: + ```diff + processor: + conversation: + - openai-api-key: # "YOUR_OPENAI_API_KEY" + + openai-api-key: sk-aaaaaaaaaaaaaaaaaaaaaaaahhhhhhhhhhhhhhhhhhhhhhhh + model: "text-davinci-003" + conversation-logfile: "~/.khoj/processor/conversation/conversation_logs.json" + ``` #### Use -1. Open [/chat](http://localhost:42110/chat)[^2] +1. Open [/chat](http://localhost:42110/chat) 2. Type your queries and see response by Khoj from your notes #### Demo diff --git a/docs/development.md b/docs/development.md index c4ba859a..eb15c417 100644 --- a/docs/development.md +++ b/docs/development.md @@ -40,8 +40,8 @@ git clone https://github.com/khoj-ai/khoj && cd khoj #### 2. Configure -- **Required**: Update [docker-compose.yml](./docker-compose.yml) to mount your images, (org-mode or markdown) notes, PDFs and Github repositories -- **Optional**: Edit application configuration in [khoj_docker.yml](./config/khoj_docker.yml) +- **Required**: Update [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) to mount your images, (org-mode or markdown) notes, PDFs and Github repositories +- **Optional**: Edit application configuration in [khoj_docker.yml](https://github.com/khoj-ai/khoj/blob/master/config/khoj_docker.yml) #### 3. Run diff --git a/docs/emacs.md b/docs/emacs.md index 3578a849..0ae0cea2 100644 --- a/docs/emacs.md +++ b/docs/emacs.md @@ -1,6 +1,6 @@

Khoj LogoEmacs

-An AI personal assistance for your digital brain +> An AI personal assistance for your digital brain Melpa Stable Badge Melpa Badge @@ -100,7 +100,7 @@ Indexes the specified org files, directories. Sets up OpenAI API key for Khoj Ch E.g "When did I file my taxes last year?" - See [Khoj Chat](./README.md#khoj-chat) for more details + See [Khoj Chat](/#/chat) for more details ### Find Similar Entries This feature finds entries similar to the one you are currently on. diff --git a/docs/index.html b/docs/index.html index be0fdb3f..33ba0735 100644 --- a/docs/index.html +++ b/docs/index.html @@ -25,6 +25,7 @@ +