mirror of
https://github.com/khoj-ai/khoj.git
synced 2025-02-17 08:04:21 +00:00
Update references to all documentation to reflect instructions for managed service
- By default assume the audience of this website is people looking to understand the featuer offering of Khoj, and then people who are looking to self-host
This commit is contained in:
parent
7688228b9c
commit
3934633947
13 changed files with 29 additions and 87 deletions
|
@ -24,13 +24,13 @@
|
|||
</div>
|
||||
|
||||
## Introduction
|
||||
Welcome to the Khoj Docs! This is the best place to [get started](./setup.md) with Khoj. Unless otherwise mentioned, the docs only pertain to self-hosted Khoj instances.
|
||||
Welcome to the Khoj Docs! This is the best place to [get started](./setup.md) with Khoj. We have instructions on self-hosting, using Khoj with Emacs, Obsidian, and the Web, and more. We also include setup instructions for users on the hosted instance at [app.khoj.dev](https://app.khoj.dev).
|
||||
|
||||
- Khoj is an application to dynamically engage with your notes, documents and images. We support APIs for [semantic search](./search.md) and [chat](./chat.md).
|
||||
- It can be easily self-hosted and run on your consumer hardware or private cloud.
|
||||
- It provides an open source, AI personal assistant accessible from your [Emacs](./emacs.md), [Obsidian](./obsidian.md) or [Web browser](./web.md), or our [desktop app](https://khoj.dev/downloads).
|
||||
- It works with plaintext, markdown, [notion](./notion_integration.md) org-mode, pdf files and [github repositories](./github_integration.md)
|
||||
- It can support use with multiple users, so you and your family, friends, or team can have a shared assistance server. As the admin, you can configure the server settings at `/server/admin`.
|
||||
- It can support use with multiple users. If you're self-hosting, your family, friends, or team can have a shared assistance server. You'll the the suite of server admin settings at `/server/admin`.
|
||||
|
||||
## Quickstart
|
||||
[Click here](./setup.md) for full setup instructions
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
- Get Started
|
||||
- [Overview](README.md)
|
||||
- [Install](setup.md)
|
||||
- [Self-Host](setup.md)
|
||||
- [Demos](demos.md)
|
||||
- Use
|
||||
- [Features](features.md)
|
||||
|
|
|
@ -1,63 +1,11 @@
|
|||
|
||||
## Advanced Usage
|
||||
### Search across Different Languages
|
||||
|
||||
### Search across Different Languages (Self-Hosting)
|
||||
To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
|
||||
For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it:
|
||||
1. Manually update `search-type > asymmetric > encoder` to `paraphrase-multilingual-MiniLM-L12-v2` in your `~/.khoj/khoj.yml` file for now. See diff of `khoj.yml` below for illustration:
|
||||
|
||||
```diff
|
||||
asymmetric:
|
||||
- encoder: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
|
||||
+ encoder: paraphrase-multilingual-MiniLM-L12-v2
|
||||
cross-encoder: cross-encoder/ms-marco-MiniLM-L-6-v2
|
||||
model_directory: "~/.khoj/search/asymmetric/"
|
||||
```
|
||||
|
||||
2. Regenerate your content index. For example, by opening [\<khoj-url\>/api/update?t=force](http://localhost:42110/api/update?t=force)
|
||||
|
||||
### Access Khoj on Mobile
|
||||
1. [Setup Khoj](/#/setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc
|
||||
2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone
|
||||
3. Open the Khoj web interface of the server from your phone browser.<br /> It should be `http://tailscale-ip-of-server:42110` or `http://name-of-server:42110` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)
|
||||
4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button
|
||||
5. Enjoy exploring your notes, documents and images from your phone!
|
||||
|
||||
![](./assets/khoj_pwa_android.png?)
|
||||
|
||||
### Use OpenAI Models for Search
|
||||
#### Setup
|
||||
1. Set `encoder-type`, `encoder` and `model-directory` under `asymmetric` and/or `symmetric` `search-type` in your `khoj.yml` (at `~/.khoj/khoj.yml`):
|
||||
```diff
|
||||
asymmetric:
|
||||
- encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
|
||||
+ encoder: text-embedding-ada-002
|
||||
+ encoder-type: khoj.utils.models.OpenAI
|
||||
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
|
||||
- encoder-type: sentence_transformers.SentenceTransformer
|
||||
- model_directory: "~/.khoj/search/asymmetric/"
|
||||
+ model-directory: null
|
||||
```
|
||||
2. [Setup your OpenAI API key in Khoj](/#/chat?id=setup)
|
||||
3. Restart Khoj server to generate embeddings. It will take longer than with the offline search models.
|
||||
|
||||
#### Warnings
|
||||
This configuration *uses an online model*
|
||||
- It will **send all notes to OpenAI** to generate embeddings
|
||||
- **All queries will be sent to OpenAI** when you search with Khoj
|
||||
- You will be **charged by OpenAI** based on the total tokens processed
|
||||
- It *requires an active internet connection* to search and index
|
||||
|
||||
### Bootstrap Khoj Search for Offline Usage later
|
||||
|
||||
You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine.
|
||||
Note: *Only search can currently run in fully offline mode, not chat.*
|
||||
|
||||
- With Internet
|
||||
1. Manually download the [asymmetric text](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [symmetric text](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [image search](https://huggingface.co/sentence-transformers/clip-ViT-B-32) models from HuggingFace
|
||||
2. Pip install khoj (and dependencies) in an associated virtualenv. E.g `python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant`
|
||||
- Without Internet
|
||||
1. Copy each of the search models into their respective folders, `asymmetric`, `symmetric` and `image` under the `~/.khoj/search/` directory on the air-gapped machine
|
||||
2. Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g `source .venv/bin/activate && khoj`
|
||||
1. Manually update the search config in server's admin settings page. Go to [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/). Either create a new one, if none exists, or update the existing one. Set the bi_encoder to `sentence-transformers/multi-qa-MiniLM-L6-cos-v1` and the cross_encoder to `cross-encoder/ms-marco-MiniLM-L-6-v2`.
|
||||
2. Regenerate your content index from all the relevant clients. This step is very important, as you'll need to re-encode all your content with the new model.
|
||||
|
||||
### Query Filters
|
||||
|
||||
|
|
17
docs/chat.md
17
docs/chat.md
|
@ -5,9 +5,9 @@
|
|||
- Supports multi-turn conversations with the relevant notes for context
|
||||
- Shows reference notes used to generate a response
|
||||
|
||||
### Setup
|
||||
### Setup (Self-Hosting)
|
||||
#### Offline Chat
|
||||
Offline chat stays completely private and works without internet. But it is slower, lower quality and more compute intensive.
|
||||
Offline chat stays completely private and works without internet using open-source models.
|
||||
|
||||
> **System Requirements**:
|
||||
> - Minimum 8 GB RAM. Recommend **16Gb VRAM**
|
||||
|
@ -15,9 +15,10 @@ Offline chat stays completely private and works without internet. But it is slow
|
|||
> - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required
|
||||
> - A Mac M1+ or [Vulcan supported GPU](https://vulkan.gpuinfo.org/) should significantly speed up chat response times
|
||||
|
||||
- Open your [Khoj settings](http://localhost:42110/config/) and click *Enable* on the Offline Chat card
|
||||
1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration.
|
||||
2. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the offline chat model you want to use. Make sure to use `Offline` as its type. We currently only support offline models that use the [Llama chat prompt](https://replicate.com/blog/how-to-prompt-llama#wrap-user-input-with-inst-inst-tags) format. We recommend using `mistral-7b-instruct-v0.1.Q4_0.gguf`.
|
||||
|
||||
![Configure offline chat](https://user-images.githubusercontent.com/6413477/257021364-8a2029f5-dc21-4de8-9af9-9ba6100d695c.mp4 ':include :type=mp4')
|
||||
!> **Note**: Offline chat is not supported for a multi-user scenario. The host machine will encounter segmentation faults if multiple users try to use offline chat at the same time.
|
||||
|
||||
#### Online Chat
|
||||
Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.
|
||||
|
@ -25,14 +26,12 @@ Online chat requires internet to use ChatGPT but is faster, higher quality and l
|
|||
!> **Warning**: This will enable Khoj to send your chat queries and query relevant notes to OpenAI for processing
|
||||
|
||||
1. Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)
|
||||
2. Open your [Khoj Online Chat settings](http://localhost:42110/config/processor/conversation), add your OpenAI API key, and click *Save*. Then go to your [Khoj settings](http://localhost:42110/config) and click `Configure`. This will refresh Khoj with your OpenAI API key.
|
||||
|
||||
![Configure online chat](https://user-images.githubusercontent.com/6413477/256998908-ac26e55e-13a2-45fb-9348-3b90a62f7687.mp4 ':include :type=mp4')
|
||||
|
||||
2. Open your [Khoj Online Chat settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/). Add a new setting with your OpenAI API key, and click *Save*. Only one configuration will be used, so make sure that's the only one you have.
|
||||
3. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the OpenAI chat model you want to use. Make sure to use `OpenAI` as its type.
|
||||
|
||||
### Use
|
||||
1. Open Khoj Chat
|
||||
- **On Web**: Open [/chat](http://localhost:42110/chat) in your web browser
|
||||
- **On Web**: Open [/chat](https://app.khoj.dev/chat) in your web browser
|
||||
- **On Obsidian**: Search for *Khoj: Chat* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
- **On Emacs**: Run `M-x khoj <user-query>`
|
||||
2. Enter your queries to chat with Khoj. Use [slash commands](#commands) and [query filters](./advanced.md#query-filters) to change what Khoj uses to respond
|
||||
|
|
|
@ -28,5 +28,5 @@ For the Linux installation, you have to have `glibc` version 2.35 or higher. You
|
|||
If you decide you want to uninstall the application, you can uninstall it like any other application on your system. For example, on MacOS, you can drag the application to the trash. On Windows, you can uninstall it from the `Add or Remove Programs` menu. On Linux, you can uninstall it with `sudo apt remove khoj`.
|
||||
|
||||
In addition to that, you might want to `rm -rf` the following directories:
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
|
|
|
@ -25,13 +25,7 @@ pip install -e .'[dev]'
|
|||
khoj -vv
|
||||
```
|
||||
2. Configure Khoj
|
||||
- **Via the Settings UI**: Add files, directories to index the [Khoj settings](http://localhost:42110/config) UI once Khoj has started up. Once you've saved all your settings, click `Configure`.
|
||||
- **Manually**:
|
||||
- Copy the `config/khoj_sample.yml` to `~/.khoj/khoj.yml`
|
||||
- Set `input-files` or `input-filter` in each relevant `content-type` section of `~/.khoj/khoj.yml`
|
||||
- Set `input-directories` field in `image` `content-type` section
|
||||
- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case
|
||||
- Restart khoj
|
||||
- **Via the Desktop application**: Add files, directories to index using the settings page of your desktop application. Click "Save" to immediately trigger indexing.
|
||||
|
||||
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
|
||||
|
||||
|
|
|
@ -69,7 +69,7 @@ Indexes your org-agenda files, by default.
|
|||
khoj-openai-api-key "YOUR_OPENAI_API_KEY")) ; required to enable chat
|
||||
```
|
||||
|
||||
### With [Straight.el](https://github.com/raxod502/straight.el)
|
||||
### With [ Straight.el](https://github.com/raxod502/straight.el)
|
||||
Add below snippet to your Emacs config file.
|
||||
Indexes the specified org files, directories. Sets up OpenAI API key for Khoj Chat
|
||||
|
||||
|
|
|
@ -4,11 +4,11 @@ The Github integration allows you to index as many repositories as you want. It'
|
|||
|
||||
# Configure your settings
|
||||
|
||||
1. Go to [http://localhost:42110/config](http://localhost:42110/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
|
||||
1. Go to [https://app.khoj.dev/config](https://app.khoj.dev/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
|
||||
|
||||
## Use the Github plugin
|
||||
|
||||
1. Generate a [classic PAT (personal access token)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) from [Github](https://github.com/settings/tokens) with `repo` and `admin:org` scopes at least.
|
||||
2. Navigate to [http://localhost:42110/config/content-source/github](http://localhost:42110/config/content-source/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
|
||||
2. Navigate to [https://app.khoj.dev/config/content-source/github](https://app.khoj.dev/config/content-source/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
|
||||
3. Click `Save`. Go back to the settings page and click `Configure`.
|
||||
4. Go to [http://localhost:42110/](http://localhost:42110/) and start searching!
|
||||
4. Go to [https://app.khoj.dev/](https://app.khoj.dev/) and start searching!
|
||||
|
|
|
@ -17,6 +17,7 @@
|
|||
repo: 'https://github.com/khoj-ai/khoj',
|
||||
loadSidebar: true,
|
||||
themeColor: '#c2a600',
|
||||
auto2top: true,
|
||||
// coverpage: true,
|
||||
}
|
||||
</script>
|
||||
|
|
|
@ -8,7 +8,7 @@ We haven't setup a fancy integration with OAuth yet, so this integration still r
|
|||
![setup_new_integration](https://github.com/khoj-ai/khoj/assets/65192171/b056e057-d4dc-47dc-aad3-57b59a22c68b)
|
||||
3. Share all the workspaces that you want to integrate with the Khoj integration you just made in the previous step
|
||||
![enable_workspace](https://github.com/khoj-ai/khoj/assets/65192171/98290303-b5b8-4cb0-b32c-f68c6923a3d0)
|
||||
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at http://localhost:42110/config/content-source/notion. Click `Save`.
|
||||
5. Click `Configure` in http://localhost:42110/config to index your Notion workspace(s).
|
||||
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at https://app.khoj.dev/config/content-source/notion. Click `Save`.
|
||||
5. Click `Configure` in https://app.khoj.dev/config to index your Notion workspace(s).
|
||||
|
||||
That's it! You should be ready to start searching and chatting. Make sure you've configured your OpenAI API Key for chat.
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
## Khoj Search
|
||||
### Use
|
||||
1. Open Khoj Search
|
||||
- **On Web**: Open <http://localhost:42110/> in your web browser
|
||||
- **On Web**: Open <https://app.khoj.dev/> in your web browser
|
||||
- **On Obsidian**: Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or Search for *Khoj: Search* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
|
||||
- **On Emacs**: Run `M-x khoj <user-query>`
|
||||
2. Query using natural language to find relevant entries from your knowledge base. Use [query filters](./advanced.md#query-filters) to limit entries to search
|
||||
|
|
|
@ -196,8 +196,8 @@ docker-compose up --build
|
|||
1. (Optional) Hit `Ctrl-C` in the terminal running the khoj server to stop it
|
||||
2. Delete the khoj directory in your home folder (i.e `~/.khoj` on Linux, Mac or `C:\Users\<your-username>\.khoj` on Windows)
|
||||
5. You might want to `rm -rf` the following directories:
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
- `~/.khoj`
|
||||
- `~/.cache/gpt4all`
|
||||
3. Uninstall the khoj server with `pip uninstall khoj-assistant`
|
||||
4. (Optional) Uninstall khoj.el or the khoj obsidian plugin in the standard way on Emacs, Obsidian
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
# Telemetry
|
||||
# Telemetry (self-hosting)
|
||||
|
||||
We collect some high level, anonymized metadata about usage of Khoj. This includes:
|
||||
- Client (Web, Emacs, Obsidian)
|
||||
|
|
Loading…
Add table
Reference in a new issue