mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-11-23 15:38:55 +01:00
Update docs to mention using Llama 3.1 and 20K max prompt size for it
Update stale credits to better reflect bigger open source dependencies
This commit is contained in:
parent
238bc11a50
commit
bdb81260ac
5 changed files with 13 additions and 12 deletions
|
@ -26,11 +26,11 @@ Using LiteLLM with Khoj makes it possible to turn any LLM behind an API into you
|
|||
- Api Key: `any string`
|
||||
- Api Base Url: **URL of your Openai Proxy API**
|
||||
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
|
||||
- Name: `llama3` (replace with the name of your local model)
|
||||
- Name: `llama3.1` (replace with the name of your local model)
|
||||
- Model Type: `Openai`
|
||||
- Openai Config: `<the proxy config you created in step 3>`
|
||||
- Max prompt size: `2000` (replace with the max prompt size of your model)
|
||||
- Tokenizer: *Do not set for OpenAI, mistral, llama3 based models*
|
||||
- Max prompt size: `20000` (replace with the max prompt size of your model)
|
||||
- Tokenizer: *Do not set for OpenAI, Mistral, Llama3 based models*
|
||||
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
|
||||
- Default model: `<name of chat model option you created in step 4>`
|
||||
- Summarizer model: `<name of chat model option you created in step 4>`
|
||||
|
|
|
@ -19,10 +19,10 @@ LM Studio can expose an [OpenAI API compatible server](https://lmstudio.ai/docs/
|
|||
- Api Key: `any string`
|
||||
- Api Base Url: `http://localhost:1234/v1/` (default for LMStudio)
|
||||
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
|
||||
- Name: `llama3` (replace with the name of your local model)
|
||||
- Name: `llama3.1` (replace with the name of your local model)
|
||||
- Model Type: `Openai`
|
||||
- Openai Config: `<the proxy config you created in step 3>`
|
||||
- Max prompt size: `2000` (replace with the max prompt size of your model)
|
||||
- Max prompt size: `20000` (replace with the max prompt size of your model)
|
||||
- Tokenizer: *Do not set for OpenAI, mistral, llama3 based models*
|
||||
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
|
||||
- Default model: `<name of chat model option you created in step 4>`
|
||||
|
|
|
@ -17,17 +17,17 @@ Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/
|
|||
1. Setup Ollama: https://ollama.com/
|
||||
2. Start your preferred model with Ollama. For example,
|
||||
```bash
|
||||
ollama run llama3
|
||||
ollama run llama3.1
|
||||
```
|
||||
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
|
||||
- Name: `ollama`
|
||||
- Api Key: `any string`
|
||||
- Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
|
||||
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
|
||||
- Name: `llama3` (replace with the name of your local model)
|
||||
- Name: `llama3.1` (replace with the name of your local model)
|
||||
- Model Type: `Openai`
|
||||
- Openai Config: `<the ollama config you created in step 3>`
|
||||
- Max prompt size: `1000` (replace with the max prompt size of your model)
|
||||
- Max prompt size: `20000` (replace with the max prompt size of your model)
|
||||
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
|
||||
- Default model: `<name of chat model option you created in step 4>`
|
||||
- Summarizer model: `<name of chat model option you created in step 4>`
|
||||
|
|
|
@ -25,7 +25,7 @@ Offline chat stays completely private and can work without internet using open-s
|
|||
> - An Nvidia, AMD GPU or a Mac M1+ machine would significantly speed up chat response times
|
||||
|
||||
1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration.
|
||||
2. Open your [Chat model options settings](http://localhost:42110/server/admin/database/chatmodeloptions/) and add any [GGUF chat model](https://huggingface.co/models?library=gguf) to use for offline chat. Make sure to use `Offline` as its type. For a balanced chat model that runs well on standard consumer hardware we recommend using [Llama 3.1 by Meta](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF) by default.
|
||||
2. Open your [Chat model options settings](http://localhost:42110/server/admin/database/chatmodeloptions/) and add any [GGUF chat model](https://huggingface.co/models?library=gguf) to use for offline chat. Make sure to use `Offline` as its type. For a balanced chat model that runs well on standard consumer hardware we recommend using [Llama 3.1 by Meta](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF) by default. For machines with no or small GPU we recommend using [Gemma 2 2B](https://huggingface.co/bartowski/gemma-2-2b-it-GGUF) or [Phi 3.5 mini](https://huggingface.co/bartowski/Phi-3.5-mini-instruct-GGUF)
|
||||
|
||||
|
||||
:::tip[Note]
|
||||
|
|
|
@ -5,9 +5,10 @@ sidebar_position: 4
|
|||
# Credits
|
||||
Many Open Source projects are used to power Khoj. Here's a few of them:
|
||||
|
||||
- [Multi-QA MiniLM Model](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [All MiniLM Model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) for Text Search. See [SBert Documentation](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)
|
||||
- [OpenAI CLIP Model](https://github.com/openai/CLIP) for Image Search. See [SBert Documentation](https://www.sbert.net/examples/applications/image-search/README.html)
|
||||
- [Llama.cpp](https://github.com/ggerganov/llama.cpp) to chat with local LLM
|
||||
- [SentenceTransformer](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for Text Search
|
||||
- [HuggingFace](https://huggingface.co/) for hosting open-source chat and search models
|
||||
- Charles Cave for [OrgNode Parser](http://members.optusnet.com.au/~charles57/GTD/orgnode.html)
|
||||
- [Org.js](https://mooz.github.io/org-js/) to render Org-mode results on the Web interface
|
||||
- [Markdown-it](https://github.com/markdown-it/markdown-it) to render Markdown results on the Web interface
|
||||
- [Llama.cpp](https://github.com/ggerganov/llama.cpp) to chat with local LLM
|
||||
- [Katex](https://katex.org/) to render math
|
||||
|
|
Loading…
Reference in a new issue