sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-04 04:43:01 +01:00

Debanjum Singh Solanky 58c8068079 Upgrade default offline chat model to llama 3.1

2024-08-20 09:28:56 -07:00

4.5 KiB

Raw Blame History

sidebar_position
2

Chat

You can configure Khoj to chat with you about anything. When relevant, it'll use any notes or documents you shared with it to respond. It acts as an excellent research assistant, search engine, or personal tutor.

Overview

Creates a personal assistant for you to inquire and engage with your notes or online information as needed
You can choose to use Online or Offline Chat depending on your requirements
Supports multi-turn conversations with the relevant notes for context
Shows reference notes used to generate a response

Setup (Self-Hosting)

Offline Chat

Offline chat stays completely private and can work without internet using open-source models.

System Requirements:

Minimum 8 GB RAM. Recommend 16Gb VRAM

Minimum 5 GB of Disk available

A CPU supporting AVX or AVX2 instructions is required

An Nvidia, AMD GPU or a Mac M1+ machine would significantly speed up chat response times

Open your Khoj offline settings and click Enable on the Offline Chat configuration.
Open your Chat model options settings and add any GGUF chat model to use for offline chat. Make sure to use Offline as its type. For a balanced chat model that runs well on standard consumer hardware we recommend using Llama 3.1 by Meta by default.

:::tip[Note] Offline chat is not supported for a multi-user scenario. The host machine will encounter segmentation faults if multiple users try to use offline chat at the same time. :::

Online Chat

Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.

:::danger[Warning] This will enable Khoj to send your chat queries and query relevant notes to OpenAI for processing. :::

Get your OpenAI API Key
Open your Khoj Online Chat settings. Add a new setting with your OpenAI API key, and click Save. Only one configuration will be used, so make sure that's the only one you have.
Open your Chat model options and add a new option for the OpenAI chat model you want to use. Make sure to use OpenAI as its type.

Use

Open Khoj Chat
- On Web: Open /chat in your web browser
- On Obsidian: Search for Khoj: Chat in the Command Palette
- On Emacs: Run M-x khoj <user-query>
Enter your queries to chat with Khoj. Use slash commands and query filters to change what Khoj uses to respond

Details

Your query is used to retrieve the most relevant notes, if any, using Khoj search
These notes, the last few messages and associated metadata is passed to the enabled chat model along with your query to generate a response

Conversation File Filters

You can use conversation file filters to limit the notes used in the chat response. To do so, use the left panel in the web UI. Alternatively, you can also use query filters to limit the notes used in the chat response.

Commands

Slash commands allows you to change what Khoj uses to respond to your query

/notes: Limit chat to only respond using your notes, not just Khoj's general world knowledge as reference
/general: Limit chat to only respond using Khoj's general world knowledge, not using your notes as reference
/default: Allow chat to respond using your notes or it's general knowledge as reference. It's the default behavior when no slash command is used
/online: Use online information and incorporate it in the prompt to the LLM to send you a response.
/image: Generate an image in response to your query.
/help: Use /help to get all available commands and general information about Khoj
/summarize: Can be used to summarize 1 selected file filter for that conversation. Refer to File Summarization for details.

4.5 KiB Raw Blame History