Commit graph

2382 commits

Author SHA1 Message Date
Debanjum Singh Solanky
dcdd1edde2 Update docs to show how to setup llama-cpp with Khoj
- How to pip install khoj to run offline chat on GPU
  After migration to llama-cpp-python more GPU types are supported but
  require build step so mention how
- New default offline chat model
- Where to get supported chat models from on HuggingFace
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
8ca39a436c Use llama.cpp for offline chat models
- Benefits of moving to llama-cpp-python from gpt4all:
  - Support for all GGUF format chat models
  - Support for AMD, Nvidia, Mac, Vulcan GPU machines (instead of just Vulcan, Mac)
  - Supports models with more capabilities like tools, schema
    enforcement, speculative ddecoding, image gen etc.
- Upgrade default chat model, prompt size, tokenizer for new supported
  chat models

- Load offline chat model when present on disk without requiring internet
  - Load model onto GPU if not disabled and device has GPU
  - Load model onto CPU if loading model onto GPU fails
  - Create helper function to check and load model from disk, when model
    glob is present on disk.

    `Llama.from_pretrained' needs internet to get repo info from
    HuggingFace. This isn't required, if the model is already downloaded

    Didn't find any existing HF or llama.cpp method that looked for model
    glob on disk without internet
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
0a7392f6ec Only add location to image prompt generator when location known 2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
15ed208996 Use Khoj icon when Khoj web is installed on iOS as a PWA 2024-03-26 00:13:12 +05:30
sabaimran
f8eaff574f Change prod docker image to use jammy, rather than nvidia base image 2024-03-25 23:09:58 +05:30
sabaimran
2b5341f53a [Temp] Reduce to 1 gunicorn worker 2024-03-25 16:13:04 +05:30
sabaimran
991f500775 Do not pipe access/error logs to separate files. Flow to stdout/stderr 2024-03-25 16:12:39 +05:30
Debanjum
586654e2af
Allow directly reading web pages, even when SERP not enabled (#676)
### Overview
Khoj can now read website directly without needing to go through the search step first

### Details
- Parallelize simple webpage read and extractor
- Rename extract_content online results field to web pages
- Tweak prompts to extract information from webpages, online results
- Test select webpage as data source and extract web urls chat actors

- Render webpage read in chat response references on Web, Desktop apps
- Pass multiple webpages with their urls in online results context

- Support webpage command in chat API
- Add webpage chat command for read web pages requested by user
- Create chat actor for directly reading webpages based on user message
2024-03-24 16:25:25 +05:30
Debanjum Singh Solanky
9e52ae9e98 Time chat actor responses & chat api request start for perf analysis 2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
dabf71bc3c Render webpage read in chat response references on Web, Desktop apps 2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
a2e79c94be Pass multiple webpages with their urls in online results context
Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted
content would ever be passed.

URL of the extracted webpage content wasn't passed to clients in
online results context. This limited them from being rendered
2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
71b6905008 Parallelize simple webpage read and extractor
Similar to what is being done with search_online with olostep
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
1167f6ddf9 Rename extract_content online results field to webpages 2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
b22a7dae5d Tweak prompts to extract information from webpages, online results
- Show more of the truncated messages for debugging context
- Update Khoj personality prompt to encourage it to remember it's capabilities
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
85c62efca1 Test select webpage as data source and extract web urls chat actors 2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
ad6f6bb0ed Support webpage command in chat API
- Fallback to use webpage when SERPER not setup and online command was
  attempted
- Do not stop responding if can't retrieve online results. Try to
  respond without the online context
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
a6b7432837 Add webpage chat command for read web pages requested by user
Update auto chat command inference prompt to show example of when to
use webpage chat command (i.e when url is directly provided in link)
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
7416ca9ae1 Lower the default gunicorn workers running on prod 2024-03-21 04:35:52 +05:30
Debanjum Singh Solanky
aed4313cfc Fix updating specific conversation by id from the chat API endpoint
- Use the conversation id of the retrieved conversation rather than the
  potentially unset conversation id passed via API
- await creating new chat when no chat id provided and no existing
  conversations exist
2024-03-21 02:46:52 +05:30
Debanjum Singh Solanky
ec6dc0daaf Bump up the default gunicorn workers running on prod 2024-03-20 22:56:09 +05:30
Debanjum Singh Solanky
62a83dc9bb Fix online search actor to use natural dates not after: operator
The recently added after: operator to online search actor was too
restrictive, gave worse results than when just use natural language
dates in search query
2024-03-15 21:50:14 +05:30
Debanjum Singh Solanky
4a1e6a2275 Convert deleted old user requests log line to debug from info 2024-03-15 20:50:10 +05:30
Debanjum Singh Solanky
9a068dadbf Fix extract questions prompt to use YYYY-MM-DD date filter format 2024-03-15 18:43:18 +05:30
Debanjum
bb2693c792
Improve Chat Session UX, Fix Login, Chat Message Truncation (#677)
### Improve
- Improve delete, rename chat session UX in Desktop, Web app
- Get conversation by title when requested via chat API

### Fix
- Allow unset locale for Google authenticating user
- Handle truncation when single long non-system chat message
- Fix setting chat session title from Desktop app
- Only create new chat on get if a specific chat id, slug isn't requested
2024-03-15 18:19:36 +05:30
Debanjum Singh Solanky
ecddf98430 Handle truncation when single long non-system chat message
Previously was assuming the system prompt is being always passed as
the first message. So expected there to be at least 2 messages in logs.

This broke chat actors querying with single long non system message.

A more robust way to extract system prompt is via the message role
instead
2024-03-15 15:58:39 +05:30
Debanjum Singh Solanky
ec0c35b7ed Improve delete, rename chat session UX in Desktop, Web app
- Ask for Confirmation before deleting chat session in Desktop, Web app
- Save chat session rename on hitting enter in title edit input box
- No need to flash previous conversation cleared status message
- Move chat session delete button after rename button in Desktop app
2024-03-15 15:58:19 +05:30
Debanjum Singh Solanky
924b1215ce Allow unset locale for Google authenticated user 2024-03-15 15:35:20 +05:30
Debanjum Singh Solanky
c792fa819f Fix setting chat session title from Desktop app
Pass auth headers to not have the chat session title update request fail
2024-03-15 15:19:20 +05:30
Debanjum Singh Solanky
c9e05dc184 Get conversation by title when requested via chat API 2024-03-15 12:31:50 +05:30
Debanjum Singh Solanky
cac26dafe3 Only create new chat on get if a specific chat id, slug isn't requested 2024-03-15 11:58:39 +05:30
Debanjum Singh Solanky
8cdfaf41ec Update project URLs to show on pypi project page 2024-03-15 04:03:39 +05:30
Debanjum Singh Solanky
08993ff109 Add new, remove old known chat models from model to prompt size map 2024-03-15 04:02:25 +05:30
Debanjum Singh Solanky
fba0338787 Release Khoj version 1.7.0 2024-03-15 00:08:32 +05:30
Debanjum Singh Solanky
6118d1ff57 Create chat actor for directly reading webpages based on user message
- Add prompt for the read webpages chat actor to extract, infer
  webpage links
- Make chat actor infer or extract webpage to read directly from user
  message
- Rename previous read_webpage function to more narrow
  read_webpage_at_url function
2024-03-14 14:58:37 +05:30
Debanjum
e549824fe2
Improve OpenAI Chat Actors and their prompts (#673)
### Major
- Enforce json mode response from OpenAI chat actors prev using string lists
- Use `gpt-4-turbo-preview' as default chat model, extract questions actor
- Make Khoj read khoj website to respond with accurate, up-to-date information about itself
- Dedupe query in notes prompt. Improve OAI chat actor, director tests

### Minor
- Test data source, output mode selector, web search query chat actors
- Improve notes search actor to always create a non-empty list of queries
- Construct available data sources, output modes as a bullet list in prompts
- Use consistent agent name across static and dynamic examples in prompts
- Add actor's name to extract questions prompt to improve context for guidance
2024-03-14 12:44:40 +05:30
Debanjum Singh Solanky
a1ce12296f Fix rendering online with note references post streaming chat response
Previously only the notes references would get rendered post response
streaming when when both online and notes references were used to
respond to the user's message
2024-03-14 03:40:40 +05:30
Debanjum Singh Solanky
1aeea3d854 Fix opening external links from confirmation dialog box on desktop app 2024-03-14 02:29:22 +05:30
Debanjum Singh Solanky
2e5cc49cb3 Enforce json response from OpenAI chat actors prev using string lists
- Allow passing response format type to OpenAI API via chat actors
- Convert in-context examples to use json objects instead of str lists
- Update actors outputting str list to request output to be json_object
  - OpenAI's json mode enforces the model to output valid json object
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
7211eb9cf5 Default to gpt-4-turbo-preview for chat model, extract questions actor
GPT-4 is more expensive and generally less capable than gpt-4-turbo-preview
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
dd883dc53a Dedupe query in notes prompt. Improve OAI chat actor, director tests
- Remove stale tests
- Improve tests to pass across gpt-3.5 and gpt-4-turbo
- The haiku creation director was failing because of duplicate query in
  instantiated prompt
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
70b04d16c0 Test data source, output mode selector, web search query chat actors 2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
14682d5354 Improve notes search actor to always create a non-empty list of queries
- Remove the option for Notes search query generation actor to return
  no queries. Whether search should be performed is decided before,
  this step doesn't need to decide that
- But do not throw warning if the response is a list with no elements
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
f5734826cb Improve pick data source prompt to look online for info about Khoj
- Add examples where user queries requesting information about Khoj
  results in the "online" data source being selected
- Add an example for "general" to select chat command prompt
2024-03-14 01:21:13 +05:30
Debanjum Singh Solanky
9a516bed47 Construct available data sources, output modes as a bullet list in prompts 2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky
f28fb89af8 Use consistent agent name across static and dynamic examples in prompts
Previously the examples constructed from chat history used "Khoj" as
the agent's name but all 3 prompts using the func used static examples
with "AI:" as the pertinent agent's name
2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky
f5793149a9 Add actor's name to extract questions prompt to improve context for guidance 2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky
73ad444086 Make online search Actor read khoj.dev for docs, info about Khoj
- Add example to read khoj.dev website for up-to-date info to setup,
  use khoj, discover khoj features etc.
- Online search should use site: and after: google search operators
  - Show example of adding the after: date filter to google search
- Give local event lookup example using user's current location in
  query
- Remove unused select search content type prompt
2024-03-14 00:34:57 +05:30
Debanjum
3abe7ccb26
Improve Online Search Speed and Context (#670)
### Major
- Read web pages in parallel to improve chat response time
- Read web pages directly when Olostep proxy not setup
- Include search results & web page content in online context for chat response

### Minor
- Simplify, modularize and add type hints to online search functions
2024-03-11 22:16:30 +05:30
Debanjum Singh Solanky
dc86e44a07 Include search results & webpage content in online context for chat response
Previously if a web page was read for a sub-query, only the extracted
web page content was provided as context for the given sub-query. But
the google results themselves have relevant snippets. So include them
2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky
d136a6be44 Simplify, modularize and add type hints to online search functions
- Simplify content arg to `extract_relevant_info' function. Validate,
  clean the content arg inside the `extract_relevant_info' function

- Extract `search_with_google' function outside the parent function
- Call the parent function a more appropriate `search_online' instead
  of `search_with_google'
- Simplify the `search_with_google' function using list comprehension.
  Drop empty search result fields from chat model context for response
  to reduce cost and response latency

- No need to show stacktrace when unable to read webpage, basic error
  is enough
- Add type hints to online search functions to catch issues with mypy
2024-03-11 18:41:02 +05:30