Commit graph

2472 commits

Author SHA1 Message Date
sabaimran
6cb38d92c0 Specify version of pypi gh publish action 2024-03-28 12:47:31 +05:30
sabaimran
56da96b2e9 Increase minimum python required in the pyproject, use python 3.11 for building the wheel in the workflow 2024-03-28 12:19:07 +05:30
sabaimran
22014cfcbc
Merge pull request #682 from khoj-ai/features/full-integration-agents
Add support for custom agents configured by the server admin
2024-03-27 23:27:15 -07:00
sabaimran
17776daed8 Merge from master 2024-03-28 11:38:29 +05:30
sabaimran
32a505d841 Revert to using the nvidia base image for the next release 2024-03-28 11:37:37 +05:30
sabaimran
51d0c9b8b0 Add telemetry to keep state of new agents being used 2024-03-28 11:37:24 +05:30
sabaimran
46ebc55e2b Add a top tab for agents 2024-03-28 11:37:01 +05:30
sabaimran
8397187231 Use default agent when creating a new conversation without agent specified 2024-03-28 11:36:27 +05:30
Debanjum Singh Solanky
8c4ef9270d Fix using format string for logger in chat API endpoint 2024-03-27 16:31:22 +05:30
Debanjum Singh Solanky
4912c0ee30 Use extract queries actor to improve notes search with offline chat
Previously we were skipping the extract questions step for offline
chat as default offline chat model wasn't good enough to output proper
json given the time it took to extract questions.

The new default offline chat models gives json much more regularly and
with date filters, so the extract questions step becomes useful given
the impact on latency
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
1ebd5c3648 Rename GPT4AllChatProcessor* to OfflineChatProcessor Config, Model 2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
2a0b943bb4 Use Hermes-2-Pro as default offline chat model in khoj.yml 2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
dcdd1edde2 Update docs to show how to setup llama-cpp with Khoj
- How to pip install khoj to run offline chat on GPU
  After migration to llama-cpp-python more GPU types are supported but
  require build step so mention how
- New default offline chat model
- Where to get supported chat models from on HuggingFace
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
8ca39a436c Use llama.cpp for offline chat models
- Benefits of moving to llama-cpp-python from gpt4all:
  - Support for all GGUF format chat models
  - Support for AMD, Nvidia, Mac, Vulcan GPU machines (instead of just Vulcan, Mac)
  - Supports models with more capabilities like tools, schema
    enforcement, speculative ddecoding, image gen etc.
- Upgrade default chat model, prompt size, tokenizer for new supported
  chat models

- Load offline chat model when present on disk without requiring internet
  - Load model onto GPU if not disabled and device has GPU
  - Load model onto CPU if loading model onto GPU fails
  - Create helper function to check and load model from disk, when model
    glob is present on disk.

    `Llama.from_pretrained' needs internet to get repo info from
    HuggingFace. This isn't required, if the model is already downloaded

    Didn't find any existing HF or llama.cpp method that looked for model
    glob on disk without internet
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
0a7392f6ec Only add location to image prompt generator when location known 2024-03-26 22:33:01 +05:30
sabaimran
fdf78525b4
Part 2: Add web UI updates for basic agent interactions (#675)
* Initial pass at backend changes to support agents
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications

* Customize default behaviors for conversations without agents or with default agents

* Add a new web client route for viewing all agents

* Use agent_id for getting correct agent

* Add web UI views for agents
- Add a page to view all agents
- Add slugs to manage agents
- Add a view to view single agent
- Display active agent when in chat window
- Fix post-login redirect issue

* Fix agent view

* Spruce up the 404 page and improve the overall layout for agents pages

* Create chat actor for directly reading webpages based on user message

- Add prompt for the read webpages chat actor to extract, infer
  webpage links
- Make chat actor infer or extract webpage to read directly from user
  message
- Rename previous read_webpage function to more narrow
  read_webpage_at_url function

* Rename agents_page -> agent_page

* Fix unit test for adding the filename to the compiled markdown entry

* Fix layout of agent, agents pages

* Merge migrations

* Let the name, slug of the default agent be Khoj, khoj

* Fix chat-related unit tests

* Add webpage chat command for read web pages requested by user

Update auto chat command inference prompt to show example of when to
use webpage chat command (i.e when url is directly provided in link)

* Support webpage command in chat API

- Fallback to use webpage when SERPER not setup and online command was
  attempted
- Do not stop responding if can't retrieve online results. Try to
  respond without the online context

* Test select webpage as data source and extract web urls chat actors

* Tweak prompts to extract information from webpages, online results

- Show more of the truncated messages for debugging context
- Update Khoj personality prompt to encourage it to remember it's capabilities

* Rename extract_content online results field to webpages

* Parallelize simple webpage read and extractor

Similar to what is being done with search_online with olostep

* Pass multiple webpages with their urls in online results context

Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted
content would ever be passed.

URL of the extracted webpage content wasn't passed to clients in
online results context. This limited them from being rendered

* Render webpage read in chat response references on Web, Desktop apps

* Time chat actor responses & chat api request start for perf analysis

* Increase the keep alive timeout in the main application for testing

* Do not pipe access/error logs to separate files. Flow to stdout/stderr

* [Temp] Reduce to 1 gunicorn worker

* Change prod docker image to use jammy, rather than nvidia base image

* Use Khoj icon when Khoj web is installed on iOS as a PWA

* Make slug required for agents

* Simplify calling logic and prevent agent access for unauthenticated users

* Standardize to use personality over tuning in agent nomenclature

* Make filtering logic more stringent for accessible agents and remove unused method:

* Format chat message query

---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2024-03-26 18:13:24 +05:30
Debanjum Singh Solanky
15ed208996 Use Khoj icon when Khoj web is installed on iOS as a PWA 2024-03-26 00:13:12 +05:30
sabaimran
f8eaff574f Change prod docker image to use jammy, rather than nvidia base image 2024-03-25 23:09:58 +05:30
sabaimran
2b5341f53a [Temp] Reduce to 1 gunicorn worker 2024-03-25 16:13:04 +05:30
sabaimran
991f500775 Do not pipe access/error logs to separate files. Flow to stdout/stderr 2024-03-25 16:12:39 +05:30
Debanjum
586654e2af
Allow directly reading web pages, even when SERP not enabled (#676)
### Overview
Khoj can now read website directly without needing to go through the search step first

### Details
- Parallelize simple webpage read and extractor
- Rename extract_content online results field to web pages
- Tweak prompts to extract information from webpages, online results
- Test select webpage as data source and extract web urls chat actors

- Render webpage read in chat response references on Web, Desktop apps
- Pass multiple webpages with their urls in online results context

- Support webpage command in chat API
- Add webpage chat command for read web pages requested by user
- Create chat actor for directly reading webpages based on user message
2024-03-24 16:25:25 +05:30
Debanjum Singh Solanky
9e52ae9e98 Time chat actor responses & chat api request start for perf analysis 2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
dabf71bc3c Render webpage read in chat response references on Web, Desktop apps 2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
a2e79c94be Pass multiple webpages with their urls in online results context
Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted
content would ever be passed.

URL of the extracted webpage content wasn't passed to clients in
online results context. This limited them from being rendered
2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
71b6905008 Parallelize simple webpage read and extractor
Similar to what is being done with search_online with olostep
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
1167f6ddf9 Rename extract_content online results field to webpages 2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
b22a7dae5d Tweak prompts to extract information from webpages, online results
- Show more of the truncated messages for debugging context
- Update Khoj personality prompt to encourage it to remember it's capabilities
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
85c62efca1 Test select webpage as data source and extract web urls chat actors 2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
ad6f6bb0ed Support webpage command in chat API
- Fallback to use webpage when SERPER not setup and online command was
  attempted
- Do not stop responding if can't retrieve online results. Try to
  respond without the online context
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
a6b7432837 Add webpage chat command for read web pages requested by user
Update auto chat command inference prompt to show example of when to
use webpage chat command (i.e when url is directly provided in link)
2024-03-24 15:46:29 +05:30
sabaimran
8abc8ded82
Part 1: Server-side changes to support agents integrated with Conversations (#671)
* Initial pass at backend changes to support agents
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications

* Customize default behaviors for conversations without agents or with default agents

* Use agent_id for getting correct agent

* Merge migrations

* Simplify some variable definitions, add additional security checks for agents

* Rename agent.tuning -> agent.personality
2024-03-23 22:09:38 +05:30
sabaimran
4deb849fb1 Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming 2024-03-23 14:04:25 +05:30
sabaimran
8edbd7094f Let the name, slug of the default agent be Khoj, khoj 2024-03-23 14:03:58 +05:30
sabaimran
6b4c4f10b5 Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming 2024-03-23 11:22:00 +05:30
sabaimran
20617614ae Merge branch 'features/customize-chat-with-agents' of github.com:khoj-ai/khoj into features/add-agents-ui 2024-03-23 11:20:57 +05:30
sabaimran
2399d91f61 Merge migrations 2024-03-22 10:05:33 +05:30
sabaimran
d38089ab57 Merge with origin 2024-03-22 09:55:33 +05:30
Debanjum Singh Solanky
7416ca9ae1 Lower the default gunicorn workers running on prod 2024-03-21 04:35:52 +05:30
Debanjum Singh Solanky
aed4313cfc Fix updating specific conversation by id from the chat API endpoint
- Use the conversation id of the retrieved conversation rather than the
  potentially unset conversation id passed via API
- await creating new chat when no chat id provided and no existing
  conversations exist
2024-03-21 02:46:52 +05:30
Debanjum Singh Solanky
ec6dc0daaf Bump up the default gunicorn workers running on prod 2024-03-20 22:56:09 +05:30
sabaimran
6ba0d8e379 Add a connected notification if the websocket is connected 2024-03-20 20:53:28 +05:30
sabaimran
255b69dc58 Add a comma delimeter between outputted search queries 2024-03-20 19:43:35 +05:30
sabaimran
d84188b221 Scroll down when a message is added in the chat interface's handle stream response method 2024-03-20 15:04:41 +05:30
sabaimran
70ad78990a Use a common method for sending a generic message to the client from the server in the ws connection 2024-03-20 15:04:14 +05:30
sabaimran
d4e83b060a Update the web UI for the chat interface to establish a connection via a socket to the server
- Move some common methods into separate functions to make the UI components more efficient
- The normal HTTP-based chat connection will still work and serves as a fallback if the websocket is unavailable
2024-03-20 14:34:47 +05:30
sabaimran
a346f79b39 Add support for chatting via the web socket connection
- Convert to a model of calling the search API directly with a function call (rather than using the API method)
- Gracefully handle websocket connection disconnects
- Ensure that the rest of the response is still saved, as it is currently, if the user disconects from the client
- Setup unchangeable context at the beginning of the session when the connection is established (like location, username, etc)
2024-03-20 14:33:33 +05:30
sabaimran
36af9776e6 Add the websockets dependency to pyproject.toml 2024-03-20 14:11:18 +05:30
Debanjum Singh Solanky
62a83dc9bb Fix online search actor to use natural dates not after: operator
The recently added after: operator to online search actor was too
restrictive, gave worse results than when just use natural language
dates in search query
2024-03-15 21:50:14 +05:30
Debanjum Singh Solanky
4a1e6a2275 Convert deleted old user requests log line to debug from info 2024-03-15 20:50:10 +05:30
Debanjum Singh Solanky
9a068dadbf Fix extract questions prompt to use YYYY-MM-DD date filter format 2024-03-15 18:43:18 +05:30