Commit graph

2658 commits

Author SHA1 Message Date
Debanjum Singh Solanky
d341b1efe8 Store, retrieve task metadata from the job name field 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
ae10ff4a5f Create create_scheduled_task func to dedupe logic across ws, http APIs
Previously, both the websocket and http endpoint were implementing
the same logic. This was becoming too unwieldy
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
8dfa0bf047 Simplify task scheduler prompt. No timezone conversion. Infer subject
- Make timezone aware scheduling programmatic, instead of asking the
  chat model to do the conversion. This removes the need for
  scratchpad and may let smaller models handle the task as well
- Make chat model infer subject for email. This should make the
  notification email more readable
- Improve email by using subject in email subject, task heading. Move
  query to email final paragraph, which is where task metadata  should
  go
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
2c563ad280 Use hash of query in process lock id to standardize id format
- Using inferred_query directly was brittle (like previous job id)
- And process lock id had a limited size, so wouldn't work for larger
  inferred query strings
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
3ce06a938c Render scheduled task response as html to improve readability in email 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
c17dbbeb92 Render next run time in user timezone in config, chat UIs
- Pass timezone string from ipapi to khoj via clients
  - Pass this data from web, desktop and obsidian clients to server
- Use user tz to render next run time of scheduled task in user tz
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
6736551ba3 Improve scheduled task text rendered in UI 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
0e01362469 Merge DB migrations from master with those from scheduled task feature 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
a5ed4f2af2 Send email to share results of scheduled task 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
69775b6d6e Add /task command. Use it to disable scheduling tasks from tasks
This takes the load of the task scheduling chat actor / prompt from
having to artifically differentiate query to create scheduled task
from a scheduled task run.
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
22289a0002 Improve task scheduling by using json mode and agent scratchpad
- The task scheduling actor was having trouble calculating the
  timezone. Giving the actor a scratchpad to improve correctness by
  thinking step by step
- Add more examples to reduce chances of the inferred query looping to
  create another reminder instead of running the query and sharing
  results with user
- Improve task scheduling chat actor test with more tests and
  by ensuring unexpected words not present in response
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
7f5981594c Only notify when scheduled task results satisfy user's requirements
There's a difference between running a scheduled task and notifying
the user about the results of running the scheduled task.

Decide to notify the user only when the results of running the
scheduled task satisfy the user's requirements.

Use sync version of send_message_to_model_wrapper for scheduled tasks
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
7e084ef1e0 Improve job id. Fix refreshing list of jobs on delete from config page 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
a1e5195c8b Save separate user message time from Khoj response time in chat logs
Previously user message time was being stored the same as Khoj
response time in conversation logs.
2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
5133b6e73b Minor improvements to styling the config page 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
648f1a5c71 Suffix chat response element vars with "El" in chat.html of web, desktop apps 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
98d0ffecf1 Add section in settings page to view, delete your scheduled tasks 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
423d61796d Add API endpoints to get and delete user scheduled tasks 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
af0972c539 Make scheduled jobs persistent and work in multiple worker setups
- Store scheduled job state in Postgres so job schedules persist
  across app restarts
- Use Process Locks to only allow single worker to process a given job
  type. This prevents duplicating job runs across all workers
2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
fcf878e1f3 Add new operation Scheduled Job to Operation enum of ProcessLock 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
c28d7d3414 Add basic chat actor test to infer scheduled queries 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
c11742f443 Add chat actor to schedule run query for user at specified times
- Detect when user intends to schedule a task, aka reminder
  Add new output mode: reminder. Add example of selecting the reminder
  output mode
- Extract schedule time (as cron timestring) and inferred query to run
  from user message
- Use APScheduler to call chat with inferred query at scheduled time
- Handle reminder scheduling from both websocket and http chat requests

- Support constructing scheduled task using chat history as context
  Pass chat history to scheduled query generator for improved context
  for scheduled task generation
2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
9e068fad4f Handle null ref, when refresh conversation from db in websocket chat 2024-04-30 14:19:07 +05:30
sabaimran
37879a7850 Release Khoj version 1.11.2 2024-04-30 13:31:06 +05:30
sabaimran
93b41170d1 Refresh the conversation log from the db before addressing the next query 2024-04-30 13:27:51 +05:30
Debanjum Singh Solanky
f1545d2b2f Add, fix help link, improve title style in web ui config pages
- Align title text with icon better in all config cards
- Fix help link to github setup docs
- Fix help link to notion setup docs
2024-04-30 05:50:08 +05:30
Debanjum Singh Solanky
e6da0f9a8c Fix response type of delete client tokens API endpoint
Previously the make delete API response failed, after deleting token.
Required a page refresh to see that the API token was actually gone.

This was happening because the response type of the delete token API
endpoint isn't a string, so it failed FastAPI response validation
checks.
2024-04-30 02:46:52 +05:30
sabaimran
0f4c3518d3 Allow session cookies to be stored with a lax policy for some localhost scenarios 2024-04-29 15:48:45 +05:30
sabaimran
5beedc9734 Use Secure proxy ssl header only if no https 2024-04-29 15:33:21 +05:30
sabaimran
408f4780ce Add and update documentation for setting up khoj with an openai proxy server or offline llm 2024-04-27 20:16:32 +05:30
sabaimran
12258f02d7 Release Khoj version 1.11.1 2024-04-27 18:42:24 +05:30
sabaimran
2047b0c973
Support customization of the OpenAI base url in admin settings (#725)
- Allow self-hosted users to customize their open ai base url. This allows you to easily use a proxy service and extend support for other models.
- This also includes a migration that associates any existing openai chat model configuration with an openai processor configuration
- Make changing model a paid/subscriber feature
- Removes usage of langchain's OpenAI wrapper for better control over parsing input/output
2024-04-27 18:24:35 +05:30
sabaimran
49834e3b00 Add a hero image for the og:image meta tag 2024-04-27 17:07:21 +05:30
sabaimran
138f12f957 Fix indentation and revert first run message link styling to all links 2024-04-27 09:56:58 +05:30
Debanjum Singh Solanky
4395ed8065 Improve extract_questions func. Set message role to user, not assistant
Previous behavior of passing message with role = "assistant was
reducing instruction following quality of the model
2024-04-26 11:55:22 +05:30
Debanjum Singh Solanky
346499f12c Fix, improve args being passed to chat_completion args
- Allow passing completion args through completion_with_backoff
- Pass model_kwargs in a separate arg to simplify this
- Pass model in `model_name' kwarg from the send_message_to_model func
  `model_name' kwarg is used by langchain, not `model' kwarg
2024-04-26 11:55:22 +05:30
sabaimran
d8f2eac6e0 Release Khoj version 1.11.0 2024-04-25 17:24:59 +05:30
Debanjum Singh Solanky
1842017393 Skip trying to index deleted files, folders from Desktop app
Previously app would crash on startup if desktop app was told to
index a file that had been deleted afterwards
2024-04-25 15:23:05 +05:30
Debanjum
17a06f152c
Support Llama 3 and Improve Offline Chat Actors (#724)
- Add support for Llama 3 in Khoj offline mode
- Make chat actors generate valid json with more local models
- Fix offline chat actor tests
2024-04-25 14:00:56 +05:30
Debanjum
220e5516ab
Make Search Models More Configurable. Upgrade Default Cross-Encoder (#722)
- Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall
- Support more embedding models by making query, docs encoding configurable
2024-04-25 13:55:49 +05:30
Debanjum Singh Solanky
cf08eaf786 Add comments explaining each field in the search model config in DB 2024-04-25 13:54:13 +05:30
Debanjum
4ee5ac7c20
Fix Chat UI and Indexing on Desktop App (#723)
- Make valid file extension checking case insensitive on Desktop app
- Skip indexing non-existent folders on Desktop app
- Pass auth headers to fix lazy load of chat messages on Desktop app
- Set chat-message height to height of content in web, desktop
2024-04-24 18:49:03 +05:30
Debanjum Singh Solanky
89ef23de50 Upgrade gunicorn and make it only a production dependency 2024-04-24 11:28:55 +05:30
Debanjum Singh Solanky
799efb5974 Create DB migration to add new fields and change default cross-encoder 2024-04-24 09:50:34 +05:30
Debanjum Singh Solanky
ec41482324 Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall
Previous cross-encoder model was a few years old, newer models should
have improved in quality. Model size increases by 50% compared to
previous for better performance, at least on benchmarks
2024-04-24 09:50:09 +05:30
Debanjum Singh Solanky
7eaf9367fe Support more embedding models by making query, docs encoding configurable
Most newer, better embeddings models add a query, docs prefix when
encoding. Previously Khoj admins couldn't configure these, so it
wasn't possible to use these newer models.

This change allows configuring the kwargs passed to the query, docs
encoders by updating the search config in the database.
2024-04-24 09:49:17 +05:30
Debanjum Singh Solanky
f2db8d7d99 Fix offline chat actor tests
Do not check for original q in extracted questions. Since this was
removed in a previous commit
2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky
4f7237b158 Make chat actors generate valid json with more local models
Improve tool, online search, webpage links, docs search chat actor
prompts. Ensure works with hermes-2-pro and llama-3.

Be more specific about generating JSON and not saying anything else.
2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky
a2e4e4bede Add support for Llama 3 in Khoj offline mode
- Improve extract question prompts to explicitly request JSON list
- Use llama-3 chat format if HF repo_id mentions llama-3. The
  llama-cpp-python logic for detecting when to use llama-3 chat format
  isn't robust enough currently
2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky
8e77b3dc82 Fix infer_max_tokens func when configured_max_tokens is set to None 2024-04-24 09:36:29 +05:30