sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-21 03:47:45 +00:00

Author	SHA1	Message	Date
Debanjum Singh Solanky	e86899eec4	Click on referenced notes by Khoj chat to open it in Obsidian vault Allow opening Khoj chat references in Obsidian vault if the reference is a heading or file in the current Obsidian vault	2024-05-28 10:16:40 +05:30
Desmond	70fea6c6b6	fix: delete file request	2024-05-27 14:46:26 +08:00
sabaimran	607534021b	Add a link to github in the settings menu, improve styling	2024-05-27 11:39:30 +05:30
Desmond	3f49b5a4ab	fix: emacs tests	2024-05-27 10:42:09 +08:00
sabaimran	b97ca9d19d	Skip using max_tokens as input to the extract questions step, as that's not used for max_output	2024-05-27 01:23:54 +05:30
sabaimran	9ebf3a4d80	Improve the admin experience, add more metadata to the list_display - Don't propagate max_tokens to the openai chat completion method. the max for the newer models is fixed at 4096 max output. The token limit is just used for input	2024-05-27 00:49:20 +05:30
sabaimran	01cdc54ad0	Add support for Anthropic models (#760 ) * Add support for chatting with Anthropic's suite of models - Had to use a custom class because there was enough nuance with how the anthropic SDK works that it would be better to simply separate out the logic. The extract questions flow needed modification of the system prompt in order to work as intended with the haiku model	2024-05-26 22:50:34 +05:30
Debanjum Singh Solanky	0f796a79ec	Extract function to get link to entry in Obsidian vault for reuse	2024-05-26 18:03:15 +05:30
Debanjum Singh Solanky	e24ca9ec28	Pass file path of each doc reference in references returned by API - Pass file path of reference along with the compiled reference in list of references returned by chat API converts - Update the structure of references from list of strings to list of dictionary (containing 'compiled' and 'file' keys) - Pull out the compiled reference from the new references data struct wherever it was is being used	2024-05-26 18:02:11 +05:30
Debanjum Singh Solanky	ba330712f8	Fix to always pass online results in chat API response	2024-05-26 13:56:55 +05:30
Debanjum Singh Solanky	38d8d2bb56	Show online references used to generate response in Obsidian chat view	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	f495d338eb	Modularize render message with references func in web based clients Simplify, reuse, standardize code to render messages with references in the obsidian, web and desktop clients. Specifically: - Reuse function to create reference section, dedupe code - Create reusable function to generate image markdown - Simplify logic to render message with references	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	14a2006c76	Stream steps taken to generate response in Obsidian chat pane - Setup websocket using Khoj web app as reference. - Moved the geolocating code to chat view out from the general pane view - Use loading spinner from web instead of the thinking emoji	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	afcd22d30c	Improve spacing, colors of chat message references and buttons Works better with dark modes. References have more spacing and adhere to background color of the chat message itself	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	bd4931e70b	Add ability to paste chat messages directly into current file It'll replace any highlighted text with the chat message or if not text is highlighted, it'll insert the chat message at the last cursor position in the active file	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	032ad3b521	Add ability to copy messages to clipboard from Obsidian Khoj chat	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	57f1c53214	Create Nav bar for Obsidian pane. Use abstract View class for reuse - Jump to chat, show similar actions from nav menu of Khoj side pane - Add chat, search icons from web, desktop app - Use lucide icon for find similar (for now) - Match proportions of find similar icon to khoj other icons via css, js - Use KhojPaneView abstract class to allow reuse of common functionality like - Creating the nav bar header in side pane views - Loading geo-location data for chat context This should make creating new views easier	2024-05-26 13:55:22 +05:30
sabaimran	e23c803cee	Release Khoj version 1.12.1	2024-05-24 21:42:03 +05:30
sabaimran	0308699849	Use links from assets.khoj.dev to render images in the automations page	2024-05-24 20:18:02 +05:30
sabaimran	3f9c20a399	Make it easier to manage server-level chat settings (#729 ) * Add support for server-wide model settings fix web page reading results returning logic	2024-05-24 20:15:18 +05:30
sabaimran	cbbbe2da9a	Add a schedule picker and automations preview func (#747 ) * Update suggested automations * add a schedule picker when creating an automation * Create a new conversation in flow of the automation scheduling in order to send a preview and deliver more consistent results * Start adding in scaffolding to manually trigger a test job for an automation * Add support for manually triggering automations for testing * Schedule automation asynchronously * Update styling of the preview button * Improve admin lookup experience and prevent jobs from being scheduled to run every minute of everyday * Ignore mypy issues on job info short description	2024-05-24 19:42:47 +05:30
sabaimran	4511c6ae7c	Fix bug in chat feedback flow - user message not included during live chat	2024-05-21 14:55:39 -05:00
Desmond	a3c6045328	Merge remote-tracking branch 'origin/master'	2024-05-21 21:55:53 +08:00
Desmond	b0630c1a98	Simplify partition	2024-05-21 21:52:01 +08:00
Raghav Tirumale	d57772f9e7	Add Feedback Buttons on Chat (#721 ) ### Description and Rationale for Changes This feature includes thumbs up and thumbs down buttons on Khoj's chat responses that provide automated feedback. When a thumbs up/down button is clicked, the code sends an email to team@khoj.dev with the following: * user query * khoj's response * whether the sentiment of the user was good or bad. This is critical in improving Khoj's nondeterministic LLM model for a better user experience. ### List of Changes * new endpoint in `api_chat.py` (/feedback) that can be used to trigger mail sending). * thumbs up and thumbs down buttons implemented in `chat.html` * new function in `routers/email.py` to handle feedback email sending via resend * `feedback.html` template for a formatted email with the feedback. --------- Co-authored-by: mythicalcow <mythicalcow@linux.myguest.virtualbox.org> Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-05-20 16:29:08 -05:00
sabaimran	7feaf34702	Fix capitalization, update suggeted prompt	2024-05-10 02:36:13 -07:00
sabaimran	b545aceb47	Use a simpler example for the sample automation and put schedule on top of instructions	2024-05-09 13:53:19 -07:00
sabaimran	7ae00832bd	Rname from parameter to sender in resend call	2024-05-09 13:29:39 -07:00
sabaimran	fbd76f8ebe	Improve the UX of automations (#737 ) * Improve the automations UX - Add suggested jobs to elimiinate some of the cold start problem - Make each of the tasks cards that are clickable/editable * Hide suggested automations that have already been added * Add a footer and reapply styling when a save action is taken on a card	2024-05-09 01:29:48 -07:00
sabaimran	70d0ee4310	Only remove the process lock from a process that created it	2024-05-08 10:14:52 -07:00
Desmond Deng	20303feb3a	Merge branch 'khoj-ai:master' into master	2024-05-08 13:46:34 +08:00
Desmond	150cd18bf3	Update batch-size	2024-05-08 13:44:22 +08:00
Desmond	192cd53003	Batch send of index files	2024-05-08 13:38:40 +08:00
sabaimran	a50deb2762	Add better handling for empty responses	2024-05-07 11:49:33 -07:00
sabaimran	4aed6bd274	Add an admin view for subscriptions	2024-05-07 11:48:52 -07:00
sabaimran	77626d28d1	Include stack trace when automation is not successfully craeted	2024-05-07 06:52:41 -07:00
sabaimran	0c8c565ab0	Don't include the whole stack trace for an integrity error	2024-05-07 06:48:18 -07:00
Debanjum Singh Solanky	0a1a6cd041	Get detailed user info in Obsidian from the new v1/user API Previously we were just getting user email from the /health API Instead store the retrieved user info in the user settings	2024-05-07 04:37:26 +08:00
Debanjum Singh Solanky	f8f9d066db	Focus on input field, scroll to latest message on opening chat pane Previously scroll and chat input focus weren't applied as view hadn't been rendered yet	2024-05-07 04:37:26 +08:00
Debanjum Singh Solanky	9f65e8de98	Open Khoj Chat as a Pane instead of a Modal - Allows having it open on the side as you traverse your Obsidian notes - Allow faster time to response, having responses visible for context - Enables ambient interactions	2024-05-07 04:37:26 +08:00
sabaimran	9ae828cf11	Use asssets.khoj.dev for loading math katex rendering	2024-05-07 01:43:46 +08:00
sabaimran	cf0b7628d0	Add the url scheme to the public share url	2024-05-06 21:37:49 +08:00
sabaimran	f6aaecb04f	Fix construction method for public share conversation URL	2024-05-06 08:32:51 +05:30
sabaimran	14c9bea663	Make conversations optionally shareable (#712 ) * Make conversations optionally shareable - Shared conversations are viewable by anyone, without a login wall - Can share a conversation from the three dot menu - Add a new model for Public Conversation - The rationale for a separate model is that public and private conversations have different assumptions. Separating them reduces some of the code specificity on our server-side code and allows us for easier interpretation and stricter security. Separating the data model makes it harder to accidentally view something that was meant to be private - Add a new, read-only view for public conversations	2024-05-05 23:16:04 +05:30
Debanjum Singh Solanky	80cbaca935	Serve generated images from Khoj domain instead of directly from S3 Use CNAME to forward requests from the khoj subdomain to the equivalent S3 bucket	2024-05-04 20:07:10 +05:30
Debanjum Singh Solanky	425496844b	Rename assets URL from Khoj S3 bucket to assets.khoj.dev Server khoj assets from khoj domain	2024-05-04 20:07:10 +05:30
sabaimran	88daa841fd	Rename process lock migration and add a reverse migration step	2024-05-04 20:05:00 +05:30
sabaimran	509a8a412c	Throw an error if trying to create a process lock that already exists. Names should be unique	2024-05-04 19:03:53 +05:30
sabaimran	7100614de5	Add support for rendering math equations in the web view (#733 ) - Add parsing logic for LaTeX-format math equations in the web chat - Add placeholder delimiters when converting the markdown to HTML in order to avoid removing the escaped characters - Add the `<!DOCTYPE html>` specification to the page	2024-05-04 15:59:17 +05:30
Debanjum Singh Solanky	d9b3482b1a	Show error when required fields to create automation are not set	2024-05-04 11:17:30 +05:30
Debanjum Singh Solanky	91a5643c5c	Use Preview label for Automate feature. Prefix mailto: link to contact	2024-05-04 10:59:17 +05:30
Debanjum Singh Solanky	fd2328ab40	Do not hard code base url of path to automation icon in chat message	2024-05-04 10:59:07 +05:30
sabaimran	a38f3227e2	Revert domain in task task send emails	2024-05-03 15:27:27 +05:30
sabaimran	a1263951e9	Use mail to in email contact link	2024-05-03 12:16:56 +05:30
sabaimran	7c9847fe48	Increase jitter to 60	2024-05-03 11:38:22 +05:30
sabaimran	737ebfd521	Make improvements to online search prompts and use a custom domain for automations emails	2024-05-03 10:47:42 +05:30
sabaimran	42e9504ba8	Use a different function for getting last run time, avoid async/sync issues	2024-05-02 12:13:45 +05:30
sabaimran	9e8491b814	Add experimental disclaimers to the automations	2024-05-02 11:40:37 +05:30
sabaimran	c418449311	Add additional robustness in verifying job execution parameters at run time	2024-05-02 11:13:04 +05:30
sabaimran	690e9d8ed3	Collapse the reminders after they're successfully scheduled	2024-05-02 09:55:04 +05:30
sabaimran	6b648ee3ad	Add experimental disclaimer in the automation page	2024-05-02 09:21:27 +05:30
sabaimran	f4fbc91515	Remove the exclamation point from the email	2024-05-01 19:01:51 +05:30
sabaimran	bddd1d0fcb	Quip, smart reminders	2024-05-01 16:39:07 +05:30
sabaimran	bc8b92a77d	Release Khoj version 1.12.0	2024-05-01 16:30:48 +05:30
sabaimran	b499851097	Use the cleaned query as the reference query in the email notification	2024-05-01 15:33:11 +05:30
sabaimran	f24495e0e6	Fix time zone used in query history. Closes #694	2024-05-01 15:31:48 +05:30
sabaimran	7fd57d737e	Adjustments to improve overall styling of config page, email template	2024-05-01 14:19:47 +05:30
sabaimran	28578310d1	Add log line when sending a task-related email	2024-05-01 13:56:02 +05:30
sabaimran	a86f95117e	Add the subject generation prompt and helper method	2024-05-01 13:55:32 +05:30
sabaimran	c30ba2e551	Set subject dynamically when creating new tasks, and make some minor improvments to the automations UI	2024-05-01 13:54:59 +05:30
sabaimran	d1b2037676	Shutdown the scheduler when the application is exiting	2024-05-01 13:53:34 +05:30
Debanjum Singh Solanky	89a8dbb81a	Fix edit job API. Use user timezone, pass all reqd. params to automation - Pass user and calling_url to the scheduled chat too when modifying params of automation - Update to use user timezone even when update job via API - Move timezone string to timezone object calculation into the schedule automation method	2024-05-01 10:29:49 +05:30
Debanjum Singh Solanky	19c5af3ebc	Handle natural language to cron translation error on web client	2024-05-01 09:10:18 +05:30
Debanjum Singh Solanky	70ee9ddf91	Merge migrations from main with feature branch	2024-05-01 09:10:18 +05:30
Debanjum Singh Solanky	8f28f6cc1e	Remove now unused location data from being passed to automation funcs	2024-05-01 08:48:16 +05:30
Debanjum Singh Solanky	815966cb25	Unify, modularize DB adapters to get automation metadata by user further	2024-05-01 08:47:50 +05:30
Debanjum Singh Solanky	21bdf45d6f	Add link to Automate page in nav pane of the web app	2024-05-01 08:47:50 +05:30
Debanjum Singh Solanky	bd5008136a	Move automations into independent page. Allow direct automation - Previously it was a section in the settings page. Move it to independent, top-level page to improve visibility of feature - Calculate crontime from natural language on web client before sending it to to server for saving new/updated schedule to disk. - Avoids round-trip of call to chat model - Convert POST /api/automation API endpoint into a direct request for automation with query_to_run, subject and schedule provided via the automation page. This allows more granular control to create automation - Make the POST automations endpoint more robust; runs validation checks, normalizes parameters	2024-05-01 08:47:48 +05:30
Debanjum Singh Solanky	cbc8a02179	Make, use func for constructing the automation created response - Dedupe logic across http, ws chat API endpoints - Reduces size of already too long http, ws chat API endpoint funcs	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	c52ed333fa	Make content, cards on config pages occupy the whole middle column - Make the config page content use the same top level 3-column layout as the khoj-header-wrapper This ensures the content is aligned with heading pane width - Let cards and other settings sections scale to the width of their grid element. This utilizes more of the screen space and does it consistently across the different settings pages	2024-05-01 08:30:10 +05:30
sabaimran	ad4145e48c	Fix unique has for job id	2024-05-01 08:30:10 +05:30
sabaimran	311d58e1ed	Ensure the automated_task command is removed from the prepended query	2024-05-01 08:30:10 +05:30
sabaimran	eb65532386	Use Django ap scheduler in place of the sqlalchemy one	2024-05-01 08:30:10 +05:30
sabaimran	06213ea814	Fix token retrieval when executing the job and name async job approriately	2024-05-01 08:30:10 +05:30
sabaimran	ca8a7d8368	Revert sync -> aync in send welcome email method	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	6936875a82	Use DB adapter to unify logic to get, delete automation by auth user To use place with logic to get, view, delete (and edit soon) automations by (authenticated) user, instead of scattered across code	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	1238cadd31	Allow editting query-to-run from the automation config section	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	cb2b1dccc5	Add icon for Automation feature. Replace old icons for delete, new	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	23f2057868	Allow creating automations from automation settings section in web ui - Create new POST API endpoint to create automations - Use it in the settings page on the web interface to create new automations This simplified managing automations from the setting page by allowing both delete and create from the same page	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	2f9241b5a3	Rename scheduled task to automations across code and UX - Fix query, subject parameters passed to email template - Show 12 hour scheduled time in automation created chat message	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	230d160602	Improve rendering task scheduled settings view and message - Render crontime string in natural language in message & settings UI - Show more fields in tasks web config UI - Add link to the tasks settings page in task scheduled chat response - Improve task variables names Rename executing_query to query_to_run. scheduling_query to scheduling_request	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	d341b1efe8	Store, retrieve task metadata from the job name field	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	ae10ff4a5f	Create create_scheduled_task func to dedupe logic across ws, http APIs Previously, both the websocket and http endpoint were implementing the same logic. This was becoming too unwieldy	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	8dfa0bf047	Simplify task scheduler prompt. No timezone conversion. Infer subject - Make timezone aware scheduling programmatic, instead of asking the chat model to do the conversion. This removes the need for scratchpad and may let smaller models handle the task as well - Make chat model infer subject for email. This should make the notification email more readable - Improve email by using subject in email subject, task heading. Move query to email final paragraph, which is where task metadata should go	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	2c563ad280	Use hash of query in process lock id to standardize id format - Using inferred_query directly was brittle (like previous job id) - And process lock id had a limited size, so wouldn't work for larger inferred query strings	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	3ce06a938c	Render scheduled task response as html to improve readability in email	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	c17dbbeb92	Render next run time in user timezone in config, chat UIs - Pass timezone string from ipapi to khoj via clients - Pass this data from web, desktop and obsidian clients to server - Use user tz to render next run time of scheduled task in user tz	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	6736551ba3	Improve scheduled task text rendered in UI	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	0e01362469	Merge DB migrations from master with those from scheduled task feature	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	a5ed4f2af2	Send email to share results of scheduled task	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	69775b6d6e	Add /task command. Use it to disable scheduling tasks from tasks This takes the load of the task scheduling chat actor / prompt from having to artifically differentiate query to create scheduled task from a scheduled task run.	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	22289a0002	Improve task scheduling by using json mode and agent scratchpad - The task scheduling actor was having trouble calculating the timezone. Giving the actor a scratchpad to improve correctness by thinking step by step - Add more examples to reduce chances of the inferred query looping to create another reminder instead of running the query and sharing results with user - Improve task scheduling chat actor test with more tests and by ensuring unexpected words not present in response	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	7f5981594c	Only notify when scheduled task results satisfy user's requirements There's a difference between running a scheduled task and notifying the user about the results of running the scheduled task. Decide to notify the user only when the results of running the scheduled task satisfy the user's requirements. Use sync version of send_message_to_model_wrapper for scheduled tasks	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	7e084ef1e0	Improve job id. Fix refreshing list of jobs on delete from config page	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	a1e5195c8b	Save separate user message time from Khoj response time in chat logs Previously user message time was being stored the same as Khoj response time in conversation logs.	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	5133b6e73b	Minor improvements to styling the config page	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	648f1a5c71	Suffix chat response element vars with "El" in chat.html of web, desktop apps	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	98d0ffecf1	Add section in settings page to view, delete your scheduled tasks	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	423d61796d	Add API endpoints to get and delete user scheduled tasks	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	af0972c539	Make scheduled jobs persistent and work in multiple worker setups - Store scheduled job state in Postgres so job schedules persist across app restarts - Use Process Locks to only allow single worker to process a given job type. This prevents duplicating job runs across all workers	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	fcf878e1f3	Add new operation Scheduled Job to Operation enum of ProcessLock	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	c11742f443	Add chat actor to schedule run query for user at specified times - Detect when user intends to schedule a task, aka reminder Add new output mode: reminder. Add example of selecting the reminder output mode - Extract schedule time (as cron timestring) and inferred query to run from user message - Use APScheduler to call chat with inferred query at scheduled time - Handle reminder scheduling from both websocket and http chat requests - Support constructing scheduled task using chat history as context Pass chat history to scheduled query generator for improved context for scheduled task generation	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	9e068fad4f	Handle null ref, when refresh conversation from db in websocket chat	2024-04-30 14:19:07 +05:30
sabaimran	37879a7850	Release Khoj version 1.11.2	2024-04-30 13:31:06 +05:30
sabaimran	93b41170d1	Refresh the conversation log from the db before addressing the next query	2024-04-30 13:27:51 +05:30
Debanjum Singh Solanky	f1545d2b2f	Add, fix help link, improve title style in web ui config pages - Align title text with icon better in all config cards - Fix help link to github setup docs - Fix help link to notion setup docs	2024-04-30 05:50:08 +05:30
Debanjum Singh Solanky	e6da0f9a8c	Fix response type of delete client tokens API endpoint Previously the make delete API response failed, after deleting token. Required a page refresh to see that the API token was actually gone. This was happening because the response type of the delete token API endpoint isn't a string, so it failed FastAPI response validation checks.	2024-04-30 02:46:52 +05:30
sabaimran	0f4c3518d3	Allow session cookies to be stored with a lax policy for some localhost scenarios	2024-04-29 15:48:45 +05:30
sabaimran	5beedc9734	Use Secure proxy ssl header only if no https	2024-04-29 15:33:21 +05:30
sabaimran	12258f02d7	Release Khoj version 1.11.1	2024-04-27 18:42:24 +05:30
sabaimran	2047b0c973	Support customization of the OpenAI base url in admin settings (#725 ) - Allow self-hosted users to customize their open ai base url. This allows you to easily use a proxy service and extend support for other models. - This also includes a migration that associates any existing openai chat model configuration with an openai processor configuration - Make changing model a paid/subscriber feature - Removes usage of langchain's OpenAI wrapper for better control over parsing input/output	2024-04-27 18:24:35 +05:30
sabaimran	49834e3b00	Add a hero image for the og:image meta tag	2024-04-27 17:07:21 +05:30
sabaimran	138f12f957	Fix indentation and revert first run message link styling to all links	2024-04-27 09:56:58 +05:30
Debanjum Singh Solanky	4395ed8065	Improve extract_questions func. Set message role to user, not assistant Previous behavior of passing message with role = "assistant was reducing instruction following quality of the model	2024-04-26 11:55:22 +05:30
Debanjum Singh Solanky	346499f12c	Fix, improve args being passed to chat_completion args - Allow passing completion args through completion_with_backoff - Pass model_kwargs in a separate arg to simplify this - Pass model in `model_name' kwarg from the send_message_to_model func `model_name' kwarg is used by langchain, not `model' kwarg	2024-04-26 11:55:22 +05:30
sabaimran	d8f2eac6e0	Release Khoj version 1.11.0	2024-04-25 17:24:59 +05:30
Debanjum Singh Solanky	1842017393	Skip trying to index deleted files, folders from Desktop app Previously app would crash on startup if desktop app was told to index a file that had been deleted afterwards	2024-04-25 15:23:05 +05:30
Debanjum	17a06f152c	Support Llama 3 and Improve Offline Chat Actors (#724 ) - Add support for Llama 3 in Khoj offline mode - Make chat actors generate valid json with more local models - Fix offline chat actor tests	2024-04-25 14:00:56 +05:30
Debanjum	220e5516ab	Make Search Models More Configurable. Upgrade Default Cross-Encoder (#722 ) - Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall - Support more embedding models by making query, docs encoding configurable	2024-04-25 13:55:49 +05:30
Debanjum Singh Solanky	cf08eaf786	Add comments explaining each field in the search model config in DB	2024-04-25 13:54:13 +05:30
Debanjum	4ee5ac7c20	Fix Chat UI and Indexing on Desktop App (#723 ) - Make valid file extension checking case insensitive on Desktop app - Skip indexing non-existent folders on Desktop app - Pass auth headers to fix lazy load of chat messages on Desktop app - Set chat-message height to height of content in web, desktop	2024-04-24 18:49:03 +05:30
Debanjum Singh Solanky	799efb5974	Create DB migration to add new fields and change default cross-encoder	2024-04-24 09:50:34 +05:30
Debanjum Singh Solanky	ec41482324	Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall Previous cross-encoder model was a few years old, newer models should have improved in quality. Model size increases by 50% compared to previous for better performance, at least on benchmarks	2024-04-24 09:50:09 +05:30
Debanjum Singh Solanky	7eaf9367fe	Support more embedding models by making query, docs encoding configurable Most newer, better embeddings models add a query, docs prefix when encoding. Previously Khoj admins couldn't configure these, so it wasn't possible to use these newer models. This change allows configuring the kwargs passed to the query, docs encoders by updating the search config in the database.	2024-04-24 09:49:17 +05:30
Debanjum Singh Solanky	4f7237b158	Make chat actors generate valid json with more local models Improve tool, online search, webpage links, docs search chat actor prompts. Ensure works with hermes-2-pro and llama-3. Be more specific about generating JSON and not saying anything else.	2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky	a2e4e4bede	Add support for Llama 3 in Khoj offline mode - Improve extract question prompts to explicitly request JSON list - Use llama-3 chat format if HF repo_id mentions llama-3. The llama-cpp-python logic for detecting when to use llama-3 chat format isn't robust enough currently	2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky	8e77b3dc82	Fix infer_max_tokens func when configured_max_tokens is set to None	2024-04-24 09:36:29 +05:30
Debanjum Singh Solanky	8196ab62f9	Make valid file extension checking case insensitive on Desktop app	2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky	5def14e3bb	Skip indexing non-existent folders on Desktop app	2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky	cd05f262a6	Pass auth headers to fix lazy load of chat messages on Desktop app	2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky	4d5d3e6433	Set chat-message height to height of content in web, desktop In some cases, especially with image generation requests, this was causing the chat messages to overlap in the chat UI	2024-04-24 09:35:20 +05:30
sabaimran	60658a8037	Get rid of enable flag for the offline chat processor config - Default, assume that offline chat is enabled if there is an offline chat model option configured	2024-04-23 23:08:29 +05:30
sabaimran	ac474fce38	Ensure that the tokenizer and max prompt size are used the wrapper method	2024-04-23 21:22:23 +05:30
Olatoyan George	ad59180fb8	Added indication in the desktop UI for back-end connectivity (#711 ) * Changed the styling of the link that takes a user to the settings page into a button * added an indicator that shows if a user is connected to the server or not * made a class name more descriptive and also made the text in first run message more intuitive * changed the command to install dependencies in the README.md * changed the class name of the first run message text to be more descriptive * added icons in the desktop UI that shows if a file is synced successfully or not * made the link class name in the homepage more descriptive * fixed the hover issue on status box in the chat header pane * fixed hovering issue on status box on macOS	2024-04-23 16:43:48 +05:30
Debanjum	419b044ac5	Use set, inferred max token limits wherever chat models are used (#713 ) - User configured max tokens limits weren't being passed to `send_message_to_model_wrapper' - One of the load offline model code paths wasn't reachable. Remove it to simplify code - When max prompt size isn't set infer max tokens based on free VRAM on machine - Use min of app configured max tokens, vram based max tokens and model context window	2024-04-23 16:42:35 +05:30
AjaySDwivedi1	abf6f963ea	Replaced reinitialize and save all button to a sync button in config.… (#701 ) Replaced reinitialize and save all button to a sync button in config	2024-04-23 16:42:11 +05:30
Debanjum Singh Solanky	c39c4e4ec4	Improve prompt for online search query generation chat actor - Allow searching github, pypi for information about Khoj - Enable creating multiple search queries by rewording prompt	2024-04-22 01:32:11 +05:30
Debanjum Singh Solanky	175169c156	Use set, inferred max token limits wherever chat models are used - User configured max tokens limits weren't being passed to `send_message_to_model_wrapper' - One of the load offline model code paths wasn't reachable. Remove it to simplify code - When max prompt size isn't set infer max tokens based on free VRAM on machine - Use min of app configured max tokens, vram based max tokens and model context window	2024-04-20 11:23:28 +05:30
Debanjum Singh Solanky	002cd14a65	Only let agent use online search tool if connected to it	2024-04-20 11:19:48 +05:30
Debanjum Singh Solanky	75c9ebbc54	Only show uvicorn debug logs at higher verbosity levels Don't automatically show the uvicorn logs when in_debug_mode, only show on at least verbosity = 2, i.e when start khoj with -vv flag	2024-04-20 11:18:01 +05:30
sabaimran	d11354f9c8	Remove additional references to image content config	2024-04-17 13:00:50 +05:30
sabaimran	105dbf49e4	Fix max_duration_in_seconds for the update_embeddings job	2024-04-17 13:00:18 +05:30
Debanjum Singh Solanky	8e0bae894d	Extract run with process lock logic into func. Use for content reindexing	2024-04-17 12:31:19 +05:30
Debanjum Singh Solanky	e9f608174b	Fix access to Khoj admin panel from non HTTPS custom domains To access the Khoj admin panel from a non HTTPS custom domain the `KHOJ_NO_SSL' and `KHOJ_DOMAIN' env vars need to be explictly set. See the updated setup docs for details. Resolves #662	2024-04-17 03:20:05 +05:30
sabaimran	b0059654c9	Do not create an import error if the resend module is not available	2024-04-17 01:00:22 +05:30
sabaimran	f04ead7c37	Remove seting up log line for configuring image search	2024-04-17 00:45:39 +05:30
sabaimran	0208688801	Increase factor for n_ctx reduciton to 2e6	2024-04-17 00:41:36 +05:30
Debanjum Singh Solanky	1f2ffce85b	Copy chat message with it's markdown formatting in Web, Desktop apps	2024-04-16 22:10:34 +05:30
sabaimran	91c8b137f1	Add a database lock for jobs that shouldn't be run by multiple workers (#706 ) * Add a database lock for jobs that shouldn't be run by multiple workers * Import relevant functions from utils.helpers	2024-04-16 21:29:27 +05:30
sabaimran	adb2e8cc5f	Check if n is populated before making a comparison	2024-04-16 02:05:58 +05:30
Debanjum Singh Solanky	6707ccc463	Check before updating "chat" key in meta_log in chat history API endpoint	2024-04-15 21:06:47 +05:30
Debanjum Singh Solanky	4e7812fe55	Use Django management cmd to update inline images in DB to/from WebP/PNG This provides Khoj server admins more control on migrating their S3 images to WebP format from PNG	2024-04-15 20:19:49 +05:30
Debanjum Singh Solanky	7fab8d6586	Only use chat messages count in history API endpoint when set by client	2024-04-15 19:12:57 +05:30
Debanjum	6b3ef61dd2	Improve Chat Page Load Perf, Offline Chat Perf and Miscellaneous Fixes (#703 ) ### Store Generated Images as WebP - `78bac4ae` Add migration script to convert PNG to WebP references in database - `c6e84436` Update clients to support rendering webp images inline - `d21f22ff` Store Khoj generated images as webp instead of png for faster loading ### Lazy Fetch Chat Messages to Improve Time, Data to First Render This is especially helpful for long conversations with lots of images - `128829c4` Render latest msgs on chat session load. Fetch, render rest as they near viewport - `9e558577` Support getting latest N chat messages via chat history API ### Intelligently set Context Window of Offline Chat to Improve Performance - `4977b551` Use offline chat prompt config to set context window of loaded chat model ### Fixes - `148923c1` Fix to raise error on hitting rate limit during Github indexing - `b8bc6bee` Always remove loading animation on Desktop app if can't login to server - `38250705` Fix `get_user_photo` to only return photo, not user name from DB ### Miscellaneous Improvements - `689202e0` Update recommended CMAKE flag to enable using CUDA on linux in Docs - `b820daf3` Makes logs less noisy	2024-04-15 18:34:29 +05:30
Debanjum Singh Solanky	a352940dfd	Use Django management command to update images URL in DB to WebP This provides Khoj server admins more control on migrating their S3 images to WebP format from PNG	2024-04-15 17:53:41 +05:30
Debanjum Singh Solanky	7d8e8eb0cf	Use Enum to type text-to-image intent of Khoj chat response	2024-04-15 17:53:40 +05:30
Debanjum Singh Solanky	128829c477	Show latest msgs on chat session load. Fetch rest as they near viewport - Reduces time to first render when loading long chat sessions - Limits size of first page load, when loading long chat sessions These performance improvements are maximally felt for large chat sessions with lots of images generated by Khoj Updated web and desktop app to support these changes for now	2024-04-15 16:10:56 +05:30
Debanjum Singh Solanky	9e5585776c	Support getting latest N chat messages via chat history API Get latest N if N > 0, else return all messages except latest N from the conversation	2024-04-15 15:32:32 +05:30
Debanjum Singh Solanky	e5ff85f6fb	Start fetching khoj css before icons to reduce time with no styling This should reduce frequency of page load jitter when icons are loaded before style is applied	2024-04-15 15:32:32 +05:30
Debanjum Singh Solanky	d5de59d411	Do not assume results key present in notion content when indexing	2024-04-15 08:02:20 +05:30
Debanjum Singh Solanky	4977b55106	Use offline chat prompt config to set context window of loaded chat model Previously you couldn't configure the n_ctx of the loaded offline chat model. This made it hard to use good offline chat model (which these days also have larger context) on machines with lower VRAM	2024-04-14 02:35:36 +05:30
Debanjum Singh Solanky	148923c13a	Fix to raise error on hitting rate limit during Github indexing	2024-04-13 22:09:13 +05:30
sabaimran	f24d71c71c	Improve the agents UX (#702 ) - Make the chat buttons look more clickable - Show agent name in new conversation message - Add an icon to the CTA to send agent a message	2024-04-13 20:11:37 +05:30
Debanjum Singh Solanky	78bac4ae05	Add migration script to convert PNG to WebP references in database	2024-04-13 19:06:28 +05:30
Debanjum Singh Solanky	c6e8443631	Update clients to support rendering webp images inline This is for self-hosted scenarios where AWS S3 uploads is not enabled	2024-04-13 13:11:18 +05:30
Debanjum Singh Solanky	d21f22ffa1	Store Khoj generated images as webp instead of png for faster loading	2024-04-13 13:03:32 +05:30
Debanjum Singh Solanky	b820daf38f	Makes logs less noisy - Show telemetry enabled/disabled state on init, not every 2 minutes - Convert no docs synced logs to debug level instead of warning Having synced docs isn't as important to use Khoj now, unlike before	2024-04-13 11:22:58 +05:30
Debanjum Singh Solanky	b8bc6bee83	Always remove loading animation on Desktop app if can't login to server	2024-04-13 11:02:44 +05:30
Debanjum Singh Solanky	382507051f	Fix get_user_photo to only return photo, not user name from DB	2024-04-13 11:02:30 +05:30
sabaimran	f06ec485cb	Fix redirect url process for login flow, existing user	2024-04-12 17:10:05 +05:30
sabaimran	b86e68a29d	Make it easier to view agents in the admin page	2024-04-12 13:02:22 +05:30
sabaimran	1377a44a1a	Suppress debug logs from uvicorn.error to avoid clutter from websockets - If application is not in DEBUG_MODE	2024-04-12 12:12:16 +05:30
Debanjum Singh Solanky	89b8ec3546	Release Khoj version 1.10.2	2024-04-12 11:53:32 +05:30
Debanjum Singh Solanky	50b4788a91	Remove chat loading animation in login required state on Desktop app	2024-04-12 11:50:54 +05:30
Debanjum Singh Solanky	b3f4794d91	Remove the unnecessary async/await func chains on Desktop app	2024-04-12 11:49:25 +05:30
Debanjum Singh Solanky	1e30a072d4	Just use file ext to identify indexable files to fix Desktop app install - Magika on Desktop app was too bloated (100Mb to 250Mb) and broke install for some reason. Not sure why it was causing the app install to fail but do not have time to currently investigate - Just use file extensions whitelist it's good enough for now. Let server handle the deeper identification of file type	2024-04-12 11:16:07 +05:30
Debanjum Singh Solanky	5c7797dbca	Only check content type if file extension cannot identify text file	2024-04-12 03:40:42 +05:30
Debanjum Singh Solanky	7d2ef728e6	Fix identifying pdf files on server Introduced bug in previous commit that would stop indexing PDF files as trying to check content_group instead of mime_type is application/pdf	2024-04-12 03:07:46 +05:30
Debanjum Singh Solanky	07f8fb5c5b	Release Khoj version 1.10.1	2024-04-12 02:18:07 +05:30
Debanjum Singh Solanky	a7d9102c33	Make identifying text, code files with Magika more robust on server Use identified content group rather than mime_type to find text files.	2024-04-12 02:12:26 +05:30
Debanjum Singh Solanky	60337086f9	Release Khoj version 1.10.0	2024-04-12 01:01:02 +05:30
Debanjum Singh Solanky	34c3f70203	Index only files with valid text extension in folders synced by Desktop app This maintains consistent set of indexable files from Desktop app, whether indexing via file or folder filters	2024-04-12 00:59:54 +05:30
Debanjum	9a48f72041	Index more text file types from Desktop, Github (#692 ) ### Index more text file types - Index all text, code files in Github repos. Not just md, org files - Send more text file types from Desktop app and improve indexing them - Identify file type by content & allow server to index all text files ### Deprecate Github Indexing Features - Stop indexing commits, issues and issue comments in a Github repo - Skip indexing Github repo on hitting Github API rate limit ### Fixes and Improvements - Fix indexing files in sub-folders from Desktop app - Standardize structure of text to entries to match other entry processors	2024-04-12 00:08:29 +05:30
Debanjum Singh Solanky	0819b83d0b	Fix constructing status update strings for intermediate chat steps	2024-04-11 20:31:32 +05:30
Debanjum Singh Solanky	d15b9bc272	Tell doc search actor to not generate online queries for doc search This can pick up irrelevant details from notes	2024-04-11 19:49:41 +05:30
Debanjum Singh Solanky	15a78b19ad	Improve Inferred Document Search Query Extraction from GPT Using stop_words = "\n" was preventing JSON responses with newlines in them	2024-04-11 19:24:04 +05:30
Debanjum Singh Solanky	653681967e	Show inferred document search queries in intermediate chat step on Web app	2024-04-11 19:24:04 +05:30
Debanjum Singh Solanky	997741119a	Show better intermediate steps when responding to chat via web socket - Show internet search, webpage read, image query, image generation steps - Standardize, improve rendering of the intermediate steps on the web app Benefits: 1. Improved transparency, allow users to see what Khoj is doing behind the scenes and modify their query patterns to improve response quality 2. Reduced websocket connection keep alive timeouts for long running steps	2024-04-11 18:04:40 +05:30
sabaimran	fae7900f19	Remove more	2024-04-11 00:27:44 +05:30
sabaimran	5d1dd3e2b7	If resend not enabled, don't send the welcome email	2024-04-10 23:52:42 +05:30
sabaimran	d2f9c43c8e	Use datetime.timezone.utc instead of datetime.utc	2024-04-10 23:07:43 +05:30
Debanjum Singh Solanky	f2dc9709b7	Use Magika to more robustly identify text files to send for indexing - `file-type' doesn't handle mis-labelled files or files without extensions well - Only show supported file types in file selector dialog on Desktop app Use Magika to get list of text file extensions. Combine with other supported extensions to get complete list of supported file extensions. Use it to limit selectable files in the File Open dialog. Note: Folder selector will index text files with no extensions as well	2024-04-10 22:44:24 +05:30
sabaimran	3fe94a67b0	Send welcome emails when a new user signs up (#691 ) * Don't trigger any re-indexing on server initailization * Integrate Resend to send welcome emails when a new user signs up - Only send if this is the first time they've signed in - Configure welcome email with basic styling, as more complex designs don't work and style tag did not work	2024-04-10 19:57:33 +05:30
Debanjum	6d153022f6	Improve nav pane, chat session UI on Desktop, Web app (#693 ) ### Enable copying chat messages. Improve copy button behavior and styling - Add button to copy chat messages on Desktop, Web apps - Improve copy button's icon, hover color & click animation in Desktop, Web apps ### Improve Navigation, Chat Session Panes on Desktop, Web apps - Dynamically generate navigation menu based on user info from server - Create API endpoint to get authenticated user information - Collapse navigation tabs into icons on mobile. Add spacing to them - Add Chat navigation tab back to top pane on Web app - Use proper icons for Search, Chat and Agents tab on navigation pane ### Miscellaneous Improvements - Make current chat expand to full width when session panel collapsed on Desktop App - Add chat session loading spinner to Desktop App (same as Web app) ### Fixes - Show title bar in Khoj desktop app on Windows to simplify close, minimize etc. - Only render first run setup message once if error or server not running - Fix showing Search navigation tab from Agent pages on web client	2024-04-10 19:54:12 +05:30
Debanjum Singh Solanky	48d249db9e	Center the nav item text and user profile initial icons	2024-04-10 19:38:43 +05:30
Debanjum Singh Solanky	60f6a1c6f1	Use svg icons in nav pane to standardize styling on Web, Desktop apps Emojis varied based on device. svg icons standardize icon styles of the web, desktop apps	2024-04-10 19:38:43 +05:30
Debanjum Singh Solanky	cccea484e4	Pass username, location context in system prompt instead of chat message The username and location in system prompt should disambiguate user context from user's actual message for the chat model. It doesn't need to be told to not mention the context or acknowledge the context instructions in it's response, as it understands that this information is just context and not part of the user's actual message.	2024-04-10 15:05:33 +05:30
Debanjum Singh Solanky	804c04f7b9	Do not render copy message button on every Khoj thinking step Only render copy chat message button once, after message text is rendered	2024-04-10 14:48:36 +05:30
sabaimran	a4afada746	Remove client-side timeouts for the khoj socket	2024-04-10 13:35:25 +05:30
Debanjum Singh Solanky	cadeaac769	Align conversation sessions side panel on Desktop app with Web app - Move new conversation button to right of "Conversation" title - Reduce size of chat message loading ellipsis animation - Add loading animation for chat session	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	1c3d129e08	Add button to copy chat messages on Desktop client	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	0a5a91619e	Improve copy button's icon, hover color & click animation in Desktop UI	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	184873213c	Add button to copy chat messages on Web client	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	f56522cb8e	Improve copy button's icon, hover color & click animation in Web UI	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	8ff3890ba8	Dynamically generate navigation menu based on user info from server	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	94c69eb8e3	Create API endpoint to get authenticated user information This help clients render UI with user information	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	377e979800	Make current chat expand to full width when session panel collapsed This behavior also matches web client behavior on chat session panel collapse	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	913dcdfbcd	Only render first run setup message once if error or server not running	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	3b630841bd	s/aget_all_filenames_by_source/get_all_filenames_by_source as sync func	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	e45edbb992	Collapse navigation tabs into icons on mobile. Add spacing to them	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	93edd5427f	Add Chat navigation tab back to top pane on web client Reduces user confusion on how to go to chat pane Add emoji's for each tab to provide cleaner, iconified division between the nav options	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	8159d1ab25	Fix showing Search navigation tab from Agent pages on web client The `has_documents' flag wasn't being passed. So the search tab always showing up as empty instead of being dynamically enabled if documents had been indexed.	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	76cb543347	Show title bar in Khoj desktop app on Windows	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	f040418cf1	Fix indexing files in sub-folders on the Desktop app - `fs.readdir' func in node version 18.18.2 has buggy `recursive' option See nodejs/node#48640, effect-ts/effect#1801 for details - We were recursing down a folder in two ways on the Desktop app. Remove `recursive: True' option to the `fs.readdirSync' method call to recurse down via app code only	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	a8dec1c9d5	Index all text, code files in Github repos. Not just md, org files	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	8291b898ca	Standardize structure of text to entries to match other entry processors Add process_single_plaintext_file func etc with similar signatures as org_to_entries and markdown_to_entries processors The standardization makes modifications, abstractions easier to create	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	079f409238	Skip indexing Github repo on hitting Github API rate limit Sleep until rate limit passed is too expensive, as it keeps a app worker occupied. Ideally we should schedule job to contine after rate limit wait time has passed. But this can only be added once we support jobs scheduling.	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	d5c9b5cb32	Stop indexing commits, issues and issue comments in Github indexer Normal indexing quickly Github hits rate limits. Purpose of exposing Github indexer is for indexing content like notes, code and other knowledge base in a repo. The current indexer doesn't scale to index metadata given Github's rate limits, so remove it instead of giving a degraded experience of partially indexed repos	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	7ff1bd9f8b	Send more text file types from Desktop app and improve indexing them - Allow syncing more file types from desktop app to index on server - Use `file-type' package to identify valid text file types on Desktop app - Split plaintext entries into smaller logical units than a whole file Since the text splitting upgrades in #645, compiled chunks have more logical splits like paragraph, sentence. Show those (potentially) smaller snippets to the user as references - Tangential Fix: Initialize unbound currentTime variable for error log timestamp	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	89915dcb4c	Identify file type by content & allow server to index all text files - Use Magika's AI for a tiny, portable and better file type identification system - Existing file type identification tools like `file' and `magic' require system level packages, that may not be installed by default on all operating systems (e.g `file' command on Windows)	2024-04-09 20:19:39 +05:30
sabaimran	312528d471	Fix typo in SECURE_PROXY_SSL_HEADER settings	2024-04-09 12:33:21 +05:30
sabaimran	e56c5e67dd	Revert SSL Redirect setting as it prevents the admin page from loading	2024-04-09 12:24:48 +05:30
sabaimran	1770bb174b	Add UUID to the KhojUser search fields and inc frequency of telemetry job to 2 mins	2024-04-09 11:51:51 +05:30
sabaimran	ab51ae9091	Use SECURE_SSL_REDIRECT to ensure requests are routed to https always	2024-04-09 10:18:12 +05:30
sabaimran	1c229dad91	Set daily limit for unsubsribed users to 5 in websocket API	2024-04-08 21:16:48 +05:30
sabaimran	27815d982c	Redirect user to the login page when either of the csrf token inputs is missing	2024-04-08 20:22:17 +05:30
sabaimran	d257629f81	Handle case when properties field isn't present in the page	2024-04-08 16:15:47 +05:30
sabaimran	089e0d028b	Add a more gracefull error message when the rate limit is exceeded	2024-04-08 15:20:54 +05:30
Debanjum	11ce3e2268	Update Text Chunking Strategy to Improve Search Context (#645 ) ## Major - Parse markdown, org parent entries as single entry if fit within max tokens - Parse a file as single entry if it fits with max token limits - Add parent heading ancestry to extracted markdown entries for context - Chunk text in preference order of para, sentence, word, character ## Minor - Create wrapper function to get entries from org, md, pdf & text files - Remove unused Entry to Jsonl converter from text to entry class, tests - Dedupe code by using single func to process an org file into entries Resolves #620	2024-04-08 13:56:38 +05:30
Debanjum Singh Solanky	67b1178aec	Remove debug logs generated while compiling org-mode entries	2024-04-08 13:01:24 +05:30
sabaimran	731ad03348	Skip indexing commits that are missing properties	2024-04-07 15:19:07 +05:30
sabaimran	376eaf64cd	Check if results are present in the pages or db response in Notion	2024-04-07 15:19:07 +05:30
Debanjum Singh Solanky	8222615280	Do not add original user message to knowledge search queries for offline chat It's not required anymore. The extracted questions by the offline chat model being used should be good enough.	2024-04-07 11:29:35 +05:30
sabaimran	351fb31a34	Add webpage search to socket codepath, add a feature page for online search	2024-04-07 09:23:29 +05:30
Debanjum Singh Solanky	4be4c53222	Release Khoj version 1.9.0	2024-04-05 17:13:58 +05:30
sabaimran	2aedd3c819	Increase freq. of telemetry upload to every 5 minutes	2024-04-05 14:13:47 +05:30
sabaimran	3b1234d084	Await the calls to the db in the notion.py file	2024-04-05 13:58:14 +05:30
sabaimran	00a67e9524	Add additional log lines when configuring the Notion settings for a user in the callback	2024-04-05 13:19:24 +05:30
sabaimran	d23f7da8e3	Handle the case where a previous serach model isn't set when updating the model	2024-04-05 13:18:51 +05:30
sabaimran	f57f9f672d	Address Notion, Image tech debt in indexing code path (#687 ) * Add support for using OAuth2.0 in the Notion integration * Add notion to the admin page * Remove unnecessary content_index and image search/setup references * Trigger background job to start indexing Notion after user configures it * Add a log line when a new Notion integration is setup * Fix references to the configure_content methods	2024-04-05 12:10:03 +05:30
sabaimran	a60321b68e	Push khoj to include inline references when possible	2024-04-04 10:31:13 +05:30
sabaimran	5bdcb4e69c	Wait for location data to be returned before setting up the socket connection	2024-04-04 10:31:13 +05:30
Debanjum Singh Solanky	00f599ea78	Fix passing flags to re.split to break org, md content by heading level `re.MULTILINE' should be passed to the `flags' argument, not the `max_splits' argument of the `re.split' func This was messing up the indexing by only allowing a maximum of re.MULTILINE splits. Fixing this improves the search quality to previous state	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	32ac0622ff	Extract dates from compiled text entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	29c1c18042	Increase search distance to get relevant content for chat post indexer update More content indexed per entry would result in an overall scores lowering effect. Increase default search distance threshold to counter that - Details - Fix expected results post indexing updates - Fix search with max distance post indexing updates - Minor - Remove openai chat actor test for after: operator as it's not expected anymore	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	ad4fa4b2f4	Fix adding file path instead of stem to markdown entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	44b3247869	Update logical splitting of org-mode text into entries - Major - Do not split org file, entry if it fits within the max token limits - Recurse down org file entries, one heading level at a time until reach leaf node or the current parent tree fits context window - Update `process_single_org_file' func logic to do this recursion - Convert extracted org nodes with children into entries - Previously org node to entry code just had to handle leaf entries - Now it recieve list of org node trees - Only add ancestor path to root org-node of each tree - Indent each entry trees headings by +1 level from base level (=2) - Minor - Stop timing org-node parsing vs org-node to entry conversion Just time the wrapping function for org-mode entry extraction This standardizes what is being timed across at md, org etc. - Move try/catch to `extract_org_nodes' from `parse_single_org_file' func to standardize this also across md, org	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	eaa27ca841	Only add spaces after heading if any tags in orgnode raw entry repr	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	2ea8a832a0	Log error when fail to index md file. Fix, improve typing in md_to_entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	44eab74888	Dedupe code by using single func to process an org file into entries Add type hints to orgnode and org-to-entries packages	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	db2581459f	Parse markdown parent entries as single entry if fit within max tokens These changes improve context available to the search model. Specifically this should improve entry context from short knowledge trees, that is knowledge bases with sparse, short heading/entry trees Previously we'd always split markdown files by headings, even if a parent entry was small enough to fit entirely within the max token limits of the search model. This used to reduce the context available to the search model to select appropriate entries for a query, especially from short entry trees Revert back to using regex to parse through markdown file instead of using MarkdownHeaderTextSplitter. It was easier to implement the logical split using regexes rather than bend MarkdowHeaderTextSplitter to implement it. - DFS traverse the markdown knowledge tree, prefix ancestry to each entry	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	982ac1859c	Parse markdown file as single entry if it fits with max token limits These changes improve entry context available to the search model Specifically this should improve entry context from short knowledge trees, that is knowledge bases with small files Previously we split all markdown files by their headings, even if the file was small enough to fit entirely within the max token limits of the search model. This used to reduce the context available to select the appropriate entries for a given query for the search model, especially from short knowledge trees	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	d8f01876e5	Add parent heading ancestory to extracted markdown entries for context Improve, update the markdown to entries extractor tests	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	86575b2946	Chunk text in preference order of para, sentence, word, character - Previous simplistic chunking strategy of splitting text by space didn't capture notes with newlines, no spaces. For e.g in #620 - New strategy will try chunk the text at more natural points like paragraph, sentence, word first. If none of those work it'll split at character to fit within max token limit - Drop long words while preserving original delimiters Resolves #620	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	a627f56a64	Remove unused Entry to Jsonl converter from text to entry class, tests This was earlier used when the index was plaintext jsonl file. Now that documents are indexed in a DB this func is not required. Simplify org,md,pdf,plaintext to entries tests by removing the entry to jsonl conversion step	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	28105ee027	Create wrapper function to get entries from org, md, pdf & text files - Convert extract_org_entries function to actually extract org entries Previously it was extracting intermediary org-node objects instead Now it extracts the org-node objects from files and converts them into entries - Create separate, new function to extract_org_nodes from files - Similarly create wrapper funcs for md, pdf, plaintext to entries - Update org, md, pdf, plaintext to entries tests to use the new simplified wrapper function to extract org entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	f01a12b1d2	Improve styling of chat sessions side panel - Move green server connected dot to the bottom. Show status when disconnected from server - Move "New conversation" button to right of the "Conversation" title - Center alignment of the new conversation and connection status buttons	2024-04-04 01:43:26 +05:30
sabaimran	dd1e5e145a	Use List[Any] for typing	2024-04-03 21:46:41 +05:30
sabaimran	b8087c4c8e	Add typing to empty list variables in github_to_entries	2024-04-03 21:41:36 +05:30
sabaimran	d036fdfc26	If tree is not in the contents, then just return empty files list	2024-04-03 17:55:25 +05:30
Debanjum Singh Solanky	f915b2bd14	Fix passing model_name param to chatml formatter for online chat	2024-04-03 17:21:43 +05:30
sabaimran	6aa88761b8	Skip creating the default agent if there's no default conversation config	2024-04-03 17:21:01 +05:30
sabaimran	b4f71e06b3	Add timeout after 10 minutes of inactivity on socket	2024-04-02 22:12:27 +05:30
sabaimran	f48426623d	resolve merge conflict in chat.html	2024-04-02 17:29:48 +05:30
sabaimran	bf1187f465	Use new online/websearch logic and add agent to chat_metadata	2024-04-02 17:20:38 +05:30
sabaimran	867e1007d1	Remove superfluous newline	2024-04-02 17:20:08 +05:30
sabaimran	228ad68042	Merge with origin/master	2024-04-02 17:02:21 +05:30
sabaimran	776550d5ce	Add a migration for updating the default chat model, update for existing users	2024-04-02 17:01:31 +05:30
sabaimran	47fc7e1ce6	Rebase with matser	2024-04-02 16:16:06 +05:30
Debanjum	215ab6e66a	Extract More Dates from entries to improve Date Filter (#683 ) - Overview - Extract more structured date variants (e.g with dot(.) & slash(/) separators, 2-digit year) - Extract some natural, partial dates as well from entries - Capability Add ability to extract the following additional date forms: - Natural Dates: 21st April 2000, February 29 2024 - Partial Natural Dates: March 24, Mar 2024 - Structured Dates: 20/12/24, 20.12.2024, 2024/12/20 Note: Previously only YYYY-MM-DD ISO-8601 structured date form was extracted for date filters - Performance Using regexes is MUCH faster than using the `dateparser' python library It's a little crude but gives acceptable performance for large datasets	2024-04-02 16:14:53 +05:30
Debanjum Singh Solanky	7afee2d55c	Let offline chat model set context window. Improve, fix prompts	2024-03-31 16:19:35 +05:30
Debanjum Singh Solanky	4228965c9b	Handle msg truncation when question is larger than max prompt size Notice and truncate the question it self at this point	2024-03-31 15:50:06 +05:30
Debanjum Singh Solanky	886d49e3a4	Merge branch 'master' into migrate-to-llama-cpp-for-offline-chat	2024-03-31 00:59:20 +05:30
Debanjum Singh Solanky	4f65dde201	Release Khoj version 1.8.0	2024-03-31 00:06:15 +05:30
Debanjum Singh Solanky	7923903d21	Improve date filter regexes to extract structured, natural, partial dates - Much faster than using dateparser - It took 2x-4x for improved regex to extracts 1-15% more dates - Whereas It took 33x to 100x for dateparser to extract 65% - 400% more dates - Improve date extractor tests to test deduping dates, natural, structured date extraction from content - Extract some natural, partial dates and more structured dates Using regex is much faster than using dateparser. It's a little crude but should pay off in performance. Supports dates of form: - (Day-of-Month) Month\|AbbreviatedMonth Year\|2DigitYear - Month\|AbbreviatedMonth (Day-of-Month) Year\|2DigitYear	2024-03-30 00:07:19 +05:30
Debanjum Singh Solanky	104eeea274	Extract natural language and locale specific dates in content Previously we just extracted dates in YYYY-MM-DD format from content for date filterings during search. Use dateparser to extract dates across locales and natural language This should improve notes returned as context when chat searches knowledge base with date filters Fallback to regex for date parsing from content if dateparser fails - Limit natural date extractor capabilities to improve performance - Assume language is english Language detection otherwise takes a REALLY long time - Do not extract unix timestamps, timezone - This isn't required, as just using date and approximating dates as UTC	2024-03-30 00:06:56 +05:30
sabaimran	1195f843a3	Remove forward slash from the root agents endpoint	2024-03-28 23:06:55 +05:30
sabaimran	a1729b9b9e	Add telemetry for agents used in conversation, increase image width in agents page	2024-03-28 22:18:11 +05:30
sabaimran	d503b3e867	Use Personality vernacular in agent page - When setting up the default agent, configure every conversation that doesn't have an agent to use the Khoj agent - Fix reverse migration for the locale removal migration	2024-03-28 15:07:02 +05:30
sabaimran	e59de8c9b1	Constrain width/size of agent image in agents view	2024-03-28 13:32:11 +05:30
sabaimran	51d0c9b8b0	Add telemetry to keep state of new agents being used	2024-03-28 11:37:24 +05:30
sabaimran	46ebc55e2b	Add a top tab for agents	2024-03-28 11:37:01 +05:30
sabaimran	8397187231	Use default agent when creating a new conversation without agent specified	2024-03-28 11:36:27 +05:30
Debanjum Singh Solanky	4912c0ee30	Use extract queries actor to improve notes search with offline chat Previously we were skipping the extract questions step for offline chat as default offline chat model wasn't good enough to output proper json given the time it took to extract questions. The new default offline chat models gives json much more regularly and with date filters, so the extract questions step becomes useful given the impact on latency	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	1ebd5c3648	Rename GPT4AllChatProcessor* to OfflineChatProcessor Config, Model	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	2a0b943bb4	Use Hermes-2-Pro as default offline chat model in khoj.yml	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	8ca39a436c	Use llama.cpp for offline chat models - Benefits of moving to llama-cpp-python from gpt4all: - Support for all GGUF format chat models - Support for AMD, Nvidia, Mac, Vulcan GPU machines (instead of just Vulcan, Mac) - Supports models with more capabilities like tools, schema enforcement, speculative ddecoding, image gen etc. - Upgrade default chat model, prompt size, tokenizer for new supported chat models - Load offline chat model when present on disk without requiring internet - Load model onto GPU if not disabled and device has GPU - Load model onto CPU if loading model onto GPU fails - Create helper function to check and load model from disk, when model glob is present on disk. `Llama.from_pretrained' needs internet to get repo info from HuggingFace. This isn't required, if the model is already downloaded Didn't find any existing HF or llama.cpp method that looked for model glob on disk without internet	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	0a7392f6ec	Only add location to image prompt generator when location known	2024-03-26 22:33:01 +05:30
sabaimran	fdf78525b4	Part 2: Add web UI updates for basic agent interactions (#675 ) * Initial pass at backend changes to support agents - Add a db model for Agents, attaching them to conversations - When an agent is added to a conversation, override the system prompt to tweak the instructions - Agents can be configured with prompt modification, model specification, a profile picture, and other things - Admin-configured models will not be editable by individual users - Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications * Customize default behaviors for conversations without agents or with default agents * Add a new web client route for viewing all agents * Use agent_id for getting correct agent * Add web UI views for agents - Add a page to view all agents - Add slugs to manage agents - Add a view to view single agent - Display active agent when in chat window - Fix post-login redirect issue * Fix agent view * Spruce up the 404 page and improve the overall layout for agents pages * Create chat actor for directly reading webpages based on user message - Add prompt for the read webpages chat actor to extract, infer webpage links - Make chat actor infer or extract webpage to read directly from user message - Rename previous read_webpage function to more narrow read_webpage_at_url function * Rename agents_page -> agent_page * Fix unit test for adding the filename to the compiled markdown entry * Fix layout of agent, agents pages * Merge migrations * Let the name, slug of the default agent be Khoj, khoj * Fix chat-related unit tests * Add webpage chat command for read web pages requested by user Update auto chat command inference prompt to show example of when to use webpage chat command (i.e when url is directly provided in link) * Support webpage command in chat API - Fallback to use webpage when SERPER not setup and online command was attempted - Do not stop responding if can't retrieve online results. Try to respond without the online context * Test select webpage as data source and extract web urls chat actors * Tweak prompts to extract information from webpages, online results - Show more of the truncated messages for debugging context - Update Khoj personality prompt to encourage it to remember it's capabilities * Rename extract_content online results field to webpages * Parallelize simple webpage read and extractor Similar to what is being done with search_online with olostep * Pass multiple webpages with their urls in online results context Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted content would ever be passed. URL of the extracted webpage content wasn't passed to clients in online results context. This limited them from being rendered * Render webpage read in chat response references on Web, Desktop apps * Time chat actor responses & chat api request start for perf analysis * Increase the keep alive timeout in the main application for testing * Do not pipe access/error logs to separate files. Flow to stdout/stderr * [Temp] Reduce to 1 gunicorn worker * Change prod docker image to use jammy, rather than nvidia base image * Use Khoj icon when Khoj web is installed on iOS as a PWA * Make slug required for agents * Simplify calling logic and prevent agent access for unauthenticated users * Standardize to use personality over tuning in agent nomenclature * Make filtering logic more stringent for accessible agents and remove unused method: * Format chat message query --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-03-26 18:13:24 +05:30
Debanjum Singh Solanky	15ed208996	Use Khoj icon when Khoj web is installed on iOS as a PWA	2024-03-26 00:13:12 +05:30
Debanjum	586654e2af	Allow directly reading web pages, even when SERP not enabled (#676 ) ### Overview Khoj can now read website directly without needing to go through the search step first ### Details - Parallelize simple webpage read and extractor - Rename extract_content online results field to web pages - Tweak prompts to extract information from webpages, online results - Test select webpage as data source and extract web urls chat actors - Render webpage read in chat response references on Web, Desktop apps - Pass multiple webpages with their urls in online results context - Support webpage command in chat API - Add webpage chat command for read web pages requested by user - Create chat actor for directly reading webpages based on user message	2024-03-24 16:25:25 +05:30
Debanjum Singh Solanky	9e52ae9e98	Time chat actor responses & chat api request start for perf analysis	2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky	dabf71bc3c	Render webpage read in chat response references on Web, Desktop apps	2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky	a2e79c94be	Pass multiple webpages with their urls in online results context Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted content would ever be passed. URL of the extracted webpage content wasn't passed to clients in online results context. This limited them from being rendered	2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky	71b6905008	Parallelize simple webpage read and extractor Similar to what is being done with search_online with olostep	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	1167f6ddf9	Rename extract_content online results field to webpages	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	b22a7dae5d	Tweak prompts to extract information from webpages, online results - Show more of the truncated messages for debugging context - Update Khoj personality prompt to encourage it to remember it's capabilities	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	ad6f6bb0ed	Support webpage command in chat API - Fallback to use webpage when SERPER not setup and online command was attempted - Do not stop responding if can't retrieve online results. Try to respond without the online context	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	a6b7432837	Add webpage chat command for read web pages requested by user Update auto chat command inference prompt to show example of when to use webpage chat command (i.e when url is directly provided in link)	2024-03-24 15:46:29 +05:30
sabaimran	8abc8ded82	Part 1: Server-side changes to support agents integrated with Conversations (#671 ) * Initial pass at backend changes to support agents - Add a db model for Agents, attaching them to conversations - When an agent is added to a conversation, override the system prompt to tweak the instructions - Agents can be configured with prompt modification, model specification, a profile picture, and other things - Admin-configured models will not be editable by individual users - Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications * Customize default behaviors for conversations without agents or with default agents * Use agent_id for getting correct agent * Merge migrations * Simplify some variable definitions, add additional security checks for agents * Rename agent.tuning -> agent.personality	2024-03-23 22:09:38 +05:30
sabaimran	4deb849fb1	Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming	2024-03-23 14:04:25 +05:30
sabaimran	8edbd7094f	Let the name, slug of the default agent be Khoj, khoj	2024-03-23 14:03:58 +05:30
sabaimran	6b4c4f10b5	Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming	2024-03-23 11:22:00 +05:30
sabaimran	20617614ae	Merge branch 'features/customize-chat-with-agents' of github.com:khoj-ai/khoj into features/add-agents-ui	2024-03-23 11:20:57 +05:30
sabaimran	2399d91f61	Merge migrations	2024-03-22 10:05:33 +05:30
sabaimran	d38089ab57	Merge with origin	2024-03-22 09:55:33 +05:30
Debanjum Singh Solanky	aed4313cfc	Fix updating specific conversation by id from the chat API endpoint - Use the conversation id of the retrieved conversation rather than the potentially unset conversation id passed via API - await creating new chat when no chat id provided and no existing conversations exist	2024-03-21 02:46:52 +05:30
sabaimran	6ba0d8e379	Add a connected notification if the websocket is connected	2024-03-20 20:53:28 +05:30
sabaimran	255b69dc58	Add a comma delimeter between outputted search queries	2024-03-20 19:43:35 +05:30
sabaimran	d84188b221	Scroll down when a message is added in the chat interface's handle stream response method	2024-03-20 15:04:41 +05:30
sabaimran	70ad78990a	Use a common method for sending a generic message to the client from the server in the ws connection	2024-03-20 15:04:14 +05:30
sabaimran	d4e83b060a	Update the web UI for the chat interface to establish a connection via a socket to the server - Move some common methods into separate functions to make the UI components more efficient - The normal HTTP-based chat connection will still work and serves as a fallback if the websocket is unavailable	2024-03-20 14:34:47 +05:30
sabaimran	a346f79b39	Add support for chatting via the web socket connection - Convert to a model of calling the search API directly with a function call (rather than using the API method) - Gracefully handle websocket connection disconnects - Ensure that the rest of the response is still saved, as it is currently, if the user disconects from the client - Setup unchangeable context at the beginning of the session when the connection is established (like location, username, etc)	2024-03-20 14:33:33 +05:30
Debanjum Singh Solanky	62a83dc9bb	Fix online search actor to use natural dates not after: operator The recently added after: operator to online search actor was too restrictive, gave worse results than when just use natural language dates in search query	2024-03-15 21:50:14 +05:30
Debanjum Singh Solanky	4a1e6a2275	Convert deleted old user requests log line to debug from info	2024-03-15 20:50:10 +05:30
Debanjum Singh Solanky	9a068dadbf	Fix extract questions prompt to use YYYY-MM-DD date filter format	2024-03-15 18:43:18 +05:30
Debanjum Singh Solanky	ecddf98430	Handle truncation when single long non-system chat message Previously was assuming the system prompt is being always passed as the first message. So expected there to be at least 2 messages in logs. This broke chat actors querying with single long non system message. A more robust way to extract system prompt is via the message role instead	2024-03-15 15:58:39 +05:30
Debanjum Singh Solanky	ec0c35b7ed	Improve delete, rename chat session UX in Desktop, Web app - Ask for Confirmation before deleting chat session in Desktop, Web app - Save chat session rename on hitting enter in title edit input box - No need to flash previous conversation cleared status message - Move chat session delete button after rename button in Desktop app	2024-03-15 15:58:19 +05:30
Debanjum Singh Solanky	924b1215ce	Allow unset locale for Google authenticated user	2024-03-15 15:35:20 +05:30
Debanjum Singh Solanky	c792fa819f	Fix setting chat session title from Desktop app Pass auth headers to not have the chat session title update request fail	2024-03-15 15:19:20 +05:30
Debanjum Singh Solanky	c9e05dc184	Get conversation by title when requested via chat API	2024-03-15 12:31:50 +05:30
sabaimran	724557fc7b	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-agents-ui	2024-03-15 12:14:34 +05:30
sabaimran	7fc484ba7a	Merge branch 'master' of github.com:khoj-ai/khoj into features/customize-chat-with-agents	2024-03-15 12:13:28 +05:30
Debanjum Singh Solanky	cac26dafe3	Only create new chat on get if a specific chat id, slug isn't requested	2024-03-15 11:58:39 +05:30
sabaimran	416feb13ef	Fix layout of agent, agents pages	2024-03-15 11:17:40 +05:30
sabaimran	d734be61cf	Rename agents_page -> agent_page	2024-03-15 10:17:51 +05:30
Debanjum Singh Solanky	08993ff109	Add new, remove old known chat models from model to prompt size map	2024-03-15 04:02:25 +05:30
Debanjum Singh Solanky	fba0338787	Release Khoj version 1.7.0	2024-03-15 00:08:32 +05:30
Debanjum Singh Solanky	6118d1ff57	Create chat actor for directly reading webpages based on user message - Add prompt for the read webpages chat actor to extract, infer webpage links - Make chat actor infer or extract webpage to read directly from user message - Rename previous read_webpage function to more narrow read_webpage_at_url function	2024-03-14 14:58:37 +05:30
Debanjum	e549824fe2	Improve OpenAI Chat Actors and their prompts (#673 ) ### Major - Enforce json mode response from OpenAI chat actors prev using string lists - Use `gpt-4-turbo-preview' as default chat model, extract questions actor - Make Khoj read khoj website to respond with accurate, up-to-date information about itself - Dedupe query in notes prompt. Improve OAI chat actor, director tests ### Minor - Test data source, output mode selector, web search query chat actors - Improve notes search actor to always create a non-empty list of queries - Construct available data sources, output modes as a bullet list in prompts - Use consistent agent name across static and dynamic examples in prompts - Add actor's name to extract questions prompt to improve context for guidance	2024-03-14 12:44:40 +05:30
sabaimran	3caf0a79d8	Spruce up the 404 page and improve the overall layout for agents pages	2024-03-14 11:26:49 +05:30
sabaimran	c45030af44	Fix agent view	2024-03-14 11:13:19 +05:30
Debanjum Singh Solanky	a1ce12296f	Fix rendering online with note references post streaming chat response Previously only the notes references would get rendered post response streaming when when both online and notes references were used to respond to the user's message	2024-03-14 03:40:40 +05:30
Debanjum Singh Solanky	1aeea3d854	Fix opening external links from confirmation dialog box on desktop app	2024-03-14 02:29:22 +05:30
Debanjum Singh Solanky	2e5cc49cb3	Enforce json response from OpenAI chat actors prev using string lists - Allow passing response format type to OpenAI API via chat actors - Convert in-context examples to use json objects instead of str lists - Update actors outputting str list to request output to be json_object - OpenAI's json mode enforces the model to output valid json object	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	7211eb9cf5	Default to gpt-4-turbo-preview for chat model, extract questions actor GPT-4 is more expensive and generally less capable than gpt-4-turbo-preview	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	dd883dc53a	Dedupe query in notes prompt. Improve OAI chat actor, director tests - Remove stale tests - Improve tests to pass across gpt-3.5 and gpt-4-turbo - The haiku creation director was failing because of duplicate query in instantiated prompt	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	14682d5354	Improve notes search actor to always create a non-empty list of queries - Remove the option for Notes search query generation actor to return no queries. Whether search should be performed is decided before, this step doesn't need to decide that - But do not throw warning if the response is a list with no elements	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	f5734826cb	Improve pick data source prompt to look online for info about Khoj - Add examples where user queries requesting information about Khoj results in the "online" data source being selected - Add an example for "general" to select chat command prompt	2024-03-14 01:21:13 +05:30
Debanjum Singh Solanky	9a516bed47	Construct available data sources, output modes as a bullet list in prompts	2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky	f28fb89af8	Use consistent agent name across static and dynamic examples in prompts Previously the examples constructed from chat history used "Khoj" as the agent's name but all 3 prompts using the func used static examples with "AI:" as the pertinent agent's name	2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky	f5793149a9	Add actor's name to extract questions prompt to improve context for guidance	2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky	73ad444086	Make online search Actor read khoj.dev for docs, info about Khoj - Add example to read khoj.dev website for up-to-date info to setup, use khoj, discover khoj features etc. - Online search should use site: and after: google search operators - Show example of adding the after: date filter to google search - Give local event lookup example using user's current location in query - Remove unused select search content type prompt	2024-03-14 00:34:57 +05:30
sabaimran	290712c3fe	Add web UI views for agents - Add a page to view all agents - Add slugs to manage agents - Add a view to view single agent - Display active agent when in chat window - Fix post-login redirect issue	2024-03-14 00:07:36 +05:30
Debanjum	3abe7ccb26	Improve Online Search Speed and Context (#670 ) ### Major - Read web pages in parallel to improve chat response time - Read web pages directly when Olostep proxy not setup - Include search results & web page content in online context for chat response ### Minor - Simplify, modularize and add type hints to online search functions	2024-03-11 22:16:30 +05:30
Debanjum Singh Solanky	dc86e44a07	Include search results & webpage content in online context for chat response Previously if a web page was read for a sub-query, only the extracted web page content was provided as context for the given sub-query. But the google results themselves have relevant snippets. So include them	2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky	d136a6be44	Simplify, modularize and add type hints to online search functions - Simplify content arg to `extract_relevant_info' function. Validate, clean the content arg inside the `extract_relevant_info' function - Extract `search_with_google' function outside the parent function - Call the parent function a more appropriate `search_online' instead of `search_with_google' - Simplify the `search_with_google' function using list comprehension. Drop empty search result fields from chat model context for response to reduce cost and response latency - No need to show stacktrace when unable to read webpage, basic error is enough - Add type hints to online search functions to catch issues with mypy	2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky	88f096977b	Read webpages directly when Olostep proxy not setup This is useful for self-hosted, individual user, low traffic setups where a proxy service is not required	2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky	ca2f962e95	Read, extract information from web pages in parallel to lower response time - Time reading webpage, extract info from webpage steps for perf analysis - Deduplicate webpages to read gathered across separate google searches - Use aiohttp to make API requests non-blocking, pair with asyncio to parallelize all the online search webpage read and extract calls	2024-03-11 18:41:02 +05:30
sabaimran	8e1445b15b	Use agent_id for getting correct agent	2024-03-11 14:44:46 +05:30
sabaimran	6ab649312f	Add a new web client route for viewing all agents	2024-03-11 14:40:40 +05:30
sabaimran	352168d6c2	Customize default behaviors for conversations without agents or with default agents	2024-03-11 14:20:28 +05:30
sabaimran	9b88976f36	Initial pass at backend changes to support agents - Add a db model for Agents, attaching them to conversations - When an agent is added to a conversation, override the system prompt to tweak the instructions - Agents can be configured with prompt modification, model specification, a profile picture, and other things - Admin-configured models will not be editable by individual users - Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications	2024-03-11 12:45:24 +05:30
Debanjum	18fa3e2384	Rerank Search Results by Default on GPU machines (#668 ) - Trigger SentenceTransformer Cross Encoder models now run fast on GPU enabled machines, including Mac ARM devices since UKPLab/sentence-transformers#2463 - Details - Use cross-encoder to rerank search results by default on GPU machines and when using an inference server - Only call search API when pause in typing search query on web, desktop apps	2024-03-10 15:15:25 +05:30
Debanjum Singh Solanky	53d402480c	Rerank search results with cross-encoder when using an inference server If an inference server is being used, we can expect the cross encoder to be running fast enough to rerank search results by default	2024-03-10 15:09:46 +05:30
Debanjum Singh Solanky	44c8d09342	Only call search API when pause in typing search query on web, desktop apps Wait for 300ms since stop typing before calling search API. This smooths out UI jitter when rendering search results, especially now that we're reranking for every search query on GPU enabled devices Emacs already has 300ms debounce time. More convoluted to add debounce time to Obsidian search modal, so not updating that yet	2024-03-10 14:29:24 +05:30
Debanjum Singh Solanky	1105d8814f	Use cross-encoder to rerank search results by default on GPU machines Latest sentence-transformer package uses GPU for cross-encoder. This makes it fast enough to enable reranking on machines with GPU. Enabling search reranking by default allows (at least) users with GPUs to side-step learning the UI affordance to rerank results (i.e hitting Cmd/Ctrl-Enter or ENTER).	2024-03-10 14:29:21 +05:30
Debanjum Singh Solanky	fd81446ba3	Do not create new chat session when an old chat session is deleted - Fix `get_conversation_by_user' shouldn't return new conversation if conversation with requested id not found. It should only return new conversation if no specific conversation is requested and no conversations found for user at all - Repro - Delete a new chat, this calls loadChat via window.onload which calls server /chat/history API endpoint with conversationId set to that of just deleted conversation sporadically The call to GET chat/history API with conversationId set occurs when window.onload triggers before the conversationId is deleted by the delete button after the DELETE /chat/history API call (via race) - In such a scenario, get_conversation_by_user called by chat/history API with conversationId of deleted conversation returns a new conversation - Miscellaneous - Chat history load should be logged as call to that chat_history api, not the "chat" api - Show status updates of clearing conversation history in chat input - Simplify web, desktop client code by removing unnecessary new variables	2024-03-10 02:17:23 +05:30
Debanjum Singh Solanky	b7fad04870	Use consistent field name for queries in chat history & better image prompt	2024-03-09 19:11:03 +05:30
sabaimran	6aae9864d3	Fix Notion indexing and add an admin view for Entry objects	2024-03-09 16:25:23 +05:30
sabaimran	12d6c4da7d	Only include inferred queries in the conversation history for images, not links. Overflow the side panel when too long	2024-03-09 11:59:35 +05:30
sabaimran	e5cd0237e3	Release Khoj version 1.6.2	2024-03-08 17:04:03 +05:30
Debanjum Singh Solanky	446ac7649d	Remove unused js method in web chat client, add newline to web data in prompt	2024-03-08 16:40:39 +05:30
Debanjum Singh Solanky	12d32ac99c	Increase user visibility into more errors during image generation Catch OpenAI connection error and errors during better image prompt generation	2024-03-08 16:40:39 +05:30
sabaimran	ff31759423	Fix target determination in the copy programmatic output button	2024-03-08 16:33:12 +05:30
sabaimran	9f934929c6	Infer mime type from file ending when not available in browser. Don't output image in conversation turns	2024-03-08 12:34:26 +05:30
sabaimran	81beb7940c	Upload generated images to s3, if AWS credentials and bucket is available (#667 ) * Upload generated images to s3, if AWS credentials and bucket is available. - In clients, render the images via the URL if it's returned with a text-to-image2 intent type * Make the loading screen more intuitve, less jerky and update the programmatic copy button * Update the loading icon when waiting for a chat response	2024-03-08 10:54:13 +05:30
sabaimran	13894e1fd5	add instructions for drag/drop files in sys prompt	2024-03-07 17:57:42 +05:30
sabaimran	7357b6eff1	Revert white-space preline and add more detailed help text when selecting file	2024-03-06 16:47:27 +05:30
sabaimran	b615c0719e	Support upload for files via drag/drop in the web UI (#666 ) * Add additional styling changes for showing UI changes when dragging file to the main screen * Add a loading spinner when file upload is in progress, and don't index github/notion when indexing files * Add an explicit icon for file uploading in the chat button menu * Add appropriate dragover styling when picking a file from the file picker/browser * Add a loading screen when retrieving chat history. Fix width of the chat window. Put attachment icon to the left of chat input	2024-03-06 16:43:05 +05:30
sabaimran	e323a6d69b	Include additional user context in the image generation flow (#660 ) * Make major improvements to the image generation flow - Include user context from online references and personal notes for generating images - Dynamically select the modality that the LLM should respond with - Retun the inferred context in the query response for the dekstop, web chat views to read * Add unit tests for retrieving response modes via LLM * Move output mode unit tests to the actor suite, rather than director * Only show the references button if there is at least one available * Rename aget_relevant_modes to aget_relevant_output_modes * Use a shared method for generating reference sections, simplify some of the prompting logic * Make out of space errors in the desktop client more obvious	2024-03-06 13:48:41 +05:30
Debanjum Singh Solanky	2d61591c22	Improve user visibility into errors during image generation	2024-02-29 13:19:13 +05:30
sabaimran	0bbb5cff85	Release Khoj version 1.6.1	2024-02-26 13:27:20 -08:00
sabaimran	c8194a7364	Make out of space errors in the desktop client more obvious	2024-02-26 11:53:36 -08:00
Debanjum Singh Solanky	956dd71d91	Clean entry before adding to DB and log when it fails Remove \0 null characters from entry fields as this is causing indexing errors	2024-02-27 01:19:34 +05:30
Debanjum Singh Solanky	bb613a8e1d	Make indentation styling more compact on Obsidian client	2024-02-25 14:41:45 +05:30
Debanjum Singh Solanky	682b70011f	Set chat body height to remove UX jitter on chat history load in Web, Desktop	2024-02-25 14:40:47 +05:30
Debanjum Singh Solanky	efe86ce159	Fix saved conversation logger to handle image responses	2024-02-25 13:46:32 +05:30
Debanjum Singh Solanky	4839f2901a	Open external links in Desktop app with default app for url on OS - Open external links using the default link handler registered on OS for the link type, e.g http:// -> firefox, mailto: thunderbird etc - Confirm before opening non-http URL using an external app	2024-02-25 13:21:52 +05:30
Debanjum	170bce2c02	Fix, Improve rendering images in Obsidian, Desktop, Web clients (#659 ) - Improve render of inferred query in image chat messages in Web, Desktop apps - Add inferred queries to image chat responses in Obsidian client - Fix rendering images from Khoj response in Obsidian client	2024-02-25 00:56:26 +05:30
Debanjum Singh Solanky	f84606325c	Improve render of inferred query in image chat messages in Web, Desktop apps	2024-02-25 00:47:06 +05:30
Debanjum Singh Solanky	a2e53d5e41	Add inferred queries to image chat responses in Obsidian client	2024-02-25 00:24:58 +05:30
Debanjum Singh Solanky	9b61f0b5f7	Fix rendering images from Khoj response in Obsidian client	2024-02-25 00:11:11 +05:30
sabaimran	b9d0533d92	Misc. fixes to prompting, admin, and others (#658 ) * Simplify and clarify prompt for selecting toolset dynamically * Add error handling around call to OLOSTEP api * Fix conversation admin page * Skip adding none or empty entries in the chunking method	2024-02-24 10:25:42 -08:00
Debanjum Singh Solanky	0e0e751ef7	Improve docstring of entrypoint function to the emacs client	2024-02-24 21:09:41 +05:30
Debanjum	8855529637	Improve Syncing Obsidian Vault, Invalidate Static Assets in Browser Cache in Web Client (#657 ) - Improve - Only send files modified since their last sync for indexing on server from the Obsidian client - Fix - Invalidate static asset browser cache in Web client when Khoj version changes	2024-02-24 20:20:30 +05:30
Debanjum Singh Solanky	a46f70c4b0	Remove deprecated lastSyncedFiles settings field from Obsidian client	2024-02-24 20:18:22 +05:30
Debanjum Singh Solanky	03a6b491b2	Warn when can't identify mimeType of files in Desktop, Obsidian clients	2024-02-24 19:59:03 +05:30
Debanjum Singh Solanky	3675ab4864	Only sync modified files from the Obsidian client Previously we'd send all files in vault and let the server deduplicate. This changes takes inspiration from the desktop app, and only pushes files which were modified after their previous sync with the server. This should reduce the processing load on the server	2024-02-24 07:48:40 +05:30
Debanjum Singh Solanky	ddfbf31bc8	Append version query param to web asset URLs to bypass browser cache Ensure latest assets are loaded when khoj version is updated	2024-02-24 06:49:25 +05:30

... 6 7 8 9 10 ...

2379 commits