sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-23 12:48:09 +00:00

Author	SHA1	Message	Date
sabaimran	807687a0ac	Automatically generate titles for conversations from history	2024-11-08 16:02:34 -08:00
sabaimran	7159b0b735	Enforce limits on file size when converting to text	2024-11-08 15:27:28 -08:00
sabaimran	4695174149	Add support for file preview in the chat input area (before message sent)	2024-11-08 15:12:48 -08:00
sabaimran	ad46b0e718	Label pages when extract text from pdf, docs content. Fix scroll area in doc preview.	2024-11-08 14:53:20 -08:00
sabaimran	ee062d1c48	Fix parsing for PDFs via content indexing API	2024-11-07 18:17:29 -08:00
sabaimran	623a97a9ee	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-07 17:18:23 -08:00
sabaimran	33498d876b	Simplify the share chat page. Don't need it to maintain its own conversation history - When chatting on a shared page, fork and redirect to a new conversation page	2024-11-07 17:14:11 -08:00
sabaimran	4b8be55958	Convert UUID to string when forking a conversation	2024-11-07 17:13:04 -08:00
sabaimran	9bbe27fe36	Set default value of attached files to empty list	2024-11-07 17:12:45 -08:00
sabaimran	3a51996f64	Process attached files in the chat history and add them to the chat message	2024-11-07 16:06:58 -08:00
sabaimran	a89160e2f7	Add support for converting an attached doc and chatting with it - Document is first converted in the chatinputarea, then sent to the chat component. From there, it's sent in the chat API body and then processed by the backend - We couldn't directly use a UploadFile type in the backend API because we'd have to convert the api type to a multipart form. This would require other client side migrations without uniform benefit, which is why we do it in this two-phase process. This also gives us capacity to repurpose the moe generic interface down the road.	2024-11-07 16:06:37 -08:00
sabaimran	e521853895	Remove unnecessary console.log statements	2024-11-07 16:03:31 -08:00
sabaimran	92c3b9c502	Add function to get an icon from a file type	2024-11-07 16:02:53 -08:00
sabaimran	140c67f6b5	Remove focus ring from the text area component	2024-11-07 16:02:02 -08:00
sabaimran	b8ed98530f	Accept attached files in the chat API - weave through all subsequent subcalls to models, where relevant, and save to conversation log	2024-11-07 16:01:48 -08:00
sabaimran	ecc81e06a7	Add separate methods for docx and pdf files to just convert files to raw text, before further processing	2024-11-07 16:01:08 -08:00
sabaimran	394035136d	Add an api that gets a document, and converts it to just text	2024-11-07 16:00:10 -08:00
sabaimran	3b1e8462cd	Include attach files in calls to extract questions	2024-11-07 15:59:15 -08:00
sabaimran	de73cbc610	Add support for relaying attached files through backend calls to models	2024-11-07 15:58:52 -08:00
Debanjum	05a93fcbed	v-align attach, send buttons with chat input text area on web app Otherwise, those buttons look off-center when images are attached to the chat input area	2024-11-05 17:10:53 -08:00
sabaimran	a0480d5f6c	use fill weight for the toggle right (enabled state) for research mode	2024-11-04 22:01:09 -08:00
sabaimran	dc26da0a12	Add uploaded files in the conversation file filter for a new convo	2024-11-04 22:00:47 -08:00
Debanjum	b51ee644aa	Fix escaping filename when normalizing in org node parser	2024-11-04 20:24:57 -08:00
Debanjum	5724d16a6f	Fix passing images to anthropic chat models to extract questions	2024-11-04 20:24:57 -08:00
sabaimran	7543360210	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-04 16:55:48 -08:00
sabaimran	b6145df3be	Handle file retrieval when agent is None	2024-11-04 16:55:22 -08:00
sabaimran	3dc9139cee	Add additional handling for when file_object comes back empty	2024-11-04 16:53:07 -08:00
sabaimran	a27b8d3e54	Remove summarize condition for only 1 file filter	2024-11-04 16:51:37 -08:00
sabaimran	362bdebd02	Add methods for reading full files by name and including context Now that models have much larger context windows, we can reasonably include full texts of certain files in the messages. Do this when an explicit file filter is set in a conversation. Do so in a separate user message in order to mitigate any confusion in the operation. Pipe the relevant attached_files context through all methods calling into models. We'll want to limit the file sizes for which this is used and provide more helpful UI indicators that this sort of behavior is taking place.	2024-11-04 16:37:13 -08:00
sabaimran	e3ca52b7cb	Use .get() to get text accompanying image url, instead of subindexing	2024-11-04 16:09:16 -08:00
sabaimran	1e89baca7b	Deprecate the UserSearchModelConfig and remove all references - The server has moved to a model of standardization for the embeddings generation workflow. Remove references to the support for differentiated models. - The migration script fo ra new model needs to be updated to accommodate full regeneration.	2024-11-04 12:24:41 -08:00
sabaimran	99c1d2831a	Release Khoj version 1.28.3	2024-11-02 12:23:11 -07:00
sabaimran	075b4ecf15	Call subscription_to_state with sync_to_async wrapper when getting user subscription state - This is needed in case the renewal_date is not set and we need to reset it for the user	2024-11-02 12:22:35 -07:00
sabaimran	ec44cbe1e7	Release Khoj version 1.28.2	2024-11-02 07:53:51 -07:00
Debanjum	31b5fde163	Only enable prompt tracer if git python is installed	2024-11-02 02:07:02 -07:00
sabaimran	5b18dc96e0	Release Khoj version 1.28.1	2024-11-01 22:51:51 -07:00
Debanjum	e85dd59295	Release Khoj version 1.28.0	2024-11-01 19:06:59 -07:00
Debanjum	14e453039d	Add prompt tracing, agent personality to infer webpage urls chat actor	2024-11-01 18:12:50 -07:00
Debanjum	ab321dc518	Expect query before tool in response to give think space in research prompt	2024-11-01 17:51:41 -07:00
Debanjum	1a83bbcc94	Clean API chat router. Move FeedbackData response type to router helper	2024-11-01 17:51:41 -07:00
sabaimran	e6eb87bbb5	Merge branch 'improve-debug-reasoning-and-other-misc-fixes' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-11-01 16:48:39 -07:00
sabaimran	a213b593e8	Limit the number of urls the webscraper can extract for scraping	2024-11-01 16:48:36 -07:00
sabaimran	327fcb8f62	create defiltered query after conversation command is extracted	2024-11-01 16:48:03 -07:00
sabaimran	b79a9ec36d	Clarify description of the code evaluation environment: not for document creation	2024-11-01 16:47:27 -07:00
Debanjum	9c7b36dc69	Use standard per minute rate limits across user types	2024-11-01 16:16:06 -07:00
Debanjum	ac21b10dd5	Simplify logic to get default search model. Remove unused import	2024-11-01 15:14:00 -07:00
sabaimran	2b35790165	Merge branch 'master' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-11-01 14:51:26 -07:00
sabaimran	baa939f4ce	When running code, strip any code delimiters. Disable application json type specification in Gemini request.	2024-11-01 13:47:39 -07:00
sabaimran	8fd2fe162f	Determine if research mode is enabled by checking the conversation commands and 'linting' them in the selection phase	2024-11-01 13:12:34 -07:00
sabaimran	cead1598b9	Don't reset research mode after completing research execution	2024-11-01 13:00:11 -07:00
Debanjum	c1c779a7ef	Do not yaml format raw code results in context for LLM. It's confusing	2024-11-01 12:45:26 -07:00
sabaimran	b3dad1f393	Standardize rate limits to 1/6 ratio	2024-11-01 12:21:09 -07:00
Debanjum	cd75151431	Do not allow auto selecting research mode as tool for now. You are required to manually turning it on. This takes longer and should be a high intent activity initiated by user	2024-11-01 12:07:52 -07:00
Debanjum	0b0cfb35e6	Simplify in research mode check in api_chat. - Dedent code for readability - Use better name for in research mode check - Continue to remove inferred summarize command when multiple files in file filter even when not in research mode - Continue to show select information source train of thought. It was removed by mistake earlier	2024-11-01 12:07:08 -07:00
Debanjum	73750ef286	Merge branch 'master' into features/advanced-reasoning	2024-11-01 11:42:01 -07:00
sabaimran	1fc280db35	Handle case where infer_webpage_url returns no valid urls	2024-11-01 11:41:32 -07:00
Debanjum	1c920273dd	Add Prompt Tracer to Visualize, Analyze and Debug Khoj's Train of Thought (#951 ) ## Overview Use git to capture prompt traces of khoj's train of thought. View, analyze and debug them using your favorite git client (e.g vscode, magit). - Each commit captures an interaction with an LLM The commit writes the query, response and system message each to a separate file in the repo. The commit message captures the chat model, Khoj version and other metadata - Each conversation turn can have multiple interactions with an LLM (e.g Khoj's train of thought) - Each new conversation turn forks from and merges back into its conversation branch - Each new conversation branches from the user branch - Each new user branches from root commit on the main branch ## Usage 1. Set `KHOJ_DEBUG=true` or start khoj in very verbose mode with `khoj -vv` to turn on prompt tracing 2. Chat with Khoj as usual 3. Open the promptrace git repo to view the generated prompt traces using your favorite git porcelain. The Khoj prompt trace git repo is created at `/tmp/khoj_promptrace` by default. You can configure the prompt trace directory by setting the `PROMPTRACE_DIR`environment variable. ## Implementation - Add utility functions to capture prompt traces using git (via `gitpython`) - Make each model provider in Khoj commit their LLM interactions with promptrace - Weave chat metadata from chat API through all chat actors and commit it to the prompt trace	2024-11-01 11:33:54 -07:00
sabaimran	33d36ee58c	Add experimental notice to research mode tooltip	2024-11-01 11:00:27 -07:00
sabaimran	0145b2a366	Set usage limits on the research mode	2024-11-01 10:29:33 -07:00
sabaimran	3ea94ac972	Only include inferred-queries in chat history when present	2024-10-31 22:01:41 -07:00
sabaimran	149cbe1019	Use bottom anchor for the commandbar popover	2024-10-31 20:40:38 -07:00
sabaimran	21858acccc	Remove conversation command always in query, filter out inferred queries that were not with selected tool when going through tool selection iterations	2024-10-31 20:27:38 -07:00
sabaimran	19241805ee	Merge branch 'master' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-10-31 18:20:23 -07:00
Debanjum	302bd51d17	Improve online chat actor prompt for research and normal mode - Match the online query generator prompt to match the formatting of extract questions - Separate iteration results by newline - Improve webpage and online tool descriptions	2024-10-31 18:17:12 -07:00
Debanjum	52163fe299	Improve research planner prompt to reduce looping	2024-10-31 18:17:01 -07:00
sabaimran	7ebf999688	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-31 18:15:13 -07:00
sabaimran	159ea44883	Remove frame references in the diagramming prompts	2024-10-31 18:14:51 -07:00
Debanjum	89597aefe9	Json dump contents in prompt tracer to make structure discernable	2024-10-31 18:08:42 -07:00
Debanjum	5b15176e20	Only add /research prefix in research mode if not already in user query	2024-10-31 18:08:42 -07:00
sabaimran	559601dd0a	Do not exit if/else loop in research loop when notes not found	2024-10-31 13:51:10 -07:00
sabaimran	a13760640c	Only show trash can when turnId is present	2024-10-31 13:19:16 -07:00
Debanjum	adca6cbe9d	Merge branch 'master' into add-prompt-tracer-for-observability	2024-10-31 02:28:34 -07:00
Debanjum	e17dc9f7b5	Put train of thought ui before Khoj response on web app	2024-10-31 02:24:53 -07:00
Debanjum	e8e6ead39f	Fix deleting new messages generated after conversation load	2024-10-30 20:56:38 -07:00
Debanjum	cb90abc660	Resolve train of thought component needs unique key id error on web app	2024-10-30 14:00:21 -07:00
Debanjum	ca5a6831b6	Add ability to delete messages from the web app	2024-10-30 14:00:21 -07:00
Debanjum	ba15686682	Store turn id with each chat message. Expose API to delete chat turn Each chat turn is a user query, khoj response message pair	2024-10-30 14:00:21 -07:00
Debanjum	f64f5b3b6e	Handle add/delete file filter operation on non-existent conversation	2024-10-30 14:00:21 -07:00
Debanjum	b3a63017b5	Support setting seed for reproducible LLM response generation Anthropic models do not support seed. But offline, gemini and openai models do. Use these to debug and test Khoj via KHOJ_LLM_SEED env var	2024-10-30 14:00:21 -07:00
Debanjum	d44e68ba01	Improve handling embedding model config from admin interface - Allow server to start if loading embedding model fails with an error. This allows fixing the embedding model config via admin panel. Previously server failed to start if embedding model was configured incorrectly. This prevented fixing the model config via admin panel. - Convert boolean string in config json to actual booleans when passed via admin panel as json before passing to model, query configs - Only create default model if no search model configured by admin. Return first created search model if its been configured by admin.	2024-10-30 14:00:21 -07:00
Debanjum	358a6ce95d	Defer turning cursor color to selected agents color for later Capability exists but idea needs to be investigated further	2024-10-30 14:00:21 -07:00
Debanjum	2ac840e3f2	Make cursor in chat input take on selected agent color	2024-10-30 14:00:21 -07:00
Debanjum	1448b8b3fc	Use 3rd person for user in research prompt to reduce person confusion Models were getting a bit confused about who is search for who's information. Using third person to explicitly call out on who's behalf these searches are running seems to perform better across models (gemini's, gpt etc.), even if the role of the message is user.	2024-10-30 13:49:48 -07:00
Debanjum	b8c6989677	Separate example from actual question in extract question prompt	2024-10-30 13:49:48 -07:00
Debanjum	86ffd7a7a2	Handle \n, dedupe json cleaning into single function for reusability Use placeholder for newline in json object values until json parsed and values extracted. This is useful when research mode models outputs multi-line codeblocks in queries etc.	2024-10-30 13:49:48 -07:00
Debanjum	83ca820abe	Encourage Anthropic models to output json object using { prefill Anthropic API doesn't have ability to enforce response with valid json object, unlike all the other model types. While the model will usually adhere to json output instructions. This step is meant to more strongly encourage it to just output json object when response_type of json_object is requested.	2024-10-30 13:49:48 -07:00
Debanjum	dc8e89b5de	Pass tool AIs iteration history as chat history for better context Separate conversation history with user from the conversation history between the tool AIs and the researcher AI. Tools AIs don't need top level conversation history, that context is meant for the researcher AI. The invoked tool AIs need previous attempts at using the tool in this research runs iteration history to better tune their next run. Or at least that is the hypothesis to break the models looping.	2024-10-30 13:49:48 -07:00
Debanjum	d865994062	Rename code tool arg `previous_iteration_history' to` context'	2024-10-30 13:49:48 -07:00
Debanjum	06aeca2670	Make researcher, docs search AIs ask more diverse retrieval questions Models weren't generating a diverse enough set of questions. They'd do minor variations on the original query. What is required is asking queries from a bunch of different lenses to retrieve the requisite information. This prompt updates shows the AIs the breadth of questions to by example and instruction. Seem like performance improved based on vibes	2024-10-30 13:49:48 -07:00
Debanjum	01881dc7a2	Revert "Make extract question prompt in 1st person wrt user as its a user message" This reverts commit 6d3602798aa1b95a30c557576fd4f93ddef2ae76.	2024-10-30 13:49:48 -07:00
Debanjum	3e695df198	Make extract question prompt in 1st person wrt user as its a user message Divide Example from Actual chat history section in prompt	2024-10-30 13:49:48 -07:00
Debanjum	a3751d6a04	Make extract relevant information system prompt work for any document Previously it was too strongly tuned for extracting information from only webpages. This shouldn't be necessary	2024-10-30 13:49:48 -07:00
Debanjum	a39e747d07	Improve passing user name in pick next research tool prompt	2024-10-30 13:49:48 -07:00
Debanjum	deff512baa	Improve research mode prompts to reduce looping, increase webpage reads	2024-10-30 13:49:48 -07:00
Debanjum	d3184ae39a	Simplify storing and displaying document results in research mode - Mention count of notes and files disovered - Store query associated with each compiled reference retrieved for easier referencing	2024-10-30 13:49:48 -07:00
Debanjum	8bd94bf855	Do not use a message branch if no msg id provided to prompt tracer	2024-10-30 13:49:48 -07:00
sabaimran	b63fbc5345	Add a simple badget to the dropdown menu that shows subscription status	2024-10-30 13:00:16 -07:00
sabaimran	82f3d79064	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-30 11:32:10 -07:00
sabaimran	2b2564257e	Handle subscription case where it's set to trial, but renewal_date is not set. set the renewal_date for LENGTH_OF_FREE_TRIAL days from subscription creation.	2024-10-30 11:05:31 -07:00
Debanjum	9935d4db0b	Do not use a message branch if no msg id provided to prompt tracer	2024-10-28 17:50:27 -07:00
Debanjum	d184498038	Pass context in separate message from user query to research chat actor	2024-10-28 15:37:28 -07:00
Debanjum	d75ce4a9e3	Format online, notes, code context with YAML to be legibile for LLM	2024-10-28 15:37:28 -07:00
sabaimran	5bea0c705b	Use break-words in the train of thought for better formatting	2024-10-28 15:36:06 -07:00
sabaimran	1f1b182461	Automatically carry over research mode from home page to chat - Improve mobile friendliness with new research mode toggle, since chat input area is now taking up more space - Remove clunky title from the suggestion card - Fix fk lookup error for agent.creator	2024-10-28 15:29:24 -07:00
sabaimran	ebaed53069	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-28 12:39:00 -07:00
sabaimran	889dbd738a	Add keyword diagram to diagram output mode description	2024-10-28 12:20:46 -07:00
Debanjum	50ffd7f199	Merge branch 'master' into features/advanced-reasoning	2024-10-28 04:10:59 -07:00
Debanjum	a5d0ca6e1c	Use selected agent color to theme the chat input area on home page	2024-10-28 03:47:40 -07:00
Debanjum	aad7528d1b	Render slash commands popup below chat input text area on home page	2024-10-28 02:06:04 -07:00
Debanjum	3e17ab438a	Separate notes, online context from user message sent to chat models (#950 ) Overview --- - Put context into separate user message before sending to chat model. This should improve model response quality and truncation logic in code - Pass online context from chat history to chat model for response. This should improve response speed when previous online context can be reused - Improve format of notes, online context passed to chat models in prompt. This should improve model response quality Details --- The document, online search context are now passed as separate user messages to chat model, instead of being added to the final user message. This will improve - Models ability to differentiate data from user query. That should improve response quality and reduce prompt injection probability - Make truncation logic simpler and more robust When context window hit, can simply pop messages to auto truncate context in order of context, user, assistant message for each conversation turn in history until reach current user query The complex, brittle logic to extract user query from context in last user message isn't required.	2024-10-28 02:03:18 -07:00
Debanjum	8ddd70f3a9	Put context into separate message before sending to offline chat model Align context passed to offline chat model with other chat models - Pass context in separate message for better separation between user query and the shared context - Pass filename in context - Add online results for webpage conversation command	2024-10-28 00:22:21 -07:00
Debanjum	ee0789eb3d	Mark context messages with user role as context role isn't being used Context role was added to allow change message truncation order based on context role as well. Revert it for now since currently this is not currently being done.	2024-10-28 00:04:14 -07:00
Debanjum	4e39088f5b	Make agent name in home page carousel not text wrap on mobile	2024-10-27 23:03:53 -07:00
Debanjum	94074b7007	Focus chat input on toggle research mode. v-align it with send button	2024-10-27 22:54:55 -07:00
sabaimran	a691ce4aa6	Batch entries into smaller groups to process	2024-10-27 20:43:41 -07:00
sabaimran	2924909692	Add a research mode toggle to the chat input area	2024-10-27 16:37:40 -07:00
sabaimran	68499e253b	Auto-collapse train of thought, show after chat response in history	2024-10-27 15:48:13 -07:00
sabaimran	101ea6efb1	Add research mode as a slash command, remove from default path	2024-10-27 15:47:44 -07:00
sabaimran	0bd78791ca	Let user exit from command mode with esc, click out, etc.	2024-10-27 15:01:49 -07:00
sabaimran	a121d67b10	Persist the train of thought in the conversation history	2024-10-26 23:46:15 -07:00
sabaimran	9e8ac7f89e	Fix input/output mismatches in the /summarize command	2024-10-26 16:37:58 -07:00
sabaimran	e4285941d1	Use the advanced chat model if the user is subscribed	2024-10-26 16:00:54 -07:00
sabaimran	33e48aa27e	Merge branch 'add-prompt-tracer-for-observability' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-26 14:09:00 -07:00
sabaimran	fd71a4b086	Add better exception handling in the prompt trace logic, use default value from parameters	2024-10-26 14:08:00 -07:00
Debanjum	3e5b5ec122	Encourage model to read webpages more often after online search Previously model would rarely read webpages after webpage search. Need the model to webpages more regularly for deeper research and to stop getting stuck in repetitive online search loops	2024-10-26 10:49:09 -07:00
Debanjum	bf96d81943	Format online results as YAML to pass it in more readable form to model Previous passing of online results as json dump in prompts was less readable for humans, and I'm guessing less readable for models (trained on human data) as well?	2024-10-26 10:49:09 -07:00
Debanjum	3e97ebf0c7	Unescape special characters in prompt traces for better readability	2024-10-26 10:49:09 -07:00
Debanjum	8af9dc3ee1	Unescape special characters in prompt traces for better readability	2024-10-26 10:45:42 -07:00
Debanjum Singh Solanky	0f3927e810	Send gathered references to client after code results calculated	2024-10-26 05:59:10 -07:00
Debanjum Singh Solanky	f04f871a72	Merge branch 'add-prompt-tracer-for-observability' of github.com:khoj-ai/khoj into features/advanced-reasoning - Start from this branches src/khoj/routers/api_chat.py Add tracer to all old and new chat actors that don't have it set when they are called. - Update the new chat actors like apick next tool etc to use tracer too	2024-10-26 05:56:13 -07:00
Debanjum Singh Solanky	ddc6ccde2d	Merge branch 'master' into features/advanced-reasoning - Conflicts: Combine both sides of the conflict in all 3 files below - src/khoj/processor/conversation/utils.py - src/khoj/routers/helpers.py - src/khoj/utils/helpers.py	2024-10-26 05:15:51 -07:00
Debanjum Singh Solanky	ea0712424b	Commit conversation traces using user, chat, message branch hierarchy - Message train of thought forks and merges from its conversation branch - Conversation branches from user branch - User branches from root commit on the main branch - Weave chat tracer metadata from api endpoint through all chat actors and commit it to the prompt trace	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	a3022b7556	Allow Offline Chat model calling functions to save conversation traces	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	eb6424f14d	Allow Anthropic API calling functions to save conversation traces	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	6fcd6a5659	Allow Gemini API calling functions to save conversation traces	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	384f394336	Allow OpenAI API calling functions to save conversation traces	2024-10-26 04:59:21 -07:00
Debanjum Singh Solanky	10c8fd3b2a	Save conversation traces to git for visualization	2024-10-26 04:59:19 -07:00
sabaimran	7e0a692d16	Release Khoj version 1.27.1	2024-10-25 15:23:07 -07:00
sabaimran	b257fa1884	Add a None check before doing a DT comparison when getting subscription type	2024-10-25 15:22:48 -07:00
sabaimran	0f6f282c30	Release Khoj version 1.27.0	2024-10-25 14:11:14 -07:00
sabaimran	479e156168	Add to the ConversationCommand.Image description to LLM	2024-10-25 09:14:32 -07:00
sabaimran	a11b5293fb	Add uploaded images to research mode, code slash command, include code references	2024-10-24 23:56:24 -07:00
sabaimran	5acf40c440	Clean up summarization code paths Use assumption of summarization response being a str	2024-10-24 23:56:24 -07:00
sabaimran	12b32a3d04	Resolve merge conflicts	2024-10-24 23:43:55 -07:00
Debanjum	adee5a3e20	Give Vision to Anthropic models in Khoj (#948 ) ### Major - Give Vision to Anthropic models in Khoj ### Minor - Reuse logic to format messages for chat with anthropic models - Make the get image from url function more versatile and reusable - Encourage output mode chat actor to output only json and nothing else	2024-10-24 18:02:38 -07:00
Debanjum Singh Solanky	01d740debd	Return typed image from image_with_url function for readability	2024-10-24 17:58:46 -07:00
Debanjum Singh Solanky	37317e321d	Dedupe user location passed in image, diagram generation prompts	2024-10-24 01:03:29 -07:00
Debanjum Singh Solanky	2a32836d1a	Log more descriptive error when image gen fails with Replicate	2024-10-24 01:03:29 -07:00
sabaimran	30f9225021	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-23 19:15:51 -07:00
sabaimran	5120597d4e	Remove user customized search model (#946 ) - Use a single standard search model across the server. There's diminishing benefits for having multiple user-customizable search models. - We may want to add server-level customization for specific tasks - Store the search model used to generate a given entry on the `Entry` object - Remove user-facing APIs and view - Add a management command for migrating the default search model on the server In a future PR (after running the migration), we'll also remove the `UserSearchModelConfig`	2024-10-23 17:38:37 -07:00
Debanjum Singh Solanky	8d588e0765	Encourage output mode chat actor to output only json and nothing else Latest claude model wanted to say more than just give the json output. The updated prompt encourages the model to ouput just json. This is similar to what is already being done for other prompts	2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky	abad5348a0	Give Vision to Anthropic models in Khoj	2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky	6fd50a5956	Reuse logic to format messages for chat with anthropic models	2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky	82eac5a043	Make the get image from url function more versatile and reusable It was previously added under the google utils. Now it can be used by other conversation processors as well. The updated function - can get both base64 encoded and PIL formatted images from url - will return the media type of the image as well in response	2024-10-23 17:19:20 -07:00
sabaimran	f3ce47b445	Create explicit flow to enable the free trial (#944 ) * Create explicit flow to enable the free trial The current design is confusing. It obfuscates the fact that the user is on a free trial. This design will make the opt-in explicit and more intuitive. * Use the Subscription Type enum instead of hardcoded strings everywhere * Use length of free trial in the frontend code as well	2024-10-23 15:29:23 -07:00
Debanjum Singh Solanky	bc059eeb0b	Merge branch 'master' into put-retrieved-context-in-separate-chatml-message	2024-10-23 12:55:18 -07:00
Debanjum Singh Solanky	3b978b9b67	Fix chat history construction when generating chatml msgs with context	2024-10-23 12:55:12 -07:00
Debanjum Singh Solanky	9f2c02d9f7	Chat with the default agent by default from web app home Had temporarily updated the default selected agent to last used. Revert for now as 1. The previous logic was buggy. It didn't select the default agent even when the last used agent was the default agent. Which would require more work. 2. It maybe too early anyway to set the default agent to last used.	2024-10-23 03:43:57 -07:00
Debanjum Singh Solanky	218946edda	Fix copying message with user images on web app Adding div elements to message to render degraded text copied to clipboard for messages with user uploaded images. This change fixes that by separating message to render from message for clipboard. It ensures differently formatted forms of the user images are added to the two to allow proper rendering while still having decently formatted text copied to clipboard	2024-10-23 03:41:25 -07:00
Debanjum Singh Solanky	7d9a06c8ab	Merge branch 'master' into put-retrieved-context-in-separate-chatml-message	2024-10-23 00:13:38 -07:00
Debanjum Singh Solanky	2a50694089	Allow typing multi-line queries from a phone with Enter key Add newline instead of sending message when hit Enter key on mobile displays. As on phones shift key doesn't exist and send button is easily clickable. Limit hitting Enter key to send message to computers = larger display = expected to have full fledged keyboards.	2024-10-22 21:20:22 -07:00
Debanjum Singh Solanky	a134cd835c	Focus on chat input area to enter text after file uploads on web app	2024-10-22 21:19:17 -07:00
Debanjum Singh Solanky	750fbce0c2	Merge branch 'master' into improve-agent-pane-on-home-screen	2024-10-22 20:05:29 -07:00
Debanjum Singh Solanky	3be505db48	Only show type of error when image generation fails to clients Rather than showing raw error message from the underlying service as it could contain sensitive information	2024-10-22 20:03:20 -07:00
Debanjum Singh Solanky	b3fff43542	Sanitize user attached images. Constrain chat input width on home page Set max combined images size to 20mb to allow multiple photos to be shared	2024-10-22 19:42:40 -07:00
Debanjum Singh Solanky	6c393800cc	Merge branch 'master' into multi-image-chat-and-vision-for-gemini	2024-10-22 18:38:49 -07:00
Debanjum Singh Solanky	91bbd19333	Close the agent detail hover card when scroll on agent pane	2024-10-22 18:03:17 -07:00
Debanjum Singh Solanky	110c67f083	Improve agent pill, detail card styling. Handle null chatInputRef - Remove border from agent detail hover card on home page - Do not wrap long agent names in agent pills on home page - Handle scenario where chatInputRef is null	2024-10-22 18:03:17 -07:00
Debanjum Singh Solanky	aca8bef024	Only use recent chat sessions for agent MRU. Handle null agent chats	2024-10-22 17:46:45 -07:00
sabaimran	0dad4212fa	Generate dynamic diagrams (via Excalidraw) (#940 ) Add support for generating dynamic diagrams in flow with Excalidraw (https://github.com/excalidraw/excalidraw). This happens in three steps: 1. Default information collection & intent determination step. 2. Improving the overall guidance of the prompt for generating a JSON, Excalidraw-compatible declaration. 3. Generation of the diagram to output to the final UI. Add support in the web UI.	2024-10-22 16:13:46 -07:00
sabaimran	1e993d561b	Release Khoj version 1.26.4	2024-10-22 13:50:08 -07:00
Debanjum Singh Solanky	e8fb79a369	Rate limit the count and total size of images shared via API	2024-10-22 04:37:54 -07:00
Debanjum Singh Solanky	0847fb0102	Pass online context from chat history to chat model for response Previously only notes context from chat history was included. This change includes online context from chat history for model to use for response generation. This can reduce need for online lookups by reusing previous online context for faster responses. But will increase overall response time when not reusing past online context, as faster context buildup per conversation. Unsure if inclusion of context is preferrable. If not, both notes and online context should be removed.	2024-10-22 03:09:36 -07:00
Debanjum Singh Solanky	0c52a1169a	Put context into separate user message before sending to chat model The document, online search context are now passed as separate user messages to chat model, instead of being added to the final user message. This will improve - Models ability to differentiate data from user query. That should improve response quality and reduce prompt injection probability - Make truncation logic simpler and more robust When context window hit, can simply pop messages to auto truncate context in order of context, user, assistant message for each conversation turn in history until reach current user query The complex, brittle logic to extract user query from context in last user message isn't required. Marking the context message with assistant role doesn't translate well across chat models. E.g - Gemini can't handle consecutive messages by role = model well - Claude will merge consecutive messages by same role. In current message ordering the context message will result get merged into the previous assistant response. And if move context message after user query. The truncation logic will have to hop and skip while doing deletions - GPT seems to handle consecutive roles of any type fine Using context role = user generalizes better across chat models for now and aligns with previous behavior.	2024-10-22 03:09:36 -07:00
Debanjum Singh Solanky	7ac241b766	Improve format of notes, online context passed to chat models in prompt Improve separation of note snippets and show its origin file in notes prompt to have more readable, contextualized text shared with model. Previously the references dict was being directly passed as a string. The documents don't look well formatted and are less intelligible. - Passing file path along with notes snippets will help contextualize the notes better. - Better formatting should help with making notes more readable by the chat model.	2024-10-22 03:09:36 -07:00
sabaimran	892040972f	Replace user_id with server_id in telemetry	2024-10-21 20:47:52 -07:00
sabaimran	21e69b506d	Release Khoj version 1.26.3	2024-10-21 08:19:05 -07:00
Debanjum Singh Solanky	9b554feb91	Show agent details card on hover on agent pill on web app home page - Double click on agent to open edit agent card - Focus on chat input pane when agent selected/clicked for quick, smooth agent switch and message flow - Hover on agent to see agent detail card on non-mobile displays - Use debounce to only show when hover on card for a bit	2024-10-21 00:08:01 -07:00
Debanjum Singh Solanky	220ff1df62	Set chatInputArea forward ref from parent components for control	2024-10-21 00:02:48 -07:00
Debanjum Singh Solanky	54b92eaf73	Extract isUserSubscribed check from Agents page to make it resusable	2024-10-20 23:31:48 -07:00
Debanjum Singh Solanky	bdbe8f003e	Move agent details and edit card out into reusable components on web app	2024-10-20 23:31:47 -07:00
sabaimran	59fec37943	Improve agents management, and limit agents view to private and official agents - Default to None for the input_tools and output_modes so that they can be managed in the admin panel - Hold off on showing off all Public Agents until we have a better experience for user profiles etc.	2024-10-20 22:24:51 -07:00
sabaimran	a979457442	Add unit tests for agents - Add permutations of testing for with, without knowledge base. Private, public, different users.	2024-10-20 20:04:50 -07:00
sabaimran	fc70f25583	Release Khoj version 1.26.2	2024-10-20 18:03:36 -07:00
sabaimran	046de57571	Improve error handling when documents not searched with stack trace - Stop extract OCR content from PDFs - Only use agent knowledge base when user not provided	2024-10-20 18:03:14 -07:00
sabaimran	2b68d61fef	Release Khoj version 1.26.1	2024-10-20 16:21:51 -07:00
Debanjum Singh Solanky	5fca41cc29	Show agents sorted by mru, Select mru agent by default on web app Have get agents API return agents ordered intelligently - Put the default agent first - Sort used agents by most recently chatted with agent for ease of access - Randomly shuffle the remaining unused agents for discoverability	2024-10-20 15:21:25 -07:00
Debanjum Singh Solanky	a6bfdbdbfe	Show all agents in carousel on home screen agent pane of web app This change wraps the agent pane in a scroll area with all agents shown. It allows selecting an agent to chat with directly from the home screen without breaking flow and having to jump to the agents page. The previous flow was not convenient to quickly and consistently start chat with one of your standard agents. This was because a random subet of agents were shown on the home page. To start chat with an agent not shown on home screen load you had to open the agents page and initiate the conversation from there.	2024-10-20 15:21:25 -07:00
Debanjum Singh Solanky	9ffd726799	Allow making sync api requests with body from khoj.el	2024-10-20 15:16:40 -07:00
Debanjum Singh Solanky	ac51920859	Start conversation with Agents from within Emacs Exposes a transient switch with available agents as selectable options in the Khoj chat sub-menu. Currently shows agent slugs instead of agent names as options. This isn't the cleanest but gets the job done for now. Only new conversations with a different agent can be started. Existing conversations will continue with the original agent it was created with. The ability to switch the conversation's agent doesn't exist on the server yet.	2024-10-20 15:16:40 -07:00
Debanjum Singh Solanky	7646ac6779	Style user attached images as carousel on chat input area of web app	2024-10-20 00:40:08 -07:00
sabaimran	5d5bea6a5f	Ensure images are reset after messages processed	2024-10-19 22:02:06 -07:00
sabaimran	1ad6e1749f	Move window redirect to after relevant data is dropped in localStorage on the homage page One limitation of this methodology is that localStorage has a limit in how much data it can take. Should add more graceful error handling here as well.	2024-10-19 20:36:13 -07:00
sabaimran	cb6b3ec1e9	Improve mode description given to LLM when determining how to respond. Currently experiencing difficulty instruction following when an image is shared. It's more likely to try and output an image. Update to make a clearer distinction.	2024-10-19 20:35:32 -07:00
sabaimran	545259e308	Remove unused icons in chatInputArea	2024-10-19 16:54:21 -07:00
Debanjum Singh Solanky	3cc1426edf	Style user attached images with fixed height, in a single row on web app	2024-10-19 16:48:36 -07:00
Debanjum Singh Solanky	58a331227d	Display the attached images inside the chat input area on the web app - Put the attached images display div inside the same parent div as the text area - Keep the attachment, microphone/send message buttons aligned with the text area. So the attached images just show up at the top of the text area but everything else stays at the same horizontal height as before. - This improves the UX by - Ensuring that the attached images do not obscure the agents pane above the chat input area - The attached images visually look like they are inside the actual input area, rather than floating above it. So the visual aligns with the semantics	2024-10-19 16:29:45 -07:00
Debanjum Singh Solanky	3e39fac455	Add vision support for Gemini models in Khoj	2024-10-19 15:47:03 -07:00
Debanjum Singh Solanky	0d6a54c10f	Allow sharing multiple images as part of user query from the web app Previously the web app only expected a single image to be shared by the user as part of their query. This change allows sharing multiple images from the web app. Closes #921	2024-10-19 15:47:03 -07:00
Debanjum Singh Solanky	e2abc1a257	Handle multiple images shared in query to chat API Previously Khoj could respond to a single shared image at a time. This changes updates the chat API to accept multiple images shared by the user and send it to the appropriate chat actors including the openai response generation chat actor for getting an image aware response	2024-10-19 14:53:33 -07:00
Debanjum Singh Solanky	d55cba8627	Pass user query for chat response when document lookup fails Recent changes made Khoj try respond even when document lookup fails. This change missed handling downstream effects of a failed document lookup, as the defiltered_query was null and so the text response didn't have the user query to respond to. This code initializes defiltered_query to original user query to handle that. Also response_type wasn't being passed via send_message_to_model_wrapper_sync unlike in the async scenario	2024-10-19 14:32:19 -07:00
Debanjum Singh Solanky	a4e6e1d5e8	Share webp images from web, desktop, obsidian app to chat with	2024-10-19 14:32:17 -07:00
sabaimran	dbd9a945b0	Re-evaluate agent private/public filtering after authenticateddata is retrieved. Update selectedAgent check logic to reflect.	2024-10-18 09:31:56 -07:00
Debanjum Singh Solanky	35015e720e	Release Khoj version 1.26.0	2024-10-17 18:25:53 -07:00
Debanjum Singh Solanky	f0dcfe4777	Explicitly ask Gemini models to format their response with markdown Otherwise it can get confused by the format of the passed context (e.g respond in org-mode if context contains org-mode notes)	2024-10-17 18:12:47 -07:00
Debanjum Singh Solanky	2c20f49bc5	Return enabled scrapers as WebScraper objects for more ergonomic code	2024-10-17 17:44:09 -07:00
Debanjum Singh Solanky	0db52786ed	Make web scraper priority configurable via admin panel - Simplifies changing order in which web scrapers are invoked to read web page by just changing their priority number on the admin panel. Previously you'd have to delete/, re-add the scrapers to change their priority. - Add help text for each scraper field to ease admin setup experience - Friendlier env var to use Firecrawl's LLM to extract content - Remove use of separate friendly name for scraper types. Reuse actual name and just make actual name better	2024-10-17 17:42:42 -07:00
Debanjum Singh Solanky	20b6f0c2f4	Access internal links directly via a simple get request The other webpage scrapers will not work for internal webpages. Try access those urls directly if they are visible to the Khoj server over the network. Only enable this by default for self-hosted, single user setups. Otherwise ability to scan internal network would be a liability! For use-cases where it makes sense, the Khoj server admin can explicitly add the direct webpage scraper via the admin panel	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	d94abba2dc	Fallback through enabled scrapers to reduce web page read failures - Set up scrapers via API keys, explicitly adding them via admin panel or enabling only a single scraper to use via server chat settings. - Use validation to ensure only valid scrapers added via admin panel Example API key is present for scrapers that require it etc. - Modularize the read webpage functions to take api key, url as args Removes dependence on constants loaded in online_search. Functions are now mostly self contained - Improve ability to read webpages by using the speed, success rate of different scrapers. Optimal configuration needs to be discovered	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	11c64791aa	Allow changing perf timer log level. Info log time for webpage read	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	c841abe13f	Change webpage scraper to use via server admin panel	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	e47922e53a	Aggregate webpage extract queries to run once for each distinct webpage This should reduce webpage read and response generation time. Previously, we'd run separate webpage read and extract relevant content pipes for each distinct (query, url) pair. Now we aggregate all queries for each url to extract information from and run the webpage read and extract relevant content pipes once for each distinct url. Even though the webpage content extraction pipes were previously being in parallel. They increased response time by 1. adding more context for the response generation chat actor to respond from 2. and by being more susceptible to page read and extract latencies of the parallel jobs The aggregated retrieval of context for all queries for a given webpage could result in some hit to context quality. But it should improve and reduce variability in response time, quality and costs.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	98f99fa6f8	Allow using Firecrawl to extract web page content Set the FIRECRAWL_TO_EXTRACT environment variable to true to have Firecrawl scrape and extract content from webpage using their LLM This could be faster, not sure about quality as LLM used is obfuscated	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	993fd7cd2b	Support using Firecrawl to read webpages Firecrawl is open-source, self-hostable with a default hosted service provided, similar to Jina.ai. So it can be 1. Self-hosted as part of a private Khoj cloud deployment 2. Used directly by getting an API key from the Firecrawl.dev service This is as an alternative to Olostep and Jina.ai for reading webpages.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	731ea3779e	Return data sources to use if exception in data source chat actor Previously no value was returned if an exception got triggered when collecting information sources to search.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	a932564169	Try respond even if web search, webpage read fails during chat Khoj shouldn't refuse to respond to user if web lookups fail. It should transparently mention that online search etc. failed. But try respond as best as it can without those references This change ensures a response to the users query is attempted even when web info retrieval fails.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	1b04b801c6	Try respond even if document search via inference endpoint fails The huggingface endpoint can be flaky. Khoj shouldn't refuse to respond to user if document search fails. It should transparently mention that document lookup failed. But try respond as best as it can without the document references This changes provides graceful failover when inference endpoint requests fail either when encoding query or reranking retrieved docs	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	9affeb9e85	Fix to log the client app calling the chat API - Remove unused subscribed variable from the chat API - Unexpectedly dropped client app logging when migrated API chat to do advanced streaming in july	2024-10-17 15:24:43 -07:00
Debanjum Singh Solanky	c6c48cfc18	Fix arg to generate_summary_from_file and type of this_iteration	2024-10-17 13:38:48 -07:00
Debanjum Singh Solanky	884fe42602	Allow automation as an output mode supported by custom agents	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	c5e19b37ef	Use Khoj icons. Add automation & improve agent text on web login page	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	42acc324dc	Handle correctly setting file filters as array when API call fails - Only set addedFiles to selectedFiles when selectedFiles is an array - Only set seleectedFiles, addedFiles to API response json when response succeeded. Previously we set it to response json on errors as well. This made the variables into json objects instead of arrays on API call failure - Check if selectedFiles, addedFiles are arrays before running operations on them. Previously the addedFiles.includes was where the code would fail	2024-10-17 11:58:52 -07:00
sabaimran	07ab8ab931	Update handling of gemini response with new API changes. Per documentation: finish_reason (google.ai.generativelanguage_v1beta.types.Candidate.FinishReason): Optional. Output only. The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens.	2024-10-17 09:00:01 -07:00
Debanjum Singh Solanky	19c65fb82b	Show user uuid field in django admin panel	2024-10-15 17:59:12 -07:00
Debanjum Singh Solanky	6c5b362551	Remove deprecated GET chat API endpoint	2024-10-15 15:13:09 -07:00
Debanjum Singh Solanky	931c56182e	Fix default chat model to use user model if no server chat model set - Advanced chat model should also fallback to user chat model if set - Get conversation config should falback to user chat model if set These assume no server chat model settings is configured	2024-10-15 15:13:09 -07:00
Debanjum Singh Solanky	feb6d65ef8	Merge branch 'master' into features/advanced-reasoning	2024-10-15 09:37:56 -07:00
Debanjum Singh Solanky	336c6c3689	Show tool to use decision for next iteration in train of thought	2024-10-15 01:12:18 -07:00
Debanjum Singh Solanky	81fb65fa0a	Return data sources to use if exception in data source chat actor Previously no value was returned if an exception got triggered when collecting information sources to search.	2024-10-14 18:20:20 -07:00
Debanjum Singh Solanky	3c93f07b3f	Try respond even if web search, webpage read fails during chat Khoj shouldn't refuse to respond to user if web lookups fail. It should transparently mention that online search etc. failed. But try respond as best as it can without those references This change ensures a response to the users query is attempted even when web info retrieval fails.	2024-10-14 18:13:26 -07:00
Debanjum Singh Solanky	07ab7ebf07	Try respond even if document search via inference endpoint fails The huggingface endpoint can be flaky. Khoj shouldn't refuse to respond to user if document search fails. It should transparently mention that document lookup failed. But try respond as best as it can without the document references This changes provides graceful failover when inference endpoint requests fail either when encoding query or reranking retrieved docs	2024-10-14 18:13:26 -07:00
Debanjum Singh Solanky	d6206aa80c	Remove deprecated GET chat API endpoint	2024-10-14 18:13:26 -07:00
Debanjum Singh Solanky	263eee4351	Fix default chat model to use user model if no server chat model set - Advanced chat model should also fallback to user chat model if set - Get conversation config should falback to user chat model if set These assume no server chat model settings is configured	2024-10-14 18:13:26 -07:00
sabaimran	81aa1b5589	Update some edge cases and usability of create agent flow - Use the slug to determine which agent to PATCH - Make the agent creation form multi-step to streamline the process	2024-10-14 14:07:31 -07:00
Debanjum Singh Solanky	abcd11cfc0	Merge branch 'master' into features/advanced-reasoning	2024-10-13 03:06:23 -07:00
Debanjum Singh Solanky	9356e66b94	Fix default chat model to use user model if no server chat model set - Advanced chat model should also fallback to user chat model if set - Get conversation config should falback to user chat model if set These assume no server chat model settings is configured	2024-10-13 03:02:29 -07:00
Debanjum Singh Solanky	9314f0a398	Fix default chat configs to use user model if no server chat model set Post merge cleanup in advanced reasoning to fallback to user chat model if no server chat model defined for advanced and default	2024-10-13 02:59:10 -07:00
Debanjum Singh Solanky	a2200466b7	Merge branch 'master' into features/advanced-reasoning	2024-10-12 21:01:22 -07:00
Debanjum	c66c571396	Simplify switching chat model when self-hosting (#934 ) # Overview - Default to use user chat models for train of thought when no server chat settings created by admins - Default to not create server chat settings on first run # Details This change simplifies switching chat models for self-hosted setups by just changing the chat model on the user settings page. It falls back to use the user chat model for train of thought if server chat settings have not been created on the admin panel. Server chat settings, when set, controls the chat model used for Khoj's train of thought and the default user chat model. Previously a self-hosted user had to update 1. the server chat settings in the admin panel and 2. their own user chat model in the user settings panel to completely switch to a different chat model for both train of thought & response generation respectively You can still set server chat settings via the admin panel to use a different chat model for train of thought vs response generation. But this is only useful for advanced, multi-user setups.	2024-10-12 19:58:05 -07:00
Debanjum Singh Solanky	90888a1099	Log when new user created via magic link or whatsapp as well	2024-10-12 19:56:01 -07:00
Debanjum Singh Solanky	8222c6629d	Remove unused subscribed argument to read_webpage function	2024-10-12 10:45:39 -07:00
Debanjum Singh Solanky	9daaae0fdb	Render inline any image files output by code in message Update regex to also include any links to code generated images that aren't explicitly meant to be displayed inline. This allows folks to download the image (unlike the fake link that doesn't work created by model)	2024-10-12 10:34:57 -07:00
Debanjum Singh Solanky	20d495c43a	Update the iterative chat director prompt to generalize across chat models These prompts work across o1 and standard openai model. Works with anthropic and google models as well	2024-10-12 10:34:57 -07:00
sabaimran	eb4d598d0f	Eliminate the drawer component from the Agents view	2024-10-10 20:40:59 -07:00
sabaimran	0a1c3e4f41	Release Khoj version 1.25.0	2024-10-10 18:07:30 -07:00
sabaimran	01a58b71a5	Skip image, code generation if in research mode	2024-10-10 18:06:29 -07:00
Debanjum Singh Solanky	1b13d069f5	Pass data collected from various sources to code tool in normal flow too	2024-10-10 05:19:27 -07:00
Debanjum Singh Solanky	f462d34547	Render images files output by code interpreter in message on web app	2024-10-10 05:17:53 -07:00
Debanjum Singh Solanky	564491e164	Extract date filters quoted with non-ascii quotes in query	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	6a8fd9bf33	Reorder embeddings search arguments based on argument importance	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	0eacc0b2b0	Use consistent name for user, planner to not miss current user query Previously Khoj would start answering the previous query. This maybe because the prompt uses User for prompt in chat history but was using Q for current user prompt.	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	284c8c331b	Increase default max iterations for research chat director to 5	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	1e390325d2	Let research chat director decide which webpage to read, if any Make webpages to read automatically on search_online configurable via a argument. Set it to default to 1, so other callers of the function are unaffected. But iterative chat director can still decide which, if any, webpages to read based on the online search it performs	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	5a699a52d2	Improve webpage summarization prompt to better extract links, excerpts This change allows the iterative director to dive deeper into its research as the data extracted contains relevant links from the webpage Previous summarization prompt didn't extract relevant links from the webpage which limited further explorations from webpages	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	61df1d5db8	Pass previous iteration results to code interpreter chat actors This improves the code interpreter chat actors abilitiy to generate code with data collected during the previous iterations	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	9e7025b330	Set python interpret sandbox url via environment variable	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	2dc5804571	Extract defilter query into conversation utils for reuse	2024-10-10 04:45:00 -07:00
sabaimran	e69a8382f2	Add a code icon for code-related train of thought	2024-10-09 23:56:57 -07:00
sabaimran	536422a40c	Include code snippets in the reference panel	2024-10-09 23:54:11 -07:00
Debanjum Singh Solanky	8d33c764b7	Allow iterative chat director to use python interpreter as a tool	2024-10-09 23:38:20 -07:00
Debanjum Singh Solanky	b373073f47	Show executed code in web app chat message references	2024-10-09 22:13:18 -07:00
Debanjum Singh Solanky	a98f97ed5e	Refactor Run Code tool into separate module and modularize code functions Move construct_chat_history and ChatEvent enum into conversation.utils and move send_message_to_model_wrapper to conversation.helper to modularize code. And start thinning out the bloated routers.helper - conversation.util components are shared functions that conversation child packages can use. - conversation.helper components can't be imported by conversation packages but it can use these child packages This division allows better modularity while avoiding circular import dependencies	2024-10-09 22:13:17 -07:00
Debanjum Singh Solanky	8044733201	Give Khoj ability to run python code as a tool triggered via chat API Create python code executing chat actor - The chat actor generate python code within sandbox constraints - Run the generated python code in the cohere terrarium, pyodide based sandbox accessible at sandbox url	2024-10-09 21:37:22 -07:00
Debanjum Singh Solanky	4d33239af6	Improve prompts for the iterative chat director	2024-10-09 21:23:18 -07:00
Debanjum Singh Solanky	6ad85e2275	Fix to continue showing retrieved documents in train of thought	2024-10-09 21:20:22 -07:00
sabaimran	a6f6e4f418	Fix notes references and passage of user query in the chat flow	2024-10-09 20:34:20 -07:00
Debanjum Singh Solanky	ec248efd31	Allow iterative chat director to do notes search	2024-10-09 19:04:59 -07:00
Debanjum Singh Solanky	a6905a9f0c	Pass background context to iterating chat director	2024-10-09 19:04:59 -07:00
sabaimran	028b6e6379	Fix yield for scraping direct web page	2024-10-09 18:14:08 -07:00
sabaimran	717d9da8d8	Handle when summarize result is not present, rename variable in for loop from query	2024-10-09 17:57:08 -07:00
sabaimran	03544efde2	Ignore typing of the result dict for online, web page scrape	2024-10-09 17:48:24 -07:00
sabaimran	ab81b01fcb	Fix typing of direct_web_pages and remove the deprecated chat API	2024-10-09 17:46:28 -07:00
sabaimran	5b8d663cf1	Add intermediate summarization of results when planning with o1	2024-10-09 17:40:56 -07:00
sabaimran	7b288a1179	Clean up the function planning prompt a little bit	2024-10-09 16:59:20 -07:00
sabaimran	f71e4969d3	Skip summarize while it's broken, and snip some other parts of the workflow while under construction	2024-10-09 16:40:06 -07:00
sabaimran	f7e6f99a32	add typing for extract document references	2024-10-09 16:05:34 -07:00
sabaimran	6960fb097c	update types of prev iterations response	2024-10-09 16:04:39 -07:00
sabaimran	4978360852	Fix type of previous_iterations	2024-10-09 16:02:41 -07:00
sabaimran	46ef205a75	Add additional type annotations for compiled_references et al	2024-10-09 16:01:52 -07:00
sabaimran	4fbaef10e9	Correct usage of the summarize function	2024-10-09 15:58:05 -07:00
sabaimran	c91678078d	Correct the usage of query passed to summarize function	2024-10-09 15:55:55 -07:00
sabaimran	f867d5ed72	Working prototype of meta-level chain of reasoning and execution - Create a more dynamic reasoning agent that can evaluate information and understand what it doesn't know, making moves to get that information - Lots of hacks and code that needs to be reversed later on before submission	2024-10-09 15:54:25 -07:00
Debanjum Singh Solanky	05fb0f14d3	Use user chat models for train of thought when no server chat settings Update chat actors to use user's chat model for train of thought. This requires passing the user info as argument to all the chat actors. Whether the user is subscribed or not can be inferred from the user info being passed, so it doesn't need to be passed as a separate argument to chat actor functions Let send_message_to_model function infer chat model instead of passing it as an argument from some chat actors. Better if this logic can be done in a single place.	2024-10-09 00:07:08 -07:00
Debanjum Singh Solanky	ec0c79217f	Do not set server chat settings on first run Server chat settings can be set for advanced self-hosted or multi-user cloud setups. They are not necessary anymore as we fallback to use the users chat model for train of thought now	2024-10-09 00:07:08 -07:00
Debanjum Singh Solanky	a9009ea774	Default to use user chat model if server chat settings not defined Fallback to use user chat model for train of thought if server chat settings not defined. This simplifies switching chat models for single-user, self-hosted setups by just changing the chat model on the user settings page. Server chat settings, when set, controls the default user chat model and the chat model that is used for Khoj's train of thought. Previously a self-hosted user had to update both the server chat settings in the admin panel and their own user chat model in the user settings panel to explicitly switch to a different chat model (i.e to switch to a new model for both train of thought & response generation) You can still set server chat settings to use a different chat model for train of thought and response generation. But this is only necessary for advanced self-hosted or cloud hosted setups of Khoj.	2024-10-09 00:07:08 -07:00
Debanjum Singh Solanky	9a056383e0	Reduce size of start chat and edit buttons on agent card in web app	2024-10-09 00:00:32 -07:00
Debanjum Singh Solanky	dc7f22f76c	Mention no. of docs in agents knowledge base in its badge hover text	2024-10-08 23:51:00 -07:00
Debanjum Singh Solanky	13fb22f7e7	Update agent form data shown in edit card after save operaton on web app Previously you had to refresh the page to see the updated data on reopening the agents edit card after a save operation. Now you see the latest saved agent data on reopening the agents edit card. This should avoid confusion on whether the data was saved correctly	2024-10-08 23:26:04 -07:00
Debanjum Singh Solanky	dd770cf1b9	Start chat with public and protected agents when shared via link	2024-10-08 22:10:07 -07:00
Debanjum Singh Solanky	80212c50fd	Use default agent in others chats with an agent if agent made private If a public or protected agent is made private. Other users who were having conversation with that agent will have to carry on their conversation using default agent instead	2024-10-08 22:08:38 -07:00
Debanjum Singh Solanky	d628f89ce9	Prefetch agents related database models	2024-10-08 21:59:15 -07:00
Debanjum Singh Solanky	8de67c5d4d	Fallback to use general command if no tool selected by agent	2024-10-08 19:48:02 -07:00
Debanjum Singh Solanky	b80c4bcfdd	Improve agent command descriptions	2024-10-08 19:47:51 -07:00
Debanjum Singh Solanky	67d0e59eac	Pass chat history to the summarize chat actor	2024-10-08 18:44:52 -07:00
Debanjum Singh Solanky	7e3090060b	Encourage Gemini to output more verbose responses	2024-10-08 18:41:43 -07:00
Debanjum Singh Solanky	bbbdba3093	Time embedding model load for better visibility into app startup time Loading the embeddings model, even locally seems to be taking much longer. Use timer to track visibility into embedding, cross-encoder model load times	2024-10-08 18:41:43 -07:00
Debanjum Singh Solanky	516472a8d5	Switch default tokenizer to tiktoken as more widely used The tiktoken BPE based tokenizers seem more widely used these days. Fallback to gpt-4o tiktoken tokenizer to count tokens for context stuffing	2024-10-08 18:41:43 -07:00
Debanjum Singh Solanky	2b8f7f3efb	Reuse a single func to format conversation for Gemini This deduplicates code and prevents logic from deviating across gemini chat actors	2024-10-08 18:41:42 -07:00
Debanjum Singh Solanky	452e360175	Do not use max prompt size to limit Gemini max output tokens We should start disambiguating the the max input from output size. Max prompt size should only be used for the max input context to an LLM. If required max_output_tokens should be set as a separate new field	2024-10-08 15:30:08 -07:00
Debanjum Singh Solanky	bdc36fec5d	Remove unnecessary whitespace indent from personality context	2024-10-08 15:30:08 -07:00

... 4 5 6 7 8 ...

3198 commits