sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-23 12:48:09 +00:00

Author	SHA1	Message	Date
Debanjum	73750ef286	Merge branch 'master' into features/advanced-reasoning	2024-11-01 11:42:01 -07:00
sabaimran	1fc280db35	Handle case where infer_webpage_url returns no valid urls	2024-11-01 11:41:32 -07:00
Debanjum	1c920273dd	Add Prompt Tracer to Visualize, Analyze and Debug Khoj's Train of Thought (#951 ) ## Overview Use git to capture prompt traces of khoj's train of thought. View, analyze and debug them using your favorite git client (e.g vscode, magit). - Each commit captures an interaction with an LLM The commit writes the query, response and system message each to a separate file in the repo. The commit message captures the chat model, Khoj version and other metadata - Each conversation turn can have multiple interactions with an LLM (e.g Khoj's train of thought) - Each new conversation turn forks from and merges back into its conversation branch - Each new conversation branches from the user branch - Each new user branches from root commit on the main branch ## Usage 1. Set `KHOJ_DEBUG=true` or start khoj in very verbose mode with `khoj -vv` to turn on prompt tracing 2. Chat with Khoj as usual 3. Open the promptrace git repo to view the generated prompt traces using your favorite git porcelain. The Khoj prompt trace git repo is created at `/tmp/khoj_promptrace` by default. You can configure the prompt trace directory by setting the `PROMPTRACE_DIR`environment variable. ## Implementation - Add utility functions to capture prompt traces using git (via `gitpython`) - Make each model provider in Khoj commit their LLM interactions with promptrace - Weave chat metadata from chat API through all chat actors and commit it to the prompt trace	2024-11-01 11:33:54 -07:00
sabaimran	33d36ee58c	Add experimental notice to research mode tooltip	2024-11-01 11:00:27 -07:00
sabaimran	0145b2a366	Set usage limits on the research mode	2024-11-01 10:29:33 -07:00
sabaimran	3ea94ac972	Only include inferred-queries in chat history when present	2024-10-31 22:01:41 -07:00
sabaimran	149cbe1019	Use bottom anchor for the commandbar popover	2024-10-31 20:40:38 -07:00
sabaimran	21858acccc	Remove conversation command always in query, filter out inferred queries that were not with selected tool when going through tool selection iterations	2024-10-31 20:27:38 -07:00
sabaimran	19241805ee	Merge branch 'master' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-10-31 18:20:23 -07:00
Debanjum	302bd51d17	Improve online chat actor prompt for research and normal mode - Match the online query generator prompt to match the formatting of extract questions - Separate iteration results by newline - Improve webpage and online tool descriptions	2024-10-31 18:17:12 -07:00
Debanjum	52163fe299	Improve research planner prompt to reduce looping	2024-10-31 18:17:01 -07:00
sabaimran	7ebf999688	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-31 18:15:13 -07:00
sabaimran	159ea44883	Remove frame references in the diagramming prompts	2024-10-31 18:14:51 -07:00
Debanjum	89597aefe9	Json dump contents in prompt tracer to make structure discernable	2024-10-31 18:08:42 -07:00
Debanjum	5b15176e20	Only add /research prefix in research mode if not already in user query	2024-10-31 18:08:42 -07:00
sabaimran	559601dd0a	Do not exit if/else loop in research loop when notes not found	2024-10-31 13:51:10 -07:00
sabaimran	a13760640c	Only show trash can when turnId is present	2024-10-31 13:19:16 -07:00
sabaimran	8d1ecb9bd8	Add optional brew steps for docker install	2024-10-31 12:41:53 -07:00
Debanjum	adca6cbe9d	Merge branch 'master' into add-prompt-tracer-for-observability	2024-10-31 02:28:34 -07:00
Debanjum	e17dc9f7b5	Put train of thought ui before Khoj response on web app	2024-10-31 02:24:53 -07:00
Debanjum	e8e6ead39f	Fix deleting new messages generated after conversation load	2024-10-30 20:56:38 -07:00
Debanjum	cb90abc660	Resolve train of thought component needs unique key id error on web app	2024-10-30 14:00:21 -07:00
Debanjum	ca5a6831b6	Add ability to delete messages from the web app	2024-10-30 14:00:21 -07:00
Debanjum	ba15686682	Store turn id with each chat message. Expose API to delete chat turn Each chat turn is a user query, khoj response message pair	2024-10-30 14:00:21 -07:00
Debanjum	f64f5b3b6e	Handle add/delete file filter operation on non-existent conversation	2024-10-30 14:00:21 -07:00
Debanjum	b3a63017b5	Support setting seed for reproducible LLM response generation Anthropic models do not support seed. But offline, gemini and openai models do. Use these to debug and test Khoj via KHOJ_LLM_SEED env var	2024-10-30 14:00:21 -07:00
Debanjum	d44e68ba01	Improve handling embedding model config from admin interface - Allow server to start if loading embedding model fails with an error. This allows fixing the embedding model config via admin panel. Previously server failed to start if embedding model was configured incorrectly. This prevented fixing the model config via admin panel. - Convert boolean string in config json to actual booleans when passed via admin panel as json before passing to model, query configs - Only create default model if no search model configured by admin. Return first created search model if its been configured by admin.	2024-10-30 14:00:21 -07:00
Debanjum	358a6ce95d	Defer turning cursor color to selected agents color for later Capability exists but idea needs to be investigated further	2024-10-30 14:00:21 -07:00
Debanjum	2ac840e3f2	Make cursor in chat input take on selected agent color	2024-10-30 14:00:21 -07:00
Debanjum	1448b8b3fc	Use 3rd person for user in research prompt to reduce person confusion Models were getting a bit confused about who is search for who's information. Using third person to explicitly call out on who's behalf these searches are running seems to perform better across models (gemini's, gpt etc.), even if the role of the message is user.	2024-10-30 13:49:48 -07:00
Debanjum	b8c6989677	Separate example from actual question in extract question prompt	2024-10-30 13:49:48 -07:00
Debanjum	86ffd7a7a2	Handle \n, dedupe json cleaning into single function for reusability Use placeholder for newline in json object values until json parsed and values extracted. This is useful when research mode models outputs multi-line codeblocks in queries etc.	2024-10-30 13:49:48 -07:00
Debanjum	83ca820abe	Encourage Anthropic models to output json object using { prefill Anthropic API doesn't have ability to enforce response with valid json object, unlike all the other model types. While the model will usually adhere to json output instructions. This step is meant to more strongly encourage it to just output json object when response_type of json_object is requested.	2024-10-30 13:49:48 -07:00
Debanjum	dc8e89b5de	Pass tool AIs iteration history as chat history for better context Separate conversation history with user from the conversation history between the tool AIs and the researcher AI. Tools AIs don't need top level conversation history, that context is meant for the researcher AI. The invoked tool AIs need previous attempts at using the tool in this research runs iteration history to better tune their next run. Or at least that is the hypothesis to break the models looping.	2024-10-30 13:49:48 -07:00
Debanjum	d865994062	Rename code tool arg `previous_iteration_history' to` context'	2024-10-30 13:49:48 -07:00
Debanjum	06aeca2670	Make researcher, docs search AIs ask more diverse retrieval questions Models weren't generating a diverse enough set of questions. They'd do minor variations on the original query. What is required is asking queries from a bunch of different lenses to retrieve the requisite information. This prompt updates shows the AIs the breadth of questions to by example and instruction. Seem like performance improved based on vibes	2024-10-30 13:49:48 -07:00
Debanjum	01881dc7a2	Revert "Make extract question prompt in 1st person wrt user as its a user message" This reverts commit 6d3602798aa1b95a30c557576fd4f93ddef2ae76.	2024-10-30 13:49:48 -07:00
Debanjum	3e695df198	Make extract question prompt in 1st person wrt user as its a user message Divide Example from Actual chat history section in prompt	2024-10-30 13:49:48 -07:00
Debanjum	a3751d6a04	Make extract relevant information system prompt work for any document Previously it was too strongly tuned for extracting information from only webpages. This shouldn't be necessary	2024-10-30 13:49:48 -07:00
Debanjum	a39e747d07	Improve passing user name in pick next research tool prompt	2024-10-30 13:49:48 -07:00
Debanjum	deff512baa	Improve research mode prompts to reduce looping, increase webpage reads	2024-10-30 13:49:48 -07:00
Debanjum	d3184ae39a	Simplify storing and displaying document results in research mode - Mention count of notes and files disovered - Store query associated with each compiled reference retrieved for easier referencing	2024-10-30 13:49:48 -07:00
Debanjum	8bd94bf855	Do not use a message branch if no msg id provided to prompt tracer	2024-10-30 13:49:48 -07:00
sabaimran	b63fbc5345	Add a simple badget to the dropdown menu that shows subscription status	2024-10-30 13:00:16 -07:00
sabaimran	82f3d79064	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-30 11:32:10 -07:00
sabaimran	2b2564257e	Handle subscription case where it's set to trial, but renewal_date is not set. set the renewal_date for LENGTH_OF_FREE_TRIAL days from subscription creation.	2024-10-30 11:05:31 -07:00
Debanjum	9935d4db0b	Do not use a message branch if no msg id provided to prompt tracer	2024-10-28 17:50:27 -07:00
Debanjum	d184498038	Pass context in separate message from user query to research chat actor	2024-10-28 15:37:28 -07:00
Debanjum	d75ce4a9e3	Format online, notes, code context with YAML to be legibile for LLM	2024-10-28 15:37:28 -07:00
sabaimran	5bea0c705b	Use break-words in the train of thought for better formatting	2024-10-28 15:36:06 -07:00

... 7 8 9 10 11 ...

4147 commits