sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-23 12:48:09 +00:00

Author	SHA1	Message	Date
Debanjum	41d9011a26	Move evaluation script into tests/evals directory This should give more space for eval scripts, results and readme	2024-11-17 02:08:20 -08:00
Debanjum	d9d5884958	Enable evaluating Khoj on the OpenAI SimpleQA bench using eval script - Just load the raw csv from OpenAI bucket. Normalize it into FRAMES format - Improve docstring for frames datasets as well - Log the load dataset perf timer at info level	2024-11-17 02:08:20 -08:00
Debanjum	eb5bc6d9eb	Remove Talc search bench from Khoj eval script	2024-11-17 02:08:20 -08:00
Debanjum	fc45aceecf	Delete unused favicon ico in old web app directory	2024-11-17 02:08:20 -08:00
Debanjum	a16fc3ade8	Only add /research prefix when no slash command in message on web app - Explictly adding a slash command is a higher priority intent than research mode being enabled in the background. Respect that for a more intuitive UX flow. - Explicit slash commands do not currently work in research mode. You've to turn research mode off to use other slash commands. This is strange, unnecessary given intent priority is clear.	2024-11-17 02:08:20 -08:00
sabaimran	a1b4587b34	Remove extract_images flag from PDF loader	2024-11-15 21:46:35 -08:00
sabaimran	15b4cec1e8	Add documentation for how to use the text to image model configs, reduce to Replicate	2024-11-15 15:26:14 -08:00
sabaimran	759873ec44	Add documentation for how to use the text to image model configs	2024-11-15 15:22:06 -08:00
sabaimran	c77dc84a68	Remove output_modes function reference in chat tests	2024-11-15 14:03:07 -08:00
sabaimran	e3f1ea9dee	Improve tool, output mode selection process - JSON extract from LLMs is pretty decent now, so get the input tools and output modes all in one go. It'll help the model think through the full cycle of what it wants to do to handle the request holistically. - Make slight improvements to tool selection indicators	2024-11-15 13:53:53 -08:00
sabaimran	c1a5b32ebf	Do not start server when importing the main.py file, unless gunicorn - Add more graceful shutdown when closing bg scheduler thread	2024-11-14 17:36:51 -08:00
sabaimran	be3ee5ec9f	Add cool new suggestion cards for math, diagramming	2024-11-14 17:36:51 -08:00
Debanjum	9fc44f1a7f	Enable evaluation Khoj on the Talc Search Bench using Eval script - Just load the raw jsonl from Github and normalize it into FRAMES format - Color printed accuracy in eval script to blue for readability	2024-11-13 22:50:14 -08:00
Debanjum	8e009f48ce	Show tool call error in next iteration. Allow rerun if model requests. Previously errors would get eaten up but the model wouldn't see anything. And the model wouldn't be allowed re-run the same query-tool combination in the next iteration. This update should give it insight into why it didn't get a result. So it can make an informed (hopefully better) decision on what to do next. And re-run the previous query if appropriate.	2024-11-13 22:50:14 -08:00
Debanjum	604da90fa8	Wrap try/catch around online search in research mode like other tools Previously when call to online search API etc. failed, it'd error out of response to query in research mode. Khoj should skip tool use that iteration but continue to try respond.	2024-11-13 16:46:09 -08:00
Debanjum	8851b5f78a	Standardize chat message truncation and serialization before print Previously chatml messages were just strings, now they can be list of strings or list of dicts as well. - Use json seriallization to manage their variations and truncate them before printing for context. - Put logic in single function for use across chat models	2024-11-13 16:30:17 -08:00
Debanjum	f4e37209a2	Improve error handling, display and configurability of eval script - Default to evaluation decision of None when either agent or evaluator llm fails. This fixes accuracy calculations on errors - Fix showing color for decision True - Enable arg flags to specify output results file paths	2024-11-13 14:32:22 -08:00
Debanjum	15b0cfa3dd	Improve structured message truncation in logger Previously chatml messages were just strings. Since gemini, anthropic models always have messages as list of strings, truncate those strings instead of the list of message content	2024-11-13 14:32:22 -08:00
Debanjum	153ae8bea9	Cut binary, long output files from code result for context efficiency Removing binary data and truncating large data in output files generated by code runs should improve speed and cost of research mode runs with large or binary output files. Previously binary data in code results was passed around in iteration context during research mode. This made the context inefficient because models have limited efficiency and reasoning capabilities over b64 encoded image (and other binary) data and would hit context limits leading to unnecessary truncation of other useful context Also remove image data when logging output of code execution	2024-11-13 14:32:22 -08:00
sabaimran	de34cc3987	Remove og image url with khoj documentation	2024-11-13 10:23:02 -08:00
sabaimran	4a1b1e8b9a	Add support for interrupting messages after they've been sent.	2024-11-12 22:22:45 -08:00
sabaimran	d607ad7a27	Release Khoj version 1.29.1	2024-11-12 10:32:56 -08:00
sabaimran	8ec1764e42	Handle size calculation more gracefully for converted documents, depending on type	2024-11-12 02:00:29 -08:00
sabaimran	b6714c202f	Increase the title character limit to 500 for conversations	2024-11-12 01:51:19 -08:00
sabaimran	f05e64cf8c	Release Khoj version 1.29.0	2024-11-11 21:46:25 -08:00
sabaimran	47d3c8c235	Remove email query parameter from subscription patch api	2024-11-11 21:39:49 -08:00
sabaimran	d7027109a5	And null handling for response output_files in code output	2024-11-11 21:14:56 -08:00
sabaimran	d68243a3fb	Revert clean_json logic temporarily. Eventually, we should do better validation here to extract markdown-formatted json.	2024-11-11 21:05:17 -08:00
sabaimran	1cab6c081f	Add better error handling for diagram output, and fix chat history construct - Make the `clean_json` method more robust as well	2024-11-11 20:44:19 -08:00
sabaimran	7bd2f83f97	Wrap test in suggestionCard	2024-11-11 20:12:46 -08:00
Debanjum	48862a8400	Enable Passing External Documents for Analysis in Code Sandbox (#960 ) - Allow passing user files as input into code sandbox for analysis - Update prompt to give more example of complex, multi-line code - Simplify logic for model. Run one program at a time, instead of allowing model to run multiple programs in parallel - Show Code generated charts and docs in Reference pane of web app and make them downloaded	2024-11-11 19:37:17 -08:00
Debanjum	5078ac0ce2	Await on conversation save when generate conversation title via API	2024-11-11 19:17:39 -08:00
Debanjum	e1d0015248	Allow disabling Khoj telemetry via KHOJ_TELEMETRY_DISABLE env var	2024-11-11 19:17:39 -08:00
Debanjum	a52500d289	Show generated code artifacts before notes and online references	2024-11-11 18:00:22 -08:00
Debanjum	218eed83cd	Show output file not code on hover. Remove reference card title border	2024-11-11 18:00:22 -08:00
Debanjum	b970cfd4b3	Align styling of reference panel card across code, docs, web results - Add a border below heading - Show code snippet in pre block - Overflow-x when reference side panel open to allow seeing whole text via x-scroll - Align header, body position of reference cards with each other - Only show filename in doc reference cards at message bottom. Show full file path in hover and reference side panel	2024-11-11 18:00:22 -08:00
Debanjum	8e9f4262a9	Render code output files with code references in reference section - Improve rendering code reference with better icons, smaller text and different line clamps for better visibility - Show code output files as sub card of code card in reference section - Allow downloading files generated by code instead of rendering it in chat message directly - Show executed code before online references in reference panel	2024-11-11 18:00:22 -08:00
Debanjum	92c1efe6ee	Fixes to render & save code context with non text based output modes - Fix to render code generated chart with images, excalidraw diagrams - Fix to save code context to chat history in image, diagram output modes - Fix bug in image markdown being wrapped twice in markdown syntax - Render newline in code references shown on chat page of web app Previously newlines weren't getting rendered. This made the code executed by Khoj hard to read in references. This changes fixes that. `dangerouslySetInnerHTML' usage is justified as rendered code snippet is being sanitized by DOMPurify before rendering.	2024-11-11 18:00:22 -08:00
Debanjum	af0215765c	Decode code text output files from b64 to str to ease client processing	2024-11-11 18:00:22 -08:00
Debanjum	7b39f2014a	Enable analysing user documents in code sandbox and other improvements - Run one program at a time, instead of allowing model to pass multiple programs to run in parallel to simplify logic for model - Update prompt to give more example of complex, multi-line code - Allow passing user files as input into code sandbox for analysis - Log code execution timer at info level to evaluate execution latencies in production - Type the generated code for easier processing by caller functions	2024-11-11 17:59:37 -08:00
sabaimran	dc109559d4	Research mode gray when off, colored when on	2024-11-11 16:35:07 -08:00
sabaimran	cdda9c2e73	Improve text wrapping for attached files and preview context For the research mode toggle, make it not fill when it's off	2024-11-11 13:32:10 -08:00
sabaimran	dd36303bb7	Fix sending file attachments in save_to_conversation method - When files attached but upload fails, don't update the state variables - Make removing null characters in pdf extraction more space efficient	2024-11-11 12:53:06 -08:00
Debanjum	ba2471dc02	Do not CRUD on entries, files & conversations in DB for null user (#958 ) Increase defense-in-depth by reducing paths to create, read, update or delete entries, files and conversations in DB when user is unset.	2024-11-11 12:47:22 -08:00
Debanjum	536fe994be	Remove unused db adapter methods, like for fact checker data store	2024-11-11 12:22:34 -08:00
Debanjum	10bca6fa8f	Convert required user param check into decorator. Use with more adapters	2024-11-11 12:22:32 -08:00
Debanjum	ff5c10c221	Do not CRUD on entries, files & conversations in DB for null user Increase defense-in-depth by reducing paths to create, read, update or delete entries, files and conversations in DB when user is unset.	2024-11-11 12:20:07 -08:00
sabaimran	27fa39353e	Make custom agent creation flow available to everyone - For private agents, add guardrails to prevent against any misuse or violation of terms of service.	2024-11-11 11:54:59 -08:00
sabaimran	b563f46a2e	Merge pull request #957 from khoj-ai/features/include-full-file-in-convo-with-filter Support including file attachments in the chat message Now that models have much larger context windows, we can reasonably include full texts of certain files in the messages. Do this when an explicit file filter is set in a conversation. Do so in a separate user message in order to mitigate any confusion in the operation. Pipe the relevant attached_files context through all methods calling into models. This breaks certain prior behaviors. We will no longer automatically be processing/generating embeddings on the backend and adding documents to the "brain". You'll have to go to settings and go through the upload documents flow there in order to add docs to the brain (i.e., have search include them during question / response).	2024-11-11 11:34:42 -08:00
sabaimran	2bb2ff27a4	Rename attached_files to query_files. Update relevant backend and client-side code.	2024-11-11 11:21:26 -08:00

... 4 5 6 7 8 ...

4145 commits