sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-11-30 10:53:02 +01:00

Author	SHA1	Message	Date
sabaimran	3badb27744	Remove stored uploaded files after they're processed.	2024-11-08 23:28:02 -08:00
sabaimran	78630603f4	Delete the fact checker application	2024-11-08 17:27:42 -08:00
sabaimran	807687a0ac	Automatically generate titles for conversations from history	2024-11-08 16:02:34 -08:00
sabaimran	7159b0b735	Enforce limits on file size when converting to text	2024-11-08 15:27:28 -08:00
sabaimran	4695174149	Add support for file preview in the chat input area (before message sent)	2024-11-08 15:12:48 -08:00
sabaimran	ad46b0e718	Label pages when extract text from pdf, docs content. Fix scroll area in doc preview.	2024-11-08 14:53:20 -08:00
sabaimran	ee062d1c48	Fix parsing for PDFs via content indexing API	2024-11-07 18:17:29 -08:00
sabaimran	623a97a9ee	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-07 17:18:23 -08:00
sabaimran	33498d876b	Simplify the share chat page. Don't need it to maintain its own conversation history - When chatting on a shared page, fork and redirect to a new conversation page	2024-11-07 17:14:11 -08:00
sabaimran	4b8be55958	Convert UUID to string when forking a conversation	2024-11-07 17:13:04 -08:00
sabaimran	9bbe27fe36	Set default value of attached files to empty list	2024-11-07 17:12:45 -08:00
sabaimran	3a51996f64	Process attached files in the chat history and add them to the chat message	2024-11-07 16:06:58 -08:00
sabaimran	a89160e2f7	Add support for converting an attached doc and chatting with it - Document is first converted in the chatinputarea, then sent to the chat component. From there, it's sent in the chat API body and then processed by the backend - We couldn't directly use a UploadFile type in the backend API because we'd have to convert the api type to a multipart form. This would require other client side migrations without uniform benefit, which is why we do it in this two-phase process. This also gives us capacity to repurpose the moe generic interface down the road.	2024-11-07 16:06:37 -08:00
sabaimran	e521853895	Remove unnecessary console.log statements	2024-11-07 16:03:31 -08:00
sabaimran	92c3b9c502	Add function to get an icon from a file type	2024-11-07 16:02:53 -08:00
sabaimran	140c67f6b5	Remove focus ring from the text area component	2024-11-07 16:02:02 -08:00
sabaimran	b8ed98530f	Accept attached files in the chat API - weave through all subsequent subcalls to models, where relevant, and save to conversation log	2024-11-07 16:01:48 -08:00
sabaimran	ecc81e06a7	Add separate methods for docx and pdf files to just convert files to raw text, before further processing	2024-11-07 16:01:08 -08:00
sabaimran	394035136d	Add an api that gets a document, and converts it to just text	2024-11-07 16:00:10 -08:00
sabaimran	3b1e8462cd	Include attach files in calls to extract questions	2024-11-07 15:59:15 -08:00
sabaimran	de73cbc610	Add support for relaying attached files through backend calls to models	2024-11-07 15:58:52 -08:00
Debanjum	4cad96ded6	Add Script to Evaluate Khoj on Google's FRAMES benchmark (#955 ) - Why We need better, automated evals to measure performance shifts of Khoj across prompt, model and capability changes. Google's FRAMES benchmark evaluates multi-step retrieval and reasoning capabilities of AI agents. It's a good starter benchmark to evaluate Khoj. - Details This PR adds an eval script to evaluate Khoj responses on the the FRAMES benchmark prompts against the ground truth provided by it. Script allows configuring sample size, batch size, sampling queries from the eval dataset. Gemini is used as an LLM Judge to auto grade Khoj responses vs ground truth data from the benchmark.	2024-11-06 17:52:01 -08:00
Debanjum	8679294bed	Remove need to set server chat settings from use openai proxies docs This was previously required, but now it's only usefuly for more advanced settings, not typical for self-hosting users. With recent updates, the user's selected chat model is used for both Khoj's train of thought and response. This makes it easy to switch your preferred chat model directly from the user settings page and not have to update this in the admin panel as well. Reflect these code changse in the docs, by removing the unnecessary step for self-hosted users to create a server chat setting when using an OpenAI proxy service like Ollama, LiteLLM etc.	2024-11-05 17:10:53 -08:00
Debanjum	05a93fcbed	v-align attach, send buttons with chat input text area on web app Otherwise, those buttons look off-center when images are attached to the chat input area	2024-11-05 17:10:53 -08:00
sabaimran	a0480d5f6c	use fill weight for the toggle right (enabled state) for research mode	2024-11-04 22:01:09 -08:00
sabaimran	dc26da0a12	Add uploaded files in the conversation file filter for a new convo	2024-11-04 22:00:47 -08:00
Debanjum	b51ee644aa	Fix escaping filename when normalizing in org node parser	2024-11-04 20:24:57 -08:00
Debanjum	5724d16a6f	Fix passing images to anthropic chat models to extract questions	2024-11-04 20:24:57 -08:00
sabaimran	cf0bcec0e7	Revert SKIP_TESTS flag in offline chat director tests	2024-11-04 19:06:54 -08:00
sabaimran	1f372bf2b1	Update file summarization unit tests now that multiple files are allowed	2024-11-04 17:45:54 -08:00
sabaimran	7543360210	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-04 16:55:48 -08:00
sabaimran	b6145df3be	Handle file retrieval when agent is None	2024-11-04 16:55:22 -08:00
sabaimran	3dc9139cee	Add additional handling for when file_object comes back empty	2024-11-04 16:53:07 -08:00
sabaimran	a27b8d3e54	Remove summarize condition for only 1 file filter	2024-11-04 16:51:37 -08:00
sabaimran	362bdebd02	Add methods for reading full files by name and including context Now that models have much larger context windows, we can reasonably include full texts of certain files in the messages. Do this when an explicit file filter is set in a conversation. Do so in a separate user message in order to mitigate any confusion in the operation. Pipe the relevant attached_files context through all methods calling into models. We'll want to limit the file sizes for which this is used and provide more helpful UI indicators that this sort of behavior is taking place.	2024-11-04 16:37:13 -08:00
sabaimran	e3ca52b7cb	Use .get() to get text accompanying image url, instead of subindexing	2024-11-04 16:09:16 -08:00
sabaimran	1e89baca7b	Deprecate the UserSearchModelConfig and remove all references - The server has moved to a model of standardization for the embeddings generation workflow. Remove references to the support for differentiated models. - The migration script fo ra new model needs to be updated to accommodate full regeneration.	2024-11-04 12:24:41 -08:00
Debanjum	1ccbf72752	Use logger instead of print to track eval	2024-11-04 00:40:26 -08:00
sabaimran	99c1d2831a	Release Khoj version 1.28.3	2024-11-02 12:23:11 -07:00
sabaimran	075b4ecf15	Call subscription_to_state with sync_to_async wrapper when getting user subscription state - This is needed in case the renewal_date is not set and we need to reset it for the user	2024-11-02 12:22:35 -07:00
sabaimran	ec44cbe1e7	Release Khoj version 1.28.2	2024-11-02 07:53:51 -07:00
Debanjum	791eb205f6	Run prompt batches in parallel for faster eval runs	2024-11-02 04:58:03 -07:00
Debanjum	96904e0769	Add script to evaluate khoj on Google's FRAMES benchmark Google's FRAMES benchmark evaluates multi-step retrieval and reasoning capabilities of an agent. The script uses Gemini as an LLM Judge to evaluate Khoj responses to the FRAMES benchmark prompts against the ground truth provided by it.	2024-11-02 04:57:42 -07:00
Debanjum	31b5fde163	Only enable prompt tracer if git python is installed	2024-11-02 02:07:02 -07:00
sabaimran	5b18dc96e0	Release Khoj version 1.28.1	2024-11-01 22:51:51 -07:00
sabaimran	8d1b1bc78e	Move the git python dependency into top level dependencies	2024-11-01 22:51:00 -07:00
Debanjum	e85dd59295	Release Khoj version 1.28.0	2024-11-01 19:06:59 -07:00
Debanjum	1f79a10541	Fix link to code execution feature in docs	2024-11-01 18:22:21 -07:00
Debanjum	cff8e02b60	Research Mode [Part 2]: Improve Prompts, Edit Chat Messages. Set LLM Seed for Reproducibility (#954 ) - Improve chat actors and their prompts for research mode. - Add documentation to enable the code tool when self-hosting Khoj - Edit Chat Messages - Store Turn Id in each chat message. - Expose API to delete chat message. - Expose delete chat message button to turn delete chat message from web app - Set LLM Generation Seed for Reproducible Debugging and Testing - Setting seed for LLM generation is supported by Llama.cpp and OpenAI models. This can (somewhat) restrain LLM output - Getting fixed responses for fixed inputs helps test, debug longer reasoning chains like used in advanced reasoning	2024-11-01 18:16:42 -07:00
Debanjum	14e453039d	Add prompt tracing, agent personality to infer webpage urls chat actor	2024-11-01 18:12:50 -07:00

1 2 3 4 5 ...

3866 commits