sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-11-23 15:38:55 +01:00

Author	SHA1	Message	Date
sabaimran	cdda9c2e73	Improve text wrapping for attached files and preview context For the research mode toggle, make it not fill when it's off	2024-11-11 13:32:10 -08:00
sabaimran	dd36303bb7	Fix sending file attachments in save_to_conversation method - When files attached but upload fails, don't update the state variables - Make removing null characters in pdf extraction more space efficient	2024-11-11 12:53:06 -08:00
Debanjum	ba2471dc02	Do not CRUD on entries, files & conversations in DB for null user (#958 ) Increase defense-in-depth by reducing paths to create, read, update or delete entries, files and conversations in DB when user is unset.	2024-11-11 12:47:22 -08:00
Debanjum	536fe994be	Remove unused db adapter methods, like for fact checker data store	2024-11-11 12:22:34 -08:00
Debanjum	10bca6fa8f	Convert required user param check into decorator. Use with more adapters	2024-11-11 12:22:32 -08:00
Debanjum	ff5c10c221	Do not CRUD on entries, files & conversations in DB for null user Increase defense-in-depth by reducing paths to create, read, update or delete entries, files and conversations in DB when user is unset.	2024-11-11 12:20:07 -08:00
sabaimran	27fa39353e	Make custom agent creation flow available to everyone - For private agents, add guardrails to prevent against any misuse or violation of terms of service.	2024-11-11 11:54:59 -08:00
sabaimran	b563f46a2e	Merge pull request #957 from khoj-ai/features/include-full-file-in-convo-with-filter Support including file attachments in the chat message Now that models have much larger context windows, we can reasonably include full texts of certain files in the messages. Do this when an explicit file filter is set in a conversation. Do so in a separate user message in order to mitigate any confusion in the operation. Pipe the relevant attached_files context through all methods calling into models. This breaks certain prior behaviors. We will no longer automatically be processing/generating embeddings on the backend and adding documents to the "brain". You'll have to go to settings and go through the upload documents flow there in order to add docs to the brain (i.e., have search include them during question / response).	2024-11-11 11:34:42 -08:00
sabaimran	2bb2ff27a4	Rename attached_files to query_files. Update relevant backend and client-side code.	2024-11-11 11:21:26 -08:00
sabaimran	47937d5148	Merge branch 'features/include-full-file-in-convo-with-filter' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-11 09:34:08 -08:00
sabaimran	ae4eb96d48	Consolidate file name to icon mapping	2024-11-11 09:34:04 -08:00
Debanjum	7954f39633	Use accept param to file input to indicate supported file types in web app Remove unused total size calculations in chat input	2024-11-11 04:06:17 -08:00
Debanjum	4223b355dc	Use python stdlib methods to write pdf, docx to temp files for loaders Use python standard method tempfile.NamedTemporaryFile to write, delete temporary files safely.	2024-11-11 03:24:50 -08:00
Debanjum	fd15fc1e59	Move construct chat history back to it's original position in file Keep function where it original was allows tracking diffs and change history more easily	2024-11-11 03:24:50 -08:00
Debanjum	35d6c792e4	Show snippet of truncated messages in debug logs to avoid log flooding	2024-11-11 02:30:38 -08:00
sabaimran	8805e731fd	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-10 19:24:11 -08:00
sabaimran	a5e2b9e745	Exit early when running an automation if the conversation for the automation does not exist.	2024-11-10 19:22:21 -08:00
sabaimran	55200be4fa	Apply agent color fill to the toggle both in off and on states	2024-11-10 19:16:43 -08:00
Debanjum	7468f6a6ed	Deduplicate online references returned by chat API to clients This will ensure only unique online references are shown in all clients. The duplication issue was exacerbated in research mode as even with different online search queries, you can get previously seen results. This change does a global deduplication across all online results seen across research iterations before returning them in client reponse.	2024-11-10 16:10:32 -08:00
Debanjum	137687ee49	Deduplicate searches in normal mode & across research iterations - Deduplicate online, doc search queries across research iterations. This avoids running previously run online, doc searches again and dedupes online, doc context seen by model to generate response. - Deduplicate online search queries generated by chat model for each user query. - Do not pass online, docs, code context separately when generate response in research mode. These are already collected in the meta research passed with the user query - Improve formatting of context passed to generate research response - Use xml tags to delimit context. Pass per iteration queries in each iteration result - Put user query before meta research results in user message passed for generating response This deduplications will improve speed, cost & quality of research mode	2024-11-10 16:10:32 -08:00
Debanjum	306f7a2132	Show error in picking next tool to researcher llm in next iteration Previously the whole research mode response would fail if the pick next tool call to chat model failed. Now instead of it completely failing, the researcher actor is told to try again in next iteration. This allows for a more graceful degradation in answering a research question even if a (few?) calls to the chat model fail.	2024-11-10 14:52:02 -08:00
Debanjum	eb492f3025	Only keep webpage content requested, even if Jina API gets more data Jina search API returns content of all webpages in search results. Previously code wouldn't remove content beyond max_webpages_to_read limit set. Now, webpage content in organic results aree explicitly removed beyond the requested max_webpage_to_read limit. This should align behavior of online results from Jina with other online search providers. And restrict llm context to a reasonable size when using Jina for online search.	2024-11-10 14:51:16 -08:00
Debanjum	8ef7892c5e	Exclude non-dictionary doc context from chat history sent to chat models This fixes chat with old chat sessions. Fixes issue with old Whatsapp users can't chat with Khoj because chat history doc context was stored as a list earlier	2024-11-10 14:51:16 -08:00
Debanjum	d892ab3174	Fix handling of command rate limit and improve rate limit messages Command rate limit wouldn't be shown to user as server wouldn't be able to handle HTTP exception in the middle of streaming. Catch exception and render it as LLM response message instead for visibility into command rate limiting to user on client Log rate limmit messages for all rate limit events on server as info messages Convert exception messages into first person responses by Khoj to prevent breaking the third wall and provide more details on wht happened and possible ways to resolve them.	2024-11-10 14:51:16 -08:00
Debanjum	80ee35b9b1	Wrap messages in web, obsidian UI to stay within screen when long links Wrap long links etc. in chat messages and train of thought lists on web app app and obsidian plugin by breaking them into newlines by word	2024-11-10 14:49:51 -08:00
Debanjum	f967bdf702	Show correct example index being currently processed in frames eval Previously the batch start index wasn't being passed so all batches started in parallel were showing the same processing example index This change doesn't impact the evaluation itself, just the index shown of the example currently being evaluated	2024-11-10 14:49:51 -08:00
Debanjum	84a8088c2b	Only evaluate non-empty responses to reduce eval script latency, cost Empty responses by Khoj will always be an incorrect response, so no need to make call to an evaluator agent to check that	2024-11-10 14:49:51 -08:00
sabaimran	170d959feb	Handle offline messages differently, as they don't respond well to the structured messages	2024-11-09 19:52:46 -08:00
sabaimran	2c543bedd7	Add typing to the constructed messages listed	2024-11-09 19:40:27 -08:00
sabaimran	79b15e4594	Only add images when they're present and vision enabled	2024-11-09 19:37:30 -08:00
sabaimran	bd55028115	Fix randint import from random when creating filenames for tmp	2024-11-09 19:17:18 -08:00
sabaimran	92b6b3ef7b	Add attached files to latest structured message in chat ml format	2024-11-09 19:17:00 -08:00
sabaimran	835fa80a4b	Allow docx conversion in the chatFunction.ts	2024-11-09 18:51:00 -08:00
sabaimran	459318be13	And random suffixes to decreases any clash probability when writing tmp files to disc	2024-11-09 18:46:34 -08:00
sabaimran	dbf0c26247	Remove _summary_ description in function descriptions	2024-11-09 18:42:42 -08:00
sabaimran	e5ac076fc4	Move construct_chat_history method back to conversation.utils.py	2024-11-09 18:27:46 -08:00
sabaimran	bc95a99fb4	Make tracer the last input parameter for all the relevant chat helper methods	2024-11-09 18:22:46 -08:00
sabaimran	ceb29eae74	Add phone number verification and remove telemetry update call from place where authentication middleware isn't yet installed (in the middleware itself).	2024-11-09 12:25:36 -08:00
sabaimran	3badb27744	Remove stored uploaded files after they're processed.	2024-11-08 23:28:02 -08:00
sabaimran	78630603f4	Delete the fact checker application	2024-11-08 17:27:42 -08:00
sabaimran	807687a0ac	Automatically generate titles for conversations from history	2024-11-08 16:02:34 -08:00
sabaimran	7159b0b735	Enforce limits on file size when converting to text	2024-11-08 15:27:28 -08:00
sabaimran	4695174149	Add support for file preview in the chat input area (before message sent)	2024-11-08 15:12:48 -08:00
sabaimran	ad46b0e718	Label pages when extract text from pdf, docs content. Fix scroll area in doc preview.	2024-11-08 14:53:20 -08:00
sabaimran	ee062d1c48	Fix parsing for PDFs via content indexing API	2024-11-07 18:17:29 -08:00
sabaimran	623a97a9ee	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-07 17:18:23 -08:00
sabaimran	33498d876b	Simplify the share chat page. Don't need it to maintain its own conversation history - When chatting on a shared page, fork and redirect to a new conversation page	2024-11-07 17:14:11 -08:00
sabaimran	4b8be55958	Convert UUID to string when forking a conversation	2024-11-07 17:13:04 -08:00
sabaimran	9bbe27fe36	Set default value of attached files to empty list	2024-11-07 17:12:45 -08:00
sabaimran	3a51996f64	Process attached files in the chat history and add them to the chat message	2024-11-07 16:06:58 -08:00

1 2 3 4 5 ...

3954 commits