sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-18 02:27:10 +00:00

Author	SHA1	Message	Date
Debanjum Singh Solanky	601ff2541b	Revert to using GPT to extract search queries from users message - Reasons: - GPT can extract date aware search queries with date filters better than ChatGPT given the same prompt. - Need quality more than cost savings for now. - Need to figure ways to improve prompt for ChatGPT before using it	2023-03-18 17:56:13 -06:00
Debanjum Singh Solanky	e28526bbc9	Extract search queries from users message using ChatGPT as Search Actor - Reasons - ChatGPT should be better at following instructions than GPT - At 1/10th the cost, it's much cheaper than using older GPT models	2023-03-18 16:33:24 -06:00
Debanjum Singh Solanky	939d7731da	Fix-up Search Actor GPT's response for decoding it as valid JSON	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	f63fd0995e	Pass more search results as context to Chat Actor to improve inference	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	10836dedee	Search should return user message if GPT response is not valid JSON Previously would throw if GPT response is not valid JSON. Better to return original message to use for search instead	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	08f5fb315f	Add answers to context for Search Actor to generate relevant queries Update Search Actor prompt with answers, more precise primer and two more examples for context Mark the 3 chat quality tests using answer as context to generate queries as expected to pass. Verify that the 3 tests pass now, unlike before when the Search Actor did not have the answers for context	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	f09bdd515b	Expect Chat Director can extract relative dates using new Search Actor	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	36c7389b46	Test Search Actor generating search query from Chat History	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	2600cc9d4d	Test Search Actor extracting relative dates & multiple questions	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	45cb510421	Loosen search results score thresold used by chat for more context	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	d871e04a81	Use past user messages, inferred questions as context to extract questions - Keep inferred questions in logs - Improve prompt to GPT to try use past questions as context - Pass past user message and inferred questions as context to help GPT extract complete questions - This should improve search results quality - Example Expected Inferred Questions from User Message using History: 1. "What is the name of Arun's daughter?" => "What is the name of Arun's daughter" 2. "Where does she study?" => => "Where does Arun's daughter study?" OR => "Where does Arun's daughter, Reena study?"	2023-03-18 16:30:50 -06:00
Debanjum Singh Solanky	1a5d1130f4	Generate search queries from message to answer users chat questions The Search Actor allows for 1. Looking up multiple pieces of information from the notes E.g "Is Bob older than Tom?" searches for age of Bob and Tom in 2 searches 2. Allow date aware user queries in Khoj chat Answer time range based questions Limit search to specified timeframe in question using date filter E.g "What national parks did I visit last year?" adds dt>="2022-01-01" dt<"2023-01-01" to Khoj search Note: Temperature set to 0. Message to search queries should be deterministic	2023-03-18 16:28:51 -06:00
Debanjum Singh Solanky	d0f14d3f85	Test usage of = in date filter queries	2023-03-16 14:52:59 -06:00
Debanjum Singh Solanky	dfb277ee37	Set skipif at module level if OpenAI API key not set for chat tests - Remove stale message_to_prompt test It is too broad, reduces maintainability. Remove as it doesn't really need its own test right now - Setting skipif at module level for chat actor, director tests reduces code duplication as earlier was using decorator on each chat test	2023-03-16 12:23:52 -06:00
Debanjum	e75e13d788	Create Tests to Measure Chat Quality, Capabilities Create Rubric to Test Chat Quality and Capabilities ### Issues - Previously the improvements in quality of Khoj Chat on changes was uncertain - Manual testing on my evolving set of notes was slow and didn't assess all expected, desired capabilities ### Fix 1. Create an Evaluation Dataset to assess Chat Capabilities - Create custom notes for a fictitious person (I'll publish a book with these soon 😅😋) - Add a few of Paul Graham's more personal essays. [Easy to get as markdown](https://github.com/ofou/graham-essays) 2. Write Unit Tests to Measure Chat Capabilities - Measure quality at 2 separate layers - Chat Actor: These are the narrow agents made of LLM + Prompt. E.g `summarize`, `converse` in `gpt.py` - Chat Director: This is the chat orchestration agent. It calls on required chat actors, search through user provided knowledge base (i.e notes, ledger, image) etc to respond appropriately to the users message. This is what the `/api/chat` API exposes. - Mark desired but not currently available capabilities as expected to fail <br /> This still allows measuring the chat capability score/percentage while only failing capability tests which were passing before on any changes to chat	2023-03-16 11:30:52 -06:00
Debanjum Singh Solanky	4e15b4e411	Create test notes dataset for chat testing Combine hand-written custom notes and PG essays with personal content to bulk up notes count Delete old documentation markdown as not a representative dataset for application (which is more tuned for personal notes)	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	1b4d562700	Test Chat Director Capabilities: Answer from notes, chat history etc - Chat directors are broad agents. - Chat directors orchestrate narrow actor agents to synthesize final response for the user - Agents are Prompts + ML Model - Test Chat Director Capabilities 1. [X] Answer from retrieved notes 2. [X] Answer from chat history 3. [X] Answer general questions 4. [X] Carry out multi-turn conversation 5. [X] Say don't know when answer not in provided context 6. [X] Answers that require current date awareness This test is expected to fail as the chat is not capable of doing this without the Search actor. But the test allows assessing chat quality 7. [X] Date-aware aggregation across multiple different notes This test is expected to fail as the chat is not capable of doing this without the Search actor. But the test allows assessing chat quality 8. [X] Ask clarification questions if no unambiguous answer in provided context 9. [X] Retrieve answer from chat history beyond lookback window This test is expected to fail as the chat director is not capable of searching chat history yet. But the test allows assessing chat quality 10. [X] Retrieve context for answer using multiple independent searches on knowledge base This test is expected to fail as the chat is not capable of doing this without the Search actor. But the test allows assessing chat quality	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	b6d63137f1	Setup Pytest fixture for conversation processor to test chat API - Index markdown test data as knowledge base. As easier to get good markdown content (vs org) - Setup markdown_content_config, processor_config and chat_client to test chat API	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	3f719c9e17	Rename Chat Model+Prompt tests to chat actor tests	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	7526a50dd4	Extract conversation processor utility funcs from gpt.py into utils.py	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	7c4d546039	Configure tests to mark chat quality tests & filter unhelpful warnings - Mark chat quality tests, register custom mark for chat quality - Filter unhelpful deprecation warnings from within dateparser library - Error if tests use unregistered marks	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	c1128a1ad8	Test Chat Actor Capabilities; ability to answer from notes, chat logs etc - Chat actors are narrow agents (prompt + ML model) Chat actors are different from the Chat director. who orchestrates the narrow actor agents to synthesize final response to the user - Test Chat Actor Capabilities 1. Answer from retrieved notes 2. Answer from chat history 3. Answer general questions 4. Carry out multi-turn conversation 5. Say don't know when answer not in provided context 6. Answers that require current date awareness 7. Date-aware aggregation across multiple different notes 8. Ask clarification questions if no unambiguous answer in provided context This test is expected to fail as the chat is not capable of doing this consistently yet. But having the test allows assessing chat quality - Use Openai API Key from OPENAI_API_KEY environment variable - Gitignore .env file, python virtualenv directory Put OpenAI API Key in .env file to run chatbot tests via vscode The .env file is default location for importing env vars	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	9306cd901a	Clean up chat tests to work with updated chat methods in gpt.py	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	24ddebf3ce	Make converse prompt more precise. Fix default arg vals in gpt methods - Set conversation_log arg default to dict - Increase default temperature to 0.2 for a little creativity in answering - Make GPT be more reliable in looking at past conversations for forming response	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	8609e3129e	Fix, improve displaying chat messages, sources by Khoj in web interface Pretty pretty json in conversation logs	2023-03-14 11:24:47 -06:00
Debanjum	6c0e82b2d6	Merge Improve Khoj Chat PR #183 from debanjum/improve-chat-interface # Improve Khoj Chat ## Main Changes - Use the new [API](https://openai.com/blog/introducing-chatgpt-and-whisper-apis) for [ChatGPT](https://openai.com/blog/chatgpt) to improve conversation quality and cost - Improve Prompt to answer query using indexed notes - Previously was asking GPT to summarize the notes - Both the chat and answer API use this new prompt - Support Multi-Turn conversations - Pass previous messages and associated reference notes to ChatGPT for context - Show note snippets referenced to generate response - Allows fact-checking, getting details - Simplify chat interface by using only single unified chat type for now ## Miscellaneous - Replace summarize with answer API. Summarize via API not useful for now - Only pass Khoj search results above a threshold confidence to GPT for context - Allows Khoj to say don't know if it can't find answer to query from notes - Allows relying on (only) conversation history to generate response in multi-turn conversation - Move Chat API out of beta. Update Readme	2023-03-10 19:03:44 -06:00
Debanjum Singh Solanky	cccd225247	Deduplicate and simplify logic to render chat message with reference	2023-03-10 18:58:11 -06:00
Debanjum Singh Solanky	b9caad458e	Type score_threshold with union, not \|, to support python <3.10	2023-03-10 18:58:11 -06:00
Debanjum Singh Solanky	198d9af8cf	Update Readme to reflect Khoj Chat out of Beta	2023-03-10 18:58:11 -06:00
Debanjum Singh Solanky	a71f168273	Move the chat API out of beta. Save chat sessions at 15min intervals	2023-03-10 17:20:52 -06:00
Debanjum Singh Solanky	bcc0bed9db	Upgrade bump_version script to handle release and post-release commit - Updates version in khoj.el and Obsidian manifest, package, versions json files under interface and project root - Create and tag release commit with updated files - Creates commit with post-release version upgrade in files - Use flags to specify whether to create a release or post-release commit	2023-03-10 15:23:17 -06:00
Debanjum Singh Solanky	8bb8824d0c	Bump khoj versions in obsidian, emacs files	2023-03-10 15:23:17 -06:00
Debanjum Singh Solanky	e16d0b6d7e	Open references notes used for chat on mobile too (by clicking) Requires clicking the reference as hover doesn't work on mobile	2023-03-09 17:13:07 -06:00
Debanjum Singh Solanky	c3c7b8a951	Make Khoj chat a separate Progressive Web App (PWA) for easier access	2023-03-09 13:45:06 -06:00
Debanjum Singh Solanky	3838f9d8e3	Remove explicitly asking GPT to say I don't know in prompt for now GPT still mostly says I don't know when answer not in notes or chats But with this its more inclined to answer general questions not in chats or notes while informing user that the information is not from existing chats or notes	2023-03-09 12:11:44 -06:00
Debanjum Singh Solanky	f7b8cdd02e	Log prompts being passed to GPT for debugging	2023-03-08 19:17:52 -06:00
Debanjum Singh Solanky	2739a492b4	Log message metadata along with Khoj message instead of user message References should be attached to khoj chat messsage rather than the users message in the chat interface	2023-03-08 19:16:24 -06:00
Debanjum Singh Solanky	87d1e1341d	Show reference notes used as response context in chat interface	2023-03-08 19:16:24 -06:00
Debanjum Singh Solanky	280061e1fa	Do not deduplicate search results used for chat context - Chat uses compiled form of search results, not the raw entries to provide context for chat. The compiled snipped search results themselves are unique and using multiple of them for context from the same raw note is fine if they cross the score and rank thresholds This should improve the context provided for chat - Also apply score_threshold, no deduplication to the answers API	2023-03-06 23:51:31 -06:00
Debanjum Singh Solanky	672f61529e	Make getting deduped search results configurable via Search API	2023-03-06 23:48:46 -06:00
Debanjum Singh Solanky	4fb628975c	Fix jumping to note from Khoj Obsidian search modal result on Windows - Issue The file path separator by khoj server and the Obsidian vault were different on Windows - Fix Normalize file path to use forward slash(/) to find the matching note file in the Obsidian vault for jump to it Resolves #177	2023-03-05 21:07:54 -06:00
Debanjum Singh Solanky	b6cdc5c7cb	Do not expose answer API as a chat type in chat web interface or API Answer does not rely on past conversations, just the knowledge base. It is meant for one off interactions, like search rather than a continuing conversation like chat For now it is only exposed via API. Later it will be expose in the interfaces as well Remove ability to select different chat types from the chat web interface as there is only a single chat type Stop appending answers to the conversation logs	2023-03-05 18:21:59 -06:00
Debanjum Singh Solanky	7f994274bb	Support multi-turn conversations in chat mode - Only use decent quality search results, if any, as context - Pass source results used by previous chat messages as context - Loosen prompt to allow looking at previous chats and notes to answer - Pass current date for context - Make GPT provide reason when it can't answer the question. Gives user context to tune their questions	2023-03-05 18:21:39 -06:00
Debanjum Singh Solanky	d73042426d	Support filtering for results above threshold score in search API	2023-03-05 18:21:39 -06:00
Debanjum Singh Solanky	45f461d175	Keep search results passed to GPT as context in conversation logs This will be useful to 1. Show source references used to arrive at answer 2. Carry out multi-turn conversations	2023-03-05 16:00:19 -06:00
Debanjum Singh Solanky	7cad1c9428	Only use past chat message, not session summaries as chat context Passing only chat messages for current active, and summaries for past session isn't currently as useful	2023-03-05 16:00:18 -06:00
Debanjum Singh Solanky	ad1f1cf620	Improve and simplify Khoj Chat using ChatGPT - Set context by either including last 2 chat messages from active session or past 2 conversation summaries from conversation logs - Set personality in system message - Place personality system message before last completed back & forth This may stop ChatGPT forgetting its personality as conversation progresses given: - The conditioning based on system role messages is light - If system message is too far back in conversation history, the model may forget its personality conditioning - If system message at end of conversation, the model can think its the start of a new conversation - Inserting the system message before last completed back & forth should prevent ChatGPT from assuming its the start of a new conversation while not losing personality conditioning from the system message - Simplfy the Khoj Chat API to for now just answer from users notes instead of trying to infer other potential interaction types. - This is the default expected behavior from the feature anyway - Use the compiled text of the top 2 search results for context - Benefits of using ChatGPT - Better model - 1/10th the price - No hand rolled prompt required to make GPT provide more chatty, assistant type responses	2023-03-05 01:24:13 -06:00
Debanjum Singh Solanky	9d42b5d60d	Use multiple compiled search results for more relevant context to GPT Increase temperature to allow GPT to collect answer across multiple notes	2023-03-05 01:24:13 -06:00
Debanjum Singh Solanky	c3b624e351	Introduce improved answer API and prompt. Use by default in chat web interface - Improve GPT prompt - Make GPT answer users query based on provided notes instead of summarizing the provided notes - Make GPT be truthful using prompt and reduced temperature - Use Official OpenAI Q&A prompt from cookbook as starting reference - Replace summarize API with the improved answer API endpoint - Default to answer type in chat web interface. The chat type is not fit for default consumption yet	2023-03-05 01:24:13 -06:00
Debanjum Singh Solanky	7184508784	Mention Python and Pip need to be installed in Main and Emacs Readme	2023-03-02 21:28:54 -06:00

1 2 3 4 5 ...

1095 commits