sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-11-24 07:55:07 +01:00

Author	SHA1	Message	Date
Debanjum Singh Solanky	de2494154f	Set client request param when calling khoj server APIs from Emacs	2023-06-06 00:02:10 +05:30
Debanjum Singh Solanky	168c11cea7	Make server API endpoints accept client as query param - The chat, search and update API will accept client as request param. - This will allow logging the client from which these APIs was called.	2023-06-05 23:57:08 +05:30
Debanjum Singh Solanky	8617cf1389	Push telemetry to Posthog to grok Khoj usage	2023-06-05 22:47:49 +05:30
Debanjum Singh Solanky	d13db2e666	Make old telemetry server forward requests to new server	2023-06-05 13:06:45 +05:30
Saba	5f4223efb4	Increase timeout to OpenAI call	2023-06-04 20:49:47 -07:00
Saba	0e63a90377	Fix the mechanism to retrieve the message content	2023-06-04 20:25:37 -07:00
Saba	f0efe0177e	Pass truncated message as string in ChatMessage when exceeding max prompt size	2023-06-04 19:33:46 -07:00
Debanjum	f6ceb22373	Use api_key keyword argument to set the openai_api_key parameter for GPT	2023-06-04 15:05:34 +05:30
Saba	068ee0ac5e	Swap elif with else, as usage of this method does not use openai_api_key	2023-06-04 02:25:08 -07:00
Saba	6508379d7b	Use api_key keyword argument to set the openai_api_key parameter for GPT	2023-06-04 00:57:00 -07:00
Debanjum Singh Solanky	7af8a56434	Remove filename from reference before rendering references in khoj.el Fixes bug where actual reference heading in next line jumping out of references footnote section	2023-06-02 10:42:44 +05:30
Debanjum Singh Solanky	ec280067ef	Do not retrieve relevant notes when having a general chat with Khoj - This improves latency of @general chat by avoiding unnecessary compute - It also avoids passing references in API response when they haven't been used to generate the chat response. So interfaces don't have to add logic to not render them unnecessarily	2023-06-02 10:42:44 +05:30
Debanjum Singh Solanky	90439a8db1	Update Khoj subtitle to AI personal assistant for your digital brain	2023-06-02 10:42:44 +05:30
Debanjum	e022910f31	Search PDF files with Khoj. Integrate with LangChain - Introduce Khoj to LangChain: Call GPT with LangChain for Khoj Chat - Search (and Chat about) PDF files with Khoj - Create PDF to JSONL Processor: Convert PDF content into standardized JSONL format - Expose PDF search type via Khoj server API - Enable querying PDF files via Obsidian, Emacs and Web interfaces	2023-06-02 10:20:26 +05:30
Debanjum Singh Solanky	e9ed7a19fd	Update search prompt to extract PDF search type. Fix extract_question prompt	2023-06-02 10:06:03 +05:30
Debanjum Singh Solanky	89fbfce20a	Mention PDF are also supported in Khoj Readme	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	bbe3bf9733	Render PDF search results in Khoj Obsidian interface - Make plugin update khoj server config to index PDF files in vault too - Make Obsidian plugin update index for PDF files in vault too - Show PDF results in Khoj Search modal as well - Ensure combined results are sorted by score across both types - Jump to PDF file when select it PDF search result from modal	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	e3892945d4	Render PDF search results in Khoj.el Emacs interface	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	85144006a1	Render PDF search results in khoj web interface	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	acd14a5e41	Wire up PDF to jsonl processor to Khoj server layer (API, config) - Specify PDF content to index via khoj.yml - Index PDF content on app start, reconfigure - Expose PDF as a search type via API	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	d63194c3a9	Create tests for PDF to JSONL processor	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	286b500f66	Create PDF to JSONL processor using PyPDF and LangChain Switch `pydantic' to >= 1.9.1 else `langchain.document_loaders' starts throwing typing error for python 3.8, 3.9	2023-06-01 21:41:49 +05:30
Debanjum Singh Solanky	1b3effd8e6	Fork Markdown to JSONL processor as start template for PDF to Jsonl Processor	2023-06-01 09:13:31 +05:30
Debanjum Singh Solanky	1cd9ecd449	Truncate last message if still over max supported prompt size by model	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	ed4d0f9076	Simplify argument names used in khoj openai completion functions - Match argument names passed to khoj openai completion funcs with arguments passed to langchain calls to OpenAI - This simplifies the logic in the khoj openai completion funcs	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	703a7c89c0	Reduce retry count and request timeout for faster response or failure - Fix bug where both LangChain and Khoj retry requests 6 times each. So a total of 12 requests at >1minute intervals for each chat response in case of OpenAI API being down - Retrying too many times when the API is failing doesn't help - The earlier 60 second request timeout was spacing out the interval between retries way too much. This slowed down chat response times quite a bit when API was being flaky - With these updates you'll know if call to chat API failed in under a minute	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	18081b3bc6	Use LangChain to call GPT over API	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	277d2f5c96	Do not add "Notes:" suffix to chat messages when no notes retrieved This was causing spurious "Notes:" suffix being added to Khoj Chat in response	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	334be4e600	Use LangChain to call OpenAI for Khoj Chat - Use ChatModel and ChatOpenAI to call OpenAI chat model instead of using OpenAI package directly - This is being done as part of migration to rely on LangChain for creating agents and managing their state	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	efcf7d1508	Extract prompts as LangChain Prompt Templates into a separate module Improves code modularity, cleanliness. Reduces bloat in GPT.py module	2023-06-01 08:50:58 +05:30
Debanjum Singh Solanky	b484953bb3	Import app state correctly to generate embeddings with OpenAI model Resolves #216	2023-05-28 10:21:54 +05:30
Debanjum Singh Solanky	9cfaaf0941	Update docs to configure khoj.yml for using OpenAI model for embeddings	2023-05-28 10:21:54 +05:30
Debanjum Singh Solanky	a0d0dbaca7	Fix link to Khoj Obsidian Demo video in Readmes	2023-05-23 04:23:08 +05:30
Debanjum Singh Solanky	ebb5d7b8e5	Release Khoj version 0.6.2	2023-05-17 20:04:20 +05:30
Debanjum Singh Solanky	d02415edcc	Write generated server id to env file when env file does not contain it	2023-05-17 19:38:44 +05:30
Debanjum Singh Solanky	dc0626856e	Put the telemetry db in a separate directory by default	2023-05-17 18:58:47 +05:30
Debanjum	dc495babb3	Add Telemetry to Understand Khoj Usage ### Objective: Use telemetry to better understand Khoj usage. This will motivate and prioritize work for Khoj. Specific questions: - Number of active deployments of khoj server - How regularly is khoj used (hourly, daily, weekly etc)? - How much is which feature used (chat, search)? - Which UI interface is used most (obsidian, emacs, web ui)? ### Details - Expose setting to disable telemetry logging in khoj.yml - Create basic telemetry server to log data to a DB - Log calls to Khoj API /search, /chat, /update endpoints - Batch upload telemetry data to server at ~hourly interval	2023-05-17 19:09:50 +08:00
Debanjum Singh Solanky	55d72231b3	Generate docker image for telemetry server using Github workflow	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	e9f04dc644	Add dockerfile to containerize telemetry server	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	07b19964d4	Schedule jobs at (co-)prime intervals to reduce overlap in job runs	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	d42f0f5055	Add basic telemetry server for khoj	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	134cce9d32	Batch upload telemetry data at regular interval instead of while querying	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	3ede919c66	Log usage of /search, /chat, /update API endpoints to telemetry server	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	f2e89f6f46	Add khoj app helper methods to log app usage to a telemetry server	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	9ca61d62ff	Enable/disable logging telemetry by setting bool in khoj.yml config We log usage telemetry by default, unless setting explicitly set in khoj.yml	2023-05-15 23:26:38 +08:00
Debanjum Singh Solanky	131b8407b5	Allow Khoj Chat to respond to general queries not in reference notes - Khoj chat will now respond to general queries if: 1. no relevant reference notes available or 2. when explicitly induced by prefixing the chat message with "@general" - Previously Khoj Chat would a lot of times refuse to respond to general queries not answerable from reference notes or chat history - Make chat quality tests more robust - Add more equivalent chat response options refusing to answer - Force haiku writing to not give any preable, just the haiku	2023-05-12 18:42:40 +08:00
Debanjum Singh Solanky	cc75f986b2	Test text search index only updates on changes to text content	2023-05-12 17:37:34 +08:00
Debanjum Singh Solanky	f9ccce430e	Allow configuring OpenAI chat model for Khoj chat - Simplifies switching between different OpenAI chat models. E.g GPT4 - It was previously hard-coded to use gpt-3.5-turbo. Now it just defaults to using gpt-3.5-turbo, unless chat-model field under conversation processor updated in khoj.yml	2023-05-03 23:01:13 +08:00
Debanjum	f0253e2cbb	Include Filename, Entry Heading in All Compiled Entries to Improve Search Context Merge pull request #214 from debanjum/add-filename-heading-to-compiled-entry-for-context - Set filename as top heading in compiled org, markdown entries - Note: Khoj was already indexing filenames in compiled markdown entries but they weren't set as top level headings but rather appended as bare text. The updated structure should provide more schematic context of relevance - Set entry heading as heading for compiled org, md entries, even if split by max tokens - Snip prepended heading to avoid crossing model max_token limits - Entries with no md headings should not get heading prefix prepended	2023-05-03 22:59:30 +08:00
Debanjum Singh Solanky	6b535cc345	Snip prepended heading to avoid crossing model max_token limits Otherwise if heading > max_tokens than the search models will just see a heading (with repeated filename) for each compiled entry and not actual content. 100 characters should be sufficient to include filename (not path) and entry heading. If longer rather truncate to pass entry unique text to model for search context	2023-05-03 22:53:13 +08:00

1 2 3 4 5 ...

1236 commits