- Use the conversation id of the retrieved conversation rather than the
potentially unset conversation id passed via API
- await creating new chat when no chat id provided and no existing
conversations exist
- Move some common methods into separate functions to make the UI components more efficient
- The normal HTTP-based chat connection will still work and serves as a fallback if the websocket is unavailable
- Convert to a model of calling the search API directly with a function call (rather than using the API method)
- Gracefully handle websocket connection disconnects
- Ensure that the rest of the response is still saved, as it is currently, if the user disconects from the client
- Setup unchangeable context at the beginning of the session when the connection is established (like location, username, etc)
The recently added after: operator to online search actor was too
restrictive, gave worse results than when just use natural language
dates in search query
Previously was assuming the system prompt is being always passed as
the first message. So expected there to be at least 2 messages in logs.
This broke chat actors querying with single long non system message.
A more robust way to extract system prompt is via the message role
instead
- Ask for Confirmation before deleting chat session in Desktop, Web app
- Save chat session rename on hitting enter in title edit input box
- No need to flash previous conversation cleared status message
- Move chat session delete button after rename button in Desktop app
- Add prompt for the read webpages chat actor to extract, infer
webpage links
- Make chat actor infer or extract webpage to read directly from user
message
- Rename previous read_webpage function to more narrow
read_webpage_at_url function
### Major
- Enforce json mode response from OpenAI chat actors prev using string lists
- Use `gpt-4-turbo-preview' as default chat model, extract questions actor
- Make Khoj read khoj website to respond with accurate, up-to-date information about itself
- Dedupe query in notes prompt. Improve OAI chat actor, director tests
### Minor
- Test data source, output mode selector, web search query chat actors
- Improve notes search actor to always create a non-empty list of queries
- Construct available data sources, output modes as a bullet list in prompts
- Use consistent agent name across static and dynamic examples in prompts
- Add actor's name to extract questions prompt to improve context for guidance
Previously only the notes references would get rendered post response
streaming when when both online and notes references were used to
respond to the user's message
- Allow passing response format type to OpenAI API via chat actors
- Convert in-context examples to use json objects instead of str lists
- Update actors outputting str list to request output to be json_object
- OpenAI's json mode enforces the model to output valid json object
- Remove stale tests
- Improve tests to pass across gpt-3.5 and gpt-4-turbo
- The haiku creation director was failing because of duplicate query in
instantiated prompt
- Remove the option for Notes search query generation actor to return
no queries. Whether search should be performed is decided before,
this step doesn't need to decide that
- But do not throw warning if the response is a list with no elements
- Add examples where user queries requesting information about Khoj
results in the "online" data source being selected
- Add an example for "general" to select chat command prompt
Previously the examples constructed from chat history used "Khoj" as
the agent's name but all 3 prompts using the func used static examples
with "AI:" as the pertinent agent's name
- Add example to read khoj.dev website for up-to-date info to setup,
use khoj, discover khoj features etc.
- Online search should use site: and after: google search operators
- Show example of adding the after: date filter to google search
- Give local event lookup example using user's current location in
query
- Remove unused select search content type prompt
- Add a page to view all agents
- Add slugs to manage agents
- Add a view to view single agent
- Display active agent when in chat window
- Fix post-login redirect issue
### Major
- Read web pages in parallel to improve chat response time
- Read web pages directly when Olostep proxy not setup
- Include search results & web page content in online context for chat response
### Minor
- Simplify, modularize and add type hints to online search functions
Previously if a web page was read for a sub-query, only the extracted
web page content was provided as context for the given sub-query. But
the google results themselves have relevant snippets. So include them
- Simplify content arg to `extract_relevant_info' function. Validate,
clean the content arg inside the `extract_relevant_info' function
- Extract `search_with_google' function outside the parent function
- Call the parent function a more appropriate `search_online' instead
of `search_with_google'
- Simplify the `search_with_google' function using list comprehension.
Drop empty search result fields from chat model context for response
to reduce cost and response latency
- No need to show stacktrace when unable to read webpage, basic error
is enough
- Add type hints to online search functions to catch issues with mypy
- Time reading webpage, extract info from webpage steps for perf
analysis
- Deduplicate webpages to read gathered across separate google
searches
- Use aiohttp to make API requests non-blocking, pair with asyncio to
parallelize all the online search webpage read and extract calls
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications
- Trigger
SentenceTransformer Cross Encoder models now run fast on GPU enabled machines, including Mac ARM devices since UKPLab/sentence-transformers#2463
- Details
- Use cross-encoder to rerank search results by default on GPU machines and when using an inference server
- Only call search API when pause in typing search query on web, desktop apps
Wait for 300ms since stop typing before calling search API.
This smooths out UI jitter when rendering search results, especially
now that we're reranking for every search query on GPU enabled devices
Emacs already has 300ms debounce time. More convoluted to add
debounce time to Obsidian search modal, so not updating that yet
Latest sentence-transformer package uses GPU for cross-encoder. This
makes it fast enough to enable reranking on machines with GPU.
Enabling search reranking by default allows (at least) users with GPUs
to side-step learning the UI affordance to rerank results
(i.e hitting Cmd/Ctrl-Enter or ENTER).
- Fix
`get_conversation_by_user' shouldn't return new conversation if
conversation with requested id not found.
It should only return new conversation if no specific conversation
is requested and no conversations found for user at all
- Repro
- Delete a new chat, this calls loadChat via window.onload which
calls server /chat/history API endpoint with conversationId set to
that of just deleted conversation sporadically
The call to GET chat/history API with conversationId set occurs
when window.onload triggers before the conversationId is deleted
by the delete button after the DELETE /chat/history API call (via race)
- In such a scenario, get_conversation_by_user called by
chat/history API with conversationId of deleted conversation
returns a new conversation
- Miscellaneous
- Chat history load should be logged as call to that chat_history api,
not the "chat" api
- Show status updates of clearing conversation history in chat input
- Simplify web, desktop client code by removing unnecessary new variables
* Upload generated images to s3, if AWS credentials and bucket is available.
- In clients, render the images via the URL if it's returned with a text-to-image2 intent type
* Make the loading screen more intuitve, less jerky and update the programmatic copy button
* Update the loading icon when waiting for a chat response
* Add additional styling changes for showing UI changes when dragging file to the main screen
* Add a loading spinner when file upload is in progress, and don't index github/notion when indexing files
* Add an explicit icon for file uploading in the chat button menu
* Add appropriate dragover styling when picking a file from the file picker/browser
* Add a loading screen when retrieving chat history. Fix width of the chat window. Put attachment icon to the left of chat input
* Make major improvements to the image generation flow
- Include user context from online references and personal notes for generating images
- Dynamically select the modality that the LLM should respond with
- Retun the inferred context in the query response for the dekstop, web chat views to read
* Add unit tests for retrieving response modes via LLM
* Move output mode unit tests to the actor suite, rather than director
* Only show the references button if there is at least one available
* Rename aget_relevant_modes to aget_relevant_output_modes
* Use a shared method for generating reference sections, simplify some of the prompting logic
* Make out of space errors in the desktop client more obvious
- Open external links using the default link handler registered on OS
for the link type, e.g http:// -> firefox, mailto: thunderbird etc
- Confirm before opening non-http URL using an external app
- Improve render of inferred query in image chat messages in Web, Desktop apps
- Add inferred queries to image chat responses in Obsidian client
- Fix rendering images from Khoj response in Obsidian client
* Simplify and clarify prompt for selecting toolset dynamically
* Add error handling around call to OLOSTEP api
* Fix conversation admin page
* Skip adding none or empty entries in the chunking method
- Improve
- Only send files modified since their last sync for indexing on server from the Obsidian client
- Fix
- Invalidate static asset browser cache in Web client when Khoj version changes
Previously we'd send all files in vault and let the server
deduplicate.
This changes takes inspiration from the desktop app, and only pushes
files which were modified after their previous sync with the server.
This should reduce the processing load on the server
* Retrieve, create, and save conversations differently if they're coming from a client application
- Not all of our client apps will necessarily maintain state over the conversation IDs available to a user. For some (single-threaded conversations), it should just use a single conversation. Fix the code to do so
* Simplify conversation retrieval logic
* Keep 0 padding below chat response
* Add order_by sorting to retrieving the conversation without id
### Improvements to Chat UI on Web, Desktop apps
- Improve styling of chat session side panel
- Improve styling of chat message bubble in Desktop, Web app
- Add frosted, minimal chat UI to background of Login screen
- Improve PWA install experience of Khoj
### Fixes to Chat UI on Web, Desktop apps
- Fix creating new chat sessions from the Desktop app
- Only show 3 starter questions even when consecutive chat sessions created
### Other Improvements
- Update Khoj cloud trial period to a fortnight instead of a week
- Document using venv to handle dependency conflict on khoj pip install
Resolves#276
- Resolve PWA issues thrown by Chrome/Edge
- Add screenshot samples showcasing remember, browse and draw features
- This can provide a richer app store like experience when
installing Khoj PWA on Mobile or Desktop
- Add wide and narrow screenshots to show Mobile vs Desktop UX
- Add higher resolution favicon for PWA
- Use single web manifest instead of separate ones for Chat, Search
- Update manifest description with more details about Khoj features
Reset starter question suggestions before appending in web, desktop app
Otherwise previously it'd keep adding to existing starter question
suggestions on each new session creation if multiple consecutive new
chat sessions created.
This would result in more than the 3 expected starter questions being
displayed at a time
- Make collapse, expand toggle arrow point in the direction the action
will expand the side panel in
- Make the collapsed side panel reduce to a 1px sliver
- Improve rate limit error message wording
- Make the "too many requests" error message more robust. Should throw
that exception fix self.request >= self.subscribed_requests because
upgrading wouldn't fix this rate limiting
- Respect newline with pre-line but not for bullets to improve
formatting of responses by Khoj
- Respect bold font by loading tajawal font with other weights
- Reduce bottom margin in chat message bubble, its taking too much space
* Document original query when subqueries can't be generated
* Only add messages to the chat message log if it's non-empty
* When changing the search model, alert the user that all underlying data will be deleted
* Adding more clarification to the prompt input for username, location
* Check if has_more is in the notion results before getting next_cursor
* Update prompt template for user name/location, update confirmation message when changing search model