- Time reading webpage, extract info from webpage steps for perf
analysis
- Deduplicate webpages to read gathered across separate google
searches
- Use aiohttp to make API requests non-blocking, pair with asyncio to
parallelize all the online search webpage read and extract calls
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications
- Trigger
SentenceTransformer Cross Encoder models now run fast on GPU enabled machines, including Mac ARM devices since UKPLab/sentence-transformers#2463
- Details
- Use cross-encoder to rerank search results by default on GPU machines and when using an inference server
- Only call search API when pause in typing search query on web, desktop apps
Wait for 300ms since stop typing before calling search API.
This smooths out UI jitter when rendering search results, especially
now that we're reranking for every search query on GPU enabled devices
Emacs already has 300ms debounce time. More convoluted to add
debounce time to Obsidian search modal, so not updating that yet
Latest sentence-transformer package uses GPU for cross-encoder. This
makes it fast enough to enable reranking on machines with GPU.
Enabling search reranking by default allows (at least) users with GPUs
to side-step learning the UI affordance to rerank results
(i.e hitting Cmd/Ctrl-Enter or ENTER).
### Issue
Previously deleting a chat session from the side panel on desktop, web app would sometimes result in also creating a new chat session
### Fix
`get_conversation_by_user' shouldn't return new conversation if
conversation with requested id not found.
It should only return new conversation if no specific conversation
is requested and no conversations found for user at all
### Miscellaneous Improvements
- Chat history load should be logged as call to that chat_history api,
not the "chat" api
- Show status updates of clearing conversation history in chat input
- Simplify web, desktop client code by removing unnecessary new variables
### Repro
- Delete a new chat, this calls loadChat via window.onload which
calls server /chat/history API endpoint with conversationId set to
that of just deleted conversation sporadically
The call to GET chat/history API with conversationId set occurs
when window.onload triggers before the conversationId is deleted
by the delete button after the DELETE /chat/history API call (via race)
- In such a scenario, get_conversation_by_user called by
chat/history API with conversationId of deleted conversation
returns a new conversation
- Fix
`get_conversation_by_user' shouldn't return new conversation if
conversation with requested id not found.
It should only return new conversation if no specific conversation
is requested and no conversations found for user at all
- Repro
- Delete a new chat, this calls loadChat via window.onload which
calls server /chat/history API endpoint with conversationId set to
that of just deleted conversation sporadically
The call to GET chat/history API with conversationId set occurs
when window.onload triggers before the conversationId is deleted
by the delete button after the DELETE /chat/history API call (via race)
- In such a scenario, get_conversation_by_user called by
chat/history API with conversationId of deleted conversation
returns a new conversation
- Miscellaneous
- Chat history load should be logged as call to that chat_history api,
not the "chat" api
- Show status updates of clearing conversation history in chat input
- Simplify web, desktop client code by removing unnecessary new variables
* Upload generated images to s3, if AWS credentials and bucket is available.
- In clients, render the images via the URL if it's returned with a text-to-image2 intent type
* Make the loading screen more intuitve, less jerky and update the programmatic copy button
* Update the loading icon when waiting for a chat response
* Add additional styling changes for showing UI changes when dragging file to the main screen
* Add a loading spinner when file upload is in progress, and don't index github/notion when indexing files
* Add an explicit icon for file uploading in the chat button menu
* Add appropriate dragover styling when picking a file from the file picker/browser
* Add a loading screen when retrieving chat history. Fix width of the chat window. Put attachment icon to the left of chat input
* Make major improvements to the image generation flow
- Include user context from online references and personal notes for generating images
- Dynamically select the modality that the LLM should respond with
- Retun the inferred context in the query response for the dekstop, web chat views to read
* Add unit tests for retrieving response modes via LLM
* Move output mode unit tests to the actor suite, rather than director
* Only show the references button if there is at least one available
* Rename aget_relevant_modes to aget_relevant_output_modes
* Use a shared method for generating reference sections, simplify some of the prompting logic
* Make out of space errors in the desktop client more obvious
- Open external links using the default link handler registered on OS
for the link type, e.g http:// -> firefox, mailto: thunderbird etc
- Confirm before opening non-http URL using an external app
- Improve render of inferred query in image chat messages in Web, Desktop apps
- Add inferred queries to image chat responses in Obsidian client
- Fix rendering images from Khoj response in Obsidian client
* Simplify and clarify prompt for selecting toolset dynamically
* Add error handling around call to OLOSTEP api
* Fix conversation admin page
* Skip adding none or empty entries in the chunking method
- Improve
- Only send files modified since their last sync for indexing on server from the Obsidian client
- Fix
- Invalidate static asset browser cache in Web client when Khoj version changes
Previously we'd send all files in vault and let the server
deduplicate.
This changes takes inspiration from the desktop app, and only pushes
files which were modified after their previous sync with the server.
This should reduce the processing load on the server