- Use a single standard search model across the server. There's diminishing benefits for having multiple user-customizable search models.
- We may want to add server-level customization for specific tasks
- Store the search model used to generate a given entry on the `Entry` object
- Remove user-facing APIs and view
- Add a management command for migrating the default search model on the server
In a future PR (after running the migration), we'll also remove the `UserSearchModelConfig`
Latest claude model wanted to say more than just give the json output.
The updated prompt encourages the model to ouput just json. This is
similar to what is already being done for other prompts
It was previously added under the google utils. Now it can be used by
other conversation processors as well.
The updated function
- can get both base64 encoded and PIL formatted images from url
- will return the media type of the image as well in response
* Create explicit flow to enable the free trial
The current design is confusing. It obfuscates the fact that the user is on a free trial. This design will make the opt-in explicit and more intuitive.
* Use the Subscription Type enum instead of hardcoded strings everywhere
* Use length of free trial in the frontend code as well
Had temporarily updated the default selected agent to last used.
Revert for now as
1. The previous logic was buggy. It didn't select the default agent
even when the last used agent was the default agent. Which would
require more work.
2. It maybe too early anyway to set the default agent to last used.
Adding div elements to message to render degraded text copied to
clipboard for messages with user uploaded images.
This change fixes that by separating message to render from message
for clipboard. It ensures differently formatted forms of the user
images are added to the two to allow proper rendering while still
having decently formatted text copied to clipboard
Add newline instead of sending message when hit Enter key on mobile
displays. As on phones shift key doesn't exist and send button is easily
clickable.
Limit hitting Enter key to send message to computers = larger display
= expected to have full fledged keyboards.
## Overview
Allow quickly selecting, switching agents from agents pane on home page of web app
## Details
- Show all agents in carousel on home screen agent pane of web app
- Smart Sort
1. Pin default agent as first for ease of access
2. Show used agents by MRU for ease of access
3. Shuffle unused agents for discoverability
- Select most recently used agent to chat with by default
- Push smart sort logic down to API
- Common logic can be reused across clients
- Agent sort was previously done in web app
- Focus on chat input on agent select
- Double click agent on home page to open edit agent card on agents page
## Overview
- Add vision support for Gemini models in Khoj
- Allow sharing multiple images as part of user query from the web app
- Handle multiple images shared in query to chat API
- Remove border from agent detail hover card on home page
- Do not wrap long agent names in agent pills on home page
- Handle scenario where chatInputRef is null
Add support for generating dynamic diagrams in flow with Excalidraw (https://github.com/excalidraw/excalidraw). This happens in three steps:
1. Default information collection & intent determination step.
2. Improving the overall guidance of the prompt for generating a JSON, Excalidraw-compatible declaration.
3. Generation of the diagram to output to the final UI.
Add support in the web UI.
Previously only notes context from chat history was included.
This change includes online context from chat history for model to use
for response generation.
This can reduce need for online lookups by reusing previous online
context for faster responses. But will increase overall response time
when not reusing past online context, as faster context buildup per
conversation.
Unsure if inclusion of context is preferrable. If not, both notes and
online context should be removed.
The document, online search context are now passed as separate user
messages to chat model, instead of being added to the final user message.
This will improve
- Models ability to differentiate data from user query.
That should improve response quality and reduce prompt injection
probability
- Make truncation logic simpler and more robust
When context window hit, can simply pop messages to auto truncate
context in order of context, user, assistant message for each
conversation turn in history until reach current user query
The complex, brittle logic to extract user query from context in
last user message isn't required.
Marking the context message with assistant role doesn't translate well
across chat models. E.g
- Gemini can't handle consecutive messages by role = model well
- Claude will merge consecutive messages by same role. In current
message ordering the context message will result get merged into the
previous assistant response. And if move context message after user
query. The truncation logic will have to hop and skip while doing
deletions
- GPT seems to handle consecutive roles of any type fine
Using context role = user generalizes better across chat models for
now and aligns with previous behavior.
Improve separation of note snippets and show its origin file in notes
prompt to have more readable, contextualized text shared with model.
Previously the references dict was being directly passed as a string.
The documents don't look well formatted and are less intelligible.
- Passing file path along with notes snippets will help contextualize
the notes better.
- Better formatting should help with making notes more readable by the
chat model.
- Double click on agent to open edit agent card
- Focus on chat input pane when agent selected/clicked
for quick, smooth agent switch and message flow
- Hover on agent to see agent detail card on non-mobile displays
- Use debounce to only show when hover on card for a bit
- Default to None for the input_tools and output_modes so that they can be managed in the admin panel
- Hold off on showing off all Public Agents until we have a better experience for user profiles etc.
Have get agents API return agents ordered intelligently
- Put the default agent first
- Sort used agents by most recently chatted with agent for ease of access
- Randomly shuffle the remaining unused agents for discoverability
This change wraps the agent pane in a scroll area with all agents shown.
It allows selecting an agent to chat with directly from the home
screen without breaking flow and having to jump to the agents page.
The previous flow was not convenient to quickly and consistently start
chat with one of your standard agents.
This was because a random subet of agents were shown on the home page.
To start chat with an agent not shown on home screen load you had to
open the agents page and initiate the conversation from there.