- Double click on agent to open edit agent card
- Focus on chat input pane when agent selected/clicked
for quick, smooth agent switch and message flow
- Hover on agent to see agent detail card on non-mobile displays
- Use debounce to only show when hover on card for a bit
Have get agents API return agents ordered intelligently
- Put the default agent first
- Sort used agents by most recently chatted with agent for ease of access
- Randomly shuffle the remaining unused agents for discoverability
This change wraps the agent pane in a scroll area with all agents shown.
It allows selecting an agent to chat with directly from the home
screen without breaking flow and having to jump to the agents page.
The previous flow was not convenient to quickly and consistently start
chat with one of your standard agents.
This was because a random subet of agents were shown on the home page.
To start chat with an agent not shown on home screen load you had to
open the agents page and initiate the conversation from there.
Exposes a transient switch with available agents as selectable options
in the Khoj chat sub-menu.
Currently shows agent slugs instead of agent names as options. This
isn't the cleanest but gets the job done for now.
Only new conversations with a different agent can be started. Existing
conversations will continue with the original agent it was created with.
The ability to switch the conversation's agent doesn't exist on the
server yet.
One limitation of this methodology is that localStorage has a limit in how much data it can take. Should add more graceful error handling here as well.
- Put the attached images display div inside the same parent div as
the text area
- Keep the attachment, microphone/send message buttons aligned with
the text area. So the attached images just show up at the top of the
text area but everything else stays at the same horizontal height as
before.
- This improves the UX by
- Ensuring that the attached images do not obscure the agents pane
above the chat input area
- The attached images visually look like they are inside the actual
input area, rather than floating above it. So the visual aligns
with the semantics
Previously the web app only expected a single image to be shared by
the user as part of their query.
This change allows sharing multiple images from the web app.
Closes#921
- Only set addedFiles to selectedFiles when selectedFiles is an array
- Only set seleectedFiles, addedFiles to API response json when
response succeeded. Previously we set it to response json
on errors as well. This made the variables into json objects instead
of arrays on API call failure
- Check if selectedFiles, addedFiles are arrays before running
operations on them. Previously the addedFiles.includes was where the
code would fail
Update regex to also include any links to code generated images that
aren't explicitly meant to be displayed inline. This allows folks to
download the image (unlike the fake link that doesn't work created by
model)
Previously you had to refresh the page to see the updated data on
reopening the agents edit card after a save operation.
Now you see the latest saved agent data on reopening the agents edit
card. This should avoid confusion on whether the data was saved
correctly
Currently, the personality of the agent is only included in the final response that it returns to the user. Historically, this was because models were quite bad at navigating the additional context of personality, and there was a bias towards having more control over certain operations (e.g., tool selection, question extraction).
Going forward, it should be more approachable to have prompts included in the sub tasks that Khoj runs in order to response to a given query. Make this possible in this PR. This also sets us up for agent creation becoming available soon.
Create custom agents in #928
Agents are useful insofar as you can personalize them to fulfill specific subtasks you need to accomplish. In this PR, we add support for using custom agents that can be configured with a custom system prompt (aka persona) and knowledge base (from your own indexed documents). Once created, private agents can be accessible only to the creator, and protected agents can be accessible via a direct link.
Custom tool selection for agents in #930
Expose the functionality to select which tools a given agent has access to. By default, they have all. Can limit both information sources and output modes.
Add new tools to the agent modification form
## Overview
Add user country code as context for doing online search with serper.dev API.
This should find more user relevant results from online searches by Khoj
## Details
### Major
- Default to using system clock to infer user timezone on js clients
- Infer country from timezone when only timezone received by chat API
- Localize online search results to user country when location available
### Minor
- Add `__str__` func to `LocationData` class to deduplicate location string generation
Make all the scroll actions just use requestAnimationFrame instead of
setTimeout. It better aligns with browser rendering loop, so better
for UX changes than setTimeout
Using system clock to infer user timezone on clients makes Khoj
more robust to provide location aware responses.
Previously only ip based location was used to infer timezone via API.
This didn't provide any decent fallback when calls to ipapi failed or
Khoj was being run in offline mode
Get country code to server chat api from i.p location check on clients.
Use country code to get country specific online search results via Serper.dev API
Remove unnecessary "Inferred Query" heading prefix to image generation prompt
used by Khoj. The inferred query in chat message has a heading of it's
own, so avoid two headings for the image prompt
The problem was the tool tip was visible on hover, but it was slow, so before the tool tip popped up, the user would click on the button and this stopped the tool tip from popping up.
So i reduced the popup delay to 10ms. now as soon as user hovers over the button, they will see that its a feature coming soon!
Improve Scrolling on Chat page of Web app
- Details
1. Only auto scroll Khoj's streamed response when scroll is near bottom of page
Allows scrolling to other messages in conversation while Khoj is formulating and streaming its response
2. Add button to scroll to bottom of the chat page
3. Scroll to most recent conversation turn on conversation first load
It's a better default to anchor to most recent conversation turn (i.e most recent user message)
4. Smooth scroll when Khoj's chat response is streamed
Previously the scroll would jitter during response streaming
5. Anchor scroll position when fetch and render older messages in conversation
Allow users to keep their scroll position when older messages are fetched from server and rendered
Resolves#758
* Update the conversation_id primary key field to be a uuid
- update associated API endpoints
- this is to improve the overall application health, by obfuscating some information about the internal database
- conversation_id type is now implicitly a string, rather than an int
- ensure automations are also migrated in place, such that the conversation_ids they're pointing to are now mapped to the new IDs
* Update client-side API calls to correctly query with a string field
* Allow modifying of conversation properties from the chat title
* Improve drag and drop file experience for chat input area
* Use a phosphor icon for the copy to clipboard experience for code snippets
* Update conversation_id parameter to be a str type
* If django_apscheduler is not in the environment, skip the migration script
* Fix create automation flow by storing conversation id as string
The new UUID used for conversation id can't be directly serialized.
Convert to string for serializing it for later execution
---------
Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
This reverts commit c9665fb20b.
Revert "Fix handling for new conversation in agents page"
This reverts commit 3466f04992.
Revert "Add a unique_id field for identifiying conversations (#914)"
This reverts commit ece2ec2d90.
- This allows triggering khoj chat from the browser addressbar
- So now if you add Khoj to your browser bookmark with
- URL: https://app.khoj.dev/?q=%s
- Keyword: khoj
- Then you can type "khoj what is the news today" to trigger Khoj to
quickly respond to your query. This avoids having to open the Khoj web
app before asking your question
* Add a unique_id field to the conversation object
- This helps us keep track of the unique identity of the conversation without expose the internal id
- Create three staged migrations in order to first add the field, then add unique values to pre-fill, and then set the unique constraint. Without this, it tries to initialize all the existing conversations with the same ID.
* Parse and utilize the unique_id field in the query parameters of the front-end view
- Handle the unique_id field when creating a new conversation from the home page
- Parse the id field with a lightweight parameter called v in the chat page
- Share page should not be affected, as it uses the public slug
* Fix suggested card category
This happens sometimes when LLM respons contains [\[1\]] kind of links
as reference. Both markdown-it and katex apply styling.
Katex's span uses display: block which makes the rendering of these
references take up a whole line by themselves.
Override block styling of spans within an `a' element to prevent such
chat message styling issues
- Make train of thought icons to be top aligned, next to the
their intermediate step heading
- Add margin bottom to ordered, unordered lists in chat message,
similar to how it is already added for paragraphs
# Summary of Changes
* New UI to show preview of image uploads
* ChatML message changes to support gpt-4o vision based responses on images
* AWS S3 image uploads for persistent image context in conversations
* Database changes to have `vision_enabled` option in server admin panel while configuring models
* Render previously uploaded images in the chat history, show uploaded images for pending msgs
* Pass the uploaded_image_url through to subqueries
* Allow image to render upon first message from the homepage
* Add rendering support for images to shared chat as well
* Fix some UI/functionality bugs in the share page
* Convert user attached images for chat to webp format before upload
* Use placeholder to attached image for data source, response mode actors
* Update all clients to call /api/chat as a POST instead of GET request
* Fix copying chat messages with images to clipboard
TLDR; Add vision support for openai models on Khoj via the web UI!
---------
Co-authored-by: sabaimran <narmiabas@gmail.com>
Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
Limit file types to sync with Khoj from Obsidian to:
- Avoid hitting per user index-able data limits, especially for folks on the Khoj cloud free tier. E.g by excluding images in Obsidian vault from being synced
- Improve context used by Khoj to generate responses
When user exceeds data sync limits. Show error notice with
- Link to web app settings page to upgrade subscription
- Link to Khoj plugin settings in Obsidian to configure file types to
sync from vault to Khoj
- Allow free tier users to have unlimited chats with default chat model. It'll only be rate-limited and at the same rate as subscribed users
- In the server chat settings, replace the concept of default/summarizer models with default/advanced chat models. Use the advanced models as a default for subscribed users.
- For each `ChatModelOption' configuration, allow the admin to specify a separate value of `max_tokens' for subscribed users. This allows server admins to configure different max token limits for unsubscribed and subscribed users
- Show error message in web app when hit rate limit or other server errors
Previously `force' was passed as a query param to the single indexing API. After the recent API updates, it is meant to select the API method to use (PATCH vs PATCH). Converting `force' argument to a bool fixes implementing this new behavior
* Add ability to cycle through the chat history in the chat input on Obsidian (similar to terminal history navigation)
* Add mod key shortcut to cycle through chat history in chat input
* Add shortcut help text in chat input placeholder
---------
Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
Previously required the automation page to be refreshed to see updates
to the automation in the edit automation card. This would be seen when
user tries to edit an automation multiple times (without a page refresh)
Previously, the code incorrectly treated all non-nil values as true, leading to
the index being re-indexed with the force flag whenever the user selected to
update the index.
- Use color to provide visual feedback when hover, click on feedback
buttons
- Use color to provide visual feedback when hover on speech, copy
buttons click
- Add cooldown period before being able to send feedback on that message again.
Avoids inadvertent multiple consecutive clicks on feedback buttons
- Since the .gitignore will ignore any of the assets in the src/ folder when building the package wheel, we need to output the static assets to another folder just for the python pypi package. Use /compiled for this.
- Auto focus on email input on login screen for smoother login experience
- Use file icon associated with search page results. Improve search bar
- Show logged in user's email in nav menu for context
- Use previous icons with eyes for search, agents and automations items in nav menu
Hierarchical documents like org-mode, markdown have their ancestry
shown in first line. Remove it to show cleaner, deduplicated reference
text from org-mode, markdown files
Utilize chat footer space more efficiently. This is especially useful
on small screens
- Send button is anyway only enabled when there is text in chat input
- Otherwise voice message button is better to show by default
- Remove invalid call to styles.main
- Remove unnecessary top padding above side pane to keep side pane at
consistent position across web app
- Use same pageLayout styles and styling structure on agent like
automation
- Vertically center automation section and page title on it's row
- Fix applying flex vs grid with tailwind
- Remove x axis footer padding on small screens to preserve space,
keep equal spacing between footer items
- Add 1rem margin to buttons to not have overlap in boundary
- Add 1rem y-axis padding to chat footer to not have focus boundary
leave the footer boundary on smaller screens
Installing Khoj as PWA was supported in previous web UX as well. This
just adds link to the existing webmanifest to continue support for
installing Khoj as PWA with new web UX
Previously the rename wasn't updating the chat session title. We'd
have to refresh the page or side pane to get latest chat session names
after rename action.
Previously the footer's right border wasn't visible on small screens
due to usage of w-full
Use mr-1 on send button instead of px-1 on chat input parent to
eualize chat footer buttons spacing
- Show informative toast messages on copy, delete of API keys
- Onle show API keys card in non anonymous mode. API keys aren't
required (and is disabled on server side) in anon mode. Not showing
card at all in anon mode reduces chance of unnecessary confusion
Style profile pircture button on nav menu
- Use primary colored ring around subscribed user profile on nav menu
- Use gray colored ring around non-subscribed user profile on nav menu
- Use upper case initial as profile pic for user with no profile pic
- Click anywhere on nav menu item to trigger action
Previously the actual clickable area was smaller than the width of
the nav menu item