Commit graph

2740 commits

Author SHA1 Message Date
Debanjum Singh Solanky
8d588e0765 Encourage output mode chat actor to output only json and nothing else
Latest claude model wanted to say more than just give the json output.
The updated prompt encourages the model to ouput just json. This is
similar to what is already being done for other prompts
2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky
abad5348a0 Give Vision to Anthropic models in Khoj 2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky
6fd50a5956 Reuse logic to format messages for chat with anthropic models 2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky
82eac5a043 Make the get image from url function more versatile and reusable
It was previously added under the google utils. Now it can be used by
other conversation processors as well.

The updated function
- can get both base64 encoded and PIL formatted images from url
- will return the media type of the image as well in response
2024-10-23 17:19:20 -07:00
Debanjum Singh Solanky
9f2c02d9f7 Chat with the default agent by default from web app home
Had temporarily updated the default selected agent to last used.
Revert for now as
1. The previous logic was buggy. It didn't select the default agent
   even when the last used agent was the default agent. Which would
   require more work.
2. It maybe too early anyway to set the default agent to last used.
2024-10-23 03:43:57 -07:00
Debanjum Singh Solanky
218946edda Fix copying message with user images on web app
Adding div elements to message to render degraded text copied to
clipboard for messages with user uploaded images.

This change fixes that by separating message to render from message
for clipboard. It ensures differently formatted forms of the user
images are added to the two to allow proper rendering while still
having decently formatted text copied to clipboard
2024-10-23 03:41:25 -07:00
Debanjum Singh Solanky
2a50694089 Allow typing multi-line queries from a phone with Enter key
Add newline instead of sending message when hit Enter key on mobile
displays. As on phones shift key doesn't exist and send button is easily
clickable.

Limit hitting Enter key to send message to computers = larger display
= expected to have full fledged keyboards.
2024-10-22 21:20:22 -07:00
Debanjum Singh Solanky
a134cd835c Focus on chat input area to enter text after file uploads on web app 2024-10-22 21:19:17 -07:00
Debanjum Singh Solanky
750fbce0c2 Merge branch 'master' into improve-agent-pane-on-home-screen 2024-10-22 20:05:29 -07:00
Debanjum Singh Solanky
3be505db48 Only show type of error when image generation fails to clients
Rather than showing raw error message from the underlying service as it
could contain sensitive information
2024-10-22 20:03:20 -07:00
Debanjum Singh Solanky
b3fff43542 Sanitize user attached images. Constrain chat input width on home page
Set max combined images size to 20mb to allow multiple photos to be shared
2024-10-22 19:42:40 -07:00
Debanjum Singh Solanky
6c393800cc Merge branch 'master' into multi-image-chat-and-vision-for-gemini 2024-10-22 18:38:49 -07:00
Debanjum Singh Solanky
91bbd19333 Close the agent detail hover card when scroll on agent pane 2024-10-22 18:03:17 -07:00
Debanjum Singh Solanky
110c67f083 Improve agent pill, detail card styling. Handle null chatInputRef
- Remove border from agent detail hover card on home page
- Do not wrap long agent names in agent pills on home page
- Handle scenario where chatInputRef is null
2024-10-22 18:03:17 -07:00
Debanjum Singh Solanky
aca8bef024 Only use recent chat sessions for agent MRU. Handle null agent chats 2024-10-22 17:46:45 -07:00
sabaimran
0dad4212fa
Generate dynamic diagrams (via Excalidraw) (#940)
Add support for generating dynamic diagrams in flow with Excalidraw (https://github.com/excalidraw/excalidraw). This happens in three steps:
1. Default information collection & intent determination step.
2. Improving the overall guidance of the prompt for generating a JSON, Excalidraw-compatible declaration.
3. Generation of the diagram to output to the final UI.

Add support in the web UI.
2024-10-22 16:13:46 -07:00
sabaimran
1e993d561b Release Khoj version 1.26.4 2024-10-22 13:50:08 -07:00
Debanjum Singh Solanky
e8fb79a369 Rate limit the count and total size of images shared via API 2024-10-22 04:37:54 -07:00
sabaimran
892040972f Replace user_id with server_id in telemetry 2024-10-21 20:47:52 -07:00
sabaimran
21e69b506d Release Khoj version 1.26.3 2024-10-21 08:19:05 -07:00
Debanjum Singh Solanky
9b554feb91 Show agent details card on hover on agent pill on web app home page
- Double click on agent to open edit agent card
- Focus on chat input pane when agent selected/clicked
  for quick, smooth agent switch and message flow
- Hover on agent to see agent detail card on non-mobile displays
  - Use debounce to only show when hover on card for a bit
2024-10-21 00:08:01 -07:00
Debanjum Singh Solanky
220ff1df62 Set chatInputArea forward ref from parent components for control 2024-10-21 00:02:48 -07:00
Debanjum Singh Solanky
54b92eaf73 Extract isUserSubscribed check from Agents page to make it resusable 2024-10-20 23:31:48 -07:00
Debanjum Singh Solanky
bdbe8f003e Move agent details and edit card out into reusable components on web app 2024-10-20 23:31:47 -07:00
sabaimran
59fec37943 Improve agents management, and limit agents view to private and official agents
- Default to None for the input_tools and output_modes so that they can be managed in the admin panel
- Hold off on showing off all Public Agents until we have a better experience for user profiles etc.
2024-10-20 22:24:51 -07:00
sabaimran
a979457442 Add unit tests for agents
- Add permutations of testing for with, without knowledge base. Private, public, different users.
2024-10-20 20:04:50 -07:00
sabaimran
fc70f25583 Release Khoj version 1.26.2 2024-10-20 18:03:36 -07:00
sabaimran
046de57571 Improve error handling when documents not searched with stack trace
- Stop extract OCR content from PDFs
- Only use agent knowledge base when user not provided
2024-10-20 18:03:14 -07:00
sabaimran
2b68d61fef Release Khoj version 1.26.1 2024-10-20 16:21:51 -07:00
Debanjum Singh Solanky
5fca41cc29 Show agents sorted by mru, Select mru agent by default on web app
Have get agents API return agents ordered intelligently
- Put the default agent first
- Sort used agents by most recently chatted with agent for ease of access
- Randomly shuffle the remaining unused agents for discoverability
2024-10-20 15:21:25 -07:00
Debanjum Singh Solanky
a6bfdbdbfe Show all agents in carousel on home screen agent pane of web app
This change wraps the agent pane in a scroll area with all agents shown.
It allows selecting an agent to chat with directly from the home
screen without breaking flow and having to jump to the agents page.

The previous flow was not convenient to quickly and consistently start
chat with one of your standard agents.

This was because a random subet of agents were shown on the home page.
To start chat with an agent not shown on home screen load you had to
open the agents page and initiate the conversation from there.
2024-10-20 15:21:25 -07:00
Debanjum Singh Solanky
9ffd726799 Allow making sync api requests with body from khoj.el 2024-10-20 15:16:40 -07:00
Debanjum Singh Solanky
ac51920859 Start conversation with Agents from within Emacs
Exposes a transient switch with available agents as selectable options
in the Khoj chat sub-menu.

Currently shows agent slugs instead of agent names as options. This
isn't the cleanest but gets the job done for now.

Only new conversations with a different agent can be started. Existing
conversations will continue with the original agent it was created with.
The ability to switch the conversation's agent doesn't exist on the
server yet.
2024-10-20 15:16:40 -07:00
Debanjum Singh Solanky
7646ac6779 Style user attached images as carousel on chat input area of web app 2024-10-20 00:40:08 -07:00
sabaimran
5d5bea6a5f Ensure images are reset after messages processed 2024-10-19 22:02:06 -07:00
sabaimran
1ad6e1749f Move window redirect to after relevant data is dropped in localStorage on the homage page
One limitation of this methodology is that localStorage has a limit in how much data it can take. Should add more graceful error handling here as well.
2024-10-19 20:36:13 -07:00
sabaimran
cb6b3ec1e9 Improve mode description given to LLM when determining how to respond.
Currently experiencing difficulty instruction following when an image is shared. It's more likely to try and output an image. Update to make a clearer distinction.
2024-10-19 20:35:32 -07:00
sabaimran
545259e308 Remove unused icons in chatInputArea 2024-10-19 16:54:21 -07:00
Debanjum Singh Solanky
3cc1426edf Style user attached images with fixed height, in a single row on web app 2024-10-19 16:48:36 -07:00
Debanjum Singh Solanky
58a331227d Display the attached images inside the chat input area on the web app
- Put the attached images display div inside the same parent div as
  the text area
- Keep the attachment, microphone/send message buttons aligned with
  the text area. So the attached images just show up at the top of the
  text area but everything else stays at the same horizontal height as
  before.

- This improves the UX by
  - Ensuring that the attached images do not obscure the agents pane
    above the chat input area
  - The attached images visually look like they are inside the actual
    input area, rather than floating above it. So the visual aligns
    with the semantics
2024-10-19 16:29:45 -07:00
Debanjum Singh Solanky
3e39fac455 Add vision support for Gemini models in Khoj 2024-10-19 15:47:03 -07:00
Debanjum Singh Solanky
0d6a54c10f Allow sharing multiple images as part of user query from the web app
Previously the web app only expected a single image to be shared by
the user as part of their query.

This change allows sharing multiple images from the web app.

Closes #921
2024-10-19 15:47:03 -07:00
Debanjum Singh Solanky
e2abc1a257 Handle multiple images shared in query to chat API
Previously Khoj could respond to a single shared image at a time.

This changes updates the chat API to accept multiple images shared by
the user and send it to the appropriate chat actors including the
openai response generation chat actor for getting an image aware
response
2024-10-19 14:53:33 -07:00
Debanjum Singh Solanky
d55cba8627 Pass user query for chat response when document lookup fails
Recent changes made Khoj try respond even when document lookup fails.
This change missed handling downstream effects of a failed document
lookup, as the defiltered_query was null and so the text response
didn't have the user query to respond to.

This code initializes defiltered_query to original user query to
handle that.

Also response_type wasn't being passed via
send_message_to_model_wrapper_sync unlike in the async scenario
2024-10-19 14:32:19 -07:00
Debanjum Singh Solanky
a4e6e1d5e8 Share webp images from web, desktop, obsidian app to chat with 2024-10-19 14:32:17 -07:00
sabaimran
dbd9a945b0 Re-evaluate agent private/public filtering after authenticateddata is retrieved. Update selectedAgent check logic to reflect. 2024-10-18 09:31:56 -07:00
Debanjum Singh Solanky
35015e720e Release Khoj version 1.26.0 2024-10-17 18:25:53 -07:00
Debanjum Singh Solanky
f0dcfe4777 Explicitly ask Gemini models to format their response with markdown
Otherwise it can get confused by the format of the passed context (e.g
respond in org-mode if context contains org-mode notes)
2024-10-17 18:12:47 -07:00
Debanjum Singh Solanky
2c20f49bc5 Return enabled scrapers as WebScraper objects for more ergonomic code 2024-10-17 17:44:09 -07:00
Debanjum Singh Solanky
0db52786ed Make web scraper priority configurable via admin panel
- Simplifies changing order in which web scrapers are invoked to read
  web page by just changing their priority number on the admin panel.
  Previously you'd have to delete/, re-add the scrapers to change
  their priority.

- Add help text for each scraper field to ease admin setup experience

- Friendlier env var to use Firecrawl's LLM to extract content

- Remove use of separate friendly name for scraper types.
  Reuse actual name and just make actual name better
2024-10-17 17:42:42 -07:00