Loading the embeddings model, even locally seems to be taking much
longer. Use timer to track visibility into embedding, cross-encoder
model load times
We should start disambiguating the the max input from output size. Max
prompt size should only be used for the max input context to an LLM.
If required max_output_tokens should be set as a separate new field
Currently, the personality of the agent is only included in the final response that it returns to the user. Historically, this was because models were quite bad at navigating the additional context of personality, and there was a bias towards having more control over certain operations (e.g., tool selection, question extraction).
Going forward, it should be more approachable to have prompts included in the sub tasks that Khoj runs in order to response to a given query. Make this possible in this PR. This also sets us up for agent creation becoming available soon.
Create custom agents in #928
Agents are useful insofar as you can personalize them to fulfill specific subtasks you need to accomplish. In this PR, we add support for using custom agents that can be configured with a custom system prompt (aka persona) and knowledge base (from your own indexed documents). Once created, private agents can be accessible only to the creator, and protected agents can be accessible via a direct link.
Custom tool selection for agents in #930
Expose the functionality to select which tools a given agent has access to. By default, they have all. Can limit both information sources and output modes.
Add new tools to the agent modification form
## Overview
Add user country code as context for doing online search with serper.dev API.
This should find more user relevant results from online searches by Khoj
## Details
### Major
- Default to using system clock to infer user timezone on js clients
- Infer country from timezone when only timezone received by chat API
- Localize online search results to user country when location available
### Minor
- Add `__str__` func to `LocationData` class to deduplicate location string generation
Make all the scroll actions just use requestAnimationFrame instead of
setTimeout. It better aligns with browser rendering loop, so better
for UX changes than setTimeout
Using system clock to infer user timezone on clients makes Khoj
more robust to provide location aware responses.
Previously only ip based location was used to infer timezone via API.
This didn't provide any decent fallback when calls to ipapi failed or
Khoj was being run in offline mode
Timezone is easier to infer using clients system clock. This can be
used to infer user country name, country code, even if ip based
location cannot be inferred.
This makes using location data to contextualize Khoj's responses more
robust. For example, online search results are retrieved for user's
country, even if call to ipapi.co for ip based location fails
Get country code to server chat api from i.p location check on clients.
Use country code to get country specific online search results via Serper.dev API
Previously the location string from location data was being generated
wherever it was being used.
By adding a __str__ representation to LocationData class, we can
dedupe and simplify the code to get the location string
- Use tabs for GPU/CPU type khoj being install on
- Update CMAKE flags to use to install Khoj with correct GPU support
Previous flags used DLLAMA, this has been updated to use DGGML now
in llama.cpp
Remove unnecessary "Inferred Query" heading prefix to image generation prompt
used by Khoj. The inferred query in chat message has a heading of it's
own, so avoid two headings for the image prompt
The problem was the tool tip was visible on hover, but it was slow, so before the tool tip popped up, the user would click on the button and this stopped the tool tip from popping up.
So i reduced the popup delay to 10ms. now as soon as user hovers over the button, they will see that its a feature coming soon!