## Improve
- Intelligently initialize a decent default set of chat model options
- Create non-interactive mode. Auto set default server configuration on first run via Docker
## Fix
- Make RapidOCR dependency optional as flaky requirements causing docker build failures
- Set default openai text to image model correctly during initialization
## Details
Improve initialization flow during first run to remove need to configure Khoj:
- Set Google, Anthropic Chat models too
Previously only Offline, Openai chat models could be set during init
- Add multiple chat models for each LLM provider
Interactively set a comma separated list of models for each provider
- Auto add default chat models for each provider in non-interactive
model if the `{OPENAI,GEMINI,ANTHROPIC}_API_KEY' env var is set
- Used when server run via Docker as user input cannot be processed to configure server during first run
- Do not ask for `max_tokens', `tokenizer' for offline models during
initialization. Use better defaults inferred in code instead
- Explicitly set default chat model to use
If unset, it implicitly defaults to using the first chat model.
Make it explicit to reduce this confusion
Resolves#882
The chat model initialize interaction flow is fairly similar across
the chat model providers.
This should simplify adding new chat model providers and reduce
chances of bugs in the interactive chat model initialization flow.
This is an initial pass to add documentation for all the knobs
available on the Khoj Admin panel.
It should shed some light onto what each admin setting is for and how
they can be customized when self hosting.
Resolves#831
- Improve Self Hosting Docker Instructions
- Ask to Install Docker Desktop to not require separate
docker-compose install and unify the instruction across OS
- To Self Host on Windows, ask to use Docker Desktop with WSL2 backend
- Use nested Tab grouping to split Docker vs Pip Self Host Instructions
- Reduce Self Host Setup Steps in Documentation after code simplification
- First run now avoids need to configure Khoj via admin panel
- So move the chat model config steps into optional post setup
config section
- Improve Instructions to Configure chat models on First Run
- Compress configuring chat model providers into a Tab Group
- Add Documentation for Remote Access under Advanced Self Hosting
Given the LLM landscape is rapidly changing, providing a good default
set of options should help reduce decision fatigue to get started
Improve initialization flow during first run
- Set Google, Anthropic Chat models too
Previously only Offline, Openai chat models could be set during init
- Add multiple chat models for each LLM provider
Interactively set a comma separated list of models for each provider
- Auto add default chat models for each provider in non-interactive
model if the {OPENAI,GEMINI,ANTHROPIC}_API_KEY env var is set
- Do not ask for max_tokens, tokenizer for offline models during
initialization. Use better defaults inferred in code instead
- Explicitly set default chat model to use
If unset, it implicitly defaults to using the first chat model.
Make it explicit to reduce this confusion
Resolves#882
This should configure Khoj with decent default configurations via
Docker and avoid needing to configure Khoj via admin page to start
using dockerized Khoj
Update default max prompt size set during khoj initialization
as online chat model are cheaper and offline chat models have larger
context now
RapidOCR depends on OpenCV which by default requires a bunch of GUI
paramters. This system package dependency set (like libgl1) is flaky
Making the RapidOCR dependency optional should allow khoj to be more
resilient to setup/dependency failures
Trade-off is that OCR for documents may not always be available and
it'll require looking at server logs to find out when this happens
This reverts commit c9665fb20b.
Revert "Fix handling for new conversation in agents page"
This reverts commit 3466f04992.
Revert "Add a unique_id field for identifiying conversations (#914)"
This reverts commit ece2ec2d90.
- This allows triggering khoj chat from the browser addressbar
- So now if you add Khoj to your browser bookmark with
- URL: https://app.khoj.dev/?q=%s
- Keyword: khoj
- Then you can type "khoj what is the news today" to trigger Khoj to
quickly respond to your query. This avoids having to open the Khoj web
app before asking your question
* Add a unique_id field to the conversation object
- This helps us keep track of the unique identity of the conversation without expose the internal id
- Create three staged migrations in order to first add the field, then add unique values to pre-fill, and then set the unique constraint. Without this, it tries to initialize all the existing conversations with the same ID.
* Parse and utilize the unique_id field in the query parameters of the front-end view
- Handle the unique_id field when creating a new conversation from the home page
- Parse the id field with a lightweight parameter called v in the chat page
- Share page should not be affected, as it uses the public slug
* Fix suggested card category
Previously Khoj would stop in the middle of response generation when
the safety filters got triggered at default thresholds. This was
confusing as it felt like a service error, not expected behavior.
Going forward Khoj will
- Only block responding to high confidence harmful content detected by
Gemini's safety filters instead of using the default safety settings
- Show an explanatory, conversational response (w/ harm category)
when response is terminated due to Gemini's safety filters
- Support using image generation models like Flux via Replicate
- Modularize the image generation code
- Make generate better image prompt chat actor add composition details
- Generate vivid images with DALLE-3
Enables using any image generation model on Replicate's Predictions
API endpoints.
The server admin just needs to add text-to-image model on the
server/admin panel in organization/model_name format and input their
Replicate API key with it
Create db migration (including merge)
Set sender email using `RESEND_EMAIL` environment variable for magic link sent via Resend API for authentication . It was previously hard-coded. This prevented hosting Khoj on other domains.
Resolves#908
- Major
- The new O1 series doesn't seem to support streaming, response_format enforcement,
stop words or temperature currently.
- Remove any markdown json codeblock in chat actors expecting json responses
- Minor
- Override block display styling of links by Katex in chat messages
Strip any json md codeblock wrapper if exists before processing
response by output mode, extract questions chat actor. This is similar
to what is already being done by other chat actors
Useful for succesfully interpreting json output in chat actors when
using non (json) schema enforceable models like o1 and gemma-2
Use conversation helper function to centralize the json md codeblock
removal code
This happens sometimes when LLM respons contains [\[1\]] kind of links
as reference. Both markdown-it and katex apply styling.
Katex's span uses display: block which makes the rendering of these
references take up a whole line by themselves.
Override block styling of spans within an `a' element to prevent such
chat message styling issues
* Add functions to chat with Google's gemini model series
* Gracefully close thread when there's an exception in the gemini llm thread
* Use enums for verifying the chat model option type
* Add a migration to add the gemini chat model type to the db model
* Fix chat model selection verification and math prompt tuning
* Fix extract questions method with gemini. Enforce json response in extract questions.
* Add standard stop sequence for Gemini chat response generation
---------
Co-authored-by: sabaimran <narmiabas@gmail.com>
Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
Additional logging was enabled to debug automation failures in
production since migration chat API to use POST request method (from
earlier GET).
Redirect from http to https was default to use GET instead of POST
method to call /api/chat on redirect. This has been resolved now