Commit graph

3041 commits

Author SHA1 Message Date
Debanjum
c53c3db96b Track, return cost and usage metrics in chat api response
- Track input, output token usage and cost for interactions
  via chat api with openai, anthropic and google chat models

- Get usage metadata from OpenAI using stream_options
- Handle openai proxies that do not support passing usage in response

- Add new usage, end response  events returned by chat api.
  - This can be optionally consumed by clients at a later point
  - Update streaming clients to mark message as completed after new
    end response event, not after end llm response event
- Ensure usage data from final response generation step is included
  - Pass usage data after llm response complete. This allows gathering
    token usage and cost for the final response generation step across
    streaming and non-streaming modes
2024-11-20 12:17:58 -08:00
Debanjum
7bdc9590dd Fix handling sources, output in chat actor when is automated task
Remove unnecessary ```python prefix removal. It isn't being triggered
in json deserialize path.
2024-11-19 13:49:27 -08:00
Debanjum
0e7d611a80 Remove ```python codeblock prefix from raw json before deserialize 2024-11-19 12:53:52 -08:00
Debanjum
001c13ef43 Upgrade web app package dependencies 2024-11-19 12:53:52 -08:00
sabaimran
5134d49d71 Release Khoj version 1.30.1 2024-11-18 17:30:33 -08:00
sabaimran
8bdd0b26d3 And a connections clean up decorator to all scheduled tasks 2024-11-18 17:19:36 -08:00
Debanjum
817601872f Update default offline models enabled 2024-11-18 16:38:17 -08:00
Debanjum
653127bf1d Improve data source, output mode selection
- Set output mode to single string. Specify output schema in prompt
  - Both thesee should encourage model to select only 1 output mode
    instead of encouraging it in prompt too many times
  - Output schema should also improve schema following in general
- Standardize variable, func name of io selector for readability
- Fix chat actors to test the io selector chat actor
- Make chat actor return sources, output separately for better
  disambiguation, at least during tests, for now
2024-11-18 15:11:37 -08:00
Debanjum
e3fd51d14b Pass user arg to create title from query in new automation flow 2024-11-18 12:58:10 -08:00
Debanjum
9e74de9b4f Improve serializing conversation JSON to print messages on console
- Handle chatml message.content with non-json serializable data like
  WebP image binary data used by Gemini models
2024-11-18 12:57:05 -08:00
sabaimran
3f70d2f685 Add more graceful exception handling when tool selection doesn't work 2024-11-18 09:34:49 -08:00
sabaimran
f75085dc7a Release Khoj version 1.30.0 2024-11-17 21:36:22 -08:00
sabaimran
c72813ba67
Merge pull request #981 from rznzippy/bugfix/980/database-connections-leakage
Fix database connections leakage (#980)
2024-11-17 21:01:06 -08:00
sabaimran
7d50c6590d
Merge pull request #977 from khoj-ai/features/improve-tool-selection
- JSON extract from LLMs is pretty decent now, so get the input tools and output modes all in one go. It'll help the model think through the full cycle of what it wants to do to handle the request holistically.
- Make slight improvements to tool selection indicators
2024-11-17 20:08:19 -08:00
Debanjum
48567fd468 Do not erase partial message when generation stopped via button on web app
Previously, we'd replace the generated message with an error message
when message generation stopped via stop button on chat page of web app.
So the partially generated message (which could be useful) gets lost.

This change just stops generation, while keeping the generated
response so any useful information from the partially generated
message can be retrieved.
2024-11-17 16:29:18 -08:00
Debanjum
285006d6c9 Sync chat models in Khoj with OpenAI proxies (e.g Ollama) on startup
- Allows managing chat models in the OpenAI proxy service like Ollama.
- Removes need to manually add, remove chat models from Khoj Admin Panel
  for these OpenAI compatible API services when enabled.
- Khoj still mantains the chat models configs within Khoj, so they can
  be configured via the Khoj admin panel as usual.
2024-11-17 15:34:36 -08:00
Debanjum
d6eece63f4 Use Jina API Key of Jina web scraper if configured in DB
Previously Jina search didn't API key. Now that it does need API key,
we should re-use the API key set in the Jina web scraper config,
otherwise fallback to using JINA_API_KEY from environment variable, if
either is present.

Resolves #978
2024-11-17 15:34:14 -08:00
sabaimran
6531f24ca0 Further improvements for descriptions to LLM on modes, code, diagram, image. 2024-11-17 13:23:57 -08:00
sabaimran
0eba6ce315 When diagram generation fails, save to conversation log
- Update tool name when choosing tools to execute
2024-11-17 13:23:12 -08:00
sabaimran
7e662a05f8 Merge branch 'master' of github.com:khoj-ai/khoj into features/improve-tool-selection 2024-11-17 12:26:55 -08:00
Ilya Khrustalev
00b1af8f99 Fix database connections leakage (#980) 2024-11-17 19:15:05 +01:00
Debanjum
69ef6829c1 Simplify integrating Ollama, OpenAI proxies with Khoj on first run
- Integrate with Ollama or other openai compatible APIs by simply
  setting `OPENAI_API_BASE' environment variable in docker-compose etc.
- Update docs on integrating with Ollama, openai proxies on first run
- Auto populate all chat models supported by openai compatible APIs
- Auto set vision enabled for all commercial models

- Minor
  - Add huggingface cache to khoj_models volume. This is where chat
  models and (now) sentence transformer models are stored by default
  - Reduce verbosity of yarn install of web app. Otherwise hit docker
  log size limit & stops showing remaining logs after web app install
  - Suggest `ollama pull <model_name>` to start it in background
2024-11-17 02:08:20 -08:00
Debanjum
2366fa08b9 Update default vision supported & anthropic chat models on first run
- Update to latest initialize with new claude 3.5 sonnet and haiku models
- Update to set vision enabled for google and anthropic models by
  default. Previously we didn't support but we've supported this for a
  month or two now
2024-11-17 02:08:20 -08:00
Debanjum
23ab258d78 Improve user conversation config details on Admin panel
Show user email and chat model that is associated with the user
conversation config
2024-11-17 02:08:20 -08:00
Debanjum
fc45aceecf Delete unused favicon ico in old web app directory 2024-11-17 02:08:20 -08:00
Debanjum
a16fc3ade8 Only add /research prefix when no slash command in message on web app
- Explictly adding a slash command is a higher priority intent than
  research mode being enabled in the background. Respect that for a
  more intuitive UX flow.

- Explicit slash commands do not currently work in research mode.
  You've to turn research mode off to use other slash commands. This
  is strange, unnecessary given intent priority is clear.
2024-11-17 02:08:20 -08:00
sabaimran
a1b4587b34 Remove extract_images flag from PDF loader 2024-11-15 21:46:35 -08:00
sabaimran
e3f1ea9dee Improve tool, output mode selection process
- JSON extract from LLMs is pretty decent now, so get the input tools and output modes all in one go. It'll help the model think through the full cycle of what it wants to do to handle the request holistically.
- Make slight improvements to tool selection indicators
2024-11-15 13:53:53 -08:00
sabaimran
c1a5b32ebf Do not start server when importing the main.py file, unless gunicorn
- Add more graceful shutdown when closing bg scheduler thread
2024-11-14 17:36:51 -08:00
sabaimran
be3ee5ec9f Add cool new suggestion cards for math, diagramming 2024-11-14 17:36:51 -08:00
Debanjum
8e009f48ce Show tool call error in next iteration. Allow rerun if model requests.
Previously errors would get eaten up but the model wouldn't see
anything. And the model wouldn't be allowed re-run the same query-tool
combination in the next iteration.

This update should give it insight into why it didn't get a result. So
it can make an informed (hopefully better) decision on what to do next.
And re-run the previous query if appropriate.
2024-11-13 22:50:14 -08:00
Debanjum
604da90fa8 Wrap try/catch around online search in research mode like other tools
Previously when call to online search API etc. failed, it'd error out
of response to query in research mode. Khoj should skip tool use that
iteration but continue to try respond.
2024-11-13 16:46:09 -08:00
Debanjum
8851b5f78a Standardize chat message truncation and serialization before print
Previously chatml messages were just strings, now they can be list of
strings or list of dicts as well.

- Use json seriallization to manage their variations and truncate them
  before printing for context.
- Put logic in single function for use across chat models
2024-11-13 16:30:17 -08:00
Debanjum
15b0cfa3dd Improve structured message truncation in logger
Previously chatml messages were just strings.
Since gemini, anthropic models always have messages as list of
strings, truncate those strings instead of the list of message content
2024-11-13 14:32:22 -08:00
Debanjum
153ae8bea9 Cut binary, long output files from code result for context efficiency
Removing binary data and truncating large data in output files
generated by code runs should improve speed and cost of research mode
runs with large or binary output files.

Previously binary data in code results was passed around in iteration
context during research mode. This made the context inefficient
because models have limited efficiency and reasoning capabilities over
b64 encoded image (and other binary) data and would hit context limits
leading to unnecessary truncation of other useful context

Also remove image data when logging output of code execution
2024-11-13 14:32:22 -08:00
sabaimran
4a1b1e8b9a Add support for interrupting messages after they've been sent. 2024-11-12 22:22:45 -08:00
sabaimran
d607ad7a27 Release Khoj version 1.29.1 2024-11-12 10:32:56 -08:00
sabaimran
8ec1764e42 Handle size calculation more gracefully for converted documents, depending on type 2024-11-12 02:00:29 -08:00
sabaimran
b6714c202f Increase the title character limit to 500 for conversations 2024-11-12 01:51:19 -08:00
sabaimran
f05e64cf8c Release Khoj version 1.29.0 2024-11-11 21:46:25 -08:00
sabaimran
47d3c8c235 Remove email query parameter from subscription patch api 2024-11-11 21:39:49 -08:00
sabaimran
d7027109a5 And null handling for response output_files in code output 2024-11-11 21:14:56 -08:00
sabaimran
d68243a3fb Revert clean_json logic temporarily. Eventually, we should do better validation here to extract markdown-formatted json. 2024-11-11 21:05:17 -08:00
sabaimran
1cab6c081f Add better error handling for diagram output, and fix chat history construct
- Make the `clean_json` method more robust as well
2024-11-11 20:44:19 -08:00
sabaimran
7bd2f83f97 Wrap test in suggestionCard 2024-11-11 20:12:46 -08:00
Debanjum
48862a8400
Enable Passing External Documents for Analysis in Code Sandbox (#960)
- Allow passing user files as input into code sandbox for analysis
- Update prompt to give more example of complex, multi-line code
- Simplify logic for model. Run one program at a time, 
  instead of allowing model to run multiple programs in parallel
- Show Code generated charts and docs in Reference pane of web app and make them downloaded
2024-11-11 19:37:17 -08:00
Debanjum
5078ac0ce2 Await on conversation save when generate conversation title via API 2024-11-11 19:17:39 -08:00
Debanjum
e1d0015248 Allow disabling Khoj telemetry via KHOJ_TELEMETRY_DISABLE env var 2024-11-11 19:17:39 -08:00
Debanjum
a52500d289 Show generated code artifacts before notes and online references 2024-11-11 18:00:22 -08:00
Debanjum
218eed83cd Show output file not code on hover. Remove reference card title border 2024-11-11 18:00:22 -08:00