Commit graph

3086 commits

Author SHA1 Message Date
Debanjum
47c926b0ff Add more typing to org|md_to_entries. Remove redundant f-string wraps
- Add type hints to improve maintainability of stabilzed indexing code
- It shouldn't be necessary to wrap string variables in an f-string

This change aims to improve code quality. It should not affect
functionality.
2024-12-01 23:02:52 -08:00
Debanjum
dffdd81345 Do not wrap filepath in Path to fix indexing markdown files on Windows
Issue
- Path with / are converted to \\ on Windows using the Path operator.
- The markdown to entries method for some reason was doing this.
  This would store the file paths in DB entry differently than the file
  to entries map. Resulting in a KeyError when trying to look up the
  entry file path from file_to_text_map in the
  text_to_entries:update_embeddings() function.

Fix
- Removing the unnecessary OS dependendent Path normalization in
  markdown_to_entries should keep the file path storage consistent
  across file_to_text_map var, FileObjectAdaptor, Entry DB tables on
  Windows for Markdown files as well

This issue would only affect users hosting Khoj server on Windows and
attempting to index markdown files.

Resolves #984
2024-12-01 23:00:31 -08:00
Debanjum
9e0a2c7a98 Restrict generated chat title to 200 chars limit allowed for chat slug
Some checks are pending
dockerize / Publish Khoj Docker Images (push) Waiting to run
build and deploy github pages for documentation / deploy (push) Waiting to run
pre-commit / Setup Application and Lint (push) Waiting to run
pypi / Publish Python Package to PyPI (push) Waiting to run
test / Run Tests (push) Waiting to run
2024-11-30 19:12:03 -08:00
Debanjum
8b8e2be82d Only create subscription object when it does not exist for user
This avoid unnecessarily throwing an internal server error when the
user tries to sign-up using multiple mechanisms (e.g first by email, then
by google oauth)
2024-11-30 19:08:34 -08:00
sabaimran
439b18c21f Release Khoj version 1.30.10
Some checks failed
build khoj.el / build (push) Has been cancelled
desktop / 🖥️ Build, Release Desktop App (push) Has been cancelled
dockerize / Publish Khoj Docker Images (push) Has been cancelled
build and deploy github pages for documentation / deploy (push) Has been cancelled
pypi / Publish Python Package to PyPI (push) Has been cancelled
test khoj.el / test (27.1) (push) Has been cancelled
test khoj.el / test (27.2) (push) Has been cancelled
test khoj.el / test (28.1) (push) Has been cancelled
test khoj.el / test (28.2) (push) Has been cancelled
test khoj.el / test (snapshot) (push) Has been cancelled
2024-11-28 19:43:06 -08:00
sabaimran
40d8a7a581 Release Khoj version 1.30.9 2024-11-28 18:45:50 -08:00
Debanjum
a552543f4f Use json5 to parse llm generated questions to query docs and web
Some checks failed
build khoj.el / build (push) Waiting to run
desktop / 🖥️ Build, Release Desktop App (push) Waiting to run
dockerize / Publish Khoj Docker Images (push) Waiting to run
build and deploy github pages for documentation / deploy (push) Waiting to run
pypi / Publish Python Package to PyPI (push) Waiting to run
test khoj.el / test (27.1) (push) Waiting to run
test khoj.el / test (27.2) (push) Waiting to run
test khoj.el / test (28.1) (push) Waiting to run
test khoj.el / test (28.2) (push) Waiting to run
test khoj.el / test (snapshot) (push) Waiting to run
test / Run Tests (push) Has been cancelled
pre-commit / Setup Application and Lint (push) Has been cancelled
json5 is more forgiving, handles double quotes, newlines in raw json
string
2024-11-28 14:35:34 -08:00
Debanjum
0a69af4f61 Update to latest ToDesktop runtime 2024-11-28 13:56:14 -08:00
Debanjum
1d0fe141dc Release Khoj version 1.30.8 2024-11-28 13:37:30 -08:00
Debanjum
8c120a5139 Fallback to json5 loader if json.loads cannot parse complex json str
JSON5 spec is more flexible, try to load using a fast json5 parser if
the stricter json.loads from the standard library can't load the
raw complex json string into a python dictionary/list
2024-11-26 21:17:00 -08:00
Debanjum
70b7e7c73a Improve load of complex json objects. Use it to pick tool, run code
Gemini doesn't work well when trying to output json objects. Using it
to output raw json strings with complex, multi-line structures
requires more intense clean-up of raw json string for parsing
2024-11-26 17:37:57 -08:00
Debanjum
29315f44e7 Add assetlinks.json to link android app to app.khoj.dev domain
Add sha cert of android upload, signing keys to open debug, prod apps
as TWA in fullscreen on android phones
2024-11-26 01:57:54 -08:00
Debanjum
a97a45bf20 Align agent personality with recently updated khoj personality
See update to Khoj personality in commit
6eb59464da
2024-11-26 00:06:16 -08:00
Debanjum
5723a3778e
Speed up Docker image builds using multi-stage parallel pipelines (#987)
## Objective
Improve build speed and size of khoj docker images

## Changes
### Improve docker image build speeds
  - Decouple web app and server build steps
  - Build the web app and server in parallel
  - Cache docker layers for reuse across dockerize github workflow runs
    - Split Docker build layers for improved cacheability (e.g separate `yarn install` and `yarn build` steps)
### Reduce size of khoj docker images 
  - Use an up-to-date `.dockerignore` to exclude unnecessary directories
  - Do not installing cuda python packages for cpu builds
### Improve web app builds
  - Use consistent mechanism to get fonts for web app
  - Make tailwind extensions production instead of dev dependencies
  - Make next.js create production builds for the web app (via `NODE_ENV=production` env var)
2024-11-24 21:49:46 -08:00
Debanjum
6a39651ad3 Standardize loading fonts locally across pages on web app 2024-11-24 20:41:15 -08:00
sabaimran
6eb59464da Add additional reinforcement to coax gemini into giving a minimum helpful response 2024-11-24 14:53:53 -08:00
sabaimran
15f062b34a Remove print statement for agent style map 2024-11-24 14:53:53 -08:00
sabaimran
d7e68a2d1b Wait for iplcodata to load before first message
- Fix the console khoj ai ascii art
- Remove some not so good suggested prompt
2024-11-24 14:53:53 -08:00
Debanjum
710e00ad9e Make tailwind extensions prod, instead of dev, deps of web app 2024-11-24 13:59:40 -08:00
Debanjum
7c77d65d35 Improve logic to disable telemetry via KHOJ_TELEMETRY_DISABLE env var
Some checks failed
dockerize / Publish Khoj Docker Images (push) Waiting to run
build and deploy github pages for documentation / deploy (push) Waiting to run
pre-commit / Setup Application and Lint (push) Waiting to run
pypi / Publish Python Package to PyPI (push) Waiting to run
test / Run Tests (push) Waiting to run
build khoj.el / build (push) Has been cancelled
desktop / 🖥️ Build, Release Desktop App (push) Has been cancelled
test khoj.el / test (27.1) (push) Has been cancelled
test khoj.el / test (27.2) (push) Has been cancelled
test khoj.el / test (28.1) (push) Has been cancelled
test khoj.el / test (28.2) (push) Has been cancelled
test khoj.el / test (snapshot) (push) Has been cancelled
The newly added KHOJ_TELEMETRY_DISABLE env var knob to disable
telemetry should override old config mechanism when set
2024-11-24 00:54:16 -08:00
sabaimran
2d683898c2 Release Khoj version 1.30.7 2024-11-23 22:51:10 -08:00
sabaimran
914ff994f7 Fix cost addition to chat_metadata
Some checks are pending
build khoj.el / build (push) Waiting to run
desktop / 🖥️ Build, Release Desktop App (push) Waiting to run
dockerize / Publish Khoj Docker Images (push) Waiting to run
build and deploy github pages for documentation / deploy (push) Waiting to run
pre-commit / Setup Application and Lint (push) Waiting to run
pypi / Publish Python Package to PyPI (push) Waiting to run
test / Run Tests (push) Waiting to run
test khoj.el / test (27.1) (push) Waiting to run
test khoj.el / test (27.2) (push) Waiting to run
test khoj.el / test (28.1) (push) Waiting to run
test khoj.el / test (28.2) (push) Waiting to run
test khoj.el / test (snapshot) (push) Waiting to run
2024-11-23 22:50:45 -08:00
Debanjum
caaa127dcf Release Khoj version 1.30.6 2024-11-23 21:07:00 -08:00
Debanjum
8f966b11ec Release Khoj version 1.30.5 2024-11-23 20:49:05 -08:00
Debanjum
e5b211a743 Release Khoj version 1.30.4 2024-11-23 19:48:21 -08:00
Debanjum
c4ef31d86f Release Khoj version 1.30.3
Some checks are pending
build khoj.el / build (push) Waiting to run
desktop / 🖥️ Build, Release Desktop App (push) Waiting to run
dockerize / Publish Khoj Docker Images (push) Waiting to run
build and deploy github pages for documentation / deploy (push) Waiting to run
pypi / Publish Python Package to PyPI (push) Waiting to run
test khoj.el / test (27.1) (push) Waiting to run
test khoj.el / test (27.2) (push) Waiting to run
test khoj.el / test (28.1) (push) Waiting to run
test khoj.el / test (28.2) (push) Waiting to run
test khoj.el / test (snapshot) (push) Waiting to run
2024-11-23 14:40:06 -08:00
sabaimran
4ac49ca90f Release Khoj version 1.30.2 2024-11-23 12:00:28 -08:00
sabaimran
7f5bf35806 Disambiguate renewal_date type. Previously, being used as None, False, and Datetime in different places.
Some checks failed
dockerize / Publish Khoj Docker Images (push) Waiting to run
build and deploy github pages for documentation / deploy (push) Waiting to run
pypi / Publish Python Package to PyPI (push) Waiting to run
pre-commit / Setup Application and Lint (push) Has been cancelled
test / Run Tests (push) Has been cancelled
2024-11-22 12:06:20 -08:00
sabaimran
5e8c824ecc Improve the experience for finding past conversation
- add a conversation title search filter, and an agents filter, for finding conversations
- in the chat session api, return relevant agent style data
2024-11-22 12:03:01 -08:00
sabaimran
a761865724 Fix handling of customer.subscription.updated event to process new renewal end date 2024-11-22 12:03:01 -08:00
sabaimran
6a054d884b Add quicker/easier filtering on auth 2024-11-22 12:03:01 -08:00
Debanjum
b9a889ab69 Fix Khoj responses when code generated charts in response context
The current fix should improve Khoj responses when charts in response
context. It truncates code context before sharing with response chat actors.

Previously Khoj would respond with it not being able to create chart
but than have a generated chart in it's response in default mode.

The truncate code context was added to research chat actor for
decision making but it wasn't added to conversation response
generation chat actors.

When khoj generated charts with code for its response, the images in
the context would exceed context window limits.

So the truncation logic to drop all past context, including chat
history, context gathered for current response.

This would result in chat response generator 'forgetting' all for the
current response when code generated images, charts in response context.
2024-11-21 14:43:52 -08:00
Debanjum
5475a262d4 Move truncate code context func for reusability across modules
It needs to be used across routers and processors. It being in
run_code tool makes it hard to be used in other chat provider contexts
due to circular dependency issues created by
send_message_to_model_wrapper func
2024-11-21 14:27:39 -08:00
Debanjum
f434c3fab2 Fix toggling prompt tracer on/off in Khoj via PROMPTRACE_DIR env var
Previous changes to depend on just the PROMPTRACE_DIR env var instead
of KHOJ_DEBUG or verbosity flag was partial/incomplete.

This fix adds all the changes required to only depend on the
PROMPTRACE_DIR env var to enable/disable prompt tracing in Khoj.
2024-11-21 14:06:00 -08:00
Debanjum
1f96c13f72 Enable starting khoj uvicorn server with ssl cert file, key for https
Pass your domain cert files via the --sslcert, --sslkey cli args.
For example, to start khoj at https://example.com, you'd run command:

KHOJ_DOMAIN=example.com khoj --sslcert example.com.crt --sslkey
example.com.key --host example.com

This sets up ssl certs directly with khoj without requiring a
reverse proxy like nginx to serve khoj behind https endpoint for
simple setups. More complex setups should, of course, still use a
reverse proxy for efficient request processing
2024-11-21 11:07:18 -08:00
sabaimran
9fea02f20f In telemetry, differentiate create_user google and email 2024-11-21 11:01:37 -08:00
sabaimran
9db885b5f7 Limit access to chat models to futurist users 2024-11-21 07:53:24 -08:00
sabaimran
3519dd76f0 Fix type of excalidraw image response 2024-11-20 19:01:13 -08:00
sabaimran
467de76fc1 Improve the image diagramming prompts and response parsing 2024-11-20 18:59:40 -08:00
Debanjum
2203236e4c Update desktop app dependencies 2024-11-20 13:05:55 -08:00
Debanjum
6f1adcfe67
Track Usage Metrics in Chat API. Track Running Cost, Accuracy in Evals (#985)
- Track, return cost and usage metrics in chat api response
  Track input, output token usage and cost of interactions with 
  openai, anthropic and google chat models for each call to the khoj chat api
- Collect, display and store costs & accuracy of eval run currently in progress
  This provides more insight into eval runs during execution 
  instead of having to wait until the eval run completes.
2024-11-20 12:59:44 -08:00
Debanjum
bbd24f1e98 Improve dropdown menus on web app setting page with scroll & min-width
- Previously when settings list became long the dropdown height would
  overflow screen height. Now it's max height is clamped and  y-scroll
- Previously the dropdown content would take width of content. This
  would mean the menu could sometimes be less wide than the button. It
  felt strange. Now dropdown content is at least width of parent button
2024-11-20 12:27:13 -08:00
Debanjum
c53c3db96b Track, return cost and usage metrics in chat api response
- Track input, output token usage and cost for interactions
  via chat api with openai, anthropic and google chat models

- Get usage metadata from OpenAI using stream_options
- Handle openai proxies that do not support passing usage in response

- Add new usage, end response  events returned by chat api.
  - This can be optionally consumed by clients at a later point
  - Update streaming clients to mark message as completed after new
    end response event, not after end llm response event
- Ensure usage data from final response generation step is included
  - Pass usage data after llm response complete. This allows gathering
    token usage and cost for the final response generation step across
    streaming and non-streaming modes
2024-11-20 12:17:58 -08:00
Debanjum
80df3bb8c4 Enable prompt tracing only when PROMPTRACE_DIR env var set
Better to decouple prompt tracing from debug mode or verbosity level
and require explicit, independent config to enable prompt tracing
2024-11-20 11:54:02 -08:00
Debanjum
9ab76ccaf1 Skip adding agent to chat metadata when chat unset to avoids null ref 2024-11-19 21:10:23 -08:00
Debanjum
4da0499cd7 Stream responses by openai's o1 model series, as api now supports it
Previously o1 models did not support streaming responses via API. Now
they seem to do
2024-11-19 21:10:23 -08:00
Debanjum
7bdc9590dd Fix handling sources, output in chat actor when is automated task
Remove unnecessary ```python prefix removal. It isn't being triggered
in json deserialize path.
2024-11-19 13:49:27 -08:00
Debanjum
0e7d611a80 Remove ```python codeblock prefix from raw json before deserialize 2024-11-19 12:53:52 -08:00
Debanjum
001c13ef43 Upgrade web app package dependencies 2024-11-19 12:53:52 -08:00
sabaimran
5134d49d71 Release Khoj version 1.30.1 2024-11-18 17:30:33 -08:00