Commit graph

2138 commits

Author SHA1 Message Date
Debanjum Singh Solanky
305c25ae1a Track ancestor headings for each org-mode entry in org-node parser 2023-11-16 02:39:14 -08:00
Debanjum
208ddddc6a
Make Search Model Configurable on Server (#544)
- Make search model configurable on server
- Update migration script to get search model from `khoj.yml` to Postgres
- Update first run message on Khoj Desktop and Web app landing page
- Other miscellaneous bug fixes
2023-11-16 00:11:58 -08:00
Debanjum Singh Solanky
cc05013715 Update first run message on Web app with Chat models setup instructions
- Link to Django admin panel for user to create Chat Models on their
  Khoj server
- This should only get hit when user is not using Khoj cloud, as Khoj
  cloud would already have Chat models configured
2023-11-15 22:44:24 -08:00
Debanjum Singh Solanky
6c1693b8f4 Update first run message on Desktop app with API token setup instructions
- Open Web app settings in the default browser via link click
- Open Desktop app settings via link click
2023-11-15 22:44:11 -08:00
Debanjum Singh Solanky
922983bd53 Set max cos distance to 0.18. Test search API query with max distance 2023-11-15 20:26:21 -08:00
Debanjum Singh Solanky
18dbad5edb Use Sigmoid to normalize cross-encoder score between 0-1
- While sigmoid normalization isn't required for reranking.
  Normalizing score to distance metrics for both encoder and cross
  encoder scores is useful to reason about them
- Softmax wasn't required as don't need probabilities, sigmoid is good
  enough to get distance metric
2023-11-15 19:31:59 -08:00
sabaimran
0da4db4310
Merge pull request #547 from khoj-ai/features/fix-api-token-generator
Update the return type of the API token generator
2023-11-15 19:23:18 -08:00
sabaimran
ea144de438 Merge with master 2023-11-15 18:34:46 -08:00
sabaimran
6b17aeb32d Resolve merge conflicts in auth.py with remove KhojApiUser import 2023-11-15 17:32:53 -08:00
Debanjum Singh Solanky
348cc0cf0e Use better name for DB adapter func to create user by Google token 2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky
08a057bdd5 Rename SearchModel to SearchModelConfig DB model, Require Cross-Encoder 2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky
0679b2a7bd Use embeddings model store from state in text to entries
Do not need to instantiating it separately. In all other places we're
using the embeddings model store in global state anyway
2023-11-15 17:31:50 -08:00
sabaimran
f88a5867b4 Allow dockerize step to run for prod from PR temporarily 2023-11-15 17:31:50 -08:00
sabaimran
245a9cbf63 Fix return type of the update_or_create method 2023-11-15 17:31:50 -08:00
sabaimran
10be8dfad9 Rename dockerize dev action to be more accurate 2023-11-15 17:31:50 -08:00
sabaimran
70f5d0ed3c Add a dev workflow for GitHub actions, change the production workflow to only kick off when pushed to master 2023-11-15 17:31:50 -08:00
sabaimran
bbae7dd83c Update logic for creating a new user to use aupdate_or_create 2023-11-15 17:31:50 -08:00
sabaimran
154de8c629 Update format for return type of the generate token method 2023-11-15 17:31:12 -08:00
sabaimran
cf74fa4a70 Allow dockerize step to run for prod from PR temporarily 2023-11-15 17:04:48 -08:00
sabaimran
8e62af77b9 Update format for return type of the generate token mehtod 2023-11-15 17:03:01 -08:00
sabaimran
4a487aff23 Fix return type of the update_or_create method 2023-11-15 14:35:42 -08:00
sabaimran
992e54c218 Rename dockerize dev action to b emore accurate 2023-11-15 14:09:28 -08:00
sabaimran
99f5a6082e Add a dev workflow for GitHub actions, change the production workflow to only kick off when pushed to master 2023-11-15 14:07:25 -08:00
sabaimran
b63856ecb4 Update logic for creating a new user to use aupdate_or_create 2023-11-15 12:50:39 -08:00
sabaimran
b8e7488a95 Use a more permissive distance filter for search results from notes 2023-11-15 11:13:47 -08:00
sabaimran
d06b2cf24b Downgrade pyproject.toml to avert depedency conflict 2023-11-15 10:47:54 -08:00
sabaimran
05b7542115 Remove config lock from the state 2023-11-15 10:44:45 -08:00
sabaimran
ecd005cac0 Check if search model is already in DB before creating a new one 2023-11-15 10:41:35 -08:00
Debanjum Singh Solanky
9c6e7bdea2 Upgrade server, desktop app dependencies to resolve CVE bugs 2023-11-15 01:47:53 -08:00
Debanjum Singh Solanky
5a6ab9cc85 Fix failing client tests 2023-11-15 00:17:44 -08:00
Debanjum Singh Solanky
8f200cf53f Remove unused parameter from configure_search_type method 2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
f8e5e118e1 Only create KhojUser on login if doesn't already exist 2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
3d8d6145f2 Add search model config from khoj.yml to Postgres DB via migration script 2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
4af194d74b Make search model configurable on server
- Expose ability to modify search model via Django admin interface
- Previously the bi_encoder and cross_encoder models to use were set
  in code
- Now it's user configurable but with a default config generated by
  default
2023-11-14 19:09:35 -08:00
Debanjum
b734984d6d
Fix, Improve Khoj with multi-user, db support for Khoj Cloud Release (#539)
### Overview
Prepare Khoj with multi-user, db support for Khoj Cloud release

### Details
- Add first run experience to configure Khoj via khoj CLI 
- Improve Web app settings page: Move files data into content section card. Move content index update button(s) to content section
- Improve OpenAI chat prompts
  - Push more general information for OpenAI models into system prompt
  - Make it more aware of it's current capabilities
  - Weaken asking follow-up questions
- Rate-limit calls to the chat API
- Add back search results quality threshold
  - Normalize quality score definitions across cross_encoder, encoder to distance metric
- Remove reference to deprecated button
- Await result of the search query
- Fixed Langchain issue by allowing the Docker image to rebuild with a later package version
2023-11-14 16:55:34 -08:00
Debanjum Singh Solanky
e98141f4c3 Subscribe default user to standard plan with a far away renewal date
Self hosted users in anonymous mode have all capabilities unlocked
2023-11-14 16:31:39 -08:00
Debanjum Singh Solanky
9d30fda26d Deduplicate, improve name of prompt templates for GPT4All chat models
- Do not pass unused rerank_results parameter to text_search.query method
2023-11-14 16:31:09 -08:00
Debanjum Singh Solanky
795ec9eb55 Add KHOJ_prefix to server admin credentials environment variables 2023-11-14 16:13:13 -08:00
sabaimran
ee005de662 Rename django files URL to server instead of django 2023-11-14 12:36:38 -08:00
sabaimran
75e5a6b6de Remove all the example mounted volumes as they're no longer required in the new architecture 2023-11-14 12:31:24 -08:00
sabaimran
20ce3d0c78 Update default docker compose configuration with Khoj local mode 2023-11-14 12:21:26 -08:00
sabaimran
8c36079f74 Add a first run experience to intialize the admin user if none exists and setup chat models 2023-11-13 21:07:12 -08:00
Debanjum Singh Solanky
e9adb58c16 Rate limit calls to the /chat API per user, per day/minute 2023-11-13 19:41:46 -08:00
Debanjum Singh Solanky
33a8eb0470 Log when new user is created 2023-11-13 19:37:24 -08:00
sabaimran
603f838115 Block input text field when waiting for chat response 2023-11-11 17:14:37 -08:00
Asim Shrestha
0bfc094e18
Add test separators 2023-11-11 17:08:58 -08:00
Debanjum Singh Solanky
9c321ac070 Fix cross encoder to use softmax to convert it to a distance metric 2023-11-11 16:12:24 -08:00
sabaimran
8a824167cf Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references 2023-11-11 12:59:31 -08:00
sabaimran
fa428932a8 Update URL for downloading the desktop application 2023-11-11 12:59:15 -08:00
Debanjum Singh Solanky
941c7f23a3 Only get text search results above confidence threshold via API
- During the migration, the confidence score stopped being used. It
  was being passed down from API to some point and went unused

- Remove score thresholding for images as image search confidence
  score different from text search model distance score

- Default score threshold of 0.15 is experimentally determined by
  manually looking at search results vs distance for a few queries

- Use distance instead of confidence as metric for search result quality
  Previously we'd moved text search to a distance metric from a
  confidence score.

  Now convert even cross encoder, image search scores to distance metric
  for consistent results sorting
2023-11-11 04:11:33 -08:00