Commit graph

4037 commits

Author SHA1 Message Date
sabaimran
e9e49ea098
Allow custom inference endpoint for the crossencoder model (#616)
* Add support for custom inference endpoints for the cross encoder model
- Since there's not a good out of the box solution, I've deployed a custom model/handler via huggingface to support this use case.
* Use langchain.community for pdf, openai chat modules
* Add an explicit stipulation that the api endpoint for crossencoder inference should be for huggingface for now
2024-01-18 10:02:12 +05:30
Debanjum Singh Solanky
08012c71b1 Update Dockerfile with swig system package required by PyMuPDF 2024-01-17 19:24:27 +05:30
Debanjum Singh Solanky
870af19ba4 Update health API to pass authenticated users their info
This allows Khoj clients to get email address associated with
user's API token for display in client UX

In anonymous mode, default user information is passed
2024-01-17 13:38:57 +05:30
Debanjum
4d30f7d1d9
Short-circuit API rate limiter for unauthenticated users (#607)
### Major
- Short-circuit API rate limiter for unauthenticated user
  Calls by unauthenticated users were failing at API rate limiter as it
  failed to access user info object. This is a bug.
  
  API rate limiter should short-circuit for unauthenicated users so a
  proper Forbidden response can be returned by API
  
  Add regression test to verify that unauthenticated users get 403
  response when calling the /chat API endpoint
  
### Minor
- Remove trailing slash to normalize khoj url in obsidian plugin settings
- Move used /api/config API controllers into separate module
- Delete unused /api/beta API endpoint
- Fix error message rendering in khoj.el, khoj obsidian chat
- Handle deprecation warnings for subscribe renew date, langchain, pydantic & logger.warn
2024-01-17 00:59:52 +05:30
Debanjum Singh Solanky
d26a4ffcea Only run the OpenAI chat client, /online test when API keys are set 2024-01-17 00:36:03 +05:30
Debanjum Singh Solanky
2752e0d607 Update jinja2 and axios min supported package versions 2024-01-16 18:45:38 +05:30
Debanjum Singh Solanky
7039c202c8 Merge branch 'master' into short-circuit-api-rate-limiter 2024-01-16 18:18:34 +05:30
Debanjum Singh Solanky
8917228dbb Remove unused, deprecated /api/config/data API endpoints
- Use /api/health for server up check instead of api/config/default
- Remove unused `khoj--post-new-config' method
- Remove the now unused /config/data GET, POST API endpoints
2024-01-16 18:15:06 +05:30
Debanjum
51c59d0059
Remove the 1000 files limit when syncing from Desktop, Obsidian clients (#605)
### Major
- Push 1000 files at a time from the Desktop client for indexing
- Push 1000 files at a time from the Obsidian client for indexing
- Test 1000 file upload limit to index/update API endpoint

### Minor
- Show relevant error message in desktop app, e.g when can't connect to server
- Pass indexed filenames in API response for client validation
- Collect files to index in single dict to simplify index/update controller

Resolves #573
2024-01-16 17:59:26 +05:30
Debanjum Singh Solanky
6ded4c1d75 Merge branch 'master' into fix-1000-file-index-update-limit 2024-01-16 16:50:58 +05:30
sabaimran
c24389cff5 Add Algolia to documentation website for better search 2024-01-16 15:53:53 +05:30
Debanjum
45f892dfdd
Fix Offline Chat without GPU and Decoding Chat Query before Processing
- Only run /online command offline chat director test when `SERPER DEV_API_KEY' present
- Decode URL encoded query string in chat API endpoint before processing
- Make references and online_results optional params to converse_offline
- Pass max context length to fix using updated `GPT4All.list_gpu' method
2024-01-16 14:53:34 +05:30
Debanjum Singh Solanky
e0b381d523 Only run /online command offline chat director test when SERPER KEY present 2024-01-16 13:09:38 +05:30
Debanjum Singh Solanky
16175137e5 Decode URL encoded query string in chat API endpoint before processing 2024-01-16 13:09:28 +05:30
Debanjum Singh Solanky
9fe1c8ae13 Make references and online_results optional params to converse_offline
Fixes all the failing GPT4All tests because they were missing the
online_results argument
2024-01-16 13:09:28 +05:30
Debanjum Singh Solanky
d74f8e03d3 Pass max context length to fix using updated GPT4All.list_gpu method
It's signature was updated in GPT4All 2.1.0 pypi release.

Resolves #610
2024-01-16 12:23:45 +05:30
Debanjum Singh Solanky
1ae6669fbf Correctly handle API response when no files to index 2024-01-16 11:57:40 +05:30
sabaimran
50575b749b
Add option to use HuggingFace's inference endpoint for generating embeddings (#609)
* Support using hosted Huggingface inference endpoint for embeddings generation
* Since the huggingface inference endpoint is model-specific, make the URL an optional property of the search model config
* Handle ECONNREFUSED error in desktop app
* Drive API key via the search model config model and use more generic names
2024-01-16 08:58:24 +05:30
Debanjum Singh Solanky
ba37b28fb5 Improve batched error handling. Catch can't connect to server error
Break out of batch processing when unable to connect to server or
when requests throttled by server
2024-01-14 01:04:44 +05:30
Debanjum Singh Solanky
7dfbcd2e5a Handle subscribe renew date, langchain, pydantic & logger.warn warnings
- Ensure langchain less than 0.2.0 is used, to prevent breaking
  ChatOpenAI, PyMuPDF usage due to their deprecation after 0.2.0
- Set subscription renewal date to a timezone aware datetime
- Use logger.warning instead of logger.warn as latter is deprecated
- Use `model_dump' not deprecated dict to get all configured content_types
2024-01-12 01:46:52 +05:30
Debanjum Singh Solanky
5f97357fe0 Delete unused /api/beta API endpoint 2024-01-12 01:11:05 +05:30
Debanjum Singh Solanky
bb1c1b39d8 Move /api/config API controllers into separate module for code modularity 2024-01-12 01:11:04 +05:30
Debanjum Singh Solanky
ba99089a12 Short-circuit API rate limiter for unauthenticated user
Calls by unauthenticated users were failing at API rate limiter as it
failed to access user info object. This is a bug.

API rate limiter should short-circuit for unauthenicated users so a
proper Forbidden response can be returned by API

Add regression test to verify that unauthenticated users get 403
response when calling the /chat API endpoint
2024-01-12 00:23:50 +05:30
Debanjum Singh Solanky
b1269fdad2 Remove trailing slash to normalize khoj url in obsidian plugin settings 2024-01-11 21:56:36 +05:30
Debanjum Singh Solanky
ffdb291fe0 Fix error message rendering in khoj.el, khoj obsidian chat
- Fix failed to index error message in khoj.el
- Fix chat model not configured message in khoj obsidian chat
2024-01-11 21:55:54 +05:30
Debanjum Singh Solanky
af9ceb00a0 Show relevant error msg in desktop app, e.g when can't connect to server 2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky
43423432ce Pass indexed filenames in API response for client validation 2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky
5f9ac5a630 Collect files to index in single dict to simplify index/update controller
Simplifies code while maintaining typing
2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky
efe41aaaca Push 1000 files at a time from the Desktop client for indexing
FastAPI API endpoints only support uploading 1000 files at a time.
So split all files to index into groups of 1000 for upload to
index/update API endpoint
2024-01-09 23:09:34 +05:30
sabaimran
02187b19bb Customize font styling for documentation 2024-01-08 08:50:42 +05:30
sabaimran
8389108653 Fix reference issue for demos in the main README 2024-01-08 08:29:51 +05:30
Debanjum
dbc59b2952
Fix, Improve Khoj Documentation Layout (#604)
- 26f96e00 Use Khoj Client, Data sources diagrams in feature docs
- c82d34b6 Add Docs footer, nav pane links. Fix tagline, Remove announcement topbar
- d920e4d0 Make the docs overview page as the main docs landing page
- 80d1ad5b Fix image urls on docs overview page. Remove logo header in client docs
2024-01-08 02:00:02 +05:30
Debanjum Singh Solanky
efc7b08cd9 Use Khoj Client, Data sources diagrams in feature docs 2024-01-08 01:58:57 +05:30
Debanjum Singh Solanky
c82d34b659 Add Docs footer, nav pane links. Fix tagline, Remove announcement topbar 2024-01-08 01:17:47 +05:30
Debanjum Singh Solanky
d920e4d0a7 Make the docs overview page as the main docs landing page
- Make the docs overview page available at docs.khoj.dev root instead of
under docs.khoj.dev/docs path
  - Remove the new landing page, it is unnecessary.
- Remove /docs path prefix from links to internal doc pages
- Remove .md path suffix in internal doc pages for consistency
2024-01-08 01:13:42 +05:30
Debanjum Singh Solanky
80d1ad5b6f Fix image urls on docs overview page. Remove logo header in client docs 2024-01-08 00:30:31 +05:30
sabaimran
ce53bc52c5 Modify permissions of the GITHUB_TOKEN for publishing to gh-pages 2024-01-07 20:53:57 +05:30
sabaimran
740453fa18 Use documentation folder for building project and uploading data 2024-01-07 20:50:15 +05:30
sabaimran
2be7c84203 Enter documentation repository before running yarn build 2024-01-07 20:46:21 +05:30
sabaimran
ad95e88838 Update node version in github action 2024-01-07 20:41:24 +05:30
sabaimran
bd9aa578f4 Add a yarn.lock file and use for node.js setup 2024-01-07 20:36:02 +05:30
sabaimran
9b991eb4fe
Migrate to using docusaurus, rather than docsify for documentation (#603)
* Add docusaurus documentation (to replace the docsify setup
* Remove older docs
* Specify documentation as the gh pages build action working directory
2024-01-07 20:28:15 +05:30
Debanjum Singh Solanky
98081bc0d3 Update Uninstall Documentation for Khoj Server when Self Hosting 2024-01-06 01:37:29 +05:30
Debanjum Singh Solanky
5d52dc5b35 Fix spelling in the development documentation for Khoj 2024-01-04 19:24:58 +05:30
Debanjum Singh Solanky
b6d5392c0c Release Khoj version 1.2.1 2024-01-04 18:45:37 +05:30
Debanjum Singh Solanky
fca7a5ff32 Push 1000 files at a time from the Obsidian client for indexing
FastAPI API endpoints only support uploading 1000 files at a time.
So split all files to index into groups of 1000 for upload to
index/update API endpoint
2024-01-04 18:43:22 +05:30
Debanjum Singh Solanky
4ded32cc64 Test 1000 file upload limit to index/update API endpoint
Due to FastAPI limitation
2024-01-03 22:14:36 +05:30
Debanjum Singh Solanky
4a234c8db3 Use default offline/openai chat model to extract DB search queries
Make usage of the first offline/openai chat model as the default LLM
to use for background tasks more explicit

The idea is to use the default/first chat model for all background
activities, like user message to extract search queries to perform.
This is controlled by the server admin.

The chat model set by the user is used for user-facing functions like
generating chat responses
2024-01-03 14:04:49 +05:30
Debanjum Singh Solanky
e28adf2884 Also index pdf, markdown and plaintext files using khoj emacs client
Previously you could only index org-mode files and directories from
khoj.el

Mark the `khoj-org-directories', `khoj-org-files' variables for
deprecation, since `khoj-index-directories', `khoj-index-files'
replace them as more appropriate names for the more general case

Resolves #597
2024-01-03 11:46:17 +05:30
Debanjum Singh Solanky
5abaed9d08 Use user chosen OpenAI model to extract DB search questions from query
Previously Khoj was selecting the first OpenAI model configured on
server and not the OpenAI model configured by the user for themselves
2024-01-03 11:45:06 +05:30