Commit graph

497 commits

Author SHA1 Message Date
Debanjum Singh Solanky
2f1756cc15 Do not use icon for each file, folder to index in desktop app.
Other minor fixes based on PR feedback
2023-11-04 00:13:10 -07:00
Debanjum Singh Solanky
e8f568d79c Make splash screen wider, opaque and fix it's spinner radius
Radius should be such that final spin doesn't extend out of the circle
Opaque background improves contrast for better visual
2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky
3ef05f4803 Use css var for main font color in search, chat page of desktop app 2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky
a19cbde2d7 Add About page for Khoj to Desktop app. Expose it via system tray
- Pass current khoj version from package.json to about page via
  electron IPC between backend js and frontend page
- Update Khoj information in default About screen as well, in case
  it's exposed anywhere else
2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky
a327294ee9 Rename khoj.js to utils.js in web and desktop client apps 2023-11-03 18:13:37 -07:00
Debanjum Singh Solanky
db57eeaefe Console log a welcome message on loading Desktop client 2023-11-03 05:15:41 -07:00
Debanjum Singh Solanky
4cd76311ad Slow down spinning at end of splash sequence. Make animation bigger 2023-11-03 04:28:17 -07:00
Debanjum Singh Solanky
34661c33a2 Show splash screen on starting desktop app 2023-11-03 03:19:08 -07:00
Debanjum Singh Solanky
126d3f4563 Render each file, folder to index row with icon in desktop app
Make the file, folders to index look less like an editable field
2023-11-03 02:48:42 -07:00
Debanjum Singh Solanky
80ae132cad Update Desktop, Obsidian client color theme to lighter yellow
- Update background color to a different shade of white
- Make primary and primary hover colors less intense and more aligned
  with lantern flame shade
- Add water, leaf, flower color variables
2023-11-03 02:48:42 -07:00
Debanjum Singh Solanky
345856e7be Merge branch 'master' of github.com:khoj-ai/khoj into features/multi-user-support-khoj
Merge changes to use latest GPT4All with GPU, GGUF model support into
khoj multi-user support rearchitecture branch
2023-11-02 22:44:25 -07:00
Debanjum Singh Solanky
041074ccd6 Make chat the landing page for the desktop app
Chat, unlike search, doesn't knowledge base indexing setup.
So you can get started with chat much faster.
2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
9fc6c97139 Use Khoj standard font family, weight in web client settings page 2023-11-02 20:42:21 -07:00
Tuan Nguyen
354605e73e
Autofocus to chat input when openning chat (#524) 2023-11-01 16:09:45 -07:00
sabaimran
54a387326c
[Multi-User Part 6]: Address small bugs and upstream PR comments (#518)
- 08654163cb: Add better parsing for XML files
- f3acfac7fb: Add a try/catch around the dateparser in order to avoid internal server errors in app
- 7d43cd62c0: Chunk embeddings generation in order to avoid large memory load
- e02d751eb3: Addresses comments from PR #498 
- a3f393edb4: Addresses comments from PR #503 
- 66eb078286: Addresses comments from PR #511 
- Address various items in https://github.com/khoj-ai/khoj/issues/527
2023-10-31 17:59:53 -07:00
Debanjum
9acc722f7f
[Multi-User Part 4]: Authenticate using API Tokens (#513)
###  New
- Use API keys to authenticate from Desktop, Obsidian, Emacs clients
- Create API, UI on web app config page to CRUD API Keys
- Create user API keys table and functions to CRUD them in Database

### 🧪 Improve
- Default to better search model, [gte-small](https://huggingface.co/thenlper/gte-small), to improve search quality
- Only load chat model to GPU if enough space, throw error on load failure
- Show encoding progress, truncate headings to max chars supported
- Add instruction to create db in Django DB setup Readme

### ⚙️ Fix
- Fix error handling when configure offline chat via Web UI
- Do not warn in anon mode about Google OAuth env vars not being set
- Fix path to load static files when server started from project root
2023-10-26 12:33:03 -07:00
Debanjum Singh Solanky
8346e1193c Release Khoj version 0.13.0 2023-10-18 03:43:54 -07:00
Debanjum Singh Solanky
6631fc38db Delete plaintext config via API. Catch any offline model loading exception 2023-10-18 03:37:45 -07:00
Debanjum Singh Solanky
53abd1a506 Mark sync completed on desktop client, even when no files to send
Previously Sync spinner on desktop config screen would hang when no
files to send to server & the Sync button had been manually triggered
2023-10-18 01:30:56 -07:00
Debanjum Singh Solanky
51363d280d Do not configure khoj server for pull based indexing from khoj.el
Do not make khoj server pull update index on Obsidian plugin load.
Index is updated on push from plugin instead now/
2023-10-17 21:47:19 -07:00
Debanjum Singh Solanky
c8293998d9 Fix encoding binary files like PDFs for sync from Obsidian client
Use readBinary to read binary files like PDFs instead of read
2023-10-17 15:08:30 -07:00
sabaimran
ba60c869c9 Fix encoding binary files like PDFs for sync from Desktop client
Use readFileSync, Buffer to pass appropriately formatted binary data
2023-10-17 15:08:23 -07:00
Debanjum Singh Solanky
b8976426eb Update offline chat model config schema used by Emacs, Obsidian clients
The server uses a new schema for the conversation config. The Emacs,
Obsidian clients need to use this schema to update the conversation
config
2023-10-17 07:01:35 -07:00
Debanjum
ecc6fbfeb2
Push Files to Index from Emacs, Obsidian & Desktop Clients using Multi-Part Forms (#499)
### Overview
- Add ability to push data to index from the Emacs, Obsidian client
- Switch to standard mechanism of syncing files via HTTP multi-part/form. Previously we were streaming the data as JSON
  - Benefits of new mechanism
    - No manual parsing of files to send or receive on clients or server is required as most have in-built mechanisms to send multi-part/form requests
    - The whole response is not required to be kept in memory to parse content as JSON. As individual files arrive they're automatically pushed to disk to conserve memory if required
    - Binary files don't need to be encoded on client and decoded on server

### Code Details
### Major
- Use multi-part form to receive files to index on server
- Use multi-part form to send files to index on desktop client
- Send files to index on server from the khoj.el emacs client
  - Send content for indexing on server at a regular interval from khoj.el
- Send files to index on server from the khoj obsidian client
- Update tests to test multi-part/form method of pushing files to index

#### Minor
- Put indexer API endpoint under /api path segment
- Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method
- Improve emoji, message on content index updated via logger
- Don't call khoj server on khoj.el load, only once khoj invoked explicitly by user
- Improve indexing of binary files
  - Let fs_syncer pass PDF files directly as binary before indexing
  - Use encoding of each file set in indexer request to read file 
- Add CORS policy to khoj server. Allow requests from khoj apps, obsidian & localhost
- Update indexer API endpoint URL to` index/update` from `indexer/batch`

Resolves #471 #243
2023-10-17 06:05:15 -07:00
Debanjum Singh Solanky
5efae1ad55 Update indexer API endpoint query params for force, content type
New URL query params, `force' and `t' match name of query parameter in
existing Khoj API endpoints

Update Desktop, Obsidian and Emacs client to call using these new API
query params. Set `client' query param from each client for telemetry
visibility
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
84654ffc5d Update indexer API endpoint URL to index/update from indexer/batch
New URL follows action oriented endpoint naming convention used for
other Khoj API endpoints

Update desktop, obsidian and emacs client to call this new API
endpoint
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
05be6bd877 Clicking Update Index in Obsidian settings should push files to index
Use the indexer/batch API endpoint to regenerate content index rather
than the previous pull based content indexing API endpoint
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
13a3122bf3 Stop configuring server to pull files to index from Obsidian client
Obsidian client now pushes vault files to index instead
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
d27dc71dfe Use encoding of each file set in indexer request to read file
Get encoding type from multi-part/form-request body for each file
Read text files as utf-8 and pdfs, images as binary
2023-10-17 04:58:12 -07:00
Debanjum Singh Solanky
8e627a5809 Pass any files to be deleted to indexer API via Khoj Obsidian plugin
- Keep state of previously synced files to identify files to be deleted
- Last synced files stored in settings for persistence of this data
  across Obsidian reboots
2023-10-17 03:34:49 -07:00
Debanjum Singh Solanky
f2e293a149 Push Vault files to index to Khoj server using Khoj Obsidian plugin
Use the multi-part/form-data request to sync Markdown, PDF files in
vault to index on khoj server

Run scheduled job to push updates to value for indexing every 1 hour
2023-10-17 03:05:30 -07:00
Debanjum Singh Solanky
6baaaaf91a Test request body of multi-part form to update content index from khoj.el 2023-10-16 23:54:32 -07:00
Debanjum Singh Solanky
79b3f8273a Make khoj.el send files to be deleted from index to server 2023-10-16 23:53:02 -07:00
Debanjum Singh Solanky
f64fa06e22 Initialize the Khoj Transient menu on first run instead of load
This prevents Khoj from polling the Khoj server until explicitly
invoked via `khoj' entrypoint function.

Previously it'd make a request to the khoj server every time Emacs or
khoj.el was loaded

Closes #243
2023-10-16 19:11:46 -07:00
Debanjum Singh Solanky
96c0b21285 Sync desktop app package.json with other Khoj clients metadata
- Make `bump_version.sh' script set version for the Khoj desktop app too
- Sync Khoj desktop app authors, license, description and version with
  the other interfaces and server
- Update description in packages metadata to match project subtitle on Github
2023-10-13 20:43:55 -07:00
sabaimran
80fb56b8a5 Sync deksktop app package version with the other releases 2023-10-13 19:23:00 -07:00
Debanjum Singh Solanky
b669aa2395 Clean and fix the content indexing code in the Emacs client
- Pass payloads as unibyte. This was causing the request to fail for
  files with unicode characters
- Suppress messages with file content in on index updates
- Fix rendering response from server on index update API call
- Extract code to populate body of index update HTTP request with files
2023-10-13 18:00:37 -07:00
Debanjum Singh Solanky
bea196aa30 Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method
Previously global state of `url-request-method' would affect the
kind of request made to api/config/data API endpoint as it wasn't
being explicitly being set before calling the API endpoint

This was done with the assumption that the default value of GET for
url-request-method wouldn't change globally

But in some cases, experientially, it can get changed. This was
resulting in khoj.el load failing as POST request was being made
instead which would throw error
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
292f0420ad Send content for indexing on server at a regular interval from khoj.el
- Allow indexing frequency to be configurable by user
- Ensure there is only one khoj indexing timer running
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
fc99431754 Send files to index on server from the khoj.el emacs client
- Add elisp variable to set API key to engage with the Khoj server
- Use multi-part form to POST the files to index to the indexer API
  endpoint on the khoj server
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
68018ef397 Use multi-part form to send files to index on desktop client
- Add typing for variables in for loop and other minor formatting clean-up
- Assume utf8 encoding for text files and binary for image, pdf files
2023-10-12 20:58:49 -07:00
Debanjum Singh Solanky
6aa69da3ef Put indexer API endpoint under /api path segment
Update FastAPI app router, desktop app and to use new url path to
batch indexer API endpoint

All api endpoints should exist under /api path segment
2023-10-09 21:35:58 -07:00
Debanjum Singh Solanky
f6f7a62d80 Wait for user to stop typing to trigger search from khoj.el in Emacs
- Improves user experience by aligning idle time with search latency
  to avoid display jitter (to render results) while user is typing

- Makes the idle time configurable

Closes #480
2023-10-06 12:44:45 -07:00
sabaimran
4a5ed7f06c
Update Khoj package version for Electron, Desktop app (#492)
* Address package upgrade for Electron application
* Update package version for Electron desktop application
2023-10-03 12:21:32 -07:00
sabaimran
3f962a55c3
Fix Linux Desktop Application (#491)
* Use separate functions for adding files and folders to configuration for indexing
* Add a loading bar while data is syncing
* Bump the minor version for the application
2023-10-03 11:43:19 -07:00
sabaimran
63b3696af0 Release Khoj version 0.12.3 2023-09-26 22:41:11 -07:00
sabaimran
d2f9bca1cf Fix null ref issue in query method and update logic for determining whether khoj is already configured in obsidian 2023-09-26 22:33:44 -07:00
sabaimran
2f18383349 Release Khoj version 0.12.2 2023-09-26 11:59:47 -07:00
sabaimran
4e370d7a18 Release Khoj version 0.12.1 2023-09-26 09:24:53 -07:00
sabaimran
3675aa348a Update naming of Khoj in manifest.json for Obsidian 2023-09-26 09:24:36 -07:00
sabaimran
a82d1becc3 Release Khoj version 0.12.0 2023-09-26 09:17:56 -07:00
sabaimran
38f0df3d53 Remove unused icons from electron app folder 2023-09-26 07:56:29 -07:00
sabaimran
16874e1953 Provide force fallback for regeneration 2023-09-12 16:35:07 -07:00
sabaimran
76562f4250
Add front-end Electron application for Khoj local file syncing (#473)
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Use state.host and state.port for configuring the URL for the indexer
* Fix parsing of PDF files
* Read markdown files from streamed data and update unit tests
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
* Init: refactor indexer/batch endpoint to support a generic file ingestion format
* Add features to better support indexing from files sent by the desktop client
* Initial commit with Electron application
- Adds electron app
* Add import for pymupdf, remove import for pypdf
* Allow user to configure khoj host URL
* Remove search type configuration from index.html
* Use v1 path for current indexer routes
2023-09-06 12:04:18 -07:00
sabaimran
74409c2c64 Release Khoj version 0.11.4 2023-08-29 11:44:35 -07:00
sabaimran
e592f6eac8 Release Khoj version 0.11.3 2023-08-28 14:46:03 -07:00
sabaimran
bc09143856 Release Khoj version 0.11.2 2023-08-28 10:16:13 -07:00
Debanjum Singh Solanky
3ff4e19dd2 Release Khoj version 0.11.1 2023-08-16 22:53:29 -07:00
Debanjum Singh Solanky
26c3977fb9 Remove info hint to reindex khoj on unexpected search results
The index corruption was issue resolved a while ago in #325 and
hasn't cropped up again
2023-08-16 00:58:59 -07:00
sabaimran
6562ec6531 Release Khoj version 0.11.0 2023-08-14 19:25:03 -07:00
Jason Qin
3ef1b7073d
Update obsidian/manifest.json
Closes #426
2023-08-07 10:41:39 -07:00
Debanjum Singh Solanky
e6e3acdbe4 Release Khoj version 0.10.1 2023-08-01 23:55:13 -07:00
sabaimran
f65d157244 Release Khoj version 0.10.0 2023-07-28 19:27:47 -07:00
sabaimran
5ccb01343e
Add Offline chat to Obsidian (#359)
* Add support for configuring/using offline chat from within Obsidian
* Fix type checking for search type
* If Github is not configured, /update call should fail
* Fix regenerate tests same as the update ones
* Update help text for offline chat in obsidian
* Update relevant description for Khoj settings in Obsidian
* Simplify configuration logic and use smarter defaults
2023-07-28 18:47:56 -07:00
Debanjum Singh Solanky
ebfbef1f68 Configure using offline chat from Emacs
Closes #358
2023-07-28 16:07:33 -07:00
Debanjum Singh Solanky
715d56d4f0 Use new schema to update khoj.yml config from khoj.el 2023-07-26 17:34:16 -07:00
sabaimran
8b2af0b5ef
Add support for our first Local LLM 🤖🏠 (#330)
* Add support for gpt4all's falcon model as an additional conversation processor
- Update the UI pages to allow the user to point to the new endpoints for GPT
- Update the internal schemas to support both GPT4 models and OpenAI
- Add unit tests benchmarking some of the Falcon performance
* Add exc_info to include stack trace in error logs for text processors
* Pull shared functions into utils.py to be used across gpt4 and gpt
* Add migration for new processor conversation schema
* Skip GPT4All actor tests due to typing issues
* Fix Obsidian processor configuration in auto-configure flow
* Rename enable_local_llm to enable_offline_chat
2023-07-26 16:27:08 -07:00
Debanjum Singh Solanky
7722a9c347 Default to using the gpt-3.5-turbo model for chat from khoj.el 2023-07-22 00:29:26 -07:00
sabaimran
1610d2ebd9
📝 Add a documentation base for Khoj! (#333)
* Add docs for more organized, accessible information detailing Khoj setup
* Delete duplicated files
* Add a coverpage without enabling it. Add logo and theme
* Remove obsidian README.md
* Add plausible script to index.html via docsify
2023-07-20 22:34:25 -07:00
Debanjum Singh Solanky
3e59be7f1d Release Khoj version 0.9.0 2023-07-18 19:59:27 -07:00
Debanjum Singh Solanky
4d910936b7 Fix triggering index update on khoj server from khoj.el 2023-07-18 19:57:54 -07:00
Debanjum Singh Solanky
5c7d7f558d Make AI model used for Khoj chat configurable from khoj.el
- Fix bug. Set the unused model-name to a standad default value
2023-07-18 19:57:54 -07:00
Debanjum
5f2be2a9bb
Merge pull request #298 from HyunggyuJang/patch-1
Encode config as utf-8 during setup in khoj.el. This will allow utf-8 encoded files etc to be passed in config
2023-07-18 17:54:11 -07:00
Debanjum Singh Solanky
71e8ddd9a2 Check if PDF is configured before showing it as an option in khoj.el 2023-07-17 15:49:20 -07:00
HyunggyuJang
88c42b3043
Encode data as utf-8
otherwise it will complain, see 1c85531090
2023-07-11 17:06:05 +09:00
Debanjum Singh Solanky
f664a74e77 Update Khoj server to run on non standard port, 42110 instead of 8000
Resolves #295
2023-07-10 21:27:58 -07:00
sabaimran
55f5be7b03 Release Khoj version 0.8.2 2023-07-10 14:39:32 -07:00
sabaimran
53809298c0 Release Khoj version 0.8.1 2023-07-10 14:30:04 -07:00
Debanjum Singh Solanky
75ff871217 Release Khoj version 0.8.0 2023-07-10 13:37:51 -07:00
sabaimran
07cf5a214a
Check if PDF files are present in the Obsidian vault before initializing the Khoj configuration (#293) 2023-07-10 10:33:04 -07:00
sabaimran
4c135ea316
Make streaming optional for the /chat endpoint (#287)
* Update the /chat endpoint to conditionally support streaming

- If streams are enabled, return the threadgenerator as it does currently
- If stream is disabled, return a JSON response with the response/compiled references separated out
- Correspondingly, update the chat.html UI to use the streamed API, as well as Obsidian
- Rename chat/init/ to chat/history

* Update khoj.el to use the /history endpoint

- Update corresponding unit tests to use stream=true

* Remove & from call to /chat for obsidian

* Abstract functions out into a helpers.py file and clean up some of the error-catching
2023-07-09 10:12:09 -07:00
Debanjum Singh Solanky
0a86220d42 Use default values, delete content config on disable and update state 2023-07-07 20:36:16 -07:00
Debanjum Singh Solanky
362063f5fe By default, connect to Khoj server over IPv4 from Obsidian plugin 2023-07-07 20:36:16 -07:00
Debanjum Singh Solanky
bf427cd8dd Set no. of results used to generate chat response from Khoj Emacs 2023-07-07 12:34:50 -07:00
Debanjum Singh Solanky
1d77fe712c Set no. of results used to generate chat response from Khoj Obsidian 2023-07-07 12:32:32 -07:00
Debanjum Singh Solanky
a2c668268f Use node-fetch >=3.1.0 in khoj obsidian plugin to avoid security vulnerability 2023-07-06 13:05:52 -07:00
Debanjum
6c2a8a5bce
️ Stream Responses by Khoj Chat on Web, Obsidian
- What
   - Stream chat responses from OpenAI API to Web, Obsidian clients
      - Implement using a callback function which manages a queue where new tokens can be placed as they come on. As the thread is read from, tokens are removed.
      - When the final token has been processed, add the `compiled_references` to the queue to be rendered by the `chat` client
      - When the thread has been closed, save the accumulated conversation log in the user's history using a `partial func`
      - Incrementally decode tokens on the front end and add them as they appear from the streamed response

- Why
This significantly reduces perceived latency and OpenAI API request timeouts for Chat

Closes https://github.com/khoj-ai/khoj/issues/257
2023-07-05 20:02:11 -07:00
Debanjum Singh Solanky
0ba838b53a Show temp status message in Khoj Obsidian chat while Khoj is thinking
- Scroll to bottom after adding temporary status message and
references too
2023-07-05 18:02:43 -07:00
sabaimran
4e6b66b139 Add support for streaming chat response from OpenAI to Obsidian
- I needed to installed node-fetch to accomplish this, as the built-in request object from Obsidian doesn't seem to support streaming and the built-in fetch object is very sensitive to any and all cross origin requests
2023-07-05 15:01:22 -07:00
Debanjum Singh Solanky
5889eceba4 Make text selectable in Khoj chat modal on Obsidian
Previously the text in the Khoj chat modal couldn't be copied as it
was not selectable

Resolves #206
2023-07-03 23:24:04 -07:00
sabaimran
a6f313589e Release Khoj version 0.7.1 2023-07-03 12:26:41 -07:00
Debanjum
faad1297f4
Drop Support for Org Music, Ledger Content Types
Removing unused content types will reduce khoj code to manage

- 0f993b3 Drop support for Ledger as a separate content type
   Khoj will soon get a generic text indexing content type in Index plain text files #237.
   This along with a file filter should suffice for searching through Ledger transactions

- c9db532 Remove unused org-music as an indexable content type from Khoj
   Org-music was just a custom content type that worked with org-music.
   It was mostly only useful for me.
2023-07-02 17:48:29 -07:00
Debanjum Singh Solanky
0f993b332e Drop support for Ledger as a separate content type
Khoj will soon get a generic text indexing content type. This along
with a file filter should suffice for searching through Ledger
transactions, if required.

Having a specific content type for niche use-case like ledger isn't
useful. Removing unused content types will reduce khoj code to manage.
2023-07-02 16:57:49 -07:00
Debanjum Singh Solanky
c9db5321e7 Remove unused org-music as an indexable content type from Khoj
Org-music was just a custom content type that worked with org-music.
It was mostly only useful for me.

Cleaning up that code will reduce number of content types for khoj to
manage.
2023-07-02 16:21:21 -07:00
sabaimran
b86a3bb0c5 Merge branch 'master' of github.com:debanjum/khoj into fix/obsidian-setup-issues 2023-07-02 16:21:05 -07:00
sabaimran
a52c1c8380 Use built-in app.vault to determine whether there are any PDF files within 2023-07-02 16:20:43 -07:00
sabaimran
eff1436857 Overwrite existing PDFs in Obsidian as well, make if-block more legible 2023-07-02 16:17:25 -07:00
Debanjum Singh Solanky
30459ee4ba Fix Khoj subtitle in desktop entry, pyproject, cli and Obsidian Readme 2023-07-02 16:09:07 -07:00
sabaimran
4b02a8c788 Fix PDF setup in Obsidian plugin and force Obsidian configuration for markdown 2023-07-02 12:37:24 -07:00
sabaimran
f0f6390366 Make --no-gui the default behavior of Khoj and update corresponding documentation 2023-07-01 19:07:59 -07:00