Lock syncing to server if a sync is already in progress.
While the sync save button gets disabled while sync is in progress,
the background sync job can still trigger a sync in parallel. This
sync lock prevents that
- Remove spurious whitespace in chat input box on page load being
added because text area element was ending on newline
- Do not insert newline in message when send message by hitting enter key
This would be more evident when send message with cursor in the
middle of the sentence, as a newline would be inserted at the cursor
point
- Remove chat message separator tokens from model output. Model
sometimes starts to output text in it's chat format
- Pass current khoj version from package.json to about page via
electron IPC between backend js and frontend page
- Update Khoj information in default About screen as well, in case
it's exposed anywhere else
- Update background color to a different shade of white
- Make primary and primary hover colors less intense and more aligned
with lantern flame shade
- Add water, leaf, flower color variables
### ✨ New
- Use API keys to authenticate from Desktop, Obsidian, Emacs clients
- Create API, UI on web app config page to CRUD API Keys
- Create user API keys table and functions to CRUD them in Database
### 🧪 Improve
- Default to better search model, [gte-small](https://huggingface.co/thenlper/gte-small), to improve search quality
- Only load chat model to GPU if enough space, throw error on load failure
- Show encoding progress, truncate headings to max chars supported
- Add instruction to create db in Django DB setup Readme
### ⚙️ Fix
- Fix error handling when configure offline chat via Web UI
- Do not warn in anon mode about Google OAuth env vars not being set
- Fix path to load static files when server started from project root
### Overview
- Add ability to push data to index from the Emacs, Obsidian client
- Switch to standard mechanism of syncing files via HTTP multi-part/form. Previously we were streaming the data as JSON
- Benefits of new mechanism
- No manual parsing of files to send or receive on clients or server is required as most have in-built mechanisms to send multi-part/form requests
- The whole response is not required to be kept in memory to parse content as JSON. As individual files arrive they're automatically pushed to disk to conserve memory if required
- Binary files don't need to be encoded on client and decoded on server
### Code Details
### Major
- Use multi-part form to receive files to index on server
- Use multi-part form to send files to index on desktop client
- Send files to index on server from the khoj.el emacs client
- Send content for indexing on server at a regular interval from khoj.el
- Send files to index on server from the khoj obsidian client
- Update tests to test multi-part/form method of pushing files to index
#### Minor
- Put indexer API endpoint under /api path segment
- Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method
- Improve emoji, message on content index updated via logger
- Don't call khoj server on khoj.el load, only once khoj invoked explicitly by user
- Improve indexing of binary files
- Let fs_syncer pass PDF files directly as binary before indexing
- Use encoding of each file set in indexer request to read file
- Add CORS policy to khoj server. Allow requests from khoj apps, obsidian & localhost
- Update indexer API endpoint URL to` index/update` from `indexer/batch`
Resolves#471#243
New URL query params, `force' and `t' match name of query parameter in
existing Khoj API endpoints
Update Desktop, Obsidian and Emacs client to call using these new API
query params. Set `client' query param from each client for telemetry
visibility
New URL follows action oriented endpoint naming convention used for
other Khoj API endpoints
Update desktop, obsidian and emacs client to call this new API
endpoint
- Make `bump_version.sh' script set version for the Khoj desktop app too
- Sync Khoj desktop app authors, license, description and version with
the other interfaces and server
- Update description in packages metadata to match project subtitle on Github
* Use separate functions for adding files and folders to configuration for indexing
* Add a loading bar while data is syncing
* Bump the minor version for the application
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Use state.host and state.port for configuring the URL for the indexer
* Fix parsing of PDF files
* Read markdown files from streamed data and update unit tests
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
* Init: refactor indexer/batch endpoint to support a generic file ingestion format
* Add features to better support indexing from files sent by the desktop client
* Initial commit with Electron application
- Adds electron app
* Add import for pymupdf, remove import for pypdf
* Allow user to configure khoj host URL
* Remove search type configuration from index.html
* Use v1 path for current indexer routes
- Why
The khoj pypi packages should be installed in `khoj' directory.
Previously it was being installed into `src' directory, which is a
generic top level directory name that is discouraged from being used
- Changes
- move src/* to src/khoj/*
- update `setup.py' to `find_packages' in `src' instead of project root
- rename imports to form `from khoj.*' in complete project
- update `constants.web_directory' path to use `khoj' directory
- rename root logger to `khoj' in `main.py'
- fix image_search tests to use the newly rename `khoj' logger
- update config, docs, workflows to reference new path `src/khoj'
- Default config has `input_files' set to None
- This was being passed to `FileBrowser' on Initialization
- But `FileBrowser' expects `content_files' of list type, not None
- This resulted in an unexpected NoneType failure
- When no file selected in file browser an empty line/entry gets added
to input entries list
- Bug got introduced due to insufficient update on change to add
instead of insert
- Update is_none_or_empty helper method to also check for empty string
- Allows adding multiple image directories via GUI
- Allow adding multiple files in different directories via GUI
- Previously users couldn't add multiple directories via GUI
They'd have to manually append to input field if multiple files, directories
- To clear/overwrite is much easier.
The user can just select text to delete in input area
- Issue
Fix configuring image search from Desktop GUI. It was broken before.
The Desktop GUI was updating input-files field under content-type > image.
This field is not used for image search. So image search couldn't be
configured from the Desktop GUI
- Fix
- Set input-directories when field of search type image is set from GUI
- Otherwise set input-files field in config
- Show a helpful error message in the GUI to the user, instead of the
crashing if loading config fails, for e.g if file wasn't found
- Collate GUI errors into an ErrorType enum class
- Remove previous error messages before showing the new one