- Improves user experience by aligning idle time with search latency
to avoid display jitter (to render results) while user is typing
- Makes the idle time configurable
Closes#480
* Use separate functions for adding files and folders to configuration for indexing
* Add a loading bar while data is syncing
* Bump the minor version for the application
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Use state.host and state.port for configuring the URL for the indexer
* Fix parsing of PDF files
* Read markdown files from streamed data and update unit tests
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
* Init: refactor indexer/batch endpoint to support a generic file ingestion format
* Add features to better support indexing from files sent by the desktop client
* Initial commit with Electron application
- Adds electron app
* Add import for pymupdf, remove import for pypdf
* Allow user to configure khoj host URL
* Remove search type configuration from index.html
* Use v1 path for current indexer routes
* Add support for configuring/using offline chat from within Obsidian
* Fix type checking for search type
* If Github is not configured, /update call should fail
* Fix regenerate tests same as the update ones
* Update help text for offline chat in obsidian
* Update relevant description for Khoj settings in Obsidian
* Simplify configuration logic and use smarter defaults
* Add support for gpt4all's falcon model as an additional conversation processor
- Update the UI pages to allow the user to point to the new endpoints for GPT
- Update the internal schemas to support both GPT4 models and OpenAI
- Add unit tests benchmarking some of the Falcon performance
* Add exc_info to include stack trace in error logs for text processors
* Pull shared functions into utils.py to be used across gpt4 and gpt
* Add migration for new processor conversation schema
* Skip GPT4All actor tests due to typing issues
* Fix Obsidian processor configuration in auto-configure flow
* Rename enable_local_llm to enable_offline_chat
* Add docs for more organized, accessible information detailing Khoj setup
* Delete duplicated files
* Add a coverpage without enabling it. Add logo and theme
* Remove obsidian README.md
* Add plausible script to index.html via docsify
* Update the /chat endpoint to conditionally support streaming
- If streams are enabled, return the threadgenerator as it does currently
- If stream is disabled, return a JSON response with the response/compiled references separated out
- Correspondingly, update the chat.html UI to use the streamed API, as well as Obsidian
- Rename chat/init/ to chat/history
* Update khoj.el to use the /history endpoint
- Update corresponding unit tests to use stream=true
* Remove & from call to /chat for obsidian
* Abstract functions out into a helpers.py file and clean up some of the error-catching
- What
- Stream chat responses from OpenAI API to Web, Obsidian clients
- Implement using a callback function which manages a queue where new tokens can be placed as they come on. As the thread is read from, tokens are removed.
- When the final token has been processed, add the `compiled_references` to the queue to be rendered by the `chat` client
- When the thread has been closed, save the accumulated conversation log in the user's history using a `partial func`
- Incrementally decode tokens on the front end and add them as they appear from the streamed response
- Why
This significantly reduces perceived latency and OpenAI API request timeouts for Chat
Closes https://github.com/khoj-ai/khoj/issues/257
- I needed to installed node-fetch to accomplish this, as the built-in request object from Obsidian doesn't seem to support streaming and the built-in fetch object is very sensitive to any and all cross origin requests