- Dedupe offline_chat_model variable. Only reference offline chat
model stored under offline_chat. Delete the previous chat_model
field under GPT4AllProcessorConfig
- Set offline chat model to use via config/offline_chat API endpoint
This provides flexibility to use non 1st party supported chat models
- Create migration script to update khoj.yml config
- Put `enable_offline_chat' under new `offline-chat' section
Referring code needs to be updated to accomodate this change
- Move `offline_chat_model' to `chat-model' under new `offline-chat' section
- Put chat `tokenizer` under new `offline-chat' section
- Put `max_prompt' under existing `conversation' section
As `max_prompt' size effects both openai and offline chat models
Pass user configured chat model as argument to use by converse_offline
The proper fix for this would allow users to configure the max_prompt
and tokenizer to use (while supplying default ones, if none provided)
For now, this is a reasonable start.
- Format extract questions prompt format with newlines and whitespaces
- Make llama v2 extract questions prompt consistent
- Remove empty questions extracted by offline extract_questions actor
- Update implicit qs extraction unit test for offline search actor
* Use separate functions for adding files and folders to configuration for indexing
* Add a loading bar while data is syncing
* Bump the minor version for the application
- GPT4All integration had ceased working with 0.1.7 specification. Update to use 1.0.12. At a later date, we should also use first party support for llama v2 via gpt4all
- Update the system prompt for the extract_questions flow to add start and end date to the yesterday date filter example.
- Update all setup data in conftest.py to use new client-server indexing pattern
* Remove GPT4All dependency in pyproject.toml and use multiplatform builds in the dockerization setup in GH actions
* Move configure_search method into indexer
* Add conditional installation for gpt4all
* Add hint to go to localhost:42110 in the docs. Addresses #477
* Remove PySide, gui option from code
* Remove pyside 6 dependency from code
* Remove workflows which build desktop applications
* Update unit tests and update line in documentation
* Remove additional references to pyinstaller, gui
* Add uninstall steps to normal uninstall instructions
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Use state.host and state.port for configuring the URL for the indexer
* Fix parsing of PDF files
* Read markdown files from streamed data and update unit tests
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
* Init: refactor indexer/batch endpoint to support a generic file ingestion format
* Add features to better support indexing from files sent by the desktop client
* Initial commit with Electron application
- Adds electron app
* Add import for pymupdf, remove import for pypdf
* Allow user to configure khoj host URL
* Remove search type configuration from index.html
* Use v1 path for current indexer routes
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Update unit tests to fix with new application design
* Allow configure server to be called without regenerating the index; this no longer works because the API for indexing files is not up in time for the server to send a request
* Use state.host and state.port for configuring the URL for the indexer
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
Git tag tests/data files with the linguist-vendored attribute to
prevent github from including them in stats.
Otherwise Khoj is getting marked as an HTML project due to the
tardigrades html page in tests data, when it's primarily a python
project currently
- Overview
- Allow applying word, file or date filters on your knowledge base from the chat interface
- This will limit the portion of the knowledge base Khoj chat can use to respond to your query