sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-11-30 19:03:01 +01:00

Author	SHA1	Message	Date
Debanjum Singh Solanky	1a9023d396	Update Chat Actor test to not incept with prior world knowledge	2023-10-15 17:22:44 -07:00
Debanjum Singh Solanky	df1d74a879	Use max_prompt_size, tokenizer from config for chat model context stuffing	2023-10-15 16:52:53 -07:00
Debanjum Singh Solanky	116595b351	Use chat_model specified in new offline_chat section of config - Dedupe offline_chat_model variable. Only reference offline chat model stored under offline_chat. Delete the previous chat_model field under GPT4AllProcessorConfig - Set offline chat model to use via config/offline_chat API endpoint	2023-10-15 16:37:49 -07:00
Debanjum Singh Solanky	feb4f17e3d	Update chat config schema. Make max_prompt, chat tokenizer configurable This provides flexibility to use non 1st party supported chat models - Create migration script to update khoj.yml config - Put `enable_offline_chat' under new `offline-chat' section Referring code needs to be updated to accomodate this change - Move `offline_chat_model' to `chat-model' under new `offline-chat' section - Put chat `tokenizer` under new `offline-chat' section - Put `max_prompt' under existing `conversation' section As `max_prompt' size effects both openai and offline chat models	2023-10-15 16:35:11 -07:00
Debanjum Singh Solanky	247e75595c	Use AutoTokenizer to support more tokenizers	2023-10-14 16:54:52 -07:00
Debanjum Singh Solanky	1ad8b150e8	Add default tokenizer, max_prompt as fallback for non-default offline chat models Pass user configured chat model as argument to use by converse_offline The proper fix for this would allow users to configure the max_prompt and tokenizer to use (while supplying default ones, if none provided) For now, this is a reasonable start.	2023-10-13 22:48:56 -07:00
Debanjum Singh Solanky	56bd69d5af	Improve Llama v2 extract questions actor and associated prompt - Format extract questions prompt format with newlines and whitespaces - Make llama v2 extract questions prompt consistent - Remove empty questions extracted by offline extract_questions actor - Update implicit qs extraction unit test for offline search actor	2023-10-13 22:48:56 -07:00
Debanjum Singh Solanky	a85ff941ca	Make offline chat model user configurable Only GPT4All supported Llama v2 models will work given the prompt structure is not currently configurable	2023-10-04 20:41:14 -07:00
Debanjum Singh Solanky	d1ff812021	Run GPT4All Chat Model on GPU, when available GPT4All now supports running models on GPU via Vulkan	2023-10-04 18:42:12 -07:00
Debanjum Singh Solanky	13b16a4364	Use default Llama 2 supported by GPT4All Remove custom logic to download custom Llama 2 model. This was added as GPT4All didn't support Llama 2 when it was added to Khoj	2023-10-03 19:01:54 -07:00
sabaimran	4a5ed7f06c	Update Khoj package version for Electron, Desktop app (#492 ) * Address package upgrade for Electron application * Update package version for Electron desktop application	2023-10-03 12:21:32 -07:00
sabaimran	3f962a55c3	Fix Linux Desktop Application (#491 ) * Use separate functions for adding files and folders to configuration for indexing * Add a loading bar while data is syncing * Bump the minor version for the application	2023-10-03 11:43:19 -07:00
sabaimran	63b3696af0	Release Khoj version 0.12.3	2023-09-26 22:41:11 -07:00
sabaimran	d2f9bca1cf	Fix null ref issue in query method and update logic for determining whether khoj is already configured in obsidian	2023-09-26 22:33:44 -07:00
sabaimran	2f18383349	Release Khoj version 0.12.2	2023-09-26 11:59:47 -07:00
sabaimran	588f35b6e9	Add max prompt size for gpt-3.5-turbo-16k	2023-09-26 10:57:35 -07:00
sabaimran	99f9c3f8e2	Update setup instructions	2023-09-26 09:40:36 -07:00
sabaimran	4e370d7a18	Release Khoj version 0.12.1	2023-09-26 09:24:53 -07:00
sabaimran	3675aa348a	Update naming of Khoj in manifest.json for Obsidian	2023-09-26 09:24:36 -07:00
sabaimran	4b6d8af218	Update metadata in manifest.json	2023-09-26 09:19:56 -07:00
sabaimran	a82d1becc3	Release Khoj version 0.12.0	2023-09-26 09:17:56 -07:00
sabaimran	38f0df3d53	Remove unused icons from electron app folder	2023-09-26 07:56:29 -07:00
sabaimran	29a64be939	Deprecate desktop build instructions from old setup	2023-09-25 22:02:02 -07:00
sabaimran	99995b2497	Add basic instructions for setting up the Khoj desktop interface	2023-09-25 21:08:14 -07:00
sabaimran	5e16074b92	Fix comparison for search type in plugins mode	2023-09-25 10:57:17 -07:00
sabaimran	efe5e09c3a	Use jammy for docker base image due to dependency issue with arm64 image	2023-09-18 15:38:18 -07:00
sabaimran	6df728c445	Move bash command in Dockerfile into single line	2023-09-18 15:13:11 -07:00
sabaimran	96a9fa07f0	Fix conf test setup for offline chat	2023-09-18 15:05:15 -07:00
sabaimran	2dd15e9f63	Resolve issues with GPT4All and fix prompt for yesterday extract questions date filter (#483 ) - GPT4All integration had ceased working with 0.1.7 specification. Update to use 1.0.12. At a later date, we should also use first party support for llama v2 via gpt4all - Update the system prompt for the extract_questions flow to add start and end date to the yesterday date filter example. - Update all setup data in conftest.py to use new client-server indexing pattern	2023-09-18 14:41:26 -07:00
sabaimran	8141be97f6	Update date filter test to use compiled rather than raw key	2023-09-18 11:24:56 -07:00
sabaimran	b225d1188c	Fix formatting of gpt.py	2023-09-18 11:09:02 -07:00
Jonny-GM	34b202b868	More lenient date searching (#481 ) * Modify DateFilter to use compiled entry key * Instruct search to include date in query * Minor prompt change * Prompt fix	2023-09-18 10:46:00 -07:00
sabaimran	16874e1953	Provide force fallback for regeneration	2023-09-12 16:35:07 -07:00
sabaimran	9f42a1a036	Propagate flags to configure index command	2023-09-11 10:33:44 -07:00
sabaimran	343854752c	Improve docker builds for local hosting (#476 ) * Remove GPT4All dependency in pyproject.toml and use multiplatform builds in the dockerization setup in GH actions * Move configure_search method into indexer * Add conditional installation for gpt4all * Add hint to go to localhost:42110 in the docs. Addresses #477	2023-09-08 17:07:26 -07:00
sabaimran	dccfae3853	Remove PySide dependency and deprecate desktop builds (#475 ) * Remove PySide, gui option from code * Remove pyside 6 dependency from code * Remove workflows which build desktop applications * Update unit tests and update line in documentation * Remove additional references to pyinstaller, gui * Add uninstall steps to normal uninstall instructions	2023-09-07 11:36:27 -07:00
sabaimran	76562f4250	Add front-end Electron application for Khoj local file syncing (#473 ) * Initial version - setup a file-push architecture for generating embeddings with Khoj * Use state.host and state.port for configuring the URL for the indexer * Fix parsing of PDF files * Read markdown files from streamed data and update unit tests * On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system * Init: refactor indexer/batch endpoint to support a generic file ingestion format * Add features to better support indexing from files sent by the desktop client * Initial commit with Electron application - Adds electron app * Add import for pymupdf, remove import for pypdf * Allow user to configure khoj host URL * Remove search type configuration from index.html * Use v1 path for current indexer routes	2023-09-06 12:04:18 -07:00
bholagabbar	205dc90746	Fix notion title bug (#474 ) * Update notion_to_jsonl.py * Fix try-catch block	2023-09-05 10:47:42 -07:00
sabaimran	922222a813	Fix anyio package version to avoid backwards compatibility issue with start_blocking_portal method	2023-08-31 14:14:13 -07:00
sabaimran	4854258047	Move to a push-first model for retrieving embeddings from local files (#457 ) * Initial version - setup a file-push architecture for generating embeddings with Khoj * Update unit tests to fix with new application design * Allow configure server to be called without regenerating the index; this no longer works because the API for indexing files is not up in time for the server to send a request * Use state.host and state.port for configuring the URL for the indexer * On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system	2023-08-31 12:55:17 -07:00
sabaimran	92cbfef7ab	Skip plaintext file indexing if there's a parsing issue and log the file	2023-08-29 14:34:08 -07:00
sabaimran	74409c2c64	Release Khoj version 0.11.4	2023-08-29 11:44:35 -07:00
sabaimran	1b85958bcc	trim chat input start	2023-08-28 19:18:10 -07:00
sabaimran	e592f6eac8	Release Khoj version 0.11.3	2023-08-28 14:46:03 -07:00
sabaimran	7c35da9fc4	Fix bug in /chat endpoint for general and update depdendencies	2023-08-28 14:12:11 -07:00
Debanjum Singh Solanky	c93dcc948a	Exclude tests data file from programming stats on Github Git tag tests/data files with the linguist-vendored attribute to prevent github from including them in stats. Otherwise Khoj is getting marked as an HTML project due to the tardigrades html page in tests data, when it's primarily a python project currently	2023-08-28 11:00:52 -07:00
Debanjum Singh Solanky	59ffd1dc94	Document slash command and query filter in docs for chat and search	2023-08-28 11:00:52 -07:00
sabaimran	bc09143856	Release Khoj version 0.11.2	2023-08-28 10:16:13 -07:00
Debanjum	bc5e60defb	Filter knowledge base used by chat to respond (#469 ) - Overview - Allow applying word, file or date filters on your knowledge base from the chat interface - This will limit the portion of the knowledge base Khoj chat can use to respond to your query	2023-08-28 09:32:33 -07:00
Debanjum Singh Solanky	01b310635e	Enable passing search query filters via chat and test it	2023-08-28 09:24:32 -07:00

1 2 3 4 5 ...

1635 commits