Commit graph

3509 commits

Author SHA1 Message Date
sabaimran
6df728c445 Move bash command in Dockerfile into single line 2023-09-18 15:13:11 -07:00
sabaimran
96a9fa07f0 Fix conf test setup for offline chat 2023-09-18 15:05:15 -07:00
sabaimran
2dd15e9f63
Resolve issues with GPT4All and fix prompt for yesterday extract questions date filter (#483)
- GPT4All integration had ceased working with 0.1.7 specification. Update to use 1.0.12. At a later date, we should also use first party support for llama v2 via gpt4all
- Update the system prompt for the extract_questions flow to add start and end date to the yesterday date filter example.
- Update all setup data in conftest.py to use new client-server indexing pattern
2023-09-18 14:41:26 -07:00
sabaimran
8141be97f6 Update date filter test to use compiled rather than raw key 2023-09-18 11:24:56 -07:00
sabaimran
b225d1188c Fix formatting of gpt.py 2023-09-18 11:09:02 -07:00
Jonny-GM
34b202b868
More lenient date searching (#481)
* Modify DateFilter to use compiled entry key
* Instruct search to include date in query
* Minor prompt change
* Prompt fix
2023-09-18 10:46:00 -07:00
sabaimran
16874e1953 Provide force fallback for regeneration 2023-09-12 16:35:07 -07:00
sabaimran
9f42a1a036 Propagate flags to configure index command 2023-09-11 10:33:44 -07:00
sabaimran
343854752c
Improve docker builds for local hosting (#476)
* Remove GPT4All dependency in pyproject.toml and use multiplatform builds in the dockerization setup in GH actions
* Move configure_search method into indexer
* Add conditional installation for gpt4all
* Add hint to go to localhost:42110 in the docs. Addresses #477
2023-09-08 17:07:26 -07:00
sabaimran
dccfae3853
Remove PySide dependency and deprecate desktop builds (#475)
* Remove PySide, gui option from code
* Remove pyside 6 dependency from code
* Remove workflows which build desktop applications
* Update unit tests and update line in documentation
* Remove additional references to pyinstaller, gui
* Add uninstall steps to normal uninstall instructions
2023-09-07 11:36:27 -07:00
sabaimran
76562f4250
Add front-end Electron application for Khoj local file syncing (#473)
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Use state.host and state.port for configuring the URL for the indexer
* Fix parsing of PDF files
* Read markdown files from streamed data and update unit tests
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
* Init: refactor indexer/batch endpoint to support a generic file ingestion format
* Add features to better support indexing from files sent by the desktop client
* Initial commit with Electron application
- Adds electron app
* Add import for pymupdf, remove import for pypdf
* Allow user to configure khoj host URL
* Remove search type configuration from index.html
* Use v1 path for current indexer routes
2023-09-06 12:04:18 -07:00
bholagabbar
205dc90746
Fix notion title bug (#474)
* Update notion_to_jsonl.py
* Fix try-catch block
2023-09-05 10:47:42 -07:00
sabaimran
922222a813 Fix anyio package version to avoid backwards compatibility issue with start_blocking_portal method 2023-08-31 14:14:13 -07:00
sabaimran
4854258047
Move to a push-first model for retrieving embeddings from local files (#457)
* Initial version - setup a file-push architecture for generating embeddings with Khoj
* Update unit tests to fix with new application design
* Allow configure server to be called without regenerating the index; this no longer works because the API for indexing files is not up in time for the server to send a request
* Use state.host and state.port for configuring the URL for the indexer
* On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system
2023-08-31 12:55:17 -07:00
sabaimran
92cbfef7ab Skip plaintext file indexing if there's a parsing issue and log the file 2023-08-29 14:34:08 -07:00
sabaimran
74409c2c64 Release Khoj version 0.11.4 2023-08-29 11:44:35 -07:00
sabaimran
1b85958bcc trim chat input start 2023-08-28 19:18:10 -07:00
sabaimran
e592f6eac8 Release Khoj version 0.11.3 2023-08-28 14:46:03 -07:00
sabaimran
7c35da9fc4 Fix bug in /chat endpoint for general and update depdendencies 2023-08-28 14:12:11 -07:00
Debanjum Singh Solanky
c93dcc948a Exclude tests data file from programming stats on Github
Git tag tests/data files with the linguist-vendored attribute to
prevent github from including them in stats.

Otherwise Khoj is getting marked as an HTML project due to the
tardigrades html page in tests data, when it's primarily a python
project currently
2023-08-28 11:00:52 -07:00
Debanjum Singh Solanky
59ffd1dc94 Document slash command and query filter in docs for chat and search 2023-08-28 11:00:52 -07:00
sabaimran
bc09143856 Release Khoj version 0.11.2 2023-08-28 10:16:13 -07:00
Debanjum
bc5e60defb
Filter knowledge base used by chat to respond (#469)
- Overview
  - Allow applying word, file or date filters on your knowledge base from the chat interface
  - This will limit the portion of the knowledge base Khoj chat can use to respond to your query
2023-08-28 09:32:33 -07:00
Debanjum Singh Solanky
01b310635e Enable passing search query filters via chat and test it 2023-08-28 09:24:32 -07:00
Debanjum Singh Solanky
794bad8bcb Make date_filter.extract_date_range method always return a list type 2023-08-28 00:55:28 -07:00
Debanjum Singh Solanky
d5a2de6222 Add method to extract filter terms from query to all filters
- Test the get_filter_term method in all 3 word, file, date filters
- Make the existing can_filter method by default in base filter abstract class
2023-08-28 00:55:28 -07:00
Debanjum
150105505b
Add Default chat command. Make Khoj ask clarifying questions (#468)
- Make Khoj ask clarifying questions when answer not in provided context
- Add default conversation command to auto switch b/w general, notes modes
- Show filtered list of commands available with the currently input text
- Use general prompt when no references found and not in Notes mode
- Test general and notes slash commands in offline chat director tests
2023-08-28 00:52:57 -07:00
Debanjum Singh Solanky
319f066aec Test general and notes slash commands in offline chat director tests 2023-08-28 00:47:02 -07:00
Debanjum Singh Solanky
eb6cd4f8d0 Use general prompt when no references found and not in Notes mode 2023-08-28 00:47:02 -07:00
Debanjum Singh Solanky
edffbad837 Make Khoj ask clarifying questions when answer not in provided context
Previously it would just refuse ask for clarification. This improves
the chat quality score for the existing director tests
2023-08-28 00:47:02 -07:00
Debanjum Singh Solanky
75c1016ec0 Show filtered list of commands available with the currently input text 2023-08-28 00:46:10 -07:00
Debanjum Singh Solanky
74605f6159 Add default conversation command to auto switch b/w general, notes modes
This was the default behavior but behavior regressed when adding slash
commands in PR #463
2023-08-28 00:46:10 -07:00
sabaimran
cbc978ea08 Update help links for notion, github to point to the main docs 2023-08-27 15:02:55 -07:00
sabaimran
b45e1d8c0d
Fix plaintext HTML parsing and rendering (#464)
* Store conversation command options in an Enum
* Move to slash commands instead of using @ to specify general commands
* Calculate conversation command once & pass it as arg to child funcs
* Add /notes command to respond using only knowledge base as context
This prevents the chat model to try respond using it's general world
knowledge only without any references pulled from the indexed
knowledge base
* Test general and notes slash commands in openai chat director tests
---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2023-08-27 11:24:30 -07:00
Debanjum
7919787fb7
Use Slash Commands and Add Notes Slash Command (#463)
* Store conversation command options in an Enum

* Move to slash commands instead of using @ to specify general commands

* Calculate conversation command once & pass it as arg to child funcs

* Add /notes command to respond using only knowledge base as context

This prevents the chat model to try respond using it's general world
knowledge only without any references pulled from the indexed
knowledge base

* Test general and notes slash commands in openai chat director tests

* Update gpt4all tests to use md configuration

* Add a /help tooltip

* Add dynamic support for describing slash commands. Remove default and treat notes as the default type

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2023-08-26 18:11:18 -07:00
sabaimran
e64357698d
Skip indexing single bad markdown, plaintext file (#460) 2023-08-23 15:34:56 -07:00
sabaimran
84bd579077 Format the chat outputted message with code, bolding, or italics. Add a copy button for code. Closes #445. 2023-08-19 20:02:57 -07:00
sabaimran
f9e09ba490 Do not try downloading model from GPT4All if the user is not connected to the internet 2023-08-19 19:09:21 -07:00
Debanjum Singh Solanky
3ff4e19dd2 Release Khoj version 0.11.1 2023-08-16 22:53:29 -07:00
sabaimran
4fb8c2c5e1 Pass a SIGTERM to tell the uvicorn server to exit and gracefully kill the thread 2023-08-16 21:27:05 -07:00
Debanjum Singh Solanky
34d5cd2bd8 Increase pytests workflow timeout duration to reduce intermittent failures
The test workflow fails regularly with an OperationCancelled error.
This is an intermittent failure that gets resolved on running the
failed workflows a few times.
2023-08-16 20:00:36 -07:00
sabaimran
4e03dfea43
Attach the parent to the server thread, allowing the kill signal to trigger a graceful exit (#446) 2023-08-16 19:36:10 -07:00
Debanjum Singh Solanky
3c58ab5fcb Unmark Python 3.8 as supported in khoj-assistant pypi package 2023-08-16 00:58:59 -07:00
Debanjum Singh Solanky
26c3977fb9 Remove info hint to reindex khoj on unexpected search results
The index corruption was issue resolved a while ago in #325 and
hasn't cropped up again
2023-08-16 00:58:59 -07:00
sabaimran
def909a913
Revert "Open Web interface within Desktop app in GUI mode" (#444) 2023-08-15 23:26:28 -07:00
sabaimran
6562ec6531 Release Khoj version 0.11.0 2023-08-14 19:25:03 -07:00
sabaimran
064b2fbc4a
Add a link to the FAQ in our docs (#438)
* Add a link to faq.khoj.dev in the docs
2023-08-14 15:05:08 -07:00
sabaimran
0ea901c7c1
Allow indexing to continue even if there's an issue parsing a particular org file (#430)
* Allow indexing to continue even if there's an issue parsing a particular org file
* Use approximation in pytorch comparison in text_search UT, skip additional file parser errors for org files
* Change error of expected failure
2023-08-14 07:56:33 -07:00
sabaimran
7b907add77
Add support for indexing plaintext files (#420)
* Add support for indexing plaintext files
- Adds backend support for parsing plaintext files generically (.html, .txt, .xml, .csv, .md)
- Add equivalent frontend views for setting up plaintext file indexing
- Update config, rawconfig, default config, search API, setup endpoints
* Add a nifty plaintext file icon to configure plaintext files in the Web UI
* Use generic glob path for plaintext files. Skip indexing files that aren't in whitelist
2023-08-09 15:44:40 -07:00
Debanjum Singh Solanky
84d774ea34 Retain desktop builds for 3 days to allow user tests
Upgrade minimum tiktoken version to work for encoding gpt4
2023-08-08 23:02:13 -07:00