Commit graph

3960 commits

Author SHA1 Message Date
Debanjum
83ed8561ee
Reduce size of Docker image and build it from local code
- Improvements
  - Install Khoj on Docker from local code instead of pulling from Github
  - Reduce Khoj Docker image size by 2Gb by not caching installed pip packages. Refer [issue comment](https://github.com/khoj-ai/khoj/issues/148#issuecomment-1627443570)
2023-07-11 01:30:06 -07:00
HyunggyuJang
88c42b3043
Encode data as utf-8
otherwise it will complain, see 1c85531090
2023-07-11 17:06:05 +09:00
Debanjum Singh Solanky
6308388dfc Install Khoj on Docker from local app instead of pulling from github
Just use a random static version for Khoj on the Docker as otherwise
the hatch vcs dynamic versioning requires the .git directory in the
docker image too
2023-07-11 00:41:05 -07:00
Debanjum Singh Solanky
802472cd99 Reduce Khoj Docker image size by 2Gb by not caching pip packages
Resolve #148
2023-07-10 23:27:02 -07:00
Debanjum Singh Solanky
f664a74e77 Update Khoj server to run on non standard port, 42110 instead of 8000
Resolves #295
2023-07-10 21:27:58 -07:00
Debanjum Singh Solanky
bfd516c1a4 Deprecate (unmaintained) support to setup Khoj via Conda 2023-07-10 21:27:58 -07:00
Debanjum Singh Solanky
58c2c3b71a Add Documentation to Release Khoj 2023-07-10 21:27:58 -07:00
sabaimran
effb52f859 Fix demo rendering with the new header 2023-07-10 21:16:19 -07:00
sabaimran
55f5be7b03 Release Khoj version 0.8.2 2023-07-10 14:39:32 -07:00
sabaimran
9a63f89f33 Merge branch 'master' of github.com:debanjum/khoj 2023-07-10 14:31:19 -07:00
sabaimran
53809298c0 Release Khoj version 0.8.1 2023-07-10 14:30:04 -07:00
tjsousa
5b37e988e6
Allow using configured GPT chat model (#292)
My account doesn't have gpt-4 enabled and it wouldn't work as the default value was always used from extract_questions, where the caller could use the configured model.
2023-07-10 14:24:40 -07:00
Debanjum Singh Solanky
75ff871217 Release Khoj version 0.8.0 2023-07-10 13:37:51 -07:00
Debanjum Singh Solanky
979088b3dc Add tooltip helper text on web settings page buttons
- Provide more details on what clicking configure, initialize buttons
  or changing the results count slider does
- This shows up on user hovering over those buttons
2023-07-10 13:32:41 -07:00
Debanjum Singh Solanky
255781e135 Use relative link on logo to jump to correct page on local and cloud 2023-07-10 13:22:20 -07:00
Debanjum Singh Solanky
b2d229c116 Move header pane style to base khoj.css for reuse. Fix logo size 2023-07-10 13:10:17 -07:00
Debanjum Singh Solanky
f4cef377ca Add details to run, configure Khoj via Web in Readme 2023-07-10 12:10:20 -07:00
Debanjum Singh Solanky
20cb314171 Open the Khoj config page in the browser on first run 2023-07-10 12:10:20 -07:00
sabaimran
07cf5a214a
Check if PDF files are present in the Obsidian vault before initializing the Khoj configuration (#293) 2023-07-10 10:33:04 -07:00
sabaimran
7364bac8ae Make the header take up less space
- Use a single row for the header
- Needed custom styling for each page because each of them are different in subtle ways, unfortunately
2023-07-09 22:31:37 -07:00
sabaimran
62704cac09
Add a plugin which allows users to index their Notion pages (#284)
* For the demo instance, re-instate the scheduler, but infrequently for api updates
- In constants, determine the cadence based on whether it's a demo instance or not
- This allow us to collect telemetry again. This will also allow us to save the chat session
* Conditionally skip updating the index altogether if it's a demo isntance
* Add backend support for Notion data parsing
- Add a NotionToJsonl class which parses the text of Notion documents made accessible to the API token
- Make corresponding updates to the default config, raw config to support the new notion addition
* Add corresponding views to support configuring Notion from the web-based settings page
- Support backend APIs for deleting/configuring notion setup as well
- Streamline some of the index updating code
* Use defaults for search and chat queries results count
* Update pagination of retrieving pages from Notion
* Update state conversation processor when update is hit
* frequency_penalty should be passed to gpt through kwargs
* Add check for notion in render_multiple method
* Add headings to Notion render
* Revert results count slider and split Notion files by blocks
* Clean/fix misc things in the function to update index
- Use the successText and errorText variables appropriately
- Name parameters in function calls
- Add emojis, woohoo
* Clean up and further modularize code for processing data in Notion
2023-07-09 15:29:26 -07:00
Debanjum
77755c0284
Fix Packaging the Khoj Desktop Apps (#289)
* Add langchain static files and pytorch metadata to Khoj native app

* Add pillow static files, metadata & hidden imports to Khoj native app

* Fix path to web interface static files on Khoj native app

* Add tiktoken hidden imports to make chat work from Khoj native app

* Fix Khoj native app to run with GUI mode enabled

This got broken when we moved from using the --no-gui flag to using
--gui in https://github.com/khoj-ai/khoj/pull/263
2023-07-09 10:21:16 -07:00
sabaimran
4c135ea316
Make streaming optional for the /chat endpoint (#287)
* Update the /chat endpoint to conditionally support streaming

- If streams are enabled, return the threadgenerator as it does currently
- If stream is disabled, return a JSON response with the response/compiled references separated out
- Correspondingly, update the chat.html UI to use the streamed API, as well as Obsidian
- Rename chat/init/ to chat/history

* Update khoj.el to use the /history endpoint

- Update corresponding unit tests to use stream=true

* Remove & from call to /chat for obsidian

* Abstract functions out into a helpers.py file and clean up some of the error-catching
2023-07-09 10:12:09 -07:00
Debanjum Singh Solanky
0a86220d42 Use default values, delete content config on disable and update state 2023-07-07 20:36:16 -07:00
Debanjum Singh Solanky
362063f5fe By default, connect to Khoj server over IPv4 from Obsidian plugin 2023-07-07 20:36:16 -07:00
Debanjum Singh Solanky
571e8c2548 Add rerank, index corruption hint on search page of web interface
Similar to the hint alrady in the Obsidian search modal
Closes #272
2023-07-07 20:36:16 -07:00
Debanjum
4b79d8216f
Move remaining chat actors to use OpenAI chat models
- Deprecate the unused beta /answer and /search type identification endpoints and associated GPT functions
- Update extract_questions to use GPT4
- Update summarize method to default to GPT-3.5
- Update date filter to support quoting values in single quotes too. So now both dt>'2023-04-01' and dt>"2023-04-01" should work
- Remove "model" field from chat settings on the web interface
2023-07-07 18:53:05 -07:00
Debanjum Singh Solanky
61e131f95c Hide unused model field from chat settings on web interface 2023-07-07 18:43:53 -07:00
Debanjum Singh Solanky
af30d01e85 Move to newer chat models to extract questions & summarize chats
Deprecate usage of the older gpt3 models in-place of the newer chat
based models
- text-davinci-003 is only 50% cheaper than gpt4 and less reliable for
  question extraction
- Using gpt-3.50turbo for summarization should reduce cost of chat

- Keep conversation.chat_session as a list instead of a string
- Update completion_with_backoff func to use ChatML format
2023-07-07 17:32:27 -07:00
Debanjum Singh Solanky
171ce19e1f Update date filter to allow quoting values in single quotes 2023-07-07 17:13:47 -07:00
Debanjum Singh Solanky
e588f7c528 Deprecate unused beta search and answer API endpoints 2023-07-07 16:38:07 -07:00
Debanjum Singh Solanky
c9fc4d1296 Revert to using cross-encoder to improve search results used by chat 2023-07-07 15:31:34 -07:00
Debanjum Singh Solanky
11f0a9f196 Fix chat tests since streaming. Pass args correctly to chat methods
- Fix testing gpt converse method after it started streaming responses
- Pass stop in model_kwargs dictionary and api key in openai_api_key
  parameter to chat completion methods. This should resolve the arg
  warning thrown by OpenAI module
2023-07-07 15:23:44 -07:00
Debanjum Singh Solanky
48870d9170 Fix parsing questions generated by extract_questions actor into list
The previous json parsing was failing to handle questions with date
filters

Fix the chat actor tests to run without throwing error with freezegun
complaining about importing transformers.local_llama model

Remove quote escapes from date filter examples provided to
extract_questions actor
2023-07-07 15:18:55 -07:00
Debanjum Singh Solanky
279662620b Move results count to settings page on web. Use it for search & chat
- Before
  Only the search interface had the results count configuration option

- After
  - The results count is set on the settings page instead of the
    search page
  - Both search and chat can use the configured results count instead
    of just search
2023-07-07 14:08:08 -07:00
Debanjum Singh Solanky
2ec8da89e8 Remove Update button from Khoj Search page on the Web interface
The settings page on the Khoj web interface already has a configure
button. Don't need the Update button on the search page as well
2023-07-07 12:49:58 -07:00
Debanjum Singh Solanky
bf427cd8dd Set no. of results used to generate chat response from Khoj Emacs 2023-07-07 12:34:50 -07:00
Debanjum Singh Solanky
1d77fe712c Set no. of results used to generate chat response from Khoj Obsidian 2023-07-07 12:32:32 -07:00
Debanjum Singh Solanky
2f31de5ed5 Set no. of references to use for chat configurable in Chat API 2023-07-07 12:29:36 -07:00
Debanjum Singh Solanky
d97682fdac Use tooltip, placeholders to guide Khoj setup via web settings page 2023-07-06 21:37:48 -07:00
Debanjum Singh Solanky
f5cf09424b Use more descriptive field names for content type settings on Khoj web
Resolves #281
2023-07-06 20:47:39 -07:00
Debanjum Singh Solanky
a2c668268f Use node-fetch >=3.1.0 in khoj obsidian plugin to avoid security vulnerability 2023-07-06 13:05:52 -07:00
sabaimran
d688ddf92c
Re-instate the scheduler for the demo instances (#279)
* For the demo instance, re-instate the scheduler, but infrequently for api updates

- In constants, determine the cadence based on whether it's a demo instance or not
- This allow us to collect telemetry again. This will also allow us to save the chat session

* Conditionally skip updating the index altogether if it's a demo isntance
2023-07-06 11:01:32 -07:00
Debanjum Singh Solanky
8f36572a9b Improve typing, null checks in controllers and gpt functions 2023-07-05 20:49:25 -07:00
Debanjum Singh Solanky
41ac1e24c9 Add docs for a pre-emptive setup of Khoj for later offline usage
Closes #151
2023-07-05 20:48:51 -07:00
Debanjum
6c2a8a5bce
️ Stream Responses by Khoj Chat on Web, Obsidian
- What
   - Stream chat responses from OpenAI API to Web, Obsidian clients
      - Implement using a callback function which manages a queue where new tokens can be placed as they come on. As the thread is read from, tokens are removed.
      - When the final token has been processed, add the `compiled_references` to the queue to be rendered by the `chat` client
      - When the thread has been closed, save the accumulated conversation log in the user's history using a `partial func`
      - Incrementally decode tokens on the front end and add them as they appear from the streamed response

- Why
This significantly reduces perceived latency and OpenAI API request timeouts for Chat

Closes https://github.com/khoj-ai/khoj/issues/257
2023-07-05 20:02:11 -07:00
Debanjum Singh Solanky
e111eda6ae Make client, app_config optional in telemetry logger for correct typing 2023-07-05 18:57:38 -07:00
Debanjum Singh Solanky
e562114f6b Improve comments, var names in js for chat streaming on web interface 2023-07-05 18:57:27 -07:00
Debanjum Singh Solanky
46269ddfd3 Fix chat logging messages to get context without flooding logs 2023-07-05 18:27:06 -07:00
Debanjum Singh Solanky
0ba838b53a Show temp status message in Khoj Obsidian chat while Khoj is thinking
- Scroll to bottom after adding temporary status message and
references too
2023-07-05 18:02:43 -07:00