- Memory consumption now only scales with search models used, not with
content types as well. Previously each content type had it's own
copy of the search ML models. That'd result in 300+ Mb per enabled
content type
- Split model state into 2 separate state objects, `search_models' and
`content_index'.
This allows loading text_search and image_search models first and then
reusing them across all content_types in content_index
- This should cut down memory utilization quite a bit for most users.
I see a ~50% drop in memory utilization.
This will, of course, vary for each user based on the amount of
content indexed vs number of plugins enabled
- This does not solve the RAM utilization scaling with size of the index.
As the whole content index is still kept in RAM while Khoj is running
Should help with #195, #301 and #303
* Add a Github workflow that allows you to build dev versions of Desktop applications
* Add pull_request trigger for testing
* Fix errant open quote in Package Khoj App step
* Nix the release step, since this isn't associated with any tags
- Set retention period for uploaded artifacts to 1 day
* Remove pull_request trigger - limit to manual triggers and pushes to master
Just use a random static version for Khoj on the Docker as otherwise
the hatch vcs dynamic versioning requires the .git directory in the
docker image too
My account doesn't have gpt-4 enabled and it wouldn't work as the default value was always used from extract_questions, where the caller could use the configured model.
- Provide more details on what clicking configure, initialize buttons
or changing the results count slider does
- This shows up on user hovering over those buttons
* For the demo instance, re-instate the scheduler, but infrequently for api updates
- In constants, determine the cadence based on whether it's a demo instance or not
- This allow us to collect telemetry again. This will also allow us to save the chat session
* Conditionally skip updating the index altogether if it's a demo isntance
* Add backend support for Notion data parsing
- Add a NotionToJsonl class which parses the text of Notion documents made accessible to the API token
- Make corresponding updates to the default config, raw config to support the new notion addition
* Add corresponding views to support configuring Notion from the web-based settings page
- Support backend APIs for deleting/configuring notion setup as well
- Streamline some of the index updating code
* Use defaults for search and chat queries results count
* Update pagination of retrieving pages from Notion
* Update state conversation processor when update is hit
* frequency_penalty should be passed to gpt through kwargs
* Add check for notion in render_multiple method
* Add headings to Notion render
* Revert results count slider and split Notion files by blocks
* Clean/fix misc things in the function to update index
- Use the successText and errorText variables appropriately
- Name parameters in function calls
- Add emojis, woohoo
* Clean up and further modularize code for processing data in Notion
* Add langchain static files and pytorch metadata to Khoj native app
* Add pillow static files, metadata & hidden imports to Khoj native app
* Fix path to web interface static files on Khoj native app
* Add tiktoken hidden imports to make chat work from Khoj native app
* Fix Khoj native app to run with GUI mode enabled
This got broken when we moved from using the --no-gui flag to using
--gui in https://github.com/khoj-ai/khoj/pull/263
* Update the /chat endpoint to conditionally support streaming
- If streams are enabled, return the threadgenerator as it does currently
- If stream is disabled, return a JSON response with the response/compiled references separated out
- Correspondingly, update the chat.html UI to use the streamed API, as well as Obsidian
- Rename chat/init/ to chat/history
* Update khoj.el to use the /history endpoint
- Update corresponding unit tests to use stream=true
* Remove & from call to /chat for obsidian
* Abstract functions out into a helpers.py file and clean up some of the error-catching
- Deprecate the unused beta /answer and /search type identification endpoints and associated GPT functions
- Update extract_questions to use GPT4
- Update summarize method to default to GPT-3.5
- Update date filter to support quoting values in single quotes too. So now both dt>'2023-04-01' and dt>"2023-04-01" should work
- Remove "model" field from chat settings on the web interface
Deprecate usage of the older gpt3 models in-place of the newer chat
based models
- text-davinci-003 is only 50% cheaper than gpt4 and less reliable for
question extraction
- Using gpt-3.50turbo for summarization should reduce cost of chat
- Keep conversation.chat_session as a list instead of a string
- Update completion_with_backoff func to use ChatML format
- Fix testing gpt converse method after it started streaming responses
- Pass stop in model_kwargs dictionary and api key in openai_api_key
parameter to chat completion methods. This should resolve the arg
warning thrown by OpenAI module
The previous json parsing was failing to handle questions with date
filters
Fix the chat actor tests to run without throwing error with freezegun
complaining about importing transformers.local_llama model
Remove quote escapes from date filter examples provided to
extract_questions actor
- Before
Only the search interface had the results count configuration option
- After
- The results count is set on the settings page instead of the
search page
- Both search and chat can use the configured results count instead
of just search
* For the demo instance, re-instate the scheduler, but infrequently for api updates
- In constants, determine the cadence based on whether it's a demo instance or not
- This allow us to collect telemetry again. This will also allow us to save the chat session
* Conditionally skip updating the index altogether if it's a demo isntance