sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-01 03:13:01 +01:00

Author	SHA1	Message	Date
sabaimran	8fa0b69c67	Resolve merge issue with adapters methods	2023-11-20 15:21:06 -08:00
sabaimran	fee99779bf	Add subqueries for internet-connected search results and update client-side code accordingly - Add a wrapper method to help make direct queries to the LLM and determine any intermediate responses needed for handling the request	2023-11-20 15:19:15 -08:00
Debanjum Singh Solanky	d61b0dd55c	Add Khoj Django app package to sys path to load Django module via pip install	2023-11-20 14:55:00 -08:00
sabaimran	b8e6883a81	Merge branch 'master' of github.com:khoj-ai/khoj into features/internet-enabled-search	2023-11-19 16:20:08 -08:00
sabaimran	237195e20e	Make all name-related fields nullable within the GoogleUser	2023-11-19 14:22:32 -08:00
Debanjum	71799add0b	Index Parent Headings of Org-Mode Entries to Improve Search Context (#548 ) ### Overview The parent hierarchy of org-mode entries can store important context. This change updates OrgNode to track parent headings for each org entry and adds the parent outline for each entry to the index ### Details - Test search uses ancestor headings as context for improved results - Add ancestor headings of each org-mode entry to their compiled form - Track ancestor headings for each org-mode entry in org-node parser Resolves #85	2023-11-19 13:18:19 -08:00
sabaimran	ef5e9d66c1	Resolve merge conflicts in dependency imports	2023-11-19 11:42:20 -08:00
Debanjum Singh Solanky	c3465d6982	Release Khoj version 1.0.0	2023-11-19 09:50:25 -08:00
Debanjum	736744be3a	Update documentation to reflect new multi-user config scenario (#550 ) - Update docs to show how to use Khoj Cloud - Move self-hosting Khoj to separate section - Add page to setup Desktop app - Set default URL to Khoj Cloud URL in Obsidian, Emacs clients	2023-11-18 18:22:46 -08:00
Debanjum Singh Solanky	e1bf1f0e86	Update default Khoj server URL to Khoj cloud on Emacs, Obsidian clients	2023-11-18 16:25:45 -08:00
Debanjum Singh Solanky	8775ce730a	Use URL fragments to allow jumping to config page sections on Web app	2023-11-18 16:25:45 -08:00
sabaimran	f792b1e301	Remove already defined identical function	2023-11-18 14:08:50 -08:00
sabaimran	e2fff5dc47	Don't explicitly use value to get the model type value	2023-11-18 14:01:01 -08:00
sabaimran	a8a25ceac2	Honor user's chat settings when running the extract questions phase - Add marginally better error handling when GPT gives a messed up respones to the extract questions method - Remove debug log lines	2023-11-18 13:31:51 -08:00
sabaimran	67156e6aec	Add new logs for debugging issues with chat references	2023-11-18 12:10:50 -08:00
sabaimran	5de2ab6098	Change parse_obj calls to use model_validate per new pydantic specification	2023-11-18 12:10:36 -08:00
sabaimran	6d249645a6	Fix interpretation of the default search type	2023-11-18 00:04:18 -08:00
sabaimran	f180b2ba94	Resolve mypy errors for various data types	2023-11-17 23:26:15 -08:00
sabaimran	3328a41f08	Update types of base config models for pydantic 2.0	2023-11-17 23:08:52 -08:00
sabaimran	f688529150	Update the default configuration for the AppConfig	2023-11-17 19:26:31 -08:00
sabaimran	11ccb92755	Fix formatting of welcome message to use markdown	2023-11-17 18:55:59 -08:00
Debanjum Singh Solanky	ca87b4ede9	Wrap common API query parameters into shared class to deduplicate code - Upgrade FastAPI to >= latest version. Required upgrade of FastAPI. Earlier version didn't support wrapping common query params in class - Use per fixture app instead of a global FastAPI app in conftest - Upgrade minimum required Django version - Fix no notes chat director test with updated no notes message No notes message was updated in commit `118f1143`	2023-11-17 18:43:49 -08:00
sabaimran	262f3ccb59	Resolve mypy issues with formatting	2023-11-17 17:11:00 -08:00
sabaimran	a7e00898cb	Fix rendering even when no online context references are returned	2023-11-17 16:41:28 -08:00
sabaimran	0fcf234f07	Add support for using serper.dev for online queries - Use the knowledgeGraph, answerBox, peopleAlsoAsk and organic responses of serper.dev to provide online context for queries made with the /online command - Add it as an additional tool for doing Google searches - Render the results appropriately in the chat web window - Pass appropriate reference data down to the LLM	2023-11-17 16:19:11 -08:00
Debanjum Singh Solanky	55785d50c3	Use title, when present, as root ancestor of entries instead of file path	2023-11-17 15:03:27 -08:00
sabaimran	bfbe273ffd	Add some styling to the copy button for programmatic output	2023-11-17 12:18:35 -08:00
sabaimran	9ddf3b58c3	Use the markdown parser for rendering the chat messages in the web interface	2023-11-17 12:14:02 -08:00
sabaimran	a0b12b001a	Provide in-line rendering when output matches certain views	2023-11-17 11:04:36 -08:00
sabaimran	ec06d2c446	Move data indexer files into a separate folder under processor. Update assoc UTs	2023-11-16 17:19:55 -08:00
sabaimran	45a42faec8	Make adjectives more positive for api token generation	2023-11-16 15:55:35 -08:00
sabaimran	118f1143ff	When user tries using the notes slash command without having any data indexed	2023-11-16 12:52:39 -08:00
sabaimran	e8a13f0813	Add multi-user support to Khoj and use Postgres for backend storage (#549 ) - Adds support for multiple users to be connected to the same Khoj instance using their Google login credentials - Moves storage solution from in-memory json data to a Postgres db. This stores all relevant information, including accounts, embeddings, chat history, server side chat configuration - Adds the concept of a Khoj server admin for configuring instance-wide settings regarding search model, and chat configuration - Miscellaneous updates and fixes to the UX, including chat references, colors, and an updated config page - Adds billing to allow users to subscribe to the cloud service easily - Adds a separate GitHub action for building the dockerized production (tag `prod`) and dev (tag `dev`) images, separate from the image used for local building. The production image uses `gunicorn` with multiple workers to run the server. - Updates all clients (Obsidian, Emacs, Desktop) to follow the client/server architecture. The server no longer reads from the file system at all; it only accepts data via the indexer API. In line with that, removes the functionality to configure org, markdown, plaintext, or other file-specific settings in the server. Only leaves GitHub and Notion for server-side configuration. - Changes license to GNU AGPLv3 Resolves #467 Resolves #488 Resolves #303 Resolves #345 Resolves #195 Resolves #280 Resolves #461 Closes #259 Resolves #351 Resolves #301 Resolves #296	2023-11-16 11:48:01 -08:00
Debanjum Singh Solanky	74403e3536	Add ancestor headings of each org-mode entry to their compiled form Resolves #85	2023-11-16 02:54:41 -08:00
Debanjum Singh Solanky	305c25ae1a	Track ancestor headings for each org-mode entry in org-node parser	2023-11-16 02:39:14 -08:00
Debanjum Singh Solanky	cc05013715	Update first run message on Web app with Chat models setup instructions - Link to Django admin panel for user to create Chat Models on their Khoj server - This should only get hit when user is not using Khoj cloud, as Khoj cloud would already have Chat models configured	2023-11-15 22:44:24 -08:00
Debanjum Singh Solanky	6c1693b8f4	Update first run message on Desktop app with API token setup instructions - Open Web app settings in the default browser via link click - Open Desktop app settings via link click	2023-11-15 22:44:11 -08:00
Debanjum Singh Solanky	922983bd53	Set max cos distance to 0.18. Test search API query with max distance	2023-11-15 20:26:21 -08:00
Debanjum Singh Solanky	18dbad5edb	Use Sigmoid to normalize cross-encoder score between 0-1 - While sigmoid normalization isn't required for reranking. Normalizing score to distance metrics for both encoder and cross encoder scores is useful to reason about them - Softmax wasn't required as don't need probabilities, sigmoid is good enough to get distance metric	2023-11-15 19:31:59 -08:00
sabaimran	ea144de438	Merge with master	2023-11-15 18:34:46 -08:00
Debanjum Singh Solanky	348cc0cf0e	Use better name for DB adapter func to create user by Google token	2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky	08a057bdd5	Rename SearchModel to SearchModelConfig DB model, Require Cross-Encoder	2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky	0679b2a7bd	Use embeddings model store from state in text to entries Do not need to instantiating it separately. In all other places we're using the embeddings model store in global state anyway	2023-11-15 17:31:50 -08:00
sabaimran	245a9cbf63	Fix return type of the update_or_create method	2023-11-15 17:31:50 -08:00
sabaimran	bbae7dd83c	Update logic for creating a new user to use aupdate_or_create	2023-11-15 17:31:50 -08:00
sabaimran	8e62af77b9	Update format for return type of the generate token mehtod	2023-11-15 17:03:01 -08:00
sabaimran	4a487aff23	Fix return type of the update_or_create method	2023-11-15 14:35:42 -08:00
sabaimran	b63856ecb4	Update logic for creating a new user to use aupdate_or_create	2023-11-15 12:50:39 -08:00
sabaimran	b8e7488a95	Use a more permissive distance filter for search results from notes	2023-11-15 11:13:47 -08:00
sabaimran	05b7542115	Remove config lock from the state	2023-11-15 10:44:45 -08:00
sabaimran	ecd005cac0	Check if search model is already in DB before creating a new one	2023-11-15 10:41:35 -08:00
Debanjum Singh Solanky	9c6e7bdea2	Upgrade server, desktop app dependencies to resolve CVE bugs	2023-11-15 01:47:53 -08:00
Debanjum Singh Solanky	8f200cf53f	Remove unused parameter from configure_search_type method	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	f8e5e118e1	Only create KhojUser on login if doesn't already exist	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	3d8d6145f2	Add search model config from khoj.yml to Postgres DB via migration script	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	4af194d74b	Make search model configurable on server - Expose ability to modify search model via Django admin interface - Previously the bi_encoder and cross_encoder models to use were set in code - Now it's user configurable but with a default config generated by default	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	e98141f4c3	Subscribe default user to standard plan with a far away renewal date Self hosted users in anonymous mode have all capabilities unlocked	2023-11-14 16:31:39 -08:00
Debanjum Singh Solanky	9d30fda26d	Deduplicate, improve name of prompt templates for GPT4All chat models - Do not pass unused rerank_results parameter to text_search.query method	2023-11-14 16:31:09 -08:00
Debanjum Singh Solanky	795ec9eb55	Add KHOJ_prefix to server admin credentials environment variables	2023-11-14 16:13:13 -08:00
sabaimran	ee005de662	Rename django files URL to server instead of django	2023-11-14 12:36:38 -08:00
sabaimran	20ce3d0c78	Update default docker compose configuration with Khoj local mode	2023-11-14 12:21:26 -08:00
sabaimran	8c36079f74	Add a first run experience to intialize the admin user if none exists and setup chat models	2023-11-13 21:07:12 -08:00
Debanjum Singh Solanky	e9adb58c16	Rate limit calls to the /chat API per user, per day/minute	2023-11-13 19:41:46 -08:00
Debanjum Singh Solanky	33a8eb0470	Log when new user is created	2023-11-13 19:37:24 -08:00
sabaimran	603f838115	Block input text field when waiting for chat response	2023-11-11 17:14:37 -08:00
Debanjum Singh Solanky	9c321ac070	Fix cross encoder to use softmax to convert it to a distance metric	2023-11-11 16:12:24 -08:00
sabaimran	8a824167cf	Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references	2023-11-11 12:59:31 -08:00
sabaimran	fa428932a8	Update URL for downloading the desktop application	2023-11-11 12:59:15 -08:00
Debanjum Singh Solanky	941c7f23a3	Only get text search results above confidence threshold via API - During the migration, the confidence score stopped being used. It was being passed down from API to some point and went unused - Remove score thresholding for images as image search confidence score different from text search model distance score - Default score threshold of 0.15 is experimentally determined by manually looking at search results vs distance for a few queries - Use distance instead of confidence as metric for search result quality Previously we'd moved text search to a distance metric from a confidence score. Now convert even cross encoder, image search scores to distance metric for consistent results sorting	2023-11-11 04:11:33 -08:00
Debanjum Singh Solanky	e44e6df221	Reduce data dumped in console log from web, desktop app	2023-11-11 02:05:07 -08:00
Debanjum Singh Solanky	f044a89d50	Show status in Save, Reinitialize button of config page on web app - Show non-transient error message in status element if action fails - On success, just show temporary success message within button	2023-11-11 02:04:58 -08:00
Debanjum Singh Solanky	f17d9da36c	Move Configure, Reinitialize buttons into the Content section on Web app Remove the Results Count button from the web app. It's hanging weirdly with not much context to its purpose. Reintroduce it in the Search card when created under the Features section	2023-11-11 02:01:39 -08:00
Debanjum Singh Solanky	325cb0f7fb	Show message in Save button of Github, Notion config save in web app Show the success, failure message only temporarily. Previously it stuck around after clicking save until page refresh	2023-11-11 02:01:39 -08:00
Debanjum Singh Solanky	b34d4fa741	Save config, update index on save of Github, Notion config in web app Reduce user confusion by joining config update with index updation for each content type. So only a single click required to configure any content type instead of two clicks on two separate pages	2023-11-11 00:33:49 -08:00
Debanjum Singh Solanky	c4364b9100	Weaken asking follow-up qs and q&a mode in notes prompt to OpenAI models - Notes prompt doesn't need to be so tuned to question answering. User could just want to talk about life. The notes need to be used to response to those, not necessarily only retrieve answers from notes - System and notes prompts were forcing asking follow-up questions a little too much. Reduce strength of follow-up question asking	2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky	cba371678d	Stop OpenAI chat from emitting reference notes directly in chat body The Chat models sometime output reference notes directly in the chat body in unformatted form, specifically as Notes:\n['. Prevent that. Reference notes are shown in clean, formatted form anyway	2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky	8585976f37	Revert "Use notes in system prompt, rather than in the user message" This reverts commit `e695b9ab8c`.	2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky	b6441683c6	Increase reference text on 1st expansion to 3 lines and 140 characters	2023-11-10 23:36:43 -08:00
sabaimran	55c97241b5	Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references	2023-11-10 22:38:34 -08:00
sabaimran	e2e96f9aa4	Add default settings to let new users be subscribed on trial - Add the default user to a subscription trial - Update associated unit tests	2023-11-10 22:38:28 -08:00
Debanjum Singh Solanky	501e7606a0	Increase reference text on 1st expansion to 3 lines and 140 characters	2023-11-10 21:27:04 -08:00
sabaimran	0a950d9382	Fix checker to determine if obsidian client is connected	2023-11-10 19:21:58 -08:00
sabaimran	c736604366	Merge with remote	2023-11-10 17:50:15 -08:00
sabaimran	b0b07bde6c	Allow chat reference to expand enough to show the whole reference, rather than constraining the height	2023-11-10 17:49:20 -08:00
sabaimran	14f8c151c8	Fix return type of the generate_chat_response method	2023-11-10 17:48:54 -08:00
Debanjum Singh Solanky	45b8670c25	Fix return type hint for generate_chat_response func	2023-11-10 17:34:19 -08:00
Debanjum Singh Solanky	9b6c5ddba4	Update action row padding in cards on config page of web app	2023-11-10 16:53:25 -08:00
sabaimran	54d4fd0e08	Add chat_model data for logging selected models to telemetry	2023-11-10 16:46:34 -08:00
sabaimran	e695b9ab8c	Use notes in system prompt, rather than in the user message	2023-11-10 15:09:33 -08:00
sabaimran	cec932d88a	Update prompt so that GPT is more context aware with its capabilities	2023-11-10 14:37:11 -08:00
sabaimran	e62788ad79	Await result for determining if user has entries	2023-11-10 13:51:56 -08:00
sabaimran	1a56344f12	Remove the old syncData reference as it no longer exists	2023-11-10 10:10:07 -08:00
Debanjum Singh Solanky	39ad1c6ce6	Release Khoj version 0.14.0 Fix Khoj subtitle in manifest of Khoj Obsidian plugin	2023-11-10 00:28:33 -08:00
Debanjum Singh Solanky	745d6bfeed	Add detailed intro message, mention download desktop app for docs sync	2023-11-10 00:20:28 -08:00
Debanjum Singh Solanky	6eb7df717c	Only show search in web app nav pane if user has documents indexed	2023-11-09 19:14:54 -08:00
Debanjum Singh Solanky	c0789dc57b	Use email to get_user_subscription from DB and other DB adapters - Needing user subscription requires chaining function - Simplify get_file_sources DB adapter	2023-11-09 19:09:57 -08:00
Debanjum Singh Solanky	841ed95521	Move active user profile halo check into nav pane macro on web app	2023-11-09 18:05:19 -08:00
Debanjum Singh Solanky	ddac693762	Hide download desktop app message in web app if synced files exist	2023-11-09 17:47:00 -08:00
Debanjum Singh Solanky	30a9674f25	Mark generated profile pic with subscription circle in web app	2023-11-09 15:22:38 -08:00
Debanjum Singh Solanky	d6e6ed1cfa	Keep single Save button, Show next sync, default to prod Khoj URL in Desktop app - Make mutable syncing variable not a const - Show next sync time to make users aware of data sync is automated - Keep a single Save button to reduce confusion. It does what Save All previously did. Intent to manual sync should Save All - Default to using app.khoj.dev as default Khoj URL to ease setup	2023-11-09 14:04:58 -08:00
Debanjum Singh Solanky	e1f0128576	Change config migration script to update to 0.15.0 version Next release, 0.14.0 wouldn't contain the migration to Postgres	2023-11-09 12:21:58 -08:00
Debanjum Singh Solanky	17cbbb0b01	Use Consistent Environment Variable for KHOJ_DEBUG	2023-11-09 11:01:28 -08:00
Debanjum Singh Solanky	391db80499	Improve subscribed user profile pictures and nav pane selection - Add yellow halo around subscribed user profile - Fix highlighting current page in header nav pane	2023-11-09 00:57:05 -08:00
Debanjum Singh Solanky	605058c72a	Allow null user profile picture from Google OAuth in DB - Fix width of generated profile picture generated for user - Ignore unused Stripe webhook events	2023-11-09 00:46:59 -08:00
Debanjum Singh Solanky	a2609973b8	Disable Subscription if Stripe environment not setup Deduplicate DJANGO_SECRET_KEY and KHOJ_DJANGO_SECRET_KEY to latter name as prefixed with KHOJ as KHOJ app specific	2023-11-08 19:39:32 -08:00
Debanjum Singh Solanky	09e1235832	Auto update billing card UI on (re/un-)subscribe click on web app Previously required a page load to see the updated billing state after clicking resubscribe or unsubscribe buttons	2023-11-08 18:38:12 -08:00
Debanjum Singh Solanky	8b8bb15866	Keep sync state in memory, initialized to false in Desktop app Prevent deadlock if desktop app killed in middle of syncing	2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky	c043eb54ae	Use typed entry source instead of raw str to map source to conf in api.py	2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky	8178004e6d	Move Subscription data into separate table in DB. Merge migrations	2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky	3bb10128ef	Move subscription API to separate, independent router	2023-11-08 16:20:27 -08:00
Debanjum Singh Solanky	ec1395d072	Clean, merge subscription update events, API and functions - Reduce webhook triggers for subscription updates - Merge subscription update API endpoint, functions for (re/un-)subscribe	2023-11-08 15:55:20 -08:00
Debanjum Singh Solanky	ef5c13f968	Keep user subscription state. Update it when user has unsubscribed	2023-11-08 12:08:36 -08:00
Debanjum Singh Solanky	c52affc6d9	Get Khoj Cloud Subscription URL via environment variable	2023-11-08 12:07:53 -08:00
sabaimran	609d358b1a	Use sql datetime comparison for detecting validity of subscription renewal date - Update the unsubscribe endpoint to use query params - Use subscription id to process unsubscribe endpoint, rather than the customer id	2023-11-07 19:17:36 -08:00
sabaimran	98cf095b65	Fix bug for rendering chat references in LLM response	2023-11-07 16:44:41 -08:00
sabaimran	0e1cdb6536	Add additional error handling for processing unknown Stripe events and fix typo in STRIPE_SIGNING env variable	2023-11-07 16:43:05 -08:00
sabaimran	08c86927cb	Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into fix-improve-config-page-on-desktop-and-web-app	2023-11-07 12:46:49 -08:00
sabaimran	cec54e3a8a	Merge pull request #536 from khoj-ai/features/update-chat-ui Update the chat UI to have richer representation of the references	2023-11-07 12:34:57 -08:00
Debanjum Singh Solanky	f466751f4d	Expose card on web app config page to manage subscription to Khoj cloud	2023-11-07 10:21:00 -08:00
Debanjum Singh Solanky	9aaf475c8a	Create API webhook, endpoints for subscription payments using Stripe - Add fields to mark users as subscribed to a specific plan and subscription renewal date in DB - Add ability to unsubscribe a user using their email address - Expose webhook for stripe to callback confirming payment	2023-11-07 10:20:51 -08:00
Debanjum Singh Solanky	156421d30a	Show file type icons for each indexed file in config card of web app	2023-11-07 05:48:44 -08:00
Debanjum Singh Solanky	045c2252d6	Set content enabled status on update via config buttons on web app Previously hitting configure or disable wouldn't update the state of the content cards. It needed page refresh to see if the content was synced correctly. Now cards automatically get set to new state on hitting disable button on card or global configure buttons	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	7c424e0d5f	Enable deleting all indexed desktop files from Khoj via Desktop app	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	779fa531a5	Prevent Desktop app triggering multiple simultaneous syncs to server Lock syncing to server if a sync is already in progress. While the sync save button gets disabled while sync is in progress, the background sync job can still trigger a sync in parallel. This sync lock prevents that	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	404d47f1a1	Bubble up content indexing errors to notify user on client apps	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	6e957584ac	Create config page on web app to manage computer files indexed by Khoj Remove the table of all files indexed by Khoj. This seems overkill and doesn't match the UI semantics of the other data sources like Github, Notion. Create instead a data source card for computer files with the same update, disable semantics of the Github and Notion data source cards Users can disable each data source from its card on the main config page. They can see/delete individual files indexed from the computer data source once they click into the computer files data source card on the config page	2023-11-07 04:42:53 -08:00
Debanjum Singh Solanky	d527b644f4	Update content by source via API. Make web client use this API for config	2023-11-07 03:41:19 -08:00
Debanjum Singh Solanky	9ab327a2b6	Store the data source of each entry in database This will be useful for updating, deleting entries by their data source. Data source can be one of Computer, Github or Notion for now Store each file/entries source in database	2023-11-07 02:18:48 -08:00
Debanjum Singh Solanky	c82cd0862a	Delete deprecated content config pages for local files from web client The desktop app now manages syncing local computer files to index The server only manages "cloud" data source like github and notion.	2023-11-06 23:55:37 -08:00
Debanjum Singh Solanky	97cf8339aa	Rename Sync button, Force Sync toggle to Save, Save All buttons	2023-11-06 21:57:37 -08:00
Debanjum Singh Solanky	a08b152358	Improve log messages in text_entries and memory leak unit test	2023-11-06 19:27:31 -08:00
sabaimran	6c8689e4ae	Update corresponding chat UX in the desktop client as well	2023-11-06 16:18:41 -08:00
sabaimran	e01ecf1419	/s/references/reference to fix bug of jumping references	2023-11-06 16:12:25 -08:00
Debanjum	38f24a037d	Improve Indexing Text Entries (#535 ) Major - Ensure search results logic consistent across migration to DB, multi-user - Manually verified search results for sample queries look the same across migration - Flatten indexing code for better indexing progress tracking and code readability Minor - `a4f407f` Test memory leak on MPS device when generating vector embeddings - `ef24485` Improve Khoj with DB setup instructions in the Django app readme (for now) - `f212cc7` Arrange remaining text search tests in arrange, act, assert order - `022017d` Fix text search tests to test updated indexing log messages	2023-11-06 16:01:53 -08:00
sabaimran	270f7b3eb3	Update the chat UI to have richer representation of the references	2023-11-05 15:46:43 -08:00
sabaimran	d697d752c2	Use repeat rather than manually specify auto in grid-template-rows Co-authored-by: Debanjum <debanjum@gmail.com>	2023-11-05 15:23:42 -08:00
sabaimran	5f1e37fff0	Adjust indentation for css property	2023-11-05 14:33:23 -08:00
Debanjum Singh Solanky	a4f407f595	Test memory leak on MPS device when generating vector embeddings Slope threshold of 2.0 determined qualitatively on local Mac device Minor unused import and clean-up	2023-11-05 03:48:54 -08:00
Debanjum Singh Solanky	ef24485ada	Improve Khoj with DB setup instructions in the Django app readme (for now)	2023-11-05 02:04:52 -08:00
sabaimran	084a8becc5	Fix but to prevent default in chat trigger	2023-11-04 20:13:33 -07:00
Debanjum Singh Solanky	5489e98b9c	Do not index org heading entries by default This is to maintain the previous default behavior	2023-11-04 20:09:25 -07:00
Debanjum Singh Solanky	34b5a86d1d	Use SentenceTransformer to disable progress bar when encoding query The Langchain HuggingFaceEmbeddings wrapper doesn't support disabling progressbar, not especially for only query but not documents. This makes the logs noisy with encoding progressbar for each incremental queries No features of the Langchain wrapper for SentenceTransformer was currently being used anyway for now, and we can always switch back to it if required	2023-11-04 20:09:25 -07:00
Debanjum Singh Solanky	dc9946fc03	Flatten nested loops, improve progress reporting in text_to_jsonl indexer Flatten the nested loops to improve visibilty into indexing progress Reduce spurious logs, report the logs at aggregated level and update the logging description text to improve indexing progress reporting	2023-11-04 20:09:25 -07:00
sabaimran	88eeee3f4b	Move try/catch for import one line later	2023-11-04 19:46:47 -07:00
sabaimran	dbaa892665	Flip catching modulenotfound to import error exception	2023-11-04 19:34:10 -07:00
sabaimran	8c3d5a49da	Add try/except around image extraction step	2023-11-04 19:27:18 -07:00
sabaimran	fdfab39942	Update the config UI to show all files indexed with option to delete - Given the separation of the client and server now, the web UI will no longer support configuration of local file paths of data to index - Expose a way to show all the files that are currently set for indexing, along with an option to delete all or specific files	2023-11-04 19:03:34 -07:00
sabaimran	800bb4f458	Remove references to demo - The demo setting is no longer necessary for the time being, as we won't have anymore demo instances	2023-11-04 17:17:04 -07:00
sabaimran	b5972e9311	Use OCR to extract image text in PDFs	2023-11-04 17:15:28 -07:00
Debanjum Singh Solanky	8273bf26b7	Fix multi-line chat input and output render on web, desktop clients - Remove spurious whitespace in chat input box on page load being added because text area element was ending on newline - Do not insert newline in message when send message by hitting enter key This would be more evident when send message with cursor in the middle of the sentence, as a newline would be inserted at the cursor point - Remove chat message separator tokens from model output. Model sometimes starts to output text in it's chat format	2023-11-04 01:09:35 -07:00

1 2 3 4 5 ...

1468 commits