sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-12-01 11:23:01 +01:00

Author	SHA1	Message	Date
sabaimran	bf8914d0c8	Fix default config initialization for for chat.html	2023-07-03 00:00:47 -07:00
Debanjum	faad1297f4	Drop Support for Org Music, Ledger Content Types Removing unused content types will reduce khoj code to manage - `0f993b3` Drop support for Ledger as a separate content type Khoj will soon get a generic text indexing content type in Index plain text files #237. This along with a file filter should suffice for searching through Ledger transactions - `c9db532` Remove unused org-music as an indexable content type from Khoj Org-music was just a custom content type that worked with org-music. It was mostly only useful for me.	2023-07-02 17:48:29 -07:00
Debanjum Singh Solanky	0f993b332e	Drop support for Ledger as a separate content type Khoj will soon get a generic text indexing content type. This along with a file filter should suffice for searching through Ledger transactions, if required. Having a specific content type for niche use-case like ledger isn't useful. Removing unused content types will reduce khoj code to manage.	2023-07-02 16:57:49 -07:00
sabaimran	fa218ff5aa	Fix call to update for Reinitialize button	2023-07-02 16:31:30 -07:00
sabaimran	a8b83da872	Merge branch 'master' of github.com:debanjum/khoj into features/simplify-configuration-steps	2023-07-02 16:21:54 -07:00
Debanjum Singh Solanky	c9db5321e7	Remove unused org-music as an indexable content type from Khoj Org-music was just a custom content type that worked with org-music. It was mostly only useful for me. Cleaning up that code will reduce number of content types for khoj to manage.	2023-07-02 16:21:21 -07:00
sabaimran	b86a3bb0c5	Merge branch 'master' of github.com:debanjum/khoj into fix/obsidian-setup-issues	2023-07-02 16:21:05 -07:00
sabaimran	a52c1c8380	Use built-in app.vault to determine whether there are any PDF files within	2023-07-02 16:20:43 -07:00
sabaimran	eff1436857	Overwrite existing PDFs in Obsidian as well, make if-block more legible	2023-07-02 16:17:25 -07:00
Debanjum Singh Solanky	30459ee4ba	Fix Khoj subtitle in desktop entry, pyproject, cli and Obsidian Readme	2023-07-02 16:09:07 -07:00
sabaimran	1a1b044d12	Simplify settings pages for configuration - Add one-click disablement - Remove fields that probably don't need to be edited (our implementation details) - Add a green tick if a given field is configured	2023-07-02 16:04:05 -07:00
sabaimran	e4c445f805	Add try-except-finally blocks around configure calls in /update	2023-07-02 13:35:02 -07:00
sabaimran	4b02a8c788	Fix PDF setup in Obsidian plugin and force Obsidian configuration for markdown	2023-07-02 12:37:24 -07:00
sabaimran	2a7e4f2b71	Escape special characters in the URL when adding a link to the remote file	2023-07-02 09:13:28 -07:00
sabaimran	c747562897	Update the GUI to just be a simple box with a button for the web UI	2023-07-01 20:37:21 -07:00
sabaimran	bab7f39d47	Move logic to open the web browser into the GUI section	2023-07-01 20:11:27 -07:00
sabaimran	36537606da	Update unit test and preserve prior operational ordering in main.py	2023-07-01 20:02:35 -07:00
sabaimran	ea9ae4ae28	Configure Khoj to automatically open the browser to their web home page when Khoj is up	2023-07-01 19:46:31 -07:00
sabaimran	d2083dd395	Remove bespoke processing for GithubToJsonl file demo	2023-07-01 19:09:22 -07:00
sabaimran	a71440f62a	Update the guidance in the error message if config is not set	2023-07-01 19:09:00 -07:00
sabaimran	7db97d8aa9	Fix: don't try to render the search_type.ALL	2023-07-01 19:08:19 -07:00
sabaimran	f0f6390366	Make --no-gui the default behavior of Khoj and update corresponding documentation	2023-07-01 19:07:59 -07:00
Debanjum Singh Solanky	d77e05c279	Release Khoj version 0.7.0	2023-07-01 05:44:22 -07:00
Debanjum Singh Solanky	30d87a9a01	Update color of Khoj chat in Obsidinan plugin to Lantern theme	2023-07-01 02:18:47 -07:00
Debanjum Singh Solanky	51826d28d6	Ensure clicking Update in Khoj Obsidian indexes PDF files too	2023-07-01 02:18:47 -07:00
sabaimran	dac2d14380	Handle file names appropriately for md files and render commits in github results	2023-07-01 01:20:58 -07:00
sabaimran	dbe713604d	Fix error in tests for markdown_to_jsonl	2023-07-01 00:49:40 -07:00
sabaimran	931aab4464	Handle case for when headers value is None	2023-07-01 00:37:30 -07:00
sabaimran	d01afb3ee4	Fix path issues for URL-based markdown files	2023-07-01 00:25:11 -07:00
sabaimran	31655447e7	Add the sign-up list to the chat page as well and update copy	2023-06-30 21:43:01 -07:00
sabaimran	796102c74e	Add separate configuration if the given Khoj instance is meant for demo - In theory, this will be suitable for any Khoj instance that's meant for external-facing purposes (as in, outside of the user's network) - Prevent re-indexing for Github data if this is a demo instance - Fix up some issues with the CSS which made settings page small in mobile - In the frontend views for Khoj, add a button to get on the waitlist and links to the landing page	2023-06-30 20:38:55 -07:00
sabaimran	db3026739d	Resolve diffs in api.py to make /chat endpoint async with new request parameter	2023-06-30 00:25:37 -07:00
sabaimran	ef72508914	Try/catch around github file decoding, await call to search in chat API, fix img width	2023-06-30 00:23:21 -07:00
Debanjum Singh Solanky	b950889f47	Fix org-mode web renderer to handle results containing list in block - Break out of rendering list if at end of org block in org.js - This would previous hang rendering results in web interface Should try fix this upstream in org.js as well	2023-06-29 19:01:25 -07:00
sabaimran	780c769567	Add additional request headers to improve telemetry	2023-06-29 18:51:24 -07:00
sabaimran	6c10d68262	Merge pull request #253 from khoj-ai/features/github-issues-indexing Support indexing Github issues as well as corresponding comments	2023-06-29 16:02:47 -07:00
sabaimran	b2dd946c6d	Rename issue to entry method for accuracy	2023-06-29 15:23:50 -07:00
Debanjum Singh Solanky	51dfa48e2b	Have Khoj support Python 3.11 as Pytorch supports it now - Previously Khoj could only support Python upto 3.10 due to pytorch. But lots of folks had python 3.11 installed by default on their machines. This required installing python 3.10 and dealing with virtual envs. With Torch >= 2.0.1 now able to support python 3.11, at least one class of installation troubles for Khoj should drop. See https://github.com/pytorch/pytorch/issues/86566 for reference - Preliminary testing indicates using the new torch 2.x may reduce search time by 25% (from 80ms to 60ms on Mac M1) - Update Docs to not require mentioning python <=3.10 required - Update Github test workflow to run khoj tests with python 3.11 too	2023-06-29 15:13:26 -07:00
sabaimran	65bf894302	Interpret org files as a list and put them in separate divs. Update styling of search results to separate into cards	2023-06-29 15:12:48 -07:00
Debanjum Singh Solanky	d212298573	Make Configure button on web interface incrementally update by default We should add a way to force index everything. But force indexing should not be the default when user is just trying update content to index	2023-06-29 14:52:51 -07:00
Debanjum Singh Solanky	da2de21339	Only return requested result count even if search in multiple content types - Set results_count to default value at start so it is an int, never None	2023-06-29 14:49:05 -07:00
sabaimran	77672ac0ae	Demarcate different results with a border box - Add back support for searching by type Github - Remove custom class name in markdown js file	2023-06-29 14:14:25 -07:00
sabaimran	6edc32f2f4	Accept current changes to include issues in rendering flow	2023-06-29 12:25:29 -07:00
sabaimran	ab7dabe74f	Explicitly use Union type for function parameters for lint checks	2023-06-29 11:44:30 -07:00
sabaimran	fecf6700d2	Limit small image rendering to just the avatar images	2023-06-29 11:27:18 -07:00
sabaimran	70e550250a	Add an additional data source for issues from Github repositories + quality of life updates - Use a request session to reduce the overhead of setting up a new connection with the Github URL each request - Use the streaming feature for the REST api to reduce some of the memory footprint	2023-06-29 10:59:54 -07:00
Debanjum Singh Solanky	5f2717cc4b	Use logger.warning since logger.warn is deprecated	2023-06-28 22:15:27 -07:00
Debanjum Singh Solanky	56ce97ef9e	Use async/await in tests for query method of text and image search The text, image search query method has become async. So async/await is required to get results correctly in tests etc	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	b1767f93d6	Get any configured asymmetric search model to encode query for search - Set image_search.query to async to use it with multi-threading This is same as text_search.query being set to an async method - Exit search early if no search_model is defined in state.model	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	8eae7c898c	Put each result under org heading when query for "all" content type in khoj.el - Add "all" as default content type when no content type retrieved from server	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	630bf995f1	Style each result based on its content type in same view on Khoj web - So when searching across content types (with content-type = "all") org-mode results get rendered differently than markdown, PDF etc. results - Set div class for each result separately instead of a single uber div for styling. This allows styling div of each result based on the content-type of that result - No need to create placeholder "all" content type on web interface as server is passing an all content type by itself	2023-06-28 22:07:01 -07:00
Debanjum Singh Solanky	1773a78339	Fix createRequestUrl method signature to fetch results from khoj web	2023-06-28 12:10:45 -07:00
Debanjum Singh Solanky	212b1a96c8	Create "all" search type for search across all content types on khoj server Allows moving logic to handle search across all content types to server from clients	2023-06-28 11:34:26 -07:00
Debanjum Singh Solanky	0636ceaf14	Merge branch 'master' of github.com:khoj-ai/khoj into parallelize-search-across-all-asymmetric-text-content-types Conflicts: - src/khoj/routers/api.py: Use theirs	2023-06-27 16:10:32 -07:00
Debanjum Singh Solanky	510bb7e684	Use typing union in text_search for python 3.8 compatible type hinting	2023-06-27 15:59:50 -07:00
Debanjum Singh Solanky	1b11d5723d	Extract search request URL builder into js function in web interface	2023-06-27 15:50:41 -07:00
Debanjum Singh Solanky	09f739b8cc	Null check config, log warning instead of error when configuring search	2023-06-27 15:48:48 -07:00
sabaimran	9d62d66a77	Simplify construction of repo shorthand in GithubToJsonl	2023-06-27 15:05:03 -07:00
sabaimran	227169ebde	Support configuration of multiple Github repositories in the settings interface - Add cards to configure each of the Github repositories - Fix a bug in the API which caused all other settings to be wiped when updating one of the content types - Provide an error message to the user if they have a misconfiguration in their chat settings	2023-06-27 14:10:09 -07:00
sabaimran	37a1f15c38	Add backend support for indexing multiple repositories - Add support for indexing org files as well as markdown files from the Github repository and update corresponding search view - Support indexing a list of repositories	2023-06-27 12:06:15 -07:00
sabaimran	ddd550e6f4	Add call to use X-CSRFToken in relevant POST methods	2023-06-26 12:38:00 -07:00
sabaimran	35e24d7851	Fix null checking in state for content config API and telemetry API	2023-06-26 11:37:34 -07:00
sabaimran	5e39421f56	Merge branch 'master' of github.com:debanjum/khoj	2023-06-25 11:41:47 -07:00
sabaimran	4410a3bb4b	Limit max width of the pre tag to 100% of the screen width	2023-06-25 11:41:15 -07:00
sabaimran	ffe66b848a	Use a single column tempalte for config plugins when in mobile	2023-06-25 11:27:41 -07:00
Debanjum Singh Solanky	b1890aa050	Null check intermediary objects when config not fully initialized	2023-06-24 15:34:18 -07:00
Debanjum Singh Solanky	946af0889d	Improve showing status message on saving config via web interface - Show success/failure status message much closer to the save button Previously status message was shown on top of the page, which wasn't always in view and wasn't easily seen - Improve the status message to more clearly show next steps on success	2023-06-24 00:49:57 -07:00
Debanjum Singh Solanky	40d1abfe50	Update the new /config APIs to configure Khoj for first time users - Setup state.config and sub-components from unset state - Setup search types with default settings	2023-06-24 00:45:30 -07:00
Debanjum Singh Solanky	edabede93a	Fix post configuration state update on error or success on config html	2023-06-23 14:52:25 -07:00
Debanjum Singh Solanky	4744d69221	Resolve button name, anchor tag feedback. Add status message to settings page - Use "Configure" name for settings config action - Use more standard anchor tag instead of button - Add configure status message	2023-06-23 09:48:38 -07:00
Debanjum Singh Solanky	26abafa658	Highlight currently active tab in web interface for orientation	2023-06-22 00:33:28 -07:00
Debanjum Singh Solanky	2728c714d7	Put pico.css in local assets. Move common css styling into khoj.css	2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky	20a37697de	Add Khoj header with navigation pane to Search and Chat Interfaces	2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky	c467a0cbb0	Update UI of config sub pages to use khoj lantern theme styling	2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky	0ce2ec590a	Update main config page on khoj server to match khoj lantern theme	2023-06-21 20:25:25 -07:00
Debanjum Singh Solanky	d30a9ddd33	Use Khoj Logo on Search, Chat pages of Web Interface	2023-06-21 12:34:53 -07:00
Debanjum Singh Solanky	6d4aad57e1	Use new Khoj Lantern Logo in Web, Emacs, Obsidian UIs and Docs	2023-06-21 01:57:22 -07:00
Debanjum Singh Solanky	69d4fa6525	Rename project links across repo from debanjum/khoj to khoj-ai/khoj	2023-06-21 00:13:21 -07:00
Debanjum Singh Solanky	5c4eb950d5	Search across all content types via khoj.el on Emacs If no content-type selected in transient menu option, khoj.el queries khoj server without content-type parameter (t) set. This results in search across all enabled asymmetric search text content types	2023-06-20 23:39:56 -07:00
Debanjum Singh Solanky	2cd3e799d3	Improve null and type checks	2023-06-20 23:30:59 -07:00
Debanjum Singh Solanky	d5fb4196de	Update web interface to allow querying all content types at once	2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky	5c7c8d1f46	Use async/await to fix parallelization of search across content types	2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky	1192e49307	Pass default value matching argument types expected by text_search methods	2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky	0144e610d6	Only search across content types that work with asymmetric search	2023-06-20 22:21:46 -07:00
Debanjum Singh Solanky	f6a7aa6c96	Style Khoj chat on web interface with new lantern theme - Color khoj chat message with new yellow theme color - Update Khoj chat emoji to lantern - Add page type to title of pages on web interface	2023-06-20 01:39:33 -07:00
Debanjum Singh Solanky	6d94d6e75a	Encode the asymmetric, symmetric search queries in parallel for speed Use timer to measure time to encode queries and total search time	2023-06-20 01:18:17 -07:00
Debanjum Singh Solanky	d292dc03b3	Use new Khoj Logotype in Web interface	2023-06-20 01:13:06 -07:00
Debanjum Singh Solanky	db07362ca3	Encode user query as same across search types to speed up query time - Add new filter abstract method to remove filter terms from query - Use the filter method to remove filter terms, encode this defiltered query and pass it to the query methods of each search types TODO: Encoding query is still taking 100-200 ms unlike before. Need to investigate why	2023-06-19 23:29:54 -07:00
Debanjum Singh Solanky	285d17af2a	Search in parallel across all enabled content types requested via API - Update API to return content from all enabled content types when type is not set to specific type in HTTP request param - To do this efficiently run the search queries in parallel threads	2023-06-19 23:29:06 -07:00
Debanjum Singh Solanky	79d325fbb6	Fix triggering @general queries in Khoj Chat	2023-06-19 23:05:33 -07:00
Debanjum Singh Solanky	e97a20d70c	Set conversation type if query param set, else return chat history Only initialize variables if query is not empty, to avoid unnecessary compute, variable null checks etc. Fixes #230	2023-06-19 19:59:16 -07:00
sabaimran	4722a2c16d	Add Github configuration page and success notifications	2023-06-18 10:06:45 -07:00
sabaimran	668135c763	Merge branch 'master' of github.com:debanjum/khoj into features/pretty-config-page	2023-06-18 08:35:09 -07:00
sabaimran	81183a1fe1	Address misc PR comments and update logo in all clients - Rename the new logo to reflect accuracy on size (e.g., 128x128) - Update the icns file for Mac - Update nomenclature in settings pages	2023-06-18 08:34:58 -07:00
Debanjum Singh Solanky	a44cde2865	Show hint to re-index vault if wonky results in Obsidian search modal Remove spurious indentation in Obsidian styles.css Resolves #207	2023-06-18 04:53:51 -07:00
Debanjum Singh Solanky	595cc5b0f5	Use printer icon for PDF logs. Only split lines if file at web link in web interface	2023-06-18 02:26:03 -07:00
Debanjum Singh Solanky	e31a540a5e	Get all md files recursively in repository by passing recursive param Previously the `get_markdown_files' method was only getting files at root of the repository Fix, improve logger messages in github to jsonl processor	2023-06-18 01:47:15 -07:00
Debanjum Singh Solanky	6fdac24416	Set page size to 100 to reduce requests required to Github API to 1/3 - Default is 30. So number of paginated requests required to get all items (commits, files) will reduce by 67% - No need to increase page size for the get tree Github API request from `get_markdown_files' Get tree Github API doesn't support pagination and return 100K items in response. This should be way more than enough for our current use-cases	2023-06-18 01:44:36 -07:00
Debanjum Singh Solanky	87975e589a	Fix passing auth token to Github API to increase rate limits by x85 - Previously wasn't prefixing "token" to PAT token in Auth header This resulted in the request being considered unauthenticated - Unauthenticated requests to Github API are limited to 60 requests/hour Authenticated requests to Github API are allowed 5000 requests/hour	2023-06-18 01:19:26 -07:00
Debanjum Singh Solanky	9c70af960c	Extract logic to get file content from Github into a separate method	2023-06-18 01:19:13 -07:00

1 2 3 4 5 ...

957 commits