Commit graph

957 commits

Author SHA1 Message Date
sabaimran
bf8914d0c8 Fix default config initialization for for chat.html 2023-07-03 00:00:47 -07:00
Debanjum
faad1297f4
Drop Support for Org Music, Ledger Content Types
Removing unused content types will reduce khoj code to manage

- 0f993b3 Drop support for Ledger as a separate content type
   Khoj will soon get a generic text indexing content type in Index plain text files #237.
   This along with a file filter should suffice for searching through Ledger transactions

- c9db532 Remove unused org-music as an indexable content type from Khoj
   Org-music was just a custom content type that worked with org-music.
   It was mostly only useful for me.
2023-07-02 17:48:29 -07:00
Debanjum Singh Solanky
0f993b332e Drop support for Ledger as a separate content type
Khoj will soon get a generic text indexing content type. This along
with a file filter should suffice for searching through Ledger
transactions, if required.

Having a specific content type for niche use-case like ledger isn't
useful. Removing unused content types will reduce khoj code to manage.
2023-07-02 16:57:49 -07:00
sabaimran
fa218ff5aa Fix call to update for Reinitialize button 2023-07-02 16:31:30 -07:00
sabaimran
a8b83da872 Merge branch 'master' of github.com:debanjum/khoj into features/simplify-configuration-steps 2023-07-02 16:21:54 -07:00
Debanjum Singh Solanky
c9db5321e7 Remove unused org-music as an indexable content type from Khoj
Org-music was just a custom content type that worked with org-music.
It was mostly only useful for me.

Cleaning up that code will reduce number of content types for khoj to
manage.
2023-07-02 16:21:21 -07:00
sabaimran
b86a3bb0c5 Merge branch 'master' of github.com:debanjum/khoj into fix/obsidian-setup-issues 2023-07-02 16:21:05 -07:00
sabaimran
a52c1c8380 Use built-in app.vault to determine whether there are any PDF files within 2023-07-02 16:20:43 -07:00
sabaimran
eff1436857 Overwrite existing PDFs in Obsidian as well, make if-block more legible 2023-07-02 16:17:25 -07:00
Debanjum Singh Solanky
30459ee4ba Fix Khoj subtitle in desktop entry, pyproject, cli and Obsidian Readme 2023-07-02 16:09:07 -07:00
sabaimran
1a1b044d12 Simplify settings pages for configuration
- Add one-click disablement
- Remove fields that probably don't need to be edited (our implementation details)
- Add a green tick if a given field is configured
2023-07-02 16:04:05 -07:00
sabaimran
e4c445f805 Add try-except-finally blocks around configure calls in /update 2023-07-02 13:35:02 -07:00
sabaimran
4b02a8c788 Fix PDF setup in Obsidian plugin and force Obsidian configuration for markdown 2023-07-02 12:37:24 -07:00
sabaimran
2a7e4f2b71 Escape special characters in the URL when adding a link to the remote file 2023-07-02 09:13:28 -07:00
sabaimran
c747562897 Update the GUI to just be a simple box with a button for the web UI 2023-07-01 20:37:21 -07:00
sabaimran
bab7f39d47 Move logic to open the web browser into the GUI section 2023-07-01 20:11:27 -07:00
sabaimran
36537606da Update unit test and preserve prior operational ordering in main.py 2023-07-01 20:02:35 -07:00
sabaimran
ea9ae4ae28 Configure Khoj to automatically open the browser to their web home page when Khoj is up 2023-07-01 19:46:31 -07:00
sabaimran
d2083dd395 Remove bespoke processing for GithubToJsonl file demo 2023-07-01 19:09:22 -07:00
sabaimran
a71440f62a Update the guidance in the error message if config is not set 2023-07-01 19:09:00 -07:00
sabaimran
7db97d8aa9 Fix: don't try to render the search_type.ALL 2023-07-01 19:08:19 -07:00
sabaimran
f0f6390366 Make --no-gui the default behavior of Khoj and update corresponding documentation 2023-07-01 19:07:59 -07:00
Debanjum Singh Solanky
d77e05c279 Release Khoj version 0.7.0 2023-07-01 05:44:22 -07:00
Debanjum Singh Solanky
30d87a9a01 Update color of Khoj chat in Obsidinan plugin to Lantern theme 2023-07-01 02:18:47 -07:00
Debanjum Singh Solanky
51826d28d6 Ensure clicking Update in Khoj Obsidian indexes PDF files too 2023-07-01 02:18:47 -07:00
sabaimran
dac2d14380 Handle file names appropriately for md files and render commits in github results 2023-07-01 01:20:58 -07:00
sabaimran
dbe713604d Fix error in tests for markdown_to_jsonl 2023-07-01 00:49:40 -07:00
sabaimran
931aab4464 Handle case for when headers value is None 2023-07-01 00:37:30 -07:00
sabaimran
d01afb3ee4 Fix path issues for URL-based markdown files 2023-07-01 00:25:11 -07:00
sabaimran
31655447e7 Add the sign-up list to the chat page as well and update copy 2023-06-30 21:43:01 -07:00
sabaimran
796102c74e Add separate configuration if the given Khoj instance is meant for demo
- In theory, this will be suitable for any Khoj instance that's meant for external-facing purposes (as in, outside of the user's network)
- Prevent re-indexing for Github data if this is a demo instance
- Fix up some issues with the CSS which made settings page small in mobile
- In the frontend views for Khoj, add a button to get on the waitlist and links to the landing page
2023-06-30 20:38:55 -07:00
sabaimran
db3026739d Resolve diffs in api.py to make /chat endpoint async with new request parameter 2023-06-30 00:25:37 -07:00
sabaimran
ef72508914 Try/catch around github file decoding, await call to search in chat API, fix img width 2023-06-30 00:23:21 -07:00
Debanjum Singh Solanky
b950889f47 Fix org-mode web renderer to handle results containing list in block
- Break out of rendering list if at end of org block in org.js
- This would previous hang rendering results in web interface

Should try fix this upstream in org.js as well
2023-06-29 19:01:25 -07:00
sabaimran
780c769567 Add additional request headers to improve telemetry 2023-06-29 18:51:24 -07:00
sabaimran
6c10d68262
Merge pull request #253 from khoj-ai/features/github-issues-indexing
Support indexing Github issues as well as corresponding comments
2023-06-29 16:02:47 -07:00
sabaimran
b2dd946c6d Rename issue to entry method for accuracy 2023-06-29 15:23:50 -07:00
Debanjum Singh Solanky
51dfa48e2b Have Khoj support Python 3.11 as Pytorch supports it now
- Previously Khoj could only support Python upto 3.10 due to pytorch.
  But lots of folks had python 3.11 installed by default on their machines.

  This required installing python 3.10 and dealing with virtual envs.

  With Torch >= 2.0.1 now able to support python 3.11, at least one
  class of installation troubles for Khoj should drop. See
  https://github.com/pytorch/pytorch/issues/86566 for reference

- Preliminary testing indicates using the new torch 2.x may reduce
  search time by 25% (from 80ms to 60ms on Mac M1)

- Update Docs to not require mentioning python <=3.10 required
- Update Github test workflow to run khoj tests with python 3.11 too
2023-06-29 15:13:26 -07:00
sabaimran
65bf894302 Interpret org files as a list and put them in separate divs. Update styling of search results to separate into cards 2023-06-29 15:12:48 -07:00
Debanjum Singh Solanky
d212298573 Make Configure button on web interface incrementally update by default
We should add a way to force index everything.

But force indexing should not be the default when user is just trying
update content to index
2023-06-29 14:52:51 -07:00
Debanjum Singh Solanky
da2de21339 Only return requested result count even if search in multiple content types
- Set results_count to default value at start so it is an int, never None
2023-06-29 14:49:05 -07:00
sabaimran
77672ac0ae Demarcate different results with a border box
- Add back support for searching by type Github
- Remove custom class name in markdown js file
2023-06-29 14:14:25 -07:00
sabaimran
6edc32f2f4 Accept current changes to include issues in rendering flow 2023-06-29 12:25:29 -07:00
sabaimran
ab7dabe74f Explicitly use Union type for function parameters for lint checks 2023-06-29 11:44:30 -07:00
sabaimran
fecf6700d2 Limit small image rendering to just the avatar images 2023-06-29 11:27:18 -07:00
sabaimran
70e550250a Add an additional data source for issues from Github repositories + quality of life updates
- Use a request session to reduce the overhead of setting up a new connection with the Github URL each request
- Use the streaming feature for the REST api to reduce some of the memory footprint
2023-06-29 10:59:54 -07:00
Debanjum Singh Solanky
5f2717cc4b Use logger.warning since logger.warn is deprecated 2023-06-28 22:15:27 -07:00
Debanjum Singh Solanky
56ce97ef9e Use async/await in tests for query method of text and image search
The text, image search query method has become async. So async/await
is required to get results correctly in tests etc
2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky
b1767f93d6 Get any configured asymmetric search model to encode query for search
- Set image_search.query to async to use it with multi-threading
  This is same as text_search.query being set to an async method
- Exit search early if no search_model is defined in state.model
2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky
8eae7c898c Put each result under org heading when query for "all" content type in khoj.el
- Add "all" as default content type when no content type retrieved
  from server
2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky
630bf995f1 Style each result based on its content type in same view on Khoj web
- So when searching across content types (with content-type = "all")
  org-mode results get rendered differently than markdown, PDF etc. results

- Set div class for each result separately instead of a single uber div
  for styling. This allows styling div of each result based on the
  content-type of that result

- No need to create placeholder "all" content type on web interface as
  server is passing an all content type by itself
2023-06-28 22:07:01 -07:00
Debanjum Singh Solanky
1773a78339 Fix createRequestUrl method signature to fetch results from khoj web 2023-06-28 12:10:45 -07:00
Debanjum Singh Solanky
212b1a96c8 Create "all" search type for search across all content types on khoj server
Allows moving logic to handle search across all content types to
server from clients
2023-06-28 11:34:26 -07:00
Debanjum Singh Solanky
0636ceaf14 Merge branch 'master' of github.com:khoj-ai/khoj into parallelize-search-across-all-asymmetric-text-content-types
Conflicts:
- src/khoj/routers/api.py: Use theirs
2023-06-27 16:10:32 -07:00
Debanjum Singh Solanky
510bb7e684 Use typing union in text_search for python 3.8 compatible type hinting 2023-06-27 15:59:50 -07:00
Debanjum Singh Solanky
1b11d5723d Extract search request URL builder into js function in web interface 2023-06-27 15:50:41 -07:00
Debanjum Singh Solanky
09f739b8cc Null check config, log warning instead of error when configuring search 2023-06-27 15:48:48 -07:00
sabaimran
9d62d66a77 Simplify construction of repo shorthand in GithubToJsonl 2023-06-27 15:05:03 -07:00
sabaimran
227169ebde Support configuration of multiple Github repositories in the settings interface
- Add cards to configure each of the Github repositories
- Fix a bug in the API which caused all other settings to be wiped when updating one of the content types
- Provide an error message to the user if they have a misconfiguration in their chat settings
2023-06-27 14:10:09 -07:00
sabaimran
37a1f15c38 Add backend support for indexing multiple repositories
- Add support for indexing org files as well as markdown files from the Github repository and update corresponding search view
- Support indexing a list of repositories
2023-06-27 12:06:15 -07:00
sabaimran
ddd550e6f4 Add call to use X-CSRFToken in relevant POST methods 2023-06-26 12:38:00 -07:00
sabaimran
35e24d7851 Fix null checking in state for content config API and telemetry API 2023-06-26 11:37:34 -07:00
sabaimran
5e39421f56 Merge branch 'master' of github.com:debanjum/khoj 2023-06-25 11:41:47 -07:00
sabaimran
4410a3bb4b Limit max width of the pre tag to 100% of the screen width 2023-06-25 11:41:15 -07:00
sabaimran
ffe66b848a Use a single column tempalte for config plugins when in mobile 2023-06-25 11:27:41 -07:00
Debanjum Singh Solanky
b1890aa050 Null check intermediary objects when config not fully initialized 2023-06-24 15:34:18 -07:00
Debanjum Singh Solanky
946af0889d Improve showing status message on saving config via web interface
- Show success/failure status message much closer to the save button
  Previously status message was shown on top of the page, which wasn't
  always in view and wasn't easily seen
- Improve the status message to more clearly show next steps on success
2023-06-24 00:49:57 -07:00
Debanjum Singh Solanky
40d1abfe50 Update the new /config APIs to configure Khoj for first time users
- Setup state.config and sub-components from unset state
- Setup search types with default settings
2023-06-24 00:45:30 -07:00
Debanjum Singh Solanky
edabede93a Fix post configuration state update on error or success on config html 2023-06-23 14:52:25 -07:00
Debanjum Singh Solanky
4744d69221 Resolve button name, anchor tag feedback. Add status message to settings page
- Use "Configure" name for settings config action
- Use more standard anchor tag instead of button
- Add configure status message
2023-06-23 09:48:38 -07:00
Debanjum Singh Solanky
26abafa658 Highlight currently active tab in web interface for orientation 2023-06-22 00:33:28 -07:00
Debanjum Singh Solanky
2728c714d7 Put pico.css in local assets. Move common css styling into khoj.css 2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky
20a37697de Add Khoj header with navigation pane to Search and Chat Interfaces 2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky
c467a0cbb0 Update UI of config sub pages to use khoj lantern theme styling 2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky
0ce2ec590a Update main config page on khoj server to match khoj lantern theme 2023-06-21 20:25:25 -07:00
Debanjum Singh Solanky
d30a9ddd33 Use Khoj Logo on Search, Chat pages of Web Interface 2023-06-21 12:34:53 -07:00
Debanjum Singh Solanky
6d4aad57e1 Use new Khoj Lantern Logo in Web, Emacs, Obsidian UIs and Docs 2023-06-21 01:57:22 -07:00
Debanjum Singh Solanky
69d4fa6525 Rename project links across repo from debanjum/khoj to khoj-ai/khoj 2023-06-21 00:13:21 -07:00
Debanjum Singh Solanky
5c4eb950d5 Search across all content types via khoj.el on Emacs
If no content-type selected in transient menu option, khoj.el queries
khoj server without content-type parameter (t) set.

This results in search across all enabled asymmetric search text
content types
2023-06-20 23:39:56 -07:00
Debanjum Singh Solanky
2cd3e799d3 Improve null and type checks 2023-06-20 23:30:59 -07:00
Debanjum Singh Solanky
d5fb4196de Update web interface to allow querying all content types at once 2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky
5c7c8d1f46 Use async/await to fix parallelization of search across content types 2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky
1192e49307 Pass default value matching argument types expected by text_search methods 2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky
0144e610d6 Only search across content types that work with asymmetric search 2023-06-20 22:21:46 -07:00
Debanjum Singh Solanky
f6a7aa6c96 Style Khoj chat on web interface with new lantern theme
- Color khoj chat message with new yellow theme color
- Update Khoj chat emoji to lantern
- Add page type to title of pages on web interface
2023-06-20 01:39:33 -07:00
Debanjum Singh Solanky
6d94d6e75a Encode the asymmetric, symmetric search queries in parallel for speed
Use timer to measure time to encode queries and total search time
2023-06-20 01:18:17 -07:00
Debanjum Singh Solanky
d292dc03b3 Use new Khoj Logotype in Web interface 2023-06-20 01:13:06 -07:00
Debanjum Singh Solanky
db07362ca3 Encode user query as same across search types to speed up query time
- Add new filter abstract method to remove filter terms from query
- Use the filter method to remove filter terms, encode this defiltered
  query and pass it to the query methods of each search types

TODO: Encoding query is still taking 100-200 ms unlike before. Need to
investigate why
2023-06-19 23:29:54 -07:00
Debanjum Singh Solanky
285d17af2a Search in parallel across all enabled content types requested via API
- Update API to return content from all enabled content types when type
  is not set to specific type in HTTP request param
- To do this efficiently run the search queries in parallel threads
2023-06-19 23:29:06 -07:00
Debanjum Singh Solanky
79d325fbb6 Fix triggering @general queries in Khoj Chat 2023-06-19 23:05:33 -07:00
Debanjum Singh Solanky
e97a20d70c Set conversation type if query param set, else return chat history
Only initialize variables if query is not empty, to avoid unnecessary
compute, variable null checks etc.

Fixes #230
2023-06-19 19:59:16 -07:00
sabaimran
4722a2c16d Add Github configuration page and success notifications 2023-06-18 10:06:45 -07:00
sabaimran
668135c763 Merge branch 'master' of github.com:debanjum/khoj into features/pretty-config-page 2023-06-18 08:35:09 -07:00
sabaimran
81183a1fe1 Address misc PR comments and update logo in all clients
- Rename the new logo to reflect accuracy on size (e.g., 128x128)
- Update the icns file for Mac
- Update nomenclature in settings pages
2023-06-18 08:34:58 -07:00
Debanjum Singh Solanky
a44cde2865 Show hint to re-index vault if wonky results in Obsidian search modal
Remove spurious indentation in Obsidian styles.css

Resolves #207
2023-06-18 04:53:51 -07:00
Debanjum Singh Solanky
595cc5b0f5 Use printer icon for PDF logs. Only split lines if file at web link in web interface 2023-06-18 02:26:03 -07:00
Debanjum Singh Solanky
e31a540a5e Get all md files recursively in repository by passing recursive param
Previously the `get_markdown_files' method was only getting files at
root of the repository

Fix, improve logger messages in github to jsonl processor
2023-06-18 01:47:15 -07:00
Debanjum Singh Solanky
6fdac24416 Set page size to 100 to reduce requests required to Github API to 1/3
- Default is 30. So number of paginated requests required to get all
  items (commits, files) will reduce by 67%

- No need to increase page size for the get tree Github API request from
  `get_markdown_files'

  Get tree Github API doesn't support pagination and return 100K items
  in response. This should be way more than enough for our current
  use-cases
2023-06-18 01:44:36 -07:00
Debanjum Singh Solanky
87975e589a Fix passing auth token to Github API to increase rate limits by x85
- Previously wasn't prefixing "token" to PAT token in Auth header
  This resulted in the request being considered unauthenticated

- Unauthenticated requests to Github API are limited to 60 requests/hour
  Authenticated requests to Github API are allowed 5000 requests/hour
2023-06-18 01:19:26 -07:00
Debanjum Singh Solanky
9c70af960c Extract logic to get file content from Github into a separate method 2023-06-18 01:19:13 -07:00