Debanjum Singh Solanky
510bb7e684
Use typing union in text_search for python 3.8 compatible type hinting
2023-06-27 15:59:50 -07:00
Debanjum Singh Solanky
1b11d5723d
Extract search request URL builder into js function in web interface
2023-06-27 15:50:41 -07:00
Debanjum Singh Solanky
09f739b8cc
Null check config, log warning instead of error when configuring search
2023-06-27 15:48:48 -07:00
Debanjum Singh Solanky
5c4eb950d5
Search across all content types via khoj.el on Emacs
...
If no content-type selected in transient menu option, khoj.el queries
khoj server without content-type parameter (t) set.
This results in search across all enabled asymmetric search text
content types
2023-06-20 23:39:56 -07:00
Debanjum Singh Solanky
2cd3e799d3
Improve null and type checks
2023-06-20 23:30:59 -07:00
Debanjum Singh Solanky
d5fb4196de
Update web interface to allow querying all content types at once
2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky
5c7c8d1f46
Use async/await to fix parallelization of search across content types
2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky
1192e49307
Pass default value matching argument types expected by text_search methods
2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky
0144e610d6
Only search across content types that work with asymmetric search
2023-06-20 22:21:46 -07:00
Debanjum Singh Solanky
6d94d6e75a
Encode the asymmetric, symmetric search queries in parallel for speed
...
Use timer to measure time to encode queries and total search time
2023-06-20 01:18:17 -07:00
Debanjum Singh Solanky
db07362ca3
Encode user query as same across search types to speed up query time
...
- Add new filter abstract method to remove filter terms from query
- Use the filter method to remove filter terms, encode this defiltered
query and pass it to the query methods of each search types
TODO: Encoding query is still taking 100-200 ms unlike before. Need to
investigate why
2023-06-19 23:29:54 -07:00
Debanjum Singh Solanky
285d17af2a
Search in parallel across all enabled content types requested via API
...
- Update API to return content from all enabled content types when type
is not set to specific type in HTTP request param
- To do this efficiently run the search queries in parallel threads
2023-06-19 23:29:06 -07:00
Debanjum Singh Solanky
79d325fbb6
Fix triggering @general queries in Khoj Chat
2023-06-19 23:05:33 -07:00
Debanjum Singh Solanky
e97a20d70c
Set conversation type if query param set, else return chat history
...
Only initialize variables if query is not empty, to avoid unnecessary
compute, variable null checks etc.
Fixes #230
2023-06-19 19:59:16 -07:00
sabaimran
6224dce49d
Merge pull request #228 from debanjum/features/pretty-config-page
...
Update the config page to be more usable
2023-06-19 18:11:35 -07:00
sabaimran
4722a2c16d
Add Github configuration page and success notifications
2023-06-18 10:06:45 -07:00
sabaimran
668135c763
Merge branch 'master' of github.com:debanjum/khoj into features/pretty-config-page
2023-06-18 08:35:09 -07:00
sabaimran
81183a1fe1
Address misc PR comments and update logo in all clients
...
- Rename the new logo to reflect accuracy on size (e.g., 128x128)
- Update the icns file for Mac
- Update nomenclature in settings pages
2023-06-18 08:34:58 -07:00
Debanjum Singh Solanky
a44cde2865
Show hint to re-index vault if wonky results in Obsidian search modal
...
Remove spurious indentation in Obsidian styles.css
Resolves #207
2023-06-18 04:53:51 -07:00
Debanjum Singh Solanky
595cc5b0f5
Use printer icon for PDF logs. Only split lines if file at web link in web interface
2023-06-18 02:26:03 -07:00
Debanjum
e06be395f9
Use Github REST API and Index Commit Messages off Github Repository
...
- Migrate to Github REST API instead of Llama Hub to index Markdown Docs in Github Repository
- Index Commit Messages from Github Repository as well
2023-06-18 14:51:32 +05:30
Debanjum Singh Solanky
e31a540a5e
Get all md files recursively in repository by passing recursive param
...
Previously the `get_markdown_files' method was only getting files at
root of the repository
Fix, improve logger messages in github to jsonl processor
2023-06-18 01:47:15 -07:00
Debanjum Singh Solanky
6fdac24416
Set page size to 100 to reduce requests required to Github API to 1/3
...
- Default is 30. So number of paginated requests required to get all
items (commits, files) will reduce by 67%
- No need to increase page size for the get tree Github API request from
`get_markdown_files'
Get tree Github API doesn't support pagination and return 100K items
in response. This should be way more than enough for our current
use-cases
2023-06-18 01:44:36 -07:00
Debanjum Singh Solanky
87975e589a
Fix passing auth token to Github API to increase rate limits by x85
...
- Previously wasn't prefixing "token" to PAT token in Auth header
This resulted in the request being considered unauthenticated
- Unauthenticated requests to Github API are limited to 60 requests/hour
Authenticated requests to Github API are allowed 5000 requests/hour
2023-06-18 01:19:26 -07:00
Debanjum Singh Solanky
9c70af960c
Extract logic to get file content from Github into a separate method
2023-06-18 01:19:13 -07:00
Debanjum Singh Solanky
10d4c38ce9
Extract Wait for rate limit reset logic into a function for reuse
2023-06-18 01:06:46 -07:00
sabaimran
aad7f825e0
Remove music configuration
2023-06-17 21:23:56 -07:00
sabaimran
5f97afbfac
Ignore type checks from mypy in subindexed fields
2023-06-17 16:53:36 -07:00
sabaimran
c2d46de8bc
Add endpoint for regenerating directly from the config page and add music content-type
2023-06-17 15:47:33 -07:00
sabaimran
ded3100caf
Update the configuration page to make config management easier
...
- Add a central configuration management page to make management of config details easier
- Add relevant api endpoints both for client and server to update/request data as necessary
- Attempt to update the favicon
2023-06-17 15:21:28 -07:00
Debanjum Singh Solanky
3f24e53b6e
Render URL as link in web interface if file param of result is a web link
2023-06-17 04:26:40 -07:00
Debanjum Singh Solanky
63ec84ad78
Store Github URL of Markdown files on Github in file jsonl param
2023-06-17 04:23:01 -07:00
Debanjum Singh Solanky
0c1c7583b5
Handle pagination, API rate limits. Get all commits from Github repo
2023-06-17 04:21:39 -07:00
Debanjum Singh Solanky
31d17d0b22
Index commits message from repository with the github plugin
2023-06-17 02:59:54 -07:00
Debanjum Singh Solanky
c29c141a7e
Use Github Rest API to index Markdown files in Github Repository
...
The Llama_Hub Github plugin is fairly limited.
The Github Rest API is well supported and can easily be extended to
index commit messages, issues, discussions, PRs etc.
2023-06-17 02:16:13 -07:00
Debanjum
9f00a366ab
Add a Github plugin to index content from a Github repository
...
- Use the Github plugin on LlamaHub to read in markdown files from specified Github repository for indexing
- Update the desktop GUI application to take in the required parameters to read from Github
- Requires a classic PAT token for Github access
2023-06-17 12:28:47 +05:30
Saba
ac96f43b1b
Remove try-catch specific to Github plugin; consolidate GUI logic
2023-06-16 23:46:25 -07:00
Saba
07ade2262a
Set default value of pat_token in conftest.py to be empty string
2023-06-13 17:03:03 -07:00
Saba
751edfefe5
Add separate unit test for github. Will only run of a PAT token is set
2023-06-13 16:55:58 -07:00
Saba
3a61919344
Fix failing unit tests by hard-coding model presence of expected search types
2023-06-13 16:32:47 -07:00
Saba
019d3732de
Rename orgmode_search to org_search
2023-06-13 16:06:54 -07:00
Saba
08d79f5ba4
Unify types used in Github and other text-based configs. Fix typing issues
2023-06-13 15:52:36 -07:00
Saba
a6cd96a6a9
Add a Github plugin which can be used to read from a Github repository
2023-06-13 14:40:06 -07:00
Debanjum
c68cde4803
Log clients calling API endpoints on Khoj server
...
- Make API endpoints on Khoj server accept `client` as request parameter
- Khoj API endpoints: /chat, /search, /update
- Make Khoj clients set `client` request param when calling the API endpoints on the Khoj server
- Khoj clients: Emacs, Obsidian and Web
- Also log khoj server_version running to telemetry server
2023-06-09 18:36:49 +05:30
sabaimran
59fa48036f
Merge pull request #224 from debanjum/fix/message-exceeds-prompt-size
...
Pass truncated message as string in ChatMessage when exceeding max prompt size
2023-06-08 17:32:53 -07:00
Debanjum Singh Solanky
139a3ba060
Update server to log new server version field to telemetry db
2023-06-08 14:14:21 +05:30
Saba
c5666e0404
Move factory dependencies to optional settings
2023-06-06 23:26:24 -07:00
Saba
5d5ebcbf7c
Rename truncate messages method and update unit tests to simplify assertion logic
2023-06-06 23:25:43 -07:00
Saba
7119ed0849
Run pre-commit script
2023-06-05 19:29:23 -07:00
Saba
948ba6ddca
Remove unused logger
2023-06-05 19:01:03 -07:00