- Ensure all tensors are on MPS device before doing operations across them
- Background
- GPU is used by default for Khoj on MacOS now
- Needed PyTorch > 1.13.0 on Macs to use GPU, which we do now
- MPS should speed up search and indexing on MacOS
Fix usage warning for unescaped single quote in `khoj.el' docstring.
Converts usage of '<text>' into `<text>' to use the correct quote forms in generated docs
⛔ Warning (comp): khoj.el:119:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting)
⛔ Warning (comp): khoj.el:120:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting)
⛔ Warning (comp): khoj.el:121:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting)
⛔ Warning (comp): khoj.el:168:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting)
### Plugin Features
- Search Obsidian notes using Khoj
*Provide Natural language search on your (markdown) notes in Obsidian Vault*
- Show search results as rendered Markdown
*Improve legibility of the results*
- Jump to selected note from search result in Khoj search modal
*Simplify seeing result within its original note context*
- Automatically configure khoj to index markdown files in current vault
*Reduce khoj setup steps for plugin users by using reasonable defaults*
- Code updates the markdown config in `khoj.yml` and triggers index update
- It can be configured by user in khoj plugin settings, if required
- Add Demo and detailed Readme for the Obsidian plugin
*Ease setup and usage. Give context about capabilities*
### Miscellaneous
- (Try) Keep a mono repo until the Khoj project is mature enough
to reduce maintainance burden
### Commits Details
- 0e39e0f Add details about the Khoj Obsidian plugin to the main Readme
- cd8b918 Add `manifest.json`, `versions.json` of Obsidian plugin to project root
- 66ccd0c Create Obsidian plugin for Khoj
- Add Khoj in Obsidian Demo
- Update Interfaces Screenshot to include Obsidian Plugin Screenshot
- Update .gitignore to ignore obsidian plugin ignorelist
Section the .gitignore for better readability
- Update the Setup, Usage instructions to include information about
the Obsidian plugin
- Obsidian provides limited support for plugins in larger repositories.
Currently, it does not have a way to specify the directory of a plugin
So it expects the plugins `manifest.json' and `versions.json' to be at
project root
- While this unnecessarily litters the codebase. It is the (current)
required tradeoff for keeping the core plugins in a mono repo
- Features
- Search using Khoj from within the Obsidian app
Allow Natural language search on your (markdown) notes in Obsidian Vault
- Show search results as rendered (instead of raw) Markdown
Improve legibility of the results
- Jump to selected note from search result in Khoj search modal
Simplify seeing result within its original note context
- Automatically configure khoj to index markdown files in current vault
Reduce khoj setup steps for plugin users by using reasonable defaults
- Code updates the markdown config in khoj.yml and triggers index update
- It can be configured by user in khoj plugin settings, if required
- Add Demo and detailed Readme for the Obsidian plugin
Ease setup and usage. Give context about capabilities
- Miscellaneous
- Trying keep a mono repo until the Khoj project is mature enough
to reduce maintainance burden
This can ease configuring khoj from the different interfaces
- Don't need to know all the (default) config used by khoj.
- Just get default config by calling the above API endpoint.
- Then modify desired portions and call POST /api/config/data to
configure khoj.
- Start khoj server (in non-GUI mode) without needing config file
already instantiated.
- But throw warning to configure khoj to use it
- This allows plugins to configure the app via the /config/data APIs
- To be used by the Khoj obsidian plugin to configure markdown content
in khoj
- c535953 Update index automatically in non GUI mode too
- 701d92e Lock the index before updating it via API or Scheduler
- 3b0783a Automate updating embeddings, search index on a hourly schedule
Resolves#106
- Poll scheduler every minute using threading.Timer
- Use 60 seconds polling interval to avoid fork bombing
- Schedule next via the same poll scheduler
- Allow clean program interrupt by running scheduler in daemon mode
- There are 3 paths to updating/setting the index (stored in state.model)
- App start
- API
- Scheduler
- Put all updates to the index behind a lock. As multiple updates path
that could (potentially) run at the same time (via API or Scheduler)
- Remove property drawer from test entry for max_words splitting test
- Property drawer is not required for the test
- Keep minimal test case to reduce chance for confusion
- Required because entries are now split by the max_word count supported
by the ML models
- This would now result in potentially duplicate hits, entries being
returned to user
- Do deduplication after ranking to get the top ranked deduplicated
results
- The instructions suggest installing khoj-assistant via pip install.
This installs the latest tagged/release version of khoj
- To match that version user should install khoj.el from MELPA stable
instead of MELPA
- Issue
ML Models truncate entries exceeding some max token limit.
This lowers the quality of search results
- Fix
Split entries by max tokens before indexing.
This should improve searching for content in longer entries.
- Miscellaneous
- Test method to split entries by max tokens
Update readme to ask user to install khoj.el from MELPA when a
pre-release version of the main khoj app is installed. Else install
khoj.el from MELPA Stable
- **Improve API Endpoints**
- ee65a4f Merge /reload, /regenerate into single /update API endpoint
- 9975497 Type the /search API response to better document the response schema
- 0521ea1 Put image score breakdown under `additional` field in search response
- **Formalize Intermediary Format to Index Text Content**
- 7e9298f Use new Text `Entry` class to track text entries in Intermediate Format
- 02d9440 Use Base `TextToJsonl` class to standardize `<text>_to_jsonl` processors
- **Modularize API router code**
- e42a38e Split router code into `web_client`, `api`, `api_beta` routers. Version Khoj API
- d292bdc Remove API versioning. Premature given current state of the codebase
- **Miscellaneous**
- c467df8 Setup `mypy` for static type checking
- 2c54813 Remove unused imports, `embeddings` variable from text search tests