- Why
The khoj pypi packages should be installed in `khoj' directory.
Previously it was being installed into `src' directory, which is a
generic top level directory name that is discouraged from being used
- Changes
- move src/* to src/khoj/*
- update `setup.py' to `find_packages' in `src' instead of project root
- rename imports to form `from khoj.*' in complete project
- update `constants.web_directory' path to use `khoj' directory
- rename root logger to `khoj' in `main.py'
- fix image_search tests to use the newly rename `khoj' logger
- update config, docs, workflows to reference new path `src/khoj'
- By default the obsidian plugin automatically configures the khoj
backend to index the current vault
- For more complex scenarios, users can manage their ~/.khoj/khoj.yml
manually by toggling the auto-configure setting off in the khoj
plugin settings
Resolves#156
- setup.py best practise recommends only specifying core dependencies,
not dependencies of core dependencies in it
- Latest sentence-transformer (version 2.2.2) correctly installs its
huggingface_hub dependency. Else application fails to start
### Background
1. Obsidian stores markdown notes as `utf8`[1]
2. By default, the python `open` command uses the OS locale encoding[2]
### Issue
Based on above background, if the OS locale encoding isn't `utf8` it causes the `UnicodeDecodeError: <locale_encoding> codec can't decode byte` error
### Fix
- Read markdown files as `utf8`
The Obsidian plugin is the main use-case for markdown files in khoj currently and that stores md files as `utf8`.
Do not assume utf8 for other content types like org-mode, beancount for now.
- Fail if error in reading file as utf8, instead of ignoring errors.
Would rather have user realize that their files are not going to get indexed correctly.
[1]: https://forum.obsidian.md/t/better-handle-md-files-not-stored-in-utf8-format/13524/3
[2]: https://docs.python.org/3/library/functions.html#open
- Khoj supports indexing subdirectories but the khoj docker config
wasn't updated to support the same
- This should also allow khoj docker users to index multiple separate
directory trees by mounting them into separate sub folders within
/data/<content-type>/.
For e.g /data/org/dir1, /data/org/dir2 etc in khoj_docker.yml
- Background
1. Obsidian stores markdown notes as utf8[1]
2. By default, the python `open' command uses the OS locale encoding[2]
This was causing the `UnicodeDecodeError: <locale_encoding> codec can't decode byte' error
- Fix
- Read markdown files as utf8
The Obsidian plugin is the main use-case for markdown files in
khoj currently and that stores md files as utf8.
Do not assume utf8 for other content types like org-mode, beancount for now.
- Fail if error in reading file as utf8, instead of ignoring errors.
Would rather have user realize that their files are not going to
get indexed correctly.
[1]: https://forum.obsidian.md/t/better-handle-md-files-not-stored-in-utf8-format/13524/3
[2]: https://docs.python.org/3/library/functions.html#open
Khoj plugin page from within Obsidian isn't recognized. Seems like it
needs an uppercase readme file only. So it doesn't show the Khoj
readme from within Obsidian itself.
- Update khoj.el test to reflect updated rendering logic
- Move ledger render function before image rendered to group functions
with similar logic closer
### Details
- b415f87 Split find and jump to notes code in `onChooseSuggestion' method
- 37063f6 Truncate query to 8k chars for find similar notes from Obsidian plugin
- 4456cf5 No need to use `then' or `finally' in `async' functions after an `await'
- 4070be6 Pass app object from plugin instance to child objects and functions
- c203c6a Use Sentence case for Find similar mote Obsidian command name
Split find file, jump to file code to make onChooseSuggestion more readable
- Use find, instead of using return in forEach to get first match
- Move the jump to file+heading code out from forEach
Do not reference global app object from child objects and funcs
directly.
It is only available for debugging purposes and access to it maybe
dropped in the future.
- Use ERT to test `khoj.el'
- Test extracting and rendering of Org, Markdown and Ledger entries from Khoj API response
- Automate `khoj.el' testing using Github workflow
- Fix, Simplify and Test the get text around point code for the "Find Similar" feature
Previously no query syntax helpers, like the "file:" prefix, were used
before checking if query contains file path.
This made query to image search brittle to misinterpretation and
pointless checking
Add test to verify search by image at file works as expected
### Overview
Find items of specified type similar to current text item at point
### Capabilities
- Support querying with text surrounding point in any text buffer
- Find similar items of specified content type indexed on Khoj
### Details
- Query using text in current section if in a `outline-mode` buffer (i.e markdown heading, org-mode entry text)
- Query using text in current paragraph if in non `outline-mode` buffer
- Search for items of `content-type` set in khoj transient menu
- Update last used khoj content-type and results from the
*find-similar* and *update* functions for later reuse
### Related
- Recently added [Find Similar Notes in Khoj Obsidian](https://github.com/debanjum/khoj/pull/122) as well