Commit graph

1321 commits

Author SHA1 Message Date
Debanjum Singh Solanky
ab0d3a08e2 Index configured plugins on app start and via update API endpoint 2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky
55a032e8c4 Add processor to index entries from jsonl files for plugins
- Read, merge entries from input jsonl files and filters
- Mark new, modified entries for update
2023-02-24 02:54:12 -06:00
Debanjum Singh Solanky
fcbbe8c759 Read content plugin configs from Khoj config YAML
Configure external text content plugins via the Khoj YAML
Reuse existing TextContentConfig definition for external text content plugins
2023-02-23 23:57:32 -06:00
Debanjum Singh Solanky
f57d7bf5ad Use pypi khoj to fix docker builds and dockerize github workflow
- Instead of building the package locally like before
  The issue started since moving to dynamic git based versioning with hatch-vcs
  This should reduce image size of docker builds too

- Also move to ubuntu image since pyqt6 builds available on it, so do
  not need to build it locally for image

- This s
2023-02-19 01:57:01 -06:00
Debanjum Singh Solanky
fada617faa Fix TOC links, Add how to auto start Khoj server to Readme
Rename tools directory to more standard scripts directory
2023-02-18 23:51:02 -06:00
Debanjum Singh Solanky
61b6ee2857 Use helper script to bump khoj pre-release versions 2023-02-17 20:31:51 -06:00
Debanjum Singh Solanky
47c2cc63e1 Automate uploading Obsidian artifacts to new releases 2023-02-17 19:57:44 -06:00
Debanjum Singh Solanky
a8940462c4 Automate khoj python package versioning using hatch-vcs and Git tags 2023-02-17 18:19:01 -06:00
Debanjum Singh Solanky
053d6141f3 Ignore ts typing error, Fix SPDX license identifier in Obsidian plugin 2023-02-17 18:19:01 -06:00
Debanjum Singh Solanky
47569da38e Fix usage of "\" in orgnode test string to resolve DeprecationWarning 2023-02-17 17:15:44 -06:00
Debanjum Singh Solanky
36be3c4b8f Fix or ignore MyPy issues in PyQt desktop GUI code
- Remove unneeded type ignore for mps with the latest mypy
- Stop excluding PyQT desktop GUI code from MyPy checks
- Do not warn about unused ignores. Some issue with mypy giving
  different errors in different environments (venv, system and pre-commit)
2023-02-17 16:13:05 -06:00
Debanjum Singh Solanky
fd0a2f55f8 Run mypy checks in test workflow and on push (via pre-commit)
- Run mypy on git push (not every commit) but for all files
  - Running it on pre-commit, doesn't make sense as mypy wants to look
    at all files, not just diff files
  - But this is too time consuming to run every commit, so run on push

- Update development section documentation on installing, manually
  running pre-commit for validation that includes running mypy checks
2023-02-17 16:08:56 -06:00
Debanjum Singh Solanky
5c0d340970 Update Development section in Readme. Add steps for code validation 2023-02-17 13:31:37 -06:00
Debanjum Singh Solanky
051f0e3fb5 Add, configure and run pre-commit locally and in test workflow 2023-02-17 13:31:36 -06:00
Debanjum Singh Solanky
5e83baab21 Use Black to format Khoj server code and tests 2023-02-17 11:55:17 -06:00
Debanjum Singh Solanky
6130fddf45 Install pytest as optional dev dependency of app in test workflow 2023-02-17 10:11:57 -06:00
Debanjum Singh Solanky
8b293edd7c Move mypy config into pyproject.toml. Ignore 2 remaining mypy issues 2023-02-16 03:33:08 -06:00
Debanjum Singh Solanky
7a9a811874 Fix authors, homepage URL in pyproject.toml and workflow triggers 2023-02-16 03:19:56 -06:00
Debanjum Singh Solanky
dcb86c2d3e Build khoj python package using hatchling, pyproject.toml
- Why
  - pyprojects.toml is the python standards compliant config format
    - allows collating python tooling configs into single standard file
  - hatch(-ling) is a new lightweight build system for python packages

- Detailed Changes
  - Replace setup.py, setuptools with pyproject.toml, hatchling for
    khoj python config and build
  - move pytest into optional development dependencies
  - add more links to khoj in the project urls section
  - add topic classifiers and keywords to find khoj package

  - Delete setup.py, MANIFEST.in as moved to pyproject.toml based setup
  - Update pypi workflow to set python package version in pyproject.toml
2023-02-16 02:37:32 -06:00
Debanjum Singh Solanky
c641eb4ad6 Improve rendering log and error stacktraces using the Rich package
- Use Rich to render uvicorn, fastAPI logs as well
  The previous CustomFormatter only worked on khoj logs
- Improve rendering stacktrace on errors using Rich
2023-02-15 16:19:32 -06:00
Debanjum Singh Solanky
a403def19e Fix workflow to publish Khoj python package to PyPi 2023-02-14 22:19:21 -06:00
Debanjum
eee57599ad
Improve Dockerize, Publish to PyPi Workflows
- fb86dea Create tagged Docker image on new tag/release
- 01fd98b Improve workflow to publish khoj to pypi
2023-02-14 21:11:56 -06:00
Debanjum Singh Solanky
af6d65a909 Create tagged Docker image on new tag/release 2023-02-14 20:04:06 -06:00
Debanjum Singh Solanky
25e06f26c0 Improve workflow to publish khoj to pypi
- Use emoji's to improve visual indicator of action step
- Rename to pypi instead of the more ambiguous publish name
  Publish could mean publish docker image, publish to pypi, MELPA or
  Obsidian plugin
- Update workflow badge, link pypi badge to khoj pypi package page
- Use pypa official github action to upload package to (test) pypi
  instead of doing it manually using twine
- Upload python package artifact for easier access for testing.
  As uploading to testpypi doesn't work for PRs by others from forked repos
2023-02-14 20:03:35 -06:00
Debanjum
11873795a6
Use src layout to fix packaging khoj for pypi
### Issue
The khoj python package was using a common top level name[1], `src' instead of `khoj' due to incorrect usage of the src layout[2]

### Fix
Put content meant for python packaging from `src/' to `src/khoj/'
Update code, tests, configs and docs to reference new layout

The `khoj' python package should now get unpacked under `khoj' instead of `src' directory

### Details
- 25a749c Use the src/ layout to fix packaging Khoj for PyPi
- bc7477e Move Emacs, Obsidian plugin code out from under src/khoj directory
- f83cf4e Check wheel contents in workflow before publishing Khoj to PyPI

[1]: https://github.com/jwodder/check-wheel-contents#w005--wheel-contains-common-toplevel-name-in-library
[2]: https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/
2023-02-14 16:26:07 -06:00
Debanjum Singh Solanky
e76c285bdc No need to prune plugins as not included in pypi package.
Mention Obsidian as supported Interfaces in Readme
2023-02-14 16:15:40 -06:00
Debanjum Singh Solanky
bc7477ea3e Move Emacs, Obsidian plugin code out from under src/khoj directory
- What
  - The Emacs and Obsidian interfaces stay in their original
    directories under src/
  - src/khoj now only contains code meant for pypi packaging

- Benefits
  - This avoids having to update khoj MELPA, Obsidian plugin config as
    the Emacs, Obsidian code is under their original directories
  - It separates the code in src/khoj meant for python packaging from
    code for external interfaces like Emacs and Obsidian
2023-02-14 15:44:22 -06:00
Debanjum Singh Solanky
f83cf4ebc6 Check wheel contents in workflow before publishing it to PyPI 2023-02-14 15:20:44 -06:00
Debanjum Singh Solanky
25a749ca1d Use the src/ layout to fix packaging Khoj for PyPi
- Why
  The khoj pypi packages should be installed in `khoj' directory.
  Previously it was being installed into `src' directory, which is a
  generic top level directory name that is discouraged from being used

- Changes
 - move src/* to src/khoj/*
 - update `setup.py' to `find_packages' in `src' instead of project root
 - rename imports to form `from khoj.*' in complete project
 - update `constants.web_directory' path to use `khoj' directory
 - rename root logger to `khoj' in `main.py'
 - fix image_search tests to use the newly rename `khoj' logger
 - update config, docs, workflows to reference new path `src/khoj'
2023-02-14 15:19:06 -06:00
Debanjum Singh Solanky
cc31cd070d Enable the publish workflow for PRs created in the main repo
The publish workflow was previously disabled for PRs in commit
d1945c5ba8
2023-02-14 13:51:31 -06:00
Debanjum
84322b2a45
Demo using Search in Khoj Obsidian Plugin 2023-02-14 08:43:50 -08:00
Debanjum Singh Solanky
a4dcb20622 Add setting to toggle auto configuring of khoj backend from Obsidian
- By default the obsidian plugin automatically configures the khoj
  backend to index the current vault
- For more complex scenarios, users can manage their ~/.khoj/khoj.yml
  manually by toggling the auto-configure setting off in the khoj
  plugin settings

Resolves #156
2023-02-13 20:15:28 -06:00
Debanjum Singh Solanky
24aa696ef5 Indicate indexing active on Update button in Obsidian plugin settings
Use moon rotating through phases to indicate notes indexing in progress

Resolves #129
2023-02-13 19:28:19 -06:00
Debanjum Singh Solanky
11517ba8eb Encode jsonl data as utf8 for gzip write for consistent read/write encoding
Should help with issue #89
2023-02-12 17:33:23 -06:00
Debanjum Singh Solanky
c156b3e087 Remove sub-dependencies from setup.py. Upgrade sentence-transformer
- setup.py best practise recommends only specifying core dependencies,
  not dependencies of core dependencies in it

- Latest sentence-transformer (version 2.2.2) correctly installs its
  huggingface_hub dependency. Else application fails to start
2023-02-12 10:42:05 -06:00
Debanjum Singh Solanky
3ec41c4d64 Wrap lines for org, markdown results in khoj search results buffer 2023-02-12 07:33:50 -06:00
Debanjum Singh Solanky
d1945c5ba8 Do not run publish workflow for PRs as forks do not have auth token 2023-02-12 07:31:24 -06:00
Debanjum Singh Solanky
9a013ec48f Add more details to setup Khoj backend in Obsidian plugin readme 2023-02-12 07:31:13 -06:00
Debanjum
24c553877c
Merge pull request #152 from axelson/fix-obsidian-doc-link
Fix link to Obsidian plugins doc in Khoj Obsidian Readme
2023-02-10 22:20:06 -06:00
Jason Axelson
6d5930363a Fix obsidian plugins doc link
Also make it more obvious where the link is going, initially I thought
the link was to another official khoj documentation site.
2023-02-10 07:11:21 -10:00
Debanjum Singh Solanky
215235efd2 Bump khoj pre-release version 2023-02-08 20:24:36 -03:00
Debanjum Singh Solanky
55e4fa9719 Fix indentation in workflow yaml for testing khoj backend 2023-02-07 02:59:46 -03:00
Debanjum Singh Solanky
2445664d40 Deprioritize searching for Music content over other text content 2023-02-07 02:41:31 -03:00
Debanjum Singh Solanky
2e052913b6 Search in first configured content type when no search type set
Instead of searching through all configured content types but only
returning results of the last configured content type
2023-02-07 02:41:31 -03:00
Debanjum Singh Solanky
a26ab31d20 Allow chat with markdown notes if no org-mode content configured 2023-02-07 02:41:31 -03:00
Debanjum
99a03da3f7
Read Markdown file as utf8 instead of the default encoding used by OS
### Background
  1. Obsidian stores markdown notes as `utf8`[1]
  2. By default, the python `open` command uses the OS locale encoding[2]

### Issue
  Based on above background, if the OS locale encoding isn't `utf8` it causes the `UnicodeDecodeError: <locale_encoding> codec can't decode byte` error

### Fix
  - Read markdown files as `utf8`
    The Obsidian plugin is the main use-case for markdown files in khoj currently and that stores md files as `utf8`.
    Do not assume utf8 for other content types like org-mode, beancount for now.
  - Fail if error in reading file as utf8, instead of ignoring errors.
    Would rather have user realize that their files are not going to get indexed correctly.

[1]: https://forum.obsidian.md/t/better-handle-md-files-not-stored-in-utf8-format/13524/3
[2]: https://docs.python.org/3/library/functions.html#open
2023-02-07 01:46:42 -03:00
Debanjum Singh Solanky
d3e82b918f Make Khoj require python version below 3.11 until PyTorch works with it
Closes #128
2023-02-06 23:11:51 -03:00
Debanjum Singh Solanky
c11f7b47e4 Update workflow to run backend tests for all supported python versions 2023-02-06 21:05:34 -03:00
Debanjum Singh Solanky
11a18cc452 Update khoj docker config to index sub directories for text content
- Khoj supports indexing subdirectories but the khoj docker config
  wasn't updated to support the same
- This should also allow khoj docker users to index multiple separate
  directory trees by mounting them into separate sub folders within
  /data/<content-type>/.
  For e.g /data/org/dir1, /data/org/dir2 etc in khoj_docker.yml
2023-02-06 21:04:50 -03:00
Debanjum Singh Solanky
fbb7747dcc Read Markdown file as utf8 instead of the default encoding used by OS
- Background
  1. Obsidian stores markdown notes as utf8[1]
  2. By default, the python `open' command uses the OS locale encoding[2]

  This was causing the `UnicodeDecodeError: <locale_encoding> codec can't decode byte' error

- Fix
  - Read markdown files as utf8
    The Obsidian plugin is the main use-case for markdown files in
    khoj currently and that stores md files as utf8.
    Do not assume utf8 for other content types like org-mode, beancount for now.
  - Fail if error in reading file as utf8, instead of ignoring errors.
    Would rather have user realize that their files are not going to
    get indexed correctly.

[1]: https://forum.obsidian.md/t/better-handle-md-files-not-stored-in-utf8-format/13524/3
[2]: https://docs.python.org/3/library/functions.html#open
2023-02-06 21:04:50 -03:00