- Note: Support for MPS in Pytorch is currently in v1.13.0 nightly builds
- Users will have to wait for PyTorch MPS support to land in stable builds
- Until then the code can be tweaked and tested to make use of the GPU
acceleration on newer Macs
- Pass device to load models onto from app state.
- SentenceTransformer models accept device to load models onto during initialization
- Pass device to load corpus embeddings onto from app state
### Fix Image Search
- Do not use XMP metadata by default for image search
- It seems to be buggy currently. The returned results do not make sense with XMP metadata enabled
### Fix Image Search using Desktop App
- Fix configuring Image Search via Desktop GUI
- Set `input-directories`, instead of unused `input-files` for `content-type.image` in `khoj.yml`
- Fix running Image Search via Desktop apps.
- Previously the transformers wasn't getting packaged into the app by pyinstaller
- This is required by image search to run. So the desktop apps would fail to start when image search was enabled
- Resolves#68
- Append selected files, directories via "Add" button in Desktop GUI
- This allows selecting multiple files, directories using Desktop GUI
- Previously selecting multiple image directories had to be entered manually
### Improve Desktop App
- Show Splash Screen to Desktop on App Initialization
- The app takes a while to load during first run
- A splash screen signals that app is loading and not being unresponsive
- Note: _Pyinstaller only supports splash screens on Windows, Linux. Not on Macs._
- Add Khoj icon to the Windows, Linux app. Windows expects a `.ico` icon type
- Only exclude `libtorch_{cuda, cpu, python}` on Linux machine
- Seems those libraries are being used on Mac (and maybe Windows).
- Linux is where the app size benefits from removing these is maximum anyway
- Fix PyInstaller Warnings on App Start
- The warning show up as annoying error popups on Windows
- CLIP Image score and XMP metadata score are not combining well.
When combined they give non sensical results. Enable only once
figure how best to combine the two.
- Show scores with higher precision for image search
- Image search scores seem to be mostly be between 0.2 - 0.3 for some reason
- Higher precision scores make it easier to understand the quality
of returned results perceived by the model itself
- Allows adding multiple image directories via GUI
- Allow adding multiple files in different directories via GUI
- Previously users couldn't add multiple directories via GUI
They'd have to manually append to input field if multiple files, directories
- To clear/overwrite is much easier.
The user can just select text to delete in input area
- On Mac excluding the .dylib version of these files throws errors
Not sure why it didn't throw during testing. Maybe the libs were cached?
- Tested on Linux again. It still seems to be passing with the above
libs excluded. So going to keep those excluded for now. Unless
further testing reveals those libs are really required for app
- Issue
Fix configuring image search from Desktop GUI. It was broken before.
The Desktop GUI was updating input-files field under content-type > image.
This field is not used for image search. So image search couldn't be
configured from the Desktop GUI
- Fix
- Set input-directories when field of search type image is set from GUI
- Otherwise set input-files field in config
## Changes
- On **Debian**
- `libtorch_cuda.so` (1Gb) and `libtorch_cpu.so` (700Mb) are large shared libs
- They are available at package root and under `torch/lib` directory in the package
- We remove the unused, duplicate libraries from under `torch/lib` as only the top level libraries are used
- On **Mac**
- Remove `libtorch_{cpu,python}.dylib` under `torch/lib` directory from the Mac app.
## App Size Reduction
- **Debian amd64** app size by **42%** from **1.6Gb to 920Mb**
- **Mac arm64** app by **15%** from **190Mb to 160Mb**
- **Mac amd64** app by **33%** from **340Mb to 230Mb**
## Reference
- [Release Workflow Run](https://github.com/debanjum/khoj/actions/runs/2878104171) after changes
- [Release Workflow Run](https://github.com/debanjum/khoj/actions/runs/2869906116) before changes
libtorch_cuda only seems to be imported in Linux. Which is why the
size of the Mac, Windows apps are 700Mb smaller than the Debian app size.
Guessing this is because libtorch_cuda only works on Linux machines?
Anyway, removing libtorch_{cpu,python}.dylib under torch/lib from the
Mac app reduces it's size from 190Mb to 160Mb. 15% reduction isn't too bad
- libtorch_cuda.so (1Gb) and libtorch_cpu.so (700Mb) are large shared
libs that are available at package root and under torch/lib.
- The top level imports are used, so they unused libs are removed from
package
- This reduces the single file package size from 1.6Gb to 920Mb
- Improve Documentation
- 7866add Add Interface Screenshots to Docs
- 8ad3482 Update Readme Instructions to use Desktop GUI to configure App
- Fix Markdown Search Bug on Backend
- b891347 Fix condition in router to trigger markdown search
- Improve Desktop GUI
- 67ab40b Regenerate embeddings everytime user clicks configure in Desktop GUI
- 7f479b0 Improve Displaying Error to User on Khoj window in Desktop GUI
- 873bb9d Do not force the Khoj window to always be on top. It's needlessly annoying
- 9bc4fd5 Set Web Interface URL from loaded state in Desktop GUIs. Not hard-coded
- Configure app using the configure screen during first run.
No config file args required to be passed via CLI by users
- Update instructions to copy khoj_sample.yml to default app configure
file location and edit that. Instead of editing the khoj_sample.yml
directly in source
- Show a helpful error message in the GUI to the user, instead of the
crashing if loading config fails, for e.g if file wasn't found
- Collate GUI errors into an ErrorType enum class
- Remove previous error messages before showing the new one
Previously if the embeddings were already there only the khoj.yml
config file would get updated. The embeddings would remain old.
1. This results in a stale app state where the config doesn't
match the embeddings
2. Currently the user cannot update their config from the config
screen. They'd have to use a combination of config screen and web
interface>regenerate button to trigger it or delete their ~/.khoj dir
This commit should resolve the above issues
- Prevent immediate overwrite of re-ranked results by
incremental-search without rerank triggered via post-command-hook.
- This triggers right after the reranking results are rendered, so
user never ends up seeing them
- Why
- Simplify Installation of Khoj by providing OS specific apps
- What
- Create Github workflow to wrap Khoj into
- [X] A Mac .app Disk Image
- [X] A Windows .exe app
- [X] A Linux .deb package
Improves #56Closes#64, #65