khoj/tests/data/markdown/main_readme.md
Debanjum Singh Solanky bf1ae038cb Get XMP metadata from image using Pillow. Remove ExifTool dependency
- Pillow already supports reading XMP metadata from Images
- Removes need to maintain my fork of unmaintained PyExiftool
  - This also removes dependency on system Exiftool package for
    XMP metadata extraction
- Add test to verify XMP metadata extracted from test images
- Remove references to Exiftool from Documentation
2022-09-16 00:48:45 +03:00

4 KiB

Khoj

Allow natural language search on user content like notes, images, transactions using transformer ML models

User can interface with Khoj via Web, Emacs or the API. All search is done locally*

Demo

https://user-images.githubusercontent.com/6413477/168417719-8a8bc4e5-8404-42b2-89a7-4493e3d2582c.mp4

Setup

1. Clone

git clone https://github.com/debanjum/khoj && cd khoj

2. Configure

  • [Required] Update docker-compose.yml to mount your images, (org-mode or markdown) notes and beancount directories
  • [Optional] Edit application configuration in khoj_sample.yml

3. Run

docker-compose up -d

Note: The first run will take time. Let it run, it's mostly not hung, just generating embeddings

Use

Run Unit tests

pytest

Upgrade

docker-compose build --pull

Troubleshooting

  • Symptom: Errors out with "Killed" in error message
  • Symptom: Errors out complaining about Tensors mismatch, null etc
    • Mitigation: Delete content-type > image section from dockersampleconfig.yml

Miscellaneous

  • The experimental chat API endpoint uses the OpenAI API
    • It is disabled by default
    • To use it add your openai-api-key to config.yml

Development Setup

Setup on Local Machine

  1. 1. Install Dependencies

    1. Install Python3 [Required]

    2. Install Conda

      Required
  2. 2. Install Khoj

    git clone https://github.com/debanjum/khoj && cd khoj
    conda env create -f config/environment.yml
    conda activate khoj
    
  3. 3. Configure

    • Configure files/directories to search in content-type section of khoj_sample.yml
    • To run application on test data, update file paths containing /data/ to tests/data/ in khoj_sample.yml
      • Example replace /data/org/*.org with tests/data/org/*.org
  4. 4. Run

    Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML

    python3 -m src.main -c=config/khoj_sample.yml -vv
    

Upgrade On Local Machine

cd khoj
git pull origin master
conda deactivate khoj
conda env update -f config/environment.yml
conda activate khoj

Acknowledgments