- Add instruction to using Docker with README - Use the ./tests/data folder in docker_sample_conifg.yml so it can work right away for users
3.4 KiB
Semantic Search
Allow natural language search on user content like notes, images, transactions using transformer based models
All data is processed locally. User can interface with semantic-search app via Emacs, API or Commandline
Setup
Setup using Docker
1. Clone Repository
git clone https://github.com/debanjum/semantic-search && cd semantic-search
2. Configure
Add Content Directories for Semantic Search to Docker-Compose Update docker-compose.yml to mount your images, org-mode notes, ledger/beancount directories If required, edit config settings in docker_sample_config.yml.
3. Run
docker-compose up -d
Setup on Local Machine
1. Install Dependencies
- Install Python3 [Required[
- Install Conda [Required]
-
Install Exiftool [Optional]
sudo apt-get -y install libimage-exiftool-perl
2. Install Semantic Search
git clone https://github.com/debanjum/semantic-search && cd semantic-search
conda env create -f environment.yml
conda activate semantic-search
3. Configure
Configure application search types and their underlying data source/files in sample_config.yml
Use the sample_config.yml
as reference
4. Run
Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML
python3 -m src.main -c=sample_config.yml -vv
Use
-
Semantic Search via Emacs
- Install semantic-search.el
- Run
M-x semantic-search <user-query>
-
Semantic Search via API
- Query:
GET
http://localhost:8000/search?q="What is the meaning of life"&t=notes - Regenerate Embeddings:
GET
http://localhost:8000/regenerate - Semantic Search API Docs
- Query:
-
UI to Edit Config
Upgrade
Using Docker
docker-compose up
On Local Machine
cd semantic-search
git pull origin master
conda env update -f environment.yml
conda activate semantic-search
Acknowledgments
- MiniLM Model for Asymmetric Text Search. See SBert Documentation
- OpenAI CLIP Model for Image Search. See SBert Documentation
- Charles Cave for OrgNode Parser
- Sven Marnach for PyExifTool