2021-09-30 13:58:18 +02:00
[[https://github.com/debanjum/semantic-search/actions/workflows/build.yml/badge.svg ]]
2021-09-30 13:51:47 +02:00
2021-08-17 05:00:05 +02:00
* Semantic Search
2021-08-23 06:50:27 +02:00
/Allow natural language search on user content like notes, images, transactions using transformer based models/
2021-08-17 05:00:05 +02:00
2021-08-17 10:25:12 +02:00
All data is processed locally. User can interface with semantic-search app via [[./src/interface/emacs/semantic-search.el ][Emacs ]], API or Commandline
2021-08-17 05:00:05 +02:00
** Dependencies
- Python3
- [[https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links ][Miniconda ]]
** Install
#+begin_src shell
git clone https://github.com/debanjum/semantic-search && cd semantic-search
conda env create -f environment.yml
conda activate semantic-search
#+end_src
2021-12-11 20:13:37 +01:00
*** Install Environment Dependencies
#+begin_src shell
sudo apt-get -y install libimage-exiftool-perl
#+end_src
2021-08-17 05:00:05 +02:00
2021-11-18 14:55:50 +01:00
** Configure
Configure application search types and their underlying data source/files in ~sample_config.yml~
Use the ~sample_config.yml~ as reference
2021-08-17 05:00:05 +02:00
** Run
2021-11-18 14:55:50 +01:00
Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML
2021-08-17 05:00:05 +02:00
#+begin_src shell
2021-11-18 14:55:50 +01:00
python3 -m src.main -c=sample_config.yml -vv
2021-08-17 05:00:05 +02:00
#+end_src
** Use
- *Semantic Search via Emacs*
2021-08-17 10:25:12 +02:00
- [[https://github.com/debanjum/semantic-search/tree/master/src/interface/emacs#installation ][Install ]] [[./src/interface/emacs/semantic-search.el ][semantic-search.el ]]
2021-09-16 21:39:42 +02:00
- Run ~M-x semantic-search <user-query>~
2021-08-17 05:00:05 +02:00
- *Semantic Search via API*
2021-09-16 21:39:42 +02:00
- Query: ~GET~ [[http://localhost:8000/search?q=%22what%20is%20the%20meaning%20of%20life%22 ][http://localhost:8000/search?q="What is the meaning of life&t=notes" ]]
- Regenerate Embeddings: ~GET~ [[http://localhost:8000/regenerate ][http://localhost:8000/regenerate?t=image ]]
2021-08-17 05:00:05 +02:00
- [[http://localhost:8000/docs ][Semantic Search API Docs ]]
2021-08-18 03:27:10 +02:00
** Upgrade
#+begin_src shell
cd semantic-search
git pull origin master
conda env update -f environment.yml
conda activate semantic-search
#+end_src
2021-08-17 05:00:05 +02:00
** Acknowledgments
- [[https://huggingface.co/sentence-transformers/msmarco-MiniLM-L-6-v3 ][MiniLM Model ]] for Asymmetric Text Search. See [[https://www.sbert.net/examples/applications/retrieve_rerank/README.html ][SBert Documentation ]]
- [[https://github.com/openai/CLIP ][OpenAI CLIP Model ]] for Image Search. See [[https://www.sbert.net/examples/applications/image-search/README.html ][SBert Documentation ]]
- Charles Cave for [[http://members.optusnet.com.au/~charles57/GTD/orgnode.html ][OrgNode Parser ]]
2021-09-16 21:39:42 +02:00
- Sven Marnach for [[https://github.com/smarnach/pyexiftool/blob/master/exiftool.py ][PyExifTool ]]