Commit graph

71 commits

Author SHA1 Message Date
Debanjum Singh Solanky
ae15e429b5 Reduce indentation from 4 to 2 in Readme.md.
Prevent everything looking like code blocks due to 4 space indentations
2021-08-15 22:56:36 -07:00
Debanjum Singh Solanky
636b6195cc Add Readme, License. Update .gitignore 2021-08-15 22:52:37 -07:00
Debanjum Singh Solanky
354c541b62 Add org processor to generate compressed jsonl from org-mode files
The corpus embeddings are generated from this compressed JSONL
using the specified transformer ML model
2021-08-15 22:52:31 -07:00
Debanjum Singh Solanky
b74cb9a104 Move install.py to new utils dir as it's for cmdline ease of use only 2021-08-15 19:10:30 -07:00
Debanjum Singh Solanky
ec92f3e146 Move different search types into search_types directory 2021-08-15 19:09:50 -07:00
Debanjum Singh Solanky
4d681c86ec Update requirements.txt for users wanting to use pip install 2021-08-15 18:45:37 -07:00
Debanjum Singh Solanky
d75df54385 Create API interface for Semantic Search
Use FastAPI, Uvicorn to create app with API endpoint at /search
Example Query: http://localhost:8000/?q="why sleep?"&t="notes'&n=5
2021-08-15 18:11:48 -07:00
Debanjum Singh Solanky
e3088c8cf8 Create environment.yml to install prerequisites for app via conda 2021-08-15 17:48:38 -07:00
Debanjum Singh Solanky
660e6c3937 Add explicit filters to asymmetric search
User can filter results to ones which include, exclude specified words
To show entries which include, exclude specific words, user should prepend
a '+', '-' before the word. E.g "+hello -bye"
2021-08-15 17:48:38 -07:00
Debanjum Singh Solanky
91a2c598fe Resolve paths to absolute paths once. Use pathlib glob directly 2021-08-09 00:39:33 -07:00
Debanjum Singh Solanky
ca0a22f4dd Search for images similar to query image provided by the user
Example user passes path to an image in query. e.g ~/Pictures/photo.jpg
The script should return images in images_embedding most similar to
the query image
2021-08-09 00:21:02 -07:00
Debanjum Singh Solanky
00d0065c5b Allow user to search images via text queries 2021-08-08 23:02:30 -07:00
Debanjum Singh Solanky
181cab89d2 Ignore Title Notes i.e notes with just headings from compute 2021-08-04 21:30:09 -07:00
Debanjum Singh Solanky
d6d7b9d6a8 Make installed script executable. Minor clean-up of duplicate code 2021-08-04 18:29:20 -07:00
Debanjum Singh Solanky
2eb029a7b0 Create script to install semantic-search as a program 2021-08-02 00:29:09 -07:00
Debanjum Singh Solanky
13d5100ce6 Rename script similarity to symmetric 2021-07-31 20:37:07 -07:00
Debanjum Singh Solanky
ad7e90bec3 Modularize script, provide cmdline control, improve results rendering 2021-07-31 17:13:39 -07:00
Debanjum Singh Solanky
eb03f57917 Save, Load Embeddings to/from file to speed up script load time 2021-07-31 10:13:41 -07:00
Debanjum Singh Solanky
0914f284bb Re-rank using cross encoder to get even more relevant results
The cross encoder re-ranked results are much better for more distant queries.
It does take more time with the cross-encoder re-ranking but it seems
worth it to get more relevant results
2021-07-31 03:10:44 -07:00
Debanjum Singh Solanky
9864a2b551 Retrieve most relevant entries for a query using MSMarco based bi-encoder
Returns best 3 results ranked by MSMarco based biencoder score of
query match to entries from org-mode notes
2021-07-31 00:20:37 -07:00
debanjum
0ef5495701 Use Sentence Transformers to Encode, Query Schedule.org Headings 2021-04-04 04:53:03 -07:00