Sentence Transformer MSMarco Model isn't date aware
So no use of adding scheduled, deadline dates to model embeddings for consideration
This reverts commit a2a08d1354.
- Introduce prompt for GPT to automatically extract user's search intent
- Expose new search api endpoint to use that to set SearchType being
passed to search API
- Currently meant as an experimental API to gauge usefulness,
extendability. Evaluating for phone or voice use-case
To prompt improve readability:
- Remove newline escape sequence and use actual newline directly
- This avoids one long line of text as prompt and
- Remove escaping of double quotes
- Add search query to top of buffer as Beancount comment
- Remove trailing ) from response
- Separate entries by empty line
- Load beancount-mode in semantic search on ledger buffer
- Fix loading entries from jsonl in extract_entries method
- Only extract Title from jsonl of each entry
This is the only thing written to the jsonl for symmetric ledger
- This fixes the trailing escape seq in loaded entries
- Remove the need for semantic-search.el response reader to do pointless complicated cleanup
- Make symmetric_ledger:extract_entries use beancount_to_jsonl:load_jsonl
Both methods were doing similar work
- Make load_jsonl handle loading entries from both gzip and uncompressed jsonl
- Keeps directory paths consistent between host and container volumes
- Consistency simplifies documentation and updates required to setup
sample_config.yml for local installation
- Put test data for each content type into separate directories
- Makes config.yml for docker and local host consistent
- Prepending tests to /data in sample_config.yml makes application
run on local host using test data
- Allows mounting separate volume for each content type in docker-compose
- Ignore gitignore to only add tests content, not generated models or embeddings
- Mount the local directory to /app
- Reformat the file paths to generically indicate what their purpose is
- Add comments to assist users who wasnt to modify properties themselves
- Use /app as the working directory
- Clarify comment to explain why the ENTRYPOINT is constructed as it is
- Move explanations for the argument to docker-compose, where it's set
- Copy required artifacts from the first build image into the subsequent one (exiftool)
- Add a Dockerfile which uses an Ubuntu image to install relevant dependencies (exif) and uses a Miniconda image for setting up/reusing the conda environment
- Add a dummy docker-compose file
Save Search Models to Disk on First Run
## Why
- Improve application startup time
- Startup application and perform semantic search even if user offline
- Use search model config in YAML file for all search types (asymmetric, symmetric, image)
## Details
- Load search models from disk when available
- Use search model config specified in YAML file
- Add search config for Symmetric Search used by Ledger/Beancount transaction search
- Rename pytest fixture search_config to more appropriate
content_config
- Create search_config pytest fixture
- Use search_config where search being setup, used in tests