Debanjum Singh Solanky
1c3a1420f8
Update asymmetric extract_entries method to handle uncompressed jsonl
...
This is similar to what was done for the symmetric extract_entries
method earlier
2022-02-27 19:03:31 -05:00
Debanjum Singh Solanky
3d8a07f252
Extract empty line escape sequences var into constants file for reuse
2022-02-27 19:01:49 -05:00
Debanjum Singh Solanky
624a3faf92
Update Readme. Improve Organization, Reduce Staleness
2022-02-26 19:04:49 -05:00
Debanjum Singh Solanky
bb5d0d8908
Improve Semantic Search Buffer Names in Emacs
...
- Allow multiple semantic searches buffers to exist simultaneously
- Uniquify semantic search buffer namew
- Add query and search-type to semantic search buffer name for easier
disambiguration, search and find appropriate
2022-02-26 18:30:14 -05:00
Debanjum
6a84ca965a
Merge pull request #25 from debanjum/users/debanjum/improve-semantic-search-on-ledger
...
Improve Extraction and Rendering of Semantic Search on Ledger
2022-02-26 15:18:22 -08:00
Debanjum Singh Solanky
b68558651b
Improve Extraction of Beancount Entries
...
- Only extract entries starting with YYYY-MM-DD from Beancount
- Strip Trailing Escape Sequences from Entries
2022-02-26 17:48:45 -05:00
Debanjum Singh Solanky
b3ac2dd730
Improve Results Rendered on Emacs from Semantic Search on Ledger
...
- Add search query to top of buffer as Beancount comment
- Remove trailing ) from response
- Separate entries by empty line
- Load beancount-mode in semantic search on ledger buffer
2022-02-26 17:48:45 -05:00
Debanjum Singh Solanky
502c68d4f8
Remove trailling escape sequence in ledger search response entries
...
- Fix loading entries from jsonl in extract_entries method
- Only extract Title from jsonl of each entry
This is the only thing written to the jsonl for symmetric ledger
- This fixes the trailing escape seq in loaded entries
- Remove the need for semantic-search.el response reader to do pointless complicated cleanup
- Make symmetric_ledger:extract_entries use beancount_to_jsonl:load_jsonl
Both methods were doing similar work
- Make load_jsonl handle loading entries from both gzip and uncompressed jsonl
2022-02-26 17:48:45 -05:00
Debanjum Singh Solanky
248aa632c0
Do not throw warning for beancount files with .beancount extension
2022-02-26 17:48:45 -05:00
Debanjum Singh Solanky
76cd63f4bd
Fix count of processed jsonl entries shown to user by ledger processor
...
Count lines not chars
2022-02-26 17:46:06 -05:00
Debanjum Singh Solanky
f08591c880
Set PORT arg when building docker image in the build workflow
2022-01-29 18:11:47 -05:00
Debanjum Singh Solanky
359f25b0a4
Rename publish workflow to build. Add badge to the workflow on Readme
2022-01-29 18:11:47 -05:00
Debanjum Singh Solanky
4add348e3c
Remove context from path to Dockerfile in Github build, push action
2022-01-29 17:16:12 -05:00
Debanjum Singh Solanky
859258864c
Update Readme badge post rename of build.yml to test.yml
2022-01-29 17:10:43 -05:00
Debanjum Singh Solanky
fa685dc37f
Create Github workflow to build, publish docker container to registry
...
- Rename the build workflow to test workflow
2022-01-29 17:08:19 -05:00
Debanjum Singh Solanky
78b76d65a0
Minor fix to notes jsonl file extension in sample_config.yml
2022-01-29 04:13:36 -05:00
Debanjum Singh Solanky
7c773d29ef
Update github workflow to use environment.yml under config/ directory
2022-01-29 03:43:34 -05:00
Debanjum Singh Solanky
c31abad0a6
Mount embeddings to /data/embeddings for directory naming consistency
...
- Keeps directory paths consistent between host and container volumes
- Consistency simplifies documentation and updates required to setup
sample_config.yml for local installation
2022-01-29 03:24:02 -05:00
Debanjum Singh Solanky
b0067fc32e
Store docker, conda, semantic-search configuration in a config directory
...
- Improves organization of config files required for application
- Declutters the application root directory from configs
2022-01-29 02:41:11 -05:00
Debanjum Singh Solanky
79c2224eaa
Improve test data organization and update correspoding conftests
...
- Put test data for each content type into separate directories
- Makes config.yml for docker and local host consistent
- Prepending tests to /data in sample_config.yml makes application
run on local host using test data
- Allows mounting separate volume for each content type in docker-compose
- Ignore gitignore to only add tests content, not generated models or embeddings
2022-01-29 02:03:17 -05:00
Debanjum Singh Solanky
3e889760c7
Merge sample_config, docker_sample_config yml into a single sample_config.yml
...
- Update readme to indicate how to update the new sample_config to run on test data
2022-01-29 01:32:12 -05:00
Debanjum Singh Solanky
2bc2780501
Mention the experimental /chat API interacts with OpenAI's API
2022-01-29 00:11:40 -05:00
Debanjum Singh Solanky
6ed667aed0
Add Troubleshooting Section, Minor Fixes to Readme
2022-01-29 00:11:40 -05:00
Debanjum
d943d2be80
Merge pull request #21 from debanjum/saba/dockerize
...
Add Docker support to semantic-search
2022-01-28 20:27:40 -08:00
Saba
1ba7fa66e5
Update README and default folders in docker_sample_config.yml
...
- Add instruction to using Docker with README
- Use the ./tests/data folder in docker_sample_conifg.yml so it can work right away for users
2022-01-28 23:20:50 -05:00
Saba
52e701b3c2
Simplify Dockerfile by removing multibuild
...
- Install exiftool dependency directly in the miniconda image
2022-01-24 21:54:10 -05:00
Saba
33bc62dc19
Fix type of use_xmp_metadata to be bool, rather than str
2022-01-24 21:53:26 -05:00
Saba
9fb410fc25
Clean up docker_sample_config.yml
...
- Uncomment other search types
- Explain the file prefixes behavior and how it interfaces with the docker image
2022-01-24 14:11:38 -05:00
Saba
9802023c79
Clean up docker-compose
...
- Mount the local directory to /app
- Reformat the file paths to generically indicate what their purpose is
- Add comments to assist users who wasnt to modify properties themselves
2022-01-24 14:10:18 -05:00
Saba
4ae8c15170
Clean the Dockerfile
...
- Use /app as the working directory
- Clarify comment to explain why the ENTRYPOINT is constructed as it is
- Move explanations for the argument to docker-compose, where it's set
- Copy required artifacts from the first build image into the subsequent one (exiftool)
2022-01-24 14:08:55 -05:00
Saba
66d08ab5df
Rename web to server in docker-compose.yml
2022-01-24 00:14:01 -05:00
Saba
77fa8718d9
Working example with docker-compose
...
Still need quite a bit of clean-up, but this adds a working docker-compose + Dockerfile setup
2022-01-23 23:44:38 -05:00
Saba
875188dc6f
Initialize working on #20 to add Docker support
...
- Add a Dockerfile which uses an Ubuntu image to install relevant dependencies (exif) and uses a Miniconda image for setting up/reusing the conda environment
- Add a dummy docker-compose file
2022-01-23 14:57:28 -05:00
sabaimran
974690939c
Merge pull request #19 from debanjum/rename-config-types-for-consistency
...
Rename RawConfig Types for Consistency
2022-01-14 21:14:08 -05:00
Debanjum Singh Solanky
179153dc5a
Rename RawConfig Types for Consistency
...
- Naming convention - [ContentType][ConfigType]Config
- Where [ConfigType] ~ Content, Search, Processor
- Where [ContentType] ~ Text, Image, Asymmetric, Symmetric, Conversation
- Current Configs:
- Content:
- Org Notes
- Org Music
- Image
- Ledger/Beancount
- Search:
- Asymmetric
- Symmetric
- Image
- Processor:
- Conversation
2022-01-14 20:54:38 -05:00
Debanjum
ed7c2901f5
Merge pull request #18 from debanjum/deb/save-models-to-disk-on-first-run
...
Save Search Models to Disk on First Run
## Why
- Improve application startup time
- Startup application and perform semantic search even if user offline
- Use search model config in YAML file for all search types (asymmetric, symmetric, image)
## Details
- Load search models from disk when available
- Use search model config specified in YAML file
- Add search config for Symmetric Search used by Ledger/Beancount transaction search
2022-01-14 17:30:46 -08:00
Debanjum Singh Solanky
ed144f7984
Setup Search with Search_Config to Fix Tests
...
- Rename pytest fixture search_config to more appropriate
content_config
- Create search_config pytest fixture
- Use search_config where search being setup, used in tests
2022-01-14 20:13:14 -05:00
Debanjum Singh Solanky
c64e0c2965
Load model from HuggingFace if model_directory unset in config YAML
...
- Do not save/load the model to/from disk when model_directory unset
in config.yml
- Add symmetric search default config to cli.py
2022-01-14 17:36:59 -05:00
Debanjum Singh Solanky
510faa1904
Save Image Search Model to Disk
2022-01-14 17:36:59 -05:00
Debanjum Singh Solanky
934ec233b0
Add Search Config for Symmetric Model. Save Model to Disk
2022-01-14 17:36:59 -05:00
Debanjum Singh Solanky
b63026d97c
Save Asymmetric Search Model to Disk
...
- Improve application load time
- Remove dependence on internet to startup application and perform semantic search
2022-01-14 17:36:27 -05:00
Debanjum
e8146e8ebb
Merge pull request #17 from albd/patch-1
...
Fix url error in README
2022-01-13 18:09:10 -08:00
Albert Davies
2e2069f720
Fix url error in README
...
Misplaced quotes
2022-01-13 16:28:46 -08:00
Debanjum Singh Solanky
2e53fbc844
Fix the user intent extraction prompt for GPT. Clean up chatbot test
2022-01-12 10:36:01 -05:00
Debanjum Singh Solanky
ea28897cdd
Remove deprecated conversation_history field from config
2022-01-12 10:35:52 -05:00
Debanjum Singh Solanky
5a686b7be9
Add logs for chat bot in verbose mode
2022-01-12 10:35:52 -05:00
Debanjum Singh Solanky
6dc2a99d35
Merge branch 'master' of github.com:debanjum/semantic-search into add-summarize-capability-to-chat-bot
...
- Fix openai_api_key being set in ConfigProcessorConfig
- Merge addition of config UI and config instantiation updates
2021-12-20 13:30:42 +05:30
Debanjum
29543d2dc3
Merge pull request #16 from debanjum/cache-conda-env-setup-to-improve-cloud-build
...
Cache Conda Setup to Improve Cloud Build Time
2021-12-18 06:58:42 -08:00
Saba
ee6aae3a40
Merge branch 'master' of github.com:debanjum/semantic-search
2021-12-16 20:36:36 -05:00
Saba
916a1ffc73
Fix formatting of REAMDE env dependencies
2021-12-16 20:36:31 -05:00