Commit graph

38 commits

Author SHA1 Message Date
Saba
7fcc8d2cef Add null check for processor config 2021-12-04 10:11:00 -05:00
Saba
7ca4fc3453 Resolve mrege conflicts with updated processor conversation data model 2021-11-28 16:22:52 -05:00
Saba
5d50487d83 Linting
New line at end of config.html
Remove debug print statement
2021-11-28 13:32:56 -05:00
Saba
6f466c8d99 Use global config and add a regenerate button to the config ui' && git push 2021-11-28 13:28:22 -05:00
Saba
34d1e4199c Use alias generator when deserializing the config file 2021-11-28 13:05:48 -05:00
Saba
19b81e82f0 Write back to the raw config.yml file on update 2021-11-28 12:34:40 -05:00
Saba
8837b02de6 dump updated config to a yaml file 2021-11-28 12:26:07 -05:00
Saba
5b80b87379 Streamline None checking in initialize_search 2021-11-28 12:05:04 -05:00
Saba
bf8ae31e6a Streamline None checking in initialize_search 2021-11-28 11:59:45 -05:00
Saba
da52433d89 Update to re-use the raw config base models in config.py as well 2021-11-28 11:57:33 -05:00
Saba
66183cc298 Working API request body parsing to /post config! 2021-11-28 11:12:26 -05:00
Saba
64645c3ac1 Begin type checking/input validation effort 2021-11-27 21:47:56 -05:00
Saba
9a0264b7fc Add a dummy POST config endpoint, integrate with editable UI 2021-11-27 20:36:03 -05:00
Saba
f3b03ea5b7 Make raw data reactive to changes 2021-11-27 19:17:15 -05:00
Debanjum Singh Solanky
67c3cd7372 Wire up GPT understand method to /chat API. Log conversation metadata too 2021-11-28 00:04:39 +05:30
Saba
3db06eee3f Basic example of serving conifg as JSON and retriving on button click 2021-11-27 10:49:33 -05:00
Saba
3d4471e107 Merge branch 'master' of github.com:debanjum/semantic-search into saba/configui 2021-11-27 08:52:48 -05:00
Debanjum Singh Solanky
ccfb97e1a7 Wire up minimal conversation processor. Expose it over /chat API endpoint
Ensure conversation history persists across application restart
2021-11-27 18:12:01 +05:30
Saba
baee52648d Set up basic ui page with no functionality 2021-11-26 14:51:11 -05:00
Debanjum Singh Solanky
c47a8cdf16 Allow configuring host, port or unix socket of server via CLI 2021-10-02 16:16:33 -07:00
Debanjum Singh Solanky
d2905c4be6 Move tests out to project root. Use absolute import in project
tests/ directory in project root is more standard.
Just had to use absolute path for internal module imports to get it to
work
2021-09-30 04:12:14 -07:00
Debanjum Singh Solanky
d5597442f4 Modularize Code. Wrap Search, Model Config in Classes. Add Tests
Details
  - Rename method query_* to query in search_types for standardization
  - Wrapping Config code in classes simplified mocking test config
  - Reduce args beings passed to a function by passing it as single
    argument wrapped in a class
  - Minimize setup in main.py:__main__. Put most of it into functions
    These functions can be mocked if required in tests later too

Setup Flow:
  CLI_Args|Config_YAML -> (Text|Image)SearchConfig -> (Text|Image)SearchModel
2021-09-30 02:04:04 -07:00
Debanjum Singh Solanky
f4dd9cd117 Use type specific model for other search types too. Expose them via SearchModels
- Wrap Image, Music, Ledger search into the type of SearchModel they use
  Similar to what was done for notes model by wrapping it's config
  into an AsymmetricSearchModel.

- Use the uber wrapper class to expose all type specific search models
2021-09-29 21:09:42 -07:00
Debanjum Singh Solanky
e22e0b41e3 Wrap asymmetric search model into SearchModels. Test notes search end-to-end
- Wrap asymmetric search model parameters into AsymmetricSearchModel class
- Create wrapper for all search type models. Put notes search model into it
- Test notes search end-to-end from client API layer to results.
  Use model build on test data
2021-09-29 20:47:35 -07:00
Debanjum Singh Solanky
cde11a2331 Wrap search type enablement status in a search settings class
- Cleaner, more idiomatic usage of a global variable
- Simplifies mocking when testing client in pytest as setting wrapped
  in object rather than a simple type. So passed around by reference
2021-09-29 19:18:33 -07:00
Debanjum Singh Solanky
81ce0cacc3 Only allow supported search types to /search, /regenerate APIs
- Use a SearchType to limit types that can be passed by user
- FastAPI automatically validates type passed in query param
- Available type options show up in Swagger UI, FastAPI docs
- controller code looks neater instead of doing string comparisons for type
- Test invalid, valid search types via pytest
2021-09-29 19:12:56 -07:00
Debanjum Singh Solanky
169ddcc8c6 Make Using XMP Metadata to Enhance Image Search Optional, Configurable
- Break the compute embeddings method into separate methods:
  compute_image_embeddings and compute_metadata_embeddings

- If image_metadata_embeddings isn't defined, do not use it to enhance
  search results. Given image_metadata_embeddings wouldn't be defined
  if use_xmp_metadata is False, we can avoid unnecessary addition of
  args to query method
2021-09-16 12:01:05 -07:00
Debanjum Singh Solanky
3afe054312 Make image batch size to encode configurable via config.yml 2021-09-16 10:52:31 -07:00
Debanjum Singh Solanky
d8abbc0552 Use XMP metadata in images to improve image search
- Details
  - The CLIP model can represent images, text in the same vector space

  - Enhance CLIP's image understanding by augmenting the plain image
    with it's text based metadata.
    Specifically with any subject, description XMP tags on the image

  - Improve results by combining plain image similarity score with
    metadata similarity scores for the highest ranked images

- Minor Fixes
  - Convert verbose to integer from bool in image_search.
    It's already passed as integer from the main program entrypoint

  - Process images with ".jpeg" extensions too
2021-09-16 08:55:20 -07:00
Debanjum Singh Solanky
0263d4d068 Enable semantic search for songs in org-music
Org-Music: https://github.com/debanjum/org-music
2021-08-29 06:06:28 -07:00
Debanjum Singh Solanky
4daeddbbda Enable Semantic Search on Images 2021-08-22 21:42:37 -07:00
Debanjum Singh Solanky
fd217fe8b7 Enable Semantic Search for Beancount transactions 2021-08-22 21:36:06 -07:00
Debanjum Singh Solanky
97263b8209 Move CLI into a separate module. Move CLI tests into a separate file 2021-08-21 19:21:38 -07:00
Debanjum Singh Solanky
78a1f4ebb4 Use YAML file to allow user to configure application. Add tests
- YAML Config
  - Can specify all params[1] earlier being passed via cmd args in config YAML
  - Can now also configure sentence-transformer models to use etc for search
    - [1] Config params
       - org files
       - compressed entries file config path
       - embeddings file config path

  - Include sample_config.yaml
  - Include sample .org file from this repos readmes

- CLI
  - Configuration Priority: Config via cmd > Config via YAML > Default Config
  - Test CLI, include test config.yml for the tests

- Set default type to None unless set via query param to API
  Run notes search if search_enabled, also if type is None (default)
  Prepares for running queries on all search types unless type
  specified in API query param

- Update Readme
2021-08-21 19:07:39 -07:00
Debanjum Singh Solanky
252266b62a Pass type of item via regenerate API. Default type query param to None 2021-08-17 18:25:07 -07:00
Debanjum Singh Solanky
ff7207a6bd Extract commandline arguments into separate testable method 2021-08-17 04:11:03 -07:00
Debanjum Singh Solanky
a3a1100be9 Arrange modules in standardized ordering 2021-08-17 04:11:03 -07:00
Debanjum Singh Solanky
af9660f28e Move application files under src directory. Update Readmes
- Remove callign asymmetric search script directly command.
  It doesn't work anymore on calling directly due to internal package
  import issues
2021-08-17 04:11:03 -07:00
Renamed from main.py (Browse further)