sij/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2024-11-27 17:35:07 +01:00

Author	SHA1	Message	Date
Debanjum Singh Solanky	d292bdcc11	Do not version API. Premature given current state of the codebase - Reason - All clients that currently consume the API are part of Khoj - Any breaking API changes will be fixed in clients immediately - So decoupling client from API is not required - This removes the burden of maintaining muliple versions of the API	2022-10-08 16:32:46 +03:00
Debanjum Singh Solanky	e42a38e825	Version Khoj API, Update frontends, tests and docs to reflect it - Split router.py into v1.0, beta and frontend (no-prefix) api modules under new router package. Version tag in main.py via prefix - Update frontends to use the versioned api endpoints - Update tests to work with versioned api endpoints - Update docs to mentioned, reference only versioned api endpoints	2022-09-28 20:08:38 +03:00
Debanjum Singh Solanky	02d944030f	Use Base TextToJsonl class to standardize <text>_to_jsonl processors - Start standardizing implementation of the `text_to_jsonl' processors - `text_to_jsonl; scripts already had a shared structure - This change starts to codify that implicit structure - Benefits - Ease adding more `text_to_jsonl; processors - Allow merging shared functionality - Help with type hinting - Drawbacks - Lower agility to change. But this was already an implicit issue as the text_to_jsonl processors got more deeply wired into the app	2022-09-16 00:53:11 +03:00
Debanjum Singh Solanky	1bfe9c4ef2	Handle filter only queries. Short-circuit and return filtered results - For queries with only filters in them short-circuit and return filtered results. No need to run semantic search, re-ranking. - Add client test for filter only query and quote query in client tests	2022-09-12 17:13:05 +03:00
Debanjum Singh Solanky	c17a0fd05b	Do not store word filters index to file. Not necessary for now - It's more of a hassle to not let word filter go stale on entry updates - Generating index on 120K lines of notes takes 1s. Loading from file takes 0.2s. For less content load time difference will be even smaller - Let go of startup time improvement for simplicity for now	2022-09-10 21:01:54 +03:00
Debanjum Singh Solanky	092b9e329d	Setup Filters when configuring Text Search for each Search Type - Allows enabling different filters for different Text Search Types - Use FileFilter in Text Search on Org Files	2022-09-05 15:21:40 +03:00
Debanjum Singh Solanky	f930324350	Rename explicit filter to word filter to be more specific	2022-09-04 17:18:47 +03:00
Debanjum Singh Solanky	cdcee89ae5	Wrap words in quotes to trigger explicit filter from query - Do not run the more expensive explicit filter until the word to be filtered is completed by user. This requires an end sequence marker to identify end of explicit word filter to trigger filtering - Space isn't a good enough delimiter as the explicit filter could be at the end of the query in which case no space	2022-09-04 02:38:57 +03:00
Debanjum Singh Solanky	30c3eb372a	Update Tests to Configure Filters and Setup Text Search	2022-09-03 22:24:10 +03:00
Debanjum Singh Solanky	972523e8a9	Re-enable tests for image search Verify if recent fixes resolve test flakiness	2022-08-20 14:44:53 +03:00
Debanjum Singh Solanky	7b04978f52	Put global state variables into separate state module - Variables storing app, device state aren't constants. Do not mix with actual constants like empty_escape_sequence, web_directory	2022-08-06 03:13:18 +03:00
Debanjum Singh Solanky	bc423d8f76	Disable image search in tests. Import global state from constants module - Upstream issues causing load of image search model to fail. Disable tests related to image search for now	2022-08-06 02:47:52 +03:00
Debanjum Singh Solanky	1168244c92	Make cross-encoder re-rank results if query param set on /search API - Improve search speed by ~10x Tested on corpus of 125K lines, 12.5K entries - Allow cross-encoder to re-rank results by settings &?r=true when querying /search API - It's an optional param that default to False - Earlier all results were re-ranked by cross-encoder - Making this configurable allows for much faster results, if desired but for lower accuracy	2022-07-26 22:56:36 +04:00
Debanjum Singh Solanky	65fea7681a	Rename notes search type to org search, now that markdown notes supported	2022-07-21 22:09:44 +04:00
Debanjum Singh Solanky	0602d018c0	Merge Symmetric, Asymmetric Search Types into a single Text Search Type - The code for both the text search types were mostly the same It was earlier done this way for expedience while experimenting - The minor differences were reconciled and merged into a single text_search type - This simplifies the app and making it easier to process other text types	2022-07-21 21:19:52 +04:00
Debanjum Singh Solanky	c1369233db	Consistently use "entry", "score" in json response for all search types - Had already made some progress on this earlier by updating the image search responses. But needed to update the text search responses to use lowercase entry and score - Update khoj.el to consume the updated json response keys for text search	2022-07-20 20:33:27 +04:00
Debanjum Singh Solanky	c9ff97451b	Fix tests to handle updated response types by API	2022-07-20 03:01:56 +04:00
Debanjum Singh Solanky	68ee88cebc	Fix image search tests after update to API response for image search types - Look for 'entry' key in response json instead of 'Entry' - Expect image where id = alphanumeric order of image name	2022-07-20 01:37:01 +04:00
Debanjum Singh Solanky	732b2d287f	Give the project a short, less generic name. Rename it to Khoj - Semantic Search was just a placeholder used to test the idea out Didn't want to get into naming at that point of time	2022-07-19 18:26:16 +04:00
Debanjum Singh Solanky	2f7ef08b11	Add Unit Tests to verify the Reload API functions as desired	2022-06-29 23:47:17 +04:00
Debanjum Singh Solanky	179153dc5a	Rename RawConfig Types for Consistency - Naming convention - [ContentType][ConfigType]Config - Where [ConfigType] ~ Content, Search, Processor - Where [ContentType] ~ Text, Image, Asymmetric, Symmetric, Conversation - Current Configs: - Content: - Org Notes - Org Music - Image - Ledger/Beancount - Search: - Asymmetric - Symmetric - Image - Processor: - Conversation	2022-01-14 20:54:38 -05:00
Debanjum Singh Solanky	ed144f7984	Setup Search with Search_Config to Fix Tests - Rename pytest fixture search_config to more appropriate content_config - Create search_config pytest fixture - Use search_config where search being setup, used in tests	2022-01-14 20:13:14 -05:00
Debanjum	ef911aa6be	Skip Flaky Image Search Test Image search doesn't always return expected image path. Should resolve remaining issues with failing cloud test. See #11	2021-12-12 02:15:20 -08:00
Saba	ba8dc9ed5f	Update the search_config instantiated for tests in conftest	2021-12-11 14:14:31 -05:00
Saba	d65190c3ee	Update unit tests, files with removing model suffix to config types	2021-12-09 08:50:38 -05:00
Saba	76e9e9da2f	Update unit tests to use the new BaseModel types	2021-12-05 09:31:39 -05:00
Debanjum Singh Solanky	9dcffe3e8e	Rename test_main to test_client. It only contains client specific tests	2021-10-02 20:32:53 -07:00

27 commits