khoj/tests
Debanjum Singh Solanky db2581459f Parse markdown parent entries as single entry if fit within max tokens
These changes improve context available to the search model.
Specifically this should improve entry context from short knowledge trees,
that is knowledge bases with sparse, short heading/entry trees

Previously we'd always split markdown files by headings, even if a
parent entry was small enough to fit entirely within the max token
limits of the search model. This used to reduce the context available
to the search model to select appropriate entries for a query,
especially from short entry trees

Revert back to using regex to parse through markdown file instead of
using MarkdownHeaderTextSplitter. It was easier to implement the
logical split using regexes rather than bend MarkdowHeaderTextSplitter
to implement it.
- DFS traverse the markdown knowledge tree, prefix ancestry to each entry
2024-04-04 02:41:55 +05:30
..
data Update the default configuration for the AppConfig 2023-11-17 19:26:31 -08:00
__init__.py Move tests out to project root. Use absolute import in project 2021-09-30 04:12:14 -07:00
conftest.py Part 1: Server-side changes to support agents integrated with Conversations (#671) 2024-03-23 22:09:38 +05:30
helpers.py Use llama.cpp for offline chat models 2024-03-26 22:33:01 +05:30
test_cli.py Add isort to the pre-commit configuration and apply it to the whole project (#595) 2023-12-28 18:04:02 +05:30
test_client.py Short-circuit API rate limiter for unauthenticated users (#607) 2024-01-17 00:59:52 +05:30
test_conversation_utils.py Handle msg truncation when question is larger than max prompt size 2024-03-31 15:50:06 +05:30
test_date_filter.py Improve date filter regexes to extract structured, natural, partial dates 2024-03-30 00:07:19 +05:30
test_file_filter.py [Multi-User Part 1]: Enable storage of settings for plaintext files based on user account (#498) 2023-10-26 09:42:29 -07:00
test_helpers.py Part 2: Add web UI updates for basic agent interactions (#675) 2024-03-26 18:13:24 +05:30
test_image_search.py Add isort to the pre-commit configuration and apply it to the whole project (#595) 2023-12-28 18:04:02 +05:30
test_markdown_to_entries.py Parse markdown parent entries as single entry if fit within max tokens 2024-04-04 02:41:55 +05:30
test_multiple_users.py Add isort to the pre-commit configuration and apply it to the whole project (#595) 2023-12-28 18:04:02 +05:30
test_offline_chat_actors.py Merge branch 'master' into migrate-to-llama-cpp-for-offline-chat 2024-03-31 00:59:20 +05:30
test_offline_chat_director.py Merge branch 'master' into migrate-to-llama-cpp-for-offline-chat 2024-03-31 00:59:20 +05:30
test_openai_chat_actors.py Part 2: Add web UI updates for basic agent interactions (#675) 2024-03-26 18:13:24 +05:30
test_openai_chat_director.py Part 1: Server-side changes to support agents integrated with Conversations (#671) 2024-03-23 22:09:38 +05:30
test_org_to_entries.py Chunk text in preference order of para, sentence, word, character 2024-04-04 02:41:55 +05:30
test_orgnode.py Add isort to the pre-commit configuration and apply it to the whole project (#595) 2023-12-28 18:04:02 +05:30
test_pdf_to_entries.py Remove unused Entry to Jsonl converter from text to entry class, tests 2024-04-04 02:41:55 +05:30
test_plaintext_to_entries.py Remove unused Entry to Jsonl converter from text to entry class, tests 2024-04-04 02:41:55 +05:30
test_rawconfig.py Add isort to the pre-commit configuration and apply it to the whole project (#595) 2023-12-28 18:04:02 +05:30
test_text_search.py Chunk text in preference order of para, sentence, word, character 2024-04-04 02:41:55 +05:30
test_word_filter.py Fix test word filter 2023-11-19 13:14:58 -08:00