mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-11-23 15:38:55 +01:00
View each chunk of a non-hierarchical files as a separate corpus
If raw_is_compiled, it means there is no inherent hierarchical structure of the document being chunked. The corpus_id shouldn't be shared for these chunks. Otherwise all chunks of a plain text file will be shown as one during dedupe (default) search
This commit is contained in:
parent
2d35004371
commit
00620356e6
1 changed files with 1 additions and 1 deletions
|
@ -108,7 +108,7 @@ class TextToEntries(ABC):
|
|||
raw=entry.raw,
|
||||
heading=entry.heading,
|
||||
file=entry.file,
|
||||
corpus_id=corpus_id,
|
||||
corpus_id=uuid.uuid4() if raw_is_compiled else corpus_id,
|
||||
)
|
||||
)
|
||||
|
||||
|
|
Loading…
Reference in a new issue