- Reason:
Allow natural search on markdown based notes, documentation,
websites etc
- Details:
- Create markdown processor to extract Markdown entries (identified by
Heading) into standard jsonl format required by text_search
- Update API, Configs to support interfacing with new markdown type
- Update Emacs, Web clients to support interfacing with new markdown
type via API
- Update Readme to mentiond markdown is also supported
Closes#35
- Had already made some progress on this earlier by updating the image
search responses. But needed to update the text search responses to
use lowercase entry and score
- Update khoj.el to consume the updated json response keys for text
search
- Use shr to render image response from html in result buffer
Earlier was using org-mode. But rendering HTML with shr seems cleaner
- Use Headings to Add highlights
- Use Random to Force fetch of Image. Similar to what was done for Web interface
- Remove trailing elisp brackets from response
- Show query match scores by image model for each image in results
- Add search query to top of buffer as Beancount comment
- Remove trailing ) from response
- Separate entries by empty line
- Load beancount-mode in semantic search on ledger buffer
- Previously:
The text the model was trained on was being used to
re-create a semblance of the original org-mode entry.
- Now:
- Store raw entry as another key:value in each entry json too
Only return actual raw org entries in results
But create embeddings like before
- Also add link to entry in file:<filename>::<line_number> form
in property drawer of returned results
This can be used to jump to actual entry in it's original file