Having org-mode result headings change size based on their depth in
the source document makes is a confusing UI experience.
Improve font-size, line-spacing and margins of results to make
delineation between entries, and differntiating between entry heading
and it's body easier to visually infer.
Do not white-space: pre-line. Improves rendering of Markdown results
## Support Incremental Search on Khoj Web Interface
- Use default, fast path to query /search API while user is typing
- Upgrade to cross-encoder re-ranked results once user hits enter on search box
## Improve Render of Org Results on Web Interface
- We were previously just wrapping results from /search API into a pre formatted div field. This was not easy to read
- Use [org.js](https://mooz.github.io/org-js/) to render results from Khoj `/search` API as proper HTML
- Improve org.js to render all task states, stylize task tags and make org-mode results look more like original content
Closes#42#41
In current state:
- Rerank results:
- If user idles while entering query OR
- exits normally
- Do not rerank results:
- If user exits abnormally, e.g via C-g from query
- Rename functions to more standard, descriptive names
- Keep known, required code for incremental search
- E.g Do not set buffer local flag in hooks on minibuffer setup
- Only query when user in khoj minibuffer
- Use active-minibuffer-window and track khoj minibuffer
- (minibuffer-prompt) is not useful for our use-case here
- (For now) Run re-rank only if user idle while querying
- Do not run rerank on teardown/completion
- The reranking lag (~2s) is annoying; hit enter,
wait to see results
- Also triggered when user exits abnormally,
so C-g also results in rerank which is even more annoying
- Emacs will still hang if re-ranking gets triggered on idle but
that's better than always getting triggered. And better than not
having mechanism to get results re-ranked via cross-encoder at all
- Update khoj-simple to work cross-encoder re-ranked results like before
- Increment major version as incremental search considered a breaking
change and a major update to search capability
- Reason:
Allow natural search on markdown based notes, documentation,
websites etc
- Details:
- Create markdown processor to extract Markdown entries (identified by
Heading) into standard jsonl format required by text_search
- Update API, Configs to support interfacing with new markdown type
- Update Emacs, Web clients to support interfacing with new markdown
type via API
- Update Readme to mentiond markdown is also supported
Closes#35
- Had already made some progress on this earlier by updating the image
search responses. But needed to update the text search responses to
use lowercase entry and score
- Update khoj.el to consume the updated json response keys for text
search
- Avoids having to click the query input box
- Just open page, type whatever and hit enter to do image search
- For other search types select appropriate type from dropdown
- Use shr to render image response from html in result buffer
Earlier was using org-mode. But rendering HTML with shr seems cleaner
- Use Headings to Add highlights
- Use Random to Force fetch of Image. Similar to what was done for Web interface
- Remove trailing elisp brackets from response
- Show query match scores by image model for each image in results
Adding a random, unused url param at the end of the img.src string
fixes the issue. As the browser thinks it's a new image and doesn't
use the image data that's already cached because of which it wasn't
even making the fetch call for the image
- Allow viewing image results returned by Semantic Search.
Until now there wasn't any interface within the app to view image
search results. For text results, we at least had the emacs interface
- This should help with debugging issues with image search too
For text the Swagger interface was good enough
- Add search query to top of buffer as Beancount comment
- Remove trailing ) from response
- Separate entries by empty line
- Load beancount-mode in semantic search on ledger buffer
- Previously:
The text the model was trained on was being used to
re-create a semblance of the original org-mode entry.
- Now:
- Store raw entry as another key:value in each entry json too
Only return actual raw org entries in results
But create embeddings like before
- Also add link to entry in file:<filename>::<line_number> form
in property drawer of returned results
This can be used to jump to actual entry in it's original file