- This improves latency of @general chat by avoiding unnecessary
compute
- It also avoids passing references in API response when they haven't
been used to generate the chat response. So interfaces don't have to
add logic to not render them unnecessarily
- **Introduce Khoj to LangChain**:
Call GPT with LangChain for Khoj Chat
- **Search (and Chat about) PDF files with Khoj**
- Create PDF to JSONL Processor: Convert PDF content into standardized JSONL format
- Expose PDF search type via Khoj server API
- Enable querying PDF files via Obsidian, Emacs and Web interfaces
- Make plugin update khoj server config to index PDF files in vault too
- Make Obsidian plugin update index for PDF files in vault too
- Show PDF results in Khoj Search modal as well
- Ensure combined results are sorted by score across both types
- Jump to PDF file when select it PDF search result from modal
- Match argument names passed to khoj openai completion funcs with
arguments passed to langchain calls to OpenAI
- This simplifies the logic in the khoj openai completion funcs
- Fix bug where both LangChain and Khoj retry requests 6 times each.
So a total of 12 requests at >1minute intervals for each chat
response in case of OpenAI API being down
- Retrying too many times when the API is failing doesn't help
- The earlier 60 second request timeout was spacing out the interval
between retries way too much. This slowed down chat response times
quite a bit when API was being flaky
- With these updates you'll know if call to chat API failed in under a
minute
- Use ChatModel and ChatOpenAI to call OpenAI chat model instead of
using OpenAI package directly
- This is being done as part of migration to rely on LangChain for
creating agents and managing their state
### Objective:
Use telemetry to better understand Khoj usage.
This will motivate and prioritize work for Khoj.
Specific questions:
- Number of active deployments of khoj server
- How regularly is khoj used (hourly, daily, weekly etc)?
- How much is which feature used (chat, search)?
- Which UI interface is used most (obsidian, emacs, web ui)?
### Details
- Expose setting to disable telemetry logging in khoj.yml
- Create basic telemetry server to log data to a DB
- Log calls to Khoj API /search, /chat, /update endpoints
- Batch upload telemetry data to server at ~hourly interval
- Khoj chat will now respond to general queries if:
1. no relevant reference notes available or
2. when explicitly induced by prefixing the chat message with "@general"
- Previously Khoj Chat would a lot of times refuse to respond to
general queries not answerable from reference notes or chat history
- Make chat quality tests more robust
- Add more equivalent chat response options refusing to answer
- Force haiku writing to not give any preable, just the haiku
- Simplifies switching between different OpenAI chat models. E.g GPT4
- It was previously hard-coded to use gpt-3.5-turbo. Now it just
defaults to using gpt-3.5-turbo, unless chat-model field under
conversation processor updated in khoj.yml
Merge pull request #214 from debanjum/add-filename-heading-to-compiled-entry-for-context
- Set filename as top heading in compiled org, markdown entries
- Note: *Khoj was already indexing filenames in compiled markdown entries but they weren't set as top level headings but rather appended as bare text*. The updated structure should provide more schematic context of relevance
- Set entry heading as heading for compiled org, md entries, even if split by max tokens
- Snip prepended heading to avoid crossing model max_token limits
- Entries with no md headings should not get heading prefix prepended
Otherwise if heading > max_tokens than the search models will just see
a heading (with repeated filename) for each compiled entry and not
actual content.
100 characters should be sufficient to include filename (not path) and
entry heading. If longer rather truncate to pass entry unique text to
model for search context