The source URL returned by OpenAI would expire soon. This would make
the chat sessions contain non-accessible images/messages if using
OpenaI image URL
Get base64 encoded image from OpenAI and store directly in
conversation logs. This resolves the image link expiring issue
- Use new style references for Khoj chat modal in Obsidian
- Khoj Chat responses in Obsidian had regressed to not show references
for new questions after modal has been opened. Now even those are
rendered, and use new references style
- Render chat response as markdown while it's being streamed
- Add transcription button with mic icon
- Collect audio recording on pressing mic
- Process and send audio recording to server for transcription
- Extract the functionality to flash status in chat input for reuse
Previously it was only searching for PDF and Markdown files. This was
meant to show only content from current vault as results.
But it has not scaled well as other clients also allow syncing PDF and
markdown files now. So remove this content type filter for now.
A proper solution would limit by using file/dir filters on server or
client side.
- Adds support for multiple users to be connected to the same Khoj instance using their Google login credentials
- Moves storage solution from in-memory json data to a Postgres db. This stores all relevant information, including accounts, embeddings, chat history, server side chat configuration
- Adds the concept of a Khoj server admin for configuring instance-wide settings regarding search model, and chat configuration
- Miscellaneous updates and fixes to the UX, including chat references, colors, and an updated config page
- Adds billing to allow users to subscribe to the cloud service easily
- Adds a separate GitHub action for building the dockerized production (tag `prod`) and dev (tag `dev`) images, separate from the image used for local building. The production image uses `gunicorn` with multiple workers to run the server.
- Updates all clients (Obsidian, Emacs, Desktop) to follow the client/server architecture. The server no longer reads from the file system at all; it only accepts data via the indexer API. In line with that, removes the functionality to configure org, markdown, plaintext, or other file-specific settings in the server. Only leaves GitHub and Notion for server-side configuration.
- Changes license to GNU AGPLv3
Resolves#467Resolves#488Resolves#303Resolves#345Resolves#195Resolves#280Resolves#461Closes#259Resolves#351Resolves#301Resolves#296
- Update background color to a different shade of white
- Make primary and primary hover colors less intense and more aligned
with lantern flame shade
- Add water, leaf, flower color variables
### ✨ New
- Use API keys to authenticate from Desktop, Obsidian, Emacs clients
- Create API, UI on web app config page to CRUD API Keys
- Create user API keys table and functions to CRUD them in Database
### 🧪 Improve
- Default to better search model, [gte-small](https://huggingface.co/thenlper/gte-small), to improve search quality
- Only load chat model to GPU if enough space, throw error on load failure
- Show encoding progress, truncate headings to max chars supported
- Add instruction to create db in Django DB setup Readme
### ⚙️ Fix
- Fix error handling when configure offline chat via Web UI
- Do not warn in anon mode about Google OAuth env vars not being set
- Fix path to load static files when server started from project root
### Overview
- Add ability to push data to index from the Emacs, Obsidian client
- Switch to standard mechanism of syncing files via HTTP multi-part/form. Previously we were streaming the data as JSON
- Benefits of new mechanism
- No manual parsing of files to send or receive on clients or server is required as most have in-built mechanisms to send multi-part/form requests
- The whole response is not required to be kept in memory to parse content as JSON. As individual files arrive they're automatically pushed to disk to conserve memory if required
- Binary files don't need to be encoded on client and decoded on server
### Code Details
### Major
- Use multi-part form to receive files to index on server
- Use multi-part form to send files to index on desktop client
- Send files to index on server from the khoj.el emacs client
- Send content for indexing on server at a regular interval from khoj.el
- Send files to index on server from the khoj obsidian client
- Update tests to test multi-part/form method of pushing files to index
#### Minor
- Put indexer API endpoint under /api path segment
- Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method
- Improve emoji, message on content index updated via logger
- Don't call khoj server on khoj.el load, only once khoj invoked explicitly by user
- Improve indexing of binary files
- Let fs_syncer pass PDF files directly as binary before indexing
- Use encoding of each file set in indexer request to read file
- Add CORS policy to khoj server. Allow requests from khoj apps, obsidian & localhost
- Update indexer API endpoint URL to` index/update` from `indexer/batch`
Resolves#471#243
New URL query params, `force' and `t' match name of query parameter in
existing Khoj API endpoints
Update Desktop, Obsidian and Emacs client to call using these new API
query params. Set `client' query param from each client for telemetry
visibility
New URL follows action oriented endpoint naming convention used for
other Khoj API endpoints
Update desktop, obsidian and emacs client to call this new API
endpoint
- Keep state of previously synced files to identify files to be deleted
- Last synced files stored in settings for persistence of this data
across Obsidian reboots
Use the multi-part/form-data request to sync Markdown, PDF files in
vault to index on khoj server
Run scheduled job to push updates to value for indexing every 1 hour
- Make `bump_version.sh' script set version for the Khoj desktop app too
- Sync Khoj desktop app authors, license, description and version with
the other interfaces and server
- Update description in packages metadata to match project subtitle on Github