- Reason
- All clients that currently consume the API are part of Khoj
- Any breaking API changes will be fixed in clients immediately
- So decoupling client from API is not required
- This removes the burden of maintaining muliple versions of the API
- Split router.py into v1.0, beta and frontend (no-prefix) api modules
under new router package. Version tag in main.py via prefix
- Update frontends to use the versioned api endpoints
- Update tests to work with versioned api endpoints
- Update docs to mentioned, reference only versioned api endpoints
- The logging to file code expects the config directory to already be setup
- But parent directory of config file was being set up later in code
- This resulted in app start failing with ~/.khoj dir does not exist error
- Stop passing verbose flag around app methods
- Minor remap of verbosity levels to match python logging framework levels
- verbose = 0 maps to logging.WARN
- verbose = 1 maps to logging.INFO
- verbose >=2 maps to logging.DEBUG
- Minor clean-up of app: unused modules, conversation file opening
- What
- Convert the config screen into the main application window
with configuration as just one of the functionality it provides
- Rename config screen to main window to match new designation
- Why
- System Tray isn't available everywhere (e.g Linux)
- This requires moving functionality into a normal window for cross-compat
- What
- On Linux
- Show Configure Screen, even if not first run experience
- Do no show system tray on Linux
- Quit app on closing Configure Screen
- On Windows, Mac
- Show Configure screen only if first run experience
- Show system tray always
- Do not quit app on closing Configure Screen
- Why
- Configure screen is the only GUI element on Linux. So closing it
should close the application
- On Windows, Mac the system tray exists, so app should not be closed
on closing configure screen
- Decouple configuring backend from starting server.
Backend search and processors can be configured after the backend
server has started
- Set global state in main instead of in configure_server method.
This allows the app to start even if configure_server exits early in
the first run scenario, where no config available to configure server
- Now start server, even if no config, before GUI started in main
- This refactor of app startup flow will allow users to configure
backend using the configure screen after server start
- Search is being reconfigured multiple times in /regenerate and
n/reload. More appropriate name is configure_ rather than initialize_
for it
- Standardize name of methods under configure.py
- Main.py was becoming too big to manage. It had both
controllers/routers and component configurations (search, processors)
in it
- Now that the native app GUI code is also getting added to the main
path, good time to split/modularize/clean main.py
- Put global state into a separate file to share across modules
- Run FastAPI server in a separate thread.
- This allows starting both the server and gui in parallel
- Create System Tray for Khoj
- Contains menu items that open search or config pages in browser
- Rearrange code to have only the code required to start Backend and
GUI in the run() method
- Move the backend setup code into a separate method
- Most concretely right now,
it eliminates the re-rank latency hit
on re-rank triggered on user hitting enter
after re-rank is already done on user idle
in the emacs interface
- Improves search latency of (incremental) search
- Improve code layout by ensuring all web interface specific code
under the src/interface/web directory
- Rename config API to more specifi /config instead of /ui
- Rename config data GET, POST api to /config/data instead of /config
- Improve search speed by ~10x
Tested on corpus of 125K lines, 12.5K entries
- Allow cross-encoder to re-rank results by settings &?r=true when querying /search API
- It's an optional param that default to False
- Earlier all results were re-ranked by cross-encoder
- Making this configurable allows for much faster results, if desired
but for lower accuracy
- Formalize filters into class with can_filter() and filter() methods
- Use can_filter() method to decide whether to apply filter and
create deep copies of entries and embeddings for it
- Improve search speed for queries with no filters
as deep copying entries, embeddings takes the most time
after cross-encodes scoring when calling the /search API
Earlier we would create deep copies of entries, embeddings
even if the query did not contain any filter keywords
- Reason:
Allow natural search on markdown based notes, documentation,
websites etc
- Details:
- Create markdown processor to extract Markdown entries (identified by
Heading) into standard jsonl format required by text_search
- Update API, Configs to support interfacing with new markdown type
- Update Emacs, Web clients to support interfacing with new markdown
type via API
- Update Readme to mentiond markdown is also supported
Closes#35
- The code for both the text search types were mostly the same
It was earlier done this way for expedience while experimenting
- The minor differences were reconciled and merged into a single
text_search type
- This simplifies the app and making it easier to process other
text types