- Improve escaping to load complex json objects
- Fallback to a more forgiving [json5](https://json5.org/) loader if `json.loads` cannot parse complex json str
This should reduce failures to pick research tool and run code by agent
JSON5 spec is more flexible, try to load using a fast json5 parser if
the stricter json.loads from the standard library can't load the
raw complex json string into a python dictionary/list
Gemini doesn't work well when trying to output json objects. Using it
to output raw json strings with complex, multi-line structures
requires more intense clean-up of raw json string for parsing
- Use pre-built wheels for torch and llama-cpp-python
- Install and link musl as it's used by llama-cpp-python pre-built
wheel instead of glibc
- Join Install git and Install Dependencies steps in pytest workflow
To remove unnecessary steps
- Building arm64 image on an ubuntu arm64 runner reduces `yarn build'
step time by 75% from 12mins to 3mins.
- This is because no QEMU emulation for arm64 on x86 is required now
- Parallelizing x64 and arm64 platform builds halves build time on top
- Revert to use standard ubuntu-latest runner as large x64 runner
doesn't give much more speed improvements
This results an effective additional 50%-66% reduction in build time
on top of #987.
So a full dockerize workflow run now takes *10 mins* vs previous 35+mins.
This is a total of *72% improvement* in max dockerize run time.
Get additional speed improvements when docker layer cache hit.
## Objective
Improve build speed and size of khoj docker images
## Changes
### Improve docker image build speeds
- Decouple web app and server build steps
- Build the web app and server in parallel
- Cache docker layers for reuse across dockerize github workflow runs
- Split Docker build layers for improved cacheability (e.g separate `yarn install` and `yarn build` steps)
### Reduce size of khoj docker images
- Use an up-to-date `.dockerignore` to exclude unnecessary directories
- Do not installing cuda python packages for cpu builds
### Improve web app builds
- Use consistent mechanism to get fonts for web app
- Make tailwind extensions production instead of dev dependencies
- Make next.js create production builds for the web app (via `NODE_ENV=production` env var)