sij/khoj

Fork 0

mirror of https://github.com/khoj-ai/khoj.git synced 2024-11-23 23:48:56 +01:00

Commit graph

Author	SHA1	Message	Date
Debanjum	a2ccf6f59f	Fix github workflow to start Khoj, connect to PG and upload results - Do not trigger tests to run in ci on update to evals	2024-11-18 04:25:15 -08:00
Debanjum	7c0fd71bfd	Add GitHub workflow to quiz Khoj across modes and specified evals (#982 ) - Evaluate khoj on random 200 questions from each of google frames and openai simpleqa benchmarks across general, default and research modes - Run eval with Gemini 1.5 Flash as test giver and Gemini 1.5 Pro as test evaluator models - Trigger eval workflow on release or manually - Make dataset, khoj mode and sample size configurable when triggered via manual workflow - Enable Web search, webpage read tools during evaluation	2024-11-18 02:19:30 -08:00
Debanjum	41d9011a26	Move evaluation script into tests/evals directory This should give more space for eval scripts, results and readme	2024-11-17 02:08:20 -08:00

Author

SHA1

Message

Date

Debanjum

a2ccf6f59f

Fix github workflow to start Khoj, connect to PG and upload results

- Do not trigger tests to run in ci on update to evals

2024-11-18 04:25:15 -08:00

Debanjum

7c0fd71bfd

Add GitHub workflow to quiz Khoj across modes and specified evals (#982 )

- Evaluate khoj on random 200 questions from each of google frames and openai simpleqa benchmarks across *general*, *default* and *research* modes
- Run eval with Gemini 1.5 Flash as test giver and Gemini 1.5 Pro as test evaluator models
- Trigger eval workflow on release or manually
- Make dataset, khoj mode and sample size configurable when triggered via manual workflow
- Enable Web search, webpage read tools during evaluation

2024-11-18 02:19:30 -08:00

Debanjum

41d9011a26

Move evaluation script into tests/evals directory

This should give more space for eval scripts, results and readme

2024-11-17 02:08:20 -08:00

3 commits