Merge branch 'master' of github.com:khoj-ai/khoj into features/improve-tool-selection

2025-02-18 22:54:20 +00:00 · 2024-11-17 12:26:55 -08:00 · 2024-11-17 12:26:55 -08:00 · 7e662a05f8
commit 7e662a05f8
parent c77dc84a68 69ef6829c1
13 changed files with 263 additions and 112 deletions
--- a/2
+++ b/2
@ -37,7 +37,7 @@ ENV PYTHONPATH=/app/src:$PYTHONPATH
 # Go to the directory src/interface/web and export the built Next.js assets
 WORKDIR /app/src/interface/web
-RUN bash -c "yarn install --frozen-lockfile --verbose && yarn ciexport && yarn cache clean"
+RUN bash -c "yarn install --frozen-lockfile && yarn ciexport && yarn cache clean"
 WORKDIR /app
 # Run the Application
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -37,6 +37,7 @@ services:
    volumes:
      - khoj_config:/root/.khoj/
      - khoj_models:/root/.cache/torch/sentence_transformers
      - khoj_models:/root/.cache/huggingface
    # Use 0.0.0.0 to explicitly set the host ip for the service on the container. https://pythonspeed.com/articles/docker-connection-refused/
    environment:
      - POSTGRES_DB=postgres
@ -48,12 +49,17 @@ services:
      - KHOJ_DEBUG=False
      - KHOJ_ADMIN_EMAIL=username@example.com
      - KHOJ_ADMIN_PASSWORD=password
-      # Uncomment lines below to use chat models by each provider.
+      # Uncomment line below to use with Ollama running on your local machine at localhost:11434.
      # Change URL to use with other OpenAI API compatible providers like VLLM, LMStudio etc.
      # - OPENAI_API_BASE=http://host.docker.internal:11434/v1/
      #
      # Uncomment appropriate lines below to use chat models by OpenAI, Anthropic, Google.
      # Ensure you set your provider specific API keys.
      # ---
      # - OPENAI_API_KEY=your_openai_api_key
      # - GEMINI_API_KEY=your_gemini_api_key
      # - ANTHROPIC_API_KEY=your_anthropic_api_key
      #
      # Uncomment the necessary lines below to make your instance publicly accessible.
      # Replace the KHOJ_DOMAIN with either your domain or IP address (no http/https prefix).
      # Proceed with caution, especially if you are using anonymous mode.
--- a/documentation/docs/advanced/ollama.md
+++ b/documentation/docs/advanced/ollama.md
@ -1,33 +0,0 @@
 # Ollama
 :::info
 This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
 :::
 :::info
 Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
 :::
 Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
 For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
 Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama to create your personal AI agents with Khoj.
 ## Setup
 1. Setup Ollama: https://ollama.com/
 2. Start your preferred model with Ollama. For example,
    ```bash
    ollama run llama3.1
    ```
 3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
   - Name: `ollama`
   - Api Key: `any string`
   - Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
 4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
   - Name: `llama3.1` (replace with the name of your local model)
   - Model Type: `Openai`
   - Openai Config: `<the ollama config you created in step 3>`
   - Max prompt size: `20000` (replace with the max prompt size of your model)
 5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
 That's it! You should now be able to chat with your Ollama model from Khoj. If you want to add additional models running on Ollama, repeat step 6 for each model.
--- a/documentation/docs/advanced/ollama.mdx
+++ b/documentation/docs/advanced/ollama.mdx
@ -0,0 +1,78 @@
 # Ollama
 ```mdx-code-block
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 ```
 :::info
 This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you can use our first-party supported models.
 :::
 :::info
 Khoj can directly run local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). The integration with Ollama is useful to run Khoj on Docker and have the chat models use your GPU or to try new models via CLI.
 :::
 Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
 For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
 Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama with Khoj.
 ## Setup
 :::info
 Restart your Khoj server after first run or update to the settings below to ensure all settings are applied correctly.
 :::
 <Tabs groupId="type" queryString>
  <TabItem value="first-run" label="First Run">
    <Tabs groupId="server" queryString>
      <TabItem value="docker" label="Docker">
      1. Setup Ollama: https://ollama.com/
      2. Download your preferred chat model with Ollama. For example,
         ```bash
         ollama pull llama3.1
         ```
      3. Uncomment `OPENAI_API_BASE` environment variable in your downloaded Khoj [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml#:~:text=OPENAI_API_BASE)
      4. Start Khoj docker for the first time to automatically integrate and load models from the Ollama running on your host machine
         ```bash
         # run below command in the directory where you downloaded the Khoj docker-compose.yml
         docker-compose up
         ```
      </TabItem>
      <TabItem value="pip" label="Pip">
      1. Setup Ollama: https://ollama.com/
      2. Download your preferred chat model with Ollama. For example,
         ```bash
         ollama pull llama3.1
         ```
      3. Set `OPENAI_API_BASE` environment variable to `http://localhost:11434/v1` in your shell before starting Khoj for the first time
         ```bash
         export OPENAI_API_BASE="http://localhost:11434/v1"
         khoj --anonymous-mode
         ```
      </TabItem>
   </Tabs>
  </TabItem>
  <TabItem value="update" label="Update">
   1. Setup Ollama: https://ollama.com/
   2. Download your preferred chat model with Ollama. For example,
      ```bash
      ollama pull llama3.1
      ```
   3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
      - Name: `ollama`
      - Api Key: `any string`
      - Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
   4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
      - Name: `llama3.1` (replace with the name of your local model)
      - Model Type: `Openai`
      - Openai Config: `<the ollama config you created in step 3>`
      - Max prompt size: `20000` (replace with the max prompt size of your model)
   5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
   If you want to add additional models running on Ollama, repeat step 4 for each model.
  </TabItem>
 </Tabs>
 That's it! You should now be able to chat with your Ollama model from Khoj.
--- a/documentation/docs/features/image_generation.md
+++ b/documentation/docs/features/image_generation.md
@ -10,6 +10,17 @@ To generate images, you just need to provide a prompt to Khoj in which the image
 ## Setup (Self-Hosting)
-Right now, we only support integration with OpenAI's DALL-E. You need to have an OpenAI API key to use this feature. Here's how you can set it up:
+You have a couple of image generation options.
-1. Setup your OpenAI API key. See instructions [here](/get-started/setup#2-configure)
+
-2. Create a text to image config at http://localhost:42110/server/admin/database/texttoimagemodelconfig/. We recommend the value `dall-e-3`.
+### Image Generation Models
 We support most state of the art image generation models, including Ideogram, Flux, and Stable Diffusion. These will run using [Replicate](https://replicate.com). Here's how to set them up:
 1. Get a Replicate API key [here](https://replicate.com/account/api-tokens).
 1. Create a new [Text to Image Model](https://app.khoj.dev/server/admin/database/texttoimagemodelconfig/). Set the `type` to `Replicate`. Use any of the model names you see [on this list](https://replicate.com/pricing#image-models).
 ### OpenAI
 1. Get [an OpenAI API key](https://platform.openai.com/settings/organization/api-keys).
 2. Setup your OpenAI API key, if you haven't already. See instructions [here](/get-started/setup#2-configure)
 3. Create a text to image config at http://localhost:42110/server/admin/database/texttoimagemodelconfig/. We recommend the `model name` `dall-e-3`. Make sure to associate it with the OpenAI API chat configuration you setup in step 2 with `Openai config` field.
--- a/documentation/docs/get-started/setup.mdx
+++ b/documentation/docs/get-started/setup.mdx
@ -19,7 +19,11 @@ These are the general setup instructions for self-hosted Khoj.
 You can install the Khoj server using either [Docker](?server=docker) or [Pip](?server=pip).
 :::info[Offline Model + GPU]
-If you want to use the offline chat model and you have a GPU, you should use Installation Option 2  - local setup via the Python package directly. Our Docker image doesn't currently support running the offline chat model on GPU, making inference times really slow.
+To use the offline chat model with your GPU, we recommend using the Docker setup with Ollama . You can also use the local Khoj setup via the Python package directly.
 :::
 :::info[First Run]
 Restart your Khoj server after the first run to ensure all settings are applied correctly.
 :::
 <Tabs groupId="server" queryString>
@ -28,9 +32,9 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
      <TabItem value="macos" label="MacOS">
        <h3>Prerequisites</h3>
        <h4>Docker</h4>
-        (Option 1) Click here to install [Docker Desktop](https://docs.docker.com/desktop/install/mac-install/). Make sure you also install the [Docker Compose](https://docs.docker.com/desktop/install/mac-install/) tool.
+        - *Option 1*: Click here to install [Docker Desktop](https://docs.docker.com/desktop/install/mac-install/). Make sure you also install the [Docker Compose](https://docs.docker.com/desktop/install/mac-install/) tool.
-        (Option 2) Use [Homebrew](https://brew.sh/) to install Docker and Docker Compose.
+        - *Option 2*: Use [Homebrew](https://brew.sh/) to install Docker and Docker Compose.
          ```shell
          brew install --cask docker
          brew install docker-compose
@ -41,9 +45,10 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
           mkdir ~/.khoj && cd ~/.khoj
           wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
           ```
-        2. Configure the environment variables in the docker-compose.yml
+        2. Configure the environment variables in the `docker-compose.yml`
           - Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
-          - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively.
+           - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
           - Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama?type=first-run&server=docker#setup) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
        3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
           ```shell
           cd ~/.khoj
@ -66,9 +71,10 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
         mkdir ~/.khoj && cd ~/.khoj
         wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
         ```
-      2. Configure the environment variables in the docker-compose.yml
+      2. Configure the environment variables in the `docker-compose.yml`
         - Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
-        - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively.
+         - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
         - Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
      3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
         ```shell
         # Windows users should use their WSL2 terminal to run these commands
@ -87,9 +93,10 @@ If you want to use the offline chat model and you have a GPU, you should use Ins
           mkdir ~/.khoj && cd ~/.khoj
           wget https://raw.githubusercontent.com/khoj-ai/khoj/master/docker-compose.yml
           ```
-        2. Configure the environment variables in the docker-compose.yml
+        2. Configure the environment variables in the `docker-compose.yml`
           - Set `KHOJ_ADMIN_PASSWORD`, `KHOJ_DJANGO_SECRET_KEY` (and optionally the `KHOJ_ADMIN_EMAIL`) to something secure. This allows you to customize Khoj later via the admin panel.
-          - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini chat models respectively.
+           - Set `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` to your API key if you want to use OpenAI, Anthropic or Gemini commercial chat models respectively.
           - Uncomment `OPENAI_API_BASE` to use [Ollama](/advanced/ollama) running on your host machine. Or set it to the URL of your OpenAI compatible API like vLLM or [LMStudio](/advanced/lmstudio).
        3. Start Khoj by running the following command in the same directory as your docker-compose.yml file.
           ```shell
           cd ~/.khoj
--- a/src/interface/web/app/components/chatInputArea/chatInputArea.tsx
+++ b/src/interface/web/app/components/chatInputArea/chatInputArea.tsx
@ -170,7 +170,12 @@ export const ChatInputArea = forwardRef<HTMLTextAreaElement, ChatInputProps>((pr
        }
        let messageToSend = message.trim();
-        if (useResearchMode && !messageToSend.startsWith("/research")) {
+        // Check if message starts with an explicit slash command
        const startsWithSlashCommand =
            props.chatOptionsData &&
            Object.keys(props.chatOptionsData).some((cmd) => messageToSend.startsWith(`/${cmd}`));
        // Only add /research if useResearchMode is enabled and message doesn't already use a slash command
        if (useResearchMode && !startsWithSlashCommand) {
            messageToSend = `/research ${messageToSend}`;
        }
--- a/src/khoj/database/admin.py
+++ b/src/khoj/database/admin.py
@ -108,7 +108,6 @@ admin.site.register(GithubConfig)
 admin.site.register(NotionConfig)
 admin.site.register(UserVoiceModelConfig)
 admin.site.register(VoiceModelOption)
 admin.site.register(UserConversationConfig)
 admin.site.register(UserRequests)
@ -326,3 +325,35 @@ class ConversationAdmin(admin.ModelAdmin):
            if "export_selected_minimal_objects" in actions:
                del actions["export_selected_minimal_objects"]
        return actions
@admin.register(UserConversationConfig)
 class UserConversationConfigAdmin(admin.ModelAdmin):
    list_display = (
        "id",
        "get_user_email",
        "get_chat_model",
        "get_subscription_type",
    )
    search_fields = ("id", "user__email", "setting__chat_model", "user__subscription__type")
    ordering = ("-updated_at",)
    def get_user_email(self, obj):
        return obj.user.email
    get_user_email.short_description = "User Email"  # type: ignore
    get_user_email.admin_order_field = "user__email"  # type: ignore
    def get_chat_model(self, obj):
        return obj.setting.chat_model if obj.setting else None
    get_chat_model.short_description = "Chat Model"  # type: ignore
    get_chat_model.admin_order_field = "setting__chat_model"  # type: ignore
    def get_subscription_type(self, obj):
        if hasattr(obj.user, "subscription"):
            return obj.user.subscription.type
        return None
    get_subscription_type.short_description = "Subscription Type"  # type: ignore
    get_subscription_type.admin_order_field = "user__subscription__type"  # type: ignore
--- a/src/khoj/interface/web/assets/icons/favicon-128x128.ico
+++ b/src/khoj/interface/web/assets/icons/favicon-128x128.ico
--- a/src/khoj/processor/content/pdf/pdf_to_entries.py
+++ b/src/khoj/processor/content/pdf/pdf_to_entries.py
@ -101,7 +101,7 @@ class PdfToEntries(TextToEntries):
                tmpf.flush()  # Ensure all data is written
                # Load the content using PyMuPDFLoader
-                loader = PyMuPDFLoader(tmpf.name, extract_images=True)
+                loader = PyMuPDFLoader(tmpf.name)
                pdf_entries_per_file = loader.load()
                # Convert the loaded entries into the desired format
--- a/src/khoj/utils/constants.py
+++ b/src/khoj/utils/constants.py
@ -16,7 +16,7 @@ default_offline_chat_models = [
 ]
 default_openai_chat_models = ["gpt-4o-mini", "gpt-4o"]
 default_gemini_chat_models = ["gemini-1.5-flash", "gemini-1.5-pro"]
-default_anthropic_chat_models = ["claude-3-5-sonnet-20240620", "claude-3-opus-20240229"]
+default_anthropic_chat_models = ["claude-3-5-sonnet-20241022", "claude-3-5-haiku-20241022"]
 empty_config = {
    "search-type": {
--- a/src/khoj/utils/initialization.py
+++ b/src/khoj/utils/initialization.py
@ -2,12 +2,13 @@ import logging
 import os
 from typing import Tuple
 import openai
 from khoj.database.adapters import ConversationAdapters
 from khoj.database.models import (
    ChatModelOptions,
    KhojUser,
    OpenAIProcessorConversationConfig,
    ServerChatSettings,
    SpeechToTextModelOptions,
    TextToImageModelConfig,
 )
@ -42,14 +43,32 @@ def initialization(interactive: bool = True):
            "🗣️ Configure chat models available to your server. You can always update these at /server/admin using your admin account"
        )
        openai_api_base = os.getenv("OPENAI_API_BASE")
        provider = "Ollama" if openai_api_base and openai_api_base.endswith(":11434/v1/") else "OpenAI"
        openai_api_key = os.getenv("OPENAI_API_KEY", "placeholder" if openai_api_base else None)
        default_chat_models = default_openai_chat_models
        if openai_api_base:
            # Get available chat models from OpenAI compatible API
            try:
                openai_client = openai.OpenAI(api_key=openai_api_key, base_url=openai_api_base)
                default_chat_models = [model.id for model in openai_client.models.list()]
                # Put the available default OpenAI models at the top
                valid_default_models = [model for model in default_openai_chat_models if model in default_chat_models]
                other_available_models = [model for model in default_chat_models if model not in valid_default_models]
                default_chat_models = valid_default_models + other_available_models
            except Exception:
                logger.warning(f"⚠️ Failed to fetch {provider} chat models. Fallback to default models. Error: {e}")
        # Set up OpenAI's online chat models
        openai_configured, openai_provider = _setup_chat_model_provider(
            ChatModelOptions.ModelType.OPENAI,
-            default_openai_chat_models,
+            default_chat_models,
-            default_api_key=os.getenv("OPENAI_API_KEY"),
+            default_api_key=openai_api_key,
            api_base_url=openai_api_base,
            vision_enabled=True,
            is_offline=False,
            interactive=interactive,
            provider_name=provider,
        )
        # Setup OpenAI speech to text model
@ -87,7 +106,7 @@ def initialization(interactive: bool = True):
            ChatModelOptions.ModelType.GOOGLE,
            default_gemini_chat_models,
            default_api_key=os.getenv("GEMINI_API_KEY"),
-            vision_enabled=False,
+            vision_enabled=True,
            is_offline=False,
            interactive=interactive,
            provider_name="Google Gemini",
@ -98,7 +117,7 @@ def initialization(interactive: bool = True):
            ChatModelOptions.ModelType.ANTHROPIC,
            default_anthropic_chat_models,
            default_api_key=os.getenv("ANTHROPIC_API_KEY"),
-            vision_enabled=False,
+            vision_enabled=True,
            is_offline=False,
            interactive=interactive,
        )
@ -154,11 +173,14 @@ def initialization(interactive: bool = True):
        default_chat_models: list,
        default_api_key: str,
        interactive: bool,
        api_base_url: str = None,
        vision_enabled: bool = False,
        is_offline: bool = False,
        provider_name: str = None,
    ) -> Tuple[bool, OpenAIProcessorConversationConfig]:
-        supported_vision_models = ["gpt-4o-mini", "gpt-4o"]
+        supported_vision_models = (
            default_openai_chat_models + default_anthropic_chat_models + default_gemini_chat_models
        )
        provider_name = provider_name or model_type.name.capitalize()
        default_use_model = {True: "y", False: "n"}[default_api_key is not None or is_offline]
        use_model_provider = (
@ -170,14 +192,16 @@ def initialization(interactive: bool = True):
        logger.info(f"️💬 Setting up your {provider_name} chat configuration")
-        chat_model_provider = None
+        chat_provider = None
        if not is_offline:
            if interactive:
                user_api_key = input(f"Enter your {provider_name} API key (default: {default_api_key}): ")
                api_key = user_api_key if user_api_key != "" else default_api_key
            else:
                api_key = default_api_key
-            chat_model_provider = OpenAIProcessorConversationConfig.objects.create(api_key=api_key, name=provider_name)
+            chat_provider = OpenAIProcessorConversationConfig.objects.create(
                api_key=api_key, name=provider_name, api_base_url=api_base_url
            )
        if interactive:
            chat_model_names = input(
@ -199,13 +223,13 @@ def initialization(interactive: bool = True):
                "max_prompt_size": default_max_tokens,
                "vision_enabled": vision_enabled,
                "tokenizer": default_tokenizer,
-                "openai_config": chat_model_provider,
+                "openai_config": chat_provider,
            }
            ChatModelOptions.objects.create(**chat_model_options)
        logger.info(f"🗣️ {provider_name} chat model configuration complete")
-        return True, chat_model_provider
+        return True, chat_provider
    admin_user = KhojUser.objects.filter(is_staff=True).first()
    if admin_user is None:
--- a/tests/eval_frames.py
+++ b/tests/eval_frames.py
@ -5,6 +5,7 @@ import logging
 import os
 import time
 from datetime import datetime
 from io import StringIO
 from typing import Any, Dict
 import pandas as pd
@ -36,11 +37,21 @@ SLEEP_SECONDS = 1  # Delay between API calls to avoid rate limiting
 def load_frames_dataset():
-    """Load the FRAMES benchmark dataset from HuggingFace"""
+    """
    Load the Google FRAMES benchmark dataset from HuggingFace
    FRAMES is a benchmark dataset to evaluate retrieval and answering capabilities of agents.
    It contains ~800 requiring multi-hop retrieval and reasoning across various topics.
    ### Data Fields
    - Prompt: The question to be answered
    - Answer: The ground truth answer
    - reasoning_types: The type of reasoning required to answer the question
    """
    try:
        dataset = load_dataset("google/frames-benchmark")
        dataset = dataset.shuffle() if RANDOMIZE else dataset
        # Use test split for evaluation. Sample and shuffle dataset if configured
        dataset = dataset.shuffle() if RANDOMIZE else dataset
        return dataset["test"][: int(SAMPLE_SIZE)] if SAMPLE_SIZE else dataset["test"]
    except Exception as e:
@ -48,24 +59,36 @@ def load_frames_dataset():
        return None
-def load_talc_dataset():
+def load_simpleqa_dataset():
    """
-    Load the TALC dataset from Github.
+    Load the OpenAI SimpleQA benchmark dataset from their public bucket.
-    Normalize it into the FRAMES benchmark structure and the HuggingFace Dataset format.
+    SimpleQA is a dataset of moderately difficult q&a for 2024 models to answer across various topics.
    It contains ~4000 human vetted questions and answers with additional metadata.
    Its usage can be seen in openai/simple-evals github repository as well.
    ### Data Fields
    - problem: The question to be answered
    - answer: The ground truth answer
    - metadata: Additional metadata including topic information
    """
    try:
-        # Load TALC search benchmark from Github
+        # Load SimpleQA benchmark from OpenAI public bucket
-        raw_url = "https://raw.githubusercontent.com/Talc-AI/search-bench/3fd5b0858e2effa4c1578c7d046bee0a3895c488/data/searchbench_08_30_2024.jsonl"
+        raw_url = "https://openaipublic.blob.core.windows.net/simple-evals/simple_qa_test_set.csv"
        response = requests.get(raw_url)
        response.raise_for_status()
-        # Parse benchmark from raw JSONL response
+        # Parse benchmark from raw CSV response
-        jsonl_data = [json.loads(line) for line in response.text.splitlines()]
+        csv_data = pd.read_csv(StringIO(response.text))
-
+        # Normalize it into FRAMES format
        # Rename keys to match FRAMES format
        formatted_data = [
-            {"Prompt": d["question"], "Answer": d["expected_answer"], "reasoning_types": "talc"} for d in jsonl_data
+            {
                "Prompt": d["problem"],
                "Answer": d["answer"],
                "reasoning_types": json.loads(csv_data.to_dict("records")[0]["metadata"].replace("'", '"'))["topic"],
            }
            for d in csv_data.to_dict("records")
        ]
        # Convert benchmark to HF Dataset
@ -74,9 +97,8 @@ def load_talc_dataset():
        dataset = dataset.select(range(int(SAMPLE_SIZE))) if SAMPLE_SIZE else dataset
        return dataset
    except Exception as e:
-        logger.error(f"Error loading dataset: {e}")
+        logger.error(f"Error loading simpleqa dataset: {e}")
        return None
@ -208,7 +230,7 @@ def parse_args():
        "--dataset",
        "-d",
        default="frames",
-        choices=["frames", "talc"],
+        choices=["frames", "simpleqa"],
        help="Dataset to use for evaluation (default: frames)",
    )
    return parser.parse_args()
@ -220,11 +242,11 @@ def main():
    dataset = None
    # Load dataset
-    with timer(f"Loaded {args.dataset} dataset in", logger):
+    with timer(f"Loaded {args.dataset} dataset in", logger, log_level=logging.INFO):
        if args.dataset == "frames":
            dataset = load_frames_dataset()
-        elif args.dataset == "talc":
+        elif args.dataset == "simpleqa":
-            dataset = load_talc_dataset()
+            dataset = load_simpleqa_dataset()
    if dataset is None:
        return