Improve doc search actor performance on vague, random or meta questions

- Issue Previously the doc search actor wouldn't extract good search queries to run on user's documents for broad, vague questions. - Fix The updated extract questions prompt shows and tells the doc search actor on how to deal with such questions The doc search actor's temperature was also increased to support more creative/random questions. The previous temp of 0 was meant to encourage structured json output. But now with json mode, a low temp is not necessary to get json output
2024-11-23 23:48:56 +01:00 · 2024-07-27 01:33:36 +05:30 · 2024-07-27 01:33:36 +05:30 · f75606d7f5
commit f75606d7f5
parent 3675938df6
4 changed files with 21 additions and 3 deletions
--- a/src/khoj/processor/conversation/anthropic/anthropic_chat.py
+++ b/src/khoj/processor/conversation/anthropic/anthropic_chat.py
@ -24,7 +24,7 @@ def extract_questions_anthropic(
    model: Optional[str] = "claude-instant-1.2",
    conversation_log={},
    api_key=None,
-    temperature=0,
+    temperature=0.7,
    location_data: LocationData = None,
    user: KhojUser = None,
 ):
@ -52,6 +52,7 @@ def extract_questions_anthropic(
    system_prompt = prompts.extract_questions_anthropic_system_prompt.format(
        current_date=today.strftime("%Y-%m-%d"),
        day_of_week=today.strftime("%A"),
+        current_month=today.strftime("%Y-%m"),
        last_new_year=last_new_year.strftime("%Y"),
        last_new_year_date=last_new_year.strftime("%Y-%m-%d"),
        current_new_year_date=current_new_year.strftime("%Y-%m-%d"),
--- a/src/khoj/processor/conversation/offline/chat_model.py
+++ b/src/khoj/processor/conversation/offline/chat_model.py
@ -32,7 +32,7 @@ def extract_questions_offline(
    location_data: LocationData = None,
    user: KhojUser = None,
    max_prompt_size: int = None,
-    temperature: float = 0,
+    temperature: float = 0.7,
 ) -> List[str]:
    """
    Infer search queries to retrieve relevant notes to answer user query
@ -67,6 +67,7 @@ def extract_questions_offline(
        chat_history=chat_history,
        current_date=today.strftime("%Y-%m-%d"),
        day_of_week=today.strftime("%A"),
+        current_month=today.strftime("%Y-%m"),
        yesterday_date=yesterday,
        last_year=last_year,
        this_year=today.year,
--- a/src/khoj/processor/conversation/openai/gpt.py
+++ b/src/khoj/processor/conversation/openai/gpt.py
@ -24,7 +24,7 @@ def extract_questions(
    conversation_log={},
    api_key=None,
    api_base_url=None,
-    temperature=0,
+    temperature=0.7,
    max_tokens=100,
    location_data: LocationData = None,
    user: KhojUser = None,
@ -52,6 +52,7 @@ def extract_questions(
    prompt = prompts.extract_questions.format(
        current_date=today.strftime("%Y-%m-%d"),
        day_of_week=today.strftime("%A"),
+        current_month=today.strftime("%Y-%m"),
        last_new_year=last_new_year.strftime("%Y"),
        last_new_year_date=last_new_year.strftime("%Y-%m-%d"),
        current_new_year_date=current_new_year.strftime("%Y-%m-%d"),
--- a/src/khoj/processor/conversation/prompts.py
+++ b/src/khoj/processor/conversation/prompts.py
@ -208,6 +208,7 @@ Construct search queries to retrieve relevant information to answer the user's q
 - Add as much context from the previous questions and answers as required into your search queries.
 - Break messages into multiple search queries when required to retrieve the relevant information.
 - Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
+- When asked a meta, vague or random questions, search for a variety of broad topics to answer the user's question.
 - Share relevant search queries as a JSON list of strings. Do not say anything else.

 Current Date: {day_of_week}, {current_date}
@ -239,6 +240,9 @@ Khoj: ["What kind of plants do I have?", "What issues do my plants have?"]
 Q: Who all did I meet here yesterday?
 Khoj: ["Met in {location} on {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]

+Q: Share some random, interesting experiences from this month
+Khoj: ["Exciting travel adventures from {current_month}", "Fun social events dt>='{current_month}-01' dt<'{current_date}'", "Intense emotional experiences in {current_month}"]
+
 Chat History:
 {chat_history}
 What searches will you perform to answer the following question, using the chat history as reference? Respond only with relevant search queries as a valid JSON list of strings.
@ -255,6 +259,7 @@ Construct search queries to retrieve relevant information to answer the user's q
 - Add as much context from the previous questions and answers as required into your search queries.
 - Break messages into multiple search queries when required to retrieve the relevant information.
 - Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
+- When asked a meta, vague or random questions, search for a variety of broad topics to answer the user's question.

 What searches will you perform to answer the users question? Respond with search queries as list of strings in a JSON object.
 Current Date: {day_of_week}, {current_date}
@ -281,6 +286,10 @@ Q: How many tennis balls fit in the back of a 2002 Honda Civic?
 Khoj: {{"queries": ["What is the size of a tennis ball?", "What is the trunk size of a 2002 Honda Civic?"]}}
 A: 1085 tennis balls will fit in the trunk of a Honda Civic

+Q: Share some random, interesting experiences from this month
+Khoj: {{"queries": ["Exciting travel adventures from {current_month}", "Fun social events dt>='{current_month}-01' dt<'{current_date}'", "Intense emotional experiences in {current_month}"]}}
+A: You had a great time at the local beach with your friends, attended a music concert and had a deep conversation with your friend, Khalid.
+
 Q: Is Bob older than Tom?
 Khoj: {{"queries": ["When was Bob born?", "What is Tom's age?"]}}
 A: Yes, Bob is older than Tom. As Bob was born on 1984-01-01 and Tom is 30 years old.
@ -307,6 +316,7 @@ Construct search queries to retrieve relevant information to answer the user's q
 - Add as much context from the previous questions and answers as required into your search queries.
 - Break messages into multiple search queries when required to retrieve the relevant information.
 - Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
+- When asked a meta, vague or random questions, search for a variety of broad topics to answer the user's question.

 What searches will you perform to answer the users question? Respond with a JSON object with the key "queries" mapping to a list of searches you would perform on the user's knowledge base. Just return the queries and nothing else.

@ -331,6 +341,11 @@ A: I can help you live healthier and happier across work and personal life
 User: Who all did I meet here yesterday?
 Assistant: {{"queries": ["Met in {location} on {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]}}
 A: Yesterday's note mentions your visit to your local beach with Ram and Shyam.
+
+User: Share some random, interesting experiences from this month
+Assistant: {{"queries": ["Exciting travel adventures from {current_month}", "Fun social events dt>='{current_month}-01' dt<'{current_date}'", "Intense emotional experiences in {current_month}"]}}
+A: You had a great time at the local beach with your friends, attended a music concert and had a deep conversation with your friend, Khalid.
+
 """.strip()
 )