mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-11-27 17:35:07 +01:00
Improve OpenAI Chat Actors and their prompts (#673)
### Major - Enforce json mode response from OpenAI chat actors prev using string lists - Use `gpt-4-turbo-preview' as default chat model, extract questions actor - Make Khoj read khoj website to respond with accurate, up-to-date information about itself - Dedupe query in notes prompt. Improve OAI chat actor, director tests ### Minor - Test data source, output mode selector, web search query chat actors - Improve notes search actor to always create a non-empty list of queries - Construct available data sources, output modes as a bullet list in prompts - Use consistent agent name across static and dynamic examples in prompts - Add actor's name to extract questions prompt to improve context for guidance
This commit is contained in:
commit
e549824fe2
9 changed files with 213 additions and 157 deletions
|
@ -175,7 +175,7 @@ To use the desktop client, you need to go to your Khoj server's settings page (h
|
||||||
1. Go to http://localhost:42110/server/admin and login with your admin credentials.
|
1. Go to http://localhost:42110/server/admin and login with your admin credentials.
|
||||||
1. Go to [OpenAI settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/) in the server admin settings to add an OpenAI processor conversation config. This is where you set your API key. Alternatively, you can go to the [offline chat settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and simply create a new setting with `Enabled` set to `True`.
|
1. Go to [OpenAI settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/) in the server admin settings to add an OpenAI processor conversation config. This is where you set your API key. Alternatively, you can go to the [offline chat settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and simply create a new setting with `Enabled` set to `True`.
|
||||||
2. Go to the ChatModelOptions if you want to add additional models for chat.
|
2. Go to the ChatModelOptions if you want to add additional models for chat.
|
||||||
- Set the `chat-model` field to a supported chat model[^1] of your choice. For example, you can specify `gpt-4` if you're using OpenAI or `mistral-7b-instruct-v0.1.Q4_0.gguf` if you're using offline chat.
|
- Set the `chat-model` field to a supported chat model[^1] of your choice. For example, you can specify `gpt-4-turbo-preview` if you're using OpenAI or `mistral-7b-instruct-v0.1.Q4_0.gguf` if you're using offline chat.
|
||||||
- Make sure to set the `model-type` field to `OpenAI` or `Offline` respectively.
|
- Make sure to set the `model-type` field to `OpenAI` or `Offline` respectively.
|
||||||
- The `tokenizer` and `max-prompt-size` fields are optional. Set them only when using a non-standard model (i.e not mistral, gpt or llama2 model).
|
- The `tokenizer` and `max-prompt-size` fields are optional. Set them only when using a non-standard model (i.e not mistral, gpt or llama2 model).
|
||||||
1. Select files and folders to index [using the desktop client](/get-started/setup#2-download-the-desktop-client). When you click 'Save', the files will be sent to your server for indexing.
|
1. Select files and folders to index [using the desktop client](/get-started/setup#2-download-the-desktop-client). When you click 'Save', the files will be sent to your server for indexing.
|
||||||
|
|
|
@ -35,7 +35,7 @@ Use structured query syntax to filter entries from your knowledge based used by
|
||||||
Use this if you want to use non-standard, open or commercial, local or hosted LLM models for Khoj chat
|
Use this if you want to use non-standard, open or commercial, local or hosted LLM models for Khoj chat
|
||||||
1. Setup your desired chat LLM by installing an OpenAI compatible LLM API Server like [LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#openai-compatible-web-server)
|
1. Setup your desired chat LLM by installing an OpenAI compatible LLM API Server like [LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#openai-compatible-web-server)
|
||||||
2. Set environment variable `OPENAI_API_BASE="<url-of-your-llm-server>"` before starting Khoj
|
2. Set environment variable `OPENAI_API_BASE="<url-of-your-llm-server>"` before starting Khoj
|
||||||
3. Add ChatModelOptions with `model-type` `OpenAI`, and `chat-model` to anything (e.g `gpt-4`) during [Config](/get-started/setup#3-configure)
|
3. Add ChatModelOptions with `model-type` `OpenAI`, and `chat-model` to anything (e.g `gpt-3.5-turbo`) during [Config](/get-started/setup#3-configure)
|
||||||
- *(Optional)* Set the `tokenizer` and `max-prompt-size` relevant to the actual chat model you're using
|
- *(Optional)* Set the `tokenizer` and `max-prompt-size` relevant to the actual chat model you're using
|
||||||
|
|
||||||
#### Sample Setup using LiteLLM and Mistral API
|
#### Sample Setup using LiteLLM and Mistral API
|
||||||
|
|
|
@ -11,7 +11,6 @@ from khoj.processor.conversation.openai.utils import (
|
||||||
completion_with_backoff,
|
completion_with_backoff,
|
||||||
)
|
)
|
||||||
from khoj.processor.conversation.utils import generate_chatml_messages_with_context
|
from khoj.processor.conversation.utils import generate_chatml_messages_with_context
|
||||||
from khoj.utils.constants import empty_escape_sequences
|
|
||||||
from khoj.utils.helpers import ConversationCommand, is_none_or_empty
|
from khoj.utils.helpers import ConversationCommand, is_none_or_empty
|
||||||
from khoj.utils.rawconfig import LocationData
|
from khoj.utils.rawconfig import LocationData
|
||||||
|
|
||||||
|
@ -20,7 +19,7 @@ logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
def extract_questions(
|
def extract_questions(
|
||||||
text,
|
text,
|
||||||
model: Optional[str] = "gpt-4",
|
model: Optional[str] = "gpt-4-turbo-preview",
|
||||||
conversation_log={},
|
conversation_log={},
|
||||||
api_key=None,
|
api_key=None,
|
||||||
temperature=0,
|
temperature=0,
|
||||||
|
@ -35,9 +34,9 @@ def extract_questions(
|
||||||
# Extract Past User Message and Inferred Questions from Conversation Log
|
# Extract Past User Message and Inferred Questions from Conversation Log
|
||||||
chat_history = "".join(
|
chat_history = "".join(
|
||||||
[
|
[
|
||||||
f'Q: {chat["intent"]["query"]}\n\n{chat["intent"].get("inferred-queries") or list([chat["intent"]["query"]])}\n\n{chat["message"]}\n\n'
|
f'Q: {chat["intent"]["query"]}\nKhoj: {{"queries": {chat["intent"].get("inferred-queries") or list([chat["intent"]["query"]])}}}\nA: {chat["message"]}\n\n'
|
||||||
for chat in conversation_log.get("chat", [])[-4:]
|
for chat in conversation_log.get("chat", [])[-4:]
|
||||||
if chat["by"] == "khoj" and chat["intent"].get("type") != "text-to-image"
|
if chat["by"] == "khoj" and "text-to-image" not in chat["intent"].get("type")
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -66,7 +65,7 @@ def extract_questions(
|
||||||
model_name=model,
|
model_name=model,
|
||||||
temperature=temperature,
|
temperature=temperature,
|
||||||
max_tokens=max_tokens,
|
max_tokens=max_tokens,
|
||||||
model_kwargs={"stop": ["A: ", "\n"]},
|
model_kwargs={"stop": ["A: ", "\n"], "response_format": {"type": "json_object"}},
|
||||||
openai_api_key=api_key,
|
openai_api_key=api_key,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -74,8 +73,8 @@ def extract_questions(
|
||||||
try:
|
try:
|
||||||
response = response.strip()
|
response = response.strip()
|
||||||
response = json.loads(response)
|
response = json.loads(response)
|
||||||
response = [q.strip() for q in response if q.strip()]
|
response = [q.strip() for q in response["queries"] if q.strip()]
|
||||||
if not isinstance(response, list) or not response or len(response) == 0:
|
if not isinstance(response, list) or not response:
|
||||||
logger.error(f"Invalid response for constructing subqueries: {response}")
|
logger.error(f"Invalid response for constructing subqueries: {response}")
|
||||||
return [text]
|
return [text]
|
||||||
return response
|
return response
|
||||||
|
@ -87,11 +86,7 @@ def extract_questions(
|
||||||
return questions
|
return questions
|
||||||
|
|
||||||
|
|
||||||
def send_message_to_model(
|
def send_message_to_model(messages, api_key, model, response_type="text"):
|
||||||
messages,
|
|
||||||
api_key,
|
|
||||||
model,
|
|
||||||
):
|
|
||||||
"""
|
"""
|
||||||
Send message to model
|
Send message to model
|
||||||
"""
|
"""
|
||||||
|
@ -101,6 +96,7 @@ def send_message_to_model(
|
||||||
messages=messages,
|
messages=messages,
|
||||||
model=model,
|
model=model,
|
||||||
openai_api_key=api_key,
|
openai_api_key=api_key,
|
||||||
|
model_kwargs={"response_format": {"type": response_type}},
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@ -150,7 +146,7 @@ def converse(
|
||||||
f"{prompts.online_search_conversation.format(online_results=str(online_results))}\n{conversation_primer}"
|
f"{prompts.online_search_conversation.format(online_results=str(online_results))}\n{conversation_primer}"
|
||||||
)
|
)
|
||||||
if not is_none_or_empty(compiled_references):
|
if not is_none_or_empty(compiled_references):
|
||||||
conversation_primer = f"{prompts.notes_conversation.format(query=user_query, references=compiled_references)}\n{conversation_primer}"
|
conversation_primer = f"{prompts.notes_conversation.format(query=user_query, references=compiled_references)}\n\n{conversation_primer}"
|
||||||
|
|
||||||
# Setup Prompt with Primer or Conversation History
|
# Setup Prompt with Primer or Conversation History
|
||||||
messages = generate_chatml_messages_with_context(
|
messages = generate_chatml_messages_with_context(
|
||||||
|
|
|
@ -104,8 +104,6 @@ Ask crisp follow-up questions to get additional context, when a helpful response
|
||||||
|
|
||||||
Notes:
|
Notes:
|
||||||
{references}
|
{references}
|
||||||
|
|
||||||
Query: {query}
|
|
||||||
""".strip()
|
""".strip()
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -217,69 +215,52 @@ Use these notes from the user's previous conversations to provide a response:
|
||||||
|
|
||||||
extract_questions = PromptTemplate.from_template(
|
extract_questions = PromptTemplate.from_template(
|
||||||
"""
|
"""
|
||||||
You are Khoj, an extremely smart and helpful search assistant with the ability to retrieve information from the user's notes.
|
You are Khoj, an extremely smart and helpful search assistant with the ability to retrieve information from the user's notes. Construct search queries to retrieve relevant information to answer the user's question.
|
||||||
- The user will provide their questions and answers to you for context.
|
- You will be provided past questions(Q) and answers(A) for context.
|
||||||
- Add as much context from the previous questions and answers as required into your search queries.
|
- Add as much context from the previous questions and answers as required into your search queries.
|
||||||
- Break messages into multiple search queries when required to retrieve the relevant information.
|
- Break messages into multiple search queries when required to retrieve the relevant information.
|
||||||
- Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
|
- Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
|
||||||
|
|
||||||
What searches, if any, will you need to perform to answer the users question?
|
What searches will you need to perform to answer the users question? Respond with search queries as list of strings in a JSON object.
|
||||||
Provide search queries as a JSON list of strings
|
|
||||||
Current Date: {current_date}
|
Current Date: {current_date}
|
||||||
User's Location: {location}
|
User's Location: {location}
|
||||||
|
|
||||||
Q: How was my trip to Cambodia?
|
Q: How was my trip to Cambodia?
|
||||||
|
Khoj: {{"queries": ["How was my trip to Cambodia?"]}}
|
||||||
["How was my trip to Cambodia?"]
|
|
||||||
|
|
||||||
A: The trip was amazing. I went to the Angkor Wat temple and it was beautiful.
|
A: The trip was amazing. I went to the Angkor Wat temple and it was beautiful.
|
||||||
|
|
||||||
Q: Who did i visit that temple with?
|
Q: Who did i visit that temple with?
|
||||||
|
Khoj: {{"queries": ["Who did I visit the Angkor Wat Temple in Cambodia with?"]}}
|
||||||
["Who did I visit the Angkor Wat Temple in Cambodia with?"]
|
|
||||||
|
|
||||||
A: You visited the Angkor Wat Temple in Cambodia with Pablo, Namita and Xi.
|
A: You visited the Angkor Wat Temple in Cambodia with Pablo, Namita and Xi.
|
||||||
|
|
||||||
Q: What national parks did I go to last year?
|
Q: What national parks did I go to last year?
|
||||||
|
Khoj: {{"queries": ["National park I visited in {last_new_year} dt>='{last_new_year_date}' dt<'{current_new_year_date}'"]}}
|
||||||
["National park I visited in {last_new_year} dt>='{last_new_year_date}' dt<'{current_new_year_date}'"]
|
|
||||||
|
|
||||||
A: You visited the Grand Canyon and Yellowstone National Park in {last_new_year}.
|
A: You visited the Grand Canyon and Yellowstone National Park in {last_new_year}.
|
||||||
|
|
||||||
Q: How are you feeling today?
|
Q: How can you help me?
|
||||||
|
Khoj: {{"queries": ["Social relationships", "Physical and mental health", "Education and career", "Personal life goals and habits"]}}
|
||||||
[]
|
A: I can help you live healthier and happier across work and personal life
|
||||||
|
|
||||||
A: I'm feeling a little bored. Helping you will hopefully make me feel better!
|
|
||||||
|
|
||||||
Q: How many tennis balls fit in the back of a 2002 Honda Civic?
|
Q: How many tennis balls fit in the back of a 2002 Honda Civic?
|
||||||
|
Khoj: {{"queries": ["What is the size of a tennis ball?", "What is the trunk size of a 2002 Honda Civic?"]}}
|
||||||
["What is the size of a tennis ball?", "What is the trunk size of a 2002 Honda Civic?"]
|
|
||||||
|
|
||||||
A: 1085 tennis balls will fit in the trunk of a Honda Civic
|
A: 1085 tennis balls will fit in the trunk of a Honda Civic
|
||||||
|
|
||||||
Q: Is Bob older than Tom?
|
Q: Is Bob older than Tom?
|
||||||
|
Khoj: {{"queries": ["When was Bob born?", "What is Tom's age?"]}}
|
||||||
["When was Bob born?", "What is Tom's age?"]
|
|
||||||
|
|
||||||
A: Yes, Bob is older than Tom. As Bob was born on 1984-01-01 and Tom is 30 years old.
|
A: Yes, Bob is older than Tom. As Bob was born on 1984-01-01 and Tom is 30 years old.
|
||||||
|
|
||||||
Q: What is their age difference?
|
Q: What is their age difference?
|
||||||
|
Khoj: {{"queries": ["What is Bob's age?", "What is Tom's age?"]}}
|
||||||
["What is Bob's age?", "What is Tom's age?"]
|
|
||||||
|
|
||||||
A: Bob is {bob_tom_age_difference} years older than Tom. As Bob is {bob_age} years old and Tom is 30 years old.
|
A: Bob is {bob_tom_age_difference} years older than Tom. As Bob is {bob_age} years old and Tom is 30 years old.
|
||||||
|
|
||||||
Q: What does yesterday's note say?
|
Q: What does yesterday's note say?
|
||||||
|
Khoj: {{"queries": ["Note from {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]}}
|
||||||
["Note from {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]
|
A: Yesterday's note mentions your visit to your local beach with Ram and Shyam.
|
||||||
|
|
||||||
A: Yesterday's note contains the following information: ...
|
|
||||||
|
|
||||||
{chat_history}
|
{chat_history}
|
||||||
Q: {text}
|
Q: {text}
|
||||||
|
Khoj:
|
||||||
"""
|
""".strip()
|
||||||
)
|
)
|
||||||
|
|
||||||
system_prompt_extract_relevant_information = """As a professional analyst, create a comprehensive report of the most relevant information from a web page in response to a user's query. The text provided is directly from within the web page. The report you create should be multiple paragraphs, and it should represent the content of the website. Tell the user exactly what the website says in response to their query, while adhering to these guidelines:
|
system_prompt_extract_relevant_information = """As a professional analyst, create a comprehensive report of the most relevant information from a web page in response to a user's query. The text provided is directly from within the web page. The report you create should be multiple paragraphs, and it should represent the content of the website. Tell the user exactly what the website says in response to their query, while adhering to these guidelines:
|
||||||
|
@ -339,7 +320,12 @@ Khoj:
|
||||||
|
|
||||||
pick_relevant_information_collection_tools = PromptTemplate.from_template(
|
pick_relevant_information_collection_tools = PromptTemplate.from_template(
|
||||||
"""
|
"""
|
||||||
You are Khoj, a smart and helpful personal assistant. You have access to a variety of data sources to help you answer the user's question. You can use the data sources listed below to collect more relevant information. You can use any combination of these data sources to answer the user's question. Tell me which data sources you would like to use to answer the user's question.
|
You are Khoj, an extremely smart and helpful search assistant.
|
||||||
|
- You have access to a variety of data sources to help you answer the user's question
|
||||||
|
- You can use the data sources listed below to collect more relevant information
|
||||||
|
- You can use any combination of these data sources to answer the user's question
|
||||||
|
|
||||||
|
Which of the data sources listed below you would use to answer the user's question?
|
||||||
|
|
||||||
{tools}
|
{tools}
|
||||||
|
|
||||||
|
@ -351,7 +337,7 @@ User: I'm thinking of moving to a new city. I'm trying to decide between New Yor
|
||||||
AI: Moving to a new city can be challenging. Both New York and San Francisco are great cities to live in. New York is known for its diverse culture and San Francisco is known for its tech scene.
|
AI: Moving to a new city can be challenging. Both New York and San Francisco are great cities to live in. New York is known for its diverse culture and San Francisco is known for its tech scene.
|
||||||
|
|
||||||
Q: What is the population of each of those cities?
|
Q: What is the population of each of those cities?
|
||||||
Khoj: ["online"]
|
Khoj: {{"source": ["online"]}}
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
Chat History:
|
Chat History:
|
||||||
|
@ -359,23 +345,32 @@ User: I'm thinking of my next vacation idea. Ideally, I want to see something ne
|
||||||
AI: Excellent! Taking a vacation is a great way to relax and recharge.
|
AI: Excellent! Taking a vacation is a great way to relax and recharge.
|
||||||
|
|
||||||
Q: Where did Grandma grow up?
|
Q: Where did Grandma grow up?
|
||||||
Khoj: ["notes"]
|
Khoj: {{"source": ["notes"]}}
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
Chat History:
|
Chat History:
|
||||||
|
|
||||||
Q: What's the latest news with the first company I worked for?
|
|
||||||
Khoj: ["notes", "online"]
|
Q: What can you do for me?
|
||||||
|
Khoj: {{"source": ["notes", "online"]}}
|
||||||
|
|
||||||
|
Example:
|
||||||
|
Chat History:
|
||||||
|
User: Good morning
|
||||||
|
AI: Good morning! How can I help you today?
|
||||||
|
|
||||||
|
Q: How can I share my files with Khoj?
|
||||||
|
Khoj: {{"source": ["default", "online"]}}
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
Chat History:
|
Chat History:
|
||||||
User: I want to start a new hobby. I'm thinking of learning to play the guitar.
|
User: I want to start a new hobby. I'm thinking of learning to play the guitar.
|
||||||
AI: Learning to play the guitar is a great hobby. It can be a lot of fun and a great way to express yourself.
|
AI: Learning to play the guitar is a great hobby. It can be a lot of fun and a great way to express yourself.
|
||||||
|
|
||||||
Q: Who is Sandra?
|
Q: What is the first element of the periodic table?
|
||||||
Khoj: ["default"]
|
Khoj: {{"source": ["general"]}}
|
||||||
|
|
||||||
Now it's your turn to pick the tools you would like to use to answer the user's question. Provide your response as a list of strings.
|
Now it's your turn to pick the data sources you would like to use to answer the user's question. Respond with data sources as a list of strings in a JSON object.
|
||||||
|
|
||||||
Chat History:
|
Chat History:
|
||||||
{chat_history}
|
{chat_history}
|
||||||
|
@ -387,76 +382,71 @@ Khoj:
|
||||||
|
|
||||||
online_search_conversation_subqueries = PromptTemplate.from_template(
|
online_search_conversation_subqueries = PromptTemplate.from_template(
|
||||||
"""
|
"""
|
||||||
You are Khoj, an extremely smart and helpful search assistant. You are tasked with constructing **up to three** search queries for Google to answer the user's question.
|
You are Khoj, an advanced google search assistant. You are tasked with constructing **up to three** google search queries to answer the user's question.
|
||||||
- You will receive the conversation history as context.
|
- You will receive the conversation history as context.
|
||||||
- Add as much context from the previous questions and answers as required into your search queries.
|
- Add as much context from the previous questions and answers as required into your search queries.
|
||||||
- Break messages into multiple search queries when required to retrieve the relevant information.
|
- Break messages into multiple search queries when required to retrieve the relevant information.
|
||||||
|
- Use site: and after: google search operators when appropriate
|
||||||
- You have access to the the whole internet to retrieve information.
|
- You have access to the the whole internet to retrieve information.
|
||||||
|
- Official, up-to-date information about you, Khoj, is available at site:khoj.dev
|
||||||
|
|
||||||
What Google searches, if any, will you need to perform to answer the user's question?
|
What Google searches, if any, will you need to perform to answer the user's question?
|
||||||
Provide search queries as a list of strings
|
Provide search queries as a JSON list of strings
|
||||||
Current Date: {current_date}
|
Current Date: {current_date}
|
||||||
User's Location: {location}
|
User's Location: {location}
|
||||||
|
|
||||||
Here are some examples:
|
Here are some examples:
|
||||||
History:
|
History:
|
||||||
User: I like to use Hacker News to get my tech news.
|
User: I like to use Hacker News to get my tech news.
|
||||||
Khoj: Hacker News is an online forum for sharing and discussing the latest tech news. It is a great place to learn about new technologies and startups.
|
AI: Hacker News is an online forum for sharing and discussing the latest tech news. It is a great place to learn about new technologies and startups.
|
||||||
|
|
||||||
Q: Posts about vector databases on Hacker News
|
Q: Summarize posts about vector databases on Hacker News since Feb 2024
|
||||||
A: ["site:"news.ycombinator.com vector database"]
|
Khoj: {{"queries": ["site:news.ycombinator.com after:2024/02/01 vector database"]}}
|
||||||
|
|
||||||
History:
|
History:
|
||||||
User: I'm currently living in New York but I'm thinking about moving to San Francisco.
|
User: I'm currently living in New York but I'm thinking about moving to San Francisco.
|
||||||
Khoj: New York is a great city to live in. It has a lot of great restaurants and museums. San Francisco is also a great city to live in. It has a lot of great restaurants and museums.
|
AI: New York is a great city to live in. It has a lot of great restaurants and museums. San Francisco is also a great city to live in. It has good access to nature and a great tech scene.
|
||||||
|
|
||||||
Q: What is the weather like in those cities?
|
Q: What is the climate like in those cities?
|
||||||
A: ["weather in new york", "weather in san francisco"]
|
Khoj: {{"queries": ["climate in new york city", "climate in san francisco"]}}
|
||||||
|
|
||||||
History:
|
History:
|
||||||
User: I'm thinking of my next vacation idea. Ideally, I want to see something new and exciting.
|
AI: Hey, how is it going?
|
||||||
Khoj: You could time your next trip with the next lunar eclipse, as that would be a novel experience.
|
User: Going well. Ananya is in town tonight!
|
||||||
|
AI: Oh that's awesome! What are your plans for the evening?
|
||||||
|
|
||||||
Q: When is the next one?
|
Q: She wants to see a movie. Any decent sci-fi movies playing at the local theater?
|
||||||
A: ["next lunar eclipse"]
|
Khoj: {{"queries": ["new sci-fi movies in theaters near {location}"]}}
|
||||||
|
|
||||||
|
History:
|
||||||
|
User: Can I chat with you over WhatsApp?
|
||||||
|
AI: Yes, you can chat with me using WhatsApp.
|
||||||
|
|
||||||
|
Q: How
|
||||||
|
Khoj: {{"queries": ["site:khoj.dev chat with Khoj on Whatsapp"]}}
|
||||||
|
|
||||||
|
History:
|
||||||
|
|
||||||
|
|
||||||
|
Q: How do I share my files with you?
|
||||||
|
Khoj: {{"queries": ["site:khoj.dev sync files with Khoj"]}}
|
||||||
|
|
||||||
History:
|
History:
|
||||||
User: I need to transport a lot of oranges to the moon. Are there any rockets that can fit a lot of oranges?
|
User: I need to transport a lot of oranges to the moon. Are there any rockets that can fit a lot of oranges?
|
||||||
Khoj: NASA's Saturn V rocket frequently makes lunar trips and has a large cargo capacity.
|
AI: NASA's Saturn V rocket frequently makes lunar trips and has a large cargo capacity.
|
||||||
|
|
||||||
Q: How many oranges would fit in NASA's Saturn V rocket?
|
Q: How many oranges would fit in NASA's Saturn V rocket?
|
||||||
A: ["volume of an orange", "volume of saturn v rocket"]
|
Khoj: {{"queries": ["volume of an orange", "volume of saturn v rocket"]}}
|
||||||
|
|
||||||
Now it's your turn to construct a search query for Google to answer the user's question.
|
Now it's your turn to construct a search query for Google to answer the user's question.
|
||||||
History:
|
History:
|
||||||
{chat_history}
|
{chat_history}
|
||||||
|
|
||||||
Q: {query}
|
Q: {query}
|
||||||
A:
|
Khoj:
|
||||||
"""
|
""".strip()
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
## Extract Search Type
|
|
||||||
## --
|
|
||||||
search_type = """
|
|
||||||
Objective: Extract search type from user query and return information as JSON
|
|
||||||
|
|
||||||
Allowed search types are listed below:
|
|
||||||
- search-type=["notes", "image", "pdf"]
|
|
||||||
|
|
||||||
Some examples are given below for reference:
|
|
||||||
Q:What fiction book was I reading last week about AI starship?
|
|
||||||
A:{ "search-type": "notes" }
|
|
||||||
Q: What did the lease say about early termination
|
|
||||||
A: { "search-type": "pdf" }
|
|
||||||
Q:Can you recommend a movie to watch from my notes?
|
|
||||||
A:{ "search-type": "notes" }
|
|
||||||
Q:When did I go surfing last?
|
|
||||||
A:{ "search-type": "notes" }
|
|
||||||
Q:"""
|
|
||||||
|
|
||||||
|
|
||||||
# System messages to user
|
# System messages to user
|
||||||
# --
|
# --
|
||||||
help_message = PromptTemplate.from_template(
|
help_message = PromptTemplate.from_template(
|
||||||
|
|
|
@ -112,15 +112,15 @@ def update_telemetry_state(
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
def construct_chat_history(conversation_history: dict, n: int = 4) -> str:
|
def construct_chat_history(conversation_history: dict, n: int = 4, agent_name="AI") -> str:
|
||||||
chat_history = ""
|
chat_history = ""
|
||||||
for chat in conversation_history.get("chat", [])[-n:]:
|
for chat in conversation_history.get("chat", [])[-n:]:
|
||||||
if chat["by"] == "khoj" and chat["intent"].get("type") == "remember":
|
if chat["by"] == "khoj" and chat["intent"].get("type") == "remember":
|
||||||
chat_history += f"User: {chat['intent']['query']}\n"
|
chat_history += f"User: {chat['intent']['query']}\n"
|
||||||
chat_history += f"Khoj: {chat['message']}\n"
|
chat_history += f"{agent_name}: {chat['message']}\n"
|
||||||
elif chat["by"] == "khoj" and ("text-to-image" in chat["intent"].get("type")):
|
elif chat["by"] == "khoj" and ("text-to-image" in chat["intent"].get("type")):
|
||||||
chat_history += f"User: {chat['intent']['query']}\n"
|
chat_history += f"User: {chat['intent']['query']}\n"
|
||||||
chat_history += f"Khoj: [generated image redacted for space]\n"
|
chat_history += f"{agent_name}: [generated image redacted for space]\n"
|
||||||
return chat_history
|
return chat_history
|
||||||
|
|
||||||
|
|
||||||
|
@ -153,24 +153,26 @@ async def aget_relevant_information_sources(query: str, conversation_history: di
|
||||||
"""
|
"""
|
||||||
|
|
||||||
tool_options = dict()
|
tool_options = dict()
|
||||||
|
tool_options_str = ""
|
||||||
|
|
||||||
for tool, description in tool_descriptions_for_llm.items():
|
for tool, description in tool_descriptions_for_llm.items():
|
||||||
tool_options[tool.value] = description
|
tool_options[tool.value] = description
|
||||||
|
tool_options_str += f'- "{tool.value}": "{description}"\n'
|
||||||
|
|
||||||
chat_history = construct_chat_history(conversation_history)
|
chat_history = construct_chat_history(conversation_history)
|
||||||
|
|
||||||
relevant_tools_prompt = prompts.pick_relevant_information_collection_tools.format(
|
relevant_tools_prompt = prompts.pick_relevant_information_collection_tools.format(
|
||||||
query=query,
|
query=query,
|
||||||
tools=str(tool_options),
|
tools=tool_options_str,
|
||||||
chat_history=chat_history,
|
chat_history=chat_history,
|
||||||
)
|
)
|
||||||
|
|
||||||
response = await send_message_to_model_wrapper(relevant_tools_prompt)
|
response = await send_message_to_model_wrapper(relevant_tools_prompt, response_type="json_object")
|
||||||
|
|
||||||
try:
|
try:
|
||||||
response = response.strip()
|
response = response.strip()
|
||||||
response = json.loads(response)
|
response = json.loads(response)
|
||||||
response = [q.strip() for q in response if q.strip()]
|
response = [q.strip() for q in response["source"] if q.strip()]
|
||||||
if not isinstance(response, list) or not response or len(response) == 0:
|
if not isinstance(response, list) or not response or len(response) == 0:
|
||||||
logger.error(f"Invalid response for determining relevant tools: {response}")
|
logger.error(f"Invalid response for determining relevant tools: {response}")
|
||||||
return tool_options
|
return tool_options
|
||||||
|
@ -195,15 +197,17 @@ async def aget_relevant_output_modes(query: str, conversation_history: dict):
|
||||||
"""
|
"""
|
||||||
|
|
||||||
mode_options = dict()
|
mode_options = dict()
|
||||||
|
mode_options_str = ""
|
||||||
|
|
||||||
for mode, description in mode_descriptions_for_llm.items():
|
for mode, description in mode_descriptions_for_llm.items():
|
||||||
mode_options[mode.value] = description
|
mode_options[mode.value] = description
|
||||||
|
mode_options_str += f'- "{mode.value}": "{description}"\n'
|
||||||
|
|
||||||
chat_history = construct_chat_history(conversation_history)
|
chat_history = construct_chat_history(conversation_history)
|
||||||
|
|
||||||
relevant_mode_prompt = prompts.pick_relevant_output_mode.format(
|
relevant_mode_prompt = prompts.pick_relevant_output_mode.format(
|
||||||
query=query,
|
query=query,
|
||||||
modes=str(mode_options),
|
modes=mode_options_str,
|
||||||
chat_history=chat_history,
|
chat_history=chat_history,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -240,13 +244,13 @@ async def generate_online_subqueries(q: str, conversation_history: dict, locatio
|
||||||
location=location,
|
location=location,
|
||||||
)
|
)
|
||||||
|
|
||||||
response = await send_message_to_model_wrapper(online_queries_prompt)
|
response = await send_message_to_model_wrapper(online_queries_prompt, response_type="json_object")
|
||||||
|
|
||||||
# Validate that the response is a non-empty, JSON-serializable list
|
# Validate that the response is a non-empty, JSON-serializable list
|
||||||
try:
|
try:
|
||||||
response = response.strip()
|
response = response.strip()
|
||||||
response = json.loads(response)
|
response = json.loads(response)
|
||||||
response = [q.strip() for q in response if q.strip()]
|
response = [q.strip() for q in response["queries"] if q.strip()]
|
||||||
if not isinstance(response, list) or not response or len(response) == 0:
|
if not isinstance(response, list) or not response or len(response) == 0:
|
||||||
logger.error(f"Invalid response for constructing subqueries: {response}. Returning original query: {q}")
|
logger.error(f"Invalid response for constructing subqueries: {response}. Returning original query: {q}")
|
||||||
return [q]
|
return [q]
|
||||||
|
@ -320,6 +324,7 @@ async def generate_better_image_prompt(
|
||||||
async def send_message_to_model_wrapper(
|
async def send_message_to_model_wrapper(
|
||||||
message: str,
|
message: str,
|
||||||
system_message: str = "",
|
system_message: str = "",
|
||||||
|
response_type: str = "text",
|
||||||
):
|
):
|
||||||
conversation_config: ChatModelOptions = await ConversationAdapters.aget_default_conversation_config()
|
conversation_config: ChatModelOptions = await ConversationAdapters.aget_default_conversation_config()
|
||||||
|
|
||||||
|
@ -348,9 +353,7 @@ async def send_message_to_model_wrapper(
|
||||||
api_key = openai_chat_config.api_key
|
api_key = openai_chat_config.api_key
|
||||||
chat_model = conversation_config.chat_model
|
chat_model = conversation_config.chat_model
|
||||||
openai_response = send_message_to_model(
|
openai_response = send_message_to_model(
|
||||||
messages=truncated_messages,
|
messages=truncated_messages, api_key=api_key, model=chat_model, response_type=response_type
|
||||||
api_key=api_key,
|
|
||||||
model=chat_model,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
return openai_response
|
return openai_response
|
||||||
|
|
|
@ -7,7 +7,7 @@ app_env_filepath = "~/.khoj/env"
|
||||||
telemetry_server = "https://khoj.beta.haletic.com/v1/telemetry"
|
telemetry_server = "https://khoj.beta.haletic.com/v1/telemetry"
|
||||||
content_directory = "~/.khoj/content/"
|
content_directory = "~/.khoj/content/"
|
||||||
default_offline_chat_model = "mistral-7b-instruct-v0.1.Q4_0.gguf"
|
default_offline_chat_model = "mistral-7b-instruct-v0.1.Q4_0.gguf"
|
||||||
default_online_chat_model = "gpt-4"
|
default_online_chat_model = "gpt-4-turbo-preview"
|
||||||
|
|
||||||
empty_config = {
|
empty_config = {
|
||||||
"search-type": {
|
"search-type": {
|
||||||
|
|
|
@ -277,16 +277,16 @@ command_descriptions = {
|
||||||
ConversationCommand.General: "Only talk about information that relies on Khoj's general knowledge, not your personal knowledge base.",
|
ConversationCommand.General: "Only talk about information that relies on Khoj's general knowledge, not your personal knowledge base.",
|
||||||
ConversationCommand.Notes: "Only talk about information that is available in your knowledge base.",
|
ConversationCommand.Notes: "Only talk about information that is available in your knowledge base.",
|
||||||
ConversationCommand.Default: "The default command when no command specified. It intelligently auto-switches between general and notes mode.",
|
ConversationCommand.Default: "The default command when no command specified. It intelligently auto-switches between general and notes mode.",
|
||||||
ConversationCommand.Online: "Look up information on the internet.",
|
ConversationCommand.Online: "Search for information on the internet.",
|
||||||
ConversationCommand.Image: "Generate images by describing your imagination in words.",
|
ConversationCommand.Image: "Generate images by describing your imagination in words.",
|
||||||
ConversationCommand.Help: "Display a help message with all available commands and other metadata.",
|
ConversationCommand.Help: "Display a help message with all available commands and other metadata.",
|
||||||
}
|
}
|
||||||
|
|
||||||
tool_descriptions_for_llm = {
|
tool_descriptions_for_llm = {
|
||||||
ConversationCommand.Default: "Use this if there might be a mix of general and personal knowledge in the question, or if you don't entirely understand the query.",
|
ConversationCommand.Default: "To use a mix of your internal knowledge and the user's personal knowledge, or if you don't entirely understand the query.",
|
||||||
ConversationCommand.General: "Use this when you can answer the question without any outside information or personal knowledge",
|
ConversationCommand.General: "Use this when you can answer the question without any outside information or personal knowledge",
|
||||||
ConversationCommand.Notes: "Use this when you would like to use the user's personal knowledge base to answer the question. This is especially helpful if the query seems to be missing context.",
|
ConversationCommand.Notes: "To search the user's personal knowledge base. Especially helpful if the question expects context from the user's notes or documents.",
|
||||||
ConversationCommand.Online: "Use this when you would like to look up information on the internet",
|
ConversationCommand.Online: "To search for the latest, up-to-date information from the internet. Note: **Questions about Khoj should always use this data source**",
|
||||||
}
|
}
|
||||||
|
|
||||||
mode_descriptions_for_llm = {
|
mode_descriptions_for_llm = {
|
||||||
|
|
|
@ -7,7 +7,12 @@ from freezegun import freeze_time
|
||||||
|
|
||||||
from khoj.processor.conversation.openai.gpt import converse, extract_questions
|
from khoj.processor.conversation.openai.gpt import converse, extract_questions
|
||||||
from khoj.processor.conversation.utils import message_to_log
|
from khoj.processor.conversation.utils import message_to_log
|
||||||
from khoj.routers.helpers import aget_relevant_output_modes
|
from khoj.routers.helpers import (
|
||||||
|
aget_relevant_information_sources,
|
||||||
|
aget_relevant_output_modes,
|
||||||
|
generate_online_subqueries,
|
||||||
|
)
|
||||||
|
from khoj.utils.helpers import ConversationCommand
|
||||||
|
|
||||||
# Initialize variables for tests
|
# Initialize variables for tests
|
||||||
api_key = os.getenv("OPENAI_API_KEY")
|
api_key = os.getenv("OPENAI_API_KEY")
|
||||||
|
@ -154,33 +159,6 @@ def test_generate_search_query_using_question_and_answer_from_chat_history():
|
||||||
assert "Leia" in response[0] and "Luke" in response[0]
|
assert "Leia" in response[0] and "Luke" in response[0]
|
||||||
|
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------------------------
|
|
||||||
@pytest.mark.chatquality
|
|
||||||
def test_generate_search_query_with_date_and_context_from_chat_history():
|
|
||||||
# Arrange
|
|
||||||
message_list = [
|
|
||||||
("When did I visit Masai Mara?", "You visited Masai Mara in April 2000", []),
|
|
||||||
]
|
|
||||||
|
|
||||||
# Act
|
|
||||||
response = extract_questions(
|
|
||||||
"What was the Pizza place we ate at over there?", conversation_log=populate_chat_history(message_list)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Assert
|
|
||||||
expected_responses = [
|
|
||||||
("dt>='2000-04-01'", "dt<'2000-05-01'"),
|
|
||||||
("dt>='2000-04-01'", "dt<='2000-04-30'"),
|
|
||||||
('dt>="2000-04-01"', 'dt<"2000-05-01"'),
|
|
||||||
('dt>="2000-04-01"', 'dt<="2000-04-30"'),
|
|
||||||
]
|
|
||||||
assert len(response) == 1
|
|
||||||
assert "Masai Mara" in response[0]
|
|
||||||
assert any([start in response[0] and end in response[0] for start, end in expected_responses]), (
|
|
||||||
"Expected date filter to limit to April 2000 in response but got: " + response[0]
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------------------------
|
||||||
@pytest.mark.chatquality
|
@pytest.mark.chatquality
|
||||||
def test_chat_with_no_chat_history_or_retrieved_content():
|
def test_chat_with_no_chat_history_or_retrieved_content():
|
||||||
|
@ -391,7 +369,7 @@ def test_answer_general_question_not_in_chat_history_or_retrieved_content():
|
||||||
# Act
|
# Act
|
||||||
response_gen = converse(
|
response_gen = converse(
|
||||||
references=[], # Assume no context retrieved from notes for the user_query
|
references=[], # Assume no context retrieved from notes for the user_query
|
||||||
user_query="Write a haiku about unit testing in 3 lines",
|
user_query="Write a haiku about unit testing in 3 lines. Do not say anything else",
|
||||||
conversation_log=populate_chat_history(message_list),
|
conversation_log=populate_chat_history(message_list),
|
||||||
api_key=api_key,
|
api_key=api_key,
|
||||||
)
|
)
|
||||||
|
@ -435,6 +413,47 @@ My sister, Aiyla is married to Tolga. They have 3 kids, Yildiz, Ali and Ahmet.""
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------------------------------------
|
||||||
|
@pytest.mark.anyio
|
||||||
|
@pytest.mark.django_db(transaction=True)
|
||||||
|
@freeze_time("2024-04-04", ignore=["transformers"])
|
||||||
|
async def test_websearch_with_operators(chat_client):
|
||||||
|
# Arrange
|
||||||
|
user_query = "Share popular posts on r/worldnews this month"
|
||||||
|
|
||||||
|
# Act
|
||||||
|
responses = await generate_online_subqueries(user_query, {}, None)
|
||||||
|
|
||||||
|
# Assert
|
||||||
|
assert any(
|
||||||
|
["reddit.com/r/worldnews" in response for response in responses]
|
||||||
|
), "Expected a search query to include site:reddit.com but got: " + str(responses)
|
||||||
|
|
||||||
|
assert any(
|
||||||
|
["site:reddit.com" in response for response in responses]
|
||||||
|
), "Expected a search query to include site:reddit.com but got: " + str(responses)
|
||||||
|
|
||||||
|
assert any(
|
||||||
|
["after:2024/04/01" in response for response in responses]
|
||||||
|
), "Expected a search query to include after:2024/04/01 but got: " + str(responses)
|
||||||
|
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------------------------------------
|
||||||
|
@pytest.mark.anyio
|
||||||
|
@pytest.mark.django_db(transaction=True)
|
||||||
|
async def test_websearch_khoj_website_for_info_about_khoj(chat_client):
|
||||||
|
# Arrange
|
||||||
|
user_query = "Do you support image search?"
|
||||||
|
|
||||||
|
# Act
|
||||||
|
responses = await generate_online_subqueries(user_query, {}, None)
|
||||||
|
|
||||||
|
# Assert
|
||||||
|
assert any(
|
||||||
|
["site:khoj.dev" in response for response in responses]
|
||||||
|
), "Expected search query to include site:khoj.dev but got: " + str(responses)
|
||||||
|
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------------------------
|
||||||
@pytest.mark.anyio
|
@pytest.mark.anyio
|
||||||
@pytest.mark.django_db(transaction=True)
|
@pytest.mark.django_db(transaction=True)
|
||||||
|
@ -454,7 +473,7 @@ async def test_use_default_response_mode(chat_client):
|
||||||
@pytest.mark.django_db(transaction=True)
|
@pytest.mark.django_db(transaction=True)
|
||||||
async def test_use_image_response_mode(chat_client):
|
async def test_use_image_response_mode(chat_client):
|
||||||
# Arrange
|
# Arrange
|
||||||
user_query = "Paint a picture of the scenery in Timbuktu in the winter"
|
user_query = "Paint a scenery in Timbuktu in the winter"
|
||||||
|
|
||||||
# Act
|
# Act
|
||||||
mode = await aget_relevant_output_modes(user_query, {})
|
mode = await aget_relevant_output_modes(user_query, {})
|
||||||
|
@ -463,6 +482,34 @@ async def test_use_image_response_mode(chat_client):
|
||||||
assert mode.value == "image"
|
assert mode.value == "image"
|
||||||
|
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------------------------------------
|
||||||
|
@pytest.mark.anyio
|
||||||
|
@pytest.mark.django_db(transaction=True)
|
||||||
|
async def test_select_data_sources_actor_chooses_to_search_notes(chat_client):
|
||||||
|
# Arrange
|
||||||
|
user_query = "Where did I learn to swim?"
|
||||||
|
|
||||||
|
# Act
|
||||||
|
conversation_commands = await aget_relevant_information_sources(user_query, {})
|
||||||
|
|
||||||
|
# Assert
|
||||||
|
assert ConversationCommand.Notes in conversation_commands
|
||||||
|
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------------------------------------
|
||||||
|
@pytest.mark.anyio
|
||||||
|
@pytest.mark.django_db(transaction=True)
|
||||||
|
async def test_select_data_sources_actor_chooses_to_search_online(chat_client):
|
||||||
|
# Arrange
|
||||||
|
user_query = "Where is the nearest hospital?"
|
||||||
|
|
||||||
|
# Act
|
||||||
|
conversation_commands = await aget_relevant_information_sources(user_query, {})
|
||||||
|
|
||||||
|
# Assert
|
||||||
|
assert ConversationCommand.Online in conversation_commands
|
||||||
|
|
||||||
|
|
||||||
# Helpers
|
# Helpers
|
||||||
# ----------------------------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------------------------
|
||||||
def populate_chat_history(message_list):
|
def populate_chat_history(message_list):
|
||||||
|
|
|
@ -222,9 +222,17 @@ def test_no_answer_in_chat_history_or_retrieved_content(chat_client, default_use
|
||||||
response_message = response.content.decode("utf-8")
|
response_message = response.content.decode("utf-8")
|
||||||
|
|
||||||
# Assert
|
# Assert
|
||||||
expected_responses = ["don't know", "do not know", "no information", "do not have", "don't have"]
|
expected_responses = [
|
||||||
|
"don't know",
|
||||||
|
"do not know",
|
||||||
|
"no information",
|
||||||
|
"do not have",
|
||||||
|
"don't have",
|
||||||
|
"where were you born?",
|
||||||
|
]
|
||||||
|
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert any([expected_response in response_message for expected_response in expected_responses]), (
|
assert any([expected_response in response_message.lower() for expected_response in expected_responses]), (
|
||||||
"Expected chat director to say they don't know in response, but got: " + response_message
|
"Expected chat director to say they don't know in response, but got: " + response_message
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -330,10 +338,8 @@ def test_answer_general_question_not_in_chat_history_or_retrieved_content(chat_c
|
||||||
populate_chat_history(message_list, default_user2)
|
populate_chat_history(message_list, default_user2)
|
||||||
|
|
||||||
# Act
|
# Act
|
||||||
response = chat_client.get(
|
response = chat_client.get(f'/api/chat?q="Write a haiku about unit testing. Do not say anything else."&stream=true')
|
||||||
f'/api/chat?q=""Write a haiku about unit testing. Do not say anything else."&stream=true'
|
response_message = response.content.decode("utf-8").split("### compiled references")[0]
|
||||||
)
|
|
||||||
response_message = response.content.decode("utf-8")
|
|
||||||
|
|
||||||
# Assert
|
# Assert
|
||||||
expected_responses = ["test", "Test"]
|
expected_responses = ["test", "Test"]
|
||||||
|
@ -350,8 +356,8 @@ def test_answer_general_question_not_in_chat_history_or_retrieved_content(chat_c
|
||||||
def test_ask_for_clarification_if_not_enough_context_in_question(chat_client_no_background):
|
def test_ask_for_clarification_if_not_enough_context_in_question(chat_client_no_background):
|
||||||
# Act
|
# Act
|
||||||
|
|
||||||
response = chat_client_no_background.get(f'/api/chat?q="What is the name of Namitas older son"&stream=true')
|
response = chat_client_no_background.get(f'/api/chat?q="What is the name of Namitas older son?"&stream=true')
|
||||||
response_message = response.content.decode("utf-8")
|
response_message = response.content.decode("utf-8").split("### compiled references")[0].lower()
|
||||||
|
|
||||||
# Assert
|
# Assert
|
||||||
expected_responses = [
|
expected_responses = [
|
||||||
|
@ -361,9 +367,11 @@ def test_ask_for_clarification_if_not_enough_context_in_question(chat_client_no_
|
||||||
"the birth order",
|
"the birth order",
|
||||||
"provide more context",
|
"provide more context",
|
||||||
"provide me with more context",
|
"provide me with more context",
|
||||||
|
"don't have that",
|
||||||
|
"haven't provided me",
|
||||||
]
|
]
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert any([expected_response in response_message.lower() for expected_response in expected_responses]), (
|
assert any([expected_response in response_message for expected_response in expected_responses]), (
|
||||||
"Expected chat director to ask for clarification in response, but got: " + response_message
|
"Expected chat director to ask for clarification in response, but got: " + response_message
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -399,13 +407,18 @@ def test_answer_in_chat_history_beyond_lookback_window(chat_client, default_user
|
||||||
def test_answer_requires_multiple_independent_searches(chat_client):
|
def test_answer_requires_multiple_independent_searches(chat_client):
|
||||||
"Chat director should be able to answer by doing multiple independent searches for required information"
|
"Chat director should be able to answer by doing multiple independent searches for required information"
|
||||||
# Act
|
# Act
|
||||||
response = chat_client.get(f'/api/chat?q="Is Xi older than Namita?"&stream=true')
|
response = chat_client.get(f'/api/chat?q="Is Xi older than Namita? Just the older persons full name"&stream=true')
|
||||||
response_message = response.content.decode("utf-8")
|
response_message = response.content.decode("utf-8").split("### compiled references")[0].lower()
|
||||||
|
|
||||||
# Assert
|
# Assert
|
||||||
expected_responses = ["he is older than namita", "xi is older than namita", "xi li is older than namita"]
|
expected_responses = ["he is older than namita", "xi is older than namita", "xi li is older than namita"]
|
||||||
|
only_full_name_check = "xi li" in response_message and "namita" not in response_message
|
||||||
|
comparative_statement_check = any(
|
||||||
|
[expected_response in response_message for expected_response in expected_responses]
|
||||||
|
)
|
||||||
|
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert any([expected_response in response_message.lower() for expected_response in expected_responses]), (
|
assert only_full_name_check or comparative_statement_check, (
|
||||||
"Expected Xi is older than Namita, but got: " + response_message
|
"Expected Xi is older than Namita, but got: " + response_message
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -415,15 +428,22 @@ def test_answer_requires_multiple_independent_searches(chat_client):
|
||||||
def test_answer_using_file_filter(chat_client):
|
def test_answer_using_file_filter(chat_client):
|
||||||
"Chat should be able to use search filters in the query"
|
"Chat should be able to use search filters in the query"
|
||||||
# Act
|
# Act
|
||||||
query = urllib.parse.quote('Is Xi older than Namita? file:"Namita.markdown" file:"Xi Li.markdown"')
|
query = urllib.parse.quote(
|
||||||
|
'Is Xi older than Namita? Just say the older persons full name. file:"Namita.markdown" file:"Xi Li.markdown"'
|
||||||
|
)
|
||||||
|
|
||||||
response = chat_client.get(f"/api/chat?q={query}&stream=true")
|
response = chat_client.get(f"/api/chat?q={query}&stream=true")
|
||||||
response_message = response.content.decode("utf-8")
|
response_message = response.content.decode("utf-8").split("### compiled references")[0].lower()
|
||||||
|
|
||||||
# Assert
|
# Assert
|
||||||
expected_responses = ["he is older than namita", "xi is older than namita", "xi li is older than namita"]
|
expected_responses = ["he is older than namita", "xi is older than namita", "xi li is older than namita"]
|
||||||
|
only_full_name_check = "xi li" in response_message and "namita" not in response_message
|
||||||
|
comparative_statement_check = any(
|
||||||
|
[expected_response in response_message for expected_response in expected_responses]
|
||||||
|
)
|
||||||
|
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
assert any([expected_response in response_message.lower() for expected_response in expected_responses]), (
|
assert only_full_name_check or comparative_statement_check, (
|
||||||
"Expected Xi is older than Namita, but got: " + response_message
|
"Expected Xi is older than Namita, but got: " + response_message
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue