Commit graph

6 commits

Author SHA1 Message Date
Debanjum Singh Solanky
7e36f421f9 Truncate message logs to below max supported prompt size by model
- Use tiktoken to count tokens for chat models
- Make conversation turns to add to prompt configurable via method
  argument to generate_chatml_messages_with_context method
2023-03-25 05:13:56 +07:00
Debanjum Singh Solanky
508b2176b7 Update Chat API, Logs, Interfaces to store, use references as list
- Remove the need to split by magic string in emacs and chat interfaces
- Move compiling references into string as context for GPT to GPT layer
- Update setup in tests to use new style of setting references
- Name first argument to converse as more appropriate "references"
2023-03-24 22:10:11 +07:00
Debanjum Singh Solanky
f09bdd515b Expect Chat Director can extract relative dates using new Search Actor 2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky
2600cc9d4d Test Search Actor extracting relative dates & multiple questions 2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky
dfb277ee37 Set skipif at module level if OpenAI API key not set for chat tests
- Remove stale message_to_prompt test
  It is too broad, reduces maintainability.
  Remove as it doesn't really need its own test right now
- Setting skipif at module level for chat actor, director tests
  reduces code duplication as earlier was using decorator on each chat
  test
2023-03-16 12:23:52 -06:00
Debanjum Singh Solanky
1b4d562700 Test Chat Director Capabilities: Answer from notes, chat history etc
- Chat directors are broad agents.
  - Chat directors orchestrate narrow actor agents to synthesize
    final response for the user
  - Agents are Prompts + ML Model

- Test Chat Director Capabilities
  1. [X] Answer from retrieved notes
  2. [X] Answer from chat history
  3. [X] Answer general questions
  4. [X] Carry out multi-turn conversation
  5. [X] Say don't know when answer not in provided context
  6. [X] Answers that require current date awareness
     This test is expected to fail as the chat is not capable of doing
     this without the Search actor. But the test allows assessing chat quality
  7. [X] Date-aware aggregation across multiple different notes
     This test is expected to fail as the chat is not capable of doing
     this without the Search actor. But the test allows assessing chat quality
  8. [X] Ask clarification questions if no unambiguous answer in provided context
  9. [X] Retrieve answer from chat history beyond lookback window
     This test is expected to fail as the chat director is not capable
     of searching chat history yet. But the test allows assessing chat quality
 10. [X] Retrieve context for answer using multiple independent
         searches on knowledge base
     This test is expected to fail as the chat is not capable of doing
     this without the Search actor. But the test allows assessing chat quality
2023-03-16 09:30:37 -06:00