khoj/documentation/docs/features/voice-chat.md
Debanjum Singh Solanky 68e7c297e0 Add Advanced Self Hosting Section, Improve Self Hosting, OpenAI Proxy Docs
- Add instructions for self-hosted users with info, warning boxes to
  avoid, fix common issues when setting up Khoj server
- Create new Advanced Self Hosting section
  - Extract Advanced Self-Hosting Sections from the Advanced Page and
    move them to separate Pages under Advanced Self Hosting section
- Improve OpenAI Proxy Docs
  - Put Ollama setup as a section under OpenAI API Proxy page instead
    of a separate page
  - Add Section to use Khoj with chat model from LM Studio
  - Update LiteLLM docs to use chat model from LM Studio
2024-06-24 16:12:20 +05:30

1.7 KiB

Voice

You can talk to Khoj using your voice. Khoj will respond to your queries using the same models as the chat feature. You can use voice chat on the web, Desktop, and Obsidian apps.

Voice Chat

Click on the little mic icon to send your voice message to Khoj. It will send back what it heard via text. You'll have some time to edit it before sending it, if required. Try it at https://app.khoj.dev/.

Voice Response

If you send a voice message, Khoj will automatically respond back with a voice message. You can also click on the speaker icon next to any message to hear it out loud. The voice response feature is available only on the web view right now.

Speaker Icon

Setup (Self-Hosting)

Voice chat will automatically be configured when you initialize the application. The default configuration will run locally. If you want to use the OpenAI whisper API for voice chat, you can set it up by following these steps:

  1. Setup your OpenAI API key. See instructions here.
  2. Create a new configuration at http://localhost:42110/server/admin/database/speechtotextmodeloptions/. We recommend the value whisper-1 and model type Openai.

If you want to use the Text to Speech feature, you can set it up by following these steps:

  1. Setup your account on ElevenLabs.io.
  2. Configure your API key in your environment variables with the key ELEVEN_LABS_API_KEY.
  3. (Optional) Create a new Voice model option with a specific voice ID from whichever voice you want to use. You can explore the options here.