khoj/documentation/docs/miscellaneous/ollama.md

1.8 KiB

Ollama / Khoj

You can run your own open source models locally with Ollama and use them with Khoj.

:::info[Ollama Integration] This is only going to be helpful for self-hosted users. If you're using Khoj Cloud, you're limited to our first-party models. :::

Khoj supports any OpenAI-API compatible server, which includes Ollama. Ollama allows you to start a local server with several popular open-source LLMs directly on your own computer. Combined with Khoj, you can chat with these LLMs and use them to search your notes and documents.

While Khoj also supports local-hosted LLMs downloaded from Hugging Face, the Ollama integration is particularly useful for its ease of setup and multi-model support, especially if you're already using Ollama.

Setup

  1. Setup Ollama: https://ollama.com/
  2. Start your preferred model with Ollama. For example,
    ollama run llama3
    
  3. Go to Khoj settings at OpenAI Processor Conversation Config
  4. Create a new config.
    • Name: ollama
    • Api Key: any string
    • Api Base Url: http://localhost:11434/v1/ (default for Ollama)
  5. Go to Chat Model Options
  6. Create a new config.
    • Name: llama3 (replace with the name of your local model)
    • Model Type: Openai
    • Openai Config: <the ollama config you created in step 4>
    • Max prompt size: 1000 (replace with the max prompt size of your model)
  7. Go to your config and select the model you just created in the chat model dropdown.

That's it! You should now be able to chat with your Ollama model from Khoj. If you want to add additional models running on Ollama, repeat step 6 for each model.