mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-12-03 20:33:00 +01:00
68e7c297e0
- Add instructions for self-hosted users with info, warning boxes to avoid, fix common issues when setting up Khoj server - Create new Advanced Self Hosting section - Extract Advanced Self-Hosting Sections from the Advanced Page and move them to separate Pages under Advanced Self Hosting section - Improve OpenAI Proxy Docs - Put Ollama setup as a section under OpenAI API Proxy page instead of a separate page - Add Section to use Khoj with chat model from LM Studio - Update LiteLLM docs to use chat model from LM Studio
31 lines
2.9 KiB
Markdown
31 lines
2.9 KiB
Markdown
---
|
|
sidebar_position: 4
|
|
slug: /privacy
|
|
---
|
|
|
|
# Privacy
|
|
|
|
If you're using Khoj to index you personal data, it's almost certain you'll have sensitive and private information you'd like to index.
|
|
|
|
Khoj is designed to be a personal AI, so one of our cornerstone principles is to make it as privacy-friendly as possible. That's why, you can *always* choose to run Khoj on your own hardware, and never share your data outside of your device. You can generate your embeddings directly on your machine, and then use an offline chat client so that your data never leaves your machine. You'll find the instructions to [self-hosting](./setup.mdx) here.
|
|
|
|
Here's what to consider if you're using Khoj, whether self-hosted or on our cloud:
|
|
1. Some of your relevant indexed data may be included as context when you chat with Khoj. This means that it may be sent to OpenAI, if you use one of the OpenAI models.
|
|
1. We collect completely anonymized usage telemetry and send it to [PostHog](https://posthog.com/). This includes data like unique chat requests, unique search requests, unique requests to index data. Usage data is collected to help us understand how people are using Khoj, and to help us prioritize features.
|
|
- We do not log your IP address, nor upload any of your personal data to PostHog.
|
|
- You can see our telemetry aggregation code [here](https://github.com/khoj-ai/khoj/blob/master/src/khoj/routers/helpers.py#L71) and see our telemetry server [here](https://github.com/khoj-ai/khoj/blob/master/src/telemetry/telemetry.py).
|
|
- If you're self-hosting, you can opt out of telemetry by following [these instructions](/miscellaneous/telemetry).
|
|
|
|
|
|
Self-hosting isn't for everyone, so we've still taken steps to make Khoj privacy-friendly, even if you choose to use our [cloud offering](https://app.khoj.dev/login). Here's what to consider when using Khoj Cloud:
|
|
1. Your embeddings are generated by an open source model within our own dedicated endpoint [hosted on AWS with Huggingface](https://huggingface.co/inference-endpoints/dedicated). There's zero persistent memory to the Huggingface Inference endpoints (it's stateless).
|
|
1. Your embeddings and the associated raw text are stored in a secure Postgres DB in our private AWS cloud. Your data is sharded on a unique user ID. We store the raw text in your files to improve file syncing and provide context when you chat with Khoj.
|
|
1. When you use the single-sign-on option with Google, we only receive your name, a link to your profile photo, and your email address.
|
|
|
|
|
|
:::tip[Info]
|
|
Your data is yours. We do not sell your data or use it for training models. Khoj is a sustainable, open-source alternative to closed-source, commercial personal AI. We have no interest in selling your data to make a quick buck.
|
|
:::
|
|
|
|
|
|
We have lots of ideas of how to make Khoj really robust as a personal AI and cloud offering, but also trust-less and privacy-centric. Please [reach out](mailto:team@khoj.dev) if this is important to you, and you'd like to help us build it.
|