mirror of
https://github.com/Mintplex-Labs/anything-llm.git
synced 2025-04-17 18:18:11 +00:00
inital commit ⚡
This commit is contained in:
commit
27c58541bd
100 changed files with 5394 additions and 0 deletions
.gitignoreLICENSEREADME.mdclean.sh
collector
.env.example.gitignoreREADME.md
hotdir
main.pyrequirements.txtscripts
__init__.pygitbook.pylink.pylink_utils.pymedium.pymedium_utils.pysubstack.pysubstack_utils.pyutils.py
watch.pywatch
youtube.pyyt_utils.pyfrontend
.env.production.eslintrc.cjs.gitignore.nvmrcindex.htmljsconfig.jsonpackage.jsonpostcss.config.js
public
src
App.jsxAuthContext.jsxindex.cssmain.jsx
tailwind.config.jsvite.config.jscomponents
DefaultChat
Modals
Sidebar
UserIcon
WorkspaceChat
models
pages
utils
images
package.jsonserver
.env.example.gitignore.nvmrc
documents
endpoints
index.jsmodels
package.jsonutils
chats
files
http
middleware
openAi
pinecone
vector-cache
10
.gitignore
vendored
Normal file
10
.gitignore
vendored
Normal file
|
@ -0,0 +1,10 @@
|
|||
v-env
|
||||
.env
|
||||
!.env.example
|
||||
|
||||
node_modules
|
||||
__pycache__
|
||||
v-env
|
||||
*.lock
|
||||
.DS_Store
|
||||
|
21
LICENSE
Normal file
21
LICENSE
Normal file
|
@ -0,0 +1,21 @@
|
|||
The MIT License
|
||||
|
||||
Copyright (c) Mintplex Labs Inc.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
59
README.md
Normal file
59
README.md
Normal file
|
@ -0,0 +1,59 @@
|
|||
# 🤖 AnythingLLM: A full-stack personalized AI assistant
|
||||
|
||||
[](https://twitter.com/tcarambat) [](https://discord.gg/6UyHPeGZAC)
|
||||
|
||||
A full-stack application and tool suite that enables you to turn any document, resource, or piece of content into a piece of data that any LLM can use as reference during chatting. This application runs with very minimal overhead as by default the LLM and vectorDB are hosted remotely, but can be swapped for local instances. Currently this project supports Pinecone and OpenAI.
|
||||
|
||||

|
||||
[view more screenshots](/images/screenshots/SCREENSHOTS.md)
|
||||
|
||||
### Watch the demo!
|
||||
|
||||
_tbd_
|
||||
|
||||
### Product Overview
|
||||
AnythingLLM aims to be a full-stack application where you can use commercial off-the-shelf LLMs with Long-term-memory solutions or use popular open source LLM and vectorDB solutions.
|
||||
|
||||
Anything LLM is a full-stack product that you can run locally as well as host remotely and be able to chat intelligently with any documents you provide it.
|
||||
|
||||
AnythingLLM divides your documents into objects called `workspaces`. A Workspace functions a lot like a thread, but with the addition of containerization of your documents. Workspaces can share documents, but they do not talk to each other so you can keep your context for each workspace clean.
|
||||
|
||||
Some cool features of AnythingLLM
|
||||
- Atomically manage documents to be used in long-term-memory from a simple UI
|
||||
- Two chat modes `conversation` and `query`. Conversation retains previous questions and amendments. Query is simple QA against your documents
|
||||
- Each chat response contains a citation that is linked to the original content
|
||||
- Simple technology stack for fast iteration
|
||||
- Fully capable of being hosted remotely
|
||||
- "Bring your own LLM" model and vector solution. _still in progress_
|
||||
- Extremely efficient cost-saving measures for managing very large documents. you'll never pay to embed a massive document or transcript more than once. 90% more cost effective than other LTM chatbots
|
||||
|
||||
### Technical Overview
|
||||
This monorepo consists of three main sections:
|
||||
- `collector`: Python tools that enable you to quickly convert online resources or local documents into LLM useable format.
|
||||
- `frontend`: A viteJS + React frontend that you can run to easily create and manage all your content the LLM can use.
|
||||
- `server`: A nodeJS + express server to handle all the interactions and do all the vectorDB management and LLM interactions.
|
||||
|
||||
### Requirements
|
||||
- `yarn` and `node` on your machine
|
||||
- `python` 3.8+ for running scripts in `collector/`.
|
||||
- access to an LLM like `GPT-3.5`, `GPT-4`*.
|
||||
- a [Pinecone.io](https://pinecone.io) free account*.
|
||||
*you can use drop in replacements for these. This is just the easiest to get up and running fast.
|
||||
|
||||
### How to get started
|
||||
- `yarn setup` from the project root directory.
|
||||
|
||||
This will fill in the required `.env` files you'll need in each of the application sections. Go fill those out before proceeding or else things won't work right.
|
||||
|
||||
Next, you will need some content to embed. This could be a Youtube Channel, Medium articles, local text files, word documents, and the list goes on. This is where you will use the `collector/` part of the repo.
|
||||
|
||||
[Go set up and run collector scripts](./collector/README.md)
|
||||
|
||||
[Learn about documents](./server/documents/DOCUMENTS.md)
|
||||
|
||||
[Learn about vector caching](./server/documents/VECTOR_CACHE.md)
|
||||
|
||||
### Contributing
|
||||
- create issue
|
||||
- create PR with branch name format of `<issue number>-<short name>`
|
||||
- yee haw let's merge
|
2
clean.sh
Normal file
2
clean.sh
Normal file
|
@ -0,0 +1,2 @@
|
|||
# Easily kill process on port because sometimes nodemon fails to reboot
|
||||
kill -9 $(lsof -t -i tcp:5000)
|
1
collector/.env.example
Normal file
1
collector/.env.example
Normal file
|
@ -0,0 +1 @@
|
|||
GOOGLE_APIS_KEY=
|
6
collector/.gitignore
vendored
Normal file
6
collector/.gitignore
vendored
Normal file
|
@ -0,0 +1,6 @@
|
|||
outputs/*/*.json
|
||||
hotdir/*
|
||||
hotdir/processed/*
|
||||
!hotdir/__HOTDIR__.md
|
||||
!hotdir/processed
|
||||
|
45
collector/README.md
Normal file
45
collector/README.md
Normal file
|
@ -0,0 +1,45 @@
|
|||
# How to collect data for vectorizing
|
||||
This process should be run first. This will enable you to collect a ton of data across various sources. Currently the following services are supported:
|
||||
- [x] YouTube Channels
|
||||
- [x] Medium
|
||||
- [x] Substack
|
||||
- [x] Arbitrary Link
|
||||
- [x] Gitbook
|
||||
- [x] Local Files (.txt, .pdf, etc) [See full list](./hotdir/__HOTDIR__.md)
|
||||
_these resources are under development or require PR_
|
||||
- Twitter
|
||||

|
||||
|
||||
### Requirements
|
||||
- [ ] Python 3.8+
|
||||
- [ ] Google Cloud Account (for YouTube channels)
|
||||
- [ ] `brew install pandoc` [pandoc](https://pandoc.org/installing.html) (for .ODT document processing)
|
||||
|
||||
### Setup
|
||||
This example will be using python3.9, but will work with 3.8+. Tested on MacOs. Untested on Windows
|
||||
- install virtualenv for python3.8+ first before any other steps. `python3.9 -m pip install virutalenv`
|
||||
- `cd collector` from root directory
|
||||
- `python3.9 -m virtualenv v-env`
|
||||
- `source v-env/bin/activate`
|
||||
- `pip install -r requirements.txt`
|
||||
- `cp .env.example .env`
|
||||
- `python main.py` for interactive collection or `python watch.py` to process local documents.
|
||||
- Select the option you want and follow follow the prompts - Done!
|
||||
- run `deactivate` to get back to regular shell
|
||||
|
||||
### Outputs
|
||||
All JSON file data is cached in the `output/` folder. This is to prevent redundant API calls to services which may have rate limits to quota caps. Clearing out the `output/` folder will execute the script as if there was no cache.
|
||||
|
||||
As files are processed you will see data being written to both the `collector/outputs` folder as well as the `server/documents` folder. Later in this process, once you boot up the server you will then bulk vectorize this content from a simple UI!
|
||||
|
||||
If collection fails at any point in the process it will pick up where it last bailed out so you are not reusing credits.
|
||||
|
||||
### How to get a Google Cloud API Key (YouTube data collection only)
|
||||
**required to fetch YouTube transcripts and data**
|
||||
- Have a google account
|
||||
- [Visit the GCP Cloud Console](https://console.cloud.google.com/welcome)
|
||||
- Click on dropdown in top right > Create new project. Name it whatever you like
|
||||
- 
|
||||
- [Enable YouTube Data APIV3](https://console.cloud.google.com/apis/library/youtube.googleapis.com)
|
||||
- Once enabled generate a Credential key for this API
|
||||
- Paste your key after `GOOGLE_APIS_KEY=` in your `collector/.env` file.
|
17
collector/hotdir/__HOTDIR__.md
Normal file
17
collector/hotdir/__HOTDIR__.md
Normal file
|
@ -0,0 +1,17 @@
|
|||
### What is the "Hot directory"
|
||||
|
||||
This is the location where you can dump all supported file types and have them automatically converted and prepared to be digested by the vectorizing service and selected from the AnythingLLM frontend.
|
||||
|
||||
Files dropped in here will only be processed when you are running `python watch.py` from the `collector` directory.
|
||||
|
||||
Once converted the original file will be moved to the `hotdir/processed` folder so that the original document is still able to be linked to when referenced when attached as a source document during chatting.
|
||||
|
||||
**Supported File types**
|
||||
- `.md`
|
||||
- `.text`
|
||||
- `.pdf`
|
||||
|
||||
__requires more development__
|
||||
- `.png .jpg etc`
|
||||
- `.mp3`
|
||||
- `.mp4`
|
81
collector/main.py
Normal file
81
collector/main.py
Normal file
|
@ -0,0 +1,81 @@
|
|||
import os
|
||||
from whaaaaat import prompt, Separator
|
||||
from scripts.youtube import youtube
|
||||
from scripts.link import link, links
|
||||
from scripts.substack import substack
|
||||
from scripts.medium import medium
|
||||
from scripts.gitbook import gitbook
|
||||
|
||||
def main():
|
||||
if os.name == 'nt':
|
||||
methods = {
|
||||
'1': 'YouTube Channel',
|
||||
'2': 'Article or Blog Link',
|
||||
'3': 'Substack',
|
||||
'4': 'Medium',
|
||||
'5': 'Gitbook'
|
||||
}
|
||||
print("There are options for data collection to make this easier for you.\nType the number of the method you wish to execute.")
|
||||
print("1. YouTube Channel\n2. Article or Blog Link (Single)\n3. Substack\n4. Medium\n\n[In development]:\nTwitter\n\n")
|
||||
selection = input("Your selection: ")
|
||||
method = methods.get(str(selection))
|
||||
else:
|
||||
questions = [
|
||||
{
|
||||
"type": "list",
|
||||
"name": "collector",
|
||||
"message": "What kind of data would you like to add to convert into long-term memory?",
|
||||
"choices": [
|
||||
"YouTube Channel",
|
||||
"Substack",
|
||||
"Medium",
|
||||
"Article or Blog Link(s)",
|
||||
"Gitbook",
|
||||
Separator(),
|
||||
{"name": "Twitter", "disabled": "Needs PR"},
|
||||
"Abort",
|
||||
],
|
||||
},
|
||||
]
|
||||
method = prompt(questions).get('collector')
|
||||
|
||||
if('Article or Blog Link' in method):
|
||||
questions = [
|
||||
{
|
||||
"type": "list",
|
||||
"name": "collector",
|
||||
"message": "Do you want to scrape a single article/blog/url or many at once?",
|
||||
"choices": [
|
||||
'Single URL',
|
||||
'Multiple URLs',
|
||||
'Abort',
|
||||
],
|
||||
},
|
||||
]
|
||||
method = prompt(questions).get('collector')
|
||||
if(method == 'Single URL'):
|
||||
link()
|
||||
exit(0)
|
||||
if(method == 'Multiple URLs'):
|
||||
links()
|
||||
exit(0)
|
||||
|
||||
if(method == 'Abort'): exit(0)
|
||||
if(method == 'YouTube Channel'):
|
||||
youtube()
|
||||
exit(0)
|
||||
if(method == 'Substack'):
|
||||
substack()
|
||||
exit(0)
|
||||
if(method == 'Medium'):
|
||||
medium()
|
||||
exit(0)
|
||||
if(method == 'Gitbook'):
|
||||
gitbook()
|
||||
exit(0)
|
||||
|
||||
print("Selection was not valid.")
|
||||
exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
221
collector/requirements.txt
Normal file
221
collector/requirements.txt
Normal file
|
@ -0,0 +1,221 @@
|
|||
about-time==4.2.1
|
||||
aiohttp==3.8.4
|
||||
aiosignal==1.3.1
|
||||
alive-progress==3.1.2
|
||||
anyio==3.7.0
|
||||
appdirs==1.4.4
|
||||
argilla==1.8.0
|
||||
async-timeout==4.0.2
|
||||
attrs==23.1.0
|
||||
backoff==2.2.1
|
||||
beautifulsoup4==4.12.2
|
||||
bs4==0.0.1
|
||||
certifi==2023.5.7
|
||||
cffi==1.15.1
|
||||
chardet==5.1.0
|
||||
charset-normalizer==3.1.0
|
||||
click==8.1.3
|
||||
commonmark==0.9.1
|
||||
cryptography==41.0.1
|
||||
cssselect==1.2.0
|
||||
dataclasses-json==0.5.7
|
||||
Deprecated==1.2.14
|
||||
et-xmlfile==1.1.0
|
||||
exceptiongroup==1.1.1
|
||||
fake-useragent==1.1.3
|
||||
frozenlist==1.3.3
|
||||
grapheme==0.6.0
|
||||
greenlet==2.0.2
|
||||
h11==0.14.0
|
||||
httpcore==0.16.3
|
||||
httpx==0.23.3
|
||||
idna==3.4
|
||||
importlib-metadata==6.6.0
|
||||
importlib-resources==5.12.0
|
||||
install==1.3.5
|
||||
joblib==1.2.0
|
||||
langchain==0.0.189
|
||||
lxml==4.9.2
|
||||
Markdown==3.4.3
|
||||
marshmallow==3.19.0
|
||||
marshmallow-enum==1.5.1
|
||||
monotonic==1.6
|
||||
msg-parser==1.2.0
|
||||
multidict==6.0.4
|
||||
mypy-extensions==1.0.0
|
||||
nltk==3.8.1
|
||||
numexpr==2.8.4
|
||||
numpy==1.23.5
|
||||
olefile==0.46
|
||||
openapi-schema-pydantic==1.2.4
|
||||
openpyxl==3.1.2
|
||||
packaging==23.1
|
||||
pandas==1.5.3
|
||||
parse==1.19.0
|
||||
pdfminer.six==20221105
|
||||
Pillow==9.5.0
|
||||
prompt-toolkit==1.0.14
|
||||
pycparser==2.21
|
||||
pydantic==1.10.8
|
||||
pyee==8.2.2
|
||||
Pygments==2.15.1
|
||||
pyobjc==9.1.1
|
||||
pyobjc-core==9.1.1
|
||||
pyobjc-framework-Accounts==9.1.1
|
||||
pyobjc-framework-AddressBook==9.1.1
|
||||
pyobjc-framework-AdSupport==9.1.1
|
||||
pyobjc-framework-AppleScriptKit==9.1.1
|
||||
pyobjc-framework-AppleScriptObjC==9.1.1
|
||||
pyobjc-framework-ApplicationServices==9.1.1
|
||||
pyobjc-framework-AudioVideoBridging==9.1.1
|
||||
pyobjc-framework-AuthenticationServices==9.1.1
|
||||
pyobjc-framework-AutomaticAssessmentConfiguration==9.1.1
|
||||
pyobjc-framework-Automator==9.1.1
|
||||
pyobjc-framework-AVFoundation==9.1.1
|
||||
pyobjc-framework-AVKit==9.1.1
|
||||
pyobjc-framework-BusinessChat==9.1.1
|
||||
pyobjc-framework-CalendarStore==9.1.1
|
||||
pyobjc-framework-CFNetwork==9.1.1
|
||||
pyobjc-framework-CloudKit==9.1.1
|
||||
pyobjc-framework-Cocoa==9.1.1
|
||||
pyobjc-framework-Collaboration==9.1.1
|
||||
pyobjc-framework-ColorSync==9.1.1
|
||||
pyobjc-framework-Contacts==9.1.1
|
||||
pyobjc-framework-ContactsUI==9.1.1
|
||||
pyobjc-framework-CoreAudio==9.1.1
|
||||
pyobjc-framework-CoreAudioKit==9.1.1
|
||||
pyobjc-framework-CoreBluetooth==9.1.1
|
||||
pyobjc-framework-CoreData==9.1.1
|
||||
pyobjc-framework-CoreHaptics==9.1.1
|
||||
pyobjc-framework-CoreLocation==9.1.1
|
||||
pyobjc-framework-CoreMedia==9.1.1
|
||||
pyobjc-framework-CoreMediaIO==9.1.1
|
||||
pyobjc-framework-CoreMIDI==9.1.1
|
||||
pyobjc-framework-CoreML==9.1.1
|
||||
pyobjc-framework-CoreMotion==9.1.1
|
||||
pyobjc-framework-CoreServices==9.1.1
|
||||
pyobjc-framework-CoreSpotlight==9.1.1
|
||||
pyobjc-framework-CoreText==9.1.1
|
||||
pyobjc-framework-CoreWLAN==9.1.1
|
||||
pyobjc-framework-CryptoTokenKit==9.1.1
|
||||
pyobjc-framework-DeviceCheck==9.1.1
|
||||
pyobjc-framework-DictionaryServices==9.1.1
|
||||
pyobjc-framework-DiscRecording==9.1.1
|
||||
pyobjc-framework-DiscRecordingUI==9.1.1
|
||||
pyobjc-framework-DiskArbitration==9.1.1
|
||||
pyobjc-framework-DVDPlayback==9.1.1
|
||||
pyobjc-framework-EventKit==9.1.1
|
||||
pyobjc-framework-ExceptionHandling==9.1.1
|
||||
pyobjc-framework-ExecutionPolicy==9.1.1
|
||||
pyobjc-framework-ExternalAccessory==9.1.1
|
||||
pyobjc-framework-FileProvider==9.1.1
|
||||
pyobjc-framework-FileProviderUI==9.1.1
|
||||
pyobjc-framework-FinderSync==9.1.1
|
||||
pyobjc-framework-FSEvents==9.1.1
|
||||
pyobjc-framework-GameCenter==9.1.1
|
||||
pyobjc-framework-GameController==9.1.1
|
||||
pyobjc-framework-GameKit==9.1.1
|
||||
pyobjc-framework-GameplayKit==9.1.1
|
||||
pyobjc-framework-ImageCaptureCore==9.1.1
|
||||
pyobjc-framework-IMServicePlugIn==9.1.1
|
||||
pyobjc-framework-InputMethodKit==9.1.1
|
||||
pyobjc-framework-InstallerPlugins==9.1.1
|
||||
pyobjc-framework-InstantMessage==9.1.1
|
||||
pyobjc-framework-Intents==9.1.1
|
||||
pyobjc-framework-IOBluetooth==9.1.1
|
||||
pyobjc-framework-IOBluetoothUI==9.1.1
|
||||
pyobjc-framework-IOSurface==9.1.1
|
||||
pyobjc-framework-iTunesLibrary==9.1.1
|
||||
pyobjc-framework-LatentSemanticMapping==9.1.1
|
||||
pyobjc-framework-LaunchServices==9.1.1
|
||||
pyobjc-framework-libdispatch==9.1.1
|
||||
pyobjc-framework-libxpc==9.1.1
|
||||
pyobjc-framework-LinkPresentation==9.1.1
|
||||
pyobjc-framework-LocalAuthentication==9.1.1
|
||||
pyobjc-framework-MapKit==9.1.1
|
||||
pyobjc-framework-MediaAccessibility==9.1.1
|
||||
pyobjc-framework-MediaLibrary==9.1.1
|
||||
pyobjc-framework-MediaPlayer==9.1.1
|
||||
pyobjc-framework-MediaToolbox==9.1.1
|
||||
pyobjc-framework-Metal==9.1.1
|
||||
pyobjc-framework-MetalKit==9.1.1
|
||||
pyobjc-framework-MetalPerformanceShaders==9.1.1
|
||||
pyobjc-framework-ModelIO==9.1.1
|
||||
pyobjc-framework-MultipeerConnectivity==9.1.1
|
||||
pyobjc-framework-NaturalLanguage==9.1.1
|
||||
pyobjc-framework-NetFS==9.1.1
|
||||
pyobjc-framework-Network==9.1.1
|
||||
pyobjc-framework-NetworkExtension==9.1.1
|
||||
pyobjc-framework-NotificationCenter==9.1.1
|
||||
pyobjc-framework-OpenDirectory==9.1.1
|
||||
pyobjc-framework-OSAKit==9.1.1
|
||||
pyobjc-framework-OSLog==9.1.1
|
||||
pyobjc-framework-PencilKit==9.1.1
|
||||
pyobjc-framework-Photos==9.1.1
|
||||
pyobjc-framework-PhotosUI==9.1.1
|
||||
pyobjc-framework-PreferencePanes==9.1.1
|
||||
pyobjc-framework-PushKit==9.1.1
|
||||
pyobjc-framework-Quartz==9.1.1
|
||||
pyobjc-framework-QuickLookThumbnailing==9.1.1
|
||||
pyobjc-framework-SafariServices==9.1.1
|
||||
pyobjc-framework-SceneKit==9.1.1
|
||||
pyobjc-framework-ScreenSaver==9.1.1
|
||||
pyobjc-framework-ScriptingBridge==9.1.1
|
||||
pyobjc-framework-SearchKit==9.1.1
|
||||
pyobjc-framework-Security==9.1.1
|
||||
pyobjc-framework-SecurityFoundation==9.1.1
|
||||
pyobjc-framework-SecurityInterface==9.1.1
|
||||
pyobjc-framework-ServiceManagement==9.1.1
|
||||
pyobjc-framework-Social==9.1.1
|
||||
pyobjc-framework-SoundAnalysis==9.1.1
|
||||
pyobjc-framework-Speech==9.1.1
|
||||
pyobjc-framework-SpriteKit==9.1.1
|
||||
pyobjc-framework-StoreKit==9.1.1
|
||||
pyobjc-framework-SyncServices==9.1.1
|
||||
pyobjc-framework-SystemConfiguration==9.1.1
|
||||
pyobjc-framework-SystemExtensions==9.1.1
|
||||
pyobjc-framework-UserNotifications==9.1.1
|
||||
pyobjc-framework-VideoSubscriberAccount==9.1.1
|
||||
pyobjc-framework-VideoToolbox==9.1.1
|
||||
pyobjc-framework-Vision==9.1.1
|
||||
pyobjc-framework-WebKit==9.1.1
|
||||
pypandoc==1.11
|
||||
pyppeteer==1.0.2
|
||||
pyquery==2.0.0
|
||||
python-dateutil==2.8.2
|
||||
python-docx==0.8.11
|
||||
python-dotenv==0.21.1
|
||||
python-magic==0.4.27
|
||||
python-pptx==0.6.21
|
||||
python-slugify==8.0.1
|
||||
pytz==2023.3
|
||||
PyYAML==6.0
|
||||
regex==2023.5.5
|
||||
requests==2.31.0
|
||||
requests-html==0.10.0
|
||||
rfc3986==1.5.0
|
||||
rich==13.0.1
|
||||
six==1.16.0
|
||||
sniffio==1.3.0
|
||||
soupsieve==2.4.1
|
||||
SQLAlchemy==2.0.15
|
||||
tenacity==8.2.2
|
||||
text-unidecode==1.3
|
||||
tiktoken==0.4.0
|
||||
tqdm==4.65.0
|
||||
typer==0.9.0
|
||||
typing-inspect==0.9.0
|
||||
typing_extensions==4.6.3
|
||||
unstructured==0.7.1
|
||||
urllib3==1.26.16
|
||||
uuid==1.30
|
||||
w3lib==2.1.1
|
||||
wcwidth==0.2.6
|
||||
websockets==10.4
|
||||
whaaaaat==0.5.2
|
||||
wrapt==1.14.1
|
||||
xlrd==2.0.1
|
||||
XlsxWriter==3.1.2
|
||||
yarl==1.9.2
|
||||
youtube-transcript-api==0.6.0
|
||||
zipp==3.15.0
|
0
collector/scripts/__init__.py
Normal file
0
collector/scripts/__init__.py
Normal file
44
collector/scripts/gitbook.py
Normal file
44
collector/scripts/gitbook.py
Normal file
|
@ -0,0 +1,44 @@
|
|||
import os, json
|
||||
from langchain.document_loaders import GitbookLoader
|
||||
from urllib.parse import urlparse
|
||||
from datetime import datetime
|
||||
from alive_progress import alive_it
|
||||
from .utils import tokenize
|
||||
from uuid import uuid4
|
||||
|
||||
def gitbook():
|
||||
url = input("Enter the URL of the GitBook you want to collect: ")
|
||||
if(url == ''):
|
||||
print("Not a gitbook URL")
|
||||
exit(1)
|
||||
|
||||
primary_source = urlparse(url)
|
||||
output_path = f"./outputs/gitbook-logs/{primary_source.netloc}"
|
||||
transaction_output_dir = f"../server/documents/gitbook-{primary_source.netloc}"
|
||||
|
||||
if os.path.exists(output_path) == False:os.makedirs(output_path)
|
||||
if os.path.exists(transaction_output_dir) == False: os.makedirs(transaction_output_dir)
|
||||
loader = GitbookLoader(url, load_all_paths= primary_source.path in ['','/'])
|
||||
for doc in alive_it(loader.load()):
|
||||
metadata = doc.metadata
|
||||
content = doc.page_content
|
||||
source = urlparse(metadata.get('source'))
|
||||
name = 'home' if source.path in ['','/'] else source.path.replace('/','_')
|
||||
output_filename = f"doc-{name}.json"
|
||||
transaction_output_filename = f"doc-{name}.json"
|
||||
data = {
|
||||
'id': str(uuid4()),
|
||||
'url': metadata.get('source'),
|
||||
"title": metadata.get('title'),
|
||||
"description": metadata.get('title'),
|
||||
"published": datetime.today().strftime('%Y-%m-%d %H:%M:%S'),
|
||||
"wordCount": len(content),
|
||||
'pageContent': content,
|
||||
'token_count_estimate': len(tokenize(content))
|
||||
}
|
||||
|
||||
with open(f"{output_path}/{output_filename}", 'w', encoding='utf-8') as file:
|
||||
json.dump(data, file, ensure_ascii=True, indent=4)
|
||||
|
||||
with open(f"{transaction_output_dir}/{transaction_output_filename}", 'w', encoding='utf-8') as file:
|
||||
json.dump(data, file, ensure_ascii=True, indent=4)
|
139
collector/scripts/link.py
Normal file
139
collector/scripts/link.py
Normal file
|
@ -0,0 +1,139 @@
|
|||
import os, json, tempfile
|
||||
from urllib.parse import urlparse
|
||||
from requests_html import HTMLSession
|
||||
from langchain.document_loaders import UnstructuredHTMLLoader
|
||||
from .link_utils import append_meta
|
||||
from .utils import tokenize, ada_v2_cost
|
||||
|
||||
# Example Channel URL https://tim.blog/2022/08/09/nft-insider-trading-policy/
|
||||
def link():
|
||||
print("[NOTICE]: The first time running this process it will download supporting libraries.\n\n")
|
||||
fqdn_link = input("Paste in the URL of an online article or blog: ")
|
||||
if(len(fqdn_link) == 0):
|
||||
print("Invalid URL!")
|
||||
exit(1)
|
||||
|
||||
session = HTMLSession()
|
||||
req = session.get(fqdn_link)
|
||||
if(req.ok == False):
|
||||
print("Could not reach this url!")
|
||||
exit(1)
|
||||
|
||||
req.html.render()
|
||||
full_text = None
|
||||
with tempfile.NamedTemporaryFile(mode = "w") as tmp:
|
||||
tmp.write(req.html.html)
|
||||
tmp.seek(0)
|
||||
loader = UnstructuredHTMLLoader(tmp.name)
|
||||
data = loader.load()[0]
|
||||
full_text = data.page_content
|
||||
tmp.close()
|
||||
|
||||
link = append_meta(req, full_text, True)
|
||||
if(len(full_text) > 0):
|
||||
source = urlparse(req.url)
|
||||
output_filename = f"website-{source.netloc}-{source.path.replace('/','_')}.json"
|
||||
output_path = f"./outputs/website-logs"
|
||||
|
||||
transaction_output_filename = f"article-{source.path.replace('/','_')}.json"
|
||||
transaction_output_dir = f"../server/documents/website-{source.netloc}"
|
||||
|
||||
if os.path.isdir(output_path) == False:
|
||||
os.makedirs(output_path)
|
||||
|
||||
if os.path.isdir(transaction_output_dir) == False:
|
||||
os.makedirs(transaction_output_dir)
|
||||
|
||||
full_text = append_meta(req, full_text)
|
||||
tokenCount = len(tokenize(full_text))
|
||||
link['pageContent'] = full_text
|
||||
link['token_count_estimate'] = tokenCount
|
||||
|
||||
with open(f"{output_path}/{output_filename}", 'w', encoding='utf-8') as file:
|
||||
json.dump(link, file, ensure_ascii=True, indent=4)
|
||||
|
||||
with open(f"{transaction_output_dir}/{transaction_output_filename}", 'w', encoding='utf-8') as file:
|
||||
json.dump(link, file, ensure_ascii=True, indent=4)
|
||||
else:
|
||||
print("Could not parse any meaningful data from this link or url.")
|
||||
exit(1)
|
||||
|
||||
print(f"\n\n[Success]: article or link content fetched!")
|
||||
print(f"////////////////////////////")
|
||||
print(f"Your estimated cost to embed this data using OpenAI's text-embedding-ada-002 model at $0.0004 / 1K tokens will cost {ada_v2_cost(tokenCount)} using {tokenCount} tokens.")
|
||||
print(f"////////////////////////////")
|
||||
exit(0)
|
||||
|
||||
def links():
|
||||
links = []
|
||||
prompt = "Paste in the URL of an online article or blog: "
|
||||
done = False
|
||||
|
||||
while(done == False):
|
||||
new_link = input(prompt)
|
||||
if(len(new_link) == 0):
|
||||
done = True
|
||||
links = [*set(links)]
|
||||
continue
|
||||
|
||||
links.append(new_link)
|
||||
prompt = f"\n{len(links)} links in queue. Submit an empty value when done pasting in links to execute collection.\nPaste in the next URL of an online article or blog: "
|
||||
|
||||
if(len(links) == 0):
|
||||
print("No valid links provided!")
|
||||
exit(1)
|
||||
|
||||
totalTokens = 0
|
||||
for link in links:
|
||||
print(f"Working on {link}...")
|
||||
session = HTMLSession()
|
||||
req = session.get(link)
|
||||
if(req.ok == False):
|
||||
print(f"Could not reach {link} - skipping!")
|
||||
continue
|
||||
|
||||
req.html.render()
|
||||
full_text = None
|
||||
with tempfile.NamedTemporaryFile(mode = "w") as tmp:
|
||||
tmp.write(req.html.html)
|
||||
tmp.seek(0)
|
||||
loader = UnstructuredHTMLLoader(tmp.name)
|
||||
data = loader.load()[0]
|
||||
full_text = data.page_content
|
||||
tmp.close()
|
||||
|
||||
link = append_meta(req, full_text, True)
|
||||
if(len(full_text) > 0):
|
||||
source = urlparse(req.url)
|
||||
output_filename = f"website-{source.netloc}-{source.path.replace('/','_')}.json"
|
||||
output_path = f"./outputs/website-logs"
|
||||
|
||||
transaction_output_filename = f"article-{source.path.replace('/','_')}.json"
|
||||
transaction_output_dir = f"../server/documents/website-{source.netloc}"
|
||||
|
||||
if os.path.isdir(output_path) == False:
|
||||
os.makedirs(output_path)
|
||||
|
||||
if os.path.isdir(transaction_output_dir) == False:
|
||||
os.makedirs(transaction_output_dir)
|
||||
|
||||
full_text = append_meta(req, full_text)
|
||||
tokenCount = len(tokenize(full_text))
|
||||
link['pageContent'] = full_text
|
||||
link['token_count_estimate'] = tokenCount
|
||||
totalTokens += tokenCount
|
||||
|
||||
with open(f"{output_path}/{output_filename}", 'w', encoding='utf-8') as file:
|
||||
json.dump(link, file, ensure_ascii=True, indent=4)
|
||||
|
||||
with open(f"{transaction_output_dir}/{transaction_output_filename}", 'w', encoding='utf-8') as file:
|
||||
json.dump(link, file, ensure_ascii=True, indent=4)
|
||||
else:
|
||||
print(f"Could not parse any meaningful data from {link}.")
|
||||
continue
|
||||
|
||||
print(f"\n\n[Success]: {len(links)} article or link contents fetched!")
|
||||
print(f"////////////////////////////")
|
||||
print(f"Your estimated cost to embed this data using OpenAI's text-embedding-ada-002 model at $0.0004 / 1K tokens will cost {ada_v2_cost(totalTokens)} using {totalTokens} tokens.")
|
||||
print(f"////////////////////////////")
|
||||
exit(0)
|
14
collector/scripts/link_utils.py
Normal file
14
collector/scripts/link_utils.py
Normal file
|
@ -0,0 +1,14 @@
|
|||
import json
|
||||
from datetime import datetime
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
def append_meta(request, text, metadata_only = False):
|
||||
meta = {
|
||||
'url': request.url,
|
||||
'title': request.html.find('title', first=True).text if len(request.html.find('title')) != 0 else '',
|
||||
'description': request.html.find('meta[name="description"]', first=True).attrs.get('content') if request.html.find('meta[name="description"]', first=True) != None else '',
|
||||
'published':request.html.find('meta[property="article:published_time"]', first=True).attrs.get('content') if request.html.find('meta[property="article:published_time"]', first=True) != None else datetime.today().strftime('%Y-%m-%d %H:%M:%S'),
|
||||
'wordCount': len(text.split(' ')),
|
||||
}
|
||||
return "Article JSON Metadata:\n"+json.dumps(meta)+"\n\n\nText Content:\n" + text if metadata_only == False else meta
|
71
collector/scripts/medium.py
Normal file
71
collector/scripts/medium.py
Normal file
|
@ -0,0 +1,71 @@
|
|||
import os, json
|
||||
from urllib.parse import urlparse
|
||||
from .utils import tokenize, ada_v2_cost
|
||||
from .medium_utils import get_username, fetch_recent_publications, append_meta
|
||||
from alive_progress import alive_it
|
||||
|
||||
# Example medium URL: https://medium.com/@yujiangtham or https://davidall.medium.com
|
||||
def medium():
|
||||
print("[NOTICE]: This method will only get the 10 most recent publishings.")
|
||||
author_url = input("Enter the medium URL of the author you want to collect: ")
|
||||
if(author_url == ''):
|
||||
print("Not a valid medium.com/@author URL")
|
||||
exit(1)
|
||||
|
||||
handle = get_username(author_url)
|
||||
if(handle is None):
|
||||
print("This does not appear to be a valid medium.com/@author URL")
|
||||
exit(1)
|
||||
|
||||
publications = fetch_recent_publications(handle)
|
||||
if(len(publications)==0):
|
||||
print("There are no public or free publications by this creator - nothing to collect.")
|
||||
exit(1)
|
||||
|
||||
totalTokenCount = 0
|
||||
transaction_output_dir = f"../server/documents/medium-{handle}"
|
||||
if os.path.isdir(transaction_output_dir) == False:
|
||||
os.makedirs(transaction_output_dir)
|
||||
|
||||
for publication in alive_it(publications):
|
||||
pub_file_path = transaction_output_dir + f"/publication-{publication.get('id')}.json"
|
||||
if os.path.exists(pub_file_path) == True: continue
|
||||
|
||||
full_text = publication.get('pageContent')
|
||||
if full_text is None or len(full_text) == 0: continue
|
||||
|
||||
full_text = append_meta(publication, full_text)
|
||||
item = {
|
||||
'id': publication.get('id'),
|
||||
'url': publication.get('url'),
|
||||
'title': publication.get('title'),
|
||||
'published': publication.get('published'),
|
||||
'wordCount': len(full_text.split(' ')),
|
||||
'pageContent': full_text,
|
||||
}
|
||||
|
||||
tokenCount = len(tokenize(full_text))
|
||||
item['token_count_estimate'] = tokenCount
|
||||
|
||||
totalTokenCount += tokenCount
|
||||
with open(pub_file_path, 'w', encoding='utf-8') as file:
|
||||
json.dump(item, file, ensure_ascii=True, indent=4)
|
||||
|
||||
print(f"[Success]: {len(publications)} scraped and fetched!")
|
||||
print(f"\n\n////////////////////////////")
|
||||
print(f"Your estimated cost to embed all of this data using OpenAI's text-embedding-ada-002 model at $0.0004 / 1K tokens will cost {ada_v2_cost(totalTokenCount)} using {totalTokenCount} tokens.")
|
||||
print(f"////////////////////////////\n\n")
|
||||
exit(0)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
71
collector/scripts/medium_utils.py
Normal file
71
collector/scripts/medium_utils.py
Normal file
|
@ -0,0 +1,71 @@
|
|||
import os, json, requests, re
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
def get_username(author_url):
|
||||
if '@' in author_url:
|
||||
pattern = r"medium\.com/@([\w-]+)"
|
||||
match = re.search(pattern, author_url)
|
||||
return match.group(1) if match else None
|
||||
else:
|
||||
# Given subdomain
|
||||
pattern = r"([\w-]+).medium\.com"
|
||||
match = re.search(pattern, author_url)
|
||||
return match.group(1) if match else None
|
||||
|
||||
def get_docid(medium_docpath):
|
||||
pattern = r"medium\.com/p/([\w-]+)"
|
||||
match = re.search(pattern, medium_docpath)
|
||||
return match.group(1) if match else None
|
||||
|
||||
def fetch_recent_publications(handle):
|
||||
rss_link = f"https://medium.com/feed/@{handle}"
|
||||
response = requests.get(rss_link)
|
||||
if(response.ok == False):
|
||||
print(f"Could not fetch RSS results for author.")
|
||||
return []
|
||||
|
||||
xml = response.content
|
||||
soup = BeautifulSoup(xml, 'xml')
|
||||
items = soup.find_all('item')
|
||||
publications = []
|
||||
|
||||
if os.path.isdir("./outputs/medium-logs") == False:
|
||||
os.makedirs("./outputs/medium-logs")
|
||||
|
||||
file_path = f"./outputs/medium-logs/medium-{handle}.json"
|
||||
|
||||
if os.path.exists(file_path):
|
||||
with open(file_path, "r") as file:
|
||||
print(f"Returning cached data for Author {handle}. If you do not wish to use stored data then delete the file for this author to allow refetching.")
|
||||
return json.load(file)
|
||||
|
||||
for item in items:
|
||||
tags = []
|
||||
for tag in item.find_all('category'): tags.append(tag.text)
|
||||
content = BeautifulSoup(item.find('content:encoded').text, 'html.parser')
|
||||
data = {
|
||||
'id': get_docid(item.find('guid').text),
|
||||
'title': item.find('title').text,
|
||||
'url': item.find('link').text.split('?')[0],
|
||||
'tags': ','.join(tags),
|
||||
'published': item.find('pubDate').text,
|
||||
'pageContent': content.get_text()
|
||||
}
|
||||
publications.append(data)
|
||||
|
||||
with open(file_path, 'w+', encoding='utf-8') as json_file:
|
||||
json.dump(publications, json_file, ensure_ascii=True, indent=2)
|
||||
print(f"{len(publications)} articles found for author medium.com/@{handle}. Saved to medium-logs/medium-{handle}.json")
|
||||
|
||||
return publications
|
||||
|
||||
def append_meta(publication, text):
|
||||
meta = {
|
||||
'url': publication.get('url'),
|
||||
'tags': publication.get('tags'),
|
||||
'title': publication.get('title'),
|
||||
'createdAt': publication.get('published'),
|
||||
'wordCount': len(text.split(' '))
|
||||
}
|
||||
return "Article Metadata:\n"+json.dumps(meta)+"\n\nArticle Content:\n" + text
|
||||
|
78
collector/scripts/substack.py
Normal file
78
collector/scripts/substack.py
Normal file
|
@ -0,0 +1,78 @@
|
|||
import os, json
|
||||
from urllib.parse import urlparse
|
||||
from .utils import tokenize, ada_v2_cost
|
||||
from .substack_utils import fetch_all_publications, only_valid_publications, get_content, append_meta
|
||||
from alive_progress import alive_it
|
||||
|
||||
# Example substack URL: https://swyx.substack.com/
|
||||
def substack():
|
||||
author_url = input("Enter the substack URL of the author you want to collect: ")
|
||||
if(author_url == ''):
|
||||
print("Not a valid author.substack.com URL")
|
||||
exit(1)
|
||||
|
||||
source = urlparse(author_url)
|
||||
if('substack.com' not in source.netloc or len(source.netloc.split('.')) != 3):
|
||||
print("This does not appear to be a valid author.substack.com URL")
|
||||
exit(1)
|
||||
|
||||
subdomain = source.netloc.split('.')[0]
|
||||
publications = fetch_all_publications(subdomain)
|
||||
valid_publications = only_valid_publications(publications)
|
||||
|
||||
if(len(valid_publications)==0):
|
||||
print("There are no public or free preview newsletters by this creator - nothing to collect.")
|
||||
exit(1)
|
||||
|
||||
print(f"{len(valid_publications)} of {len(publications)} publications are readable publically text posts - collecting those.")
|
||||
|
||||
totalTokenCount = 0
|
||||
transaction_output_dir = f"../server/documents/substack-{subdomain}"
|
||||
if os.path.isdir(transaction_output_dir) == False:
|
||||
os.makedirs(transaction_output_dir)
|
||||
|
||||
for publication in alive_it(valid_publications):
|
||||
pub_file_path = transaction_output_dir + f"/publication-{publication.get('id')}.json"
|
||||
if os.path.exists(pub_file_path) == True: continue
|
||||
|
||||
full_text = get_content(publication.get('canonical_url'))
|
||||
if full_text is None or len(full_text) == 0: continue
|
||||
|
||||
full_text = append_meta(publication, full_text)
|
||||
item = {
|
||||
'id': publication.get('id'),
|
||||
'url': publication.get('canonical_url'),
|
||||
'thumbnail': publication.get('cover_image'),
|
||||
'title': publication.get('title'),
|
||||
'subtitle': publication.get('subtitle'),
|
||||
'description': publication.get('description'),
|
||||
'published': publication.get('post_date'),
|
||||
'wordCount': publication.get('wordcount'),
|
||||
'pageContent': full_text,
|
||||
}
|
||||
|
||||
tokenCount = len(tokenize(full_text))
|
||||
item['token_count_estimate'] = tokenCount
|
||||
|
||||
totalTokenCount += tokenCount
|
||||
with open(pub_file_path, 'w', encoding='utf-8') as file:
|
||||
json.dump(item, file, ensure_ascii=True, indent=4)
|
||||
|
||||
print(f"[Success]: {len(valid_publications)} scraped and fetched!")
|
||||
print(f"\n\n////////////////////////////")
|
||||
print(f"Your estimated cost to embed all of this data using OpenAI's text-embedding-ada-002 model at $0.0004 / 1K tokens will cost {ada_v2_cost(totalTokenCount)} using {totalTokenCount} tokens.")
|
||||
print(f"////////////////////////////\n\n")
|
||||
exit(0)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
86
collector/scripts/substack_utils.py
Normal file
86
collector/scripts/substack_utils.py
Normal file
|
@ -0,0 +1,86 @@
|
|||
import os, json, requests, tempfile
|
||||
from requests_html import HTMLSession
|
||||
from langchain.document_loaders import UnstructuredHTMLLoader
|
||||
|
||||
def fetch_all_publications(subdomain):
|
||||
file_path = f"./outputs/substack-logs/substack-{subdomain}.json"
|
||||
|
||||
if os.path.isdir("./outputs/substack-logs") == False:
|
||||
os.makedirs("./outputs/substack-logs")
|
||||
|
||||
if os.path.exists(file_path):
|
||||
with open(file_path, "r") as file:
|
||||
print(f"Returning cached data for substack {subdomain}.substack.com. If you do not wish to use stored data then delete the file for this newsletter to allow refetching.")
|
||||
return json.load(file)
|
||||
|
||||
collecting = True
|
||||
offset = 0
|
||||
publications = []
|
||||
|
||||
while collecting is True:
|
||||
url = f"https://{subdomain}.substack.com/api/v1/archive?sort=new&offset={offset}"
|
||||
response = requests.get(url)
|
||||
if(response.ok == False):
|
||||
print("Bad response - exiting collection")
|
||||
collecting = False
|
||||
continue
|
||||
|
||||
data = response.json()
|
||||
|
||||
if(len(data) ==0 ):
|
||||
collecting = False
|
||||
continue
|
||||
|
||||
for publication in data:
|
||||
publications.append(publication)
|
||||
offset = len(publications)
|
||||
|
||||
with open(file_path, 'w+', encoding='utf-8') as json_file:
|
||||
json.dump(publications, json_file, ensure_ascii=True, indent=2)
|
||||
print(f"{len(publications)} publications found for author {subdomain}.substack.com. Saved to substack-logs/channel-{subdomain}.json")
|
||||
|
||||
return publications
|
||||
|
||||
def only_valid_publications(publications= []):
|
||||
valid_publications = []
|
||||
for publication in publications:
|
||||
is_paid = publication.get('audience') != 'everyone'
|
||||
if (is_paid and publication.get('should_send_free_preview') != True) or publication.get('type') != 'newsletter': continue
|
||||
valid_publications.append(publication)
|
||||
return valid_publications
|
||||
|
||||
def get_content(article_link):
|
||||
print(f"Fetching {article_link}")
|
||||
if(len(article_link) == 0):
|
||||
print("Invalid URL!")
|
||||
return None
|
||||
|
||||
session = HTMLSession()
|
||||
req = session.get(article_link)
|
||||
if(req.ok == False):
|
||||
print("Could not reach this url!")
|
||||
return None
|
||||
|
||||
req.html.render()
|
||||
|
||||
full_text = None
|
||||
with tempfile.NamedTemporaryFile(mode = "w") as tmp:
|
||||
tmp.write(req.html.html)
|
||||
tmp.seek(0)
|
||||
loader = UnstructuredHTMLLoader(tmp.name)
|
||||
data = loader.load()[0]
|
||||
full_text = data.page_content
|
||||
tmp.close()
|
||||
return full_text
|
||||
|
||||
def append_meta(publication, text):
|
||||
meta = {
|
||||
'url': publication.get('canonical_url'),
|
||||
'thumbnail': publication.get('cover_image'),
|
||||
'title': publication.get('title'),
|
||||
'subtitle': publication.get('subtitle'),
|
||||
'description': publication.get('description'),
|
||||
'createdAt': publication.get('post_date'),
|
||||
'wordCount': publication.get('wordcount')
|
||||
}
|
||||
return "Newsletter Metadata:\n"+json.dumps(meta)+"\n\nArticle Content:\n" + text
|
10
collector/scripts/utils.py
Normal file
10
collector/scripts/utils.py
Normal file
|
@ -0,0 +1,10 @@
|
|||
import tiktoken
|
||||
encoder = tiktoken.encoding_for_model("text-embedding-ada-002")
|
||||
|
||||
def tokenize(fullText):
|
||||
return encoder.encode(fullText)
|
||||
|
||||
def ada_v2_cost(tokenCount):
|
||||
rate_per = 0.0004 / 1_000 # $0.0004 / 1K tokens
|
||||
total = tokenCount * rate_per
|
||||
return '${:,.2f}'.format(total) if total >= 0.01 else '< $0.01'
|
0
collector/scripts/watch/__init__.py
Normal file
0
collector/scripts/watch/__init__.py
Normal file
58
collector/scripts/watch/convert/as_docx.py
Normal file
58
collector/scripts/watch/convert/as_docx.py
Normal file
|
@ -0,0 +1,58 @@
|
|||
import os
|
||||
from langchain.document_loaders import Docx2txtLoader, UnstructuredODTLoader
|
||||
from slugify import slugify
|
||||
from ..utils import guid, file_creation_time, write_to_server_documents, move_source
|
||||
from ...utils import tokenize
|
||||
|
||||
# Process all text-related documents.
|
||||
def as_docx(**kwargs):
|
||||
parent_dir = kwargs.get('directory', 'hotdir')
|
||||
filename = kwargs.get('filename')
|
||||
ext = kwargs.get('ext', '.txt')
|
||||
fullpath = f"{parent_dir}/{filename}{ext}"
|
||||
|
||||
loader = Docx2txtLoader(fullpath)
|
||||
data = loader.load()[0]
|
||||
content = data.page_content
|
||||
|
||||
print(f"-- Working {fullpath} --")
|
||||
data = {
|
||||
'id': guid(),
|
||||
'url': "file://"+os.path.abspath(f"{parent_dir}/processed/{filename}{ext}"),
|
||||
'title': f"{filename}{ext}",
|
||||
'description': "a custom file uploaded by the user.",
|
||||
'published': file_creation_time(fullpath),
|
||||
'wordCount': len(content),
|
||||
'pageContent': content,
|
||||
'token_count_estimate': len(tokenize(content))
|
||||
}
|
||||
|
||||
write_to_server_documents(data, f"{slugify(filename)}-{data.get('id')}")
|
||||
move_source(parent_dir, f"{filename}{ext}")
|
||||
print(f"[SUCCESS]: {filename}{ext} converted & ready for embedding.\n")
|
||||
|
||||
def as_odt(**kwargs):
|
||||
parent_dir = kwargs.get('directory', 'hotdir')
|
||||
filename = kwargs.get('filename')
|
||||
ext = kwargs.get('ext', '.txt')
|
||||
fullpath = f"{parent_dir}/{filename}{ext}"
|
||||
|
||||
loader = UnstructuredODTLoader(fullpath)
|
||||
data = loader.load()[0]
|
||||
content = data.page_content
|
||||
|
||||
print(f"-- Working {fullpath} --")
|
||||
data = {
|
||||
'id': guid(),
|
||||
'url': "file://"+os.path.abspath(f"{parent_dir}/processed/{filename}{ext}"),
|
||||
'title': f"{filename}{ext}",
|
||||
'description': "a custom file uploaded by the user.",
|
||||
'published': file_creation_time(fullpath),
|
||||
'wordCount': len(content),
|
||||
'pageContent': content,
|
||||
'token_count_estimate': len(tokenize(content))
|
||||
}
|
||||
|
||||
write_to_server_documents(data, f"{slugify(filename)}-{data.get('id')}")
|
||||
move_source(parent_dir, f"{filename}{ext}")
|
||||
print(f"[SUCCESS]: {filename}{ext} converted & ready for embedding.\n")
|
32
collector/scripts/watch/convert/as_markdown.py
Normal file
32
collector/scripts/watch/convert/as_markdown.py
Normal file
|
@ -0,0 +1,32 @@
|
|||
import os
|
||||
from langchain.document_loaders import UnstructuredMarkdownLoader
|
||||
from slugify import slugify
|
||||
from ..utils import guid, file_creation_time, write_to_server_documents, move_source
|
||||
from ...utils import tokenize
|
||||
|
||||
# Process all text-related documents.
|
||||
def as_markdown(**kwargs):
|
||||
parent_dir = kwargs.get('directory', 'hotdir')
|
||||
filename = kwargs.get('filename')
|
||||
ext = kwargs.get('ext', '.txt')
|
||||
fullpath = f"{parent_dir}/{filename}{ext}"
|
||||
|
||||
loader = UnstructuredMarkdownLoader(fullpath)
|
||||
data = loader.load()[0]
|
||||
content = data.page_content
|
||||
|
||||
print(f"-- Working {fullpath} --")
|
||||
data = {
|
||||
'id': guid(),
|
||||
'url': "file://"+os.path.abspath(f"{parent_dir}/processed/{filename}{ext}"),
|
||||
'title': f"{filename}{ext}",
|
||||
'description': "a custom file uploaded by the user.",
|
||||
'published': file_creation_time(fullpath),
|
||||
'wordCount': len(content),
|
||||
'pageContent': content,
|
||||
'token_count_estimate': len(tokenize(content))
|
||||
}
|
||||
|
||||
write_to_server_documents(data, f"{slugify(filename)}-{data.get('id')}")
|
||||
move_source(parent_dir, f"{filename}{ext}")
|
||||
print(f"[SUCCESS]: {filename}{ext} converted & ready for embedding.\n")
|
36
collector/scripts/watch/convert/as_pdf.py
Normal file
36
collector/scripts/watch/convert/as_pdf.py
Normal file
|
@ -0,0 +1,36 @@
|
|||
import os
|
||||
from langchain.document_loaders import PyPDFLoader
|
||||
from slugify import slugify
|
||||
from ..utils import guid, file_creation_time, write_to_server_documents, move_source
|
||||
from ...utils import tokenize
|
||||
|
||||
# Process all text-related documents.
|
||||
def as_pdf(**kwargs):
|
||||
parent_dir = kwargs.get('directory', 'hotdir')
|
||||
filename = kwargs.get('filename')
|
||||
ext = kwargs.get('ext', '.txt')
|
||||
fullpath = f"{parent_dir}/{filename}{ext}"
|
||||
|
||||
loader = PyPDFLoader(fullpath)
|
||||
pages = loader.load_and_split()
|
||||
|
||||
print(f"-- Working {fullpath} --")
|
||||
for page in pages:
|
||||
pg_num = page.metadata.get('page')
|
||||
print(f"-- Working page {pg_num} --")
|
||||
|
||||
content = page.page_content
|
||||
data = {
|
||||
'id': guid(),
|
||||
'url': "file://"+os.path.abspath(f"{parent_dir}/processed/{filename}{ext}"),
|
||||
'title': f"{filename}_pg{pg_num}{ext}",
|
||||
'description': "a custom file uploaded by the user.",
|
||||
'published': file_creation_time(fullpath),
|
||||
'wordCount': len(content),
|
||||
'pageContent': content,
|
||||
'token_count_estimate': len(tokenize(content))
|
||||
}
|
||||
write_to_server_documents(data, f"{slugify(filename)}-pg{pg_num}-{data.get('id')}")
|
||||
|
||||
move_source(parent_dir, f"{filename}{ext}")
|
||||
print(f"[SUCCESS]: {filename}{ext} converted & ready for embedding.\n")
|
28
collector/scripts/watch/convert/as_text.py
Normal file
28
collector/scripts/watch/convert/as_text.py
Normal file
|
@ -0,0 +1,28 @@
|
|||
import os
|
||||
from slugify import slugify
|
||||
from ..utils import guid, file_creation_time, write_to_server_documents, move_source
|
||||
from ...utils import tokenize
|
||||
|
||||
# Process all text-related documents.
|
||||
def as_text(**kwargs):
|
||||
parent_dir = kwargs.get('directory', 'hotdir')
|
||||
filename = kwargs.get('filename')
|
||||
ext = kwargs.get('ext', '.txt')
|
||||
fullpath = f"{parent_dir}/{filename}{ext}"
|
||||
content = open(fullpath).read()
|
||||
|
||||
print(f"-- Working {fullpath} --")
|
||||
data = {
|
||||
'id': guid(),
|
||||
'url': "file://"+os.path.abspath(f"{parent_dir}/processed/{filename}{ext}"),
|
||||
'title': f"{filename}{ext}",
|
||||
'description': "a custom file uploaded by the user.",
|
||||
'published': file_creation_time(fullpath),
|
||||
'wordCount': len(content),
|
||||
'pageContent': content,
|
||||
'token_count_estimate': len(tokenize(content))
|
||||
}
|
||||
|
||||
write_to_server_documents(data, f"{slugify(filename)}-{data.get('id')}")
|
||||
move_source(parent_dir, f"{filename}{ext}")
|
||||
print(f"[SUCCESS]: {filename}{ext} converted & ready for embedding.\n")
|
12
collector/scripts/watch/filetypes.py
Normal file
12
collector/scripts/watch/filetypes.py
Normal file
|
@ -0,0 +1,12 @@
|
|||
from .convert.as_text import as_text
|
||||
from .convert.as_markdown import as_markdown
|
||||
from .convert.as_pdf import as_pdf
|
||||
from .convert.as_docx import as_docx, as_odt
|
||||
|
||||
FILETYPES = {
|
||||
'.txt': as_text,
|
||||
'.md': as_markdown,
|
||||
'.pdf': as_pdf,
|
||||
'.docx': as_docx,
|
||||
'.odt': as_odt,
|
||||
}
|
20
collector/scripts/watch/main.py
Normal file
20
collector/scripts/watch/main.py
Normal file
|
@ -0,0 +1,20 @@
|
|||
import os
|
||||
from .filetypes import FILETYPES
|
||||
|
||||
RESERVED = ['__HOTDIR__.md']
|
||||
def watch_for_changes(directory):
|
||||
for raw_doc in os.listdir(directory):
|
||||
if os.path.isdir(f"{directory}/{raw_doc}") or raw_doc in RESERVED: continue
|
||||
|
||||
filename, fileext = os.path.splitext(raw_doc)
|
||||
if filename in ['.DS_Store'] or fileext == '': continue
|
||||
|
||||
if fileext not in FILETYPES.keys():
|
||||
print(f"{fileext} not a supported file type for conversion. Please remove from hot directory.")
|
||||
continue
|
||||
|
||||
FILETYPES[fileext](
|
||||
directory=directory,
|
||||
filename=filename,
|
||||
ext=fileext,
|
||||
)
|
30
collector/scripts/watch/utils.py
Normal file
30
collector/scripts/watch/utils.py
Normal file
|
@ -0,0 +1,30 @@
|
|||
import os, json
|
||||
from datetime import datetime
|
||||
from uuid import uuid4
|
||||
|
||||
def guid():
|
||||
return str(uuid4())
|
||||
|
||||
def file_creation_time(path_to_file):
|
||||
try:
|
||||
if os.name == 'nt':
|
||||
return datetime.fromtimestamp(os.path.getctime(path_to_file)).strftime('%Y-%m-%d %H:%M:%S')
|
||||
else:
|
||||
stat = os.stat(path_to_file)
|
||||
return datetime.fromtimestamp(stat.st_birthtime).strftime('%Y-%m-%d %H:%M:%S')
|
||||
except AttributeError:
|
||||
return datetime.today().strftime('%Y-%m-%d %H:%M:%S')
|
||||
|
||||
def move_source(working_dir='hotdir', new_destination_filename= ''):
|
||||
destination = f"{working_dir}/processed"
|
||||
if os.path.exists(destination) == False:
|
||||
os.mkdir(destination)
|
||||
|
||||
os.replace(f"{working_dir}/{new_destination_filename}", f"{destination}/{new_destination_filename}")
|
||||
return
|
||||
|
||||
def write_to_server_documents(data, filename):
|
||||
destination = f"../server/documents/custom-documents"
|
||||
if os.path.exists(destination) == False: os.makedirs(destination)
|
||||
with open(f"{destination}/{filename}.json", 'w', encoding='utf-8') as file:
|
||||
json.dump(data, file, ensure_ascii=True, indent=4)
|
55
collector/scripts/youtube.py
Normal file
55
collector/scripts/youtube.py
Normal file
|
@ -0,0 +1,55 @@
|
|||
import os, json
|
||||
from youtube_transcript_api import YouTubeTranscriptApi
|
||||
from youtube_transcript_api.formatters import TextFormatter, JSONFormatter
|
||||
from .utils import tokenize, ada_v2_cost
|
||||
from .yt_utils import fetch_channel_video_information, get_channel_id, clean_text, append_meta, get_duration
|
||||
from alive_progress import alive_it
|
||||
|
||||
# Example Channel URL https://www.youtube.com/channel/UCmWbhBB96ynOZuWG7LfKong
|
||||
# Example Channel URL https://www.youtube.com/@mintplex
|
||||
|
||||
def youtube():
|
||||
channel_link = input("Paste in the URL of a YouTube channel: ")
|
||||
channel_id = get_channel_id(channel_link)
|
||||
|
||||
if channel_id == None or len(channel_id) == 0:
|
||||
print("Invalid input - must be full YouTube channel URL")
|
||||
exit(1)
|
||||
|
||||
channel_data = fetch_channel_video_information(channel_id)
|
||||
transaction_output_dir = f"../server/documents/youtube-{channel_data.get('channelTitle')}"
|
||||
|
||||
if os.path.isdir(transaction_output_dir) == False:
|
||||
os.makedirs(transaction_output_dir)
|
||||
|
||||
print(f"\nFetching transcripts for {len(channel_data.get('items'))} videos - please wait.\nStopping and restarting will not refetch known transcripts in case there is an error.\nSaving results to: {transaction_output_dir}.")
|
||||
totalTokenCount = 0
|
||||
for video in alive_it(channel_data.get('items')):
|
||||
video_file_path = transaction_output_dir + f"/video-{video.get('id')}.json"
|
||||
if os.path.exists(video_file_path) == True:
|
||||
continue
|
||||
|
||||
formatter = TextFormatter()
|
||||
json_formatter = JSONFormatter()
|
||||
try:
|
||||
transcript = YouTubeTranscriptApi.get_transcript(video.get('id'))
|
||||
raw_text = clean_text(formatter.format_transcript(transcript))
|
||||
duration = get_duration(json_formatter.format_transcript(transcript))
|
||||
|
||||
if(len(raw_text) > 0):
|
||||
fullText = append_meta(video, duration, raw_text)
|
||||
tokenCount = len(tokenize(fullText))
|
||||
video['pageContent'] = fullText
|
||||
video['token_count_estimate'] = tokenCount
|
||||
totalTokenCount += tokenCount
|
||||
with open(video_file_path, 'w', encoding='utf-8') as file:
|
||||
json.dump(video, file, ensure_ascii=True, indent=4)
|
||||
except:
|
||||
print("There was an issue getting the transcription of a video in the list - likely because captions are disabled. Skipping")
|
||||
continue
|
||||
|
||||
print(f"[Success]: {len(channel_data.get('items'))} video transcripts fetched!")
|
||||
print(f"\n\n////////////////////////////")
|
||||
print(f"Your estimated cost to embed all of this data using OpenAI's text-embedding-ada-002 model at $0.0004 / 1K tokens will cost {ada_v2_cost(totalTokenCount)} using {totalTokenCount} tokens.")
|
||||
print(f"////////////////////////////\n\n")
|
||||
exit(0)
|
120
collector/scripts/yt_utils.py
Normal file
120
collector/scripts/yt_utils.py
Normal file
|
@ -0,0 +1,120 @@
|
|||
import json, requests, os, re
|
||||
from slugify import slugify
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
def is_yt_short(videoId):
|
||||
url = 'https://www.youtube.com/shorts/' + videoId
|
||||
ret = requests.head(url)
|
||||
return ret.status_code == 200
|
||||
|
||||
def get_channel_id(channel_link):
|
||||
if('@' in channel_link):
|
||||
pattern = r'https?://www\.youtube\.com/(@\w+)/?'
|
||||
match = re.match(pattern, channel_link)
|
||||
if match is False: return None
|
||||
handle = match.group(1)
|
||||
print('Need to map username to channelId - this can take a while sometimes.')
|
||||
response = requests.get(f"https://yt.lemnoslife.com/channels?handle={handle}", timeout=20)
|
||||
|
||||
if(response.ok == False):
|
||||
print("Handle => ChannelId mapping endpoint is too slow - use regular youtube.com/channel URL")
|
||||
return None
|
||||
|
||||
json_data = response.json()
|
||||
return json_data.get('items')[0].get('id')
|
||||
else:
|
||||
pattern = r"youtube\.com/channel/([\w-]+)"
|
||||
match = re.search(pattern, channel_link)
|
||||
return match.group(1) if match else None
|
||||
|
||||
|
||||
def clean_text(text):
|
||||
return re.sub(r"\[.*?\]", "", text)
|
||||
|
||||
def append_meta(video, duration, text):
|
||||
meta = {
|
||||
'youtubeURL': f"https://youtube.com/watch?v={video.get('id')}",
|
||||
'thumbnail': video.get('thumbnail'),
|
||||
'description': video.get('description'),
|
||||
'createdAt': video.get('published'),
|
||||
'videoDurationInSeconds': duration,
|
||||
}
|
||||
return "Video JSON Metadata:\n"+json.dumps(meta, indent=4)+"\n\n\nAudio Transcript:\n" + text
|
||||
|
||||
def get_duration(json_str):
|
||||
data = json.loads(json_str)
|
||||
return data[-1].get('start')
|
||||
|
||||
def fetch_channel_video_information(channel_id, windowSize = 50):
|
||||
if channel_id == None or len(channel_id) == 0:
|
||||
print("No channel id provided!")
|
||||
exit(1)
|
||||
|
||||
if os.path.isdir("./outputs/channel-logs") == False:
|
||||
os.makedirs("./outputs/channel-logs")
|
||||
|
||||
file_path = f"./outputs/channel-logs/channel-{channel_id}.json"
|
||||
if os.path.exists(file_path):
|
||||
with open(file_path, "r") as file:
|
||||
print(f"Returning cached data for channel {channel_id}. If you do not wish to use stored data then delete the file for this channel to allow refetching.")
|
||||
return json.load(file)
|
||||
|
||||
if(os.getenv('GOOGLE_APIS_KEY') == None):
|
||||
print("GOOGLE_APIS_KEY env variable not set!")
|
||||
exit(1)
|
||||
|
||||
done = False
|
||||
currentPage = None
|
||||
pageTokens = []
|
||||
items = []
|
||||
data = {
|
||||
'id': channel_id,
|
||||
}
|
||||
|
||||
print("Fetching first page of results...")
|
||||
while(done == False):
|
||||
url = f"https://www.googleapis.com/youtube/v3/search?key={os.getenv('GOOGLE_APIS_KEY')}&channelId={channel_id}&part=snippet,id&order=date&type=video&maxResults={windowSize}"
|
||||
if(currentPage != None):
|
||||
print(f"Fetching page ${currentPage}")
|
||||
url += f"&pageToken={currentPage}"
|
||||
|
||||
req = requests.get(url)
|
||||
if(req.ok == False):
|
||||
print("Could not fetch channel_id items!")
|
||||
exit(1)
|
||||
|
||||
response = req.json()
|
||||
currentPage = response.get('nextPageToken')
|
||||
if currentPage in pageTokens:
|
||||
print('All pages iterated and logged!')
|
||||
done = True
|
||||
break
|
||||
|
||||
for item in response.get('items'):
|
||||
if 'id' in item and 'videoId' in item.get('id'):
|
||||
if is_yt_short(item.get('id').get('videoId')):
|
||||
print(f"Filtering out YT Short {item.get('id').get('videoId')}")
|
||||
continue
|
||||
|
||||
if data.get('channelTitle') is None:
|
||||
data['channelTitle'] = slugify(item.get('snippet').get('channelTitle'))
|
||||
|
||||
newItem = {
|
||||
'id': item.get('id').get('videoId'),
|
||||
'url': f"https://youtube.com/watch?v={item.get('id').get('videoId')}",
|
||||
'title': item.get('snippet').get('title'),
|
||||
'description': item.get('snippet').get('description'),
|
||||
'thumbnail': item.get('snippet').get('thumbnails').get('high').get('url'),
|
||||
'published': item.get('snippet').get('publishTime'),
|
||||
}
|
||||
items.append(newItem)
|
||||
|
||||
pageTokens.append(currentPage)
|
||||
|
||||
data['items'] = items
|
||||
with open(file_path, 'w+', encoding='utf-8') as json_file:
|
||||
json.dump(data, json_file, ensure_ascii=True, indent=2)
|
||||
print(f"{len(items)} videos found for channel {data.get('channelTitle')}. Saved to channel-logs/channel-{channel_id}.json")
|
||||
|
||||
return data
|
21
collector/watch.py
Normal file
21
collector/watch.py
Normal file
|
@ -0,0 +1,21 @@
|
|||
import _thread, time
|
||||
from scripts.watch.main import watch_for_changes
|
||||
|
||||
a_list = []
|
||||
WATCH_DIRECTORY = "hotdir"
|
||||
def input_thread(a_list):
|
||||
input()
|
||||
a_list.append(True)
|
||||
|
||||
def main():
|
||||
_thread.start_new_thread(input_thread, (a_list,))
|
||||
print(f"Watching '{WATCH_DIRECTORY}/' for new files.\n\nUpload files into this directory while this script is running to convert them.\nPress enter or crtl+c to exit script.")
|
||||
while not a_list:
|
||||
watch_for_changes(WATCH_DIRECTORY)
|
||||
time.sleep(1)
|
||||
|
||||
print("Stopping watching of hot directory.")
|
||||
exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
1
frontend/.env.production
Normal file
1
frontend/.env.production
Normal file
|
@ -0,0 +1 @@
|
|||
GENERATE_SOURCEMAP=false
|
15
frontend/.eslintrc.cjs
Normal file
15
frontend/.eslintrc.cjs
Normal file
|
@ -0,0 +1,15 @@
|
|||
module.exports = {
|
||||
env: { browser: true, es2020: true },
|
||||
extends: [
|
||||
'eslint:recommended',
|
||||
'plugin:react/recommended',
|
||||
'plugin:react/jsx-runtime',
|
||||
'plugin:react-hooks/recommended',
|
||||
],
|
||||
parserOptions: { ecmaVersion: 'latest', sourceType: 'module' },
|
||||
settings: { react: { version: '18.2' } },
|
||||
plugins: ['react-refresh'],
|
||||
rules: {
|
||||
'react-refresh/only-export-components': 'warn',
|
||||
},
|
||||
}
|
25
frontend/.gitignore
vendored
Normal file
25
frontend/.gitignore
vendored
Normal file
|
@ -0,0 +1,25 @@
|
|||
# Logs
|
||||
logs
|
||||
*.log
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
pnpm-debug.log*
|
||||
lerna-debug.log*
|
||||
|
||||
node_modules
|
||||
dist
|
||||
dist-ssr
|
||||
*.local
|
||||
|
||||
# Editor directories and files
|
||||
.vscode/*
|
||||
!.vscode/extensions.json
|
||||
.idea
|
||||
.DS_Store
|
||||
*.suo
|
||||
*.ntvs*
|
||||
*.njsproj
|
||||
*.sln
|
||||
*.sw?
|
||||
bundleinspector.html
|
1
frontend/.nvmrc
Normal file
1
frontend/.nvmrc
Normal file
|
@ -0,0 +1 @@
|
|||
v18.12.1
|
36
frontend/index.html
Normal file
36
frontend/index.html
Normal file
|
@ -0,0 +1,36 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<link rel="icon" type="image/svg+xml" href="/favicon.ico" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>AnythingLLM | Your personal LLM trained on anything</title>
|
||||
|
||||
<meta name="title" content="AnythingLLM | Your personal LLM trained on anything">
|
||||
<meta name="description" content="AnythingLLM | Your personal LLM trained on anything">
|
||||
|
||||
<!-- Facebook -->
|
||||
<meta property="og:type" content="website">
|
||||
<meta property="og:url" content="https://anything-llm.ai">
|
||||
<meta property="og:title" content="AnythingLLM | Your personal LLM trained on anything">
|
||||
<meta property="og:description" content="AnythingLLM | Your personal LLM trained on anything">
|
||||
<meta property="og:image" content="https://anything-llm.ai/promo.png">
|
||||
|
||||
<!-- Twitter -->
|
||||
<meta property="twitter:card" content="summary_large_image">
|
||||
<meta property="twitter:url" content="https://anything-llm.ai">
|
||||
<meta property="twitter:title" content="AnythingLLM | Your personal LLM trained on anything">
|
||||
<meta property="twitter:description" content="AnythingLLM | Your personal LLM trained on anything">
|
||||
<meta property="twitter:image" content="https://anything-llm.ai/promo.png">
|
||||
|
||||
<link rel="icon" href="/favicon.ico" />
|
||||
<link rel="apple-touch-icon" href="/favicon.ico" />
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<div id="root" class="h-screen"></div>
|
||||
<script type="module" src="/src/main.jsx"></script>
|
||||
</body>
|
||||
|
||||
</html>
|
7
frontend/jsconfig.json
Normal file
7
frontend/jsconfig.json
Normal file
|
@ -0,0 +1,7 @@
|
|||
{
|
||||
"compilerOptions": {
|
||||
"module": "commonjs",
|
||||
"target": "esnext",
|
||||
"jsx": "react"
|
||||
}
|
||||
}
|
50
frontend/package.json
Normal file
50
frontend/package.json
Normal file
|
@ -0,0 +1,50 @@
|
|||
{
|
||||
"name": "anything-llm-frontend",
|
||||
"private": false,
|
||||
"version": "0.1.0",
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"start": "vite --open",
|
||||
"build": "vite build",
|
||||
"lint": "yarn prettier --write ./src",
|
||||
"preview": "vite preview"
|
||||
},
|
||||
"dependencies": {
|
||||
"@esbuild-plugins/node-globals-polyfill": "^0.1.1",
|
||||
"@metamask/jazzicon": "^2.0.0",
|
||||
"@react-oauth/google": "^0.11.0",
|
||||
"buffer": "^6.0.3",
|
||||
"email-validator": "^2.0.4",
|
||||
"he": "^1.2.0",
|
||||
"js-file-download": "^0.4.12",
|
||||
"moment-timezone": "^0.5.43",
|
||||
"pluralize": "^8.0.0",
|
||||
"react": "^18.2.0",
|
||||
"react-confetti-explosion": "^2.1.2",
|
||||
"react-device-detect": "^2.2.2",
|
||||
"react-dom": "^18.2.0",
|
||||
"react-drag-drop-files": "^2.3.7",
|
||||
"react-feather": "^2.0.10",
|
||||
"react-loading-skeleton": "^3.1.0",
|
||||
"react-router-dom": "^6.3.0",
|
||||
"react-type-animation": "^3.0.1",
|
||||
"text-case": "^1.0.9",
|
||||
"truncate": "^3.0.0",
|
||||
"uuid": "^9.0.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/react": "^18.0.28",
|
||||
"@types/react-dom": "^18.0.11",
|
||||
"@vitejs/plugin-react": "^4.0.0-beta.0",
|
||||
"autoprefixer": "^10.4.14",
|
||||
"eslint": "^8.38.0",
|
||||
"eslint-plugin-react": "^7.32.2",
|
||||
"eslint-plugin-react-hooks": "^4.6.0",
|
||||
"eslint-plugin-react-refresh": "^0.3.4",
|
||||
"postcss": "^8.4.23",
|
||||
"prettier": "^2.4.1",
|
||||
"rollup-plugin-visualizer": "^5.9.0",
|
||||
"tailwindcss": "^3.3.1",
|
||||
"vite": "^4.3.0"
|
||||
}
|
||||
}
|
7
frontend/postcss.config.js
Normal file
7
frontend/postcss.config.js
Normal file
|
@ -0,0 +1,7 @@
|
|||
import tailwind from 'tailwindcss'
|
||||
import autoprefixer from 'autoprefixer'
|
||||
import tailwindConfig from './tailwind.config.js'
|
||||
|
||||
export default {
|
||||
plugins: [tailwind(tailwindConfig), autoprefixer],
|
||||
}
|
BIN
frontend/public/favicon.ico
Normal file
BIN
frontend/public/favicon.ico
Normal file
Binary file not shown.
After Width: 100px | Height: 100px | Size: 3.6 KiB |
BIN
frontend/public/fonts/AvenirNext.ttf
Normal file
BIN
frontend/public/fonts/AvenirNext.ttf
Normal file
Binary file not shown.
19
frontend/src/App.jsx
Normal file
19
frontend/src/App.jsx
Normal file
|
@ -0,0 +1,19 @@
|
|||
import React, { lazy, Suspense } from "react";
|
||||
import { Routes, Route } from "react-router-dom";
|
||||
import { ContextWrapper } from "./AuthContext";
|
||||
|
||||
const Main = lazy(() => import("./pages/Main"));
|
||||
const WorkspaceChat = lazy(() => import("./pages/WorkspaceChat"));
|
||||
|
||||
export default function App() {
|
||||
return (
|
||||
<Suspense fallback={<div />}>
|
||||
<ContextWrapper>
|
||||
<Routes>
|
||||
<Route path="/" element={<Main />} />
|
||||
<Route path="/workspace/:slug" element={<WorkspaceChat />} />
|
||||
</Routes>
|
||||
</ContextWrapper>
|
||||
</Suspense>
|
||||
);
|
||||
}
|
30
frontend/src/AuthContext.jsx
Normal file
30
frontend/src/AuthContext.jsx
Normal file
|
@ -0,0 +1,30 @@
|
|||
import React, { useState, createContext } from "react";
|
||||
|
||||
export const AuthContext = createContext(null);
|
||||
export function ContextWrapper(props) {
|
||||
const localUser = localStorage.getItem("anythingllm_user");
|
||||
const localAuthToken = localStorage.getItem("anythingllm_authToken");
|
||||
const [store, setStore] = useState({
|
||||
user: localUser ? JSON.parse(localUser) : null,
|
||||
authToken: localAuthToken ? localAuthToken : null,
|
||||
});
|
||||
|
||||
const [actions] = useState({
|
||||
updateUser: (user, authToken = "") => {
|
||||
localStorage.setItem("anythingllm_user", JSON.stringify(user));
|
||||
localStorage.setItem("anythingllm_authToken", authToken);
|
||||
setStore({ user, authToken });
|
||||
},
|
||||
unsetUser: () => {
|
||||
localStorage.removeItem("anythingllm_user");
|
||||
localStorage.removeItem("anythingllm_authToken");
|
||||
setStore({ user: null, authToken: null });
|
||||
},
|
||||
});
|
||||
|
||||
return (
|
||||
<AuthContext.Provider value={{ store, actions }}>
|
||||
{props.children}
|
||||
</AuthContext.Provider>
|
||||
);
|
||||
}
|
254
frontend/src/components/DefaultChat/index.jsx
Normal file
254
frontend/src/components/DefaultChat/index.jsx
Normal file
|
@ -0,0 +1,254 @@
|
|||
import React, { useEffect, useState } from "react";
|
||||
import { GitHub, GitMerge, Mail, Plus } from "react-feather";
|
||||
import NewWorkspaceModal, {
|
||||
useNewWorkspaceModal,
|
||||
} from "../Modals/NewWorkspace";
|
||||
|
||||
export default function DefaultChatContainer() {
|
||||
const [mockMsgs, setMockMessages] = useState([]);
|
||||
const {
|
||||
showing: showingNewWsModal,
|
||||
showModal: showNewWsModal,
|
||||
hideModal: hideNewWsModal,
|
||||
} = useNewWorkspaceModal();
|
||||
const popMsg = !window.localStorage.getItem("anythingllm_intro");
|
||||
|
||||
const MESSAGES = [
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-start ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-b-2xl rounded-tr-2xl rounded-tl-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
Welcome to AnythingLLM, AnythingLLM is an open-source AI tool by
|
||||
Mintplex Labs that turns <i>anything</i> into a trained chatbot you
|
||||
can query and chat with. AnythingLLM is a BYOK (bring-your-own-keys)
|
||||
software so there is no subscription, fee, or charges for this
|
||||
software outside of the services you want to use with it.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-start ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-b-2xl rounded-tr-2xl rounded-tl-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
AnythingLLM is the easiest way to put powerful AI products like
|
||||
OpenAi, GPT-4, LangChain, PineconeDB, ChromaDB, and other services
|
||||
together in a neat package with no fuss to increase your
|
||||
productivity by 100x.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-start ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-b-2xl rounded-tr-2xl rounded-tl-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
AnythingLLM can run totally locally on your machine with little
|
||||
overhead you wont even notice it's there! No GPU needed. Cloud and
|
||||
on-premises installtion is available as well.
|
||||
<br />
|
||||
The AI tooling ecosytem gets more powerful everyday. AnythingLLM
|
||||
makes it easy to use.
|
||||
</p>
|
||||
<a
|
||||
href=""
|
||||
className="mt-4 w-fit flex flex-grow gap-x-2 py-[5px] px-4 border border-slate-400 rounded-lg text-slate-800 dark:text-slate-200 justify-start items-center hover:bg-slate-100 dark:hover:bg-stone-900 dark:bg-stone-900"
|
||||
>
|
||||
<GitMerge className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-lg leading-loose">
|
||||
Create an issue on Github
|
||||
</p>
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-end ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-slate-200 dark:bg-amber-800 rounded-b-2xl rounded-tl-2xl rounded-tr-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
How do I get started?!
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-start ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-b-2xl rounded-tr-2xl rounded-tl-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
It's simple. All collections are organized into buckets we call{" "}
|
||||
<b>"Workspaces"</b>. Workspaces are buckets of files, documents,
|
||||
images, PDFs, and other files which will be transformed into
|
||||
something LLM's can understand and use in conversation.
|
||||
<br />
|
||||
<br />
|
||||
You can add and remove files at anytime.
|
||||
</p>
|
||||
<button
|
||||
onClick={showNewWsModal}
|
||||
className="mt-4 w-fit flex flex-grow gap-x-2 py-[5px] px-4 border border-slate-400 rounded-lg text-slate-800 dark:text-slate-200 justify-start items-center hover:bg-slate-100 dark:hover:bg-stone-900 dark:bg-stone-900"
|
||||
>
|
||||
<Plus className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-lg leading-loose">
|
||||
Create your first workspace
|
||||
</p>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-end ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-slate-200 dark:bg-amber-800 rounded-b-2xl rounded-tl-2xl rounded-tr-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
Is this like an AI dropbox or something? What about chatting? It is
|
||||
a chatbot isnt it?
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-start ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-b-2xl rounded-tr-2xl rounded-tl-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
AnythingLLM is more than a smarter Dropbox.
|
||||
<br />
|
||||
<br />
|
||||
AnythingLLM offers two ways of talking with your data:
|
||||
<br />
|
||||
<br />
|
||||
<i>Query:</i> Your chats will return data or inferences found with
|
||||
the documents in your workspace it has access to. Adding more
|
||||
documents to the Workspace make it smarter!
|
||||
<br />
|
||||
<br />
|
||||
<i>Conversational:</i> Your documents + your on-going chat history
|
||||
both contribute to the LLM knowledge at the same time. Great for
|
||||
appending real-time text-based info or corrections and
|
||||
misunderstandings the LLM might have.
|
||||
<br />
|
||||
<br />
|
||||
You can toggle between either mode <i>in the middle of chatting!</i>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-end ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-slate-200 dark:bg-amber-800 rounded-b-2xl rounded-tl-2xl rounded-tr-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
Wow, this sounds amazing, let me try it out already!
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
|
||||
<React.Fragment>
|
||||
<div
|
||||
className={`flex w-full mt-2 justify-start ${
|
||||
popMsg ? "chat__message" : ""
|
||||
}`}
|
||||
>
|
||||
<div className="p-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-b-2xl rounded-tr-2xl rounded-tl-sm">
|
||||
<p className="text-slate-800 dark:text-slate-200 font-semibold">
|
||||
Have Fun!
|
||||
</p>
|
||||
<div className="flex items-center gap-x-4">
|
||||
<a
|
||||
href=""
|
||||
className="mt-4 w-fit flex flex-grow gap-x-2 py-[5px] px-4 border border-slate-400 rounded-lg text-slate-800 dark:text-slate-200 justify-start items-center hover:bg-slate-100 dark:hover:bg-stone-900 dark:bg-stone-900"
|
||||
>
|
||||
<GitHub className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-lg leading-loose">
|
||||
Star on GitHub
|
||||
</p>
|
||||
</a>
|
||||
<a
|
||||
href=""
|
||||
className="mt-4 w-fit flex flex-grow gap-x-2 py-[5px] px-4 border border-slate-400 rounded-lg text-slate-800 dark:text-slate-200 justify-start items-center hover:bg-slate-100 dark:hover:bg-stone-900 dark:bg-stone-900"
|
||||
>
|
||||
<Mail className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-lg leading-loose">
|
||||
Contact Mintplex Labs
|
||||
</p>
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>,
|
||||
];
|
||||
|
||||
useEffect(() => {
|
||||
function processMsgs() {
|
||||
if (!!window.localStorage.getItem("anythingllm_intro")) {
|
||||
setMockMessages([...MESSAGES]);
|
||||
return false;
|
||||
} else {
|
||||
setMockMessages([MESSAGES[0]]);
|
||||
}
|
||||
|
||||
var timer = 500;
|
||||
var messages = [];
|
||||
|
||||
MESSAGES.map((child) => {
|
||||
setTimeout(() => {
|
||||
setMockMessages([...messages, child]);
|
||||
messages.push(child);
|
||||
}, timer);
|
||||
timer += 2_500;
|
||||
});
|
||||
window.localStorage.setItem("anythingllm_intro", 1);
|
||||
}
|
||||
|
||||
processMsgs();
|
||||
}, []);
|
||||
|
||||
return (
|
||||
<div
|
||||
style={{ height: "calc(100% - 32px)" }}
|
||||
className="transition-all duration-500 relative ml-[2px] mr-[8px] my-[16px] rounded-[26px] bg-white dark:bg-black-900 min-w-[82%] p-[18px] h-full overflow-y-scroll"
|
||||
>
|
||||
{mockMsgs.map((content, i) => {
|
||||
return <React.Fragment key={i}>{content}</React.Fragment>;
|
||||
})}
|
||||
{showingNewWsModal && <NewWorkspaceModal hideModal={hideNewWsModal} />}
|
||||
</div>
|
||||
);
|
||||
}
|
163
frontend/src/components/Modals/Keys.jsx
Normal file
163
frontend/src/components/Modals/Keys.jsx
Normal file
|
@ -0,0 +1,163 @@
|
|||
import React, { useState, useEffect } from "react";
|
||||
import { X } from "react-feather";
|
||||
import System from "../../models/system";
|
||||
|
||||
const noop = () => false;
|
||||
export default function KeysModal({ hideModal = noop }) {
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [settings, setSettings] = useState({});
|
||||
|
||||
useEffect(() => {
|
||||
async function fetchKeys() {
|
||||
const settings = await System.keys();
|
||||
setSettings(settings);
|
||||
setLoading(false);
|
||||
}
|
||||
fetchKeys();
|
||||
}, []);
|
||||
|
||||
const allSettingsValid =
|
||||
!!settings && Object.values(settings).every((val) => !!val);
|
||||
return (
|
||||
<div class="fixed top-0 left-0 right-0 z-50 w-full p-4 overflow-x-hidden overflow-y-auto md:inset-0 h-[calc(100%-1rem)] h-full bg-black bg-opacity-50 flex items-center justify-center">
|
||||
<div
|
||||
className="flex fixed top-0 left-0 right-0 w-full h-full"
|
||||
onClick={hideModal}
|
||||
/>
|
||||
<div class="relative w-full max-w-2xl max-h-full">
|
||||
<div class="relative bg-white rounded-lg shadow dark:bg-stone-700">
|
||||
<div class="flex items-start justify-between p-4 border-b rounded-t dark:border-gray-600">
|
||||
<h3 class="text-xl font-semibold text-gray-900 dark:text-white">
|
||||
Your System Settings
|
||||
</h3>
|
||||
<button
|
||||
onClick={hideModal}
|
||||
type="button"
|
||||
class="text-gray-400 bg-transparent hover:bg-gray-200 hover:text-gray-900 rounded-lg text-sm p-1.5 ml-auto inline-flex items-center dark:hover:bg-gray-600 dark:hover:text-white"
|
||||
data-modal-hide="staticModal"
|
||||
>
|
||||
<X className="text-gray-300 text-lg" />
|
||||
</button>
|
||||
</div>
|
||||
<div class="p-6 space-y-6 flex h-full w-full">
|
||||
{loading ? (
|
||||
<div className="w-full h-full flex items-center justify-center">
|
||||
<p className="text-gray-800 dark:text-gray-200 text-base">
|
||||
loading system settings
|
||||
</p>
|
||||
</div>
|
||||
) : (
|
||||
<div className="w-full flex flex-col gap-y-4">
|
||||
{allSettingsValid ? (
|
||||
<div className="bg-green-300 p-4 rounded-lg border border-green-600 text-green-700 w-full">
|
||||
<p>All system settings are defined. You are good to go!</p>
|
||||
</div>
|
||||
) : (
|
||||
<div className="bg-red-300 p-4 rounded-lg border border-red-600 text-red-700 w-full text-sm">
|
||||
<p>
|
||||
ENV setttings are missing - this software will not
|
||||
function fully.
|
||||
<br />
|
||||
After updating restart the server.
|
||||
</p>
|
||||
</div>
|
||||
)}
|
||||
<ShowKey
|
||||
name="OpenAI API Key"
|
||||
value={settings?.OpenAiKey ? "*".repeat(20) : ""}
|
||||
valid={settings?.OpenAiKey}
|
||||
/>
|
||||
<ShowKey
|
||||
name="OpenAI Model for chats"
|
||||
value={settings?.OpenAiModelPref}
|
||||
valid={!!settings?.OpenAiModelPref}
|
||||
/>
|
||||
<div className="h-[2px] w-full bg-gray-200 dark:bg-stone-600" />
|
||||
<ShowKey
|
||||
name="Pinecone DB API Key"
|
||||
value={settings?.PineConeKey ? "*".repeat(20) : ""}
|
||||
valid={!!settings?.PineConeKey}
|
||||
/>
|
||||
<ShowKey
|
||||
name="Pinecone DB Environment"
|
||||
value={settings?.PineConeEnvironment}
|
||||
valid={!!settings?.PineConeEnvironment}
|
||||
/>
|
||||
<ShowKey
|
||||
name="Pinecone DB Index"
|
||||
value={settings?.PinceConeIndex}
|
||||
valid={!!settings?.PinceConeIndex}
|
||||
/>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
<div class="flex items-center p-6 space-x-2 border-t border-gray-200 rounded-b dark:border-gray-600">
|
||||
<button
|
||||
onClick={hideModal}
|
||||
type="button"
|
||||
class="text-gray-500 bg-white hover:bg-gray-100 focus:ring-4 focus:outline-none focus:ring-blue-300 rounded-lg border border-gray-200 text-sm font-medium px-5 py-2.5 hover:text-gray-900 focus:z-10 dark:bg-gray-700 dark:text-gray-300 dark:border-gray-500 dark:hover:text-white dark:hover:bg-gray-600 dark:focus:ring-gray-600"
|
||||
>
|
||||
Close
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function ShowKey({ name, value, valid }) {
|
||||
if (!valid) {
|
||||
return (
|
||||
<div>
|
||||
<label
|
||||
for="error"
|
||||
class="block mb-2 text-sm font-medium text-red-700 dark:text-red-500"
|
||||
>
|
||||
{name}
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
id="error"
|
||||
disabled={true}
|
||||
class="bg-red-50 border border-red-500 text-red-900 placeholder-red-700 text-sm rounded-lg focus:ring-red-500 dark:bg-gray-700 focus:border-red-500 block w-full p-2.5 dark:text-red-500 dark:placeholder-red-500 dark:border-red-500"
|
||||
placeholder={name}
|
||||
defaultValue={value}
|
||||
/>
|
||||
<p class="mt-2 text-sm text-red-600 dark:text-red-500">
|
||||
Need setup in .env file.
|
||||
</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div class="mb-6">
|
||||
<label
|
||||
for="success"
|
||||
class="block mb-2 text-sm font-medium text-gray-800 dark:text-slate-200"
|
||||
>
|
||||
{name}
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
id="success"
|
||||
disabled={true}
|
||||
class="border border-white text-green-900 dark:text-green-400 placeholder-green-700 dark:placeholder-green-500 text-sm rounded-lg focus:ring-green-500 focus:border-green-500 block w-full p-2.5 dark:bg-gray-700 dark:border-green-500"
|
||||
defaultValue={value}
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
export function useKeysModal() {
|
||||
const [showing, setShowing] = useState(false);
|
||||
const showModal = () => {
|
||||
setShowing(true);
|
||||
};
|
||||
const hideModal = () => {
|
||||
setShowing(false);
|
||||
};
|
||||
|
||||
return { showing, showModal, hideModal };
|
||||
}
|
476
frontend/src/components/Modals/ManageWorkspace.jsx
Normal file
476
frontend/src/components/Modals/ManageWorkspace.jsx
Normal file
|
@ -0,0 +1,476 @@
|
|||
import React, { useState, useEffect } from "react";
|
||||
import {
|
||||
FileMinus,
|
||||
FilePlus,
|
||||
Folder,
|
||||
FolderMinus,
|
||||
FolderPlus,
|
||||
X,
|
||||
Zap,
|
||||
} from "react-feather";
|
||||
import System from "../../models/system";
|
||||
import Workspace from "../../models/workspace";
|
||||
import { nFormatter } from "../../utils/numbers";
|
||||
import { dollarFormat } from "../../utils/numbers";
|
||||
import paths from "../../utils/paths";
|
||||
import { useParams } from "react-router-dom";
|
||||
|
||||
const noop = () => false;
|
||||
export default function ManageWorkspace({ hideModal = noop, workspace }) {
|
||||
const { slug } = useParams();
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [saving, setSaving] = useState(false);
|
||||
const [showConfirmation, setShowConfirmation] = useState(false);
|
||||
const [directories, setDirectories] = useState(null);
|
||||
const [originalDocuments, setOriginalDocuments] = useState([]);
|
||||
const [selectedFiles, setSelectFiles] = useState([]);
|
||||
|
||||
useEffect(() => {
|
||||
async function fetchKeys() {
|
||||
const _workspace = await Workspace.bySlug(workspace.slug);
|
||||
const localFiles = await System.localFiles();
|
||||
const originalDocs = _workspace.documents.map((doc) => doc.docpath) || [];
|
||||
setDirectories(localFiles);
|
||||
setOriginalDocuments([...originalDocs]);
|
||||
setSelectFiles([...originalDocs]);
|
||||
setLoading(false);
|
||||
}
|
||||
fetchKeys();
|
||||
}, []);
|
||||
|
||||
const deleteWorkspace = async () => {
|
||||
if (
|
||||
!window.confirm(
|
||||
`You are about to delete your entire ${workspace.name} workspace. This will remove all vector embeddings on your vector database.\n\nThe original source files will remiain untouched. This action is irreversible.`
|
||||
)
|
||||
)
|
||||
return false;
|
||||
await Workspace.delete(workspace.slug);
|
||||
workspace.slug === slug
|
||||
? (window.location = paths.home())
|
||||
: window.location.reload();
|
||||
};
|
||||
|
||||
const docChanges = () => {
|
||||
const changes = {
|
||||
adds: [],
|
||||
deletes: [],
|
||||
};
|
||||
|
||||
selectedFiles.map((doc) => {
|
||||
const inOriginal = !!originalDocuments.find((oDoc) => oDoc === doc);
|
||||
if (!inOriginal) {
|
||||
changes.adds.push(doc);
|
||||
}
|
||||
});
|
||||
|
||||
originalDocuments.map((doc) => {
|
||||
const selected = !!selectedFiles.find((oDoc) => oDoc === doc);
|
||||
if (!selected) {
|
||||
changes.deletes.push(doc);
|
||||
}
|
||||
});
|
||||
|
||||
return changes;
|
||||
};
|
||||
|
||||
const confirmChanges = (e) => {
|
||||
e.preventDefault();
|
||||
const changes = docChanges();
|
||||
changes.adds.length > 0 ? setShowConfirmation(true) : updateWorkspace(e);
|
||||
};
|
||||
|
||||
const updateWorkspace = async (e) => {
|
||||
e.preventDefault();
|
||||
setSaving(true);
|
||||
setShowConfirmation(false);
|
||||
const changes = docChanges();
|
||||
await Workspace.modifyEmbeddings(workspace.slug, changes);
|
||||
setSaving(false);
|
||||
window.location.reload();
|
||||
};
|
||||
|
||||
const isSelected = (filepath) => {
|
||||
const isFolder = !filepath.includes("/");
|
||||
return isFolder
|
||||
? selectedFiles.some((doc) => doc.includes(filepath.split("/")[0]))
|
||||
: selectedFiles.some((doc) => doc.includes(filepath));
|
||||
};
|
||||
|
||||
const toggleSelection = (filepath) => {
|
||||
const isFolder = !filepath.includes("/");
|
||||
const parent = isFolder ? filepath : filepath.split("/")[0];
|
||||
|
||||
if (isSelected(filepath)) {
|
||||
const updatedDocs = isFolder
|
||||
? selectedFiles.filter((doc) => !doc.includes(parent))
|
||||
: selectedFiles.filter((doc) => !doc.includes(filepath));
|
||||
setSelectFiles([...new Set(updatedDocs)]);
|
||||
} else {
|
||||
var newDocs = [];
|
||||
if (isFolder) {
|
||||
const folderItems = directories.items.find(
|
||||
(item) => item.name === parent
|
||||
).items;
|
||||
newDocs = folderItems.map((item) => parent + "/" + item.name);
|
||||
} else {
|
||||
newDocs = [filepath];
|
||||
}
|
||||
|
||||
const combined = [...selectedFiles, ...newDocs];
|
||||
setSelectFiles([...new Set(combined)]);
|
||||
}
|
||||
};
|
||||
|
||||
if (loading) {
|
||||
return (
|
||||
<div className="fixed top-0 left-0 right-0 z-50 w-full p-4 overflow-x-hidden overflow-y-auto md:inset-0 h-[calc(100%-1rem)] h-full bg-black bg-opacity-50 flex items-center justify-center">
|
||||
<div
|
||||
className="flex fixed top-0 left-0 right-0 w-full h-full"
|
||||
onClick={hideModal}
|
||||
/>
|
||||
<div className="relative w-full max-w-2xl max-h-full">
|
||||
<div className="relative bg-white rounded-lg shadow dark:bg-stone-700">
|
||||
<div className="flex items-start justify-between p-4 border-b rounded-t dark:border-gray-600">
|
||||
<h3 className="text-xl font-semibold text-gray-900 dark:text-white">
|
||||
{workspace.name} Settings
|
||||
</h3>
|
||||
<button
|
||||
onClick={hideModal}
|
||||
type="button"
|
||||
className="text-gray-400 bg-transparent hover:bg-gray-200 hover:text-gray-900 rounded-lg text-sm p-1.5 ml-auto inline-flex items-center dark:hover:bg-gray-600 dark:hover:text-white"
|
||||
data-modal-hide="staticModal"
|
||||
>
|
||||
<X className="text-gray-300 text-lg" />
|
||||
</button>
|
||||
</div>
|
||||
<div className="p-6 flex h-full w-full max-h-[80vh] overflow-y-scroll">
|
||||
<div className="flex flex-col gap-y-1 w-full">
|
||||
<p className="text-slate-200 dark:text-stone-300 text-center">
|
||||
loading workspace files
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
<div className="flex items-center p-6 space-x-2 border-t border-gray-200 rounded-b dark:border-gray-600"></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<>
|
||||
{showConfirmation && (
|
||||
<ConfirmationModal
|
||||
directories={directories}
|
||||
hideConfirm={() => setShowConfirmation(false)}
|
||||
additions={docChanges().adds}
|
||||
updateWorkspace={updateWorkspace}
|
||||
/>
|
||||
)}
|
||||
<div className="fixed top-0 left-0 right-0 z-50 w-full p-4 overflow-x-hidden overflow-y-auto md:inset-0 h-[calc(100%-1rem)] h-full bg-black bg-opacity-50 flex items-center justify-center">
|
||||
<div
|
||||
className="flex fixed top-0 left-0 right-0 w-full h-full"
|
||||
onClick={hideModal}
|
||||
/>
|
||||
<div className="relative w-full max-w-2xl max-h-full">
|
||||
<div className="relative bg-white rounded-lg shadow dark:bg-stone-700">
|
||||
<div className="flex items-start justify-between p-4 border-b rounded-t dark:border-gray-600">
|
||||
<h3 className="text-xl font-semibold text-gray-900 dark:text-white">
|
||||
"{workspace.name}" workspace settings
|
||||
</h3>
|
||||
<button
|
||||
onClick={hideModal}
|
||||
type="button"
|
||||
className="text-gray-400 bg-transparent hover:bg-gray-200 hover:text-gray-900 rounded-lg text-sm p-1.5 ml-auto inline-flex items-center dark:hover:bg-gray-600 dark:hover:text-white"
|
||||
data-modal-hide="staticModal"
|
||||
>
|
||||
<X className="text-gray-300 text-lg" />
|
||||
</button>
|
||||
</div>
|
||||
<div className="p-6 flex h-full w-full max-h-[80vh] overflow-y-scroll">
|
||||
<div className="flex flex-col gap-y-1 w-full">
|
||||
<div className="flex flex-col mb-2">
|
||||
<p className="text-gray-800 dark:text-stone-200 text-base ">
|
||||
Select folders to add or remove from workspace.
|
||||
</p>
|
||||
<p className="text-gray-800 dark:text-stone-400 text-xs italic">
|
||||
{selectedFiles.length} documents in workspace selected.
|
||||
</p>
|
||||
</div>
|
||||
<div className="w-full h-auto border border-slate-200 dark:border-stone-600 rounded-lg px-4 py-2">
|
||||
{!!directories && (
|
||||
<Directory
|
||||
files={directories}
|
||||
toggleSelection={toggleSelection}
|
||||
isSelected={isSelected}
|
||||
/>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="flex items-center justify-between p-6 space-x-2 border-t border-gray-200 rounded-b dark:border-gray-600">
|
||||
<button
|
||||
onClick={deleteWorkspace}
|
||||
type="button"
|
||||
className="border border-transparent text-gray-500 bg-white hover:bg-red-100 rounded-lg text-sm font-medium px-5 py-2.5 hover:text-red-900 focus:z-10 dark:bg-transparent dark:text-gray-300 dark:hover:text-white dark:hover:bg-red-600"
|
||||
>
|
||||
Delete Workspace
|
||||
</button>
|
||||
<div className="flex items-center">
|
||||
<button
|
||||
disabled={saving}
|
||||
onClick={confirmChanges}
|
||||
type="submit"
|
||||
className="text-slate-200 bg-black-900 px-4 py-2 rounded-lg hover:bg-gray-900"
|
||||
>
|
||||
{saving ? "Saving..." : "Confirm Changes"}
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</>
|
||||
);
|
||||
}
|
||||
|
||||
function Directory({
|
||||
files,
|
||||
parent = null,
|
||||
nested = 0,
|
||||
toggleSelection,
|
||||
isSelected,
|
||||
}) {
|
||||
const [isExpanded, toggleExpanded] = useState(false);
|
||||
const [showDetails, toggleDetails] = useState(false);
|
||||
const [showZap, setShowZap] = useState(false);
|
||||
|
||||
if (files.type === "folder") {
|
||||
return (
|
||||
<div style={{ marginLeft: nested }} className="mb-2">
|
||||
<div
|
||||
className={`flex items-center hover:bg-gray-100 gap-x-2 text-gray-800 dark:text-stone-200 dark:hover:bg-stone-800 px-2 rounded-lg`}
|
||||
>
|
||||
{files.items.some((files) => files.type === "folder") ? (
|
||||
<Folder className="w-6 h-6" />
|
||||
) : (
|
||||
<button onClick={() => toggleSelection(files.name)}>
|
||||
{isSelected(files.name) ? (
|
||||
<FolderMinus className="w-6 h-6 stroke-red-800 hover:fill-red-500" />
|
||||
) : (
|
||||
<FolderPlus className="w-6 h-6 hover:stroke-green-800 hover:fill-green-500" />
|
||||
)}
|
||||
</button>
|
||||
)}
|
||||
|
||||
<div
|
||||
className="flex gap-x-2 items-center cursor-pointer w-full"
|
||||
onClick={() => toggleExpanded(!isExpanded)}
|
||||
>
|
||||
<h2 className="text-2xl">{files.name}</h2>
|
||||
{files.items.some((files) => files.type === "folder") ? (
|
||||
<p className="text-xs italic">{files.items.length} folders</p>
|
||||
) : (
|
||||
<p className="text-xs italic">
|
||||
{files.items.length} documents |{" "}
|
||||
{nFormatter(
|
||||
files.items.reduce((a, b) => a + b.token_count_estimate, 0)
|
||||
)}{" "}
|
||||
tokens
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
{isExpanded &&
|
||||
files.items.map((item) => (
|
||||
<Directory
|
||||
key={item.name}
|
||||
parent={files.name}
|
||||
files={item}
|
||||
nested={nested + 20}
|
||||
toggleSelection={toggleSelection}
|
||||
isSelected={isSelected}
|
||||
/>
|
||||
))}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
const { name, type: _type, ...meta } = files;
|
||||
return (
|
||||
<div className="ml-[20px] my-2">
|
||||
<div className="flex items-center">
|
||||
{meta?.cached && (
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => setShowZap(true)}
|
||||
className="rounded-full p-1 hover:bg-stone-500 hover:bg-opacity-75"
|
||||
>
|
||||
<Zap className="h-4 w-4 stroke-yellow-500 fill-yellow-400" />
|
||||
</button>
|
||||
)}
|
||||
{showZap && (
|
||||
<dialog
|
||||
open={true}
|
||||
style={{ zIndex: 100 }}
|
||||
className="fixed top-0 flex bg-black bg-opacity-50 w-[100vw] h-full items-center justify-center "
|
||||
>
|
||||
<div className="w-fit px-10 py-4 w-[25%] rounded-lg bg-white shadow dark:bg-stone-700 text-black dark:text-slate-200">
|
||||
<div className="flex flex-col w-full">
|
||||
<p className="font-semibold text-xl flex items-center gap-x-1 justify-left">
|
||||
What does{" "}
|
||||
<Zap className="h-4 w-4 stroke-yellow-500 fill-yellow-400" />{" "}
|
||||
mean?
|
||||
</p>
|
||||
<p className="text-base mt-4">
|
||||
This symbol indicates that you have embed this document before
|
||||
and will not have to pay to re-embed this document.
|
||||
</p>
|
||||
<div className="flex w-full justify-center items-center mt-4">
|
||||
<button
|
||||
onClick={() => setShowZap(false)}
|
||||
className="border border-gray-800 text-gray-800 hover:bg-gray-100 px-4 py-1 rounded-lg dark:text-slate-200 dark:border-slate-200 dark:hover:bg-stone-900"
|
||||
>
|
||||
Close
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</dialog>
|
||||
)}
|
||||
|
||||
<div
|
||||
className={`flex items-center gap-x-2 text-gray-800 dark:text-stone-200 hover:bg-gray-100 dark:hover:bg-stone-800 px-2 rounded-lg`}
|
||||
>
|
||||
<button onClick={() => toggleSelection(`${parent}/${name}`)}>
|
||||
{isSelected(`${parent}/${name}`) ? (
|
||||
<FileMinus className="w-6 h-6 stroke-red-800 hover:fill-red-500" />
|
||||
) : (
|
||||
<FilePlus className="w-6 h-6 hover:stroke-green-800 hover:fill-green-500" />
|
||||
)}
|
||||
</button>
|
||||
<div
|
||||
className="w-full items-center flex cursor-pointer"
|
||||
onClick={() => toggleDetails(!showDetails)}
|
||||
>
|
||||
<h3 className="text-sm">{name}</h3>
|
||||
<br />
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
{showDetails && (
|
||||
<div className="ml-[20px] flex flex-col gap-y-1 my-1 p-2 rounded-md bg-slate-200 font-mono text-sm overflow-x-scroll">
|
||||
{Object.entries(meta).map(([key, value]) => {
|
||||
if (key === "cached") return null;
|
||||
return (
|
||||
<p className="whitespace-pre">
|
||||
{key}: {value}
|
||||
</p>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function ConfirmationModal({
|
||||
directories,
|
||||
hideConfirm,
|
||||
additions,
|
||||
updateWorkspace,
|
||||
}) {
|
||||
function estimateCosts() {
|
||||
const cachedTokens = additions.map((filepath) => {
|
||||
const [parent, filename] = filepath.split("/");
|
||||
const details = directories.items
|
||||
.find((folder) => folder.name === parent)
|
||||
.items.find((file) => file.name === filename);
|
||||
|
||||
const { token_count_estimate = 0, cached = false } = details;
|
||||
return cached ? token_count_estimate : 0;
|
||||
});
|
||||
const tokenEstimates = additions.map((filepath) => {
|
||||
const [parent, filename] = filepath.split("/");
|
||||
const details = directories.items
|
||||
.find((folder) => folder.name === parent)
|
||||
.items.find((file) => file.name === filename);
|
||||
|
||||
const { token_count_estimate = 0 } = details;
|
||||
return token_count_estimate;
|
||||
});
|
||||
|
||||
const totalTokens = tokenEstimates.reduce((a, b) => a + b, 0);
|
||||
const cachedTotal = cachedTokens.reduce((a, b) => a + b, 0);
|
||||
const dollarValue = 0.0004 * ((totalTokens - cachedTotal) / 1_000);
|
||||
|
||||
return {
|
||||
dollarValue,
|
||||
dollarText:
|
||||
dollarValue < 0.01 ? "< $0.01" : `about ${dollarFormat(dollarValue)}`,
|
||||
};
|
||||
}
|
||||
|
||||
const { dollarValue, dollarText } = estimateCosts();
|
||||
return (
|
||||
<dialog
|
||||
open={true}
|
||||
style={{ zIndex: 100 }}
|
||||
className="fixed top-0 flex bg-black bg-opacity-50 w-[100vw] h-full items-center justify-center "
|
||||
>
|
||||
<div className="w-fit px-10 p-4 min-w-1/2 rounded-lg bg-white shadow dark:bg-stone-700 text-black dark:text-slate-200">
|
||||
<div className="flex flex-col w-full">
|
||||
<p className="font-semibold">
|
||||
Are you sure you want to embed these documents?
|
||||
</p>
|
||||
|
||||
<div className="flex flex-col gap-y-1">
|
||||
{dollarValue <= 0 ? (
|
||||
<p className="text-base mt-4">
|
||||
You will be embedding {additions.length} new documents into this
|
||||
workspace.
|
||||
<br />
|
||||
This will not incur any costs for OpenAI credits.
|
||||
</p>
|
||||
) : (
|
||||
<p className="text-base mt-4">
|
||||
You will be embedding {additions.length} new documents into this
|
||||
workspace. <br />
|
||||
This will cost {dollarText} in OpenAI credits.
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<div className="flex w-full justify-between items-center mt-4">
|
||||
<button
|
||||
onClick={hideConfirm}
|
||||
className="text-gray-800 hover:bg-gray-100 px-4 py-1 rounded-lg dark:text-slate-200 dark:hover:bg-stone-900"
|
||||
>
|
||||
Cancel
|
||||
</button>
|
||||
<button
|
||||
onClick={updateWorkspace}
|
||||
className="border border-gray-800 text-gray-800 hover:bg-gray-100 px-4 py-1 rounded-lg dark:text-slate-200 dark:border-slate-200 dark:hover:bg-stone-900"
|
||||
>
|
||||
Continue
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</dialog>
|
||||
);
|
||||
}
|
||||
|
||||
export function useManageWorkspaceModal() {
|
||||
const [showing, setShowing] = useState(false);
|
||||
const showModal = () => {
|
||||
setShowing(true);
|
||||
};
|
||||
const hideModal = () => {
|
||||
setShowing(false);
|
||||
};
|
||||
|
||||
return { showing, showModal, hideModal };
|
||||
}
|
104
frontend/src/components/Modals/NewWorkspace.jsx
Normal file
104
frontend/src/components/Modals/NewWorkspace.jsx
Normal file
|
@ -0,0 +1,104 @@
|
|||
import React, { useRef, useState } from "react";
|
||||
import { X } from "react-feather";
|
||||
import Workspace from "../../models/workspace";
|
||||
|
||||
const noop = () => false;
|
||||
export default function NewWorkspaceModal({ hideModal = noop }) {
|
||||
const formEl = useRef(null);
|
||||
const [error, setError] = useState(null);
|
||||
const handleCreate = async (e) => {
|
||||
setError(null);
|
||||
e.preventDefault();
|
||||
const data = {};
|
||||
const form = new FormData(formEl.current);
|
||||
for (var [key, value] of form.entries()) data[key] = value;
|
||||
const { workspace, message } = await Workspace.new(data);
|
||||
if (!!workspace) window.location.reload();
|
||||
setError(message);
|
||||
};
|
||||
|
||||
return (
|
||||
<div class="fixed top-0 left-0 right-0 z-50 w-full p-4 overflow-x-hidden overflow-y-auto md:inset-0 h-[calc(100%-1rem)] h-full bg-black bg-opacity-50 flex items-center justify-center">
|
||||
<div
|
||||
className="flex fixed top-0 left-0 right-0 w-full h-full"
|
||||
onClick={hideModal}
|
||||
/>
|
||||
<div class="relative w-full max-w-2xl max-h-full">
|
||||
<div class="relative bg-white rounded-lg shadow dark:bg-stone-700">
|
||||
<div class="flex items-start justify-between p-4 border-b rounded-t dark:border-gray-600">
|
||||
<h3 class="text-xl font-semibold text-gray-900 dark:text-white">
|
||||
Create a New Workspace
|
||||
</h3>
|
||||
<button
|
||||
onClick={hideModal}
|
||||
type="button"
|
||||
class="text-gray-400 bg-transparent hover:bg-gray-200 hover:text-gray-900 rounded-lg text-sm p-1.5 ml-auto inline-flex items-center dark:hover:bg-gray-600 dark:hover:text-white"
|
||||
data-modal-hide="staticModal"
|
||||
>
|
||||
<X className="text-gray-300 text-lg" />
|
||||
</button>
|
||||
</div>
|
||||
<form ref={formEl} onSubmit={handleCreate}>
|
||||
<div class="p-6 space-y-6 flex h-full w-full">
|
||||
<div className="w-full flex flex-col gap-y-4">
|
||||
<div>
|
||||
<label
|
||||
htmlFor="name"
|
||||
class="block mb-2 text-sm font-medium text-gray-900 dark:text-white"
|
||||
>
|
||||
Workspace Name
|
||||
</label>
|
||||
<input
|
||||
name="name"
|
||||
type="text"
|
||||
id="name"
|
||||
class="bg-gray-50 border border-gray-300 text-gray-900 text-sm rounded-lg focus:ring-blue-500 focus:border-blue-500 block w-full p-2.5 dark:bg-stone-600 dark:border-stone-600 dark:placeholder-gray-400 dark:text-white dark:focus:ring-blue-500 dark:focus:border-blue-500"
|
||||
placeholder="My Workspace"
|
||||
required={true}
|
||||
autoComplete="off"
|
||||
/>
|
||||
</div>
|
||||
{error && (
|
||||
<p className="text-red-600 dark:text-red-400 text-sm">
|
||||
Error: {error}
|
||||
</p>
|
||||
)}
|
||||
<p className="text-gray-800 dark:text-slate-200 text-sm">
|
||||
After creating a workspace you will be able to add and remove
|
||||
documents from it.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flex w-full justify-between items-center p-6 space-x-2 border-t border-gray-200 rounded-b dark:border-gray-600">
|
||||
<button
|
||||
onClick={hideModal}
|
||||
type="button"
|
||||
className="text-gray-800 hover:bg-gray-100 px-4 py-1 rounded-lg dark:text-slate-200 dark:hover:bg-stone-900"
|
||||
>
|
||||
Cancel
|
||||
</button>
|
||||
<button
|
||||
type="submit"
|
||||
class="text-gray-500 bg-white hover:bg-gray-100 focus:ring-4 focus:outline-none focus:ring-blue-300 rounded-lg border border-gray-200 text-sm font-medium px-5 py-2.5 hover:text-gray-900 focus:z-10 dark:bg-black dark:text-slate-200 dark:border-transparent dark:hover:text-slate-200 dark:hover:bg-gray-900 dark:focus:ring-gray-800"
|
||||
>
|
||||
Create Workspace
|
||||
</button>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
export function useNewWorkspaceModal() {
|
||||
const [showing, setShowing] = useState(false);
|
||||
const showModal = () => {
|
||||
setShowing(true);
|
||||
};
|
||||
const hideModal = () => {
|
||||
setShowing(false);
|
||||
};
|
||||
|
||||
return { showing, showModal, hideModal };
|
||||
}
|
82
frontend/src/components/Sidebar/ActiveWorkspaces/index.jsx
Normal file
82
frontend/src/components/Sidebar/ActiveWorkspaces/index.jsx
Normal file
|
@ -0,0 +1,82 @@
|
|||
import React, { useState, useEffect } from "react";
|
||||
import { Book, Settings } from "react-feather";
|
||||
import * as Skeleton from "react-loading-skeleton";
|
||||
import "react-loading-skeleton/dist/skeleton.css";
|
||||
import Workspace from "../../../models/workspace";
|
||||
import ManageWorkspace, {
|
||||
useManageWorkspaceModal,
|
||||
} from "../../Modals/ManageWorkspace";
|
||||
import paths from "../../../utils/paths";
|
||||
import { useParams } from "react-router-dom";
|
||||
|
||||
export default function ActiveWorkspaces() {
|
||||
const { slug } = useParams();
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [workspaces, setWorkspaces] = useState([]);
|
||||
const [selectedWs, setSelectedWs] = useState(null);
|
||||
const { showing, showModal, hideModal } = useManageWorkspaceModal();
|
||||
|
||||
useEffect(() => {
|
||||
async function getWorkspaces() {
|
||||
const workspaces = await Workspace.all();
|
||||
setLoading(false);
|
||||
setWorkspaces(workspaces);
|
||||
}
|
||||
getWorkspaces();
|
||||
}, []);
|
||||
|
||||
if (loading) {
|
||||
return (
|
||||
<>
|
||||
<Skeleton.default
|
||||
height={36}
|
||||
width="100%"
|
||||
count={3}
|
||||
baseColor="#292524"
|
||||
highlightColor="#4c4948"
|
||||
enableAnimation={true}
|
||||
/>
|
||||
</>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<>
|
||||
{workspaces.map((workspace) => {
|
||||
const isActive = workspace.slug === slug;
|
||||
return (
|
||||
<div
|
||||
key={workspace.id}
|
||||
className="flex gap-x-2 items-center justify-between"
|
||||
>
|
||||
<a
|
||||
href={isActive ? null : paths.workspace.chat(workspace.slug)}
|
||||
className={`flex flex-grow w-[75%] h-[36px] gap-x-2 py-[5px] px-4 border border-slate-400 rounded-lg text-slate-800 dark:text-slate-200 justify-start items-center ${
|
||||
isActive
|
||||
? "bg-gray-100 dark:bg-stone-600"
|
||||
: "hover:bg-slate-100 dark:hover:bg-stone-900 "
|
||||
}`}
|
||||
>
|
||||
<Book className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-xs leading-loose font-semibold">
|
||||
{workspace.name}
|
||||
</p>
|
||||
</a>
|
||||
<button
|
||||
onClick={() => {
|
||||
setSelectedWs(workspace);
|
||||
showModal();
|
||||
}}
|
||||
className="rounded-md bg-stone-200 p-2 h-[36px] w-[15%] flex items-center justify-center text-slate-800 hover:bg-stone-300 group dark:bg-stone-800 dark:text-slate-200 dark:hover:bg-stone-900 dark:border dark:border-stone-800"
|
||||
>
|
||||
<Settings className="h-3.5 w-3.5 transition-all duration-300 group-hover:rotate-90" />
|
||||
</button>
|
||||
</div>
|
||||
);
|
||||
})}
|
||||
{showing && !!selectedWs && (
|
||||
<ManageWorkspace hideModal={hideModal} workspace={selectedWs} />
|
||||
)}
|
||||
</>
|
||||
);
|
||||
}
|
34
frontend/src/components/Sidebar/IndexCount.jsx
Normal file
34
frontend/src/components/Sidebar/IndexCount.jsx
Normal file
|
@ -0,0 +1,34 @@
|
|||
import pluralize from "pluralize";
|
||||
import React, { useEffect, useState } from "react";
|
||||
import System from "../../models/system";
|
||||
import { numberWithCommas } from "../../utils/numbers";
|
||||
|
||||
export default function IndexCount() {
|
||||
const [indexes, setIndexes] = useState(null);
|
||||
useEffect(() => {
|
||||
async function indexCount() {
|
||||
setIndexes(await System.totalIndexes());
|
||||
}
|
||||
indexCount();
|
||||
}, []);
|
||||
|
||||
if (indexes === null || indexes === 0) {
|
||||
return (
|
||||
<div className="flex w-full items-center justify-end gap-x-2">
|
||||
<div className="flex items-center gap-x-1 px-2 rounded-full">
|
||||
<p className="text-slate-400 leading-tight text-sm"></p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="flex w-full items-center justify-end gap-x-2">
|
||||
<div className="flex items-center gap-x-1 px-2 rounded-full">
|
||||
<p className="text-slate-400 leading-tight text-sm">
|
||||
{numberWithCommas(indexes)} {pluralize("index", indexes)}
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
49
frontend/src/components/Sidebar/LLMStatus.jsx
Normal file
49
frontend/src/components/Sidebar/LLMStatus.jsx
Normal file
|
@ -0,0 +1,49 @@
|
|||
import React, { useEffect, useState } from "react";
|
||||
import { AlertCircle, Circle } from "react-feather";
|
||||
import System from "../../models/system";
|
||||
|
||||
export default function LLMStatus() {
|
||||
const [status, setStatus] = useState(null);
|
||||
useEffect(() => {
|
||||
async function checkPing() {
|
||||
setStatus(await System.ping());
|
||||
}
|
||||
checkPing();
|
||||
}, []);
|
||||
|
||||
if (status === null) {
|
||||
return (
|
||||
<div className="flex w-full items-center justify-start gap-x-2">
|
||||
<p className="text-slate-400 leading-loose text-sm">LLM</p>
|
||||
<div className="flex items-center gap-x-1 border border-slate-400 px-2 rounded-full">
|
||||
<p className="text-slate-400 leading-tight text-sm">unknown</p>
|
||||
<Circle className="h-3 w-3 stroke-slate-700 fill-slate-400 animate-pulse" />
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// TODO: add modal or toast on click to identify why this is broken
|
||||
// need to likely start server.
|
||||
if (status === false) {
|
||||
return (
|
||||
<div className="flex w-full items-center justify-end gap-x-2">
|
||||
<p className="text-slate-400 leading-loose text-sm">LLM</p>
|
||||
<div className="flex items-center gap-x-1 border border-red-400 px-2 bg-red-200 rounded-full">
|
||||
<p className="text-red-700 leading-tight text-sm">offline</p>
|
||||
<AlertCircle className="h-3 w-3 stroke-red-100 fill-red-400" />
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="flex w-full items-center justify-end gap-x-2">
|
||||
<p className="text-slate-400 leading-loose text-sm">LLM</p>
|
||||
<div className="flex items-center gap-x-1 border border-slate-400 px-2 rounded-full">
|
||||
<p className="text-slate-400 leading-tight text-sm">online</p>
|
||||
<Circle className="h-3 w-3 stroke-green-100 fill-green-400 animate-pulse" />
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
133
frontend/src/components/Sidebar/index.jsx
Normal file
133
frontend/src/components/Sidebar/index.jsx
Normal file
|
@ -0,0 +1,133 @@
|
|||
import React, { useRef } from "react";
|
||||
import { BookOpen, Briefcase, Cpu, GitHub, Key, Plus } from "react-feather";
|
||||
import IndexCount from "./IndexCount";
|
||||
import LLMStatus from "./LLMStatus";
|
||||
import KeysModal, { useKeysModal } from "../Modals/Keys";
|
||||
import NewWorkspaceModal, {
|
||||
useNewWorkspaceModal,
|
||||
} from "../Modals/NewWorkspace";
|
||||
import ActiveWorkspaces from "./ActiveWorkspaces";
|
||||
import paths from "../../utils/paths";
|
||||
|
||||
export default function Sidebar() {
|
||||
const sidebarRef = useRef(null);
|
||||
const {
|
||||
showing: showingKeyModal,
|
||||
showModal: showKeyModal,
|
||||
hideModal: hideKeyModal,
|
||||
} = useKeysModal();
|
||||
const {
|
||||
showing: showingNewWsModal,
|
||||
showModal: showNewWsModal,
|
||||
hideModal: hideNewWsModal,
|
||||
} = useNewWorkspaceModal();
|
||||
|
||||
// const handleWidthToggle = () => {
|
||||
// if (!sidebarRef.current) return false;
|
||||
// sidebarRef.current.classList.add('translate-x-[-100%]')
|
||||
// }
|
||||
|
||||
return (
|
||||
<>
|
||||
<div
|
||||
ref={sidebarRef}
|
||||
style={{ height: "calc(100% - 32px)" }}
|
||||
className="transition-all duration-500 relative m-[16px] rounded-[26px] bg-white dark:bg-black-900 min-w-[15.5%] p-[18px] "
|
||||
>
|
||||
{/* <button onClick={handleWidthToggle} className='absolute -right-[13px] top-[35%] bg-white w-auto h-auto bg-transparent flex items-center'>
|
||||
<svg width="16" height="96" viewBox="0 0 16 96" fill="none" xmlns="http://www.w3.org/2000/svg" stroke="#141414"><path d="M2.5 0H3C3 20 15 12 15 32V64C15 84 3 76 3 96H2.5V0Z" fill="black" fill-opacity="0.12" stroke="transparent" stroke-width="0px"></path><path d="M0 0H2.5C2.5 20 14.5 12 14.5 32V64C14.5 84 2.5 76 2.5 96H0V0Z" fill="#141414"></path></svg>
|
||||
<ChevronLeft className='absolute h-4 w-4 text-white mr-1' />
|
||||
</button> */}
|
||||
|
||||
<div className="w-full h-full flex flex-col overflow-x-hidden items-between">
|
||||
{/* Header Information */}
|
||||
<div className="flex w-full items-center justify-between">
|
||||
<p className="text-xl font-base text-slate-600 dark:text-slate-200">
|
||||
AnythingLLM
|
||||
</p>
|
||||
<div className="flex gap-x-2 items-center text-slate-500">
|
||||
<button
|
||||
onClick={showKeyModal}
|
||||
className="transition-all duration-300 p-2 rounded-full bg-slate-200 text-slate-400 dark:bg-stone-800 hover:bg-slate-800 hover:text-slate-200 dark:hover:text-slate-200"
|
||||
>
|
||||
<Key className="h-4 w-4 " />
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Primary Body */}
|
||||
<div className="h-[100%] flex flex-col w-full justify-between pt-4 overflow-y-hidden">
|
||||
<div className="h-auto sidebar-items dark:sidebar-items">
|
||||
<div className="flex flex-col gap-y-4 h-[65vh] pb-8 overflow-y-scroll no-scroll">
|
||||
<div className="flex gap-x-2 items-center justify-between">
|
||||
<button
|
||||
onClick={showNewWsModal}
|
||||
className="flex flex-grow w-[75%] h-[36px] gap-x-2 py-[5px] px-4 border border-slate-400 rounded-lg text-slate-800 dark:text-slate-200 justify-start items-center hover:bg-slate-100 dark:hover:bg-stone-900"
|
||||
>
|
||||
<Plus className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-xs leading-loose font-semibold">
|
||||
New workspace
|
||||
</p>
|
||||
</button>
|
||||
</div>
|
||||
<ActiveWorkspaces />
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div className="flex flex-col gap-y-2">
|
||||
<div className="w-full flex items-center justify-between">
|
||||
<LLMStatus />
|
||||
<IndexCount />
|
||||
</div>
|
||||
<a
|
||||
href=""
|
||||
className="flex flex-grow w-[100%] h-[36px] gap-x-2 py-[5px] px-4 border border-slate-400 dark:border-transparent rounded-lg text-slate-800 dark:text-slate-200 justify-center items-center hover:bg-slate-100 dark:bg-stone-800 dark:hover:bg-stone-900"
|
||||
>
|
||||
<Cpu className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-xs leading-loose font-semibold">
|
||||
Managed cloud hosting
|
||||
</p>
|
||||
</a>
|
||||
<a
|
||||
href=""
|
||||
className="flex flex-grow w-[100%] h-[36px] gap-x-2 py-[5px] px-4 border border-slate-400 dark:border-transparent rounded-lg text-slate-800 dark:text-slate-200 justify-center items-center hover:bg-slate-100 dark:bg-stone-800 dark:hover:bg-stone-900"
|
||||
>
|
||||
<Briefcase className="h-4 w-4" />
|
||||
<p className="text-slate-800 dark:text-slate-200 text-xs leading-loose font-semibold">
|
||||
Enterpise Installation
|
||||
</p>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
{/* Footer */}
|
||||
<div className="flex items-end justify-between mt-2">
|
||||
<div className="flex gap-x-1 items-center">
|
||||
<a
|
||||
href={paths.github()}
|
||||
className="transition-all duration-300 p-2 rounded-full bg-slate-200 text-slate-400 dark:bg-slate-800 hover:bg-slate-800 hover:text-slate-200 dark:hover:text-slate-200"
|
||||
>
|
||||
<GitHub className="h-4 w-4 " />
|
||||
</a>
|
||||
<a
|
||||
href={paths.docs()}
|
||||
className="transition-all duration-300 p-2 rounded-full bg-slate-200 text-slate-400 dark:bg-slate-800 hover:bg-slate-800 hover:text-slate-200 dark:hover:text-slate-200"
|
||||
>
|
||||
<BookOpen className="h-4 w-4 " />
|
||||
</a>
|
||||
</div>
|
||||
<a
|
||||
href={paths.mailToMintplex()}
|
||||
className="transition-all duration-300 text-xs text-slate-200 dark:text-slate-600 hover:text-blue-600 dark:hover:text-blue-400"
|
||||
>
|
||||
@MintplexLabs
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
{showingKeyModal && <KeysModal hideModal={hideKeyModal} />}
|
||||
{showingNewWsModal && <NewWorkspaceModal hideModal={hideNewWsModal} />}
|
||||
</>
|
||||
);
|
||||
}
|
27
frontend/src/components/UserIcon/index.jsx
Normal file
27
frontend/src/components/UserIcon/index.jsx
Normal file
|
@ -0,0 +1,27 @@
|
|||
import React, { useRef, useEffect } from "react";
|
||||
import JAZZ from "@metamask/jazzicon";
|
||||
|
||||
export default function Jazzicon({ size = 10, user }) {
|
||||
const divRef = useRef(null);
|
||||
const seed = user?.uid
|
||||
? toPseudoRandomInteger(user.uid)
|
||||
: Math.floor(100000 + Math.random() * 900000);
|
||||
const result = JAZZ(size, seed);
|
||||
|
||||
useEffect(() => {
|
||||
if (!divRef || !divRef.current) return null;
|
||||
|
||||
divRef.current.appendChild(result);
|
||||
}, []); // eslint-disable-line react-hooks/exhaustive-deps
|
||||
|
||||
return <div className="flex" ref={divRef} />;
|
||||
}
|
||||
|
||||
function toPseudoRandomInteger(uidString = "") {
|
||||
var numberArray = [uidString.length];
|
||||
for (var i = 0; i < uidString.length; i++) {
|
||||
numberArray[i] = uidString.charCodeAt(i);
|
||||
}
|
||||
|
||||
return numberArray.reduce((a, b) => a + b, 0);
|
||||
}
|
|
@ -0,0 +1,106 @@
|
|||
import { useEffect, useRef, memo, useState } from "react";
|
||||
import { AlertTriangle } from "react-feather";
|
||||
import Jazzicon from "../../../../UserIcon";
|
||||
import { v4 } from "uuid";
|
||||
import { decode as HTMLDecode } from "he";
|
||||
|
||||
function HistoricalMessage({
|
||||
message,
|
||||
role,
|
||||
workspace,
|
||||
sources = [],
|
||||
error = false,
|
||||
}) {
|
||||
const replyRef = useRef(null);
|
||||
useEffect(() => {
|
||||
if (replyRef.current)
|
||||
replyRef.current.scrollIntoView({ behavior: "smooth", block: "end" });
|
||||
}, [replyRef.current]);
|
||||
|
||||
if (role === "user") {
|
||||
return (
|
||||
<div className="flex justify-end mb-4 items-start">
|
||||
<div className="mr-2 py-1 px-4 max-w-[75%] bg-slate-200 dark:bg-amber-800 rounded-b-2xl rounded-tl-2xl rounded-tr-sm">
|
||||
<span
|
||||
className={`inline-block p-2 rounded-lg whitespace-pre-line text-slate-800 dark:text-slate-200 font-semibold`}
|
||||
>
|
||||
{message}
|
||||
</span>
|
||||
</div>
|
||||
<Jazzicon size={30} user={{ uid: "user" }} />
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
if (error) {
|
||||
return (
|
||||
<div className="flex justify-start mb-4 items-end">
|
||||
<Jazzicon size={30} user={{ uid: workspace.slug }} />
|
||||
<div className="ml-2 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-t-2xl rounded-br-2xl rounded-bl-sm">
|
||||
<span
|
||||
className={`inline-block p-2 rounded-lg bg-red-50 text-red-500`}
|
||||
>
|
||||
<AlertTriangle className="h-4 w-4 mb-1 inline-block" /> Could not
|
||||
respond to message.
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div ref={replyRef} className="flex justify-start items-end mb-4">
|
||||
<Jazzicon size={30} user={{ uid: workspace.slug }} />
|
||||
<div className="ml-2 py-3 px-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-t-2xl rounded-br-2xl rounded-bl-sm">
|
||||
<span className="whitespace-pre-line text-slate-800 dark:text-slate-200 font-semibold">
|
||||
{message}
|
||||
</span>
|
||||
<Citations sources={sources} />
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
const Citations = ({ sources = [] }) => {
|
||||
const [show, setShow] = useState(false);
|
||||
if (sources.length === 0) return null;
|
||||
|
||||
return (
|
||||
<div className="flex flex-col mt-4 justify-left">
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => setShow(!show)}
|
||||
className="w-fit text-gray-700 dark:text-stone-400 italic text-xs"
|
||||
>
|
||||
{show ? "hide" : "show"} citations{show && "*"}
|
||||
</button>
|
||||
{show && (
|
||||
<>
|
||||
<div className="w-full flex flex-wrap items-center gap-4 mt-1 doc__source">
|
||||
{sources.map((source) => {
|
||||
const { id = null, title, url } = source;
|
||||
const handleClick = () => {
|
||||
if (!url) return false;
|
||||
window.open(url, "_blank");
|
||||
};
|
||||
return (
|
||||
<button
|
||||
key={id || v4()}
|
||||
onClick={handleClick}
|
||||
className="italic transition-all duration-300 w-fit bg-gray-400 text-gray-900 py-[1px] hover:text-slate-200 hover:bg-gray-500 hover:dark:text-gray-900 dark:bg-stone-400 dark:hover:bg-stone-300 rounded-full px-2 text-xs leading-tight"
|
||||
>
|
||||
"{HTMLDecode(title)}"
|
||||
</button>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
<p className="w-fit text-gray-700 dark:text-stone-400 text-xs mt-1">
|
||||
*citation may not be relevant to end result.
|
||||
</p>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default memo(HistoricalMessage);
|
|
@ -0,0 +1,112 @@
|
|||
import { memo, useEffect, useRef, useState } from "react";
|
||||
import { AlertTriangle } from "react-feather";
|
||||
import Jazzicon from "../../../../UserIcon";
|
||||
import { decode as HTMLDecode } from "he";
|
||||
|
||||
function PromptReply({
|
||||
uuid,
|
||||
reply,
|
||||
pending,
|
||||
error,
|
||||
workspace,
|
||||
sources = [],
|
||||
closed = true,
|
||||
}) {
|
||||
const replyRef = useRef(null);
|
||||
useEffect(() => {
|
||||
if (replyRef.current)
|
||||
replyRef.current.scrollIntoView({ behavior: "smooth", block: "end" });
|
||||
}, [replyRef.current]);
|
||||
|
||||
if (!reply && !sources.length === 0 && !pending && !error) return null;
|
||||
if (pending) {
|
||||
return (
|
||||
<div className="chat__message flex justify-start mb-4 items-end">
|
||||
<Jazzicon size={30} user={{ uid: workspace.slug }} />
|
||||
<div className="ml-2 pt-2 px-6 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-t-2xl rounded-br-2xl rounded-bl-sm">
|
||||
<span className={`inline-block p-2`}>
|
||||
<div className="dot-falling"></div>
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
if (error) {
|
||||
return (
|
||||
<div className="chat__message flex justify-start mb-4 items-center">
|
||||
<Jazzicon size={30} user={{ uid: workspace.slug }} />
|
||||
<div className="ml-2 py-3 px-4 rounded-br-3xl rounded-tr-3xl rounded-tl-xl text-slate-100 ">
|
||||
<div className="bg-red-50 text-red-500 rounded-lg w-fit flex flex-col p-2">
|
||||
<span className={`inline-block`}>
|
||||
<AlertTriangle className="h-4 w-4 mb-1 inline-block" /> Could not
|
||||
respond to message.
|
||||
</span>
|
||||
<span className="text-xs">Reason: {error || "unknown"}</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div
|
||||
key={uuid}
|
||||
ref={replyRef}
|
||||
className="chat__message mb-4 flex justify-start items-end"
|
||||
>
|
||||
<Jazzicon size={30} user={{ uid: workspace.slug }} />
|
||||
<div className="ml-2 py-3 px-4 max-w-[75%] bg-orange-100 dark:bg-stone-700 rounded-t-2xl rounded-br-2xl rounded-bl-sm">
|
||||
<p className="text-[15px] whitespace-pre-line break-words text-slate-800 dark:text-slate-200 font-semibold">
|
||||
{reply}
|
||||
{!closed && <i className="not-italic blink">|</i>}
|
||||
</p>
|
||||
<Citations sources={sources} />
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
const Citations = ({ sources = [] }) => {
|
||||
const [show, setShow] = useState(false);
|
||||
if (sources.length === 0) return null;
|
||||
|
||||
return (
|
||||
<div className="flex flex-col mt-4 justify-left">
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => setShow(!show)}
|
||||
className="w-fit text-gray-700 dark:text-stone-400 italic text-xs"
|
||||
>
|
||||
{show ? "hide" : "show"} citations{show && "*"}
|
||||
</button>
|
||||
{show && (
|
||||
<>
|
||||
<div className="w-full flex flex-wrap items-center gap-4 mt-1 doc__source">
|
||||
{sources.map((source) => {
|
||||
const { id = null, title, url } = source;
|
||||
const handleClick = () => {
|
||||
if (!url) return false;
|
||||
window.open(url, "_blank");
|
||||
};
|
||||
return (
|
||||
<button
|
||||
key={id || v4()}
|
||||
onClick={handleClick}
|
||||
className="italic transition-all duration-300 w-fit bg-gray-400 text-gray-900 py-[1px] hover:text-slate-200 hover:bg-gray-500 hover:dark:text-gray-900 dark:bg-stone-400 dark:hover:bg-stone-300 rounded-full px-2 text-xs leading-tight"
|
||||
>
|
||||
"{HTMLDecode(title)}"
|
||||
</button>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
<p className="w-fit text-gray-700 dark:text-stone-400 text-xs mt-1">
|
||||
*citation may not be relevant to end result.
|
||||
</p>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default memo(PromptReply);
|
|
@ -0,0 +1,70 @@
|
|||
import { Frown } from "react-feather";
|
||||
import HistoricalMessage from "./HistoricalMessage";
|
||||
import PromptReply from "./PromptReply";
|
||||
// import paths from '../../../../../utils/paths';
|
||||
|
||||
export default function ChatHistory({ history = [], workspace }) {
|
||||
if (history.length === 0) {
|
||||
return (
|
||||
<div className="flex flex-col h-[89%] md:mt-0 pb-5 w-full justify-center items-center">
|
||||
<div className="w-fit flex items-center gap-x-2">
|
||||
<Frown className="h-4 w-4 text-slate-400" />
|
||||
<p className="text-slate-400">No chat history found.</p>
|
||||
</div>
|
||||
<p className="text-slate-400 text-xs">
|
||||
Send your first message to get started.
|
||||
</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div
|
||||
className="h-[89%] pb-[100px] pt-[50px] md:pt-0 md:pb-5 mx-2 md:mx-0 overflow-y-scroll flex flex-col justify-between md:justify-start"
|
||||
id="chat-history"
|
||||
>
|
||||
{history.map(
|
||||
(
|
||||
{
|
||||
uuid = null,
|
||||
content,
|
||||
sources = [],
|
||||
role,
|
||||
closed = true,
|
||||
pending = false,
|
||||
error = false,
|
||||
animate = false,
|
||||
},
|
||||
index
|
||||
) => {
|
||||
const isLastBotReply =
|
||||
index === history.length - 1 && role === "assistant";
|
||||
if (isLastBotReply && animate) {
|
||||
return (
|
||||
<PromptReply
|
||||
key={uuid}
|
||||
uuid={uuid}
|
||||
reply={content}
|
||||
pending={pending}
|
||||
sources={sources}
|
||||
error={error}
|
||||
workspace={workspace}
|
||||
closed={closed}
|
||||
/>
|
||||
);
|
||||
}
|
||||
return (
|
||||
<HistoricalMessage
|
||||
key={index}
|
||||
message={content}
|
||||
role={role}
|
||||
workspace={workspace}
|
||||
sources={sources}
|
||||
error={error}
|
||||
/>
|
||||
);
|
||||
}
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
|
@ -0,0 +1,106 @@
|
|||
import React, { useState, useRef } from "react";
|
||||
import { Loader, Menu, Send, X } from "react-feather";
|
||||
|
||||
export default function PromptInput({
|
||||
workspace,
|
||||
message,
|
||||
submit,
|
||||
onChange,
|
||||
inputDisabled,
|
||||
buttonDisabled,
|
||||
}) {
|
||||
const [showMenu, setShowMenu] = useState(false);
|
||||
const formRef = useRef(null);
|
||||
const [_, setFocused] = useState(false);
|
||||
const handleSubmit = (e) => {
|
||||
setFocused(false);
|
||||
submit(e);
|
||||
};
|
||||
const captureEnter = (event) => {
|
||||
if (event.keyCode == 13) {
|
||||
if (!event.shiftKey) {
|
||||
submit(event);
|
||||
}
|
||||
}
|
||||
};
|
||||
const adjustTextArea = (event) => {
|
||||
const element = event.target;
|
||||
element.style.height = "1px";
|
||||
element.style.height =
|
||||
event.target.value.length !== 0
|
||||
? 25 + element.scrollHeight + "px"
|
||||
: "1px";
|
||||
};
|
||||
|
||||
const setTextCommand = (command = "") => {
|
||||
onChange({ target: { value: `${command} ${message}` } });
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="w-full fixed md:absolute bottom-0 left-0">
|
||||
<form
|
||||
onSubmit={handleSubmit}
|
||||
className="flex flex-col gap-y-1 bg-transparentrounded-t-lg w-3/4 mx-auto"
|
||||
>
|
||||
<div className="flex items-center py-2 px-4 rounded-lg">
|
||||
{/* Toggle selector? */}
|
||||
{/* <button
|
||||
onClick={() => setShowMenu(!showMenu)}
|
||||
type="button"
|
||||
className="p-2 text-slate-200 bg-transparent rounded-md hover:bg-gray-50 dark:hover:bg-stone-500">
|
||||
<Menu className="w-4 h-4 md:h-6 md:w-6" />
|
||||
</button> */}
|
||||
<textarea
|
||||
onKeyUp={adjustTextArea}
|
||||
onKeyDown={captureEnter}
|
||||
onChange={onChange}
|
||||
required={true}
|
||||
maxLength={240}
|
||||
disabled={inputDisabled}
|
||||
onFocus={() => setFocused(true)}
|
||||
onBlur={(e) => {
|
||||
setFocused(false);
|
||||
adjustTextArea(e);
|
||||
}}
|
||||
value={message}
|
||||
className="cursor-text max-h-[100px] md:min-h-[40px] block mx-2 md:mx-4 p-2.5 w-full text-[16px] md:text-sm rounded-lg border bg-gray-50 border-gray-300 placeholder-gray-400 text-white dark:bg-stone-600 dark:border-stone-700 dark:placeholder-stone-400"
|
||||
placeholder="Shift + Enter for newline. Enter to submit."
|
||||
/>
|
||||
<button
|
||||
ref={formRef}
|
||||
type="submit"
|
||||
disabled={buttonDisabled}
|
||||
className="inline-flex justify-center p-0 md:p-2 rounded-full cursor-pointer text-black-900 dark:text-slate-200 hover:bg-gray-600 dark:hover:bg-stone-500"
|
||||
>
|
||||
{buttonDisabled ? (
|
||||
<Loader className="w-6 h-6 animate-spin" />
|
||||
) : (
|
||||
<svg
|
||||
aria-hidden="true"
|
||||
className="w-6 h-6 rotate-45"
|
||||
fill="currentColor"
|
||||
viewBox="0 0 20 20"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
>
|
||||
<path d="M10.894 2.553a1 1 0 00-1.788 0l-7 14a1 1 0 001.169 1.409l5-1.429A1 1 0 009 15.571V11a1 1 0 112 0v4.571a1 1 0 00.725.962l5 1.428a1 1 0 001.17-1.408l-7-14z"></path>
|
||||
</svg>
|
||||
)}
|
||||
<span className="sr-only">Send message</span>
|
||||
</button>
|
||||
</div>
|
||||
<Tracking />
|
||||
</form>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
const Tracking = () => {
|
||||
return (
|
||||
<div className="flex flex-col w-full justify-center items-center gap-y-2 mb-2 px-4 mx:px-0">
|
||||
<p className="text-slate-400 text-xs">
|
||||
Responses from system may produce inaccurate or invalid responses - use
|
||||
with caution.
|
||||
</p>
|
||||
</div>
|
||||
);
|
||||
};
|
|
@ -0,0 +1,87 @@
|
|||
import { useState, useEffect } from "react";
|
||||
import ChatHistory from "./ChatHistory";
|
||||
import PromptInput from "./PromptInput";
|
||||
import Workspace from "../../../models/workspace";
|
||||
import handleChat from "../../../utils/chat";
|
||||
|
||||
export default function ChatContainer({ workspace, knownHistory = [] }) {
|
||||
const [message, setMessage] = useState("");
|
||||
const [loadingResponse, setLoadingResponse] = useState(false);
|
||||
const [chatHistory, setChatHistory] = useState(knownHistory);
|
||||
|
||||
const handleMessageChange = (event) => {
|
||||
setMessage(event.target.value);
|
||||
};
|
||||
|
||||
const handleSubmit = async (event) => {
|
||||
event.preventDefault();
|
||||
if (!message || message === "") return false;
|
||||
|
||||
const prevChatHistory = [
|
||||
...chatHistory,
|
||||
{ content: message, role: "user" },
|
||||
{
|
||||
content: "",
|
||||
role: "assistant",
|
||||
pending: true,
|
||||
userMessage: message,
|
||||
animate: true,
|
||||
},
|
||||
];
|
||||
|
||||
setChatHistory(prevChatHistory);
|
||||
setMessage("");
|
||||
setLoadingResponse(true);
|
||||
};
|
||||
|
||||
useEffect(() => {
|
||||
async function fetchReply() {
|
||||
const promptMessage =
|
||||
chatHistory.length > 0 ? chatHistory[chatHistory.length - 1] : null;
|
||||
const remHistory = chatHistory.length > 0 ? chatHistory.slice(0, -1) : [];
|
||||
var _chatHistory = [...remHistory];
|
||||
|
||||
if (!promptMessage || !promptMessage?.userMessage) {
|
||||
setLoadingResponse(false);
|
||||
return false;
|
||||
}
|
||||
|
||||
const chatResult = await Workspace.sendChat(
|
||||
workspace,
|
||||
promptMessage.userMessage
|
||||
);
|
||||
if (!chatResult) {
|
||||
alert("Could not send chat.");
|
||||
setLoadingResponse(false);
|
||||
return;
|
||||
}
|
||||
handleChat(
|
||||
chatResult,
|
||||
setLoadingResponse,
|
||||
setChatHistory,
|
||||
remHistory,
|
||||
_chatHistory
|
||||
);
|
||||
}
|
||||
loadingResponse === true && fetchReply();
|
||||
}, [loadingResponse, chatHistory, workspace]);
|
||||
|
||||
return (
|
||||
<div
|
||||
style={{ height: "calc(100% - 32px)" }}
|
||||
className="transition-all duration-500 relative ml-[2px] mr-[8px] my-[16px] rounded-[26px] bg-white dark:bg-black-900 min-w-[82%] p-[18px] h-full overflow-y-scroll"
|
||||
>
|
||||
<div className="flex flex-col h-full w-full flex">
|
||||
<ChatHistory history={chatHistory} workspace={workspace} />
|
||||
<PromptInput
|
||||
workspace={workspace}
|
||||
message={message}
|
||||
submit={handleSubmit}
|
||||
onChange={handleMessageChange}
|
||||
inputDisabled={loadingResponse}
|
||||
buttonDisabled={loadingResponse}
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
57
frontend/src/components/WorkspaceChat/LoadingChat/index.jsx
Normal file
57
frontend/src/components/WorkspaceChat/LoadingChat/index.jsx
Normal file
|
@ -0,0 +1,57 @@
|
|||
import * as Skeleton from "react-loading-skeleton";
|
||||
import "react-loading-skeleton/dist/skeleton.css";
|
||||
|
||||
export default function LoadingChat() {
|
||||
return (
|
||||
<div
|
||||
style={{ height: "calc(100% - 32px)" }}
|
||||
className="transition-all duration-500 relative ml-[2px] mr-[8px] my-[16px] rounded-[26px] bg-white dark:bg-black-900 min-w-[82%] p-[18px] h-full overflow-y-scroll"
|
||||
>
|
||||
<Skeleton.default
|
||||
height="100px"
|
||||
width="100%"
|
||||
baseColor={"#2a3a53"}
|
||||
highlightColor={"#395073"}
|
||||
count={1}
|
||||
className="max-w-[75%] p-4 rounded-b-2xl rounded-tr-2xl rounded-tl-sm mt-6"
|
||||
containerClassName="flex justify-start"
|
||||
/>
|
||||
<Skeleton.default
|
||||
height="100px"
|
||||
width="45%"
|
||||
baseColor={"#2a3a53"}
|
||||
highlightColor={"#395073"}
|
||||
count={1}
|
||||
className="max-w-[75%] p-4 rounded-b-2xl rounded-tr-2xl rounded-tl-sm mt-6"
|
||||
containerClassName="flex justify-end"
|
||||
/>
|
||||
<Skeleton.default
|
||||
height="100px"
|
||||
width="30%"
|
||||
baseColor={"#2a3a53"}
|
||||
highlightColor={"#395073"}
|
||||
count={1}
|
||||
className="max-w-[75%] p-4 rounded-b-2xl rounded-tr-2xl rounded-tl-sm mt-6"
|
||||
containerClassName="flex justify-start"
|
||||
/>
|
||||
<Skeleton.default
|
||||
height="100px"
|
||||
width="25%"
|
||||
baseColor={"#2a3a53"}
|
||||
highlightColor={"#395073"}
|
||||
count={1}
|
||||
className="max-w-[75%] p-4 rounded-b-2xl rounded-tr-2xl rounded-tl-sm mt-6"
|
||||
containerClassName="flex justify-end"
|
||||
/>
|
||||
<Skeleton.default
|
||||
height="160px"
|
||||
width="100%"
|
||||
baseColor={"#2a3a53"}
|
||||
highlightColor={"#395073"}
|
||||
count={1}
|
||||
className="max-w-[75%] p-4 rounded-b-2xl rounded-tr-2xl rounded-tl-sm mt-6"
|
||||
containerClassName="flex justify-start"
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
}
|
62
frontend/src/components/WorkspaceChat/index.jsx
Normal file
62
frontend/src/components/WorkspaceChat/index.jsx
Normal file
|
@ -0,0 +1,62 @@
|
|||
import React, { useEffect, useState } from "react";
|
||||
import Workspace from "../../models/workspace";
|
||||
import LoadingChat from "./LoadingChat";
|
||||
import ChatContainer from "./ChatContainer";
|
||||
import paths from "../../utils/paths";
|
||||
|
||||
export default function WorkspaceChat({ loading, workspace }) {
|
||||
const [history, setHistory] = useState([]);
|
||||
const [loadingHistory, setLoadingHistory] = useState(true);
|
||||
|
||||
useEffect(() => {
|
||||
async function getHistory() {
|
||||
if (loading) return;
|
||||
if (!workspace?.slug) {
|
||||
setLoadingHistory(false);
|
||||
return false;
|
||||
}
|
||||
|
||||
const chatHistory = await Workspace.chatHistory(workspace.slug);
|
||||
setHistory(chatHistory);
|
||||
setLoadingHistory(false);
|
||||
}
|
||||
getHistory();
|
||||
}, [workspace, loading]);
|
||||
|
||||
if (loadingHistory) return <LoadingChat />;
|
||||
if (!loading && !loadingHistory && !workspace)
|
||||
return (
|
||||
<>
|
||||
{loading === false && !workspace && (
|
||||
<dialog
|
||||
open={true}
|
||||
style={{ zIndex: 100 }}
|
||||
className="fixed top-0 flex bg-black bg-opacity-50 w-[100vw] h-full items-center justify-center "
|
||||
>
|
||||
<div className="w-fit px-10 p-4 w-1/4 rounded-lg bg-white shadow dark:bg-stone-700 text-black dark:text-slate-200">
|
||||
<div className="flex flex-col w-full">
|
||||
<p className="font-semibold text-red-500">
|
||||
We cannot locate this workspace!
|
||||
</p>
|
||||
<p className="text-sm mt-4">
|
||||
It looks like a workspace by this name is not available.
|
||||
</p>
|
||||
|
||||
<div className="flex w-full justify-center items-center mt-4">
|
||||
<a
|
||||
href={paths.home()}
|
||||
className="border border-gray-800 text-gray-800 hover:bg-gray-100 px-4 py-1 rounded-lg dark:text-slate-200 dark:border-slate-200 dark:hover:bg-stone-900"
|
||||
>
|
||||
Go back to homepage
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</dialog>
|
||||
)}
|
||||
<LoadingChat />
|
||||
</>
|
||||
);
|
||||
|
||||
return <ChatContainer workspace={workspace} knownHistory={history} />;
|
||||
}
|
293
frontend/src/index.css
Normal file
293
frontend/src/index.css
Normal file
|
@ -0,0 +1,293 @@
|
|||
@tailwind base;
|
||||
@tailwind components;
|
||||
@tailwind utilities;
|
||||
|
||||
html,
|
||||
body {
|
||||
padding: 0;
|
||||
margin: 0;
|
||||
font-family: -apple-system, BlinkMacSystemFont, Segoe UI, Roboto, Oxygen,
|
||||
Ubuntu, Cantarell, Fira Sans, Droid Sans, Helvetica Neue, sans-serif;
|
||||
background-color: white;
|
||||
}
|
||||
|
||||
a {
|
||||
color: inherit;
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
* {
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
.g327 {
|
||||
border-color: #302f30;
|
||||
}
|
||||
|
||||
@font-face {
|
||||
font-family: "AvenirNextW10-Bold";
|
||||
src: url("../public/fonts/AvenirNext.ttf");
|
||||
}
|
||||
|
||||
.Avenir {
|
||||
font-family: AvenirNextW10-Bold;
|
||||
font-display: swap;
|
||||
}
|
||||
|
||||
.grr {
|
||||
grid-template-columns: repeat(2, 1fr);
|
||||
}
|
||||
|
||||
.greyC {
|
||||
filter: gray;
|
||||
-webkit-filter: grayscale(100%);
|
||||
transition: 0.4s;
|
||||
}
|
||||
|
||||
.greyC:hover {
|
||||
filter: none;
|
||||
-webkit-filter: none;
|
||||
transition: 0.4s;
|
||||
}
|
||||
|
||||
.chat__message {
|
||||
transform-origin: 0 100%;
|
||||
transform: scale(0);
|
||||
animation: message 0.15s ease-out 0s forwards;
|
||||
animation-delay: 500ms;
|
||||
}
|
||||
|
||||
@keyframes message {
|
||||
0% {
|
||||
max-height: 100%;
|
||||
}
|
||||
|
||||
80% {
|
||||
transform: scale(1.1);
|
||||
}
|
||||
|
||||
100% {
|
||||
transform: scale(1);
|
||||
max-height: 100%;
|
||||
overflow: visible;
|
||||
padding-top: 1rem;
|
||||
}
|
||||
}
|
||||
|
||||
.doc__source {
|
||||
transform-origin: 0 100%;
|
||||
transform: scale(0);
|
||||
animation: message2 0.15s ease-out 0s forwards;
|
||||
animation-delay: 50ms;
|
||||
}
|
||||
|
||||
@keyframes message2 {
|
||||
0% {
|
||||
max-height: 100%;
|
||||
}
|
||||
|
||||
80% {
|
||||
transform: scale(1.1);
|
||||
}
|
||||
|
||||
100% {
|
||||
transform: scale(1);
|
||||
max-height: 100%;
|
||||
overflow: visible;
|
||||
}
|
||||
}
|
||||
|
||||
@media (prefers-color-scheme: light) {
|
||||
.sidebar-items:after {
|
||||
content: " ";
|
||||
position: absolute;
|
||||
left: 0;
|
||||
right: 0px;
|
||||
height: 4em;
|
||||
top: 69vh;
|
||||
background: linear-gradient(
|
||||
to bottom,
|
||||
rgba(173, 3, 3, 0),
|
||||
rgb(255 255 255) 50%
|
||||
);
|
||||
z-index: 1;
|
||||
pointer-events: none;
|
||||
}
|
||||
}
|
||||
|
||||
@media (prefers-color-scheme: dark) {
|
||||
.sidebar-items:after {
|
||||
content: " ";
|
||||
position: absolute;
|
||||
left: 0;
|
||||
right: 0px;
|
||||
height: 4em;
|
||||
top: 69vh;
|
||||
background: linear-gradient(
|
||||
to bottom,
|
||||
rgba(173, 3, 3, 0),
|
||||
rgb(20 20 20) 50%
|
||||
);
|
||||
z-index: 1;
|
||||
pointer-events: none;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* ==============================================
|
||||
* Dot Falling
|
||||
* ==============================================
|
||||
*/
|
||||
.dot-falling {
|
||||
position: relative;
|
||||
left: -9999px;
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
border-radius: 5px;
|
||||
background-color: #5fa4fa;
|
||||
color: #5fa4fa;
|
||||
box-shadow: 9999px 0 0 0 #5fa4fa;
|
||||
animation: dot-falling 1.5s infinite linear;
|
||||
animation-delay: 0.1s;
|
||||
}
|
||||
|
||||
.dot-falling::before,
|
||||
.dot-falling::after {
|
||||
content: "";
|
||||
display: inline-block;
|
||||
position: absolute;
|
||||
top: 0;
|
||||
}
|
||||
|
||||
.dot-falling::before {
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
border-radius: 5px;
|
||||
background-color: #5fa4fa;
|
||||
color: #5fa4fa;
|
||||
animation: dot-falling-before 1.5s infinite linear;
|
||||
animation-delay: 0s;
|
||||
}
|
||||
|
||||
.dot-falling::after {
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
border-radius: 5px;
|
||||
background-color: #5fa4fa;
|
||||
color: #5fa4fa;
|
||||
animation: dot-falling-after 1.5s infinite linear;
|
||||
animation-delay: 0.2s;
|
||||
}
|
||||
|
||||
@keyframes dot-falling {
|
||||
0% {
|
||||
box-shadow: 9999px -15px 0 0 rgba(152, 128, 255, 0);
|
||||
}
|
||||
|
||||
25%,
|
||||
50%,
|
||||
75% {
|
||||
box-shadow: 9999px 0 0 0 #5fa4fa;
|
||||
}
|
||||
|
||||
100% {
|
||||
box-shadow: 9999px 15px 0 0 rgba(152, 128, 255, 0);
|
||||
}
|
||||
}
|
||||
|
||||
@keyframes dot-falling-before {
|
||||
0% {
|
||||
box-shadow: 9984px -15px 0 0 rgba(152, 128, 255, 0);
|
||||
}
|
||||
|
||||
25%,
|
||||
50%,
|
||||
75% {
|
||||
box-shadow: 9984px 0 0 0 #5fa4fa;
|
||||
}
|
||||
|
||||
100% {
|
||||
box-shadow: 9984px 15px 0 0 rgba(152, 128, 255, 0);
|
||||
}
|
||||
}
|
||||
|
||||
@keyframes dot-falling-after {
|
||||
0% {
|
||||
box-shadow: 10014px -15px 0 0 rgba(152, 128, 255, 0);
|
||||
}
|
||||
|
||||
25%,
|
||||
50%,
|
||||
75% {
|
||||
box-shadow: 10014px 0 0 0 #5fa4fa;
|
||||
}
|
||||
|
||||
100% {
|
||||
box-shadow: 10014px 15px 0 0 rgba(152, 128, 255, 0);
|
||||
}
|
||||
}
|
||||
|
||||
#chat-history::-webkit-scrollbar,
|
||||
#chat-container::-webkit-scrollbar,
|
||||
.no-scroll::-webkit-scrollbar {
|
||||
display: none !important;
|
||||
}
|
||||
|
||||
/* Hide scrollbar for IE, Edge and Firefox */
|
||||
#chat-history,
|
||||
#chat-container,
|
||||
.no-scroll {
|
||||
-ms-overflow-style: none !important;
|
||||
/* IE and Edge */
|
||||
scrollbar-width: none !important;
|
||||
/* Firefox */
|
||||
}
|
||||
|
||||
.z-99 {
|
||||
z-index: 99;
|
||||
}
|
||||
|
||||
.z-98 {
|
||||
z-index: 98;
|
||||
}
|
||||
|
||||
.file-uploader {
|
||||
width: 100% !important;
|
||||
height: 100px !important;
|
||||
}
|
||||
|
||||
.blink {
|
||||
animation: blink 1.5s steps(1) infinite;
|
||||
}
|
||||
|
||||
@keyframes blink {
|
||||
0% {
|
||||
opacity: 0;
|
||||
}
|
||||
|
||||
50% {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
100% {
|
||||
opacity: 0;
|
||||
}
|
||||
}
|
||||
|
||||
.background-animate {
|
||||
background-size: 400%;
|
||||
-webkit-animation: bgAnimate 10s ease infinite;
|
||||
-moz-animation: bgAnimate 10s ease infinite;
|
||||
animation: bgAnimate 10s ease infinite;
|
||||
}
|
||||
|
||||
@keyframes bgAnimate {
|
||||
0%,
|
||||
100% {
|
||||
background-position: 0% 50%;
|
||||
}
|
||||
|
||||
50% {
|
||||
background-position: 100% 50%;
|
||||
}
|
||||
}
|
15
frontend/src/main.jsx
Normal file
15
frontend/src/main.jsx
Normal file
|
@ -0,0 +1,15 @@
|
|||
import React from "react";
|
||||
import ReactDOM from "react-dom/client";
|
||||
import { BrowserRouter as Router } from "react-router-dom";
|
||||
import App from "./App.jsx";
|
||||
import "./index.css";
|
||||
const isDev = process.env.NODE_ENV !== "production";
|
||||
const REACTWRAP = isDev ? React.Fragment : React.StrictMode;
|
||||
|
||||
ReactDOM.createRoot(document.getElementById("root")).render(
|
||||
<REACTWRAP>
|
||||
<Router>
|
||||
<App />
|
||||
</Router>
|
||||
</REACTWRAP>
|
||||
);
|
38
frontend/src/models/system.js
Normal file
38
frontend/src/models/system.js
Normal file
|
@ -0,0 +1,38 @@
|
|||
import { API_BASE } from "../utils/constants";
|
||||
|
||||
const System = {
|
||||
ping: async function () {
|
||||
return await fetch(`${API_BASE}/ping`)
|
||||
.then((res) => res.ok)
|
||||
.catch(() => false);
|
||||
},
|
||||
totalIndexes: async function () {
|
||||
return await fetch(`${API_BASE}/system-vectors`)
|
||||
.then((res) => {
|
||||
if (!res.ok) throw new Error("Could not find indexes.");
|
||||
return res.json();
|
||||
})
|
||||
.then((res) => res.vectorCount)
|
||||
.catch(() => 0);
|
||||
},
|
||||
keys: async function () {
|
||||
return await fetch(`${API_BASE}/setup-complete`)
|
||||
.then((res) => {
|
||||
if (!res.ok) throw new Error("Could not find setup information.");
|
||||
return res.json();
|
||||
})
|
||||
.then((res) => res.results)
|
||||
.catch(() => null);
|
||||
},
|
||||
localFiles: async function () {
|
||||
return await fetch(`${API_BASE}/local-files`)
|
||||
.then((res) => {
|
||||
if (!res.ok) throw new Error("Could not find setup information.");
|
||||
return res.json();
|
||||
})
|
||||
.then((res) => res.localFiles)
|
||||
.catch(() => null);
|
||||
},
|
||||
};
|
||||
|
||||
export default System;
|
77
frontend/src/models/workspace.js
Normal file
77
frontend/src/models/workspace.js
Normal file
|
@ -0,0 +1,77 @@
|
|||
import { API_BASE } from "../utils/constants";
|
||||
|
||||
const Workspace = {
|
||||
new: async function (data = {}) {
|
||||
const { workspace, message } = await fetch(`${API_BASE}/workspace/new`, {
|
||||
method: "POST",
|
||||
body: JSON.stringify(data),
|
||||
})
|
||||
.then((res) => res.json())
|
||||
.catch((e) => {
|
||||
return { workspace: null, message: e.message };
|
||||
});
|
||||
|
||||
return { workspace, message };
|
||||
},
|
||||
modifyEmbeddings: async function (slug, changes = {}) {
|
||||
const { workspace, message } = await fetch(
|
||||
`${API_BASE}/workspace/${slug}/update-embeddings`,
|
||||
{
|
||||
method: "POST",
|
||||
body: JSON.stringify(changes), // contains 'adds' and 'removes' keys that are arrays of filepaths
|
||||
}
|
||||
)
|
||||
.then((res) => res.json())
|
||||
.catch((e) => {
|
||||
return { workspace: null, message: e.message };
|
||||
});
|
||||
|
||||
return { workspace, message };
|
||||
},
|
||||
chatHistory: async function (slug) {
|
||||
const history = await fetch(`${API_BASE}/workspace/${slug}/chats`)
|
||||
.then((res) => res.json())
|
||||
.then((res) => res.history || [])
|
||||
.catch(() => []);
|
||||
return history;
|
||||
},
|
||||
sendChat: async function ({ slug }, message, mode = "query") {
|
||||
const chatResult = await fetch(`${API_BASE}/workspace/${slug}/chat`, {
|
||||
method: "POST",
|
||||
body: JSON.stringify({ message, mode }),
|
||||
})
|
||||
.then((res) => res.json())
|
||||
.catch((e) => {
|
||||
console.error(e);
|
||||
return null;
|
||||
});
|
||||
|
||||
return chatResult;
|
||||
},
|
||||
all: async function () {
|
||||
const workspaces = await fetch(`${API_BASE}/workspaces`)
|
||||
.then((res) => res.json())
|
||||
.then((res) => res.workspaces || [])
|
||||
.catch(() => []);
|
||||
|
||||
return workspaces;
|
||||
},
|
||||
bySlug: async function (slug = "") {
|
||||
const workspace = await fetch(`${API_BASE}/workspace/${slug}`)
|
||||
.then((res) => res.json())
|
||||
.then((res) => res.workspace)
|
||||
.catch(() => null);
|
||||
return workspace;
|
||||
},
|
||||
delete: async function (slug) {
|
||||
const result = await fetch(`${API_BASE}/workspace/${slug}`, {
|
||||
method: "DELETE",
|
||||
})
|
||||
.then((res) => res.ok)
|
||||
.catch(() => false);
|
||||
|
||||
return result;
|
||||
},
|
||||
};
|
||||
|
||||
export default Workspace;
|
24
frontend/src/pages/404.jsx
Normal file
24
frontend/src/pages/404.jsx
Normal file
|
@ -0,0 +1,24 @@
|
|||
import Header from "../components/Header";
|
||||
import Footer from "../components/Footer";
|
||||
|
||||
export default function Contact() {
|
||||
return (
|
||||
<div className="text-black">
|
||||
<Header />
|
||||
<div className="flex flex-col justify-center mx-auto mt-52 text-center max-w-2x1">
|
||||
<h1 className="text-3xl font-bold tracking-tight text-black md:text-5xl">
|
||||
404 – Unavailable
|
||||
</h1>
|
||||
<br />
|
||||
<a
|
||||
className="w-64 p-1 mx-auto font-bold text-center text-black border border-gray-500 rounded-lg sm:p-4"
|
||||
href="/"
|
||||
>
|
||||
Return Home
|
||||
</a>
|
||||
</div>
|
||||
<div className="mt-64"></div>
|
||||
<Footer />
|
||||
</div>
|
||||
);
|
||||
}
|
12
frontend/src/pages/Main/index.jsx
Normal file
12
frontend/src/pages/Main/index.jsx
Normal file
|
@ -0,0 +1,12 @@
|
|||
import React from "react";
|
||||
import DefaultChatContainer from "../../components/DefaultChat";
|
||||
import Sidebar from "../../components/Sidebar";
|
||||
|
||||
export default function Main() {
|
||||
return (
|
||||
<div className="w-screen h-screen overflow-hidden bg-orange-100 dark:bg-stone-700 flex">
|
||||
<Sidebar />
|
||||
<DefaultChatContainer />
|
||||
</div>
|
||||
);
|
||||
}
|
28
frontend/src/pages/WorkspaceChat/index.jsx
Normal file
28
frontend/src/pages/WorkspaceChat/index.jsx
Normal file
|
@ -0,0 +1,28 @@
|
|||
import React, { useEffect, useState } from "react";
|
||||
import { default as WorkspaceChatContainer } from "../../components/WorkspaceChat";
|
||||
import Sidebar from "../../components/Sidebar";
|
||||
import { useParams } from "react-router-dom";
|
||||
import Workspace from "../../models/workspace";
|
||||
|
||||
export default function WorkspaceChat() {
|
||||
const { slug } = useParams();
|
||||
const [workspace, setWorkspace] = useState(null);
|
||||
const [loading, setLoading] = useState(true);
|
||||
|
||||
useEffect(() => {
|
||||
async function getWorkspace() {
|
||||
if (!slug) return;
|
||||
const _workspace = await Workspace.bySlug(slug);
|
||||
setWorkspace(_workspace);
|
||||
setLoading(false);
|
||||
}
|
||||
getWorkspace();
|
||||
}, []);
|
||||
|
||||
return (
|
||||
<div className="w-screen h-screen overflow-hidden bg-orange-100 dark:bg-stone-700 flex">
|
||||
<Sidebar />
|
||||
<WorkspaceChatContainer loading={loading} workspace={workspace} />
|
||||
</div>
|
||||
);
|
||||
}
|
59
frontend/src/utils/chat/index.js
Normal file
59
frontend/src/utils/chat/index.js
Normal file
|
@ -0,0 +1,59 @@
|
|||
// For handling of synchronous chats that are not utilizing streaming or chat requests.
|
||||
export default function handleChat(
|
||||
chatResult,
|
||||
setLoadingResponse,
|
||||
setChatHistory,
|
||||
remHistory,
|
||||
_chatHistory
|
||||
) {
|
||||
const { uuid, textResponse, type, sources = [], error, close } = chatResult;
|
||||
|
||||
if (type === "abort") {
|
||||
setLoadingResponse(false);
|
||||
alert(error);
|
||||
setChatHistory([
|
||||
...remHistory,
|
||||
{
|
||||
uuid,
|
||||
content: textResponse,
|
||||
role: "assistant",
|
||||
sources,
|
||||
closed: true,
|
||||
error,
|
||||
animate: true,
|
||||
},
|
||||
]);
|
||||
_chatHistory.push({
|
||||
uuid,
|
||||
content: textResponse,
|
||||
role: "assistant",
|
||||
sources,
|
||||
closed: true,
|
||||
error,
|
||||
animate: true,
|
||||
});
|
||||
} else if (type === "textResponse") {
|
||||
setLoadingResponse(false);
|
||||
setChatHistory([
|
||||
...remHistory,
|
||||
{
|
||||
uuid,
|
||||
content: textResponse,
|
||||
role: "assistant",
|
||||
sources,
|
||||
closed: close,
|
||||
error,
|
||||
animate: true,
|
||||
},
|
||||
]);
|
||||
_chatHistory.push({
|
||||
uuid,
|
||||
content: textResponse,
|
||||
role: "assistant",
|
||||
sources,
|
||||
closed: close,
|
||||
error,
|
||||
animate: true,
|
||||
});
|
||||
}
|
||||
}
|
2
frontend/src/utils/constants.js
Normal file
2
frontend/src/utils/constants.js
Normal file
|
@ -0,0 +1,2 @@
|
|||
export const API_BASE =
|
||||
import.meta.env.VITE_ENABLE_GOOGLE_AUTH || "http://localhost:5000";
|
16
frontend/src/utils/numbers.js
Normal file
16
frontend/src/utils/numbers.js
Normal file
|
@ -0,0 +1,16 @@
|
|||
const Formatter = Intl.NumberFormat("en", { notation: "compact" });
|
||||
|
||||
export function numberWithCommas(input) {
|
||||
return input.toString().replace(/\B(?=(\d{3})+(?!\d))/g, ",");
|
||||
}
|
||||
|
||||
export function nFormatter(input) {
|
||||
return Formatter.format(input);
|
||||
}
|
||||
|
||||
export function dollarFormat(input) {
|
||||
return new Intl.NumberFormat("en-us", {
|
||||
style: "currency",
|
||||
currency: "USD",
|
||||
}).format(input);
|
||||
}
|
19
frontend/src/utils/paths.js
Normal file
19
frontend/src/utils/paths.js
Normal file
|
@ -0,0 +1,19 @@
|
|||
export default {
|
||||
home: () => {
|
||||
return "/";
|
||||
},
|
||||
github: () => {
|
||||
return "/";
|
||||
},
|
||||
docs: () => {
|
||||
return "/";
|
||||
},
|
||||
mailToMintplex: () => {
|
||||
return "mailto:team@mintplex.xyz";
|
||||
},
|
||||
workspace: {
|
||||
chat: (slug) => {
|
||||
return `/workspace/${slug}`;
|
||||
},
|
||||
},
|
||||
};
|
13
frontend/tailwind.config.js
Normal file
13
frontend/tailwind.config.js
Normal file
|
@ -0,0 +1,13 @@
|
|||
/** @type {import('tailwindcss').Config} */
|
||||
export default {
|
||||
content: ["./src/**/*.{js,jsx}"],
|
||||
theme: {
|
||||
extend: {
|
||||
colors: {
|
||||
'black-900': '#141414',
|
||||
}
|
||||
},
|
||||
},
|
||||
plugins: [],
|
||||
}
|
||||
|
59
frontend/vite.config.js
Normal file
59
frontend/vite.config.js
Normal file
|
@ -0,0 +1,59 @@
|
|||
import { defineConfig } from 'vite'
|
||||
import postcss from './postcss.config.js'
|
||||
import react from '@vitejs/plugin-react'
|
||||
import dns from 'dns'
|
||||
import { visualizer } from "rollup-plugin-visualizer";
|
||||
|
||||
dns.setDefaultResultOrder('verbatim')
|
||||
|
||||
// https://vitejs.dev/config/
|
||||
export default defineConfig({
|
||||
server: {
|
||||
port: 3000,
|
||||
host: 'localhost'
|
||||
},
|
||||
define: {
|
||||
'process.env': process.env
|
||||
},
|
||||
css: {
|
||||
postcss,
|
||||
},
|
||||
plugins: [
|
||||
react(),
|
||||
visualizer({
|
||||
template: "treemap", // or sunburst
|
||||
open: false,
|
||||
gzipSize: true,
|
||||
brotliSize: true,
|
||||
filename: "bundleinspector.html", // will be saved in project's root
|
||||
}),
|
||||
],
|
||||
resolve: {
|
||||
alias: [
|
||||
{
|
||||
process: "process/browser",
|
||||
stream: "stream-browserify",
|
||||
zlib: "browserify-zlib",
|
||||
util: "util",
|
||||
find: /^~.+/,
|
||||
replacement: (val) => {
|
||||
return val.replace(/^~/, "");
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
build: {
|
||||
commonjsOptions: {
|
||||
transformMixedEsModules: true,
|
||||
}
|
||||
},
|
||||
optimizeDeps: {
|
||||
esbuildOptions: {
|
||||
define: {
|
||||
global: 'globalThis'
|
||||
},
|
||||
plugins: [
|
||||
]
|
||||
}
|
||||
}
|
||||
})
|
BIN
images/choices.png
Normal file
BIN
images/choices.png
Normal file
Binary file not shown.
After ![]() (image error) Size: 152 KiB |
BIN
images/gcp-project-bar.png
Normal file
BIN
images/gcp-project-bar.png
Normal file
Binary file not shown.
After ![]() (image error) Size: 19 KiB |
14
images/screenshots/SCREENSHOTS.md
Normal file
14
images/screenshots/SCREENSHOTS.md
Normal file
|
@ -0,0 +1,14 @@
|
|||
# AnythingLLM Screenshots
|
||||
|
||||
### Homescreen
|
||||

|
||||
|
||||
### Document Manager
|
||||
⚡ means the current version of the document has been embedded before and will not cost money to convert into a vector!
|
||||

|
||||
|
||||
### Chatting
|
||||

|
||||
|
||||
### Setup check
|
||||

|
BIN
images/screenshots/chat.png
Normal file
BIN
images/screenshots/chat.png
Normal file
Binary file not shown.
After ![]() (image error) Size: 320 KiB |
BIN
images/screenshots/document.png
Normal file
BIN
images/screenshots/document.png
Normal file
Binary file not shown.
After ![]() (image error) Size: 766 KiB |
BIN
images/screenshots/home.png
Normal file
BIN
images/screenshots/home.png
Normal file
Binary file not shown.
After ![]() (image error) Size: 575 KiB |
BIN
images/screenshots/keys.png
Normal file
BIN
images/screenshots/keys.png
Normal file
Binary file not shown.
After ![]() (image error) Size: 515 KiB |
15
package.json
Normal file
15
package.json
Normal file
|
@ -0,0 +1,15 @@
|
|||
{
|
||||
"name": "socials-to-chat",
|
||||
"version": "1.0.0",
|
||||
"description": "Turn creator socials into chatbots with long-term-memory though a simple UI",
|
||||
"main": "index.js",
|
||||
"author": "Timothy Carambat (Mintplex Labs)",
|
||||
"license": "MIT",
|
||||
"scripts": {
|
||||
"setup": "cd server && yarn && cd .. && yarn setup:envs && echo \"Please run yarn dev:server and yarn dev:frontend in separate terminal tabs.\"",
|
||||
"setup:envs": "cd server && cp -n .env.example .env.development && cd ../collector && cp -n .env.example .env && cd ..",
|
||||
"dev:server": "cd server && yarn dev",
|
||||
"dev:frontend": "cd frontend && yarn start"
|
||||
},
|
||||
"private": false
|
||||
}
|
8
server/.env.example
Normal file
8
server/.env.example
Normal file
|
@ -0,0 +1,8 @@
|
|||
SERVER_PORT=5000
|
||||
OPEN_AI_KEY=
|
||||
OPEN_MODEL_PREF='gpt-3.5-turbo'
|
||||
PINECONE_ENVIRONMENT=
|
||||
PINECONE_API_KEY=
|
||||
PINECONE_INDEX=
|
||||
AUTH_TOKEN="hunter2" # This is the password to your application if remote hosting.
|
||||
CACHE_VECTORS="true"
|
7
server/.gitignore
vendored
Normal file
7
server/.gitignore
vendored
Normal file
|
@ -0,0 +1,7 @@
|
|||
.env.production
|
||||
.env.development
|
||||
documents/*
|
||||
vector-cache/*.json
|
||||
!documents/DOCUMENTS.md
|
||||
logs/server.log
|
||||
*.db
|
1
server/.nvmrc
Normal file
1
server/.nvmrc
Normal file
|
@ -0,0 +1 @@
|
|||
v18.12.1
|
10
server/documents/DOCUMENTS.md
Normal file
10
server/documents/DOCUMENTS.md
Normal file
|
@ -0,0 +1,10 @@
|
|||
### What is this folder of documents?
|
||||
|
||||
This is a temporary cache of the resulting files you have collected from `collector/`. You really should not be adding files manually to this folder. However the general format of this is you should partion data by how it was collected - it will be added to the appropriate namespace when you undergo vectorizing.
|
||||
|
||||
You can manage these files from the frontend application.
|
||||
|
||||
All files should be JSON files and in general there is only one main required key: `pageContent` all other keys will be inserted as metadata for each document inserted into the vector DB.
|
||||
|
||||
There is also a special reserved key called `published` that should be reserved for timestamps.
|
||||
|
23
server/endpoints/chat.js
Normal file
23
server/endpoints/chat.js
Normal file
|
@ -0,0 +1,23 @@
|
|||
const { reqBody } = require('../utils/http');
|
||||
const { Workspace } = require('../models/workspace');
|
||||
const { chatWithWorkspace } = require('../utils/chats');
|
||||
|
||||
function chatEndpoints(app) {
|
||||
if (!app) return;
|
||||
|
||||
app.post('/workspace/:slug/chat', async (request, response) => {
|
||||
const { slug } = request.params
|
||||
const { message, mode = 'query' } = reqBody(request)
|
||||
const workspace = await Workspace.get(`slug = '${slug}'`);
|
||||
if (!workspace) {
|
||||
response.sendStatus(400).end();
|
||||
return;
|
||||
}
|
||||
|
||||
const result = await chatWithWorkspace(workspace, message, mode);
|
||||
response.status(200).json({ ...result });
|
||||
})
|
||||
|
||||
}
|
||||
|
||||
module.exports = { chatEndpoints }
|
34
server/endpoints/system.js
Normal file
34
server/endpoints/system.js
Normal file
|
@ -0,0 +1,34 @@
|
|||
require('dotenv').config({ path: `.env.${process.env.NODE_ENV}` })
|
||||
const { Pinecone } = require('../utils/pinecone');
|
||||
const { viewLocalFiles } = require('../utils/files');
|
||||
|
||||
function systemEndpoints(app) {
|
||||
if (!app) return;
|
||||
|
||||
app.get('/ping', (_, response) => {
|
||||
response.sendStatus(200);
|
||||
})
|
||||
|
||||
app.get('/setup-complete', (_, response) => {
|
||||
const results = {
|
||||
OpenAiKey: !!process.env.OPEN_AI_KEY,
|
||||
OpenAiModelPref: process.env.OPEN_MODEL_PREF || 'gpt-3.5-turbo',
|
||||
PineConeEnvironment: process.env.PINECONE_ENVIRONMENT,
|
||||
PineConeKey: !!process.env.PINECONE_API_KEY,
|
||||
PinceConeIndex: process.env.PINECONE_INDEX,
|
||||
}
|
||||
response.status(200).json({ results })
|
||||
})
|
||||
|
||||
app.get('/system-vectors', async (_, response) => {
|
||||
const vectorCount = await Pinecone.totalIndicies();
|
||||
response.status(200).json({ vectorCount })
|
||||
})
|
||||
|
||||
app.get('/local-files', async (_, response) => {
|
||||
const localFiles = await viewLocalFiles()
|
||||
response.status(200).json({ localFiles })
|
||||
})
|
||||
}
|
||||
|
||||
module.exports = { systemEndpoints }
|
75
server/endpoints/workspaces.js
Normal file
75
server/endpoints/workspaces.js
Normal file
|
@ -0,0 +1,75 @@
|
|||
const { Pinecone } = require('../utils/pinecone');
|
||||
const { reqBody } = require('../utils/http');
|
||||
const { Workspace } = require('../models/workspace');
|
||||
const { Document } = require('../models/documents');
|
||||
const { DocumentVectors } = require('../models/vectors');
|
||||
const { WorkspaceChats } = require('../models/workspaceChats');
|
||||
const { convertToChatHistory } = require('../utils/chats');
|
||||
|
||||
function workspaceEndpoints(app) {
|
||||
if (!app) return;
|
||||
|
||||
app.post('/workspace/new', async (request, response) => {
|
||||
const { name = null } = reqBody(request);
|
||||
const { workspace, message } = await Workspace.new(name);
|
||||
response.status(200).json({ workspace, message })
|
||||
})
|
||||
|
||||
app.post('/workspace/:slug/update-embeddings', async (request, response) => {
|
||||
const { slug = null } = request.params;
|
||||
const { adds = [], deletes = [] } = reqBody(request);
|
||||
const currWorkspace = await Workspace.get(`slug = '${slug}'`);
|
||||
|
||||
if (!currWorkspace) {
|
||||
response.sendStatus(400).end();
|
||||
return;
|
||||
}
|
||||
|
||||
await Document.removeDocuments(currWorkspace, deletes);
|
||||
await Document.addDocuments(currWorkspace, adds);
|
||||
const updatedWorkspace = await Workspace.get(`slug = '${slug}'`);
|
||||
response.status(200).json({ workspace: updatedWorkspace })
|
||||
})
|
||||
|
||||
app.delete('/workspace/:slug', async (request, response) => {
|
||||
const { slug = '' } = request.params
|
||||
const workspace = await Workspace.get(`slug = '${slug}'`);
|
||||
|
||||
if (!workspace) {
|
||||
response.sendStatus(400).end();
|
||||
return;
|
||||
}
|
||||
|
||||
await Workspace.delete(`slug = '${slug.toLowerCase()}'`);
|
||||
await DocumentVectors.deleteForWorkspace(workspace.id);
|
||||
await Document.delete(`workspaceId = ${Number(workspace.id)}`)
|
||||
await WorkspaceChats.delete(`workspaceId = ${Number(workspace.id)}`)
|
||||
try { await Pinecone['delete-namespace']({ namespace: slug }) } catch (e) { console.error(e.message) }
|
||||
response.sendStatus(200).end()
|
||||
})
|
||||
|
||||
app.get('/workspaces', async (_, response) => {
|
||||
const workspaces = await Workspace.where();
|
||||
response.status(200).json({ workspaces })
|
||||
})
|
||||
|
||||
app.get('/workspace/:slug', async (request, response) => {
|
||||
const { slug } = request.params
|
||||
const workspace = await Workspace.get(`slug = '${slug}'`);
|
||||
response.status(200).json({ workspace })
|
||||
})
|
||||
|
||||
app.get('/workspace/:slug/chats', async (request, response) => {
|
||||
const { slug } = request.params
|
||||
const workspace = await Workspace.get(`slug = '${slug}'`);
|
||||
if (!workspace) {
|
||||
response.sendStatus(400).end()
|
||||
return;
|
||||
}
|
||||
|
||||
const history = await WorkspaceChats.forWorkspace(workspace.id)
|
||||
response.status(200).json({ history: convertToChatHistory(history) })
|
||||
})
|
||||
}
|
||||
|
||||
module.exports = { workspaceEndpoints }
|
59
server/index.js
Normal file
59
server/index.js
Normal file
|
@ -0,0 +1,59 @@
|
|||
require('dotenv').config({ path: `.env.${process.env.NODE_ENV}` })
|
||||
const express = require('express')
|
||||
const bodyParser = require('body-parser')
|
||||
const cors = require('cors');
|
||||
const { validatedRequest } = require('./utils/middleware/validatedRequest');
|
||||
const { Pinecone } = require('./utils/pinecone');
|
||||
const { reqBody } = require('./utils/http');
|
||||
const { systemEndpoints } = require('./endpoints/system');
|
||||
const { workspaceEndpoints } = require('./endpoints/workspaces');
|
||||
const { chatEndpoints } = require('./endpoints/chat');
|
||||
const app = express();
|
||||
|
||||
app.use(cors({ origin: true }));
|
||||
app.use(validatedRequest);
|
||||
app.use(bodyParser.text());
|
||||
app.use(bodyParser.json());
|
||||
app.use(bodyParser.urlencoded({
|
||||
extended: true
|
||||
}));
|
||||
|
||||
systemEndpoints(app);
|
||||
workspaceEndpoints(app);
|
||||
chatEndpoints(app);
|
||||
|
||||
app.post('/v/:command', async (request, response) => {
|
||||
const { command } = request.params
|
||||
if (!Object.getOwnPropertyNames(Pinecone).includes(command)) {
|
||||
response.status(500).json({ message: 'invalid interface command', commands: Object.getOwnPropertyNames(Pinecone.prototype) });
|
||||
return
|
||||
}
|
||||
|
||||
try {
|
||||
const body = reqBody(request);
|
||||
const resBody = await Pinecone[command](body)
|
||||
response.status(200).json({ ...resBody });
|
||||
} catch (e) {
|
||||
// console.error(e)
|
||||
console.error(JSON.stringify(e))
|
||||
response.status(500).json({ error: e.message });
|
||||
}
|
||||
return;
|
||||
})
|
||||
|
||||
|
||||
app.all('*', function (_, response) {
|
||||
response.sendStatus(404);
|
||||
});
|
||||
|
||||
app.listen(process.env.SERVER_PORT || 5000, () => {
|
||||
console.log(`Example app listening on port ${process.env.SERVER_PORT || 5000}`)
|
||||
})
|
||||
.on("error", function (err) {
|
||||
process.once("SIGUSR2", function () {
|
||||
process.kill(process.pid, "SIGUSR2");
|
||||
});
|
||||
process.on("SIGINT", function () {
|
||||
process.kill(process.pid, "SIGINT");
|
||||
});
|
||||
});
|
99
server/models/documents.js
Normal file
99
server/models/documents.js
Normal file
|
@ -0,0 +1,99 @@
|
|||
const { fileData } = require('../utils/files');
|
||||
const { v4: uuidv4 } = require('uuid');
|
||||
|
||||
const Document = {
|
||||
tablename: 'workspace_documents',
|
||||
colsInit: `
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
docId TEXT NOT NULL UNIQUE,
|
||||
filename TEXT NOT NULL,
|
||||
docpath TEXT NOT NULL,
|
||||
workspaceId INTEGER NOT NULL,
|
||||
metadata TEXT NULL,
|
||||
createdAt TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
lastUpdatedAt TEXT DEFAULT CURRENT_TIMESTAMP
|
||||
`,
|
||||
db: async function () {
|
||||
const sqlite3 = require('sqlite3').verbose();
|
||||
const { open } = require('sqlite');
|
||||
|
||||
const db = await open({
|
||||
filename: 'anythingllm.db',
|
||||
driver: sqlite3.Database
|
||||
})
|
||||
|
||||
await db.exec(`CREATE TABLE IF NOT EXISTS ${this.tablename} (${this.colsInit})`);
|
||||
db.on('trace', (sql) => console.log(sql))
|
||||
return db
|
||||
},
|
||||
forWorkspace: async function (workspaceId = null) {
|
||||
if (!workspaceId) return [];
|
||||
return await this.where(`workspaceId = ${workspaceId}`);
|
||||
},
|
||||
delete: async function (clause = '') {
|
||||
const db = await this.db()
|
||||
await db.get(`DELETE FROM ${this.tablename} WHERE ${clause}`)
|
||||
db.close()
|
||||
return true
|
||||
},
|
||||
where: async function (clause = '', limit = null) {
|
||||
const db = await this.db()
|
||||
const results = await db.all(`SELECT * FROM ${this.tablename} ${clause ? `WHERE ${clause}` : ''} ${!!limit ? `LIMIT ${limit}` : ''}`)
|
||||
|
||||
db.close()
|
||||
return results
|
||||
},
|
||||
firstWhere: async function (clause = '') {
|
||||
const results = await this.where(clause);
|
||||
return results.length > 0 ? results[0] : null
|
||||
},
|
||||
addDocuments: async function (workspace, additions = []) {
|
||||
const { Pinecone } = require('../utils/pinecone');
|
||||
if (additions.length === 0) return;
|
||||
|
||||
const db = await this.db()
|
||||
const stmt = await db.prepare(`INSERT INTO ${this.tablename} (docId, filename, docpath, workspaceId, metadata) VALUES (?,?,?,?,?)`)
|
||||
for (const path of additions) {
|
||||
const data = await fileData(path);
|
||||
if (!data) continue;
|
||||
|
||||
const docId = uuidv4();
|
||||
const { pageContent, ...metadata } = data
|
||||
const newDoc = {
|
||||
docId,
|
||||
filename: path.split('/')[1],
|
||||
docpath: path,
|
||||
workspaceId: Number(workspace.id),
|
||||
metadata: JSON.stringify(metadata)
|
||||
}
|
||||
const vectorized = await Pinecone.addDocumentToNamespace(workspace.slug, { ...data, docId }, path);
|
||||
if (!vectorized) {
|
||||
console.error('Failed to vectorize', path)
|
||||
continue;
|
||||
}
|
||||
stmt.run([docId, newDoc.filename, newDoc.docpath, newDoc.workspaceId, newDoc.metadata])
|
||||
}
|
||||
stmt.finalize();
|
||||
db.close();
|
||||
|
||||
return;
|
||||
},
|
||||
removeDocuments: async function (workspace, removals = []) {
|
||||
const { Pinecone } = require('../utils/pinecone');
|
||||
|
||||
if (removals.length === 0) return;
|
||||
const db = await this.db()
|
||||
const stmt = await db.prepare(`DELETE FROM ${this.tablename} WHERE docpath = ? AND workspaceId = ?`);
|
||||
for (const path of removals) {
|
||||
const document = await this.firstWhere(`docPath = '${path}' AND workspaceId = ${workspace.id}`)
|
||||
if (!document) continue;
|
||||
await Pinecone.deleteDocumentFromNamespace(workspace.slug, document.docId);
|
||||
stmt.run([path, workspace.id])
|
||||
}
|
||||
stmt.finalize();
|
||||
db.close();
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
module.exports = { Document }
|
63
server/models/vectors.js
Normal file
63
server/models/vectors.js
Normal file
|
@ -0,0 +1,63 @@
|
|||
const { Document } = require('./documents');
|
||||
|
||||
// TODO: Do we want to store entire vectorized chunks in here
|
||||
// so that we can easily spin up temp-namespace clones for threading
|
||||
//
|
||||
const DocumentVectors = {
|
||||
tablename: 'document_vectors',
|
||||
colsInit: `
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
docId TEXT NOT NULL,
|
||||
vectorId TEXT NOT NULL,
|
||||
createdAt TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
lastUpdatedAt TEXT DEFAULT CURRENT_TIMESTAMP
|
||||
`,
|
||||
db: async function () {
|
||||
const sqlite3 = require('sqlite3').verbose();
|
||||
const { open } = require('sqlite');
|
||||
|
||||
const db = await open({
|
||||
filename: 'anythingllm.db',
|
||||
driver: sqlite3.Database
|
||||
})
|
||||
|
||||
await db.exec(`CREATE TABLE IF NOT EXISTS ${this.tablename} (${this.colsInit})`);
|
||||
db.on('trace', (sql) => console.log(sql))
|
||||
return db
|
||||
},
|
||||
bulkInsert: async function (vectorRecords = []) {
|
||||
if (vectorRecords.length === 0) return;
|
||||
const db = await this.db();
|
||||
const stmt = await db.prepare(`INSERT INTO ${this.tablename} (docId, vectorId) VALUES (?, ?)`);
|
||||
for (const record of vectorRecords) {
|
||||
const { docId, vectorId } = record
|
||||
stmt.run([docId, vectorId])
|
||||
}
|
||||
|
||||
stmt.finalize()
|
||||
db.close()
|
||||
return { documentsInserted: vectorRecords.length };
|
||||
},
|
||||
deleteForWorkspace: async function (workspaceId) {
|
||||
const documents = await Document.forWorkspace(workspaceId);
|
||||
const docIds = [...(new Set(documents.map((doc) => doc.docId)))];
|
||||
const ids = (await this.where(`docId IN (${docIds.map((id) => `'${id}'`).join(',')})`)).map((doc) => doc.id)
|
||||
await this.deleteIds(ids)
|
||||
return true;
|
||||
},
|
||||
where: async function (clause = '', limit = null) {
|
||||
const db = await this.db()
|
||||
const results = await db.all(`SELECT * FROM ${this.tablename} ${clause ? `WHERE ${clause}` : ''} ${!!limit ? `LIMIT ${limit}` : ''}`)
|
||||
|
||||
db.close()
|
||||
return results
|
||||
},
|
||||
deleteIds: async function (ids = []) {
|
||||
const db = await this.db()
|
||||
await db.get(`DELETE FROM ${this.tablename} WHERE id IN (${ids.join(', ')}) `)
|
||||
db.close()
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
module.exports = { DocumentVectors }
|
63
server/models/workspace.js
Normal file
63
server/models/workspace.js
Normal file
|
@ -0,0 +1,63 @@
|
|||
const slugify = require('slugify');
|
||||
const { Document } = require('./documents');
|
||||
|
||||
const Workspace = {
|
||||
tablename: 'workspaces',
|
||||
colsInit: `
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT NOT NULL UNIQUE,
|
||||
slug TEXT NOT NULL UNIQUE,
|
||||
vectorTag TEXT DEFAULT NULL,
|
||||
createdAt TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
lastUpdatedAt TEXT DEFAULT CURRENT_TIMESTAMP
|
||||
`,
|
||||
db: async function () {
|
||||
const sqlite3 = require('sqlite3').verbose();
|
||||
const { open } = require('sqlite');
|
||||
|
||||
const db = await open({
|
||||
filename: 'anythingllm.db',
|
||||
driver: sqlite3.Database
|
||||
})
|
||||
|
||||
await db.exec(`CREATE TABLE IF NOT EXISTS ${this.tablename} (${this.colsInit})`);
|
||||
db.on('trace', (sql) => console.log(sql))
|
||||
return db
|
||||
},
|
||||
new: async function (name = null) {
|
||||
if (!name) return { result: null, message: 'name cannot be null' };
|
||||
|
||||
const db = await this.db()
|
||||
const { id, success, message } = await db.run(`INSERT INTO ${this.tablename} (name, slug) VALUES (?, ?)`, [name, slugify(name, { lower: true })])
|
||||
.then((res) => {
|
||||
return { id: res.lastID, success: true, message: null }
|
||||
})
|
||||
.catch((error) => {
|
||||
return { id: null, success: false, message: error.message }
|
||||
})
|
||||
if (!success) return { workspace: null, message }
|
||||
|
||||
const workspace = await db.get(`SELECT * FROM ${this.tablename} WHERE id = ${id}`)
|
||||
return { workspace, message: null }
|
||||
},
|
||||
get: async function (clause = '') {
|
||||
const db = await this.db()
|
||||
const result = await db.get(`SELECT * FROM ${this.tablename} WHERE ${clause}`).then((res) => res || null)
|
||||
if (!result) return null;
|
||||
|
||||
const documents = await Document.forWorkspace(result.id);
|
||||
return { ...result, documents }
|
||||
},
|
||||
delete: async function (clause = '') {
|
||||
const db = await this.db()
|
||||
await db.get(`DELETE FROM ${this.tablename} WHERE ${clause}`)
|
||||
return true
|
||||
},
|
||||
where: async function (clause = '', limit = null) {
|
||||
const db = await this.db()
|
||||
const results = await db.all(`SELECT * FROM ${this.tablename} ${clause ? `WHERE ${clause}` : ''} ${!!limit ? `LIMIT ${limit}` : ''}`)
|
||||
return results
|
||||
},
|
||||
}
|
||||
|
||||
module.exports = { Workspace }
|
68
server/models/workspaceChats.js
Normal file
68
server/models/workspaceChats.js
Normal file
|
@ -0,0 +1,68 @@
|
|||
|
||||
const WorkspaceChats = {
|
||||
tablename: 'workspace_chats',
|
||||
colsInit: `
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
workspaceId INTEGER NOT NULL,
|
||||
prompt TEXT NOT NULL,
|
||||
response TEXT NOT NULL,
|
||||
include BOOL DEFAULT true,
|
||||
createdAt TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
lastUpdatedAt TEXT DEFAULT CURRENT_TIMESTAMP
|
||||
`,
|
||||
db: async function () {
|
||||
const sqlite3 = require('sqlite3').verbose();
|
||||
const { open } = require('sqlite');
|
||||
|
||||
const db = await open({
|
||||
filename: 'anythingllm.db',
|
||||
driver: sqlite3.Database
|
||||
})
|
||||
|
||||
await db.exec(`CREATE TABLE IF NOT EXISTS ${this.tablename} (${this.colsInit})`);
|
||||
db.on('trace', (sql) => console.log(sql))
|
||||
return db
|
||||
},
|
||||
new: async function ({ workspaceId, prompt, response = {} }) {
|
||||
const db = await this.db()
|
||||
const { id, success, message } = await db.run(`INSERT INTO ${this.tablename} (workspaceId, prompt, response) VALUES (?, ?, ?)`, [workspaceId, prompt, JSON.stringify(response)])
|
||||
.then((res) => {
|
||||
return { id: res.lastID, success: true, message: null }
|
||||
})
|
||||
.catch((error) => {
|
||||
return { id: null, success: false, message: error.message }
|
||||
})
|
||||
if (!success) return { chat: null, message }
|
||||
|
||||
const chat = await db.get(`SELECT * FROM ${this.tablename} WHERE id = ${id}`)
|
||||
return { chat, message: null }
|
||||
},
|
||||
forWorkspace: async function (workspaceId = null) {
|
||||
if (!workspaceId) return [];
|
||||
return await this.where(`workspaceId = ${workspaceId} AND include = true`, null, 'ORDER BY id ASC')
|
||||
},
|
||||
markHistoryInvalid: async function (workspaceId = null) {
|
||||
if (!workspaceId) return;
|
||||
const db = await this.db()
|
||||
await db.run(`UPDATE ${this.tablename} SET include = false WHERE workspaceId = ?`, [workspaceId]);
|
||||
return;
|
||||
},
|
||||
get: async function (clause = '') {
|
||||
const db = await this.db()
|
||||
const result = await db.get(`SELECT * FROM ${this.tablename} WHERE ${clause}`).then((res) => res || null)
|
||||
if (!result) return null;
|
||||
return result
|
||||
},
|
||||
delete: async function (clause = '') {
|
||||
const db = await this.db()
|
||||
await db.get(`DELETE FROM ${this.tablename} WHERE ${clause}`)
|
||||
return true
|
||||
},
|
||||
where: async function (clause = '', limit = null, order = null) {
|
||||
const db = await this.db()
|
||||
const results = await db.all(`SELECT * FROM ${this.tablename} ${clause ? `WHERE ${clause}` : ''} ${!!limit ? `LIMIT ${limit}` : ''} ${!!order ? order : ''}`)
|
||||
return results
|
||||
},
|
||||
}
|
||||
|
||||
module.exports = { WorkspaceChats }
|
35
server/package.json
Normal file
35
server/package.json
Normal file
|
@ -0,0 +1,35 @@
|
|||
{
|
||||
"name": "socials-to-chat-server",
|
||||
"version": "1.0.0",
|
||||
"description": "Server endpoints to process or create content for chatting",
|
||||
"main": "index.js",
|
||||
"author": "Timothy Carambat (Mintplex Labs)",
|
||||
"license": "MIT",
|
||||
"private": false,
|
||||
"engines": {
|
||||
"node": ">=18.12.1"
|
||||
},
|
||||
"scripts": {
|
||||
"dev": "NODE_ENV=development nodemon --ignore documents index.js",
|
||||
"start": "NODE_ENV=production node index.js"
|
||||
},
|
||||
"dependencies": {
|
||||
"@googleapis/youtube": "^9.0.0",
|
||||
"@pinecone-database/pinecone": "^0.1.6",
|
||||
"body-parser": "^1.20.2",
|
||||
"cors": "^2.8.5",
|
||||
"dotenv": "^16.0.3",
|
||||
"express": "^4.18.2",
|
||||
"langchain": "^0.0.81",
|
||||
"moment": "^2.29.4",
|
||||
"openai": "^3.2.1",
|
||||
"pinecone-client": "^1.1.0",
|
||||
"slugify": "^1.6.6",
|
||||
"sqlite": "^4.2.1",
|
||||
"sqlite3": "^5.1.6",
|
||||
"uuid": "^9.0.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"nodemon": "^2.0.22"
|
||||
}
|
||||
}
|
17
server/utils/chats/commands/reset.js
Normal file
17
server/utils/chats/commands/reset.js
Normal file
|
@ -0,0 +1,17 @@
|
|||
const { WorkspaceChats } = require("../../../models/workspaceChats");
|
||||
|
||||
async function resetMemory(workspace, _message, msgUUID) {
|
||||
await WorkspaceChats.markHistoryInvalid(workspace.id);
|
||||
return {
|
||||
uuid: msgUUID,
|
||||
type: 'textResponse',
|
||||
textResponse: 'Workspace chat memory was reset!',
|
||||
sources: [],
|
||||
close: true,
|
||||
error: false,
|
||||
};
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
resetMemory
|
||||
}
|
128
server/utils/chats/index.js
Normal file
128
server/utils/chats/index.js
Normal file
|
@ -0,0 +1,128 @@
|
|||
const { v4: uuidv4 } = require('uuid');
|
||||
const { OpenAi } = require('../openAi');
|
||||
const { Pinecone } = require('../pinecone');
|
||||
const { WorkspaceChats } = require('../../models/workspaceChats');
|
||||
const { resetMemory } = require("./commands/reset");
|
||||
const moment = require('moment')
|
||||
|
||||
function convertToChatHistory(history = []) {
|
||||
const formattedHistory = []
|
||||
history.forEach((history) => {
|
||||
const { prompt, response, createdAt } = history
|
||||
const data = JSON.parse(response);
|
||||
formattedHistory.push([
|
||||
{
|
||||
role: 'user',
|
||||
content: prompt,
|
||||
sentAt: moment(createdAt).unix(),
|
||||
},
|
||||
{
|
||||
role: 'assistant',
|
||||
content: data.text,
|
||||
sources: data.sources || [],
|
||||
sentAt: moment(createdAt).unix(),
|
||||
},
|
||||
])
|
||||
})
|
||||
|
||||
return formattedHistory.flat()
|
||||
}
|
||||
|
||||
function convertToPromptHistory(history = []) {
|
||||
const formattedHistory = []
|
||||
history.forEach((history) => {
|
||||
const { prompt, response } = history
|
||||
const data = JSON.parse(response);
|
||||
formattedHistory.push([
|
||||
{ role: 'user', content: prompt },
|
||||
{ role: 'assistant', content: data.text },
|
||||
])
|
||||
})
|
||||
return formattedHistory.flat()
|
||||
}
|
||||
|
||||
|
||||
const VALID_COMMANDS = {
|
||||
'/reset': resetMemory,
|
||||
}
|
||||
|
||||
function grepCommand(message) {
|
||||
const availableCommands = Object.keys(VALID_COMMANDS);
|
||||
|
||||
for (let i = 0; i < availableCommands.length; i++) {
|
||||
const cmd = availableCommands[i];
|
||||
const re = new RegExp(`^(${cmd})`, "i");
|
||||
if (re.test(message)) {
|
||||
return cmd;
|
||||
}
|
||||
}
|
||||
|
||||
return null
|
||||
}
|
||||
|
||||
async function chatWithWorkspace(workspace, message, chatMode = 'query') {
|
||||
const uuid = uuidv4();
|
||||
const openai = new OpenAi();
|
||||
|
||||
const command = grepCommand(message)
|
||||
if (!!command && Object.keys(VALID_COMMANDS).includes(command)) {
|
||||
return await VALID_COMMANDS[command](workspace, message, uuid);
|
||||
}
|
||||
|
||||
const { safe, reasons = [] } = await openai.isSafe(message)
|
||||
if (!safe) {
|
||||
return {
|
||||
id: uuid,
|
||||
type: 'abort',
|
||||
textResponse: null,
|
||||
sources: [],
|
||||
close: true,
|
||||
error: `This message was moderated and will not be allowed. Violations for ${reasons.join(', ')} found.`
|
||||
};
|
||||
}
|
||||
|
||||
const hasVectorizedSpace = await Pinecone.hasNamespace(workspace.slug);
|
||||
if (!hasVectorizedSpace) {
|
||||
const rawHistory = await WorkspaceChats.forWorkspace(workspace.id)
|
||||
const chatHistory = convertToPromptHistory(rawHistory);
|
||||
const response = await openai.sendChat(chatHistory, message);
|
||||
const data = { text: response, sources: [], type: 'chat' }
|
||||
|
||||
await WorkspaceChats.new({ workspaceId: workspace.id, prompt: message, response: data })
|
||||
return {
|
||||
id: uuid,
|
||||
type: 'textResponse',
|
||||
textResponse: response,
|
||||
sources: [],
|
||||
close: true,
|
||||
error: null,
|
||||
};
|
||||
} else {
|
||||
const { response, sources, message: error } = await Pinecone[chatMode]({ namespace: workspace.slug, input: message });
|
||||
if (!response) {
|
||||
return {
|
||||
id: uuid,
|
||||
type: 'abort',
|
||||
textResponse: null,
|
||||
sources: [],
|
||||
close: true,
|
||||
error,
|
||||
};
|
||||
}
|
||||
|
||||
const data = { text: response, sources, type: chatMode }
|
||||
await WorkspaceChats.new({ workspaceId: workspace.id, prompt: message, response: data })
|
||||
return {
|
||||
id: uuid,
|
||||
type: 'textResponse',
|
||||
textResponse: response,
|
||||
sources,
|
||||
close: true,
|
||||
error,
|
||||
};
|
||||
}
|
||||
}
|
||||
module.exports = {
|
||||
convertToChatHistory,
|
||||
chatWithWorkspace
|
||||
}
|
120
server/utils/files/index.js
Normal file
120
server/utils/files/index.js
Normal file
|
@ -0,0 +1,120 @@
|
|||
const fs = require("fs")
|
||||
const path = require('path');
|
||||
const { v5: uuidv5 } = require('uuid');
|
||||
|
||||
async function collectDocumentData(folderName = null) {
|
||||
if (!folderName) throw new Error('No docPath provided in request');
|
||||
const folder = path.resolve(__dirname, `../../documents/${folderName}`)
|
||||
const dirExists = fs.existsSync(folder);
|
||||
if (!dirExists) throw new Error(`No documents folder for ${folderName} - did you run collector/main.py for this element?`);
|
||||
|
||||
const files = fs.readdirSync(folder);
|
||||
const fileData = [];
|
||||
files.forEach(file => {
|
||||
if (path.extname(file) === '.json') {
|
||||
const filePath = path.join(folder, file);
|
||||
const data = fs.readFileSync(filePath, 'utf8');
|
||||
console.log(`Parsing document: ${file}`);
|
||||
fileData.push(JSON.parse(data))
|
||||
}
|
||||
});
|
||||
return fileData;
|
||||
}
|
||||
|
||||
// Should take in a folder that is a subfolder of documents
|
||||
// eg: youtube-subject/video-123.json
|
||||
async function fileData(filePath = null) {
|
||||
if (!filePath) throw new Error('No docPath provided in request');
|
||||
const fullPath = path.resolve(__dirname, `../../documents/${filePath}`)
|
||||
const fileExists = fs.existsSync(fullPath);
|
||||
if (!fileExists) return null;
|
||||
|
||||
const data = fs.readFileSync(fullPath, 'utf8');
|
||||
return JSON.parse(data)
|
||||
}
|
||||
|
||||
async function viewLocalFiles() {
|
||||
const folder = path.resolve(__dirname, `../../documents`)
|
||||
const dirExists = fs.existsSync(folder);
|
||||
if (!dirExists) return {}
|
||||
|
||||
const directory = {
|
||||
name: "documents",
|
||||
type: "folder",
|
||||
items: [],
|
||||
}
|
||||
|
||||
for (const file of fs.readdirSync(folder)) {
|
||||
if (path.extname(file) === '.md') continue;
|
||||
const folderPath = path.resolve(__dirname, `../../documents/${file}`)
|
||||
const isFolder = fs.lstatSync(folderPath).isDirectory()
|
||||
if (isFolder) {
|
||||
const subdocs = {
|
||||
name: file,
|
||||
type: "folder",
|
||||
items: [],
|
||||
}
|
||||
const subfiles = fs.readdirSync(folderPath);
|
||||
|
||||
for (const subfile of subfiles) {
|
||||
if (path.extname(subfile) !== '.json') continue;
|
||||
const filePath = path.join(folderPath, subfile);
|
||||
const rawData = fs.readFileSync(filePath, 'utf8');
|
||||
const cachefilename = `${file}/${subfile}`
|
||||
const { pageContent, ...metadata } = JSON.parse(rawData)
|
||||
|
||||
subdocs.items.push({
|
||||
name: subfile,
|
||||
type: "file",
|
||||
...metadata,
|
||||
cached: await cachedVectorInformation(cachefilename, true)
|
||||
})
|
||||
}
|
||||
directory.items.push(subdocs)
|
||||
}
|
||||
};
|
||||
|
||||
return directory
|
||||
}
|
||||
|
||||
// Searches the vector-cache folder for existing information so we dont have to re-embed a
|
||||
// document and can instead push directly to vector db.
|
||||
async function cachedVectorInformation(filename = null, checkOnly = false) {
|
||||
if (!process.env.CACHE_VECTORS) return checkOnly ? false : { exists: false, chunks: [] };
|
||||
if (!filename) return checkOnly ? false : { exists: false, chunks: [] };
|
||||
|
||||
const digest = uuidv5(filename, uuidv5.URL);
|
||||
const file = path.resolve(__dirname, `../../vector-cache/${digest}.json`);
|
||||
const exists = fs.existsSync(file);
|
||||
|
||||
if (checkOnly) return exists
|
||||
if (!exists) return { exists, chunks: [] }
|
||||
|
||||
console.log(`Cached vectorized results of ${filename} found! Using cached data to save on embed costs.`)
|
||||
const rawData = fs.readFileSync(file, 'utf8');
|
||||
return { exists: true, chunks: JSON.parse(rawData) }
|
||||
}
|
||||
|
||||
// vectorData: pre-chunked vectorized data for a given file that includes the proper metadata and chunk-size limit so it can be iterated and dumped into Pinecone, etc
|
||||
// filename is the fullpath to the doc so we can compare by filename to find cached matches.
|
||||
async function storeVectorResult(vectorData = [], filename = null) {
|
||||
if (!process.env.CACHE_VECTORS) return;
|
||||
if (!filename) return;
|
||||
console.log(`Caching vectorized results of ${filename} to prevent duplicated embedding.`)
|
||||
const folder = path.resolve(__dirname, `../../vector-cache`);
|
||||
|
||||
if (!fs.existsSync(folder)) fs.mkdirSync(folder);
|
||||
|
||||
const digest = uuidv5(filename, uuidv5.URL);
|
||||
const writeTo = path.resolve(folder, `${digest}.json`);
|
||||
fs.writeFileSync(writeTo, JSON.stringify(vectorData), 'utf8');
|
||||
return;
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
cachedVectorInformation,
|
||||
collectDocumentData,
|
||||
viewLocalFiles,
|
||||
storeVectorResult,
|
||||
fileData
|
||||
}
|
14
server/utils/http/index.js
Normal file
14
server/utils/http/index.js
Normal file
|
@ -0,0 +1,14 @@
|
|||
function reqBody(request) {
|
||||
return typeof request.body === 'string'
|
||||
? JSON.parse(request.body)
|
||||
: request.body;
|
||||
}
|
||||
|
||||
function queryParams(request) {
|
||||
return request.query;
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
reqBody,
|
||||
queryParams,
|
||||
};
|
37
server/utils/middleware/validatedRequest.js
Normal file
37
server/utils/middleware/validatedRequest.js
Normal file
|
@ -0,0 +1,37 @@
|
|||
function validatedRequest(request, response, next) {
|
||||
// When in development passthrough auth token for ease of development.
|
||||
if (process.env.NODE_ENV === 'development' || !process.env.AUTH_TOKEN) {
|
||||
next();
|
||||
return;
|
||||
}
|
||||
|
||||
if (!process.env.AUTH_TOKEN) {
|
||||
response.status(403).json({
|
||||
error: "You need to set an AUTH_TOKEN environment variable."
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
const auth = request.header('Authorization');
|
||||
const token = auth ? auth.split(' ')[1] : null;
|
||||
|
||||
if (!token) {
|
||||
response.status(403).json({
|
||||
error: "No auth token found."
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
if (token !== process.env.AUTH_TOKEN) {
|
||||
response.status(403).json({
|
||||
error: "Invalid auth token found."
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
next();
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
validatedRequest,
|
||||
};
|
64
server/utils/openAi/index.js
Normal file
64
server/utils/openAi/index.js
Normal file
|
@ -0,0 +1,64 @@
|
|||
const { Configuration, OpenAIApi } = require('openai')
|
||||
class OpenAi {
|
||||
constructor() {
|
||||
const config = new Configuration({ apiKey: process.env.OPEN_AI_KEY, organization: 'org-amIuvAIIcdUmN5YCiwRayVfb' })
|
||||
const openai = new OpenAIApi(config);
|
||||
this.openai = openai
|
||||
}
|
||||
isValidChatModel(modelName = '') {
|
||||
const validModels = ['gpt-4', 'gpt-3.5-turbo']
|
||||
return validModels.includes(modelName)
|
||||
}
|
||||
|
||||
async isSafe(input = '') {
|
||||
const { flagged = false, categories = {} } = await this.openai.createModeration({ input })
|
||||
.then((json) => {
|
||||
const res = json.data;
|
||||
if (!res.hasOwnProperty('results')) throw new Error('OpenAI moderation: No results!');
|
||||
if (res.results.length === 0) throw new Error('OpenAI moderation: No results length!');
|
||||
return res.results[0]
|
||||
})
|
||||
|
||||
if (!flagged) return { safe: true, reasons: [] };
|
||||
const reasons = Object.keys(categories).map((category) => {
|
||||
const value = categories[category]
|
||||
if (value === true) {
|
||||
return category.replace('/', ' or ');
|
||||
} else {
|
||||
return null;
|
||||
}
|
||||
}).filter((reason) => !!reason)
|
||||
|
||||
return { safe: false, reasons }
|
||||
}
|
||||
|
||||
async sendChat(chatHistory = [], prompt) {
|
||||
const model = process.env.OPEN_MODEL_PREF
|
||||
if (!this.isValidChatModel(model)) throw new Error(`OpenAI chat: ${model} is not valid for chat completion!`);
|
||||
|
||||
const textResponse = await this.openai.createChatCompletion({
|
||||
model,
|
||||
temperature: 0.7,
|
||||
n: 1,
|
||||
messages: [
|
||||
{ role: 'system', content: '' },
|
||||
...chatHistory,
|
||||
{ role: 'user', content: prompt },
|
||||
]
|
||||
})
|
||||
.then((json) => {
|
||||
const res = json.data
|
||||
if (!res.hasOwnProperty('choices')) throw new Error('OpenAI chat: No results!');
|
||||
if (res.choices.length === 0) throw new Error('OpenAI chat: No results length!');
|
||||
return res.choices[0].message.content
|
||||
})
|
||||
|
||||
return textResponse
|
||||
}
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
OpenAi,
|
||||
};
|
||||
|
||||
|
279
server/utils/pinecone/index.js
Normal file
279
server/utils/pinecone/index.js
Normal file
|
@ -0,0 +1,279 @@
|
|||
const { PineconeClient } = require("@pinecone-database/pinecone");
|
||||
const { PineconeStore } = require("langchain/vectorstores/pinecone");
|
||||
const { OpenAI } = require("langchain/llms/openai");
|
||||
const { ChatOpenAI } = require('langchain/chat_models/openai');
|
||||
const { VectorDBQAChain, LLMChain, RetrievalQAChain, ConversationalRetrievalQAChain } = require("langchain/chains");
|
||||
const { OpenAIEmbeddings } = require("langchain/embeddings/openai");
|
||||
const { VectorStoreRetrieverMemory, BufferMemory } = require("langchain/memory");
|
||||
const { PromptTemplate } = require("langchain/prompts");
|
||||
const { RecursiveCharacterTextSplitter } = require("langchain/text_splitter");
|
||||
const { storeVectorResult, cachedVectorInformation } = require('../files');
|
||||
const { Configuration, OpenAIApi } = require('openai')
|
||||
const { v4: uuidv4 } = require('uuid');
|
||||
|
||||
const toChunks = (arr, size) => {
|
||||
return Array.from({ length: Math.ceil(arr.length / size) }, (_v, i) =>
|
||||
arr.slice(i * size, i * size + size)
|
||||
);
|
||||
}
|
||||
|
||||
function curateSources(sources = []) {
|
||||
const knownDocs = [];
|
||||
const documents = []
|
||||
for (const source of sources) {
|
||||
const { metadata = {} } = source
|
||||
if (Object.keys(metadata).length > 0 && !knownDocs.includes(metadata.title)) {
|
||||
documents.push({ ...metadata })
|
||||
knownDocs.push(metadata.title)
|
||||
}
|
||||
}
|
||||
|
||||
return documents;
|
||||
}
|
||||
|
||||
const Pinecone = {
|
||||
connect: async function () {
|
||||
const client = new PineconeClient();
|
||||
await client.init({
|
||||
apiKey: process.env.PINECONE_API_KEY,
|
||||
environment: process.env.PINECONE_ENVIRONMENT,
|
||||
});
|
||||
const pineconeIndex = client.Index(process.env.PINECONE_INDEX);
|
||||
const { status } = await client.describeIndex({ indexName: process.env.PINECONE_INDEX });
|
||||
|
||||
if (!status.ready) throw new Error("Pinecode::Index not ready.")
|
||||
return { client, pineconeIndex, indexName: process.env.PINECONE_INDEX };
|
||||
},
|
||||
embedder: function () {
|
||||
return new OpenAIEmbeddings({ openAIApiKey: process.env.OPEN_AI_KEY });
|
||||
},
|
||||
openai: function () {
|
||||
const config = new Configuration({ apiKey: process.env.OPEN_AI_KEY })
|
||||
const openai = new OpenAIApi(config);
|
||||
return openai
|
||||
},
|
||||
embedChunk: async function (openai, textChunk) {
|
||||
const { data: { data } } = await openai.createEmbedding({
|
||||
model: 'text-embedding-ada-002',
|
||||
input: textChunk
|
||||
})
|
||||
return data.length > 0 && data[0].hasOwnProperty('embedding') ? data[0].embedding : null
|
||||
},
|
||||
llm: function () {
|
||||
const model = process.env.OPEN_MODEL_PREF || 'gpt-3.5-turbo'
|
||||
return new OpenAI({ openAIApiKey: process.env.OPEN_AI_KEY, temperature: 0.7, modelName: model });
|
||||
},
|
||||
chatLLM: function () {
|
||||
const model = process.env.OPEN_MODEL_PREF || 'gpt-3.5-turbo'
|
||||
return new ChatOpenAI({ openAIApiKey: process.env.OPEN_AI_KEY, temperature: 0.7, modelName: model });
|
||||
},
|
||||
totalIndicies: async function () {
|
||||
const { pineconeIndex } = await this.connect();
|
||||
const { namespaces } = await pineconeIndex.describeIndexStats1();
|
||||
return Object.values(namespaces).reduce((a, b) => a + (b?.vectorCount || 0), 0)
|
||||
},
|
||||
namespace: async function (index, namespace = null) {
|
||||
if (!namespace) throw new Error("No namespace value provided.");
|
||||
const { namespaces } = await index.describeIndexStats1();
|
||||
return namespaces.hasOwnProperty(namespace) ? namespaces[namespace] : null
|
||||
},
|
||||
hasNamespace: async function (namespace = null) {
|
||||
if (!namespace) return false;
|
||||
const { pineconeIndex } = await this.connect();
|
||||
return await this.namespaceExists(pineconeIndex, namespace)
|
||||
},
|
||||
namespaceExists: async function (index, namespace = null) {
|
||||
if (!namespace) throw new Error("No namespace value provided.");
|
||||
const { namespaces } = await index.describeIndexStats1();
|
||||
return namespaces.hasOwnProperty(namespace)
|
||||
},
|
||||
deleteVectorsInNamespace: async function (index, namespace = null) {
|
||||
await index.delete1({ namespace, deleteAll: true })
|
||||
return true
|
||||
},
|
||||
addDocumentToNamespace: async function (namespace, documentData = {}, fullFilePath = null) {
|
||||
const { DocumentVectors } = require("../../models/vectors");
|
||||
try {
|
||||
const { pageContent, docId, ...metadata } = documentData
|
||||
if (!pageContent || pageContent.length == 0) return false;
|
||||
|
||||
console.log("Adding new vectorized document into namespace", namespace);
|
||||
const cacheResult = await cachedVectorInformation(fullFilePath)
|
||||
if (cacheResult.exists) {
|
||||
const { pineconeIndex } = await this.connect();
|
||||
const { chunks } = cacheResult
|
||||
const documentVectors = []
|
||||
|
||||
for (const chunk of chunks) {
|
||||
// Before sending to Pinecone and saving the records to our db
|
||||
// we need to assign the id of each chunk that is stored in the cached file.
|
||||
const newChunks = chunk.map((chunk) => {
|
||||
const id = uuidv4()
|
||||
documentVectors.push({ docId, vectorId: id });
|
||||
return { ...chunk, id }
|
||||
})
|
||||
|
||||
// Push chunks with new ids to pinecone.
|
||||
await pineconeIndex.upsert({
|
||||
upsertRequest: {
|
||||
vectors: [...newChunks],
|
||||
namespace,
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
await DocumentVectors.bulkInsert(documentVectors)
|
||||
return true
|
||||
}
|
||||
|
||||
// If we are here then we are going to embed and store a novel document.
|
||||
// We have to do this manually as opposed to using LangChains `PineconeStore.fromDocuments`
|
||||
// because we then cannot atomically control our namespace to granularly find/remove documents
|
||||
// from vectordb.
|
||||
// https://github.com/hwchase17/langchainjs/blob/2def486af734c0ca87285a48f1a04c057ab74bdf/langchain/src/vectorstores/pinecone.ts#L167
|
||||
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 20 });
|
||||
const textChunks = await textSplitter.splitText(pageContent)
|
||||
|
||||
console.log('Chunks created from document:', textChunks.length)
|
||||
const documentVectors = []
|
||||
const vectors = []
|
||||
const openai = this.openai()
|
||||
for (const textChunk of textChunks) {
|
||||
const vectorValues = await this.embedChunk(openai, textChunk);
|
||||
|
||||
if (!!vectorValues) {
|
||||
const vectorRecord = {
|
||||
id: uuidv4(),
|
||||
values: vectorValues,
|
||||
// [DO NOT REMOVE]
|
||||
// LangChain will be unable to find your text if you embed manually and dont include the `text` key.
|
||||
// https://github.com/hwchase17/langchainjs/blob/2def486af734c0ca87285a48f1a04c057ab74bdf/langchain/src/vectorstores/pinecone.ts#L64
|
||||
metadata: { ...metadata, text: textChunk },
|
||||
}
|
||||
vectors.push(vectorRecord);
|
||||
documentVectors.push({ docId, vectorId: vectorRecord.id });
|
||||
} else {
|
||||
console.error('Could not use OpenAI to embed document chunk! This document will not be recorded.')
|
||||
}
|
||||
}
|
||||
|
||||
if (vectors.length > 0) {
|
||||
const chunks = []
|
||||
const { pineconeIndex } = await this.connect();
|
||||
console.log('Inserting vectorized chunks into Pinecone.')
|
||||
for (const chunk of toChunks(vectors, 100)) {
|
||||
chunks.push(chunk)
|
||||
await pineconeIndex.upsert({
|
||||
upsertRequest: {
|
||||
vectors: [...chunk],
|
||||
namespace,
|
||||
}
|
||||
})
|
||||
}
|
||||
await storeVectorResult(chunks, fullFilePath)
|
||||
}
|
||||
|
||||
await DocumentVectors.bulkInsert(documentVectors)
|
||||
return true;
|
||||
} catch (e) {
|
||||
console.error('addDocumentToNamespace', e.message)
|
||||
return false;
|
||||
}
|
||||
},
|
||||
deleteDocumentFromNamespace: async function (namespace, docId) {
|
||||
const { DocumentVectors } = require("../../models/vectors");
|
||||
const { pineconeIndex } = await this.connect();
|
||||
if (!await this.namespaceExists(pineconeIndex, namespace)) return;
|
||||
|
||||
const knownDocuments = await DocumentVectors.where(`docId = '${docId}'`)
|
||||
if (knownDocuments.length === 0) return;
|
||||
|
||||
const vectorIds = knownDocuments.map((doc) => doc.vectorId);
|
||||
await pineconeIndex.delete1({
|
||||
ids: vectorIds,
|
||||
namespace,
|
||||
})
|
||||
|
||||
const indexes = knownDocuments.map((doc) => doc.id);
|
||||
await DocumentVectors.deleteIds(indexes)
|
||||
return true;
|
||||
},
|
||||
'namespace-stats': async function (reqBody = {}) {
|
||||
const { namespace = null } = reqBody
|
||||
if (!namespace) throw new Error("namespace required");
|
||||
const { pineconeIndex } = await this.connect();
|
||||
if (!await this.namespaceExists(pineconeIndex, namespace)) throw new Error('Namespace by that name does not exist.');
|
||||
const stats = await this.namespace(pineconeIndex, namespace)
|
||||
return stats ? stats : { message: 'No stats were able to be fetched from DB' }
|
||||
},
|
||||
'delete-namespace': async function (reqBody = {}) {
|
||||
const { namespace = null } = reqBody
|
||||
const { pineconeIndex } = await this.connect();
|
||||
if (!await this.namespaceExists(pineconeIndex, namespace)) throw new Error('Namespace by that name does not exist.');
|
||||
|
||||
const details = await this.namespace(pineconeIndex, namespace);
|
||||
await this.deleteVectorsInNamespace(pineconeIndex, namespace);
|
||||
return { message: `Namespace ${namespace} was deleted along with ${details.vectorCount} vectors.` }
|
||||
},
|
||||
query: async function (reqBody = {}) {
|
||||
const { namespace = null, input } = reqBody;
|
||||
if (!namespace || !input) throw new Error("Invalid request body");
|
||||
|
||||
const { pineconeIndex } = await this.connect();
|
||||
if (!await this.namespaceExists(pineconeIndex, namespace)) {
|
||||
return {
|
||||
response: null, sources: [], message: 'Invalid query - no documents found for workspace!'
|
||||
}
|
||||
}
|
||||
|
||||
const vectorStore = await PineconeStore.fromExistingIndex(
|
||||
this.embedder(),
|
||||
{ pineconeIndex, namespace }
|
||||
);
|
||||
|
||||
const model = this.llm();
|
||||
const chain = VectorDBQAChain.fromLLM(model, vectorStore, {
|
||||
k: 5,
|
||||
returnSourceDocuments: true,
|
||||
});
|
||||
const response = await chain.call({ query: input });
|
||||
return { response: response.text, sources: curateSources(response.sourceDocuments), message: false }
|
||||
},
|
||||
// This implementation of chat also expands the memory of the chat itself
|
||||
// and adds more tokens to the PineconeDB instance namespace
|
||||
chat: async function (reqBody = {}) {
|
||||
const { namespace = null, input } = reqBody;
|
||||
if (!namespace || !input) throw new Error("Invalid request body");
|
||||
|
||||
const { pineconeIndex } = await this.connect();
|
||||
if (!await this.namespaceExists(pineconeIndex, namespace)) throw new Error("Invalid namespace - has it been collected and seeded yet?");
|
||||
|
||||
const vectorStore = await PineconeStore.fromExistingIndex(
|
||||
this.embedder(),
|
||||
{ pineconeIndex, namespace }
|
||||
);
|
||||
|
||||
const memory = new VectorStoreRetrieverMemory({
|
||||
vectorStoreRetriever: vectorStore.asRetriever(1),
|
||||
memoryKey: "history",
|
||||
});
|
||||
|
||||
const model = this.llm();
|
||||
const prompt =
|
||||
PromptTemplate.fromTemplate(`The following is a friendly conversation between a human and an AI. The AI is very casual and talkative and responds with a friendly tone. If the AI does not know the answer to a question, it truthfully says it does not know.
|
||||
Relevant pieces of previous conversation:
|
||||
{history}
|
||||
|
||||
Current conversation:
|
||||
Human: {input}
|
||||
AI:`);
|
||||
|
||||
const chain = new LLMChain({ llm: model, prompt, memory });
|
||||
const response = await chain.call({ input });
|
||||
return { response: response.text, sources: [], message: false }
|
||||
},
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
Pinecone
|
||||
}
|
5
server/vector-cache/VECTOR_CACHE.md
Normal file
5
server/vector-cache/VECTOR_CACHE.md
Normal file
|
@ -0,0 +1,5 @@
|
|||
### What is this folder?
|
||||
|
||||
`vector-cache` is a running storage of JSON documents that you have already run embeddings on. This allows you to use the same large documents for multiple workspaces without having to pay to re-embed them each time you want to reference them across workspaces.
|
||||
|
||||
This also allows you to reset entire workspaces back to their original state without having to pay for the embeddings again. Saving you tons of money for large documents that take a while to embed.
|
Loading…
Add table
Reference in a new issue