mirror of
https://github.com/khoj-ai/khoj.git
synced 2024-11-23 15:38:55 +01:00
Drop support for Ledger as a separate content type
Khoj will soon get a generic text indexing content type. This along with a file filter should suffice for searching through Ledger transactions, if required. Having a specific content type for niche use-case like ledger isn't useful. Removing unused content types will reduce khoj code to manage.
This commit is contained in:
parent
c9db5321e7
commit
0f993b332e
19 changed files with 18 additions and 635 deletions
10
README.md
10
README.md
|
@ -64,7 +64,7 @@
|
||||||
- **General**
|
- **General**
|
||||||
- **Natural**: Advanced natural language understanding using Transformer based ML Models
|
- **Natural**: Advanced natural language understanding using Transformer based ML Models
|
||||||
- **Pluggable**: Modular architecture makes it easy to plug in new data sources, frontends and ML models
|
- **Pluggable**: Modular architecture makes it easy to plug in new data sources, frontends and ML models
|
||||||
- **Multiple Sources**: Index your Org-mode and Markdown notes, Beancount transactions, PDF files, Github repositories, and Photos
|
- **Multiple Sources**: Index your Org-mode and Markdown notes, PDF files, Github repositories, and Photos
|
||||||
- **Multiple Interfaces**: Interact from your [Web Browser](./src/khoj/interface/web/index.html), [Emacs](./src/interface/emacs/khoj.el) or [Obsidian](./src/interface/obsidian/)
|
- **Multiple Interfaces**: Interact from your [Web Browser](./src/khoj/interface/web/index.html), [Emacs](./src/interface/emacs/khoj.el) or [Obsidian](./src/interface/obsidian/)
|
||||||
|
|
||||||
## Demos
|
## Demos
|
||||||
|
@ -267,7 +267,7 @@ pip install --upgrade --pre khoj-assistant
|
||||||
2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone
|
2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone
|
||||||
3. Open the Khoj web interface of the server from your phone browser.<br /> It should be `http://tailscale-ip-of-server:8000` or `http://name-of-server:8000` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)
|
3. Open the Khoj web interface of the server from your phone browser.<br /> It should be `http://tailscale-ip-of-server:8000` or `http://name-of-server:8000` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)
|
||||||
4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button
|
4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button
|
||||||
5. Enjoy exploring your notes, transactions and images from your phone!
|
5. Enjoy exploring your notes, documents and images from your phone!
|
||||||
|
|
||||||
![](https://github.com/khoj-ai/khoj/blob/master/docs/khoj_pwa_android.png?)
|
![](https://github.com/khoj-ai/khoj/blob/master/docs/khoj_pwa_android.png?)
|
||||||
|
|
||||||
|
@ -399,7 +399,7 @@ pip install -e .[dev]
|
||||||
- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case
|
- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case
|
||||||
- Restart khoj
|
- Restart khoj
|
||||||
|
|
||||||
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML
|
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
|
||||||
|
|
||||||
#### Using Docker
|
#### Using Docker
|
||||||
##### 1. Clone
|
##### 1. Clone
|
||||||
|
@ -410,7 +410,7 @@ git clone https://github.com/khoj-ai/khoj && cd khoj
|
||||||
|
|
||||||
##### 2. Configure
|
##### 2. Configure
|
||||||
|
|
||||||
- **Required**: Update [docker-compose.yml](./docker-compose.yml) to mount your images, (org-mode or markdown) notes, pdf, Github repositories, and beancount directories
|
- **Required**: Update [docker-compose.yml](./docker-compose.yml) to mount your images, (org-mode or markdown) notes, PDFs and Github repositories
|
||||||
- **Optional**: Edit application configuration in [khoj_docker.yml](./config/khoj_docker.yml)
|
- **Optional**: Edit application configuration in [khoj_docker.yml](./config/khoj_docker.yml)
|
||||||
|
|
||||||
##### 3. Run
|
##### 3. Run
|
||||||
|
@ -449,7 +449,7 @@ python3 -m pip install pyqt6 # As conda does not support pyqt6 yet
|
||||||
```shell
|
```shell
|
||||||
python3 -m src.khoj.main -vv
|
python3 -m src.khoj.main -vv
|
||||||
```
|
```
|
||||||
Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML
|
Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
|
||||||
|
|
||||||
##### 5. Upgrade
|
##### 5. Upgrade
|
||||||
```shell
|
```shell
|
||||||
|
|
|
@ -18,7 +18,6 @@ services:
|
||||||
# must match the path prefix in your config file.
|
# must match the path prefix in your config file.
|
||||||
- ./tests/data/org/:/data/org/
|
- ./tests/data/org/:/data/org/
|
||||||
- ./tests/data/images/:/data/images/
|
- ./tests/data/images/:/data/images/
|
||||||
- ./tests/data/ledger/:/data/ledger/
|
|
||||||
- ./tests/data/markdown/:/data/markdown/
|
- ./tests/data/markdown/:/data/markdown/
|
||||||
- ./tests/data/pdf/:/data/pdf/
|
- ./tests/data/pdf/:/data/pdf/
|
||||||
# Embeddings and models are populated after the first run
|
# Embeddings and models are populated after the first run
|
||||||
|
|
|
@ -19,7 +19,6 @@ keywords = [
|
||||||
"AI",
|
"AI",
|
||||||
"org-mode",
|
"org-mode",
|
||||||
"markdown",
|
"markdown",
|
||||||
"beancount",
|
|
||||||
"images",
|
"images",
|
||||||
"pdf",
|
"pdf",
|
||||||
]
|
]
|
||||||
|
|
|
@ -4,7 +4,7 @@
|
||||||
|
|
||||||
;; Author: Debanjum Singh Solanky <debanjum@gmail.com>
|
;; Author: Debanjum Singh Solanky <debanjum@gmail.com>
|
||||||
;; Description: An AI personal assistant for your digital brain
|
;; Description: An AI personal assistant for your digital brain
|
||||||
;; Keywords: search, chat, org-mode, outlines, markdown, pdf, beancount, image
|
;; Keywords: search, chat, org-mode, outlines, markdown, pdf, image
|
||||||
;; Version: 0.7.0
|
;; Version: 0.7.0
|
||||||
;; Package-Requires: ((emacs "27.1") (transient "0.3.0") (dash "2.19.1"))
|
;; Package-Requires: ((emacs "27.1") (transient "0.3.0") (dash "2.19.1"))
|
||||||
;; URL: https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs
|
;; URL: https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs
|
||||||
|
@ -29,8 +29,7 @@
|
||||||
;;; Commentary:
|
;;; Commentary:
|
||||||
|
|
||||||
;; Create an AI personal assistant for your `org-mode', `markdown' notes,
|
;; Create an AI personal assistant for your `org-mode', `markdown' notes,
|
||||||
;; `beancount' transactions, PDFs and images. This package exposes
|
;; PDFs and images. The assistant exposes 2 modes, search and chat:
|
||||||
;; two assistance modes, search and chat:
|
|
||||||
;;
|
;;
|
||||||
;; Chat provides faster answers, iterative discovery and assisted
|
;; Chat provides faster answers, iterative discovery and assisted
|
||||||
;; creativity. It requires your OpenAI API key to access GPT models
|
;; creativity. It requires your OpenAI API key to access GPT models
|
||||||
|
@ -93,7 +92,6 @@
|
||||||
:group 'khoj
|
:group 'khoj
|
||||||
:type '(choice (const "org")
|
:type '(choice (const "org")
|
||||||
(const "markdown")
|
(const "markdown")
|
||||||
(const "ledger")
|
|
||||||
(const "image")
|
(const "image")
|
||||||
(const "pdf")))
|
(const "pdf")))
|
||||||
|
|
||||||
|
@ -119,7 +117,6 @@
|
||||||
|
|
||||||
(declare-function org-element-property "org-mode" (PROPERTY ELEMENT))
|
(declare-function org-element-property "org-mode" (PROPERTY ELEMENT))
|
||||||
(declare-function org-element-type "org-mode" (ELEMENT))
|
(declare-function org-element-type "org-mode" (ELEMENT))
|
||||||
(declare-function beancount-mode "beancount" ())
|
|
||||||
(declare-function markdown-mode "markdown-mode" ())
|
(declare-function markdown-mode "markdown-mode" ())
|
||||||
(declare-function which-key--show-keymap "which-key" (KEYMAP-NAME KEYMAP &optional PRIOR-ARGS ALL
|
(declare-function which-key--show-keymap "which-key" (KEYMAP-NAME KEYMAP &optional PRIOR-ARGS ALL
|
||||||
NO-PAGING FILTER))
|
NO-PAGING FILTER))
|
||||||
|
@ -135,8 +132,6 @@ NO-PAGING FILTER))
|
||||||
"C-x m | markdown\n")
|
"C-x m | markdown\n")
|
||||||
(when (member 'org enabled-content-types)
|
(when (member 'org enabled-content-types)
|
||||||
"C-x o | org-mode\n")
|
"C-x o | org-mode\n")
|
||||||
(when (member 'ledger enabled-content-types)
|
|
||||||
"C-x l | ledger\n")
|
|
||||||
(when (member 'image enabled-content-types)
|
(when (member 'image enabled-content-types)
|
||||||
"C-x i | image\n")
|
"C-x i | image\n")
|
||||||
(when (member 'pdf enabled-content-types)
|
(when (member 'pdf enabled-content-types)
|
||||||
|
@ -146,7 +141,6 @@ NO-PAGING FILTER))
|
||||||
(defvar khoj--reference-count 0 "Track number of references currently in chat bufffer.")
|
(defvar khoj--reference-count 0 "Track number of references currently in chat bufffer.")
|
||||||
(defun khoj--search-markdown () "Set content-type to `markdown'." (interactive) (setq khoj--content-type "markdown"))
|
(defun khoj--search-markdown () "Set content-type to `markdown'." (interactive) (setq khoj--content-type "markdown"))
|
||||||
(defun khoj--search-org () "Set content-type to `org-mode'." (interactive) (setq khoj--content-type "org"))
|
(defun khoj--search-org () "Set content-type to `org-mode'." (interactive) (setq khoj--content-type "org"))
|
||||||
(defun khoj--search-ledger () "Set content-type to `ledger'." (interactive) (setq khoj--content-type "ledger"))
|
|
||||||
(defun khoj--search-images () "Set content-type to image." (interactive) (setq khoj--content-type "image"))
|
(defun khoj--search-images () "Set content-type to image." (interactive) (setq khoj--content-type "image"))
|
||||||
(defun khoj--search-pdf () "Set content-type to pdf." (interactive) (setq khoj--content-type "pdf"))
|
(defun khoj--search-pdf () "Set content-type to pdf." (interactive) (setq khoj--content-type "pdf"))
|
||||||
(defun khoj--improve-rank () "Use cross-encoder to rerank search results." (interactive) (khoj--incremental-search t))
|
(defun khoj--improve-rank () "Use cross-encoder to rerank search results." (interactive) (khoj--incremental-search t))
|
||||||
|
@ -159,8 +153,6 @@ NO-PAGING FILTER))
|
||||||
(define-key kmap (kbd "C-x m") #'khoj--search-markdown))
|
(define-key kmap (kbd "C-x m") #'khoj--search-markdown))
|
||||||
(when (member 'org enabled-content-types)
|
(when (member 'org enabled-content-types)
|
||||||
(define-key kmap (kbd "C-x o") #'khoj--search-org))
|
(define-key kmap (kbd "C-x o") #'khoj--search-org))
|
||||||
(when (member 'ledger enabled-content-types)
|
|
||||||
(define-key kmap (kbd "C-x l") #'khoj--search-ledger))
|
|
||||||
(when (member 'image enabled-content-types)
|
(when (member 'image enabled-content-types)
|
||||||
(define-key kmap (kbd "C-x i") #'khoj--search-images))
|
(define-key kmap (kbd "C-x i") #'khoj--search-images))
|
||||||
(when (member 'pdf enabled-content-types)
|
(when (member 'pdf enabled-content-types)
|
||||||
|
@ -531,18 +523,6 @@ CONFIG is json obtained from Khoj config API."
|
||||||
;; remove leading (, ) or SPC from extracted entries string
|
;; remove leading (, ) or SPC from extracted entries string
|
||||||
(replace-regexp-in-string "^[\(\) ]" "")))
|
(replace-regexp-in-string "^[\(\) ]" "")))
|
||||||
|
|
||||||
(defun khoj--extract-entries-as-ledger (json-response query)
|
|
||||||
"Convert JSON-RESPONSE, QUERY from API to ledger entries."
|
|
||||||
(thread-last json-response
|
|
||||||
;; extract and render entries from API response
|
|
||||||
(mapcar (lambda (args) (format "%s\n\n" (cdr (assoc 'entry args)))))
|
|
||||||
;; Set query as heading in rendered results buffer
|
|
||||||
(format ";; %s\n\n%s\n" query)
|
|
||||||
;; remove leading (, ) or SPC from extracted entries string
|
|
||||||
(replace-regexp-in-string "^[\(\) ]" "")
|
|
||||||
;; remove trailing (, ) or SPC from extracted entries string
|
|
||||||
(replace-regexp-in-string "[\(\) ]$" "")))
|
|
||||||
|
|
||||||
(defun khoj--extract-entries-as-pdf (json-response query)
|
(defun khoj--extract-entries-as-pdf (json-response query)
|
||||||
"Convert QUERY, JSON-RESPONSE from API with PDF results to `org-mode' entries."
|
"Convert QUERY, JSON-RESPONSE from API with PDF results to `org-mode' entries."
|
||||||
(thread-last
|
(thread-last
|
||||||
|
@ -614,7 +594,6 @@ CONFIG is json obtained from Khoj config API."
|
||||||
(let ((enabled-content-types (khoj--get-enabled-content-types))
|
(let ((enabled-content-types (khoj--get-enabled-content-types))
|
||||||
(file-extension (file-name-extension buffer-name)))
|
(file-extension (file-name-extension buffer-name)))
|
||||||
(cond
|
(cond
|
||||||
((and (member 'ledger enabled-content-types) (or (equal file-extension "bean") (equal file-extension "beancount"))) "ledger")
|
|
||||||
((and (member 'org enabled-content-types) (equal file-extension "org")) "org")
|
((and (member 'org enabled-content-types) (equal file-extension "org")) "org")
|
||||||
((and (member 'org enabled-content-types) (equal file-extension "pdf")) "pdf")
|
((and (member 'org enabled-content-types) (equal file-extension "pdf")) "pdf")
|
||||||
((and (member 'markdown enabled-content-types) (or (equal file-extension "markdown") (equal file-extension "md"))) "markdown")
|
((and (member 'markdown enabled-content-types) (or (equal file-extension "markdown") (equal file-extension "md"))) "markdown")
|
||||||
|
@ -673,7 +652,6 @@ Render results in BUFFER-NAME using QUERY, CONTENT-TYPE."
|
||||||
(cond ((equal content-type "org") (khoj--extract-entries-as-org json-response query))
|
(cond ((equal content-type "org") (khoj--extract-entries-as-org json-response query))
|
||||||
((equal content-type "markdown") (khoj--extract-entries-as-markdown json-response query))
|
((equal content-type "markdown") (khoj--extract-entries-as-markdown json-response query))
|
||||||
((equal content-type "pdf") (khoj--extract-entries-as-pdf json-response query))
|
((equal content-type "pdf") (khoj--extract-entries-as-pdf json-response query))
|
||||||
((equal content-type "ledger") (khoj--extract-entries-as-ledger json-response query))
|
|
||||||
((equal content-type "image") (khoj--extract-entries-as-images json-response query))
|
((equal content-type "image") (khoj--extract-entries-as-images json-response query))
|
||||||
(t (khoj--extract-entries json-response query))))
|
(t (khoj--extract-entries json-response query))))
|
||||||
(cond ((or (equal content-type "all")
|
(cond ((or (equal content-type "all")
|
||||||
|
@ -688,7 +666,6 @@ Render results in BUFFER-NAME using QUERY, CONTENT-TYPE."
|
||||||
(org-set-startup-visibility)))
|
(org-set-startup-visibility)))
|
||||||
((equal content-type "markdown") (progn (markdown-mode)
|
((equal content-type "markdown") (progn (markdown-mode)
|
||||||
(visual-line-mode)))
|
(visual-line-mode)))
|
||||||
((equal content-type "ledger") (beancount-mode))
|
|
||||||
((equal content-type "image") (progn (shr-render-region (point-min) (point-max))
|
((equal content-type "image") (progn (shr-render-region (point-min) (point-max))
|
||||||
(goto-char (point-min))))
|
(goto-char (point-min))))
|
||||||
(t (fundamental-mode))))
|
(t (fundamental-mode))))
|
||||||
|
@ -1004,7 +981,7 @@ Paragraph only starts at first text after blank line."
|
||||||
;; set content type to: last used > based on current buffer > default type
|
;; set content type to: last used > based on current buffer > default type
|
||||||
:init-value (lambda (obj) (oset obj value (format "--content-type=%s" (or khoj--content-type (khoj--buffer-name-to-content-type (buffer-name))))))
|
:init-value (lambda (obj) (oset obj value (format "--content-type=%s" (or khoj--content-type (khoj--buffer-name-to-content-type (buffer-name))))))
|
||||||
;; dynamically set choices to content types enabled on khoj backend
|
;; dynamically set choices to content types enabled on khoj backend
|
||||||
:choices (or (ignore-errors (mapcar #'symbol-name (khoj--get-enabled-content-types))) '("all" "org" "markdown" "pdf" "ledger" "image")))
|
:choices (or (ignore-errors (mapcar #'symbol-name (khoj--get-enabled-content-types))) '("all" "org" "markdown" "pdf" "image")))
|
||||||
|
|
||||||
(transient-define-suffix khoj--search-command (&optional args)
|
(transient-define-suffix khoj--search-command (&optional args)
|
||||||
(interactive (list (transient-args transient-current-command)))
|
(interactive (list (transient-args transient-current-command)))
|
||||||
|
@ -1064,7 +1041,7 @@ Paragraph only starts at first text after blank line."
|
||||||
|
|
||||||
;;;###autoload
|
;;;###autoload
|
||||||
(defun khoj ()
|
(defun khoj ()
|
||||||
"Provide natural, search assistance for your notes, transactions and images."
|
"Provide natural, search assistance for your notes, documents and images."
|
||||||
(interactive)
|
(interactive)
|
||||||
(when khoj-auto-setup
|
(when khoj-auto-setup
|
||||||
(khoj-setup t))
|
(khoj-setup t))
|
||||||
|
|
|
@ -112,46 +112,6 @@ Rule everything\n\
|
||||||
\n"))))
|
\n"))))
|
||||||
|
|
||||||
|
|
||||||
(ert-deftest khoj-tests--extract-entries-as-ledger ()
|
|
||||||
"Test `json-response', `query' from API formatted as beancount ledger."
|
|
||||||
(let ((user-query "Become God")
|
|
||||||
(json-response-from-khoj-backend
|
|
||||||
(json-read-from-string
|
|
||||||
"[\
|
|
||||||
{\
|
|
||||||
\"entry\": \"4242-04-01 * \\\"Penance Center\\\" \\\"Book Stay for 10,000 Years\\\"\\n Expenses:Health:Mental 15 GOLD\\n Assets:Commodities:Gold\",\
|
|
||||||
\"score\": \"0.42\",\
|
|
||||||
\"additional\": {\
|
|
||||||
\"file\": \"/home/ravan/ledger.beancount\",\
|
|
||||||
\"compiled\": \"4242-04-01 * \\\"Penance Center\\\" \\\"Book Stay for 10,000 Years\\\" Expenses:Health:Mental 15 GOLD Assets:Commodities:Gold\"\
|
|
||||||
}\
|
|
||||||
},\
|
|
||||||
{\
|
|
||||||
\"entry\": \"14242-04-01 * \\\"Brahma\\\" \\\"Boon for Invincibility from Higher Beings\\\"\\n Income:Health -1,00,00,000 LIFE\\n Assets:Commodities:Life\",\
|
|
||||||
\"score\": \"0.42\",\
|
|
||||||
\"additional\": {\
|
|
||||||
\"file\": \"/home/ravan/ledger.beancount\",\
|
|
||||||
\"compiled\": \"4242-04-01 * \\\"Brahma\\\" \\\"Boon for Invincibility from Higher Beings\\\" Income:Health -1,00,00,000 LIFE Assets:Commodities:Life\"\
|
|
||||||
}\
|
|
||||||
}]\
|
|
||||||
")))
|
|
||||||
(should
|
|
||||||
(equal
|
|
||||||
(khoj--extract-entries-as-ledger json-response-from-khoj-backend user-query)
|
|
||||||
";; Become God\n\
|
|
||||||
\n\
|
|
||||||
4242-04-01 * \"Penance Center\" \"Book Stay for 10,000 Years\"\n\
|
|
||||||
Expenses:Health:Mental 15 GOLD\n\
|
|
||||||
Assets:Commodities:Gold\n\
|
|
||||||
\n\
|
|
||||||
14242-04-01 * \"Brahma\" \"Boon for Invincibility from Higher Beings\"\n\
|
|
||||||
Income:Health -1,00,00,000 LIFE\n\
|
|
||||||
Assets:Commodities:Life\n\
|
|
||||||
\n\
|
|
||||||
\n\
|
|
||||||
"))))
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
;; -------------------------------------
|
;; -------------------------------------
|
||||||
;; Test Helpers for Find Similar Feature
|
;; Test Helpers for Find Similar Feature
|
||||||
|
|
|
@ -12,7 +12,6 @@ from fastapi.staticfiles import StaticFiles
|
||||||
|
|
||||||
# Internal Packages
|
# Internal Packages
|
||||||
from khoj.processor.conversation.gpt import summarize
|
from khoj.processor.conversation.gpt import summarize
|
||||||
from khoj.processor.ledger.beancount_to_jsonl import BeancountToJsonl
|
|
||||||
from khoj.processor.jsonl.jsonl_to_jsonl import JsonlToJsonl
|
from khoj.processor.jsonl.jsonl_to_jsonl import JsonlToJsonl
|
||||||
from khoj.processor.markdown.markdown_to_jsonl import MarkdownToJsonl
|
from khoj.processor.markdown.markdown_to_jsonl import MarkdownToJsonl
|
||||||
from khoj.processor.org_mode.org_to_jsonl import OrgToJsonl
|
from khoj.processor.org_mode.org_to_jsonl import OrgToJsonl
|
||||||
|
@ -122,18 +121,6 @@ def configure_search(model: SearchModels, config: FullConfig, regenerate: bool,
|
||||||
filters=[DateFilter(), WordFilter(), FileFilter()],
|
filters=[DateFilter(), WordFilter(), FileFilter()],
|
||||||
)
|
)
|
||||||
|
|
||||||
# Initialize Ledger Search
|
|
||||||
if (t == state.SearchType.Ledger or t == None) and config.content_type.ledger and config.search_type.symmetric:
|
|
||||||
logger.info("💸 Setting up search for ledger")
|
|
||||||
# Extract Entries, Generate Ledger Embeddings
|
|
||||||
model.ledger_search = text_search.setup(
|
|
||||||
BeancountToJsonl,
|
|
||||||
config.content_type.ledger,
|
|
||||||
search_config=config.search_type.symmetric,
|
|
||||||
regenerate=regenerate,
|
|
||||||
filters=[DateFilter(), WordFilter(), FileFilter()],
|
|
||||||
)
|
|
||||||
|
|
||||||
# Initialize PDF Search
|
# Initialize PDF Search
|
||||||
if (t == state.SearchType.Pdf or t == None) and config.content_type.pdf and config.search_type.asymmetric:
|
if (t == state.SearchType.Pdf or t == None) and config.content_type.pdf and config.search_type.asymmetric:
|
||||||
logger.info("🖨️ Setting up search for pdf")
|
logger.info("🖨️ Setting up search for pdf")
|
||||||
|
|
|
@ -47,12 +47,6 @@
|
||||||
}).join("\n");
|
}).join("\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
function render_ledger(query, data) {
|
|
||||||
return data.map(function (item) {
|
|
||||||
return `<div class="results-ledger">` + `<p>${item.entry}</p>` + `</div>`;
|
|
||||||
}).join("\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
function render_pdf(query, data) {
|
function render_pdf(query, data) {
|
||||||
return data.map(function (item) {
|
return data.map(function (item) {
|
||||||
let compiled_lines = item.additional.compiled.split("\n");
|
let compiled_lines = item.additional.compiled.split("\n");
|
||||||
|
@ -90,8 +84,6 @@
|
||||||
results = render_org(query, data, "org-");
|
results = render_org(query, data, "org-");
|
||||||
} else if (type === "image") {
|
} else if (type === "image") {
|
||||||
results = data.map(render_image).join('');
|
results = data.map(render_image).join('');
|
||||||
} else if (type === "ledger") {
|
|
||||||
results = render_ledger(query, data);
|
|
||||||
} else if (type === "pdf") {
|
} else if (type === "pdf") {
|
||||||
results = render_pdf(query, data);
|
results = render_pdf(query, data);
|
||||||
} else if (type === "github" || type === "all") {
|
} else if (type === "github" || type === "all") {
|
||||||
|
@ -360,8 +352,7 @@
|
||||||
white-space: pre-wrap;
|
white-space: pre-wrap;
|
||||||
}
|
}
|
||||||
.results-pdf,
|
.results-pdf,
|
||||||
.results-plugin,
|
.results-plugin {
|
||||||
.results-ledger {
|
|
||||||
text-align: left;
|
text-align: left;
|
||||||
white-space: pre-line;
|
white-space: pre-line;
|
||||||
}
|
}
|
||||||
|
|
|
@ -143,19 +143,15 @@ search_type = """
|
||||||
Objective: Extract search type from user query and return information as JSON
|
Objective: Extract search type from user query and return information as JSON
|
||||||
|
|
||||||
Allowed search types are listed below:
|
Allowed search types are listed below:
|
||||||
- search-type=["notes","ledger","image", "pdf"]
|
- search-type=["notes", "image", "pdf"]
|
||||||
|
|
||||||
Some examples are given below for reference:
|
Some examples are given below for reference:
|
||||||
Q:What fiction book was I reading last week about AI starship?
|
Q:What fiction book was I reading last week about AI starship?
|
||||||
A:{ "search-type": "notes" }
|
A:{ "search-type": "notes" }
|
||||||
Q: What did the lease say about early termination
|
Q: What did the lease say about early termination
|
||||||
A: { "search-type": "pdf" }
|
A: { "search-type": "pdf" }
|
||||||
Q:How much did I spend at Subway for dinner last time?
|
|
||||||
A:{ "search-type": "ledger" }
|
|
||||||
Q:Can you recommend a movie to watch from my notes?
|
Q:Can you recommend a movie to watch from my notes?
|
||||||
A:{ "search-type": "notes" }
|
A:{ "search-type": "notes" }
|
||||||
Q:When did I buy Groceries last?
|
|
||||||
A:{ "search-type": "ledger" }
|
|
||||||
Q:When did I go surfing last?
|
Q:When did I go surfing last?
|
||||||
A:{ "search-type": "notes" }
|
A:{ "search-type": "notes" }
|
||||||
Q:"""
|
Q:"""
|
||||||
|
|
|
@ -1,133 +0,0 @@
|
||||||
# Standard Packages
|
|
||||||
import glob
|
|
||||||
import re
|
|
||||||
import logging
|
|
||||||
from typing import List
|
|
||||||
|
|
||||||
# Internal Packages
|
|
||||||
from khoj.processor.text_to_jsonl import TextToJsonl
|
|
||||||
from khoj.utils.helpers import get_absolute_path, is_none_or_empty, timer
|
|
||||||
from khoj.utils.constants import empty_escape_sequences
|
|
||||||
from khoj.utils.jsonl import dump_jsonl, compress_jsonl_data
|
|
||||||
from khoj.utils.rawconfig import Entry
|
|
||||||
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class BeancountToJsonl(TextToJsonl):
|
|
||||||
# Define Functions
|
|
||||||
def process(self, previous_entries=None):
|
|
||||||
# Extract required fields from config
|
|
||||||
beancount_files, beancount_file_filter, output_file = (
|
|
||||||
self.config.input_files,
|
|
||||||
self.config.input_filter,
|
|
||||||
self.config.compressed_jsonl,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Input Validation
|
|
||||||
if is_none_or_empty(beancount_files) and is_none_or_empty(beancount_file_filter):
|
|
||||||
print("At least one of beancount-files or beancount-file-filter is required to be specified")
|
|
||||||
exit(1)
|
|
||||||
|
|
||||||
# Get Beancount Files to Process
|
|
||||||
beancount_files = BeancountToJsonl.get_beancount_files(beancount_files, beancount_file_filter)
|
|
||||||
|
|
||||||
# Extract Entries from specified Beancount files
|
|
||||||
with timer("Parse transactions from Beancount files into dictionaries", logger):
|
|
||||||
current_entries = BeancountToJsonl.convert_transactions_to_maps(
|
|
||||||
*BeancountToJsonl.extract_beancount_transactions(beancount_files)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Split entries by max tokens supported by model
|
|
||||||
with timer("Split entries by max token size supported by model", logger):
|
|
||||||
current_entries = self.split_entries_by_max_tokens(current_entries, max_tokens=256)
|
|
||||||
|
|
||||||
# Identify, mark and merge any new entries with previous entries
|
|
||||||
with timer("Identify new or updated transaction", logger):
|
|
||||||
if not previous_entries:
|
|
||||||
entries_with_ids = list(enumerate(current_entries))
|
|
||||||
else:
|
|
||||||
entries_with_ids = TextToJsonl.mark_entries_for_update(
|
|
||||||
current_entries, previous_entries, key="compiled", logger=logger
|
|
||||||
)
|
|
||||||
|
|
||||||
with timer("Write transactions to JSONL file", logger):
|
|
||||||
# Process Each Entry from All Notes Files
|
|
||||||
entries = list(map(lambda entry: entry[1], entries_with_ids))
|
|
||||||
jsonl_data = BeancountToJsonl.convert_transaction_maps_to_jsonl(entries)
|
|
||||||
|
|
||||||
# Compress JSONL formatted Data
|
|
||||||
if output_file.suffix == ".gz":
|
|
||||||
compress_jsonl_data(jsonl_data, output_file)
|
|
||||||
elif output_file.suffix == ".jsonl":
|
|
||||||
dump_jsonl(jsonl_data, output_file)
|
|
||||||
|
|
||||||
return entries_with_ids
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def get_beancount_files(beancount_files=None, beancount_file_filters=None):
|
|
||||||
"Get Beancount files to process"
|
|
||||||
absolute_beancount_files, filtered_beancount_files = set(), set()
|
|
||||||
if beancount_files:
|
|
||||||
absolute_beancount_files = {get_absolute_path(beancount_file) for beancount_file in beancount_files}
|
|
||||||
if beancount_file_filters:
|
|
||||||
filtered_beancount_files = {
|
|
||||||
filtered_file
|
|
||||||
for beancount_file_filter in beancount_file_filters
|
|
||||||
for filtered_file in glob.glob(get_absolute_path(beancount_file_filter), recursive=True)
|
|
||||||
}
|
|
||||||
|
|
||||||
all_beancount_files = sorted(absolute_beancount_files | filtered_beancount_files)
|
|
||||||
|
|
||||||
files_with_non_beancount_extensions = {
|
|
||||||
beancount_file
|
|
||||||
for beancount_file in all_beancount_files
|
|
||||||
if not beancount_file.endswith(".bean") and not beancount_file.endswith(".beancount")
|
|
||||||
}
|
|
||||||
if any(files_with_non_beancount_extensions):
|
|
||||||
print(f"[Warning] There maybe non beancount files in the input set: {files_with_non_beancount_extensions}")
|
|
||||||
|
|
||||||
logger.debug(f"Processing files: {all_beancount_files}")
|
|
||||||
|
|
||||||
return all_beancount_files
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def extract_beancount_transactions(beancount_files):
|
|
||||||
"Extract entries from specified Beancount files"
|
|
||||||
|
|
||||||
# Initialize Regex for extracting Beancount Entries
|
|
||||||
transaction_regex = r"^\n?\d{4}-\d{2}-\d{2} [\*|\!] "
|
|
||||||
empty_newline = f"^[\n\r\t\ ]*$"
|
|
||||||
|
|
||||||
entries = []
|
|
||||||
transaction_to_file_map = []
|
|
||||||
for beancount_file in beancount_files:
|
|
||||||
with open(beancount_file) as f:
|
|
||||||
ledger_content = f.read()
|
|
||||||
transactions_per_file = [
|
|
||||||
entry.strip(empty_escape_sequences)
|
|
||||||
for entry in re.split(empty_newline, ledger_content, flags=re.MULTILINE)
|
|
||||||
if re.match(transaction_regex, entry)
|
|
||||||
]
|
|
||||||
transaction_to_file_map += zip(transactions_per_file, [beancount_file] * len(transactions_per_file))
|
|
||||||
entries.extend(transactions_per_file)
|
|
||||||
return entries, dict(transaction_to_file_map)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def convert_transactions_to_maps(parsed_entries: List[str], transaction_to_file_map) -> List[Entry]:
|
|
||||||
"Convert each parsed Beancount transaction into a Entry"
|
|
||||||
entries = []
|
|
||||||
for parsed_entry in parsed_entries:
|
|
||||||
entries.append(
|
|
||||||
Entry(compiled=parsed_entry, raw=parsed_entry, file=f"{transaction_to_file_map[parsed_entry]}")
|
|
||||||
)
|
|
||||||
|
|
||||||
logger.debug(f"Converted {len(parsed_entries)} transactions to dictionaries")
|
|
||||||
|
|
||||||
return entries
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def convert_transaction_maps_to_jsonl(entries: List[Entry]) -> str:
|
|
||||||
"Convert each Beancount transaction entry to JSON and collate as JSONL"
|
|
||||||
return "".join([f"{entry.to_json()}\n" for entry in entries])
|
|
|
@ -171,11 +171,9 @@ async def search(
|
||||||
defiltered_query = filter.defilter(user_query)
|
defiltered_query = filter.defilter(user_query)
|
||||||
|
|
||||||
encoded_asymmetric_query = None
|
encoded_asymmetric_query = None
|
||||||
if t == SearchType.All or (t != SearchType.Ledger and t != SearchType.Image):
|
if t == SearchType.All or t != SearchType.Image:
|
||||||
text_search_models: List[TextSearchModel] = [
|
text_search_models: List[TextSearchModel] = [
|
||||||
model
|
model for model in state.model.__dict__.values() if isinstance(model, TextSearchModel)
|
||||||
for model_name, model in state.model.__dict__.items()
|
|
||||||
if isinstance(model, TextSearchModel) and model_name != "ledger_search"
|
|
||||||
]
|
]
|
||||||
if text_search_models:
|
if text_search_models:
|
||||||
with timer("Encoding query took", logger=logger):
|
with timer("Encoding query took", logger=logger):
|
||||||
|
@ -244,19 +242,6 @@ async def search(
|
||||||
)
|
)
|
||||||
]
|
]
|
||||||
|
|
||||||
if (t == SearchType.Ledger) and state.model.ledger_search:
|
|
||||||
# query transactions
|
|
||||||
search_futures += [
|
|
||||||
executor.submit(
|
|
||||||
text_search.query,
|
|
||||||
user_query,
|
|
||||||
state.model.ledger_search,
|
|
||||||
rank_results=r or False,
|
|
||||||
score_threshold=score_threshold,
|
|
||||||
dedupe=dedupe or True,
|
|
||||||
)
|
|
||||||
]
|
|
||||||
|
|
||||||
if (t == SearchType.Image) and state.model.image_search:
|
if (t == SearchType.Image) and state.model.image_search:
|
||||||
# query images
|
# query images
|
||||||
search_futures += [
|
search_futures += [
|
||||||
|
|
|
@ -16,7 +16,7 @@ import json
|
||||||
web_client = APIRouter()
|
web_client = APIRouter()
|
||||||
templates = Jinja2Templates(directory=constants.web_directory)
|
templates = Jinja2Templates(directory=constants.web_directory)
|
||||||
|
|
||||||
VALID_CONTENT_TYPES = ["org", "ledger", "markdown", "pdf"]
|
VALID_TEXT_CONTENT_TYPES = ["org", "markdown", "pdf"]
|
||||||
|
|
||||||
|
|
||||||
# Create Routes
|
# Create Routes
|
||||||
|
@ -60,7 +60,7 @@ if not state.demo:
|
||||||
|
|
||||||
@web_client.get("/config/content_type/{content_type}", response_class=HTMLResponse)
|
@web_client.get("/config/content_type/{content_type}", response_class=HTMLResponse)
|
||||||
def content_config_page(request: Request, content_type: str):
|
def content_config_page(request: Request, content_type: str):
|
||||||
if content_type not in VALID_CONTENT_TYPES:
|
if content_type not in VALID_TEXT_CONTENT_TYPES:
|
||||||
return templates.TemplateResponse("config.html", context={"request": request})
|
return templates.TemplateResponse("config.html", context={"request": request})
|
||||||
|
|
||||||
default_copy = constants.default_config.copy()
|
default_copy = constants.default_config.copy()
|
||||||
|
|
|
@ -19,7 +19,6 @@ if TYPE_CHECKING:
|
||||||
class SearchType(str, Enum):
|
class SearchType(str, Enum):
|
||||||
All = "all"
|
All = "all"
|
||||||
Org = "org"
|
Org = "org"
|
||||||
Ledger = "ledger"
|
|
||||||
Markdown = "markdown"
|
Markdown = "markdown"
|
||||||
Image = "image"
|
Image = "image"
|
||||||
Pdf = "pdf"
|
Pdf = "pdf"
|
||||||
|
@ -60,7 +59,6 @@ class ImageSearchModel:
|
||||||
@dataclass
|
@dataclass
|
||||||
class SearchModels:
|
class SearchModels:
|
||||||
org_search: TextSearchModel = None
|
org_search: TextSearchModel = None
|
||||||
ledger_search: TextSearchModel = None
|
|
||||||
markdown_search: TextSearchModel = None
|
markdown_search: TextSearchModel = None
|
||||||
pdf_search: TextSearchModel = None
|
pdf_search: TextSearchModel = None
|
||||||
image_search: ImageSearchModel = None
|
image_search: ImageSearchModel = None
|
||||||
|
|
|
@ -22,12 +22,6 @@ default_config = {
|
||||||
"compressed-jsonl": "~/.khoj/content/markdown/markdown.jsonl.gz",
|
"compressed-jsonl": "~/.khoj/content/markdown/markdown.jsonl.gz",
|
||||||
"embeddings-file": "~/.khoj/content/markdown/markdown_embeddings.pt",
|
"embeddings-file": "~/.khoj/content/markdown/markdown_embeddings.pt",
|
||||||
},
|
},
|
||||||
"ledger": {
|
|
||||||
"input-files": None,
|
|
||||||
"input-filter": None,
|
|
||||||
"compressed-jsonl": "~/.khoj/content/ledger/ledger.jsonl.gz",
|
|
||||||
"embeddings-file": "~/.khoj/content/ledger/ledger_embeddings.pt",
|
|
||||||
},
|
|
||||||
"pdf": {
|
"pdf": {
|
||||||
"input-files": None,
|
"input-files": None,
|
||||||
"input-filter": None,
|
"input-filter": None,
|
||||||
|
|
|
@ -72,7 +72,6 @@ class ImageContentConfig(ConfigBase):
|
||||||
|
|
||||||
class ContentConfig(ConfigBase):
|
class ContentConfig(ConfigBase):
|
||||||
org: Optional[TextContentConfig]
|
org: Optional[TextContentConfig]
|
||||||
ledger: Optional[TextContentConfig]
|
|
||||||
image: Optional[ImageContentConfig]
|
image: Optional[ImageContentConfig]
|
||||||
markdown: Optional[TextContentConfig]
|
markdown: Optional[TextContentConfig]
|
||||||
pdf: Optional[TextContentConfig]
|
pdf: Optional[TextContentConfig]
|
||||||
|
|
|
@ -1,233 +0,0 @@
|
||||||
; -*- mode: org; mode: beancount; -*-
|
|
||||||
;; Otzi's Ledger: A 3rd Millenium B.C Mountain Shepherd's Ledger
|
|
||||||
;;
|
|
||||||
;; A stylized recreation of Otzi's transaction history from a few months before his death
|
|
||||||
;; based on https://en.wikipedia.org/wiki/Otzi#Scientific_analyses
|
|
||||||
|
|
||||||
* Options ; Beancount options
|
|
||||||
#+STARTUP: content
|
|
||||||
option "title" "Beancount Ledger"
|
|
||||||
option "operating_currency" "COWRIE" ; The main currencies you use
|
|
||||||
|
|
||||||
* Accounts ; Open all the accounts
|
|
||||||
3300-04-01 open Equity:Sheep ANIMALS
|
|
||||||
description: "Inheritance from Parents"
|
|
||||||
|
|
||||||
3300-04-01 open Income:Hunt ANIMALS
|
|
||||||
description: "From Hunting Animals"
|
|
||||||
3300-04-01 open Income:Forage PLANTS
|
|
||||||
description: "From Foraging Wild Fruits, Plants"
|
|
||||||
3300-04-01 open Income:Market COWRIE
|
|
||||||
description: "Assets sold at the market"
|
|
||||||
|
|
||||||
3300-04-01 open Assets:Animal ANIMALS
|
|
||||||
description: "Animals Owned Like Sheep, Goats, Cows"
|
|
||||||
3300-04-01 open Assets:Food MEALS
|
|
||||||
description: "Food for Consumption"
|
|
||||||
3300-04-01 open Assets:Food:Meat MEALS
|
|
||||||
description: "Killed Animals for Consumption"
|
|
||||||
3300-04-01 open Assets:Food:Veggie MEALS
|
|
||||||
description: "Procured, Foraged Fruits, Grains"
|
|
||||||
3300-04-01 open Assets:Plant PLANTS
|
|
||||||
description: "Procured, Foraged Plants"
|
|
||||||
3300-04-01 open Assets:Tools TOOLS
|
|
||||||
description: "Procured, Made Tools"
|
|
||||||
3300-04-01 open Assets:Cash COWRIE
|
|
||||||
description: "Cowrie Shells in Pouch"
|
|
||||||
|
|
||||||
3300-04-01 open Expenses:Medicine COWRIE
|
|
||||||
description: "Procured, Foraged Medicinals"
|
|
||||||
3300-04-01 open Expenses:Tools:Weapons COWRIE
|
|
||||||
description: "Bought Weapons"
|
|
||||||
3300-04-01 open Expenses:Food
|
|
||||||
description: "Bought, Consumed Meals"
|
|
||||||
3300-04-01 open Expenses:Clothing COWRIE
|
|
||||||
description: "Bought Clothes"
|
|
||||||
3300-04-01 open Expenses:Tools COWRIE
|
|
||||||
description: "Bought Tools"
|
|
||||||
|
|
||||||
* Transactions
|
|
||||||
3345-03-15 * "Parents" "Inheritance"
|
|
||||||
note: "Opening Balance"
|
|
||||||
Equity:Sheep -20 ANIMALS
|
|
||||||
Assets:Animal
|
|
||||||
|
|
||||||
3345-03-26 * "Hauslabjoch Pass, Otzal Alps" "Red Deers"
|
|
||||||
Income:Hunt -2 ANIMALS {50 COWRIE}
|
|
||||||
Assets:Food:Meat 10 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Assets:Animal 0.5 ANIMALS {50 COWRIE}
|
|
||||||
|
|
||||||
3345-03-28 * "Hauslabjoch Pass, Otzal Alps" "Wild Berries"
|
|
||||||
Income:Forage -60 PLANTS
|
|
||||||
Assets:Food:Veggie 3 MEALS {20 PLANTS, "Berry"}
|
|
||||||
|
|
||||||
3345-04-02 * "Hauslabjoch Pass, Otzal Alps" "Last Weeks Meals"
|
|
||||||
Assets:Food:Meat -7 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Assets:Food:Veggie -3 MEALS {20 PLANTS, "Berry"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-04-02 * "Hauslabjoch Pass, Otzal Alps" "Sloe"
|
|
||||||
Income:Forage -50 PLANTS
|
|
||||||
Assets:Food:Veggie 5 MEALS {10 PLANTS, "Sloe"}
|
|
||||||
|
|
||||||
3345-04-05 * "Hauslabjoch Pass, Otzal Alps" "Ibex"
|
|
||||||
Income:Hunt -2 ANIMALS {100 COWRIE}
|
|
||||||
Assets:Food:Meat 10 MEALS {15 COWRIE, "Ibex"}
|
|
||||||
Assets:Animal 0.5 ANIMALS {100 COWRIE}
|
|
||||||
|
|
||||||
3345-04-08 * "Hauslabjoch Pass, Otzal Alps" "Birch Fungus Medicinal Mushroom"
|
|
||||||
Income:Forage -6 PLANTS {100 COWRIE}
|
|
||||||
Assets:Plant 6 PLANTS {100 COWRIE}
|
|
||||||
|
|
||||||
3345-04-09 * "Hauslabjoch Pass, Otzal Alps" "Last Weeks Meals"
|
|
||||||
Assets:Food:Meat -3 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Assets:Food:Meat -4 MEALS {15 COWRIE, "Ibex"}
|
|
||||||
Assets:Food:Veggie -3 MEALS {10 PLANTS, "Sloe"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Sold Red Deers Skin, Antler"
|
|
||||||
Assets:Animal -0.5 ANIMALS {50 COWRIE}
|
|
||||||
Assets:Cash 25 COWRIE
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Sold Ibex Skin, Antler"
|
|
||||||
Assets:Animal -0.5 ANIMALS {100 COWRIE}
|
|
||||||
Assets:Cash 50 COWRIE
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Sold Birch Fungus Medicinal Mushroom"
|
|
||||||
Assets:Plant -5 PLANTS {100 COWRIE}
|
|
||||||
Assets:Cash 500 COWRIE
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Snow Shoes: Bearskin, Deer hide, Tree Bark"
|
|
||||||
note: "Expensive Bearkskin but need not want"
|
|
||||||
Assets:Cash -90 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Soft Grass Socks"
|
|
||||||
Assets:Cash -10 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Cattle Shoelace"
|
|
||||||
Assets:Cash -5 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-04-15 * "Innsbruck Farmers Market" "Einkorn Wheat Bran Bread"
|
|
||||||
Assets:Cash -50 COWRIE
|
|
||||||
Assets:Food:Veggie 5 MEALS {10 COWRIE, "Bread"}
|
|
||||||
|
|
||||||
3345-04-16 * "Enroute to Innsbruck" "Last Weeks Meals"
|
|
||||||
Assets:Food:Meat -6 MEALS {15 COWRIE, "Ibex"}
|
|
||||||
Assets:Food:Veggie -2 MEALS {10 PLANTS, "Sloe"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-04-16 * "Innsbruck Tools Market" "Firelighting Kit: Plants, Pyrite, Flint"
|
|
||||||
Assets:Cash -30 COWRIE
|
|
||||||
Expenses:Tools
|
|
||||||
|
|
||||||
3345-04-16 * "Innsbruck Tools Market" "Flint Blade, Ash Handle Knife"
|
|
||||||
Assets:Cash -50 COWRIE
|
|
||||||
Expenses:Tools:Weapons
|
|
||||||
|
|
||||||
3345-04-20 * "Tisenjoch Pass, Otzal Alps" "Chamois"
|
|
||||||
Income:Hunt -1 ANIMALS {100 COWRIE}
|
|
||||||
Assets:Food:Meat 5 MEALS {10 COWRIE, "Chamois"}
|
|
||||||
Assets:Animal 0.5 ANIMALS {100 COWRIE}
|
|
||||||
|
|
||||||
3345-04-22 * "Tisenjoch Pass, Otzal Alps" "Roe Deer"
|
|
||||||
Income:Hunt -2 ANIMALS {50 COWRIE}
|
|
||||||
Assets:Food:Meat 10 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Assets:Animal 0.5 ANIMALS {50 COWRIE}
|
|
||||||
|
|
||||||
3345-04-23 * "Tisenjoch Pass, Otzal Alps" "Last Weeks Meals"
|
|
||||||
Assets:Food:Veggie -4 MEALS {10 COWRIE, "Bread"}
|
|
||||||
Assets:Food:Meat -4 MEALS {10 COWRIE, "Chamois"}
|
|
||||||
Assets:Food:Meat -3 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-04-25 * "Tisenjoch Pass, Otzal Alps" "Roe Deer Quiver"
|
|
||||||
Assets:Animal -0.25 ANIMALS {50 COWRIE}
|
|
||||||
Assets:Tools 1 TOOLS {12.50 COWRIE}
|
|
||||||
|
|
||||||
3345-04-28 * "Tisenjoch Pass, Otzal Alps" "Wild Berries"
|
|
||||||
Income:Forage -60 PLANTS
|
|
||||||
Assets:Food:Veggie 3 MEALS {20 PLANTS, "Berry"}
|
|
||||||
|
|
||||||
3345-04-30 * "Tisenjoch Pass, Otzal Alps" "Last Weeks Meals"
|
|
||||||
Assets:Food:Veggie -1 MEALS {10 COWRIE, "Bread"}
|
|
||||||
Assets:Food:Meat -1 MEALS {10 COWRIE, "Chamois"}
|
|
||||||
Assets:Food:Meat -6 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-05-02 * "Enroute to Bolzano City" "Poppy Seed"
|
|
||||||
Income:Forage -80 PLANTS
|
|
||||||
Assets:Food:Veggie 8 MEALS {10 PLANTS, "Poppy"}
|
|
||||||
|
|
||||||
3345-05-06 * "Enroute to Bolzano City" "Barley, Flax Seeds"
|
|
||||||
Income:Forage -80 PLANTS
|
|
||||||
Assets:Food:Veggie 4 MEALS {10 PLANTS, "Barley"}
|
|
||||||
Assets:Food:Veggie 4 MEALS {10 PLANTS, "Flax"}
|
|
||||||
|
|
||||||
3345-05-07 * "Enroute to Bolzano City" "Last Weeks Meals"
|
|
||||||
Assets:Food:Veggie -5 MEALS {10 PLANTS, "Poppy"}
|
|
||||||
Assets:Food:Veggie -3 MEALS {20 PLANTS, "Berry"}
|
|
||||||
Assets:Food:Meat -1 MEALS {7.5 COWRIE, "Deer"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-05-09 * "Bolzano City Market" "Sold Roe Deers Hide"
|
|
||||||
Assets:Animal -0.25 ANIMALS {50 COWRIE}
|
|
||||||
Assets:Cash 12.5 COWRIE
|
|
||||||
|
|
||||||
3345-05-09 * "Bolzano City Market" "Sold Chamois Hide"
|
|
||||||
Assets:Animal -0.5 ANIMALS {100 COWRIE}
|
|
||||||
Assets:Cash 50 COWRIE
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Yewood Handle Copper Axe"
|
|
||||||
note: "Expensive Bearkskin but need not want"
|
|
||||||
Assets:Cash -140 COWRIE
|
|
||||||
Expenses:Tools:Weapons
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Sheepskin Hide Coat"
|
|
||||||
Assets:Cash -40 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Sheepskin Loincloth"
|
|
||||||
Assets:Cash -20 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Goat Skin Leggings"
|
|
||||||
Assets:Cash -40 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Brown Bear Fur Hat"
|
|
||||||
Assets:Cash -60 COWRIE
|
|
||||||
Expenses:Clothing
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Viburnum, Dogwood, Flint"
|
|
||||||
note: "For Making Arrows"
|
|
||||||
Assets:Cash -40 COWRIE
|
|
||||||
Expenses:Tools:Weapons
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Yew Wood"
|
|
||||||
note: "For Making Yewood Longbow"
|
|
||||||
Assets:Cash -32.5 COWRIE
|
|
||||||
Expenses:Tools:Weapons
|
|
||||||
|
|
||||||
3345-05-10 * "Bolzano City Market" "Birch Bark Baskets"
|
|
||||||
note: "Need Better Containers for Storage, Carrying"
|
|
||||||
Assets:Cash -30 COWRIE
|
|
||||||
Expenses:Tools
|
|
||||||
|
|
||||||
3345-05-13 * "Near Feldthurns, South Tyrol" "Ibex"
|
|
||||||
Income:Hunt -2 ANIMALS {100 COWRIE}
|
|
||||||
Assets:Food:Meat 10 MEALS {15 COWRIE, "Ibex"}
|
|
||||||
Assets:Animal 0.5 ANIMALS {100 COWRIE}
|
|
||||||
|
|
||||||
3345-05-14 * "Near Feldthurns, South Tyrol" "Last Weeks Meals"
|
|
||||||
Assets:Food:Veggie -4 MEALS {10 PLANTS, "Barley"}
|
|
||||||
Assets:Food:Veggie -3 MEALS {10 PLANTS, "Flax"}
|
|
||||||
Assets:Food:Veggie -3 MEALS {10 PLANTS, "Poppy"}
|
|
||||||
Expenses:Food
|
|
||||||
|
|
||||||
3345-05-21 * "Fineilspitze Peak, Otzal Alps" "Last Weeks Meals"
|
|
||||||
Assets:Food:Meat -7 MEALS {15 COWRIE, "Ibex"}
|
|
||||||
Assets:Food:Veggie -1 MEALS {10 PLANTS, "Flax"}
|
|
||||||
Expenses:Food
|
|
|
@ -1,18 +0,0 @@
|
||||||
* The Beatles - Across The Universe :60s:BRITISH:POP:
|
|
||||||
:PROPERTIES:
|
|
||||||
:TYPE: song
|
|
||||||
:END:
|
|
||||||
:LOGBOOK:
|
|
||||||
ENQUEUED: [1984-04-01 Sun 00:00]
|
|
||||||
:END:
|
|
||||||
|
|
||||||
* Ram Narayan :INDIAN:CLASSICAL:SARANGI:
|
|
||||||
** Ram Narayan - Raag Kirwani Alap
|
|
||||||
:PROPERTIES:
|
|
||||||
:TYPE: song
|
|
||||||
:QUERY: Raga Kirvani (feat. Suresh Talwalkar, François Auboux) (Alap)
|
|
||||||
:CATEGORY: youtube
|
|
||||||
:END:
|
|
||||||
:LOGBOOK:
|
|
||||||
ENQUEUED: [1984-04-01 Sun 00:00]
|
|
||||||
:END:
|
|
|
@ -1,118 +0,0 @@
|
||||||
# Standard Packages
|
|
||||||
import json
|
|
||||||
|
|
||||||
# Internal Packages
|
|
||||||
from khoj.processor.ledger.beancount_to_jsonl import BeancountToJsonl
|
|
||||||
|
|
||||||
|
|
||||||
def test_no_transactions_in_file(tmp_path):
|
|
||||||
"Handle file with no transactions."
|
|
||||||
# Arrange
|
|
||||||
entry = f"""
|
|
||||||
- Bullet point 1
|
|
||||||
- Bullet point 2
|
|
||||||
"""
|
|
||||||
beancount_file = create_file(tmp_path, entry)
|
|
||||||
|
|
||||||
# Act
|
|
||||||
# Extract Entries from specified Beancount files
|
|
||||||
entry_nodes, file_to_entries = BeancountToJsonl.extract_beancount_transactions(beancount_files=[beancount_file])
|
|
||||||
|
|
||||||
# Process Each Entry from All Beancount Files
|
|
||||||
jsonl_string = BeancountToJsonl.convert_transaction_maps_to_jsonl(
|
|
||||||
BeancountToJsonl.convert_transactions_to_maps(entry_nodes, file_to_entries)
|
|
||||||
)
|
|
||||||
jsonl_data = [json.loads(json_string) for json_string in jsonl_string.splitlines()]
|
|
||||||
|
|
||||||
# Assert
|
|
||||||
assert len(jsonl_data) == 0
|
|
||||||
|
|
||||||
|
|
||||||
def test_single_beancount_transaction_to_jsonl(tmp_path):
|
|
||||||
"Convert transaction from single file to jsonl."
|
|
||||||
# Arrange
|
|
||||||
entry = f"""
|
|
||||||
1984-04-01 * "Payee" "Narration"
|
|
||||||
Expenses:Test:Test 1.00 KES
|
|
||||||
Assets:Test:Test -1.00 KES
|
|
||||||
"""
|
|
||||||
beancount_file = create_file(tmp_path, entry)
|
|
||||||
|
|
||||||
# Act
|
|
||||||
# Extract Entries from specified Beancount files
|
|
||||||
entries, entry_to_file_map = BeancountToJsonl.extract_beancount_transactions(beancount_files=[beancount_file])
|
|
||||||
|
|
||||||
# Process Each Entry from All Beancount Files
|
|
||||||
jsonl_string = BeancountToJsonl.convert_transaction_maps_to_jsonl(
|
|
||||||
BeancountToJsonl.convert_transactions_to_maps(entries, entry_to_file_map)
|
|
||||||
)
|
|
||||||
jsonl_data = [json.loads(json_string) for json_string in jsonl_string.splitlines()]
|
|
||||||
|
|
||||||
# Assert
|
|
||||||
assert len(jsonl_data) == 1
|
|
||||||
|
|
||||||
|
|
||||||
def test_multiple_transactions_to_jsonl(tmp_path):
|
|
||||||
"Convert multiple transactions from single file to jsonl."
|
|
||||||
# Arrange
|
|
||||||
entry = f"""
|
|
||||||
1984-04-01 * "Payee" "Narration"
|
|
||||||
Expenses:Test:Test 1.00 KES
|
|
||||||
Assets:Test:Test -1.00 KES
|
|
||||||
\t\r
|
|
||||||
1984-04-01 * "Payee" "Narration"
|
|
||||||
Expenses:Test:Test 1.00 KES
|
|
||||||
Assets:Test:Test -1.00 KES
|
|
||||||
"""
|
|
||||||
|
|
||||||
beancount_file = create_file(tmp_path, entry)
|
|
||||||
|
|
||||||
# Act
|
|
||||||
# Extract Entries from specified Beancount files
|
|
||||||
entries, entry_to_file_map = BeancountToJsonl.extract_beancount_transactions(beancount_files=[beancount_file])
|
|
||||||
|
|
||||||
# Process Each Entry from All Beancount Files
|
|
||||||
jsonl_string = BeancountToJsonl.convert_transaction_maps_to_jsonl(
|
|
||||||
BeancountToJsonl.convert_transactions_to_maps(entries, entry_to_file_map)
|
|
||||||
)
|
|
||||||
jsonl_data = [json.loads(json_string) for json_string in jsonl_string.splitlines()]
|
|
||||||
|
|
||||||
# Assert
|
|
||||||
assert len(jsonl_data) == 2
|
|
||||||
|
|
||||||
|
|
||||||
def test_get_beancount_files(tmp_path):
|
|
||||||
"Ensure Beancount files specified via input-filter, input-files extracted"
|
|
||||||
# Arrange
|
|
||||||
# Include via input-filter globs
|
|
||||||
group1_file1 = create_file(tmp_path, filename="group1-file1.bean")
|
|
||||||
group1_file2 = create_file(tmp_path, filename="group1-file2.bean")
|
|
||||||
group2_file1 = create_file(tmp_path, filename="group2-file1.beancount")
|
|
||||||
group2_file2 = create_file(tmp_path, filename="group2-file2.beancount")
|
|
||||||
# Include via input-file field
|
|
||||||
file1 = create_file(tmp_path, filename="ledger.bean")
|
|
||||||
# Not included by any filter
|
|
||||||
create_file(tmp_path, filename="not-included-ledger.bean")
|
|
||||||
create_file(tmp_path, filename="not-included-text.txt")
|
|
||||||
|
|
||||||
expected_files = sorted(map(str, [group1_file1, group1_file2, group2_file1, group2_file2, file1]))
|
|
||||||
|
|
||||||
# Setup input-files, input-filters
|
|
||||||
input_files = [tmp_path / "ledger.bean"]
|
|
||||||
input_filter = [tmp_path / "group1*.bean", tmp_path / "group2*.beancount"]
|
|
||||||
|
|
||||||
# Act
|
|
||||||
extracted_org_files = BeancountToJsonl.get_beancount_files(input_files, input_filter)
|
|
||||||
|
|
||||||
# Assert
|
|
||||||
assert len(extracted_org_files) == 5
|
|
||||||
assert extracted_org_files == expected_files
|
|
||||||
|
|
||||||
|
|
||||||
# Helper Functions
|
|
||||||
def create_file(tmp_path, entry=None, filename="ledger.beancount"):
|
|
||||||
beancount_file = tmp_path / filename
|
|
||||||
beancount_file.touch()
|
|
||||||
if entry:
|
|
||||||
beancount_file.write_text(entry)
|
|
||||||
return beancount_file
|
|
|
@ -34,7 +34,7 @@ def test_search_with_invalid_content_type(client):
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------------------------
|
||||||
def test_search_with_valid_content_type(client):
|
def test_search_with_valid_content_type(client):
|
||||||
for content_type in ["all", "org", "markdown", "ledger", "image", "pdf", "plugin1"]:
|
for content_type in ["all", "org", "markdown", "image", "pdf", "plugin1"]:
|
||||||
# Act
|
# Act
|
||||||
response = client.get(f"/api/search?q=random&t={content_type}")
|
response = client.get(f"/api/search?q=random&t={content_type}")
|
||||||
# Assert
|
# Assert
|
||||||
|
@ -52,7 +52,7 @@ def test_update_with_invalid_content_type(client):
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------------------------
|
||||||
def test_update_with_valid_content_type(client):
|
def test_update_with_valid_content_type(client):
|
||||||
for content_type in ["org", "markdown", "ledger", "image", "pdf", "plugin1"]:
|
for content_type in ["org", "markdown", "image", "pdf", "plugin1"]:
|
||||||
# Act
|
# Act
|
||||||
response = client.get(f"/api/update?t={content_type}")
|
response = client.get(f"/api/update?t={content_type}")
|
||||||
# Assert
|
# Assert
|
||||||
|
@ -70,7 +70,7 @@ def test_regenerate_with_invalid_content_type(client):
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------------------------
|
||||||
def test_regenerate_with_valid_content_type(client):
|
def test_regenerate_with_valid_content_type(client):
|
||||||
for content_type in ["org", "markdown", "ledger", "image", "pdf", "plugin1"]:
|
for content_type in ["org", "markdown", "image", "pdf", "plugin1"]:
|
||||||
# Act
|
# Act
|
||||||
response = client.get(f"/api/update?force=true&t={content_type}")
|
response = client.get(f"/api/update?force=true&t={content_type}")
|
||||||
# Assert
|
# Assert
|
||||||
|
|
Loading…
Reference in a new issue