khoj/tests/data
Raghav Tirumale 8eccd8a5e4
Support Indexing Images via OCR (#823)
- Added support for uploading .jpeg, .jpg, and .png files to Khoj from Web, Desktop app
- Updating indexer to generate raw text and entries using RapidOCR
- Details
  * added support for indexing images via ocr
  * fixed pyproject.toml
  * Update src/khoj/processor/content/images/image_to_entries.py
     Co-authored-by: Debanjum <debanjum@gmail.com>
  * Update src/khoj/processor/content/images/image_to_entries.py
     Co-authored-by: Debanjum <debanjum@gmail.com>
  * removed redudant try except blocks
  * updated desktop js file to support image formats
  * added tests for jpg and png
  * Fix processing for image to entries files
  * Update unit tests with working image indexer
  * Change png test from version verificaition to open-cv verification

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-07-01 06:00:00 -07:00
..
docx Support Indexing Docx Files (#801) 2024-06-20 11:18:01 +05:30
images Support Indexing Images via OCR (#823) 2024-07-01 06:00:00 -07:00
markdown Bonus: Rename all md files to markdown for cleanliness 2023-06-29 11:27:47 -07:00
org Spell fix s/e.g/e.g./ across code, tests and docs 2024-06-24 15:24:45 +05:30
pdf Try adding dependencies for libgl in order to run OCR in github action unit tests 2023-11-05 15:09:40 -08:00
plaintext Fix plaintext HTML parsing and rendering (#464) 2023-08-27 11:24:30 -07:00
config.yml Update text search test since indexing ancestor hierarchy added 2023-11-17 15:26:55 -08:00