A curated collection of interesting GitHub repositories
View the Project on GitHub tom-doerr/repo_posts
extracts text and metadata from PDFs, Word, HTML, images, and more with fast Rust core and Python support