Paperless-ngx, an excellent application for managing your documents

Recently the release of the new version of Paperless-ngx was announced, which is a web-based document management application that converts paper (physical) documents into electronic documents that can be searched, downloaded and stored online in full text.

For those who do not know about Paperless-ngx, I can tell you that this is a fork of the paperless -ng project, which in turn is a fork of the paperlsess project (forks were created to continue development after maintenance by previous developers ceased).

About Paperless-ngx

After uploading the scanned document in any available way (via FTP, via web interface, via Android app, via email via IMAP), the program performs optical character recognition (OCR) using the Tesseract engine.

Besides that, allows you to organize and index scanned documents with labels, correspondents, types and more. It also performs OCR on documents, adds selectable text to image-only documents, and adds labels, correspondents, and document types to documents.

paperless-ngx supports PDF documents, images, plain text files, and Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents), Office document support is optional and provided by Apache Tika.

Paperless-ngx stores your documents on disk, filenames and folders are managed paperless and their format is freely configurable, plus it has a document matching function powered by machine learning.

The application itself is optimized for multi-core systems, whereby it can process multiple documents in parallel and the built-in checker makes sure that the document archive is healthy.

Of the other features highlights of Paperless-ngx:

  • Single page application interface.
  • It includes a dashboard that shows basic statistics and has document uploads.
  • Filtered by tags, correspondents, types and more.
  • Customizable views can be saved and displayed on the dashboard.
  • Full text search helps you find what you need.
  • Auto completion suggests relevant words from your documents.
  • The results are sorted by relevance to your search query.
  • Highlighting shows you which parts of the document matched the query.
  • Search for similar documents (“More like this”)
  • Email Processing: Paperless aggregates documents from your email accounts.
  • Set up multiple accounts and filters for each account.
  • When adding documents from the mail, Paperless can move these mails to a new folder, mark them as read, mark them as important, or delete them.

Main news in Paperless-ngx 1.8.0

In this new version it is highlighted that the pre and post processing scripts use environment variables instead of command line arguments, plus web interface thumbnails have been converted to WebP format instead of PNG and that the web interface configuration is stored in the database.

On the other hand, when changing the language of the document, a hint appears in the interface about the need to reload the page and if a Redis communication error occurs, more detailed information is displayed.

In addition to this, it is also highlighted that the ability to view the document queue for processing has been added to the web interface.

Finally, if you are interested in learning more about the application, you can check the details at the following link.

The code is written in Python using the Django framework and is released under the GPLv3 license.

How to install Paperless-ngx on Ubuntu and derivatives?

For those who are interested in being able to install this application on their system, they should know that the easiest way to implement it is with the help of Docker.

The installation can be done by opening a terminal and typing in it:

bash -c "$(curl -L https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"

As for those who are interested in being able to do the compilation on their own, they can refer to the instructions In the following link.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.