Extracting payment info in rasterized invoice
In this short article I will explain how I have put together an OCR and a barcode reader and converter using FLOSS (Free and Libre OpenSource Software) so that I can pay the bills I have got from my phone based on an image of an invoice I got on my desktop computer.
In this short section I explain what was the issue that forced me to come up with this solution. In general it is good to know the background to properly integrate it to you workflow, but if you don't feel like it, just jump to the next section.
After getting grant for scientific research in Finland, in case you don't have a salary contract with a university or company to handle your insurance and pension, you have to apply for MELA (a.k.a farmer's insurance). MELA, similar to most government and government-affiliated entities has out-dated workflows, portals, and website in a way that simple tasks which should be done in 5 minutes in today's world (2021) will take over several weeks and involves tons of human interactions and etc. When MELA makes the decision according to your income and regulations, you will receive an invoice in snail-mail (traditional paper-based postal system) which then you have to pay that invoice. On the exact opposite of the out-dated, constantly crashing, and super complicated MELA portal, the Posti (Finnish postal system) has a fantastic website in which you can usually get the PDF of such invoices almost immediately as they are issued, and you don't need to wait for the paper mail to arrive.
Another thing to know is that e-invoices in Finland usually come with a 1D (one-dimensional) barcode, and the Finnish bank apps has a built-in 1D barcode reader to quickly scan them. Upon scanning, everything about that payment is automatically set in the app so that you can just press "Pay" button. These bank apps unfortunately are practically unmaintained from UI/UX perspective as they cannot read the barcodes easily. There is always a constant struggle to get the angle and zoom and focus right for it to work. But there is a backup system that can be [ab]used :) . These bank apps usually accept "virtual barcode" which is basically the data of the barcode in plain text format.
I received an invoice in PDF form, but it only had an 1D barcode and the whole PDF was an image and not parsable text, hence no copy-paste. I had to translate the barcode into a format that I can read and then pay the bill. In addition to all these, the bill was in Finnish (a language that I'm not very comfortable to do financial stuff in as I'm not so good at comprehending official and technical Finnish text [yet]).
On my free time I am helping Flamehsot[Git] in my free time and I realized I can use it to select the parts of the screen I want and then pass the info to other software to handle either the OCR (Optical Character Recognition) or decoding and encoding barcodes.
All I needed to do was to select the region I want to OCR using
flameshot, then pass it
tesseract for OCR (i.e convert to text), get the resulting text and copy it to clipboard so that I can use Google Translate or DeepL to make sure I have correctly understood the Finnish text.
For OCR I wrote the following one-liner (broke it into three lines to improve readability):
flameshot gui --raw \ | tesseract -l fin stdin stdout \ | xclip -selection clipboard
For the barcode, the situation is different.
I have an app on my phone that can perfectly read QR code.
So if I can read the 1D barcode, decode it with
zbar, then encode it to QR code using
qrencode and the show it on the screen (in this case using
imagemagick, I can then scan it with my phone app and paste the barcode data into my banking app.
I came up with the following one-liner:
flameshot gui --raw \ | zbarimg --raw --quiet PNG:- \ | qrencode --level=H --size=7 --output="-" \ | display -
With this, I paid two crappy rasterized low quality invoices in less than a minute with zero human error.