All You Need To Know About OCR

When converting from image to PDF, you naturally receive a PDF containing the different images. It’s a great way for preparing images to print, yet that is not all a conversion to PDF can do. Using the convert to PDF function of PDF2Go allows you to extract text from an image!

Why would you need that, you ask? Imagine you’re at a lecture, presentation or conference and instead of speed-typing or writing down things, you simply take a photo of the slides. Now, when you want to revise, add comments or simply want to go over your notes again, you only have them in an image format.

Luckily, with the help of a feature we will explain here, you can extract the text from your images and use it properly! 

Optical Character Recognition (OCR)

When you go to PDF2Go.com/convert-pdf and upload a file, you will find a button underneath the “Optional settings” headline that reads: Use OCR.

pdf2go_ocr

Lets have a look at what this function does.

What is OCR?

OCR stands for Optical Character Recognition. It is a process, or more precisely a conversion, of images that contain text to machine-encoded text. This works with typed and printed text and, in some rare occasions, also with very clear handwritten text.

OCR is heavily used in the process of digitizing documents like bank statements, passports, invoices and other scanned documents. It allows cataloging, storing, digital editing and a better of searching of the records.

How does it work?

There are two main ways in which a text on an image is processed by OCR: one character at a time, or one word at a time. The latter one can only be used for languages that use a space in between words to divide them. Both of these processes are known as OCR. The process that interprets handwritten text or cursive text is known as Intelligent Character Recognition (ICR), which relies heavily on machine learning.

Early versions of OCR processors were trained with images containing one character of a specific font. The program was then able to convert scanned or taken pictures containing text in the trained font. By now, most fonts are supported by OCR and slightly crooked scans and not 100% aligned photos can be processed.

In order for the scan to be as successful as possible, many OCR programs use the pre-processing method in which the (scanned) images is improved before the actual OCR process begins. This includes de-skewing, despeckling and a conversion of a colored image to greyscale among other things. Then, the single characters (or words) are detected, segmented and interpreted. In some cases, the output of the OCR is even further improved. This is done by post-processing the converted text and comparing it to a lexicon of words in the given language.

How to get better results?

There are a few ways to improve the success rate of OCR without meddling with the process itself:

  1. Use fonts that can be easily recognized
  2. Give higher contrast to the image
  3. Make sure there is no watermark, image or item covering the text

Share: