OCR and Nostalgia

OCR & A Dose Of Nostalgia

A few days ago, my mother gifted me with something unexpected. She had found pages and pages of typewriter-written texts from her own mother, my grandmother. Poems, short stories, diary entries, memories. A whole nostalgic world suddenly opened up in front of me in the form of a couple of JPEG images.

Of course, in the modern world we live in, my first instinct was to preserve this writing in a digital way. Luckily, I work at just the place for this!

Read more to find out how to preserve such old, personal, nostalgic texts using OCR.

Prepare The Scans

The scanned images I received of my grandmother’s writing have fairly good quality. Some letters are hard to read, however, due to the typewriter ink. Some parts have been “corrected” by typing X-es over spelling errors. A few scans are also slightly crooked.

However, I did not feel the need to change anything. I extracted the texts from my scans as is.

If your scans or images are saved in a lesser quality, you could try changing the contrast between the background and the text.

Also, deskewing pages that are “too crooked” can improve the result as well. This can easily be done on PDF2Go, btw. Just tick the “Deskew” option when converting to PDF.

Extract The Text

Now it’s time to get the text from your image or scanned PDF. For this, head over to PDF2Go.com.

There are a few different formats in which you can get your text. Following, I will tell you about three of them in particular.

  1. Convert to text
    This is the most basic conversion you can to that will simply take the text from your scan and extract it without formatting and the like. The result is a TXT document you can open in any writing program.
  2. Convert to Word
    This conversion tries to keep the formatting intact as good as possible, also including images from your scan. You can choose to convert to DOC or DOCX which is perfect for users of Microsoft Word.
  3. Convert to LibreOffice
    For this conversion, you have to choose the ODT format in your conversion output. Like the conversion to Word, it retains formatting and images as best as it can but is designed for users of LibreOffice or OpenOffice programs.

Of course, there are more formats you could convert to, but these are the most popular and possibly useful ones. In any case, however, make sure you tick the box next to OCR to switch it on!

What is there to know about OCR?

The Finishing Touches

Now that you have your text, there is one more thing you should do: go through it.

Luckily, reading through such texts that invoke nostalgia is rather a feast than a burden. Read through it, replace letters that the OCR did not quite catch, correct line breaks, etc.

Afterward, your nostalgic text is ready to be shared among your family!