#2 Solution TXTOCR Blitz5

alarih · February 5, 2021, 10:04pm

Tesseract.

The data was 1 or 2 english words written on the image. Tesseract generally did ok in reading the text from the image, except when it encountered fancy font or the word was at the edge of the image partly invisible.

Preprocessing
Binarization did help tesseract better handle images, it made the text written in black on white background:

 def binarize(fname):
     img = cv2.imread(fname)
     for d in range(3):
         img2 = img[:, :, d]
         med = np.median(img2)
         img[:, :, d] = abs(img2 - med)
     bw = np.sum(img, axis=2)
     bw = bw / np.max(bw) * 255 # scale
     bw = 255 - bw
     fname = fname.replace('.png', '_bin.png')
     cv2.imwrite(fname, bw)

Also tried to resize image from 256x256 to 512x512.

Run tesseract:
tesseract img_bin.png out --psm 7 -l eng
Vocabulary check:
Lastly, check if prediction is made of actual words, and if not - try different type of preprocessing.
import enchant
usdict = enchant.Dict(‘en_US’)
usdict.check(‘word’)