#2 Solution TXTOCR Blitz5

Tesseract.

The data was 1 or 2 english words written on the image. Tesseract generally did ok in reading the text from the image, except when it encountered fancy font or the word was at the edge of the image partly invisible.

  1. Preprocessing
    Binarization did help tesseract better handle images, it made the text written in black on white background:

     def binarize(fname):
         img = cv2.imread(fname)
         for d in range(3):
             img2 = img[:, :, d]
             med = np.median(img2)
             img[:, :, d] = abs(img2 - med)
         bw = np.sum(img, axis=2)
         bw = bw / np.max(bw) * 255 # scale
         bw = 255 - bw
         fname = fname.replace('.png', '_bin.png')
         cv2.imwrite(fname, bw)
    

Also tried to resize image from 256x256 to 512x512.

  1. Run tesseract:
    tesseract img_bin.png out --psm 7 -l eng

  2. Vocabulary check:
    Lastly, check if prediction is made of actual words, and if not - try different type of preprocessing.
    import enchant
    usdict = enchant.Dict(‘en_US’)
    usdict.check(‘word’)

1 Like