Python-Tensorflow Number Recognition

Neural Network trained to detect consistent-font numbers amongst large dataset complex images.

Related Tags...

PythonTensorFlowComputer VisionOpenCV
1

Screenshot Gallery...

Network Training Data

Class# of Training Images
Zero1368
One588
Two144
Three109
Four105
Five184
Six119
Seven324
Eight558
Nine493

Network Structure

Layer #Specific TypeSettings
0Random Flip
1Random Rotation
2Random Zoom
3Rescaling
42D Convolutional
  • 16 filters, 3 kernel size
  • Relu activation
52D Max Pooling
62D Convolutional
  • 32 filters, 3 kernel size
  • Relu activation
72D Max Pooling
82D Convolutional
  • 64 filters, 3 kernel size
  • Relu activation
92D Max Pooling
10Dropout
  • Rate of 0.2
11
  • 128 units
  • Relu activation
12
  • 10 units
  • Relu activation

Training Stages

Initial Training:
Initially, the algorithm was applied to separate all number within the image (as determined by the strongest found contours).

Self-supported Training:
With a functional, yet inaccurate network running, a classification was run on all images in the set.
The results were then manually filtered, creating a significantly large set of accurate data.
The neural network training is run on this larger dataset, creating a significantly higher accuracy.

Algorithm

  1. User provides a file-name (in the form of a number).
  2. Image is opened, and preprocessed using a combination of Gaussian Blur, Canny Edge Detection, External Edge Detection
  3. Take the four largest contours (most likely the numbers on the image)
  4. For each contour, Blur, grayscale, and convert to Network Image required sizes
  5. Neural Network run for each number, and value is concatenated and returned

Limitations

Correctly Identify all 4 numbers: 1294/1324 (97.734%)
Correctly Identify all 4 non-corrupted numbers: 1300/1324 (98.187%)
With the accuracy of the neural network, it appears that the error lies in incorrectly canny-edged images. Different threshvalues or a blurring method may provide an increase in accuracy