This application takes an input photo of the front of a wine bottle, recognizes the label using computer vision techniques, flattens the label as if it were peeled off, and proceeds to read the label using Tesseract's optical character recognition (OCR) library.
Prediction of the label location based on a U-net type neural network trained on a small dataset of examples.
Binary transformation of the pixels (black or white), rotation to maximize the amount of all black columns and attempt to locate all 6 important points of the label cylinder via an custom iteration algorithm.
Applying cylinder points to the source photo to compute the estimated geometry of the label.
Mathematical geometric interpolation to flatten the label to able optical character recognition (OCR).