Forum Discussion

David Enright's avatar
David Enright
Copper Contributor
May 22, 2017

Computer Vision API - OCR bounding boxes

I'm building an API for a customer than leverages computer vision to analyse images. I am trying to get it to analyse handwriting on the white board.   When I upload my test image to my API, the JS...
  • David Enright's avatar
    May 25, 2017

    I worked it out.

     

    The API gives back coordinates based on XY,XY,XY,XY,XY,XY,XY,XY but it sorts the lines based on the first X coordinate, not the first Y coordinate.

     

    So for example:

     

    Line 1: 179, 73, 767, 60, 770, 145, 181, 158
    Line 2: 214, 257, 1328, 219, 1331, 306, 217, 344
    Line 3: 185,345,1298,350,1297, 444, 184, 438
    Line 9: 29, 1099, 1396, 1162, 1391, 1281, 24,1218

     

    The vision API however is returning line 9 first, because it's sorting by the first X coordinate. In reality though we read from top to bottom (Y not X) so it should be sorting by the first Y.

     

    Is there anywhere I can leave feedback for Microsoft to look at this?

Resources