Forum Discussion

Copper Contributor

May 21, 2017

Solved

Computer Vision API - OCR bounding boxes

I'm building an API for a customer than leverages computer vision to analyse images. I am trying to get it to analyse handwriting on the white board. When I upload my test image to my API, the JS...

azure

Cortana Intelligence

David Enright
May 24, 2017
I worked it out.

The API gives back coordinates based on XY,XY,XY,XY,XY,XY,XY,XY but it sorts the lines based on the first X coordinate, not the first Y coordinate.

So for example:

Line 1: 179, 73, 767, 60, 770, 145, 181, 158
Line 2: 214, 257, 1328, 219, 1331, 306, 217, 344
Line 3: 185,345,1298,350,1297, 444, 184, 438
Line 9: 29, 1099, 1396, 1162, 1391, 1281, 24,1218

The vision API however is returning line 9 first, because it's sorting by the first X coordinate. In reality though we read from top to bottom (Y not X) so it should be sorting by the first Y.

Is there anywhere I can leave feedback for Microsoft to look at this?

David Enright

Copper Contributor

May 24, 2017

I worked it out.

The API gives back coordinates based on XY,XY,XY,XY,XY,XY,XY,XY but it sorts the lines based on the first X coordinate, not the first Y coordinate.

So for example:

Line 1: 179, 73, 767, 60, 770, 145, 181, 158
Line 2: 214, 257, 1328, 219, 1331, 306, 217, 344
Line 3: 185,345,1298,350,1297, 444, 184, 438
Line 9: 29, 1099, 1396, 1162, 1391, 1281, 24,1218

The vision API however is returning line 9 first, because it's sorting by the first X coordinate. In reality though we read from top to bottom (Y not X) so it should be sorting by the first Y.

Is there anywhere I can leave feedback for Microsoft to look at this?

Jake Dan Attis
Copper Contributor
Jul 10, 2017
You could start by adding it to UserVoice! https://cognitive.uservoice.com/forums/430309-computer-vision
- dakesh
  Copper Contributor
  Jun 06, 2019
  Jake Dan Attis
  I am using https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/read/core/asyncBatchAnalyze Does boundingbox gives { X top left, Y top left , X top right , Y top right, X bottom right , Y bottom right , X bottom left , Y bottom left } in response ? Need to find x,y,height and width please suggest