Vision API: deskew

Copper Contributor

Hello,

On the Vision API in the response there's an angle field that's supposed to give us the skew of the image (I need to do OCR on some documents). How should I interpret that angle (seems to be in percents?!) and are there any tools that would use the angle and de-skew?

Thanks!

Roxana

3 Replies

@RoxanaTFA Supposing you're using the computer vision API 3.1, https://westcentralus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-1-ga/operations/5... the textAngle field is in radians. You can use that and the orientation field in order to re-orient either the image to be in line with overlaid recognition results, or orient the overlaid text at the angle of the text in the image.

 

You can use a photo editor, like photoshop or Paint 3D (guide here https://www.laptopmag.com/articles/rotate-resize-paint-3d) to rotate the image before passing it to the vision API.

 

Let me know if you have any follow ups!

 

Thank you for your answer @Shunderpooch! We are looking for an automated solution that would read the angle, deskew the image and send it for preprocessing again. Are there any plans to deskew the image automatically (as part of the Computer Vision API call) in the future (since you know the angle) or release any tools that would do that?

Here is a repository that uses techniques to deskew/scale images using opencv, this is mostly geared toward form documents, but it could also be adapted to general images as well:
https://github.com/jakeatmsft/Form_OCR_ACV_API