The Computer Vision service offers two APIs that you can use to read text.

  • The OCR API:
    • Use this API to read small to medium volumes of text from images.
    • The API can read text in multiple languages.
    • Results are returned immediately from a single function call.
  • The Read API:
    • Use this API to read small to large volumes of text from images and PDF documents.
    • This API uses a newer model than the OCR API, resulting in greater accuracy.
    • The Read API can read printed text in multiple languages, and handwritten text in English.
    • The initial function call returns an asynchronous operation ID, which must be used in a subsequent call to retrieve the results.

You can access both technologies via the REST API or a client library. In the next few units, we’ll show you how to call the REST API and return a JSON response. Then for the exercise, you’ll use a client library to return objects that abstract the JSON response.

To use the OCR API, call the OCR REST function (or the equivalent SDK method) passing the image URL or binary image data, and specifying the language of the text to be detected (with a default value of en for English), and optionally the detectOrientation parameter to return information about orientation of the text in the image. Here is a result example:

Create a new Cognitive Services Resource

Clone in your Visual Studio the AI-102-AIEngineer repo.

https://github.com/MicrosoftLearning/AI-102-AIEngineer

Update the endpoint and service key.

Import the missing namespaces.

Using the OCR API

Inside the main function create a Computer Vision client object. Note the getTextOcr() function at the end.

Here is the function called in the Main call

The picture we will test it on:

Here is the output of the application:

And the output for the image:

Using the Read API

Here is the code for the GetTextRead() method

We will read a PDF for example that is a travel document for Rome, Italy.

Read handwritten text

We will read handwritten text from an image

You can practice this yourself here: https://docs.microsoft.com/en-us/learn/modules/read-text-images-documents-with-computer-vision-service/