For Optical Character Recognition (OCR) in Windows Store Apps, we can use a Bing OCR control. Always remember that, to run and deploy programs with the Bing OCR control you need devices with a built in rear facing camera (for example: tablets with Windows 8/8.1). We can't run the application in the emulator or local machine (laptop) since it does not contain a built in rear facing camera. The Bing OCR control searches for a built in rear facing camera when running the application, so the devices with a built in rear facing camera are a must to run the application. Let's look at the procedure to use the Bing OCR control to build OCR-enabled Windows Store apps.
- Create a new Windows Store App project as in the following:
- Download and install the Bing Optical Character Recognition (OCR) control from http://visualstudiogallery.msdn.microsoft.com/5434265c-683c-4bb9-adc2-2710f89c264a.
- Register in the Microsoft Azure Market place: https://datamarket.azure.com/developer/applications/
In the preceding registration, fill in all the fields. The Client ID and Client Secret are very important and confidential fields. Since you will be using them in your Bing OCR applications, note them down somewhere.
The Client Secret is automatically generated during the registration. Once you register your application, you will get your application details along with Client ID and Name in the Registered Applications list: https://datamarket.azure.com/developer/applications/ .
- Add references to your OCR project as in the following:
Right-click on the References in the project under the solution, select Add References. This opens up the Reference Manager of the project.
Add both “Bing Optical Character Recognition (OCR) Control” and “Microsoft Visual C++ Runtime Package for Windows” references into the project.
- Check for added references and change the target platform to point to your device's CPU Architecture.
Once you have added the references, check whether the references are added or not. You can see that the references were added but it is showing an exclamatory mark on these added references. If you build the application now, you will get some warning messages along with the error messages for sure and certainly the build fails. Now, change the target platform from “Any CPU” to the CPU Architecture of your Windows devices (in other words the Windows device on which you are planning to run the Metro Style application). In my case, the CPU Architecture is x64.
- Add a Bing OCR control to the toolbox.
In the toolbox, right-click on the General tab and click on Choose Items. Once you click on it, you will get a Choose Toolbox Items window.
In the Choose Toolbox Items window, search for OcrControl in the filter search box. Check “OcrControl” so that the control will be added to your project toolbox. Check for the added control in the toolbox.
After adding OcrControl, you can see it in the TextBox. It will be added to the toolbox.
- Drag and drop the OcrControl on the design view.
- Provide a name to the control.
Select OcrControl and press the F4 key to go to the properties of the control and then provide some relevant name to it. I have given the name “ocr” as shown in the preceding screen.
- The following describes enabling the webcam capabilities for the project.
a) Click on the “package.appxmanifest” file present in the Solution Explorer. You will get the package information window on the left side.
b) Click on the Capabilities tab and check the Webcam option under Capabilities.
- Now include the namespace in the code behind file and start working on the OCR.
Include the following statement in the set of namespace inclusion statements present at the very beginning in your code behind file:
using Bing.Ocr;
- The following describes use of the Client ID and Client Secret in the application.
Go to https://datamarket.azure.com/developer/applications/ and login with your credentials. You will see your registered apps as shown in the following screen.
The preceding screen shows the setting for the Client Id and Client Secret in the code behind file.
- Extracting text from OCR results
Read the web article to understand the method employed to extract text from an OCR result: http://msdn.microsoft.com/en-us/library/dn261763.aspx. Use a button and we see that the camera capturing event triggers through the OCR control when the user clicks on this button. Use the StartPreviewAsync() function to start the default camera in preview mode.
In the code above Button_Click is the event handler called when we click on the button present in the application. Remember that “ocr” is the name of the Bing OCR control, in other words OcrControl was added on the design view. As soon as the OcrControl receives the response from the OCR Service, the Completed event will be triggered that results in a call of the event handler function ocr_Completed in the preceding piece of code.
- Either we add a Text Block or any other control to display the extracted text after the OCR process or we can use the extracted text depending on the application requirements.
- When running the app on Windows 8/8.1 tablets or any other device, please see to it that you tap on OCRControl for capturing the text from the textbook or from any other source. The OCRControl itself has camera capture capabilities hence the real-time camera captured frames will be displayed within the OCRControl area. As soon as you tap on the OCRControl present in the application, the image of the text is captured and will be sent to the Bing server for Optical Character Recognition.
- Remember that the Bing OCR will not work properly for handwritten text and blurry images. It works well for printed text. In the Bing OCR, the service is present on the server side, so a picture taken at the client side will be sent through the app to the Bing OCR service, the service analyses the picture for recognizable texts and the recognized text will be sent back to the client device.