.net, OCR, Optical Character Recognition

Optical Character Recognition in C# in Universal Windows Applications – Part #2, using Windows.Media.Ocr

This is the second part in my series on Optical Character Recognition using C#. Last time I looked at the Apache 2 licenced package Tesseract, where I tested its recognition ability against a sample image, and wrote some sample code showing how to use it.

This time I want to test the abilities of the Windows.Media.Ocr library. This one is a bit different from a normal C# library, as this is only usable in Windows store applications, or Universal Windows Platform (UWP) applications.

I’m not going to present code samples in this post – most of the code would be about how to create a UWP application, with probably only a couple of lines dedicated to the actual OCR library. There’s an excellent blog post by Jelena Mojasevic here, which gives some sample code.

Getting Started with testing a Windows.Media.Ocr app in Visual Studio 2015

Microsoft provide a huge amount of starter information and samples for UWP – these are freely available from its Github page. It’s pretty easy to test these applications – I needed a Windows Phone so I could deploy the sample applications, but that’s because I’m developing on a machine that is a bit old and doesn’t support Hyper-V. The image below shows the error I get when my Windows Phone device isn’t attached.

install windows phone tools

You can get this code using your favourite tool (e.g. TortoiseGit) or download the zip, and extract this.  The code I found useful for this was in the OCR sample directory. This solution might compile and run on your machine first time, but if it doesn’t there’s two things that it might be useful to check:

1. Make sure the UWP tools are installed.

I didn’t include all the UWP tools when I was installing VS2015  – but if I hadn’t remembered this, it’s pretty easy to check if they are installed. Select File -> New Project -> Visual C# -> Windows -> Universal. Since they weren’t installed on my machine, I saw a screen like the one below which invites me to install the Universal Windows Tools:

install windows tools

I just selected this option, and my Visual Studio installer opened and guided me through the process of downloading and installing the necessary components. This took a long time so prepare to be patient!

2. Developer mode is required for running debugging Windows Store apps

This is pretty easy to solve – if your machine isn’t set up for debugging apps, you’ll see a message like the one below:

install windows tools 3

Just follow the instructions – go to “Settings”, “Update & Security”, and “For developers”, and choose to put your computer into Developer mode (Note – do this at your own risk, this is obviously something you should only do if you’re comfortable with it!)

install windows tools 6

If you change to Developer mode, you’ll get a warning like this anyway:

install windows tools 5

Testing how application recognises text from our sample image

I used the same image as previously, and copied it to my Windows phone. I was then able to run the OCR application through Visual Studio, which made it open on my Windows phone. Using the app, I browsed to the location I saved the file to, and triggered the app’s text recognition function. The picture below shows how the app interpreted the text from the source image:

wp_ss_20160317_0003

My review comments are:

  1. The text at the top seems to be close to gibberish – but remember this is the light grey text, which Tesseract didn’t even recognise in the last post.
  2. The rest of the text has been interpreted perfectly.

Conclusion

Windows.Media.Ocr tried to interpret the faint grey text, and didn’t fare well. However, for darker text, it gave extremely impressive results – it recognised the darker text perfectly.

So on the face of it, this is a very good option for OCR applications to be written in C#. But this library is only directly accessible through UWP apps – I’d prefer be able to use it in my regular Windows applications as well. For example, I may want to allow users to upload an image to a website and have the server recognise the text in the image.

Fortunately, Microsoft have us covered – they have created the “Project Oxford” web service for exactly this kind of purpose. I’ll return to this in the third post in this series, with a bit more C# code on how to get started using this service.