While scanning QR codes can be achieved by porting the ZXing library app to Google Glass (which has been accomplished by BarcodeEye), I thought it be interesting to also combine OCR and Glass. OCR stands for Optical Character Recognition. The Wiki definition states that it is the mechanical or electronic conversion of scanned or photographed images of typewritten or printed text into machine-encoded/computer-readable text. I normally refer OCR as the process to read printed texts. In this blog post I will share how I’ve used Google Glass to do OCR. I hope it would be interesting to some of you.
Here are the steps involved:
1. Create Google Glass project in Eclipse.
If you are not familiar with that, it is best to directly follow Google’s GDK Quick Start guide. And because I wanted to add a custom voice command to the Ok Glass menu, I added the following line to the manifest file:
<uses-permission android:name=”com.google.android.glass.permission.DEVELOPMENT” />
2. Build Tesseract.
Here we are using Tesseract for OCR. It provides a set of Android APIs to build the Tesseract Optical Character Recognition and Leptonica image processing libraries.
1) Download Tesseract from Github (this is an excellent version forked by Robert Theis call tess-two)
2) Download android-ndk which allows Tesseract to build at http://developer.android.com/tools/sdk/ndk/index.html
3) Extract it to a folder (e.g. c:\Software\android-ndk)
4) Set it in your Environment Variable. Go to Control Panel -> Environment Variable, add your extracted folder path to the variable PATH. (e.g. c:\Software\android-ndk)
5) Open command prompt, and go into your project from step one
6) Type “ndk-build”, and the project will compile. It takes about an hour.
3. Add Tesseract as a library.
Open the properties of your project created in Step one, add tess-two library as a reference.
4. Add the following sample code to the project’s main activity.
String thumbnailFilePath = extras.getString(CameraManager.EXTRA_THUMBNAIL_FILE_PATH);
String pictureFilePath = extras.getString(CameraManager.EXTRA_PICTURE_FILE_PATH);
BitmapFactory.Options options = new BitmapFactory.Options();
options.inSampleSize = 4;
Bitmap bitmap = BitmapFactory.decodeFile(thumbnailFilePath, options);
bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);
String lang = "eng";
//Make sure this path exist
String DATA_PATH = Environment .getExternalStorageDirectory().toString() + "/Android/data/";
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.setDebug(true); baseApi.init(DATA_PATH, "eng");
String recognizedText = baseApi.getUTF8Text();
That should do it and you can download the sample code at Github for further customizations and extensions.
This is Michael Siu from Idea Notion, a consulting firm that develops enterprise web and mobile apps. If you have any questions, would like to learn more, or are interested in developing Google Glass apps, feel free to contact me. Follow our Google Glass Series to find out what exciting things we are building with Google Glass.
By Michael Siu