Google and open source OCR

Filed under: Official Google Blog — Wrote by Lees on Tuesday, December 11th, 2007 @ 6:18 am

Posted by T.V. Raman, Research
Scientist

From time to time, our own href="http://emacspeak.sourceforge.net/raman/" >T.V. Raman
shares his tips on how to use Google from his perspective as a
technologist who cannot see — tips that sighted people, among
others, may also find useful. - Ed.

As someone who cannot see, I prefer to live in a mostly paperless
world. This means ruthlessly turning every piece of paper that
enters my life into a set of bits that I can process digitally. I
scan in everything. Until now, I have relied on commercial OCR
packages to convert these images into readable text. OCR is perhaps
one of the areas where the benefits of href="http://en.wikipedia.org/wiki/Moore's_law" >Moore's
Law are most evident; today, OCR can do remarkably well when
handed a page image. Until now, my only dissatisfaction with the
status quo in this area has been that commercial OCR engines afford
me little flexibility with respect to training them to do better on
documents that are specific to me.

The advent of our own open source OCR initiative, href="http://code.google.com/p/ocropus/" >OCRopus (source code:
Ocropus
Sources
) is a welcome change in this regard. I introduced
support for OCRopus in
Emacspeak
recently, and the HTML output this produces compares
favorably with output from commercial OCR engines, provided you
place the page at the right orientation on the scanner.
OCRopus' extensibility, and the ability to express the OCR as a
structured HTML document makes it an ideal starting point for
producing rich spoken output. The possibilities are enormous for
people being able to collectively train, customize and improve an
OCR engine.

Tags: , , , , , ,

  -

No comments yet. Be the first to comment this post.

Leave your comment

Copyright © 2007 Google Adsense College.
Powered by GoogleSchool. All Rights Reserved.