Articles

Reviews and Ramblings
MIDDENARDE - PART 7
    by PurpleXVI - 12/12/16
MIDDENARDE - PART 6
    by PurpleXVI - 12/12/16
MIDDENARDE - PART 5
    by PurpleXVI - 12/12/16
MIDDENARDE - PART 4
    by PurpleXVI - 12/12/16
View All Articles
Game Design
What if you\'re not prepared?
    by PurpleXVI - 09/19/14
GM Startup Guide
    by PurpleXVI - 06/10/09
Weave: The Threads of Reality
    by PurpleXVI - 01/30/09
View All Articles
STORYTIEM
Jachin Akhenaton: Epic Death in Two Sessions
    by PurpleXVI - 11/10/08
DF Let's Play - Episode One
    by CAPSLOCKGUY - 11/06/08
DORF FORT ELLPEE
    by CAPSLOCKGUY - 10/19/08
Razamon, Barbarian of the North
    by MxSavior - 10/17/08
View All Articles
 
[#] Optical Character Recognition and You: A Primer and Handy Guide
08:07am EST - 11/13/2008

But, what’s OCR? Why should I care if my PDFs are tagged with the (OCR)? Well, fair reader, let me explain.

Optical Character Recognition, or OCR, is a method of recognizing letters in a PDF in order to make it searchable. By stringing letters together to form words, the program – Adobe Acrobat – is able to turn what was a jpeg or similar image into clickable, highlight-able, searchable text. Sounds pretty handy doesn’t it? Why yes, yes they are! OCR technology makes reading a PDF that much simpler. In addition to making the text interactive, it also allows for the compressing of PDFs. Take your average Dark Heresy scan. It’s about 200 megs, give or take a few megs for the differences in scans available. Once run through an OCR converter, that size can shrink all the way down to under 50 megs. Repeat this process for any book out there.

“But if OCR is so good, why aren’t all books OCR’d!” one might be wondering. Well. There is a slight loss of quality involved in the process. The same Dark Heresy book lost some of its quality. It still remained an excellent scan, however during the OCR process it does lose some. The remaining problem is the necessary equipment. Not just any program either. Adobe Acrobat is the only one I’ve found so far that is able to perform the process. Along with the program, a certain amount of time is needed, usually a few hours for the OCR to take place. During this period, the program uses a large amount of resources, making other endeavors nigh on impossible.

However, this brave author, has started OCRing books, in a venture he calls UNLIMITED OCR WORKS. Through this program, he’s OCR’d a fair number of books, and the library grows often. Though intensive, I have a number of books that I have OCR’d personally. The catalog will be updated and posted shortly, along with handy links.

If you have a particular or special request, please field them. It’s kind of fun to OCR books and stuff, so let me know over the IRC or something.

~Ishallcallu

View Older News

 

Site code and contents © 2007-2024 thisisnotatrueending.com. All rights reserved. Click here for legal information.
If you are under the age of 18, please leave this site immediately. Asshole.