PSN-L Email List Message

Subject: Re: PDFing KS36000 manuals
From: "Charles R. Patton" charles.r.patton@........
Date: Thu, 24 May 2001 09:54:02 -0700

Karl Cunningham wrote:
> I think the major drawback is the number of bytes created and the labor=

> involved. ... I don't think putting them into a PDF
> is practical.  =

I can confirm this. If you PDF'd a scan without first converting to text
it would be larger.  On a quick test I just did, a page from a technical
journal with text and line drawings scanned at 200 dpi (marginally
better than a fax), and PDF'd at 600 dpi, was 278 KB.  If I JPG'd the
page at 90%, I got 1239 KB.  A Publisher page I had with text, boxes,
and line drawings from PPT, took only 86 KB at 600 dpi.  The Publisher
page looked perfect in the PDF while the scanned page had visible
artifacts.  It took about =BD a minute just to scan the page.  =

I use OCR (OmniPage Pro 10) all the time for a newsletter we do. The
OmniPage software is fairly current state-of-the-art.  And unless the
original copy is absolutely flawless -- i.e., not a third generation
Xerox with broken letters, smeared background, out-of-focus sections,
etc., it can be easier to just re-type the copy than go through and edit
the OCR version.  For instance, much of the input we get comes as faxes,
and about 50% of the time I just re-type rather than OCR due to the poor
recognition and high error rate.  And I hasten to say, that even with
perfect (originals) copy, it's still maybe only 98% correct.  That
translates to a 100% certainty that you'll have to perform edits on
every page.  As Karl mentioned, attempting to OCR with mixed text and
pictures would be absolutely laborious.  Omnipage can pick up the
indentatations, but then the pictures would have to be separately
inserted in an editing program such as Word or Publisher.  I would not
be up for converting the manuals.

At best, scanning and direct conversion to PDF would take a huge amount
of time and require lots of disk space.  I know I would not be up to it.

Charles R. Patton

Public Seismic Network Mailing List (PSN-L)

