« Seamless Wireless Data Access | Main

Digitizing Books, Fast and Affordable

I had a recent chat with a librarian who hosts a group of rare books in different languages. He is using proprietary software to scan and thereafter convert the text digitally. After this, he intends to keep the books preserved and allow electronic library lending. He already has a dedicated fileserver setup for sharing his electronic library. I did point him to Project Gutenberg and Project Madurai (an Indic/Tamil Language initiative intending to do what PG did for Ancient and Classical works in Tamil.)
However, he wasn't happy with the fact that scanning books took ages, usually ranging from a week to several weeks requiring a sizable amount of storage. Some of the scanned images could be automated for naming conventions, but that was how far he could get.
 
Looking at the FAQ available at PG, I found that they too used "Volunteer" scanners and the volunteer scanners were recommended to use a traditional flat-bed scanner (limiting page sizes of scans.) Some Scanners allow Automatic Data Feed (ADF) which helps if you can disassemble the book from its binding. For classical works, this is probably very tricky and sometimes impossible. The alternative is to take a two page scan and finally post process each scan. You can check out the details here at the Scanning FAQ.
 
Many of the books which are scanned are fragile and use thin paper. That prevents removal of the binding or in some rare cases laying them flat for scanning. Yes, this is a known problem to the gadget makers who build scanners. You can read about some of the gadgets specialized for scanning books here. (Wikipedia Link)
 
A good book scanner from anthroblogs/nomadicthoughts 
If you did check the Wikipedia link, you would see the open versions which specifically are designed to accommodate a variety of sizes as compared to the above closed unit which is less sensitive to ambient lighting and requires less powerful lighting (and therefore saves your power budget which is critical today.)
 
Now, why would this be an interesting space for innovation? The answer lies in the pricing of available solutions which are bundled with software. Flatbed Book Scanners are different from traditional Flatbed scanners by the very fact that they offer high speed scanning specifically for standard book sizes (the fastest close to about 3s per page.) The Flatbed Book Scanners do not automatically turn pages, which obviously is the tough part and also risk damaging the book.
 
If you wanted to look for a solution that was complete and turned pages automatically, you will find the following listed.
 
Now looking at numbers upto $10,000, this is definitely no easy investment for NGOs or even Governmental Libraries in the Developing Nations. The pricey components are actually the Camera, which in many of these cases is
  • a High End CCD with a lens capable of autofocus and control.
  • Robotic or Non-Touch techniques for page turning
  • Illumination with automatic adjustment for paper type, quality and color
  • Bundled Software for OCR and automated PDF output
  • Non-Standard Adjustable mechanical casing
  • Demand Volume is much lower to throttle pricing creating a vicious cycle

If you really want to penetrate this space, then there is quite a lot that can be done. Here are a few suggestions

  • Use CMOS Cameras with In-built Autofocus DSP functions (Monochrome should suffice for most needs.)
  • Use in-built DSP for quick post-processing and OCR (Processors like the OMAP already make it possible.)
  • Use minimal illumination and  partially closed box built on traditional flatbed scanner or OHP dimensions.
  • Improvise on non-touch page turning (Robotics will need calibration and have component scalability issues.)
  • Bundle Software that integrates well with Content Management Systems to support multiple output formats

I strongly feel that beating that price point and reaching a larger volume of people will create a compelling business case. New books are usually created on digital media and therefore have a digital alternative available. It is the volumes of old books containing priceless information that would need to be scanned. One could imagine a host of business models for such a case including Library centric Data centers hosting readable material accessible on almost any reading device.

 
 

TrackBack

TrackBack URL for this entry:
http://www.sunilbetabaskar.com/ideacenter-mt/mt-tb.fcgi/26


Hosting by Yahoo!
[ Yahoo! ] options

Comments

I absolutely am loving this site. going to need to put this on my blogroll.

I've had my Kindle 2 for a couple of months now. I wanted to wait before I wrote a review to get a little experience with the device.

I have to say that I very much like it. Those who have never read on one and maintain that only a book with actual paper pages cannot be replaced are, in fact, not in a position to speak with any authority.

First of all, the Kindle 2 is very easy on the eyes. A couple of books I've purchased aren't great, fontwise. They look like bad PDF scans of a paper printout of something, but they're nonetheless readable. Most books, however, are very easy to read, and the variable font size enables me to find the font size that works best for me. While not as much text fits on a screen as on a mass-market paperback page using the font size I like, pushing the "next page" button has become second nature. I now press it at just the right time in the middle of the last sentence on the page so it turns just at the right moment. Works like a charm. (My one complaint is that I'd like a "previous page" button on both side of the screen, the way there's a "next page" button on both sides, but that's a very minor consideration.)

Some have said they don't like the black text on a greyish background, but I find the contrast not only more than adequate but easier on the eyes than if the background were white. Further, the display doesn't glare. If you do have a lightbulb nearby - or perhaps the sun if you're outdoors - you can, with some effort position the screen at exactly the right angle to reflect an image of the bright and shining light bulb (or the sun, I presume) in the surface of the screen. But you really
have to get it just right to have that happen and, otherwise, there isn't a hint of glare.

I don't miss backlighting at all. A backlighted display is not easy on the eyes for hours on end and the Kindle 2 display is as nice to read as any traditional book. If I didn't have a reading lamp handy, I find that the "Mighty Bright" clip on battery operated book light that amazon.com recommends works very nicely. I bought one and I've tried it out, so I know if I'm ever in the middle of a power failure, I can read. The other thing to think about is that a backlighted display
would no doubt run down the Kindle 2's battery faster. Also, people who have reviewed the Sony Reader's backlighted version have all said the display isn't as crisp and sharp as an e-ink display, and I have to say that the Kindle 2's e-ink display is as crisp and sharp as reading a
traditional book.

I very much like how thin the unit is. There are two slots on the left-hand edge that enable the Kindle 2 to lock right into a leather cover. I bought the basic black morocco leather cover which
works nicely. Reading the book in the cover isn't a problem as the front cover readily folds around to the back so that it isn't making the left-edge buttons hard to press. Yes, it costs extra, but they lowered the price on the Kindle 2 so that buying the basic cover costs plus the purchase price of the device itself equals the cost of a Kindle 1. The advantage here is that the buyer can choose the kind of cover (s)he wants to buy. There are come more expensive ones available that
add some additional features, such as one that zips closed on three sides, which keeps out dirt and other unwanted detritus when camping outdoors. There are designer leather covers that come in multiple colors. The basic one, however, meets my needs as an indoor user.

As to the tiny qwerty keyboard, I have no complaints. I've used the search function many times and it's easy to "type" in the search terms. The little square "joystick" that moves the cursor in the four directions of the compass is also surprisingly easy to use. It's square and protrudes just enough that I can move it left, right, up or down by catching its edge with my fingertip. It's way better than trying to wiggle the rubber-capped thingy in the middle of a laptop keyboard with a
fingertip. To use the "enter" function, that is to execute a command or enter the text I've typed in, I just have to push downward on the square button. Very easy, works like a charm, and I expected it to be difficult to use to some degree, but it isn't at all.

I can't complain about network access so far. I'm in San Francisco and coverage is excellent anywhere I've been so far. I happens that I've had a Sprint cell phone for years and, apparently, the WhisperNet access amazon.com provides makes use of the Sprint network. I do have the
complaint that anyone who gets their hands on my Kindle 2 could buy books using my credit card because amazon.com doesn't let me keep it password protected, as it ought to do. One thing I like, though, is that I can go to the amazon.com website on my computer and buy books there and they download right to the Kindle 2. (It's way easier to browse for books on my PC with it's full-sized monitor, and I can do it from any PC with Internet access, but I can also do it right from the Kindle 2 if I'm traveling and can't get at a PC.)

So far, the battery charge seems to last quite a long time for reading. I leave the wireless access feature turned off for longer battery life. It's easy enough to turn the feature on when I need it, and turn it off again when I don't, so why leave it on?

So far, I've recharged the unit at the office and at home by plugging it into a USB port on the front of a PC. I haven't tried the AC outlet adapter yet, but that's designed to use with the USB cable, and I'm intrigued to see that the AC adapter will work at any voltage from 100 volts in Japan to 240 volts in the UK, either 50 or 60 Hz, and would only need an outlet adapter to do that and not a voltage transformer. So far, it has taken anywhere from two to three hours to recharge the device, depending on how far down I've let the battery run. New out of the box, it took less than two hours to fully charge. There's a nicely functional tiny LED next to where the USB cable plugs into the bottom end of the Kindle 2 that shows amber when charging and turns green when fully charged.

The USB cable is fairly long, and it's possible to continue reading while charging the unit. I should mention that the USB cable has a tiny plug for the Kindle end and a standard USB plug to go into the USB port on a computer. I have a feeling that finding such a cable other than on the amazon.com website might be tricky. I bought one at a retailer that had a tiny plug on one end, looked right to me, but when I got home it turned out not to be the same as the one on the Kindle 2.

When hooking up the device to the computer with the USB cable, the PC perceives the Kindle as a flash drive. I have found that I can cut and paste book files to my PC and burn them to a CD or DVD-ROM as a way of storing them. I can copy and paste them back to the Kindle at any time.
There are two files for each book with the same name but a different extension after the "dot." One is apparently the book itself and the other file gets created once you start reading it - it keeps
track of bookmarks, notes you make, and also where you were last reading so that any time you reopen the book, you'll go right to the place you left off.

That leads me to another thing I like about the device - when you turn it on the next time, it goes right to where you left off reading the last time. Very nice. Also, you can search for any string of text you want. Useful in a novel if you want to look something up earlier in the book, such as a character's name. I frequently get to a point where a character introduced earlier comes back into the picture and I'm not quite sure which character that was if their first mention didn't amount to much. Way easier with the Kindle 2 to find the first introduction of that character using the search function than trying to find it in a paper book.

I'm not pleased with the prospect of having to send the unit in to have the battery replaced when it finally fails and won't take or hold a charge. Apparently the Kindle 1 had a battery the owner could replace quickly, the way a cell phone does. I think this is the one thing I can complain about legitimately.

I've given the voice synthesizer unit a try. I won't be using it much. It works just fine - with a choice of a female or male voice. The latter feature is important. Reading a novel narrated by a male character doesn't work so well, I find, with a woman's voice, and vice versa. So the choice is a nice thing. That said, as with the voice synthesizer on my PC, it's just too approximate. Many pronunciations are approximate and even confusing, and sentences are never read with the
correct vocal inflection to convey the meaning of the sentence. Of course, a voice synthesizer program cannot comprehend and interpret what it's reading. Thus, I think that assertion that this device will cut into the sales of audiobooks is unfounded. I won't be using the feature much anyway because I get a lot more out of a book I'm reading with my eyes and hearing the voice of the storyteller in my mind. (I haven't wanted to be read to since I learned to read myself and that was a very long time ago.) I don't think the volume goes quite loud enough on this device, anyway, but the tiny stereo speakers on the back produce better quality sound than they ought to given how small they are. There is a tiny headphones jack on the top edge of the device, and perhaps with earphones or headphones, the volume would be fine. (I haven't bothered to check.)

All in all, I think this is a good thing. I'm pleased with getting books I purchase within seconds of buying them. Saves lots of fuel and shipping costs - I don't have to go out to the bookstore and a truck doesn't haven't to deliver the book to me. It also saves trees - no paper is involved. The books seem to be cheaper than their paper counterparts in all cases. When a book is out in hard binding only, it's cheaper than the hardbound book, and when it's out in paperback, it's cheaper than the paperback. I don't care that I can't lend the books out or swap them or sell them to a used bookstore. I have a tiny junior one bedroom city apartment and I long ago ran out of shelf space for more books, and I'm delighted at the prospect of a digital reading library that takes up no physical space. Closed in its cover, the Kindle 2 has the same dimensions as a book of Millay poetry, 5/8" thick at the spine and when I'm not reading or charging it, it gets put on
my bedside shelf of books next to those books of poetry and simply looks like one more book.

If I never bought a printed-on-paper book again, it would be more than fine with me, unless it were an art book or one full of beautiful color photos. If it's just text with the occasional illustration, the Kindle 2 is fine. In fact, the 16-shade greyscale feature with black and white
shows illustrations very nicely. When the device goes into the powersaving mode - or when you put it there by giving the power switch a momentary slide, an illustration of an author or perhaps an illuminated manuscript page comes up and remains there. Apparently, no power is required to maintain an e-ink image on the screen once it displays. The spring-loaded, sliding power switch works as follows. To turn on from a blank screen, a momentary slide of the switch does it. The same to put it manually into the powersaving mode. To turn the device off, slide the switch and hold it for a few seconds until the screen goes blank.

So I give the Kindle 2 an A-minus, the minus because the battery isn't replaceable by the end user. That is, frankly, a situation for which there is just no excuse. I also would like to see the cost come down, and I have a feeling, over time, that it will. That happened with VCR's, CD-players, DVD-players, LED HDTV's and PC's, among other devices. It'll happen with the Kindle eventually, too, I'm sure. But given my book storage issues and how much it would cost me to rent a bigger apartment here in the City, I didn't mind paying the current price. That may not be true for everyone, but, as I said, I'm sure the price will eventually come down.

So my only real complaint right now is a battery the end-user can't replace. Other than that, I think the Kindle 2 is an excellent thing and I recommend it to all who don't mind paying what it costs.

I saw this really good post today, found it on google. i think i may return some time.

Nice post. Keep up the great work

Hi all. At a dinner party one should eat wisely but not too well, and talk well but not too wisely. Help me! Help to find sites on the: Does celebrex work. I found only this - norvasc 5 mg. Course is treated in pills and fatal teeth across european union; however, there is a online person among the outcome of failure within the treatments, norvasc. Norvasc, ventilation: sacraments are annoyed to use by their confinality on atmosphere effects left principles which are developed in treating ratio. With respect :-), Jan from Djibouti.

Wow! what an idea ! What a concept ! Beautiful .. Amazing …

Super-Duper site! I am loving it!! Will come back again - taking you feeds also, Thanks.

Post a comment