11-01-2008, 10:12 AM
Quote: Google Inc. this week took another step in its effort to shed light on the so-called Dark Web, announcing that its engine can now search scanned documents in a PDF.
Using optical character recognition (OCR) technology, Google's search engine now can convert scanned PDF documents into text that can be searched and indexed, the company said. Thus, government reports, academic papers and other scanned documents can now show up in search results...
This is part on an ongoing effort by Google to shed more light on the Deep, or Dark Web, where lies a massive amount of information that can be accessed but not indexed by a search engine because it is behind databases or in a format -- like PDF -- that can't be easily searched...
full article: http://www.computerworld.com/action/arti...ticleBasic&articleId=9118683