Microsoft Office Document Imaging

Microsoft Office Document Imaging
see also Microsoft Document Imaging Format

Microsoft Office Document Imaging (MODI) is a Microsoft Office application that supports editing documents scanned by Microsoft Office Document Scanning. It was first introduced in Microsoft Office XP and is included in later Office versions including Office 2007. It is no longer available in Office 2010. According to Microsoft, MODI allows users to:

  • Scan single or multi-page documents.
  • Produce editable text from a scanned document using OCR.
  • Copy and export scanned text and images to Microsoft Word.
  • View a scanned document (the software does not permit navigating among multiple documents).
  • Search for text within scanned documents.
  • Easily reorganize scanned document pages.
  • Send scanned documents via e-mail or Internet fax.
  • Annotate scanned documents including using ink on a Tablet PC.

While the native file format of MODI seems to be MDI, MODI can read and write a small variety of TIFF files. It can also save OCR text into the original TIFF file. However, MODI produces .tif files which violate the TIFF standard[1] and are usable only by the Microsoft Office Document Imaging products[2]. JPEG format images can be recovered from these files using data carving recovery tools designed to cull intact files from images of damaged hard drives such as foremost[3]. The OCR text in these files is visible in a binary editor.

In its default mode, the OCR engine will deskew and re-orient the page where required. If the objectname.save() method is called it will save the deskewed reoriented images back into the original image file.

Contents

Programmability

Via COM, MODI provides an object model based on 'document' and 'image' (page) objects. One feature that has elicited particular interest on the Web is MODI's ability to convert scanned images to text under program control, using its built-in OCR engine.

The MODI object model is accessible from development tools that support the Component Object Model (COM) by using a reference to the Microsoft Office Document Imaging 11.0 Type Library. The MODI Viewer control is accessible from any development tool that supports ActiveX controls by adding Microsoft Office Document Imaging Viewer Control 11.0 or 12.0 (MDIVWCTL.DLL) to the application project. These folders are usually located in C:\Program Files\Common Files\Microsoft Shared\MODI.

The MODI control became accessible in the Office 2003 release; while the associated programs were included in earlier Office XP, the object model was not exposed to programmatic control.

A simple example in Visual Basic .NET follows:

Dim inputFile As String = "C:\test\multipage.tif"
Dim strRecText As String = ""
Dim Doc1 As MODI.Document
 
Doc1 = New MODI.Document
Doc1.Create(inputFile)
Doc1.OCR()  ' this will ocr all pages of a multi-page tiff file
Doc1.Save() ' this will save the deskewed reoriented images, and the OCR text, back to the inputFile
 
For imageCounter As Integer = 0 To (Doc1.Images.Count - 1) ' work your way through each page of results
    strRecText &= Doc1.Images(imageCounter).Layout.Text    ' this puts the ocr results into a string
Next
 
File.AppendAllText("C:\test\testmodi.txt", strRecText)     ' write the OCR file out to disk
 
Doc1.Close() ' clean up
Doc1 = Nothing

Changes since Office 2003 Service Pack 3

In Office 2003 Service Pack 3, Microsoft removed the file association for .TIF and .TIFF file extensions with Microsoft Office Document Imaging as part of the Service Pack's security changes. Also, TIFF files can no longer use JPEG compression. [4]. No detail is given about what the security issue was.

In Office 2010, MODI is fully deprecated. This change also affects the setup tree, which no longer shows the MODI Help, OCR, or Indexing Service Filter nodes on the Tools menu. The Internet Fax feature in Office 2010 uses the Windows Fax printer driver to generate a fixed file format (TIF). MODI and all its components are deprecated for 64-bit Office 2010. [5]

Alternatives to MODI for Office 2010 Users

If running Office 2010 which lacks MODI, there are these alternatives (among others):

  • Follow Microsoft's suggestions which includes an installation of only the MODI software from Microsoft Office 2007. (This installation process might also work with earlier versions of Office): http://support.microsoft.com/kb/982760
  • Install the Alterna-TIFF viewer: either ActiveX control (for IE) or browser plug-in (for other browsers): http://www.alternatiff.com/

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Microsoft Office Document Imaging — см. также Microsoft Document Imaging Format Microsoft Office Document Imaging (MODI) входит в состав пакета Microsoft Office для поддержки редактирования документов, отсканированных в Microsoft Office Document Scanning. Впервые появился в… …   Википедия

  • Microsoft Office Document Imaging — Das Microsoft Office Document Imaging (MODI) gehört zu den Office Tools von Microsoft Office und ermöglicht das Scannen von Dokumenten sowie deren Bearbeitung. Dabei ist auch eine Texterkennung enthalten. Die Dokumente werden im Dateiformat .MDI… …   Deutsch Wikipedia

  • Microsoft Office Document Scanning — ( MODS ) is a scanning and OCR application included with Microsoft Office. The OCR engine is based upon Nuance s OmniPage. Microsoft Office Document Scanning is suited for creating archival copies of documents.Microsoft Office Document Scanning… …   Wikipedia

  • Microsoft Office shared tools — are software components that are (or were) included in all Microsoft Office products. Contents 1 Clip Organizer 2 Graph 2.1 History 3 Equation Editor …   Wikipedia

  • Microsoft Office — Developer(s) Microsoft Initial release November 19, 1990; 21 years ago ( …   Wikipedia

  • Microsoft Office 2010 — applications shown on Windows 7 (clockwise from top left: Word, Excel, OneNote, PowerPoint; t …   Wikipedia

  • Microsoft Document Imaging Format — Extension .mdi Type MIME image/vnd.ms modi Développé par Microsoft Type de format Format de fichier d image Extension du TIFF …   Wikipédia en Français

  • Microsoft Document Imaging Format — see also Microsoft Office Document Imaging Microsoft Document Imaging Format Filename extension .mdi Internet media type image/vnd.ms modi Magic number 0x5045 Developed by Microsoft …   Wikipedia

  • Microsoft Document Imaging — Das Microsoft Office Document Imaging (MODI) ist ein Dateiformat (.MDI) für gescannte Dokumente, das von der Firma Microsoft entwickelt wurde. Dateien im MDI Format werden mit dem Microsoft Office Document Imaging Druckertreiber erzeugt. Dieser… …   Deutsch Wikipedia

  • Microsoft Document Imaging Format — Das Microsoft Office Document Imaging (MODI) ist ein Dateiformat (.MDI) für gescannte Dokumente, das von der Firma Microsoft entwickelt wurde. Dateien im MDI Format werden mit dem Microsoft Office Document Imaging Druckertreiber erzeugt. Dieser… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”