Convert to Office

3 minute read Last updated on June 21, 2021

The Convert To Office process is an optional node you can use to convert image files and PDFs to standard Microsoft Office formats, such as the following:

  • Word
  • Excel
  • Powerpoint

A scan of 300 DPI or higher is recommended for good output. This process works with the following file types:

  • TIF/TIFF
  • PNG
  • JPG/JPEG
  • BMP
  • PDF

Note: This node can process GIF files. However, the accuracy of the output may vary.

Important! Some features of this node may not work correctly without Microsoft.NET Framework 3.5. You should install the framework from Windows Update or download it from the Microsoft web site.

Note: This nodes includes the OmniPage OCR engine.

Limits

For incoming documents, any images exceeding the following will not be processed:

  • Maximum allowable size: A0 (33.1” x 46.8”)
  • Maximum dpi: 2400

In addition, word processor formats (.doc and .rtf) have a length/height limit of 22”.

Configuring the Convert to Office Node

To open the Convert to Office node, drag-drop a Convert to Office node from the Distribution panel to the workflow and then double-click on it. The Convert to Office Properties window opens, as shown below:

The following options are available:

  • Enabled - To enable this node in the current workflow, check the box at this field. If you leave the box blank, the workflow ignores the node and documents pass through as if the node was not present. Note that a disabled node does not check for logic or error conditions.

  • Node Name - The node name defaults into this field. This name appears in the workflow under the node icon. Use this field to specify a meaningful name for the node that indicates its use in the workflow.

  • Node Description - Enter an optional description for this node. A description can help you remember the purpose of the node in the workflow or distinguish nodes from each other. If the description is long, you can hover the mouse over the field to read its entire contents.

Configure Advanced Settings

To adjust the accuracy and speed of the OCR process, click the Advanced Settings button to access the OCR Settings screen. For example, if the PDF conversion drastically changes the appearance of the output file, use this screen to adjust the OCR settings accordingly.

Output Settings

Use this section to specify your output settings.

  • Output File Format - Choose the output file format. For more information on which file formats and types are supported, please see the OCR Settings page.

    Note: For Microsoft Excel, if you process a file with multiple pages, the output will be one workbook file with tabs for each page.

    Note: PowerPoint 97 files are outputted as .rtf files but can be opened in PowerPoint once the extension has changed to .ppt.

  • Remove Blank Pages - To remove any blank pages from the outputted PDF, check this box.

  • Auto-Rotate - The OCR engine automatically rotates the text appropriately when converting. To prevent any automatic rotation of text, leave this box blank.

  • Enable Spell Check - To achieve better accuracy during the OCR process by determining the acceptability of words and automatically correcting any misspellings, check this box. The default value is enabled. If you do not want incoming files to be automatically spell-checked, uncheck the box. Disabling Spell Check may speed up the OCR Process, but may affect the accuracy of the results.

Buttons

  • Help - To access Dispatcher Phoenix Online Help, click this button.
  • Save - To preserve your node definition and exit the window, click this button.
  • Cancel - To exit the window without saving any changes, click this button.