Advanced OCR

18 minute read Last updated on September 11, 2024

Use the Advanced OCR process node to refine optical character recognition (OCR) results and extract metadata through the use of zones. Through zones, you can define how you want the OCR engine to recognize various elements on the page such as text, forms, tables, graphics, etc. For example, to capture invoice numbers from incoming documents, you can create a zone in the area of the document where invoice numbers appear. In addition, zones can extract metadata automatically and associate the metadata with the original document.

This node works with the following file types:

TIFF
JPG
PNG
BMP
PDF

Notes:

Some features of this node may not work correctly without Microsoft.NET Framework 3.5 or higher. We recommend you install the framework from Windows Update or download it from the Microsoft web site.
The recommended minimum DPI for scanned documents is 200. Better results can be expected for documents 300 DPI and above.
This node includes the Tesseract OCR engine. You can also purchase a license for the OmniPage OCR engine.

Properties Window

To configure the Advanced OCR process node, drag-drop the node on to the Workflow Builder’s work area and double-click on it. The Advanced OCR node properties window appears, as in the following illustration:

The above illustration shows the default configuration for the Properties window, with Tesseract as the OCR engine.

If you have access to more than one OCR engine, the OCR Engine field will appear in a dropdown menu. Options include:

Tesseract
OmniPage

On the Advanced OCR properties window, you can fully define and customize zones for your OCR processing. The window consists of the following areas:

General Settings
Preview Area
ToolBar
Zone List
Additional Settings

General Settings

Enabled - To enable this node in the current workflow, check the box at this field. If you leave the box blank, the workflow ignores the node and documents pass through as if the node was not present. Note that a disabled node does not check for logic or error conditions.
Node Name - The node name defaults into this field. This name appears in the workflow below the node icon. Use this field to specify a meaningful name for the node that indicates its use in the workflow.
Description - Enter an optional description for this node. A description can help you remember the purpose of the node in the workflow or distinguish nodes from each other. If the description is long, you can hover the mouse over the field to read its entire contents.
OCR Engine - If multiple OCR engines are licensed and installed in Dispatcher Phoenix, a dropdown menu appears from which you can select the engine you want to use. If only one engine is available on your system, the name of the engine displays at this field.

Note: If you change to a different OCR engine, all configuration settings except the Enabled checkbox and Node Name are discarded and set back to the default settings.

Buttons

Advanced Settings - To access the Advanced OCR Settings window, click this button.
Help - To access Dispatcher Phoenix Online Help, click this button.
Save - To preserve your node definition and exit the window, click this button.
Cancel - To exit the window without saving any changes, click this button.

Preview Area

Use the Preview area to upload a sample document you can use to help define your zones. The document should resemble the documents you want to scan.

When you first open the Advanced OCR node properties window, the Preview area contains only the Upload your document window, and many options on the screen are inactive. Once you upload a document, the image appears in the Preview area and the options activate.

To upload a document, click on the icon in the Upload your document window or click on the Upload icon on the Toolbar. The Open Sample window appears from which you can choose a document. In addition, the application provides several sample documents of various sizes that you can also use. See the following illustration:

Select a document and click Open. The sample document appears in the Preview area.

Note: Once you select a sample document, you can select a different document by clicking on the Upload icon on the Toolbar. If you have already created zones, once you select the new document a window appears and you choose to save or delete the existing zones.

Use the toolbar at the top of the window to further define the zone as well as customize the view of the node properties window. Note that many options on the Toolbar do not activate until you upload a sample document to the Preview area.

When using the drop-down palettes on the Toolbar, pressing the Enter key or clicking anywhere outside of the palette applies those changes to the zone.

ToolBar Icons	Description
	Test Zone(s) - Click to test all the zones on the page currently being previewed. Results will appear in a section below the preview area.
	Zone Coordinates - Click to define specific coordinates for the zone. You can change the size of the zone by entering values (in pixels) in the Width and Height fields. You can also move the position of the zone by entering values (in pixels) in the Left and Top fields.
	Zone Type - Click on this icon to define settings for a selected zone.
	Zone Page Range - Click to specify on which pages to apply the zone. Options include: All pages within allowable range - Select this radio button to ensure that the zone applies to all pages within the range. Only these pages from allowable range - Select this radio button to ensure that the zone applies only to pages within a specified range. Next, enter the page range in the empty field provided below.
	Delete - Click to delete a selected zone.
	OCR - Click to have the application automatically detect and apply zones for you. Any preexisting zones defined on the page will be removed before this process begins.
	Pages - Click on the arrows to navigate through multiple pages of the sample document (if necessary).
	Upload Sample Document - Click to find and upload another sample document to use in the Preview area.
	Actual Size - Click to revert the preview sample document to its original size.
	Fit to Width - Click to stretch the sample document to fit the width of the Preview area.
	Whole Page - Click to fit the sample document completely in the Preview area.
	Zoom controls - Use either the magnifying glass icons or the sliding bar to zoom in and out of the Preview area.

Zone List

Use this area to create, edit, and delete detection zones. Zones define areas of an imaged document for use by the OCR engine, and they can output text from the document. You can create zones manually or you can configure the node to create zones automatically. For example, to capture invoice numbers from incoming documents, you can create a zone in the area of the document where invoice numbers appear.

Once you upload a document to the Preview area, the Zone List activates. All defined zones (if any) for the node appear in the list, as in the following illustration:

To access additional options for zones, open the “More Actions Menu” menu by clicking the icon at the upper right of the Zones List area, as shown below:

Clicking the three dots in the Zones area will open up the More Actions Menu that allows you to:

Menu Option	Menu Action	Keyboard Shortcut
Show / Hide all zones	Toggle the visibility of all zones on the Canvas and display a “hidden” icon next to each zone in the list when it is hidden. If the current selection includes a mix of zones that are shown and hidden, clicking this option will hide all zones.	F6
Delete all zones	Delete all zones from the Zone Editor / Canvas.	Ctrl+Shift+Del

Clicking the three dots next to a zone allows you to:

Menu Option	Menu Action	Keyboard Shortcut
Show / Hide zone	See next table	See next table
Delete	Delete the selected zone from the Zone Editor / Canvas.	Del
Rename	Rename the selected zone.	F2
Test this zone	Run and test a single selected zone, to help users test and debug individual zones.	F5

Notes:

After successfully testing a zone, the zone results query section will appear at the bottom of the preview area with the values for all tested zones.
You can adjust the size of the zone area, the preview area, and the zone results query section that appears after testing a zone by clicking and dragging the edges between two of those areas. For example, you can make the Zones area larger (and the preview area smaller) by dragging the border between the zones area and the preview area to the right.
There are two ways to copy the value of a tested zone:
1. You can right click on the zone and select Copy zone value.
2. You can right click on the zone result in the zone results query section and select Copy.

There is a second menu for Show / Hide Zone with more options:

Menu Option	Menu Action	Keyboard Shortcut
Show / Hide all zones	Toggle the visibility of all zones on the Canvas and display a “hidden” icon next to each zone in the list when it is hidden. If the current selection includes a mix of zones that are shown and hidden, clicking this option will hide all zones.	F6
Show / Hide this zone	Toggle the visibility of the selected zone on the Canvas.	F7
Hide all zones but this	Hide all zones on the Canvas except the selected zone.	F9

The node also supports additional actions:

Menu Option	Menu Action	Keyboard Shortcut
Test selected zones	Run and test more than one selected zones, to help users test and debug multiple selected zones at a time.	F5
Copy zone value to the Windows Clipboard	Copy the detected zone value to the Windows clipboard. You can access this command from the right click menu on the zone result, which is produced after using the Test Zone(s) feature.	Ctrl+c
Delete selected zones	Delete selected zones from the Zone Editor / Canvas.	Del

Note: You can also Rename, Delete, and/or Show / Hide the properties of individual zones by right-clicking them in either the Zones List area or the Preview area and selecting an option from the menu that appears.

Multiple zones can be selected two ways:

Click and drag the mouse in the Preview area to highlight multiple zones at once
Use CTRL+Click to select multiple zones. This method works in the Zones area and in the Preview area.

If you have multiple zones selected, you must select the More Actions menu from any of the selected zones, and the options to modify or test multiple zones will appear.

Creating Zones Manually

To create zones manually, do the following:

Add New Zone - Click this button to access a drop-down palette, as in the following illustration:
On the Add New Zone drop-down palette, do the following:
- Zone Name - Enter an identifying name for the zone (e.g., invoice or address). You can enter up to 15 characters.
- Left and Top - Enter a value (in pixels) to position the zone from the left and top of the document.
- Width - Enter a value (in pixels) to define an appropriate width for the zone.
- Height - Enter a value (in pixels) to define an appropriate height for the zone.
- Zone Page Range - Specify the pages on which the zone will be applied. Options include:
  - All pages within allowable range - Select this radio button to apply the zone to all pages within the specified range. This means that a zone configured on the first page of the document will automatically be applied to the rest of the pages in the document (if the specified page range to process is Every Page).
  - Only these pages from allowable range - Select this radio button to apply the zone to only a specific range of pages within the specified range. Next, then enter the page range in the empty field provided.
- Save - Click this button when you are done. The zone appears in the specified location on the Preview area. See the illustration below.
- Cancel - Click this button to exit the drop-down palette without saving any changes.

Creating Zones Automatically

The Advanced OCR node can automatically detect zones, whether or not you upload a sample document. If you choose not to upload a sample document, you can use the Page range to process and Output fields to specify a page range and output format, respectively. Then select the Save button.

If you choose to upload a sample document, the OCR engine can divide the content of pages into ordered zones automatically. Do the following:

OCR - To create zones automatically, click on the icon on the Toolbar. On a multi-page document, the zones appear on the page you are currently previewing. For example, on a four-page document, if you click this button while previewing page 2, the zones appear on page 2. To apply zones to another page in multi-page document, preview that page then click on this button.

Editing Zones

To edit a zone, click on it in the Zone List or the Preview area. You have the following options for editing a selected zone:

Preview area
- Re-locate - Click on the zone and drag it to a new area on the palette.
- Re-size - Click on one of the handles on the zone border and drag the edge to a new location on the palette. Note that the handles may not be available if you drastically change the size of the sample document. In such cases, click on the icon on the Toolbar to use the Zone Coordinates option to re-size the zone.

Defining Type/Content for Zones

You can choose settings for each zone to match the specific format of your zone content. With the zone selected on the Preview area, click on the icon on the toolbar to display the Zone Type drop-down palette. Next, choose a type for the zone. The table below lists your options:

Note: The zone types available in the drop-down palette is determined by the OCR engine, based on the functional capabilities of the engine. The table below lists the options available for each OCR engine.

Zone Type	OmniPage	Tesseract
Text Zone - Zone contents will be treated as flowing text.	Y	Y
Table Zone - Zone contents will be treated as a table, with information expected in rows and columns.	Y	Y
Graphic Zone - Zone contents will be treated as an embedded image, and not as recognized text (e.g., photos, logos, and drawings).	Y	Y
Form Zone - Zone contents will be treated as form elements and contain a description of the form objects.	Y	N
Vertical Asian Text Zone - Zone contents will be treated as vertical Asian text.	Y	N
Vertical left-rotated Text Zone - Zone contents will be treated as vertical text that has been rotated left (e.g., )	Y	N
Vertical right-rotated Text Zone - Zone contents will be treated as vertical text that has been rotated right (e.g., )	Y	N

Note: Automatic Image Rotation conflicts with the Vertical left-rotated or Vertical right-rotated options. If selecting either one of these options, disable the default setting of Automatic Image Rotation.

OCR Metadata

Once you define an OCR zone, other nodes in the workflow can reference it.

The syntax for OCR zone referencing is the following:
- {ocr:zone.nameofzone.[<page>]}
This provides the value extracted in this zone.
The syntax for where the OCR application found any values in the zone is the following:
- {ocr:zone.nameofzone.[zonecoordinate]}
[zonecoordinate] is either top, left, width, or height, as defined in pixels.

You can also use the Metadata Browser window to choose the OCR zone variable, as in the following illustration:

Additional options for OCR Metadata include Specifying Page Level Metadata and Specifying Metadata Occurrence Number.

Additional Settings

You can specify which pages to include in the OCR process and the output format. These fields appear in the lower-left corner of the Advanced OCR Node properties window.

Specifying Page Ranges to Process

You can specify which pages to include in the OCR process. You have the following options:

Page Range Process - This area appears at the left of the page, below the Zone List. If you the click on the drop-down, the following options appear:
- Every page - Process every page.
- Every even page - Process even pages only.
- Every odd page - Process odd pages only.
- First page - Process the first page only.
- Last page - Process the last page only.
- Define your own page range - Process a custom page range. Once you choose this option, an empty field appears where you can enter the page range. You have the following options: - Specify a page range by using commas and/or dash signs counting from the start of the document. For example, enter 1, 2, 5-7 to process pages 1, 2, 5, 6, and 7. - Specify a specific sequence within a range of pages by using parentheses. For example, enter 1-10(3) to process every third page from pages 1 to 10. - Specify the last page by using ‘end.’ For example, enter end(-5) - end to process pages 15-20 of a 20-page document.
  
  Other examples include:
  
  - To process pages 1, 2, 5,6,7, and 19 of a 20-page document, enter: 1,2,5-7, end(-1). - To process pages 10-15 of a 20-page document, enter: 10-end(-5). - To process every other page from pages 10-15 of a 20-page document, enter: 10-end(-5)(2). - To process pages 15-20 of a 25-page document, enter: end(-10)-end(-5). - To process pages 10-20 of a 20-page document, enter: end(-10)-end.
Note: If you specify a page range that does not correspond to the number of pages in the incoming document (e.g., processing pages 10-20 for a three-page document), the file will go out on error.
Process pages using auto-zone if there are no user-defined zones - When processing multi-page documents, the default behavior is to process only those pages that have zones created on them. If your document includes multiple pages that do not have any zones defined on them, you can enable auto-zoning for those pages by checking the Process pages using auto-zone if there are no user-defined zones box. Zones will be automatically defined for the pages without zones. Note that blank pages will not have zones defined for them.

Note: If you check this option, {ocr:zone} will appear in the Metadata Browser window along with any available user-defined zones.

Choosing an Output Format

Use the Output field to specify the format of the output file. This area appears at the left of the page, below the Zone List.

At the Output field, if you the click on the drop-down, a list of output options appears. The OCR engine can affect the list of options available at this field. If an option does not appear at the Output field, it is not supported by the OCR engine. See the table below.

Note: All processed output files include only the content captured in the user-defined zones, with the following exceptions. These output formats include the original file along with the content captured in the zones:

Original Document + Metadata
PDF Searchable

Output Option	OmniPage	Tesseract
Original Document + Metadata - Outputs the original file along with metadata extracted from defined zones. This is the default setting and is necessary to use metadata in other nodes within the workflow, such as Metadata to File and Metadata Route, for further processing.	Y	Y
PDF - A highly configurable, general PDF output converter that supports many PDF features and relies heavily on the position of the recognized characters. The PDF will have a very similar look to the original document.	Y	N
PDF Searchable - PDF output converter that retains the original image in the foreground with the recognized text hidden in the background (in the correct position). Recommended for archiving and indexing documents. With this format, the entire input document is included as output.	Y	Y
PDF with image substitutes - Special PDF converter that covers suspect words with their images that have been cut out from the original image.	Y	N
Microsoft Word options - Outputs the document to various versions of Word. Choices are Word 2000, XP; 97; 2003. Note that Microsoft Word has a length/height limit of 22”.	Y	N
Microsoft Excel - Outputs the document to various versions of Excel. Choices are Excel 2003, XP; 97.	Y	N
Microsoft PowerPoint - Generates a plain and simple RTF file, which can be interpreted by Microsoft PowerPoint.	Y	N
XPS Searchable - Outputs the document to an XPS file that is searchable. XPS is an XML-based, fixed-layout document format similar to PDF.	Y	N
RTF options - Outputs the document to various versions of RTF. Choices are RTF Word 2000, 97, or Word 6.0/95; RTF 2000 Exact Word (renders pages more exactly in Microsoft Word). Note that Microsoft Word has a length/height limit of 22”.	Y	N
Text - Outputs the document to plain text (*.TXT) that can be read by most text editors and word processors.	Y	Y
Comma Separated Text - Outputs the document into a tabled text file that can be read by Excel (*.CSV).	Y	Y
Formatted Text - Outputs the document to a *.TXT file, trying to retain the layout of the page by inserting extra spaces.	Y	N
Text with line breaks - Outputs the document to text with a line break after each line.	Y	Y
Unicode Text - Outputs the document to plain text, using two-byte Unicode characters.	Y	Y
Unicode Comma Separated Text - Outputs the document into a tabled text file using two-byte Unicode characters. The outputted file can be read by Excel.	Y	Y
Unicode Formatted Text - Outputs the document to formatted text, using two-byte Unicode characters.	Y	N
Unicode Text with line breaks - Outputs the document to text with a line break after each line and uses two-byte Unicode characters.	Y	Y
XML - Outputs the document to an XML file format.	Y	N
eBook - Uses the Open Ebook Specification 1.0 XML converter.	Y	N